2017-12-31 56 views
-3

단락 시작 부분에서 단일 공백을 제거하고 파이썬을 사용하여 단락 첫 글자를 대문자로 만드는 방법은 무엇입니까?선행 공백 및 대문자

입력 :

this is a sample sentence. This is a sample second sentence. 

출력 : 지금까지

This is a sample sentence. This is a sample second sentence. 

내 노력 :

import spacy, re 
nlp = spacy.load('en_core_web_sm') 
doc = nlp(unicode(open('2.txt').read().decode('utf8'))) 
tagged_sent = [(w.text, w.tag_) for w in doc] 
normalized_sent = [w.capitalize() if t in ["NN","NNS"] else w for (w,t) in tagged_sent] 
normalized_sent1 = normalized_sent[0].capitalize() 
string = re.sub(" (?=[\.,'!?:;])", "", ' '.join(normalized_sent1)) 
rtn = re.split('([.!?] *)', string) 
final = ''.join([i.capitalize() for i in rtn]) 
print final 

이 단락의 시작을 제외한 모든 단락의 문장의 첫 단어를 대문자로?

Output: 
on the insert tab, the galleries include items that are designed to coordinate with the overall look of your document. You can use these galleries to insert tables, headers, footers, lists, cover pages, and other document building blocks. When you create pictures, charts, or diagrams, they also coordinate with your current document look. 

Expected output: 
On the insert tab, the galleries include items that are designed to coordinate with the overall look of your document. You can use these galleries to insert tables, headers, footers, lists, cover pages, and other document building blocks. When you create pictures, charts, or diagrams, they also coordinate with your current document look. 
+2

지금까지 시도 무엇 ? 코드를 게시하십시오. – James

+1

"단락"을 정의하십시오. – Sweeper

+0

'nltk' 라이브러리를 사용해도 될까요? –

답변

1

요구 사항이 첫 글자 자본을 먼저 공간을 제거하고 만 있다면 당신이 뭔가를 시도 할 수 있습니다 :

your_data=' on the insert tab, the galleries include items that are designed to coordinate with the overall look of your document. you can use these galleries to insert tables, headers, footers, lists, cover pages, and other document building blocks. when you create pictures, charts, or diagrams, they also coordinate with your current document look. ' 
conversion=list(your_data) 
if conversion[0]==' ': 
    del conversion[0] 

capitalize="".join(conversion).split() 
for j,i in enumerate(capitalize): 
    try: 
     if j==0: 
      capitalize[j]=capitalize[j].capitalize() 

     if '.' in i: 
      capitalize[j + 1] = capitalize[j + 1].capitalize() 
    except IndexError: 
     pass 

print(" ".join(capitalize)) 

출력 :

On the insert tab, the galleries include items that are designed to coordinate with the overall look of your document. You can use these galleries to insert tables, headers, footers, lists, cover pages, and other document building blocks. When you create pictures, charts, or diagrams, they also coordinate with your current document look. 
2

당신은 정규식과 str.capitalize()를 사용할 수 있습니다

import re 
s = " this is a sample sentence. This is a sample second sentence." 
new_s = '. '.join(i.capitalize() for i in re.split('\.\s', re.sub('^\s+', '', s))) 

출력 :

'This is a sample sentence. This is a sample second sentence.' 
+0

감사합니다. 하지만 내 예상 출력 : 이것은 샘플 문장입니다. 이것은 두 번째 문장의 샘플입니다. –

+0

@Programmer_nltk 내 최근 편집을 참조하십시오. – Ajax1234

1

, (나는 아약스의 대답 @ 추천) 간단한 해결책이 될 것

x = 'on the insert tab, the galleries include items that are designed to coordinate with the overall look of your document. You can use these galleries to insert tables, headers, footers, lists, cover pages, and other document building blocks. When you create pictures, charts, or diagrams, they also coordinate with your current document look. ' 
print('. '.join(map(lambda s: s.strip().capitalize(), x.split('.')))) 

출력 :

On the insert tab, the galleries include items that are designed to coordinate with the overall look of your document. You can use these galleries to insert tables, headers, footers, lists, cover pages, and other document building blocks. When you create pictures, charts, or diagrams, they also coordinate with your current document look.