2017-11-20 7 views
0

스탠포드 CoreNLP 서버와 통화하기 위해 pycorenlp 클라이언트를 사용 중입니다. 내 설정에서 나는이 같은 germanpipelineLanguage을 설정하고 있습니다 :StanfordCoreNLP - pipelineLanguage를 독일어로 설정하면 작동하지 않습니까?

from pycorenlp import StanfordCoreNLP 

nlp = StanfordCoreNLP('http://localhost:9000') 

text = 'Das große Auto.' 

output = nlp.annotate(text, properties={ 
    'annotators': 'tokenize,ssplit,pos,depparse,parse', 
    'outputFormat': 'json', 
    'pipelineLanguage': 'german' 
    }) 

을하지만, 나는 그것이 작동하지 않는 것을 말하고 싶지만 외모에서 :

output['sentences'][0]['tokens'] 

가 반환합니다

[{'after': ' ', 
    'before': '', 
    'characterOffsetBegin': 0, 
    'characterOffsetEnd': 3, 
    'index': 1, 
    'originalText': 'Das', 
    'pos': 'NN', 
    'word': 'Das'}, 
{'after': ' ', 
    'before': ' ', 
    'characterOffsetBegin': 4, 
    'characterOffsetEnd': 9, 
    'index': 2, 
    'originalText': 'große', 
    'pos': 'NN', 
    'word': 'große'}, 
{'after': '', 
    'before': ' ', 
    'characterOffsetBegin': 10, 
    'characterOffsetEnd': 14, 
    'index': 3, 
    'originalText': 'Auto', 
    'pos': 'NN', 
    'word': 'Auto'}, 
{'after': '', 
    'before': '', 
    'characterOffsetBegin': 14, 
    'characterOffsetEnd': 15, 
    'index': 4, 
    'originalText': '.', 
    'pos': '.', 
    'word': '.'}] 

을 더 비슷해야합니다.

 Das große Auto 
POS: DT  JJ NN 

어떤 이유로 든 'pipelineLanguage': 'de' 설정이 작동하지 않는 것 같습니다.

는 내가 서버를 시작하기 위해

java -mx4g -cp "*" edu.stanford.nlp.pipeline.StanfordCoreNLPServer -port 9000 -timeout 15000 

을 실행했습니다.


것은 내가 로거에서 다음 점점 오전 :

[main] INFO CoreNLP - StanfordCoreNLPServer listening at /0:0:0:0:0:0:0:0:9000 
[pool-1-thread-3] ERROR CoreNLP - Failure to load language specific properties: StanfordCoreNLP-german.properties for german 
[pool-1-thread-3] INFO CoreNLP - [/127.0.0.1:60700] API call w/annotators tokenize,ssplit,pos,depparse,parse 
Das große Auto. 
[pool-1-thread-3] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator tokenize 
[pool-1-thread-3] INFO edu.stanford.nlp.pipeline.TokenizerAnnotator - No tokenizer type provided. Defaulting to PTBTokenizer. 
[pool-1-thread-3] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator ssplit 
[pool-1-thread-3] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator pos 
[pool-1-thread-3] INFO edu.stanford.nlp.tagger.maxent.MaxentTagger - Loading POS tagger from edu/stanford/nlp/models/pos-tagger/english-left3words/english-left3words-distsim.tagger ... done [0.5 sec]. 
[pool-1-thread-3] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator depparse 
[pool-1-thread-3] INFO edu.stanford.nlp.parser.nndep.DependencyParser - Loading depparse model file: edu/stanford/nlp/models/parser/nndep/english_UD.gz ... 
[pool-1-thread-3] INFO edu.stanford.nlp.parser.nndep.Classifier - PreComputed 99996, Elapsed Time: 8.645 (s) 
[pool-1-thread-3] INFO edu.stanford.nlp.parser.nndep.DependencyParser - Initializing dependency parser ... done [9.8 sec]. 
[pool-1-thread-3] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator parse 
[pool-1-thread-3] INFO edu.stanford.nlp.parser.common.ParserGrammar - Loading parser from serialized file edu/stanford/nlp/models/lexparser/englishPCFG.ser.gz ... done [0.3 sec]. 

은 분명히 서버는 영어에 대한 모델을로드 - 그것에 대해 나 경고없이.

답변

1

좋아, 방금 German에 대한 모델 병을 website에서 다운로드하여 서버를 추출한 디렉토리로 옮겼습니다.

~/Downloads/stanford-corenlp-full-2017-06-09 $ 

서버를 다시 실행하면 모델이 성공적으로로드되었습니다.

[pool-1-thread-3] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator tokenize 
[pool-1-thread-3] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator ssplit 
[pool-1-thread-3] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator pos 
[pool-1-thread-3] INFO edu.stanford.nlp.tagger.maxent.MaxentTagger - Loading POS tagger from edu/stanford/nlp/models/pos-tagger/german/german-hgc.tagger ... done [5.1 sec]. 
[pool-1-thread-3] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator depparse 
[pool-1-thread-3] INFO edu.stanford.nlp.parser.nndep.DependencyParser - Loading depparse model file: edu/stanford/nlp/models/parser/nndep/UD_German.gz ... 
[pool-1-thread-3] INFO edu.stanford.nlp.parser.nndep.Classifier - PreComputed 99984, Elapsed Time: 11.419 (s) 
[pool-1-thread-3] INFO edu.stanford.nlp.parser.nndep.DependencyParser - Initializing dependency parser ... done [12.2 sec]. 
[pool-1-thread-3] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator parse 
[pool-1-thread-3] INFO edu.stanford.nlp.parser.common.ParserGrammar - Loading parser from serialized file edu/stanford/nlp/models/lexparser/germanFactored.ser.gz ... done [1.0 sec]. 
[pool-1-thread-3] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator lemma 
[pool-1-thread-3] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator ner 
[pool-1-thread-3] INFO edu.stanford.nlp.ie.AbstractSequenceClassifier - Loading classifier from edu/stanford/nlp/models/ner/german.conll.hgc_175m_600.crf.ser.gz ... done [0.7 sec].