Naive Bayes Classifier를 사용하는 open() 문은 길다.

나는 3483 행과 460K 문자와 65K 단어의 CSV 파일을 가지고 있으며,이 코퍼스를 사용하여 Scikit에서 NaiveBayes 분류자를 학습하려고한다.Naive Bayes Classifier를 사용하는 open() 문은 길다.

문제는 내가 아래의이 문장을 사용할 때, 너무 오래 걸리고 끝나지 않았다는 것입니다.

from textblob import TextBlob 
from textblob.classifiers import NaiveBayesClassifier 
import csv 

with open('train.csv', 'r') as fp: 
    cl = NaiveBayesClassifier(fp, format="csv")

내가 뭘 잘못했는지 추측 해보십시오.

미리 감사드립니다.

출처

2017-02-12 Flavio

이 CSV 파일과 같이 포맷 : '''instagrama, 인스 타 그램 : http://textblob.readthedocs.io/en/dev/classifiers.html – vendaTrout

예 @vendaTrout 이 파일의 예입니다 #의 FB, 페이스 북 facebookio, 페이스 북 facebooktime 메신저 아이폰, 페이스 북 이 WhatsApp에 COM, WHATSSUP facebooko 번호의 FB, 페이스 북 facebookiokio 번호의 FB, 페이스 북 instagramas : 인스 타 그램 페이스 북은 https : FB, 페이스 북 페이스 북 #의 FB, 페이스 북 ''' – Flavio

각 열차 데이터와 레이블이 분리되어 있다고 가정하면 "\ n"으로, 당신은 작은 CSV, 또는 이것에 대한 기능을 프로파일 링 할 수 있습니까? stdlib [profiling] (https://docs.python.org/3/library/profile.html) 모듈을 살펴보십시오. – vendaTrout

이 라이브러리에는 문제가 있습니다.

은 다음 링크에 문서화 :

https://github.com/sloria/TextBlob/pull/136

https://github.com/sloria/TextBlob/issues/77

작은 이야기 : 라이브러리는 대규모 데이터 세트 잘하지 계약을한다.

출처

2017-02-15 11:50:13 Flavio

Naive Bayes Classifier를 사용하는 open() 문은 길다.

답변

관련 문제