나는 영화에 대한 사용자의 의견과 dataframe을 가지고 예를 구문 분석하고 싶습니다

팬더 dataframe의 컬럼에 함수를 적용하는 것은 "movie2"나는 영화에 대한 사용자의 의견과 dataframe을 가지고 예를 구문 분석하고 싶습니다

User id  Old id_New id Score Comments 
947952018 3101_771355141 3.0 If you want to see a comedy and have a stupid ... 
805407067 11903_18330  5.0 Argento?s fever dream masterpiece. Fairy tale ... 
901306244 16077_771225176 4.5 Evil Dead II meets Brothers Grimm and Hawkeye ... 
901306244 NaN_381422014 1.0 Biggest disappointment! There&#39;s a host of ... 
15169683 NaN_22471  3.0 You know in the original story of Pinocchio he...

I을 충족 "meet"이라는 단어를 찾고 앞뒤로 첫 번째 n 개의 단어를 취하여 movie1의 본질을 만나고 (희망을 갖고) 반환하는 기능을 작성했습니다. & movie2, 나중에 퍼지 일치를 계획합니다. 다른 데이터 프레임의 제목.

def parse_movie(comment, num_words): 
    words = comment.partition('meets') 
    words_before = words[0].split(maxsplit=num_words)[-num_words:] 
    words_after = words[2].split(maxsplit=num_words)[:num_words] 
    movie1 = ' '.join(words_before) 
    movie2 = ' '.join(words_after) 
    return movie1, movie2

가 어떻게 원래의 팬더 dataframe의 의견 컬럼에이 기능을 적용하고 별도의 열에 반환 된 극장 1과 movie2 제목을 넣을 수 있습니다? 시도했습니다

df['Comments'].apply(parse_titles)

그러나 다음 num_words를 사용하고 싶습니다. 칼럼에서 직접 작업하는 것도 저에게는 효과적이지 않습니다. 새로운 영화를 새로운 칼럼에 넣는 방법을 모르겠습니다.

parse_movie(sample['Comments'], 4) 
AttributeError: 'Series' object has no attribute 'partition'

제안 사항을 알려주세요.

출처

2017-12-19 Matt

'args '인수를 사용하여'apply()'로 인수를 전달할 수 있습니다. [docs] (https://pandas.pydata.org/pandas-docs/stable/generated/pandas.Series.apply.html)를보십시오. –

how to split column of tuples in pandas dataframe? 답변을 바탕으로. 이것은 람다 함수를 사용하고 적용 할 수 있습니다 (pd.Series). 결과를 데이터 프레임 열 'movie1'및 'movie2'에 저장합니다.

num_words = 4 
df[['movie1','movie2']] = df['comments'].apply(lambda comment: parse_movie(comment, num_words)).apply(pd.Series)

출처

2017-12-19 02:47:07

나는 영화에 대한 사용자의 의견과 dataframe을 가지고 예를 구문 분석하고 싶습니다

답변

관련 문제