2016-12-27 3 views
2

되지는
http://spark.apache.org/docs/latest/structured-streaming-programming-guide.html형식 오류가 : '빌더'개체가 스트리밍 구조 파이썬 스파크 프로그래밍 가이드 [링크]에 주어진 예제를 실행에 호출 스파크 구조 스트리밍

나는 오류 아래 얻을 :
형식 오류를 '빌더 '개체가 호출되지 않습니다

from pyspark.sql import SparkSession 
from pyspark.sql.functions import explode 
from pyspark.sql.functions import split 

spark = SparkSession.builder()\ 
    .appName("StructuredNetworkWordCount")\ 
    .getOrCreate() 

# Create DataFrame representing the stream of input lines from connection to localhost:9999 
lines = spark\ 
    .readStream\ 
    .format('socket')\ 
    .option('host', 'localhost')\ 
    .option('port', 9999)\ 
    .load() 

# Split the lines into words 
words = lines.select(
    explode(
     split(lines.value, ' ') 
    ).alias('word') 
) 

# Generate running word count 
wordCounts = words.groupBy('word').count() 

# Start running the query that prints the running counts to the console 
query = wordCounts\ 
    .writeStream\ 
    .outputMode('complete')\ 
    .format('console')\ 
    .start() 

query.awaitTermination() 

오류 :

[email protected]:~/thesis/backUp$ spark-submit structured.py 
Traceback (most recent call last): 
    File "/home/omkar/thesis/backUp/structured.py", line 8, in <module> 
    spark = SparkSession.builder()\ 
TypeError: 'Builder' object is not callable 

답변

6

spark = SparkSession.builder()\ 
    .appName("StructuredNetworkWordCount")\ 
    .getOrCreate() 

를 들어 수정 .builder()와 같은 .builder합니다 :

spark = SparkSession.builder\ 
    .appName("StructuredNetworkWordCount")\ 
    .getOrCreate() 

출처 : https://issues.apache.org/jira/browse/SPARK-18426