2017-09-28 24 views
1

hadoop 설치 및 Flume 튜토리얼의 모든 단계를 수행했습니다. 빅 데이터 도구에 익숙하지 않습니다. 다음과 같은 오류가 발생합니다. 나는 이해가 안된다. 문제는 어디에 있는가?FLUME을 사용하여 Hadoop에 데이터 저장

설치시 많은 게시물을 읽었지만 여전히이 문제에 직면하고 있습니다. 궁극적 인 목적은 R.

17/09/29 02:25:39 INFO node.PollingPropertiesFileConfigurationProvider: Configuration provider starting 
17/09/29 02:25:39 INFO node.PollingPropertiesFileConfigurationProvider: Reloading configuration file:/home/shivam/apache-flume-1.6.0-bin/conf/flume.conf 
17/09/29 02:25:39 INFO conf.FlumeConfiguration: Processing:HDFS 
17/09/29 02:25:39 INFO conf.FlumeConfiguration: Processing:HDFS 
17/09/29 02:25:39 INFO conf.FlumeConfiguration: Processing:HDFS 
17/09/29 02:25:39 INFO conf.FlumeConfiguration: Processing:HDFS 
17/09/29 02:25:39 INFO conf.FlumeConfiguration: Processing:HDFS 
17/09/29 02:25:39 INFO conf.FlumeConfiguration: Processing:HDFS 
17/09/29 02:25:39 INFO conf.FlumeConfiguration: Added sinks: HDFS Agent: TwitterAgent 
17/09/29 02:25:39 INFO conf.FlumeConfiguration: Processing:HDFS 
17/09/29 02:25:39 INFO conf.FlumeConfiguration: Processing:HDFS 
17/09/29 02:25:39 INFO conf.FlumeConfiguration: Processing:HDFS 
17/09/29 02:25:39 INFO conf.FlumeConfiguration: Post-validation flume configuration contains configuration for agents: [TwitterAgent] 
17/09/29 02:25:39 INFO node.AbstractConfigurationProvider: Creating channels 
17/09/29 02:25:39 INFO channel.DefaultChannelFactory: Creating instance of channel MemChannel type memory 
17/09/29 02:25:39 INFO node.AbstractConfigurationProvider: Created channel MemChannel 
17/09/29 02:25:39 INFO source.DefaultSourceFactory: Creating instance of source Twitter, type org.apache.flume.source.twitter.TwitterSource 
17/09/29 02:25:39 INFO twitter.TwitterSource: Consumer Key:  'fRw12aumIqkAWD6PP5ZHk7vva' 
17/09/29 02:25:39 INFO twitter.TwitterSource: Consumer Secret:  'K9K0yL2pwngp3JXEdMGWUOEB7AaGWswXcq72WveRvnD4ZSphNQ' 
17/09/29 02:25:39 INFO twitter.TwitterSource: Access Token:  '771287280438968320-XnbtNtBt40cs6gUOk6F9bjgmUABM0qG' 
17/09/29 02:25:39 INFO twitter.TwitterSource: Access Token Secret: 'afUppGRqcRi2p9fzLhVdYQXkfMEm72xduaWD6uNs3HhKg' 
17/09/29 02:25:39 INFO sink.DefaultSinkFactory: Creating instance of sink: HDFS, type: hdfs 
17/09/29 02:25:39 INFO node.AbstractConfigurationProvider: Channel MemChannel connected to [Twitter, HDFS] 
17/09/29 02:25:39 INFO node.Application: Starting new configuration:{ sourceRunners:{Twitter=EventDrivenSourceRunner: { source:org.apache.flume.source.twitter.TwitterSource{name:Twitter,state:IDLE} }} sinkRunners:{HDFS=SinkRunner: { policy:[email protected] counterGroup:{ name:null counters:{} } }} channels:{MemChannel=org.apache.flume.channel.MemoryChannel{name: MemChannel}} } 
17/09/29 02:25:39 INFO node.Application: Starting Channel MemChannel 
17/09/29 02:25:39 INFO instrumentation.MonitoredCounterGroup: Monitored counter group for type: CHANNEL, name: MemChannel: Successfully registered new MBean. 
17/09/29 02:25:39 INFO instrumentation.MonitoredCounterGroup: Component type: CHANNEL, name: MemChannel started 
17/09/29 02:25:39 INFO node.Application: Starting Sink HDFS 
17/09/29 02:25:39 INFO node.Application: Starting Source Twitter 
17/09/29 02:25:39 INFO twitter.TwitterSource: Starting twitter source org.apache.flume.source.twitter.TwitterSource{name:Twitter,state:IDLE} ... 
17/09/29 02:25:39 INFO twitter.TwitterSource: Twitter source Twitter started. 
17/09/29 02:25:39 INFO twitter4j.TwitterStreamImpl: Establishing connection. 
17/09/29 02:25:39 INFO instrumentation.MonitoredCounterGroup: Monitored counter group for type: SINK, name: HDFS: Successfully registered new MBean. 
17/09/29 02:25:39 INFO instrumentation.MonitoredCounterGroup: Component type: SINK, name: HDFS started 
17/09/29 02:25:42 INFO twitter4j.TwitterStreamImpl: Connection established. 
17/09/29 02:25:42 INFO twitter4j.TwitterStreamImpl: Receiving status stream. 
17/09/29 02:25:42 INFO hdfs.HDFSDataStream: Serializer = TEXT, UseRawLocalFileSystem = false 
17/09/29 02:25:42 INFO hdfs.BucketWriter: Creating hdfs://localhost:9000/user/flume/tweets/FlumeData.1506632142370.tmp 
17/09/29 02:25:42 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 
17/09/29 02:25:44 WARN hdfs.HDFSEventSink: HDFS IO error 
java.net.ConnectException: Call From maverick/127.0.1.1 to localhost:9000 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused 
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) 
    at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) 
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) 
    at java.lang.reflect.Constructor.newInstance(Constructor.java:423) 
    at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:792) 
    at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:732) 
    at org.apache.hadoop.ipc.Client.call(Client.java:1480) 
    at org.apache.hadoop.ipc.Client.call(Client.java:1407) 
    at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229) 
    at com.sun.proxy.$Proxy13.create(Unknown Source) 
    at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.create(ClientNamenodeProtocolTranslatorPB.java:296) 
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) 
    at java.lang.reflect.Method.invoke(Method.java:498) 
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187) 
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102) 
    at com.sun.proxy.$Proxy14.create(Unknown Source) 
    at org.apache.hadoop.hdfs.DFSOutputStream.newStreamForCreate(DFSOutputStream.java:1623) 
    at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1703) 
    at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1638) 
    at org.apache.hadoop.hdfs.DistributedFileSystem$7.doCall(DistributedFileSystem.java:448) 
    at org.apache.hadoop.hdfs.DistributedFileSystem$7.doCall(DistributedFileSystem.java:444) 
    at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) 
    at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:444) 
    at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:387) 
    at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:909) 
    at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:890) 
    at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:787) 
    at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:776) 
    at org.apache.flume.sink.hdfs.HDFSDataStream.doOpen(HDFSDataStream.java:86) 
    at org.apache.flume.sink.hdfs.HDFSDataStream.open(HDFSDataStream.java:113) 
    at org.apache.flume.sink.hdfs.BucketWriter$1.call(BucketWriter.java:246) 
    at org.apache.flume.sink.hdfs.BucketWriter$1.call(BucketWriter.java:235) 
    at org.apache.flume.sink.hdfs.BucketWriter$9$1.run(BucketWriter.java:679) 
    at org.apache.flume.auth.SimpleAuthenticator.execute(SimpleAuthenticator.java:50) 
    at org.apache.flume.sink.hdfs.BucketWriter$9.call(BucketWriter.java:676) 
    at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
    at java.lang.Thread.run(Thread.java:748) 
Caused by: java.net.ConnectException: Connection refused 
    at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) 
    at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717) 
    at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) 
    at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:531) 
    at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:495) 
    at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:609) 
    at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:707) 
    at org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:370) 
    at org.apache.hadoop.ipc.Client.getConnection(Client.java:1529) 
    at org.apache.hadoop.ipc.Client.call(Client.java:1446) 
    ... 34 more 
17/09/29 02:25:45 INFO twitter.TwitterSource: Processed 100 docs 
17/09/29 02:25:45 INFO hdfs.BucketWriter: Creating hdfs://localhost:9000/user/flume/tweets/FlumeData.1506632142371.tmp 
17/09/29 02:25:45 WARN hdfs.HDFSEventSink: HDFS IO error 
java.net.ConnectException: Call From maverick/127.0.1.1 to localhost:9000 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused 
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) 
    at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) 
17/09/29 02:25:48 INFO twitter.TwitterSource: Processed 200 docs 
17/09/29 02:25:50 INFO twitter.TwitterSource: Processed 300 docs 
17/09/29 02:25:50 INFO hdfs.BucketWriter: Creating hdfs://localhost:9000/user/flume/tweets/FlumeData.1506632142373.tmp 
17/09/29 02:25:50 WARN hdfs.HDFSEventSink: HDFS IO error 
java.net.ConnectException: Call From maverick/127.0.1.1 to localhost:9000 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused 

을 사용하여 Twitter 감정 분석을 수행하는 것입니다. 완벽한 해결책이 있습니까? 다시 처음부터 다시 할 수 있습니다.

답변

0

Flume이 성공없이 localhost:9000에서 듣고있는 Hadoop의 네임 노드에 연결하려고합니다.

이 동작은 정확합니다. Hadoop의 네임 노드는 일반적으로 Hadoop의 파일 시스템 (HDFS)과 관련된 IPC (Inter-Process Communications) 용 TCP/8020 또는 TCP/9000 포트에서 수신 대기합니다. 그리고 기본적으로 Flume은 TCP/9000에 연결하려고합니다.

localhost에서 실행 중이며 TCP/9000에서 수신 대기중인 프로세스가 있음을 확인할 수 있습니까? lsof 또는 netstat 명령을 사용하면됩니다. 또한 Hadoop 설정을 확인하여 Hadoop이 네임 노드의 IPC를 위해 열려있는 포트가 무엇인지 확인해야합니다. 당신은이 core-site.xml 파일에 fs.default.name 속성을 구성하여 이루어집니다 9000에 여러분의 네임 노드의 IPC 수신 대기 포트를 변경

  • :

    그런 다음 두 가지 옵션이 있습니다.

  • Humeop에서 구성한 포트에 연결하기 위해 Flume을 구성합니다. 싱크의 hdfs.path 속성을 hdfs://127.0.0.1:<your_port>/your/path/으로 구성하면됩니다.
+0

나는 그것을 시도했지만 여전히 오류가 발생하고 있습니다. @frb – Shivam

+0

설치를 다시 한 번 다른 질문을 게시했습니다. https://stackoverflow.com/questions/46530583/fetching-twitter-data-using-flume – Shivam

+0

나는 50070과 같은 hadoop 설치를 위해 많은 포트를 사용했으며 namenode는 다른 포트를 수신하고 있습니다. 내 flume.conf에서 포트로 9000입니다. – Shivam