2017-11-28 20 views
0

내가 these 다음 작은 하둡 클러스터를 설정하둡 클러스터는지도 작업을 줄일 실행되고 있지 않습니다 - 문제를 스케줄러

(이것은 내가이 문제에 있었다 이전 문제에 관한 한 토론의 후속이다) Hadoop 버전 2.7.4를 사용합니다. 클러스터가 정상적으로 작동하는 것으로 보이지만 mapreduce 작업을 실행할 수 없습니다. 특히,

17/11/27 16:35:21 INFO client.RMProxy: Connecting to ResourceManager at 
ec2-yyy.eu-central- 
1.compute.amazonaws.com/xxx:8032 
Running 0 maps. 

Job started: Mon Nov 27 16:35:22 UTC 2017 

17/11/27 16:35:22 INFO client.RMProxy: Connecting to ResourceManager at 
ec2-yyy.eu-central- 
1.compute.amazonaws.com/xxx:8032 


17/11/27 16:35:22 INFO mapreduce.JobSubmitter: number of splits:0 

17/11/27 16:35:22 INFO mapreduce.JobSubmitter: Submitting tokens for 
job: job_1511799491035_0006 

17/11/27 16:35:22 INFO impl.YarnClientImpl: Submitted application 
application_1511799491035_0006 

17/11/27 16:35:22 INFO mapreduce.Job: The url to track the job: 
http://ec2-yyy.eu-central- 
1.compute.amazonaws.com:8088/proxy/application_1511799491035_0006/ 

17/11/27 16:35:22 INFO mapreduce.Job: Running job: 
job_1511799491035_0006 

결코이 상태를지나 도착 다음

$HADOOP_HOME/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.4.jar randomwriter outdenter code here 

작업 인쇄를 시도 할 때. 작업 추적기에서

는, 내가 용량 스케줄러에 문제가 있음을 시사

2017-11-27 13:50:29,202 INFO org.apache.hadoop.conf.Configuration: found resource capacity-scheduler.xml at file:/usr/local/hadoop/etc/hadoop/capacity-scheduler.xml 
2017-11-27 13:50:29,252 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacitySchedulerConfiguration: max alloc mb per queue for root is undefined 
2017-11-27 13:50:29,252 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacitySchedulerConfiguration: max alloc vcore per queue for root is undefined 
2017-11-27 13:50:29,256 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue: root, capacity=1.0, asboluteCapacity=1.0, maxCapacity=1.0, asboluteMaxCapacity=1.0, state=RUNNING, acls=ADMINISTER_QUEUE:*SUBMIT_APP:*, labels=*, reservationsContinueLooking=true 
2017-11-27 13:50:29,256 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue: Initialized parent-queue root name=root, fullname=root 
2017-11-27 13:50:29,265 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacitySchedulerConfiguration: max alloc mb per queue for root.default is undefined 
2017-11-27 13:50:29,265 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacitySchedulerConfiguration: max alloc vcore per queue for root.default is undefined 

을 발견 어디 다음 로그 파일에보고

ACCEPTED: waiting for AM container to be allocated, launched and 
register with RM. 

말한다. 다음과 같이 파일 capacity-scheduler.xml 보인다 :

<configuration> 

    <property> 
    <name>yarn.scheduler.capacity.maximum-applications</name> 
    <value>10000</value> 
    <description> 
     Maximum number of applications that can be pending and running. 
    </description> 
    </property> 

    <property> 
    <name>yarn.scheduler.capacity.maximum-am-resource-percent</name> 
    <value>0.1</value> 
    <description> 
     Maximum percent of resources in the cluster which can be used to run 
     application masters i.e. controls number of concurrent running 
     applications. 
    </description> 
    </property> 

    <property> 
    <name>yarn.scheduler.capacity.resource-calculator</name> 
    <value>org.apache.hadoop.yarn.util.resource.DefaultResourceCalculator</value> 
    <description> 
     The ResourceCalculator implementation to be used to compare 
     Resources in the scheduler. 
     The default i.e. DefaultResourceCalculator only uses Memory while 
     DominantResourceCalculator uses dominant-resource to compare 
     multi-dimensional resources such as Memory, CPU etc. 
    </description> 
    </property> 

    <property> 
    <name>yarn.scheduler.capacity.root.queues</name> 
    <value>default</value> 
    <description> 
     The queues at the this level (root is the root queue). 
    </description> 
    </property> 

    <property> 
    <name>yarn.scheduler.capacity.root.default.capacity</name> 
    <value>100</value> 
    <description>Default queue target capacity.</description> 
    </property> 

    <property> 
    <name>yarn.scheduler.capacity.root.default.user-limit-factor</name> 
    <value>1</value> 
    <description> 
     Default queue user limit a percentage from 0.0 to 1.0. 
    </description> 
    </property> 

    <property> 
    <name>yarn.scheduler.capacity.root.default.maximum-capacity</name> 
    <value>100</value> 
    <description> 
     The maximum capacity of the default queue. 
    </description> 
    </property> 

    <property> 
    <name>yarn.scheduler.capacity.root.default.state</name> 
    <value>RUNNING</value> 
    <description> 
     The state of the default queue. State can be one of RUNNING or STOPPED. 
    </description> 
    </property> 

    <property> 
    <name>yarn.scheduler.capacity.root.default.acl_submit_applications</name> 
    <value>*</value> 
    <description> 
     The ACL of who can submit jobs to the default queue. 
    </description> 
    </property> 

    <property> 
    <name>yarn.scheduler.capacity.root.default.acl_administer_queue</name> 
    <value>*</value> 
    <description> 
     The ACL of who can administer jobs on the default queue. 
    </description> 
    </property> 

    <property> 
    <name>yarn.scheduler.capacity.node-locality-delay</name> 
    <value>40</value> 
    <description> 
     Number of missed scheduling opportunities after which the CapacityScheduler 
     attempts to schedule rack-local containers. 
     Typically this should be set to number of nodes in the cluster, By default is setting 
     approximately number of nodes in one rack which is 40. 
    </description> 
    </property> 

    <property> 
    <name>yarn.scheduler.capacity.queue-mappings</name> 
    <value></value> 
    <description> 
     A list of mappings that will be used to assign jobs to queues 
     The syntax for this list is [u|g]:[name]:[queue_name][,next mapping]* 
     Typically this list will be used to map users to queues, 
     for example, u:%user:%user maps all users to queues with the same name 
     as the user. 
    </description> 
    </property> 

    <property> 
    <name>yarn.scheduler.capacity.queue-mappings-override.enable</name> 
    <value>false</value> 
    <description> 
     If a queue mapping is present, will it override the value specified 
     by the user? This can be used by administrators to place jobs in queues 
     that are different than the one specified by the user. 
     The default is false. 
    </description> 
    </property> 

</configuration> 

나는이 접근하는 방법에 대한 힌트 감사하겠습니다?

감사 웨딩

+0

확인. 감사. 나는 파일의 내용을 추가했다. – clog14

답변

0

클러스터 구성을 잘 모든 일이 있지만 작업 실행에 관해서, t2.micro 인스턴스에 의해 제공 RAM 그래서 더 맵리 듀스 작업을 실행 클러스터 생성을 위해 더 큰 인스턴스를 사용하는 것만으로는 충분하지 않습니다 및 작업 실행

+1

안녕하세요, t2.mico 인스턴스는 사용하지 않지만 m4.xlarge를 사용하고 있습니다. 저는 작업자 1 인당 20GB의 하드 디스크를 선택했는데 이것이 문제가 될 수 있습니까? – clog14