s3에서 로컬 클러스터로의 distcp를 사용할 때 맵퍼 대역폭을 제한해야합니다. MR1이있는 CDH5.2의 distcp2
그래서 나는 여기 https://repository.cloudera.com에서 하둡 - distcp-2.5.0-cdh5.2.0-20141009.063640-188.jar 다운로드 링크입니다 :
https://repository.cloudera.com/artifactory/public/org/apache/hadoop/hadoop-distcp/2.5.0-cdh5.2.0-SNAPSHOT/hadoop-distcp-2.5.0-cdh5.2.0-20141009.063640-188.jar 다음 distcp 명령을 다음 실행하지만 일부 오류가 발생했습니다. 난 아무 잘못 g
export HADOOP_USER_CLASSPATH_FIRST=true && HADOOP_CLASSPATH=hadoop-distcp-2.5.0-cdh5.2.0-20141009.063640-188.jar hadoop org.apache.hadoop.tools.DistCp -bandwidth 1 s3n://com.xyz/2014/10/23/ hdfs:///user/abc/2014-10-23/
14/11/05 09:54:55 INFO tools.DistCp: Input Options: DistCpOptions{atomicCommit=false, syncFolder=false, deleteMissing=false, ignoreFailures=false, maxMaps=20, sslConfigurationFile='null', copyStrategy='uniformsize', sourceFileListing=null, sourcePaths=[s3n://com.xyz.rtb/2014/10/23], targetPath=hdfs:/user/abc/2014-10-23, targetPathExists=true, preserveRawXattrs=false}
Exception in thread "main" java.lang.NoSuchMethodError: org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(Lorg/apache/hadoop/mapreduce/Cluster;Lorg/apache/hadoop/conf/Configuration;)Lorg/apache/hadoop/fs/Path;
at org.apache.hadoop.tools.DistCp.createMetaFolderPath(DistCp.java:379)
at org.apache.hadoop.tools.DistCp.execute(DistCp.java:155)
at org.apache.hadoop.tools.DistCp.run(DistCp.java:121)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.hadoop.tools.DistCp.main(DistCp.java:401)
에게