Eclipse에서 mapreduce 프로그램이 있습니다. 나는 그것을 실행하고 싶다. 나는 아래의 프로그램을 따른다. :내 MapReduce 작업이 실패합니다
http://www.orzota.com/step-by-step-mapreduce-programming/
나는 페이지가 말하는 모든 것을하고 프로그램을 실행한다.
13/09/03 12:19:11 INFO util.ProcessTree: setsid exited with exit code 0
13/09/03 12:19:11 INFO mapred.Task: Using ResourceCalculatorPlugin : [email protected]
13/09/03 12:19:11 INFO mapred.MapTask: Processing split: file:/home/ubuntu/Eclip/Runs/input/BX-Books.csv:0+33554432
13/09/03 12:19:11 INFO mapred.MapTask: numReduceTasks: 1
13/09/03 12:19:12 INFO mapred.MapTask: io.sort.mb = 100
13/09/03 12:19:12 INFO mapred.MapTask: data buffer = 79691776/99614720
13/09/03 12:19:12 INFO mapred.MapTask: record buffer = 262144/327680
13/09/03 12:19:12 INFO mapred.JobClient: map 0% reduce 0%
13/09/03 12:19:13 INFO mapred.MapTask: Starting flush of map output
13/09/03 12:19:14 INFO mapred.MapTask: Finished spill 0
13/09/03 12:19:14 INFO mapred.Task: Task:attempt_local1379860058_0001_m_000000_0 is done. And is in the process of commiting
13/09/03 12:19:14 INFO mapred.LocalJobRunner: file:/home/ubuntu/Eclipse/Runs/input/BX-Books.csv:0+33554432
13/09/03 12:19:14 INFO mapred.Task: Task 'attempt_local1379860058_0001_m_000000_0' done.
13/09/03 12:19:14 INFO mapred.LocalJobRunner: Finishing task: attempt_local1379860058_0001_m_000000_0
13/09/03 12:19:14 INFO mapred.LocalJobRunner: Starting task: attempt_local1379860058_0001_m_000001_0
13/09/03 12:19:14 INFO mapred.Task: Using ResourceCalculatorPlugin : [email protected]
13/09/03 12:19:14 INFO mapred.MapTask: Processing split: file:/home/ubuntu/Eclipse/Runs/input/BX-Books.csv:33554432+33554432
13/09/03 12:19:14 INFO mapred.MapTask: numReduceTasks: 1
13/09/03 12:19:14 INFO mapred.MapTask: io.sort.mb = 100
13/09/03 12:19:14 INFO mapred.MapTask: data buffer = 79691776/99614720
13/09/03 12:19:14 INFO mapred.MapTask: record buffer = 262144/327680
13/09/03 12:19:14 INFO mapred.JobClient: map 20% reduce 0%
13/09/03 12:19:15 INFO mapred.MapTask: Starting flush of map output
13/09/03 12:19:15 INFO mapred.MapTask: Finished spill 0
13/09/03 12:19:15 INFO mapred.Task: Task:attempt_local1379860058_0001_m_000001_0 is done. And is in the process of commiting
13/09/03 12:19:15 INFO mapred.LocalJobRunner: file:/home/ubuntu/Eclipse/Runs/input/BX-Books.csv:33554432+33554432
13/09/03 12:19:15 INFO mapred.Task: Task 'attempt_local1379860058_0001_m_000001_0' done.
13/09/03 12:19:15 INFO mapred.LocalJobRunner: Finishing task: attempt_local1379860058_0001_m_000001_0
13/09/03 12:19:15 INFO mapred.LocalJobRunner: Starting task: attempt_local1379860058_0001_m_000002_0
13/09/03 12:19:15 INFO mapred.Task: Using ResourceCalculatorPlugin : [email protected]
13/09/03 12:19:15 INFO mapred.MapTask: Processing split: file:/home/ubuntu/Eclipse/Runs/input/BX-Book-Ratings.csv:0+30682276
13/09/03 12:19:15 INFO mapred.MapTask: numReduceTasks: 1
13/09/03 12:19:15 INFO mapred.MapTask: io.sort.mb = 100
13/09/03 12:19:16 INFO mapred.MapTask: data buffer = 79691776/99614720
13/09/03 12:19:16 INFO mapred.MapTask: record buffer = 262144/327680
13/09/03 12:19:16 INFO mapred.LocalJobRunner: Starting task: attempt_local1379860058_0001_m_000003_0
13/09/03 12:19:16 INFO mapred.Task: Using ResourceCalculatorPlugin : [email protected]
13/09/03 12:19:16 INFO mapred.MapTask: Processing split: file:/home/ubuntu/Eclipse/Runs/input/BX-Users.csv:0+12284157
13/09/03 12:19:16 INFO mapred.MapTask: numReduceTasks: 1
13/09/03 12:19:16 INFO mapred.MapTask: io.sort.mb = 100
13/09/03 12:19:16 INFO mapred.MapTask: data buffer = 79691776/99614720
13/09/03 12:19:16 INFO mapred.MapTask: record buffer = 262144/327680
13/09/03 12:19:16 INFO mapred.LocalJobRunner: Starting task: attempt_local1379860058_0001_m_000004_0
13/09/03 12:19:16 INFO mapred.Task: Using ResourceCalculatorPlugin : [email protected]
13/09/03 12:19:16 INFO mapred.MapTask: Processing split: file:/home/ubuntu/Eclipse/Runs/input/BX-Books.csv:67108864+10678575
13/09/03 12:19:16 INFO mapred.MapTask: numReduceTasks: 1
13/09/03 12:19:16 INFO mapred.MapTask: io.sort.mb = 100
13/09/03 12:19:16 INFO mapred.MapTask: data buffer = 79691776/99614720
13/09/03 12:19:16 INFO mapred.MapTask: record buffer = 262144/327680
13/09/03 12:19:16 INFO mapred.JobClient: map 40% reduce 0%
13/09/03 12:19:17 INFO mapred.MapTask: Starting flush of map output
13/09/03 12:19:17 INFO mapred.MapTask: Finished spill 0
13/09/03 12:19:17 INFO mapred.Task: Task:attempt_local1379860058_0001_m_000004_0 is done. And is in the process of commiting
13/09/03 12:19:17 INFO mapred.LocalJobRunner: file:/home/ubuntu/Eclipse/Runs/input/BX-Books.csv:67108864+10678575
13/09/03 12:19:17 INFO mapred.Task: Task 'attempt_local1379860058_0001_m_000004_0' done.
13/09/03 12:19:17 INFO mapred.LocalJobRunner: Finishing task: attempt_local1379860058_0001_m_000004_0
13/09/03 12:19:17 INFO mapred.LocalJobRunner: Map task executor complete.
13/09/03 12:19:17 WARN mapred.LocalJobRunner: job_local1379860058_0001
java.lang.Exception: java.lang.ArrayIndexOutOfBoundsException: 3
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:354)
Caused by: java.lang.ArrayIndexOutOfBoundsException: 3
at org.orzota.bookx.mappers.MyHadoopMapper.map(MyHadoopMapper.java:17)
at org.orzota.bookx.mappers.MyHadoopMapper.map(MyHadoopMapper.java:1)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:366)
at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:223)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
13/09/03 12:19:17 INFO mapred.JobClient: map 60% reduce 0%
13/09/03 12:19:17 INFO mapred.JobClient: Job complete: job_local1379860058_0001
13/09/03 12:19:17 INFO mapred.JobClient: Counters: 16
13/09/03 12:19:17 INFO mapred.JobClient: File Input Format Counters
13/09/03 12:19:17 INFO mapred.JobClient: Bytes Read=77795631
13/09/03 12:19:17 INFO mapred.JobClient: FileSystemCounters
13/09/03 12:19:17 INFO mapred.JobClient: FILE_BYTES_READ=178484057
13/09/03 12:19:17 INFO mapred.JobClient: FILE_BYTES_WRITTEN=6981917
13/09/03 12:19:17 INFO mapred.JobClient: Map-Reduce Framework
13/09/03 12:19:17 INFO mapred.JobClient: Map output materialized bytes=2971356
13/09/03 12:19:17 INFO mapred.JobClient: Map input records=271380
13/09/03 12:19:17 INFO mapred.JobClient: Spilled Records=271380
13/09/03 12:19:17 INFO mapred.JobClient: Map output bytes=2428578
13/09/03 12:19:17 INFO mapred.JobClient: Total committed heap usage (bytes)=883687424
13/09/03 12:19:17 INFO mapred.JobClient: CPU time spent (ms)=0
13/09/03 12:19:17 INFO mapred.JobClient: Map input bytes=77787439
13/09/03 12:19:17 INFO mapred.JobClient: SPLIT_RAW_BYTES=306
13/09/03 12:19:17 INFO mapred.JobClient: Combine input records=0
13/09/03 12:19:17 INFO mapred.JobClient: Combine output records=0
13/09/03 12:19:17 INFO mapred.JobClient: Physical memory (bytes) snapshot=0
13/09/03 12:19:17 INFO mapred.JobClient: Virtual memory (bytes) snapshot=0
13/09/03 12:19:17 INFO mapred.JobClient: Map output records=271380
13/09/03 12:19:17 INFO mapred.JobClient: Job Failed: NA java.io.IOException: Job failed!
at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1357)
at org.orzota.bookx.mappers.MyHadoopDriver.main(MyHadoopDriver.java:44)
내 생각 :하지만 나에게 오류를 표시하고 내 작업이 실패 .. 프로그램이 출력 폴더를 만들 수 있지만 비어 .. 여기 내 대구입니다 : 오류 여기
package org.orzota.bookx.mappers;
import java.io.IOException;
import org.apache.hadoop.io.*;
import org.apache.hadoop.mapred.MapReduceBase;
import org.apache.hadoop.mapred.Mapper;
import org.apache.hadoop.mapred.OutputCollector;
import org.apache.hadoop.mapred.Reporter;
public class MyHadoopMapper extends MapReduceBase implements Mapper <LongWritable, Text, Text, IntWritable>{
private final static IntWritable one = new IntWritable(1);
public void map(LongWritable _key, Text value, OutputCollector<Text, IntWritable> output, Reporter reporter) throws IOException {
String st = value.toString();
String[] bookdata = st.split("\";\"");
output.collect(new Text(bookdata[3]), one);
}
}
public class MyHadoopReducer extends MapReduceBase implements Reducer<Text, IntWritable, Text, IntWritable>{
public void reduce(Text _key, Iterator<IntWritable> values, OutputCollector<Text, IntWritable> output, Reporter reporter) throws IOException {
Text key = _key;
int freq = 0;
while (values.hasNext()){
IntWritable value = (IntWritable) values.next();
freq += value.get();
}
output.collect(key, new IntWritable(freq));
}
}
public class MyHadoopDriver {
public static void main(String[] args) {
JobClient client = new JobClient();
JobConf conf = new JobConf(
org.orzota.bookx.mappers.MyHadoopDriver.class);
conf.setJobName("BookCrossing1.0");
// TODO: specify output types
conf.setOutputKeyClass(Text.class);
conf.setOutputValueClass(IntWritable.class);
// TODO: specify a mapper
conf.setMapperClass(org.orzota.bookx.mappers.MyHadoopMapper.class);
// TODO: specify a reducer
conf.setReducerClass(org.orzota.bookx.mappers.MyHadoopReducer.class);
conf.setInputFormat(TextInputFormat.class);
conf.setOutputFormat(TextOutputFormat.class);
FileInputFormat.setInputPaths(conf, new Path(args[0]));
FileOutputFormat.setOutputPath(conf, new Path(args[1]));
client.setConf(conf);
try {
JobClient.runJob(conf);
} catch (Exception e) {
e.printStackTrace();
}
}
}
하고있다
output.collect(new Text(bookdata[3]), one);
하지만 난 그게 말하는 것을 모른다는 ... 사람이 제발 도움이 될 수 있습니다 오류는이 라인에서인가? 감사합니다 ...
내가 사이트에서 csv 파일을 checing했다 .. 그리고 몇 가지 문제가있을 수 있습니다 : 당신은 또한 당신의지도 기능에 다음 줄을 추가 할 수 있습니다 파편. 붙여 넣기를 복사하면 모든 레코드가 ";"로 구분되지 않고 ";은 한 줄에 있고"다음에 예를 들어 THUMBZZZ.jpg로 끝나는 부분도 있습니다 이것이 문제를 해결하면 – DDW
대답 답장을 보내 주셔서 감사합니다 .. 오류가 해결되었습니다 .. 문제는 내 입력 파일에서였습니다. H 변경하고 해결되었습니다 ... –
제 대답과 hadoop으로 행운을 빌어주세요 upvote! – DDW