2014-02-17 6 views
0

sqoop을 사용하여 hdfs에서 mysql로 ​​일부 데이터를 내보내려고합니다. 문제는 제대로 압축되지 않은 파일을 내보내려고 할 때이지만 lzo 압축으로 압축 된 동일한 파일을 내보내려고하면 sqoop 작업이 실패합니다. 나는 표준 cloudera CDH4 VM 환경에서 그것을 시도하고있다. 파일의 열은 탭으로 구분되며 널 (NULL)은 '\ N'으로 표시됩니다.sqoop을 사용하여 lzo 압축에서 데이터를 내보낼 때 NoSuchElementException이 발생했습니다.

파일 내용 :

[[email protected] ~]$ cat dipayan-test.txt 
dipayan koramangala 29 
raju marathahalli 32 
raju marathahalli 32 
raju \N 32 
raju marathahalli 32 
raju \N 32 
raju marathahalli 32 
raju marathahalli \N 
raju marathahalli \N 

MySQL의 테이블에 대한 설명 : HDFS에서

mysql> describe sqooptest; 
+---------+--------------+------+-----+---------+-------+ 
| Field | Type   | Null | Key | Default | Extra | 
+---------+--------------+------+-----+---------+-------+ 
| name | varchar(100) | YES |  | NULL |  | 
| address | varchar(100) | YES |  | NULL |  | 
| age  | int(11)  | YES |  | NULL |  | 
+---------+--------------+------+-----+---------+-------+ 
3 rows in set (0.01 sec) 

파일 :

[[email protected] ~]$ hadoop fs -ls /user/cloudera/dipayan-test 
Found 1 items 
-rw-r--r-- 3 cloudera cloudera  138 2014-02-16 23:18 /user/cloudera/dipayan-test/dipayan-test.txt.lzo 

Sqoop을 명령 :

sqoop export --connect "jdbc:mysql://localhost/bigdata" --username "root" --password "XXXXXX" --driver "com.mysql.jdbc.Driver" --table sqooptest --export-dir /user/cloudera/dipayan-test/ --input-fields-terminated-by '\t' -m 1 --input-null-string '\\N' --input-null-non-string '\\N' 
,536,913 63,210

오류 : 파일이 압축되지 않고 내가 직접 dipayan-test.txt 파일로 작업하고있는 경우

[[email protected] ~]$ sqoop export --connect "jdbc:mysql://localhost/bigdata" --username "root" --password "mysql" --driver "com.mysql.jdbc.Driver" --table sqooptest --export-dir /user/cloudera/dipayan-test/ --input-fields-terminated-by '\t' -m 1 --input-null-string '\\N' --input-null-non-string '\\N' 
14/02/16 23:19:26 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead. 
14/02/16 23:19:26 WARN sqoop.ConnFactory: Parameter --driver is set to an explicit driver however appropriate connection manager is not being set (via --connection-manager). Sqoop is going to fall back to org.apache.sqoop.manager.GenericJdbcManager. Please specify explicitly which connection manager should be used next time. 
14/02/16 23:19:26 INFO manager.SqlManager: Using default fetchSize of 1000 
14/02/16 23:19:26 INFO tool.CodeGenTool: Beginning code generation 
14/02/16 23:19:26 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM sqooptest AS t WHERE 1=0 
14/02/16 23:19:26 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM sqooptest AS t WHERE 1=0 
14/02/16 23:19:27 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /usr/lib/hadoop-0.20-mapreduce 
14/02/16 23:19:27 INFO orm.CompilationManager: Found hadoop core jar at: /usr/lib/hadoop-0.20-mapreduce/hadoop-core.jar 
Note: /tmp/sqoop-cloudera/compile/676bc185f1efffa3b0de0a924df4a02d/sqooptest.java uses or overrides a deprecated API. 
Note: Recompile with -Xlint:deprecation for details. 
14/02/16 23:19:29 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-cloudera/compile/676bc185f1efffa3b0de0a924df4a02d/sqooptest.jar 
14/02/16 23:19:29 INFO mapreduce.ExportJobBase: Beginning export of sqooptest 
14/02/16 23:19:30 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM sqooptest AS t WHERE 1=0 
14/02/16 23:19:30 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same. 
14/02/16 23:19:31 INFO input.FileInputFormat: Total input paths to process : 1 
14/02/16 23:19:31 INFO input.FileInputFormat: Total input paths to process : 1 
14/02/16 23:19:31 INFO mapred.JobClient: Running job: job_201402162201_0013 
14/02/16 23:19:32 INFO mapred.JobClient: map 0% reduce 0% 
14/02/16 23:19:41 INFO mapred.JobClient: Task Id : attempt_201402162201_0013_m_000000_0, Status : FAILED 
java.io.IOException: Can't export data, please check task tracker logs 
    at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:112) 
    at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:39) 
    at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:140) 
    at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64) 
    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:672) 
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:330) 
    at org.apache.hadoop.mapred.Child$4.run(Child.java:268) 
    at java.security.AccessController.doPrivileged(Native Method) 
    at javax.security.auth.Subject.doAs(Subject.java:396) 
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408) 
    at org.apache.hadoop.mapred.Child.main(Child.java:262) 
Caused by: java.util.NoSuchElementException 
    at java.util.AbstractList$Itr.next(AbstractList.java:350) 
    at sqooptest.__loadFromFields(sqooptest.java:225) 
    at sqooptest.parse(sqooptest.java:174) 
    at org.apach 
14/02/16 23:19:48 INFO mapred.JobClient: Task Id : attempt_201402162201_0013_m_000000_1, Status : FAILED 
java.io.IOException: Can't export data, please check task tracker logs 
    at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:112) 
    at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:39) 
    at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:140) 
    at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64) 
    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:672) 
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:330) 
    at org.apache.hadoop.mapred.Child$4.run(Child.java:268) 
    at java.security.AccessController.doPrivileged(Native Method) 
    at javax.security.auth.Subject.doAs(Subject.java:396) 
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408) 
    at org.apache.hadoop.mapred.Child.main(Child.java:262) 
Caused by: java.util.NoSuchElementException 
    at java.util.AbstractList$Itr.next(AbstractList.java:350) 
    at sqooptest.__loadFromFields(sqooptest.java:225) 
    at sqooptest.parse(sqooptest.java:174) 
    at org.apach 
14/02/16 23:19:55 INFO mapred.JobClient: Task Id : attempt_201402162201_0013_m_000000_2, Status : FAILED 
java.io.IOException: Can't export data, please check task tracker logs 
    at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:112) 
    at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:39) 
    at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:140) 
    at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64) 
    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:672) 
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:330) 
    at org.apache.hadoop.mapred.Child$4.run(Child.java:268) 
    at java.security.AccessController.doPrivileged(Native Method) 
    at javax.security.auth.Subject.doAs(Subject.java:396) 
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408) 
    at org.apache.hadoop.mapred.Child.main(Child.java:262) 
Caused by: java.util.NoSuchElementException 
    at java.util.AbstractList$Itr.next(AbstractList.java:350) 
    at sqooptest.__loadFromFields(sqooptest.java:225) 
    at sqooptest.parse(sqooptest.java:174) 
    at org.apach 
14/02/16 23:20:04 INFO mapred.JobClient: Job complete: job_201402162201_0013 
14/02/16 23:20:04 INFO mapred.JobClient: Counters: 7 
14/02/16 23:20:04 INFO mapred.JobClient: Job Counters 
14/02/16 23:20:04 INFO mapred.JobClient:  Failed map tasks=1 
14/02/16 23:20:04 INFO mapred.JobClient:  Launched map tasks=4 
14/02/16 23:20:04 INFO mapred.JobClient:  Data-local map tasks=4 
14/02/16 23:20:04 INFO mapred.JobClient:  Total time spent by all maps in occupied slots (ms)=29679 
14/02/16 23:20:04 INFO mapred.JobClient:  Total time spent by all reduces in occupied slots (ms)=0 
14/02/16 23:20:04 INFO mapred.JobClient:  Total time spent by all maps waiting after reserving slots (ms)=0 
14/02/16 23:20:04 INFO mapred.JobClient:  Total time spent by all reduces waiting after reserving slots (ms)=0 
14/02/16 23:20:04 WARN mapreduce.Counters: Group FileSystemCounters is deprecated. Use org.apache.hadoop.mapreduce.FileSystemCounter instead 
14/02/16 23:20:04 INFO mapreduce.ExportJobBase: Transferred 0 bytes in 33.5335 seconds (0 bytes/sec) 
14/02/16 23:20:04 WARN mapreduce.Counters: Group org.apache.hadoop.mapred.Task$Counter is deprecated. Use org.apache.hadoop.mapreduce.TaskCounter instead 
14/02/16 23:20:04 INFO mapreduce.ExportJobBase: Exported 0 records. 
14/02/16 23:20:04 ERROR tool.ExportTool: Error during export: Export job failed! 

이 완벽하게 작동합니다.

이 문제를 해결하는 데 도움이 필요하며 lzo 파일로 작업 할 때 뭔가 빠졌는지 알고 싶습니다.

답변

0

테이블에 올바른 열이 없을 가능성이 있습니다. 당신은 항상 거기에서있는 .java Sqoop을 당신을 위해 생성 파일 및 디버그에 갈 수 있습니다 : sqooptest.java:225

2

수출이 여러 가지 이유로 실패 할 수 있습니다

* Loss of connectivity from the Hadoop cluster to the database (either due to hardware fault, or server software crashes) 
* Attempting to INSERT a row which violates a consistency constraint (for example, inserting a duplicate primary key value) 
* Attempting to parse an incomplete or malformed record from the HDFS source data 
* Attempting to parse records using incorrect delimiters 
* Capacity issues (such as insufficient RAM or disk space) 

내 경우에는 here

에서 촬영 I NoSuchElementException 그리고 올바른 필드 종결자를 --fields-terminated-by '\t'으로 설정하면 문제가 해결되었습니다. 필드 종결 및 "\ n" 행으로 종결 자로 "," :

언급되지

, Sqoop을 같은 MySQL의 기본 터미네이터를 고려한다.