- 모드 : 마지막 값을 알고있는 열에는 append를 사용할 수 있습니다.
- 모드 : timestamp 열에 lastmodified 모드를 사용할 수 있습니다. 마지막으로 수정 한 타임 스탬프를 기억하는 것이 어려울 수 있습니다. 마지막으로 수정 한 타임 스탬프를 알고 있으면 첫 번째 방법으로 문제를 해결할 수 있습니다.
모드 : Sqoop을 명령을 사용
mysql> describe emp;
+--------+--------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+--------+--------------+------+-----+---------+-------+
| id | int(11) | YES | | NULL | |
| name | varchar(100) | YES | | NULL | |
| deg | varchar(100) | YES | | NULL | |
| salary | int(11) | YES | | NULL | |
| dep | varchar(10) | YES | | NULL | |
+--------+--------------+------+-----+---------+-------+
5 rows in set (0.00 sec)
수입 데이터를 추가. 테이블에
sqoop import \
--connect jdbc:mysql://quickstart:3306/retail_db \
--username retail_dba \
--password cloudera \
-m 3 \
--table emp \
--split-by id \
--columns id,name,deg \
--warehouse-dir /user/sqoop/ \
--delete-target-dir \
--as-textfile
HDFS 출력
[[email protected] lib]$ hadoop fs -ls -R /user/sqoop/
drwxr-xr-x - cloudera supergroup 0 2017-12-02 13:14 /user/sqoop/emp
-rw-r--r-- 1 cloudera supergroup 0 2017-12-02 13:14 /user/sqoop/emp/_SUCCESS
-rw-r--r-- 1 cloudera supergroup 70 2017-12-02 13:14 /user/sqoop/emp/part-m-00000
-rw-r--r-- 1 cloudera supergroup 64 2017-12-02 13:14 /user/sqoop/emp/part-m-00001
-rw-r--r-- 1 cloudera supergroup 86 2017-12-02 13:14 /user/sqoop/emp/part-m-00002
mysql> select * from emp;
+------+---------+--------------+--------+------+
| id | name | deg | salary | dep |
+------+---------+--------------+--------+------+
| 1201 | gopal | manager | 50000 | tp |
| 1202 | manisha | Proof reader | 50000 | TP |
| 1203 | php dev | TECH WRITER | 50000 | AC |
| 1204 | Nilesh | Domino dev | 70000 | AF |
| 1205 | Vinayak | Java dev | 50000 | IT |
| 1206 | Amish | Cog dev | 60000 | IT |
| 1207 | Jatin | Oracel dev | 40001 | IT |
| 1208 | Viren | Java dev | 70004 | IT |
| 1209 | Ashish | Oracel dev | 40001 | IT |
| 1210 | Satish | Java dev | 70004 | IT |
+------+---------+--------------+--------+------+
10 rows in set (0.00 sec)
삽입 새로운 기록. 증분 가져 오기이
mysql> insert into emp values(1211,'Jag', 'be', 20000, 'IT');
Query OK, 1 row affected (0.03 sec)
mysql> select * from emp;
+------+---------+--------------+--------+------+
| id | name | deg | salary | dep |
+------+---------+--------------+--------+------+
| 1201 | gopal | manager | 50000 | tp |
| 1202 | manisha | Proof reader | 50000 | TP |
| 1203 | php dev | TECH WRITER | 50000 | AC |
| 1204 | Nilesh | Domino dev | 70000 | AF |
| 1205 | Vinayak | Java dev | 50000 | IT |
| 1206 | Amish | Cog dev | 60000 | IT |
| 1207 | Jatin | Oracel dev | 40001 | IT |
| 1208 | Viren | Java dev | 70004 | IT |
| 1209 | Ashish | Oracel dev | 40001 | IT |
| 1210 | Satish | Java dev | 70004 | IT |
| 1211 | Jag | be | 20000 | IT |
+------+---------+--------------+--------+------+
11 rows in set (0.00 sec)
명령
sqoop import \
--connect jdbc:mysql://quickstart:3306/retail_db \
--username retail_dba \
--password cloudera \
--table emp \
--split-by id \
--check-column id \
--incremental append \
--last-value 1210 \
--warehouse-dir /user/sqoop/ \
--as-textfile
[[email protected] lib]$ hadoop fs -ls -R /user/sqoop/emp
-rw-r--r-- 1 cloudera supergroup 0 2017-12-02 13:14 /user/sqoop/emp/_SUCCESS
-rw-r--r-- 1 cloudera supergroup 70 2017-12-02 13:14 /user/sqoop/emp/part-m-00000
-rw-r--r-- 1 cloudera supergroup 64 2017-12-02 13:14 /user/sqoop/emp/part-m-00001
-rw-r--r-- 1 cloudera supergroup 86 2017-12-02 13:14 /user/sqoop/emp/part-m-00002
-rw-r--r-- 1 cloudera cloudera 21 2017-12-02 13:48 /user/sqoop/emp/part-m-00003
[[email protected] lib]$ hadoop fs -cat /user/sqoop/emp/part-m-00003
1211,Jag,be,20000,IT
모드를 가져온 후 :
을 LASTMODIFIED
mysql> describe orders;
+-------------------+-------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------------------+-------------+------+-----+---------+----------------+
| order_id | int(11) | NO | PRI | NULL | auto_increment |
| order_date | datetime | NO | | NULL | |
| order_customer_id | int(11) | NO | | NULL | |
| order_status | varchar(45) | NO | | NULL | |
+-------------------+-------------+------+-----+---------+----------------+
4 rows in set (0.00 sec)
가져 오기 HDFS에 순서
sqoop import \
--connect jdbc:mysql://quickstart:3306/retail_db \
--username retail_dba \
--password cloudera \
--table orders \
--split-by order_id \
--target-dir /user/sqoop/orders \
--as-textfile
후 수입
[[email protected] lib]$ hadoop fs -ls -R /user/sqoop/orders
-rw-r--r-- 1 cloudera supergroup 0 2017-12-02 16:01 /user/sqoop/orders/_SUCCESS
-rw-r--r-- 1 cloudera supergroup 741597 2017-12-02 16:01 /user/sqoop/orders/part-m-00000
-rw-r--r-- 1 cloudera supergroup 753022 2017-12-02 16:01 /user/sqoop/orders/part-m-00001
-rw-r--r-- 1 cloudera supergroup 752368 2017-12-02 16:01 /user/sqoop/orders/part-m-00002
-rw-r--r-- 1 cloudera supergroup 752940 2017-12-02 16:01 /user/sqoop/orders/part-m-00003
업데이트 주문 데이터
mysql> select * from orders where order_id=10;
+----------+---------------------+-------------------+-----------------+
| order_id | order_date | order_customer_id | order_status |
+----------+---------------------+-------------------+-----------------+
| 10 | 2013-07-25 00:00:00 | 5648 | PENDING_PAYMENT |
+----------+---------------------+-------------------+-----------------+
1 row in set (0.00 sec)
mysql> update orders set order_status='CLOSED', order_date=now() where order_id=10;
Query OK, 1 row affected (0.01 sec)
Rows matched: 1 Changed: 1 Warnings: 0
mysql> select * from orders where order_id=10;
+----------+---------------------+-------------------+--------------+
| order_id | order_date | order_customer_id | order_status |
+----------+---------------------+-------------------+--------------+
| 10 | 2017-12-02 16:19:23 | 5648 | CLOSED |
+----------+---------------------+-------------------+--------------+
1 row in set (0.00 sec)
가져 오기 추가 데이터
sqoop import \
--connect jdbc:mysql://quickstart:3306/retail_db \
--username retail_dba \
--password cloudera \
--table orders \
--split-by order_id \
--check-column order_date \
--merge-key order_id \
--incremental lastmodified \
--target-dir /user/sqoop/orders1 \
--as-textfile
출력
[[email protected] lib]$ hadoop fs -ls -R /user/sqoop/orders1
-rw-r--r-- 1 cloudera cloudera 0 2017-12-02 16:07 /user/sqoop/orders1/_SUCCESS
-rw-r--r-- 1 cloudera cloudera 2999918 2017-12-02 16:07 /user/sqoop/orders1/part-r-00000
참고 : 우리가 같은 디렉토리를 사용하는 경우 (주문) 이전은 이전 파일을 삭제하고 새로 생성 될 때 부품.