2017-10-25 15 views
1

테이블 하나가 table1 인 경우 "create table table2 as select * from table1 where partition_key is not null;"을 사용하여 복제했습니다. table1463.2 GB이지만, table22.8 TB으로 나타납니다. 왜 이런 일이 일어 났습니까?복제 된 하이브 테이블이 원본보다 훨씬 큽니다

추신 : 방금 파티션을 표시했으며 table1과 table2가 다르게 분할 된 것 같습니다. 그래서 내 질문에 추가 : 어떻게 테이블을 복사하고 원래의 파티션 정보를 유지?

표 : hdfs dfs -du -s -h /user/hive/warehouse/map_services.db/userhistory1/*

7.9 G 23.7 G /user/hive/warehouse/map_services.db/userhistory/datestr=1970-01-01 
25.7 G 77.1 G /user/hive/warehouse/map_services.db/userhistory/datestr=2017-10-01 
18.8 G 56.3 G /user/hive/warehouse/map_services.db/userhistory/datestr=2017-10-02 
16.8 G 50.5 G /user/hive/warehouse/map_services.db/userhistory/datestr=2017-10-03 
17.5 G 52.5 G /user/hive/warehouse/map_services.db/userhistory/datestr=2017-10-04 
18.0 G 53.9 G /user/hive/warehouse/map_services.db/userhistory/datestr=2017-10-05 
22.4 G 67.1 G /user/hive/warehouse/map_services.db/userhistory/datestr=2017-10-06 
27.3 G 81.8 G /user/hive/warehouse/map_services.db/userhistory/datestr=2017-10-07 

표 2 : hdfs dfs -du -s -h /user/hive/warehouse/map_services.db/userhistory2/*

929.2 M 2.7 G /user/hive/warehouse/map_services.db/userhistory2/000000_0 
651.1 M 1.9 G /user/hive/warehouse/map_services.db/userhistory2/000001_0 
1.1 G 3.3 G /user/hive/warehouse/map_services.db/userhistory2/000002_0 
1.1 G 3.3 G /user/hive/warehouse/map_services.db/userhistory2/000003_0 
1.6 G 4.7 G /user/hive/warehouse/map_services.db/userhistory2/000004_0 
1.3 G 4.0 G /user/hive/warehouse/map_services.db/userhistory2/000005_0 
1.2 G 3.7 G /user/hive/warehouse/map_services.db/userhistory2/000006_0 
1.5 G 4.5 G /user/hive/warehouse/map_services.db/userhistory2/000007_0 
1.5 G 4.4 G /user/hive/warehouse/map_services.db/userhistory2/000008_0 
1.5 G 4.4 G /user/hive/warehouse/map_services.db/userhistory2/000009_0 
1.5 G 4.5 G /user/hive/warehouse/map_services.db/userhistory2/000010_0 
1.4 G 4.3 G /user/hive/warehouse/map_services.db/userhistory2/000011_0 
1.4 G 4.3 G /user/hive/warehouse/map_services.db/userhistory2/000012_0 
1.3 G 3.8 G /user/hive/warehouse/map_services.db/userhistory2/000013_0 
1.5 G 4.4 G /user/hive/warehouse/map_services.db/userhistory2/000014_0 
1.4 G 4.2 G /user/hive/warehouse/map_services.db/userhistory2/000015_0 
1.2 G 3.6 G /user/hive/warehouse/map_services.db/userhistory2/000016_0 
1.5 G 4.5 G /user/hive/warehouse/map_services.db/userhistory2/000017_0 
1.5 G 4.4 G /user/hive/warehouse/map_services.db/userhistory2/000018_0 
1.4 G 4.2 G /user/hive/warehouse/map_services.db/userhistory2/000019_0 
1.5 G 4.6 G /user/hive/warehouse/map_services.db/userhistory2/000020_0 
1.5 G 4.5 G /user/hive/warehouse/map_services.db/userhistory2/000021_0 
1.6 G 4.7 G /user/hive/warehouse/map_services.db/userhistory2/000022_0 
1.3 G 4.0 G /user/hive/warehouse/map_services.db/userhistory2/000023_0 
1.1 G 3.4 G /user/hive/warehouse/map_services.db/userhistory2/000024_0 
908.7 M 2.7 G /user/hive/warehouse/map_services.db/userhistory2/000025_0 
1.4 G 4.2 G /user/hive/warehouse/map_services.db/userhistory2/000026_0 
1.4 G 4.3 G /user/hive/warehouse/map_services.db/userhistory2/000027_0 
1.3 G 3.8 G /user/hive/warehouse/map_services.db/userhistory2/000028_0 
1.4 G 4.1 G /user/hive/warehouse/map_services.db/userhistory2/000029_0 
1.6 G 4.7 G /user/hive/warehouse/map_services.db/userhistory2/000030_0 
1.3 G 4.0 G /user/hive/warehouse/map_services.db/userhistory2/000031_0 
1.3 G 4.0 G /user/hive/warehouse/map_services.db/userhistory2/000032_0 
1.6 G 4.8 G /user/hive/warehouse/map_services.db/userhistory2/000033_0 
1.5 G 4.4 G /user/hive/warehouse/map_services.db/userhistory2/000034_0 
1.3 G 3.8 G /user/hive/warehouse/map_services.db/userhistory2/000035_0 
940.0 M 2.8 G /user/hive/warehouse/map_services.db/userhistory2/000036_0 
1.3 G 4.0 G /user/hive/warehouse/map_services.db/userhistory2/000037_0 
1.2 G 3.6 G /user/hive/warehouse/map_services.db/userhistory2/000038_0 
1.5 G 4.6 G /user/hive/warehouse/map_services.db/userhistory2/000039_0 
1.2 G 3.7 G /user/hive/warehouse/map_services.db/userhistory2/000040_0 
1.1 G 3.4 G /user/hive/warehouse/map_services.db/userhistory2/000041_0 
1.1 G 3.4 G /user/hive/warehouse/map_services.db/userhistory2/000042_0 
1.0 G 3.1 G /user/hive/warehouse/map_services.db/userhistory2/000043_0 
1.4 G 4.3 G /user/hive/warehouse/map_services.db/userhistory2/000044_0 
1.3 G 4.0 G /user/hive/warehouse/map_services.db/userhistory2/000045_0 
1.4 G 4.1 G /user/hive/warehouse/map_services.db/userhistory2/000046_0 
1.5 G 4.5 G /user/hive/warehouse/map_services.db/userhistory2/000047_0 
1.1 G 3.3 G /user/hive/warehouse/map_services.db/userhistory2/000048_0 
706.3 M 2.1 G /user/hive/warehouse/map_services.db/userhistory2/000049_0 
1.4 G 4.2 G /user/hive/warehouse/map_services.db/userhistory2/000050_0 
1.5 G 4.6 G /user/hive/warehouse/map_services.db/userhistory2/000051_0 
872.2 M 2.6 G /user/hive/warehouse/map_services.db/userhistory2/000052_0 
1.2 G 3.5 G /user/hive/warehouse/map_services.db/userhistory2/000053_0 
1.2 G 3.7 G /user/hive/warehouse/map_services.db/userhistory2/000054_0 
943.9 M 2.8 G /user/hive/warehouse/map_services.db/userhistory2/000055_0 
1.6 G 4.7 G /user/hive/warehouse/map_services.db/userhistory2/000056_0 
1.5 G 4.4 G /user/hive/warehouse/map_services.db/userhistory2/000057_0 
1.3 G 4.0 G /user/hive/warehouse/map_services.db/userhistory2/000058_0 
1.4 G 4.3 G /user/hive/warehouse/map_services.db/userhistory2/000059_0 
961.5 M 2.8 G /user/hive/warehouse/map_services.db/userhistory2/000060_0 
1.3 G 3.8 G /user/hive/warehouse/map_services.db/userhistory2/000061_0 
1.4 G 4.3 G /user/hive/warehouse/map_services.db/userhistory2/000062_0 
1.4 G 4.2 G /user/hive/warehouse/map_services.db/userhistory2/000063_0 
1.4 G 4.1 G /user/hive/warehouse/map_services.db/userhistory2/000064_0 
924.4 M 2.7 G /user/hive/warehouse/map_services.db/userhistory2/000065_0 

답변

1

목표 테이블은 압축되지 않고 분배되지 않는다.

set hive.exec.dynamic.partition=true; 
set hive.exec.dynamic.partition.mode=nonstrict; 

insert overwrite table2 partition(partition_key) 
select * from table1; 
:

SET hive.exec.compress.output=true; 

삽입 동적 분할을 덮어 : 삽입하기 전에 압축에

create table 2 like table1; 

스위치 :

는 동일한 파티셔닝을 이용하여 명령을 표를 작성