언로드 여러 파일 내가 오류가 아래 점점 특정 S3 버킷에 Redshift에 여러 테이블을 언로드하려고

안녕하세요 S3합니다 :언로드 여러 파일 내가 오류가 아래 점점 특정 S3 버킷에 Redshift에 여러 테이블을 언로드하려고

psycopg2.InternalError: Specified unload destination on S3 is not empty. Consider using a different bucket/prefix, manually removing the target files in S3, or using the ALLOWOVERWRITE option.

내가 'allowoverwrite'추가 할 경우 unload_function에 대한 옵션은 그 전에 overwritting한다 S3에서 마지막 테이블을 언로드하고 테이블을로드합니다. 당신이 동일한 대상에 데이터를 저장하는 것을 불평

import psycopg2 

def unload_data(r_conn, aws_iam_role, datastoring_path, region, table_name): 
    unload = '''unload ('select * from {}') 
        to '{}' 
        credentials 'aws_iam_role={}' 
        manifest 
        gzip 
        delimiter ',' addquotes escape parallel off '''.format(table_name, datastoring_path, aws_iam_role) 

    print ("Exporting table to datastoring_path") 
    cur = r_conn.cursor() 
    cur.execute(unload) 
    r_conn.commit() 

def main(): 
    host_rs = 'dataingestion.*********.us******2.redshift.amazonaws.com' 
    port_rs = '5439' 
    database_rs = '******' 
    user_rs = '******' 
    password_rs = '********' 
    rs_tables = [ 'Employee', 'Employe_details' ] 

    iam_role = 'arn:aws:iam::************:role/RedshiftCopyUnload' 
    s3_datastoring_path = 's3://mysamplebuck/' 
    s3_region = 'us_*****_2' 
    print ("Exporting from source") 
    src_conn = psycopg2.connect(host = host_rs, 
           port = port_rs, 
           database = database_rs, 
           user = user_rs, 
           password = password_rs) 
    print ("Connected to RS") 

    for i, tabe in enumerate(rs_tables): 
      if tabe[0] == tabe[-1]: 
       print("No files to read!") 
      unload_data(src_conn, aws_iam_role = iam_role, datastoring_path = s3_datastoring_path, region = s3_region, table_name = rs_tables[i]) 
      print (rs_tables[i]) 


if __name__=="__main__": 
main()

출처

2017-10-10 Chandana Puppy

당신은 'allowoverwrite'옵션을 사용하여 문제가 있다고 말했지만 난 정말 당신이 무엇을 의미하는지에 따라 didnt한다 -하십시오 그걸 더 잘 설명 할 수 있습니까? –

답변 해 주셔서 감사합니다. 내가 아래로 언로드 변수에 'allowoverwrite'추가 할 경우 '언로드 =' ' 언로드' 에 {} ' 자격 증명'aws_iam_role = {} ' gzip을 구분 매니페스트'('*에서 {}를 선택)', 'addquotes escape allowoverwrite' '.format (table_name, datastoring_path, aws_iam_role) 모든 테이블은 동시에 다음 테이블로 덮어 쓰면서 s3 버킷에 쓸 수 있습니다. 마지막으로 s3 버켓의 마지막 테이블을 볼 수 있습니다. –

이

은 내가 준 코드입니다.

이것은 컴퓨터의 모든 파일을 같은 디렉토리에 복사하는 것과 같습니다. 파일을 덮어 씁니다.

당신은 변경해야합니다 당신의 datastoring_path 등 각 테이블 다른 될 수 있습니다 :

.format(table_name, datastoring_path + '/' + table_name, aws_iam_role)

출처

2017-10-10 21:57:02

정말 고마워요. 나는 또한 모든 테이블에 이름을 넣으려고했지만 파이썬 코딩에 익숙하지 않아서 만들 수 없었다. 당신의 대답은 정확한 해결책을 제시했습니다. –

언로드 여러 파일 내가 오류가 아래 점점 특정 S3 버킷에 Redshift에 여러 테이블을 언로드하려고

답변

관련 문제