You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@impala.apache.org by ta...@apache.org on 2020/02/18 01:40:29 UTC

[impala] 01/02: Add log of created files for data load

This is an automated email from the ASF dual-hosted git repository.

tarmstrong pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/impala.git

commit ba903deba7e529df6d3136589086105a99c878c1
Author: norbert.luksa <no...@cloudera.com>
AuthorDate: Mon Feb 17 15:36:10 2020 +0100

    Add log of created files for data load
    
    As Joe pointed out in IMPALA-9351, it would help debugging issues with
    missing files if we had logged the created files when loading the data.
    
    With this commit, running create-load-data.sh now logs the created
    files into created-files.log.
    
    Change-Id: I4f413810c6202a07c19ad1893088feedd9f7278f
    Reviewed-on: http://gerrit.cloudera.org:8080/15234
    Reviewed-by: Zoltan Borok-Nagy <bo...@cloudera.com>
    Tested-by: Impala Public Jenkins <im...@cloudera.com>
---
 testdata/bin/create-load-data.sh | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/testdata/bin/create-load-data.sh b/testdata/bin/create-load-data.sh
index 65d18e3..48ba7f5 100755
--- a/testdata/bin/create-load-data.sh
+++ b/testdata/bin/create-load-data.sh
@@ -719,6 +719,9 @@ if [ "${TARGET_FILESYSTEM}" = "hdfs" ]; then
       create-internal-hbase-table
 
   run-step "Checking HDFS health" check-hdfs-health.log check-hdfs-health
+
+  # Saving the list of created files can help in debugging missing files.
+  run-step "Logging created files" created-files.log hdfs dfs -ls -R /test-warehouse
 fi
 
 # TODO: Investigate why all stats are not preserved. Theoretically, we only need to