You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@impala.apache.org by st...@apache.org on 2022/03/23 00:42:22 UTC

[impala] 03/03: IMPALA-11192: Batch uploading files in test_scanner_fuzz.py

This is an automated email from the ASF dual-hosted git repository.

stigahuang pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/impala.git

commit 7273c9f8a9698609816b680a79f21f9389bb265f
Author: stiga-huang <hu...@gmail.com>
AuthorDate: Thu Mar 17 10:44:58 2022 +0800

    IMPALA-11192: Batch uploading files in test_scanner_fuzz.py
    
    test_scanner_fuzz.py runs much slower on ORC than other formats. The
    majority of the time is spent in uploading local files one by one to the
    hdfs table folder.
    
    The local files are copied from hdfs and randomly corrupted by the test.
    The directory layout remains the same as the table folder. There are no
    staging dirs that we should skip. So we can upload the whole local
    folder at once, which saves a lot of the test time.
    
    Tested locally and verified profiles of the succeeded queries. They all
    scan the expected number of rows.
    
    Change-Id: I504e160b84b3cc01d3be0b4e242d3c372692d181
    Reviewed-on: http://gerrit.cloudera.org:8080/18329
    Reviewed-by: Impala Public Jenkins <im...@cloudera.com>
    Tested-by: Impala Public Jenkins <im...@cloudera.com>
---
 tests/query_test/test_scanners_fuzz.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tests/query_test/test_scanners_fuzz.py b/tests/query_test/test_scanners_fuzz.py
index 9c4b48a..0576132 100644
--- a/tests/query_test/test_scanners_fuzz.py
+++ b/tests/query_test/test_scanners_fuzz.py
@@ -191,7 +191,7 @@ class TestScannersFuzzing(ImpalaTestSuite):
       table_loc = self._get_table_location(fq_fuzz_table_name, vector)
       check_call(['hdfs', 'dfs', '-copyToLocal', table_loc + "/*", tmp_table_dir])
       partitions = self.walk_and_corrupt_table_data(tmp_table_dir, num_copies, rng)
-      self.path_aware_copy_files_to_hdfs(tmp_table_dir, table_loc)
+      self.filesystem_client.copy_from_local(tmp_table_dir, table_loc)
     else:
       self.execute_query("create table %s.%s like %s.%s" % (fuzz_db, fuzz_table,
           src_db, src_table))