You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@impala.apache.org by st...@apache.org on 2023/02/01 09:32:26 UTC

[impala] 02/02: IMPALA-11756: Disable auto analyze table triggered by Hive

This is an automated email from the ASF dual-hosted git repository.

stigahuang pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/impala.git

commit c3ed44268838f4a00624e7e7fbf6e614f0a54a48
Author: stiga-huang <hu...@gmail.com>
AuthorDate: Tue Jan 31 16:11:05 2023 +0800

    IMPALA-11756: Disable auto analyze table triggered by Hive
    
    Hive will automatically recompute stats asynchronously after major
    compactions. The tasks could fail if the table is removed, which usually
    happens on our test tables. The failure could cause further failures on
    other query-based compactions in the same session, and results in test
    failures. See the analysis in the JIRA for an example.
    
    We don't rely on Hive to recompute the stats. So we can disable this
    feature to avoid the issue. This patch turns off
    "hive.compactor.gather.stats" to disable it.
    
    Tests:
     - Ran exhaustive tests
    
    Change-Id: Idc23100ae74d6cb07894053a26806e01258065ec
    Reviewed-on: http://gerrit.cloudera.org:8080/19464
    Reviewed-by: Riza Suminto <ri...@cloudera.com>
    Tested-by: Impala Public Jenkins <im...@cloudera.com>
---
 fe/src/test/resources/hive-site.xml.py | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/fe/src/test/resources/hive-site.xml.py b/fe/src/test/resources/hive-site.xml.py
index 3b3e5fb0b..e7cf99d45 100644
--- a/fe/src/test/resources/hive-site.xml.py
+++ b/fe/src/test/resources/hive-site.xml.py
@@ -159,6 +159,11 @@ if hive_major_version >= 3:
    # (the default value is 5 mins which is way too long). Setting it to 2 seconds.
    'hive.compactor.wait.timeout': '2000',
 
+   # No need to automatically compute stats after compactions. It might cause failures
+   # if we trigger compaction on temp tables in tests. The stats computation is async and
+   # will fail if the temp tables are removed. See an example in IMPALA-11756.
+   'hive.compactor.gather.stats': 'false',
+
    # Since HIVE-22589, Hive uses Julian Calendar for writing dates before 1582-10-15,
    # whereas Impala uses proleptic Gregorian Calendar. This affects the results Impala
    # gets when querying tables written by Hive.