You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@impala.apache.org by st...@apache.org on 2023/02/28 08:41:47 UTC

[impala] branch master updated: IMPALA-11953: Declare num_trues and num_falses in TIntermediateColumnStats as optional

This is an automated email from the ASF dual-hosted git repository.

stigahuang pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/impala.git


The following commit(s) were added to refs/heads/master by this push:
     new 7c854e117 IMPALA-11953: Declare num_trues and num_falses in TIntermediateColumnStats as optional
7c854e117 is described below

commit 7c854e117be659abdfed4ead43df75e5903d9132
Author: stiga-huang <hu...@gmail.com>
AuthorDate: Tue Feb 28 08:48:50 2023 +0800

    IMPALA-11953: Declare num_trues and num_falses in TIntermediateColumnStats as optional
    
    TIntermediateColumnStats is the representation of incremental stats
    which are stored in HMS partition properties using keys like
    "impala_intermediate_stats_chunk0", "impala_intermediate_stats_chunk1",
    "impala_intermediate_stats_chunk2", etc.
    
    Fields in TIntermediateColumnStats should be optional to ensure
    backward compatibility. IMPALA-8205 adds two required fields, num_trues
    and num_falses, in TIntermediateColumnStats. This breaks the incremental
    stats loading in higher versions of Impala if the stats are generated by
    older Impala versions (< 4.0). This patch changes the fields to be
    optional.
    
    Tests:
     - Verified the incremental stats generated by CDH Impala cluster can be
       loaded by CDP Impala cluster with this fix.
    
    Change-Id: I4f74d5d0676e7ce9eb4ea8061a15610846db3ca5
    Reviewed-on: http://gerrit.cloudera.org:8080/19555
    Reviewed-by: Riza Suminto <ri...@cloudera.com>
    Tested-by: Impala Public Jenkins <im...@cloudera.com>
---
 common/thrift/CatalogObjects.thrift | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/common/thrift/CatalogObjects.thrift b/common/thrift/CatalogObjects.thrift
index 12d87ab49..7dc214d3a 100644
--- a/common/thrift/CatalogObjects.thrift
+++ b/common/thrift/CatalogObjects.thrift
@@ -227,6 +227,8 @@ struct TColumnStats {
 
 // Intermediate state for the computation of per-column stats. Impala can aggregate these
 // structures together to produce final stats for a column.
+// Fields should be optional for backward compatibility since this is stored in HMS
+// partition properties.
 struct TIntermediateColumnStats {
   // One byte for each bucket of the NDV HLL computation
   1: optional binary intermediate_ndv
@@ -247,8 +249,8 @@ struct TIntermediateColumnStats {
   6: optional i64 num_rows
 
   // The number of true and false value, of the column
-  7: required i64 num_trues
-  8: required i64 num_falses
+  7: optional i64 num_trues
+  8: optional i64 num_falses
 
   // The low and the high value
   9: optional Data.TColumnValue low_value