You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "Quanlong Huang (Jira)" <ji...@apache.org> on 2023/03/02 01:10:00 UTC
[jira] [Resolved] (IMPALA-11953) num_trues and num_falses in TIntermediateColumnStats should be optional
[ https://issues.apache.org/jira/browse/IMPALA-11953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Quanlong Huang resolved IMPALA-11953.
-------------------------------------
Fix Version/s: Impala 4.3.0
Target Version: Impala 4.1.2
Resolution: Fixed
> num_trues and num_falses in TIntermediateColumnStats should be optional
> -----------------------------------------------------------------------
>
> Key: IMPALA-11953
> URL: https://issues.apache.org/jira/browse/IMPALA-11953
> Project: IMPALA
> Issue Type: Bug
> Components: Frontend
> Affects Versions: Impala 4.0.0, Impala 4.1.0, Impala 4.2.0, Impala 4.1.1
> Reporter: Quanlong Huang
> Assignee: Quanlong Huang
> Priority: Blocker
> Fix For: Impala 4.3.0
>
>
> IMPALA-8205 adds two required fields for TIntermediateColumnStats:
> {code:java}
> struct TIntermediateColumnStats {
> // One byte for each bucket of the NDV HLL computation
> 1: optional binary intermediate_ndv
> // If true, intermediate_ndv is RLE-compressed
> 2: optional bool is_ndv_encoded
> // Number of nulls seen so far (or -1 if nulls are not counted)
> 3: optional i64 num_nulls
> // The maximum width, in bytes, of the column
> 4: optional i32 max_width
> // The average width (in bytes) of the column
> 5: optional double avg_width
> // The number of rows counted, needed to compute NDVs from intermediate_ndv
> 6: optional i64 num_rows
> +
> + // The number of true and false value, of the column
> + 7: required i64 num_trues
> + 8: required i64 num_falses
> }{code}
> TIntermediateColumnStats is the representation of incremental stats which are stored in HMS partition properties using keys like "impala_intermediate_stats_num_chunks" and "impala_intermediate_stats_chunk0", "impala_intermediate_stats_chunk1", "impala_intermediate_stats_chunk2", etc.
> While upgrading Impala to 4.0, incremental stats can't be parsed due to missing these fields.
> {noformat}
> W0227 09:06:49.057451 31105 HdfsPartition.java:1337] Failed to set partition stats for table reptest.test partition loaddate=2022
> Java exception follows:
> org.apache.impala.common.InternalException: Required field 'num_trues' was not found in serialized data! Struct: org.apache.impala.thrift.TIntermediateColumnStats$TIntermediateColumnStatsStandardScheme@377da96a
> at org.apache.impala.common.JniUtil.deserializeThrift(JniUtil.java:138)
> at org.apache.impala.catalog.PartitionStatsUtil.partStatsBytesFromParameters(PartitionStatsUtil.java:114)
> at org.apache.impala.catalog.HdfsPartition$Builder.extractAndCompressPartStats(HdfsPartition.java:1334)
> at org.apache.impala.catalog.HdfsPartition$Builder.setMsPartition(HdfsPartition.java:1310)
> at org.apache.impala.catalog.HdfsTable.createOrUpdatePartitionBuilder(HdfsTable.java:906)
> at org.apache.impala.catalog.HdfsTable.createPartitionBuilder(HdfsTable.java:895)
> at org.apache.impala.catalog.HdfsTable.loadAllPartitions(HdfsTable.java:698)
> at org.apache.impala.catalog.HdfsTable.load(HdfsTable.java:1244)
> at org.apache.impala.catalog.HdfsTable.load(HdfsTable.java:1138)
> at org.apache.impala.catalog.TableLoader.load(TableLoader.java:114)
> at org.apache.impala.catalog.TableLoadingMgr$2.call(TableLoadingMgr.java:245)
> at org.apache.impala.catalog.TableLoadingMgr$2.call(TableLoadingMgr.java:242)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748){noformat}
> numTrues and numFalses are not used in planning. We'd better change them to optional to unblock the migration.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org