You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by "Manoj Govindassamy (Jira)" <ji...@apache.org> on 2022/01/25 02:55:00 UTC
[jira] [Assigned] (HUDI-3316) HoodieColumnRangeMetadata doesn't include all Parquet chunk statistics
[ https://issues.apache.org/jira/browse/HUDI-3316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Manoj Govindassamy reassigned HUDI-3316:
----------------------------------------
Assignee: Manoj Govindassamy
> HoodieColumnRangeMetadata doesn't include all Parquet chunk statistics
> ----------------------------------------------------------------------
>
> Key: HUDI-3316
> URL: https://issues.apache.org/jira/browse/HUDI-3316
> Project: Apache Hudi
> Issue Type: Task
> Components: writer-core
> Reporter: Manoj Govindassamy
> Assignee: Manoj Govindassamy
> Priority: Blocker
> Fix For: 0.11.0
>
>
> HoodieColumnChunkMetadata includes the following stats about a parquet column
> * columnName;
> * minValue
> * maxValue
> * numNulls
>
> Parquet's ColumnChunkMetaData do have more stats and we need to include them all in our index
> * distinct
> * num values
--
This message was sent by Atlassian Jira
(v8.20.1#820001)