You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Kurt Young (Jira)" <ji...@apache.org> on 2019/11/08 01:57:00 UTC
[jira] [Created] (FLINK-14663) Distinguish unknown column stats and
zero
Kurt Young created FLINK-14663:
----------------------------------
Summary: Distinguish unknown column stats and zero
Key: FLINK-14663
URL: https://issues.apache.org/jira/browse/FLINK-14663
Project: Flink
Issue Type: Improvement
Components: Connectors / Hive, Table SQL / API
Reporter: Kurt Young
When converting from hive stats to flink's column stats, we didn't check whether some columns stats is really set or just an initial value. For example:
{code:java}
// code placeholder
LongColumnStatsData longColStats = stats.getLongStats();
return new CatalogColumnStatisticsDataLong(
longColStats.getLowValue(),
longColStats.getHighValue(),
longColStats.getNumDVs(),
longColStats.getNumNulls());
{code}
Hive `LongColumnStatsData` actually has information whether some stats is set through APIs like `isSetNumDVs()`. And the initial values are all 0, it will confuse us is it really 0 or just an initial value.
We can use -1 to represent UNKNOWN value for column stats.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)