You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@drill.apache.org by GitBox <gi...@apache.org> on 2019/12/12 19:59:35 UTC

[GitHub] [drill] paul-rogers edited a comment on issue #1923: DRILL-7479: Partial fixes for metadata parameterized type issues

paul-rogers edited a comment on issue #1923: DRILL-7479: Partial fixes for metadata parameterized type issues
URL: https://github.com/apache/drill/pull/1923#issuecomment-565141768
 
 
   Tried making  `ColumnStatistics` generic. However there are several places where the type of the column is required (min/max values, for example.)
   
   The issue is that each column has multiple stats: some are of the same type as the column (min, max, histogram), others are not (HLL, NDV.)
   
   Even those that are the "same" type as the column are not: the column is of type `VARCHAR`, but the `ColumnStatistics` is of type `String`, say. There is no mapping from Drill to Java types; this knowedge appears to be implicit in the code.
   
   Further, the `ColumnStatisticsKind` is not what you would think: it has a compile-time type (the type parameter), but, because of type erasure, no runtime type (I cannot ask what kind of value the kind holds.)
   
   I think we need to rethink this a bit:
   
   * A type definition holds the type-specific logic (such as merging stats) as well as giving both the Java and SQL type (if any) for the stat.
   * A mapping tells us the Java stats type for the SQL type.
   * A stats definition holds a name and a type definition. (Where the type definition may be "same as column type.")
   * A Statistic has a definition, a (possibly implied) type and a value.
   * A column is generic: it has a name, a SQL type and a list of statistics.
   * It may be worthwhile having a column "meta-meta" definition: the list of statistics valid for a column of a given SQL type. (There is no min or max for a VarBinary, say.)
   
   This PR at least removes many warnings an obscure compile error from the present design.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services