You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@calcite.apache.org by GitBox <gi...@apache.org> on 2021/11/03 19:56:46 UTC

[GitHub] [calcite] jamesstarr commented on a change in pull request #2378: [CALCITE-4539] Customization of Metadata Handlers

jamesstarr commented on a change in pull request #2378:
URL: https://github.com/apache/calcite/pull/2378#discussion_r742284367



##########
File path: core/src/main/java/org/apache/calcite/rel/metadata/RelMetadataQuery.java
##########
@@ -107,47 +106,69 @@
   private BuiltInMetadata.LowerBoundCost.Handler lowerBoundCostHandler;
 
   /**
-   * Creates the instance with {@link JaninoRelMetadataProvider} instance
-   * from {@link #THREAD_PROVIDERS} and {@link #EMPTY} as a prototype.
+   * Creates the instance using the Handler.classualt prototype.
    */
   protected RelMetadataQuery() {
-    this(castNonNull(THREAD_PROVIDERS.get()), EMPTY);
+    this(PROTOTYPE);
   }
 
   /** Creates and initializes the instance that will serve as a prototype for
    * all other instances. */
-  private RelMetadataQuery(@SuppressWarnings("unused") boolean dummy) {
-    super(null);
-    this.collationHandler = initialHandler(BuiltInMetadata.Collation.Handler.class);
-    this.columnOriginHandler = initialHandler(BuiltInMetadata.ColumnOrigin.Handler.class);
-    this.expressionLineageHandler = initialHandler(BuiltInMetadata.ExpressionLineage.Handler.class);
-    this.tableReferencesHandler = initialHandler(BuiltInMetadata.TableReferences.Handler.class);
-    this.columnUniquenessHandler = initialHandler(BuiltInMetadata.ColumnUniqueness.Handler.class);
-    this.cumulativeCostHandler = initialHandler(BuiltInMetadata.CumulativeCost.Handler.class);
-    this.distinctRowCountHandler = initialHandler(BuiltInMetadata.DistinctRowCount.Handler.class);
-    this.distributionHandler = initialHandler(BuiltInMetadata.Distribution.Handler.class);
-    this.explainVisibilityHandler = initialHandler(BuiltInMetadata.ExplainVisibility.Handler.class);
-    this.maxRowCountHandler = initialHandler(BuiltInMetadata.MaxRowCount.Handler.class);
-    this.minRowCountHandler = initialHandler(BuiltInMetadata.MinRowCount.Handler.class);
-    this.memoryHandler = initialHandler(BuiltInMetadata.Memory.Handler.class);
-    this.nonCumulativeCostHandler = initialHandler(BuiltInMetadata.NonCumulativeCost.Handler.class);
-    this.parallelismHandler = initialHandler(BuiltInMetadata.Parallelism.Handler.class);
+  protected RelMetadataQuery(MetadataHandlerProvider metadataHandlerProvider) {

Review comment:
       There are several requirements metadata in calcite:
   Metadata retrieval - the api for getting a metdata value for a relnode and give set of arguments.  This is currently the RelMetadataQuery.
   Metadata implementation - The code that is called to for a relnode to get a metadata value.  This is currently sub-classes of RelMetdataProvider.
   Defaulting and Validation - There is defaulting, canonization, validation, etc after a metadata value is retrieved in RelMetadataQuery.
   Dispatch - How the implementation is called for given arguments.  This is currently done in Janino compiled code.
   Cycle Detection - This is currently a **hard** requirement of volcano planner.  This is currently done in Janino compiled code.
   Caching Invalidation - This is required for volcano planner.  This is currently done through RelMetadataQuery.clearCache.
   Caching Implementation - This is coupled with cycle detection, but is currently accomplished through RelMetadataQuery.map.
   Caching Values - The caches need to be checked and updated.  This is currently done in the generated code.
   
   Metadata retrieval and Metadata implementation are hard requirements for external facing APIs.  They are currently exist as external facing APIs.  Many downstream projects depend on the ability to either customize existing metadata types or add their own metadata types.  When a down stream project adds their own metadata type, they need to be able to expose it to the larger system aka through metadata retrieval.  This is currently done by subclassing RelMetadataQuery and calling methods RelMetadataQueryBase.  An example of this can be seen in RelMetadataTest.MyRelMetadataQuery.
   
   So the metadata retrieval api is the only api need to for the rest of calcite core, however, downstream users of calcite need to be able to configure and extend the metadata implementation.  So some of RelMetadataQuery implementation must be exposed with the current setup.
   
   I support making RelMetadataQuery an interface so the metadata retrieval is clearly defined.  However, this is not the only public API necessary to expose.  Also, it is breaking change which does not separate the defaulting and validation from the RelMetadataQueryImpl.  If RelMetadataQuery was split into RelMetadataQuery(an interface), DefaultingAndValidatingRelMetadataQuery and HandlerBackedRelMetadataQuery, then down stream projects would have to implement 2 different RelMetadataQueries to add a custom metadata.  This api seems a bit convoluted and cumbersome.  Alternatively, breaking it up into class hierarchy of RelMetadataQuery -> DefaultingAndValidatingRelMetadataQuery ->  HandlerBackedRelMetadataQuery also feels a bit convoluted and is a breaking change.
   
   I also support removing the leaky abstraction of the metadata implementation to the rest of calcite core.  But changing this does not actually help things much other than remove a thread local that can cause odd behaviors in uncommon nesting scenarios. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@calcite.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org