You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@druid.apache.org by GitBox <gi...@apache.org> on 2020/08/13 21:43:53 UTC

[GitHub] [druid] himanshug opened a new pull request #10277: make dimension column extensible with COMPLEX type

himanshug opened a new pull request #10277:
URL: https://github.com/apache/druid/pull/10277


   `WIP` tag is added because I am still in the process of testing it more thoroughly, However code in current patch is working in preliminary testing. I will also try and some UTs for the changes. That said, code is totally reviewable.
    
   ### Description
   
   This patch adds support for adding COMPLEX type Dimension column via Extensions, similar support for Metric columns exist.
   
   Main changes are to add `String getTypeName()` in `ColumnCapabilities` interface and to add `DimensionHandlerUtils.registerDimensionHandlerProvider(String typeName, DimensionHandlerProvider)` to enable extensions to register custom type specific `DimensionHandler`, very similar to existing `ComplexMetrics.registerSerde(..)`
   
   After this patch, Users can have extensions with custom implementations of `DimensionHandler`, `ComplexMetricSerde` 
    and related interfaces to add COMPLEX type Dimension Column.
   In addition to above, extension would also make `ComplexMetrics.registerSerde(typeName,..)` and `DimensionHandlerUtils.registerDimensionHandlerProvider(typeName,..)` to register custom type specific implementations.
   
   Side-note: I am using this extensibility to add a `Map<String, String>` type Dimension column for various reasons which will later be added in a separate PR as contrib extension.
   
   <hr>
   
   This PR has:
   - [x] been self-reviewed.
      - [ ] using the [concurrency checklist](https://github.com/apache/druid/blob/master/dev/code-review/concurrency.md) (Remove this item if the PR doesn't have any relation to concurrency.)
   - [ ] added documentation for new or modified features or behaviors.
   - [x] added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
   - [ ] added or updated version, license, or notice information in [licenses.yaml](https://github.com/apache/druid/blob/master/licenses.yaml)
   - [x] added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
   - [ ] added unit tests or modified existing tests to cover new code paths, ensuring the threshold for [code coverage](https://github.com/apache/druid/blob/master/dev/code-review/code-coverage.md) is met.
   - [ ] added integration tests.
   - [ ] been tested in a test Druid cluster.
   
   <hr>
   
   ##### Key changed/added classes in this PR
    * `ColumnCapabilities`
    * `DimensionHandlerUtils`
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] himanshug commented on a change in pull request #10277: make dimension column extensible with COMPLEX type

Posted by GitBox <gi...@apache.org>.
himanshug commented on a change in pull request #10277:
URL: https://github.com/apache/druid/pull/10277#discussion_r532332864



##########
File path: processing/src/main/java/org/apache/druid/segment/column/ColumnCapabilities.java
##########
@@ -37,6 +37,12 @@
    */
   ValueType getType();
 
+  /**
+   *
+   * If ValueType is COMPLEX, then the typeName associated with it.
+   */
+  String getTypeName();

Review comment:
       changed




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] stale[bot] commented on pull request #10277: make dimension column extensible with COMPLEX type

Posted by GitBox <gi...@apache.org>.
stale[bot] commented on pull request #10277:
URL: https://github.com/apache/druid/pull/10277#issuecomment-724221388


   This issue is no longer marked as stale.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] clintropolis commented on a change in pull request #10277: make dimension column extensible with COMPLEX type

Posted by GitBox <gi...@apache.org>.
clintropolis commented on a change in pull request #10277:
URL: https://github.com/apache/druid/pull/10277#discussion_r532525267



##########
File path: processing/src/main/java/org/apache/druid/segment/column/ColumnBuilder.java
##########
@@ -58,6 +58,12 @@ public ColumnBuilder setType(ValueType type)
     return this;
   }
 
+  public ColumnBuilder setTypeName(String typeName)

Review comment:
       nit: this should be `setComplexTypeName` probably to match capabilities




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] himanshug commented on a change in pull request #10277: make dimension column extensible with COMPLEX type

Posted by GitBox <gi...@apache.org>.
himanshug commented on a change in pull request #10277:
URL: https://github.com/apache/druid/pull/10277#discussion_r534588477



##########
File path: processing/src/main/java/org/apache/druid/segment/column/ColumnBuilder.java
##########
@@ -58,6 +58,12 @@ public ColumnBuilder setType(ValueType type)
     return this;
   }
 
+  public ColumnBuilder setTypeName(String typeName)

Review comment:
       thanks, yeah updated, I missed it.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] himanshug commented on pull request #10277: make dimension column extensible with COMPLEX type

Posted by GitBox <gi...@apache.org>.
himanshug commented on pull request #10277:
URL: https://github.com/apache/druid/pull/10277#issuecomment-724221377


   not stale


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] himanshug merged pull request #10277: make dimension column extensible with COMPLEX type

Posted by GitBox <gi...@apache.org>.
himanshug merged pull request #10277:
URL: https://github.com/apache/druid/pull/10277


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] clintropolis commented on pull request #10277: make dimension column extensible with COMPLEX type

Posted by GitBox <gi...@apache.org>.
clintropolis commented on pull request #10277:
URL: https://github.com/apache/druid/pull/10277#issuecomment-673728033


   Cool :+1:
   
   What do you think about deprecating the `ComplexMetric` terminology and replacing it with `ComplexType` (e.g. `ComplexMetricSerde` -> `ComplexTypeSerde`, etc) since it isn't strictly associated to metrics? (I have been wanting to do this for a while, just haven't got to it).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] himanshug commented on a change in pull request #10277: make dimension column extensible with COMPLEX type

Posted by GitBox <gi...@apache.org>.
himanshug commented on a change in pull request #10277:
URL: https://github.com/apache/druid/pull/10277#discussion_r472515057



##########
File path: processing/src/main/java/org/apache/druid/segment/column/ColumnCapabilities.java
##########
@@ -37,6 +37,12 @@
    */
   ValueType getType();
 
+  /**
+   *
+   * If ValueType is COMPLEX, then the typeName associated with it.
+   */
+  String getTypeName();

Review comment:
       sgtm
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] himanshug commented on a change in pull request #10277: make dimension column extensible with COMPLEX type

Posted by GitBox <gi...@apache.org>.
himanshug commented on a change in pull request #10277:
URL: https://github.com/apache/druid/pull/10277#discussion_r472515583



##########
File path: processing/src/main/java/org/apache/druid/segment/incremental/IncrementalIndex.java
##########
@@ -314,7 +315,12 @@ protected IncrementalIndex(
     for (DimensionSchema dimSchema : dimensionsSpec.getDimensions()) {
       ValueType type = TYPE_MAP.get(dimSchema.getValueType());
       String dimName = dimSchema.getName();
-      ColumnCapabilitiesImpl capabilities = makeDefaultCapabilitiesFromValueType(type);
+
+      // Note: Things might be simpler if DimensionSchema had a method "getColumnCapabilities()" which could return
+      // type specific capabilities by itself. However, for various reasons, DimensionSchema currently lives in druid-core
+      // while ColumnCapabilities lives in druid-processing which makes that approach difficult.
+      ColumnCapabilitiesImpl capabilities = makeDefaultCapabilitiesFromValueType(type, dimSchema.getTypeName());

Review comment:
       it should be alright, I will update this PR later when I have my MapStringString extension ready/deployed and tested with changes in this patch. thanks for the heads up.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] himanshug commented on pull request #10277: make dimension column extensible with COMPLEX type

Posted by GitBox <gi...@apache.org>.
himanshug commented on pull request #10277:
URL: https://github.com/apache/druid/pull/10277#issuecomment-735527755


   this is ready to review/merge, no longer wip.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] himanshug commented on pull request #10277: make dimension column extensible with COMPLEX type

Posted by GitBox <gi...@apache.org>.
himanshug commented on pull request #10277:
URL: https://github.com/apache/druid/pull/10277#issuecomment-674192153


   @clintropolis thanks, yes they could all be renamed , deserialization is only based on column type , nothing to do with dimension vs metric. however those renaming would be backward incompatible so better done in an independent PR.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] clintropolis commented on a change in pull request #10277: make dimension column extensible with COMPLEX type

Posted by GitBox <gi...@apache.org>.
clintropolis commented on a change in pull request #10277:
URL: https://github.com/apache/druid/pull/10277#discussion_r472433647



##########
File path: processing/src/main/java/org/apache/druid/segment/column/ColumnCapabilities.java
##########
@@ -37,6 +37,12 @@
    */
   ValueType getType();
 
+  /**
+   *
+   * If ValueType is COMPLEX, then the typeName associated with it.
+   */
+  String getTypeName();

Review comment:
       could this be `getComplexTypeName` to tie it more strongly to complex types? It would also match the aggregator factory changes in #9638 which I've started working on again recently. If you recall this discussion https://github.com/apache/druid/pull/9638#discussion_r410995227, I decided to go ahead and rename it to `getComplexTypeName` there because there are already breaking changes with the new abstract methods, so might as well make it intuitive.

##########
File path: processing/src/main/java/org/apache/druid/segment/incremental/IncrementalIndex.java
##########
@@ -314,7 +315,12 @@ protected IncrementalIndex(
     for (DimensionSchema dimSchema : dimensionsSpec.getDimensions()) {
       ValueType type = TYPE_MAP.get(dimSchema.getValueType());
       String dimName = dimSchema.getName();
-      ColumnCapabilitiesImpl capabilities = makeDefaultCapabilitiesFromValueType(type);
+
+      // Note: Things might be simpler if DimensionSchema had a method "getColumnCapabilities()" which could return
+      // type specific capabilities by itself. However, for various reasons, DimensionSchema currently lives in druid-core
+      // while ColumnCapabilities lives in druid-processing which makes that approach difficult.
+      ColumnCapabilitiesImpl capabilities = makeDefaultCapabilitiesFromValueType(type, dimSchema.getTypeName());

Review comment:
       Apologies for the conflicts here, I made some semi disruptive changes to push `ColumnCapabilities` into the `DimensionIndexer` implementations so they can be more accurate. I think the changes should still be workable with your addition of `DimensionHandlerProvider`, just the dimension indexer it provides will need to provide the complex column capabilities.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] stale[bot] commented on pull request #10277: make dimension column extensible with COMPLEX type

Posted by GitBox <gi...@apache.org>.
stale[bot] commented on pull request #10277:
URL: https://github.com/apache/druid/pull/10277#issuecomment-723501433


   This pull request has been marked as stale due to 60 days of inactivity. It will be closed in 4 weeks if no further activity occurs. If you think that's incorrect or this pull request should instead be reviewed, please simply write any comment. Even if closed, you can still revive the PR at any time or discuss it on the dev@druid.apache.org list. Thank you for your contributions.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org