You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@druid.apache.org by GitBox <gi...@apache.org> on 2019/04/29 23:57:43 UTC

[GitHub] [incubator-druid] himanshug opened a new issue #7574: making VirtualColumn more powerful

himanshug opened a new issue #7574: making VirtualColumn more powerful
URL: https://github.com/apache/incubator-druid/issues/7574
 
 
   **Motivation:**
   
   Current VirtualColumn interface is designed for the simple use cases where they just map the row value using a function e.g. the ExpressionVirtualColumn. I have a custom complex column that is used to persist a nested-data-structure type in Druid column. Since Druid provides full control over serialization of complex columns, I am able to have an efficient storage layout and indexes (multiple bitmaps and dictionary encodings). VirtualColumn feature helps me provide many different "views" of this complex column and that works.
   However current VirtualColumn interface doesn't let me exploit the storage layout or bitmaps in the column leading to very suboptimal processing.
   
   **Description:**
   
   Here are the proposed changes ( I might rethink these but this is the first cut).
   
   Current VirtualColumn interface has following methods to get values out of it..
   
   ```
   ColumnValueSelector<?> makeColumnValueSelector(String columnName, ColumnSelectorFactory factory);
   DimensionSelector makeDimensionSelector(DimensionSpec dimensionSpec, ColumnSelectorFactory factory);
   
   ```
   
   Both receive a `ColumnSelectorFactory` that gives access to row values from other base columns. However it can not give access to the storage structure or bitmap indexes of those columns.
   
   proposal is to add following additional methods to the VirtualColumn interface (all of which are optional for the implementor).
   
   ```
     default BitmapIndex getBitmapIndex(String columnName, ColumnSelector selector)
     {
       return null;
     }
   
     default DictionaryEncodedColumn<?> asDictionaryEncodedColumn(String columnName, ColumnSelector columnSelector)
     {
       return null;
     }
   
     default DimensionSelector makeDimensionSelector(String columnName, ColumnSelector columnSelector, ReadableOffset offset)
     {
       return null;
     }
   
     default ColumnValueSelector<?> makeColumnValueSelector(String columnName, ColumnSelector columnSelector, ReadableOffset offset)
     {
       return null;
     }
   ```
   
   And, if druid code usage of virtualColumn.makeColumnValueSelector(..) and virtualColumn.makeDimensionSelector(...) has access to "ColumnSelector" then it would first call above version of methods to get ColumnValueSelector/DimensionSelector. 
   Appropriate filtering related code would be adjusted to exploit bitmaps from virtual column if it could provide them.
   
   
   
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org