You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@drill.apache.org by GitBox <gi...@apache.org> on 2018/06/28 09:22:57 UTC

[GitHub] vvysotskyi opened a new pull request #1346: DRILL-6546: Allow unnest function with nested columns and complex expressions

vvysotskyi opened a new pull request #1346: DRILL-6546: Allow unnest function with nested columns and complex expressions
URL: https://github.com/apache/drill/pull/1346
 
 
   - Added new rule `ProjectComplexRexNodeCorrelateTransposeRule` which takes a complex expression from the `Project` below `Uncollect` rel node and creates new project with expressions from the left side of `Correlate` and this complex expression. 
   For example, part of the plan before applying the rule:
   ```
   LogicalCorrelate(correlation=[$cor0], joinType=[inner], requiredColumns=[{1}]): rowcount = 1.0, cumulative cost = {inf}, id = 100
     EnumerableTableScan(subset=[rel#94:Subset#0.ENUMERABLE.ANY([]).[]], table=[[cp, lateraljoin/nested-customer.parquet]]): rowcount = 100.0, cumulative cost = {100.0 rows, 101.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 7
     Uncollect(subset=[rel#99:Subset#3.NONE.ANY([]).[]]): rowcount = 1.0, cumulative cost = {1.0 rows, 1.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 98
       LogicalProject(subset=[rel#97:Subset#2.NONE.ANY([]).[]], EXPR$0=[ITEM($cor0.orders, 'items')]): rowcount = 1.0, cumulative cost = {1.0 rows, 1.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 96
         LogicalValues(subset=[rel#95:Subset#1.NONE.ANY([]).[0]], tuples=[[{ 0 }]]): rowcount = 1.0, cumulative cost = {1.0 rows, 1.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 8
   ```
   Plan after applying the rule:
   ```
   LogicalProject(**=[$0], orders=[$1], $complexRexNode0=[$3]): rowcount = 1.0, cumulative cost = {inf}, id = 116
     LogicalCorrelate(correlation=[$cor1], joinType=[inner], requiredColumns=[{2}]): rowcount = 1.0, cumulative cost = {inf}, id = 115
       LogicalProject(**=[$0], orders=[$1], $complexRexNode=[ITEM($1, 'items')]): rowcount = 100.0, cumulative cost = {100.0 rows, 300.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 112
         EnumerableTableScan(subset=[rel#94:Subset#0.ENUMERABLE.ANY([]).[]], table=[[cp, lateraljoin/nested-customer.parquet]]): rowcount = 100.0, cumulative cost = {100.0 rows, 101.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 7
       Uncollect: rowcount = 1.0, cumulative cost = {2.0 rows, 2.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 114
         LogicalProject($complexRexNode=[$cor1.$complexRexNode]): rowcount = 1.0, cumulative cost = {1.0 rows, 1.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 113
           LogicalValues(subset=[rel#95:Subset#1.NONE.ANY([]).[0]], tuples=[[{ 0 }]]): rowcount = 1.0, cumulative cost = {1.0 rows, 1.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 8
   ```
   - Made change to convert `DrillCompoundIdentifier` inside unnest to the item call to avoid column not found error when nested column is used.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services