You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@drill.apache.org by GitBox <gi...@apache.org> on 2018/06/28 09:22:57 UTC
[GitHub] vvysotskyi opened a new pull request #1346: DRILL-6546: Allow
unnest function with nested columns and complex expressions
vvysotskyi opened a new pull request #1346: DRILL-6546: Allow unnest function with nested columns and complex expressions
URL: https://github.com/apache/drill/pull/1346
- Added new rule `ProjectComplexRexNodeCorrelateTransposeRule` which takes a complex expression from the `Project` below `Uncollect` rel node and creates new project with expressions from the left side of `Correlate` and this complex expression.
For example, part of the plan before applying the rule:
```
LogicalCorrelate(correlation=[$cor0], joinType=[inner], requiredColumns=[{1}]): rowcount = 1.0, cumulative cost = {inf}, id = 100
EnumerableTableScan(subset=[rel#94:Subset#0.ENUMERABLE.ANY([]).[]], table=[[cp, lateraljoin/nested-customer.parquet]]): rowcount = 100.0, cumulative cost = {100.0 rows, 101.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 7
Uncollect(subset=[rel#99:Subset#3.NONE.ANY([]).[]]): rowcount = 1.0, cumulative cost = {1.0 rows, 1.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 98
LogicalProject(subset=[rel#97:Subset#2.NONE.ANY([]).[]], EXPR$0=[ITEM($cor0.orders, 'items')]): rowcount = 1.0, cumulative cost = {1.0 rows, 1.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 96
LogicalValues(subset=[rel#95:Subset#1.NONE.ANY([]).[0]], tuples=[[{ 0 }]]): rowcount = 1.0, cumulative cost = {1.0 rows, 1.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 8
```
Plan after applying the rule:
```
LogicalProject(**=[$0], orders=[$1], $complexRexNode0=[$3]): rowcount = 1.0, cumulative cost = {inf}, id = 116
LogicalCorrelate(correlation=[$cor1], joinType=[inner], requiredColumns=[{2}]): rowcount = 1.0, cumulative cost = {inf}, id = 115
LogicalProject(**=[$0], orders=[$1], $complexRexNode=[ITEM($1, 'items')]): rowcount = 100.0, cumulative cost = {100.0 rows, 300.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 112
EnumerableTableScan(subset=[rel#94:Subset#0.ENUMERABLE.ANY([]).[]], table=[[cp, lateraljoin/nested-customer.parquet]]): rowcount = 100.0, cumulative cost = {100.0 rows, 101.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 7
Uncollect: rowcount = 1.0, cumulative cost = {2.0 rows, 2.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 114
LogicalProject($complexRexNode=[$cor1.$complexRexNode]): rowcount = 1.0, cumulative cost = {1.0 rows, 1.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 113
LogicalValues(subset=[rel#95:Subset#1.NONE.ANY([]).[0]], tuples=[[{ 0 }]]): rowcount = 1.0, cumulative cost = {1.0 rows, 1.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 8
```
- Made change to convert `DrillCompoundIdentifier` inside unnest to the item call to avoid column not found error when nested column is used.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
With regards,
Apache Git Services