You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@calcite.apache.org by "godfrey he (JIRA)" <ji...@apache.org> on 2018/05/08 07:59:00 UTC
[jira] [Comment Edited] (CALCITE-2298) Correlated SubQuery in
Having generates error plan when correlated fields does not exist
[ https://issues.apache.org/jira/browse/CALCITE-2298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16467037#comment-16467037 ]
godfrey he edited comment on CALCITE-2298 at 5/8/18 7:58 AM:
-------------------------------------------------------------
Thanks for the suggestions [~julianhyde].
yes, there are a lot of code to handle the following cases:
# correlated fields in SubQuery are in parent's grouping fields. (we can link the scope for HAVING to the aggregating scope as your suggestion, and call _checkAggregateExpr_ to validate the expr)
# correlated fields in SubQuery are in an aggregator. detects this pattern, no validation for this case)
# nested SubQuery, e.g.
{code:java}
check("select ename from sales.emp group by ename, deptno\n" + "having max(sal) <= \n" + " (select max(sal) from sales.emp_b e1 group by e1.ename\n" + " having min(sal) < \n" + " (select min(sal) from sales.emp_b e2 where emp.deptno = e2.deptno group by ename))");
{code}
(we need find the correct aggregating scope which outputs the correlated fields. for the above case, the relationship of scopes is:
{noformat}
EmptyScope
|__ CatalogScope (table emp)
|__ SelectScope
|__ AggregatingSelectScope (group by ename, deptno)
|__ SelectScope (SubQuery1, emp_b e1)
|__ AggregatingSelectScope (group by e1.name)
|__ SelectScope (SubQuery2, emp_b e2)
{noformat}
_emp.deptno = e2.deptno_ is in _SelectScope (SubQuery2, emp_b e2)_ and _emp.deptno_ is from _AggregatingSelectScope (group by ename, deptno),_ I don't know how to build the link for this case. )
4. grouping expr and correlated expr are SqlCall, e.g.
{code:java}
check("select deptno + 1, sum(sal) from emp group by deptno + 1\n"
+ "having max(sal) > \n"
+ " (select max(sal) from emp e where emp.deptno + 1 = e.deptno group by ename)"); {code}
5. correlated SubQuery is in an aggregator(detects this pattern, no validation for this case)
{code:java}
check("SELECT SUM(\n"
+ " 10 + (select char_length(name) from dept\n"
+ " where dept.deptno = emp.empno))\n"
+ "FROM emp");
{code}
Looking forward to your advice, many thanks!
was (Author: godfreyhe):
Thanks for the suggestions [~julianhyde].
yes, there are a lot of code to handle the following cases:
# correlated fields in SubQuery are in parent's grouping fields. (we can link the scope for HAVING to the aggregating scope as your suggestion, and call _checkAggregateExpr_ to validate the expr)
# correlated fields in SubQuery are in an aggregator. detects this pattern, no validation for this case)
# nested SubQuery, e.g.
{code:java}
check("select ename from sales.emp group by ename, deptno\n" + "having max(sal) <= \n" + " (select max(sal) from sales.emp_b e1 group by e1.ename\n" + " having min(sal) < \n" + " (select min(sal) from sales.emp_b e2 where emp.deptno = e2.deptno group by ename))");
{code}
(we need find the correct aggregating scope which outputs the correlated fields. for the above case, the relationship of scopes is:
{noformat}
EmptyScope
|__ CatalogScope (table emp)
|__ SelectScope
|__ AggregatingSelectScope (group by ename, deptno)
|__ SelectScope (SubQuery1, emp_b e1)
|__ AggregatingSelectScope (group by e1.name)
|__ SelectScope (SubQuery2, emp_b e2)
{noformat}
_emp.deptno = e2.deptno_ is in _SelectScope (SubQuery2, emp_b e2)_ and _emp.deptno_ is from _AggregatingSelectScope (group by ename, deptno),_ I don't know how to build the link for this case. __ )
4. grouping expr and correlated expr are SqlCall, e.g.
{code:java}
check("select deptno + 1, sum(sal) from emp group by deptno + 1\n"
+ "having max(sal) > \n"
+ " (select max(sal) from emp e where emp.deptno + 1 = e.deptno group by ename)"); {code}
5. correlated SubQuery is in an aggregator(detects this pattern, no validation for this case)
{code:java}
check("SELECT SUM(\n"
+ " 10 + (select char_length(name) from dept\n"
+ " where dept.deptno = emp.empno))\n"
+ "FROM emp");
{code}
Looking forward to your advice, many thanks!
> Correlated SubQuery in Having generates error plan when correlated fields does not exist
> ----------------------------------------------------------------------------------------
>
> Key: CALCITE-2298
> URL: https://issues.apache.org/jira/browse/CALCITE-2298
> Project: Calcite
> Issue Type: Bug
> Components: core
> Affects Versions: 1.15.0
> Reporter: godfrey he
> Assignee: Julian Hyde
> Priority: Major
>
> {code}
> @Test public void testDecorrelateWithUnresolvedField() throws Exception {
> final String sql = "select ename\n"
> + "from sales.emp\n"
> + "group by ename, deptno\n"
> + "having max(sal) <= \n"
> + " (select max(sal) from sales.emp_b where emp.job = emp_b.job group by ename)";
> checkSubQuery(sql).withLateDecorrelation(true).check();
> }
> {code}
> for now, we will get the following plan:
> {code}
> LogicalProject(ENAME=[$0])
> LogicalFilter(condition=[<=($2, $SCALAR_QUERY({
> LogicalProject(EXPR$0=[$1])
> LogicalAggregate(group=[{0}], EXPR$0=[MAX($1)])
> LogicalProject(ENAME=[$1], SAL=[$5])
> LogicalFilter(condition=[=($cor0.JOB, $2)])
> LogicalTableScan(table=[[CATALOG, SALES, EMP_B]])
> }))])
> LogicalAggregate(group=[{0, 1}], agg#0=[MAX($2)])
> LogicalProject(ENAME=[$1], DEPTNO=[$7], SAL=[$5])
> LogicalTableScan(table=[[CATALOG, SALES, EMP]])
> {code}
> However emp.job is not the grouping fields or the aggCall result.
> the expected result is throwing an Exception, like:
> {code}
> org.apache.calcite.runtime.CalciteContextException: From line 5, column 47 to line 5, column 50: Column 'JOB' not found in 'LogicalAggregate(group=[{0, 1}], agg#0=[MAX($2)])'
> {code}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)