You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "Victoria Markman (JIRA)" <ji...@apache.org> on 2015/01/06 20:07:34 UTC

[jira] [Updated] (DRILL-1936) Throw an error if subquery in the where clause does not return scalar result

     [ https://issues.apache.org/jira/browse/DRILL-1936?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Victoria Markman updated DRILL-1936:
------------------------------------
    Component/s: Query Planning & Optimization
    Description: 
{code}
#Fri Jan 02 21:20:47 EST 2015
git.commit.id.abbrev=b491cdb
{code}

When result of a subquery is non scalar (regardless of if it is correlated or not) we should throw  an error either during planning time or during runtime when we know cardinality of the result set.

Currently, queries either fail to plan:

{code}
0: jdbc:drill:schema=dfs> select * from cp.`tpch/nation.parquet`  where n_name = ( select r_name from cp.`tpch/region.parquet` where n_regionkey = r_regionkey);
Query failed: Query failed: Unexpected exception during fragment initialization: Node [rel#24659:Subset#7.LOGICAL.ANY([]).[]] could not be implemented; planner state:

Root: rel#24659:Subset#7.LOGICAL.ANY([]).[]
Original rel:
AbstractConverter(subset=[rel#24659:Subset#7.LOGICAL.ANY([]).[]], convention=[LOGICAL], DrillDistributionTraitDef=[ANY([])], sort=[[]]): rowcount = 1.7976931348623157E308, cumulative cost = {inf}, id = 24660
  ProjectRel(subset=[rel#24658:Subset#7.NONE.ANY([]).[]], *=[$0]): rowcount = 1.7976931348623157E308, cumulative cost = {1.7976931348623157E308 rows, 1.7976931348623157E308 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 24657
    FilterRel(subset=[rel#24656:Subset#6.NONE.ANY([]).[]], condition=[=($1, $2)]): rowcount = 2.6965397022934733E307, cumulative cost = {2.6965397022934733E307 rows, 1.7976931348623157E308 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 24655
      JoinRel(subset=[rel#24654:Subset#5.NONE.ANY([]).[]], condition=[true], joinType=[left]): rowcount = 1.7976931348623157E308, cumulative cost = {1.7976931348623157E308 rows, 0.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 24653
        EnumerableTableAccessRel(subset=[rel#24645:Subset#0.ENUMERABLE.ANY([]).[]], table=[[cp, tpch/nation.parquet]]): rowcount = 100.0, cumulative cost = {100.0 rows, 101.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 24623
        AggregateRel(subset=[rel#24652:Subset#4.NONE.ANY([]).[]], group=[{}], agg#0=[SINGLE_VALUE($0)]): rowcount = 1.7976931348623158E307, cumulative cost = {1.7976931348623158E307 rows, 0.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 24651
          ProjectRel(subset=[rel#24650:Subset#3.NONE.ANY([]).[]], r_name=[$3]): rowcount = 1.7976931348623157E308, cumulative cost = {1.7976931348623157E308 rows, 1.7976931348623157E308 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 24649
            FilterRel(subset=[rel#24648:Subset#2.NONE.ANY([]).[]], condition=[=($1, $2)]): rowcount = 15.0, cumulative cost = {15.0 rows, 100.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 24647
              EnumerableTableAccessRel(subset=[rel#24646:Subset#1.ENUMERABLE.ANY([]).[]], table=[[cp, tpch/region.parquet]]): rowcount = 100.0, cumulative cost = {100.0 rows, 101.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 24624
{code}

or return strange error messages that are difficult to decipher:

{code}
0: jdbc:drill:schema=dfs> select  a.emp_num,
. . . . . . . . . . . . >         a.emp_name
. . . . . . . . . . . . > from    `emp1.json` as a
. . . . . . . . . . . . > where   a.salary > (
. . . . . . . . . . . . >                 select  b.salary
. . . . . . . . . . . . >                 from    `emp1.json` b
. . . . . . . . . . . . >                 where   b.dept = a.dept)
. . . . . . . . . . . . > order by 1;
Query failed: Query failed: Failure while running fragment., Schema is currently null.  You must call buildSchema(SelectionVectorMode) before this container can return a schema. [ d800ab5d-aa5b-4371-8cb6-819dccca40aa on atsqa4-134.qa.lab:31010 ]
[ d800ab5d-aa5b-4371-8cb6-819dccca40aa on atsqa4-134.qa.lab:31010 ]
Error: exception while executing query: Failure while executing query. (state=,code=0)
{code}


  was:
When result of a subquery is non scalar (regardless of if it is correlated or not) we should throw  an error either during planning time or during runtime when we know cardinality of the result set.

Currently, queries either fail to plan:

{code}
0: jdbc:drill:schema=dfs> select * from cp.`tpch/nation.parquet`  where n_name = ( select r_name from cp.`tpch/region.parquet` where n_regionkey = r_regionkey);
Query failed: Query failed: Unexpected exception during fragment initialization: Node [rel#24659:Subset#7.LOGICAL.ANY([]).[]] could not be implemented; planner state:

Root: rel#24659:Subset#7.LOGICAL.ANY([]).[]
Original rel:
AbstractConverter(subset=[rel#24659:Subset#7.LOGICAL.ANY([]).[]], convention=[LOGICAL], DrillDistributionTraitDef=[ANY([])], sort=[[]]): rowcount = 1.7976931348623157E308, cumulative cost = {inf}, id = 24660
  ProjectRel(subset=[rel#24658:Subset#7.NONE.ANY([]).[]], *=[$0]): rowcount = 1.7976931348623157E308, cumulative cost = {1.7976931348623157E308 rows, 1.7976931348623157E308 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 24657
    FilterRel(subset=[rel#24656:Subset#6.NONE.ANY([]).[]], condition=[=($1, $2)]): rowcount = 2.6965397022934733E307, cumulative cost = {2.6965397022934733E307 rows, 1.7976931348623157E308 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 24655
      JoinRel(subset=[rel#24654:Subset#5.NONE.ANY([]).[]], condition=[true], joinType=[left]): rowcount = 1.7976931348623157E308, cumulative cost = {1.7976931348623157E308 rows, 0.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 24653
        EnumerableTableAccessRel(subset=[rel#24645:Subset#0.ENUMERABLE.ANY([]).[]], table=[[cp, tpch/nation.parquet]]): rowcount = 100.0, cumulative cost = {100.0 rows, 101.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 24623
        AggregateRel(subset=[rel#24652:Subset#4.NONE.ANY([]).[]], group=[{}], agg#0=[SINGLE_VALUE($0)]): rowcount = 1.7976931348623158E307, cumulative cost = {1.7976931348623158E307 rows, 0.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 24651
          ProjectRel(subset=[rel#24650:Subset#3.NONE.ANY([]).[]], r_name=[$3]): rowcount = 1.7976931348623157E308, cumulative cost = {1.7976931348623157E308 rows, 1.7976931348623157E308 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 24649
            FilterRel(subset=[rel#24648:Subset#2.NONE.ANY([]).[]], condition=[=($1, $2)]): rowcount = 15.0, cumulative cost = {15.0 rows, 100.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 24647
              EnumerableTableAccessRel(subset=[rel#24646:Subset#1.ENUMERABLE.ANY([]).[]], table=[[cp, tpch/region.parquet]]): rowcount = 100.0, cumulative cost = {100.0 rows, 101.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 24624
{code}

or return strange error messages that are difficult to decipher:

{code}
0: jdbc:drill:schema=dfs> select  a.emp_num,
. . . . . . . . . . . . >         a.emp_name
. . . . . . . . . . . . > from    `emp1.json` as a
. . . . . . . . . . . . > where   a.salary > (
. . . . . . . . . . . . >                 select  b.salary
. . . . . . . . . . . . >                 from    `emp1.json` b
. . . . . . . . . . . . >                 where   b.dept = a.dept)
. . . . . . . . . . . . > order by 1;
Query failed: Query failed: Failure while running fragment., Schema is currently null.  You must call buildSchema(SelectionVectorMode) before this container can return a schema. [ d800ab5d-aa5b-4371-8cb6-819dccca40aa on atsqa4-134.qa.lab:31010 ]
[ d800ab5d-aa5b-4371-8cb6-819dccca40aa on atsqa4-134.qa.lab:31010 ]
Error: exception while executing query: Failure while executing query. (state=,code=0)
{code}



> Throw an error if subquery in the where clause does not return scalar result
> ----------------------------------------------------------------------------
>
>                 Key: DRILL-1936
>                 URL: https://issues.apache.org/jira/browse/DRILL-1936
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Query Planning & Optimization
>            Reporter: Victoria Markman
>
> {code}
> #Fri Jan 02 21:20:47 EST 2015
> git.commit.id.abbrev=b491cdb
> {code}
> When result of a subquery is non scalar (regardless of if it is correlated or not) we should throw  an error either during planning time or during runtime when we know cardinality of the result set.
> Currently, queries either fail to plan:
> {code}
> 0: jdbc:drill:schema=dfs> select * from cp.`tpch/nation.parquet`  where n_name = ( select r_name from cp.`tpch/region.parquet` where n_regionkey = r_regionkey);
> Query failed: Query failed: Unexpected exception during fragment initialization: Node [rel#24659:Subset#7.LOGICAL.ANY([]).[]] could not be implemented; planner state:
> Root: rel#24659:Subset#7.LOGICAL.ANY([]).[]
> Original rel:
> AbstractConverter(subset=[rel#24659:Subset#7.LOGICAL.ANY([]).[]], convention=[LOGICAL], DrillDistributionTraitDef=[ANY([])], sort=[[]]): rowcount = 1.7976931348623157E308, cumulative cost = {inf}, id = 24660
>   ProjectRel(subset=[rel#24658:Subset#7.NONE.ANY([]).[]], *=[$0]): rowcount = 1.7976931348623157E308, cumulative cost = {1.7976931348623157E308 rows, 1.7976931348623157E308 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 24657
>     FilterRel(subset=[rel#24656:Subset#6.NONE.ANY([]).[]], condition=[=($1, $2)]): rowcount = 2.6965397022934733E307, cumulative cost = {2.6965397022934733E307 rows, 1.7976931348623157E308 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 24655
>       JoinRel(subset=[rel#24654:Subset#5.NONE.ANY([]).[]], condition=[true], joinType=[left]): rowcount = 1.7976931348623157E308, cumulative cost = {1.7976931348623157E308 rows, 0.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 24653
>         EnumerableTableAccessRel(subset=[rel#24645:Subset#0.ENUMERABLE.ANY([]).[]], table=[[cp, tpch/nation.parquet]]): rowcount = 100.0, cumulative cost = {100.0 rows, 101.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 24623
>         AggregateRel(subset=[rel#24652:Subset#4.NONE.ANY([]).[]], group=[{}], agg#0=[SINGLE_VALUE($0)]): rowcount = 1.7976931348623158E307, cumulative cost = {1.7976931348623158E307 rows, 0.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 24651
>           ProjectRel(subset=[rel#24650:Subset#3.NONE.ANY([]).[]], r_name=[$3]): rowcount = 1.7976931348623157E308, cumulative cost = {1.7976931348623157E308 rows, 1.7976931348623157E308 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 24649
>             FilterRel(subset=[rel#24648:Subset#2.NONE.ANY([]).[]], condition=[=($1, $2)]): rowcount = 15.0, cumulative cost = {15.0 rows, 100.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 24647
>               EnumerableTableAccessRel(subset=[rel#24646:Subset#1.ENUMERABLE.ANY([]).[]], table=[[cp, tpch/region.parquet]]): rowcount = 100.0, cumulative cost = {100.0 rows, 101.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 24624
> {code}
> or return strange error messages that are difficult to decipher:
> {code}
> 0: jdbc:drill:schema=dfs> select  a.emp_num,
> . . . . . . . . . . . . >         a.emp_name
> . . . . . . . . . . . . > from    `emp1.json` as a
> . . . . . . . . . . . . > where   a.salary > (
> . . . . . . . . . . . . >                 select  b.salary
> . . . . . . . . . . . . >                 from    `emp1.json` b
> . . . . . . . . . . . . >                 where   b.dept = a.dept)
> . . . . . . . . . . . . > order by 1;
> Query failed: Query failed: Failure while running fragment., Schema is currently null.  You must call buildSchema(SelectionVectorMode) before this container can return a schema. [ d800ab5d-aa5b-4371-8cb6-819dccca40aa on atsqa4-134.qa.lab:31010 ]
> [ d800ab5d-aa5b-4371-8cb6-819dccca40aa on atsqa4-134.qa.lab:31010 ]
> Error: exception while executing query: Failure while executing query. (state=,code=0)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)