You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "Khurram Faraaz (JIRA)" <ji...@apache.org> on 2015/03/27 20:53:52 UTC

[jira] [Commented] (DRILL-2608) Union all query fails when json.all_text_mode=false

    [ https://issues.apache.org/jira/browse/DRILL-2608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384497#comment-14384497 ] 

Khurram Faraaz commented on DRILL-2608:
---------------------------------------

Physical plan for the failing query

00-00    Screen : rowType = RecordType(ANY key): rowcount = 343.0, cumulative cost = {1122.3 rows, 1122.3 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 12812
00-01      UnionAll(all=[true]) : rowType = RecordType(ANY key): rowcount = 343.0, cumulative cost = {1088.0 rows, 1088.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 12811
00-03        UnionAll(all=[true]) : rowType = RecordType(ANY key): rowcount = 138.0, cumulative cost = {540.0 rows, 540.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 12809
00-05          UnionAll(all=[true]) : rowType = RecordType(ANY key): rowcount = 105.0, cumulative cost = {369.0 rows, 369.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 12807
00-07            UnionAll(all=[true]) : rowType = RecordType(ANY key): rowcount = 101.0, cumulative cost = {260.0 rows, 260.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 12805
00-09              UnionAll(all=[true]) : rowType = RecordType(ANY key): rowcount = 58.0, cumulative cost = {116.0 rows, 116.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 12803
00-11                Scan(groupscan=[EasyGroupScan [selectionRoot=/tmp/charData.json, numFiles=1, columns=[`key`], files=[maprfs:/tmp/charData.json]]]) : rowType = RecordType(ANY key): rowcount = 18.0, cumulative cost = {18.0 rows, 18.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 12801
00-10                Scan(groupscan=[EasyGroupScan [selectionRoot=/tmp/dateData.json, numFiles=1, columns=[`key`], files=[maprfs:/tmp/dateData.json]]]) : rowType = RecordType(ANY key): rowcount = 40.0, cumulative cost = {40.0 rows, 40.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 12802
00-08              Scan(groupscan=[EasyGroupScan [selectionRoot=/tmp/doubleData.json, numFiles=1, columns=[`key`], files=[maprfs:/tmp/doubleData.json]]]) : rowType = RecordType(ANY key): rowcount = 43.0, cumulative cost = {43.0 rows, 43.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 12804
00-06            Scan(groupscan=[EasyGroupScan [selectionRoot=/tmp/intData.json, numFiles=1, columns=[`key`], files=[maprfs:/tmp/intData.json]]]) : rowType = RecordType(ANY key): rowcount = 4.0, cumulative cost = {4.0 rows, 4.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 12806
00-04          Scan(groupscan=[EasyGroupScan [selectionRoot=/tmp/timeStmpData.json, numFiles=1, columns=[`key`], files=[maprfs:/tmp/timeStmpData.json]]]) : rowType = RecordType(ANY key): rowcount = 33.0, cumulative cost = {33.0 rows, 33.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 12808
00-02        Scan(groupscan=[EasyGroupScan [selectionRoot=/tmp/vrChrData.json, numFiles=1, columns=[`key`], files=[maprfs:/tmp/vrChrData.json]]]) : rowType = RecordType(ANY key): rowcount = 205.0, cumulative cost = {205.0 rows, 205.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 12810

> Union all query fails when json.all_text_mode=false
> ---------------------------------------------------
>
>                 Key: DRILL-2608
>                 URL: https://issues.apache.org/jira/browse/DRILL-2608
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Query Planning & Optimization
>    Affects Versions: 0.9.0
>         Environment: | 9d92b8e319f2d46e8659d903d355450e15946533 | DRILL-2580: Exit early from HashJoinBatch if build side is empty | 26.03.2015 @ 16:13:53 EDT | Unknown     | 26.03.2015 @ 16:53:21 EDT |
>            Reporter: Khurram Faraaz
>            Assignee: Sean Hsuan-Yi Chu
>
> Union all query over JSON data file fails when store.json.all_text_mode is set to false, and same query returns correct results when store.json.all_text_mode is set to true. Each JSON data file had only one type of object {"key":<value>}, and the values in each of the JSON data files were of same datatype. Test was executed on a 4 node cluster.
> {code}
> 0: jdbc:drill:> select key from `charData.json` union all select key from `dateData.json` union all select key from `doubleData.json` union all select key from `intData.json` union all select key from `timeStmpData.json` union all select key from `vrChrData.json`;
> Query failed: RemoteRpcException: Failure while running fragment., For input string: "itzVxYBb" [ f1f81073-161c-4f24-89e5-37379413b01b on centos-04.qa.lab:31010 ]
> [ f1f81073-161c-4f24-89e5-37379413b01b on centos-04.qa.lab:31010 ]
> Error: exception while executing query: Failure while executing query. (state=,code=0)
> {code}
> Then I set alter session set `store.json.all_text_mode`=true;
> After setting son.all_text_mode to true, union all query returned correct results.
> {code}
> 0: jdbc:drill:> select key from `charData.json` union all select key from `dateData.json` union all select key from `doubleData.json` union all select key from `intData.json` union all select key from `timeStmpData.json` union all select key from `vrChrData.json`;
> ...
> +------------+
> 7,194 rows selected (0.462 seconds)
> {code}
> Resetting it back to false gives the same Exception
> {code}
> 0: jdbc:drill:> alter session set `store.json.all_text_mode`=false;
> +------------+------------+
> |     ok     |  summary   |
> +------------+------------+
> | true       | store.json.all_text_mode updated. |
> +------------+------------+
> 1 row selected (0.049 seconds)
> 0: jdbc:drill:> select key from `charData.json` union all select key from `dateData.json` union all select key from `doubleData.json` union all select key from `intData.json` union all select key from `timeStmpData.json` union all select key from `vrChrData.json`;
> Query failed: RemoteRpcException: Failure while running fragment., For input string: "itzVxYBb" [ 412eda0e-cc22-43ae-b763-5e40a0326551 on centos-04.qa.lab:31010 ]
> [ 412eda0e-cc22-43ae-b763-5e40a0326551 on centos-04.qa.lab:31010 ]
> Error: exception while executing query: Failure while executing query. (state=,code=0)
> {code}
> Stack trace from drillbit.log
> {code}
> 2015-03-27 18:30:56,620 [2aea5e1e-88b9-3e4e-07b5-d7e46b29756f:frag:0:0] ERROR o.a.drill.exec.work.foreman.Foreman - Error b9cb90bd-7d89-4061-8595-4c5ad983f3f3: RemoteRpcException: Failure while running fragment., For input string: "itzVxYBb" [ 412eda0e-cc22-43ae-b763-5e40a0326551 on centos-04.qa.lab:31010 ]
> [ 412eda0e-cc22-43ae-b763-5e40a0326551 on centos-04.qa.lab:31010 ]
> org.apache.drill.exec.rpc.RemoteRpcException: Failure while running fragment., For input string: "itzVxYBb" [ 412eda0e-cc22-43ae-b763-5e40a0326551 on centos-04.qa.lab:31010 ]
> [ 412eda0e-cc22-43ae-b763-5e40a0326551 on centos-04.qa.lab:31010 ]
>         at org.apache.drill.exec.work.foreman.QueryManager.statusUpdate(QueryManager.java:163) [drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
>         at org.apache.drill.exec.work.foreman.QueryManager$RootStatusReporter.statusChange(QueryManager.java:281) [drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
>         at org.apache.drill.exec.work.fragment.AbstractStatusReporter.fail(AbstractStatusReporter.java:114) [drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
>         at org.apache.drill.exec.work.fragment.AbstractStatusReporter.fail(AbstractStatusReporter.java:110) [drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
>         at org.apache.drill.exec.work.fragment.FragmentExecutor.internalFail(FragmentExecutor.java:230) [drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
>         at org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:182) [drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
>         at org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38) [drill-common-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [na:1.7.0_75]
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [na:1.7.0_75]
>         at java.lang.Thread.run(Thread.java:745) [na:1.7.0_75]
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)