You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "Khurram Faraaz (JIRA)" <ji...@apache.org> on 2015/04/13 23:37:12 UTC

[jira] [Commented] (DRILL-2727) CTAS select * from CSV file results in Exception

    [ https://issues.apache.org/jira/browse/DRILL-2727?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14493145#comment-14493145 ] 

Khurram Faraaz commented on DRILL-2727:
---------------------------------------

Adding one more observation here, when we set `store.format`='json' or to 'parquet' CTAS writes the results to a new file. However, when I set `store.format` to 'csv' we see an Exception.

{code}
0: jdbc:drill:> alter system set `store.format`='json';
+------------+------------+
|     ok     |  summary   |
+------------+------------+
| true       | store.format updated. |
+------------+------------+
1 row selected (0.064 seconds)
0: jdbc:drill:> create table crtTblFrmGzFile03 as select * from `airports.csv.gz`;
+------------+---------------------------+
|  Fragment  | Number of records written |
+------------+---------------------------+
| 0_0        | 46315                     |
+------------+---------------------------+
1 row selected (0.771 seconds)
0: jdbc:drill:> alter system set `store.format`='csv';
+------------+------------+
|     ok     |  summary   |
+------------+------------+
| true       | store.format updated. |
+------------+------------+
1 row selected (0.066 seconds)
0: jdbc:drill:> create table crtTblFrmGzFile04 as select * from `airports.csv.gz`;
Query failed: Query stopped., Repeated types are not supported. [ 7d38def4-e99b-4ebb-aeee-89ce41c48819 on centos-01.qa.lab:31010 ]

Error: exception while executing query: Failure while executing query. (state=,code=0)
0: jdbc:drill:> alter system set `store.format`='parquet';
+------------+------------+
|     ok     |  summary   |
+------------+------------+
| true       | store.format updated. |
+------------+------------+
1 row selected (0.061 seconds)
0: jdbc:drill:> create table crtTblFrmGzFile06 as select * from `airports.csv`;
+------------+---------------------------+
|  Fragment  | Number of records written |
+------------+---------------------------+
| 0_0        | 46315                     |
+------------+---------------------------+
1 row selected (1.59 seconds)
0: jdbc:drill:> create table crtTblFrmGzFile07 as select * from `airports.csv.gz`;
+------------+---------------------------+
|  Fragment  | Number of records written |
+------------+---------------------------+
| 0_0        | 46315                     |
+------------+---------------------------+
1 row selected (0.953 seconds)
{code}

> CTAS select * from CSV file results in Exception
> ------------------------------------------------
>
>                 Key: DRILL-2727
>                 URL: https://issues.apache.org/jira/browse/DRILL-2727
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Storage - Other
>    Affects Versions: 0.9.0
>         Environment: 4 node cluster on CentOS
> | 9d92b8e319f2d46e8659d903d355450e15946533 | DRILL-2580: Exit early from HashJoinBatch if build side is empty | 26.03.2015 @ 16:13:53 EDT 
>            Reporter: Khurram Faraaz
>            Assignee: Jacques Nadeau
>
> CREATE TABLE csv_tbl as SELECT * FROM `input.csv`
> results in Exception, with message Repeated types are not supported
> {code}
> 0: jdbc:drill:> create table newCSV_Int_tbl12 as select * from `int_f.csv`;
> Query failed: Query stopped., Repeated types are not supported. [ 84ab8d15-6aac-42bb-908a-d663e7453abf on centos-02.qa.lab:31010 ]
> Error: exception while executing query: Failure while executing query. (state=,code=0)
> 0: jdbc:drill:> create table newCSV_Int_tbl12 as select * from `int_f.csv`;
> Query failed: Query stopped., Repeated types are not supported. [ 84ab8d15-6aac-42bb-908a-d663e7453abf on centos-02.qa.lab:31010 ]
> Error: exception while executing query: Failure while executing query. (state=,code=0)
> 0: jdbc:drill:> create table newCSV_Int_tbl13 as select columns[0] from `int_f.csv`;
> +------------+---------------------------+
> |  Fragment  | Number of records written |
> +------------+---------------------------+
> | 0_0        | 12                        |
> +------------+---------------------------+
> 1 row selected (0.146 seconds)
> 0: jdbc:drill:> select * from newCSV_Int_tbl13;
> +------------+
> |  columns   |
> +------------+
> | ["EXPR$0"] |
> | ["1"]      |
> | ["0"]      |
> | ["-1"]     |
> | ["65535"]  |
> | ["1234567"] |
> | ["1000000"] |
> | ["101010"] |
> | ["11111"]  |
> | ["100"]    |
> | ["13"]     |
> | ["19"]     |
> | ["17"]     |
> +------------+
> 13 rows selected (0.117 seconds)
> Stack trace from drillbit.log
> org.apache.drill.exec.rpc.RemoteRpcException: Failure while running fragment., Repeated types are not supported. [ 0b36b8c1-3d1e-40a4-9937-aade9ead445b on centos-02.qa.lab:31010 ]
> [ 0b36b8c1-3d1e-40a4-9937-aade9ead445b on centos-02.qa.lab:31010 ]
>         at org.apache.drill.exec.work.foreman.QueryManager.statusUpdate(QueryManager.java:163) [drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
>         at org.apache.drill.exec.work.foreman.QueryManager$RootStatusReporter.statusChange(QueryManager.java:281) [drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
>         at org.apache.drill.exec.work.fragment.AbstractStatusReporter.fail(AbstractStatusReporter.java:114) [drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
>         at org.apache.drill.exec.work.fragment.AbstractStatusReporter.fail(AbstractStatusReporter.java:110) [drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
>         at org.apache.drill.exec.work.fragment.FragmentExecutor.internalFail(FragmentExecutor.java:230) [drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
>         at org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:165) [drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
>         at org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38) [drill-common-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [na:1.7.0_75]
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [na:1.7.0_75]
>         at java.lang.Thread.run(Thread.java:745) [na:1.7.0_75]
> 2015-04-09 00:37:48,977 [2ada3623-63c8-3c52-31cd-45c219efd25f:frag:0:0] WARN  o.a.d.e.w.fragment.FragmentExecutor - Error while initializing or executing fragment
> java.lang.RuntimeException: Error closing fragment context.
>         at org.apache.drill.exec.work.fragment.FragmentExecutor.closeOutResources(FragmentExecutor.java:224) ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
>         at org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:166) ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
>         at org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38) [drill-common-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [na:1.7.0_75]
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [na:1.7.0_75]
>         at java.lang.Thread.run(Thread.java:745) [na:1.7.0_75]
> Caused by: java.lang.UnsupportedOperationException: Repeated types are not supported.
>         at org.apache.drill.exec.store.StringOutputRecordWriter$RepeatedVarCharStringFieldConverter.writeField(StringOutputRecordWriter.java:1556) ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
>         at org.apache.drill.exec.store.EventBasedRecordWriter.write(EventBasedRecordWriter.java:58) ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
>         at org.apache.drill.exec.physical.impl.WriterRecordBatch.innerNext(WriterRecordBatch.java:116) ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
>         at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:142) ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
>         at org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next(IteratorValidatorBatchIterator.java:118) ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
>         at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:99) ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
>         at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:89) ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
>         at org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext(AbstractSingleRecordBatch.java:51) ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
>         at org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext(ProjectRecordBatch.java:134) ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
>         at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:142) ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
>         at org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next(IteratorValidatorBatchIterator.java:118) ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
>         at org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:68) ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
>         at org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.innerNext(ScreenCreator.java:96) ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
>         at org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:58) ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
>         at org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:163) ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
>         ... 4 common frames omitted
> 2015-04-09 00:37:48,978 [2ada3623-63c8-3c52-31cd-45c219efd25f:frag:0:0] ERROR o.a.drill.exec.ops.FragmentContext - Fragment Context received failure -- Fragment: 0:0
> java.lang.RuntimeException: Error closing fragment context.
>         at org.apache.drill.exec.work.fragment.FragmentExecutor.closeOutResources(FragmentExecutor.java:224) ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
>         at org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:166) ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
>         at org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38) [drill-common-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [na:1.7.0_75]
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [na:1.7.0_75]
>         at java.lang.Thread.run(Thread.java:745) [na:1.7.0_75]
> Caused by: java.lang.UnsupportedOperationException: Repeated types are not supported.
>         at org.apache.drill.exec.store.StringOutputRecordWriter$RepeatedVarCharStringFieldConverter.writeField(StringOutputRecordWriter.java:1556) ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
>         at org.apache.drill.exec.store.EventBasedRecordWriter.write(EventBasedRecordWriter.java:58) ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
>         at org.apache.drill.exec.physical.impl.WriterRecordBatch.innerNext(WriterRecordBatch.java:116) ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
>         at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:142) ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
>         at org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next(IteratorValidatorBatchIterator.java:118) ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
>         at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:99) ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
>         at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:89) ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
>         at org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext(AbstractSingleRecordBatch.java:51) ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
>         at org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext(ProjectRecordBatch.java:134) ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
>         at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:142) ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
>         at org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next(IteratorValidatorBatchIterator.java:118) ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
>         at org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:68) ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
>         at org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.innerNext(ScreenCreator.java:96) ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
>         at org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:58) ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
>         at org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:163) ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
>         ... 4 common frames omitted
> data used in the CSV file was
> [root@centos-01 csv_dir]# cat int_f.csv 
> 1
> 0
> -1
> 65535
> 1234567
> 1000000
> 101010
> 11111
> 100
> 13
> 19
> 17
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)