You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "Khurram Faraaz (JIRA)" <ji...@apache.org> on 2017/02/06 12:48:41 UTC

[jira] [Comment Edited] (DRILL-5040) Interrupted CTAS should not succeed & should not create physical file on disk

    [ https://issues.apache.org/jira/browse/DRILL-5040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15853926#comment-15853926 ] 

Khurram Faraaz edited comment on DRILL-5040 at 2/6/17 12:48 PM:
----------------------------------------------------------------

Issue a CTTAS over (~26M JSON records) and issue Ctrl-C on sqlline while the CTTAS is under execution. And then do a select on the temporary table physical file on disk, we see the below RuntimeException.

{noformat}
0: jdbc:drill:schema=dfs.tmp> CREATE TEMPORARY TABLE temp_tbl_1 AS select * from `twoKeyJsn.json`;
No rows affected (1.607 seconds)
0: jdbc:drill:schema=dfs.tmp> select count(*) from temp_tbl_1;
Error: SYSTEM ERROR: RuntimeException: maprfs:///tmp/bfe2dad0-921a-4f06-9799-494ab8a7246d/851a124c-80a1-45e3-9496-d2562007911e/0_0_0.parquet is not a Parquet file (too small)


[Error Id: ea0d1086-13d1-416b-a1a7-165050ac209f on centos-01.qa.lab:31010] (state=,code=0)
{noformat}

stack trace from drillbit.log

{noformat}
2017-02-06 12:39:07,056 [27679194-d169-4e11-263b-7aeae80e49cc:foreman] ERROR o.a.drill.exec.work.foreman.Foreman - SYSTEM ERROR: RuntimeException: maprfs:///tmp/bfe2dad0-921a-4f06-9799-494ab8a7246d/851a124c-80a1-45e3-9496-d2562007911e/0_0_0.parquet is not a Parquet file (too small)


[Error Id: ea0d1086-13d1-416b-a1a7-165050ac209f on centos-01.qa.lab:31010]
org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: RuntimeException: maprfs:///tmp/bfe2dad0-921a-4f06-9799-494ab8a7246d/851a124c-80a1-45e3-9496-d2562007911e/0_0_0.parquet is not a Parquet file (too small)


[Error Id: ea0d1086-13d1-416b-a1a7-165050ac209f on centos-01.qa.lab:31010]
        at org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:544) ~[drill-common-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
        at org.apache.drill.exec.work.foreman.Foreman$ForemanResult.close(Foreman.java:825) [drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
        at org.apache.drill.exec.work.foreman.Foreman.moveToState(Foreman.java:945) [drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
        at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:281) [drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [na:1.8.0_91]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [na:1.8.0_91]
        at java.lang.Thread.run(Thread.java:745) [na:1.8.0_91]
Caused by: org.apache.drill.exec.work.foreman.ForemanException: Unexpected exception during fragment initialization: Internal error: Error while applying rule DrillPushProjIntoScan, args [rel#35532:LogicalProject.NONE.ANY([]).[](input=rel#35531:Subset#0.ENUMERABLE.ANY([]).[],$f0=0), rel#35516:EnumerableTableScan.ENUMERABLE.ANY([]).[](table=[dfs.tmp, bfe2dad0-921a-4f06-9799-494ab8a7246d/851a124c-80a1-45e3-9496-d2562007911e])]
        ... 4 common frames omitted
Caused by: java.lang.AssertionError: Internal error: Error while applying rule DrillPushProjIntoScan, args [rel#35532:LogicalProject.NONE.ANY([]).[](input=rel#35531:Subset#0.ENUMERABLE.ANY([]).[],$f0=0), rel#35516:EnumerableTableScan.ENUMERABLE.ANY([]).[](table=[dfs.tmp, bfe2dad0-921a-4f06-9799-494ab8a7246d/851a124c-80a1-45e3-9496-d2562007911e])]
        at org.apache.calcite.util.Util.newInternal(Util.java:792) ~[calcite-core-1.4.0-drill-r19.jar:1.4.0-drill-r19]
        at org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch(VolcanoRuleCall.java:251) ~[calcite-core-1.4.0-drill-r19.jar:1.4.0-drill-r19]
        at org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp(VolcanoPlanner.java:808) ~[calcite-core-1.4.0-drill-r19.jar:1.4.0-drill-r19]
        at org.apache.calcite.tools.Programs$RuleSetProgram.run(Programs.java:303) ~[calcite-core-1.4.0-drill-r19.jar:1.4.0-drill-r19]
        at org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.transform(DefaultSqlHandler.java:404) ~[drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
        at org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.transform(DefaultSqlHandler.java:343) ~[drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
        at org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.convertToDrel(DefaultSqlHandler.java:240) ~[drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
        at org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.convertToDrel(DefaultSqlHandler.java:290) ~[drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
        at org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.getPlan(DefaultSqlHandler.java:168) ~[drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
        at org.apache.drill.exec.planner.sql.DrillSqlWorker.getPhysicalPlan(DrillSqlWorker.java:117) ~[drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
        at org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan(DrillSqlWorker.java:91) ~[drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
        at org.apache.drill.exec.work.foreman.Foreman.runSQL(Foreman.java:1018) [drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
        at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:264) [drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
        ... 3 common frames omitted
Caused by: org.apache.drill.common.exceptions.DrillRuntimeException: java.io.IOException: java.lang.RuntimeException: maprfs:///tmp/bfe2dad0-921a-4f06-9799-494ab8a7246d/851a124c-80a1-45e3-9496-d2562007911e/0_0_0.parquet is not a Parquet file (too small)
        at org.apache.drill.exec.planner.logical.DrillPushProjIntoScan.onMatch(DrillPushProjIntoScan.java:95) ~[drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
        at org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch(VolcanoRuleCall.java:228) ~[calcite-core-1.4.0-drill-r19.jar:1.4.0-drill-r19]
        ... 14 common frames omitted
Caused by: java.io.IOException: java.lang.RuntimeException: maprfs:///tmp/bfe2dad0-921a-4f06-9799-494ab8a7246d/851a124c-80a1-45e3-9496-d2562007911e/0_0_0.parquet is not a Parquet file (too small)
        at org.apache.drill.exec.store.parquet.Metadata$MetadataGatherer.convertToIOException(Metadata.java:325) ~[drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
        at org.apache.drill.exec.store.TimedRunnable.getValue(TimedRunnable.java:79) ~[drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
        at org.apache.drill.exec.store.TimedRunnable.run(TimedRunnable.java:173) ~[drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
        at org.apache.drill.exec.store.parquet.Metadata.getParquetFileMetadata_v3(Metadata.java:279) ~[drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
        at org.apache.drill.exec.store.parquet.Metadata.getParquetTableMetadata(Metadata.java:258) ~[drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
        at org.apache.drill.exec.store.parquet.Metadata.getParquetTableMetadata(Metadata.java:243) ~[drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
        at org.apache.drill.exec.store.parquet.Metadata.getParquetTableMetadata(Metadata.java:119) ~[drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
        at org.apache.drill.exec.store.parquet.ParquetGroupScan.init(ParquetGroupScan.java:733) ~[drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
        at org.apache.drill.exec.store.parquet.ParquetGroupScan.<init>(ParquetGroupScan.java:230) ~[drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
        at org.apache.drill.exec.store.parquet.ParquetGroupScan.<init>(ParquetGroupScan.java:190) ~[drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
        at org.apache.drill.exec.store.parquet.ParquetFormatPlugin.getGroupScan(ParquetFormatPlugin.java:169) ~[drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
        at org.apache.drill.exec.store.parquet.ParquetFormatPlugin.getGroupScan(ParquetFormatPlugin.java:67) ~[drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
        at org.apache.drill.exec.store.dfs.FileSystemPlugin.getPhysicalScan(FileSystemPlugin.java:145) ~[drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
        at org.apache.drill.exec.store.AbstractStoragePlugin.getPhysicalScan(AbstractStoragePlugin.java:103) ~[drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
        at org.apache.drill.exec.planner.logical.DrillTable.getGroupScan(DrillTable.java:85) ~[drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
        at org.apache.drill.exec.planner.logical.DrillPushProjIntoScan.onMatch(DrillPushProjIntoScan.java:63) ~[drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
        ... 15 common frames omitted
Caused by: java.lang.RuntimeException: maprfs:///tmp/bfe2dad0-921a-4f06-9799-494ab8a7246d/851a124c-80a1-45e3-9496-d2562007911e/0_0_0.parquet is not a Parquet file (too small)
        at org.apache.parquet.hadoop.ParquetFileReader.readFooter(ParquetFileReader.java:433) ~[parquet-hadoop-1.8.1-drill-r0.jar:1.8.1-drill-r0]
        at org.apache.parquet.hadoop.ParquetFileReader.readFooter(ParquetFileReader.java:412) ~[parquet-hadoop-1.8.1-drill-r0.jar:1.8.1-drill-r0]
        at org.apache.drill.exec.store.parquet.Metadata.getParquetFileMetadata_v3(Metadata.java:382) ~[drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
        at org.apache.drill.exec.store.parquet.Metadata.access$000(Metadata.java:83) ~[drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
        at org.apache.drill.exec.store.parquet.Metadata$MetadataGatherer.runInner(Metadata.java:317) ~[drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
        at org.apache.drill.exec.store.parquet.Metadata$MetadataGatherer.runInner(Metadata.java:305) ~[drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
        at org.apache.drill.exec.store.TimedRunnable.run(TimedRunnable.java:56) ~[drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
        at org.apache.drill.exec.store.TimedRunnable.run(TimedRunnable.java:122) ~[drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
        ... 28 common frames omitted
{noformat}


was (Author: khfaraaz):
Issue a CTTAS over (~26M JSON records) and issue Ctrl-C on sqlline while the CTTAS is under execution. And then do a select on the temporary table physical file on disk, we see the below RuntimeException.

{noformat}
0: jdbc:drill:schema=dfs.tmp> CREATE TEMPORARY TABLE temp_tbl_1 AS select * from `twoKeyJsn.json`;
No rows affected (1.607 seconds)
0: jdbc:drill:schema=dfs.tmp> select count(*) from temp_tbl_1;
Error: SYSTEM ERROR: RuntimeException: maprfs:///tmp/bfe2dad0-921a-4f06-9799-494ab8a7246d/851a124c-80a1-45e3-9496-d2562007911e/0_0_0.parquet is not a Parquet file (too small)


[Error Id: ea0d1086-13d1-416b-a1a7-165050ac209f on centos-01.qa.lab:31010] (state=,code=0)
{noformat}

> Interrupted CTAS should not succeed & should not create physical file on disk
> -----------------------------------------------------------------------------
>
>                 Key: DRILL-5040
>                 URL: https://issues.apache.org/jira/browse/DRILL-5040
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Execution - Flow
>    Affects Versions: 1.9.0
>            Reporter: Khurram Faraaz
>            Assignee: Arina Ielchiieva
>             Fix For: 1.10.0
>
>
> We should not allow CTAS to succeed (i.e create physical file on disk ) in the case where it was interrupted. (vis Ctrl-C)
> Drill 1.9.0
> git commit ID : db30854
> Consider the below CTAS that was interrupted using Ctrl-C
> {noformat}
> 0: jdbc:drill:schema=dfs.tmp> create table temp_t1 as select * from `twoKeyJsn.json`; 
> [ issue Ctrl-C while the above CTAS is running ]
> No rows affected (7.694 seconds)
> {noformat}
> I verified that physical file was created on disk, even though the above CTAS was Canceled
> {noformat}
> [root@centos-01 ~]# hadoop fs -ls /tmp/temp_t1*
> -rwxr-xr-x   3 root root   36713198 2016-11-14 10:51 /tmp/temp_t1/0_0_0.parquet
> {noformat}
> We are able to do a select on the CTAS table (above) that was Canceled.
> {noformat}
> 0: jdbc:drill:schema=dfs.tmp> select count(*) from temp_t1;
> +----------+
> |  EXPR$0  |
> +----------+
> | 3747840  |
> +----------+
> 1 row selected (0.183 seconds)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)