You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@tajo.apache.org by "Jaehwa Jung (JIRA)" <ji...@apache.org> on 2016/02/02 00:56:39 UTC

[jira] [Commented] (TAJO-1905) Insert clause to partitioned table fails on S3

    [ https://issues.apache.org/jira/browse/TAJO-1905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15127280#comment-15127280 ] 

Jaehwa Jung commented on TAJO-1905:
-----------------------------------

If we commit TAJO-2063 and TAJO-2069, this problem will not occur. I found that queries for creating partition table ran successfully on AWS as following:
* Dataset: TPC-H 1G
* EC2-type: c3.xlarge
* Queries
{code:xml}

CREATE TABLE LINEITEM_P (l_orderkey INT8, l_partkey INT8, l_suppkey INT8, l_linenumber INT8,
l_quantity FLOAT8, l_extendedprice FLOAT8, l_discount FLOAT8, l_tax FLOAT8, l_returnflag TEXT, l_linestatus TEXT,
 l_commitdate TEXT, l_receiptdate TEXT, l_shipinstruct TEXT, l_shipmode TEXT, l_comment TEXT)
  USING TEXT WITH ('text.delimiter'='|')
  PARTITION BY COLUMN(l_shipdate text) as 
SELECT L_ORDERKEY, L_PARTKEY, L_SUPPKEY, L_LINENUMBER, L_QUANTITY, L_EXTENDEDPRICE, L_DISCOUNT, L_TAX, L_RETURNFLAG, L_LINESTATUS, L_COMMITDATE, L_RECEIPTDATE, L_SHIPINSTRUCT, L_SHIPMODE, L_COMMENT, L_SHIPDATE FROM LINEITEM;

CREATE TABLE LINEITEM_P (l_orderkey INT8, l_partkey INT8, l_suppkey INT8, l_linenumber INT8,
l_quantity FLOAT8, l_extendedprice FLOAT8, l_discount FLOAT8, l_tax FLOAT8, l_returnflag TEXT, l_linestatus TEXT,
 l_commitdate DATE, l_receiptdate DATE, l_shipinstruct TEXT, l_shipmode TEXT, l_comment TEXT)
  USING TEXT WITH ('text.delimiter'='|')
  PARTITION BY COLUMN(l_shipdate text);

insert overwrite into LINEITEM_P
SELECT L_ORDERKEY, L_PARTKEY, L_SUPPKEY, L_LINENUMBER, L_QUANTITY, L_EXTENDEDPRICE, L_DISCOUNT, L_TAX, L_RETURNFLAG, L_LINESTATUS, L_COMMITDATE, L_RECEIPTDATE, L_SHIPINSTRUCT, L_SHIPMODE, L_COMMENT, L_SHIPDATE FROM LINEITEM;
{code}
* Result partitions number are normal.


> Insert clause to partitioned table fails on S3
> ----------------------------------------------
>
>                 Key: TAJO-1905
>                 URL: https://issues.apache.org/jira/browse/TAJO-1905
>             Project: Tajo
>          Issue Type: Sub-task
>          Components: QueryMaster, S3
>            Reporter: Jinho Kim
>            Assignee: Jaehwa Jung
>             Fix For: 0.12.0
>
>
> Here is the error log
> {noformat}
> 2015-10-02 18:54:40,399 ERROR org.apache.hadoop.fs.s3a.S3AFileSystem: rename: src not found s3a://bucket/tpch-1g-p/lineitem/.staging/q_1443779192380_0001/RESULT/l_shipdate=1996-01-30
> 2015-10-02 18:54:51,357 ERROR org.apache.hadoop.fs.s3a.S3AFileSystem: rename: src not found s3a://bucket/tpch-1g-p/lineitem/.staging/q_1443779192380_0001/RESULT/l_shipdate=1993-11-09
> 2015-10-02 18:55:03,955 ERROR org.apache.tajo.querymaster.Query: No such file or directory: s3a://bucket/lineitem/l_shipdate=1994-02-02
> java.io.FileNotFoundException: No such file or directory: s3a://bucket/tpch-1g-p/lineitem/l_shipdate=1994-02-02
> at org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:996)
> at org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:77)
> at org.apache.hadoop.fs.FileSystem.getContentSummary(FileSystem.java:1467)
> at org.apache.tajo.querymaster.Query$QueryCompletedTransition.getPartitionsWithContentsSummary(Query.java:550)
> at org.apache.tajo.querymaster.Query$QueryCompletedTransition.finalizeQuery(Query.java:512)
> at org.apache.tajo.querymaster.Query$QueryCompletedTransition.transition(Query.java:446)
> at org.apache.tajo.querymaster.Query$QueryCompletedTransition.transition(Query.java:435)
> at org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:385)
> at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
> at org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
> at org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
> at org.apache.tajo.querymaster.Query.handle(Query.java:874)
> at org.apache.tajo.querymaster.Query.handle(Query.java:63)
> at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:175)
> at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:108)
> at java.lang.Thread.run(Thread.java:745)
> 2015-10-02 18:55:03,958 INFO org.apache.tajo.querymaster.Query: q_1443779192380_0001 Query Transitioned from QUERY_RUNNING to QUERY_ERROR
> 2015-10-02 18:55:03,958 INFO org.apache.tajo.querymaster.Query: Processing q_1443779192380_0001 of type DIAGNOSTIC_UPDATE
> 2015-10-02 18:55:03,958 INFO org.apache.tajo.querymaster.QueryMasterTask: Query completion notified from q_1443779192380_0001 final state: QUERY_ERROR
> 2015-10-02 18:55:03,960 INFO org.apache.tajo.querymaster.QueryMasterTask: Stopping QueryMasterTask:q_1443779192380_0001
> 2015-10-02 18:55:03,960 INFO org.apache.tajo.querymaster.QueryMasterTask: Cleanup resources of all workers. Query: q_1443779192380_0001, workers: 1
> 2015-10-02 18:55:03,962 INFO org.apache.tajo.querymaster.QueryMasterTask: Stopped QueryMasterTask:q_1443779192380_0001
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)