You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Szehon Ho (JIRA)" <ji...@apache.org> on 2015/08/06 03:34:04 UTC

[jira] [Updated] (HIVE-7476) CTAS does not work properly for s3

     [ https://issues.apache.org/jira/browse/HIVE-7476?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Szehon Ho updated HIVE-7476:
----------------------------
    Attachment: HIVE-7476.patch

The code is now quite different, and no longer errors there, although it still errors.

I submitted an updated patch to fix the error as it happens now, which is in the MoveTask, in the last stage of CTAS.  Namely, it is using the wrong FileSystem object (dest) to run operations on the source.

> CTAS does not work properly for s3
> ----------------------------------
>
>                 Key: HIVE-7476
>                 URL: https://issues.apache.org/jira/browse/HIVE-7476
>             Project: Hive
>          Issue Type: Bug
>          Components: Query Processor
>    Affects Versions: 0.13.1
>         Environment: Linux
>            Reporter: Jian Fang
>         Attachments: HIVE-7476.1.patch, HIVE-7476.patch
>
>
> When we use CTAS to create a new table in s3, the table location is not set correctly. As a result, the data from the existing table cannot be inserted into the new created table.
> We can use the following example to reproduce this issue.
> set hive.metastore.warehouse.dir=OUTPUT_PATH;
> drop table s3_dir_test;
> drop table s3_1;
> drop table s3_2;
> create external table s3_dir_test(strct struct<a:int, b:string, c:string>)
> row format delimited
> fields terminated by '\t'
> collection items terminated by ' '
> location 'INPUT_PATH';
> create table s3_1(strct struct<a:int, b:string, c:string>)
> row format delimited
> fields terminated by '\t'
> collection items terminated by ' ';
> insert overwrite table s3_1 select * from s3_dir_test;
> select * from s3_1;
> create table s3_2 as select * from s3_1;
> select * from s3_1;
> select * from s3_2;
> The data could be as follows.
> 1 abc 10.5
> 2 def 11.5
> 3 ajss 90.23232
> 4 djns 89.02002
> 5 random 2.99
> 6 data 3.002
> 7 ne 71.9084
> The root cause is that the SemanticAnalyzer class did not handle s3 location properly for CTAS.
> A patch will be provided shortly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)