You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Sahil Takiar (JIRA)" <ji...@apache.org> on 2016/11/15 23:38:59 UTC

[jira] [Comment Edited] (HIVE-15199) INSERT INTO data on S3 is replacing the old rows with the new ones

    [ https://issues.apache.org/jira/browse/HIVE-15199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15668755#comment-15668755 ] 

Sahil Takiar edited comment on HIVE-15199 at 11/15/16 11:38 PM:
----------------------------------------------------------------

[~spena] this actually affects any {{INSERT INTO}} query that needs to insert multiple files into the target table location will lose data. Each rename operation will basically overwrite the same file again and again.

Note this seems to be a regression of HIVE-12988, which first checked if the destination file existed before renaming it.


was (Author: stakiar):
[~spena] this is actually much worse that I thought. Any {{INSERT INTO}} query that needs to insert multiple files into the target table location will lose data, each rename operation will basically overwrite the same file again and again.

Note this seems to be a regression of HIVE-12988, which first checked if the destination file existed before renaming it.

> INSERT INTO data on S3 is replacing the old rows with the new ones
> ------------------------------------------------------------------
>
>                 Key: HIVE-15199
>                 URL: https://issues.apache.org/jira/browse/HIVE-15199
>             Project: Hive
>          Issue Type: Bug
>          Components: Hive
>            Reporter: Sergio Peña
>            Assignee: Sergio Peña
>            Priority: Critical
>
> Any INSERT INTO statement run on S3 tables and when the scratch directory is saved on S3 is deleting old rows of the table.
> {noformat}
> hive> set hive.blobstore.use.blobstore.as.scratchdir=true;
> hive> create table t1 (id int, name string) location 's3a://spena-bucket/t1';
> hive> insert into table t1 values (1,'name1');
> hive> select * from t1;
> 1       name1
> hive> insert into table t1 values (2,'name2');
> hive> select * from t1;
> 2       name2
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)