You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@hive.apache.org by "Steve Loughran (JIRA)" <ji...@apache.org> on 2018/03/05 15:17:00 UTC

[jira] [Commented] (HIVE-1620) Patch to write directly to S3 from Hive

    [ https://issues.apache.org/jira/browse/HIVE-1620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16386192#comment-16386192 ] 

Steve Loughran commented on HIVE-1620:
--------------------------------------

This is the wrong way to  handle variations in FS semantics; once we add the ability to query FS Capabilities (Hadoop 3.2?) then all filesystems could be probed for their semantics. Even so, I dont think this is correct. What we've done in HADOOP-13786 gives you atomic task commit and fast job-commit semantics without playing any rename games at all.

I'd recommend closing this as a WONTFIX, but reemphasise the underlying problem, "how to commit work to a store with neither consistency nor O(1) atomic renames" remains, at least for S3 & Openstack Swift.

> Patch to write directly to S3 from Hive
> ---------------------------------------
>
>                 Key: HIVE-1620
>                 URL: https://issues.apache.org/jira/browse/HIVE-1620
>             Project: Hive
>          Issue Type: New Feature
>            Reporter: Vaibhav Aggarwal
>            Assignee: Vaibhav Aggarwal
>            Priority: Major
>         Attachments: HIVE-1620.patch
>
>
> We want to submit a patch to Hive which allows user to write files directly to S3.
> This patch allow user to specify an S3 location as the table output location and hence eliminates the need  of copying data from HDFS to S3.
> Users can run Hive queries directly over the data stored in S3.
> This patch helps integrate hive with S3 better and quicker.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)