You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Sergio Peña (JIRA)" <ji...@apache.org> on 2016/09/16 15:16:20 UTC

[jira] [Created] (HIVE-14776) Skip 'distcp' call when copying data from HDSF to S3

Sergio Peña created HIVE-14776:
----------------------------------

             Summary: Skip 'distcp' call when copying data from HDSF to S3
                 Key: HIVE-14776
                 URL: https://issues.apache.org/jira/browse/HIVE-14776
             Project: Hive
          Issue Type: Sub-task
          Components: Hive
            Reporter: Sergio Peña
            Assignee: Sergio Peña


Hive uses 'distcp' to copy files in parallel between HDFS encryption zones when the {{hive.exec.copyfile.maxsize}} threshold is lower than the file to copy. This 'distcp' is also executed when copying to S3, but it is causing slower copies.

We should not invoke distcp when copying to blobstore systems.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)