You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Ayush Saxena (Jira)" <ji...@apache.org> on 2019/12/29 20:31:00 UTC

[jira] [Commented] (HADOOP-15887) Add an option to avoid writing data locally in Distcp

    [ https://issues.apache.org/jira/browse/HADOOP-15887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17004898#comment-17004898 ] 

Ayush Saxena commented on HADOOP-15887:
---------------------------------------

Thanx [~Tao Jie] Overall LGTM.

Can you fix the checkstyle complains?

{code:java}
+    } catch (IOException e) {
+      LOG.error("Exception encountered while testing distcp", e);
+      Assert.fail("distcp failure");
{code}

Here in the test I don't think we need to catch and abstract the failure, we can let the exception surface.


> Add an option to avoid writing data locally in Distcp
> -----------------------------------------------------
>
>                 Key: HADOOP-15887
>                 URL: https://issues.apache.org/jira/browse/HADOOP-15887
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: tools/distcp
>    Affects Versions: 2.8.2, 3.0.0
>            Reporter: Tao Jie
>            Assignee: Tao Jie
>            Priority: Major
>         Attachments: HADOOP-15887.001.patch, HADOOP-15887.002.patch
>
>
> When copying large amount of data from one cluster to another via Distcp, and the Distcp jobs run in the target cluster, the datanode local usage would be imbalanced. Because the default placement policy chooses the local node to store the first replication.
> In https://issues.apache.org/jira/browse/HDFS-3702 we add a flag in DFSClient to avoid replicating to the local datanode.  We can make use of this flag in Distcp.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org