You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Chengwei Wang (Jira)" <ji...@apache.org> on 2021/11/25 04:08:00 UTC
[jira] [Commented] (HADOOP-18023) Allow cp command to run with multi threads.
[ https://issues.apache.org/jira/browse/HADOOP-18023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17448922#comment-17448922 ]
Chengwei Wang commented on HADOOP-18023:
----------------------------------------
It's useful to allow _*-cp*_ command to run with multi-thread, like the improvement we done for *_-put/-get_* commands. It would reduce about 90% time cost when run with 10 threads in my test cases.
*Source dir: 1 dir 401 files 2.3G*
*Test 1: run with single thread*
{code:java}
time hadoop fs -cp /tmp/data/test /tmp/data/t1
real 1m9.394s
user 0m16.688s
sys 0m5.331s
{code}
*Test 2: run with 10 threads*
{code:java}
time hadoop fs -cp -t 10 /tmp/data/test /tmp/data/t2
real 0m8.217s
user 0m19.864s
sys 0m8.776s
{code}
> Allow cp command to run with multi threads.
> -------------------------------------------
>
> Key: HADOOP-18023
> URL: https://issues.apache.org/jira/browse/HADOOP-18023
> Project: Hadoop Common
> Issue Type: Improvement
> Components: fs
> Reporter: Chengwei Wang
> Assignee: Chengwei Wang
> Priority: Major
>
> Allow _*hadoop fs -cp*_ command to run with multi-thread to improve copy speed.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)
---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org