You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by GitBox <gi...@apache.org> on 2021/11/11 04:14:37 UTC

[GitHub] [hadoop] smarthanwang opened a new pull request #3645: HADOOP-17998. Enable get command run with multi-thread.

smarthanwang opened a new pull request #3645:
URL: https://github.com/apache/hadoop/pull/3645


   ### Description
   CopyFromLocal/Put is enabled to run with multi-thread with HDFS-11786 and HADOOP-14698, and make put dirs or multiple files faster.So, It's necessary to enable get and copyToLocal command run with multi-thread.
   
   ### Tests
   Test case: 1 dir 240 files  13G
   
   **1. Get with single thread.**
   
   ```
   time hadoop fs -get /tmp/data/20211101 .
   real    6m28.785s
   user    0m18.844s
   sys    0m44.953s
   ```
   **2. Get with 10 threads.**
   ```
   time hadoop fs -get -t 10 /tmp/data/20211101 .
   real    0m58.721s
   user    0m21.386s
   sys    0m54.066s
   ```
   
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] hadoop-yetus commented on pull request #3645: HADOOP-17998. Enable get command run with multi-thread.

Posted by GitBox <gi...@apache.org>.
hadoop-yetus commented on pull request #3645:
URL: https://github.com/apache/hadoop/pull/3645#issuecomment-972603877


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |:----:|----------:|--------:|:--------:|:-------:|
   | +0 :ok: |  reexec  |   0m 34s |  |  Docker mode activated.  |
   |||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +0 :ok: |  markdownlint  |   0m  1s |  |  markdownlint was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to include 3 new or modified test files.  |
   |||| _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  32m 26s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |  21m 54s |  |  trunk passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  compile  |  19m  9s |  |  trunk passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  checkstyle  |   1m  8s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m 39s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   1m 12s |  |  trunk passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   1m 45s |  |  trunk passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | -1 :x: |  spotbugs  |   2m 28s | [/branch-spotbugs-hadoop-common-project_hadoop-common-warnings.html](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3645/2/artifact/out/branch-spotbugs-hadoop-common-project_hadoop-common-warnings.html) |  hadoop-common-project/hadoop-common in trunk has 1 extant spotbugs warnings.  |
   | +1 :green_heart: |  shadedclient  |  22m  8s |  |  branch has no errors when building and testing our client artifacts.  |
   |||| _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   0m 57s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  21m 18s |  |  the patch passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javac  |  21m 18s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  19m  9s |  |  the patch passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  javac  |  19m  9s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks issues.  |
   | +1 :green_heart: |  checkstyle  |   1m  7s |  |  hadoop-common-project/hadoop-common: The patch generated 0 new + 33 unchanged - 8 fixed = 33 total (was 41)  |
   | +1 :green_heart: |  mvnsite  |   1m 37s |  |  the patch passed  |
   | -1 :x: |  javadoc  |   1m 10s | [/patch-javadoc-hadoop-common-project_hadoop-common-jdkUbuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3645/2/artifact/out/patch-javadoc-hadoop-common-project_hadoop-common-jdkUbuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04.txt) |  hadoop-common in the patch failed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04.  |
   | +1 :green_heart: |  javadoc  |   1m 42s |  |  the patch passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  spotbugs  |   2m 36s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  21m 53s |  |  patch has no errors when building and testing our client artifacts.  |
   |||| _ Other Tests _ |
   | -1 :x: |  unit  |  17m 18s | [/patch-unit-hadoop-common-project_hadoop-common.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3645/2/artifact/out/patch-unit-hadoop-common-project_hadoop-common.txt) |  hadoop-common in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 58s |  |  The patch does not generate ASF License warnings.  |
   |  |   | 194m 25s |  |  |
   
   
   | Reason | Tests |
   |-------:|:------|
   | Failed junit tests | hadoop.cli.TestCLI |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3645/2/artifact/out/Dockerfile |
   | GITHUB PR | https://github.com/apache/hadoop/pull/3645 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell markdownlint |
   | uname | Linux 4e050ed689b3 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 152b85a42789eb3447b6bd7a1d56142819edbe38 |
   | Default Java | Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 |
   | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 |
   |  Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3645/2/testReport/ |
   | Max. process+thread count | 1267 (vs. ulimit of 5500) |
   | modules | C: hadoop-common-project/hadoop-common U: hadoop-common-project/hadoop-common |
   | Console output | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3645/2/console |
   | versions | git=2.25.1 maven=3.6.3 spotbugs=4.2.2 |
   | Powered by | Apache Yetus 0.14.0-SNAPSHOT https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] sodonnel commented on pull request #3645: HADOOP-17998. Enable get command run with multi-thread.

Posted by GitBox <gi...@apache.org>.
sodonnel commented on pull request #3645:
URL: https://github.com/apache/hadoop/pull/3645#issuecomment-971410375


   This seems like a good change. I will to review in more detail in the next few days.
   
   Could you add the new -t parameter to the docs in `./hadoop-common-project/hadoop-common/src/site/markdown/FileSystemShell.md` as part of this PR please? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] smarthanwang commented on a change in pull request #3645: HADOOP-17998. Enable get command run with multi-thread.

Posted by GitBox <gi...@apache.org>.
smarthanwang commented on a change in pull request #3645:
URL: https://github.com/apache/hadoop/pull/3645#discussion_r752825112



##########
File path: hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/shell/CopyCommandWithMultiThread.java
##########
@@ -0,0 +1,157 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.fs.shell;
+
+import java.io.IOException;
+import java.util.LinkedList;
+import java.util.concurrent.ArrayBlockingQueue;
+import java.util.concurrent.ThreadPoolExecutor;
+import java.util.concurrent.TimeUnit;
+
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import org.apache.hadoop.classification.VisibleForTesting;
+
+/**
+ * Abstract command to enable sub copy commands run with multi-thread.
+ */
+public abstract class CopyCommandWithMultiThread
+    extends CommandWithDestination {
+
+  private int threadCount = 1;
+  private ThreadPoolExecutor executor = null;
+  private int threadPoolQueueSize = DEFAULT_QUEUE_SIZE;
+
+  public static final int DEFAULT_QUEUE_SIZE = 1024;
+  public static final int MAX_THREAD_COUNT =
+      Runtime.getRuntime().availableProcessors() * 2;

Review comment:
       
   >  I guess it is best to avoid having no limit
   
   I agree with you, no limit and decided by users is the best way. Set the limit just for keep same  with the original. I will cancel it.
   
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] hadoop-yetus commented on pull request #3645: HADOOP-17998. Enable get command run with multi-thread.

Posted by GitBox <gi...@apache.org>.
hadoop-yetus commented on pull request #3645:
URL: https://github.com/apache/hadoop/pull/3645#issuecomment-966055357


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |:----:|----------:|--------:|:--------:|:-------:|
   | +0 :ok: |  reexec  |   0m 47s |  |  Docker mode activated.  |
   |||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to include 3 new or modified test files.  |
   |||| _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  32m 23s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |  21m 57s |  |  trunk passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  compile  |  18m 53s |  |  trunk passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  checkstyle  |   1m  9s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m 38s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   1m 10s |  |  trunk passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   1m 40s |  |  trunk passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  spotbugs  |   2m 26s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  21m 33s |  |  branch has no errors when building and testing our client artifacts.  |
   |||| _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   0m 56s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  20m 59s |  |  the patch passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javac  |  20m 59s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  19m 16s |  |  the patch passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  javac  |  19m 16s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks issues.  |
   | +1 :green_heart: |  checkstyle  |   1m  6s |  |  hadoop-common-project/hadoop-common: The patch generated 0 new + 33 unchanged - 8 fixed = 33 total (was 41)  |
   | +1 :green_heart: |  mvnsite  |   1m 34s |  |  the patch passed  |
   | -1 :x: |  javadoc  |   1m  6s | [/patch-javadoc-hadoop-common-project_hadoop-common-jdkUbuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3645/1/artifact/out/patch-javadoc-hadoop-common-project_hadoop-common-jdkUbuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04.txt) |  hadoop-common in the patch failed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04.  |
   | +1 :green_heart: |  javadoc  |   1m 41s |  |  the patch passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  spotbugs  |   2m 33s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  21m 49s |  |  patch has no errors when building and testing our client artifacts.  |
   |||| _ Other Tests _ |
   | -1 :x: |  unit  |  17m 24s | [/patch-unit-hadoop-common-project_hadoop-common.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3645/1/artifact/out/patch-unit-hadoop-common-project_hadoop-common.txt) |  hadoop-common in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   1m  0s |  |  The patch does not generate ASF License warnings.  |
   |  |   | 193m 40s |  |  |
   
   
   | Reason | Tests |
   |-------:|:------|
   | Failed junit tests | hadoop.cli.TestCLI |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3645/1/artifact/out/Dockerfile |
   | GITHUB PR | https://github.com/apache/hadoop/pull/3645 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell |
   | uname | Linux eb09f5500377 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / ee47b1a001a3f68d7d9bb0f6ac1878315a14cbce |
   | Default Java | Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 |
   | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 |
   |  Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3645/1/testReport/ |
   | Max. process+thread count | 1267 (vs. ulimit of 5500) |
   | modules | C: hadoop-common-project/hadoop-common U: hadoop-common-project/hadoop-common |
   | Console output | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3645/1/console |
   | versions | git=2.25.1 maven=3.6.3 spotbugs=4.2.2 |
   | Powered by | Apache Yetus 0.14.0-SNAPSHOT https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] smarthanwang commented on pull request #3645: HADOOP-17998. Enable get command run with multi-thread.

Posted by GitBox <gi...@apache.org>.
smarthanwang commented on pull request #3645:
URL: https://github.com/apache/hadoop/pull/3645#issuecomment-973903891


   @sodonnel  thanks for your detailed review, I have removed the limit of thread count, and fixed the problems in unit tests. Please help review again.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] hadoop-yetus commented on pull request #3645: HADOOP-17998. Enable get command run with multi-thread.

Posted by GitBox <gi...@apache.org>.
hadoop-yetus commented on pull request #3645:
URL: https://github.com/apache/hadoop/pull/3645#issuecomment-974047752


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |:----:|----------:|--------:|:--------:|:-------:|
   | +0 :ok: |  reexec  |   0m 53s |  |  Docker mode activated.  |
   |||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  markdownlint  |   0m  0s |  |  markdownlint was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to include 4 new or modified test files.  |
   |||| _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  34m 57s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |  23m 37s |  |  trunk passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  compile  |  20m 13s |  |  trunk passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  checkstyle  |   1m  0s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m 36s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   1m  7s |  |  trunk passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   1m 37s |  |  trunk passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | -1 :x: |  spotbugs  |   2m 27s | [/branch-spotbugs-hadoop-common-project_hadoop-common-warnings.html](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3645/3/artifact/out/branch-spotbugs-hadoop-common-project_hadoop-common-warnings.html) |  hadoop-common-project/hadoop-common in trunk has 1 extant spotbugs warnings.  |
   | +1 :green_heart: |  shadedclient  |  24m 20s |  |  branch has no errors when building and testing our client artifacts.  |
   |||| _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   1m  1s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  22m 56s |  |  the patch passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javac  |  22m 56s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  20m  7s |  |  the patch passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  javac  |  20m  7s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks issues.  |
   | +1 :green_heart: |  checkstyle  |   0m 57s |  |  hadoop-common-project/hadoop-common: The patch generated 0 new + 33 unchanged - 8 fixed = 33 total (was 41)  |
   | +1 :green_heart: |  mvnsite  |   1m 34s |  |  the patch passed  |
   | +1 :green_heart: |  xml  |   0m  1s |  |  The patch has no ill-formed XML file.  |
   | -1 :x: |  javadoc  |   1m  4s | [/patch-javadoc-hadoop-common-project_hadoop-common-jdkUbuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3645/3/artifact/out/patch-javadoc-hadoop-common-project_hadoop-common-jdkUbuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04.txt) |  hadoop-common in the patch failed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04.  |
   | +1 :green_heart: |  javadoc  |   1m 37s |  |  the patch passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  spotbugs  |   2m 36s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  25m  8s |  |  patch has no errors when building and testing our client artifacts.  |
   |||| _ Other Tests _ |
   | +1 :green_heart: |  unit  |  17m 20s |  |  hadoop-common in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 49s |  |  The patch does not generate ASF License warnings.  |
   |  |   | 206m 34s |  |  |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3645/3/artifact/out/Dockerfile |
   | GITHUB PR | https://github.com/apache/hadoop/pull/3645 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell markdownlint xml |
   | uname | Linux 7ad858b6dcca 4.15.0-147-generic #151-Ubuntu SMP Fri Jun 18 19:21:19 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 4d764c10dd5f5cc63006c65f596a3a7f9376cf80 |
   | Default Java | Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 |
   | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 |
   |  Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3645/3/testReport/ |
   | Max. process+thread count | 2847 (vs. ulimit of 5500) |
   | modules | C: hadoop-common-project/hadoop-common U: hadoop-common-project/hadoop-common |
   | Console output | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3645/3/console |
   | versions | git=2.25.1 maven=3.6.3 spotbugs=4.2.2 |
   | Powered by | Apache Yetus 0.14.0-SNAPSHOT https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] smarthanwang commented on a change in pull request #3645: HADOOP-17998. Enable get command run with multi-thread.

Posted by GitBox <gi...@apache.org>.
smarthanwang commented on a change in pull request #3645:
URL: https://github.com/apache/hadoop/pull/3645#discussion_r752825112



##########
File path: hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/shell/CopyCommandWithMultiThread.java
##########
@@ -0,0 +1,157 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.fs.shell;
+
+import java.io.IOException;
+import java.util.LinkedList;
+import java.util.concurrent.ArrayBlockingQueue;
+import java.util.concurrent.ThreadPoolExecutor;
+import java.util.concurrent.TimeUnit;
+
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import org.apache.hadoop.classification.VisibleForTesting;
+
+/**
+ * Abstract command to enable sub copy commands run with multi-thread.
+ */
+public abstract class CopyCommandWithMultiThread
+    extends CommandWithDestination {
+
+  private int threadCount = 1;
+  private ThreadPoolExecutor executor = null;
+  private int threadPoolQueueSize = DEFAULT_QUEUE_SIZE;
+
+  public static final int DEFAULT_QUEUE_SIZE = 1024;
+  public static final int MAX_THREAD_COUNT =
+      Runtime.getRuntime().availableProcessors() * 2;

Review comment:
       I agree with you, no limit is the best way. I set the limit just for keep same  with the original limit. I will cancel it.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] smarthanwang commented on pull request #3645: HADOOP-17998. Enable get command run with multi-thread.

Posted by GitBox <gi...@apache.org>.
smarthanwang commented on pull request #3645:
URL: https://github.com/apache/hadoop/pull/3645#issuecomment-975017718


   @sodonnel thanks for review, fixed the java docs errors.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] sodonnel merged pull request #3645: HADOOP-17998. Enable get command run with multi-thread.

Posted by GitBox <gi...@apache.org>.
sodonnel merged pull request #3645:
URL: https://github.com/apache/hadoop/pull/3645


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] sodonnel edited a comment on pull request #3645: HADOOP-17998. Enable get command run with multi-thread.

Posted by GitBox <gi...@apache.org>.
sodonnel edited a comment on pull request #3645:
URL: https://github.com/apache/hadoop/pull/3645#issuecomment-971410375


   This seems like a good change. I will try to review in more detail in the next few days.
   
   Could you add the new -t parameter to the docs in `./hadoop-common-project/hadoop-common/src/site/markdown/FileSystemShell.md` as part of this PR please? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] smarthanwang commented on a change in pull request #3645: HADOOP-17998. Enable get command run with multi-thread.

Posted by GitBox <gi...@apache.org>.
smarthanwang commented on a change in pull request #3645:
URL: https://github.com/apache/hadoop/pull/3645#discussion_r753016551



##########
File path: hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/shell/CopyCommandWithMultiThread.java
##########
@@ -0,0 +1,157 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.fs.shell;
+
+import java.io.IOException;
+import java.util.LinkedList;
+import java.util.concurrent.ArrayBlockingQueue;
+import java.util.concurrent.ThreadPoolExecutor;
+import java.util.concurrent.TimeUnit;
+
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import org.apache.hadoop.classification.VisibleForTesting;
+
+/**
+ * Abstract command to enable sub copy commands run with multi-thread.
+ */
+public abstract class CopyCommandWithMultiThread
+    extends CommandWithDestination {
+
+  private int threadCount = 1;
+  private ThreadPoolExecutor executor = null;
+  private int threadPoolQueueSize = DEFAULT_QUEUE_SIZE;
+
+  public static final int DEFAULT_QUEUE_SIZE = 1024;
+  public static final int MAX_THREAD_COUNT =
+      Runtime.getRuntime().availableProcessors() * 2;
+
+  public static final Logger LOG =
+      LoggerFactory.getLogger(CopyCommandWithMultiThread.class);
+
+  protected void setThreadCount(String optValue) {
+    if (optValue != null) {
+      int count = Integer.parseInt(optValue);
+      threadCount = count < 1 ? 1 : Math.min(count, MAX_THREAD_COUNT);

Review comment:
       Fixed By add docs to these methods.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] smarthanwang commented on a change in pull request #3645: HADOOP-17998. Enable get command run with multi-thread.

Posted by GitBox <gi...@apache.org>.
smarthanwang commented on a change in pull request #3645:
URL: https://github.com/apache/hadoop/pull/3645#discussion_r753014895



##########
File path: hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/shell/TestCopyToLocal.java
##########
@@ -0,0 +1,230 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.fs.shell;
+
+import java.io.IOException;
+import java.util.LinkedList;
+import java.util.concurrent.ThreadPoolExecutor;
+
+import org.junit.AfterClass;
+import org.junit.Assert;
+import org.junit.BeforeClass;
+import org.junit.Test;
+
+import org.apache.commons.lang3.RandomStringUtils;
+import org.apache.commons.lang3.RandomUtils;
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.fs.FSDataOutputStream;
+import org.apache.hadoop.fs.FileSystem;
+import org.apache.hadoop.fs.FileSystemTestHelper;
+import org.apache.hadoop.fs.LocalFileSystem;
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.fs.shell.CopyCommands.CopyToLocal;
+
+import static org.apache.hadoop.fs.shell.CopyCommandWithMultiThread.DEFAULT_QUEUE_SIZE;
+import static org.apache.hadoop.fs.shell.CopyCommandWithMultiThread.MAX_THREAD_COUNT;
+import static org.junit.Assert.assertEquals;
+
+public class TestCopyToLocal {
+
+  private static final String FROM_DIR_NAME = "fromDir";
+  private static final String TO_DIR_NAME = "toDir";
+
+  private static FileSystem fs;
+  private static Path testDir;
+  private static Configuration conf;
+
+  private static int initialize(Path dir) throws Exception {
+    fs.mkdirs(dir);
+    Path fromDirPath = new Path(dir, FROM_DIR_NAME);
+    fs.mkdirs(fromDirPath);
+    Path toDirPath = new Path(dir, TO_DIR_NAME);
+    fs.mkdirs(toDirPath);
+
+    int numTotalFiles = 0;
+    int numDirs = RandomUtils.nextInt(0, 5);
+    for (int dirCount = 0; dirCount < numDirs; ++dirCount) {
+      Path subDirPath = new Path(fromDirPath, "subdir" + dirCount);
+      fs.mkdirs(subDirPath);
+      int numFiles = RandomUtils.nextInt(0, 10);
+      for (int fileCount = 0; fileCount < numFiles; ++fileCount) {
+        numTotalFiles++;
+        Path subFile = new Path(subDirPath, "file" + fileCount);
+        fs.createNewFile(subFile);
+        FSDataOutputStream output = fs.create(subFile, true);
+        for (int i = 0; i < 100; ++i) {
+          output.writeInt(i);
+          output.writeChar('\n');
+        }
+        output.close();
+      }
+    }
+
+    return numTotalFiles;
+  }
+
+  @BeforeClass
+  public static void init() throws Exception {
+    conf = new Configuration(false);
+    conf.set("fs.file.impl", LocalFileSystem.class.getName());
+    fs = FileSystem.getLocal(conf);
+    testDir = new FileSystemTestHelper().getTestRootPath(fs);
+    // don't want scheme on the path, just an absolute path
+    testDir = new Path(fs.makeQualified(testDir).toUri().getPath());
+
+    FileSystem.setDefaultUri(conf, fs.getUri());
+    fs.setWorkingDirectory(testDir);
+  }
+
+  @AfterClass
+  public static void cleanup() throws Exception {
+    fs.delete(testDir, true);
+    fs.close();
+  }
+
+  private void run(CopyCommandWithMultiThread cmd, String... args) {
+    cmd.setConf(conf);
+    assertEquals(0, cmd.run(args));
+  }
+
+  @org.junit.Test(timeout = 10000)

Review comment:
       Fixed 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] hadoop-yetus commented on pull request #3645: HADOOP-17998. Enable get command run with multi-thread.

Posted by GitBox <gi...@apache.org>.
hadoop-yetus commented on pull request #3645:
URL: https://github.com/apache/hadoop/pull/3645#issuecomment-975148937


   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |:----:|----------:|--------:|:--------:|:-------:|
   | +0 :ok: |  reexec  |   0m 41s |  |  Docker mode activated.  |
   |||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  markdownlint  |   0m  0s |  |  markdownlint was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to include 4 new or modified test files.  |
   |||| _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  32m  8s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |  21m 50s |  |  trunk passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  compile  |  19m  5s |  |  trunk passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  checkstyle  |   1m  7s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m 41s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   1m 13s |  |  trunk passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   1m 44s |  |  trunk passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  spotbugs  |   2m 27s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  22m  2s |  |  branch has no errors when building and testing our client artifacts.  |
   |||| _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   1m  1s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  21m  9s |  |  the patch passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javac  |  21m  9s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  19m  8s |  |  the patch passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  javac  |  19m  8s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks issues.  |
   | +1 :green_heart: |  checkstyle  |   1m  7s |  |  hadoop-common-project/hadoop-common: The patch generated 0 new + 51 unchanged - 8 fixed = 51 total (was 59)  |
   | +1 :green_heart: |  mvnsite  |   1m 38s |  |  the patch passed  |
   | +1 :green_heart: |  xml  |   0m  2s |  |  The patch has no ill-formed XML file.  |
   | +1 :green_heart: |  javadoc  |   1m 10s |  |  the patch passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   1m 44s |  |  the patch passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  spotbugs  |   2m 37s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  22m  1s |  |  patch has no errors when building and testing our client artifacts.  |
   |||| _ Other Tests _ |
   | +1 :green_heart: |  unit  |  17m 23s |  |  hadoop-common in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 56s |  |  The patch does not generate ASF License warnings.  |
   |  |   | 194m 12s |  |  |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3645/4/artifact/out/Dockerfile |
   | GITHUB PR | https://github.com/apache/hadoop/pull/3645 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell markdownlint xml |
   | uname | Linux b83ce577d780 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 320d5572c890768fbc8a9cf63b18a0f8c9488998 |
   | Default Java | Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 |
   | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 |
   |  Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3645/4/testReport/ |
   | Max. process+thread count | 3152 (vs. ulimit of 5500) |
   | modules | C: hadoop-common-project/hadoop-common U: hadoop-common-project/hadoop-common |
   | Console output | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3645/4/console |
   | versions | git=2.25.1 maven=3.6.3 spotbugs=4.2.2 |
   | Powered by | Apache Yetus 0.14.0-SNAPSHOT https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] sodonnel commented on pull request #3645: HADOOP-17998. Enable get command run with multi-thread.

Posted by GitBox <gi...@apache.org>.
sodonnel commented on pull request #3645:
URL: https://github.com/apache/hadoop/pull/3645#issuecomment-974082796


   There are 3 errors coming up in the Java Docs build job related to the changes here:
   
   ```
   [ERROR] /home/jenkins/jenkins-agent/workspace/hadoop-multibranch_PR-3645/ubuntu-focal/src/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/shell/CopyCommandWithMultiThread.java:42: error: malformed HTML
   [ERROR]    * set thread count by option value, if the value <= 1, use 1 instead.
   [ERROR]                                                     ^
   [ERROR] /home/jenkins/jenkins-agent/workspace/hadoop-multibranch_PR-3645/ubuntu-focal/src/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/shell/CopyCommandWithMultiThread.java:53: error: malformed HTML
   [ERROR]    * set thread pool queue size by option value, if the value < 1,
   [ERROR]                                                               ^
   [ERROR] /home/jenkins/jenkins-agent/workspace/hadoop-multibranch_PR-3645/ubuntu-focal/src/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/shell/CommandWithDestination.java:400: error: unknown tag: target
   [ERROR]    * file "<target>._COPYING_". If the copy is
   ```
   
   I guess it does not like the `<` symbol or the `<target>` as it thinks it should be valid HTML.
   
   Could you try to fix those by changing the Java Doc comments and then I think this change is good to commit.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] sodonnel commented on a change in pull request #3645: HADOOP-17998. Enable get command run with multi-thread.

Posted by GitBox <gi...@apache.org>.
sodonnel commented on a change in pull request #3645:
URL: https://github.com/apache/hadoop/pull/3645#discussion_r752220174



##########
File path: hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/shell/CopyCommandWithMultiThread.java
##########
@@ -0,0 +1,157 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.fs.shell;
+
+import java.io.IOException;
+import java.util.LinkedList;
+import java.util.concurrent.ArrayBlockingQueue;
+import java.util.concurrent.ThreadPoolExecutor;
+import java.util.concurrent.TimeUnit;
+
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import org.apache.hadoop.classification.VisibleForTesting;
+
+/**
+ * Abstract command to enable sub copy commands run with multi-thread.
+ */
+public abstract class CopyCommandWithMultiThread
+    extends CommandWithDestination {
+
+  private int threadCount = 1;
+  private ThreadPoolExecutor executor = null;
+  private int threadPoolQueueSize = DEFAULT_QUEUE_SIZE;
+
+  public static final int DEFAULT_QUEUE_SIZE = 1024;
+  public static final int MAX_THREAD_COUNT =
+      Runtime.getRuntime().availableProcessors() * 2;

Review comment:
       I wonder if we should limit the number of threads like this. Its hard to say if the copy will be CPU bound, or disk / IO bound overall. I guess it is best to avoid having no limit, but I wonder if having 2 * cores would be enough for a small VM trying to put a large dir into the cluster. What do you think? Maybe setting the limit to 4 or 8 * cores would be more flexible for users and they can experiment with their own hardware to find the best setting?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] sodonnel commented on a change in pull request #3645: HADOOP-17998. Enable get command run with multi-thread.

Posted by GitBox <gi...@apache.org>.
sodonnel commented on a change in pull request #3645:
URL: https://github.com/apache/hadoop/pull/3645#discussion_r752221314



##########
File path: hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/shell/CopyCommandWithMultiThread.java
##########
@@ -0,0 +1,157 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.fs.shell;
+
+import java.io.IOException;
+import java.util.LinkedList;
+import java.util.concurrent.ArrayBlockingQueue;
+import java.util.concurrent.ThreadPoolExecutor;
+import java.util.concurrent.TimeUnit;
+
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import org.apache.hadoop.classification.VisibleForTesting;
+
+/**
+ * Abstract command to enable sub copy commands run with multi-thread.
+ */
+public abstract class CopyCommandWithMultiThread
+    extends CommandWithDestination {
+
+  private int threadCount = 1;
+  private ThreadPoolExecutor executor = null;
+  private int threadPoolQueueSize = DEFAULT_QUEUE_SIZE;
+
+  public static final int DEFAULT_QUEUE_SIZE = 1024;
+  public static final int MAX_THREAD_COUNT =
+      Runtime.getRuntime().availableProcessors() * 2;
+
+  public static final Logger LOG =
+      LoggerFactory.getLogger(CopyCommandWithMultiThread.class);
+
+  protected void setThreadCount(String optValue) {
+    if (optValue != null) {
+      int count = Integer.parseInt(optValue);
+      threadCount = count < 1 ? 1 : Math.min(count, MAX_THREAD_COUNT);

Review comment:
       Should we warn here if the thread count is being reduced due to the MAX_THREAD_COUNT in a similar way to `setThreadPoolQueueSize`?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] smarthanwang commented on pull request #3645: HADOOP-17998. Enable get command run with multi-thread.

Posted by GitBox <gi...@apache.org>.
smarthanwang commented on pull request #3645:
URL: https://github.com/apache/hadoop/pull/3645#issuecomment-973909341


   > There is also a failing test in `hadoop.cli.TestCLI`. I have not looked into it, but it may be related to these changes as we are changing the CLI here. Can you check it too please?
   
   The failure was caused by the modification of command's usage and description,  fixed now.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] sodonnel commented on a change in pull request #3645: HADOOP-17998. Enable get command run with multi-thread.

Posted by GitBox <gi...@apache.org>.
sodonnel commented on a change in pull request #3645:
URL: https://github.com/apache/hadoop/pull/3645#discussion_r752215572



##########
File path: hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/shell/TestCopyToLocal.java
##########
@@ -0,0 +1,230 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.fs.shell;
+
+import java.io.IOException;
+import java.util.LinkedList;
+import java.util.concurrent.ThreadPoolExecutor;
+
+import org.junit.AfterClass;
+import org.junit.Assert;
+import org.junit.BeforeClass;
+import org.junit.Test;
+
+import org.apache.commons.lang3.RandomStringUtils;
+import org.apache.commons.lang3.RandomUtils;
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.fs.FSDataOutputStream;
+import org.apache.hadoop.fs.FileSystem;
+import org.apache.hadoop.fs.FileSystemTestHelper;
+import org.apache.hadoop.fs.LocalFileSystem;
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.fs.shell.CopyCommands.CopyToLocal;
+
+import static org.apache.hadoop.fs.shell.CopyCommandWithMultiThread.DEFAULT_QUEUE_SIZE;
+import static org.apache.hadoop.fs.shell.CopyCommandWithMultiThread.MAX_THREAD_COUNT;
+import static org.junit.Assert.assertEquals;
+
+public class TestCopyToLocal {
+
+  private static final String FROM_DIR_NAME = "fromDir";
+  private static final String TO_DIR_NAME = "toDir";
+
+  private static FileSystem fs;
+  private static Path testDir;
+  private static Configuration conf;
+
+  private static int initialize(Path dir) throws Exception {
+    fs.mkdirs(dir);
+    Path fromDirPath = new Path(dir, FROM_DIR_NAME);
+    fs.mkdirs(fromDirPath);
+    Path toDirPath = new Path(dir, TO_DIR_NAME);
+    fs.mkdirs(toDirPath);
+
+    int numTotalFiles = 0;
+    int numDirs = RandomUtils.nextInt(0, 5);
+    for (int dirCount = 0; dirCount < numDirs; ++dirCount) {
+      Path subDirPath = new Path(fromDirPath, "subdir" + dirCount);
+      fs.mkdirs(subDirPath);
+      int numFiles = RandomUtils.nextInt(0, 10);
+      for (int fileCount = 0; fileCount < numFiles; ++fileCount) {
+        numTotalFiles++;
+        Path subFile = new Path(subDirPath, "file" + fileCount);
+        fs.createNewFile(subFile);
+        FSDataOutputStream output = fs.create(subFile, true);
+        for (int i = 0; i < 100; ++i) {
+          output.writeInt(i);
+          output.writeChar('\n');
+        }
+        output.close();
+      }
+    }
+
+    return numTotalFiles;
+  }
+
+  @BeforeClass
+  public static void init() throws Exception {
+    conf = new Configuration(false);
+    conf.set("fs.file.impl", LocalFileSystem.class.getName());
+    fs = FileSystem.getLocal(conf);
+    testDir = new FileSystemTestHelper().getTestRootPath(fs);
+    // don't want scheme on the path, just an absolute path
+    testDir = new Path(fs.makeQualified(testDir).toUri().getPath());
+
+    FileSystem.setDefaultUri(conf, fs.getUri());
+    fs.setWorkingDirectory(testDir);
+  }
+
+  @AfterClass
+  public static void cleanup() throws Exception {
+    fs.delete(testDir, true);
+    fs.close();
+  }
+
+  private void run(CopyCommandWithMultiThread cmd, String... args) {
+    cmd.setConf(conf);
+    assertEquals(0, cmd.run(args));
+  }
+
+  @org.junit.Test(timeout = 10000)
+  public void testCopy() throws Exception {
+    Path dir = new Path("dir" + RandomStringUtils.randomNumeric(4));
+    initialize(dir);
+    MultiThreadedCopy copy = new MultiThreadedCopy(1, DEFAULT_QUEUE_SIZE, 0);

Review comment:
       Every test starts with these two lines:
   
   ```
   Path dir = new Path("dir" + RandomStringUtils.randomNumeric(4));
       initialize(dir);
   ```
   
   Do you think it would be better to create a `@Before` method to run before each test?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] sodonnel commented on a change in pull request #3645: HADOOP-17998. Enable get command run with multi-thread.

Posted by GitBox <gi...@apache.org>.
sodonnel commented on a change in pull request #3645:
URL: https://github.com/apache/hadoop/pull/3645#discussion_r752213829



##########
File path: hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/shell/TestCopyToLocal.java
##########
@@ -0,0 +1,230 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.fs.shell;
+
+import java.io.IOException;
+import java.util.LinkedList;
+import java.util.concurrent.ThreadPoolExecutor;
+
+import org.junit.AfterClass;
+import org.junit.Assert;
+import org.junit.BeforeClass;
+import org.junit.Test;
+
+import org.apache.commons.lang3.RandomStringUtils;
+import org.apache.commons.lang3.RandomUtils;
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.fs.FSDataOutputStream;
+import org.apache.hadoop.fs.FileSystem;
+import org.apache.hadoop.fs.FileSystemTestHelper;
+import org.apache.hadoop.fs.LocalFileSystem;
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.fs.shell.CopyCommands.CopyToLocal;
+
+import static org.apache.hadoop.fs.shell.CopyCommandWithMultiThread.DEFAULT_QUEUE_SIZE;
+import static org.apache.hadoop.fs.shell.CopyCommandWithMultiThread.MAX_THREAD_COUNT;
+import static org.junit.Assert.assertEquals;
+
+public class TestCopyToLocal {
+
+  private static final String FROM_DIR_NAME = "fromDir";
+  private static final String TO_DIR_NAME = "toDir";
+
+  private static FileSystem fs;
+  private static Path testDir;
+  private static Configuration conf;
+
+  private static int initialize(Path dir) throws Exception {
+    fs.mkdirs(dir);
+    Path fromDirPath = new Path(dir, FROM_DIR_NAME);
+    fs.mkdirs(fromDirPath);
+    Path toDirPath = new Path(dir, TO_DIR_NAME);
+    fs.mkdirs(toDirPath);
+
+    int numTotalFiles = 0;
+    int numDirs = RandomUtils.nextInt(0, 5);
+    for (int dirCount = 0; dirCount < numDirs; ++dirCount) {
+      Path subDirPath = new Path(fromDirPath, "subdir" + dirCount);
+      fs.mkdirs(subDirPath);
+      int numFiles = RandomUtils.nextInt(0, 10);
+      for (int fileCount = 0; fileCount < numFiles; ++fileCount) {
+        numTotalFiles++;
+        Path subFile = new Path(subDirPath, "file" + fileCount);
+        fs.createNewFile(subFile);
+        FSDataOutputStream output = fs.create(subFile, true);
+        for (int i = 0; i < 100; ++i) {
+          output.writeInt(i);
+          output.writeChar('\n');
+        }
+        output.close();
+      }
+    }
+
+    return numTotalFiles;
+  }
+
+  @BeforeClass
+  public static void init() throws Exception {
+    conf = new Configuration(false);
+    conf.set("fs.file.impl", LocalFileSystem.class.getName());
+    fs = FileSystem.getLocal(conf);
+    testDir = new FileSystemTestHelper().getTestRootPath(fs);
+    // don't want scheme on the path, just an absolute path
+    testDir = new Path(fs.makeQualified(testDir).toUri().getPath());
+
+    FileSystem.setDefaultUri(conf, fs.getUri());
+    fs.setWorkingDirectory(testDir);
+  }
+
+  @AfterClass
+  public static void cleanup() throws Exception {
+    fs.delete(testDir, true);
+    fs.close();
+  }
+
+  private void run(CopyCommandWithMultiThread cmd, String... args) {
+    cmd.setConf(conf);
+    assertEquals(0, cmd.run(args));
+  }
+
+  @org.junit.Test(timeout = 10000)

Review comment:
       Nit: There is a mixture of `@org.junit.Test` and `@Test` annotations in this class - can you change them all to just `@Test`?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] smarthanwang commented on pull request #3645: HADOOP-17998. Enable get command run with multi-thread.

Posted by GitBox <gi...@apache.org>.
smarthanwang commented on pull request #3645:
URL: https://github.com/apache/hadoop/pull/3645#issuecomment-972522580


   @sodonnel thanks for your comment,  I have added -t and -q parameters to the docs, and fixed some mistakes in docs meanwhile.  


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] sodonnel commented on pull request #3645: HADOOP-17998. Enable get command run with multi-thread.

Posted by GitBox <gi...@apache.org>.
sodonnel commented on pull request #3645:
URL: https://github.com/apache/hadoop/pull/3645#issuecomment-972850753


   Thanks for working on this @smarthanwang. The change looks mostly good to me. I have just a few minor comments.
   
   There is also a failing test in `hadoop.cli.TestCLI`. I have not looked into it, but it may be related to these changes as we are changing the CLI here. Can you check it too please?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] smarthanwang commented on a change in pull request #3645: HADOOP-17998. Enable get command run with multi-thread.

Posted by GitBox <gi...@apache.org>.
smarthanwang commented on a change in pull request #3645:
URL: https://github.com/apache/hadoop/pull/3645#discussion_r753015750



##########
File path: hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/shell/TestCopyToLocal.java
##########
@@ -0,0 +1,230 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.fs.shell;
+
+import java.io.IOException;
+import java.util.LinkedList;
+import java.util.concurrent.ThreadPoolExecutor;
+
+import org.junit.AfterClass;
+import org.junit.Assert;
+import org.junit.BeforeClass;
+import org.junit.Test;
+
+import org.apache.commons.lang3.RandomStringUtils;
+import org.apache.commons.lang3.RandomUtils;
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.fs.FSDataOutputStream;
+import org.apache.hadoop.fs.FileSystem;
+import org.apache.hadoop.fs.FileSystemTestHelper;
+import org.apache.hadoop.fs.LocalFileSystem;
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.fs.shell.CopyCommands.CopyToLocal;
+
+import static org.apache.hadoop.fs.shell.CopyCommandWithMultiThread.DEFAULT_QUEUE_SIZE;
+import static org.apache.hadoop.fs.shell.CopyCommandWithMultiThread.MAX_THREAD_COUNT;
+import static org.junit.Assert.assertEquals;
+
+public class TestCopyToLocal {
+
+  private static final String FROM_DIR_NAME = "fromDir";
+  private static final String TO_DIR_NAME = "toDir";
+
+  private static FileSystem fs;
+  private static Path testDir;
+  private static Configuration conf;
+
+  private static int initialize(Path dir) throws Exception {
+    fs.mkdirs(dir);
+    Path fromDirPath = new Path(dir, FROM_DIR_NAME);
+    fs.mkdirs(fromDirPath);
+    Path toDirPath = new Path(dir, TO_DIR_NAME);
+    fs.mkdirs(toDirPath);
+
+    int numTotalFiles = 0;
+    int numDirs = RandomUtils.nextInt(0, 5);
+    for (int dirCount = 0; dirCount < numDirs; ++dirCount) {
+      Path subDirPath = new Path(fromDirPath, "subdir" + dirCount);
+      fs.mkdirs(subDirPath);
+      int numFiles = RandomUtils.nextInt(0, 10);
+      for (int fileCount = 0; fileCount < numFiles; ++fileCount) {
+        numTotalFiles++;
+        Path subFile = new Path(subDirPath, "file" + fileCount);
+        fs.createNewFile(subFile);
+        FSDataOutputStream output = fs.create(subFile, true);
+        for (int i = 0; i < 100; ++i) {
+          output.writeInt(i);
+          output.writeChar('\n');
+        }
+        output.close();
+      }
+    }
+
+    return numTotalFiles;
+  }
+
+  @BeforeClass
+  public static void init() throws Exception {
+    conf = new Configuration(false);
+    conf.set("fs.file.impl", LocalFileSystem.class.getName());
+    fs = FileSystem.getLocal(conf);
+    testDir = new FileSystemTestHelper().getTestRootPath(fs);
+    // don't want scheme on the path, just an absolute path
+    testDir = new Path(fs.makeQualified(testDir).toUri().getPath());
+
+    FileSystem.setDefaultUri(conf, fs.getUri());
+    fs.setWorkingDirectory(testDir);
+  }
+
+  @AfterClass
+  public static void cleanup() throws Exception {
+    fs.delete(testDir, true);
+    fs.close();
+  }
+
+  private void run(CopyCommandWithMultiThread cmd, String... args) {
+    cmd.setConf(conf);
+    assertEquals(0, cmd.run(args));
+  }
+
+  @org.junit.Test(timeout = 10000)
+  public void testCopy() throws Exception {
+    Path dir = new Path("dir" + RandomStringUtils.randomNumeric(4));
+    initialize(dir);
+    MultiThreadedCopy copy = new MultiThreadedCopy(1, DEFAULT_QUEUE_SIZE, 0);

Review comment:
       Fixed by created  `@Before `  method




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org