You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "zhuyaogai (via GitHub)" <gi...@apache.org> on 2023/02/14 08:00:44 UTC

[GitHub] [hadoop] zhuyaogai opened a new pull request, #5391: HADOOP-18629. Hadoop DistCp supports specifying favoredNodes for data copying

zhuyaogai opened a new pull request, #5391:
URL: https://github.com/apache/hadoop/pull/5391

   ### Description of PR
   Hadoop DistCp supports specifying favoredNodes for data copying.
   
   ### How was this patch tested?
   Add new UT.
   
   ### For code changes:
   
   - [ ] Does the title or this PR starts with the corresponding JIRA issue id (e.g. 'HADOOP-17799. Your PR title ...')?
   - [ ] Object storage: have the integration tests been executed and the endpoint declared according to the connector-specific documentation?
   - [ ] If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under [ASF 2.0](http://www.apache.org/legal/resolved.html#category-a)?
   - [ ] If applicable, have you updated the `LICENSE`, `LICENSE-binary`, `NOTICE-binary` files?
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] zhuyaogai commented on a diff in pull request #5391: HADOOP-18629. Hadoop DistCp supports specifying favoredNodes for data copying

Posted by "zhuyaogai (via GitHub)" <gi...@apache.org>.
zhuyaogai commented on code in PR #5391:
URL: https://github.com/apache/hadoop/pull/5391#discussion_r1125843372


##########
hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/mapred/RetriableFileCopyCommand.java:
##########
@@ -223,9 +233,25 @@ private long copyToFile(Path targetPath, FileSystem targetFS,
       FSDataOutputStream out;
       ChecksumOpt checksumOpt = getChecksumOpt(fileAttributes, sourceChecksum);
       if (!preserveEC || ecPolicy == null) {
-        out = targetFS.create(targetPath, permission,
-            EnumSet.of(CreateFlag.CREATE, CreateFlag.OVERWRITE), copyBufferSize,
-            repl, blockSize, context, checksumOpt);
+        if (targetFS instanceof DistributedFileSystem

Review Comment:
   If I only change else branch, some UT will be fail when I have set `favoredNodes`, because in my UT it just goes into the if branch (ec settings if not set). Is there a better solution?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] zhuyaogai commented on pull request #5391: HADOOP-18629. Hadoop DistCp supports specifying favoredNodes for data copying

Posted by "zhuyaogai (via GitHub)" <gi...@apache.org>.
zhuyaogai commented on PR #5391:
URL: https://github.com/apache/hadoop/pull/5391#issuecomment-1442901997

   @steveloughran @toddlipcon @phunt 
   hi, could you give me some advice? Thank you!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] zhuyaogai commented on a diff in pull request #5391: HADOOP-18629. Hadoop DistCp supports specifying favoredNodes for data copying

Posted by "zhuyaogai (via GitHub)" <gi...@apache.org>.
zhuyaogai commented on code in PR #5391:
URL: https://github.com/apache/hadoop/pull/5391#discussion_r1125847429


##########
hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/mapred/RetriableFileCopyCommand.java:
##########
@@ -223,9 +233,25 @@ private long copyToFile(Path targetPath, FileSystem targetFS,
       FSDataOutputStream out;
       ChecksumOpt checksumOpt = getChecksumOpt(fileAttributes, sourceChecksum);
       if (!preserveEC || ecPolicy == null) {
-        out = targetFS.create(targetPath, permission,
-            EnumSet.of(CreateFlag.CREATE, CreateFlag.OVERWRITE), copyBufferSize,
-            repl, blockSize, context, checksumOpt);
+        if (targetFS instanceof DistributedFileSystem

Review Comment:
   Thank you for your code review and so many valuable comments!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] hadoop-yetus commented on pull request #5391: HADOOP-18629. Hadoop DistCp supports specifying favoredNodes for data copying

Posted by "hadoop-yetus (via GitHub)" <gi...@apache.org>.
hadoop-yetus commented on PR #5391:
URL: https://github.com/apache/hadoop/pull/5391#issuecomment-1430405499

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |:----:|----------:|--------:|:--------:|:-------:|
   | +0 :ok: |  reexec  |   0m 45s |  |  Docker mode activated.  |
   |||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  1s |  |  No case conflicting files found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to include 2 new or modified test files.  |
   |||| _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  17m 17s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  31m 37s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |  24m 53s |  |  trunk passed with JDK Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04  |
   | +1 :green_heart: |  compile  |  22m 10s |  |  trunk passed with JDK Private Build-1.8.0_352-8u352-ga-1~20.04-b08  |
   | +1 :green_heart: |  checkstyle  |   3m 44s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   2m  4s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   1m 49s |  |  trunk passed with JDK Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04  |
   | +1 :green_heart: |  javadoc  |   1m 39s |  |  trunk passed with JDK Private Build-1.8.0_352-8u352-ga-1~20.04-b08  |
   | +1 :green_heart: |  spotbugs  |   3m 55s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  23m 58s |  |  branch has no errors when building and testing our client artifacts.  |
   |||| _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 30s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   1m 28s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  22m 25s |  |  the patch passed with JDK Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04  |
   | +1 :green_heart: |  javac  |  22m 25s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  20m 25s |  |  the patch passed with JDK Private Build-1.8.0_352-8u352-ga-1~20.04-b08  |
   | +1 :green_heart: |  javac  |  20m 25s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks issues.  |
   | +1 :green_heart: |  checkstyle  |   3m 36s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   2m  3s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   1m 41s |  |  the patch passed with JDK Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04  |
   | +1 :green_heart: |  javadoc  |   1m 41s |  |  the patch passed with JDK Private Build-1.8.0_352-8u352-ga-1~20.04-b08  |
   | +1 :green_heart: |  spotbugs  |   4m  8s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  24m  2s |  |  patch has no errors when building and testing our client artifacts.  |
   |||| _ Other Tests _ |
   | +1 :green_heart: |  unit  |   2m 45s |  |  hadoop-hdfs-client in the patch passed.  |
   | -1 :x: |  unit  |  14m 46s | [/patch-unit-hadoop-tools_hadoop-distcp.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5391/3/artifact/out/patch-unit-hadoop-tools_hadoop-distcp.txt) |  hadoop-distcp in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 59s |  |  The patch does not generate ASF License warnings.  |
   |  |   | 240m 40s |  |  |
   
   
   | Reason | Tests |
   |-------:|:------|
   | Failed junit tests | hadoop.tools.contract.TestLocalContractDistCp |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | ClientAPI=1.42 ServerAPI=1.42 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5391/3/artifact/out/Dockerfile |
   | GITHUB PR | https://github.com/apache/hadoop/pull/5391 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | Linux d770bec9be9c 4.15.0-200-generic #211-Ubuntu SMP Thu Nov 24 18:16:04 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 1182895b78d4d92012d09f1fa11d88abd50ef6b7 |
   | Default Java | Private Build-1.8.0_352-8u352-ga-1~20.04-b08 |
   | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_352-8u352-ga-1~20.04-b08 |
   |  Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5391/3/testReport/ |
   | Max. process+thread count | 667 (vs. ulimit of 5500) |
   | modules | C: hadoop-hdfs-project/hadoop-hdfs-client hadoop-tools/hadoop-distcp U: . |
   | Console output | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5391/3/console |
   | versions | git=2.25.1 maven=3.6.3 spotbugs=4.2.2 |
   | Powered by | Apache Yetus 0.14.0 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] zhuyaogai commented on pull request #5391: HADOOP-18629. Hadoop DistCp supports specifying favoredNodes for data copying

Posted by "zhuyaogai (via GitHub)" <gi...@apache.org>.
zhuyaogai commented on PR #5391:
URL: https://github.com/apache/hadoop/pull/5391#issuecomment-1430118718

   > 
   @steveloughran hi, I have fixed some problems, could you please review it again? and please correct me if I'm wrong. Thank you :)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] zhuyaogai commented on a diff in pull request #5391: HADOOP-18629. Hadoop DistCp supports specifying favoredNodes for data copying

Posted by "zhuyaogai (via GitHub)" <gi...@apache.org>.
zhuyaogai commented on code in PR #5391:
URL: https://github.com/apache/hadoop/pull/5391#discussion_r1125831678


##########
hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/mapred/RetriableFileCopyCommand.java:
##########
@@ -247,6 +276,22 @@ private long copyToFile(Path targetPath, FileSystem targetFS,
         context);
   }
 
+  private InetSocketAddress[] toFavoredNodes(String favoredNodesStr) throws UnknownHostException {
+    List<InetSocketAddress> result = new ArrayList<>();
+    for (String hostAndPort : favoredNodesStr.split(",")) {

Review Comment:
   I have checked if `favoredNodesStr ` is an empty string above, and will skip `favoredNodesStr` if empty.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] zhuyaogai commented on a diff in pull request #5391: HADOOP-18629. Hadoop DistCp supports specifying favoredNodes for data copying

Posted by "zhuyaogai (via GitHub)" <gi...@apache.org>.
zhuyaogai commented on code in PR #5391:
URL: https://github.com/apache/hadoop/pull/5391#discussion_r1125837868


##########
hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/OptionsParser.java:
##########
@@ -239,6 +239,21 @@ public static DistCpOptions parse(String[] args)
       }
     }
 
+    if (command.hasOption(DistCpOptionSwitch.FAVORED_NODES.getSwitch())) {
+      String favoredNodesStr = getVal(command, DistCpOptionSwitch.FAVORED_NODES.getSwitch().trim());
+      if (StringUtils.isEmpty(favoredNodesStr)) {

Review Comment:
   Add new UT in `TestOptionsParser`, and I think it's ok that YARN cluster can resolve favored nodes. BTW, it seems that DN does not care about `port` but `hostname`?
   https://github.com/apache/hadoop/blob/2a0dc2ab2f5fb46dc540ed440d6c8b2896dd195b/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeManager.java#L703
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] hadoop-yetus commented on pull request #5391: HADOOP-18629. Hadoop DistCp supports specifying favoredNodes for data copying

Posted by "hadoop-yetus (via GitHub)" <gi...@apache.org>.
hadoop-yetus commented on PR #5391:
URL: https://github.com/apache/hadoop/pull/5391#issuecomment-1456058542

   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |:----:|----------:|--------:|:--------:|:-------:|
   | +0 :ok: |  reexec  |   0m 42s |  |  Docker mode activated.  |
   |||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  1s |  |  No case conflicting files found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to include 3 new or modified test files.  |
   |||| _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  15m 13s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  25m 49s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |  22m 59s |  |  trunk passed with JDK Ubuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1  |
   | +1 :green_heart: |  compile  |  20m 26s |  |  trunk passed with JDK Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09  |
   | +1 :green_heart: |  checkstyle  |   3m 47s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   2m  4s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   1m 48s |  |  trunk passed with JDK Ubuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 39s |  |  trunk passed with JDK Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09  |
   | +1 :green_heart: |  spotbugs  |   3m 59s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  20m 50s |  |  branch has no errors when building and testing our client artifacts.  |
   |||| _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 28s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   1m 15s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  22m 24s |  |  the patch passed with JDK Ubuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1  |
   | +1 :green_heart: |  javac  |  22m 24s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  20m 33s |  |  the patch passed with JDK Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09  |
   | +1 :green_heart: |  javac  |  20m 33s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks issues.  |
   | -0 :warning: |  checkstyle  |   3m 38s | [/results-checkstyle-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5391/6/artifact/out/results-checkstyle-root.txt) |  root: The patch generated 1 new + 62 unchanged - 0 fixed = 63 total (was 62)  |
   | +1 :green_heart: |  mvnsite  |   2m  3s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   1m 41s |  |  the patch passed with JDK Ubuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 38s |  |  the patch passed with JDK Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09  |
   | +1 :green_heart: |  spotbugs  |   4m 10s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  20m 51s |  |  patch has no errors when building and testing our client artifacts.  |
   |||| _ Other Tests _ |
   | +1 :green_heart: |  unit  |   2m 42s |  |  hadoop-hdfs-client in the patch passed.  |
   | +1 :green_heart: |  unit  |  13m 56s |  |  hadoop-distcp in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   1m  2s |  |  The patch does not generate ASF License warnings.  |
   |  |   | 221m 26s |  |  |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | ClientAPI=1.42 ServerAPI=1.42 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5391/6/artifact/out/Dockerfile |
   | GITHUB PR | https://github.com/apache/hadoop/pull/5391 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | Linux fecad139efb7 4.15.0-200-generic #211-Ubuntu SMP Thu Nov 24 18:16:04 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / c906c78e3e7aecd12a818fcb76d281f713387323 |
   | Default Java | Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09 |
   | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09 |
   |  Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5391/6/testReport/ |
   | Max. process+thread count | 731 (vs. ulimit of 5500) |
   | modules | C: hadoop-hdfs-project/hadoop-hdfs-client hadoop-tools/hadoop-distcp U: . |
   | Console output | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5391/6/console |
   | versions | git=2.25.1 maven=3.6.3 spotbugs=4.2.2 |
   | Powered by | Apache Yetus 0.14.0 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] zhuyaogai commented on pull request #5391: HADOOP-18629. Hadoop DistCp supports specifying favoredNodes for data copying

Posted by "zhuyaogai (via GitHub)" <gi...@apache.org>.
zhuyaogai commented on PR #5391:
URL: https://github.com/apache/hadoop/pull/5391#issuecomment-1432392281

   @steveloughran hi, thanks for your suggestion:) I know what you mean, but I find that it also uses  the hdfs public/stable API in source code. 
   https://github.com/apache/hadoop/blob/723535b788070f6b103be3bae621fefe3b753081/hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/mapred/RetriableFileCopyCommand.java#L230
   I just refer to its practice in the if branch, and if you think my code changes affect too much, can I just change the else branch code and add favoredNodes option in it? Please correct me if I'm wrong. Thank you :)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] hadoop-yetus commented on pull request #5391: HADOOP-18629. Hadoop DistCp supports specifying favoredNodes for data copying

Posted by "hadoop-yetus (via GitHub)" <gi...@apache.org>.
hadoop-yetus commented on PR #5391:
URL: https://github.com/apache/hadoop/pull/5391#issuecomment-1429650998

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |:----:|----------:|--------:|:--------:|:-------:|
   | +0 :ok: |  reexec  |   0m 38s |  |  Docker mode activated.  |
   |||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  1s |  |  detect-secrets was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to include 2 new or modified test files.  |
   |||| _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  17m 32s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  31m 33s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |  23m 39s |  |  trunk passed with JDK Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04  |
   | +1 :green_heart: |  compile  |  20m 40s |  |  trunk passed with JDK Private Build-1.8.0_352-8u352-ga-1~20.04-b08  |
   | +1 :green_heart: |  checkstyle  |   3m 46s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   2m  6s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   1m 49s |  |  trunk passed with JDK Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04  |
   | +1 :green_heart: |  javadoc  |   1m 38s |  |  trunk passed with JDK Private Build-1.8.0_352-8u352-ga-1~20.04-b08  |
   | +1 :green_heart: |  spotbugs  |   3m 58s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  23m 54s |  |  branch has no errors when building and testing our client artifacts.  |
   |||| _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 28s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   1m 29s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  22m 41s |  |  the patch passed with JDK Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04  |
   | +1 :green_heart: |  javac  |  22m 41s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  20m 43s |  |  the patch passed with JDK Private Build-1.8.0_352-8u352-ga-1~20.04-b08  |
   | +1 :green_heart: |  javac  |  20m 43s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks issues.  |
   | -0 :warning: |  checkstyle  |   3m 37s | [/results-checkstyle-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5391/1/artifact/out/results-checkstyle-root.txt) |  root: The patch generated 9 new + 54 unchanged - 1 fixed = 63 total (was 55)  |
   | +1 :green_heart: |  mvnsite  |   2m  4s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   1m 41s |  |  the patch passed with JDK Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04  |
   | +1 :green_heart: |  javadoc  |   1m 38s |  |  the patch passed with JDK Private Build-1.8.0_352-8u352-ga-1~20.04-b08  |
   | +1 :green_heart: |  spotbugs  |   4m 10s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  24m  1s |  |  patch has no errors when building and testing our client artifacts.  |
   |||| _ Other Tests _ |
   | +1 :green_heart: |  unit  |   2m 44s |  |  hadoop-hdfs-client in the patch passed.  |
   | -1 :x: |  unit  |  27m 55s | [/patch-unit-hadoop-tools_hadoop-distcp.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5391/1/artifact/out/patch-unit-hadoop-tools_hadoop-distcp.txt) |  hadoop-distcp in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   1m  1s |  |  The patch does not generate ASF License warnings.  |
   |  |   | 251m 27s |  |  |
   
   
   | Reason | Tests |
   |-------:|:------|
   | Failed junit tests | hadoop.tools.TestDistCpSync |
   |   | hadoop.tools.TestDistCpWithRawXAttrs |
   |   | hadoop.tools.TestDistCpSystem |
   |   | hadoop.tools.TestDistCpWithAcls |
   |   | hadoop.tools.TestDistCpWithXAttrs |
   |   | hadoop.tools.contract.TestLocalContractDistCp |
   |   | hadoop.tools.contract.TestHDFSContractDistCp |
   |   | hadoop.tools.TestDistCpSyncReverseFromTarget |
   |   | hadoop.tools.TestDistCpSyncReverseFromSource |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | ClientAPI=1.42 ServerAPI=1.42 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5391/1/artifact/out/Dockerfile |
   | GITHUB PR | https://github.com/apache/hadoop/pull/5391 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | Linux 060e891702b0 4.15.0-200-generic #211-Ubuntu SMP Thu Nov 24 18:16:04 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / be67a5984771d510d7cfd96439215bccc3d5baca |
   | Default Java | Private Build-1.8.0_352-8u352-ga-1~20.04-b08 |
   | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_352-8u352-ga-1~20.04-b08 |
   |  Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5391/1/testReport/ |
   | Max. process+thread count | 566 (vs. ulimit of 5500) |
   | modules | C: hadoop-hdfs-project/hadoop-hdfs-client hadoop-tools/hadoop-distcp U: . |
   | Console output | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5391/1/console |
   | versions | git=2.25.1 maven=3.6.3 spotbugs=4.2.2 |
   | Powered by | Apache Yetus 0.14.0 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] steveloughran commented on pull request #5391: HADOOP-18629. Hadoop DistCp supports specifying favoredNodes for data copying

Posted by "steveloughran (via GitHub)" <gi...@apache.org>.
steveloughran commented on PR #5391:
URL: https://github.com/apache/hadoop/pull/5391#issuecomment-1429555546

   -1 to anything exposing internal hdfs implementation methods. Sorry
   
    People start using them and expect them to be stable and maintained. There is also the little detail that in cloud deployments do not always have hdfs jars on the class path; this PR would break those deployments.
   
   What would make sense would be to use createFile() and for hdfs to add a .opt() option for those favoured nodes, createFile() is the public api, .opt() options can be ignorred by other filesystems, *or reimplemented*. There is a lot more in terms of design and wiring up but the benefit is that portability and maintainability.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] steveloughran commented on a diff in pull request #5391: HADOOP-18629. Hadoop DistCp supports specifying favoredNodes for data copying

Posted by "steveloughran (via GitHub)" <gi...@apache.org>.
steveloughran commented on code in PR #5391:
URL: https://github.com/apache/hadoop/pull/5391#discussion_r1117059919


##########
hadoop-tools/hadoop-distcp/src/test/java/org/apache/hadoop/tools/TestDistCpSystem.java:
##########
@@ -594,4 +595,49 @@ public void testUpdateRoot() throws Exception {
     assertEquals(srcStatus.getModificationTime(),
         destStatus2.getModificationTime());
   }
+
+  @Test
+  public void testFavoredNodes() throws Exception {
+    final String testRoot = "/testdir";
+    final String testSrc = testRoot + "/" + SRCDAT;
+    final String testDst = testRoot + "/" + DSTDAT;
+
+    String nnUri = FileSystem.getDefaultUri(conf).toString();
+    String rootStr = nnUri + testSrc;
+    String tgtStr = nnUri + testDst;
+
+    FileEntry[] srcFiles = {
+        new FileEntry(SRCDAT, true),
+        new FileEntry(SRCDAT + "/file10", false)
+    };
+
+    DistributedFileSystem fs = (DistributedFileSystem) FileSystem.get(URI.create(nnUri), conf);
+    createFiles(fs, testRoot, srcFiles, -1);
+
+    // sad path
+    assertNotEquals(ToolRunner.run(conf, new DistCp(),

Review Comment:
   the args are the wrong way round fro these assertions.
   
   switch to AssertJ assertions and add good .descriptions, so if a jenkins runs fails we know what went wrong, rather than just what line. Right



##########
hadoop-tools/hadoop-distcp/src/site/markdown/DistCp.md.vm:
##########
@@ -364,6 +364,7 @@ Command Line Options
 | `-direct` | Write directly to destination paths | Useful for avoiding potentially very expensive temporary file rename operations when the destination is an object store |
 | `-useiterator` | Uses single threaded listStatusIterator to build listing | Useful for saving memory at the client side. Using this option will ignore the numListstatusThreads option |
 | `-updateRoot` | Update root directory attributes (eg permissions, ownership ...) | Useful if you need to enforce root directory attributes update when using distcp |
+| `-favoredNodes` | Specify favored nodes (Desired option input format: host1:port1,host2:port2,...) | Useful if you need to specify favored nodes when using distcp |

Review Comment:
   +. Requires the destination to be an hdfs filesystem



##########
hadoop-tools/hadoop-distcp/src/test/java/org/apache/hadoop/tools/TestDistCpSystem.java:
##########
@@ -32,6 +32,7 @@
 import java.util.List;
 import java.util.Random;
 
+import org.apache.hadoop.hdfs.server.datanode.DataNode;

Review Comment:
   nit: must go in the right place in the org.apache.hadoop imports.



##########
hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/mapred/RetriableFileCopyCommand.java:
##########
@@ -247,6 +276,22 @@ private long copyToFile(Path targetPath, FileSystem targetFS,
         context);
   }
 
+  private InetSocketAddress[] toFavoredNodes(String favoredNodesStr) throws UnknownHostException {
+    List<InetSocketAddress> result = new ArrayList<>();
+    for (String hostAndPort : favoredNodesStr.split(",")) {
+      String[] split = hostAndPort.split(":");

Review Comment:
   log at debug, or maybe even at info.



##########
hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/OptionsParser.java:
##########
@@ -239,6 +239,21 @@ public static DistCpOptions parse(String[] args)
       }
     }
 
+    if (command.hasOption(DistCpOptionSwitch.FAVORED_NODES.getSwitch())) {
+      String favoredNodesStr = getVal(command, DistCpOptionSwitch.FAVORED_NODES.getSwitch().trim());
+      if (StringUtils.isEmpty(favoredNodesStr)) {

Review Comment:
   if you pull this out to a @VisibleForTesting package scoped method then unit tests could to try to break it through invalid args, e.g
   * trailing ,
   * empty string
    invalid node/valid node bad port.



##########
hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/mapred/RetriableFileCopyCommand.java:
##########
@@ -247,6 +276,22 @@ private long copyToFile(Path targetPath, FileSystem targetFS,
         context);
   }
 
+  private InetSocketAddress[] toFavoredNodes(String favoredNodesStr) throws UnknownHostException {
+    List<InetSocketAddress> result = new ArrayList<>();
+    for (String hostAndPort : favoredNodesStr.split(",")) {
+      String[] split = hostAndPort.split(":");
+      if (split.length != 2) {
+        throw new IllegalArgumentException("Illegal favoredNodes parameter: " + hostAndPort);

Review Comment:
   prefer `org.apache.hadoop.util.Preconditions` here, e.g 
   ```
   checkArgument(split.length == 2, ""Illegal favoredNodes parameter: %s", hostAndPort)
   ```
    
   



##########
hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/DistCpOptions.java:
##########
@@ -164,6 +164,8 @@ public final class DistCpOptions {
 
   private final boolean updateRoot;
 
+  private final String favoredNodes;
+

Review Comment:
   nit: add javadocs.



##########
hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/mapred/RetriableFileCopyCommand.java:
##########
@@ -247,6 +276,22 @@ private long copyToFile(Path targetPath, FileSystem targetFS,
         context);
   }
 
+  private InetSocketAddress[] toFavoredNodes(String favoredNodesStr) throws UnknownHostException {

Review Comment:
   javadocs to explain what happens, when exceptions are raised.



##########
hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/mapred/RetriableFileCopyCommand.java:
##########
@@ -247,6 +276,22 @@ private long copyToFile(Path targetPath, FileSystem targetFS,
         context);
   }
 
+  private InetSocketAddress[] toFavoredNodes(String favoredNodesStr) throws UnknownHostException {
+    List<InetSocketAddress> result = new ArrayList<>();
+    for (String hostAndPort : favoredNodesStr.split(",")) {

Review Comment:
   what happens if an empty string is passed in? it should be an error, right? so add a test



##########
hadoop-tools/hadoop-distcp/src/test/java/org/apache/hadoop/tools/TestDistCpOptions.java:
##########
@@ -574,4 +574,15 @@ public void testUpdateRoot() {
         .build();
     Assert.assertTrue(options.shouldUpdateRoot());
   }
+
+  @Test
+  public void testFavoredNodes() {
+    final DistCpOptions options = new DistCpOptions.Builder(
+        Collections.singletonList(
+            new Path("hdfs://localhost:8020/source")),
+        new Path("hdfs://localhost:8020/target/"))
+        .withFavoredNodes("localhost:50010")
+        .build();
+    Assert.assertNotNull(options.getFavoredNodes());

Review Comment:
   prefer assertJ for new tests, with a message, and ideally verification the node value went all the way through



##########
hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/mapred/RetriableFileCopyCommand.java:
##########
@@ -223,9 +233,25 @@ private long copyToFile(Path targetPath, FileSystem targetFS,
       FSDataOutputStream out;
       ChecksumOpt checksumOpt = getChecksumOpt(fileAttributes, sourceChecksum);
       if (!preserveEC || ecPolicy == null) {
-        out = targetFS.create(targetPath, permission,
-            EnumSet.of(CreateFlag.CREATE, CreateFlag.OVERWRITE), copyBufferSize,
-            repl, blockSize, context, checksumOpt);
+        if (targetFS instanceof DistributedFileSystem

Review Comment:
   There should be one path if preserving ec or favoredNodes is set, so switch to hdfsbuilder, leaving the other path as create()
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] steveloughran commented on pull request #5391: HADOOP-18629. Hadoop DistCp supports specifying favoredNodes for data copying

Posted by "steveloughran (via GitHub)" <gi...@apache.org>.
steveloughran commented on PR #5391:
URL: https://github.com/apache/hadoop/pull/5391#issuecomment-1443728823

   This is interesting. I know there are deployments without hdfs around (e.g. azure clusters), but do see that the import is there for snapshot updates (hdfs only) with explicit imports of SnapshotDiffReport. tracing it back, if you use "-diff" on the command line then hdfs *must* be on the classpath.
   
   your link flags up that it is already in copymapper; looking at that I don't see it in branch-3.3; it came in with HADOOP-14254. 
   
   Given it is in there on a codepath hit with the option to copy erasure policy (and skipped if not), then again, provided the change goes in such that it is optional, your patch isn't going to force in a new run-time dependency, is it?
   
   let me look at the new patch some more without worrying about that detail...the builder API is public. 
   
   Be aware that we are always very nervous about touching distcp because it is fairly old and brittle code that is used incredibly broadly -not just on the command line but actually at the Java API from applications like hive. I think this is fairly low risk but will highlight the JIRA on the HDFS mailing list so they can review it to.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] hadoop-yetus commented on pull request #5391: HADOOP-18629. Hadoop DistCp supports specifying favoredNodes for data copying

Posted by "hadoop-yetus (via GitHub)" <gi...@apache.org>.
hadoop-yetus commented on PR #5391:
URL: https://github.com/apache/hadoop/pull/5391#issuecomment-1455526533

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |:----:|----------:|--------:|:--------:|:-------:|
   | +0 :ok: |  reexec  |  17m  9s |  |  Docker mode activated.  |
   |||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  1s |  |  No case conflicting files found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to include 3 new or modified test files.  |
   |||| _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  15m 36s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  25m 40s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |  23m  0s |  |  trunk passed with JDK Ubuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1  |
   | +1 :green_heart: |  compile  |  20m 26s |  |  trunk passed with JDK Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09  |
   | +1 :green_heart: |  checkstyle  |   3m 45s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   2m  3s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   1m 49s |  |  trunk passed with JDK Ubuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 38s |  |  trunk passed with JDK Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09  |
   | +1 :green_heart: |  spotbugs  |   4m  2s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  20m 46s |  |  branch has no errors when building and testing our client artifacts.  |
   |||| _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 28s |  |  Maven dependency ordering for patch  |
   | -1 :x: |  mvninstall  |   0m 15s | [/patch-mvninstall-hadoop-tools_hadoop-distcp.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5391/5/artifact/out/patch-mvninstall-hadoop-tools_hadoop-distcp.txt) |  hadoop-distcp in the patch failed.  |
   | -1 :x: |  compile  |  12m 30s | [/patch-compile-root-jdkUbuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5391/5/artifact/out/patch-compile-root-jdkUbuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1.txt) |  root in the patch failed with JDK Ubuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1.  |
   | -1 :x: |  javac  |  12m 30s | [/patch-compile-root-jdkUbuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5391/5/artifact/out/patch-compile-root-jdkUbuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1.txt) |  root in the patch failed with JDK Ubuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1.  |
   | -1 :x: |  compile  |  11m 14s | [/patch-compile-root-jdkPrivateBuild-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5391/5/artifact/out/patch-compile-root-jdkPrivateBuild-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09.txt) |  root in the patch failed with JDK Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09.  |
   | -1 :x: |  javac  |  11m 14s | [/patch-compile-root-jdkPrivateBuild-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5391/5/artifact/out/patch-compile-root-jdkPrivateBuild-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09.txt) |  root in the patch failed with JDK Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09.  |
   | +1 :green_heart: |  blanks  |   0m  1s |  |  The patch has no blanks issues.  |
   | -0 :warning: |  checkstyle  |   3m 18s | [/results-checkstyle-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5391/5/artifact/out/results-checkstyle-root.txt) |  root: The patch generated 1 new + 63 unchanged - 0 fixed = 64 total (was 63)  |
   | -1 :x: |  mvnsite  |   0m 22s | [/patch-mvnsite-hadoop-tools_hadoop-distcp.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5391/5/artifact/out/patch-mvnsite-hadoop-tools_hadoop-distcp.txt) |  hadoop-distcp in the patch failed.  |
   | -1 :x: |  javadoc  |   0m 21s | [/patch-javadoc-hadoop-tools_hadoop-distcp-jdkUbuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5391/5/artifact/out/patch-javadoc-hadoop-tools_hadoop-distcp-jdkUbuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1.txt) |  hadoop-distcp in the patch failed with JDK Ubuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1.  |
   | -1 :x: |  javadoc  |   0m 21s | [/patch-javadoc-hadoop-tools_hadoop-distcp-jdkPrivateBuild-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5391/5/artifact/out/patch-javadoc-hadoop-tools_hadoop-distcp-jdkPrivateBuild-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09.txt) |  hadoop-distcp in the patch failed with JDK Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09.  |
   | -1 :x: |  spotbugs  |   0m 23s | [/patch-spotbugs-hadoop-tools_hadoop-distcp.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5391/5/artifact/out/patch-spotbugs-hadoop-tools_hadoop-distcp.txt) |  hadoop-distcp in the patch failed.  |
   | +1 :green_heart: |  shadedclient  |  21m 52s |  |  patch has no errors when building and testing our client artifacts.  |
   |||| _ Other Tests _ |
   | +1 :green_heart: |  unit  |   2m 33s |  |  hadoop-hdfs-client in the patch passed.  |
   | -1 :x: |  unit  |   0m 21s | [/patch-unit-hadoop-tools_hadoop-distcp.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5391/5/artifact/out/patch-unit-hadoop-tools_hadoop-distcp.txt) |  hadoop-distcp in the patch failed.  |
   | +1 :green_heart: |  asflicense  |   0m 40s |  |  The patch does not generate ASF License warnings.  |
   |  |   | 198m 42s |  |  |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | ClientAPI=1.42 ServerAPI=1.42 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5391/5/artifact/out/Dockerfile |
   | GITHUB PR | https://github.com/apache/hadoop/pull/5391 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | Linux 6b1c4e22c346 4.15.0-200-generic #211-Ubuntu SMP Thu Nov 24 18:16:04 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / b20d228c6939cded9fb2ca92ef667feb84b5397d |
   | Default Java | Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09 |
   | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09 |
   |  Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5391/5/testReport/ |
   | Max. process+thread count | 648 (vs. ulimit of 5500) |
   | modules | C: hadoop-hdfs-project/hadoop-hdfs-client hadoop-tools/hadoop-distcp U: . |
   | Console output | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5391/5/console |
   | versions | git=2.25.1 maven=3.6.3 spotbugs=4.2.2 |
   | Powered by | Apache Yetus 0.14.0 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] zhuyaogai commented on pull request #5391: HADOOP-18629. Hadoop DistCp supports specifying favoredNodes for data copying

Posted by "zhuyaogai (via GitHub)" <gi...@apache.org>.
zhuyaogai commented on PR #5391:
URL: https://github.com/apache/hadoop/pull/5391#issuecomment-1442901558

   @steveloughran @toddlipcon @phunt 
   hi, could you give me some advice? Thank you!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] zhuyaogai commented on a diff in pull request #5391: HADOOP-18629. Hadoop DistCp supports specifying favoredNodes for data copying

Posted by "zhuyaogai (via GitHub)" <gi...@apache.org>.
zhuyaogai commented on code in PR #5391:
URL: https://github.com/apache/hadoop/pull/5391#discussion_r1125837868


##########
hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/OptionsParser.java:
##########
@@ -239,6 +239,21 @@ public static DistCpOptions parse(String[] args)
       }
     }
 
+    if (command.hasOption(DistCpOptionSwitch.FAVORED_NODES.getSwitch())) {
+      String favoredNodesStr = getVal(command, DistCpOptionSwitch.FAVORED_NODES.getSwitch().trim());
+      if (StringUtils.isEmpty(favoredNodesStr)) {

Review Comment:
   Add new UT in `TestOptionsParser`, and I think it's ok that YARN cluster can resolve favored nodes. BTW, it seems that DN does not care about `port` but `hostname`?
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] hadoop-yetus commented on pull request #5391: HADOOP-18629. Hadoop DistCp supports specifying favoredNodes for data copying

Posted by "hadoop-yetus (via GitHub)" <gi...@apache.org>.
hadoop-yetus commented on PR #5391:
URL: https://github.com/apache/hadoop/pull/5391#issuecomment-1455156173

   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |:----:|----------:|--------:|:--------:|:-------:|
   | +0 :ok: |  reexec  |   0m 37s |  |  Docker mode activated.  |
   |||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to include 2 new or modified test files.  |
   |||| _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  15m 28s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  25m 50s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |  23m 12s |  |  trunk passed with JDK Ubuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1  |
   | +1 :green_heart: |  compile  |  20m 37s |  |  trunk passed with JDK Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09  |
   | +1 :green_heart: |  checkstyle  |   3m 49s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   2m  5s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   1m 50s |  |  trunk passed with JDK Ubuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 39s |  |  trunk passed with JDK Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09  |
   | +1 :green_heart: |  spotbugs  |   3m 58s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  20m 33s |  |  branch has no errors when building and testing our client artifacts.  |
   |||| _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 29s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   1m 15s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  22m 29s |  |  the patch passed with JDK Ubuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1  |
   | +1 :green_heart: |  javac  |  22m 29s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  20m 27s |  |  the patch passed with JDK Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09  |
   | +1 :green_heart: |  javac  |  20m 27s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks issues.  |
   | +1 :green_heart: |  checkstyle  |   3m 41s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   2m  1s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   1m 41s |  |  the patch passed with JDK Ubuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 39s |  |  the patch passed with JDK Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09  |
   | +1 :green_heart: |  spotbugs  |   4m 10s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  21m 10s |  |  patch has no errors when building and testing our client artifacts.  |
   |||| _ Other Tests _ |
   | +1 :green_heart: |  unit  |   2m 44s |  |  hadoop-hdfs-client in the patch passed.  |
   | +1 :green_heart: |  unit  |  14m  7s |  |  hadoop-distcp in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   1m  3s |  |  The patch does not generate ASF License warnings.  |
   |  |   | 222m 28s |  |  |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | ClientAPI=1.42 ServerAPI=1.42 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5391/4/artifact/out/Dockerfile |
   | GITHUB PR | https://github.com/apache/hadoop/pull/5391 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | Linux e50831db0918 4.15.0-200-generic #211-Ubuntu SMP Thu Nov 24 18:16:04 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 662045ad8a6c5d2c5dff9907a0ac7e6842844153 |
   | Default Java | Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09 |
   | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09 |
   |  Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5391/4/testReport/ |
   | Max. process+thread count | 686 (vs. ulimit of 5500) |
   | modules | C: hadoop-hdfs-project/hadoop-hdfs-client hadoop-tools/hadoop-distcp U: . |
   | Console output | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5391/4/console |
   | versions | git=2.25.1 maven=3.6.3 spotbugs=4.2.2 |
   | Powered by | Apache Yetus 0.14.0 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] hadoop-yetus commented on pull request #5391: HADOOP-18629. Hadoop DistCp supports specifying favoredNodes for data copying

Posted by "hadoop-yetus (via GitHub)" <gi...@apache.org>.
hadoop-yetus commented on PR #5391:
URL: https://github.com/apache/hadoop/pull/5391#issuecomment-1429680256

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |:----:|----------:|--------:|:--------:|:-------:|
   | +0 :ok: |  reexec  |   0m 39s |  |  Docker mode activated.  |
   |||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  1s |  |  detect-secrets was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to include 2 new or modified test files.  |
   |||| _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |   7m 42s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  37m 13s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |  23m 10s |  |  trunk passed with JDK Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04  |
   | +1 :green_heart: |  compile  |  20m 29s |  |  trunk passed with JDK Private Build-1.8.0_352-8u352-ga-1~20.04-b08  |
   | +1 :green_heart: |  checkstyle  |   3m 47s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   2m  6s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   1m 47s |  |  trunk passed with JDK Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04  |
   | +1 :green_heart: |  javadoc  |   1m 38s |  |  trunk passed with JDK Private Build-1.8.0_352-8u352-ga-1~20.04-b08  |
   | +1 :green_heart: |  spotbugs  |   3m 56s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  23m 53s |  |  branch has no errors when building and testing our client artifacts.  |
   |||| _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 30s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   1m 27s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  22m 26s |  |  the patch passed with JDK Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04  |
   | +1 :green_heart: |  javac  |  22m 26s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  20m 28s |  |  the patch passed with JDK Private Build-1.8.0_352-8u352-ga-1~20.04-b08  |
   | +1 :green_heart: |  javac  |  20m 28s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks issues.  |
   | -0 :warning: |  checkstyle  |   3m 37s | [/results-checkstyle-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5391/2/artifact/out/results-checkstyle-root.txt) |  root: The patch generated 8 new + 55 unchanged - 1 fixed = 63 total (was 56)  |
   | +1 :green_heart: |  mvnsite  |   1m 59s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   1m 39s |  |  the patch passed with JDK Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04  |
   | +1 :green_heart: |  javadoc  |   1m 38s |  |  the patch passed with JDK Private Build-1.8.0_352-8u352-ga-1~20.04-b08  |
   | +1 :green_heart: |  spotbugs  |   4m  7s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  23m 53s |  |  patch has no errors when building and testing our client artifacts.  |
   |||| _ Other Tests _ |
   | +1 :green_heart: |  unit  |   2m 44s |  |  hadoop-hdfs-client in the patch passed.  |
   | -1 :x: |  unit  |  29m 24s | [/patch-unit-hadoop-tools_hadoop-distcp.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5391/2/artifact/out/patch-unit-hadoop-tools_hadoop-distcp.txt) |  hadoop-distcp in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   1m  2s |  |  The patch does not generate ASF License warnings.  |
   |  |   | 247m 24s |  |  |
   
   
   | Reason | Tests |
   |-------:|:------|
   | Failed junit tests | hadoop.tools.TestDistCpSync |
   |   | hadoop.tools.TestDistCpWithRawXAttrs |
   |   | hadoop.tools.TestDistCpSystem |
   |   | hadoop.tools.TestDistCpSyncReverseFromTarget |
   |   | hadoop.tools.TestDistCpWithAcls |
   |   | hadoop.tools.contract.TestHDFSContractDistCp |
   |   | hadoop.tools.TestDistCpSyncReverseFromSource |
   |   | hadoop.tools.contract.TestLocalContractDistCp |
   |   | hadoop.tools.TestDistCpWithXAttrs |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | ClientAPI=1.42 ServerAPI=1.42 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5391/2/artifact/out/Dockerfile |
   | GITHUB PR | https://github.com/apache/hadoop/pull/5391 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | Linux a25803cbcfde 4.15.0-200-generic #211-Ubuntu SMP Thu Nov 24 18:16:04 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 08c0cafef93ba76b6b69613bdfcc2542467ca5d3 |
   | Default Java | Private Build-1.8.0_352-8u352-ga-1~20.04-b08 |
   | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_352-8u352-ga-1~20.04-b08 |
   |  Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5391/2/testReport/ |
   | Max. process+thread count | 735 (vs. ulimit of 5500) |
   | modules | C: hadoop-hdfs-project/hadoop-hdfs-client hadoop-tools/hadoop-distcp U: . |
   | Console output | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5391/2/console |
   | versions | git=2.25.1 maven=3.6.3 spotbugs=4.2.2 |
   | Powered by | Apache Yetus 0.14.0 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org