You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by GitBox <gi...@apache.org> on 2022/04/29 12:47:16 UTC

[GitHub] [hadoop] ashutoshcipher opened a new pull request, #4248: MAPREDUCE-7370. Parallelize MultipleOutputs#close call

ashutoshcipher opened a new pull request, #4248:
URL: https://github.com/apache/hadoop/pull/4248

   ### Description of PR
   Parallelize MultipleOutputs#close call
   
   * JIRA: MAPREDUCE-7370
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] hadoop-yetus commented on pull request #4248: MAPREDUCE-7370. Parallelize MultipleOutputs#close call

Posted by GitBox <gi...@apache.org>.
hadoop-yetus commented on PR #4248:
URL: https://github.com/apache/hadoop/pull/4248#issuecomment-1216870357

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |:----:|----------:|--------:|:--------:|:-------:|
   | +0 :ok: |  reexec  |   0m 49s |  |  Docker mode activated.  |
   |||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  1s |  |  detect-secrets was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to include 2 new or modified test files.  |
   |||| _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  14m 53s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  26m 25s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   2m 53s |  |  trunk passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  compile  |   2m 28s |  |  trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  checkstyle  |   1m 22s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m 52s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   1m 27s |  |  trunk passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 18s |  |  trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   2m 50s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  21m 35s |  |  branch has no errors when building and testing our client artifacts.  |
   | -0 :warning: |  patch  |  22m  2s |  |  Used diff version of patch file. Binary files and potentially other changes not applied. Please rebase and squash commits if necessary.  |
   |||| _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 29s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   1m 16s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   2m 41s |  |  the patch passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | -1 :x: |  javac  |   2m 41s | [/results-compile-javac-hadoop-mapreduce-project_hadoop-mapreduce-client-jdkPrivateBuild-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4248/8/artifact/out/results-compile-javac-hadoop-mapreduce-project_hadoop-mapreduce-client-jdkPrivateBuild-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.txt) |  hadoop-mapreduce-project_hadoop-mapreduce-client-jdkPrivateBuild-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 generated 4 new + 286 unchanged - 0 fixed = 290 total (was 286)  |
   | +1 :green_heart: |  compile  |   2m 17s |  |  the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | -1 :x: |  javac  |   2m 17s | [/results-compile-javac-hadoop-mapreduce-project_hadoop-mapreduce-client-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4248/8/artifact/out/results-compile-javac-hadoop-mapreduce-project_hadoop-mapreduce-client-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.txt) |  hadoop-mapreduce-project_hadoop-mapreduce-client-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 generated 4 new + 269 unchanged - 0 fixed = 273 total (was 269)  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks issues.  |
   | +1 :green_heart: |  checkstyle  |   1m  1s |  |  hadoop-mapreduce-project/hadoop-mapreduce-client: The patch generated 0 new + 153 unchanged - 1 fixed = 153 total (was 154)  |
   | +1 :green_heart: |  mvnsite  |   1m 28s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 58s |  |  the patch passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   0m 53s |  |  the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | -1 :x: |  spotbugs  |   1m 41s | [/new-spotbugs-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-core.html](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4248/8/artifact/out/new-spotbugs-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-core.html) |  hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core generated 2 new + 0 unchanged - 0 fixed = 2 total (was 0)  |
   | +1 :green_heart: |  shadedclient  |  20m 32s |  |  patch has no errors when building and testing our client artifacts.  |
   |||| _ Other Tests _ |
   | +1 :green_heart: |  unit  |   7m 12s |  |  hadoop-mapreduce-client-core in the patch passed.  |
   | +1 :green_heart: |  unit  | 134m 27s |  |  hadoop-mapreduce-client-jobclient in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   1m  8s |  |  The patch does not generate ASF License warnings.  |
   |  |   | 257m  9s |  |  |
   
   
   | Reason | Tests |
   |-------:|:------|
   | SpotBugs | module:hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core |
   |  |  Inconsistent synchronization of org.apache.hadoop.mapred.lib.MultipleOutputs.recordWriters; locked 66% of time  Unsynchronized access at MultipleOutputs.java:66% of time  Unsynchronized access at MultipleOutputs.java:[line 412] |
   |  |  Inconsistent synchronization of org.apache.hadoop.mapreduce.lib.output.MultipleOutputs.recordWriters; locked 66% of time  Unsynchronized access at MultipleOutputs.java:66% of time  Unsynchronized access at MultipleOutputs.java:[line 360] |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4248/8/artifact/out/Dockerfile |
   | GITHUB PR | https://github.com/apache/hadoop/pull/4248 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | Linux 388b3d5924ae 4.15.0-112-generic #113-Ubuntu SMP Thu Jul 9 23:41:39 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 0ad4e42c36c1ff2086c43f9d0ff0bd652d212263 |
   | Default Java | Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 |
   | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 |
   |  Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4248/8/testReport/ |
   | Max. process+thread count | 1433 (vs. ulimit of 5500) |
   | modules | C: hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient U: hadoop-mapreduce-project/hadoop-mapreduce-client |
   | Console output | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4248/8/console |
   | versions | git=2.25.1 maven=3.6.3 spotbugs=4.2.2 |
   | Powered by | Apache Yetus 0.14.0 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] ashutoshcipher commented on pull request #4248: MAPREDUCE-7370. Parallelize MultipleOutputs#close call

Posted by GitBox <gi...@apache.org>.
ashutoshcipher commented on PR #4248:
URL: https://github.com/apache/hadoop/pull/4248#issuecomment-1264438518

   Thanks @aajisaka for reviewing. I have addressed your comments.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] cnauroth commented on a diff in pull request #4248: MAPREDUCE-7370. Parallelize MultipleOutputs#close call

Posted by GitBox <gi...@apache.org>.
cnauroth commented on code in PR #4248:
URL: https://github.com/apache/hadoop/pull/4248#discussion_r956291996


##########
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/lib/MultipleOutputs.java:
##########
@@ -527,9 +558,41 @@ public void collect(Object key, Object value) throws IOException {
    * @throws java.io.IOException thrown if any of the MultipleOutput files
    *                             could not be closed properly.
    */
-  public void close() throws IOException {
+  public void close() throws IOException, InterruptedException {
+    int nThreads = conf.getInt(MRConfig.MULTIPLE_OUTPUTS_CLOSE_THREAD_COUNT,
+        MRConfig.DEFAULT_MULTIPLE_OUTPUTS_CLOSE_THREAD_COUNT);
+    AtomicBoolean encounteredException = new AtomicBoolean(false);
+    ThreadFactory threadFactory = new ThreadFactoryBuilder().setNameFormat("MultipleOutputs-close")
+        .setUncaughtExceptionHandler(

Review Comment:
   Yes, makes sense. Thanks!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] ashutoshcipher commented on pull request #4248: MAPREDUCE-7370. Parallelize MultipleOutputs#close call

Posted by GitBox <gi...@apache.org>.
ashutoshcipher commented on PR #4248:
URL: https://github.com/apache/hadoop/pull/4248#issuecomment-1228778495

   Thanks @cnauroth for reviewing the PR again and sharing your comments and discussion. I will address last your comments in my next commit.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] ashutoshcipher commented on a diff in pull request #4248: MAPREDUCE-7370. Parallelize MultipleOutputs#close call

Posted by GitBox <gi...@apache.org>.
ashutoshcipher commented on code in PR #4248:
URL: https://github.com/apache/hadoop/pull/4248#discussion_r946109405


##########
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/lib/MultipleOutputs.java:
##########
@@ -528,8 +545,33 @@ public void collect(Object key, Object value) throws IOException {
    *                             could not be closed properly.
    */
   public void close() throws IOException {
+    int nThreads = 10;
+    AtomicReference<IOException> ioException = new AtomicReference<>();
+    ExecutorService executorService = Executors.newFixedThreadPool(nThreads);
+
+    List<Callable<Object>> callableList = new ArrayList<>();
+
     for (RecordWriter writer : recordWriters.values()) {
-      writer.close(null);
+      callableList.add(() -> {
+        try {
+          writer.close(null);
+          throw new IOException();

Review Comment:
   Sorry, it was left as part of testing, removed it now.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] PrabhuJoseph commented on pull request #4248: MAPREDUCE-7370. Parallelize MultipleOutputs#close call

Posted by GitBox <gi...@apache.org>.
PrabhuJoseph commented on PR #4248:
URL: https://github.com/apache/hadoop/pull/4248#issuecomment-1159995763

   Thanks @ashutoshcipher for the patch. 
   
   1. Can you include the same improvement to new api org.apache.hadoop.mapreduce.lib.output.MultipleOutputs as well.
   2. Can you add a test case if possible, else pls provide a reason.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] cnauroth merged pull request #4248: MAPREDUCE-7370. Parallelize MultipleOutputs#close call

Posted by GitBox <gi...@apache.org>.
cnauroth merged PR #4248:
URL: https://github.com/apache/hadoop/pull/4248


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] steveloughran commented on pull request #4248: MAPREDUCE-7370. Parallelize MultipleOutputs#close call

Posted by GitBox <gi...@apache.org>.
steveloughran commented on PR #4248:
URL: https://github.com/apache/hadoop/pull/4248#issuecomment-1225502343

   make the method synchronized. doesn't matter than it is only for testing; it may end up used


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] hadoop-yetus commented on pull request #4248: MAPREDUCE-7370. Parallelize MultipleOutputs#close call

Posted by GitBox <gi...@apache.org>.
hadoop-yetus commented on PR #4248:
URL: https://github.com/apache/hadoop/pull/4248#issuecomment-1233747614

   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |:----:|----------:|--------:|:--------:|:-------:|
   | +0 :ok: |  reexec  |   0m 57s |  |  Docker mode activated.  |
   |||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  1s |  |  detect-secrets was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to include 2 new or modified test files.  |
   |||| _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  15m 37s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  25m 44s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   2m 42s |  |  trunk passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  compile  |   2m 20s |  |  trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  checkstyle  |   1m 15s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m 47s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   1m 28s |  |  trunk passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 14s |  |  trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   2m 53s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  20m 47s |  |  branch has no errors when building and testing our client artifacts.  |
   | -0 :warning: |  patch  |  21m 15s |  |  Used diff version of patch file. Binary files and potentially other changes not applied. Please rebase and squash commits if necessary.  |
   |||| _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 26s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   1m 11s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   2m 34s |  |  the patch passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javac  |   2m 34s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   2m 11s |  |  the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  javac  |   2m 11s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks issues.  |
   | +1 :green_heart: |  checkstyle  |   1m  0s |  |  hadoop-mapreduce-project/hadoop-mapreduce-client: The patch generated 0 new + 153 unchanged - 1 fixed = 153 total (was 154)  |
   | +1 :green_heart: |  mvnsite  |   1m 21s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 51s |  |  the patch passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   0m 48s |  |  the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   2m 26s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  19m 52s |  |  patch has no errors when building and testing our client artifacts.  |
   |||| _ Other Tests _ |
   | +1 :green_heart: |  unit  |   7m  2s |  |  hadoop-mapreduce-client-core in the patch passed.  |
   | +1 :green_heart: |  unit  | 137m 29s |  |  hadoop-mapreduce-client-jobclient in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   1m  2s |  |  The patch does not generate ASF License warnings.  |
   |  |   | 257m 44s |  |  |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4248/13/artifact/out/Dockerfile |
   | GITHUB PR | https://github.com/apache/hadoop/pull/4248 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | Linux 15433bfd90be 4.15.0-191-generic #202-Ubuntu SMP Thu Aug 4 01:49:29 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / d746bf07d47e9f9cda713967ac1588d3bce8f73e |
   | Default Java | Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 |
   | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 |
   |  Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4248/13/testReport/ |
   | Max. process+thread count | 1635 (vs. ulimit of 5500) |
   | modules | C: hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient U: hadoop-mapreduce-project/hadoop-mapreduce-client |
   | Console output | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4248/13/console |
   | versions | git=2.25.1 maven=3.6.3 spotbugs=4.2.2 |
   | Powered by | Apache Yetus 0.14.0 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] ashutoshcipher commented on pull request #4248: MAPREDUCE-7370. Parallelize MultipleOutputs#close call

Posted by GitBox <gi...@apache.org>.
ashutoshcipher commented on PR #4248:
URL: https://github.com/apache/hadoop/pull/4248#issuecomment-1233602950

   > This looks good overall. I see there is a compilation error to address in the test code. Thanks!
   
   Thanks for reviewing and sharing your comments on PR. I have addressed this in my latest commit.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] hadoop-yetus commented on pull request #4248: MAPREDUCE-7370. Parallelize MultipleOutputs#close call

Posted by GitBox <gi...@apache.org>.
hadoop-yetus commented on PR #4248:
URL: https://github.com/apache/hadoop/pull/4248#issuecomment-1264495664

   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |:----:|----------:|--------:|:--------:|:-------:|
   | +0 :ok: |  reexec  |   0m 49s |  |  Docker mode activated.  |
   |||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to include 2 new or modified test files.  |
   |||| _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  17m 12s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  26m  6s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   2m 47s |  |  trunk passed with JDK Ubuntu-11.0.16+8-post-Ubuntu-0ubuntu120.04  |
   | +1 :green_heart: |  compile  |   2m 22s |  |  trunk passed with JDK Private Build-1.8.0_342-8u342-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  checkstyle  |   1m 16s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   2m  0s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   1m 23s |  |  trunk passed with JDK Ubuntu-11.0.16+8-post-Ubuntu-0ubuntu120.04  |
   | +1 :green_heart: |  javadoc  |   1m 11s |  |  trunk passed with JDK Private Build-1.8.0_342-8u342-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   2m 49s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  20m 33s |  |  branch has no errors when building and testing our client artifacts.  |
   | -0 :warning: |  patch  |  20m 56s |  |  Used diff version of patch file. Binary files and potentially other changes not applied. Please rebase and squash commits if necessary.  |
   |||| _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 29s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   1m  9s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   2m 31s |  |  the patch passed with JDK Ubuntu-11.0.16+8-post-Ubuntu-0ubuntu120.04  |
   | +1 :green_heart: |  javac  |   2m 31s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   2m 11s |  |  the patch passed with JDK Private Build-1.8.0_342-8u342-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  javac  |   2m 11s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks issues.  |
   | +1 :green_heart: |  checkstyle  |   0m 59s |  |  hadoop-mapreduce-project/hadoop-mapreduce-client: The patch generated 0 new + 153 unchanged - 1 fixed = 153 total (was 154)  |
   | +1 :green_heart: |  mvnsite  |   1m 18s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 53s |  |  the patch passed with JDK Ubuntu-11.0.16+8-post-Ubuntu-0ubuntu120.04  |
   | +1 :green_heart: |  javadoc  |   0m 52s |  |  the patch passed with JDK Private Build-1.8.0_342-8u342-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   2m 27s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  20m 10s |  |  patch has no errors when building and testing our client artifacts.  |
   |||| _ Other Tests _ |
   | +1 :green_heart: |  unit  |   7m 10s |  |  hadoop-mapreduce-client-core in the patch passed.  |
   | +1 :green_heart: |  unit  | 135m 39s |  |  hadoop-mapreduce-client-jobclient in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 54s |  |  The patch does not generate ASF License warnings.  |
   |  |   | 257m 18s |  |  |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4248/14/artifact/out/Dockerfile |
   | GITHUB PR | https://github.com/apache/hadoop/pull/4248 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | Linux 6a2482856d49 4.15.0-191-generic #202-Ubuntu SMP Thu Aug 4 01:49:29 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / fc2855f24e03e0555d6d81eb44840d193211f6f7 |
   | Default Java | Private Build-1.8.0_342-8u342-b07-0ubuntu1~20.04-b07 |
   | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.16+8-post-Ubuntu-0ubuntu120.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_342-8u342-b07-0ubuntu1~20.04-b07 |
   |  Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4248/14/testReport/ |
   | Max. process+thread count | 1473 (vs. ulimit of 5500) |
   | modules | C: hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient U: hadoop-mapreduce-project/hadoop-mapreduce-client |
   | Console output | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4248/14/console |
   | versions | git=2.25.1 maven=3.6.3 spotbugs=4.2.2 |
   | Powered by | Apache Yetus 0.14.0 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] cnauroth commented on a diff in pull request #4248: MAPREDUCE-7370. Parallelize MultipleOutputs#close call

Posted by GitBox <gi...@apache.org>.
cnauroth commented on code in PR #4248:
URL: https://github.com/apache/hadoop/pull/4248#discussion_r955436533


##########
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapred/lib/TestMultipleOutputs.java:
##########
@@ -70,6 +76,20 @@ public void testWithCounters() throws Exception {
     _testMOWithJavaSerialization(true);
   }
 
+  @Test(expected=IOException.class)
+  public void testParallelClose() throws IOException, InterruptedException {

Review Comment:
   I suggest naming this `testParallelCloseIOException` to make it clear that we are testing an error case.



##########
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/lib/MultipleOutputs.java:
##########
@@ -527,9 +558,41 @@ public void collect(Object key, Object value) throws IOException {
    * @throws java.io.IOException thrown if any of the MultipleOutput files
    *                             could not be closed properly.
    */
-  public void close() throws IOException {
+  public void close() throws IOException, InterruptedException {
+    int nThreads = conf.getInt(MRConfig.MULTIPLE_OUTPUTS_CLOSE_THREAD_COUNT,
+        MRConfig.DEFAULT_MULTIPLE_OUTPUTS_CLOSE_THREAD_COUNT);
+    AtomicBoolean encounteredException = new AtomicBoolean(false);
+    ThreadFactory threadFactory = new ThreadFactoryBuilder().setNameFormat("MultipleOutputs-close")
+        .setUncaughtExceptionHandler(

Review Comment:
   `IOException` is now being propagated back to the caller of `close()`, but any unexpected (unchecked) exceptions would be swallowed in this uncaught exception handler. This is different from the existing code, where the caller would receive the unchecked exception.
   
   I think the best we can do here is to set `encounteredException` from within the uncaught exception handler, resulting in throwing the `IOException` at the bottom of the method. The message would need to be generalized to "encountered exception during close", not mentioning `IOException`, because it might also have been some other unchecked exception.



##########
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/lib/MultipleOutputs.java:
##########
@@ -528,8 +545,33 @@ public void collect(Object key, Object value) throws IOException {
    *                             could not be closed properly.
    */
   public void close() throws IOException {
+    int nThreads = 10;
+    AtomicReference<IOException> ioException = new AtomicReference<>();
+    ExecutorService executorService = Executors.newFixedThreadPool(nThreads);
+
+    List<Callable<Object>> callableList = new ArrayList<>();
+
     for (RecordWriter writer : recordWriters.values()) {
-      writer.close(null);
+      callableList.add(() -> {
+        try {
+          writer.close(null);
+          throw new IOException();
+        } catch (IOException e) {
+          ioException.set(e);
+        }
+        return null;
+      });
+    }
+    try {
+      executorService.invokeAll(callableList);
+    } catch (InterruptedException e) {
+      Thread.currentThread().interrupt();
+    } finally {
+      executorService.shutdown();

Review Comment:
   Sorry, I think I gave some bad advice here. I see now that you're using `invokeAll`, and that method only returns after all invocations complete. Therefore, we know the work is all done, and we can proceed to `shutdown`.
   
   Calling `awaitTermination` opens up a new problem: how to decide on the timeout, here arbitrarily chosen as 50 seconds. Since we don't need really need `awaitTermination`, we might as well remove it and avoid that problem.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] hadoop-yetus commented on pull request #4248: MAPREDUCE-7370. Parallelize MultipleOutputs#close call

Posted by GitBox <gi...@apache.org>.
hadoop-yetus commented on PR #4248:
URL: https://github.com/apache/hadoop/pull/4248#issuecomment-1160600109

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |:----:|----------:|--------:|:--------:|:-------:|
   | +0 :ok: |  reexec  |   0m 40s |  |  Docker mode activated.  |
   |||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain any @author tags.  |
   | -1 :x: |  test4tests  |   0m  0s |  |  The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch.  |
   |||| _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  37m 26s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   1m  2s |  |  trunk passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  compile  |   0m 58s |  |  trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  checkstyle  |   1m  2s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   0m 55s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   0m 48s |  |  trunk passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   0m 39s |  |  trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   1m 58s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  22m 13s |  |  branch has no errors when building and testing our client artifacts.  |
   |||| _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   0m 43s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 46s |  |  the patch passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javac  |   0m 46s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 39s |  |  the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  javac  |   0m 39s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks issues.  |
   | +1 :green_heart: |  checkstyle  |   0m 36s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   0m 45s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 23s |  |  the patch passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   0m 24s |  |  the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   1m 30s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  21m 12s |  |  patch has no errors when building and testing our client artifacts.  |
   |||| _ Other Tests _ |
   | +1 :green_heart: |  unit  |   6m  4s |  |  hadoop-mapreduce-client-core in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 40s |  |  The patch does not generate ASF License warnings.  |
   |  |   | 101m 57s |  |  |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4248/2/artifact/out/Dockerfile |
   | GITHUB PR | https://github.com/apache/hadoop/pull/4248 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | Linux 685dea8cd11d 4.15.0-112-generic #113-Ubuntu SMP Thu Jul 9 23:41:39 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 19308c8fbda266bcc7bb6a5c739568b7cecea333 |
   | Default Java | Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 |
   | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 |
   |  Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4248/2/testReport/ |
   | Max. process+thread count | 1609 (vs. ulimit of 5500) |
   | modules | C: hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core U: hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core |
   | Console output | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4248/2/console |
   | versions | git=2.25.1 maven=3.6.3 spotbugs=4.2.2 |
   | Powered by | Apache Yetus 0.14.0 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] ashutoshcipher commented on pull request #4248: MAPREDUCE-7370. Parallelize MultipleOutputs#close call

Posted by GitBox <gi...@apache.org>.
ashutoshcipher commented on PR #4248:
URL: https://github.com/apache/hadoop/pull/4248#issuecomment-1224058544

   Sorry, I didnt mean to remove @cnauroth from review request. I just clicked on review request from @steveloughran as well for review and I didnt know that it will remove the other person. 
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] hadoop-yetus commented on pull request #4248: MAPREDUCE-7370. Parallelize MultipleOutputs#close call

Posted by GitBox <gi...@apache.org>.
hadoop-yetus commented on PR #4248:
URL: https://github.com/apache/hadoop/pull/4248#issuecomment-1216015858

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |:----:|----------:|--------:|:--------:|:-------:|
   | +0 :ok: |  reexec  |   0m 38s |  |  Docker mode activated.  |
   |||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  1s |  |  detect-secrets was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to include 2 new or modified test files.  |
   |||| _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  15m  0s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  25m 19s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   2m 50s |  |  trunk passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  compile  |   2m 30s |  |  trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  checkstyle  |   1m 23s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   2m  0s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   1m 39s |  |  trunk passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 28s |  |  trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   2m 52s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  20m 49s |  |  branch has no errors when building and testing our client artifacts.  |
   | -0 :warning: |  patch  |  21m 16s |  |  Used diff version of patch file. Binary files and potentially other changes not applied. Please rebase and squash commits if necessary.  |
   |||| _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 31s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   1m 18s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   2m 32s |  |  the patch passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | -1 :x: |  javac  |   2m 32s | [/results-compile-javac-hadoop-mapreduce-project_hadoop-mapreduce-client-jdkPrivateBuild-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4248/5/artifact/out/results-compile-javac-hadoop-mapreduce-project_hadoop-mapreduce-client-jdkPrivateBuild-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.txt) |  hadoop-mapreduce-project_hadoop-mapreduce-client-jdkPrivateBuild-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 generated 4 new + 289 unchanged - 0 fixed = 293 total (was 289)  |
   | +1 :green_heart: |  compile  |   2m 14s |  |  the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | -1 :x: |  javac  |   2m 14s | [/results-compile-javac-hadoop-mapreduce-project_hadoop-mapreduce-client-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4248/5/artifact/out/results-compile-javac-hadoop-mapreduce-project_hadoop-mapreduce-client-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.txt) |  hadoop-mapreduce-project_hadoop-mapreduce-client-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 generated 4 new + 272 unchanged - 0 fixed = 276 total (was 272)  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks issues.  |
   | -0 :warning: |  checkstyle  |   1m  1s | [/results-checkstyle-hadoop-mapreduce-project_hadoop-mapreduce-client.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4248/5/artifact/out/results-checkstyle-hadoop-mapreduce-project_hadoop-mapreduce-client.txt) |  hadoop-mapreduce-project/hadoop-mapreduce-client: The patch generated 12 new + 144 unchanged - 10 fixed = 156 total (was 154)  |
   | +1 :green_heart: |  mvnsite  |   1m 26s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 53s |  |  the patch passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   0m 57s |  |  the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | -1 :x: |  spotbugs  |   1m 34s | [/new-spotbugs-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-core.html](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4248/5/artifact/out/new-spotbugs-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-core.html) |  hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core generated 2 new + 0 unchanged - 0 fixed = 2 total (was 0)  |
   | +1 :green_heart: |  shadedclient  |  19m 58s |  |  patch has no errors when building and testing our client artifacts.  |
   |||| _ Other Tests _ |
   | +1 :green_heart: |  unit  |   7m 15s |  |  hadoop-mapreduce-client-core in the patch passed.  |
   | +1 :green_heart: |  unit  | 135m 27s |  |  hadoop-mapreduce-client-jobclient in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   1m 10s |  |  The patch does not generate ASF License warnings.  |
   |  |   | 256m 13s |  |  |
   
   
   | Reason | Tests |
   |-------:|:------|
   | SpotBugs | module:hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core |
   |  |  Inconsistent synchronization of org.apache.hadoop.mapred.lib.MultipleOutputs.recordWriters; locked 66% of time  Unsynchronized access at MultipleOutputs.java:66% of time  Unsynchronized access at MultipleOutputs.java:[line 412] |
   |  |  Inconsistent synchronization of org.apache.hadoop.mapreduce.lib.output.MultipleOutputs.recordWriters; locked 66% of time  Unsynchronized access at MultipleOutputs.java:66% of time  Unsynchronized access at MultipleOutputs.java:[line 380] |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4248/5/artifact/out/Dockerfile |
   | GITHUB PR | https://github.com/apache/hadoop/pull/4248 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | Linux 90bd650e316e 4.15.0-112-generic #113-Ubuntu SMP Thu Jul 9 23:41:39 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 58286eb9783f4db38e48c8f9092669462c852677 |
   | Default Java | Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 |
   | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 |
   |  Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4248/5/testReport/ |
   | Max. process+thread count | 1571 (vs. ulimit of 5500) |
   | modules | C: hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient U: hadoop-mapreduce-project/hadoop-mapreduce-client |
   | Console output | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4248/5/console |
   | versions | git=2.25.1 maven=3.6.3 spotbugs=4.2.2 |
   | Powered by | Apache Yetus 0.14.0 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] aajisaka commented on a diff in pull request #4248: MAPREDUCE-7370. Parallelize MultipleOutputs#close call

Posted by GitBox <gi...@apache.org>.
aajisaka commented on code in PR #4248:
URL: https://github.com/apache/hadoop/pull/4248#discussion_r985125626


##########
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/lib/MultipleOutputs.java:
##########
@@ -527,9 +557,42 @@ public void collect(Object key, Object value) throws IOException {
    * @throws java.io.IOException thrown if any of the MultipleOutput files
    *                             could not be closed properly.
    */
-  public void close() throws IOException {
+  public void close() throws IOException, InterruptedException {

Review Comment:
   `InterruptedException` is not thrown in this method and should be removed. This class is annotated `@Public` and the change may cause compile error. Also, we can remove the try-catch clause from the test code.
   ```java
         try {
           mos.close();
         } catch (InterruptedException e) {
           throw new RuntimeException(e);
         }
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] cnauroth commented on a diff in pull request #4248: MAPREDUCE-7370. Parallelize MultipleOutputs#close call

Posted by GitBox <gi...@apache.org>.
cnauroth commented on code in PR #4248:
URL: https://github.com/apache/hadoop/pull/4248#discussion_r955418817


##########
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/lib/MultipleOutputs.java:
##########
@@ -528,8 +545,33 @@ public void collect(Object key, Object value) throws IOException {
    *                             could not be closed properly.
    */
   public void close() throws IOException {
+    int nThreads = 10;
+    AtomicReference<IOException> ioException = new AtomicReference<>();
+    ExecutorService executorService = Executors.newFixedThreadPool(nThreads);
+
+    List<Callable<Object>> callableList = new ArrayList<>();
+
     for (RecordWriter writer : recordWriters.values()) {
-      writer.close(null);
+      callableList.add(() -> {
+        try {
+          writer.close(null);
+          throw new IOException();
+        } catch (IOException e) {
+          ioException.set(e);
+        }
+        return null;
+      });
+    }
+    try {
+      executorService.invokeAll(callableList);
+    } catch (InterruptedException e) {
+      Thread.currentThread().interrupt();
+    } finally {
+      executorService.shutdown();

Review Comment:
   Sorry, I think I gave some bad advice here. I see now that you're using `invokeAll`, and that method only returns after all invocations complete. Therefore, we know the work is all done, and we can proceed to `shutdown`.
   
   Calling `awaitTermination` opens up a new problem: how to decide on the timeout, here arbitrarily chosen as 50 seconds. Since we don't really need `awaitTermination`, we might as well remove it and avoid that problem.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] steveloughran commented on pull request #4248: MAPREDUCE-7370. Parallelize MultipleOutputs#close call

Posted by GitBox <gi...@apache.org>.
steveloughran commented on PR #4248:
URL: https://github.com/apache/hadoop/pull/4248#issuecomment-1224040820

   spotbugs?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] ashutoshcipher commented on a diff in pull request #4248: MAPREDUCE-7370. Parallelize MultipleOutputs#close call

Posted by GitBox <gi...@apache.org>.
ashutoshcipher commented on code in PR #4248:
URL: https://github.com/apache/hadoop/pull/4248#discussion_r985127692


##########
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/lib/MultipleOutputs.java:
##########
@@ -527,9 +557,42 @@ public void collect(Object key, Object value) throws IOException {
    * @throws java.io.IOException thrown if any of the MultipleOutput files
    *                             could not be closed properly.
    */
-  public void close() throws IOException {
+  public void close() throws IOException, InterruptedException {

Review Comment:
   makes sense. Addressed.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] ashutoshcipher commented on pull request #4248: MAPREDUCE-7370. Parallelize MultipleOutputs#close call

Posted by GitBox <gi...@apache.org>.
ashutoshcipher commented on PR #4248:
URL: https://github.com/apache/hadoop/pull/4248#issuecomment-1179776546

   @cnauroth - I have made the changes, Can you please have a look at latest commit?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] ashutoshcipher commented on a diff in pull request #4248: MAPREDUCE-7370. Parallelize MultipleOutputs#close call

Posted by GitBox <gi...@apache.org>.
ashutoshcipher commented on code in PR #4248:
URL: https://github.com/apache/hadoop/pull/4248#discussion_r905654385


##########
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/output/MultipleOutputs.java:
##########
@@ -570,8 +570,14 @@ public void setStatus(String status) {
    */
   @SuppressWarnings("unchecked")
   public void close() throws IOException, InterruptedException {
-    for (RecordWriter writer : recordWriters.values()) {
-      writer.close(context);
-    }
+    recordWriters.values().parallelStream().forEach(writer -> {

Review Comment:
   Thanks @cnauroth and @steveloughran for comments. I will make the required changes. Thanks.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] ashutoshcipher commented on pull request #4248: MAPREDUCE-7370. Parallelize MultipleOutputs#close call

Posted by GitBox <gi...@apache.org>.
ashutoshcipher commented on PR #4248:
URL: https://github.com/apache/hadoop/pull/4248#issuecomment-1224052518

   Thanks @steveloughran  for checking. 
   
   > Inconsistent synchronization of  org.apache.hadoop.mapred.lib.MultipleOutputs.recordWriters; locked 66%  of time  Unsynchronized access at MultipleOutputs.java:66% of time   Unsynchronized access at MultipleOutputs.java:[line 412]
   
   Here - This is due to this method which is only visible for testing and not to be used by actual prod code.
   
   ```
     @VisibleForTesting
     public void setRecordWriters(Map<String, RecordWriter> recordWriters) {
       this.recordWriters = recordWriters;
     }
   ```
   
   > Inconsistent synchronization of  org.apache.hadoop.mapreduce.lib.output.MultipleOutputs.recordWriters;  locked 66% of time  Unsynchronized access at MultipleOutputs.java:66% of  time  Unsynchronized access at MultipleOutputs.java:[line 360]
   
   Here, This is also due to same method which is only visible for testing and not to be used by actual prod code. I missed marking it as `VisibleForTesting` which I will do in next commit.
   
   ```
     public void setRecordWriters(Map<String, RecordWriter<?, ?>> recordWriters) {
       this.recordWriters = recordWriters;
     }
   ```
   
   Let me know if I need to address anything else.
   
   
   
   
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] ashutoshcipher commented on a diff in pull request #4248: MAPREDUCE-7370. Parallelize MultipleOutputs#close call

Posted by GitBox <gi...@apache.org>.
ashutoshcipher commented on code in PR #4248:
URL: https://github.com/apache/hadoop/pull/4248#discussion_r956070828


##########
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapred/lib/TestMultipleOutputs.java:
##########
@@ -70,6 +76,20 @@ public void testWithCounters() throws Exception {
     _testMOWithJavaSerialization(true);
   }
 
+  @Test(expected=IOException.class)
+  public void testParallelClose() throws IOException, InterruptedException {

Review Comment:
   Makes sense, I will change it in my next commit. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] hadoop-yetus commented on pull request #4248: MAPREDUCE-7370. Parallelize MultipleOutputs#close call

Posted by GitBox <gi...@apache.org>.
hadoop-yetus commented on PR #4248:
URL: https://github.com/apache/hadoop/pull/4248#issuecomment-1232467928

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |:----:|----------:|--------:|:--------:|:-------:|
   | +0 :ok: |  reexec  |   0m 50s |  |  Docker mode activated.  |
   |||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  1s |  |  detect-secrets was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to include 2 new or modified test files.  |
   |||| _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  15m 45s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  25m 43s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   2m 49s |  |  trunk passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  compile  |   2m 31s |  |  trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  checkstyle  |   1m 14s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m 59s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   1m 33s |  |  trunk passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 15s |  |  trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   2m 40s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  20m 23s |  |  branch has no errors when building and testing our client artifacts.  |
   | -0 :warning: |  patch  |  20m 48s |  |  Used diff version of patch file. Binary files and potentially other changes not applied. Please rebase and squash commits if necessary.  |
   |||| _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 29s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   1m 15s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   2m 33s |  |  the patch passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | -1 :x: |  javac  |   2m 33s | [/results-compile-javac-hadoop-mapreduce-project_hadoop-mapreduce-client-jdkPrivateBuild-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4248/12/artifact/out/results-compile-javac-hadoop-mapreduce-project_hadoop-mapreduce-client-jdkPrivateBuild-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.txt) |  hadoop-mapreduce-project_hadoop-mapreduce-client-jdkPrivateBuild-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 generated 6 new + 286 unchanged - 0 fixed = 292 total (was 286)  |
   | +1 :green_heart: |  compile  |   2m 11s |  |  the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | -1 :x: |  javac  |   2m 11s | [/results-compile-javac-hadoop-mapreduce-project_hadoop-mapreduce-client-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4248/12/artifact/out/results-compile-javac-hadoop-mapreduce-project_hadoop-mapreduce-client-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.txt) |  hadoop-mapreduce-project_hadoop-mapreduce-client-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 generated 6 new + 269 unchanged - 0 fixed = 275 total (was 269)  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks issues.  |
   | +1 :green_heart: |  checkstyle  |   0m 57s |  |  hadoop-mapreduce-project/hadoop-mapreduce-client: The patch generated 0 new + 153 unchanged - 1 fixed = 153 total (was 154)  |
   | +1 :green_heart: |  mvnsite  |   1m 26s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 50s |  |  the patch passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   0m 52s |  |  the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   2m 30s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  20m 10s |  |  patch has no errors when building and testing our client artifacts.  |
   |||| _ Other Tests _ |
   | +1 :green_heart: |  unit  |   7m 11s |  |  hadoop-mapreduce-client-core in the patch passed.  |
   | +1 :green_heart: |  unit  | 136m 59s |  |  hadoop-mapreduce-client-jobclient in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 58s |  |  The patch does not generate ASF License warnings.  |
   |  |   | 257m 23s |  |  |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4248/12/artifact/out/Dockerfile |
   | GITHUB PR | https://github.com/apache/hadoop/pull/4248 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | Linux 5613163b492d 4.15.0-191-generic #202-Ubuntu SMP Thu Aug 4 01:49:29 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 57c564211aacd569facee1b0f7ad2594361bc8e0 |
   | Default Java | Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 |
   | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 |
   |  Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4248/12/testReport/ |
   | Max. process+thread count | 1362 (vs. ulimit of 5500) |
   | modules | C: hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient U: hadoop-mapreduce-project/hadoop-mapreduce-client |
   | Console output | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4248/12/console |
   | versions | git=2.25.1 maven=3.6.3 spotbugs=4.2.2 |
   | Powered by | Apache Yetus 0.14.0 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] ashutoshcipher commented on pull request #4248: MAPREDUCE-7370. Parallelize MultipleOutputs#close call

Posted by GitBox <gi...@apache.org>.
ashutoshcipher commented on PR #4248:
URL: https://github.com/apache/hadoop/pull/4248#issuecomment-1215777311

   Sorry @cnauroth, I got occupied with some other major work, so it took a while for me to  get back on this. I have tried addressing all your comments - Can you please review it again?
   
   Thanks.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] ashutoshcipher commented on pull request #4248: MAPREDUCE-7370. Parallelize MultipleOutputs#close call

Posted by GitBox <gi...@apache.org>.
ashutoshcipher commented on PR #4248:
URL: https://github.com/apache/hadoop/pull/4248#issuecomment-1239962687

   @cnauroth , please help in reviewing the PR. Thanks. I have address your last comments.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] PrabhuJoseph commented on pull request #4248: MAPREDUCE-7370. Parallelize MultipleOutputs#close call

Posted by GitBox <gi...@apache.org>.
PrabhuJoseph commented on PR #4248:
URL: https://github.com/apache/hadoop/pull/4248#issuecomment-1162620418

   The Patch looks good. @steveloughran Do you want to take a quick look on the latest patch to make sure it meets the requirement of this Jira you reported.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] cnauroth commented on a diff in pull request #4248: MAPREDUCE-7370. Parallelize MultipleOutputs#close call

Posted by GitBox <gi...@apache.org>.
cnauroth commented on code in PR #4248:
URL: https://github.com/apache/hadoop/pull/4248#discussion_r920548705


##########
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/lib/MultipleOutputs.java:
##########
@@ -528,8 +545,33 @@ public void collect(Object key, Object value) throws IOException {
    *                             could not be closed properly.
    */
   public void close() throws IOException {
+    int nThreads = 10;
+    AtomicReference<IOException> ioException = new AtomicReference<>();
+    ExecutorService executorService = Executors.newFixedThreadPool(nThreads);
+
+    List<Callable<Object>> callableList = new ArrayList<>();
+
     for (RecordWriter writer : recordWriters.values()) {
-      writer.close(null);
+      callableList.add(() -> {
+        try {
+          writer.close(null);
+          throw new IOException();
+        } catch (IOException e) {
+          ioException.set(e);
+        }
+        return null;
+      });
+    }
+    try {
+      executorService.invokeAll(callableList);
+    } catch (InterruptedException e) {
+      Thread.currentThread().interrupt();
+    } finally {
+      executorService.shutdown();

Review Comment:
   `shutdown` does not wait for the submitted tasks to finish, so when the `close()` method returns, it won't really be guaranteed that closing has completed. We'll need a call to `awaitTermination` to make sure all tasks have finished running.



##########
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/lib/MultipleOutputs.java:
##########
@@ -528,8 +545,33 @@ public void collect(Object key, Object value) throws IOException {
    *                             could not be closed properly.
    */
   public void close() throws IOException {
+    int nThreads = 10;
+    AtomicReference<IOException> ioException = new AtomicReference<>();
+    ExecutorService executorService = Executors.newFixedThreadPool(nThreads);
+
+    List<Callable<Object>> callableList = new ArrayList<>();
+
     for (RecordWriter writer : recordWriters.values()) {
-      writer.close(null);
+      callableList.add(() -> {
+        try {
+          writer.close(null);
+          throw new IOException();

Review Comment:
   Is this line left over from some manual testing?



##########
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/lib/MultipleOutputs.java:
##########
@@ -528,8 +545,33 @@ public void collect(Object key, Object value) throws IOException {
    *                             could not be closed properly.
    */
   public void close() throws IOException {
+    int nThreads = 10;

Review Comment:
   I suggest making this configurable, with 10 as the default.



##########
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/lib/MultipleOutputs.java:
##########
@@ -528,8 +545,33 @@ public void collect(Object key, Object value) throws IOException {
    *                             could not be closed properly.
    */
   public void close() throws IOException {
+    int nThreads = 10;
+    AtomicReference<IOException> ioException = new AtomicReference<>();
+    ExecutorService executorService = Executors.newFixedThreadPool(nThreads);
+
+    List<Callable<Object>> callableList = new ArrayList<>();

Review Comment:
   We know that we will generate exactly one callable for each `RecordWriter`. We can create the `ArrayList` pre-allocated to exactly the correct size, potentially avoiding reallocation inefficiencies: `new ArrayList<>(recordWriters.size())`



##########
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/lib/MultipleOutputs.java:
##########
@@ -528,8 +545,33 @@ public void collect(Object key, Object value) throws IOException {
    *                             could not be closed properly.
    */
   public void close() throws IOException {
+    int nThreads = 10;
+    AtomicReference<IOException> ioException = new AtomicReference<>();
+    ExecutorService executorService = Executors.newFixedThreadPool(nThreads);
+
+    List<Callable<Object>> callableList = new ArrayList<>();
+
     for (RecordWriter writer : recordWriters.values()) {
-      writer.close(null);
+      callableList.add(() -> {
+        try {
+          writer.close(null);
+          throw new IOException();
+        } catch (IOException e) {
+          ioException.set(e);
+        }
+        return null;
+      });
+    }
+    try {
+      executorService.invokeAll(callableList);
+    } catch (InterruptedException e) {
+      Thread.currentThread().interrupt();

Review Comment:
   You can log a warning here that closing was interrupted.



##########
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/lib/MultipleOutputs.java:
##########
@@ -528,8 +545,33 @@ public void collect(Object key, Object value) throws IOException {
    *                             could not be closed properly.
    */
   public void close() throws IOException {
+    int nThreads = 10;
+    AtomicReference<IOException> ioException = new AtomicReference<>();
+    ExecutorService executorService = Executors.newFixedThreadPool(nThreads);

Review Comment:
   I recommend using the version of this method that accepts a `ThreadFactory`, and probably use `ThreadFactoryBuilder`. The factory should generate threads that 1) use a naming format that makes it clear these threads are related to the closing process (e.g. "MultipleOutputs-close"), and 2) set an `UncaughtExceptionHandler` that logs the exception, which would make visible unexpected errors like unchecked exceptions.



##########
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/lib/MultipleOutputs.java:
##########
@@ -528,8 +545,33 @@ public void collect(Object key, Object value) throws IOException {
    *                             could not be closed properly.
    */
   public void close() throws IOException {
+    int nThreads = 10;
+    AtomicReference<IOException> ioException = new AtomicReference<>();
+    ExecutorService executorService = Executors.newFixedThreadPool(nThreads);
+
+    List<Callable<Object>> callableList = new ArrayList<>();
+
     for (RecordWriter writer : recordWriters.values()) {
-      writer.close(null);
+      callableList.add(() -> {
+        try {
+          writer.close(null);
+          throw new IOException();
+        } catch (IOException e) {
+          ioException.set(e);
+        }
+        return null;
+      });
+    }
+    try {
+      executorService.invokeAll(callableList);
+    } catch (InterruptedException e) {
+      Thread.currentThread().interrupt();
+    } finally {
+      executorService.shutdown();
+    }
+
+    if (ioException.get() != null) {
+      throw new IOException(ioException.get());

Review Comment:
   With this approach, if multiple record writers throw an exception during close, we'll only get visibility into one of them. I'd like to suggest a slightly different approach. Within the callable, catch the exception, log it immediately and flag an `AtomicBoolean`. Then, on this line, if that `AtomicBoolean` is set, throw an `IOException` from the overall method, with a message like "One or more threads encountered IOException during close. See prior errors."



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] hadoop-yetus commented on pull request #4248: MAPREDUCE-7370. Parallelize MultipleOutputs#close call

Posted by GitBox <gi...@apache.org>.
hadoop-yetus commented on PR #4248:
URL: https://github.com/apache/hadoop/pull/4248#issuecomment-1215932491

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |:----:|----------:|--------:|:--------:|:-------:|
   | +0 :ok: |  reexec  |   0m 51s |  |  Docker mode activated.  |
   |||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to include 2 new or modified test files.  |
   |||| _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  14m 34s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  26m 22s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   3m  5s |  |  trunk passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  compile  |   2m 41s |  |  trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  checkstyle  |   1m 21s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m 58s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   1m 37s |  |  trunk passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 23s |  |  trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   2m 56s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  20m 14s |  |  branch has no errors when building and testing our client artifacts.  |
   | -0 :warning: |  patch  |  20m 41s |  |  Used diff version of patch file. Binary files and potentially other changes not applied. Please rebase and squash commits if necessary.  |
   |||| _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 31s |  |  Maven dependency ordering for patch  |
   | -1 :x: |  mvninstall  |   0m 32s | [/patch-mvninstall-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-jobclient.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4248/7/artifact/out/patch-mvninstall-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-jobclient.txt) |  hadoop-mapreduce-client-jobclient in the patch failed.  |
   | -1 :x: |  compile  |   1m 56s | [/patch-compile-hadoop-mapreduce-project_hadoop-mapreduce-client-jdkPrivateBuild-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4248/7/artifact/out/patch-compile-hadoop-mapreduce-project_hadoop-mapreduce-client-jdkPrivateBuild-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.txt) |  hadoop-mapreduce-client in the patch failed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.  |
   | -1 :x: |  javac  |   1m 56s | [/patch-compile-hadoop-mapreduce-project_hadoop-mapreduce-client-jdkPrivateBuild-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4248/7/artifact/out/patch-compile-hadoop-mapreduce-project_hadoop-mapreduce-client-jdkPrivateBuild-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.txt) |  hadoop-mapreduce-client in the patch failed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.  |
   | -1 :x: |  compile  |   1m 35s | [/patch-compile-hadoop-mapreduce-project_hadoop-mapreduce-client-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4248/7/artifact/out/patch-compile-hadoop-mapreduce-project_hadoop-mapreduce-client-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.txt) |  hadoop-mapreduce-client in the patch failed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.  |
   | -1 :x: |  javac  |   1m 35s | [/patch-compile-hadoop-mapreduce-project_hadoop-mapreduce-client-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4248/7/artifact/out/patch-compile-hadoop-mapreduce-project_hadoop-mapreduce-client-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.txt) |  hadoop-mapreduce-client in the patch failed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks issues.  |
   | -0 :warning: |  checkstyle  |   0m 59s | [/results-checkstyle-hadoop-mapreduce-project_hadoop-mapreduce-client.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4248/7/artifact/out/results-checkstyle-hadoop-mapreduce-project_hadoop-mapreduce-client.txt) |  hadoop-mapreduce-project/hadoop-mapreduce-client: The patch generated 2 new + 153 unchanged - 1 fixed = 155 total (was 154)  |
   | -1 :x: |  mvnsite  |   0m 35s | [/patch-mvnsite-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-jobclient.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4248/7/artifact/out/patch-mvnsite-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-jobclient.txt) |  hadoop-mapreduce-client-jobclient in the patch failed.  |
   | +1 :green_heart: |  javadoc  |   0m 54s |  |  the patch passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   0m 52s |  |  the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | -1 :x: |  spotbugs  |   1m 43s | [/new-spotbugs-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-core.html](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4248/7/artifact/out/new-spotbugs-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-core.html) |  hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0)  |
   | -1 :x: |  spotbugs  |   0m 36s | [/patch-spotbugs-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-jobclient.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4248/7/artifact/out/patch-spotbugs-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-jobclient.txt) |  hadoop-mapreduce-client-jobclient in the patch failed.  |
   | -1 :x: |  shadedclient  |  10m 40s |  |  patch has errors when building and testing our client artifacts.  |
   |||| _ Other Tests _ |
   | +1 :green_heart: |  unit  |   7m 23s |  |  hadoop-mapreduce-client-core in the patch passed.  |
   | -1 :x: |  unit  |   0m 47s | [/patch-unit-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-jobclient.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4248/7/artifact/out/patch-unit-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-jobclient.txt) |  hadoop-mapreduce-client-jobclient in the patch failed.  |
   | +1 :green_heart: |  asflicense  |   0m 49s |  |  The patch does not generate ASF License warnings.  |
   |  |   | 109m 45s |  |  |
   
   
   | Reason | Tests |
   |-------:|:------|
   | SpotBugs | module:hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core |
   |  |  Inconsistent synchronization of org.apache.hadoop.mapred.lib.MultipleOutputs.recordWriters; locked 66% of time  Unsynchronized access at MultipleOutputs.java:66% of time  Unsynchronized access at MultipleOutputs.java:[line 412] |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4248/7/artifact/out/Dockerfile |
   | GITHUB PR | https://github.com/apache/hadoop/pull/4248 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | Linux 368017d5885c 4.15.0-112-generic #113-Ubuntu SMP Thu Jul 9 23:41:39 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / aeaddf3c09256553e3840026800ca3bae3c99001 |
   | Default Java | Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 |
   | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 |
   |  Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4248/7/testReport/ |
   | Max. process+thread count | 1617 (vs. ulimit of 5500) |
   | modules | C: hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient U: hadoop-mapreduce-project/hadoop-mapreduce-client |
   | Console output | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4248/7/console |
   | versions | git=2.25.1 maven=3.6.3 spotbugs=4.2.2 |
   | Powered by | Apache Yetus 0.14.0 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] ashutoshcipher commented on a diff in pull request #4248: MAPREDUCE-7370. Parallelize MultipleOutputs#close call

Posted by GitBox <gi...@apache.org>.
ashutoshcipher commented on code in PR #4248:
URL: https://github.com/apache/hadoop/pull/4248#discussion_r946108671


##########
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/lib/MultipleOutputs.java:
##########
@@ -528,8 +545,33 @@ public void collect(Object key, Object value) throws IOException {
    *                             could not be closed properly.
    */
   public void close() throws IOException {
+    int nThreads = 10;

Review Comment:
   Addressed.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] ashutoshcipher commented on pull request #4248: MAPREDUCE-7370. Parallelize MultipleOutputs#close call

Posted by GitBox <gi...@apache.org>.
ashutoshcipher commented on PR #4248:
URL: https://github.com/apache/hadoop/pull/4248#issuecomment-1245505808

   @cnauroth - Please help in reviewing the PR.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] ashutoshcipher commented on a diff in pull request #4248: MAPREDUCE-7370. Parallelize MultipleOutputs#close call

Posted by GitBox <gi...@apache.org>.
ashutoshcipher commented on code in PR #4248:
URL: https://github.com/apache/hadoop/pull/4248#discussion_r985127639


##########
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/output/MultipleOutputs.java:
##########
@@ -345,6 +356,11 @@ public static boolean getCountersEnabled(JobContext job) {
     return job.getConfiguration().getBoolean(COUNTERS_ENABLED, false);
   }
 
+  @VisibleForTesting
+  public synchronized void setRecordWriters(Map<String, RecordWriter<?, ?>> recordWriters) {

Review Comment:
   yes



##########
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/lib/MultipleOutputs.java:
##########
@@ -381,6 +406,11 @@ public static boolean getCountersEnabled(JobConf conf) {
   private Map<String, RecordWriter> recordWriters;
   private boolean countersEnabled;
 
+  @VisibleForTesting
+  public synchronized void setRecordWriters(Map<String, RecordWriter> recordWriters) {

Review Comment:
   yes



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] hadoop-yetus commented on pull request #4248: MAPREDUCE-7370. Parallelize MultipleOutputs#close call

Posted by GitBox <gi...@apache.org>.
hadoop-yetus commented on PR #4248:
URL: https://github.com/apache/hadoop/pull/4248#issuecomment-1229585365

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |:----:|----------:|--------:|:--------:|:-------:|
   | +0 :ok: |  reexec  |   0m 46s |  |  Docker mode activated.  |
   |||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  1s |  |  No case conflicting files found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to include 2 new or modified test files.  |
   |||| _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  15m 13s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  25m 30s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   2m 43s |  |  trunk passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  compile  |   2m 25s |  |  trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  checkstyle  |   1m 13s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m 40s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   1m 26s |  |  trunk passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 14s |  |  trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   2m 51s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  20m 22s |  |  branch has no errors when building and testing our client artifacts.  |
   | -0 :warning: |  patch  |  20m 46s |  |  Used diff version of patch file. Binary files and potentially other changes not applied. Please rebase and squash commits if necessary.  |
   |||| _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 27s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   1m 11s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   2m 32s |  |  the patch passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | -1 :x: |  javac  |   2m 32s | [/results-compile-javac-hadoop-mapreduce-project_hadoop-mapreduce-client-jdkPrivateBuild-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4248/10/artifact/out/results-compile-javac-hadoop-mapreduce-project_hadoop-mapreduce-client-jdkPrivateBuild-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.txt) |  hadoop-mapreduce-project_hadoop-mapreduce-client-jdkPrivateBuild-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 generated 4 new + 286 unchanged - 0 fixed = 290 total (was 286)  |
   | +1 :green_heart: |  compile  |   2m 10s |  |  the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | -1 :x: |  javac  |   2m 10s | [/results-compile-javac-hadoop-mapreduce-project_hadoop-mapreduce-client-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4248/10/artifact/out/results-compile-javac-hadoop-mapreduce-project_hadoop-mapreduce-client-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.txt) |  hadoop-mapreduce-project_hadoop-mapreduce-client-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 generated 4 new + 269 unchanged - 0 fixed = 273 total (was 269)  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks issues.  |
   | -0 :warning: |  checkstyle  |   0m 56s | [/results-checkstyle-hadoop-mapreduce-project_hadoop-mapreduce-client.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4248/10/artifact/out/results-checkstyle-hadoop-mapreduce-project_hadoop-mapreduce-client.txt) |  hadoop-mapreduce-project/hadoop-mapreduce-client: The patch generated 3 new + 153 unchanged - 1 fixed = 156 total (was 154)  |
   | +1 :green_heart: |  mvnsite  |   1m 22s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 51s |  |  the patch passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   0m 51s |  |  the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   2m 29s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  20m  3s |  |  patch has no errors when building and testing our client artifacts.  |
   |||| _ Other Tests _ |
   | +1 :green_heart: |  unit  |   7m 12s |  |  hadoop-mapreduce-client-core in the patch passed.  |
   | +1 :green_heart: |  unit  | 135m 32s |  |  hadoop-mapreduce-client-jobclient in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   1m  0s |  |  The patch does not generate ASF License warnings.  |
   |  |   | 254m 14s |  |  |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4248/10/artifact/out/Dockerfile |
   | GITHUB PR | https://github.com/apache/hadoop/pull/4248 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | Linux 5d89ad53c652 4.15.0-191-generic #202-Ubuntu SMP Thu Aug 4 01:49:29 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 6f804e1f3934e54440c1aeff4c08fd9f921e769b |
   | Default Java | Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 |
   | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 |
   |  Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4248/10/testReport/ |
   | Max. process+thread count | 1573 (vs. ulimit of 5500) |
   | modules | C: hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient U: hadoop-mapreduce-project/hadoop-mapreduce-client |
   | Console output | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4248/10/console |
   | versions | git=2.25.1 maven=3.6.3 spotbugs=4.2.2 |
   | Powered by | Apache Yetus 0.14.0 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] ashutoshcipher commented on a diff in pull request #4248: MAPREDUCE-7370. Parallelize MultipleOutputs#close call

Posted by GitBox <gi...@apache.org>.
ashutoshcipher commented on code in PR #4248:
URL: https://github.com/apache/hadoop/pull/4248#discussion_r946109580


##########
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/lib/MultipleOutputs.java:
##########
@@ -528,8 +545,33 @@ public void collect(Object key, Object value) throws IOException {
    *                             could not be closed properly.
    */
   public void close() throws IOException {
+    int nThreads = 10;
+    AtomicReference<IOException> ioException = new AtomicReference<>();
+    ExecutorService executorService = Executors.newFixedThreadPool(nThreads);
+
+    List<Callable<Object>> callableList = new ArrayList<>();
+
     for (RecordWriter writer : recordWriters.values()) {
-      writer.close(null);
+      callableList.add(() -> {
+        try {
+          writer.close(null);
+          throw new IOException();
+        } catch (IOException e) {
+          ioException.set(e);
+        }
+        return null;
+      });
+    }
+    try {
+      executorService.invokeAll(callableList);
+    } catch (InterruptedException e) {
+      Thread.currentThread().interrupt();
+    } finally {
+      executorService.shutdown();

Review Comment:
   Addressed. 



##########
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/lib/MultipleOutputs.java:
##########
@@ -528,8 +545,33 @@ public void collect(Object key, Object value) throws IOException {
    *                             could not be closed properly.
    */
   public void close() throws IOException {
+    int nThreads = 10;
+    AtomicReference<IOException> ioException = new AtomicReference<>();
+    ExecutorService executorService = Executors.newFixedThreadPool(nThreads);

Review Comment:
   Done



##########
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/lib/MultipleOutputs.java:
##########
@@ -528,8 +545,33 @@ public void collect(Object key, Object value) throws IOException {
    *                             could not be closed properly.
    */
   public void close() throws IOException {
+    int nThreads = 10;
+    AtomicReference<IOException> ioException = new AtomicReference<>();
+    ExecutorService executorService = Executors.newFixedThreadPool(nThreads);
+
+    List<Callable<Object>> callableList = new ArrayList<>();
+
     for (RecordWriter writer : recordWriters.values()) {
-      writer.close(null);
+      callableList.add(() -> {
+        try {
+          writer.close(null);
+          throw new IOException();
+        } catch (IOException e) {
+          ioException.set(e);
+        }
+        return null;
+      });
+    }
+    try {
+      executorService.invokeAll(callableList);
+    } catch (InterruptedException e) {
+      Thread.currentThread().interrupt();

Review Comment:
   Done



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] hadoop-yetus commented on pull request #4248: MAPREDUCE-7370. Parallelize MultipleOutputs#close call

Posted by GitBox <gi...@apache.org>.
hadoop-yetus commented on PR #4248:
URL: https://github.com/apache/hadoop/pull/4248#issuecomment-1113391592

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |:----:|----------:|--------:|:--------:|:-------:|
   | +0 :ok: |  reexec  |   0m 40s |  |  Docker mode activated.  |
   |||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain any @author tags.  |
   | -1 :x: |  test4tests  |   0m  0s |  |  The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch.  |
   |||| _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  37m 17s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   1m  3s |  |  trunk passed with JDK Ubuntu-11.0.14.1+1-Ubuntu-0ubuntu1.20.04  |
   | +1 :green_heart: |  compile  |   0m 57s |  |  trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  checkstyle  |   0m 59s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m  4s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   0m 53s |  |  trunk passed with JDK Ubuntu-11.0.14.1+1-Ubuntu-0ubuntu1.20.04  |
   | +1 :green_heart: |  javadoc  |   0m 45s |  |  trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   1m 50s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  21m  4s |  |  branch has no errors when building and testing our client artifacts.  |
   |||| _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   0m 42s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 43s |  |  the patch passed with JDK Ubuntu-11.0.14.1+1-Ubuntu-0ubuntu1.20.04  |
   | +1 :green_heart: |  javac  |   0m 43s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 39s |  |  the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  javac  |   0m 39s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  1s |  |  The patch has no blanks issues.  |
   | +1 :green_heart: |  checkstyle  |   0m 34s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   0m 44s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 28s |  |  the patch passed with JDK Ubuntu-11.0.14.1+1-Ubuntu-0ubuntu1.20.04  |
   | +1 :green_heart: |  javadoc  |   0m 26s |  |  the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   1m 30s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  20m 25s |  |  patch has no errors when building and testing our client artifacts.  |
   |||| _ Other Tests _ |
   | +1 :green_heart: |  unit  |   6m 22s |  |  hadoop-mapreduce-client-core in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 51s |  |  The patch does not generate ASF License warnings.  |
   |  |   | 100m 45s |  |  |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4248/1/artifact/out/Dockerfile |
   | GITHUB PR | https://github.com/apache/hadoop/pull/4248 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell |
   | uname | Linux 19fab0528717 4.15.0-112-generic #113-Ubuntu SMP Thu Jul 9 23:41:39 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 4646a36b1fcf62b39ab182741acecb2cf622ad7e |
   | Default Java | Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 |
   | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.14.1+1-Ubuntu-0ubuntu1.20.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 |
   |  Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4248/1/testReport/ |
   | Max. process+thread count | 1545 (vs. ulimit of 5500) |
   | modules | C: hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core U: hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core |
   | Console output | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4248/1/console |
   | versions | git=2.25.1 maven=3.6.3 spotbugs=4.2.2 |
   | Powered by | Apache Yetus 0.14.0-SNAPSHOT https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] cnauroth commented on a diff in pull request #4248: MAPREDUCE-7370. Parallelize MultipleOutputs#close call

Posted by GitBox <gi...@apache.org>.
cnauroth commented on code in PR #4248:
URL: https://github.com/apache/hadoop/pull/4248#discussion_r904363794


##########
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/output/MultipleOutputs.java:
##########
@@ -570,8 +570,14 @@ public void setStatus(String status) {
    */
   @SuppressWarnings("unchecked")
   public void close() throws IOException, InterruptedException {
-    for (RecordWriter writer : recordWriters.values()) {
-      writer.close(context);
-    }
+    recordWriters.values().parallelStream().forEach(writer -> {

Review Comment:
   I'm concerned that this could have unintended side effects for callers, because it changes the error contract. Errors during `close()` that were formerly visible as a checked `IOException` or `InterruptedException` now become an unchecked `RuntimeException`. In the case of thread interruption, the interrupt now occurs on the background thread with no propagation of interrupted status back up to the coordinating thread.
   
   Unfortunately, `parallelStream()` with a lambda puts us down this path. It would be more code, but directly managing a `ThreadPoolExecutor` would give you the chance to preserve the contract by unwrapping checked exceptions from the `Future` and propagating.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] cnauroth commented on a diff in pull request #4248: MAPREDUCE-7370. Parallelize MultipleOutputs#close call

Posted by GitBox <gi...@apache.org>.
cnauroth commented on code in PR #4248:
URL: https://github.com/apache/hadoop/pull/4248#discussion_r904365716


##########
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/output/MultipleOutputs.java:
##########
@@ -570,8 +570,14 @@ public void setStatus(String status) {
    */
   @SuppressWarnings("unchecked")
   public void close() throws IOException, InterruptedException {
-    for (RecordWriter writer : recordWriters.values()) {
-      writer.close(context);
-    }
+    recordWriters.values().parallelStream().forEach(writer -> {

Review Comment:
   Additionally, if we agree with my assertion that it's important to preserve the error contract, then it would be good to have a unit test that fakes an `IOException` during `close()` and asserts that it propagates.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] ashutoshcipher commented on pull request #4248: MAPREDUCE-7370. Parallelize MultipleOutputs#close call

Posted by GitBox <gi...@apache.org>.
ashutoshcipher commented on PR #4248:
URL: https://github.com/apache/hadoop/pull/4248#issuecomment-1160488616

   > Thanks @ashutoshcipher for the patch.
   > 
   >     1. Can you include the same improvement to new api org.apache.hadoop.mapreduce.lib.output.MultipleOutputs as well.
   > 
   >     2. Can you add a test case if possible, else pls provide a reason.
   
   Thanks @PrabhuJoseph for review and suggestions.
   
   1. I have updated the ```org.apache.hadoop.mapreduce.lib.output.MultipleOutputs``` as well.
   2. I think we dont need a specific unit test in this case as we just paralleling the current calls and current unit tests are already testing the over features/cases
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] steveloughran commented on a diff in pull request #4248: MAPREDUCE-7370. Parallelize MultipleOutputs#close call

Posted by GitBox <gi...@apache.org>.
steveloughran commented on code in PR #4248:
URL: https://github.com/apache/hadoop/pull/4248#discussion_r905241976


##########
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/output/MultipleOutputs.java:
##########
@@ -570,8 +570,14 @@ public void setStatus(String status) {
    */
   @SuppressWarnings("unchecked")
   public void close() throws IOException, InterruptedException {
-    for (RecordWriter writer : recordWriters.values()) {
-      writer.close(context);
-    }
+    recordWriters.values().parallelStream().forEach(writer -> {

Review Comment:
   this is probably true. looking at IOUtils, our own cleanupWithLogger() method catches all throwables, but closeSocket() only swallows IOEs. We must assume a lot of other code is similar, so raised IOEs must stay as IOEs



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] ashutoshcipher commented on a diff in pull request #4248: MAPREDUCE-7370. Parallelize MultipleOutputs#close call

Posted by GitBox <gi...@apache.org>.
ashutoshcipher commented on code in PR #4248:
URL: https://github.com/apache/hadoop/pull/4248#discussion_r956070096


##########
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/lib/MultipleOutputs.java:
##########
@@ -527,9 +558,41 @@ public void collect(Object key, Object value) throws IOException {
    * @throws java.io.IOException thrown if any of the MultipleOutput files
    *                             could not be closed properly.
    */
-  public void close() throws IOException {
+  public void close() throws IOException, InterruptedException {
+    int nThreads = conf.getInt(MRConfig.MULTIPLE_OUTPUTS_CLOSE_THREAD_COUNT,
+        MRConfig.DEFAULT_MULTIPLE_OUTPUTS_CLOSE_THREAD_COUNT);
+    AtomicBoolean encounteredException = new AtomicBoolean(false);
+    ThreadFactory threadFactory = new ThreadFactoryBuilder().setNameFormat("MultipleOutputs-close")
+        .setUncaughtExceptionHandler(

Review Comment:
   I understand your concern. I hope below makes sense.
   
   ```
       ThreadFactory threadFactory = new ThreadFactoryBuilder().setNameFormat("MultipleOutputs-close")
           .setUncaughtExceptionHandler(((t, e) -> {
             LOG.error("Thread " + t + " failed unexpectedly", e);
             encounteredException.set(true);
           })).build();
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] hadoop-yetus commented on pull request #4248: MAPREDUCE-7370. Parallelize MultipleOutputs#close call

Posted by GitBox <gi...@apache.org>.
hadoop-yetus commented on PR #4248:
URL: https://github.com/apache/hadoop/pull/4248#issuecomment-1229747355

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |:----:|----------:|--------:|:--------:|:-------:|
   | +0 :ok: |  reexec  |   0m 46s |  |  Docker mode activated.  |
   |||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  1s |  |  detect-secrets was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to include 2 new or modified test files.  |
   |||| _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  15m 37s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  25m 22s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   2m 45s |  |  trunk passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  compile  |   2m 21s |  |  trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  checkstyle  |   1m 19s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m 45s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   1m 31s |  |  trunk passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 11s |  |  trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   2m 46s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  20m 21s |  |  branch has no errors when building and testing our client artifacts.  |
   | -0 :warning: |  patch  |  20m 45s |  |  Used diff version of patch file. Binary files and potentially other changes not applied. Please rebase and squash commits if necessary.  |
   |||| _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 35s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   1m 14s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   2m 34s |  |  the patch passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | -1 :x: |  javac  |   2m 34s | [/results-compile-javac-hadoop-mapreduce-project_hadoop-mapreduce-client-jdkPrivateBuild-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4248/11/artifact/out/results-compile-javac-hadoop-mapreduce-project_hadoop-mapreduce-client-jdkPrivateBuild-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.txt) |  hadoop-mapreduce-project_hadoop-mapreduce-client-jdkPrivateBuild-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 generated 4 new + 286 unchanged - 0 fixed = 290 total (was 286)  |
   | +1 :green_heart: |  compile  |   2m 12s |  |  the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | -1 :x: |  javac  |   2m 12s | [/results-compile-javac-hadoop-mapreduce-project_hadoop-mapreduce-client-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4248/11/artifact/out/results-compile-javac-hadoop-mapreduce-project_hadoop-mapreduce-client-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.txt) |  hadoop-mapreduce-project_hadoop-mapreduce-client-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 generated 4 new + 269 unchanged - 0 fixed = 273 total (was 269)  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks issues.  |
   | +1 :green_heart: |  checkstyle  |   0m 57s |  |  hadoop-mapreduce-project/hadoop-mapreduce-client: The patch generated 0 new + 153 unchanged - 1 fixed = 153 total (was 154)  |
   | +1 :green_heart: |  mvnsite  |   1m 20s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 50s |  |  the patch passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   0m 51s |  |  the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   2m 29s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  19m 49s |  |  patch has no errors when building and testing our client artifacts.  |
   |||| _ Other Tests _ |
   | +1 :green_heart: |  unit  |   7m  7s |  |  hadoop-mapreduce-client-core in the patch passed.  |
   | +1 :green_heart: |  unit  | 134m 14s |  |  hadoop-mapreduce-client-jobclient in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 54s |  |  The patch does not generate ASF License warnings.  |
   |  |   | 253m  7s |  |  |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4248/11/artifact/out/Dockerfile |
   | GITHUB PR | https://github.com/apache/hadoop/pull/4248 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | Linux 63970ad25e77 4.15.0-191-generic #202-Ubuntu SMP Thu Aug 4 01:49:29 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 8aaeb37146be73544849ea945ddfee3f256913e9 |
   | Default Java | Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 |
   | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 |
   |  Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4248/11/testReport/ |
   | Max. process+thread count | 1659 (vs. ulimit of 5500) |
   | modules | C: hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient U: hadoop-mapreduce-project/hadoop-mapreduce-client |
   | Console output | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4248/11/console |
   | versions | git=2.25.1 maven=3.6.3 spotbugs=4.2.2 |
   | Powered by | Apache Yetus 0.14.0 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] ashutoshcipher commented on a diff in pull request #4248: MAPREDUCE-7370. Parallelize MultipleOutputs#close call

Posted by GitBox <gi...@apache.org>.
ashutoshcipher commented on code in PR #4248:
URL: https://github.com/apache/hadoop/pull/4248#discussion_r957377566


##########
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/lib/MultipleOutputs.java:
##########
@@ -527,9 +558,41 @@ public void collect(Object key, Object value) throws IOException {
    * @throws java.io.IOException thrown if any of the MultipleOutput files
    *                             could not be closed properly.
    */
-  public void close() throws IOException {
+  public void close() throws IOException, InterruptedException {
+    int nThreads = conf.getInt(MRConfig.MULTIPLE_OUTPUTS_CLOSE_THREAD_COUNT,
+        MRConfig.DEFAULT_MULTIPLE_OUTPUTS_CLOSE_THREAD_COUNT);
+    AtomicBoolean encounteredException = new AtomicBoolean(false);
+    ThreadFactory threadFactory = new ThreadFactoryBuilder().setNameFormat("MultipleOutputs-close")
+        .setUncaughtExceptionHandler(

Review Comment:
   Thanks. Addressed this in my next commits.



##########
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapred/lib/TestMultipleOutputs.java:
##########
@@ -70,6 +76,20 @@ public void testWithCounters() throws Exception {
     _testMOWithJavaSerialization(true);
   }
 
+  @Test(expected=IOException.class)
+  public void testParallelClose() throws IOException, InterruptedException {

Review Comment:
   Addressed



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] ashutoshcipher commented on pull request #4248: MAPREDUCE-7370. Parallelize MultipleOutputs#close call

Posted by GitBox <gi...@apache.org>.
ashutoshcipher commented on PR #4248:
URL: https://github.com/apache/hadoop/pull/4248#issuecomment-1256085726

   @cnauroth @steveloughran - I have addressed previous comments, Please help in reviewing PR in your free slots. Thanks.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] aajisaka commented on a diff in pull request #4248: MAPREDUCE-7370. Parallelize MultipleOutputs#close call

Posted by GitBox <gi...@apache.org>.
aajisaka commented on code in PR #4248:
URL: https://github.com/apache/hadoop/pull/4248#discussion_r985124236


##########
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/output/MultipleOutputs.java:
##########
@@ -345,6 +356,11 @@ public static boolean getCountersEnabled(JobContext job) {
     return job.getConfiguration().getBoolean(COUNTERS_ENABLED, false);
   }
 
+  @VisibleForTesting
+  public synchronized void setRecordWriters(Map<String, RecordWriter<?, ?>> recordWriters) {

Review Comment:
   Would you make it package-private?



##########
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/lib/MultipleOutputs.java:
##########
@@ -527,9 +557,42 @@ public void collect(Object key, Object value) throws IOException {
    * @throws java.io.IOException thrown if any of the MultipleOutput files
    *                             could not be closed properly.
    */
-  public void close() throws IOException {
+  public void close() throws IOException, InterruptedException {

Review Comment:
   `InterruptedException` is not thrown in this method and should be removed. This class is annotated `@Public` and the change may cause compile error. Also, we can remove the below code from the test code.
   ```
         try {
           mos.close();
         } catch (InterruptedException e) {
           throw new RuntimeException(e);
         }
   ```



##########
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/lib/MultipleOutputs.java:
##########
@@ -381,6 +406,11 @@ public static boolean getCountersEnabled(JobConf conf) {
   private Map<String, RecordWriter> recordWriters;
   private boolean countersEnabled;
 
+  @VisibleForTesting
+  public synchronized void setRecordWriters(Map<String, RecordWriter> recordWriters) {

Review Comment:
   Would you make this method package-private?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] cnauroth commented on pull request #4248: MAPREDUCE-7370. Parallelize MultipleOutputs#close call

Posted by GitBox <gi...@apache.org>.
cnauroth commented on PR #4248:
URL: https://github.com/apache/hadoop/pull/4248#issuecomment-1270833272

   I have merged this to trunk and branch-3.3 (after resolving a trivial merge conflict).
   
   @ashutoshcipher , thank you for the contribution.
   
   @steveloughran and @aajisaka , thank you for the code reviews.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] hadoop-yetus commented on pull request #4248: MAPREDUCE-7370. Parallelize MultipleOutputs#close call

Posted by GitBox <gi...@apache.org>.
hadoop-yetus commented on PR #4248:
URL: https://github.com/apache/hadoop/pull/4248#issuecomment-1176810936

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |:----:|----------:|--------:|:--------:|:-------:|
   | +0 :ok: |  reexec  |   0m 39s |  |  Docker mode activated.  |
   |||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  1s |  |  detect-secrets was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain any @author tags.  |
   | -1 :x: |  test4tests  |   0m  0s |  |  The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch.  |
   |||| _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  37m 39s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   1m  1s |  |  trunk passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  compile  |   0m 59s |  |  trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  checkstyle  |   1m  0s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m  5s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   0m 54s |  |  trunk passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   0m 44s |  |  trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   1m 51s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  21m  3s |  |  branch has no errors when building and testing our client artifacts.  |
   |||| _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   0m 42s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 45s |  |  the patch passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javac  |   0m 45s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 40s |  |  the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  javac  |   0m 40s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks issues.  |
   | +1 :green_heart: |  checkstyle  |   0m 35s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   0m 45s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 27s |  |  the patch passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   0m 27s |  |  the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   1m 32s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  20m 40s |  |  patch has no errors when building and testing our client artifacts.  |
   |||| _ Other Tests _ |
   | +1 :green_heart: |  unit  |   6m 58s |  |  hadoop-mapreduce-client-core in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 51s |  |  The patch does not generate ASF License warnings.  |
   |  |   | 102m  3s |  |  |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4248/3/artifact/out/Dockerfile |
   | GITHUB PR | https://github.com/apache/hadoop/pull/4248 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | Linux de46802f0374 4.15.0-112-generic #113-Ubuntu SMP Thu Jul 9 23:41:39 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / de155c80c5e9d482c4cf2a2aa2ba7c572a0dabf8 |
   | Default Java | Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 |
   | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 |
   |  Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4248/3/testReport/ |
   | Max. process+thread count | 1274 (vs. ulimit of 5500) |
   | modules | C: hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core U: hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core |
   | Console output | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4248/3/console |
   | versions | git=2.25.1 maven=3.6.3 spotbugs=4.2.2 |
   | Powered by | Apache Yetus 0.14.0 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] ashutoshcipher commented on pull request #4248: MAPREDUCE-7370. Parallelize MultipleOutputs#close call

Posted by GitBox <gi...@apache.org>.
ashutoshcipher commented on PR #4248:
URL: https://github.com/apache/hadoop/pull/4248#issuecomment-1183743712

   Thanks @cnauroth for your comments. I will look into them.
   
   About unittest, I planned to get complete the main function first(make required changes for any new comments) and then write the unit test as well :) 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] hadoop-yetus commented on pull request #4248: MAPREDUCE-7370. Parallelize MultipleOutputs#close call

Posted by GitBox <gi...@apache.org>.
hadoop-yetus commented on PR #4248:
URL: https://github.com/apache/hadoop/pull/4248#issuecomment-1215932490

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |:----:|----------:|--------:|:--------:|:-------:|
   | +0 :ok: |  reexec  |   1m 54s |  |  Docker mode activated.  |
   |||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  1s |  |  detect-secrets was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to include 2 new or modified test files.  |
   |||| _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  14m 37s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  26m 40s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   3m  4s |  |  trunk passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  compile  |   2m 42s |  |  trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  checkstyle  |   1m 20s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m 54s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   1m 34s |  |  trunk passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 25s |  |  trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   3m  0s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  20m 43s |  |  branch has no errors when building and testing our client artifacts.  |
   | -0 :warning: |  patch  |  21m  7s |  |  Used diff version of patch file. Binary files and potentially other changes not applied. Please rebase and squash commits if necessary.  |
   |||| _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 33s |  |  Maven dependency ordering for patch  |
   | -1 :x: |  mvninstall  |   0m 32s | [/patch-mvninstall-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-jobclient.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4248/6/artifact/out/patch-mvninstall-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-jobclient.txt) |  hadoop-mapreduce-client-jobclient in the patch failed.  |
   | -1 :x: |  compile  |   1m 57s | [/patch-compile-hadoop-mapreduce-project_hadoop-mapreduce-client-jdkPrivateBuild-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4248/6/artifact/out/patch-compile-hadoop-mapreduce-project_hadoop-mapreduce-client-jdkPrivateBuild-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.txt) |  hadoop-mapreduce-client in the patch failed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.  |
   | -1 :x: |  javac  |   1m 56s | [/patch-compile-hadoop-mapreduce-project_hadoop-mapreduce-client-jdkPrivateBuild-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4248/6/artifact/out/patch-compile-hadoop-mapreduce-project_hadoop-mapreduce-client-jdkPrivateBuild-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.txt) |  hadoop-mapreduce-client in the patch failed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.  |
   | -1 :x: |  compile  |   1m 39s | [/patch-compile-hadoop-mapreduce-project_hadoop-mapreduce-client-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4248/6/artifact/out/patch-compile-hadoop-mapreduce-project_hadoop-mapreduce-client-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.txt) |  hadoop-mapreduce-client in the patch failed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.  |
   | -1 :x: |  javac  |   1m 39s | [/patch-compile-hadoop-mapreduce-project_hadoop-mapreduce-client-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4248/6/artifact/out/patch-compile-hadoop-mapreduce-project_hadoop-mapreduce-client-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.txt) |  hadoop-mapreduce-client in the patch failed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks issues.  |
   | -0 :warning: |  checkstyle  |   1m  2s | [/results-checkstyle-hadoop-mapreduce-project_hadoop-mapreduce-client.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4248/6/artifact/out/results-checkstyle-hadoop-mapreduce-project_hadoop-mapreduce-client.txt) |  hadoop-mapreduce-project/hadoop-mapreduce-client: The patch generated 2 new + 153 unchanged - 1 fixed = 155 total (was 154)  |
   | -1 :x: |  mvnsite  |   0m 37s | [/patch-mvnsite-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-jobclient.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4248/6/artifact/out/patch-mvnsite-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-jobclient.txt) |  hadoop-mapreduce-client-jobclient in the patch failed.  |
   | +1 :green_heart: |  javadoc  |   0m 57s |  |  the patch passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   0m 54s |  |  the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | -1 :x: |  spotbugs  |   1m 47s | [/new-spotbugs-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-core.html](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4248/6/artifact/out/new-spotbugs-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-core.html) |  hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0)  |
   | -1 :x: |  spotbugs  |   0m 35s | [/patch-spotbugs-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-jobclient.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4248/6/artifact/out/patch-spotbugs-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-jobclient.txt) |  hadoop-mapreduce-client-jobclient in the patch failed.  |
   | -1 :x: |  shadedclient  |  10m 51s |  |  patch has errors when building and testing our client artifacts.  |
   |||| _ Other Tests _ |
   | +1 :green_heart: |  unit  |   7m 19s |  |  hadoop-mapreduce-client-core in the patch passed.  |
   | -1 :x: |  unit  |   0m 45s | [/patch-unit-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-jobclient.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4248/6/artifact/out/patch-unit-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-jobclient.txt) |  hadoop-mapreduce-client-jobclient in the patch failed.  |
   | +1 :green_heart: |  asflicense  |   0m 50s |  |  The patch does not generate ASF License warnings.  |
   |  |   | 111m 54s |  |  |
   
   
   | Reason | Tests |
   |-------:|:------|
   | SpotBugs | module:hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core |
   |  |  Inconsistent synchronization of org.apache.hadoop.mapred.lib.MultipleOutputs.recordWriters; locked 66% of time  Unsynchronized access at MultipleOutputs.java:66% of time  Unsynchronized access at MultipleOutputs.java:[line 412] |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4248/6/artifact/out/Dockerfile |
   | GITHUB PR | https://github.com/apache/hadoop/pull/4248 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | Linux 4b72e3ebaef3 4.15.0-112-generic #113-Ubuntu SMP Thu Jul 9 23:41:39 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / aeaddf3c09256553e3840026800ca3bae3c99001 |
   | Default Java | Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 |
   | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 |
   |  Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4248/6/testReport/ |
   | Max. process+thread count | 1466 (vs. ulimit of 5500) |
   | modules | C: hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient U: hadoop-mapreduce-project/hadoop-mapreduce-client |
   | Console output | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4248/6/console |
   | versions | git=2.25.1 maven=3.6.3 spotbugs=4.2.2 |
   | Powered by | Apache Yetus 0.14.0 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] ashutoshcipher commented on a diff in pull request #4248: MAPREDUCE-7370. Parallelize MultipleOutputs#close call

Posted by GitBox <gi...@apache.org>.
ashutoshcipher commented on code in PR #4248:
URL: https://github.com/apache/hadoop/pull/4248#discussion_r946109977


##########
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/lib/MultipleOutputs.java:
##########
@@ -528,8 +545,33 @@ public void collect(Object key, Object value) throws IOException {
    *                             could not be closed properly.
    */
   public void close() throws IOException {
+    int nThreads = 10;
+    AtomicReference<IOException> ioException = new AtomicReference<>();
+    ExecutorService executorService = Executors.newFixedThreadPool(nThreads);
+
+    List<Callable<Object>> callableList = new ArrayList<>();
+
     for (RecordWriter writer : recordWriters.values()) {
-      writer.close(null);
+      callableList.add(() -> {
+        try {
+          writer.close(null);
+          throw new IOException();
+        } catch (IOException e) {
+          ioException.set(e);
+        }
+        return null;
+      });
+    }
+    try {
+      executorService.invokeAll(callableList);
+    } catch (InterruptedException e) {
+      Thread.currentThread().interrupt();
+    } finally {
+      executorService.shutdown();
+    }
+
+    if (ioException.get() != null) {
+      throw new IOException(ioException.get());

Review Comment:
   Addressed



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] ashutoshcipher commented on a diff in pull request #4248: MAPREDUCE-7370. Parallelize MultipleOutputs#close call

Posted by GitBox <gi...@apache.org>.
ashutoshcipher commented on code in PR #4248:
URL: https://github.com/apache/hadoop/pull/4248#discussion_r946108915


##########
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/lib/MultipleOutputs.java:
##########
@@ -528,8 +545,33 @@ public void collect(Object key, Object value) throws IOException {
    *                             could not be closed properly.
    */
   public void close() throws IOException {
+    int nThreads = 10;
+    AtomicReference<IOException> ioException = new AtomicReference<>();
+    ExecutorService executorService = Executors.newFixedThreadPool(nThreads);
+
+    List<Callable<Object>> callableList = new ArrayList<>();

Review Comment:
   Addressed.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] ashutoshcipher commented on pull request #4248: MAPREDUCE-7370. Parallelize MultipleOutputs#close call

Posted by GitBox <gi...@apache.org>.
ashutoshcipher commented on PR #4248:
URL: https://github.com/apache/hadoop/pull/4248#issuecomment-1219961390

   Hi @cnauroth - I have tried addressing all your comments - Can you please review it again?
   
   Thanks.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org