You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by GitBox <gi...@apache.org> on 2020/04/17 17:41:00 UTC

[GitHub] [hadoop] steveloughran opened a new pull request #1963: HADOOP-16798. S3A Committer thread pool shutdown problems.

steveloughran opened a new pull request #1963: HADOOP-16798. S3A Committer thread pool shutdown problems.
URL: https://github.com/apache/hadoop/pull/1963
 
 
   
   Contributed by Steve Loughran.
   
   Fixes a condition which can cause job commit to fail if a task was
   aborted < 60s before the job commit commenced: the task abort
   will shut down the thread pool with a hard exit after 60s; the
   job commit POST requests would be scheduled through the same pool,
   so be interrupted and fail. At present the access is synchronized,
   but presumably the executor shutdown code is calling wait() and releasing
   locks.
   
   Task abort is triggered from the AM when task attempts succeed but
   there are still active speculative task attempts running. Thus it
   only surfaces when speculation is enabled and the final tasks are
   speculating, which, given they are the stragglers, is not
   unheard of.
   
   The fix copies and clears the threadPool field in a synchronized block,
   then shuts it down; job commit will encounter the empty field and
   demand-create a new one. As would a sequence of task aborts.
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] steveloughran commented on issue #1963: HADOOP-16798. S3A Committer thread pool shutdown problems.

Posted by GitBox <gi...@apache.org>.
steveloughran commented on issue #1963: HADOOP-16798. S3A Committer thread pool shutdown problems.
URL: https://github.com/apache/hadoop/pull/1963#issuecomment-615377723
 
 
   Test run against london with exactly the same flags which triggered the failure -you have to have an overloaded test system to cause the execution delays which trigger speculative task execution and then this failure. All good, at least on this run. 
   
   ```
   -Dparallel-tests -DtestsThreadCount=8 -Dscale -Dmarkers=keep -Dauth
   ```
   
   Rerunning with 12 threads to see if that can trigger it
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] hadoop-yetus commented on issue #1963: HADOOP-16798. S3A Committer thread pool shutdown problems.

Posted by GitBox <gi...@apache.org>.
hadoop-yetus commented on issue #1963: HADOOP-16798. S3A Committer thread pool shutdown problems.
URL: https://github.com/apache/hadoop/pull/1963#issuecomment-615403186
 
 
   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |:----:|----------:|--------:|:--------|
   | +0 :ok: |  reexec  |   0m 39s |  Docker mode activated.  |
   ||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  No case conflicting files found.  |
   | +1 :green_heart: |  @author  |   0m  0s |  The patch does not contain any @author tags.  |
   | -1 :x: |  test4tests  |   0m  0s |  The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch.  |
   ||| _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  19m 53s |  trunk passed  |
   | +1 :green_heart: |  compile  |   0m 34s |  trunk passed  |
   | +1 :green_heart: |  checkstyle  |   0m 26s |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   0m 40s |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  15m  5s |  branch has no errors when building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   0m 29s |  trunk passed  |
   | +0 :ok: |  spotbugs  |   1m  0s |  Used deprecated FindBugs config; considering switching to SpotBugs.  |
   | +1 :green_heart: |  findbugs  |   0m 57s |  trunk passed  |
   ||| _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   0m 32s |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 28s |  the patch passed  |
   | +1 :green_heart: |  javac  |   0m 28s |  the patch passed  |
   | +1 :green_heart: |  checkstyle  |   0m 19s |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   0m 33s |  the patch passed  |
   | +1 :green_heart: |  whitespace  |   0m  0s |  The patch has no whitespace issues.  |
   | +1 :green_heart: |  shadedclient  |  13m 51s |  patch has no errors when building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   0m 26s |  the patch passed  |
   | +1 :green_heart: |  findbugs  |   1m  3s |  the patch passed  |
   ||| _ Other Tests _ |
   | +1 :green_heart: |  unit  |   1m 27s |  hadoop-aws in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 32s |  The patch does not generate ASF License warnings.  |
   |  |   |  59m 24s |   |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | Client=19.03.8 Server=19.03.8 base: https://builds.apache.org/job/hadoop-multibranch/job/PR-1963/1/artifact/out/Dockerfile |
   | GITHUB PR | https://github.com/apache/hadoop/pull/1963 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle |
   | uname | Linux f3bf80f4b706 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | personality/hadoop.sh |
   | git revision | trunk / 3601054 |
   | Default Java | 1.8.0_242 |
   |  Test Results | https://builds.apache.org/job/hadoop-multibranch/job/PR-1963/1/testReport/ |
   | Max. process+thread count | 415 (vs. ulimit of 5500) |
   | modules | C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-aws |
   | Console output | https://builds.apache.org/job/hadoop-multibranch/job/PR-1963/1/console |
   | versions | git=2.7.4 maven=3.3.9 findbugs=3.1.0-RC1 |
   | Powered by | Apache Yetus 0.11.1 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org