You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-issues@hadoop.apache.org by "Jason Lowe (Created) (JIRA)" <ji...@apache.org> on 2012/02/15 18:32:59 UTC

[jira] [Created] (MAPREDUCE-3862) Nodemanager can appear to hang on shutdown due to lingering DeletionService threads

Nodemanager can appear to hang on shutdown due to lingering DeletionService threads
-----------------------------------------------------------------------------------

                 Key: MAPREDUCE-3862
                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3862
             Project: Hadoop Map/Reduce
          Issue Type: Bug
          Components: mrv2, nodemanager
    Affects Versions: 0.23.1
            Reporter: Jason Lowe


When a nodemanager attempts to shutdown cleanly, it's possible for it to appear to hang due to lingering DeletionService threads.  This can occur when yarn.nodemanager.delete.debug-delay-sec is set to a relatively large value and one or more containers executes on the node shortly before the shutdown.

The DeletionService is never calling {{setExecuteExistingDelayedTasksAfterShutdownPolicy()}} on the ScheduledThreadPoolExecutor, and it defaults to waiting for all scheduled tasks to complete before exiting.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAPREDUCE-3862) Nodemanager can appear to hang on shutdown due to lingering DeletionService threads

Posted by "Robert Joseph Evans (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-3862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13212756#comment-13212756 ] 

Robert Joseph Evans commented on MAPREDUCE-3862:
------------------------------------------------

I do not know of an open ticket for this.  I assumed that most components should be doing this anyways, because if the process crashes badly there is no guarantee that the files would be deleted, so I assumed that it should have been part of their initial design.  If that is not the case then yes we should file a JIRA for it.
                
> Nodemanager can appear to hang on shutdown due to lingering DeletionService threads
> -----------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-3862
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3862
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: mrv2, nodemanager
>    Affects Versions: 0.23.1
>            Reporter: Jason Lowe
>            Assignee: Jason Lowe
>             Fix For: 0.23.2
>
>         Attachments: MAPREDUCE-3862.patch, MAPREDUCE-3862.patch
>
>
> When a nodemanager attempts to shutdown cleanly, it's possible for it to appear to hang due to lingering DeletionService threads.  This can occur when yarn.nodemanager.delete.debug-delay-sec is set to a relatively large value and one or more containers executes on the node shortly before the shutdown.
> The DeletionService is never calling {{setExecuteExistingDelayedTasksAfterShutdownPolicy()}} on the ScheduledThreadPoolExecutor, and it defaults to waiting for all scheduled tasks to complete before exiting.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAPREDUCE-3862) Nodemanager can appear to hang on shutdown due to lingering DeletionService threads

Posted by "Hudson (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-3862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13210606#comment-13210606 ] 

Hudson commented on MAPREDUCE-3862:
-----------------------------------

Integrated in Hadoop-Hdfs-trunk-Commit #1820 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/1820/])
    MAPREDUCE-3862 Nodemanager can appear to hang on shutdown due to lingering DeletionService threads (Jason Lowe via bobby) (Revision 1245781)

     Result = SUCCESS
bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1245781
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/DeletionService.java
* /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestDeletionService.java

                
> Nodemanager can appear to hang on shutdown due to lingering DeletionService threads
> -----------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-3862
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3862
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: mrv2, nodemanager
>    Affects Versions: 0.23.1
>            Reporter: Jason Lowe
>            Assignee: Jason Lowe
>         Attachments: MAPREDUCE-3862.patch, MAPREDUCE-3862.patch
>
>
> When a nodemanager attempts to shutdown cleanly, it's possible for it to appear to hang due to lingering DeletionService threads.  This can occur when yarn.nodemanager.delete.debug-delay-sec is set to a relatively large value and one or more containers executes on the node shortly before the shutdown.
> The DeletionService is never calling {{setExecuteExistingDelayedTasksAfterShutdownPolicy()}} on the ScheduledThreadPoolExecutor, and it defaults to waiting for all scheduled tasks to complete before exiting.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAPREDUCE-3862) Nodemanager can appear to hang on shutdown due to lingering DeletionService threads

Posted by "Hudson (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-3862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13210954#comment-13210954 ] 

Hudson commented on MAPREDUCE-3862:
-----------------------------------

Integrated in Hadoop-Mapreduce-trunk #994 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/994/])
    MAPREDUCE-3862 Nodemanager can appear to hang on shutdown due to lingering DeletionService threads (Jason Lowe via bobby) (Revision 1245781)

     Result = SUCCESS
bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1245781
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/DeletionService.java
* /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestDeletionService.java

                
> Nodemanager can appear to hang on shutdown due to lingering DeletionService threads
> -----------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-3862
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3862
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: mrv2, nodemanager
>    Affects Versions: 0.23.1
>            Reporter: Jason Lowe
>            Assignee: Jason Lowe
>             Fix For: 0.23.2
>
>         Attachments: MAPREDUCE-3862.patch, MAPREDUCE-3862.patch
>
>
> When a nodemanager attempts to shutdown cleanly, it's possible for it to appear to hang due to lingering DeletionService threads.  This can occur when yarn.nodemanager.delete.debug-delay-sec is set to a relatively large value and one or more containers executes on the node shortly before the shutdown.
> The DeletionService is never calling {{setExecuteExistingDelayedTasksAfterShutdownPolicy()}} on the ScheduledThreadPoolExecutor, and it defaults to waiting for all scheduled tasks to complete before exiting.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (MAPREDUCE-3862) Nodemanager can appear to hang on shutdown due to lingering DeletionService threads

Posted by "Jason Lowe (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAPREDUCE-3862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jason Lowe updated MAPREDUCE-3862:
----------------------------------

    Attachment: MAPREDUCE-3862.patch

Patch updated for unit test failures.
                
> Nodemanager can appear to hang on shutdown due to lingering DeletionService threads
> -----------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-3862
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3862
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: mrv2, nodemanager
>    Affects Versions: 0.23.1
>            Reporter: Jason Lowe
>            Assignee: Jason Lowe
>         Attachments: MAPREDUCE-3862.patch, MAPREDUCE-3862.patch
>
>
> When a nodemanager attempts to shutdown cleanly, it's possible for it to appear to hang due to lingering DeletionService threads.  This can occur when yarn.nodemanager.delete.debug-delay-sec is set to a relatively large value and one or more containers executes on the node shortly before the shutdown.
> The DeletionService is never calling {{setExecuteExistingDelayedTasksAfterShutdownPolicy()}} on the ScheduledThreadPoolExecutor, and it defaults to waiting for all scheduled tasks to complete before exiting.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAPREDUCE-3862) Nodemanager can appear to hang on shutdown due to lingering DeletionService threads

Posted by "Hadoop QA (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-3862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13208806#comment-13208806 ] 

Hadoop QA commented on MAPREDUCE-3862:
--------------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12514708/MAPREDUCE-3862.patch
  against trunk revision .

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 3 new or modified tests.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    +1 eclipse:eclipse.  The patch built with eclipse:eclipse.

    +1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

    -1 core tests.  The patch failed these unit tests:
                  org.apache.hadoop.yarn.server.nodemanager.TestDeletionService

    +1 contrib tests.  The patch passed contrib unit tests.

Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1869//testReport/
Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1869//console

This message is automatically generated.
                
> Nodemanager can appear to hang on shutdown due to lingering DeletionService threads
> -----------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-3862
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3862
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: mrv2, nodemanager
>    Affects Versions: 0.23.1
>            Reporter: Jason Lowe
>            Assignee: Jason Lowe
>         Attachments: MAPREDUCE-3862.patch
>
>
> When a nodemanager attempts to shutdown cleanly, it's possible for it to appear to hang due to lingering DeletionService threads.  This can occur when yarn.nodemanager.delete.debug-delay-sec is set to a relatively large value and one or more containers executes on the node shortly before the shutdown.
> The DeletionService is never calling {{setExecuteExistingDelayedTasksAfterShutdownPolicy()}} on the ScheduledThreadPoolExecutor, and it defaults to waiting for all scheduled tasks to complete before exiting.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAPREDUCE-3862) Nodemanager can appear to hang on shutdown due to lingering DeletionService threads

Posted by "Hudson (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-3862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13210907#comment-13210907 ] 

Hudson commented on MAPREDUCE-3862:
-----------------------------------

Integrated in Hadoop-Hdfs-trunk #959 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/959/])
    MAPREDUCE-3862 Nodemanager can appear to hang on shutdown due to lingering DeletionService threads (Jason Lowe via bobby) (Revision 1245781)

     Result = SUCCESS
bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1245781
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/DeletionService.java
* /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestDeletionService.java

                
> Nodemanager can appear to hang on shutdown due to lingering DeletionService threads
> -----------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-3862
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3862
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: mrv2, nodemanager
>    Affects Versions: 0.23.1
>            Reporter: Jason Lowe
>            Assignee: Jason Lowe
>             Fix For: 0.23.2
>
>         Attachments: MAPREDUCE-3862.patch, MAPREDUCE-3862.patch
>
>
> When a nodemanager attempts to shutdown cleanly, it's possible for it to appear to hang due to lingering DeletionService threads.  This can occur when yarn.nodemanager.delete.debug-delay-sec is set to a relatively large value and one or more containers executes on the node shortly before the shutdown.
> The DeletionService is never calling {{setExecuteExistingDelayedTasksAfterShutdownPolicy()}} on the ScheduledThreadPoolExecutor, and it defaults to waiting for all scheduled tasks to complete before exiting.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAPREDUCE-3862) Nodemanager can appear to hang on shutdown due to lingering DeletionService threads

Posted by "Hudson (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-3862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13210608#comment-13210608 ] 

Hudson commented on MAPREDUCE-3862:
-----------------------------------

Integrated in Hadoop-Common-trunk-Commit #1746 (See [https://builds.apache.org/job/Hadoop-Common-trunk-Commit/1746/])
    MAPREDUCE-3862 Nodemanager can appear to hang on shutdown due to lingering DeletionService threads (Jason Lowe via bobby) (Revision 1245781)

     Result = SUCCESS
bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1245781
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/DeletionService.java
* /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestDeletionService.java

                
> Nodemanager can appear to hang on shutdown due to lingering DeletionService threads
> -----------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-3862
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3862
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: mrv2, nodemanager
>    Affects Versions: 0.23.1
>            Reporter: Jason Lowe
>            Assignee: Jason Lowe
>         Attachments: MAPREDUCE-3862.patch, MAPREDUCE-3862.patch
>
>
> When a nodemanager attempts to shutdown cleanly, it's possible for it to appear to hang due to lingering DeletionService threads.  This can occur when yarn.nodemanager.delete.debug-delay-sec is set to a relatively large value and one or more containers executes on the node shortly before the shutdown.
> The DeletionService is never calling {{setExecuteExistingDelayedTasksAfterShutdownPolicy()}} on the ScheduledThreadPoolExecutor, and it defaults to waiting for all scheduled tasks to complete before exiting.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAPREDUCE-3862) Nodemanager can appear to hang on shutdown due to lingering DeletionService threads

Posted by "Hudson (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-3862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13210917#comment-13210917 ] 

Hudson commented on MAPREDUCE-3862:
-----------------------------------

Integrated in Hadoop-Hdfs-0.23-Build #172 (See [https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/172/])
    svn merge -c 1245781 from trunk to branch 0.23 FIXES MAPREDUCE-3862. Nodemanager can appear to hang on shutdown due to lingering DeletionService threads (Jason Lowe via bobby) (Revision 1245794)

     Result = SUCCESS
bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1245794
Files : 
* /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/CHANGES.txt
* /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/DeletionService.java
* /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestDeletionService.java

                
> Nodemanager can appear to hang on shutdown due to lingering DeletionService threads
> -----------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-3862
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3862
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: mrv2, nodemanager
>    Affects Versions: 0.23.1
>            Reporter: Jason Lowe
>            Assignee: Jason Lowe
>             Fix For: 0.23.2
>
>         Attachments: MAPREDUCE-3862.patch, MAPREDUCE-3862.patch
>
>
> When a nodemanager attempts to shutdown cleanly, it's possible for it to appear to hang due to lingering DeletionService threads.  This can occur when yarn.nodemanager.delete.debug-delay-sec is set to a relatively large value and one or more containers executes on the node shortly before the shutdown.
> The DeletionService is never calling {{setExecuteExistingDelayedTasksAfterShutdownPolicy()}} on the ScheduledThreadPoolExecutor, and it defaults to waiting for all scheduled tasks to complete before exiting.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (MAPREDUCE-3862) Nodemanager can appear to hang on shutdown due to lingering DeletionService threads

Posted by "Jason Lowe (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAPREDUCE-3862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jason Lowe updated MAPREDUCE-3862:
----------------------------------

            Assignee: Jason Lowe
    Target Version/s: 0.24.0, 0.23.2
              Status: Patch Available  (was: Open)
    
> Nodemanager can appear to hang on shutdown due to lingering DeletionService threads
> -----------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-3862
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3862
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: mrv2, nodemanager
>    Affects Versions: 0.23.1
>            Reporter: Jason Lowe
>            Assignee: Jason Lowe
>         Attachments: MAPREDUCE-3862.patch
>
>
> When a nodemanager attempts to shutdown cleanly, it's possible for it to appear to hang due to lingering DeletionService threads.  This can occur when yarn.nodemanager.delete.debug-delay-sec is set to a relatively large value and one or more containers executes on the node shortly before the shutdown.
> The DeletionService is never calling {{setExecuteExistingDelayedTasksAfterShutdownPolicy()}} on the ScheduledThreadPoolExecutor, and it defaults to waiting for all scheduled tasks to complete before exiting.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (MAPREDUCE-3862) Nodemanager can appear to hang on shutdown due to lingering DeletionService threads

Posted by "Jason Lowe (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAPREDUCE-3862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jason Lowe updated MAPREDUCE-3862:
----------------------------------

    Target Version/s: 0.24.0, 0.23.2  (was: 0.23.2, 0.24.0)
              Status: Open  (was: Patch Available)

Canceling patch to investigate test failures.
                
> Nodemanager can appear to hang on shutdown due to lingering DeletionService threads
> -----------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-3862
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3862
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: mrv2, nodemanager
>    Affects Versions: 0.23.1
>            Reporter: Jason Lowe
>            Assignee: Jason Lowe
>         Attachments: MAPREDUCE-3862.patch
>
>
> When a nodemanager attempts to shutdown cleanly, it's possible for it to appear to hang due to lingering DeletionService threads.  This can occur when yarn.nodemanager.delete.debug-delay-sec is set to a relatively large value and one or more containers executes on the node shortly before the shutdown.
> The DeletionService is never calling {{setExecuteExistingDelayedTasksAfterShutdownPolicy()}} on the ScheduledThreadPoolExecutor, and it defaults to waiting for all scheduled tasks to complete before exiting.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAPREDUCE-3862) Nodemanager can appear to hang on shutdown due to lingering DeletionService threads

Posted by "Hudson (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-3862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13210635#comment-13210635 ] 

Hudson commented on MAPREDUCE-3862:
-----------------------------------

Integrated in Hadoop-Mapreduce-trunk-Commit #1758 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/1758/])
    MAPREDUCE-3862 Nodemanager can appear to hang on shutdown due to lingering DeletionService threads (Jason Lowe via bobby) (Revision 1245781)

     Result = ABORTED
bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1245781
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/DeletionService.java
* /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestDeletionService.java

                
> Nodemanager can appear to hang on shutdown due to lingering DeletionService threads
> -----------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-3862
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3862
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: mrv2, nodemanager
>    Affects Versions: 0.23.1
>            Reporter: Jason Lowe
>            Assignee: Jason Lowe
>         Attachments: MAPREDUCE-3862.patch, MAPREDUCE-3862.patch
>
>
> When a nodemanager attempts to shutdown cleanly, it's possible for it to appear to hang due to lingering DeletionService threads.  This can occur when yarn.nodemanager.delete.debug-delay-sec is set to a relatively large value and one or more containers executes on the node shortly before the shutdown.
> The DeletionService is never calling {{setExecuteExistingDelayedTasksAfterShutdownPolicy()}} on the ScheduledThreadPoolExecutor, and it defaults to waiting for all scheduled tasks to complete before exiting.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (MAPREDUCE-3862) Nodemanager can appear to hang on shutdown due to lingering DeletionService threads

Posted by "Jason Lowe (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAPREDUCE-3862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jason Lowe updated MAPREDUCE-3862:
----------------------------------

    Attachment: MAPREDUCE-3862.patch

Patch to call setExecuteExistingDelayedTasksAfterShutdownPolicy() on init and fallback to shutdownNow() if ScheduledThreadPoolExecutor.awaitTermination() fails.
                
> Nodemanager can appear to hang on shutdown due to lingering DeletionService threads
> -----------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-3862
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3862
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: mrv2, nodemanager
>    Affects Versions: 0.23.1
>            Reporter: Jason Lowe
>         Attachments: MAPREDUCE-3862.patch
>
>
> When a nodemanager attempts to shutdown cleanly, it's possible for it to appear to hang due to lingering DeletionService threads.  This can occur when yarn.nodemanager.delete.debug-delay-sec is set to a relatively large value and one or more containers executes on the node shortly before the shutdown.
> The DeletionService is never calling {{setExecuteExistingDelayedTasksAfterShutdownPolicy()}} on the ScheduledThreadPoolExecutor, and it defaults to waiting for all scheduled tasks to complete before exiting.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAPREDUCE-3862) Nodemanager can appear to hang on shutdown due to lingering DeletionService threads

Posted by "Hudson (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-3862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13210657#comment-13210657 ] 

Hudson commented on MAPREDUCE-3862:
-----------------------------------

Integrated in Hadoop-Common-0.23-Commit #568 (See [https://builds.apache.org/job/Hadoop-Common-0.23-Commit/568/])
    svn merge -c 1245781 from trunk to branch 0.23 FIXES MAPREDUCE-3862. Nodemanager can appear to hang on shutdown due to lingering DeletionService threads (Jason Lowe via bobby) (Revision 1245794)

     Result = SUCCESS
bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1245794
Files : 
* /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/CHANGES.txt
* /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/DeletionService.java
* /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestDeletionService.java

                
> Nodemanager can appear to hang on shutdown due to lingering DeletionService threads
> -----------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-3862
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3862
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: mrv2, nodemanager
>    Affects Versions: 0.23.1
>            Reporter: Jason Lowe
>            Assignee: Jason Lowe
>             Fix For: 0.23.2
>
>         Attachments: MAPREDUCE-3862.patch, MAPREDUCE-3862.patch
>
>
> When a nodemanager attempts to shutdown cleanly, it's possible for it to appear to hang due to lingering DeletionService threads.  This can occur when yarn.nodemanager.delete.debug-delay-sec is set to a relatively large value and one or more containers executes on the node shortly before the shutdown.
> The DeletionService is never calling {{setExecuteExistingDelayedTasksAfterShutdownPolicy()}} on the ScheduledThreadPoolExecutor, and it defaults to waiting for all scheduled tasks to complete before exiting.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (MAPREDUCE-3862) Nodemanager can appear to hang on shutdown due to lingering DeletionService threads

Posted by "Robert Joseph Evans (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAPREDUCE-3862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Robert Joseph Evans updated MAPREDUCE-3862:
-------------------------------------------

          Resolution: Fixed
       Fix Version/s: 0.23.2
    Target Version/s: 0.24.0, 0.23.2  (was: 0.23.2, 0.24.0)
              Status: Resolved  (was: Patch Available)

Thanks Jason I just committed this to trunk and branch 0.23
                
> Nodemanager can appear to hang on shutdown due to lingering DeletionService threads
> -----------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-3862
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3862
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: mrv2, nodemanager
>    Affects Versions: 0.23.1
>            Reporter: Jason Lowe
>            Assignee: Jason Lowe
>             Fix For: 0.23.2
>
>         Attachments: MAPREDUCE-3862.patch, MAPREDUCE-3862.patch
>
>
> When a nodemanager attempts to shutdown cleanly, it's possible for it to appear to hang due to lingering DeletionService threads.  This can occur when yarn.nodemanager.delete.debug-delay-sec is set to a relatively large value and one or more containers executes on the node shortly before the shutdown.
> The DeletionService is never calling {{setExecuteExistingDelayedTasksAfterShutdownPolicy()}} on the ScheduledThreadPoolExecutor, and it defaults to waiting for all scheduled tasks to complete before exiting.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAPREDUCE-3862) Nodemanager can appear to hang on shutdown due to lingering DeletionService threads

Posted by "Vinod Kumar Vavilapalli (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-3862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13211209#comment-13211209 ] 

Vinod Kumar Vavilapalli commented on MAPREDUCE-3862:
----------------------------------------------------

bq. we have to be able to handle starting up from an unclean shutdown, so files need to be deleted on startup as well. +1 
Yes, that is what happens in 1.0.* too. You know of an open ticket for this?
                
> Nodemanager can appear to hang on shutdown due to lingering DeletionService threads
> -----------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-3862
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3862
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: mrv2, nodemanager
>    Affects Versions: 0.23.1
>            Reporter: Jason Lowe
>            Assignee: Jason Lowe
>             Fix For: 0.23.2
>
>         Attachments: MAPREDUCE-3862.patch, MAPREDUCE-3862.patch
>
>
> When a nodemanager attempts to shutdown cleanly, it's possible for it to appear to hang due to lingering DeletionService threads.  This can occur when yarn.nodemanager.delete.debug-delay-sec is set to a relatively large value and one or more containers executes on the node shortly before the shutdown.
> The DeletionService is never calling {{setExecuteExistingDelayedTasksAfterShutdownPolicy()}} on the ScheduledThreadPoolExecutor, and it defaults to waiting for all scheduled tasks to complete before exiting.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAPREDUCE-3862) Nodemanager can appear to hang on shutdown due to lingering DeletionService threads

Posted by "Hudson (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-3862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13210675#comment-13210675 ] 

Hudson commented on MAPREDUCE-3862:
-----------------------------------

Integrated in Hadoop-Mapreduce-0.23-Commit #571 (See [https://builds.apache.org/job/Hadoop-Mapreduce-0.23-Commit/571/])
    svn merge -c 1245781 from trunk to branch 0.23 FIXES MAPREDUCE-3862. Nodemanager can appear to hang on shutdown due to lingering DeletionService threads (Jason Lowe via bobby) (Revision 1245794)

     Result = ABORTED
bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1245794
Files : 
* /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/CHANGES.txt
* /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/DeletionService.java
* /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestDeletionService.java

                
> Nodemanager can appear to hang on shutdown due to lingering DeletionService threads
> -----------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-3862
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3862
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: mrv2, nodemanager
>    Affects Versions: 0.23.1
>            Reporter: Jason Lowe
>            Assignee: Jason Lowe
>             Fix For: 0.23.2
>
>         Attachments: MAPREDUCE-3862.patch, MAPREDUCE-3862.patch
>
>
> When a nodemanager attempts to shutdown cleanly, it's possible for it to appear to hang due to lingering DeletionService threads.  This can occur when yarn.nodemanager.delete.debug-delay-sec is set to a relatively large value and one or more containers executes on the node shortly before the shutdown.
> The DeletionService is never calling {{setExecuteExistingDelayedTasksAfterShutdownPolicy()}} on the ScheduledThreadPoolExecutor, and it defaults to waiting for all scheduled tasks to complete before exiting.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAPREDUCE-3862) Nodemanager can appear to hang on shutdown due to lingering DeletionService threads

Posted by "Hudson (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-3862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13210656#comment-13210656 ] 

Hudson commented on MAPREDUCE-3862:
-----------------------------------

Integrated in Hadoop-Hdfs-0.23-Commit #555 (See [https://builds.apache.org/job/Hadoop-Hdfs-0.23-Commit/555/])
    svn merge -c 1245781 from trunk to branch 0.23 FIXES MAPREDUCE-3862. Nodemanager can appear to hang on shutdown due to lingering DeletionService threads (Jason Lowe via bobby) (Revision 1245794)

     Result = SUCCESS
bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1245794
Files : 
* /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/CHANGES.txt
* /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/DeletionService.java
* /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestDeletionService.java

                
> Nodemanager can appear to hang on shutdown due to lingering DeletionService threads
> -----------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-3862
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3862
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: mrv2, nodemanager
>    Affects Versions: 0.23.1
>            Reporter: Jason Lowe
>            Assignee: Jason Lowe
>             Fix For: 0.23.2
>
>         Attachments: MAPREDUCE-3862.patch, MAPREDUCE-3862.patch
>
>
> When a nodemanager attempts to shutdown cleanly, it's possible for it to appear to hang due to lingering DeletionService threads.  This can occur when yarn.nodemanager.delete.debug-delay-sec is set to a relatively large value and one or more containers executes on the node shortly before the shutdown.
> The DeletionService is never calling {{setExecuteExistingDelayedTasksAfterShutdownPolicy()}} on the ScheduledThreadPoolExecutor, and it defaults to waiting for all scheduled tasks to complete before exiting.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAPREDUCE-3862) Nodemanager can appear to hang on shutdown due to lingering DeletionService threads

Posted by "Robert Joseph Evans (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-3862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13210546#comment-13210546 ] 

Robert Joseph Evans commented on MAPREDUCE-3862:
------------------------------------------------

I like the approach that this is taking.  we have to be able to handle starting up from an unclean shutdown, so files need to be deleted on startup as well.  +1 for the patch.  
                
> Nodemanager can appear to hang on shutdown due to lingering DeletionService threads
> -----------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-3862
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3862
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: mrv2, nodemanager
>    Affects Versions: 0.23.1
>            Reporter: Jason Lowe
>            Assignee: Jason Lowe
>         Attachments: MAPREDUCE-3862.patch, MAPREDUCE-3862.patch
>
>
> When a nodemanager attempts to shutdown cleanly, it's possible for it to appear to hang due to lingering DeletionService threads.  This can occur when yarn.nodemanager.delete.debug-delay-sec is set to a relatively large value and one or more containers executes on the node shortly before the shutdown.
> The DeletionService is never calling {{setExecuteExistingDelayedTasksAfterShutdownPolicy()}} on the ScheduledThreadPoolExecutor, and it defaults to waiting for all scheduled tasks to complete before exiting.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAPREDUCE-3862) Nodemanager can appear to hang on shutdown due to lingering DeletionService threads

Posted by "Jason Lowe (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-3862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13208985#comment-13208985 ] 

Jason Lowe commented on MAPREDUCE-3862:
---------------------------------------

Note that with this patch the DeletionService can leave some scheduled files undeleted to avoid long hangs at shutdown.  A couple of alternatives:

* Implement a customized ScheduledThreadPoolExecutor that executes all scheduled tasks immediately upon shutdown rather than waiting the specified delays for each scheduled task.  This could still lead to long shutdown times if there are directories scheduled to be deleted with tons of files.
* Declare the existing behavior "as-intended" and note that NMs can take up to {{yarn.nodemanager.delete.debug-delay-sec}} seconds to finish shutting down.  Would be helpful to log a useful message when waiting.
                
> Nodemanager can appear to hang on shutdown due to lingering DeletionService threads
> -----------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-3862
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3862
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: mrv2, nodemanager
>    Affects Versions: 0.23.1
>            Reporter: Jason Lowe
>            Assignee: Jason Lowe
>         Attachments: MAPREDUCE-3862.patch, MAPREDUCE-3862.patch
>
>
> When a nodemanager attempts to shutdown cleanly, it's possible for it to appear to hang due to lingering DeletionService threads.  This can occur when yarn.nodemanager.delete.debug-delay-sec is set to a relatively large value and one or more containers executes on the node shortly before the shutdown.
> The DeletionService is never calling {{setExecuteExistingDelayedTasksAfterShutdownPolicy()}} on the ScheduledThreadPoolExecutor, and it defaults to waiting for all scheduled tasks to complete before exiting.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAPREDUCE-3862) Nodemanager can appear to hang on shutdown due to lingering DeletionService threads

Posted by "Hudson (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-3862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13210946#comment-13210946 ] 

Hudson commented on MAPREDUCE-3862:
-----------------------------------

Integrated in Hadoop-Mapreduce-0.23-Build #200 (See [https://builds.apache.org/job/Hadoop-Mapreduce-0.23-Build/200/])
    svn merge -c 1245781 from trunk to branch 0.23 FIXES MAPREDUCE-3862. Nodemanager can appear to hang on shutdown due to lingering DeletionService threads (Jason Lowe via bobby) (Revision 1245794)

     Result = FAILURE
bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1245794
Files : 
* /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/CHANGES.txt
* /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/DeletionService.java
* /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestDeletionService.java

                
> Nodemanager can appear to hang on shutdown due to lingering DeletionService threads
> -----------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-3862
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3862
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: mrv2, nodemanager
>    Affects Versions: 0.23.1
>            Reporter: Jason Lowe
>            Assignee: Jason Lowe
>             Fix For: 0.23.2
>
>         Attachments: MAPREDUCE-3862.patch, MAPREDUCE-3862.patch
>
>
> When a nodemanager attempts to shutdown cleanly, it's possible for it to appear to hang due to lingering DeletionService threads.  This can occur when yarn.nodemanager.delete.debug-delay-sec is set to a relatively large value and one or more containers executes on the node shortly before the shutdown.
> The DeletionService is never calling {{setExecuteExistingDelayedTasksAfterShutdownPolicy()}} on the ScheduledThreadPoolExecutor, and it defaults to waiting for all scheduled tasks to complete before exiting.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAPREDUCE-3862) Nodemanager can appear to hang on shutdown due to lingering DeletionService threads

Posted by "Hadoop QA (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-3862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13208955#comment-13208955 ] 

Hadoop QA commented on MAPREDUCE-3862:
--------------------------------------

+1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12514725/MAPREDUCE-3862.patch
  against trunk revision .

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 3 new or modified tests.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    +1 eclipse:eclipse.  The patch built with eclipse:eclipse.

    +1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

    +1 core tests.  The patch passed unit tests in .

    +1 contrib tests.  The patch passed contrib unit tests.

Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1870//testReport/
Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1870//console

This message is automatically generated.
                
> Nodemanager can appear to hang on shutdown due to lingering DeletionService threads
> -----------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-3862
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3862
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: mrv2, nodemanager
>    Affects Versions: 0.23.1
>            Reporter: Jason Lowe
>            Assignee: Jason Lowe
>         Attachments: MAPREDUCE-3862.patch, MAPREDUCE-3862.patch
>
>
> When a nodemanager attempts to shutdown cleanly, it's possible for it to appear to hang due to lingering DeletionService threads.  This can occur when yarn.nodemanager.delete.debug-delay-sec is set to a relatively large value and one or more containers executes on the node shortly before the shutdown.
> The DeletionService is never calling {{setExecuteExistingDelayedTasksAfterShutdownPolicy()}} on the ScheduledThreadPoolExecutor, and it defaults to waiting for all scheduled tasks to complete before exiting.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (MAPREDUCE-3862) Nodemanager can appear to hang on shutdown due to lingering DeletionService threads

Posted by "Jason Lowe (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAPREDUCE-3862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jason Lowe updated MAPREDUCE-3862:
----------------------------------

    Target Version/s: 0.24.0, 0.23.2  (was: 0.23.2, 0.24.0)
              Status: Patch Available  (was: Open)
    
> Nodemanager can appear to hang on shutdown due to lingering DeletionService threads
> -----------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-3862
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3862
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: mrv2, nodemanager
>    Affects Versions: 0.23.1
>            Reporter: Jason Lowe
>            Assignee: Jason Lowe
>         Attachments: MAPREDUCE-3862.patch, MAPREDUCE-3862.patch
>
>
> When a nodemanager attempts to shutdown cleanly, it's possible for it to appear to hang due to lingering DeletionService threads.  This can occur when yarn.nodemanager.delete.debug-delay-sec is set to a relatively large value and one or more containers executes on the node shortly before the shutdown.
> The DeletionService is never calling {{setExecuteExistingDelayedTasksAfterShutdownPolicy()}} on the ScheduledThreadPoolExecutor, and it defaults to waiting for all scheduled tasks to complete before exiting.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAPREDUCE-3862) Nodemanager can appear to hang on shutdown due to lingering DeletionService threads

Posted by "Jason Lowe (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-3862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13208601#comment-13208601 ] 

Jason Lowe commented on MAPREDUCE-3862:
---------------------------------------

DeletionService has the following code which implies we don't want to wait too long for the shutdown to complete:

{code}
  public void stop() {
    sched.shutdown();
    try {
      sched.awaitTermination(10, SECONDS);
    } catch (InterruptedException e) {
      sched.shutdownNow();
    }
    super.stop();
  }
{code}

However the code never checks the result from {{awaitTermination()}}, and we can end up trying to continue the shutdown process with the thread pool still active.
                
> Nodemanager can appear to hang on shutdown due to lingering DeletionService threads
> -----------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-3862
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3862
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: mrv2, nodemanager
>    Affects Versions: 0.23.1
>            Reporter: Jason Lowe
>
> When a nodemanager attempts to shutdown cleanly, it's possible for it to appear to hang due to lingering DeletionService threads.  This can occur when yarn.nodemanager.delete.debug-delay-sec is set to a relatively large value and one or more containers executes on the node shortly before the shutdown.
> The DeletionService is never calling {{setExecuteExistingDelayedTasksAfterShutdownPolicy()}} on the ScheduledThreadPoolExecutor, and it defaults to waiting for all scheduled tasks to complete before exiting.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAPREDUCE-3862) Nodemanager can appear to hang on shutdown due to lingering DeletionService threads

Posted by "Vinod Kumar Vavilapalli (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-3862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13212816#comment-13212816 ] 

Vinod Kumar Vavilapalli commented on MAPREDUCE-3862:
----------------------------------------------------

Thanks, created MAPREDUCE-3888.
                
> Nodemanager can appear to hang on shutdown due to lingering DeletionService threads
> -----------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-3862
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3862
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: mrv2, nodemanager
>    Affects Versions: 0.23.1
>            Reporter: Jason Lowe
>            Assignee: Jason Lowe
>             Fix For: 0.23.2
>
>         Attachments: MAPREDUCE-3862.patch, MAPREDUCE-3862.patch
>
>
> When a nodemanager attempts to shutdown cleanly, it's possible for it to appear to hang due to lingering DeletionService threads.  This can occur when yarn.nodemanager.delete.debug-delay-sec is set to a relatively large value and one or more containers executes on the node shortly before the shutdown.
> The DeletionService is never calling {{setExecuteExistingDelayedTasksAfterShutdownPolicy()}} on the ScheduledThreadPoolExecutor, and it defaults to waiting for all scheduled tasks to complete before exiting.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira