You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-issues@hadoop.apache.org by "Derek Dagit (JIRA)" <ji...@apache.org> on 2012/12/03 22:17:58 UTC

[jira] [Created] (YARN-256) Increase retention of applications in AppManager without sending more applications to the UI

Derek Dagit created YARN-256:
--------------------------------

             Summary: Increase retention of applications in AppManager without sending more applications to the UI
                 Key: YARN-256
                 URL: https://issues.apache.org/jira/browse/YARN-256
             Project: Hadoop YARN
          Issue Type: Improvement
    Affects Versions: 0.23.5
            Reporter: Derek Dagit


In very busy clusters we would like to retain applications longer so that users' links will not expire too soon.  Very often links to application history expire before they can be followed.

Simply increasing max-completed applications has an adverse performance impact on the applications list in the web UI because it presents the entire list of applications with a request.

Therefore, we would like some way to be able to increase the retention of applications without increasing the number of applications sent to the Web UI.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Assigned] (YARN-256) Increase retention of applications in AppManager without sending more applications to the UI

Posted by "Derek Dagit (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/YARN-256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Derek Dagit reassigned YARN-256:
--------------------------------

    Assignee: Derek Dagit
    
> Increase retention of applications in AppManager without sending more applications to the UI
> --------------------------------------------------------------------------------------------
>
>                 Key: YARN-256
>                 URL: https://issues.apache.org/jira/browse/YARN-256
>             Project: Hadoop YARN
>          Issue Type: Improvement
>    Affects Versions: 0.23.5
>            Reporter: Derek Dagit
>            Assignee: Derek Dagit
>
> In very busy clusters we would like to retain applications longer so that users' links will not expire too soon.  Very often links to application history expire before they can be followed.
> Simply increasing max-completed applications has an adverse performance impact on the applications list in the web UI because it presents the entire list of applications with a request.
> Therefore, we would like some way to be able to increase the retention of applications without increasing the number of applications sent to the Web UI.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (YARN-256) Increase retention of applications in AppManager without sending more applications to the UI

Posted by "Vinod Kumar Vavilapalli (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/YARN-256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13509526#comment-13509526 ] 

Vinod Kumar Vavilapalli commented on YARN-256:
----------------------------------------------

I totally understand the issue at hand, but did we investigate how hard it is do the UI change to not load all the applications? May be that is the first issue to fix? Once we can fix that, increasing max-completed apps is an obvious solution.

Completed applications are explicitly managed by RMAppManager. That was always a short term solution, the long term one being MAPREDUCE-3061.
                
> Increase retention of applications in AppManager without sending more applications to the UI
> --------------------------------------------------------------------------------------------
>
>                 Key: YARN-256
>                 URL: https://issues.apache.org/jira/browse/YARN-256
>             Project: Hadoop YARN
>          Issue Type: Improvement
>    Affects Versions: 0.23.5
>            Reporter: Derek Dagit
>            Assignee: Derek Dagit
>
> In very busy clusters we would like to retain applications longer so that users' links will not expire too soon.  Very often links to application history expire before they can be followed.
> Simply increasing max-completed applications has an adverse performance impact on the applications list in the web UI because it presents the entire list of applications with a request.
> Therefore, we would like some way to be able to increase the retention of applications without increasing the number of applications sent to the Web UI.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (YARN-256) Increase retention of applications in AppManager without sending more applications to the UI

Posted by "Derek Dagit (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/YARN-256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13510007#comment-13510007 ] 

Derek Dagit commented on YARN-256:
----------------------------------

What I meant was that the map does not provide ordering of elements.
                
> Increase retention of applications in AppManager without sending more applications to the UI
> --------------------------------------------------------------------------------------------
>
>                 Key: YARN-256
>                 URL: https://issues.apache.org/jira/browse/YARN-256
>             Project: Hadoop YARN
>          Issue Type: Improvement
>    Affects Versions: 0.23.5
>            Reporter: Derek Dagit
>            Assignee: Derek Dagit
>
> In very busy clusters we would like to retain applications longer so that users' links will not expire too soon.  Very often links to application history expire before they can be followed.
> Simply increasing max-completed applications has an adverse performance impact on the applications list in the web UI because it presents the entire list of applications with a request.
> Therefore, we would like some way to be able to increase the retention of applications without increasing the number of applications sent to the Web UI.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (YARN-256) Increase retention of applications in AppManager without sending more applications to the UI

Posted by "Derek Dagit (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/YARN-256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Derek Dagit updated YARN-256:
-----------------------------

    Attachment: YARN-256-branch-0.23-wip.patch

Uploading a work-in-progress patch for discussion purposes.
                
> Increase retention of applications in AppManager without sending more applications to the UI
> --------------------------------------------------------------------------------------------
>
>                 Key: YARN-256
>                 URL: https://issues.apache.org/jira/browse/YARN-256
>             Project: Hadoop YARN
>          Issue Type: Improvement
>    Affects Versions: 0.23.5
>            Reporter: Derek Dagit
>            Assignee: Derek Dagit
>         Attachments: YARN-256-branch-0.23-wip.patch
>
>
> In very busy clusters we would like to retain applications longer so that users' links will not expire too soon.  Very often links to application history expire before they can be followed.
> Simply increasing max-completed applications has an adverse performance impact on the applications list in the web UI because it presents the entire list of applications with a request.
> Therefore, we would like some way to be able to increase the retention of applications without increasing the number of applications sent to the Web UI.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (YARN-256) Increase retention of applications in AppManager without sending more applications to the UI

Posted by "Derek Dagit (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/YARN-256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13509348#comment-13509348 ] 

Derek Dagit commented on YARN-256:
----------------------------------

One approach would be to add a new configuration variable to control limit the maximum number of applications used in the web UI presentation.  With such a configuration variable set to N applications, the user would see, more or less, the last N applications that were submitted when browsing.


Implementing the use of such a configuration has challenges:

The RMAppManager has member that implements RMContext for the purpose of concurrent accesses to pieces of the RM state.  The RMContext interface defines a method getRMApps() that returns a java.util.concurrent.ConcurrentMap (with ConcurrentHashMap as its implementation) holding the mapping of ApplicationId to RMApps.

Given that we want to return the newest "N" RMApps, we would need do walk the entire map since there is no ordering of keys.


A couple of strategies and some cons:


1) Implement a second data structure that maintains the order the ConcurrentHashMap lacks.

- Maintenance of the two separate structures in a concurrent environment could be nasty.


2) Change the map to a data structure that supports fast deletions, updates, and retrieval while maintaining ordering.

- No provided, concurrent structures exists that have these qualities, so more work.
- Locking would need to be done by the caller.


3) Encapsulate the map behind a set of method calls

- Large scope of code change.


One other thing to note: With the current implementation, there are cases in which we walk the elements in the map without locking.  ConcurrentHashMap does not guarantee that iterators remain consistent to changes in structure after the ithe iterators are created.  Practically, this means that if an RMApp is removed from the ConcurrentHashMap while we are walking it, then there is a possibility we may crash.  (ConcurrentHashMap will not throw a ConcurrentModificationException.)

http://docs.oracle.com/javase/6/docs/api/java/util/concurrent/ConcurrentHashMap.html

When collections are returned from .values(), these Collections are backed by the list.  We do such a thing in several places currently.

- GetAllApplicationsResponse#getAllApplications()
- RMWebServices#getApps()
- ClientRMService#getQueueInfo()
- AppsList#toDataTableArrays()




I am leaning toward option 3) above, and what I would want to do is something like the following:


- Remove the map from the RMContext

- Add a LinkedHashMap private to the RMContext

- Remove getRMApps() from RMContext() 

- Add to RMContext getRMAppForAppId(ApplicationId) -> returns the desired app


Use a visitor pattern to do proper locking and hide the underlying data structure:

- Add to RMContext acceptVisitor(RMAppsVisitor) -> execute logic on each RMApp with proper locking

- Provide the RMAppsVisitor interface.

- Change all getRMApps().get(ApplicationId) -> getRMAppForAppId(ApplicationId)

- Where there are walks over map values, pass in loop logic as an RMAppsVisitor to visitRMApps()



At this point I am looking for input, and I would appreciate any comments.

                
> Increase retention of applications in AppManager without sending more applications to the UI
> --------------------------------------------------------------------------------------------
>
>                 Key: YARN-256
>                 URL: https://issues.apache.org/jira/browse/YARN-256
>             Project: Hadoop YARN
>          Issue Type: Improvement
>    Affects Versions: 0.23.5
>            Reporter: Derek Dagit
>            Assignee: Derek Dagit
>
> In very busy clusters we would like to retain applications longer so that users' links will not expire too soon.  Very often links to application history expire before they can be followed.
> Simply increasing max-completed applications has an adverse performance impact on the applications list in the web UI because it presents the entire list of applications with a request.
> Therefore, we would like some way to be able to increase the retention of applications without increasing the number of applications sent to the Web UI.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (YARN-256) Increase retention of applications in AppManager without sending more applications to the UI

Posted by "Derek Dagit (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/YARN-256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13510004#comment-13510004 ] 

Derek Dagit commented on YARN-256:
----------------------------------

It would not be beneficial to change the UI to be selective, because we use a map to store each RMApp, which does not provide unordered.  We must send all of the RMApps to the UI each time to give a consistent.

I think something must be done with this map in order to avoid the problem, and that's why I'm thinking about the alternatives above.
                
> Increase retention of applications in AppManager without sending more applications to the UI
> --------------------------------------------------------------------------------------------
>
>                 Key: YARN-256
>                 URL: https://issues.apache.org/jira/browse/YARN-256
>             Project: Hadoop YARN
>          Issue Type: Improvement
>    Affects Versions: 0.23.5
>            Reporter: Derek Dagit
>            Assignee: Derek Dagit
>
> In very busy clusters we would like to retain applications longer so that users' links will not expire too soon.  Very often links to application history expire before they can be followed.
> Simply increasing max-completed applications has an adverse performance impact on the applications list in the web UI because it presents the entire list of applications with a request.
> Therefore, we would like some way to be able to increase the retention of applications without increasing the number of applications sent to the Web UI.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira