You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Vivek Ratan (JIRA)" <ji...@apache.org> on 2009/01/06 06:49:44 UTC

[jira] Commented: (HADOOP-4984) Code to create the UI display string for queues in the Capacity Scheduler needs to be synchronized, and needs to better update its information

    [ https://issues.apache.org/jira/browse/HADOOP-4984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12661050#action_12661050 ] 

Vivek Ratan commented on HADOOP-4984:
-------------------------------------

The refactoring work in HADOOP-4980 fixes this problem. While  I realize that we don't usually want to clump together more than one fix in a patch, the work in HADOOP-4980 went a long long way in simplifying the fix for this patch, so I didn't create a separate patch here. By making the _SchedulingInfo_ class be functionally 'outside' the scheduler class, and thus unaware of the latter's data structures, and by moving the generation of the display strings to the concerned _QueueSchedulingInfo_ and _TaskSchedulingInfo_ objects, the synchronization problem is easily addressed. we also don't update the QSI objects, preferring to show potentially slightly stale information, but without a performance penalty. 

> Code to create the UI display string for queues in the Capacity Scheduler needs to be synchronized, and needs to better update its information
> ----------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-4984
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4984
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: contrib/capacity-sched
>            Reporter: Vivek Ratan
>
> There are a couple of problems with _SchedulingInfo.toString()_, the code which creates the UI display string for a queue: 
> * it needs synchronized access to the _QueueSchedulingInfo_ object, as this same object can be updated by the reclaim-capacity thread, and during a heartbeat.
> * the code directly updates its count of running map/reduce tasks. this should be done in a better way, perhaps by calling updateQSIObjects(), rather than walking through the data structures directly. It's also not clear that we want to pay the performance penalty of updating the structures. it maybe OK to provide slightly stale info (the 'staleness' is tiny, in a steady-state and large system, where heartbeats are coming in frequently). 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.