You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Arpit Agarwal (JIRA)" <ji...@apache.org> on 2014/06/18 10:10:03 UTC

[jira] [Commented] (HADOOP-10281) Create a scheduler, which assigns schedulables a priority level

    [ https://issues.apache.org/jira/browse/HADOOP-10281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14035080#comment-14035080 ] 

Arpit Agarwal commented on HADOOP-10281:
----------------------------------------

Hi Chris, I assume this is also optional like FairCallQueue.

High level concerns:
# Every call results in history mutation which seems to be overkill. I think we can get away with a sampling approach, where a given call is fed as input to scheduler with a small probability.
# Do you guys have any numbers from performance analysis at scale? It seems like the combined changes are adding a few interlocked operations in the call queue path, and indirect locks via use of ConcurrentHashMap. IIUC there are 4 synchronized state updates per call in HistoryRpcScheduler itself. two updates to {{callHistoryCounts}} and two updates to {{callHistory}}.
# There should be a way to age state after a period of inactivity.
# I also think we are providing too many configuration knobs with this feature. Hadoop performance tuning is quite complex already and additional settings add administrator and debugging overhead. Perhaps it is better to leave some of these settings non-configurable.

What do you think?

If this feature is optional, we can proceed and tune the implementation later if we file a Jira to address the high level issues later.

# This comment confused me _If there is not enough traffic to flush the callHistory, sporadic users could end up with counts greater than the heartbeats_. Which heartbeats does it refer to?
# {{historyLength}}, {{serviceUsernames}}, {{thresholds}}, {{numQueues}} can be final.
# Why do we support multiple identity providers if we discard all but the first?
# I agree that service user seems like an unnecessary complication for now. Perhaps we can add it later?
# {{#getSchedulingDecisions}} - curious why you make a copy of the {{callHistoryCounts}} reference.
# {{getDefaultThresholds}} - is there any analysis behind choosing this bisection approach?
# {{getAndIncrement}} - I think we do care about the return value of {{putIfAbsent}}, because we want to update the local variable {{value}}.
# {{popLast}} - check for return value of poll being null?
# Nitpick: Spaces around equals sign in {{i=retval.length-1}}. Could you also fix other instances to be inline with our coding convention e.g. {{(aNumQueues-1)}}

> Create a scheduler, which assigns schedulables a priority level
> ---------------------------------------------------------------
>
>                 Key: HADOOP-10281
>                 URL: https://issues.apache.org/jira/browse/HADOOP-10281
>             Project: Hadoop Common
>          Issue Type: Sub-task
>            Reporter: Chris Li
>            Assignee: Chris Li
>         Attachments: HADOOP-10281.patch, HADOOP-10281.patch
>
>
> The Scheduler decides which sub-queue to assign a given Call. It implements a single method getPriorityLevel(Schedulable call) which returns an integer corresponding to the subqueue the FairCallQueue should place the call in.
> The HistoryRpcScheduler is one such implementation which uses the username of each call and determines what % of calls in recent history were made by this user.
> It is configured with a historyLength (how many calls to track) and a list of integer thresholds which determine the boundaries between priority levels.
> For instance, if the scheduler has a historyLength of 8; and priority thresholds of 4,2,1; and saw calls made by these users in order:
> Alice, Bob, Alice, Alice, Bob, Jerry, Alice, Alice
> * Another call by Alice would be placed in queue 3, since she has already made >= 4 calls
> * Another call by Bob would be placed in queue 2, since he has >= 2 but less than 4 calls
> * A call by Carlos would be placed in queue 0, since he has no calls in the history
> Also, some versions of this patch include the concept of a 'service user', which is a user that is always scheduled high-priority. Currently this seems redundant and will probably be removed in later patches, since its not too useful.



--
This message was sent by Atlassian JIRA
(v6.2#6252)