You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Janus Chow (Jira)" <ji...@apache.org> on 2020/12/09 07:59:00 UTC

[jira] [Updated] (HADOOP-17421) Specify user's queue via configuration in FairCallQueue

     [ https://issues.apache.org/jira/browse/HADOOP-17421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Janus Chow updated HADOOP-17421:
--------------------------------
    Attachment: HADOOP-17421.001.patch

> Specify user's queue via configuration in FairCallQueue 
> --------------------------------------------------------
>
>                 Key: HADOOP-17421
>                 URL: https://issues.apache.org/jira/browse/HADOOP-17421
>             Project: Hadoop Common
>          Issue Type: Improvement
>            Reporter: Janus Chow
>            Priority: Major
>         Attachments: HADOOP-17421.001.patch
>
>
> The feature of FairCallQueue helps a lot in maintaining a fair and good service in a multi-tenant cluster, each user is assigned to queues with different priority to reach this goal. But in production, we met some problems that the automatic assignment won't fit, the problems are as follows:
>  # We have a service account that would send more NN requests, for some reasons, we would like to keep this user and allow this user to keep this volume of operations. When we deployed FairCallQueue, this service user would be treated as a bad user and assigned to a lower queue, causing some slowness on the service account.
>  # We are having more Flink jobs writing checkpoints to our NN, and the checkpoint operations have a characteristic that they would have a periodically high cost on the NN with an interval of several minutes. FairCallQueue (with cost-based enabled) doesn't have good control of this kind of operations because when this kind of operations starts, the cost in the decay window of this user is quite low, so the user will be assigned to queue 0, after some windows, when the users' high cost has got the attention and assigned to a lower queue, the user's operations are already finished. 
> For problem 1, we noticed that there is already an option mentioned in HADOOP-17165, but in our case, the service account isn't that important that we'd allow it to always be assigned to queue 0. 
> To solve these problems, we'd like to raise a solution by specifying the queue for some static users via config. The basic design is as follows:
>  * Specify the static users in config for each queue.
>  * Load the mapping from the config while initializing the callqueue.
>  * Check the configured queue for each user when assigning the queue.
>  * The cost time of the static users would not be count in our decay calculation to mitigate the impacts on other normal users' costs.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org