You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Pritha Dawn (Jira)" <ji...@apache.org> on 2021/03/12 01:04:00 UTC
[jira] [Updated] (HIVE-24201) WorkloadManager kills query being moved to different pool if destination pool does not have enough sessions

     [ https://issues.apache.org/jira/browse/HIVE-24201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Pritha Dawn updated HIVE-24201:
-------------------------------
    Description: 
To reproduce, create a resource plan with move trigger, like below:
{code:java}
+----------------------------------------------------+
|                        line                        |
+----------------------------------------------------+
| experiment[status=DISABLED,parallelism=null,defaultPool=default] |
|  +  default[allocFraction=0.888,schedulingPolicy=null,parallelism=1] |
|      |  mapped for default                         |
|  +  pool2[allocFraction=0.1,schedulingPolicy=fair,parallelism=1] |
|      |  trigger t1: if (ELAPSED_TIME > 20) { MOVE TO pool1 } |
|      |  mapped for users: abcd                   |
|  +  pool1[allocFraction=0.012,schedulingPolicy=null,parallelism=1] |
|      |  mapped for users: efgh                   |
 
{code}
Now, run two queries in pool1 and pool2 using different users. The query running in pool2 will tried to move to pool1 and it will get killed because pool1 will not have session to handle the query.

Currently, the Workload management move trigger kills the query being moved to a different pool if destination pool does not have enough capacity.  We could have a "delayed move" configuration which lets the query run in the source pool as long as possible, if the destination pool is full. It will attempt the move to destination pool only when there is claim upon the source pool. If the destination pool is not full, delayed move behaves as normal move i.e. the move will happen immediately.

  was:
To reproduce, create a resource plan with move trigger, like below:
{code:java}
+----------------------------------------------------+
|                        line                        |
+----------------------------------------------------+
| experiment[status=DISABLED,parallelism=null,defaultPool=default] |
|  +  default[allocFraction=0.888,schedulingPolicy=null,parallelism=1] |
|      |  mapped for default                         |
|  +  pool2[allocFraction=0.1,schedulingPolicy=fair,parallelism=1] |
|      |  trigger t1: if (ELAPSED_TIME > 20) { MOVE TO pool1 } |
|      |  mapped for users: abcd                   |
|  +  pool1[allocFraction=0.012,schedulingPolicy=null,parallelism=1] |
|      |  mapped for users: efgh                   |
 
{code}
Now, run two queries in pool1 and pool2 using different users. The query running in pool2 will tried to move to pool1 and it will get killed because pool1 will not have session to handle the query.

Once killed this query needs to be re-run externally. It can be optimized and should be retried to run in destination pool directly(it will get queued and run once the session is alive).


> WorkloadManager kills query being moved to different pool if destination pool does not have enough sessions
> -----------------------------------------------------------------------------------------------------------
>
>                 Key: HIVE-24201
>                 URL: https://issues.apache.org/jira/browse/HIVE-24201
>             Project: Hive
>          Issue Type: Improvement
>          Components: HiveServer2, llap
>    Affects Versions: 4.0.0
>            Reporter: Adesh Kumar Rao
>            Assignee: Pritha Dawn
>            Priority: Minor
>
> To reproduce, create a resource plan with move trigger, like below:
> {code:java}
> +----------------------------------------------------+
> |                        line                        |
> +----------------------------------------------------+
> | experiment[status=DISABLED,parallelism=null,defaultPool=default] |
> |  +  default[allocFraction=0.888,schedulingPolicy=null,parallelism=1] |
> |      |  mapped for default                         |
> |  +  pool2[allocFraction=0.1,schedulingPolicy=fair,parallelism=1] |
> |      |  trigger t1: if (ELAPSED_TIME > 20) { MOVE TO pool1 } |
> |      |  mapped for users: abcd                   |
> |  +  pool1[allocFraction=0.012,schedulingPolicy=null,parallelism=1] |
> |      |  mapped for users: efgh                   |
>  
> {code}
> Now, run two queries in pool1 and pool2 using different users. The query running in pool2 will tried to move to pool1 and it will get killed because pool1 will not have session to handle the query.
> Currently, the Workload management move trigger kills the query being moved to a different pool if destination pool does not have enough capacity.  We could have a "delayed move" configuration which lets the query run in the source pool as long as possible, if the destination pool is full. It will attempt the move to destination pool only when there is claim upon the source pool. If the destination pool is not full, delayed move behaves as normal move i.e. the move will happen immediately.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)