You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2021/04/23 16:56:00 UTC

[jira] [Work logged] (HIVE-25049) LlapDaemon preemption should not be triggered for same Vertex tasks

     [ https://issues.apache.org/jira/browse/HIVE-25049?focusedWorklogId=587993&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-587993 ]

ASF GitHub Bot logged work on HIVE-25049:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 23/Apr/21 16:55
            Start Date: 23/Apr/21 16:55
    Worklog Time Spent: 10m 
      Work Description: pgaref opened a new pull request #2215:
URL: https://github.com/apache/hive/pull/2215


   <!--
   Thanks for sending a pull request!  Here are some tips for you:
     1. If this is your first time, please read our contributor guidelines: https://cwiki.apache.org/confluence/display/Hive/HowToContribute
     2. Ensure that you have created an issue on the Hive project JIRA: https://issues.apache.org/jira/projects/HIVE/summary
     3. Ensure you have added or run the appropriate tests for your PR: 
     4. If the PR is unfinished, add '[WIP]' in your PR title, e.g., '[WIP]HIVE-XXXXX:  Your PR title ...'.
     5. Be sure to keep the PR description updated to reflect all changes.
     6. Please write your PR title to summarize what this PR proposes.
     7. If possible, provide a concise example to reproduce the issue for a faster review.
   
   -->
   
   ### What changes were proposed in this pull request?
   Victim Task (A) should be preempted in favor of a candidate Task (B) when:
       1. A is NOT on the same DAG and Vertex as B  AND
        1.1. B is a Guaranteed Task while A is not OR
         1.2. Both are guaranteed but A is not finishable and B is
   To make sure that Victim task is not behind some upstream updates (asynchronous),  we check its sources' state (by QueryFragmentInfo.canFinish method)
   
   
   ### Why are the changes needed?
   Non-useful Task preemptions can increase query latency
   
   
   ### Does this PR introduce _any_ user-facing change?
   No
   
   
   ### How was this patch tested?
   Updated TestTaskExecutorService.testFinishablePreemptsNonFinishable()
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Issue Time Tracking
-------------------

            Worklog Id:     (was: 587993)
    Remaining Estimate: 0h
            Time Spent: 10m

> LlapDaemon preemption should not be triggered for same Vertex tasks
> -------------------------------------------------------------------
>
>                 Key: HIVE-25049
>                 URL: https://issues.apache.org/jira/browse/HIVE-25049
>             Project: Hive
>          Issue Type: Sub-task
>            Reporter: Panagiotis Garefalakis
>            Assignee: Panagiotis Garefalakis
>            Priority: Major
>         Attachments: lipwig-output3799058789917393195.svg
>
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> Due to the asynchronous nature of QueryInfo$FinishableState notification, we usually end up receiving finishable state updates across tasks/queryfragments with some time difference.
> Imagine vertex Map8 with dependency on Map7.
> If Map8 vertex is already running with some tasks still pending, we can end-up in a situation where on Map7 completion, some of the pending Map8 tasks are getting the finishable state update BEFORE the already running Map8 tasks, ending up preempting tasks for no reason!
> {code:java}
> 2021-04-22T15:30:45.124Z source:Map 7 updated, notifying: [Map 8] 
> 2021-04-22T15:30:45.125Z query-executor-0 class="impl.TaskExecutorService" level="INFO" thread="IPC Server handler 3 on 25000"] vertex: Map 8 Received finishable state update for attempt_1619105382691_0001_1_05_000014_0, state=true
> 2021-04-22T15:30:45.125Z query-executor-0 class="impl.QueryInfo$FinishableStateTracker" level="INFO" thread="IPC Server handler 3 on 25000"] Now notifying: Map 8
> 2021-04-22T15:30:45.125Z query-executor-0 class="impl.TaskExecutorService" level="INFO" thread="Wait-Queue-Scheduler-0"] Attempting to execute TaskWrapper{task=attempt_1619105382691_0001_1_05_000014_0, Vertex=Map 8, inWaitQueue=true, inPreemptionQueue=false, registeredForNotifications=true, canFinish=true, canFinish(in
>  2021-04-22T15:30:45.126Z query-executor-0 class="impl.TaskExecutorService" level="INFO" thread="Wait-Queue-Scheduler-0"] Task TaskWrapper{task=attempt_1619105382691_0001_1_05_000014_0, Vertex=Map 8, inWaitQueue=true, inPreemptionQueue=false, registeredForNotifications=true, canFinish=true, canFinish(in queue)=true, isGuaranteed=false, firstAttemptStartTime=1619105437749, dagStartTime=1619105422608, withinDagPriority=74, vertexParallelism= 232, selfAndUpstreamParallelism= 256, selfAndUpstreamComplete= 17} managed to preempt task TaskWrapper{task=attempt_1619105382691_0001_1_05_000006_0, Vertex=Map 8, inWaitQueue=false, inPreemptionQueue=true, registeredForNotifications=true, canFinish=true, canFinish(in queue)=false, isGuaranteed=false, firstAttemptStartTime=1619105437737, dagStartTime=1619105422608, withinDagPriority=74, vertexParallelism= 232, selfAndUpstreamParallelism= 256, selfAndUpstreamComplete= 15}
>  2021-04-22T15:30:45.126Z query-executor-0 class="impl.TaskExecutorService" level="INFO" thread="Wait-Queue-Scheduler-0"] Invoking kill task for attempt_1619105382691_0001_1_05_000006_0 due to pre-emption to run attempt_1619105382691_0001_1_05_000014_0
>  2021-04-22T15:30:45.126Z query-executor-0 class="impl.QueryInfo$FinishableStateTracker" level="INFO" thread="IPC Server handler 3 on 25000"] Now notifying: Map 8
>  2021-04-22T15:30:45.126Z query-executor-0 class="impl.TaskExecutorService" level="INFO" thread="IPC Server handler 3 on 25000"] vertex: Map 8 Received finishable state update for attempt_1619105382691_0001_1_05_000011_0, state=true
>  2021-04-22T15:30:45.127Z query-executor-0 class="impl.TaskRunnerCallable" level="INFO" thread="Wait-Queue-Scheduler-0"] Kill task requested for id=attempt_1619105382691_0001_1_05_000006_0, taskRunnerSetup=true
>  2021-04-22T15:30:45.127Z query-executor-0 class="impl.TaskRunnerCallable" level="INFO" thread="Wait-Queue-Scheduler-0"] Issuing kill to task attempt_1619105382691_0001_1_05_000006_0
>  2021-04-22T15:30:45.127Z query-executor-0 class="impl.QueryInfo$FinishableStateTracker" level="INFO" thread="IPC Server handler 3 on 25000"] Now notifying: Map 8
>  2021-04-22T15:30:45.127Z query-executor-0 class="task.TezTaskRunner2" level="INFO" thread="Wait-Queue-Scheduler-0"] Attempting to abort attempt_1619105382691_0001_1_05_000006_0 due to an invocation of killTask
>  2021-04-22T15:30:45.128Z query-executor-0 class="tez.TezProcessor" level="INFO" thread="Wait-Queue-Scheduler-0"] Received abort
>  2021-04-22T15:30:45.128Z query-executor-0 class="tez.TezProcessor" level="INFO" thread="Wait-Queue-Scheduler-0"] Forwarding abort to RecordProcessor
>  2021-04-22T15:30:45.128Z query-executor-0 class="runtime.LogicalIOProcessorRuntimeTask" dagId="dag_1619105382691_0001_1" fragmentId="1619105382691_0001_1_05_000011_0" level="INFO" queryId="hive_20210422153013_397b96bf-d5a6-493a-9c51-9446f64eeed4" thread="TezTR-382691_1_1_5_11_0"] Waiting for 1 initializers to finish
>  2021-04-22T15:30:45.128Z query-executor-0 class="tez.MapRecordProcessor" level="INFO" thread="Wait-Queue-Scheduler-0"] Forwarding abort to mapOp: {} MAP
>  2021-04-22T15:30:45.128Z query-executor-0 class="vector.VectorMapOperator" level="INFO" thread="Wait-Queue-Scheduler-0"] Received abort in operator: MAP
>  2021-04-22T15:30:45.128Z query-executor-0 class="impl.TaskRunnerCallable" level="INFO" thread="Wait-Queue-Scheduler-0"] Kill request for task attempt_1619105382691_0001_1_05_000006_0 completed. Informing AM
>  2021-04-22T15:30:45.128Z query-executor-0 class="impl.TaskExecutorService" level="INFO" thread="IPC Server handler 3 on 25000"] vertex: Map 8 Received finishable state update for attempt_1619105382691_0001_1_05_000006_0, state=true
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)