You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@tez.apache.org by "Prabhu Joseph (Jira)" <ji...@apache.org> on 2021/03/04 12:08:00 UTC

[jira] [Commented] (TEZ-3820) Plugin to write history events to ATSv2

    [ https://issues.apache.org/jira/browse/TEZ-3820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17295237#comment-17295237 ] 

Prabhu Joseph commented on TEZ-3820:
------------------------------------

Thanks [~rohithsharma] for the patch. Have rebased the patch and tested on tez-0.10 + hadoop-3.2.1. The Job hangs with Task Containers stuck in ACQUIRED state as the CallbackHandler is Null in Tez DagAppMaster (AM). ATSV2HistoryLoggingService creates the TezAMRMClientAsync instance before YarnTaskSchedulerService with CallbackHandler set to Null.

The below assumption is going wrong.

+      // assumption is at this point AMRMClient is created! So callback handler is set to null
+      TezAMRMClientAsyncProvider.createAMRMClientAsync(1000, null)

Below are some ways to handle this:

1. YarnTaskSchedulerService has to create the instance which ATSV2HistoryLoggingService has to get it when available.

2. Move Callbackhandler code from YarnTaskSchedulerService to separate class so that either YarnTaskSchedulerService or ATSV2HistoryLoggingService can create proper instance instead of ATSV2HistoryLoggingService creating with Null Callbackhandler.



> Plugin to write history events to ATSv2
> ---------------------------------------
>
>                 Key: TEZ-3820
>                 URL: https://issues.apache.org/jira/browse/TEZ-3820
>             Project: Apache Tez
>          Issue Type: Improvement
>            Reporter: Rohith Sharma K S
>            Assignee: Rohith Sharma K S
>            Priority: Major
>         Attachments: TEZ-3884.001.patch
>
>
> YARN Timeline Service v.2 is the next major iteration of Timeline Server, following v.1 and v.1.5. ATSV.2 is created to address two major challenges of v.1 i.e Scalability and Usability improvements. Refer [ATSv2-doc|https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/TimelineServiceV2.html#Timeline_Service_v.2_REST_API]
> It would be nice to use ATSv2 for Tez which solves scalability problems. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)