You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-issues@hadoop.apache.org by "Sangjin Lee (JIRA)" <ji...@apache.org> on 2015/07/27 18:47:05 UTC

[jira] [Commented] (YARN-3981) support timeline clients not associated with an application

    [ https://issues.apache.org/jira/browse/YARN-3981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14643006#comment-14643006 ] 

Sangjin Lee commented on YARN-3981:
-----------------------------------

Some of us had an offline discussion on this. There are some major challenges in supporting this in the v.2 design. First, obviously they may lack an application-specific context as they can span multiple YARN apps. Second, even if we solved the problem of the context, these clients are likely off-cluster, and they need a way to write to the cluster. Ideas such as a separate dedicated timeline writer just for these have been discussed, but their scalability is problematic at best.

 One idea that was suggested involves creating a specialized YARN application that can act as a proxy for these off-cluster clients. For example, suppose you started a tez client that can start multiple YARN apps. It can also start a special dedicated "(flow-level) timeline client". This client would launch a special YARN app under the covers whose app master and its associated timeline writer can serve as the proxy for timeline data the client may write. When this special timeline client shuts down, it would tear down the associated YARN app also.

If we go this route, we would write the YARN app itself so that the app master listens on requests coming from the client and proxies it to the timeline writer. We would also write the timeline client piece so that it manages the YARN app as well as sending the write requests to the app master.

> support timeline clients not associated with an application
> -----------------------------------------------------------
>
>                 Key: YARN-3981
>                 URL: https://issues.apache.org/jira/browse/YARN-3981
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: timelineserver
>    Affects Versions: YARN-2928
>            Reporter: Sangjin Lee
>
> In the current v.2 design, all timeline writes must belong in a flow/application context (cluster + user + flow + flow run + application).
> But there are use cases that require writing data outside the context of an application. One such example is a higher level client (e.g. tez client or hive/oozie/cascading client) writing flow-level data that spans multiple applications. We need to find a way to support them.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)