You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-issues@hadoop.apache.org by "Vrushali C (JIRA)" <ji...@apache.org> on 2017/05/11 01:41:04 UTC
[jira] [Commented] (YARN-3981) offline collector: support timeline
clients not associated with an application
[ https://issues.apache.org/jira/browse/YARN-3981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16005775#comment-16005775 ]
Vrushali C commented on YARN-3981:
----------------------------------
Thanks for the design draft Rohith. I think I have some preliminary questions, more like discussion.
- Do I understand it correctly that flow collectors will run on each node that runs an NM in the cluster?
- How much traffic do we think might come in? Would it be similar to app table writes? If not, is there a possibility we can run this on head node of the cluster like where RM or NNs run? Not on the same node as RM but a node similar to RM, so that it's "outside" the cluster. We have fairly big sized clusters and having each node run a collector may not be optimal.
- aggregation is not relevant I think for a flow collector. Or do we want to support it? If not, we don't need to mention it under challenges, it is a non issue.
- We surely want to think about optimizing connections to hbase
Perhaps I will have more as I think over this further.
> offline collector: support timeline clients not associated with an application
> ------------------------------------------------------------------------------
>
> Key: YARN-3981
> URL: https://issues.apache.org/jira/browse/YARN-3981
> Project: Hadoop YARN
> Issue Type: Sub-task
> Components: timelineserver
> Affects Versions: YARN-2928
> Reporter: Sangjin Lee
> Assignee: Rohith Sharma K S
> Labels: YARN-5355
> Attachments: YARN-3981- offline-collector-draft.pdf
>
>
> In the current v.2 design, all timeline writes must belong in a flow/application context (cluster + user + flow + flow run + application).
> But there are use cases that require writing data outside the context of an application. One such example is a higher level client (e.g. tez client or hive/oozie/cascading client) writing flow-level data that spans multiple applications. We need to find a way to support them.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org