You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-issues@hadoop.apache.org by "Joep Rottinghuis (JIRA)" <ji...@apache.org> on 2016/07/14 07:35:20 UTC

[jira] [Commented] (YARN-5378) Accomodate app-id->cluster mapping

    [ https://issues.apache.org/jira/browse/YARN-5378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15376499#comment-15376499 ] 

Joep Rottinghuis commented on YARN-5378:
----------------------------------------

Our current AppToFlow table has the cluster in the rowkey prefix:
{code}
 * |--------------------------------------|
 * |  Row       | Column Family           |
 * |  key       | info                    |
 * |--------------------------------------|
 * | clusterId! | flowName:               |
 * | AppId      | foo@daily_hive_report   |
 * |            |                         |
 * |            | flowRunId:              |
 * |            | 1452828720457           |
 * |            |                         |
 * |            | user_id:                |
 * |            | admin                   |
 * |            |                         |
 * |            |                         |
 * |            |                         |
 * |--------------------------------------|
{code}

This works fine when we know the cluster, but is bad for this new use-case.
Aoplication to cluster is often unique, but doesn't _have_ to be unique. If two clusters start with the same epoch (unlikely but possible), or worse in the Yarn federated case they are accidentally configured to have the same starting epoch, we'd override data.
Therefore in conversation with [~vrushalic] we considered a different approach. What if we eliminate the cluster from the row prefix and store the flowName, flowRunId, and user as a column name prefix.
That way we can still accommodate multiple clusters if needed, and still efficiently query for a app-id, cluster combination, yet easily retrieve a list of clusters for an application (mostly 1).

> Accomodate app-id->cluster mapping
> ----------------------------------
>
>                 Key: YARN-5378
>                 URL: https://issues.apache.org/jira/browse/YARN-5378
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: timelineserver
>            Reporter: Joep Rottinghuis
>            Assignee: Joep Rottinghuis
>
> In discussion with [~sjlee0], [~vrushalic], [~subru], and [~curino] a use-case came up to be able to map from application-id to cluster-id in context of federation for Yarn.
> What happens is that a "random" cluster in the federation is asked to generate an app-id and then potentially a different cluster can be the "home" cluster for the AM. Furthermore, tasks can then run in yet other clusters.
> In order to be able to pull up the logical home cluster on which the application ran, there needs to be a mapping from application-id to cluster-id. This mapping is available in the federated Yarn case only during the active live of the application.
> A similar situation is common in our larger production environment. Somebody will complain about a slow job, some failure or whatever. If we're lucky we have an application-id. When we ask the user which cluster they ran on, they'll typically answer with the machine from where they launched the job (many users are unaware of the underlying physical clusters). This leaves us to spelunk through various RM ui's to find a matching epoch in the application ID. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org