You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-issues@hadoop.apache.org by "Sangjin Lee (JIRA)" <ji...@apache.org> on 2016/10/10 22:41:20 UTC

[jira] [Commented] (YARN-5638) Introduce a collector timestamp to uniquely identify collectors creation order in collector discovery

    [ https://issues.apache.org/jira/browse/YARN-5638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15563769#comment-15563769 ] 

Sangjin Lee commented on YARN-5638:
-----------------------------------

Thanks [~gtCarrera9] for the patch! I agree with the general approach with adding the RM timestamp and an app version timestamp.

I notice that the patch is based on the trunk. Do we want to commit this to trunk as well as the feature branch? Is that because of the dependency on the new YARN web UI?

One meta-level question: I see that the RM timestamp is currently the RM's cluster timestamp which is really the start time of the RM process. And this is *not* the same as the active state transition. Shouldn't we define the timestamp as when it became active, rather than when the process started? For example, it is perfectly possible that the standby RM started before the currently active RM, or at least there is no strong order guarantee between those two RM start times, right? If the failover occurs, the newly active RM (the previous standby RM that has an earlier start timestamp) will start stamping with the earlier start timestamp, and they will *not* be considered more recent. And if you think of situations where multiple failovers occur without restarting the RM, you can kind of see why the cluster timestamp would be problematic.

In light of this, I am not sure if the cluster timestamp is the right RM timestamp to use. Shouldn't it be more like the active state transition timestamp that we should use?

(AppCollectorData.java)
- I am fine with the current {{happensBefore()}} before. Do you think it might be OK to add a {{Comparable}} interface and {{compareTo()}} method in addition? I don't have a strong preference, but having that wouldn't be bad.
- l.83-84: the javadoc and the check being done here seem slightly inconsistent; the javadoc states that we’re relying on the RM timestamp but the method checks either the RM timestamp or the version timestamp; if both are set or neither is set, then there may not be a real difference anyway, but then can we update the javadoc to be consistent with the implementation at least?

(Context.java)
- l.84: javadoc for {{getKnownCollectors()}} would be helpful

(NodeStatusUpdaterImpl.java)
- l.947: so does this mean that it would clear out the registering collectors on every node heartbeat? Normally this shouldn't be most of the known collectors, right?


> Introduce a collector timestamp to uniquely identify collectors creation order in collector discovery
> -----------------------------------------------------------------------------------------------------
>
>                 Key: YARN-5638
>                 URL: https://issues.apache.org/jira/browse/YARN-5638
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: timelineserver
>            Reporter: Li Lu
>            Assignee: Li Lu
>         Attachments: YARN-5638-trunk.v1.patch, YARN-5638-trunk.v2.patch, YARN-5638-trunk.v3.patch
>
>
> As discussed in YARN-3359, we need to further identify timeline collectors' creation order to rebuild collector discovery data in the RM. This JIRA proposes to use <rm_timestamp, logical_version_number> to order collectors for each application in the RM. This timestamp can then be used when a standby RM becomes active and rebuild collector discovery data. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org