You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-issues@hadoop.apache.org by "Vrushali C (JIRA)" <ji...@apache.org> on 2017/07/21 02:43:00 UTC

[jira] [Comment Edited] (YARN-6850) Ensure that supplemented timestamp is stored only for flow run metrics

    [ https://issues.apache.org/jira/browse/YARN-6850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16095696#comment-16095696 ] 

Vrushali C edited comment on YARN-6850 at 7/21/17 2:42 AM:
-----------------------------------------------------------

Thanks [~varun_saxena] for the patch. Yes, I think we want to add in something in the documentation perhaps as part of this jira? That mentions that when we move from alpha1 to beta, the existing timeseries metrics may not be retrievable. 

I had a question unrelated to this patch but I am seeing it now. 
Why do we have this check?

{code} if (tsBegin != 0 || tsEnd != Long.MAX_VALUE) { {code} 

In case someone wants all versions for this metric, how would they do it without knowing the boundary? It's not so much of a problem for users querying manually but when scripts call such queries, some times they put in min as 0 and max as long,max as boundaries in order to fetch everything. Do we want to not allow this.. (just wondering what the thought process was).



was (Author: vrushalic):
Thanks [~varun_saxena] for the patch. Yes, I think we want to add in something in the documentation perhaps as part of this jira? That mentions that when we move from alpha1 to beta, the existing timeseries metrics may not be retrievable. 

I had a question unrelated to this patch but I am seeing it now. 
Why do we have this check?

bq  if (tsBegin != 0 || tsEnd != Long.MAX_VALUE) {

In case someone wants all versions for this metric, how would they do it without knowing the boundary? It's not so much of a problem for users querying manually but when scripts call such queries, some times they put in min as 0 and max as long,max as boundaries in order to fetch everything. Do we want to not allow this.. (just wondering what the thought process was).


> Ensure that supplemented timestamp is stored only for flow run metrics
> ----------------------------------------------------------------------
>
>                 Key: YARN-6850
>                 URL: https://issues.apache.org/jira/browse/YARN-6850
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: timelineserver
>            Reporter: Vrushali C
>            Assignee: Varun Saxena
>              Labels: yarn-5355-merge-blocker
>         Attachments: YARN-6850-YARN-5355.01.patch
>
>
> In timeline service v2,  ColumnHelper#getPutTimestamp supplements the timestamp and is called by ColumnHelper#store. This is not conditional and called for every put.
> We need to ensure that the cell timestamps for metrics in entity and application (and sub application) tables are "correct" timestamps since we will be enabling TTLs for these cells. 
> The supplemented timestamp is to be used only in the flow run table by the coprocessor which intercepts all reads & writes to cells in this table. It looks at the supplemented timestamp to figure out which app id this particular cell belongs to. This is done in order to ensure no collision occurs when two apps belonging to same flow run write the same metric at the same timestamp. 
> Discovered in the discussion in YARN-4455 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org