You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-issues@hadoop.apache.org by "Li Lu (JIRA)" <ji...@apache.org> on 2015/06/02 00:18:19 UTC
[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers

    [ https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14568128#comment-14568128 ] 

Li Lu commented on YARN-3051:
-----------------------------

Hi [~varun_saxena], thanks for the work! Not sure if you've already made progress since the latest patch, but I'm posting some of my comments and questions w.r.t the reader API design in the 003 patch. I may have more comments in the near future, but I won't mind to see a new patch before posting them. 

# I noticed there is a _readerLimit_ for read operations, which works for ATS v1. I'm wondering if it's fine to use -1 to indicate there's no such limit? Not sure if this feature is already there. 
# The {{fromId}} parameter, we may need to be careful on the concept of "id". In timeline v2 we need context information to identify each entity, such as cluster, user, flow, run. When querying with {{fromId}}, what kind of assumptions should we make on the "id" here? Are we assuming all entities are of the same cluster, user, and/or flow, or the "id" is a concatenation of all information, or it's something else? 
# For all filters related parameters, I'm not sure if the current object model and storage implementation support a trivial solution. I'd certainly welcome any comments/suggestions on this problem. 
# Based on the previous two issues, a more general question is, shall we focus on a evolution of the v1 API here, or we start a v2 reader API design from the scratch, and then try to make them compatible to the v1 APIs? The current patch looks to be pursuing the evolution approach. 
# In some APIs, we're requiring clusterID and appID, but not having flow/run information. In the current writer implementations, this indicates a full table scan. Maybe we can have flow and run information as optional parameters so that we can avoid full table scans when the caller does have flow and run information?
# The current APIs require a pretty long list of parameters. For most of the use cases, I think we can abstract something much simpler. Do we plan to add those "simple APIs" in a higher layer? I think having a lot of nulls when calling reader API looks suboptimal, but with only these few APIs we may need to do this frequently?  

> [Storage abstraction] Create backing storage read interface for ATS readers
> ---------------------------------------------------------------------------
>
>                 Key: YARN-3051
>                 URL: https://issues.apache.org/jira/browse/YARN-3051
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: timelineserver
>    Affects Versions: YARN-2928
>            Reporter: Sangjin Lee
>            Assignee: Varun Saxena
>         Attachments: YARN-3051-YARN-2928.003.patch, YARN-3051-YARN-2928.03.patch, YARN-3051.wip.02.YARN-2928.patch, YARN-3051.wip.patch, YARN-3051_temp.patch
>
>
> Per design in YARN-2928, create backing storage read interface that can be implemented by multiple backing storage implementations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)