You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-issues@hadoop.apache.org by "Jason Lowe (JIRA)" <ji...@apache.org> on 2015/11/11 22:25:11 UTC

[jira] [Commented] (YARN-4265) Provide new timeline plugin storage to support fine-grained entity caching

    [ https://issues.apache.org/jira/browse/YARN-4265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15001126#comment-15001126 ] 

Jason Lowe commented on YARN-4265:
----------------------------------

Thanks for the patch, [~gtCarrera9]!  This looks like most of the patch is a copy of the entity timeline store from YARN-3942 with a few edits, so I'm sorta reviewing my own code here.  As such I did a diff of the patch from this JIRA and the one from YARN-3942 so I could focus on what's changed.  I'll defer to others to review the parts that are identical to YARN-3942.  Eventually I can see this being a superset of YARN-3942, since it can cache to memory and either cache everything or a subset based on what the plugins decide.

TIMELINE_SERVICE_PLUGIN_ENABLED and DEFAULT_TIMELINE_SERVICE_PLUGIN_ENABLED are not needed.

Is there a reason we're not using the Configuration.getInstances method or the ReflectionUtils methods to handle plugin loading?

If no plugins are configured (which is the default behavior), do we want a fallback plugin that emulates what YARN-3942 is doing?

Are there plans to support the leveldb store as an alternative to the memory store for the detail timeline store?  There was concern that a single dag could overwhelm the server, and storing it to leveldb instead of the memory store would be one way to try to mitigate that.  I'm wondering if the class to use for the detail log timeline store should be configurable with MemoryTimelineStore as the default.  Could also do this as a followup JIRA if necessary.

Should add entries to yarn-default.xml for the new properties?

Do we want to log at the info level that a path is being skipped during the scan?  The store can end up scanning fairly often in practice, so this could end up logging a lot for just one path per scan.  I'm wondering if making it a debug log is more appropriate.

Comment in getDoneAppPath mentions a cache ID but it's using an app ID.

Nit: Indentation is off at the start of the YarnConfiguration patch hunk.

> Provide new timeline plugin storage to support fine-grained entity caching
> --------------------------------------------------------------------------
>
>                 Key: YARN-4265
>                 URL: https://issues.apache.org/jira/browse/YARN-4265
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: timelineserver
>            Reporter: Li Lu
>            Assignee: Li Lu
>         Attachments: YARN-4265-trunk.poc_001.patch
>
>
> To support the newly proposed APIs in YARN-4234, we need to create a new plugin timeline store. The store may have similar behavior as the EntityFileTimelineStore proposed in YARN-3942, but cache date in cache id granularity, instead of application id granularity. Let's have this storage as a standalone one, instead of updating EntityFileTimelineStore, to keep the existing store (EntityFileTimelineStore) stable. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)