You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@tez.apache.org by "Jonathan Eagles (JIRA)" <ji...@apache.org> on 2015/05/26 20:33:17 UTC

[jira] [Commented] (TEZ-2485) Reduce the Resource Load on the Timeline Server

    [ https://issues.apache.org/jira/browse/TEZ-2485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14559573#comment-14559573 ] 

Jonathan Eagles commented on TEZ-2485:
--------------------------------------

I'll try to post a break down of what is taking up the most space soon so we can start brainstorming.

> Reduce the Resource Load on the Timeline Server
> -----------------------------------------------
>
>                 Key: TEZ-2485
>                 URL: https://issues.apache.org/jira/browse/TEZ-2485
>             Project: Apache Tez
>          Issue Type: Improvement
>            Reporter: Jonathan Eagles
>
> The disk, network, and memory resources needed by the timeline server are are many times higher than the need for the equivalent mapreduce job. 
> Based on storage improvents YARN-3448, the timeline server may support up to 30,000 jobs / 10,000,000 tasks a
> day.
> While I understand there is community effort on timeline server v2, it
> will be good if Tez can reduce its pressure on the timeline server by
> auditing both the number of events and size of events.
> Here are some observations based on my understanding of the design of
> timeline stores:
> Each timeline entity pushed explodes into many records in the database
> 1 marker record
> 1 domain record
> 1 record per event
> 2 records per related entity
> 2 records per primary filter (2 record per primary filter in
> RollingLevelDBTimelineStore, in leveldb it rewrites entire entity
> records per primary filter )
> 1 record per other info
> For example
> Task Attempt Start
> 1 marker
> 1 domain
> 1 task attempt start event
> 1 related entity X 2
> 7 other info entries
> 4 primary filters X 2
> 20 records written in the database for task attempt start
> Task Attempt Finish
> 1 marker
> 1 domain
> 1 task attempt start event
> 1 related entity X 2
> 5 other info entries
> 5 primary filters X 2
> 20 records written in the database for task attempt finish
> =====================================================
> QUESTION:
> =====================================================
> Is there any data we are publishing to the timeline server that is not
> in the UI?
> Do we use all the entities (TEZ_CONTAINER_ID for example)
> Do we use all the primary filters?
> Do we use all the related entities specified?
> Are there any fields we don't use?
> Are there other approaches to consider to reduce entity count/size?
> Is there a way to store the same information in less space?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)