You are viewing a plain text version of this content. The canonical link for it is here.

Posted to yarn-issues@hadoop.apache.org by "Amit Tiwari (JIRA)" <ji...@apache.org> on 2015/02/20 00:03:16 UTC

[jira] [Updated] (YARN-2556) Tool to measure the performance of the timeline server

     [ https://issues.apache.org/jira/browse/YARN-2556?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amit Tiwari updated YARN-2556:
------------------------------
    Attachment: YARN-2556.patch

Hi guys,
I've done the following enhancements to the previous patches that were posted:
1) Earlier, the payload was getting set as the entityId. Since the entityId is used as a key, by LevelDB it was crashing under moderate loads, because each key size was ~2MB. Hence I've changed it to send the payload as a part of OtherInfo. This is handled well.
2) Instead of posting a string of repeated 'a's as a payload, I choose from a set of characters. This ensures that the LevelDB does not get away easily with compression ('cos algos can easily compress a string if it comprises a single repeated character)

Here are some of the performance numbers that I've got:
I run 20 concurrent jobs, with the argument -m 300 -s 10 -t 20 
On a 36 node cluster, this results in ~830 concurrent containers (e.g maps), each firing 10KB of payload, 20 times.

Level DB seems to hold up fine.

Would you have other ways that I could stress/load the system even more?
thanks
--amit

> Tool to measure the performance of the timeline server
> ------------------------------------------------------
>
>                 Key: YARN-2556
>                 URL: https://issues.apache.org/jira/browse/YARN-2556
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: timelineserver
>            Reporter: Jonathan Eagles
>            Assignee: Chang Li
>         Attachments: YARN-2556-WIP.patch, YARN-2556-WIP.patch, YARN-2556.patch, yarn2556.patch, yarn2556.patch, yarn2556_wip.patch
>
>
> We need to be able to understand the capacity model for the timeline server to give users the tools they need to deploy a timeline server with the correct capacity.
> I propose we create a mapreduce job that can measure timeline server write and read performance. Transactions per second, I/O for both read and write would be a good start.
> This could be done as an example or test job that could be tied into gridmix.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)