You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-issues@hadoop.apache.org by "Amit Tiwari (JIRA)" <ji...@apache.org> on 2015/02/20 00:03:16 UTC
[jira] [Updated] (YARN-2556) Tool to measure the performance of the
timeline server
[ https://issues.apache.org/jira/browse/YARN-2556?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Amit Tiwari updated YARN-2556:
------------------------------
Attachment: YARN-2556.patch
Hi guys,
I've done the following enhancements to the previous patches that were posted:
1) Earlier, the payload was getting set as the entityId. Since the entityId is used as a key, by LevelDB it was crashing under moderate loads, because each key size was ~2MB. Hence I've changed it to send the payload as a part of OtherInfo. This is handled well.
2) Instead of posting a string of repeated 'a's as a payload, I choose from a set of characters. This ensures that the LevelDB does not get away easily with compression ('cos algos can easily compress a string if it comprises a single repeated character)
Here are some of the performance numbers that I've got:
I run 20 concurrent jobs, with the argument -m 300 -s 10 -t 20
On a 36 node cluster, this results in ~830 concurrent containers (e.g maps), each firing 10KB of payload, 20 times.
Level DB seems to hold up fine.
Would you have other ways that I could stress/load the system even more?
thanks
--amit
> Tool to measure the performance of the timeline server
> ------------------------------------------------------
>
> Key: YARN-2556
> URL: https://issues.apache.org/jira/browse/YARN-2556
> Project: Hadoop YARN
> Issue Type: Sub-task
> Components: timelineserver
> Reporter: Jonathan Eagles
> Assignee: Chang Li
> Attachments: YARN-2556-WIP.patch, YARN-2556-WIP.patch, YARN-2556.patch, yarn2556.patch, yarn2556.patch, yarn2556_wip.patch
>
>
> We need to be able to understand the capacity model for the timeline server to give users the tools they need to deploy a timeline server with the correct capacity.
> I propose we create a mapreduce job that can measure timeline server write and read performance. Transactions per second, I/O for both read and write would be a good start.
> This could be done as an example or test job that could be tied into gridmix.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)