You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-issues@hadoop.apache.org by "Ray Chiang (JIRA)" <ji...@apache.org> on 2015/06/09 23:58:00 UTC

[jira] [Commented] (MAPREDUCE-6376) Fix long load times of .jhist file in JobHistoryServer

    [ https://issues.apache.org/jira/browse/MAPREDUCE-6376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14579634#comment-14579634 ] 

Ray Chiang commented on MAPREDUCE-6376:
---------------------------------------

A few comments:

1) It turns out that Avro parsing is anywhere from 70% to 90% of the .jhist processing time.  Some data points for the json .jhist file:

- 50k mappers
-- 20 seconds overall read time
-- 16.6 seconds Avro parsing/reading
- 404k mappers
-- 68 seconds
-- 49 seconds Avro parsing/reading
-- 751k mappers
-- 300 seconds
-- 280 seconds Avro parsing/reading

2) I couldn't get access to a machine to generate more than 50k mapper jobs, but my rough experiments would see about 4x to 5x speedup in Avro parsing/reading.  For the worst case improvement on 751k mappers, I would expect the 300 seconds of processing time to get down to about 90 seconds.  There is room to shave down the processing time by a few seconds here and there, but that's probably better left to subsequent JIRAs.

3) The .jhist file output format is now a configuration option, with the default set to json.


> Fix long load times of .jhist file in JobHistoryServer
> ------------------------------------------------------
>
>                 Key: MAPREDUCE-6376
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6376
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: jobhistoryserver
>    Affects Versions: 2.7.0
>            Reporter: Ray Chiang
>            Assignee: Ray Chiang
>         Attachments: MAPREDUCE-6376.001.patch
>
>
> When you click on a Job link in the JHS Web UI, it loads the .jhist file.  For jobs which have a large number of tasks, the load time can break UI responsiveness.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)