You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-issues@hadoop.apache.org by "Dick King (JIRA)" <ji...@apache.org> on 2010/08/02 19:58:20 UTC

[jira] Commented: (MAPREDUCE-323) Improve the way job history files are managed

    [ https://issues.apache.org/jira/browse/MAPREDUCE-323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12894624#action_12894624 ] 

Dick King commented on MAPREDUCE-323:
-------------------------------------

Unfortunately, since the job sequence numbers will be assigned at job start time but the DONE directory is built as jobs complete, it's not the case that we'll only be filling one serial number block at once.  They will, however, be closed when the day is complete.

> Improve the way job history files are managed
> ---------------------------------------------
>
>                 Key: MAPREDUCE-323
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-323
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: jobtracker
>    Affects Versions: 0.21.0, 0.22.0
>            Reporter: Amar Kamat
>            Assignee: Dick King
>            Priority: Critical
>
> Today all the jobhistory files are dumped in one _job-history_ folder. This can cause problems when there is a need to search the history folder (job-recovery etc). It would be nice if we group all the jobs under a _user_ folder. So all the jobs for user _amar_ will go in _history-folder/amar/_. Jobs can be categorized using various features like _jobid, date, jobname_ etc but using _username_ will make the search much more efficient and also will not result into namespace explosion. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.