You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-dev@hadoop.apache.org by "Joydeep Sen Sarma (JIRA)" <ji...@apache.org> on 2010/05/20 03:01:55 UTC

[jira] Created: (MAPREDUCE-1802) allow outputcommitters to skip setup/cleanup

allow outputcommitters to skip setup/cleanup
--------------------------------------------

                 Key: MAPREDUCE-1802
                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1802
             Project: Hadoop Map/Reduce
          Issue Type: Bug
            Reporter: Joydeep Sen Sarma
            Assignee: Joydeep Sen Sarma


Job setup and cleanup overheads in our (larger) clusters are very significant and add to latency for small jobs. It turns out that Hive does not require job setup and cleanup at all - since all management of output/temporary files and such is done by the hive client side. So it would be a big win for our environment (and Hive users in general) if we could skip job cleanup/setup altogether.

The proposal is to add new calls to OutputCommitter interface (along the lines of needsTaskCommit()) to optionally allow skipping of setup/cleanup and for the JT to take these into account while scheduling setup/cleanup. NullOutputFormat should not need setup/cleanup for example.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Resolved: (MAPREDUCE-1802) allow outputcommitters to skip setup/cleanup

Posted by "Joydeep Sen Sarma (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAPREDUCE-1802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Joydeep Sen Sarma resolved MAPREDUCE-1802.
------------------------------------------

    Resolution: Duplicate

> allow outputcommitters to skip setup/cleanup
> --------------------------------------------
>
>                 Key: MAPREDUCE-1802
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1802
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>            Reporter: Joydeep Sen Sarma
>            Assignee: Joydeep Sen Sarma
>
> Job setup and cleanup overheads in our (larger) clusters are very significant and add to latency for small jobs. It turns out that Hive does not require job setup and cleanup at all - since all management of output/temporary files and such is done by the hive client side. So it would be a big win for our environment (and Hive users in general) if we could skip job cleanup/setup altogether.
> The proposal is to add new calls to OutputCommitter interface (along the lines of needsTaskCommit()) to optionally allow skipping of setup/cleanup and for the JT to take these into account while scheduling setup/cleanup. NullOutputFormat should not need setup/cleanup for example.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.