You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-issues@hadoop.apache.org by "Azuryy(Chijiong) (Updated) (JIRA)" <ji...@apache.org> on 2011/11/01 09:34:32 UTC

[jira] [Updated] (MAPREDUCE-3323) Distributed Cache for Map or Reduce or Both

     [ https://issues.apache.org/jira/browse/MAPREDUCE-3323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Azuryy(Chijiong) updated MAPREDUCE-3323:
----------------------------------------

    Description: 
We put some file into Distributed Cache, but sometimes, only Map or Reduce use thses cached files, not useful for both. but TaskTracker always download cached files from HDFS, if there are some little bit big files in cache, it's time expensive.

so, this patch add some new API in the DistributedCache.java as follow:

addArchiveToClassPathForMap
addArchiveToClassPathForReduce

addFileToClassPathForMap
addFileToClassPathForReduce

addCacheFileForMap
addCacheFileForReduce

addCacheArchiveForMap
addCacheArchiveForReduce


New API doesn't affect original interface. but they are specified for only map or reduce, not both of them.

But if you do need cache file during both map and reduce, then use original interface.

    
> Distributed Cache for Map or Reduce or Both
> -------------------------------------------
>
>                 Key: MAPREDUCE-3323
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3323
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>            Reporter: Azuryy(Chijiong)
>
> We put some file into Distributed Cache, but sometimes, only Map or Reduce use thses cached files, not useful for both. but TaskTracker always download cached files from HDFS, if there are some little bit big files in cache, it's time expensive.
> so, this patch add some new API in the DistributedCache.java as follow:
> addArchiveToClassPathForMap
> addArchiveToClassPathForReduce
> addFileToClassPathForMap
> addFileToClassPathForReduce
> addCacheFileForMap
> addCacheFileForReduce
> addCacheArchiveForMap
> addCacheArchiveForReduce
> New API doesn't affect original interface. but they are specified for only map or reduce, not both of them.
> But if you do need cache file during both map and reduce, then use original interface.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira