You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-dev@hadoop.apache.org by "Owen O'Malley (JIRA)" <ji...@apache.org> on 2010/08/12 20:03:17 UTC

[jira] Resolved: (MAPREDUCE-2007) Is it possible that use ArrayList or other type instead Iterable when use reduce(Object, Iterable, Context)?

     [ https://issues.apache.org/jira/browse/MAPREDUCE-2007?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Owen O'Malley resolved MAPREDUCE-2007.
--------------------------------------

    Resolution: Won't Fix

The framework can't assume that all of the values fit into memory, so it is not possible to make the API require a List object.

If you are just counting values, you should consider replacing the value with an integer and implement a combiner that adds the counts together. It will be much more efficient. Look at the word count example for an example of how to do this.

If you just need the first N values, just iterate through the values you need and return from the reduce method. There is no need to exhaust the iterator.

> Is it possible that use ArrayList or other type  instead Iterable  when use reduce(Object, Iterable, Context)?
> --------------------------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-2007
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2007
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>    Affects Versions: 0.20.2
>            Reporter: Hui Wen Han
>             Fix For: 0.20.2
>
>
> 1) Sometimes We only need get the elements count of the input values of Reducer task,
> but we have to iterate all the input values to calculate it.
> 2) Sometimes We only need get a few elements (for example top n,last n ,or random ) from  the input values of Reducer task,
> if it can use ArrayList or other type  instead Iterable  when use reduce(Object, Iterable, Context),it 's more conveniency.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.