You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Stan Rosenberg <sr...@proclivitysystems.com> on 2011/08/11 05:45:37 UTC

Deserialization in ReduceContext

Hi,

Would someone please explain why ReduceContext.nextKeyValue() creates only a
single instance of the deserializing class?
This is a rather non-standard semantics for deserialization and drove me
insane.

E.g., the following code is rather intuitive but patently wrong; it always
adds a single instance of V to the set.

void reduce(K key, Iterable<V> values, Context context) {
    TreeSet<V> union = new TreeSet<V>();
    for (V v : values) {
        union.add(v);
    }
}

Thanks,

stan

Re: Deserialization in ReduceContext

Posted by Stan Rosenberg <sr...@proclivitysystems.com>.
On Thu, Aug 11, 2011 at 1:04 AM, Harsh J <ha...@cloudera.com> wrote:

> Stan,
>
> Welcome to the reference/reuse hell. Its all part of the learning
> process; its documented here and there but everyone learns this by
> tearing hair apart I noticed :-)
>
> Have a look at the discussion done previously for this:
> http://search-hadoop.com/m/p0VEl1ywwrs1
>
> Also, we had some pointers on how to go about storing values in memory
> within mappers/reducers: http://search-hadoop.com/m/ycUTimJIEg and
> http://search-hadoop.com/m/uXuH41J1lbA
>
> Hope these help!


 Many thanks!!

Re: Deserialization in ReduceContext

Posted by Harsh J <ha...@cloudera.com>.
Stan,

Welcome to the reference/reuse hell. Its all part of the learning
process; its documented here and there but everyone learns this by
tearing hair apart I noticed :-)

Have a look at the discussion done previously for this:
http://search-hadoop.com/m/p0VEl1ywwrs1

Also, we had some pointers on how to go about storing values in memory
within mappers/reducers: http://search-hadoop.com/m/ycUTimJIEg and
http://search-hadoop.com/m/uXuH41J1lbA

Hope these help!

On Thu, Aug 11, 2011 at 9:15 AM, Stan Rosenberg
<sr...@proclivitysystems.com> wrote:
> Hi,
>
> Would someone please explain why ReduceContext.nextKeyValue() creates only a
> single instance of the deserializing class?
> This is a rather non-standard semantics for deserialization and drove me
> insane.
>
> E.g., the following code is rather intuitive but patently wrong; it always
> adds a single instance of V to the set.
>
> void reduce(K key, Iterable<V> values, Context context) {
>    TreeSet<V> union = new TreeSet<V>();
>    for (V v : values) {
>        union.add(v);
>    }
> }
>
> Thanks,
>
> stan
>



-- 
Harsh J