You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Stan Rosenberg <sr...@proclivitysystems.com> on 2011/08/11 05:45:37 UTC
Deserialization in ReduceContext
Hi,
Would someone please explain why ReduceContext.nextKeyValue() creates only a
single instance of the deserializing class?
This is a rather non-standard semantics for deserialization and drove me
insane.
E.g., the following code is rather intuitive but patently wrong; it always
adds a single instance of V to the set.
void reduce(K key, Iterable<V> values, Context context) {
TreeSet<V> union = new TreeSet<V>();
for (V v : values) {
union.add(v);
}
}
Thanks,
stan
Re: Deserialization in ReduceContext
Posted by Stan Rosenberg <sr...@proclivitysystems.com>.
On Thu, Aug 11, 2011 at 1:04 AM, Harsh J <ha...@cloudera.com> wrote:
> Stan,
>
> Welcome to the reference/reuse hell. Its all part of the learning
> process; its documented here and there but everyone learns this by
> tearing hair apart I noticed :-)
>
> Have a look at the discussion done previously for this:
> http://search-hadoop.com/m/p0VEl1ywwrs1
>
> Also, we had some pointers on how to go about storing values in memory
> within mappers/reducers: http://search-hadoop.com/m/ycUTimJIEg and
> http://search-hadoop.com/m/uXuH41J1lbA
>
> Hope these help!
Many thanks!!
Re: Deserialization in ReduceContext
Posted by Harsh J <ha...@cloudera.com>.
Stan,
Welcome to the reference/reuse hell. Its all part of the learning
process; its documented here and there but everyone learns this by
tearing hair apart I noticed :-)
Have a look at the discussion done previously for this:
http://search-hadoop.com/m/p0VEl1ywwrs1
Also, we had some pointers on how to go about storing values in memory
within mappers/reducers: http://search-hadoop.com/m/ycUTimJIEg and
http://search-hadoop.com/m/uXuH41J1lbA
Hope these help!
On Thu, Aug 11, 2011 at 9:15 AM, Stan Rosenberg
<sr...@proclivitysystems.com> wrote:
> Hi,
>
> Would someone please explain why ReduceContext.nextKeyValue() creates only a
> single instance of the deserializing class?
> This is a rather non-standard semantics for deserialization and drove me
> insane.
>
> E.g., the following code is rather intuitive but patently wrong; it always
> adds a single instance of V to the set.
>
> void reduce(K key, Iterable<V> values, Context context) {
> TreeSet<V> union = new TreeSet<V>();
> for (V v : values) {
> union.add(v);
> }
> }
>
> Thanks,
>
> stan
>
--
Harsh J