You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Tarandeep Singh <ta...@gmail.com> on 2008/05/28 19:30:21 UTC

behavior of MapWritable as Key in Map Reduce

Hi,

I want to understand the behavior of MapWritable if used as an intermediate
Key in Mappers and Reducers.

Suppose I create a MapWritable object with the following key-values in it-
(K1, V1), (K2, V2) (K3, V3)
So how will the Map Reduce Framework group and sort the keys (MapWritable
objects) emitted by Mapper function - group on Keys of MapWritable (K1, K2,
K3 in this case) or group on Values of MapWritable (V1, V2, V3) in this case
or some combination of both ?

Similarly if I have ArrayWritable (of Text) as Key with say V1, V2, V3 as
its members then will the framework do the grouping and sorting based on
string- "V1 concat V2 concat V3" ? If this is the case - what other
advantage one gets by using ArrayWritable instead of TextWritable key
(concatenation of V1, V2, V3 in above case) besides the fact that at reducer
one need not to separate the individual components if ArrayWritable is used
(Just curious to know the best practice - which one should be used -
ArrayWritable or TextWritable in such a case)

Thanks,
Taran