You are viewing a plain text version of this content. The canonical link for it is here.

Posted to mapreduce-user@hadoop.apache.org by Kevin <ke...@gmail.com> on 2012/06/18 14:47:36 UTC

TableReducer keyout

Hi,

I am going through some samples of using MapReduce with HBase. My question
is concerning the importance of the KEYOUT type of a TableReducer. Does the
output key really matter if the output value must always be a Put or a
Delete instance, in which the row key for the sink table is always
specified? Can I just use null when writing the output key in the reducer
class (e.g., context.write(null, MyPut))? It seems like in this usage of
MapReduce the keyout would be only used when chaining jobs.

-Kevin

Re: TableReducer keyout

Posted by Harsh J <ha...@cloudera.com>.

Hey Kevin,

(Moved this to the HBase user lists as it is more appropriate there -
cause of the libs you are using per your question. BCC'd
mapreduce-user and CC'd you in case you aren't subscribed to HBase
user lists).

The TableOutputFormat ignores keys. So it is safe to pass a null
object. This is also documented at
http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/mapreduce/TableOutputFormat.html

On Mon, Jun 18, 2012 at 6:17 PM, Kevin <ke...@gmail.com> wrote:
> Hi,
>
> I am going through some samples of using MapReduce with HBase. My question
> is concerning the importance of the KEYOUT type of a TableReducer. Does the
> output key really matter if the output value must always be a Put or a
> Delete instance, in which the row key for the sink table is always
> specified? Can I just use null when writing the output key in the reducer
> class (e.g., context.write(null, MyPut))? It seems like in this usage of
> MapReduce the keyout would be only used when chaining jobs.
>
> -Kevin

-- 
Harsh J

Re: TableReducer keyout

Posted by Harsh J <ha...@cloudera.com>.

Hey Kevin,

(Moved this to the HBase user lists as it is more appropriate there -
cause of the libs you are using per your question. BCC'd
mapreduce-user and CC'd you in case you aren't subscribed to HBase
user lists).

The TableOutputFormat ignores keys. So it is safe to pass a null
object. This is also documented at
http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/mapreduce/TableOutputFormat.html

On Mon, Jun 18, 2012 at 6:17 PM, Kevin <ke...@gmail.com> wrote:
> Hi,
>
> I am going through some samples of using MapReduce with HBase. My question
> is concerning the importance of the KEYOUT type of a TableReducer. Does the
> output key really matter if the output value must always be a Put or a
> Delete instance, in which the row key for the sink table is always
> specified? Can I just use null when writing the output key in the reducer
> class (e.g., context.write(null, MyPut))? It seems like in this usage of
> MapReduce the keyout would be only used when chaining jobs.
>
> -Kevin

-- 
Harsh J