You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@hbase.apache.org by Ioakim Perros <im...@gmail.com> on 2012/08/04 14:22:43 UTC

Bulk import - key, value ambiguity

Hi,

Does anyone knows why at HFileOutputFormat the API ( 
http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/mapreduce/HFileOutputFormat.html#configureIncrementalLoad(org.apache.hadoop.mapreduce.Job, 
org.apache.hadoop.hbase.client.HTable) 
<http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/mapreduce/HFileOutputFormat.html#configureIncrementalLoad%28org.apache.hadoop.mapreduce.Job,%20org.apache.hadoop.hbase.client.HTable%29> 
)

suggests using as key an ImmutableBytesWritable object and as value a 
KeyValue object,

when the KeyValue object has as its field the row that each KeyValue 
will lead to ? And as I experienced, this row field is being used as the 
table's key.

Thanks in advance!

Re: Bulk import - key, value ambiguity

Posted by Paul Mackles <pm...@adobe.com>.

Probably because M/R requires a key and because you want M/R to sort on
that key which is required for writing hfiles.

On 8/4/12 8:22 AM, "Ioakim Perros" <im...@gmail.com> wrote:

>Hi,
>
>Does anyone knows why at HFileOutputFormat the API (
>http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/mapreduce/HFileOut
>putFormat.html#configureIncrementalLoad(org.apache.hadoop.mapreduce.Job,
>org.apache.hadoop.hbase.client.HTable)
><http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/mapreduce/HFileOu
>tputFormat.html#configureIncrementalLoad%28org.apache.hadoop.mapreduce.Job
>,%20org.apache.hadoop.hbase.client.HTable%29>
>)
>
>suggests using as key an ImmutableBytesWritable object and as value a
>KeyValue object,
>
>when the KeyValue object has as its field the row that each KeyValue
>will lead to ? And as I experienced, this row field is being used as the
>table's key.
>
>Thanks in advance!
>