You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Shrijeet Paliwal <sh...@rocketfuel.com> on 2011/09/29 20:58:09 UTC

How to retrieve all columns of a CF and adding it in a put call

Hello,

I am trying to create a new table with data exactly same as BUT make
the row key in new table set as a column_value in the old table.

Following is my map method ( using a map only MR job)

public void map(ImmutableBytesWritable key, Result value, Context context) {
                throws IOException, InterruptedException {
            byte[] mod_key = value.getValue(INPUT_FAMILY, INPUT_COLUMN);
            if (mod_key != null) {
            Map<byte[], NavigableMap<byte[], byte[]>> cf =
value.getNoVersionMap();
            Put put = new Put(mod_key);
            for (byte[] c : cf.keySet()) {
                for (byte[] q : cf.get(c).keySet()) {
                    put.add(c, q, cf.get(c).get(q));
                }
            }
            context.write(key, put);}

I am inclined to think there has to be a more efficient way to do
this. By that I mean, not have to iterate through all the columns.
Thoughts?

Browsing code I found some usages like this :

outval.add(INPUT_FAMILY, null, value.getValue(INPUT_FAMILY, null);

What does above mean? Does it mean get bytes representing all columns
for INPUT_FAMILY and add it in put object?

Re: How to retrieve all columns of a CF and adding it in a put call

Posted by Shrijeet Paliwal <sh...@rocketfuel.com>.
Understood. Thank you J-D.

On Thu, Oct 6, 2011 at 11:48 AM, Jean-Daniel Cryans <jd...@apache.org> wrote:
> Well you need to insert all the columns so yes you need to iterate
> them all. There's a shorter way to do it tho, look at the Import class
> in the HBase code:
>
>    private static Put resultToPut(ImmutableBytesWritable key, Result result)
>    throws IOException {
>      Put put = new Put(key.get());
>      for (KeyValue kv : result.raw()) {
>        put.add(kv);
>      }
>      return put;
>    }
>
> Regarding your last question, what that line does is just setting the
> value of the input family with an empty qualifier. Not the whole
> family.
>
> J-D
>
> On Thu, Sep 29, 2011 at 11:58 AM, Shrijeet Paliwal
> <sh...@rocketfuel.com> wrote:
>> Hello,
>>
>> I am trying to create a new table with data exactly same as BUT make
>> the row key in new table set as a column_value in the old table.
>>
>> Following is my map method ( using a map only MR job)
>>
>> public void map(ImmutableBytesWritable key, Result value, Context context) {
>>                throws IOException, InterruptedException {
>>            byte[] mod_key = value.getValue(INPUT_FAMILY, INPUT_COLUMN);
>>            if (mod_key != null) {
>>            Map<byte[], NavigableMap<byte[], byte[]>> cf =
>> value.getNoVersionMap();
>>            Put put = new Put(mod_key);
>>            for (byte[] c : cf.keySet()) {
>>                for (byte[] q : cf.get(c).keySet()) {
>>                    put.add(c, q, cf.get(c).get(q));
>>                }
>>            }
>>            context.write(key, put);}
>>
>> I am inclined to think there has to be a more efficient way to do
>> this. By that I mean, not have to iterate through all the columns.
>> Thoughts?
>>
>> Browsing code I found some usages like this :
>>
>> outval.add(INPUT_FAMILY, null, value.getValue(INPUT_FAMILY, null);
>>
>> What does above mean? Does it mean get bytes representing all columns
>> for INPUT_FAMILY and add it in put object?
>>
>

Re: How to retrieve all columns of a CF and adding it in a put call

Posted by Jean-Daniel Cryans <jd...@apache.org>.
Well you need to insert all the columns so yes you need to iterate
them all. There's a shorter way to do it tho, look at the Import class
in the HBase code:

    private static Put resultToPut(ImmutableBytesWritable key, Result result)
    throws IOException {
      Put put = new Put(key.get());
      for (KeyValue kv : result.raw()) {
        put.add(kv);
      }
      return put;
    }

Regarding your last question, what that line does is just setting the
value of the input family with an empty qualifier. Not the whole
family.

J-D

On Thu, Sep 29, 2011 at 11:58 AM, Shrijeet Paliwal
<sh...@rocketfuel.com> wrote:
> Hello,
>
> I am trying to create a new table with data exactly same as BUT make
> the row key in new table set as a column_value in the old table.
>
> Following is my map method ( using a map only MR job)
>
> public void map(ImmutableBytesWritable key, Result value, Context context) {
>                throws IOException, InterruptedException {
>            byte[] mod_key = value.getValue(INPUT_FAMILY, INPUT_COLUMN);
>            if (mod_key != null) {
>            Map<byte[], NavigableMap<byte[], byte[]>> cf =
> value.getNoVersionMap();
>            Put put = new Put(mod_key);
>            for (byte[] c : cf.keySet()) {
>                for (byte[] q : cf.get(c).keySet()) {
>                    put.add(c, q, cf.get(c).get(q));
>                }
>            }
>            context.write(key, put);}
>
> I am inclined to think there has to be a more efficient way to do
> this. By that I mean, not have to iterate through all the columns.
> Thoughts?
>
> Browsing code I found some usages like this :
>
> outval.add(INPUT_FAMILY, null, value.getValue(INPUT_FAMILY, null);
>
> What does above mean? Does it mean get bytes representing all columns
> for INPUT_FAMILY and add it in put object?
>