You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Shrijeet Paliwal <sh...@rocketfuel.com> on 2011/09/29 20:58:09 UTC
How to retrieve all columns of a CF and adding it in a put call
Hello,
I am trying to create a new table with data exactly same as BUT make
the row key in new table set as a column_value in the old table.
Following is my map method ( using a map only MR job)
public void map(ImmutableBytesWritable key, Result value, Context context) {
throws IOException, InterruptedException {
byte[] mod_key = value.getValue(INPUT_FAMILY, INPUT_COLUMN);
if (mod_key != null) {
Map<byte[], NavigableMap<byte[], byte[]>> cf =
value.getNoVersionMap();
Put put = new Put(mod_key);
for (byte[] c : cf.keySet()) {
for (byte[] q : cf.get(c).keySet()) {
put.add(c, q, cf.get(c).get(q));
}
}
context.write(key, put);}
I am inclined to think there has to be a more efficient way to do
this. By that I mean, not have to iterate through all the columns.
Thoughts?
Browsing code I found some usages like this :
outval.add(INPUT_FAMILY, null, value.getValue(INPUT_FAMILY, null);
What does above mean? Does it mean get bytes representing all columns
for INPUT_FAMILY and add it in put object?
Re: How to retrieve all columns of a CF and adding it in a put call
Posted by Shrijeet Paliwal <sh...@rocketfuel.com>.
Understood. Thank you J-D.
On Thu, Oct 6, 2011 at 11:48 AM, Jean-Daniel Cryans <jd...@apache.org> wrote:
> Well you need to insert all the columns so yes you need to iterate
> them all. There's a shorter way to do it tho, look at the Import class
> in the HBase code:
>
> private static Put resultToPut(ImmutableBytesWritable key, Result result)
> throws IOException {
> Put put = new Put(key.get());
> for (KeyValue kv : result.raw()) {
> put.add(kv);
> }
> return put;
> }
>
> Regarding your last question, what that line does is just setting the
> value of the input family with an empty qualifier. Not the whole
> family.
>
> J-D
>
> On Thu, Sep 29, 2011 at 11:58 AM, Shrijeet Paliwal
> <sh...@rocketfuel.com> wrote:
>> Hello,
>>
>> I am trying to create a new table with data exactly same as BUT make
>> the row key in new table set as a column_value in the old table.
>>
>> Following is my map method ( using a map only MR job)
>>
>> public void map(ImmutableBytesWritable key, Result value, Context context) {
>> throws IOException, InterruptedException {
>> byte[] mod_key = value.getValue(INPUT_FAMILY, INPUT_COLUMN);
>> if (mod_key != null) {
>> Map<byte[], NavigableMap<byte[], byte[]>> cf =
>> value.getNoVersionMap();
>> Put put = new Put(mod_key);
>> for (byte[] c : cf.keySet()) {
>> for (byte[] q : cf.get(c).keySet()) {
>> put.add(c, q, cf.get(c).get(q));
>> }
>> }
>> context.write(key, put);}
>>
>> I am inclined to think there has to be a more efficient way to do
>> this. By that I mean, not have to iterate through all the columns.
>> Thoughts?
>>
>> Browsing code I found some usages like this :
>>
>> outval.add(INPUT_FAMILY, null, value.getValue(INPUT_FAMILY, null);
>>
>> What does above mean? Does it mean get bytes representing all columns
>> for INPUT_FAMILY and add it in put object?
>>
>
Re: How to retrieve all columns of a CF and adding it in a put call
Posted by Jean-Daniel Cryans <jd...@apache.org>.
Well you need to insert all the columns so yes you need to iterate
them all. There's a shorter way to do it tho, look at the Import class
in the HBase code:
private static Put resultToPut(ImmutableBytesWritable key, Result result)
throws IOException {
Put put = new Put(key.get());
for (KeyValue kv : result.raw()) {
put.add(kv);
}
return put;
}
Regarding your last question, what that line does is just setting the
value of the input family with an empty qualifier. Not the whole
family.
J-D
On Thu, Sep 29, 2011 at 11:58 AM, Shrijeet Paliwal
<sh...@rocketfuel.com> wrote:
> Hello,
>
> I am trying to create a new table with data exactly same as BUT make
> the row key in new table set as a column_value in the old table.
>
> Following is my map method ( using a map only MR job)
>
> public void map(ImmutableBytesWritable key, Result value, Context context) {
> throws IOException, InterruptedException {
> byte[] mod_key = value.getValue(INPUT_FAMILY, INPUT_COLUMN);
> if (mod_key != null) {
> Map<byte[], NavigableMap<byte[], byte[]>> cf =
> value.getNoVersionMap();
> Put put = new Put(mod_key);
> for (byte[] c : cf.keySet()) {
> for (byte[] q : cf.get(c).keySet()) {
> put.add(c, q, cf.get(c).get(q));
> }
> }
> context.write(key, put);}
>
> I am inclined to think there has to be a more efficient way to do
> this. By that I mean, not have to iterate through all the columns.
> Thoughts?
>
> Browsing code I found some usages like this :
>
> outval.add(INPUT_FAMILY, null, value.getValue(INPUT_FAMILY, null);
>
> What does above mean? Does it mean get bytes representing all columns
> for INPUT_FAMILY and add it in put object?
>