You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@hbase.apache.org by T Vinod Gupta <tv...@readypulse.com> on 2012/01/12 12:37:15 UTC

is there any way to copy data from one table to another while updating rowKey??

I am badly stuck and can't find a way out. i want to change my rowkey
schema while copying data from 1 table to another. but a map reduce job to
do this won't work because of large row sizes (responseTooLarge errors). so
i am left with a 2 steps processing of exporting to hdfs files and
importing from them to the 2nd table. so i wrote a custom exporter that
changes the rowkey to newRowKey when doing context.write(newRowKey,
result). but when i import these new files into new table, it doesnt work
due to this exception in put - "The row in the recently added ... doesn't
match the original one ....".

is there no way out for me? please help

thanks

Re: is there any way to copy data from one table to another while updating rowKey??

Posted by Asher <as...@gmail.com>.

T Vinod Gupta <tv...@...> writes:

> 
> I am badly stuck and can't find a way out. i want to change my rowkey
> schema while copying data from 1 table to another. but a map reduce job to
> do this won't work because of large row sizes (responseTooLarge errors). 
so
> i am left with a 2 steps processing of exporting to hdfs files and
> importing from them to the 2nd table. so i wrote a custom exporter that
> changes the rowkey to newRowKey when doing context.write(newRowKey,
> result). but when i import these new files into new table, it doesnt work
> due to this exception in put - "The row in the recently added ... doesn't
> match the original one ....".
> 
> is there no way out for me? please help
> 
> thanks
> 

I know this is old, but here is a solution:

You need to pass the new key in the Put constructor as well as overwrite the 
key values w/ the new key.  Here is a helper method I use to do this...

    public static Put resultToPut(byte[] newKey, Result result) throws 
IOException {
    	Put put = new Put(newKey);
        for (KeyValue kv : result.raw()) {
        	KeyValue kv2 = new KeyValue(newKey, kv.getFamily(), 
kv.getQualifier(), kv.getValue());
    		put.add(kv2);
        }
        return put;
    }

--Asher

Re: is there any way to copy data from one table to another while updating rowKey??

Posted by Stack <st...@duboce.net>.

On Thu, Jan 12, 2012 at 9:47 PM, T Vinod Gupta <tv...@readypulse.com> wrote:
> i wrote an app to delete bunch of old data which we dont need
> any more.. so that app is doing scans and deletes (specific columns of rows
> based on some custom logic).
>

You understand that you are writing a new entry per item you are deleting?


> 2012-01-13 05:42:21,201 WARN org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 8 on 60020 caught: java.nio.channels.ClosedChannelException
>        at
> sun.nio.ch.SocketChannelImpl.ensureWriteOpen(SocketChannelImpl.java:144)
>        at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:342)
>        at
> org.apache.hadoop.hbase.ipc.HBaseServer.channelIO(HBaseServer.java:1389)
>        at
> org.apache.hadoop.hbase.ipc.HBaseServer.channelWrite(HBaseServer.java:1341)
>        at
> org.apache.hadoop.hbase.ipc.HBaseServer$Responder.processResponse(HBaseServer.java:727)
>        at


Looks like the client went away before the server had time to respond.
 You seeing hard-working servers?



> 2012-01-13 04:48:37,301 WARN org.apache.hadoop.hbase.master.CatalogJanitor:
> Fail
> ed scan of catalog table
> java.net.SocketTimeoutException: Call to /10.68.145.124:60020 failed on
> socket t
> imeout exception: java.net.SocketTimeoutException: 60000 millis timeout
> while wa


This looks like why client went away... didn't get a response w/i 60 seconds.

St.Ack

Re: is there any way to copy data from one table to another while updating rowKey??

Posted by T Vinod Gupta <tv...@readypulse.com>.

Stack,
Here are some of the failures im getting now. I don't know whats wrong with
my hbase right now.. i literally stopped my main processes that write to
the store. i wrote an app to delete bunch of old data which we dont need
any more.. so that app is doing scans and deletes (specific columns of rows
based on some custom logic).

2012-01-13 05:42:21,201 WARN org.apache.hadoop.ipc.HBaseServer: IPC Server
handler 8 on 60020 caught: java.nio.channels.ClosedChannelException
        at
sun.nio.ch.SocketChannelImpl.ensureWriteOpen(SocketChannelImpl.java:144)
        at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:342)
        at
org.apache.hadoop.hbase.ipc.HBaseServer.channelIO(HBaseServer.java:1389)
        at
org.apache.hadoop.hbase.ipc.HBaseServer.channelWrite(HBaseServer.java:1341)
        at
org.apache.hadoop.hbase.ipc.HBaseServer$Responder.processResponse(HBaseServer.java:727)
        at
org.apache.hadoop.hbase.ipc.HBaseServer$Responder.doRespond(HBaseServer.java:792)
        at
org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1083)

2012-01-13 05:42:22,812 WARN org.apache.hadoop.ipc.HBaseServer: IPC Server
Responder, call multi(org.apache.hadoop.hbase.client.MultiAction@444ea383)
from 10.68.145.124:35132: output error
2012-01-13 05:42:22,812 WARN org.apache.hadoop.ipc.HBaseServer: IPC Server
handler 7 on 60020 caught: java.nio.channels.ClosedChannelException
        at
sun.nio.ch.SocketChannelImpl.ensureWriteOpen(SocketChannelImpl.java:144)
        at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:342)
        at
org.apache.hadoop.hbase.ipc.HBaseServer.channelIO(HBaseServer.java:1389)
        at
org.apache.hadoop.hbase.ipc.HBaseServer.channelWrite(HBaseServer.java:1341)
        at
org.apache.hadoop.hbase.ipc.HBaseServer$Responder.processResponse(HBaseServer.java:727)
        at
org.apache.hadoop.hbase.ipc.HBaseServer$Responder.doRespond(HBaseServer.java:792)
        at
org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1083)

on my master server i see these happening -


2012-01-13 04:48:37,301 WARN org.apache.hadoop.hbase.master.CatalogJanitor:
Fail
ed scan of catalog table
java.net.SocketTimeoutException: Call to /10.68.145.124:60020 failed on
socket t
imeout exception: java.net.SocketTimeoutException: 60000 millis timeout
while wa
iting for channel to be ready for read. ch :
java.nio.channels.SocketChannel[con
nected local=/10.68.145.124:40155 remote=/10.68.145.124:60020]
        at
org.apache.hadoop.hbase.ipc.HBaseClient.wrapException(HBaseClient.jav
a:802)
        at
org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:775)
        at
org.apache.hadoop.hbase.ipc.HBaseRPC$Invoker.invoke(HBaseRPC.java:257
)
        at $Proxy6.getRegionInfo(Unknown Source)
        at
org.apache.hadoop.hbase.catalog.CatalogTracker.verifyRegionLocation(C
atalogTracker.java:424)
        at
org.apache.hadoop.hbase.catalog.CatalogTracker.getMetaServerConnectio
n(CatalogTracker.java:272)
        at
org.apache.hadoop.hbase.catalog.CatalogTracker.waitForMeta(CatalogTra
cker.java:331)
        at
org.apache.hadoop.hbase.catalog.CatalogTracker.waitForMetaServerConne
ctionDefault(CatalogTracker.java:364)
        at
org.apache.hadoop.hbase.catalog.MetaReader.fullScan(MetaReader.java:2
55)
        at
org.apache.hadoop.hbase.catalog.MetaReader.fullScan(MetaReader.java:2
37)
        at
org.apache.hadoop.hbase.master.CatalogJanitor.scan(CatalogJanitor.jav
a:116)
        at
org.apache.hadoop.hbase.master.CatalogJanitor.chore(CatalogJanitor.ja
va:85)
        at org.apache.hadoop.hbase.Chore.run(Chore.java:66)

i did see some archive threads on this but i don't know what exactly is
causing this and how to get out of this.

thanks

On Thu, Jan 12, 2012 at 2:39 PM, Stack <st...@duboce.net> wrote:

> And what is happening on the server
> ip-10-68-145-124.ec2.internal:60020 such that 14 attempts at getting a
> region failed.  Is that region on line during this time or being
> moved?  If not online, why not?  Was server opening the region taking
> too long (because of high-load?).  Grep around the region name in
> master log to see what was happening with it at the time of the below
> fails.
>
> Folks copy from one table to the other all the time w/o need of an
> hdfs intermediary resting stop.
>
> St.Ack
>
> On Thu, Jan 12, 2012 at 9:46 AM, Ted Yu <yu...@gmail.com> wrote:
> > I think you need to manipulate the keyvalue to match the new row.
> > Take a look at the check:
> >
> >    //Checking that the row of the kv is the same as the put
> >    int res = Bytes.compareTo(this.row, 0, row.length,
> >        kv.getBuffer(), kv.getRowOffset(), kv.getRowLength());
> >    if(res != 0) {
> >      throw new IOException("The row in the recently added KeyValue " +
> >
> > Cheers
> >
> > On Thu, Jan 12, 2012 at 9:12 AM, T Vinod Gupta <tvinod@readypulse.com
> >wrote:
> >
> >> hbase version -
> >> hbase(main):001:0> version
> >> 0.90.3-cdh3u1, r, Mon Jul 18 08:23:50 PDT 2011
> >>
> >> here are the different exceptions -
> >>
> >> when copying table to another table -
> >> 12/01/12 11:06:41 INFO mapred.JobClient: Task Id :
> >> attempt_201201120656_0012_m_000001_0, Status : FAILED
> >> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException:
> Failed
> >> 14 actions: NotServingRegionException: 14 times, servers with issues:
> >> ip-10-68-145-124.ec2.internal:60020,
> >>        at
> >>
> >>
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatch(HConnectionManager.java:1227)
> >>        at
> >>
> >>
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatchOfPuts(HConnectionManager.java:1241)
> >>        at
> >> org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:826)
> >>        at org.apache.hadoop.hbase.client.HTable.doPut(HTable.java:682)
> >>        at org.apache.hadoop.hbase.client.HTable.put(HTable.java:667)
> >>        at
> >>
> >>
> org.apache.hadoop.hbase.mapreduce.TableOutputFormat$TableRecordWriter.write(TableOutputFormat.java:127)
> >>        at
> >>
> >>
> org.apache.hadoop.hbase.mapreduce.TableOutputFormat$TableRecordWriter.write(TableOutputFormat.java:82)
> >>        at
> >>
> >>
> org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:531)
> >>        at
> >>
> >>
> org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
> >>        at
> >>
> >>
> com.akanksh.information.hbasetest.HBaseTimestampSwapper$SwapperMapper.map(HBaseTimestampSwapper.java:62)
> >>        at
> >>
> >>
> com.akanksh.information.hbasetest.HBaseTimestampSwapper$SwapperMapper.map(HBaseTimestampSwapper.java:31)
> >>        at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
> >>        at
> org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:647)
> >>        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:323)
> >>        at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
> >>        at java.security.AccessController.doPrivileged(Native Method)
> >>        at javax.security.auth.Subject.doAs(Subject.java:416)
> >>        at
> >>
> >>
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1127)
> >>        at org.apache.hadoop.mapred.Child.main(Child.java:264)
> >>
> >> region server logs say this -
> >> 2012-01-10 00:00:52,545 WARN org.apache.hadoop.ipc.HBaseServer: IPC
> Server
> >> handl
> >> er 9 on 60020, responseTooLarge for: next(-5685114053145855194, 50) from
> >> 10.68.1
> >> 45.124:44423: Size: 121.7m
> >>
> >> when doing special export and then import, here is the stack trace -
> >> java.io.IOException: The row in the recently added KeyValue
> >> 84784841:1319846400:daily:PotentialReach doesn't match the original one
> >> 84784841:PotentialReach:daily:1319846400
> >>        at org.apache.hadoop.hbase.client.Put.add(Put.java:168)
> >>        at
> >>
> >>
> org.apache.hadoop.hbase.mapreduce.Import$Importer.resultToPut(Import.java:70)
> >>        at
> >> org.apache.hadoop.hbase.mapreduce.Import$Importer.map(Import.java:60)
> >>        at
> >> org.apache.hadoop.hbase.mapreduce.Import$Importer.map(Import.java:45)
> >>        at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
> >>        at
> org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:647)
> >>        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:323)
> >>        at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
> >>        at java.security.AccessController.doPrivileged(Native Method)
> >>        at javax.security.auth.Subject.doAs(Subject.java:416)
> >>        at
> >>
> >>
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1127)
> >>        at org.apache.hadoop.mapred.Child.main(Child.java:264)
> >>
> >>
> >> On Thu, Jan 12, 2012 at 5:13 AM, <yu...@gmail.com> wrote:
> >>
> >> > What version of hbase did you use ?
> >> >
> >> > Can you post the stack trace for the exception ?
> >> >
> >> > Thanks
> >> >
> >> >
> >> >
> >> > On Jan 12, 2012, at 3:37 AM, T Vinod Gupta <tv...@readypulse.com>
> >> wrote:
> >> >
> >> > > I am badly stuck and can't find a way out. i want to change my
> rowkey
> >> > > schema while copying data from 1 table to another. but a map reduce
> job
> >> > to
> >> > > do this won't work because of large row sizes (responseTooLarge
> >> errors).
> >> > so
> >> > > i am left with a 2 steps processing of exporting to hdfs files and
> >> > > importing from them to the 2nd table. so i wrote a custom exporter
> that
> >> > > changes the rowkey to newRowKey when doing context.write(newRowKey,
> >> > > result). but when i import these new files into new table, it doesnt
> >> work
> >> > > due to this exception in put - "The row in the recently added ...
> >> doesn't
> >> > > match the original one ....".
> >> > >
> >> > > is there no way out for me? please help
> >> > >
> >> > > thanks
> >> >
> >>
>

Re: is there any way to copy data from one table to another while updating rowKey??

Posted by Stack <st...@duboce.net>.

And what is happening on the server
ip-10-68-145-124.ec2.internal:60020 such that 14 attempts at getting a
region failed.  Is that region on line during this time or being
moved?  If not online, why not?  Was server opening the region taking
too long (because of high-load?).  Grep around the region name in
master log to see what was happening with it at the time of the below
fails.

Folks copy from one table to the other all the time w/o need of an
hdfs intermediary resting stop.

St.Ack

On Thu, Jan 12, 2012 at 9:46 AM, Ted Yu <yu...@gmail.com> wrote:
> I think you need to manipulate the keyvalue to match the new row.
> Take a look at the check:
>
>    //Checking that the row of the kv is the same as the put
>    int res = Bytes.compareTo(this.row, 0, row.length,
>        kv.getBuffer(), kv.getRowOffset(), kv.getRowLength());
>    if(res != 0) {
>      throw new IOException("The row in the recently added KeyValue " +
>
> Cheers
>
> On Thu, Jan 12, 2012 at 9:12 AM, T Vinod Gupta <tv...@readypulse.com>wrote:
>
>> hbase version -
>> hbase(main):001:0> version
>> 0.90.3-cdh3u1, r, Mon Jul 18 08:23:50 PDT 2011
>>
>> here are the different exceptions -
>>
>> when copying table to another table -
>> 12/01/12 11:06:41 INFO mapred.JobClient: Task Id :
>> attempt_201201120656_0012_m_000001_0, Status : FAILED
>> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed
>> 14 actions: NotServingRegionException: 14 times, servers with issues:
>> ip-10-68-145-124.ec2.internal:60020,
>>        at
>>
>> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatch(HConnectionManager.java:1227)
>>        at
>>
>> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatchOfPuts(HConnectionManager.java:1241)
>>        at
>> org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:826)
>>        at org.apache.hadoop.hbase.client.HTable.doPut(HTable.java:682)
>>        at org.apache.hadoop.hbase.client.HTable.put(HTable.java:667)
>>        at
>>
>> org.apache.hadoop.hbase.mapreduce.TableOutputFormat$TableRecordWriter.write(TableOutputFormat.java:127)
>>        at
>>
>> org.apache.hadoop.hbase.mapreduce.TableOutputFormat$TableRecordWriter.write(TableOutputFormat.java:82)
>>        at
>>
>> org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:531)
>>        at
>>
>> org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
>>        at
>>
>> com.akanksh.information.hbasetest.HBaseTimestampSwapper$SwapperMapper.map(HBaseTimestampSwapper.java:62)
>>        at
>>
>> com.akanksh.information.hbasetest.HBaseTimestampSwapper$SwapperMapper.map(HBaseTimestampSwapper.java:31)
>>        at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
>>        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:647)
>>        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:323)
>>        at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
>>        at java.security.AccessController.doPrivileged(Native Method)
>>        at javax.security.auth.Subject.doAs(Subject.java:416)
>>        at
>>
>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1127)
>>        at org.apache.hadoop.mapred.Child.main(Child.java:264)
>>
>> region server logs say this -
>> 2012-01-10 00:00:52,545 WARN org.apache.hadoop.ipc.HBaseServer: IPC Server
>> handl
>> er 9 on 60020, responseTooLarge for: next(-5685114053145855194, 50) from
>> 10.68.1
>> 45.124:44423: Size: 121.7m
>>
>> when doing special export and then import, here is the stack trace -
>> java.io.IOException: The row in the recently added KeyValue
>> 84784841:1319846400:daily:PotentialReach doesn't match the original one
>> 84784841:PotentialReach:daily:1319846400
>>        at org.apache.hadoop.hbase.client.Put.add(Put.java:168)
>>        at
>>
>> org.apache.hadoop.hbase.mapreduce.Import$Importer.resultToPut(Import.java:70)
>>        at
>> org.apache.hadoop.hbase.mapreduce.Import$Importer.map(Import.java:60)
>>        at
>> org.apache.hadoop.hbase.mapreduce.Import$Importer.map(Import.java:45)
>>        at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
>>        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:647)
>>        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:323)
>>        at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
>>        at java.security.AccessController.doPrivileged(Native Method)
>>        at javax.security.auth.Subject.doAs(Subject.java:416)
>>        at
>>
>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1127)
>>        at org.apache.hadoop.mapred.Child.main(Child.java:264)
>>
>>
>> On Thu, Jan 12, 2012 at 5:13 AM, <yu...@gmail.com> wrote:
>>
>> > What version of hbase did you use ?
>> >
>> > Can you post the stack trace for the exception ?
>> >
>> > Thanks
>> >
>> >
>> >
>> > On Jan 12, 2012, at 3:37 AM, T Vinod Gupta <tv...@readypulse.com>
>> wrote:
>> >
>> > > I am badly stuck and can't find a way out. i want to change my rowkey
>> > > schema while copying data from 1 table to another. but a map reduce job
>> > to
>> > > do this won't work because of large row sizes (responseTooLarge
>> errors).
>> > so
>> > > i am left with a 2 steps processing of exporting to hdfs files and
>> > > importing from them to the 2nd table. so i wrote a custom exporter that
>> > > changes the rowkey to newRowKey when doing context.write(newRowKey,
>> > > result). but when i import these new files into new table, it doesnt
>> work
>> > > due to this exception in put - "The row in the recently added ...
>> doesn't
>> > > match the original one ....".
>> > >
>> > > is there no way out for me? please help
>> > >
>> > > thanks
>> >
>>

Re: is there any way to copy data from one table to another while updating rowKey??

Posted by Ted Yu <yu...@gmail.com>.

I think you need to manipulate the keyvalue to match the new row.
Take a look at the check:

    //Checking that the row of the kv is the same as the put
    int res = Bytes.compareTo(this.row, 0, row.length,
        kv.getBuffer(), kv.getRowOffset(), kv.getRowLength());
    if(res != 0) {
      throw new IOException("The row in the recently added KeyValue " +

Cheers

On Thu, Jan 12, 2012 at 9:12 AM, T Vinod Gupta <tv...@readypulse.com>wrote:

> hbase version -
> hbase(main):001:0> version
> 0.90.3-cdh3u1, r, Mon Jul 18 08:23:50 PDT 2011
>
> here are the different exceptions -
>
> when copying table to another table -
> 12/01/12 11:06:41 INFO mapred.JobClient: Task Id :
> attempt_201201120656_0012_m_000001_0, Status : FAILED
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed
> 14 actions: NotServingRegionException: 14 times, servers with issues:
> ip-10-68-145-124.ec2.internal:60020,
>        at
>
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatch(HConnectionManager.java:1227)
>        at
>
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatchOfPuts(HConnectionManager.java:1241)
>        at
> org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:826)
>        at org.apache.hadoop.hbase.client.HTable.doPut(HTable.java:682)
>        at org.apache.hadoop.hbase.client.HTable.put(HTable.java:667)
>        at
>
> org.apache.hadoop.hbase.mapreduce.TableOutputFormat$TableRecordWriter.write(TableOutputFormat.java:127)
>        at
>
> org.apache.hadoop.hbase.mapreduce.TableOutputFormat$TableRecordWriter.write(TableOutputFormat.java:82)
>        at
>
> org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:531)
>        at
>
> org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
>        at
>
> com.akanksh.information.hbasetest.HBaseTimestampSwapper$SwapperMapper.map(HBaseTimestampSwapper.java:62)
>        at
>
> com.akanksh.information.hbasetest.HBaseTimestampSwapper$SwapperMapper.map(HBaseTimestampSwapper.java:31)
>        at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
>        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:647)
>        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:323)
>        at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
>        at java.security.AccessController.doPrivileged(Native Method)
>        at javax.security.auth.Subject.doAs(Subject.java:416)
>        at
>
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1127)
>        at org.apache.hadoop.mapred.Child.main(Child.java:264)
>
> region server logs say this -
> 2012-01-10 00:00:52,545 WARN org.apache.hadoop.ipc.HBaseServer: IPC Server
> handl
> er 9 on 60020, responseTooLarge for: next(-5685114053145855194, 50) from
> 10.68.1
> 45.124:44423: Size: 121.7m
>
> when doing special export and then import, here is the stack trace -
> java.io.IOException: The row in the recently added KeyValue
> 84784841:1319846400:daily:PotentialReach doesn't match the original one
> 84784841:PotentialReach:daily:1319846400
>        at org.apache.hadoop.hbase.client.Put.add(Put.java:168)
>        at
>
> org.apache.hadoop.hbase.mapreduce.Import$Importer.resultToPut(Import.java:70)
>        at
> org.apache.hadoop.hbase.mapreduce.Import$Importer.map(Import.java:60)
>        at
> org.apache.hadoop.hbase.mapreduce.Import$Importer.map(Import.java:45)
>        at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
>        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:647)
>        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:323)
>        at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
>        at java.security.AccessController.doPrivileged(Native Method)
>        at javax.security.auth.Subject.doAs(Subject.java:416)
>        at
>
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1127)
>        at org.apache.hadoop.mapred.Child.main(Child.java:264)
>
>
> On Thu, Jan 12, 2012 at 5:13 AM, <yu...@gmail.com> wrote:
>
> > What version of hbase did you use ?
> >
> > Can you post the stack trace for the exception ?
> >
> > Thanks
> >
> >
> >
> > On Jan 12, 2012, at 3:37 AM, T Vinod Gupta <tv...@readypulse.com>
> wrote:
> >
> > > I am badly stuck and can't find a way out. i want to change my rowkey
> > > schema while copying data from 1 table to another. but a map reduce job
> > to
> > > do this won't work because of large row sizes (responseTooLarge
> errors).
> > so
> > > i am left with a 2 steps processing of exporting to hdfs files and
> > > importing from them to the 2nd table. so i wrote a custom exporter that
> > > changes the rowkey to newRowKey when doing context.write(newRowKey,
> > > result). but when i import these new files into new table, it doesnt
> work
> > > due to this exception in put - "The row in the recently added ...
> doesn't
> > > match the original one ....".
> > >
> > > is there no way out for me? please help
> > >
> > > thanks
> >
>

Re: is there any way to copy data from one table to another while updating rowKey??

Posted by T Vinod Gupta <tv...@readypulse.com>.

hbase version -
hbase(main):001:0> version
0.90.3-cdh3u1, r, Mon Jul 18 08:23:50 PDT 2011

here are the different exceptions -

when copying table to another table -
12/01/12 11:06:41 INFO mapred.JobClient: Task Id :
attempt_201201120656_0012_m_000001_0, Status : FAILED
org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed
14 actions: NotServingRegionException: 14 times, servers with issues:
ip-10-68-145-124.ec2.internal:60020,
        at
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatch(HConnectionManager.java:1227)
        at
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatchOfPuts(HConnectionManager.java:1241)
        at
org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:826)
        at org.apache.hadoop.hbase.client.HTable.doPut(HTable.java:682)
        at org.apache.hadoop.hbase.client.HTable.put(HTable.java:667)
        at
org.apache.hadoop.hbase.mapreduce.TableOutputFormat$TableRecordWriter.write(TableOutputFormat.java:127)
        at
org.apache.hadoop.hbase.mapreduce.TableOutputFormat$TableRecordWriter.write(TableOutputFormat.java:82)
        at
org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:531)
        at
org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
        at
com.akanksh.information.hbasetest.HBaseTimestampSwapper$SwapperMapper.map(HBaseTimestampSwapper.java:62)
        at
com.akanksh.information.hbasetest.HBaseTimestampSwapper$SwapperMapper.map(HBaseTimestampSwapper.java:31)
        at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:647)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:323)
        at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:416)
        at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1127)
        at org.apache.hadoop.mapred.Child.main(Child.java:264)

region server logs say this -
2012-01-10 00:00:52,545 WARN org.apache.hadoop.ipc.HBaseServer: IPC Server
handl
er 9 on 60020, responseTooLarge for: next(-5685114053145855194, 50) from
10.68.1
45.124:44423: Size: 121.7m

when doing special export and then import, here is the stack trace -
java.io.IOException: The row in the recently added KeyValue
84784841:1319846400:daily:PotentialReach doesn't match the original one
84784841:PotentialReach:daily:1319846400
        at org.apache.hadoop.hbase.client.Put.add(Put.java:168)
        at
org.apache.hadoop.hbase.mapreduce.Import$Importer.resultToPut(Import.java:70)
        at
org.apache.hadoop.hbase.mapreduce.Import$Importer.map(Import.java:60)
        at
org.apache.hadoop.hbase.mapreduce.Import$Importer.map(Import.java:45)
        at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:647)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:323)
        at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:416)
        at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1127)
        at org.apache.hadoop.mapred.Child.main(Child.java:264)


On Thu, Jan 12, 2012 at 5:13 AM, <yu...@gmail.com> wrote:

> What version of hbase did you use ?
>
> Can you post the stack trace for the exception ?
>
> Thanks
>
>
>
> On Jan 12, 2012, at 3:37 AM, T Vinod Gupta <tv...@readypulse.com> wrote:
>
> > I am badly stuck and can't find a way out. i want to change my rowkey
> > schema while copying data from 1 table to another. but a map reduce job
> to
> > do this won't work because of large row sizes (responseTooLarge errors).
> so
> > i am left with a 2 steps processing of exporting to hdfs files and
> > importing from them to the 2nd table. so i wrote a custom exporter that
> > changes the rowkey to newRowKey when doing context.write(newRowKey,
> > result). but when i import these new files into new table, it doesnt work
> > due to this exception in put - "The row in the recently added ... doesn't
> > match the original one ....".
> >
> > is there no way out for me? please help
> >
> > thanks
>

Re: is there any way to copy data from one table to another while updating rowKey??

Posted by yu...@gmail.com.

What version of hbase did you use ?

Can you post the stack trace for the exception ?

Thanks



On Jan 12, 2012, at 3:37 AM, T Vinod Gupta <tv...@readypulse.com> wrote:

> I am badly stuck and can't find a way out. i want to change my rowkey
> schema while copying data from 1 table to another. but a map reduce job to
> do this won't work because of large row sizes (responseTooLarge errors). so
> i am left with a 2 steps processing of exporting to hdfs files and
> importing from them to the 2nd table. so i wrote a custom exporter that
> changes the rowkey to newRowKey when doing context.write(newRowKey,
> result). but when i import these new files into new table, it doesnt work
> due to this exception in put - "The row in the recently added ... doesn't
> match the original one ....".
> 
> is there no way out for me? please help
> 
> thanks