You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by Sergey Belousov <se...@gmail.com> on 2015/04/17 18:41:08 UTC

Splitting table from HBase shell using script

Hi all

I was looking into doing some splitting for the table (sort of in
production so I can not use SPLITS on create time) using 'split' command
from HBase shell. (0.98.9-hadoop2)
I have simple JRuby script where I just run 'split'
'table_name','split_key' command using hbase shell /tmp/split_table.rb

Time from time I have this error coming up

ERROR: org.apache.hadoop.hbase.NotServingRegionException: Region

[****************eaten by
mouse***************],1429257507107.82bfbd974d36db11075e4ef1da7abfed.
is not online on ******************,60020,1429256987509
	at org.apache.hadoop.hbase.regionserver.HRegionServer.getRegionByEncodedName(HRegionServer.java:2780)
	at org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:4337)
	at org.apache.hadoop.hbase.regionserver.HRegionServer.splitRegion(HRegionServer.java:4042)
	at org.apache.hadoop.hbase.protobuf.generated.AdminProtos$AdminService$2.callBlockingMethod(AdminProtos.java:20170)
	at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2029)
	at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:108)
	at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:112)
	at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:92)
	at java.lang.Thread.run(Thread.java:745)



Even thou sleep(2) help mostly it's not guaranty (especially on live
cluster) and not the solution I would like.

So my question is what would be proper way of checking that I am good to
run next split command?


Thank you
SB

Re: Splitting table from HBase shell using script

Posted by Sergey Belousov <se...@gmail.com>.
thank you, Mikhail
I will give it close attention if there is something useful I have to add I
will

On Fri, Apr 17, 2015 at 12:47 PM, Mikhail Antonov <ol...@gmail.com>
wrote:

> Sergey,
>
> that might by a bit of off-topic, but in HBASE-13103 there's a
> discussion on how to relieve folks of having to think of proper ways
> of running their split commands. As it's now in design & prototyping
> stage, any feedback from production users is greatly appreciated!
>
> -Mikhail
>
> On Fri, Apr 17, 2015 at 9:41 AM, Sergey Belousov
> <se...@gmail.com> wrote:
> > Hi all
> >
> > I was looking into doing some splitting for the table (sort of in
> > production so I can not use SPLITS on create time) using 'split' command
> > from HBase shell. (0.98.9-hadoop2)
> > I have simple JRuby script where I just run 'split'
> > 'table_name','split_key' command using hbase shell /tmp/split_table.rb
> >
> > Time from time I have this error coming up
> >
> > ERROR: org.apache.hadoop.hbase.NotServingRegionException: Region
> >
> > [****************eaten by
> > mouse***************],1429257507107.82bfbd974d36db11075e4ef1da7abfed.
> > is not online on ******************,60020,1429256987509
> >         at
> org.apache.hadoop.hbase.regionserver.HRegionServer.getRegionByEncodedName(HRegionServer.java:2780)
> >         at
> org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:4337)
> >         at
> org.apache.hadoop.hbase.regionserver.HRegionServer.splitRegion(HRegionServer.java:4042)
> >         at
> org.apache.hadoop.hbase.protobuf.generated.AdminProtos$AdminService$2.callBlockingMethod(AdminProtos.java:20170)
> >         at
> org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2029)
> >         at
> org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:108)
> >         at
> org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:112)
> >         at
> org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:92)
> >         at java.lang.Thread.run(Thread.java:745)
> >
> >
> >
> > Even thou sleep(2) help mostly it's not guaranty (especially on live
> > cluster) and not the solution I would like.
> >
> > So my question is what would be proper way of checking that I am good to
> > run next split command?
> >
> >
> > Thank you
> > SB
>
>
>
> --
> Thanks,
> Michael Antonov
>

Re: Splitting table from HBase shell using script

Posted by Mikhail Antonov <ol...@gmail.com>.
Sergey,

that might by a bit of off-topic, but in HBASE-13103 there's a
discussion on how to relieve folks of having to think of proper ways
of running their split commands. As it's now in design & prototyping
stage, any feedback from production users is greatly appreciated!

-Mikhail

On Fri, Apr 17, 2015 at 9:41 AM, Sergey Belousov
<se...@gmail.com> wrote:
> Hi all
>
> I was looking into doing some splitting for the table (sort of in
> production so I can not use SPLITS on create time) using 'split' command
> from HBase shell. (0.98.9-hadoop2)
> I have simple JRuby script where I just run 'split'
> 'table_name','split_key' command using hbase shell /tmp/split_table.rb
>
> Time from time I have this error coming up
>
> ERROR: org.apache.hadoop.hbase.NotServingRegionException: Region
>
> [****************eaten by
> mouse***************],1429257507107.82bfbd974d36db11075e4ef1da7abfed.
> is not online on ******************,60020,1429256987509
>         at org.apache.hadoop.hbase.regionserver.HRegionServer.getRegionByEncodedName(HRegionServer.java:2780)
>         at org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:4337)
>         at org.apache.hadoop.hbase.regionserver.HRegionServer.splitRegion(HRegionServer.java:4042)
>         at org.apache.hadoop.hbase.protobuf.generated.AdminProtos$AdminService$2.callBlockingMethod(AdminProtos.java:20170)
>         at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2029)
>         at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:108)
>         at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:112)
>         at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:92)
>         at java.lang.Thread.run(Thread.java:745)
>
>
>
> Even thou sleep(2) help mostly it's not guaranty (especially on live
> cluster) and not the solution I would like.
>
> So my question is what would be proper way of checking that I am good to
> run next split command?
>
>
> Thank you
> SB



-- 
Thanks,
Michael Antonov

Re: Splitting table from HBase shell using script

Posted by Sergey Belousov <se...@gmail.com>.
Hm
That what I am not sure I understand ...
It is not like I requesting location of regions every time (or asking to
move it). Here is more details what I am doing ...

1. I get regions for the table and store it in array.
2. I get array of split points to create by splits_to_create -
current_splits_for_the_table
3. I run *splits_to_create*.each |split| do split 'table', split sleep(2)
end

It also appears that all regions created on one (random) region server if I
run it on master or if I run it on region server it will be created on that
region...
When I have 5 sec pause between - I do not have this error...
BTW I also tried with balance_switch false (interesting naming :)) - no
difference. Same would be to run balancer before so it will sits on time
out to the next run while I splitting unless I run it after I am done (what
I do)

So why would shell be confused when I run bunch of 'split' commands for the
table ?

I would try to stay within running splits in one shell and in one session
if possible...

Thank you

On Fri, Apr 17, 2015 at 5:57 PM, Esteban Gutierrez <es...@cloudera.com>
wrote:

> Hi Sergey,
>
> I think this is what is happening: the hbase shell should be caching the
> region locations, if the balancer is on and you run those split commands it
> is very likely you will be getting that error since the regions have moved
> to other RSs. The easiest solution I think is to run the splits on
> different hbase shell processes so every time it will use a new connection.
> You could also try to close the connections or purge the cached locations
> from the hbase shell process.
>
> thanks,
> esteban.
>
> --
> Cloudera, Inc.
>
>
> On Fri, Apr 17, 2015 at 12:43 PM, Sergey Belousov <
> sergey.belousov@gmail.com
> > wrote:
>
> > Thank you Esteban for reply
> >
> > see inline
> >
> > On Apr 17, 2015 12:50 PM, "Esteban Gutierrez" <es...@cloudera.com>
> > wrote:
> > >
> > > Sergey,
> > >
> > > My first question would be if you are turning off the HBase balancer
> > before
> > > splitting this region and how long between locating the region and then
> > > splitting the region has passed.
> >
> >  I do not disable balancer before.
> >
> > Also, are you splitting multiple regions
> > > per run of your ruby script or just one?
> >
> > I do create multiple splits for the table at the time. But it all
> > sequential through shell
> >
> > like
> >
> > split 'table', 'splitkey1'
> > sleep(2)
> > split 'table', 'splitkey1'
> > sleep(2)
> > split 'table', 'splitkey1'
> > etc..
> >
> >
> > >
> > > thanks!
> > > esteban.
> > >
> > > --
> > > Cloudera, Inc.
> > >
> > >
> > > On Fri, Apr 17, 2015 at 9:41 AM, Sergey Belousov <
> > sergey.belousov@gmail.com>
> > > wrote:
> > >
> > > > Hi all
> > > >
> > > > I was looking into doing some splitting for the table (sort of in
> > > > production so I can not use SPLITS on create time) using 'split'
> > command
> > > > from HBase shell. (0.98.9-hadoop2)
> > > > I have simple JRuby script where I just run 'split'
> > > > 'table_name','split_key' command using hbase shell
> /tmp/split_table.rb
> > > >
> > > > Time from time I have this error coming up
> > > >
> > > > ERROR: org.apache.hadoop.hbase.NotServingRegionException: Region
> > > >
> > > > [****************eaten by
> > > > mouse***************],1429257507107.82bfbd974d36db11075e4ef1da7abfed.
> > > > is not online on ******************,60020,1429256987509
> > > >         at
> > > >
> >
> >
> org.apache.hadoop.hbase.regionserver.HRegionServer.getRegionByEncodedName(HRegionServer.java:2780)
> > > >         at
> > > >
> >
> >
> org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:4337)
> > > >         at
> > > >
> >
> >
> org.apache.hadoop.hbase.regionserver.HRegionServer.splitRegion(HRegionServer.java:4042)
> > > >         at
> > > >
> >
> >
> org.apache.hadoop.hbase.protobuf.generated.AdminProtos$AdminService$2.callBlockingMethod(AdminProtos.java:20170)
> > > >         at
> > org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2029)
> > > >         at
> > org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:108)
> > > >         at
> > > >
> >
> org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:112)
> > > >         at
> > > > org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:92)
> > > >         at java.lang.Thread.run(Thread.java:745)
> > > >
> > > >
> > > >
> > > > Even thou sleep(2) help mostly it's not guaranty (especially on live
> > > > cluster) and not the solution I would like.
> > > >
> > > > So my question is what would be proper way of checking that I am good
> > to
> > > > run next split command?
> > > >
> > > >
> > > > Thank you
> > > > SB
> > > >
> >
>

Re: Splitting table from HBase shell using script

Posted by Esteban Gutierrez <es...@cloudera.com>.
Hi Sergey,

I think this is what is happening: the hbase shell should be caching the
region locations, if the balancer is on and you run those split commands it
is very likely you will be getting that error since the regions have moved
to other RSs. The easiest solution I think is to run the splits on
different hbase shell processes so every time it will use a new connection.
You could also try to close the connections or purge the cached locations
from the hbase shell process.

thanks,
esteban.

--
Cloudera, Inc.


On Fri, Apr 17, 2015 at 12:43 PM, Sergey Belousov <sergey.belousov@gmail.com
> wrote:

> Thank you Esteban for reply
>
> see inline
>
> On Apr 17, 2015 12:50 PM, "Esteban Gutierrez" <es...@cloudera.com>
> wrote:
> >
> > Sergey,
> >
> > My first question would be if you are turning off the HBase balancer
> before
> > splitting this region and how long between locating the region and then
> > splitting the region has passed.
>
>  I do not disable balancer before.
>
> Also, are you splitting multiple regions
> > per run of your ruby script or just one?
>
> I do create multiple splits for the table at the time. But it all
> sequential through shell
>
> like
>
> split 'table', 'splitkey1'
> sleep(2)
> split 'table', 'splitkey1'
> sleep(2)
> split 'table', 'splitkey1'
> etc..
>
>
> >
> > thanks!
> > esteban.
> >
> > --
> > Cloudera, Inc.
> >
> >
> > On Fri, Apr 17, 2015 at 9:41 AM, Sergey Belousov <
> sergey.belousov@gmail.com>
> > wrote:
> >
> > > Hi all
> > >
> > > I was looking into doing some splitting for the table (sort of in
> > > production so I can not use SPLITS on create time) using 'split'
> command
> > > from HBase shell. (0.98.9-hadoop2)
> > > I have simple JRuby script where I just run 'split'
> > > 'table_name','split_key' command using hbase shell /tmp/split_table.rb
> > >
> > > Time from time I have this error coming up
> > >
> > > ERROR: org.apache.hadoop.hbase.NotServingRegionException: Region
> > >
> > > [****************eaten by
> > > mouse***************],1429257507107.82bfbd974d36db11075e4ef1da7abfed.
> > > is not online on ******************,60020,1429256987509
> > >         at
> > >
>
> org.apache.hadoop.hbase.regionserver.HRegionServer.getRegionByEncodedName(HRegionServer.java:2780)
> > >         at
> > >
>
> org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:4337)
> > >         at
> > >
>
> org.apache.hadoop.hbase.regionserver.HRegionServer.splitRegion(HRegionServer.java:4042)
> > >         at
> > >
>
> org.apache.hadoop.hbase.protobuf.generated.AdminProtos$AdminService$2.callBlockingMethod(AdminProtos.java:20170)
> > >         at
> org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2029)
> > >         at
> org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:108)
> > >         at
> > >
> org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:112)
> > >         at
> > > org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:92)
> > >         at java.lang.Thread.run(Thread.java:745)
> > >
> > >
> > >
> > > Even thou sleep(2) help mostly it's not guaranty (especially on live
> > > cluster) and not the solution I would like.
> > >
> > > So my question is what would be proper way of checking that I am good
> to
> > > run next split command?
> > >
> > >
> > > Thank you
> > > SB
> > >
>

Re: Splitting table from HBase shell using script

Posted by Sergey Belousov <se...@gmail.com>.
Thank you Esteban for reply

see inline

On Apr 17, 2015 12:50 PM, "Esteban Gutierrez" <es...@cloudera.com> wrote:
>
> Sergey,
>
> My first question would be if you are turning off the HBase balancer
before
> splitting this region and how long between locating the region and then
> splitting the region has passed.

 I do not disable balancer before.

Also, are you splitting multiple regions
> per run of your ruby script or just one?

I do create multiple splits for the table at the time. But it all
sequential through shell

like

split 'table', 'splitkey1'
sleep(2)
split 'table', 'splitkey1'
sleep(2)
split 'table', 'splitkey1'
etc..


>
> thanks!
> esteban.
>
> --
> Cloudera, Inc.
>
>
> On Fri, Apr 17, 2015 at 9:41 AM, Sergey Belousov <
sergey.belousov@gmail.com>
> wrote:
>
> > Hi all
> >
> > I was looking into doing some splitting for the table (sort of in
> > production so I can not use SPLITS on create time) using 'split' command
> > from HBase shell. (0.98.9-hadoop2)
> > I have simple JRuby script where I just run 'split'
> > 'table_name','split_key' command using hbase shell /tmp/split_table.rb
> >
> > Time from time I have this error coming up
> >
> > ERROR: org.apache.hadoop.hbase.NotServingRegionException: Region
> >
> > [****************eaten by
> > mouse***************],1429257507107.82bfbd974d36db11075e4ef1da7abfed.
> > is not online on ******************,60020,1429256987509
> >         at
> >
org.apache.hadoop.hbase.regionserver.HRegionServer.getRegionByEncodedName(HRegionServer.java:2780)
> >         at
> >
org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:4337)
> >         at
> >
org.apache.hadoop.hbase.regionserver.HRegionServer.splitRegion(HRegionServer.java:4042)
> >         at
> >
org.apache.hadoop.hbase.protobuf.generated.AdminProtos$AdminService$2.callBlockingMethod(AdminProtos.java:20170)
> >         at
org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2029)
> >         at
org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:108)
> >         at
> >
org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:112)
> >         at
> > org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:92)
> >         at java.lang.Thread.run(Thread.java:745)
> >
> >
> >
> > Even thou sleep(2) help mostly it's not guaranty (especially on live
> > cluster) and not the solution I would like.
> >
> > So my question is what would be proper way of checking that I am good to
> > run next split command?
> >
> >
> > Thank you
> > SB
> >

Re: Splitting table from HBase shell using script

Posted by Esteban Gutierrez <es...@cloudera.com>.
Sergey,

My first question would be if you are turning off the HBase balancer before
splitting this region and how long between locating the region and then
splitting the region has passed. Also, are you splitting multiple regions
per run of your ruby script or just one?

thanks!
esteban.

--
Cloudera, Inc.


On Fri, Apr 17, 2015 at 9:41 AM, Sergey Belousov <se...@gmail.com>
wrote:

> Hi all
>
> I was looking into doing some splitting for the table (sort of in
> production so I can not use SPLITS on create time) using 'split' command
> from HBase shell. (0.98.9-hadoop2)
> I have simple JRuby script where I just run 'split'
> 'table_name','split_key' command using hbase shell /tmp/split_table.rb
>
> Time from time I have this error coming up
>
> ERROR: org.apache.hadoop.hbase.NotServingRegionException: Region
>
> [****************eaten by
> mouse***************],1429257507107.82bfbd974d36db11075e4ef1da7abfed.
> is not online on ******************,60020,1429256987509
>         at
> org.apache.hadoop.hbase.regionserver.HRegionServer.getRegionByEncodedName(HRegionServer.java:2780)
>         at
> org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:4337)
>         at
> org.apache.hadoop.hbase.regionserver.HRegionServer.splitRegion(HRegionServer.java:4042)
>         at
> org.apache.hadoop.hbase.protobuf.generated.AdminProtos$AdminService$2.callBlockingMethod(AdminProtos.java:20170)
>         at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2029)
>         at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:108)
>         at
> org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:112)
>         at
> org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:92)
>         at java.lang.Thread.run(Thread.java:745)
>
>
>
> Even thou sleep(2) help mostly it's not guaranty (especially on live
> cluster) and not the solution I would like.
>
> So my question is what would be proper way of checking that I am good to
> run next split command?
>
>
> Thank you
> SB
>