You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by hua beatls <be...@gmail.com> on 2012/12/25 13:14:14 UTC
regionserver restartup error
Hi,
we want to test if the regionserver cound be restart by:
first step: $'kill -9 xxx(process number;
2nd step: $ ./hbase-daemon.sh start regionserver
we stop the regionserver with 'kill -9 xxx(process number)‘, and want to
restart regionserver with ' ./hbase-daemon.sh start regionserver'.
this way cannot work. i find regionserver's error below:
012-12-25 19:47:42,555 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: Attempting connect to
Master server at hadoop1,60000,1355887294437
2012-12-25 19:47:42,599 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: Connected to master at
hadoop2/192.168.250.107:60020
2012-12-25 19:47:42,599 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: Telling master at
hadoop1,60000,1355887294437 that we are up with port=60020,
startcode=1356436062169
2012-12-25 19:47:42,605 FATAL
org.apache.hadoop.hbase.regionserver.HRegionServer: Master rejected startup
because clock is out of sync
org.apache.hadoop.hbase.ClockOutOfSyncException:
org.apache.hadoop.hbase.ClockOutOfSyncException: Server
hadoop2,60020,1356436062169 has been rejected; Reported time is too far out
of sync with master. Time difference of 64785ms > max allowed of 30000ms
can we user 'kill -9 xxx(process number) ? or should we use '
$ ./bin/hbase-daemon.sh stop regionserver';
this regionserver is a loaded one.
how to restart this regionserver?
many thanks!
beatls
Re: regionserver restartup error
Posted by hua beatls <be...@gmail.com>.
HI,
yes, data/time is different, we correct the NTP configuraiton ,and
problem solved.
Thanks!
beatls
On Tue, Dec 25, 2012 at 10:40 PM, Nicolas Liochon <nk...@gmail.com> wrote:
> Hi,
>
> First, check the date/time on both server and check they don't differ;
> that's what the error says.
> You can configure the max allowed with "hbase.master.maxclockskew", but
> it's unlikely to be a good idea: it's always safer, in any distributed
> system, to have the servers sharing the same time. ntpd is often used for
> this.
>
> Second, it's better to use the stop command than doing a kill, especially a
> kill -9. Doing a stop allows to close nicely the regions this server is
> handling, and to unregister this region server in the master. With a kill
> -9, it means that the master will have to detect that this regionserver is
> dead. By default, that's 3 minutes (zookeeper timeout). In the meantime,
> the regions on this server won't be available.
>
> Lastly, there is a restart command in he hbase-daemon script: it does the
> stop & the start..
>
> Cheers,
>
> Nicolas
>
> On Tue, Dec 25, 2012 at 1:14 PM, hua beatls <be...@gmail.com> wrote:
>
> >
> > we stop the regionserver with 'kill -9 xxx(process number)‘, and want
> to
> > restart regionserver with ' ./hbase-daemon.sh start
>
Re: regionserver restartup error
Posted by Nicolas Liochon <nk...@gmail.com>.
Hi,
First, check the date/time on both server and check they don't differ;
that's what the error says.
You can configure the max allowed with "hbase.master.maxclockskew", but
it's unlikely to be a good idea: it's always safer, in any distributed
system, to have the servers sharing the same time. ntpd is often used for
this.
Second, it's better to use the stop command than doing a kill, especially a
kill -9. Doing a stop allows to close nicely the regions this server is
handling, and to unregister this region server in the master. With a kill
-9, it means that the master will have to detect that this regionserver is
dead. By default, that's 3 minutes (zookeeper timeout). In the meantime,
the regions on this server won't be available.
Lastly, there is a restart command in he hbase-daemon script: it does the
stop & the start..
Cheers,
Nicolas
On Tue, Dec 25, 2012 at 1:14 PM, hua beatls <be...@gmail.com> wrote:
>
> we stop the regionserver with 'kill -9 xxx(process number)‘, and want to
> restart regionserver with ' ./hbase-daemon.sh start