You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by "Daya, Raheem" <Ra...@relayhealth.com> on 2013/12/10 17:50:34 UTC

Table state

I have a distributed Hbase cluster that will not start.  It looks like there is a table that is an inconsistent state:
2013-12-10 07:40:50,447 FATAL org.apache.hadoop.hbase.master.HMaster: Unexpected state : ct_claims,204845|81V6SO4EF56DD1TKOIU7AS4L5D,1386050670937.6d138b97cde8bc3e49ff34639913109c. state=PENDING_OPEN, ts=1386690050445, server=rhf-045,60020,1386689069486 .. Cannot transit it to OFFLINE.

Is there a way to manually set the table to OFFLINE?  I have tried deleting the /hbase node in zookeeper.  I tried bringing up the master and then a region server and vice versa.  In the case of bringing the master up first, the master starts.  As soon as I bring up a region server the master goes down.  My thought is to move the tables to OFFLINE, (assuming it is possible), and try bringing up the cluster again.  hbck will not work as none of the region servers are up.  Any one have any other ideas?
Thanks,
Raheem





Re: Table state

Posted by Kevin O'dell <ke...@cloudera.com>.
Just to close the loop, the previous recommended steps help to get us back
up, but one of the HMasters is not happy now.  I will update with a final
analysis shortly.


On Tue, Dec 10, 2013 at 1:10 PM, Jean-Marc Spaggiari <
jean-marc@spaggiari.org> wrote:

> Also, might be interesting to look in the RS logs to see what this region
> can not come back online...
>
> JM
>
>
> 2013/12/10 Kevin O'dell <ke...@cloudera.com>
>
> > Hey Raheem,
> >
> >   You can sideline the table into tmp(mv /hbase/table /tmp/table, then
> > bring HBase back online.  Once HBase is back you can use HBCK to repair
> > your META -fixMeta -fixAssignments.  Once HBase is consistent again, you
> > can move the table back out of tmp and use HBCK to reupdate META.  If the
> > issue reoccurs let us know.
> >
> >
> > On Tue, Dec 10, 2013 at 11:50 AM, Daya, Raheem
> > <Ra...@relayhealth.com>wrote:
> >
> > > I have a distributed Hbase cluster that will not start.  It looks like
> > > there is a table that is an inconsistent state:
> > > 2013-12-10 07:40:50,447 FATAL org.apache.hadoop.hbase.master.HMaster:
> > > Unexpected state :
> > >
> >
> ct_claims,204845|81V6SO4EF56DD1TKOIU7AS4L5D,1386050670937.6d138b97cde8bc3e49ff34639913109c.
> > > state=PENDING_OPEN, ts=1386690050445,
> server=rhf-045,60020,1386689069486
> > ..
> > > Cannot transit it to OFFLINE.
> > >
> > > Is there a way to manually set the table to OFFLINE?  I have tried
> > > deleting the /hbase node in zookeeper.  I tried bringing up the master
> > and
> > > then a region server and vice versa.  In the case of bringing the
> master
> > up
> > > first, the master starts.  As soon as I bring up a region server the
> > master
> > > goes down.  My thought is to move the tables to OFFLINE, (assuming it
> is
> > > possible), and try bringing up the cluster again.  hbck will not work
> as
> > > none of the region servers are up.  Any one have any other ideas?
> > > Thanks,
> > > Raheem
> > >
> > >
> > >
> > >
> > >
> >
> >
> > --
> > Kevin O'Dell
> > Systems Engineer, Cloudera
> >
>



-- 
Kevin O'Dell
Systems Engineer, Cloudera

Re: Table state

Posted by Jean-Marc Spaggiari <je...@spaggiari.org>.
Also, might be interesting to look in the RS logs to see what this region
can not come back online...

JM


2013/12/10 Kevin O'dell <ke...@cloudera.com>

> Hey Raheem,
>
>   You can sideline the table into tmp(mv /hbase/table /tmp/table, then
> bring HBase back online.  Once HBase is back you can use HBCK to repair
> your META -fixMeta -fixAssignments.  Once HBase is consistent again, you
> can move the table back out of tmp and use HBCK to reupdate META.  If the
> issue reoccurs let us know.
>
>
> On Tue, Dec 10, 2013 at 11:50 AM, Daya, Raheem
> <Ra...@relayhealth.com>wrote:
>
> > I have a distributed Hbase cluster that will not start.  It looks like
> > there is a table that is an inconsistent state:
> > 2013-12-10 07:40:50,447 FATAL org.apache.hadoop.hbase.master.HMaster:
> > Unexpected state :
> >
> ct_claims,204845|81V6SO4EF56DD1TKOIU7AS4L5D,1386050670937.6d138b97cde8bc3e49ff34639913109c.
> > state=PENDING_OPEN, ts=1386690050445, server=rhf-045,60020,1386689069486
> ..
> > Cannot transit it to OFFLINE.
> >
> > Is there a way to manually set the table to OFFLINE?  I have tried
> > deleting the /hbase node in zookeeper.  I tried bringing up the master
> and
> > then a region server and vice versa.  In the case of bringing the master
> up
> > first, the master starts.  As soon as I bring up a region server the
> master
> > goes down.  My thought is to move the tables to OFFLINE, (assuming it is
> > possible), and try bringing up the cluster again.  hbck will not work as
> > none of the region servers are up.  Any one have any other ideas?
> > Thanks,
> > Raheem
> >
> >
> >
> >
> >
>
>
> --
> Kevin O'Dell
> Systems Engineer, Cloudera
>

Re: Table state

Posted by Kevin O'dell <ke...@cloudera.com>.
Hey Raheem,

  You can sideline the table into tmp(mv /hbase/table /tmp/table, then
bring HBase back online.  Once HBase is back you can use HBCK to repair
your META -fixMeta -fixAssignments.  Once HBase is consistent again, you
can move the table back out of tmp and use HBCK to reupdate META.  If the
issue reoccurs let us know.


On Tue, Dec 10, 2013 at 11:50 AM, Daya, Raheem
<Ra...@relayhealth.com>wrote:

> I have a distributed Hbase cluster that will not start.  It looks like
> there is a table that is an inconsistent state:
> 2013-12-10 07:40:50,447 FATAL org.apache.hadoop.hbase.master.HMaster:
> Unexpected state :
> ct_claims,204845|81V6SO4EF56DD1TKOIU7AS4L5D,1386050670937.6d138b97cde8bc3e49ff34639913109c.
> state=PENDING_OPEN, ts=1386690050445, server=rhf-045,60020,1386689069486 ..
> Cannot transit it to OFFLINE.
>
> Is there a way to manually set the table to OFFLINE?  I have tried
> deleting the /hbase node in zookeeper.  I tried bringing up the master and
> then a region server and vice versa.  In the case of bringing the master up
> first, the master starts.  As soon as I bring up a region server the master
> goes down.  My thought is to move the tables to OFFLINE, (assuming it is
> possible), and try bringing up the cluster again.  hbck will not work as
> none of the region servers are up.  Any one have any other ideas?
> Thanks,
> Raheem
>
>
>
>
>


-- 
Kevin O'Dell
Systems Engineer, Cloudera