You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@hbase.apache.org by 梁景明 <fu...@gmail.com> on 2010/06/17 11:12:20 UTC

how to recover hbase

Something  confused me for a long time.
1、how to recover hbase master?
2、how to recover hbase region server?
as hadoop ,there is  checkpoint mirror to recover the hadoop master.
and now i only know  to backup tables and restore them.
one way to rescue hbase is  to reinstall habse and create tables again .
it cant write to shell or something ,all the stuff by manually.

thanks for any help .

Re: how to recover hbase

Posted by Jean-Daniel Cryans <jd...@apache.org>.

Trying to decode what the exceptions means without any context is
extremely hard. Your configurations looks good except for:

 <property>
   <name>hbase.regionserver.dns.interface</name>
   <value>192.168.1.122</value>
   <description></description>
 </property>

It expects an interface (like eth0) not an IP. And setting this alone;

 <property>
   <name>hbase.zookeeper.dns.nameserver</name>
   <value>192.168.1.122</value>
   <description></description>
 </property>

Won't work either, you need to set also the interface.

Let's try something, try to stop all the processes (kill -9 if needed)
then wipe out the logs. Start anew, zip all the logs, and send them to
me directly.

J-D

On Thu, Jun 24, 2010 at 1:42 AM, 梁景明 <fu...@gmail.com> wrote:
> i dont know how to describe my situation more. i just want to restart
> successful again and get my data back.
> 1、bin/start-hbase.sh show  all running.
> 2、bin/stop-hbase.sh can't stop normally.
> 3、regionserver cant see sometimes. after kill master process ,and restart
> bin/start-hbase.sh ,it shows ok. but master can't work.
> 4、hadoop hdfs runs ok.and on port 50070 i can read /hbase folders.
> 5、here is my hbase-site.xml,and test1 and s1.idfs.cn is the same ip
> 192.168.1.122 ,first i set s1.idfs.cn on hbase.zookeeper.quorum but it only
> know the hostname test1. s1.idfs.cn is based onmy dns.
> <configuration>
>  <property>
>    <name>hbase.rootdir</name>
>    <value>hdfs://s1.idfs.cn:9000/hbase</value>
>    <description>The directory shared by region servers.
>    </description>
>  </property>
>  <property>
>    <name>hbase.cluster.distributed</name>
>    <value>true</value>
>    <description>
>    </description>
>  </property>
>  <property>
>    <name>fs.default.name</name>
>    <value>hdfs://s1.idfs.cn:9000</value>
>    <description></description>
>  </property>
>  <property>
>    <name>hbase.zookeeper.dns.nameserver</name>
>    <value>192.168.1.122</value>
>    <description></description>
>  </property>
>  <property>
>    <name>hbase.regionserver.dns.interface</name>
>    <value>192.168.1.122</value>
>    <description></description>
>  </property>
> <property>
>    <name>hbase.zookeeper.property.clientPort</name>
>    <value>2222</value>
>    <description>Property from ZooKeeper's config zoo.cfg.
>    The port at which the clients will connect.
>    </description>
>  </property>
>  <property>
>    <name>hbase.zookeeper.quorum</name>
>    <value>test1</value>
>  </property>
> </configuration>
>
> regionserver file is
> s1.idfs.cn
> s2.idfs.cn
>
> hbase runs ok first time ,and i create tables and insert data.
>
> 6、i try to  use bin/zkCli.sh -server 192.168.1.122:2222 to look at /hbase in
> zookeeper ,maybe some useful info to you.thanks.
>
> [zk: 192.168.1.122:2222(CONNECTED) 0] ls /
> [hbase, zookeeper]
> [zk: 192.168.1.122:2222(CONNECTED) 16] ls /hbase
> [safe-mode, root-region-server, rs, master, shutdown]
>
> see hbase in /
>
> [zk: 192.168.1.122:2222(CONNECTED) 10] get /hbase/master
> 192.168.1.122:60000
> cZxid = 0x1c
> ctime = Thu Jun 24 14:39:21 CST 2010
> mZxid = 0x1c
> mtime = Thu Jun 24 14:39:21 CST 2010
> pZxid = 0x1c
> cversion = 0
> dataVersion = 0
> aclVersion = 0
> ephemeralOwner = 0x12968ae99ca0000
> dataLength = 19
> numChildren = 0
>
> that 's my master 192.168.1.122
>
> [zk: 192.168.1.122:2222(CONNECTED) 14] get /hbase/root-region-server
> 192.168.1.123:60020
> cZxid = 0xa
> ctime = Thu Jun 24 10:38:00 CST 2010
> mZxid = 0x25
> mtime = Thu Jun 24 14:39:31 CST 2010
> pZxid = 0xa
> cversion = 0
> dataVersion = 1
> aclVersion = 0
> ephemeralOwner = 0x0
> dataLength = 19
> numChildren = 0
>
> i set two region servers but here just one.
>
> [zk: 192.168.1.122:2222(CONNECTED) 11] get /hbase/shutdown
> up
> cZxid = 0x1d
> ctime = Thu Jun 24 14:39:21 CST 2010
> mZxid = 0x1d
> mtime = Thu Jun 24 14:39:21 CST 2010
> pZxid = 0x1d
> cversion = 0
> dataVersion = 0
> aclVersion = 0
> ephemeralOwner = 0x0
> dataLength = 2
> numChildren = 0
>
> [zk: 192.168.1.122:2222(CONNECTED) 12] get /hbase/rs
>
> cZxid = 0x6
> ctime = Thu Jun 24 10:37:28 CST 2010
> mZxid = 0x6
> mtime = Thu Jun 24 10:37:28 CST 2010
> pZxid = 0x21
> cversion = 6
> dataVersion = 0
> aclVersion = 0
> ephemeralOwner = 0x0
> dataLength = 0
> numChildren = 2
>
> [zk: 192.168.1.122:2222(CONNECTED) 19] ls /hbase/safe-mode
> []
>
>
>
>
> 2010/6/24 梁景明 <fu...@gmail.com>
>
>> and more details, when i kill the  process of hbase. restart it again
>> ,regionserver on 60030 can see,it started ok.
>> ,but master on 60010 show this . and the data /hbase still in hadoop hdfs.
>> that 's what i want to say.
>> the data /hbase stays ,but i can't find any way to start hbase again.
>>
>>
>> HTTP ERROR: 500
>>
>> Trying to contact region server null for region , row '', but failed after 3 attempts.
>> Exceptions:
>>
>> org.apache.hadoop.hbase.client.NoServerForRegionException: Timed out trying to locate root region because: Failed setting up proxy to /192.168.1.123:60020 after attempts=1
>> org.apache.hadoop.hbase.client.NoServerForRegionException: Timed out trying to locate root region because: Failed setting up proxy to /192.168.1.123:60020 after attempts=1
>>
>> org.apache.hadoop.hbase.client.NoServerForRegionException: Timed out trying to locate root region because: Failed setting up proxy to /192.168.1.123:60020 after attempts=1
>>
>> RequestURI=/master.jsp
>> Caused by:
>>
>> org.apache.hadoop.hbase.client.RetriesExhaustedException: Trying to contact region server null for region , row '', but failed after 3 attempts.
>> Exceptions:
>>
>> org.apache.hadoop.hbase.client.NoServerForRegionException: Timed out trying to locate root region because: Failed setting up proxy to /192.168.1.123:60020 after attempts=1
>> org.apache.hadoop.hbase.client.NoServerForRegionException: Timed out trying to locate root region because: Failed setting up proxy to /192.168.1.123:60020 after attempts=1
>>
>> org.apache.hadoop.hbase.client.NoServerForRegionException: Timed out trying to locate root region because: Failed setting up proxy to /192.168.1.123:60020 after attempts=1
>>
>>       at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.getRegionServerWithRetries(HConnectionManager.java:1055)
>>
>>       at org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:75)
>>       at org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:48)
>>       at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.listTables(HConnectionManager.java:454)
>>
>>       at org.apache.hadoop.hbase.client.HBaseAdmin.listTables(HBaseAdmin.java:127)
>>       at org.apache.hadoop.hbase.generated.master.master_jsp._jspService(master_jsp.java:132)
>>       at org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:97)
>>
>>       at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
>>       at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:502)
>>       at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:363)
>>
>>       at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
>>       at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181)
>>       at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
>>
>>       at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:417)
>>       at org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)
>>       at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
>>
>>       at org.mortbay.jetty.Server.handle(Server.java:324)
>>       at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:534)
>>       at org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:864)
>>
>>       at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:533)
>>       at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:207)
>>       at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:403)
>>       at org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:409)
>>
>>       at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:522)
>>
>> *Powered by Jetty:// <http://jetty.mortbay.org/>*
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> 2010/6/24 梁景明 <fu...@gmail.com>
>>
>> exactly like this . it 's some problem with zookeeper, i am not sure what
>>> happen to zookeeper,
>>> it  is all started .but port 60030 and 60010 not ok.
>>>
>>> ---------------------------------------------------------------------------
>>> futureha@test1:~/hbase$ bin/start-hbase.sh
>>> test1: zookeeper running as process 18596. Stop it first.
>>> master running as process 20047. Stop it first.
>>> s1.idfs.cn: regionserver running as process 18829. Stop it first.
>>> s2.idfs.cn: regionserver running as process 18763. Stop it first.
>>>
>>> ------------------------------------------------------------------------------------------
>>>
>>> and logs in hbase give me the following, and i dont know how to deal with
>>> it.if zookeeper is dead or goes with some problems,
>>> how do i do> stop-hbase.sh & start-hbase.sh don't work at all
>>>
>>>
>>> ------------------------------------------------------------------------------------------------------------
>>> 2010-06-24 11:33:29,713 WARN
>>> org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper: Failed to create /hbase
>>> -- check quorum servers, currently=test1:2222
>>> org.apache.zookeeper.KeeperException$ConnectionLossException:
>>> KeeperErrorCode = ConnectionLoss for /hbase
>>>     at
>>> org.apache.zookeeper.KeeperException.create(KeeperException.java:90)
>>>     at
>>> org.apache.zookeeper.KeeperException.create(KeeperException.java:42)
>>>     at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:780)
>>>     at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:808)
>>>     at
>>> org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper.ensureExists(ZooKeeperWrapper.java:405)
>>>     at
>>> org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper.ensureParentExists(ZooKeeperWrapper.java:432)
>>>     at
>>> org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper.writeMasterAddress(ZooKeeperWrapper.java:520)
>>>     at
>>> org.apache.hadoop.hbase.master.HMaster.writeAddressToZooKeeper(HMaster.java:260)
>>>     at org.apache.hadoop.hbase.master.HMaster.<init>(HMaster.java:242)
>>>     at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
>>> Method)
>>>     at
>>> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
>>>     at
>>> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
>>>     at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
>>>     at org.apache.hadoop.hbase.master.HMaster.doMain(HMaster.java:1230)
>>>     at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:1271)
>>> 2010-06-24 11:33:31,202 INFO org.apache.zookeeper.ClientCnxn: Attempting
>>> connection to server test1/192.168.1.122:2222
>>> 2010-06-24 11:33:31,203 INFO org.apache.zookeeper.ClientCnxn: Priming
>>> connection to java.nio.channels.SocketChannel[connected local=/
>>> 192.168.1.122:52706 remote=test1/192.168.1.122:2222]
>>> 2010-06-24 11:33:31,203 INFO org.apache.zookeeper.ClientCnxn: Server
>>> connection successful
>>> 2010-06-24 11:33:31,204 WARN org.apache.zookeeper.ClientCnxn: Exception
>>> closing session 0x0 to sun.nio.ch.SelectionKeyImpl@163f7a1
>>> java.io.IOException: Read error rc = -1 java.nio.DirectByteBuffer[pos=0
>>> lim=4 cap=4]
>>>     at
>>> org.apache.zookeeper.ClientCnxn$SendThread.doIO(ClientCnxn.java:701)
>>>     at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:945)
>>> 2010-06-24 11:33:31,204 WARN org.apache.zookeeper.ClientCnxn: Ignoring
>>> exception during shutdown input
>>> java.net.SocketException: Transport endpoint is not connected
>>>     at sun.nio.ch.SocketChannelImpl.shutdown(Native Method)
>>>     at
>>> sun.nio.ch.SocketChannelImpl.shutdownInput(SocketChannelImpl.java:640)
>>>     at sun.nio.ch.SocketAdaptor.shutdownInput(SocketAdaptor.java:360)
>>>     at
>>> org.apache.zookeeper.ClientCnxn$SendThread.cleanup(ClientCnxn.java:999)
>>>     at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:970)
>>> 2010-06-24 11:33:31,204 WARN org.apache.zookeeper.ClientCnxn: Ignoring
>>> exception during shutdown output
>>> java.net.SocketException: Transport endpoint is not connected
>>>     at sun.nio.ch.SocketChannelImpl.shutdown(Native Method)
>>>     at
>>> sun.nio.ch.SocketChannelImpl.shutdownOutput(SocketChannelImpl.java:651)
>>>     at sun.nio.ch.SocketAdaptor.shutdownOutput(SocketAdaptor.java:368)
>>>     at
>>> org.apache.zookeeper.ClientCnxn$SendThread.cleanup(ClientCnxn.java:1004)
>>>     at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:970)
>>>
>>>
>>> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>>>
>>>
>>>
>>> 2010/6/22 Jean-Daniel Cryans <jd...@apache.org>
>>>
>>> I'm not sure I understand what you describe, and since you didn't post
>>>> any output from your logs then it's really hard to help you debug.
>>>>
>>>> What's the problem exactly and do you see any exception in the logs?
>>>>
>>>> J-D
>>>>
>>>> On Mon, Jun 21, 2010 at 2:48 AM, 梁景明 <fu...@gmail.com> wrote:
>>>> > after reading "Description of how HBase uses ZooKeeper"i see my problem
>>>> > maybe that the regionserver session in zk is lost!
>>>> >
>>>> > and i use bin/start-hbase.sh cant start hbase successfully .
>>>> >
>>>> > because they connect to zookeeper something lost?
>>>> >
>>>> > to start it.one way i think zookeeper start alone ,and i delete
>>>> "/hbase" in
>>>> > it , and run the start-hbase.sh shell again?
>>>> >
>>>> > will it be ok?
>>>> >
>>>> > 2010/6/19 Jean-Daniel Cryans <jd...@apache.org>
>>>> >
>>>> >> > do u mean if ZooKeeper is dead,the data will lose?
>>>> >>
>>>> >> If your Zookeeper ensemble is dead, then HBase will be unavailable but
>>>> >> you won't lose any data. And even if your zookeeper data is wiped out,
>>>> >> like I said it's only runtime data so it doesn't matter.
>>>> >>
>>>> >> >
>>>> >> > in that case,ZooKeeper lost .META or .ROOT ,the data in hadoop will
>>>> never
>>>> >> be
>>>> >> > recover , thought there were some table folders in hadoop.
>>>> >>
>>>> >> HBase stores the location of -ROOT- in Zookeeper, and that's changed
>>>> >> everytime the region moves. Losing that won't make -ROOT- disappear
>>>> >> forever, it's still in HDFS.
>>>> >>
>>>> >> Does it answer the question? (I'm not sure I fully understand you)
>>>> >>
>>>> >> J-D
>>>> >>
>>>> >
>>>>
>>>
>>>
>>
>

Re: how to recover hbase

Posted by 梁景明 <fu...@gmail.com>.

i dont know how to describe my situation more. i just want to restart
successful again and get my data back.
1、bin/start-hbase.sh show  all running.
2、bin/stop-hbase.sh can't stop normally.
3、regionserver cant see sometimes. after kill master process ,and restart
bin/start-hbase.sh ,it shows ok. but master can't work.
4、hadoop hdfs runs ok.and on port 50070 i can read /hbase folders.
5、here is my hbase-site.xml,and test1 and s1.idfs.cn is the same ip
192.168.1.122 ,first i set s1.idfs.cn on hbase.zookeeper.quorum but it only
know the hostname test1. s1.idfs.cn is based onmy dns.
<configuration>
  <property>
    <name>hbase.rootdir</name>
    <value>hdfs://s1.idfs.cn:9000/hbase</value>
    <description>The directory shared by region servers.
    </description>
  </property>
  <property>
    <name>hbase.cluster.distributed</name>
    <value>true</value>
    <description>
    </description>
  </property>
  <property>
    <name>fs.default.name</name>
    <value>hdfs://s1.idfs.cn:9000</value>
    <description></description>
  </property>
  <property>
    <name>hbase.zookeeper.dns.nameserver</name>
    <value>192.168.1.122</value>
    <description></description>
  </property>
  <property>
    <name>hbase.regionserver.dns.interface</name>
    <value>192.168.1.122</value>
    <description></description>
  </property>
<property>
    <name>hbase.zookeeper.property.clientPort</name>
    <value>2222</value>
    <description>Property from ZooKeeper's config zoo.cfg.
    The port at which the clients will connect.
    </description>
  </property>
  <property>
    <name>hbase.zookeeper.quorum</name>
    <value>test1</value>
  </property>
</configuration>

regionserver file is
s1.idfs.cn
s2.idfs.cn

hbase runs ok first time ,and i create tables and insert data.

6、i try to  use bin/zkCli.sh -server 192.168.1.122:2222 to look at /hbase in
zookeeper ,maybe some useful info to you.thanks.

[zk: 192.168.1.122:2222(CONNECTED) 0] ls /
[hbase, zookeeper]
[zk: 192.168.1.122:2222(CONNECTED) 16] ls /hbase
[safe-mode, root-region-server, rs, master, shutdown]

see hbase in /

[zk: 192.168.1.122:2222(CONNECTED) 10] get /hbase/master
192.168.1.122:60000
cZxid = 0x1c
ctime = Thu Jun 24 14:39:21 CST 2010
mZxid = 0x1c
mtime = Thu Jun 24 14:39:21 CST 2010
pZxid = 0x1c
cversion = 0
dataVersion = 0
aclVersion = 0
ephemeralOwner = 0x12968ae99ca0000
dataLength = 19
numChildren = 0

that 's my master 192.168.1.122

[zk: 192.168.1.122:2222(CONNECTED) 14] get /hbase/root-region-server
192.168.1.123:60020
cZxid = 0xa
ctime = Thu Jun 24 10:38:00 CST 2010
mZxid = 0x25
mtime = Thu Jun 24 14:39:31 CST 2010
pZxid = 0xa
cversion = 0
dataVersion = 1
aclVersion = 0
ephemeralOwner = 0x0
dataLength = 19
numChildren = 0

i set two region servers but here just one.

[zk: 192.168.1.122:2222(CONNECTED) 11] get /hbase/shutdown
up
cZxid = 0x1d
ctime = Thu Jun 24 14:39:21 CST 2010
mZxid = 0x1d
mtime = Thu Jun 24 14:39:21 CST 2010
pZxid = 0x1d
cversion = 0
dataVersion = 0
aclVersion = 0
ephemeralOwner = 0x0
dataLength = 2
numChildren = 0

[zk: 192.168.1.122:2222(CONNECTED) 12] get /hbase/rs

cZxid = 0x6
ctime = Thu Jun 24 10:37:28 CST 2010
mZxid = 0x6
mtime = Thu Jun 24 10:37:28 CST 2010
pZxid = 0x21
cversion = 6
dataVersion = 0
aclVersion = 0
ephemeralOwner = 0x0
dataLength = 0
numChildren = 2

[zk: 192.168.1.122:2222(CONNECTED) 19] ls /hbase/safe-mode
[]




2010/6/24 梁景明 <fu...@gmail.com>

> and more details, when i kill the  process of hbase. restart it again
> ,regionserver on 60030 can see,it started ok.
> ,but master on 60010 show this . and the data /hbase still in hadoop hdfs.
> that 's what i want to say.
> the data /hbase stays ,but i can't find any way to start hbase again.
>
>
> HTTP ERROR: 500
>
> Trying to contact region server null for region , row '', but failed after 3 attempts.
> Exceptions:
>
> org.apache.hadoop.hbase.client.NoServerForRegionException: Timed out trying to locate root region because: Failed setting up proxy to /192.168.1.123:60020 after attempts=1
> org.apache.hadoop.hbase.client.NoServerForRegionException: Timed out trying to locate root region because: Failed setting up proxy to /192.168.1.123:60020 after attempts=1
>
> org.apache.hadoop.hbase.client.NoServerForRegionException: Timed out trying to locate root region because: Failed setting up proxy to /192.168.1.123:60020 after attempts=1
>
> RequestURI=/master.jsp
> Caused by:
>
> org.apache.hadoop.hbase.client.RetriesExhaustedException: Trying to contact region server null for region , row '', but failed after 3 attempts.
> Exceptions:
>
> org.apache.hadoop.hbase.client.NoServerForRegionException: Timed out trying to locate root region because: Failed setting up proxy to /192.168.1.123:60020 after attempts=1
> org.apache.hadoop.hbase.client.NoServerForRegionException: Timed out trying to locate root region because: Failed setting up proxy to /192.168.1.123:60020 after attempts=1
>
> org.apache.hadoop.hbase.client.NoServerForRegionException: Timed out trying to locate root region because: Failed setting up proxy to /192.168.1.123:60020 after attempts=1
>
> 	at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.getRegionServerWithRetries(HConnectionManager.java:1055)
>
> 	at org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:75)
> 	at org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:48)
> 	at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.listTables(HConnectionManager.java:454)
>
> 	at org.apache.hadoop.hbase.client.HBaseAdmin.listTables(HBaseAdmin.java:127)
> 	at org.apache.hadoop.hbase.generated.master.master_jsp._jspService(master_jsp.java:132)
> 	at org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:97)
>
> 	at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
> 	at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:502)
> 	at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:363)
>
> 	at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
> 	at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181)
> 	at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
>
> 	at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:417)
> 	at org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)
> 	at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
>
> 	at org.mortbay.jetty.Server.handle(Server.java:324)
> 	at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:534)
> 	at org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:864)
>
> 	at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:533)
> 	at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:207)
> 	at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:403)
> 	at org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:409)
>
> 	at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:522)
>
> *Powered by Jetty:// <http://jetty.mortbay.org/>*
>
>
>
>
>
>
>
>
>
>
> 2010/6/24 梁景明 <fu...@gmail.com>
>
> exactly like this . it 's some problem with zookeeper, i am not sure what
>> happen to zookeeper,
>> it  is all started .but port 60030 and 60010 not ok.
>>
>> ---------------------------------------------------------------------------
>> futureha@test1:~/hbase$ bin/start-hbase.sh
>> test1: zookeeper running as process 18596. Stop it first.
>> master running as process 20047. Stop it first.
>> s1.idfs.cn: regionserver running as process 18829. Stop it first.
>> s2.idfs.cn: regionserver running as process 18763. Stop it first.
>>
>> ------------------------------------------------------------------------------------------
>>
>> and logs in hbase give me the following, and i dont know how to deal with
>> it.if zookeeper is dead or goes with some problems,
>> how do i do> stop-hbase.sh & start-hbase.sh don't work at all
>>
>>
>> ------------------------------------------------------------------------------------------------------------
>> 2010-06-24 11:33:29,713 WARN
>> org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper: Failed to create /hbase
>> -- check quorum servers, currently=test1:2222
>> org.apache.zookeeper.KeeperException$ConnectionLossException:
>> KeeperErrorCode = ConnectionLoss for /hbase
>>     at
>> org.apache.zookeeper.KeeperException.create(KeeperException.java:90)
>>     at
>> org.apache.zookeeper.KeeperException.create(KeeperException.java:42)
>>     at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:780)
>>     at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:808)
>>     at
>> org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper.ensureExists(ZooKeeperWrapper.java:405)
>>     at
>> org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper.ensureParentExists(ZooKeeperWrapper.java:432)
>>     at
>> org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper.writeMasterAddress(ZooKeeperWrapper.java:520)
>>     at
>> org.apache.hadoop.hbase.master.HMaster.writeAddressToZooKeeper(HMaster.java:260)
>>     at org.apache.hadoop.hbase.master.HMaster.<init>(HMaster.java:242)
>>     at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
>> Method)
>>     at
>> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
>>     at
>> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
>>     at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
>>     at org.apache.hadoop.hbase.master.HMaster.doMain(HMaster.java:1230)
>>     at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:1271)
>> 2010-06-24 11:33:31,202 INFO org.apache.zookeeper.ClientCnxn: Attempting
>> connection to server test1/192.168.1.122:2222
>> 2010-06-24 11:33:31,203 INFO org.apache.zookeeper.ClientCnxn: Priming
>> connection to java.nio.channels.SocketChannel[connected local=/
>> 192.168.1.122:52706 remote=test1/192.168.1.122:2222]
>> 2010-06-24 11:33:31,203 INFO org.apache.zookeeper.ClientCnxn: Server
>> connection successful
>> 2010-06-24 11:33:31,204 WARN org.apache.zookeeper.ClientCnxn: Exception
>> closing session 0x0 to sun.nio.ch.SelectionKeyImpl@163f7a1
>> java.io.IOException: Read error rc = -1 java.nio.DirectByteBuffer[pos=0
>> lim=4 cap=4]
>>     at
>> org.apache.zookeeper.ClientCnxn$SendThread.doIO(ClientCnxn.java:701)
>>     at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:945)
>> 2010-06-24 11:33:31,204 WARN org.apache.zookeeper.ClientCnxn: Ignoring
>> exception during shutdown input
>> java.net.SocketException: Transport endpoint is not connected
>>     at sun.nio.ch.SocketChannelImpl.shutdown(Native Method)
>>     at
>> sun.nio.ch.SocketChannelImpl.shutdownInput(SocketChannelImpl.java:640)
>>     at sun.nio.ch.SocketAdaptor.shutdownInput(SocketAdaptor.java:360)
>>     at
>> org.apache.zookeeper.ClientCnxn$SendThread.cleanup(ClientCnxn.java:999)
>>     at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:970)
>> 2010-06-24 11:33:31,204 WARN org.apache.zookeeper.ClientCnxn: Ignoring
>> exception during shutdown output
>> java.net.SocketException: Transport endpoint is not connected
>>     at sun.nio.ch.SocketChannelImpl.shutdown(Native Method)
>>     at
>> sun.nio.ch.SocketChannelImpl.shutdownOutput(SocketChannelImpl.java:651)
>>     at sun.nio.ch.SocketAdaptor.shutdownOutput(SocketAdaptor.java:368)
>>     at
>> org.apache.zookeeper.ClientCnxn$SendThread.cleanup(ClientCnxn.java:1004)
>>     at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:970)
>>
>>
>> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>>
>>
>>
>> 2010/6/22 Jean-Daniel Cryans <jd...@apache.org>
>>
>> I'm not sure I understand what you describe, and since you didn't post
>>> any output from your logs then it's really hard to help you debug.
>>>
>>> What's the problem exactly and do you see any exception in the logs?
>>>
>>> J-D
>>>
>>> On Mon, Jun 21, 2010 at 2:48 AM, 梁景明 <fu...@gmail.com> wrote:
>>> > after reading "Description of how HBase uses ZooKeeper"i see my problem
>>> > maybe that the regionserver session in zk is lost!
>>> >
>>> > and i use bin/start-hbase.sh cant start hbase successfully .
>>> >
>>> > because they connect to zookeeper something lost?
>>> >
>>> > to start it.one way i think zookeeper start alone ,and i delete
>>> "/hbase" in
>>> > it , and run the start-hbase.sh shell again?
>>> >
>>> > will it be ok?
>>> >
>>> > 2010/6/19 Jean-Daniel Cryans <jd...@apache.org>
>>> >
>>> >> > do u mean if ZooKeeper is dead,the data will lose?
>>> >>
>>> >> If your Zookeeper ensemble is dead, then HBase will be unavailable but
>>> >> you won't lose any data. And even if your zookeeper data is wiped out,
>>> >> like I said it's only runtime data so it doesn't matter.
>>> >>
>>> >> >
>>> >> > in that case,ZooKeeper lost .META or .ROOT ,the data in hadoop will
>>> never
>>> >> be
>>> >> > recover , thought there were some table folders in hadoop.
>>> >>
>>> >> HBase stores the location of -ROOT- in Zookeeper, and that's changed
>>> >> everytime the region moves. Losing that won't make -ROOT- disappear
>>> >> forever, it's still in HDFS.
>>> >>
>>> >> Does it answer the question? (I'm not sure I fully understand you)
>>> >>
>>> >> J-D
>>> >>
>>> >
>>>
>>
>>
>

Re: how to recover hbase

Posted by 梁景明 <fu...@gmail.com>.

and more details, when i kill the  process of hbase. restart it again
,regionserver on 60030 can see,it started ok.
,but master on 60010 show this . and the data /hbase still in hadoop hdfs.
that 's what i want to say.
the data /hbase stays ,but i can't find any way to start hbase again.


HTTP ERROR: 500

Trying to contact region server null for region , row '', but failed
after 3 attempts.
Exceptions:
org.apache.hadoop.hbase.client.NoServerForRegionException: Timed out
trying to locate root region because: Failed setting up proxy to
/192.168.1.123:60020 after attempts=1
org.apache.hadoop.hbase.client.NoServerForRegionException: Timed out
trying to locate root region because: Failed setting up proxy to
/192.168.1.123:60020 after attempts=1
org.apache.hadoop.hbase.client.NoServerForRegionException: Timed out
trying to locate root region because: Failed setting up proxy to
/192.168.1.123:60020 after attempts=1

RequestURI=/master.jsp
Caused by:

org.apache.hadoop.hbase.client.RetriesExhaustedException: Trying to
contact region server null for region , row '', but failed after 3
attempts.
Exceptions:
org.apache.hadoop.hbase.client.NoServerForRegionException: Timed out
trying to locate root region because: Failed setting up proxy to
/192.168.1.123:60020 after attempts=1
org.apache.hadoop.hbase.client.NoServerForRegionException: Timed out
trying to locate root region because: Failed setting up proxy to
/192.168.1.123:60020 after attempts=1
org.apache.hadoop.hbase.client.NoServerForRegionException: Timed out
trying to locate root region because: Failed setting up proxy to
/192.168.1.123:60020 after attempts=1

	at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.getRegionServerWithRetries(HConnectionManager.java:1055)
	at org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:75)
	at org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:48)
	at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.listTables(HConnectionManager.java:454)
	at org.apache.hadoop.hbase.client.HBaseAdmin.listTables(HBaseAdmin.java:127)
	at org.apache.hadoop.hbase.generated.master.master_jsp._jspService(master_jsp.java:132)
	at org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:97)
	at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
	at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:502)
	at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:363)
	at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
	at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181)
	at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
	at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:417)
	at org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)
	at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
	at org.mortbay.jetty.Server.handle(Server.java:324)
	at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:534)
	at org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:864)
	at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:533)
	at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:207)
	at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:403)
	at org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:409)
	at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:522)

*Powered by Jetty:// <http://jetty.mortbay.org/>*










2010/6/24 梁景明 <fu...@gmail.com>

> exactly like this . it 's some problem with zookeeper, i am not sure what
> happen to zookeeper,
> it  is all started .but port 60030 and 60010 not ok.
> ---------------------------------------------------------------------------
> futureha@test1:~/hbase$ bin/start-hbase.sh
> test1: zookeeper running as process 18596. Stop it first.
> master running as process 20047. Stop it first.
> s1.idfs.cn: regionserver running as process 18829. Stop it first.
> s2.idfs.cn: regionserver running as process 18763. Stop it first.
>
> ------------------------------------------------------------------------------------------
>
> and logs in hbase give me the following, and i dont know how to deal with
> it.if zookeeper is dead or goes with some problems,
> how do i do> stop-hbase.sh & start-hbase.sh don't work at all
>
>
> ------------------------------------------------------------------------------------------------------------
> 2010-06-24 11:33:29,713 WARN
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper: Failed to create /hbase
> -- check quorum servers, currently=test1:2222
> org.apache.zookeeper.KeeperException$ConnectionLossException:
> KeeperErrorCode = ConnectionLoss for /hbase
>     at org.apache.zookeeper.KeeperException.create(KeeperException.java:90)
>     at org.apache.zookeeper.KeeperException.create(KeeperException.java:42)
>     at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:780)
>     at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:808)
>     at
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper.ensureExists(ZooKeeperWrapper.java:405)
>     at
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper.ensureParentExists(ZooKeeperWrapper.java:432)
>     at
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper.writeMasterAddress(ZooKeeperWrapper.java:520)
>     at
> org.apache.hadoop.hbase.master.HMaster.writeAddressToZooKeeper(HMaster.java:260)
>     at org.apache.hadoop.hbase.master.HMaster.<init>(HMaster.java:242)
>     at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
> Method)
>     at
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
>     at
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
>     at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
>     at org.apache.hadoop.hbase.master.HMaster.doMain(HMaster.java:1230)
>     at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:1271)
> 2010-06-24 11:33:31,202 INFO org.apache.zookeeper.ClientCnxn: Attempting
> connection to server test1/192.168.1.122:2222
> 2010-06-24 11:33:31,203 INFO org.apache.zookeeper.ClientCnxn: Priming
> connection to java.nio.channels.SocketChannel[connected local=/
> 192.168.1.122:52706 remote=test1/192.168.1.122:2222]
> 2010-06-24 11:33:31,203 INFO org.apache.zookeeper.ClientCnxn: Server
> connection successful
> 2010-06-24 11:33:31,204 WARN org.apache.zookeeper.ClientCnxn: Exception
> closing session 0x0 to sun.nio.ch.SelectionKeyImpl@163f7a1
> java.io.IOException: Read error rc = -1 java.nio.DirectByteBuffer[pos=0
> lim=4 cap=4]
>     at org.apache.zookeeper.ClientCnxn$SendThread.doIO(ClientCnxn.java:701)
>     at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:945)
> 2010-06-24 11:33:31,204 WARN org.apache.zookeeper.ClientCnxn: Ignoring
> exception during shutdown input
> java.net.SocketException: Transport endpoint is not connected
>     at sun.nio.ch.SocketChannelImpl.shutdown(Native Method)
>     at
> sun.nio.ch.SocketChannelImpl.shutdownInput(SocketChannelImpl.java:640)
>     at sun.nio.ch.SocketAdaptor.shutdownInput(SocketAdaptor.java:360)
>     at
> org.apache.zookeeper.ClientCnxn$SendThread.cleanup(ClientCnxn.java:999)
>     at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:970)
> 2010-06-24 11:33:31,204 WARN org.apache.zookeeper.ClientCnxn: Ignoring
> exception during shutdown output
> java.net.SocketException: Transport endpoint is not connected
>     at sun.nio.ch.SocketChannelImpl.shutdown(Native Method)
>     at
> sun.nio.ch.SocketChannelImpl.shutdownOutput(SocketChannelImpl.java:651)
>     at sun.nio.ch.SocketAdaptor.shutdownOutput(SocketAdaptor.java:368)
>     at
> org.apache.zookeeper.ClientCnxn$SendThread.cleanup(ClientCnxn.java:1004)
>     at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:970)
>
>
> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>
>
> 2010/6/22 Jean-Daniel Cryans <jd...@apache.org>
>
> I'm not sure I understand what you describe, and since you didn't post
>> any output from your logs then it's really hard to help you debug.
>>
>> What's the problem exactly and do you see any exception in the logs?
>>
>> J-D
>>
>> On Mon, Jun 21, 2010 at 2:48 AM, 梁景明 <fu...@gmail.com> wrote:
>> > after reading "Description of how HBase uses ZooKeeper"i see my problem
>> > maybe that the regionserver session in zk is lost!
>> >
>> > and i use bin/start-hbase.sh cant start hbase successfully .
>> >
>> > because they connect to zookeeper something lost?
>> >
>> > to start it.one way i think zookeeper start alone ,and i delete "/hbase"
>> in
>> > it , and run the start-hbase.sh shell again?
>> >
>> > will it be ok?
>> >
>> > 2010/6/19 Jean-Daniel Cryans <jd...@apache.org>
>> >
>> >> > do u mean if ZooKeeper is dead,the data will lose?
>> >>
>> >> If your Zookeeper ensemble is dead, then HBase will be unavailable but
>> >> you won't lose any data. And even if your zookeeper data is wiped out,
>> >> like I said it's only runtime data so it doesn't matter.
>> >>
>> >> >
>> >> > in that case,ZooKeeper lost .META or .ROOT ,the data in hadoop will
>> never
>> >> be
>> >> > recover , thought there were some table folders in hadoop.
>> >>
>> >> HBase stores the location of -ROOT- in Zookeeper, and that's changed
>> >> everytime the region moves. Losing that won't make -ROOT- disappear
>> >> forever, it's still in HDFS.
>> >>
>> >> Does it answer the question? (I'm not sure I fully understand you)
>> >>
>> >> J-D
>> >>
>> >
>>
>
>

Re: how to recover hbase

Posted by 梁景明 <fu...@gmail.com>.

exactly like this . it 's some problem with zookeeper, i am not sure what
happen to zookeeper,
it  is all started .but port 60030 and 60010 not ok.
---------------------------------------------------------------------------
futureha@test1:~/hbase$ bin/start-hbase.sh
test1: zookeeper running as process 18596. Stop it first.
master running as process 20047. Stop it first.
s1.idfs.cn: regionserver running as process 18829. Stop it first.
s2.idfs.cn: regionserver running as process 18763. Stop it first.
------------------------------------------------------------------------------------------

and logs in hbase give me the following, and i dont know how to deal with
it.if zookeeper is dead or goes with some problems,
how do i do> stop-hbase.sh & start-hbase.sh don't work at all

------------------------------------------------------------------------------------------------------------
2010-06-24 11:33:29,713 WARN
org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper: Failed to create /hbase
-- check quorum servers, currently=test1:2222
org.apache.zookeeper.KeeperException$ConnectionLossException:
KeeperErrorCode = ConnectionLoss for /hbase
    at org.apache.zookeeper.KeeperException.create(KeeperException.java:90)
    at org.apache.zookeeper.KeeperException.create(KeeperException.java:42)
    at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:780)
    at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:808)
    at
org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper.ensureExists(ZooKeeperWrapper.java:405)
    at
org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper.ensureParentExists(ZooKeeperWrapper.java:432)
    at
org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper.writeMasterAddress(ZooKeeperWrapper.java:520)
    at
org.apache.hadoop.hbase.master.HMaster.writeAddressToZooKeeper(HMaster.java:260)
    at org.apache.hadoop.hbase.master.HMaster.<init>(HMaster.java:242)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
    at
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
    at org.apache.hadoop.hbase.master.HMaster.doMain(HMaster.java:1230)
    at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:1271)
2010-06-24 11:33:31,202 INFO org.apache.zookeeper.ClientCnxn: Attempting
connection to server test1/192.168.1.122:2222
2010-06-24 11:33:31,203 INFO org.apache.zookeeper.ClientCnxn: Priming
connection to java.nio.channels.SocketChannel[connected local=/
192.168.1.122:52706 remote=test1/192.168.1.122:2222]
2010-06-24 11:33:31,203 INFO org.apache.zookeeper.ClientCnxn: Server
connection successful
2010-06-24 11:33:31,204 WARN org.apache.zookeeper.ClientCnxn: Exception
closing session 0x0 to sun.nio.ch.SelectionKeyImpl@163f7a1
java.io.IOException: Read error rc = -1 java.nio.DirectByteBuffer[pos=0
lim=4 cap=4]
    at org.apache.zookeeper.ClientCnxn$SendThread.doIO(ClientCnxn.java:701)
    at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:945)
2010-06-24 11:33:31,204 WARN org.apache.zookeeper.ClientCnxn: Ignoring
exception during shutdown input
java.net.SocketException: Transport endpoint is not connected
    at sun.nio.ch.SocketChannelImpl.shutdown(Native Method)
    at
sun.nio.ch.SocketChannelImpl.shutdownInput(SocketChannelImpl.java:640)
    at sun.nio.ch.SocketAdaptor.shutdownInput(SocketAdaptor.java:360)
    at
org.apache.zookeeper.ClientCnxn$SendThread.cleanup(ClientCnxn.java:999)
    at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:970)
2010-06-24 11:33:31,204 WARN org.apache.zookeeper.ClientCnxn: Ignoring
exception during shutdown output
java.net.SocketException: Transport endpoint is not connected
    at sun.nio.ch.SocketChannelImpl.shutdown(Native Method)
    at
sun.nio.ch.SocketChannelImpl.shutdownOutput(SocketChannelImpl.java:651)
    at sun.nio.ch.SocketAdaptor.shutdownOutput(SocketAdaptor.java:368)
    at
org.apache.zookeeper.ClientCnxn$SendThread.cleanup(ClientCnxn.java:1004)
    at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:970)

------------------------------------------------------------------------------------------------------------------------------------------------------------------------



2010/6/22 Jean-Daniel Cryans <jd...@apache.org>

> I'm not sure I understand what you describe, and since you didn't post
> any output from your logs then it's really hard to help you debug.
>
> What's the problem exactly and do you see any exception in the logs?
>
> J-D
>
> On Mon, Jun 21, 2010 at 2:48 AM, 梁景明 <fu...@gmail.com> wrote:
> > after reading "Description of how HBase uses ZooKeeper"i see my problem
> > maybe that the regionserver session in zk is lost!
> >
> > and i use bin/start-hbase.sh cant start hbase successfully .
> >
> > because they connect to zookeeper something lost?
> >
> > to start it.one way i think zookeeper start alone ,and i delete "/hbase"
> in
> > it , and run the start-hbase.sh shell again?
> >
> > will it be ok?
> >
> > 2010/6/19 Jean-Daniel Cryans <jd...@apache.org>
> >
> >> > do u mean if ZooKeeper is dead,the data will lose?
> >>
> >> If your Zookeeper ensemble is dead, then HBase will be unavailable but
> >> you won't lose any data. And even if your zookeeper data is wiped out,
> >> like I said it's only runtime data so it doesn't matter.
> >>
> >> >
> >> > in that case,ZooKeeper lost .META or .ROOT ,the data in hadoop will
> never
> >> be
> >> > recover , thought there were some table folders in hadoop.
> >>
> >> HBase stores the location of -ROOT- in Zookeeper, and that's changed
> >> everytime the region moves. Losing that won't make -ROOT- disappear
> >> forever, it's still in HDFS.
> >>
> >> Does it answer the question? (I'm not sure I fully understand you)
> >>
> >> J-D
> >>
> >
>

Re: how to recover hbase

Posted by Jean-Daniel Cryans <jd...@apache.org>.

I'm not sure I understand what you describe, and since you didn't post
any output from your logs then it's really hard to help you debug.

What's the problem exactly and do you see any exception in the logs?

J-D

On Mon, Jun 21, 2010 at 2:48 AM, 梁景明 <fu...@gmail.com> wrote:
> after reading "Description of how HBase uses ZooKeeper"i see my problem
> maybe that the regionserver session in zk is lost!
>
> and i use bin/start-hbase.sh cant start hbase successfully .
>
> because they connect to zookeeper something lost?
>
> to start it.one way i think zookeeper start alone ,and i delete "/hbase" in
> it , and run the start-hbase.sh shell again?
>
> will it be ok?
>
> 2010/6/19 Jean-Daniel Cryans <jd...@apache.org>
>
>> > do u mean if ZooKeeper is dead,the data will lose?
>>
>> If your Zookeeper ensemble is dead, then HBase will be unavailable but
>> you won't lose any data. And even if your zookeeper data is wiped out,
>> like I said it's only runtime data so it doesn't matter.
>>
>> >
>> > in that case,ZooKeeper lost .META or .ROOT ,the data in hadoop will never
>> be
>> > recover , thought there were some table folders in hadoop.
>>
>> HBase stores the location of -ROOT- in Zookeeper, and that's changed
>> everytime the region moves. Losing that won't make -ROOT- disappear
>> forever, it's still in HDFS.
>>
>> Does it answer the question? (I'm not sure I fully understand you)
>>
>> J-D
>>
>

Re: how to recover hbase

Posted by 梁景明 <fu...@gmail.com>.

after reading "Description of how HBase uses ZooKeeper"i see my problem
maybe that the regionserver session in zk is lost!

and i use bin/start-hbase.sh cant start hbase successfully .

because they connect to zookeeper something lost?

to start it.one way i think zookeeper start alone ,and i delete "/hbase" in
it , and run the start-hbase.sh shell again?

will it be ok?

2010/6/19 Jean-Daniel Cryans <jd...@apache.org>

> > do u mean if ZooKeeper is dead,the data will lose?
>
> If your Zookeeper ensemble is dead, then HBase will be unavailable but
> you won't lose any data. And even if your zookeeper data is wiped out,
> like I said it's only runtime data so it doesn't matter.
>
> >
> > in that case,ZooKeeper lost .META or .ROOT ,the data in hadoop will never
> be
> > recover , thought there were some table folders in hadoop.
>
> HBase stores the location of -ROOT- in Zookeeper, and that's changed
> everytime the region moves. Losing that won't make -ROOT- disappear
> forever, it's still in HDFS.
>
> Does it answer the question? (I'm not sure I fully understand you)
>
> J-D
>

Re: how to recover hbase

Posted by Jean-Daniel Cryans <jd...@apache.org>.

> do u mean if ZooKeeper is dead,the data will lose?

If your Zookeeper ensemble is dead, then HBase will be unavailable but
you won't lose any data. And even if your zookeeper data is wiped out,
like I said it's only runtime data so it doesn't matter.

>
> in that case,ZooKeeper lost .META or .ROOT ,the data in hadoop will never be
> recover , thought there were some table folders in hadoop.

HBase stores the location of -ROOT- in Zookeeper, and that's changed
everytime the region moves. Losing that won't make -ROOT- disappear
forever, it's still in HDFS.

Does it answer the question? (I'm not sure I fully understand you)

J-D

Re: how to recover hbase

Posted by 梁景明 <fu...@gmail.com>.

thanks

>The important content when a cluster is live is all in ZooKeeper.

do u mean if ZooKeeper is dead,the data will lose?

in that case,ZooKeeper lost .META or .ROOT ,the data in hadoop will never be
recover , thought there were some table folders in hadoop.


2010/6/18 Jean-Daniel Cryans <jd...@apache.org>

> Inline.
>
> J-D
> On Thu, Jun 17, 2010 at 2:12 AM, 梁景明 <fu...@gmail.com> wrote:
> > Something  confused me for a long time.
> > 1、how to recover hbase master?
>
> The master doesn't have anything to recover. You can start as many
> masters as you want, one will be the active and the rest will just be
> waiting for the active to die and take its place. The important
> content when a cluster is live is all in ZooKeeper.
>
> > 2、how to recover hbase region server?
>
> The data is all in HDFS, you can lose up to 2 nodes at the same time
> and yon won't lose data. Then its all up to Hadoop.
>
> > as hadoop ,there is  checkpoint mirror to recover the hadoop master.
>
> As I said, there's nothing to backup for the master.
>
> > and now i only know  to backup tables and restore them.
>
> To backup tables to can currently distcp the content of the hbase root
> directory. See http://hadoop.apache.org/common/docs/r0.20.2/distcp.html.
> There are other solutions, this mailing list is full of them.
>
> Also coming is https://issues.apache.org/jira/browse/HBASE-50
>
> > one way to rescue hbase is  to reinstall habse and create tables again .
> > it cant write to shell or something ,all the stuff by manually.
> >
> > thanks for any help .
> >
>

Re: how to recover hbase

Posted by Jean-Daniel Cryans <jd...@apache.org>.

Inline.

J-D
On Thu, Jun 17, 2010 at 2:12 AM, 梁景明 <fu...@gmail.com> wrote:
> Something  confused me for a long time.
> 1、how to recover hbase master?

The master doesn't have anything to recover. You can start as many
masters as you want, one will be the active and the rest will just be
waiting for the active to die and take its place. The important
content when a cluster is live is all in ZooKeeper.

> 2、how to recover hbase region server?

The data is all in HDFS, you can lose up to 2 nodes at the same time
and yon won't lose data. Then its all up to Hadoop.

> as hadoop ,there is  checkpoint mirror to recover the hadoop master.

As I said, there's nothing to backup for the master.

> and now i only know  to backup tables and restore them.

To backup tables to can currently distcp the content of the hbase root
directory. See http://hadoop.apache.org/common/docs/r0.20.2/distcp.html.
There are other solutions, this mailing list is full of them.

Also coming is https://issues.apache.org/jira/browse/HBASE-50

> one way to rescue hbase is  to reinstall habse and create tables again .
> it cant write to shell or something ,all the stuff by manually.
>
> thanks for any help .
>