You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Zheng Lv <lv...@gmail.com> on 2009/08/16 11:49:23 UTC

master kills itself

Hello,
    Thank you for your suggestions.
    Several days before We found our routing talbe has some problems, after
adjusting now we are sure that the bandwidth is ok.
    And we have used lzo compression.
    So we started the test program again, but after running normally for 23
hours, the master killed itself. Following is part of the log.
    By the way, this time we inserted 10 webpages per second only.
2009-08-14 13:36:31,840 INFO org.apache.hadoop.hbase.master.ServerManager: 4
region servers, 0 dead, average load 48.75
2009-08-14 13:36:32,016 INFO org.apache.hadoop.hbase.master.BaseScanner:
RegionManager.metaScanner scanning meta region {server: 192.168.33.5:60020,
regionnam
e: .META.,,1, startKey: <>}
2009-08-14 13:36:32,076 INFO org.apache.hadoop.hbase.master.BaseScanner:
RegionManager.rootScanner scanning meta region {server: 192.168.33.6:60020,
regionnam
e: -ROOT-,,0, startKey: <>}
2009-08-14 13:36:32,084 INFO org.apache.hadoop.hbase.master.BaseScanner:
RegionManager.rootScanner scan of 1 row(s) of meta region {server:
192.168.33.6:60020
, regionname: -ROOT-,,0, startKey: <>} complete
2009-08-14 13:36:32,316 INFO org.apache.hadoop.hbase.master.BaseScanner:
RegionManager.metaScanner scan of 193 row(s) of meta region {server:
192.168.33.5:600
20, regionname: .META.,,1, startKey: <>} complete
2009-08-14 13:36:32,316 INFO org.apache.hadoop.hbase.master.BaseScanner: All
1 .META. region(s) scanned
2009-08-14 13:37:00,366 WARN org.apache.zookeeper.ClientCnxn: Exception
closing session 0x22313002be80001 to sun.nio.ch.SelectionKeyImpl@4a407c9f
java.io.IOException: Read error rc = -1 java.nio.DirectByteBuffer[pos=0
lim=4 cap=4]
        at
org.apache.zookeeper.ClientCnxn$SendThread.doIO(ClientCnxn.java:653)
        at
org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:897)
2009-08-14 13:37:00,881 INFO org.apache.zookeeper.ClientCnxn: Attempting
connection to server ubuntu3/192.168.33.8:2222
2009-08-14 13:37:04,366 WARN org.apache.zookeeper.ClientCnxn: Exception
closing session 0x22313002be80000 to sun.nio.ch.SelectionKeyImpl@4ac6ee33
java.io.IOException: Read error rc = -1 java.nio.DirectByteBuffer[pos=0
lim=4 cap=4]
        at
org.apache.zookeeper.ClientCnxn$SendThread.doIO(ClientCnxn.java:653)
        at
org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:897)
2009-08-14 13:37:04,721 INFO org.apache.zookeeper.ClientCnxn: Attempting
connection to server ubuntu2/192.168.33.9:2222
2009-08-14 13:37:08,872 WARN org.apache.zookeeper.ClientCnxn: Exception
closing session 0x22313002be80001 to sun.nio.ch.SelectionKeyImpl@2e93ebe0
java.io.IOException: TIMED OUT
        at
org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:858)
2009-08-14 13:37:08,873 WARN org.apache.zookeeper.ClientCnxn: Ignoring
exception during shutdown output
java.net.SocketException: Transport endpoint is not connected
        at sun.nio.ch.SocketChannelImpl.shutdown(Native Method)
        at
sun.nio.ch.SocketChannelImpl.shutdownOutput(SocketChannelImpl.java:651)
        at sun.nio.ch.SocketAdaptor.shutdownOutput(SocketAdaptor.java:368)
        at
org.apache.zookeeper.ClientCnxn$SendThread.cleanup(ClientCnxn.java:956)
        at
org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:922)
2009-08-14 13:37:09,486 INFO org.apache.zookeeper.ClientCnxn: Attempting
connection to server ubuntu2/192.168.33.9:2222
2009-08-14 13:37:12,712 WARN org.apache.zookeeper.ClientCnxn: Exception
closing session 0x22313002be80000 to sun.nio.ch.SelectionKeyImpl@7162d703
java.io.IOException: TIMED OUT
        at
org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:858)
2009-08-14 13:37:12,713 WARN org.apache.zookeeper.ClientCnxn: Ignoring
exception during shutdown output
java.net.SocketException: Transport endpoint is not connected
        at sun.nio.ch.SocketChannelImpl.shutdown(Native Method)
        at
sun.nio.ch.SocketChannelImpl.shutdownOutput(SocketChannelImpl.java:651)
        at sun.nio.ch.SocketAdaptor.shutdownOutput(SocketAdaptor.java:368)
        at
org.apache.zookeeper.ClientCnxn$SendThread.cleanup(ClientCnxn.java:956)
        at
org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:922)
2009-08-14 13:37:13,032 INFO org.apache.zookeeper.ClientCnxn: Attempting
connection to server ubuntu3/192.168.33.8:2222
2009-08-14 13:37:17,482 WARN org.apache.zookeeper.ClientCnxn: Exception
closing session 0x22313002be80001 to sun.nio.ch.SelectionKeyImpl@1012401d
java.io.IOException: TIMED OUT
        at
org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:858)
2009-08-14 13:37:17,483 WARN org.apache.zookeeper.ClientCnxn: Ignoring
exception during shutdown output
java.net.SocketException: Transport endpoint is not connected
        at sun.nio.ch.SocketChannelImpl.shutdown(Native Method)
        at
sun.nio.ch.SocketChannelImpl.shutdownOutput(SocketChannelImpl.java:651)
        at sun.nio.ch.SocketAdaptor.shutdownOutput(SocketAdaptor.java:368)
        at
org.apache.zookeeper.ClientCnxn$SendThread.cleanup(ClientCnxn.java:956)
        at
org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:922)
2009-08-14 13:37:17,856 INFO org.apache.zookeeper.ClientCnxn: Attempting
connection to server ubuntu7/192.168.33.6:2222
2009-08-14 13:37:19,445 INFO org.apache.zookeeper.ClientCnxn: Priming
connection to java.nio.channels.SocketChannel[connected local=/
192.168.33.7:40923 remote
=ubuntu7/192.168.33.6:2222]
2009-08-14 13:37:19,445 INFO org.apache.zookeeper.ClientCnxn: Server
connection successful
2009-08-14 13:37:21,022 WARN org.apache.zookeeper.ClientCnxn: Exception
closing session 0x22313002be80000 to sun.nio.ch.SelectionKeyImpl@2e101b3a
java.io.IOException: TIMED OUT
        at
org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:858)
2009-08-14 13:37:21,023 WARN org.apache.zookeeper.ClientCnxn: Ignoring
exception during shutdown output
java.net.SocketException: Transport endpoint is not connected
        at sun.nio.ch.SocketChannelImpl.shutdown(Native Method)
        at
sun.nio.ch.SocketChannelImpl.shutdownOutput(SocketChannelImpl.java:651)
        at sun.nio.ch.SocketAdaptor.shutdownOutput(SocketAdaptor.java:368)
        at
org.apache.zookeeper.ClientCnxn$SendThread.cleanup(ClientCnxn.java:956)
        at
org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:922)
2009-08-14 13:37:21,908 INFO org.apache.zookeeper.ClientCnxn: Attempting
connection to server ubuntu7/192.168.33.6:2222
2009-08-14 13:37:21,908 INFO org.apache.zookeeper.ClientCnxn: Priming
connection to java.nio.channels.SocketChannel[connected local=/
192.168.33.7:40926 remote
=ubuntu7/192.168.33.6:2222]
2009-08-14 13:37:21,909 INFO org.apache.zookeeper.ClientCnxn: Server
connection successful
2009-08-14 13:37:21,911 WARN org.apache.zookeeper.ClientCnxn: Exception
closing session 0x22313002be80000 to sun.nio.ch.SelectionKeyImpl@6bdfe124
java.io.IOException: Session Expired
        at
org.apache.zookeeper.ClientCnxn$SendThread.readConnectResult(ClientCnxn.java:548)
        at
org.apache.zookeeper.ClientCnxn$SendThread.doIO(ClientCnxn.java:661)
        at
org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:897)
2009-08-14 13:37:21,912 ERROR org.apache.hadoop.hbase.master.HMaster: Master
lost its znode, killing itself now
Regards,
LvZheng

Re: master kills itself

Posted by Zheng Lv <lv...@gmail.com>.
Thank you! We will have a try.

2009/8/25 Jean-Daniel Cryans <jd...@apache.org>

> No, Zookeeper will help the master election so you must start other
> masters yourself. See
> http://wiki.apache.org/hadoop/Hbase/MultipleMasters
>
> To improve that you can add more servers to hbase.zookeeper.quorum,
> change the zookeeper.session.timeout to something higher than 1 minute
> (current default) and make sure that the servers hosting ZK aren't CPU
> and mem starved (typical case is having only 2 CPUs for
> datanode/region server/zookeeper plus a MR job running).
>
> J-D
>
> On Tue, Aug 25, 2009 at 2:30 AM, Zheng Lv<lv...@gmail.com>
> wrote:
> > Hello,
> >    Thanks, J-D.
> >    We did the same test 3 days before, and got the same result: the
> master
> > killed itself after running for 2 days. Now we have 2 questions.
> >    1 Is it normal that the master killed itself so quickly? And if not,
> > what can we do to improve it?
> >    2 "Starting a Master on any node should be ok to recover, HBase is
> built
> > for that."
> >       Did you mean a master should be started automatically or we should
> > start a master by ourselves? By the way, what does ZK do? We thought ZK
> is
> > responsable for re-start a master when the old one is dead. Is it?
> >
> >    Thank you,
> >    LvZheng.
> >
> > 2009/8/16 Zheng Lv <lv...@gmail.com>
> >
> >> Hello,
> >>     Thank you for your suggestions.
> >>     Several days before We found our routing talbe has some problems,
> after
> >> adjusting now we are sure that the bandwidth is ok.
> >>     And we have used lzo compression.
> >>     So we started the test program again, but after running normally for
> 23
> >> hours, the master killed itself. Following is part of the log.
> >>     By the way, this time we inserted 10 webpages per second only.
> >> 2009-08-14 13:36:31,840 INFO
> org.apache.hadoop.hbase.master.ServerManager:
> >> 4
> >> region servers, 0 dead, average load 48.75
> >> 2009-08-14 13:36:32,016 INFO org.apache.hadoop.hbase.master.BaseScanner:
> >> RegionManager.metaScanner scanning meta region {server:
> 192.168.33.5:60020
> >> ,
> >> regionnam
> >> e: .META.,,1, startKey: <>}
> >> 2009-08-14 13:36:32,076 INFO org.apache.hadoop.hbase.master.BaseScanner:
> >> RegionManager.rootScanner scanning meta region {server:
> 192.168.33.6:60020
> >> ,
> >> regionnam
> >> e: -ROOT-,,0, startKey: <>}
> >> 2009-08-14 13:36:32,084 INFO org.apache.hadoop.hbase.master.BaseScanner:
> >> RegionManager.rootScanner scan of 1 row(s) of meta region {server:
> >> 192.168.33.6:60020
> >> , regionname: -ROOT-,,0, startKey: <>} complete
> >> 2009-08-14 13:36:32,316 INFO org.apache.hadoop.hbase.master.BaseScanner:
> >> RegionManager.metaScanner scan of 193 row(s) of meta region {server:
> >> 192.168.33.5:600
> >> 20, regionname: .META.,,1, startKey: <>} complete
> >> 2009-08-14 13:36:32,316 INFO org.apache.hadoop.hbase.master.BaseScanner:
> >> All
> >> 1 .META. region(s) scanned
> >> 2009-08-14 13:37:00,366 WARN org.apache.zookeeper.ClientCnxn: Exception
> >> closing session 0x22313002be80001 to
> sun.nio.ch.SelectionKeyImpl@4a407c9f
> >> java.io.IOException: Read error rc = -1 java.nio.DirectByteBuffer[pos=0
> >> lim=4 cap=4]
> >>         at
> >> org.apache.zookeeper.ClientCnxn$SendThread.doIO(ClientCnxn.java:653)
> >>         at
> >> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:897)
> >> 2009-08-14 13:37:00,881 INFO org.apache.zookeeper.ClientCnxn: Attempting
> >> connection to server ubuntu3/192.168.33.8:2222
> >> 2009-08-14 13:37:04,366 WARN org.apache.zookeeper.ClientCnxn: Exception
> >> closing session 0x22313002be80000 to
> sun.nio.ch.SelectionKeyImpl@4ac6ee33
> >> java.io.IOException: Read error rc = -1 java.nio.DirectByteBuffer[pos=0
> >> lim=4 cap=4]
> >>         at
> >> org.apache.zookeeper.ClientCnxn$SendThread.doIO(ClientCnxn.java:653)
> >>         at
> >> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:897)
> >> 2009-08-14 13:37:04,721 INFO org.apache.zookeeper.ClientCnxn: Attempting
> >> connection to server ubuntu2/192.168.33.9:2222
> >> 2009-08-14 13:37:08,872 WARN org.apache.zookeeper.ClientCnxn: Exception
> >> closing session 0x22313002be80001 to
> sun.nio.ch.SelectionKeyImpl@2e93ebe0
> >> java.io.IOException: TIMED OUT
> >>         at
> >> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:858)
> >> 2009-08-14 13:37:08,873 WARN org.apache.zookeeper.ClientCnxn: Ignoring
> >> exception during shutdown output
> >> java.net.SocketException: Transport endpoint is not connected
> >>         at sun.nio.ch.SocketChannelImpl.shutdown(Native Method)
> >>         at
> >> sun.nio.ch.SocketChannelImpl.shutdownOutput(SocketChannelImpl.java:651)
> >>         at
> sun.nio.ch.SocketAdaptor.shutdownOutput(SocketAdaptor.java:368)
> >>         at
> >> org.apache.zookeeper.ClientCnxn$SendThread.cleanup(ClientCnxn.java:956)
> >>         at
> >> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:922)
> >> 2009-08-14 13:37:09,486 INFO org.apache.zookeeper.ClientCnxn: Attempting
> >> connection to server ubuntu2/192.168.33.9:2222
> >> 2009-08-14 13:37:12,712 WARN org.apache.zookeeper.ClientCnxn: Exception
> >> closing session 0x22313002be80000 to
> sun.nio.ch.SelectionKeyImpl@7162d703
> >> java.io.IOException: TIMED OUT
> >>         at
> >> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:858)
> >> 2009-08-14 13:37:12,713 WARN org.apache.zookeeper.ClientCnxn: Ignoring
> >> exception during shutdown output
> >> java.net.SocketException: Transport endpoint is not connected
> >>         at sun.nio.ch.SocketChannelImpl.shutdown(Native Method)
> >>         at
> >> sun.nio.ch.SocketChannelImpl.shutdownOutput(SocketChannelImpl.java:651)
> >>         at
> sun.nio.ch.SocketAdaptor.shutdownOutput(SocketAdaptor.java:368)
> >>         at
> >> org.apache.zookeeper.ClientCnxn$SendThread.cleanup(ClientCnxn.java:956)
> >>         at
> >> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:922)
> >> 2009-08-14 13:37:13,032 INFO org.apache.zookeeper.ClientCnxn: Attempting
> >> connection to server ubuntu3/192.168.33.8:2222
> >> 2009-08-14 13:37:17,482 WARN org.apache.zookeeper.ClientCnxn: Exception
> >> closing session 0x22313002be80001 to
> sun.nio.ch.SelectionKeyImpl@1012401d
> >> java.io.IOException: TIMED OUT
> >>         at
> >> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:858)
> >> 2009-08-14 13:37:17,483 WARN org.apache.zookeeper.ClientCnxn: Ignoring
> >> exception during shutdown output
> >> java.net.SocketException: Transport endpoint is not connected
> >>         at sun.nio.ch.SocketChannelImpl.shutdown(Native Method)
> >>         at
> >> sun.nio.ch.SocketChannelImpl.shutdownOutput(SocketChannelImpl.java:651)
> >>         at
> sun.nio.ch.SocketAdaptor.shutdownOutput(SocketAdaptor.java:368)
> >>         at
> >> org.apache.zookeeper.ClientCnxn$SendThread.cleanup(ClientCnxn.java:956)
> >>         at
> >> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:922)
> >> 2009-08-14 13:37:17,856 INFO org.apache.zookeeper.ClientCnxn: Attempting
> >> connection to server ubuntu7/192.168.33.6:2222
> >> 2009-08-14 13:37:19,445 INFO org.apache.zookeeper.ClientCnxn: Priming
> >> connection to java.nio.channels.SocketChannel[connected local=/
> >> 192.168.33.7:40923 remote
> >> =ubuntu7/192.168.33.6:2222]
> >> 2009-08-14 13:37:19,445 INFO org.apache.zookeeper.ClientCnxn: Server
> >> connection successful
> >> 2009-08-14 13:37:21,022 WARN org.apache.zookeeper.ClientCnxn: Exception
> >> closing session 0x22313002be80000 to
> sun.nio.ch.SelectionKeyImpl@2e101b3a
> >> java.io.IOException: TIMED OUT
> >>         at
> >> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:858)
> >> 2009-08-14 13:37:21,023 WARN org.apache.zookeeper.ClientCnxn: Ignoring
> >> exception during shutdown output
> >> java.net.SocketException: Transport endpoint is not connected
> >>         at sun.nio.ch.SocketChannelImpl.shutdown(Native Method)
> >>         at
> >> sun.nio.ch.SocketChannelImpl.shutdownOutput(SocketChannelImpl.java:651)
> >>         at
> sun.nio.ch.SocketAdaptor.shutdownOutput(SocketAdaptor.java:368)
> >>         at
> >> org.apache.zookeeper.ClientCnxn$SendThread.cleanup(ClientCnxn.java:956)
> >>         at
> >> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:922)
> >> 2009-08-14 13:37:21,908 INFO org.apache.zookeeper.ClientCnxn: Attempting
> >> connection to server ubuntu7/192.168.33.6:2222
> >> 2009-08-14 13:37:21,908 INFO org.apache.zookeeper.ClientCnxn: Priming
> >> connection to java.nio.channels.SocketChannel[connected local=/
> >> 192.168.33.7:40926 remote
> >> =ubuntu7/192.168.33.6:2222]
> >> 2009-08-14 13:37:21,909 INFO org.apache.zookeeper.ClientCnxn: Server
> >> connection successful
> >> 2009-08-14 13:37:21,911 WARN org.apache.zookeeper.ClientCnxn: Exception
> >> closing session 0x22313002be80000 to
> sun.nio.ch.SelectionKeyImpl@6bdfe124
> >> java.io.IOException: Session Expired
> >>         at
> >>
> >>
> org.apache.zookeeper.ClientCnxn$SendThread.readConnectResult(ClientCnxn.java:548)
> >>         at
> >> org.apache.zookeeper.ClientCnxn$SendThread.doIO(ClientCnxn.java:661)
> >>         at
> >> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:897)
> >> 2009-08-14 13:37:21,912 ERROR org.apache.hadoop.hbase.master.HMaster:
> >> Master
> >> lost its znode, killing itself now
> >> Regards,
> >> LvZheng
> >>
> >
>

Re: master kills itself

Posted by Jean-Daniel Cryans <jd...@apache.org>.
No, Zookeeper will help the master election so you must start other
masters yourself. See
http://wiki.apache.org/hadoop/Hbase/MultipleMasters

To improve that you can add more servers to hbase.zookeeper.quorum,
change the zookeeper.session.timeout to something higher than 1 minute
(current default) and make sure that the servers hosting ZK aren't CPU
and mem starved (typical case is having only 2 CPUs for
datanode/region server/zookeeper plus a MR job running).

J-D

On Tue, Aug 25, 2009 at 2:30 AM, Zheng Lv<lv...@gmail.com> wrote:
> Hello,
>    Thanks, J-D.
>    We did the same test 3 days before, and got the same result: the master
> killed itself after running for 2 days. Now we have 2 questions.
>    1 Is it normal that the master killed itself so quickly? And if not,
> what can we do to improve it?
>    2 "Starting a Master on any node should be ok to recover, HBase is built
> for that."
>       Did you mean a master should be started automatically or we should
> start a master by ourselves? By the way, what does ZK do? We thought ZK is
> responsable for re-start a master when the old one is dead. Is it?
>
>    Thank you,
>    LvZheng.
>
> 2009/8/16 Zheng Lv <lv...@gmail.com>
>
>> Hello,
>>     Thank you for your suggestions.
>>     Several days before We found our routing talbe has some problems, after
>> adjusting now we are sure that the bandwidth is ok.
>>     And we have used lzo compression.
>>     So we started the test program again, but after running normally for 23
>> hours, the master killed itself. Following is part of the log.
>>     By the way, this time we inserted 10 webpages per second only.
>> 2009-08-14 13:36:31,840 INFO org.apache.hadoop.hbase.master.ServerManager:
>> 4
>> region servers, 0 dead, average load 48.75
>> 2009-08-14 13:36:32,016 INFO org.apache.hadoop.hbase.master.BaseScanner:
>> RegionManager.metaScanner scanning meta region {server: 192.168.33.5:60020
>> ,
>> regionnam
>> e: .META.,,1, startKey: <>}
>> 2009-08-14 13:36:32,076 INFO org.apache.hadoop.hbase.master.BaseScanner:
>> RegionManager.rootScanner scanning meta region {server: 192.168.33.6:60020
>> ,
>> regionnam
>> e: -ROOT-,,0, startKey: <>}
>> 2009-08-14 13:36:32,084 INFO org.apache.hadoop.hbase.master.BaseScanner:
>> RegionManager.rootScanner scan of 1 row(s) of meta region {server:
>> 192.168.33.6:60020
>> , regionname: -ROOT-,,0, startKey: <>} complete
>> 2009-08-14 13:36:32,316 INFO org.apache.hadoop.hbase.master.BaseScanner:
>> RegionManager.metaScanner scan of 193 row(s) of meta region {server:
>> 192.168.33.5:600
>> 20, regionname: .META.,,1, startKey: <>} complete
>> 2009-08-14 13:36:32,316 INFO org.apache.hadoop.hbase.master.BaseScanner:
>> All
>> 1 .META. region(s) scanned
>> 2009-08-14 13:37:00,366 WARN org.apache.zookeeper.ClientCnxn: Exception
>> closing session 0x22313002be80001 to sun.nio.ch.SelectionKeyImpl@4a407c9f
>> java.io.IOException: Read error rc = -1 java.nio.DirectByteBuffer[pos=0
>> lim=4 cap=4]
>>         at
>> org.apache.zookeeper.ClientCnxn$SendThread.doIO(ClientCnxn.java:653)
>>         at
>> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:897)
>> 2009-08-14 13:37:00,881 INFO org.apache.zookeeper.ClientCnxn: Attempting
>> connection to server ubuntu3/192.168.33.8:2222
>> 2009-08-14 13:37:04,366 WARN org.apache.zookeeper.ClientCnxn: Exception
>> closing session 0x22313002be80000 to sun.nio.ch.SelectionKeyImpl@4ac6ee33
>> java.io.IOException: Read error rc = -1 java.nio.DirectByteBuffer[pos=0
>> lim=4 cap=4]
>>         at
>> org.apache.zookeeper.ClientCnxn$SendThread.doIO(ClientCnxn.java:653)
>>         at
>> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:897)
>> 2009-08-14 13:37:04,721 INFO org.apache.zookeeper.ClientCnxn: Attempting
>> connection to server ubuntu2/192.168.33.9:2222
>> 2009-08-14 13:37:08,872 WARN org.apache.zookeeper.ClientCnxn: Exception
>> closing session 0x22313002be80001 to sun.nio.ch.SelectionKeyImpl@2e93ebe0
>> java.io.IOException: TIMED OUT
>>         at
>> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:858)
>> 2009-08-14 13:37:08,873 WARN org.apache.zookeeper.ClientCnxn: Ignoring
>> exception during shutdown output
>> java.net.SocketException: Transport endpoint is not connected
>>         at sun.nio.ch.SocketChannelImpl.shutdown(Native Method)
>>         at
>> sun.nio.ch.SocketChannelImpl.shutdownOutput(SocketChannelImpl.java:651)
>>         at sun.nio.ch.SocketAdaptor.shutdownOutput(SocketAdaptor.java:368)
>>         at
>> org.apache.zookeeper.ClientCnxn$SendThread.cleanup(ClientCnxn.java:956)
>>         at
>> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:922)
>> 2009-08-14 13:37:09,486 INFO org.apache.zookeeper.ClientCnxn: Attempting
>> connection to server ubuntu2/192.168.33.9:2222
>> 2009-08-14 13:37:12,712 WARN org.apache.zookeeper.ClientCnxn: Exception
>> closing session 0x22313002be80000 to sun.nio.ch.SelectionKeyImpl@7162d703
>> java.io.IOException: TIMED OUT
>>         at
>> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:858)
>> 2009-08-14 13:37:12,713 WARN org.apache.zookeeper.ClientCnxn: Ignoring
>> exception during shutdown output
>> java.net.SocketException: Transport endpoint is not connected
>>         at sun.nio.ch.SocketChannelImpl.shutdown(Native Method)
>>         at
>> sun.nio.ch.SocketChannelImpl.shutdownOutput(SocketChannelImpl.java:651)
>>         at sun.nio.ch.SocketAdaptor.shutdownOutput(SocketAdaptor.java:368)
>>         at
>> org.apache.zookeeper.ClientCnxn$SendThread.cleanup(ClientCnxn.java:956)
>>         at
>> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:922)
>> 2009-08-14 13:37:13,032 INFO org.apache.zookeeper.ClientCnxn: Attempting
>> connection to server ubuntu3/192.168.33.8:2222
>> 2009-08-14 13:37:17,482 WARN org.apache.zookeeper.ClientCnxn: Exception
>> closing session 0x22313002be80001 to sun.nio.ch.SelectionKeyImpl@1012401d
>> java.io.IOException: TIMED OUT
>>         at
>> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:858)
>> 2009-08-14 13:37:17,483 WARN org.apache.zookeeper.ClientCnxn: Ignoring
>> exception during shutdown output
>> java.net.SocketException: Transport endpoint is not connected
>>         at sun.nio.ch.SocketChannelImpl.shutdown(Native Method)
>>         at
>> sun.nio.ch.SocketChannelImpl.shutdownOutput(SocketChannelImpl.java:651)
>>         at sun.nio.ch.SocketAdaptor.shutdownOutput(SocketAdaptor.java:368)
>>         at
>> org.apache.zookeeper.ClientCnxn$SendThread.cleanup(ClientCnxn.java:956)
>>         at
>> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:922)
>> 2009-08-14 13:37:17,856 INFO org.apache.zookeeper.ClientCnxn: Attempting
>> connection to server ubuntu7/192.168.33.6:2222
>> 2009-08-14 13:37:19,445 INFO org.apache.zookeeper.ClientCnxn: Priming
>> connection to java.nio.channels.SocketChannel[connected local=/
>> 192.168.33.7:40923 remote
>> =ubuntu7/192.168.33.6:2222]
>> 2009-08-14 13:37:19,445 INFO org.apache.zookeeper.ClientCnxn: Server
>> connection successful
>> 2009-08-14 13:37:21,022 WARN org.apache.zookeeper.ClientCnxn: Exception
>> closing session 0x22313002be80000 to sun.nio.ch.SelectionKeyImpl@2e101b3a
>> java.io.IOException: TIMED OUT
>>         at
>> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:858)
>> 2009-08-14 13:37:21,023 WARN org.apache.zookeeper.ClientCnxn: Ignoring
>> exception during shutdown output
>> java.net.SocketException: Transport endpoint is not connected
>>         at sun.nio.ch.SocketChannelImpl.shutdown(Native Method)
>>         at
>> sun.nio.ch.SocketChannelImpl.shutdownOutput(SocketChannelImpl.java:651)
>>         at sun.nio.ch.SocketAdaptor.shutdownOutput(SocketAdaptor.java:368)
>>         at
>> org.apache.zookeeper.ClientCnxn$SendThread.cleanup(ClientCnxn.java:956)
>>         at
>> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:922)
>> 2009-08-14 13:37:21,908 INFO org.apache.zookeeper.ClientCnxn: Attempting
>> connection to server ubuntu7/192.168.33.6:2222
>> 2009-08-14 13:37:21,908 INFO org.apache.zookeeper.ClientCnxn: Priming
>> connection to java.nio.channels.SocketChannel[connected local=/
>> 192.168.33.7:40926 remote
>> =ubuntu7/192.168.33.6:2222]
>> 2009-08-14 13:37:21,909 INFO org.apache.zookeeper.ClientCnxn: Server
>> connection successful
>> 2009-08-14 13:37:21,911 WARN org.apache.zookeeper.ClientCnxn: Exception
>> closing session 0x22313002be80000 to sun.nio.ch.SelectionKeyImpl@6bdfe124
>> java.io.IOException: Session Expired
>>         at
>>
>> org.apache.zookeeper.ClientCnxn$SendThread.readConnectResult(ClientCnxn.java:548)
>>         at
>> org.apache.zookeeper.ClientCnxn$SendThread.doIO(ClientCnxn.java:661)
>>         at
>> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:897)
>> 2009-08-14 13:37:21,912 ERROR org.apache.hadoop.hbase.master.HMaster:
>> Master
>> lost its znode, killing itself now
>> Regards,
>> LvZheng
>>
>

Re: master kills itself

Posted by Zheng Lv <lv...@gmail.com>.
Hello,
    Thanks, J-D.
    We did the same test 3 days before, and got the same result: the master
killed itself after running for 2 days. Now we have 2 questions.
    1 Is it normal that the master killed itself so quickly? And if not,
what can we do to improve it?
    2 "Starting a Master on any node should be ok to recover, HBase is built
for that."
       Did you mean a master should be started automatically or we should
start a master by ourselves? By the way, what does ZK do? We thought ZK is
responsable for re-start a master when the old one is dead. Is it?

    Thank you,
    LvZheng.

2009/8/16 Zheng Lv <lv...@gmail.com>

> Hello,
>     Thank you for your suggestions.
>     Several days before We found our routing talbe has some problems, after
> adjusting now we are sure that the bandwidth is ok.
>     And we have used lzo compression.
>     So we started the test program again, but after running normally for 23
> hours, the master killed itself. Following is part of the log.
>     By the way, this time we inserted 10 webpages per second only.
> 2009-08-14 13:36:31,840 INFO org.apache.hadoop.hbase.master.ServerManager:
> 4
> region servers, 0 dead, average load 48.75
> 2009-08-14 13:36:32,016 INFO org.apache.hadoop.hbase.master.BaseScanner:
> RegionManager.metaScanner scanning meta region {server: 192.168.33.5:60020
> ,
> regionnam
> e: .META.,,1, startKey: <>}
> 2009-08-14 13:36:32,076 INFO org.apache.hadoop.hbase.master.BaseScanner:
> RegionManager.rootScanner scanning meta region {server: 192.168.33.6:60020
> ,
> regionnam
> e: -ROOT-,,0, startKey: <>}
> 2009-08-14 13:36:32,084 INFO org.apache.hadoop.hbase.master.BaseScanner:
> RegionManager.rootScanner scan of 1 row(s) of meta region {server:
> 192.168.33.6:60020
> , regionname: -ROOT-,,0, startKey: <>} complete
> 2009-08-14 13:36:32,316 INFO org.apache.hadoop.hbase.master.BaseScanner:
> RegionManager.metaScanner scan of 193 row(s) of meta region {server:
> 192.168.33.5:600
> 20, regionname: .META.,,1, startKey: <>} complete
> 2009-08-14 13:36:32,316 INFO org.apache.hadoop.hbase.master.BaseScanner:
> All
> 1 .META. region(s) scanned
> 2009-08-14 13:37:00,366 WARN org.apache.zookeeper.ClientCnxn: Exception
> closing session 0x22313002be80001 to sun.nio.ch.SelectionKeyImpl@4a407c9f
> java.io.IOException: Read error rc = -1 java.nio.DirectByteBuffer[pos=0
> lim=4 cap=4]
>         at
> org.apache.zookeeper.ClientCnxn$SendThread.doIO(ClientCnxn.java:653)
>         at
> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:897)
> 2009-08-14 13:37:00,881 INFO org.apache.zookeeper.ClientCnxn: Attempting
> connection to server ubuntu3/192.168.33.8:2222
> 2009-08-14 13:37:04,366 WARN org.apache.zookeeper.ClientCnxn: Exception
> closing session 0x22313002be80000 to sun.nio.ch.SelectionKeyImpl@4ac6ee33
> java.io.IOException: Read error rc = -1 java.nio.DirectByteBuffer[pos=0
> lim=4 cap=4]
>         at
> org.apache.zookeeper.ClientCnxn$SendThread.doIO(ClientCnxn.java:653)
>         at
> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:897)
> 2009-08-14 13:37:04,721 INFO org.apache.zookeeper.ClientCnxn: Attempting
> connection to server ubuntu2/192.168.33.9:2222
> 2009-08-14 13:37:08,872 WARN org.apache.zookeeper.ClientCnxn: Exception
> closing session 0x22313002be80001 to sun.nio.ch.SelectionKeyImpl@2e93ebe0
> java.io.IOException: TIMED OUT
>         at
> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:858)
> 2009-08-14 13:37:08,873 WARN org.apache.zookeeper.ClientCnxn: Ignoring
> exception during shutdown output
> java.net.SocketException: Transport endpoint is not connected
>         at sun.nio.ch.SocketChannelImpl.shutdown(Native Method)
>         at
> sun.nio.ch.SocketChannelImpl.shutdownOutput(SocketChannelImpl.java:651)
>         at sun.nio.ch.SocketAdaptor.shutdownOutput(SocketAdaptor.java:368)
>         at
> org.apache.zookeeper.ClientCnxn$SendThread.cleanup(ClientCnxn.java:956)
>         at
> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:922)
> 2009-08-14 13:37:09,486 INFO org.apache.zookeeper.ClientCnxn: Attempting
> connection to server ubuntu2/192.168.33.9:2222
> 2009-08-14 13:37:12,712 WARN org.apache.zookeeper.ClientCnxn: Exception
> closing session 0x22313002be80000 to sun.nio.ch.SelectionKeyImpl@7162d703
> java.io.IOException: TIMED OUT
>         at
> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:858)
> 2009-08-14 13:37:12,713 WARN org.apache.zookeeper.ClientCnxn: Ignoring
> exception during shutdown output
> java.net.SocketException: Transport endpoint is not connected
>         at sun.nio.ch.SocketChannelImpl.shutdown(Native Method)
>         at
> sun.nio.ch.SocketChannelImpl.shutdownOutput(SocketChannelImpl.java:651)
>         at sun.nio.ch.SocketAdaptor.shutdownOutput(SocketAdaptor.java:368)
>         at
> org.apache.zookeeper.ClientCnxn$SendThread.cleanup(ClientCnxn.java:956)
>         at
> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:922)
> 2009-08-14 13:37:13,032 INFO org.apache.zookeeper.ClientCnxn: Attempting
> connection to server ubuntu3/192.168.33.8:2222
> 2009-08-14 13:37:17,482 WARN org.apache.zookeeper.ClientCnxn: Exception
> closing session 0x22313002be80001 to sun.nio.ch.SelectionKeyImpl@1012401d
> java.io.IOException: TIMED OUT
>         at
> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:858)
> 2009-08-14 13:37:17,483 WARN org.apache.zookeeper.ClientCnxn: Ignoring
> exception during shutdown output
> java.net.SocketException: Transport endpoint is not connected
>         at sun.nio.ch.SocketChannelImpl.shutdown(Native Method)
>         at
> sun.nio.ch.SocketChannelImpl.shutdownOutput(SocketChannelImpl.java:651)
>         at sun.nio.ch.SocketAdaptor.shutdownOutput(SocketAdaptor.java:368)
>         at
> org.apache.zookeeper.ClientCnxn$SendThread.cleanup(ClientCnxn.java:956)
>         at
> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:922)
> 2009-08-14 13:37:17,856 INFO org.apache.zookeeper.ClientCnxn: Attempting
> connection to server ubuntu7/192.168.33.6:2222
> 2009-08-14 13:37:19,445 INFO org.apache.zookeeper.ClientCnxn: Priming
> connection to java.nio.channels.SocketChannel[connected local=/
> 192.168.33.7:40923 remote
> =ubuntu7/192.168.33.6:2222]
> 2009-08-14 13:37:19,445 INFO org.apache.zookeeper.ClientCnxn: Server
> connection successful
> 2009-08-14 13:37:21,022 WARN org.apache.zookeeper.ClientCnxn: Exception
> closing session 0x22313002be80000 to sun.nio.ch.SelectionKeyImpl@2e101b3a
> java.io.IOException: TIMED OUT
>         at
> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:858)
> 2009-08-14 13:37:21,023 WARN org.apache.zookeeper.ClientCnxn: Ignoring
> exception during shutdown output
> java.net.SocketException: Transport endpoint is not connected
>         at sun.nio.ch.SocketChannelImpl.shutdown(Native Method)
>         at
> sun.nio.ch.SocketChannelImpl.shutdownOutput(SocketChannelImpl.java:651)
>         at sun.nio.ch.SocketAdaptor.shutdownOutput(SocketAdaptor.java:368)
>         at
> org.apache.zookeeper.ClientCnxn$SendThread.cleanup(ClientCnxn.java:956)
>         at
> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:922)
> 2009-08-14 13:37:21,908 INFO org.apache.zookeeper.ClientCnxn: Attempting
> connection to server ubuntu7/192.168.33.6:2222
> 2009-08-14 13:37:21,908 INFO org.apache.zookeeper.ClientCnxn: Priming
> connection to java.nio.channels.SocketChannel[connected local=/
> 192.168.33.7:40926 remote
> =ubuntu7/192.168.33.6:2222]
> 2009-08-14 13:37:21,909 INFO org.apache.zookeeper.ClientCnxn: Server
> connection successful
> 2009-08-14 13:37:21,911 WARN org.apache.zookeeper.ClientCnxn: Exception
> closing session 0x22313002be80000 to sun.nio.ch.SelectionKeyImpl@6bdfe124
> java.io.IOException: Session Expired
>         at
>
> org.apache.zookeeper.ClientCnxn$SendThread.readConnectResult(ClientCnxn.java:548)
>         at
> org.apache.zookeeper.ClientCnxn$SendThread.doIO(ClientCnxn.java:661)
>         at
> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:897)
> 2009-08-14 13:37:21,912 ERROR org.apache.hadoop.hbase.master.HMaster:
> Master
> lost its znode, killing itself now
> Regards,
> LvZheng
>

Re: master kills itself

Posted by Jean-Daniel Cryans <jd...@apache.org>.
It seems your Master had trouble to connect to a ZK server and its
session expired. In this case it kills itself to make sure that it
won't be managing the cluster at the same time as another Master which
may have started if there was any waiting.

Starting a Master on any node should be ok to recover, HBase is built for that.

J-D

On Sun, Aug 16, 2009 at 2:49 AM, Zheng Lv<lv...@gmail.com> wrote:
> Hello,
>    Thank you for your suggestions.
>    Several days before We found our routing talbe has some problems, after
> adjusting now we are sure that the bandwidth is ok.
>    And we have used lzo compression.
>    So we started the test program again, but after running normally for 23
> hours, the master killed itself. Following is part of the log.
>    By the way, this time we inserted 10 webpages per second only.
> 2009-08-14 13:36:31,840 INFO org.apache.hadoop.hbase.master.ServerManager: 4
> region servers, 0 dead, average load 48.75
> 2009-08-14 13:36:32,016 INFO org.apache.hadoop.hbase.master.BaseScanner:
> RegionManager.metaScanner scanning meta region {server: 192.168.33.5:60020,
> regionnam
> e: .META.,,1, startKey: <>}
> 2009-08-14 13:36:32,076 INFO org.apache.hadoop.hbase.master.BaseScanner:
> RegionManager.rootScanner scanning meta region {server: 192.168.33.6:60020,
> regionnam
> e: -ROOT-,,0, startKey: <>}
> 2009-08-14 13:36:32,084 INFO org.apache.hadoop.hbase.master.BaseScanner:
> RegionManager.rootScanner scan of 1 row(s) of meta region {server:
> 192.168.33.6:60020
> , regionname: -ROOT-,,0, startKey: <>} complete
> 2009-08-14 13:36:32,316 INFO org.apache.hadoop.hbase.master.BaseScanner:
> RegionManager.metaScanner scan of 193 row(s) of meta region {server:
> 192.168.33.5:600
> 20, regionname: .META.,,1, startKey: <>} complete
> 2009-08-14 13:36:32,316 INFO org.apache.hadoop.hbase.master.BaseScanner: All
> 1 .META. region(s) scanned
> 2009-08-14 13:37:00,366 WARN org.apache.zookeeper.ClientCnxn: Exception
> closing session 0x22313002be80001 to sun.nio.ch.SelectionKeyImpl@4a407c9f
> java.io.IOException: Read error rc = -1 java.nio.DirectByteBuffer[pos=0
> lim=4 cap=4]
>        at
> org.apache.zookeeper.ClientCnxn$SendThread.doIO(ClientCnxn.java:653)
>        at
> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:897)
> 2009-08-14 13:37:00,881 INFO org.apache.zookeeper.ClientCnxn: Attempting
> connection to server ubuntu3/192.168.33.8:2222
> 2009-08-14 13:37:04,366 WARN org.apache.zookeeper.ClientCnxn: Exception
> closing session 0x22313002be80000 to sun.nio.ch.SelectionKeyImpl@4ac6ee33
> java.io.IOException: Read error rc = -1 java.nio.DirectByteBuffer[pos=0
> lim=4 cap=4]
>        at
> org.apache.zookeeper.ClientCnxn$SendThread.doIO(ClientCnxn.java:653)
>        at
> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:897)
> 2009-08-14 13:37:04,721 INFO org.apache.zookeeper.ClientCnxn: Attempting
> connection to server ubuntu2/192.168.33.9:2222
> 2009-08-14 13:37:08,872 WARN org.apache.zookeeper.ClientCnxn: Exception
> closing session 0x22313002be80001 to sun.nio.ch.SelectionKeyImpl@2e93ebe0
> java.io.IOException: TIMED OUT
>        at
> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:858)
> 2009-08-14 13:37:08,873 WARN org.apache.zookeeper.ClientCnxn: Ignoring
> exception during shutdown output
> java.net.SocketException: Transport endpoint is not connected
>        at sun.nio.ch.SocketChannelImpl.shutdown(Native Method)
>        at
> sun.nio.ch.SocketChannelImpl.shutdownOutput(SocketChannelImpl.java:651)
>        at sun.nio.ch.SocketAdaptor.shutdownOutput(SocketAdaptor.java:368)
>        at
> org.apache.zookeeper.ClientCnxn$SendThread.cleanup(ClientCnxn.java:956)
>        at
> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:922)
> 2009-08-14 13:37:09,486 INFO org.apache.zookeeper.ClientCnxn: Attempting
> connection to server ubuntu2/192.168.33.9:2222
> 2009-08-14 13:37:12,712 WARN org.apache.zookeeper.ClientCnxn: Exception
> closing session 0x22313002be80000 to sun.nio.ch.SelectionKeyImpl@7162d703
> java.io.IOException: TIMED OUT
>        at
> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:858)
> 2009-08-14 13:37:12,713 WARN org.apache.zookeeper.ClientCnxn: Ignoring
> exception during shutdown output
> java.net.SocketException: Transport endpoint is not connected
>        at sun.nio.ch.SocketChannelImpl.shutdown(Native Method)
>        at
> sun.nio.ch.SocketChannelImpl.shutdownOutput(SocketChannelImpl.java:651)
>        at sun.nio.ch.SocketAdaptor.shutdownOutput(SocketAdaptor.java:368)
>        at
> org.apache.zookeeper.ClientCnxn$SendThread.cleanup(ClientCnxn.java:956)
>        at
> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:922)
> 2009-08-14 13:37:13,032 INFO org.apache.zookeeper.ClientCnxn: Attempting
> connection to server ubuntu3/192.168.33.8:2222
> 2009-08-14 13:37:17,482 WARN org.apache.zookeeper.ClientCnxn: Exception
> closing session 0x22313002be80001 to sun.nio.ch.SelectionKeyImpl@1012401d
> java.io.IOException: TIMED OUT
>        at
> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:858)
> 2009-08-14 13:37:17,483 WARN org.apache.zookeeper.ClientCnxn: Ignoring
> exception during shutdown output
> java.net.SocketException: Transport endpoint is not connected
>        at sun.nio.ch.SocketChannelImpl.shutdown(Native Method)
>        at
> sun.nio.ch.SocketChannelImpl.shutdownOutput(SocketChannelImpl.java:651)
>        at sun.nio.ch.SocketAdaptor.shutdownOutput(SocketAdaptor.java:368)
>        at
> org.apache.zookeeper.ClientCnxn$SendThread.cleanup(ClientCnxn.java:956)
>        at
> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:922)
> 2009-08-14 13:37:17,856 INFO org.apache.zookeeper.ClientCnxn: Attempting
> connection to server ubuntu7/192.168.33.6:2222
> 2009-08-14 13:37:19,445 INFO org.apache.zookeeper.ClientCnxn: Priming
> connection to java.nio.channels.SocketChannel[connected local=/
> 192.168.33.7:40923 remote
> =ubuntu7/192.168.33.6:2222]
> 2009-08-14 13:37:19,445 INFO org.apache.zookeeper.ClientCnxn: Server
> connection successful
> 2009-08-14 13:37:21,022 WARN org.apache.zookeeper.ClientCnxn: Exception
> closing session 0x22313002be80000 to sun.nio.ch.SelectionKeyImpl@2e101b3a
> java.io.IOException: TIMED OUT
>        at
> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:858)
> 2009-08-14 13:37:21,023 WARN org.apache.zookeeper.ClientCnxn: Ignoring
> exception during shutdown output
> java.net.SocketException: Transport endpoint is not connected
>        at sun.nio.ch.SocketChannelImpl.shutdown(Native Method)
>        at
> sun.nio.ch.SocketChannelImpl.shutdownOutput(SocketChannelImpl.java:651)
>        at sun.nio.ch.SocketAdaptor.shutdownOutput(SocketAdaptor.java:368)
>        at
> org.apache.zookeeper.ClientCnxn$SendThread.cleanup(ClientCnxn.java:956)
>        at
> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:922)
> 2009-08-14 13:37:21,908 INFO org.apache.zookeeper.ClientCnxn: Attempting
> connection to server ubuntu7/192.168.33.6:2222
> 2009-08-14 13:37:21,908 INFO org.apache.zookeeper.ClientCnxn: Priming
> connection to java.nio.channels.SocketChannel[connected local=/
> 192.168.33.7:40926 remote
> =ubuntu7/192.168.33.6:2222]
> 2009-08-14 13:37:21,909 INFO org.apache.zookeeper.ClientCnxn: Server
> connection successful
> 2009-08-14 13:37:21,911 WARN org.apache.zookeeper.ClientCnxn: Exception
> closing session 0x22313002be80000 to sun.nio.ch.SelectionKeyImpl@6bdfe124
> java.io.IOException: Session Expired
>        at
> org.apache.zookeeper.ClientCnxn$SendThread.readConnectResult(ClientCnxn.java:548)
>        at
> org.apache.zookeeper.ClientCnxn$SendThread.doIO(ClientCnxn.java:661)
>        at
> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:897)
> 2009-08-14 13:37:21,912 ERROR org.apache.hadoop.hbase.master.HMaster: Master
> lost its znode, killing itself now
> Regards,
> LvZheng
>