You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Bradford Stephens <br...@gmail.com> on 2010/10/26 09:06:13 UTC

Nodes up, Master sees 0 ReigonServers

Hey datamigos,

I'm having trouble getting a finicky .20.6 cluster to behave.

The Master, Zookeeper, and ReigonServers all seem to be happy --
except the Master doesn't see any RSs. Doing a "status" in the shell
hangs. I'm running CDH3B-whatever-is-latest on bottom. It sees the
proper 3 datanodes.

Here's a Master log: http://pastebin.com/ZNPnmexF
Here's a Reigonserver log: http://pastebin.com/PjUy4ra4

Any ideas? This was working properly with Hadoop .20.2. The new HDFS
has been installed and formatted since then.

Cheers,
B

-- 
Bradford Stephens,
Founder, Drawn to Scale
drawntoscalehq.com
727.697.7528

http://www.drawntoscalehq.com --  The intuitive, cloud-scale data
solution. Process, store, query, search, and serve all your data.

http://www.roadtofailure.com -- The Fringes of Scalability, Social
Media, and Computer Science

RE: Nodes up, Master sees 0 ReigonServers

Posted by Michael Segel <mi...@hotmail.com>.
Not sure if this helps...

With CDH3B3 if you're using the hbase version that ships, when you install hbase, you're also installing zookeeper.
So that each node has its own ZK config in /etc/zookeeper. (zoo.cfg)

You need to make sure that these zoo.cfg files match the versions on your zoo keeper machines.

A quick thing to check... on a Region Server, do you see the line localhost in the file?

HTH

-Mike


> Date: Tue, 26 Oct 2010 12:20:44 -0700
> Subject: Re: Nodes up, Master sees 0 ReigonServers
> From: ghelmling@gmail.com
> To: user@hbase.apache.org
> 
> Hey Bradford,
> 
> Disclaimer: I don't know exactly what's in CDH3B3, so this is a bit
> uninformed...
> 
> Are you using an ASF HBase 0.20.6?  If so, I don't think that'll work with
> CDH3B3 Hadoop, since it includes the secure Hadoop changes.  Even without
> enabling the security features (hadoop.security.authentication=simple -- the
> default), there are still backward incompatible changes to core Hadoop
> classes like org.apache.hadoop.security.UserGroupInformation that will bite
> you.  You can't even compile stock HBase against secure Hadoop.
> 
> So I think running CDH3B3 Hadoop means you need to run CDH3B3 HBase.
> Cloudera folks, please correct me if I'm wrong.
> 
> I hope to work up some patches to allow HBase trunk to work on both secure
> Hadoop plus just 0.20-append, but it will take a little work due to the
> backward-incompatible changes.
> 
> --gh
> 
> 
> On Tue, Oct 26, 2010 at 12:13 AM, Tao Xie <xi...@gmail.com> wrote:
> 
> > I once have same problem. Finally I find RS are not started.
> >
> > 2010/10/26 Bradford Stephens <br...@gmail.com>
> >
> > > Hey datamigos,
> > >
> > > I'm having trouble getting a finicky .20.6 cluster to behave.
> > >
> > > The Master, Zookeeper, and ReigonServers all seem to be happy --
> > > except the Master doesn't see any RSs. Doing a "status" in the shell
> > > hangs. I'm running CDH3B-whatever-is-latest on bottom. It sees the
> > > proper 3 datanodes.
> > >
> > > Here's a Master log: http://pastebin.com/ZNPnmexF
> > > Here's a Reigonserver log: http://pastebin.com/PjUy4ra4
> > >
> > > Any ideas? This was working properly with Hadoop .20.2. The new HDFS
> > > has been installed and formatted since then.
> > >
> > > Cheers,
> > > B
> > >
> > > --
> > > Bradford Stephens,
> > > Founder, Drawn to Scale
> > > drawntoscalehq.com
> > > 727.697.7528
> > >
> > > http://www.drawntoscalehq.com --  The intuitive, cloud-scale data
> > > solution. Process, store, query, search, and serve all your data.
> > >
> > > http://www.roadtofailure.com -- The Fringes of Scalability, Social
> > > Media, and Computer Science
> > >
> >
 		 	   		  

Re: Nodes up, Master sees 0 ReigonServers

Posted by Gary Helmling <gh...@gmail.com>.
Hey Bradford,

Disclaimer: I don't know exactly what's in CDH3B3, so this is a bit
uninformed...

Are you using an ASF HBase 0.20.6?  If so, I don't think that'll work with
CDH3B3 Hadoop, since it includes the secure Hadoop changes.  Even without
enabling the security features (hadoop.security.authentication=simple -- the
default), there are still backward incompatible changes to core Hadoop
classes like org.apache.hadoop.security.UserGroupInformation that will bite
you.  You can't even compile stock HBase against secure Hadoop.

So I think running CDH3B3 Hadoop means you need to run CDH3B3 HBase.
Cloudera folks, please correct me if I'm wrong.

I hope to work up some patches to allow HBase trunk to work on both secure
Hadoop plus just 0.20-append, but it will take a little work due to the
backward-incompatible changes.

--gh


On Tue, Oct 26, 2010 at 12:13 AM, Tao Xie <xi...@gmail.com> wrote:

> I once have same problem. Finally I find RS are not started.
>
> 2010/10/26 Bradford Stephens <br...@gmail.com>
>
> > Hey datamigos,
> >
> > I'm having trouble getting a finicky .20.6 cluster to behave.
> >
> > The Master, Zookeeper, and ReigonServers all seem to be happy --
> > except the Master doesn't see any RSs. Doing a "status" in the shell
> > hangs. I'm running CDH3B-whatever-is-latest on bottom. It sees the
> > proper 3 datanodes.
> >
> > Here's a Master log: http://pastebin.com/ZNPnmexF
> > Here's a Reigonserver log: http://pastebin.com/PjUy4ra4
> >
> > Any ideas? This was working properly with Hadoop .20.2. The new HDFS
> > has been installed and formatted since then.
> >
> > Cheers,
> > B
> >
> > --
> > Bradford Stephens,
> > Founder, Drawn to Scale
> > drawntoscalehq.com
> > 727.697.7528
> >
> > http://www.drawntoscalehq.com --  The intuitive, cloud-scale data
> > solution. Process, store, query, search, and serve all your data.
> >
> > http://www.roadtofailure.com -- The Fringes of Scalability, Social
> > Media, and Computer Science
> >
>

Re: Nodes up, Master sees 0 ReigonServers

Posted by Tao Xie <xi...@gmail.com>.
I once have same problem. Finally I find RS are not started.

2010/10/26 Bradford Stephens <br...@gmail.com>

> Hey datamigos,
>
> I'm having trouble getting a finicky .20.6 cluster to behave.
>
> The Master, Zookeeper, and ReigonServers all seem to be happy --
> except the Master doesn't see any RSs. Doing a "status" in the shell
> hangs. I'm running CDH3B-whatever-is-latest on bottom. It sees the
> proper 3 datanodes.
>
> Here's a Master log: http://pastebin.com/ZNPnmexF
> Here's a Reigonserver log: http://pastebin.com/PjUy4ra4
>
> Any ideas? This was working properly with Hadoop .20.2. The new HDFS
> has been installed and formatted since then.
>
> Cheers,
> B
>
> --
> Bradford Stephens,
> Founder, Drawn to Scale
> drawntoscalehq.com
> 727.697.7528
>
> http://www.drawntoscalehq.com --  The intuitive, cloud-scale data
> solution. Process, store, query, search, and serve all your data.
>
> http://www.roadtofailure.com -- The Fringes of Scalability, Social
> Media, and Computer Science
>

Re: Strange Zookeeper messages?

Posted by Todd Lipcon <to...@cloudera.com>.
Looks like a ZooKeeper bug - maybe try the ZK user list?

-Todd

On Wed, Oct 27, 2010 at 12:13 PM, Gangl, Michael E (388K) <
Michael.E.Gangl@jpl.nasa.gov> wrote:

> I'm running:
>
> hbase-0.20.6
> zookeeper-3.2.2
> hadoop-0.20.2
>
> Everything on a single machine as I'm testing out some M/R -> hbase code
> and everything on that front is working fine. After a period of inactivity,
> however, my log was flooded (millions) of the same 'WARN' message below. Any
> ideas on this? I couldn't find this issue anywhere else, and am wondering
> how to prevent it or stop the flooding of information (other than turning
> the logging off or to a higher level).
>
> Thanks in advance
>
> Mike
>
>
> log:
>
> 2010-10-27 11:43:02,022 - WARN
>  [NIOServerCxn.Factory:2181:NIOServerCnxn@518] - Exception causing close
> of session 0x0 due to java.io.IOException: Read error
> 2010-10-27 11:43:02,022 - INFO
>  [NIOServerCxn.Factory:2181:NIOServerCnxn@857] - closing session:0x0
> NIOServerCnxn: java.nio.channels.SocketChannel[connected local=/
> 137.79.16.114:2181 remote=/137.78.237.31:43731]
> 2010-10-27 11:43:02,025 - WARN
>  [NIOServerCxn.Factory:2181:NIOServerCnxn@518] - Exception causing close
> of session 0x0 due to java.io.IOException: Read error
> 2010-10-27 11:43:02,025 - INFO
>  [NIOServerCxn.Factory:2181:NIOServerCnxn@857] - closing session:0x0
> NIOServerCnxn: java.nio.channels.SocketChannel[connected local=/
> 137.79.16.114:2181 remote=/137.78.237.31:35002]
> 2010-10-27 11:43:53,218 - WARN
>  [NIOServerCxn.Factory:2181:NIOServerCnxn@518] - Exception causing close
> of session 0x0 due to java.io.IOException: Len error 1195725856
> 2010-10-27 11:43:53,218 - INFO
>  [NIOServerCxn.Factory:2181:NIOServerCnxn@857] - closing session:0x0
> NIOServerCnxn: java.nio.channels.SocketChannel[connected local=/
> 137.79.16.114:2181 remote=/137.78.237.31:39252]
> 2010-10-27 11:44:15,496 - WARN
>  [NIOServerCxn.Factory:2181:NIOServerCnxn@518] - Exception causing close
> of session 0x0 due to java.io.IOException: Len error 1212501072
> 2010-10-27 11:44:15,496 - INFO
>  [NIOServerCxn.Factory:2181:NIOServerCnxn@857] - closing session:0x0
> NIOServerCnxn: java.nio.channels.SocketChannel[connected local=/
> 137.79.16.114:2181 remote=/137.78.237.31:45169]
> 2010-10-27 11:44:59,654 - WARN
>  [NIOServerCxn.Factory:2181:NIOServerCnxn$Factory@249] - Ignoring
> exception
> java.net.SocketException: Invalid argument
>        at sun.nio.ch.Net.setIntOption0(Native Method)
>        at sun.nio.ch.Net.setIntOption(Net.java:157)
>        at sun.nio.ch.SocketChannelImpl$1.setInt(SocketChannelImpl.java:406)
>        at sun.nio.ch.SocketOptsImpl.setBoolean(SocketOptsImpl.java:38)
>        at sun.nio.ch.SocketOptsImpl$IP$TCP.noDelay(SocketOptsImpl.java:284)
>        at sun.nio.ch.OptionAdaptor.setTcpNoDelay(OptionAdaptor.java:48)
>        at sun.nio.ch.SocketAdaptor.setTcpNoDelay(SocketAdaptor.java:268)
>        at
> org.apache.zookeeper.server.NIOServerCnxn.<init>(NIOServerCnxn.java:810)
>        at
> org.apache.zookeeper.server.NIOServerCnxn$Factory.createConnection(NIOServerCnxn.java:199)
>        at
> org.apache.zookeeper.server.NIOServerCnxn$Factory.run(NIOServerCnxn.java:233)
> 2010-10-27 11:44:59,655 - WARN
>  [NIOServerCxn.Factory:2181:NIOServerCnxn$Factory@249] - Ignoring
> exception
> java.lang.NullPointerException
>        at
> org.apache.zookeeper.server.NIOServerCnxn$Factory.run(NIOServerCnxn.java:223)
> 2010-10-27 11:44:59,655 - WARN
>  [NIOServerCxn.Factory:2181:NIOServerCnxn$Factory@249] - Ignoring
> exception
> java.lang.NullPointerException
>        at
> org.apache.zookeeper.server.NIOServerCnxn$Factory.run(NIOServerCnxn.java:239)
> 2010-10-27 11:44:59,655 - WARN
>  [NIOServerCxn.Factory:2181:NIOServerCnxn$Factory@249] - Ignoring
> exception
> java.lang.NullPointerException
>        at
> org.apache.zookeeper.server.NIOServerCnxn$Factory.run(NIOServerCnxn.java:223)
> 2010-10-27 11:44:59,656 - WARN
>  [NIOServerCxn.Factory:2181:NIOServerCnxn$Factory@249] - Ignoring
> exception
> java.lang.NullPointerException
>        at
> org.apache.zookeeper.server.NIOServerCnxn$Factory.run(NIOServerCnxn.java:223)
> 2010-10-27 11:44:59,656 - WARN
>  [NIOServerCxn.Factory:2181:NIOServerCnxn$Factory@249] - Ignoring
> exception
> java.lang.NullPointerException
>        at
> org.apache.zookeeper.server.NIOServerCnxn$Factory.run(NIOServerCnxn.java:239)
> 2010-10-27 11:44:59,656 - WARN
>  [NIOServerCxn.Factory:2181:NIOServerCnxn$Factory@249] - Ignoring
> exception
> java.lang.NullPointerException
>
> And this continues for as long as I let it. Millions of messages.
>
>


-- 
Todd Lipcon
Software Engineer, Cloudera

Strange Zookeeper messages?

Posted by "Gangl, Michael E (388K)" <Mi...@jpl.nasa.gov>.
I'm running:

hbase-0.20.6
zookeeper-3.2.2
hadoop-0.20.2

Everything on a single machine as I'm testing out some M/R -> hbase code and everything on that front is working fine. After a period of inactivity, however, my log was flooded (millions) of the same 'WARN' message below. Any ideas on this? I couldn't find this issue anywhere else, and am wondering how to prevent it or stop the flooding of information (other than turning the logging off or to a higher level).

Thanks in advance

Mike


log:

2010-10-27 11:43:02,022 - WARN  [NIOServerCxn.Factory:2181:NIOServerCnxn@518] - Exception causing close of session 0x0 due to java.io.IOException: Read error
2010-10-27 11:43:02,022 - INFO  [NIOServerCxn.Factory:2181:NIOServerCnxn@857] - closing session:0x0 NIOServerCnxn: java.nio.channels.SocketChannel[connected local=/137.79.16.114:2181 remote=/137.78.237.31:43731]
2010-10-27 11:43:02,025 - WARN  [NIOServerCxn.Factory:2181:NIOServerCnxn@518] - Exception causing close of session 0x0 due to java.io.IOException: Read error
2010-10-27 11:43:02,025 - INFO  [NIOServerCxn.Factory:2181:NIOServerCnxn@857] - closing session:0x0 NIOServerCnxn: java.nio.channels.SocketChannel[connected local=/137.79.16.114:2181 remote=/137.78.237.31:35002]
2010-10-27 11:43:53,218 - WARN  [NIOServerCxn.Factory:2181:NIOServerCnxn@518] - Exception causing close of session 0x0 due to java.io.IOException: Len error 1195725856
2010-10-27 11:43:53,218 - INFO  [NIOServerCxn.Factory:2181:NIOServerCnxn@857] - closing session:0x0 NIOServerCnxn: java.nio.channels.SocketChannel[connected local=/137.79.16.114:2181 remote=/137.78.237.31:39252]
2010-10-27 11:44:15,496 - WARN  [NIOServerCxn.Factory:2181:NIOServerCnxn@518] - Exception causing close of session 0x0 due to java.io.IOException: Len error 1212501072
2010-10-27 11:44:15,496 - INFO  [NIOServerCxn.Factory:2181:NIOServerCnxn@857] - closing session:0x0 NIOServerCnxn: java.nio.channels.SocketChannel[connected local=/137.79.16.114:2181 remote=/137.78.237.31:45169]
2010-10-27 11:44:59,654 - WARN  [NIOServerCxn.Factory:2181:NIOServerCnxn$Factory@249] - Ignoring exception
java.net.SocketException: Invalid argument
        at sun.nio.ch.Net.setIntOption0(Native Method)
        at sun.nio.ch.Net.setIntOption(Net.java:157)
        at sun.nio.ch.SocketChannelImpl$1.setInt(SocketChannelImpl.java:406)
        at sun.nio.ch.SocketOptsImpl.setBoolean(SocketOptsImpl.java:38)
        at sun.nio.ch.SocketOptsImpl$IP$TCP.noDelay(SocketOptsImpl.java:284)
        at sun.nio.ch.OptionAdaptor.setTcpNoDelay(OptionAdaptor.java:48)
        at sun.nio.ch.SocketAdaptor.setTcpNoDelay(SocketAdaptor.java:268)
        at org.apache.zookeeper.server.NIOServerCnxn.<init>(NIOServerCnxn.java:810)
        at org.apache.zookeeper.server.NIOServerCnxn$Factory.createConnection(NIOServerCnxn.java:199)
        at org.apache.zookeeper.server.NIOServerCnxn$Factory.run(NIOServerCnxn.java:233)
2010-10-27 11:44:59,655 - WARN  [NIOServerCxn.Factory:2181:NIOServerCnxn$Factory@249] - Ignoring exception
java.lang.NullPointerException
        at org.apache.zookeeper.server.NIOServerCnxn$Factory.run(NIOServerCnxn.java:223)
2010-10-27 11:44:59,655 - WARN  [NIOServerCxn.Factory:2181:NIOServerCnxn$Factory@249] - Ignoring exception
java.lang.NullPointerException
        at org.apache.zookeeper.server.NIOServerCnxn$Factory.run(NIOServerCnxn.java:239)
2010-10-27 11:44:59,655 - WARN  [NIOServerCxn.Factory:2181:NIOServerCnxn$Factory@249] - Ignoring exception
java.lang.NullPointerException
        at org.apache.zookeeper.server.NIOServerCnxn$Factory.run(NIOServerCnxn.java:223)
2010-10-27 11:44:59,656 - WARN  [NIOServerCxn.Factory:2181:NIOServerCnxn$Factory@249] - Ignoring exception
java.lang.NullPointerException
        at org.apache.zookeeper.server.NIOServerCnxn$Factory.run(NIOServerCnxn.java:223)
2010-10-27 11:44:59,656 - WARN  [NIOServerCxn.Factory:2181:NIOServerCnxn$Factory@249] - Ignoring exception
java.lang.NullPointerException
        at org.apache.zookeeper.server.NIOServerCnxn$Factory.run(NIOServerCnxn.java:239)
2010-10-27 11:44:59,656 - WARN  [NIOServerCxn.Factory:2181:NIOServerCnxn$Factory@249] - Ignoring exception
java.lang.NullPointerException

And this continues for as long as I let it. Millions of messages.