You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by "Pan.W" <pw...@gmail.com> on 2010/11/04 09:13:21 UTC

HBase failure causes dual NIC ?

Hi, HBaser

I'm currently trying to run HBase,  but some errors occur.

Running environment:
   CentOS release 5.5   
   hadoop-0.20.2 
   hbase-0.20.6 

I use two machines to run hbase  (just for illustrate this issue). 
    Master is         : 192.168.22.18 /192.168.25.18
    RegionServer is : 192.168.22.19 /192.168.25.19  
    In my cluster, every machine has dual NIC.  Maybe that's the problem, I guess...
                                               ~~~~~~~

In hbase-site.xml, list some configurations
  <property>
      <name>hbase.zookeeper.quorum</name>
      <value>192.168.25.18, 192.168.25.19</value>
      ...
  </property> 


After run the start-hbase.sh, these relevant processes have been started!
Run "hbase shell" to excute some commands:
--------------------------------------------------------
hbase(main):002:0> create "table1","cf1"
NativeException: java.io.IOException: java.io.IOException: java.lang.NullPointerException
        at org.apache.hadoop.hbase.master.HMaster.createTable(HMaster.java:785)
        at org.apache.hadoop.hbase.master.HMaster.createTable(HMaster.java:762)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:657)
        at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)

hbase(main):013:0> list
10/11/04 15:43:28 INFO ipc.HbaseRPC: Server at /192.168.25.19:60020 could not be reached after 1 tries, giving up.
-----------------------------------------------------------


And then execute "zk_dump", gets infomation as follows:
-------------------------------------------------------------------
Version: 0.20.6, r965666, Mon Jul 19 16:54:48 PDT 2010
hbase(main):001:0> zk_dump
HBase tree in ZooKeeper is rooted at /hbase
  Cluster up? true
  In safe mode? false
  Master address: 192.168.25.18:60000
  Region server holding ROOT: 192.168.25.19:60020
  Region servers:
    - 192.168.22.19:60020
  Quorum Server Statistics:
    - 192.168.25.19:2181
        Zookeeper version: 3.2.2-888565, built on 12/08/2009 21:51 GMT
        Clients:
         /192.168.25.18:53266[1](queued=0,recved=0,sent=0)
        Latency min/avg/max: 0/0/0
        Received: 0
        Sent: 0
        Outstanding: 0
        Zxid: 0xc0000000d
        Mode: leader
        Node count: 11
    - 192.168.25.18:2181
        Zookeeper version: 3.2.2-888565, built on 12/08/2009 21:51 GMT
        Clients:
         /192.168.25.18:52198[1](queued=0,recved=0,sent=0)
         /192.168.25.18:59354[1](queued=0,recved=4115,sent=0)
         /192.168.25.19:41012[1](queued=0,recved=4106,sent=0)
         /192.168.25.18:52195[1](queued=0,recved=10,sent=0)
        Latency min/avg/max: 0/1/22
        Received: 8251
        Sent: 0
        Outstanding: 0
        Zxid: 0xc0000000d
        Mode: follower
        Node count: 11
------------------------------------------------------------

>From the infomation returned by zk_dump, It's looks like the inconsistent IP address be used simultaneously.

Any help is greatly appreciated!


2010-11-04 



Pan.W 

Re: HBase failure causes dual NIC ?

Posted by Pan W <pw...@gmail.com>.
Hi, Michael

I benefited a lot from your reply, thanks...
Now that the problem is sloved in my experiment environment.

The source of error is the configuration of DNS--After DNS, One domain
name can gets one of dual NIC's IP randomly.
I asked cluster administrator to change some configurations of DNS.
The HBase  works now.

-- 
Pan W <pw...@gmail.com>


Re: HBase failure causes dual NIC ?

Posted by Ted Yu <yu...@gmail.com>.
See https://issues.apache.org/jira/browse/HBASE-2502

Deactivate one of the dual NICs.

On Thu, Nov 4, 2010 at 1:13 AM, Pan.W <pw...@gmail.com> wrote:

> Hi, HBaser
>
> I'm currently trying to run HBase,  but some errors occur.
>
> Running environment:
>   CentOS release 5.5
>   hadoop-0.20.2
>   hbase-0.20.6
>
> I use two machines to run hbase  (just for illustrate this issue).
>    Master is         : 192.168.22.18 /192.168.25.18
>    RegionServer is : 192.168.22.19 /192.168.25.19
>    In my cluster, every machine has dual NIC.  Maybe that's the problem, I
> guess...
>                                               ~~~~~~~
>
> In hbase-site.xml, list some configurations
>  <property>
>      <name>hbase.zookeeper.quorum</name>
>      <value>192.168.25.18, 192.168.25.19</value>
>      ...
>  </property>
>
>
> After run the start-hbase.sh, these relevant processes have been started!
> Run "hbase shell" to excute some commands:
> --------------------------------------------------------
> hbase(main):002:0> create "table1","cf1"
> NativeException: java.io.IOException: java.io.IOException:
> java.lang.NullPointerException
>        at
> org.apache.hadoop.hbase.master.HMaster.createTable(HMaster.java:785)
>        at
> org.apache.hadoop.hbase.master.HMaster.createTable(HMaster.java:762)
>        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>        at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>        at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>        at java.lang.reflect.Method.invoke(Method.java:597)
>        at
> org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:657)
>        at
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)
>
> hbase(main):013:0> list
> 10/11/04 15:43:28 INFO ipc.HbaseRPC: Server at /192.168.25.19:60020 could
> not be reached after 1 tries, giving up.
> -----------------------------------------------------------
>
>
> And then execute "zk_dump", gets infomation as follows:
> -------------------------------------------------------------------
> Version: 0.20.6, r965666, Mon Jul 19 16:54:48 PDT 2010
> hbase(main):001:0> zk_dump
> HBase tree in ZooKeeper is rooted at /hbase
>  Cluster up? true
>  In safe mode? false
>  Master address: 192.168.25.18:60000
>  Region server holding ROOT: 192.168.25.19:60020
>  Region servers:
>    - 192.168.22.19:60020
>  Quorum Server Statistics:
>    - 192.168.25.19:2181
>        Zookeeper version: 3.2.2-888565, built on 12/08/2009 21:51 GMT
>        Clients:
>         /192.168.25.18:53266[1](queued=0,recved=0,sent=0)
>        Latency min/avg/max: 0/0/0
>        Received: 0
>        Sent: 0
>        Outstanding: 0
>        Zxid: 0xc0000000d
>        Mode: leader
>        Node count: 11
>    - 192.168.25.18:2181
>        Zookeeper version: 3.2.2-888565, built on 12/08/2009 21:51 GMT
>        Clients:
>         /192.168.25.18:52198[1](queued=0,recved=0,sent=0)
>         /192.168.25.18:59354[1](queued=0,recved=4115,sent=0)
>         /192.168.25.19:41012[1](queued=0,recved=4106,sent=0)
>         /192.168.25.18:52195[1](queued=0,recved=10,sent=0)
>        Latency min/avg/max: 0/1/22
>        Received: 8251
>        Sent: 0
>        Outstanding: 0
>        Zxid: 0xc0000000d
>        Mode: follower
>        Node count: 11
> ------------------------------------------------------------
>
> From the infomation returned by zk_dump, It's looks like the inconsistent
> IP address be used simultaneously.
>
> Any help is greatly appreciated!
>
>
> 2010-11-04
>
>
>
> Pan.W
>

RE: HBase failure causes dual NIC ?

Posted by Michael Segel <mi...@hotmail.com>.

Short of doing an overhaul of hadoop/hbase ...

Here's the issue...

Both hadoop and hbase are currently designed to use one NIC to talk to both the outside world and to each other.
It kind of makes sense because hadoop doesn't have a single point of entry to the cloud. Each client talks with the namenode and then a specific datanode where the file is located, right?
(HDFS)  So it appears that there is an assumption of a single interface.

Since HBase sits on top of HDFS it seems to inherit this design.

To be honest... its a simple design and it works for most people. 
Also note that if your nodes have more than 4 sata drives, like 8 drives in a 2U high box, you can exceed a 1GBe port. Here, if you have two NIC ports, you'll want to bond them and still have a single IP address to the rest of the world. (In some of our testing we've found that with 4 SATA drives, under load, we're hitting about 80% of a 1GBe port's capacity. Your mileage will vary...)

I'm not sure of what you mean by you rent the hardware.
Without changing the physical hardware, can you change your configuration in software? Can you assign IP addresses to ports, and configure your HBase to use the NIC that sees the outside world?


On a side note...

In theory... you could change the HDFS and other configurations to allow for multiple nics. By this I mean changing the networking to allow one to specify which type of traffic goes on which address...

> Date: Thu, 4 Nov 2010 20:30:47 +0800
> From: pwcrab@gmail.com
> To: user@hbase.apache.org
> Subject: Re: Re: HBase failure causes dual NIC ?
> 
> Ted , I appreciate your help!
> I have read HBASE-2502 and mail of "hbase with multiple interfaces"
> 
> I rent the cluster to test HBase, so i can't modify the hardware configuration.
> (Maybe others need the dual NICs)
> 
> Can I temporary modify some code to fix the issue?
> (e.g.  Replace the process that look up IP address in DNS with some hardcode(fixed IP address) )
> Anybody can give me some clues?
> 
> 
> 2010-11-04 
> 
> 
> 
> Pan.W 
> 
> 
> 
> 发件人: Ted Yu 
> 发送时间: 2010-11-04  17:37:56 
> 收件人: user 
> 抄送: 
> 主题: Re: HBase failure causes dual NIC ? 
>  
> See https://issues.apache.org/jira/browse/HBASE-2502
> Deactivate one of the dual NICs.
> On Thu, Nov 4, 2010 at 1:13 AM, Pan.W <pw...@gmail.com> wrote:
> > Hi, HBaser
> >
> > I'm currently trying to run HBase,  but some errors occur.
> >
> > Running environment:
> >   CentOS release 5.5
> >   hadoop-0.20.2
> >   hbase-0.20.6
> >
> > I use two machines to run hbase  (just for illustrate this issue).
> >    Master is         : 192.168.22.18 /192.168.25.18
> >    RegionServer is : 192.168.22.19 /192.168.25.19
> >    In my cluster, every machine has dual NIC.  Maybe that's the problem, I
> > guess...
> >                                               ~~~~~~~
> >
> > In hbase-site.xml, list some configurations
> >  <property>
> >      <name>hbase.zookeeper.quorum</name>
> >      <value>192.168.25.18, 192.168.25.19</value>
> >      ...
> >  </property>
> >
> >
> > After run the start-hbase.sh, these relevant processes have been started!
> > Run "hbase shell" to excute some commands:
> > --------------------------------------------------------
> > hbase(main):002:0> create "table1","cf1"
> > NativeException: java.io.IOException: java.io.IOException:
> > java.lang.NullPointerException
> >        at
> > org.apache.hadoop.hbase.master.HMaster.createTable(HMaster.java:785)
> >        at
> > org.apache.hadoop.hbase.master.HMaster.createTable(HMaster.java:762)
> >        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> >        at
> > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> >        at
> > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> >        at java.lang.reflect.Method.invoke(Method.java:597)
> >        at
> > org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:657)
> >        at
> > org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)
> >
> > hbase(main):013:0> list
> > 10/11/04 15:43:28 INFO ipc.HbaseRPC: Server at /192.168.25.19:60020 could
> > not be reached after 1 tries, giving up.
> > -----------------------------------------------------------
> >
> >
> > And then execute "zk_dump", gets infomation as follows:
> > -------------------------------------------------------------------
> > Version: 0.20.6, r965666, Mon Jul 19 16:54:48 PDT 2010
> > hbase(main):001:0> zk_dump
> > HBase tree in ZooKeeper is rooted at /hbase
> >  Cluster up? true
> >  In safe mode? false
> >  Master address: 192.168.25.18:60000
> >  Region server holding ROOT: 192.168.25.19:60020
> >  Region servers:
> >    - 192.168.22.19:60020
> >  Quorum Server Statistics:
> >    - 192.168.25.19:2181
> >        Zookeeper version: 3.2.2-888565, built on 12/08/2009 21:51 GMT
> >        Clients:
> >         /192.168.25.18:53266[1](queued=0,recved=0,sent=0)
> >        Latency min/avg/max: 0/0/0
> >        Received: 0
> >        Sent: 0
> >        Outstanding: 0
> >        Zxid: 0xc0000000d
> >        Mode: leader
> >        Node count: 11
> >    - 192.168.25.18:2181
> >        Zookeeper version: 3.2.2-888565, built on 12/08/2009 21:51 GMT
> >        Clients:
> >         /192.168.25.18:52198[1](queued=0,recved=0,sent=0)
> >         /192.168.25.18:59354[1](queued=0,recved=4115,sent=0)
> >         /192.168.25.19:41012[1](queued=0,recved=4106,sent=0)
> >         /192.168.25.18:52195[1](queued=0,recved=10,sent=0)
> >        Latency min/avg/max: 0/1/22
> >        Received: 8251
> >        Sent: 0
> >        Outstanding: 0
> >        Zxid: 0xc0000000d
> >        Mode: follower
> >        Node count: 11
> > ------------------------------------------------------------
> >
> > From the infomation returned by zk_dump, It's looks like the inconsistent
> > IP address be used simultaneously.
> >
> > Any help is greatly appreciated!
> >
> >
> > 2010-11-04
> >
> >
> >
> > Pan.W
> >
 		 	   		  

Re: Re: HBase failure causes dual NIC ?

Posted by "Pan.W" <pw...@gmail.com>.
Ted , I appreciate your help!
I have read HBASE-2502 and mail of "hbase with multiple interfaces"

I rent the cluster to test HBase, so i can't modify the hardware configuration.
(Maybe others need the dual NICs)

Can I temporary modify some code to fix the issue?
(e.g.  Replace the process that look up IP address in DNS with some hardcode(fixed IP address) )
Anybody can give me some clues?


2010-11-04 



Pan.W 



发件人: Ted Yu 
发送时间: 2010-11-04  17:37:56 
收件人: user 
抄送: 
主题: Re: HBase failure causes dual NIC ? 
 
See https://issues.apache.org/jira/browse/HBASE-2502
Deactivate one of the dual NICs.
On Thu, Nov 4, 2010 at 1:13 AM, Pan.W <pw...@gmail.com> wrote:
> Hi, HBaser
>
> I'm currently trying to run HBase,  but some errors occur.
>
> Running environment:
>   CentOS release 5.5
>   hadoop-0.20.2
>   hbase-0.20.6
>
> I use two machines to run hbase  (just for illustrate this issue).
>    Master is         : 192.168.22.18 /192.168.25.18
>    RegionServer is : 192.168.22.19 /192.168.25.19
>    In my cluster, every machine has dual NIC.  Maybe that's the problem, I
> guess...
>                                               ~~~~~~~
>
> In hbase-site.xml, list some configurations
>  <property>
>      <name>hbase.zookeeper.quorum</name>
>      <value>192.168.25.18, 192.168.25.19</value>
>      ...
>  </property>
>
>
> After run the start-hbase.sh, these relevant processes have been started!
> Run "hbase shell" to excute some commands:
> --------------------------------------------------------
> hbase(main):002:0> create "table1","cf1"
> NativeException: java.io.IOException: java.io.IOException:
> java.lang.NullPointerException
>        at
> org.apache.hadoop.hbase.master.HMaster.createTable(HMaster.java:785)
>        at
> org.apache.hadoop.hbase.master.HMaster.createTable(HMaster.java:762)
>        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>        at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>        at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>        at java.lang.reflect.Method.invoke(Method.java:597)
>        at
> org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:657)
>        at
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)
>
> hbase(main):013:0> list
> 10/11/04 15:43:28 INFO ipc.HbaseRPC: Server at /192.168.25.19:60020 could
> not be reached after 1 tries, giving up.
> -----------------------------------------------------------
>
>
> And then execute "zk_dump", gets infomation as follows:
> -------------------------------------------------------------------
> Version: 0.20.6, r965666, Mon Jul 19 16:54:48 PDT 2010
> hbase(main):001:0> zk_dump
> HBase tree in ZooKeeper is rooted at /hbase
>  Cluster up? true
>  In safe mode? false
>  Master address: 192.168.25.18:60000
>  Region server holding ROOT: 192.168.25.19:60020
>  Region servers:
>    - 192.168.22.19:60020
>  Quorum Server Statistics:
>    - 192.168.25.19:2181
>        Zookeeper version: 3.2.2-888565, built on 12/08/2009 21:51 GMT
>        Clients:
>         /192.168.25.18:53266[1](queued=0,recved=0,sent=0)
>        Latency min/avg/max: 0/0/0
>        Received: 0
>        Sent: 0
>        Outstanding: 0
>        Zxid: 0xc0000000d
>        Mode: leader
>        Node count: 11
>    - 192.168.25.18:2181
>        Zookeeper version: 3.2.2-888565, built on 12/08/2009 21:51 GMT
>        Clients:
>         /192.168.25.18:52198[1](queued=0,recved=0,sent=0)
>         /192.168.25.18:59354[1](queued=0,recved=4115,sent=0)
>         /192.168.25.19:41012[1](queued=0,recved=4106,sent=0)
>         /192.168.25.18:52195[1](queued=0,recved=10,sent=0)
>        Latency min/avg/max: 0/1/22
>        Received: 8251
>        Sent: 0
>        Outstanding: 0
>        Zxid: 0xc0000000d
>        Mode: follower
>        Node count: 11
> ------------------------------------------------------------
>
> From the infomation returned by zk_dump, It's looks like the inconsistent
> IP address be used simultaneously.
>
> Any help is greatly appreciated!
>
>
> 2010-11-04
>
>
>
> Pan.W
>