You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by Ramkrishna S Vasudevan <ra...@huawei.com> on 2011/04/05 06:57:53 UTC

Reg: HRegionServer not able to communicate with the new HMaster when HMaster switching happens

Hi

 

When HBase is running in HA mode, the RegionServer is connected to the
Active HMaster.

When a switch over happens then the RegionServer is not able to connect to
the new Active HMaster.

 

Conneciton refused exception is thrown.

 

Is this a bug? If so if there is any Bug already raise for the same.

 

Regards

Ram

 

****************************************************************************
***********
This e-mail and attachments contain confidential information from HUAWEI,
which is intended only for the person or entity whose address is listed
above. Any use of the information contained herein in any way (including,
but not limited to, total or partial disclosure, reproduction, or
dissemination) by persons other than the intended recipient's) is
prohibited. If you receive this e-mail in error, please notify the sender by
phone or email immediately and delete it!

 


Re: Reg: HRegionServer not able to communicate with the new HMaster when HMaster switching happens

Posted by Stack <st...@duboce.net>.
Sounds like https://issues.apache.org/jira/browse/HBASE-3545?
St.Ack

On Mon, Apr 4, 2011 at 11:10 PM, Ramkrishna S Vasudevan
<ra...@huawei.com> wrote:
> Hi
>
> No it does not recover, regionserver is not able to get the new hmaster
> adress.  I am using hbase0.90.0 version.
> We were able to identify the problem also,
> The api getMaster() in HRegionServer has a while loop where the region
> server tries to connect to the HMaster
> while ((masterAddress = masterAddressManager.getMasterAddress()) == null) {
>      if (stopped) {
>        return null;
>      }
>      LOG.debug("No master found, will retry");
>      sleeper.sleep();
>    }
>    HMasterRegionInterface master = null;
>    while (!stopped && master == null) {
>      try {
>        // Do initial RPC setup. The final argument indicates that the RPC
>        // should retry indefinitely.
>        master = (HMasterRegionInterface) HBaseRPC.waitForProxy(
>            HMasterRegionInterface.class, HBaseRPCProtocolVersion.versionID,
>
>            masterAddress.getInetSocketAddress(), this.conf, -1,
>            this.rpcTimeout, this.rpcTimeout);
>      } catch (IOException e) {
>        e = e instanceof RemoteException ?
>            ((RemoteException)e).unwrapRemoteException() : e;
>        if (e instanceof ServerNotRunningException) {
>          LOG.info("Master isn't available yet, retrying");
>        } else {
>          LOG.warn("Unable to connect to master. Retrying. Error was:", e);
>        }
>        sleeper.sleep();
>      }
>
> The masterAddress is fetched only when the master is obtained for the first
> time.  Later the HRegion moves in to the second while loop when the HMaster
> goes down.  In the 2nd while loop the RegionServer tries to get the
> HMaster's new address(the switched one) which will still be the old one as
> the masterAddressManager is only updated.
>
> So as a fix to this problem we can get the update master address from the
> masterAddressManager as follows:
>
> masterAddress = masterAddressManager.getMasterAddress();
>                      master = (HMasterRegionInterface)
> HBaseRPC.waitForProxy(
>                    HMasterRegionInterface.class,
> HBaseRPCProtocolVersion.versionID,
>                    masterAddress.getInetSocketAddress(), this.conf, -1,
>                    this.rpcTimeout, this.rpcTimeout);
>
> This may resolve the issue.
>
>
> Regards
> Ram
>
> ****************************************************************************
> ***********
> This e-mail and attachments contain confidential information from HUAWEI,
> which is intended only for the person or entity whose address is listed
> above. Any use of the information contained herein in any way (including,
> but not limited to, total or partial disclosure, reproduction, or
> dissemination) by persons other than the intended recipient's) is
> prohibited. If you receive this e-mail in error, please notify the sender by
> phone or email immediately and delete it!
>
> -----Original Message-----
> From: saint.ack@gmail.com [mailto:saint.ack@gmail.com] On Behalf Of Stack
> Sent: Tuesday, April 05, 2011 11:05 AM
> To: dev@hbase.apache.org; ramakrishnas@huawei.com
> Subject: Re: Reg: HRegionServer not able to communicate with the new HMaster
> when HMaster switching happens
>
> RegionServer should get notification of new master.  Do you not see
> that in the regionserver logs?  Does it never recover?  What version
> of hbase?
> Thanks,
> St.Ack
>
> On Mon, Apr 4, 2011 at 9:57 PM, Ramkrishna S Vasudevan
> <ra...@huawei.com> wrote:
>> Hi
>>
>>
>>
>> When HBase is running in HA mode, the RegionServer is connected to the
>> Active HMaster.
>>
>> When a switch over happens then the RegionServer is not able to connect to
>> the new Active HMaster.
>>
>>
>>
>> Conneciton refused exception is thrown.
>>
>>
>>
>> Is this a bug? If so if there is any Bug already raise for the same.
>>
>>
>>
>> Regards
>>
>> Ram
>>
>>
>>
>>
> ****************************************************************************
>> ***********
>> This e-mail and attachments contain confidential information from HUAWEI,
>> which is intended only for the person or entity whose address is listed
>> above. Any use of the information contained herein in any way (including,
>> but not limited to, total or partial disclosure, reproduction, or
>> dissemination) by persons other than the intended recipient's) is
>> prohibited. If you receive this e-mail in error, please notify the sender
> by
>> phone or email immediately and delete it!
>>
>>
>>
>>
>
>

RE: Reg: HRegionServer not able to communicate with the new HMaster when HMaster switching happens

Posted by Ramkrishna S Vasudevan <ra...@huawei.com>.
Hi 

No it does not recover, regionserver is not able to get the new hmaster
adress.  I am using hbase0.90.0 version.
We were able to identify the problem also,
The api getMaster() in HRegionServer has a while loop where the region
server tries to connect to the HMaster
while ((masterAddress = masterAddressManager.getMasterAddress()) == null) { 
      if (stopped) { 
        return null; 
      } 
      LOG.debug("No master found, will retry"); 
      sleeper.sleep(); 
    } 
    HMasterRegionInterface master = null; 
    while (!stopped && master == null) { 
      try { 
        // Do initial RPC setup. The final argument indicates that the RPC 
        // should retry indefinitely. 
        master = (HMasterRegionInterface) HBaseRPC.waitForProxy( 
            HMasterRegionInterface.class, HBaseRPCProtocolVersion.versionID,

            masterAddress.getInetSocketAddress(), this.conf, -1, 
            this.rpcTimeout, this.rpcTimeout); 
      } catch (IOException e) { 
        e = e instanceof RemoteException ? 
            ((RemoteException)e).unwrapRemoteException() : e; 
        if (e instanceof ServerNotRunningException) { 
          LOG.info("Master isn't available yet, retrying"); 
        } else { 
          LOG.warn("Unable to connect to master. Retrying. Error was:", e); 
        } 
        sleeper.sleep(); 
      }

The masterAddress is fetched only when the master is obtained for the first
time.  Later the HRegion moves in to the second while loop when the HMaster
goes down.  In the 2nd while loop the RegionServer tries to get the
HMaster's new address(the switched one) which will still be the old one as
the masterAddressManager is only updated.

So as a fix to this problem we can get the update master address from the
masterAddressManager as follows:

masterAddress = masterAddressManager.getMasterAddress(); 
                      master = (HMasterRegionInterface)
HBaseRPC.waitForProxy( 
                    HMasterRegionInterface.class,
HBaseRPCProtocolVersion.versionID, 
                    masterAddress.getInetSocketAddress(), this.conf, -1, 
                    this.rpcTimeout, this.rpcTimeout);

This may resolve the issue.  


Regards
Ram

****************************************************************************
***********
This e-mail and attachments contain confidential information from HUAWEI,
which is intended only for the person or entity whose address is listed
above. Any use of the information contained herein in any way (including,
but not limited to, total or partial disclosure, reproduction, or
dissemination) by persons other than the intended recipient's) is
prohibited. If you receive this e-mail in error, please notify the sender by
phone or email immediately and delete it!

-----Original Message-----
From: saint.ack@gmail.com [mailto:saint.ack@gmail.com] On Behalf Of Stack
Sent: Tuesday, April 05, 2011 11:05 AM
To: dev@hbase.apache.org; ramakrishnas@huawei.com
Subject: Re: Reg: HRegionServer not able to communicate with the new HMaster
when HMaster switching happens

RegionServer should get notification of new master.  Do you not see
that in the regionserver logs?  Does it never recover?  What version
of hbase?
Thanks,
St.Ack

On Mon, Apr 4, 2011 at 9:57 PM, Ramkrishna S Vasudevan
<ra...@huawei.com> wrote:
> Hi
>
>
>
> When HBase is running in HA mode, the RegionServer is connected to the
> Active HMaster.
>
> When a switch over happens then the RegionServer is not able to connect to
> the new Active HMaster.
>
>
>
> Conneciton refused exception is thrown.
>
>
>
> Is this a bug? If so if there is any Bug already raise for the same.
>
>
>
> Regards
>
> Ram
>
>
>
>
****************************************************************************
> ***********
> This e-mail and attachments contain confidential information from HUAWEI,
> which is intended only for the person or entity whose address is listed
> above. Any use of the information contained herein in any way (including,
> but not limited to, total or partial disclosure, reproduction, or
> dissemination) by persons other than the intended recipient's) is
> prohibited. If you receive this e-mail in error, please notify the sender
by
> phone or email immediately and delete it!
>
>
>
>


Re: Reg: HRegionServer not able to communicate with the new HMaster when HMaster switching happens

Posted by Stack <st...@duboce.net>.
RegionServer should get notification of new master.  Do you not see
that in the regionserver logs?  Does it never recover?  What version
of hbase?
Thanks,
St.Ack

On Mon, Apr 4, 2011 at 9:57 PM, Ramkrishna S Vasudevan
<ra...@huawei.com> wrote:
> Hi
>
>
>
> When HBase is running in HA mode, the RegionServer is connected to the
> Active HMaster.
>
> When a switch over happens then the RegionServer is not able to connect to
> the new Active HMaster.
>
>
>
> Conneciton refused exception is thrown.
>
>
>
> Is this a bug? If so if there is any Bug already raise for the same.
>
>
>
> Regards
>
> Ram
>
>
>
> ****************************************************************************
> ***********
> This e-mail and attachments contain confidential information from HUAWEI,
> which is intended only for the person or entity whose address is listed
> above. Any use of the information contained herein in any way (including,
> but not limited to, total or partial disclosure, reproduction, or
> dissemination) by persons other than the intended recipient's) is
> prohibited. If you receive this e-mail in error, please notify the sender by
> phone or email immediately and delete it!
>
>
>
>