You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Cosmin Lehene (JIRA)" <ji...@apache.org> on 2011/03/22 15:02:05 UTC
[jira] [Updated] (HBASE-3660) If regions assignment fails, clients will be directed to stale data from .META.

     [ https://issues.apache.org/jira/browse/HBASE-3660?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Cosmin Lehene updated HBASE-3660:
---------------------------------

    Attachment: HBASE-3660.patch

I just looked over it (it's really annoying for me as my IP changes a lot).

It looks like we catch too narrow in CatalogTracker.getCachedConnection (SocketTimeoutException)
"Host is down" or "Network unreachable" are raised as SocketException. 

{code}
2011-03-22 15:13:19,111 FATAL org.apache.hadoop.hbase.master.HMaster: Unhandled exception. Starting shutdown.
java.net.SocketException: Host is down
	at sun.nio.ch.Net.connect(Native Method)
	at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:507)
	at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:192)
	at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:408)
	at org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupIOstreams(HBaseClient.java:328)
	at org.apache.hadoop.hbase.ipc.HBaseClient.getConnection(HBaseClient.java:883)
	at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:750)
	at org.apache.hadoop.hbase.ipc.HBaseRPC$Invoker.invoke(HBaseRPC.java:257)
	at $Proxy7.getProtocolVersion(Unknown Source)
	at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:419)
	at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:393)
	at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:444)
	at org.apache.hadoop.hbase.ipc.HBaseRPC.waitForProxy(HBaseRPC.java:349)
	at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:953)
	at org.apache.hadoop.hbase.catalog.CatalogTracker.getCachedConnection(CatalogTracker.java:385)
	at org.apache.hadoop.hbase.catalog.CatalogTracker.getMetaServerConnection(CatalogTracker.java:284)
	at org.apache.hadoop.hbase.catalog.CatalogTracker.verifyMetaRegionLocation(CatalogTracker.java:482)
	at org.apache.hadoop.hbase.master.HMaster.assignRootAndMeta(HMaster.java:441)
	at org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:388)
	at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:283)
{code}

I changed it to catch SocketException and don't have any problems when changing IPs anymore. 




> If regions assignment fails, clients will be directed to stale data from .META.
> -------------------------------------------------------------------------------
>
>                 Key: HBASE-3660
>                 URL: https://issues.apache.org/jira/browse/HBASE-3660
>             Project: HBase
>          Issue Type: Bug
>          Components: master, regionserver
>    Affects Versions: 0.90.1
>            Reporter: Cosmin Lehene
>            Priority: Critical
>             Fix For: 0.90.2
>
>         Attachments: HBASE-3660.patch
>
>
> I've noticed this when the IP on my machine changed (it's even easier to detect when LZO doesn't work)
> Master loads .META. successfully and then starts assigning regions.
> However LZO doesn't work so HRegionServer can't open the regions. 
> A client attempts to get data from a table so it reads the location from .META. but goes to a totally different server (the old value in .META.)
> This could happen without the LZO story too. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira