You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "stack (JIRA)" <ji...@apache.org> on 2011/01/04 00:44:45 UTC
[jira] Updated: (HBASE-3344) Master aborts after RPC to server that
was shutting down
[ https://issues.apache.org/jira/browse/HBASE-3344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
stack updated HBASE-3344:
-------------------------
Attachment: 3344.txt
This patch adds check of EOFE inside a RemoteExcepion. I don't think necessary since my thinking is that the running jar was old w/o fix that was already committed to catch EOFE (See above).
{code}
Index: src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java
===================================================================
--- src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java (revision 1054820)
+++ src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java (working copy)
@@ -1122,6 +1122,10 @@
// Failed to close, so pass through and reassign
LOG.debug("Server " + server + " returned " + ioe + " for " +
region.getEncodedName());
+ } else if (ioe instanceof EOFException) {
+ // Failed to close, so pass through and reassign
+ LOG.debug("Server " + server + " returned " + ioe + " for " +
+ region.getEncodedName());
} else {
this.master.abort("Remote unexpected exception", ioe);
}
{code}
> Master aborts after RPC to server that was shutting down
> --------------------------------------------------------
>
> Key: HBASE-3344
> URL: https://issues.apache.org/jira/browse/HBASE-3344
> Project: HBase
> Issue Type: Bug
> Affects Versions: 0.90.0
> Reporter: Todd Lipcon
> Priority: Blocker
> Attachments: 3344.txt
>
>
> I was doing a rolling restart during a bunch of splits happening, and the master aborted with the following:
> 2010-12-13 12:24:55,536 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of region usertable,user1590589031,1291843166306.dbcbe21b3447c78560802962b87fd34f. (offlining)
> 2010-12-13 12:24:55,537 FATAL org.apache.hadoop.hbase.master.HMaster: Remote unexpected exception
> java.io.IOException: Call to haus03.sf.cloudera.com/172.29.5.34:60020 failed on local exception: java.io.EOFException
> at org.apache.hadoop.hbase.ipc.HBaseClient.wrapException(HBaseClient.java:788)
> at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:757)
> at org.apache.hadoop.hbase.ipc.HBaseRPC$Invoker.invoke(HBaseRPC.java:257)
> at $Proxy6.closeRegion(Unknown Source)
> at org.apache.hadoop.hbase.master.ServerManager.sendRegionClose(ServerManager.java:589)
> at org.apache.hadoop.hbase.master.AssignmentManager.unassign(AssignmentManager.java:1085)
> at org.apache.hadoop.hbase.master.AssignmentManager.unassign(AssignmentManager.java:1032)
> at org.apache.hadoop.hbase.master.AssignmentManager.balance(AssignmentManager.java:1791)
> at org.apache.hadoop.hbase.master.HMaster.balance(HMaster.java:688)
> at org.apache.hadoop.hbase.master.HMaster$1.chore(HMaster.java:579)
> at org.apache.hadoop.hbase.Chore.run(Chore.java:66)
> Caused by: java.io.EOFException
> at java.io.DataInputStream.readInt(DataInputStream.java:375)
> at org.apache.hadoop.hbase.ipc.HBaseClient$Connection.receiveResponse(HBaseClient.java:521)
> at org.apache.hadoop.hbase.ipc.HBaseClient$Connection.run(HBaseClient.java:459)
> 2010-12-13 12:24:55,541 INFO org.apache.hadoop.hbase.master.HMaster: Aborting
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.