You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "stack (JIRA)" <ji...@apache.org> on 2011/01/04 00:44:45 UTC

[jira] Updated: (HBASE-3344) Master aborts after RPC to server that was shutting down

     [ https://issues.apache.org/jira/browse/HBASE-3344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack updated HBASE-3344:
-------------------------

    Attachment: 3344.txt

This patch adds check of EOFE inside a RemoteExcepion. I don't think necessary since my thinking is that the running jar was old w/o fix that was already committed to catch EOFE (See above).

{code}
Index: src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java
===================================================================
--- src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java (revision 1054820)
+++ src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java (working copy)
@@ -1122,6 +1122,10 @@
         // Failed to close, so pass through and reassign
         LOG.debug("Server " + server + " returned " + ioe + " for " +
           region.getEncodedName());
+      } else if (ioe instanceof EOFException) {
+        // Failed to close, so pass through and reassign
+        LOG.debug("Server " + server + " returned " + ioe + " for " +
+          region.getEncodedName());
       } else {
         this.master.abort("Remote unexpected exception", ioe);
       }
{code}

> Master aborts after RPC to server that was shutting down
> --------------------------------------------------------
>
>                 Key: HBASE-3344
>                 URL: https://issues.apache.org/jira/browse/HBASE-3344
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.90.0
>            Reporter: Todd Lipcon
>            Priority: Blocker
>         Attachments: 3344.txt
>
>
> I was doing a rolling restart during a bunch of splits happening, and the master aborted with the following:
> 2010-12-13 12:24:55,536 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of region usertable,user1590589031,1291843166306.dbcbe21b3447c78560802962b87fd34f. (offlining)
> 2010-12-13 12:24:55,537 FATAL org.apache.hadoop.hbase.master.HMaster: Remote unexpected exception
> java.io.IOException: Call to haus03.sf.cloudera.com/172.29.5.34:60020 failed on local exception: java.io.EOFException
>         at org.apache.hadoop.hbase.ipc.HBaseClient.wrapException(HBaseClient.java:788)
>         at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:757)
>         at org.apache.hadoop.hbase.ipc.HBaseRPC$Invoker.invoke(HBaseRPC.java:257)
>         at $Proxy6.closeRegion(Unknown Source)
>         at org.apache.hadoop.hbase.master.ServerManager.sendRegionClose(ServerManager.java:589)
>         at org.apache.hadoop.hbase.master.AssignmentManager.unassign(AssignmentManager.java:1085)
>         at org.apache.hadoop.hbase.master.AssignmentManager.unassign(AssignmentManager.java:1032)
>         at org.apache.hadoop.hbase.master.AssignmentManager.balance(AssignmentManager.java:1791)
>         at org.apache.hadoop.hbase.master.HMaster.balance(HMaster.java:688)
>         at org.apache.hadoop.hbase.master.HMaster$1.chore(HMaster.java:579)
>         at org.apache.hadoop.hbase.Chore.run(Chore.java:66)
> Caused by: java.io.EOFException
>         at java.io.DataInputStream.readInt(DataInputStream.java:375)
>         at org.apache.hadoop.hbase.ipc.HBaseClient$Connection.receiveResponse(HBaseClient.java:521)
>         at org.apache.hadoop.hbase.ipc.HBaseClient$Connection.run(HBaseClient.java:459)
> 2010-12-13 12:24:55,541 INFO org.apache.hadoop.hbase.master.HMaster: Aborting

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.