You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@hbase.apache.org by "Ben Maurer (JIRA)" <ji...@apache.org> on 2009/02/28 18:36:13 UTC

[jira] Created: (HBASE-1228) Hang after crash

Hang after crash
----------------

                 Key: HBASE-1228
                 URL: https://issues.apache.org/jira/browse/HBASE-1228
             Project: Hadoop HBase
          Issue Type: Bug
    Affects Versions: 0.19.0
            Reporter: Ben Maurer
             Fix For: 0.19.1


After an exception that forced an HRegionServer to shut down, I'm seeing it hang in the following method for at least a few minutes:

"regionserver/0:0:0:0:0:0:0:0:60020" prio=10 tid=0x00002aaaf41a9000 nid=0x10f6 in Object.wait() [0x00000000422dd000..0x00000000422ddb10]
   java.lang.Thread.State: WAITING (on object monitor)
	at java.lang.Object.wait(Native Method)
	at java.lang.Object.wait(Object.java:485)
	at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.flushInternal(DFSClient.java:3025)
	- locked <0x00002aaad8fa2410> (a java.util.LinkedList)
	- locked <0x00002aaad8fa2078> (a org.apache.hadoop.hdfs.DFSClient$DFSOutputStream)
	at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.closeInternal(DFSClient.java:3105)
	- locked <0x00002aaad8fa2078> (a org.apache.hadoop.hdfs.DFSClient$DFSOutputStream)
	at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.close(DFSClient.java:3054)
	at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:61)
	at org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:86)
	at org.apache.hadoop.io.SequenceFile$Writer.close(SequenceFile.java:959)
	- locked <0x00002aaad8fa1f10> (a org.apache.hadoop.io.SequenceFile$Writer)
	at org.apache.hadoop.hbase.regionserver.HLog.close(HLog.java:431)
	- locked <0x00002aaab378b290> (a java.lang.Integer)
	at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:498)
	at java.lang.Thread.run(Thread.java:619)

I believe the file system may have been closed and thus there is trouble flushing the HLog. The HLog should be pro actively closed before shutdown begins, to maximize the chances of it surviving the crash.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HBASE-1228) Hang after crash

Posted by "stack (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HBASE-1228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack updated HBASE-1228:
-------------------------

    Fix Version/s:     (was: 0.19.1)
                   0.19.2

Moving out (I think Ben said it OK).

> Hang after crash
> ----------------
>
>                 Key: HBASE-1228
>                 URL: https://issues.apache.org/jira/browse/HBASE-1228
>             Project: Hadoop HBase
>          Issue Type: Bug
>    Affects Versions: 0.19.0
>            Reporter: Ben Maurer
>             Fix For: 0.19.2
>
>
> After an exception that forced an HRegionServer to shut down, I'm seeing it hang in the following method for at least a few minutes:
> "regionserver/0:0:0:0:0:0:0:0:60020" prio=10 tid=0x00002aaaf41a9000 nid=0x10f6 in Object.wait() [0x00000000422dd000..0x00000000422ddb10]
>    java.lang.Thread.State: WAITING (on object monitor)
> 	at java.lang.Object.wait(Native Method)
> 	at java.lang.Object.wait(Object.java:485)
> 	at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.flushInternal(DFSClient.java:3025)
> 	- locked <0x00002aaad8fa2410> (a java.util.LinkedList)
> 	- locked <0x00002aaad8fa2078> (a org.apache.hadoop.hdfs.DFSClient$DFSOutputStream)
> 	at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.closeInternal(DFSClient.java:3105)
> 	- locked <0x00002aaad8fa2078> (a org.apache.hadoop.hdfs.DFSClient$DFSOutputStream)
> 	at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.close(DFSClient.java:3054)
> 	at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:61)
> 	at org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:86)
> 	at org.apache.hadoop.io.SequenceFile$Writer.close(SequenceFile.java:959)
> 	- locked <0x00002aaad8fa1f10> (a org.apache.hadoop.io.SequenceFile$Writer)
> 	at org.apache.hadoop.hbase.regionserver.HLog.close(HLog.java:431)
> 	- locked <0x00002aaab378b290> (a java.lang.Integer)
> 	at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:498)
> 	at java.lang.Thread.run(Thread.java:619)
> I believe the file system may have been closed and thus there is trouble flushing the HLog. The HLog should be pro actively closed before shutdown begins, to maximize the chances of it surviving the crash.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-1228) Hang after crash

Posted by "Jim Kellerman (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-1228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12679812#action_12679812 ] 

Jim Kellerman commented on HBASE-1228:
--------------------------------------

It would be helpful to have the region server and master logs.

> Hang after crash
> ----------------
>
>                 Key: HBASE-1228
>                 URL: https://issues.apache.org/jira/browse/HBASE-1228
>             Project: Hadoop HBase
>          Issue Type: Bug
>    Affects Versions: 0.19.0
>            Reporter: Ben Maurer
>             Fix For: 0.19.2
>
>
> After an exception that forced an HRegionServer to shut down, I'm seeing it hang in the following method for at least a few minutes:
> "regionserver/0:0:0:0:0:0:0:0:60020" prio=10 tid=0x00002aaaf41a9000 nid=0x10f6 in Object.wait() [0x00000000422dd000..0x00000000422ddb10]
>    java.lang.Thread.State: WAITING (on object monitor)
> 	at java.lang.Object.wait(Native Method)
> 	at java.lang.Object.wait(Object.java:485)
> 	at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.flushInternal(DFSClient.java:3025)
> 	- locked <0x00002aaad8fa2410> (a java.util.LinkedList)
> 	- locked <0x00002aaad8fa2078> (a org.apache.hadoop.hdfs.DFSClient$DFSOutputStream)
> 	at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.closeInternal(DFSClient.java:3105)
> 	- locked <0x00002aaad8fa2078> (a org.apache.hadoop.hdfs.DFSClient$DFSOutputStream)
> 	at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.close(DFSClient.java:3054)
> 	at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:61)
> 	at org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:86)
> 	at org.apache.hadoop.io.SequenceFile$Writer.close(SequenceFile.java:959)
> 	- locked <0x00002aaad8fa1f10> (a org.apache.hadoop.io.SequenceFile$Writer)
> 	at org.apache.hadoop.hbase.regionserver.HLog.close(HLog.java:431)
> 	- locked <0x00002aaab378b290> (a java.lang.Integer)
> 	at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:498)
> 	at java.lang.Thread.run(Thread.java:619)
> I believe the file system may have been closed and thus there is trouble flushing the HLog. The HLog should be pro actively closed before shutdown begins, to maximize the chances of it surviving the crash.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HBASE-1228) Hang after crash

Posted by "stack (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HBASE-1228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack updated HBASE-1228:
-------------------------

    Fix Version/s:     (was: 0.19.2)
                   0.20.0

Moved to 0.20.0.

> Hang after crash
> ----------------
>
>                 Key: HBASE-1228
>                 URL: https://issues.apache.org/jira/browse/HBASE-1228
>             Project: Hadoop HBase
>          Issue Type: Bug
>    Affects Versions: 0.19.0
>            Reporter: Ben Maurer
>             Fix For: 0.20.0
>
>
> After an exception that forced an HRegionServer to shut down, I'm seeing it hang in the following method for at least a few minutes:
> "regionserver/0:0:0:0:0:0:0:0:60020" prio=10 tid=0x00002aaaf41a9000 nid=0x10f6 in Object.wait() [0x00000000422dd000..0x00000000422ddb10]
>    java.lang.Thread.State: WAITING (on object monitor)
> 	at java.lang.Object.wait(Native Method)
> 	at java.lang.Object.wait(Object.java:485)
> 	at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.flushInternal(DFSClient.java:3025)
> 	- locked <0x00002aaad8fa2410> (a java.util.LinkedList)
> 	- locked <0x00002aaad8fa2078> (a org.apache.hadoop.hdfs.DFSClient$DFSOutputStream)
> 	at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.closeInternal(DFSClient.java:3105)
> 	- locked <0x00002aaad8fa2078> (a org.apache.hadoop.hdfs.DFSClient$DFSOutputStream)
> 	at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.close(DFSClient.java:3054)
> 	at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:61)
> 	at org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:86)
> 	at org.apache.hadoop.io.SequenceFile$Writer.close(SequenceFile.java:959)
> 	- locked <0x00002aaad8fa1f10> (a org.apache.hadoop.io.SequenceFile$Writer)
> 	at org.apache.hadoop.hbase.regionserver.HLog.close(HLog.java:431)
> 	- locked <0x00002aaab378b290> (a java.lang.Integer)
> 	at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:498)
> 	at java.lang.Thread.run(Thread.java:619)
> I believe the file system may have been closed and thus there is trouble flushing the HLog. The HLog should be pro actively closed before shutdown begins, to maximize the chances of it surviving the crash.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HBASE-1228) Hang on DFSOS#flushInternal for minutes after regionserver crash

Posted by "stack (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HBASE-1228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack updated HBASE-1228:
-------------------------

    Fix Version/s:     (was: 0.20.0)
          Summary: Hang on DFSOS#flushInternal for minutes after regionserver crash  (was: Hang after crash)

Moving out of 0.20.0.  Not critical.

> Hang on DFSOS#flushInternal for minutes after regionserver crash
> ----------------------------------------------------------------
>
>                 Key: HBASE-1228
>                 URL: https://issues.apache.org/jira/browse/HBASE-1228
>             Project: Hadoop HBase
>          Issue Type: Bug
>    Affects Versions: 0.19.0
>            Reporter: Ben Maurer
>
> After an exception that forced an HRegionServer to shut down, I'm seeing it hang in the following method for at least a few minutes:
> "regionserver/0:0:0:0:0:0:0:0:60020" prio=10 tid=0x00002aaaf41a9000 nid=0x10f6 in Object.wait() [0x00000000422dd000..0x00000000422ddb10]
>    java.lang.Thread.State: WAITING (on object monitor)
> 	at java.lang.Object.wait(Native Method)
> 	at java.lang.Object.wait(Object.java:485)
> 	at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.flushInternal(DFSClient.java:3025)
> 	- locked <0x00002aaad8fa2410> (a java.util.LinkedList)
> 	- locked <0x00002aaad8fa2078> (a org.apache.hadoop.hdfs.DFSClient$DFSOutputStream)
> 	at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.closeInternal(DFSClient.java:3105)
> 	- locked <0x00002aaad8fa2078> (a org.apache.hadoop.hdfs.DFSClient$DFSOutputStream)
> 	at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.close(DFSClient.java:3054)
> 	at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:61)
> 	at org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:86)
> 	at org.apache.hadoop.io.SequenceFile$Writer.close(SequenceFile.java:959)
> 	- locked <0x00002aaad8fa1f10> (a org.apache.hadoop.io.SequenceFile$Writer)
> 	at org.apache.hadoop.hbase.regionserver.HLog.close(HLog.java:431)
> 	- locked <0x00002aaab378b290> (a java.lang.Integer)
> 	at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:498)
> 	at java.lang.Thread.run(Thread.java:619)
> I believe the file system may have been closed and thus there is trouble flushing the HLog. The HLog should be pro actively closed before shutdown begins, to maximize the chances of it surviving the crash.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.