You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "stack (JIRA)" <ji...@apache.org> on 2011/05/05 01:30:03 UTC

[jira] [Commented] (HBASE-3814) force regionserver to halt

    [ https://issues.apache.org/jira/browse/HBASE-3814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13029046#comment-13029046 ] 

stack commented on HBASE-3814:
------------------------------

OK.  I'd be fine w/ a thread watching shutdown.  Shutdowns can take a while though if lots of regions.  Should there be a Progressible that gets tickled on each region close so we don't prematurely timeout the shutdown?

> force regionserver to halt
> --------------------------
>
>                 Key: HBASE-3814
>                 URL: https://issues.apache.org/jira/browse/HBASE-3814
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Prakash Khemani
>
> Once abort() on a regionserver is called we should have a timeout thread that does Runtime.halt() if the rs gets stuck somewhere during abort processing.
> ===
> Pumahbase132 has following the logs .. the dfsclient is not able to set up a write pipeline successfully ... it tries to abort ... but while aborting it gets stuck. I know there is a check that if we are aborting because filesystem is closed then we should not try to flush the logs while aborting. But in this case the fs is up and running, just that it is not functioning.
> 2011-04-21 23:48:07,082 INFO org.apache.hadoop.hdfs.DFSClient: Exception in createBlockOutputStream 10.38.131.53:50010  for file /PUMAHBASE002-SNC5-HBASE/.logs/pumahbase132.snc5.facebook.com,60020,1303450732026/pumahbase132.snc5.facebook.com%3A60020.1303450732280java.io.IOException: Bad connect ack with firstBadLink 10.38.133.33:50010
> 2011-04-21 23:48:07,082 INFO org.apache.hadoop.hdfs.DFSClient: Abandoning block blk_-8967376451767492285_6537229 for file /PUMAHBASE002-SNC5-HBASE/.logs/pumahbase132.snc5.facebook.com,60020,1303450732026/pumahbase132.snc5.facebook.com%3A60020.1303450732280
> 2011-04-21 23:48:07,125 INFO org.apache.hadoop.hdfs.DFSClient: Exception in createBlockOutputStream 10.38.131.53:50010  for file /PUMAHBASE002-SNC5-HBASE/.logs/pumahbase132.snc5.facebook.com,60020,1303450732026/pumahbase132.snc5.facebook.com%3A60020.1303450732280java.io.IOException: Bad connect ack with firstBadLink 10.38.134.59:50010
> 2011-04-21 23:48:07,125 INFO org.apache.hadoop.hdfs.DFSClient: Abandoning block blk_7172251852699100447_6537229 for file /PUMAHBASE002-SNC5-HBASE/.logs/pumahbase132.snc5.facebook.com,60020,1303450732026/pumahbase132.snc5.facebook.com%3A60020.1303450732280 
> 2011-04-21 23:48:07,169 INFO org.apache.hadoop.hdfs.DFSClient: Exception in createBlockOutputStream 10.38.131.53:50010  for file /PUMAHBASE002-SNC5-HBASE/.logs/pumahbase132.snc5.facebook.com,60020,1303450732026/pumahbase132.snc5.facebook.com%3A60020.1303450732280java.io.IOException: Bad connect ack with firstBadLink 10.38.134.53:50010
> 2011-04-21 23:48:07,169 INFO org.apache.hadoop.hdfs.DFSClient: Abandoning block blk_-9153204772467623625_6537229 for file /PUMAHBASE002-SNC5-HBASE/.logs/pumahbase132.snc5.facebook.com,60020,1303450732026/pumahbase132.snc5.facebook.com%3A60020.1303450732280
> 2011-04-21 23:48:07,213 INFO org.apache.hadoop.hdfs.DFSClient: Exception in createBlockOutputStream 10.38.131.53:50010  for file /PUMAHBASE002-SNC5-HBASE/.logs/pumahbase132.snc5.facebook.com,60020,1303450732026/pumahbase132.snc5.facebook.com%3A60020.1303450732280java.io.IOException: Bad connect ack with firstBadLink 10.38.134.49:50010
> 2011-04-21 23:48:07,213 INFO org.apache.hadoop.hdfs.DFSClient: Abandoning block blk_-2513098940934276625_6537229 for file /PUMAHBASE002-SNC5-HBASE/.logs/pumahbase132.snc5.facebook.com,60020,1303450732026/pumahbase132.snc5.facebook.com%3A60020.1303450732280
> 2011-04-21 23:48:07,214 WARN org.apache.hadoop.hdfs.DFSClient: DataStreamer Exception: java.io.IOException: Unable to create new block.
>         at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:3560)
>         at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2700(DFSClient.java:2720)
>         at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2977)
> 2011-04-21 23:48:07,214 WARN org.apache.hadoop.hdfs.DFSClient: Error Recovery for block blk_-2513098940934276625_6537229 bad datanode[1] nodes == null
> 2011-04-21 23:48:07,214 WARN org.apache.hadoop.hdfs.DFSClient: Could not get block locations. Source file "/PUMAHBASE002-SNC5-HBASE/.logs/pumahbase132.snc5.facebook.com,60020,1303450732026/pumahbase132.snc5.facebook.com%3A60020.1303450732280" - Aborting...
> 2011-04-21 23:48:07,216 FATAL org.apache.hadoop.hbase.regionserver.wal.HLog: Could not append. Requesting close of hlog
> And then the RS gets stuck trying to roll the logs ...

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira