You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Jonathan Gray (JIRA)" <ji...@apache.org> on 2010/11/09 00:16:11 UTC

[jira] Created: (HBASE-3207) If we get IOException when closing a region, we should still remove it from online regions and complete the close in ZK

If we get IOException when closing a region, we should still remove it from online regions and complete the close in ZK
-----------------------------------------------------------------------------------------------------------------------

                 Key: HBASE-3207
                 URL: https://issues.apache.org/jira/browse/HBASE-3207
             Project: HBase
          Issue Type: Bug
          Components: regionserver
    Affects Versions: 0.90.0
            Reporter: Jonathan Gray
            Assignee: Jonathan Gray
             Fix For: 0.90.0


Ran into issue on cluster where HDFS was taken out from under it.  RS eventually tried to shut itself down.  As regions were being closed, they got IOException "Filesystem closed".  In the CloseRegionHandlers, this was causing the close operation to not finish (in ZK and in the online region list in RS).  That, in turn, held up the waitOnAllRegionsToClose() so the RS never shut down.

If we get an IOException during a close, which can happen if fatal error doing flush, this is not recoverable so we should complete the region close in ZK and by removing from map of online regions on that RS.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HBASE-3207) If we get IOException when closing a region, we should still remove it from online regions and complete the close in ZK

Posted by "Jonathan Gray (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-3207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Gray updated HBASE-3207:
---------------------------------

      Resolution: Fixed
    Hadoop Flags: [Reviewed]
          Status: Resolved  (was: Patch Available)

Committed.  Thanks for review stack.

> If we get IOException when closing a region, we should still remove it from online regions and complete the close in ZK
> -----------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-3207
>                 URL: https://issues.apache.org/jira/browse/HBASE-3207
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>    Affects Versions: 0.90.0
>            Reporter: Jonathan Gray
>            Assignee: Jonathan Gray
>             Fix For: 0.90.0
>
>         Attachments: HBASE-3207-v1.patch
>
>
> Ran into issue on cluster where HDFS was taken out from under it.  RS eventually tried to shut itself down.  As regions were being closed, they got IOException "Filesystem closed".  In the CloseRegionHandlers, this was causing the close operation to not finish (in ZK and in the online region list in RS).  That, in turn, held up the waitOnAllRegionsToClose() so the RS never shut down.
> If we get an IOException during a close, which can happen if fatal error doing flush, this is not recoverable so we should complete the region close in ZK and by removing from map of online regions on that RS.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HBASE-3207) If we get IOException when closing a region, we should still remove it from online regions and complete the close in ZK

Posted by "Jonathan Gray (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-3207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Gray updated HBASE-3207:
---------------------------------

    Status: Patch Available  (was: Open)

> If we get IOException when closing a region, we should still remove it from online regions and complete the close in ZK
> -----------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-3207
>                 URL: https://issues.apache.org/jira/browse/HBASE-3207
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>    Affects Versions: 0.90.0
>            Reporter: Jonathan Gray
>            Assignee: Jonathan Gray
>             Fix For: 0.90.0
>
>         Attachments: HBASE-3207-v1.patch
>
>
> Ran into issue on cluster where HDFS was taken out from under it.  RS eventually tried to shut itself down.  As regions were being closed, they got IOException "Filesystem closed".  In the CloseRegionHandlers, this was causing the close operation to not finish (in ZK and in the online region list in RS).  That, in turn, held up the waitOnAllRegionsToClose() so the RS never shut down.
> If we get an IOException during a close, which can happen if fatal error doing flush, this is not recoverable so we should complete the region close in ZK and by removing from map of online regions on that RS.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HBASE-3207) If we get IOException when closing a region, we should still remove it from online regions and complete the close in ZK

Posted by "Jonathan Gray (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-3207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Gray updated HBASE-3207:
---------------------------------

    Attachment: HBASE-3207-v1.patch

Just makes it so we do the ZK transition and remove from online regions, even if IOException.  Adds a little more detail to logging and comments.

> If we get IOException when closing a region, we should still remove it from online regions and complete the close in ZK
> -----------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-3207
>                 URL: https://issues.apache.org/jira/browse/HBASE-3207
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>    Affects Versions: 0.90.0
>            Reporter: Jonathan Gray
>            Assignee: Jonathan Gray
>             Fix For: 0.90.0
>
>         Attachments: HBASE-3207-v1.patch
>
>
> Ran into issue on cluster where HDFS was taken out from under it.  RS eventually tried to shut itself down.  As regions were being closed, they got IOException "Filesystem closed".  In the CloseRegionHandlers, this was causing the close operation to not finish (in ZK and in the online region list in RS).  That, in turn, held up the waitOnAllRegionsToClose() so the RS never shut down.
> If we get an IOException during a close, which can happen if fatal error doing flush, this is not recoverable so we should complete the region close in ZK and by removing from map of online regions on that RS.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-3207) If we get IOException when closing a region, we should still remove it from online regions and complete the close in ZK

Posted by "stack (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12929845#action_12929845 ] 

stack commented on HBASE-3207:
------------------------------

+1

(In IRC Jon explained that we don't need deleteClosingState anymore because region is now actually considered closed).

> If we get IOException when closing a region, we should still remove it from online regions and complete the close in ZK
> -----------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-3207
>                 URL: https://issues.apache.org/jira/browse/HBASE-3207
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>    Affects Versions: 0.90.0
>            Reporter: Jonathan Gray
>            Assignee: Jonathan Gray
>             Fix For: 0.90.0
>
>         Attachments: HBASE-3207-v1.patch
>
>
> Ran into issue on cluster where HDFS was taken out from under it.  RS eventually tried to shut itself down.  As regions were being closed, they got IOException "Filesystem closed".  In the CloseRegionHandlers, this was causing the close operation to not finish (in ZK and in the online region list in RS).  That, in turn, held up the waitOnAllRegionsToClose() so the RS never shut down.
> If we get an IOException during a close, which can happen if fatal error doing flush, this is not recoverable so we should complete the region close in ZK and by removing from map of online regions on that RS.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-3207) If we get IOException when closing a region, we should still remove it from online regions and complete the close in ZK

Posted by "stack (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12930310#action_12930310 ] 

stack commented on HBASE-3207:
------------------------------

You going to commit or what? JG?

> If we get IOException when closing a region, we should still remove it from online regions and complete the close in ZK
> -----------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-3207
>                 URL: https://issues.apache.org/jira/browse/HBASE-3207
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>    Affects Versions: 0.90.0
>            Reporter: Jonathan Gray
>            Assignee: Jonathan Gray
>             Fix For: 0.90.0
>
>         Attachments: HBASE-3207-v1.patch
>
>
> Ran into issue on cluster where HDFS was taken out from under it.  RS eventually tried to shut itself down.  As regions were being closed, they got IOException "Filesystem closed".  In the CloseRegionHandlers, this was causing the close operation to not finish (in ZK and in the online region list in RS).  That, in turn, held up the waitOnAllRegionsToClose() so the RS never shut down.
> If we get an IOException during a close, which can happen if fatal error doing flush, this is not recoverable so we should complete the region close in ZK and by removing from map of online regions on that RS.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.