You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Liyin Tang (JIRA)" <ji...@apache.org> on 2011/06/17 20:49:47 UTC

[jira] [Created] (HBASE-4001) when region has been OPENING for too long, the master should not reassign region

when region has been OPENING for too long, the master should not reassign region
--------------------------------------------------------------------------------

                 Key: HBASE-4001
                 URL: https://issues.apache.org/jira/browse/HBASE-4001
             Project: HBase
          Issue Type: Bug
            Reporter: Liyin Tang
            Assignee: Liyin Tang


When the region server R1 and data node D1 running in the same machine and let's assuem this machine pub into repair,
the master will try to reassign all the regions in the region server to other region server R2. But at that time, the name node hasn't figure out the bad data node D1.
So the region server R2 will try to open the region from the D1, which will failed and retried. Then the master believes it took TOO LONG to open this region, so it reassign to R3...

The story continues until the name node figure out the bad datanode D1 and finally region server Rn opens the region and do the compaction for the store file. 
All previous region servers cannot the region later and find the file doesn't exist since it has been compacted.


So the solution is that when region has been OPENING for too long, the master should not reassign region.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Resolved] (HBASE-4001) when region has been OPENING for too long, the master should not reassign region

Posted by "Lars Hofhansl (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-4001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Lars Hofhansl resolved HBASE-4001.
----------------------------------

    Resolution: Won't Fix

Closing this. Please reopen if you disagree.
                
> when region has been OPENING for too long, the master should not reassign region
> --------------------------------------------------------------------------------
>
>                 Key: HBASE-4001
>                 URL: https://issues.apache.org/jira/browse/HBASE-4001
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Liyin Tang
>            Assignee: Liyin Tang
>
> When the region server R1 and data node D1 running in the same machine and let's assuem this machine pub into repair,
> the master will try to reassign all the regions in the region server to other region server R2. But at that time, the name node hasn't figure out the bad data node D1.
> So the region server R2 will try to open the region from the D1, which will failed and retried. Then the master believes it took TOO LONG to open this region, so it reassign to R3...
> The story continues until the name node figure out the bad datanode D1 and finally region server Rn opens the region and do the compaction for the store file. 
> All previous region servers cannot the region later and find the file doesn't exist since it has been compacted.
> So the solution is that when region has been OPENING for too long, the master should not reassign region.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4001) when region has been OPENING for too long, the master should not reassign region

Posted by "Jean-Daniel Cryans (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13052256#comment-13052256 ] 

Jean-Daniel Cryans commented on HBASE-4001:
-------------------------------------------

Not sure which HDFS version you are running, but in recent ones the DFSClient just marks the blocks on the dead DN as bad and queries the other ones.

> when region has been OPENING for too long, the master should not reassign region
> --------------------------------------------------------------------------------
>
>                 Key: HBASE-4001
>                 URL: https://issues.apache.org/jira/browse/HBASE-4001
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Liyin Tang
>            Assignee: Liyin Tang
>
> When the region server R1 and data node D1 running in the same machine and let's assuem this machine pub into repair,
> the master will try to reassign all the regions in the region server to other region server R2. But at that time, the name node hasn't figure out the bad data node D1.
> So the region server R2 will try to open the region from the D1, which will failed and retried. Then the master believes it took TOO LONG to open this region, so it reassign to R3...
> The story continues until the name node figure out the bad datanode D1 and finally region server Rn opens the region and do the compaction for the store file. 
> All previous region servers cannot the region later and find the file doesn't exist since it has been compacted.
> So the solution is that when region has been OPENING for too long, the master should not reassign region.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira