You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "ramkrishna.s.vasudevan (JIRA)" <ji...@apache.org> on 2012/05/23 17:42:41 UTC
[jira] [Commented] (HBASE-6070) AM.nodeDeleted and SSH races
creating problems for regions under SPLIT
[ https://issues.apache.org/jira/browse/HBASE-6070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13281678#comment-13281678 ]
ramkrishna.s.vasudevan commented on HBASE-6070:
-----------------------------------------------
I plan to make the following change in AM.nodeDeleted. Currently as SSH is trying to handle the RIT in splitting state doing the same in AM.nodeDeleted leads to race.
{code}
- if (rs.isSplitting() || rs.isSplit()) {
+ if (rs.isSplit()) {
LOG.debug("Ephemeral node deleted, regionserver crashed?, " +
"clearing from RIT; rs=" + rs);
regionOffline(rs.getRegion());
{code}
Pls provide your suggestions.
> AM.nodeDeleted and SSH races creating problems for regions under SPLIT
> ----------------------------------------------------------------------
>
> Key: HBASE-6070
> URL: https://issues.apache.org/jira/browse/HBASE-6070
> Project: HBase
> Issue Type: Bug
> Affects Versions: 0.92.1, 0.94.0
> Reporter: ramkrishna.s.vasudevan
> Fix For: 0.92.2, 0.96.0, 0.94.1
>
>
> We tried to address the problems in Master restart and RS restart while SPLIT region is in progress as part of HBASE-5806.
> While doing some more we found still there is one race condition.
> -> Split has just started and the znode is in RS_SPLIT state.
> -> RS goes down.
> -> First call back for SSH comes.
> -> As part of the fix for HBASE-5806 SSH knows that some region is in RIT.
> -> But now nodeDeleted event comes for the SPLIt node and there we try to delete the RIT.
> -> After this we try to see in the SSH whether any node is in RIT. As we dont find the region in RIT the region is never assigned.
> When we fixed HBASE-5806 step 6 happened first and then step 5 happened. So we missed it. Now we found that. Will come up with a patch shortly.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira