You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "ramkrishna.s.vasudevan (Created) (JIRA)" <ji...@apache.org> on 2011/10/05 11:37:34 UTC

[jira] [Created] (HBASE-4540) OpenedRegionHandler is not enforcing atomicity of the operation it is performing

OpenedRegionHandler is not enforcing atomicity of the operation it is performing
--------------------------------------------------------------------------------

                 Key: HBASE-4540
                 URL: https://issues.apache.org/jira/browse/HBASE-4540
             Project: HBase
          Issue Type: Bug
            Reporter: ramkrishna.s.vasudevan
            Assignee: ramkrishna.s.vasudevan


-> OpenedRegionHandler has not yet deleted the znode of the region R1 opened by RS1.
-> RS1 goes down.
-> Servershutdownhandler assigns the region R1 to RS2.
-> The znode of R1 is moved to OFFLINE state by master or OPENING state by RS2 if RS2 has started opening the region.
-> Now the first OpenedRegionHandler tries to delete the znode thinking its in OPENED state but fails.
-> Though it fails it removes the node from RIT and adds RS1 as the owner of R1 in master's memory.
-> Now when RS2 completes opening the region the master is not able to open the region as already the reigon has been deleted from RIT.
{code}
Master
======
2011-10-05 20:49:45,301 INFO org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Finished processing of shutdown of linux146,60020,1317827727647
2011-10-05 20:49:54,177 DEBUG org.apache.hadoop.hbase.master.HMaster: Not running balancer because 1 region(s) in transition: {3e69d628a8bd8e9b7c5e7a2a6e03aad9=t1,,1317827883842.3e69d628a8bd8e9b7c5e7a2a6e03aad9. state=PENDING_OPEN, ts=1317827985272, server=linux76,60020,1317827746847}
2011-10-05 20:49:57,720 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=M_ZK_REGION_OFFLINE, server=linux76,60000,1317827742012, region=3e69d628a8bd8e9b7c5e7a2a6e03aad9
2011-10-05 20:50:14,501 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:60000-0x132d3dc13090023 Deleting existing unassigned node for 3e69d628a8bd8e9b7c5e7a2a6e03aad9 that is in expected state RS_ZK_REGION_OPENED
2011-10-05 20:50:14,505 WARN org.apache.hadoop.hbase.zookeeper.ZKAssign: master:60000-0x132d3dc13090023 Attempting to delete unassigned node 3e69d628a8bd8e9b7c5e7a2a6e03aad9 in RS_ZK_REGION_OPENED state but node is in RS_ZK_REGION_OPENING state

After the region is opened in RS2
=================================
2011-10-05 20:50:48,066 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=RS_ZK_REGION_OPENING, server=linux76,60020,1317827746847, region=3e69d628a8bd8e9b7c5e7a2a6e03aad9, which is more than 15 seconds late
2011-10-05 20:50:48,290 WARN org.apache.hadoop.hbase.master.AssignmentManager: Received OPENING for region 3e69d628a8bd8e9b7c5e7a2a6e03aad9 from server linux76,60020,1317827746847 but region was in  the state null and not in expected PENDING_OPEN or OPENING states
2011-10-05 20:50:53,743 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=RS_ZK_REGION_OPENING, server=linux76,60020,1317827746847, region=3e69d628a8bd8e9b7c5e7a2a6e03aad9
2011-10-05 20:50:54,182 DEBUG org.apache.hadoop.hbase.master.CatalogJanitor: Scanned 1 catalog row(s) and gc'd 0 unreferenced parent region(s)
2011-10-05 20:50:54,397 WARN org.apache.hadoop.hbase.master.AssignmentManager: Received OPENING for region 3e69d628a8bd8e9b7c5e7a2a6e03aad9 from server linux76,60020,1317827746847 but region was in  the state null and not in expected PENDING_OPEN or OPENING states
{code}


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4540) OpenedRegionHandler is not enforcing atomicity of the operation it is performing

Posted by "jiraposter@reviews.apache.org (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13123380#comment-13123380 ] 

jiraposter@reviews.apache.org commented on HBASE-4540:
------------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2251/
-----------------------------------------------------------

(Updated 2011-10-08 05:13:32.657832)


Review request for hbase, Ted Yu, Michael Stack, and Jonathan Gray.


Changes
-------

This updated patch is same as uploaded at @ 07/Oct/11 14:27
Reverted the change of passing -2 for not comparing the version and address Ted's comment to add spaces.


Summary
-------

Fix for handling HBASE-4539 and HBASE-4540.
Ran all the testcases.  Added one new testcase to verify OpenedRegionHandler scenarios.
Also addresses Ted's comments.


This addresses bug HBASE-4540.
    https://issues.apache.org/jira/browse/HBASE-4540


Diffs (updated)
-----

  http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java 1179945 
  http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/handler/OpenedRegionHandler.java 1179945 
  http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKAssign.java 1179945 
  http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java 1179945 
  http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/master/TestOpenedRegionHandler.java PRE-CREATION 

Diff: https://reviews.apache.org/r/2251/diff


Testing
-------

Yes


Thanks,

ramkrishna


                
> OpenedRegionHandler is not enforcing atomicity of the operation it is performing
> --------------------------------------------------------------------------------
>
>                 Key: HBASE-4540
>                 URL: https://issues.apache.org/jira/browse/HBASE-4540
>             Project: HBase
>          Issue Type: Bug
>            Reporter: ramkrishna.s.vasudevan
>            Assignee: ramkrishna.s.vasudevan
>         Attachments: HBASE-4540_1.patch
>
>
> -> OpenedRegionHandler has not yet deleted the znode of the region R1 opened by RS1.
> -> RS1 goes down.
> -> Servershutdownhandler assigns the region R1 to RS2.
> -> The znode of R1 is moved to OFFLINE state by master or OPENING state by RS2 if RS2 has started opening the region.
> -> Now the first OpenedRegionHandler tries to delete the znode thinking its in OPENED state but fails.
> -> Though it fails it removes the node from RIT and adds RS1 as the owner of R1 in master's memory.
> -> Now when RS2 completes opening the region the master is not able to open the region as already the reigon has been deleted from RIT.
> {code}
> Master
> ======
> 2011-10-05 20:49:45,301 INFO org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Finished processing of shutdown of linux146,60020,1317827727647
> 2011-10-05 20:49:54,177 DEBUG org.apache.hadoop.hbase.master.HMaster: Not running balancer because 1 region(s) in transition: {3e69d628a8bd8e9b7c5e7a2a6e03aad9=t1,,1317827883842.3e69d628a8bd8e9b7c5e7a2a6e03aad9. state=PENDING_OPEN, ts=1317827985272, server=linux76,60020,1317827746847}
> 2011-10-05 20:49:57,720 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=M_ZK_REGION_OFFLINE, server=linux76,60000,1317827742012, region=3e69d628a8bd8e9b7c5e7a2a6e03aad9
> 2011-10-05 20:50:14,501 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:60000-0x132d3dc13090023 Deleting existing unassigned node for 3e69d628a8bd8e9b7c5e7a2a6e03aad9 that is in expected state RS_ZK_REGION_OPENED
> 2011-10-05 20:50:14,505 WARN org.apache.hadoop.hbase.zookeeper.ZKAssign: master:60000-0x132d3dc13090023 Attempting to delete unassigned node 3e69d628a8bd8e9b7c5e7a2a6e03aad9 in RS_ZK_REGION_OPENED state but node is in RS_ZK_REGION_OPENING state
> After the region is opened in RS2
> =================================
> 2011-10-05 20:50:48,066 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=RS_ZK_REGION_OPENING, server=linux76,60020,1317827746847, region=3e69d628a8bd8e9b7c5e7a2a6e03aad9, which is more than 15 seconds late
> 2011-10-05 20:50:48,290 WARN org.apache.hadoop.hbase.master.AssignmentManager: Received OPENING for region 3e69d628a8bd8e9b7c5e7a2a6e03aad9 from server linux76,60020,1317827746847 but region was in  the state null and not in expected PENDING_OPEN or OPENING states
> 2011-10-05 20:50:53,743 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=RS_ZK_REGION_OPENING, server=linux76,60020,1317827746847, region=3e69d628a8bd8e9b7c5e7a2a6e03aad9
> 2011-10-05 20:50:54,182 DEBUG org.apache.hadoop.hbase.master.CatalogJanitor: Scanned 1 catalog row(s) and gc'd 0 unreferenced parent region(s)
> 2011-10-05 20:50:54,397 WARN org.apache.hadoop.hbase.master.AssignmentManager: Received OPENING for region 3e69d628a8bd8e9b7c5e7a2a6e03aad9 from server linux76,60020,1317827746847 but region was in  the state null and not in expected PENDING_OPEN or OPENING states
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4540) OpenedRegionHandler is not enforcing atomicity of the operation it is performing

Posted by "Ted Yu (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13122884#comment-13122884 ] 

Ted Yu commented on HBASE-4540:
-------------------------------

We may designate some negative value for other purpose in the future. 
I think using one known value is recommended. 
The Javadoc addition above is nice. 
                
> OpenedRegionHandler is not enforcing atomicity of the operation it is performing
> --------------------------------------------------------------------------------
>
>                 Key: HBASE-4540
>                 URL: https://issues.apache.org/jira/browse/HBASE-4540
>             Project: HBase
>          Issue Type: Bug
>            Reporter: ramkrishna.s.vasudevan
>            Assignee: ramkrishna.s.vasudevan
>         Attachments: HBASE-4540_1.patch
>
>
> -> OpenedRegionHandler has not yet deleted the znode of the region R1 opened by RS1.
> -> RS1 goes down.
> -> Servershutdownhandler assigns the region R1 to RS2.
> -> The znode of R1 is moved to OFFLINE state by master or OPENING state by RS2 if RS2 has started opening the region.
> -> Now the first OpenedRegionHandler tries to delete the znode thinking its in OPENED state but fails.
> -> Though it fails it removes the node from RIT and adds RS1 as the owner of R1 in master's memory.
> -> Now when RS2 completes opening the region the master is not able to open the region as already the reigon has been deleted from RIT.
> {code}
> Master
> ======
> 2011-10-05 20:49:45,301 INFO org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Finished processing of shutdown of linux146,60020,1317827727647
> 2011-10-05 20:49:54,177 DEBUG org.apache.hadoop.hbase.master.HMaster: Not running balancer because 1 region(s) in transition: {3e69d628a8bd8e9b7c5e7a2a6e03aad9=t1,,1317827883842.3e69d628a8bd8e9b7c5e7a2a6e03aad9. state=PENDING_OPEN, ts=1317827985272, server=linux76,60020,1317827746847}
> 2011-10-05 20:49:57,720 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=M_ZK_REGION_OFFLINE, server=linux76,60000,1317827742012, region=3e69d628a8bd8e9b7c5e7a2a6e03aad9
> 2011-10-05 20:50:14,501 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:60000-0x132d3dc13090023 Deleting existing unassigned node for 3e69d628a8bd8e9b7c5e7a2a6e03aad9 that is in expected state RS_ZK_REGION_OPENED
> 2011-10-05 20:50:14,505 WARN org.apache.hadoop.hbase.zookeeper.ZKAssign: master:60000-0x132d3dc13090023 Attempting to delete unassigned node 3e69d628a8bd8e9b7c5e7a2a6e03aad9 in RS_ZK_REGION_OPENED state but node is in RS_ZK_REGION_OPENING state
> After the region is opened in RS2
> =================================
> 2011-10-05 20:50:48,066 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=RS_ZK_REGION_OPENING, server=linux76,60020,1317827746847, region=3e69d628a8bd8e9b7c5e7a2a6e03aad9, which is more than 15 seconds late
> 2011-10-05 20:50:48,290 WARN org.apache.hadoop.hbase.master.AssignmentManager: Received OPENING for region 3e69d628a8bd8e9b7c5e7a2a6e03aad9 from server linux76,60020,1317827746847 but region was in  the state null and not in expected PENDING_OPEN or OPENING states
> 2011-10-05 20:50:53,743 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=RS_ZK_REGION_OPENING, server=linux76,60020,1317827746847, region=3e69d628a8bd8e9b7c5e7a2a6e03aad9
> 2011-10-05 20:50:54,182 DEBUG org.apache.hadoop.hbase.master.CatalogJanitor: Scanned 1 catalog row(s) and gc'd 0 unreferenced parent region(s)
> 2011-10-05 20:50:54,397 WARN org.apache.hadoop.hbase.master.AssignmentManager: Received OPENING for region 3e69d628a8bd8e9b7c5e7a2a6e03aad9 from server linux76,60020,1317827746847 but region was in  the state null and not in expected PENDING_OPEN or OPENING states
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Resolved] (HBASE-4540) OpenedRegionHandler is not enforcing atomicity of the operation it is performing

Posted by "ramkrishna.s.vasudevan (Resolved) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-4540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

ramkrishna.s.vasudevan resolved HBASE-4540.
-------------------------------------------

       Resolution: Fixed
    Fix Version/s: 0.90.5

Resolved both in 0.92 and 0.90.5.
                
> OpenedRegionHandler is not enforcing atomicity of the operation it is performing
> --------------------------------------------------------------------------------
>
>                 Key: HBASE-4540
>                 URL: https://issues.apache.org/jira/browse/HBASE-4540
>             Project: HBase
>          Issue Type: Bug
>            Reporter: ramkrishna.s.vasudevan
>            Assignee: ramkrishna.s.vasudevan
>             Fix For: 0.92.0, 0.90.5
>
>         Attachments: HBASE-4540_1.patch, HBASE-4540_90.patch, HBASE-4540_90_1.patch
>
>
> -> OpenedRegionHandler has not yet deleted the znode of the region R1 opened by RS1.
> -> RS1 goes down.
> -> Servershutdownhandler assigns the region R1 to RS2.
> -> The znode of R1 is moved to OFFLINE state by master or OPENING state by RS2 if RS2 has started opening the region.
> -> Now the first OpenedRegionHandler tries to delete the znode thinking its in OPENED state but fails.
> -> Though it fails it removes the node from RIT and adds RS1 as the owner of R1 in master's memory.
> -> Now when RS2 completes opening the region the master is not able to open the region as already the reigon has been deleted from RIT.
> {code}
> Master
> ======
> 2011-10-05 20:49:45,301 INFO org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Finished processing of shutdown of linux146,60020,1317827727647
> 2011-10-05 20:49:54,177 DEBUG org.apache.hadoop.hbase.master.HMaster: Not running balancer because 1 region(s) in transition: {3e69d628a8bd8e9b7c5e7a2a6e03aad9=t1,,1317827883842.3e69d628a8bd8e9b7c5e7a2a6e03aad9. state=PENDING_OPEN, ts=1317827985272, server=linux76,60020,1317827746847}
> 2011-10-05 20:49:57,720 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=M_ZK_REGION_OFFLINE, server=linux76,60000,1317827742012, region=3e69d628a8bd8e9b7c5e7a2a6e03aad9
> 2011-10-05 20:50:14,501 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:60000-0x132d3dc13090023 Deleting existing unassigned node for 3e69d628a8bd8e9b7c5e7a2a6e03aad9 that is in expected state RS_ZK_REGION_OPENED
> 2011-10-05 20:50:14,505 WARN org.apache.hadoop.hbase.zookeeper.ZKAssign: master:60000-0x132d3dc13090023 Attempting to delete unassigned node 3e69d628a8bd8e9b7c5e7a2a6e03aad9 in RS_ZK_REGION_OPENED state but node is in RS_ZK_REGION_OPENING state
> After the region is opened in RS2
> =================================
> 2011-10-05 20:50:48,066 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=RS_ZK_REGION_OPENING, server=linux76,60020,1317827746847, region=3e69d628a8bd8e9b7c5e7a2a6e03aad9, which is more than 15 seconds late
> 2011-10-05 20:50:48,290 WARN org.apache.hadoop.hbase.master.AssignmentManager: Received OPENING for region 3e69d628a8bd8e9b7c5e7a2a6e03aad9 from server linux76,60020,1317827746847 but region was in  the state null and not in expected PENDING_OPEN or OPENING states
> 2011-10-05 20:50:53,743 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=RS_ZK_REGION_OPENING, server=linux76,60020,1317827746847, region=3e69d628a8bd8e9b7c5e7a2a6e03aad9
> 2011-10-05 20:50:54,182 DEBUG org.apache.hadoop.hbase.master.CatalogJanitor: Scanned 1 catalog row(s) and gc'd 0 unreferenced parent region(s)
> 2011-10-05 20:50:54,397 WARN org.apache.hadoop.hbase.master.AssignmentManager: Received OPENING for region 3e69d628a8bd8e9b7c5e7a2a6e03aad9 from server linux76,60020,1317827746847 but region was in  the state null and not in expected PENDING_OPEN or OPENING states
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4540) OpenedRegionHandler is not enforcing atomicity of the operation it is performing

Posted by "stack (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13126024#comment-13126024 ] 

stack commented on HBASE-4540:
------------------------------

+1 on commit to 0.90 branch.
                
> OpenedRegionHandler is not enforcing atomicity of the operation it is performing
> --------------------------------------------------------------------------------
>
>                 Key: HBASE-4540
>                 URL: https://issues.apache.org/jira/browse/HBASE-4540
>             Project: HBase
>          Issue Type: Bug
>            Reporter: ramkrishna.s.vasudevan
>            Assignee: ramkrishna.s.vasudevan
>             Fix For: 0.92.0
>
>         Attachments: HBASE-4540_1.patch, HBASE-4540_90.patch
>
>
> -> OpenedRegionHandler has not yet deleted the znode of the region R1 opened by RS1.
> -> RS1 goes down.
> -> Servershutdownhandler assigns the region R1 to RS2.
> -> The znode of R1 is moved to OFFLINE state by master or OPENING state by RS2 if RS2 has started opening the region.
> -> Now the first OpenedRegionHandler tries to delete the znode thinking its in OPENED state but fails.
> -> Though it fails it removes the node from RIT and adds RS1 as the owner of R1 in master's memory.
> -> Now when RS2 completes opening the region the master is not able to open the region as already the reigon has been deleted from RIT.
> {code}
> Master
> ======
> 2011-10-05 20:49:45,301 INFO org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Finished processing of shutdown of linux146,60020,1317827727647
> 2011-10-05 20:49:54,177 DEBUG org.apache.hadoop.hbase.master.HMaster: Not running balancer because 1 region(s) in transition: {3e69d628a8bd8e9b7c5e7a2a6e03aad9=t1,,1317827883842.3e69d628a8bd8e9b7c5e7a2a6e03aad9. state=PENDING_OPEN, ts=1317827985272, server=linux76,60020,1317827746847}
> 2011-10-05 20:49:57,720 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=M_ZK_REGION_OFFLINE, server=linux76,60000,1317827742012, region=3e69d628a8bd8e9b7c5e7a2a6e03aad9
> 2011-10-05 20:50:14,501 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:60000-0x132d3dc13090023 Deleting existing unassigned node for 3e69d628a8bd8e9b7c5e7a2a6e03aad9 that is in expected state RS_ZK_REGION_OPENED
> 2011-10-05 20:50:14,505 WARN org.apache.hadoop.hbase.zookeeper.ZKAssign: master:60000-0x132d3dc13090023 Attempting to delete unassigned node 3e69d628a8bd8e9b7c5e7a2a6e03aad9 in RS_ZK_REGION_OPENED state but node is in RS_ZK_REGION_OPENING state
> After the region is opened in RS2
> =================================
> 2011-10-05 20:50:48,066 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=RS_ZK_REGION_OPENING, server=linux76,60020,1317827746847, region=3e69d628a8bd8e9b7c5e7a2a6e03aad9, which is more than 15 seconds late
> 2011-10-05 20:50:48,290 WARN org.apache.hadoop.hbase.master.AssignmentManager: Received OPENING for region 3e69d628a8bd8e9b7c5e7a2a6e03aad9 from server linux76,60020,1317827746847 but region was in  the state null and not in expected PENDING_OPEN or OPENING states
> 2011-10-05 20:50:53,743 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=RS_ZK_REGION_OPENING, server=linux76,60020,1317827746847, region=3e69d628a8bd8e9b7c5e7a2a6e03aad9
> 2011-10-05 20:50:54,182 DEBUG org.apache.hadoop.hbase.master.CatalogJanitor: Scanned 1 catalog row(s) and gc'd 0 unreferenced parent region(s)
> 2011-10-05 20:50:54,397 WARN org.apache.hadoop.hbase.master.AssignmentManager: Received OPENING for region 3e69d628a8bd8e9b7c5e7a2a6e03aad9 from server linux76,60020,1317827746847 but region was in  the state null and not in expected PENDING_OPEN or OPENING states
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-4540) OpenedRegionHandler is not enforcing atomicity of the operation it is performing

Posted by "ramkrishna.s.vasudevan (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-4540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

ramkrishna.s.vasudevan updated HBASE-4540:
------------------------------------------

    Status: Patch Available  (was: Open)

Addresses HBASE-4539 also.
                
> OpenedRegionHandler is not enforcing atomicity of the operation it is performing
> --------------------------------------------------------------------------------
>
>                 Key: HBASE-4540
>                 URL: https://issues.apache.org/jira/browse/HBASE-4540
>             Project: HBase
>          Issue Type: Bug
>            Reporter: ramkrishna.s.vasudevan
>            Assignee: ramkrishna.s.vasudevan
>         Attachments: HBASE-4540_1.patch
>
>
> -> OpenedRegionHandler has not yet deleted the znode of the region R1 opened by RS1.
> -> RS1 goes down.
> -> Servershutdownhandler assigns the region R1 to RS2.
> -> The znode of R1 is moved to OFFLINE state by master or OPENING state by RS2 if RS2 has started opening the region.
> -> Now the first OpenedRegionHandler tries to delete the znode thinking its in OPENED state but fails.
> -> Though it fails it removes the node from RIT and adds RS1 as the owner of R1 in master's memory.
> -> Now when RS2 completes opening the region the master is not able to open the region as already the reigon has been deleted from RIT.
> {code}
> Master
> ======
> 2011-10-05 20:49:45,301 INFO org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Finished processing of shutdown of linux146,60020,1317827727647
> 2011-10-05 20:49:54,177 DEBUG org.apache.hadoop.hbase.master.HMaster: Not running balancer because 1 region(s) in transition: {3e69d628a8bd8e9b7c5e7a2a6e03aad9=t1,,1317827883842.3e69d628a8bd8e9b7c5e7a2a6e03aad9. state=PENDING_OPEN, ts=1317827985272, server=linux76,60020,1317827746847}
> 2011-10-05 20:49:57,720 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=M_ZK_REGION_OFFLINE, server=linux76,60000,1317827742012, region=3e69d628a8bd8e9b7c5e7a2a6e03aad9
> 2011-10-05 20:50:14,501 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:60000-0x132d3dc13090023 Deleting existing unassigned node for 3e69d628a8bd8e9b7c5e7a2a6e03aad9 that is in expected state RS_ZK_REGION_OPENED
> 2011-10-05 20:50:14,505 WARN org.apache.hadoop.hbase.zookeeper.ZKAssign: master:60000-0x132d3dc13090023 Attempting to delete unassigned node 3e69d628a8bd8e9b7c5e7a2a6e03aad9 in RS_ZK_REGION_OPENED state but node is in RS_ZK_REGION_OPENING state
> After the region is opened in RS2
> =================================
> 2011-10-05 20:50:48,066 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=RS_ZK_REGION_OPENING, server=linux76,60020,1317827746847, region=3e69d628a8bd8e9b7c5e7a2a6e03aad9, which is more than 15 seconds late
> 2011-10-05 20:50:48,290 WARN org.apache.hadoop.hbase.master.AssignmentManager: Received OPENING for region 3e69d628a8bd8e9b7c5e7a2a6e03aad9 from server linux76,60020,1317827746847 but region was in  the state null and not in expected PENDING_OPEN or OPENING states
> 2011-10-05 20:50:53,743 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=RS_ZK_REGION_OPENING, server=linux76,60020,1317827746847, region=3e69d628a8bd8e9b7c5e7a2a6e03aad9
> 2011-10-05 20:50:54,182 DEBUG org.apache.hadoop.hbase.master.CatalogJanitor: Scanned 1 catalog row(s) and gc'd 0 unreferenced parent region(s)
> 2011-10-05 20:50:54,397 WARN org.apache.hadoop.hbase.master.AssignmentManager: Received OPENING for region 3e69d628a8bd8e9b7c5e7a2a6e03aad9 from server linux76,60020,1317827746847 but region was in  the state null and not in expected PENDING_OPEN or OPENING states
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4540) OpenedRegionHandler is not enforcing atomicity of the operation it is performing

Posted by "jiraposter@reviews.apache.org (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13123633#comment-13123633 ] 

jiraposter@reviews.apache.org commented on HBASE-4540:
------------------------------------------------------



bq.  On 2011-10-08 21:55:31, Michael Stack wrote:
bq.  > http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKAssign.java, line 873
bq.  > <https://reviews.apache.org/r/2251/diff/4/?file=48949#file48949line873>
bq.  >
bq.  >     We don't have this method already in our ZK* classes?

@Stack
ZKAssign() did have getDataAndWatch() that accepts stat object.  Only ZKUtil had but it returned data in bytes which had to be again converted to RegionTransitionData. 
Hence added an utility api in ZKAssign itself and thought it may be useful in future also.


bq.  On 2011-10-08 21:55:31, Michael Stack wrote:
bq.  > http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/master/TestOpenedRegionHandler.java, line 102
bq.  > <https://reviews.apache.org/r/2251/diff/4/?file=48951#file48951line102>
bq.  >
bq.  >     Good test.
bq.  >     
bq.  >     Would it be possible to test the handler without spinning up the cluster?  See TestOpenRegionHandler over under regionserver.handler in tests -- they don't spin up a cluster, just zk.  Test can run faster if no dfs+hbase.  Not important.  For the future.

@Stack
I can do like that atleast for one of the testcases in TestOpenedRegionHandler.  But i have to use the MockServer and MockRegionServices.
I will raise one minor improvement task to do that. Currently MockServer and MockRegionServices are under regionserver.handler package but the new testcase is in master package.  So better we can move it to a test.utility package and then use it across. So i will currently go with this commit and then track the new improvement JIRA to closure.


- ramkrishna


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2251/#review2469
-----------------------------------------------------------


On 2011-10-08 05:13:32, ramkrishna vasudevan wrote:
bq.  
bq.  -----------------------------------------------------------
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/2251/
bq.  -----------------------------------------------------------
bq.  
bq.  (Updated 2011-10-08 05:13:32)
bq.  
bq.  
bq.  Review request for hbase, Ted Yu, Michael Stack, and Jonathan Gray.
bq.  
bq.  
bq.  Summary
bq.  -------
bq.  
bq.  Fix for handling HBASE-4539 and HBASE-4540.
bq.  Ran all the testcases.  Added one new testcase to verify OpenedRegionHandler scenarios.
bq.  Also addresses Ted's comments.
bq.  
bq.  
bq.  This addresses bug HBASE-4540.
bq.      https://issues.apache.org/jira/browse/HBASE-4540
bq.  
bq.  
bq.  Diffs
bq.  -----
bq.  
bq.    http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java 1179945 
bq.    http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/handler/OpenedRegionHandler.java 1179945 
bq.    http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKAssign.java 1179945 
bq.    http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java 1179945 
bq.    http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/master/TestOpenedRegionHandler.java PRE-CREATION 
bq.  
bq.  Diff: https://reviews.apache.org/r/2251/diff
bq.  
bq.  
bq.  Testing
bq.  -------
bq.  
bq.  Yes
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  ramkrishna
bq.  
bq.


                
> OpenedRegionHandler is not enforcing atomicity of the operation it is performing
> --------------------------------------------------------------------------------
>
>                 Key: HBASE-4540
>                 URL: https://issues.apache.org/jira/browse/HBASE-4540
>             Project: HBase
>          Issue Type: Bug
>            Reporter: ramkrishna.s.vasudevan
>            Assignee: ramkrishna.s.vasudevan
>         Attachments: HBASE-4540_1.patch
>
>
> -> OpenedRegionHandler has not yet deleted the znode of the region R1 opened by RS1.
> -> RS1 goes down.
> -> Servershutdownhandler assigns the region R1 to RS2.
> -> The znode of R1 is moved to OFFLINE state by master or OPENING state by RS2 if RS2 has started opening the region.
> -> Now the first OpenedRegionHandler tries to delete the znode thinking its in OPENED state but fails.
> -> Though it fails it removes the node from RIT and adds RS1 as the owner of R1 in master's memory.
> -> Now when RS2 completes opening the region the master is not able to open the region as already the reigon has been deleted from RIT.
> {code}
> Master
> ======
> 2011-10-05 20:49:45,301 INFO org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Finished processing of shutdown of linux146,60020,1317827727647
> 2011-10-05 20:49:54,177 DEBUG org.apache.hadoop.hbase.master.HMaster: Not running balancer because 1 region(s) in transition: {3e69d628a8bd8e9b7c5e7a2a6e03aad9=t1,,1317827883842.3e69d628a8bd8e9b7c5e7a2a6e03aad9. state=PENDING_OPEN, ts=1317827985272, server=linux76,60020,1317827746847}
> 2011-10-05 20:49:57,720 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=M_ZK_REGION_OFFLINE, server=linux76,60000,1317827742012, region=3e69d628a8bd8e9b7c5e7a2a6e03aad9
> 2011-10-05 20:50:14,501 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:60000-0x132d3dc13090023 Deleting existing unassigned node for 3e69d628a8bd8e9b7c5e7a2a6e03aad9 that is in expected state RS_ZK_REGION_OPENED
> 2011-10-05 20:50:14,505 WARN org.apache.hadoop.hbase.zookeeper.ZKAssign: master:60000-0x132d3dc13090023 Attempting to delete unassigned node 3e69d628a8bd8e9b7c5e7a2a6e03aad9 in RS_ZK_REGION_OPENED state but node is in RS_ZK_REGION_OPENING state
> After the region is opened in RS2
> =================================
> 2011-10-05 20:50:48,066 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=RS_ZK_REGION_OPENING, server=linux76,60020,1317827746847, region=3e69d628a8bd8e9b7c5e7a2a6e03aad9, which is more than 15 seconds late
> 2011-10-05 20:50:48,290 WARN org.apache.hadoop.hbase.master.AssignmentManager: Received OPENING for region 3e69d628a8bd8e9b7c5e7a2a6e03aad9 from server linux76,60020,1317827746847 but region was in  the state null and not in expected PENDING_OPEN or OPENING states
> 2011-10-05 20:50:53,743 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=RS_ZK_REGION_OPENING, server=linux76,60020,1317827746847, region=3e69d628a8bd8e9b7c5e7a2a6e03aad9
> 2011-10-05 20:50:54,182 DEBUG org.apache.hadoop.hbase.master.CatalogJanitor: Scanned 1 catalog row(s) and gc'd 0 unreferenced parent region(s)
> 2011-10-05 20:50:54,397 WARN org.apache.hadoop.hbase.master.AssignmentManager: Received OPENING for region 3e69d628a8bd8e9b7c5e7a2a6e03aad9 from server linux76,60020,1317827746847 but region was in  the state null and not in expected PENDING_OPEN or OPENING states
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4540) OpenedRegionHandler is not enforcing atomicity of the operation it is performing

Posted by "ramkrishna.s.vasudevan (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13121299#comment-13121299 ] 

ramkrishna.s.vasudevan commented on HBASE-4540:
-----------------------------------------------

The reason for getting the znode version for the following scenario
-> RS1 tries opening a region by transiting it to OPENED
->OpenedRegionHandler has still not processed.
-> RS1 goes down and the region is assigned to RS2.
-> RS2 has transited the node to OPENED
-> Now the OpenedRegionHandler will try to delete the znode and it will succeed thinking the region is in RS1.
-> To avoid the above scenario i have tried to use the znode version that comes along when we get the callback after transiting the node to OPENED state.
                
> OpenedRegionHandler is not enforcing atomicity of the operation it is performing
> --------------------------------------------------------------------------------
>
>                 Key: HBASE-4540
>                 URL: https://issues.apache.org/jira/browse/HBASE-4540
>             Project: HBase
>          Issue Type: Bug
>            Reporter: ramkrishna.s.vasudevan
>            Assignee: ramkrishna.s.vasudevan
>         Attachments: HBASE-4540_1.patch
>
>
> -> OpenedRegionHandler has not yet deleted the znode of the region R1 opened by RS1.
> -> RS1 goes down.
> -> Servershutdownhandler assigns the region R1 to RS2.
> -> The znode of R1 is moved to OFFLINE state by master or OPENING state by RS2 if RS2 has started opening the region.
> -> Now the first OpenedRegionHandler tries to delete the znode thinking its in OPENED state but fails.
> -> Though it fails it removes the node from RIT and adds RS1 as the owner of R1 in master's memory.
> -> Now when RS2 completes opening the region the master is not able to open the region as already the reigon has been deleted from RIT.
> {code}
> Master
> ======
> 2011-10-05 20:49:45,301 INFO org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Finished processing of shutdown of linux146,60020,1317827727647
> 2011-10-05 20:49:54,177 DEBUG org.apache.hadoop.hbase.master.HMaster: Not running balancer because 1 region(s) in transition: {3e69d628a8bd8e9b7c5e7a2a6e03aad9=t1,,1317827883842.3e69d628a8bd8e9b7c5e7a2a6e03aad9. state=PENDING_OPEN, ts=1317827985272, server=linux76,60020,1317827746847}
> 2011-10-05 20:49:57,720 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=M_ZK_REGION_OFFLINE, server=linux76,60000,1317827742012, region=3e69d628a8bd8e9b7c5e7a2a6e03aad9
> 2011-10-05 20:50:14,501 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:60000-0x132d3dc13090023 Deleting existing unassigned node for 3e69d628a8bd8e9b7c5e7a2a6e03aad9 that is in expected state RS_ZK_REGION_OPENED
> 2011-10-05 20:50:14,505 WARN org.apache.hadoop.hbase.zookeeper.ZKAssign: master:60000-0x132d3dc13090023 Attempting to delete unassigned node 3e69d628a8bd8e9b7c5e7a2a6e03aad9 in RS_ZK_REGION_OPENED state but node is in RS_ZK_REGION_OPENING state
> After the region is opened in RS2
> =================================
> 2011-10-05 20:50:48,066 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=RS_ZK_REGION_OPENING, server=linux76,60020,1317827746847, region=3e69d628a8bd8e9b7c5e7a2a6e03aad9, which is more than 15 seconds late
> 2011-10-05 20:50:48,290 WARN org.apache.hadoop.hbase.master.AssignmentManager: Received OPENING for region 3e69d628a8bd8e9b7c5e7a2a6e03aad9 from server linux76,60020,1317827746847 but region was in  the state null and not in expected PENDING_OPEN or OPENING states
> 2011-10-05 20:50:53,743 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=RS_ZK_REGION_OPENING, server=linux76,60020,1317827746847, region=3e69d628a8bd8e9b7c5e7a2a6e03aad9
> 2011-10-05 20:50:54,182 DEBUG org.apache.hadoop.hbase.master.CatalogJanitor: Scanned 1 catalog row(s) and gc'd 0 unreferenced parent region(s)
> 2011-10-05 20:50:54,397 WARN org.apache.hadoop.hbase.master.AssignmentManager: Received OPENING for region 3e69d628a8bd8e9b7c5e7a2a6e03aad9 from server linux76,60020,1317827746847 but region was in  the state null and not in expected PENDING_OPEN or OPENING states
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4540) OpenedRegionHandler is not enforcing atomicity of the operation it is performing

Posted by "Ted Yu (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13122973#comment-13122973 ] 

Ted Yu commented on HBASE-4540:
-------------------------------

@Ramkrishna:
ZKAssign.transitionNode() is already using -1 to indicate no version comparison.
Your patch @ 07/Oct/11 14:27 should be good.

Sorry for the confusion.
                
> OpenedRegionHandler is not enforcing atomicity of the operation it is performing
> --------------------------------------------------------------------------------
>
>                 Key: HBASE-4540
>                 URL: https://issues.apache.org/jira/browse/HBASE-4540
>             Project: HBase
>          Issue Type: Bug
>            Reporter: ramkrishna.s.vasudevan
>            Assignee: ramkrishna.s.vasudevan
>         Attachments: HBASE-4540_1.patch
>
>
> -> OpenedRegionHandler has not yet deleted the znode of the region R1 opened by RS1.
> -> RS1 goes down.
> -> Servershutdownhandler assigns the region R1 to RS2.
> -> The znode of R1 is moved to OFFLINE state by master or OPENING state by RS2 if RS2 has started opening the region.
> -> Now the first OpenedRegionHandler tries to delete the znode thinking its in OPENED state but fails.
> -> Though it fails it removes the node from RIT and adds RS1 as the owner of R1 in master's memory.
> -> Now when RS2 completes opening the region the master is not able to open the region as already the reigon has been deleted from RIT.
> {code}
> Master
> ======
> 2011-10-05 20:49:45,301 INFO org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Finished processing of shutdown of linux146,60020,1317827727647
> 2011-10-05 20:49:54,177 DEBUG org.apache.hadoop.hbase.master.HMaster: Not running balancer because 1 region(s) in transition: {3e69d628a8bd8e9b7c5e7a2a6e03aad9=t1,,1317827883842.3e69d628a8bd8e9b7c5e7a2a6e03aad9. state=PENDING_OPEN, ts=1317827985272, server=linux76,60020,1317827746847}
> 2011-10-05 20:49:57,720 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=M_ZK_REGION_OFFLINE, server=linux76,60000,1317827742012, region=3e69d628a8bd8e9b7c5e7a2a6e03aad9
> 2011-10-05 20:50:14,501 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:60000-0x132d3dc13090023 Deleting existing unassigned node for 3e69d628a8bd8e9b7c5e7a2a6e03aad9 that is in expected state RS_ZK_REGION_OPENED
> 2011-10-05 20:50:14,505 WARN org.apache.hadoop.hbase.zookeeper.ZKAssign: master:60000-0x132d3dc13090023 Attempting to delete unassigned node 3e69d628a8bd8e9b7c5e7a2a6e03aad9 in RS_ZK_REGION_OPENED state but node is in RS_ZK_REGION_OPENING state
> After the region is opened in RS2
> =================================
> 2011-10-05 20:50:48,066 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=RS_ZK_REGION_OPENING, server=linux76,60020,1317827746847, region=3e69d628a8bd8e9b7c5e7a2a6e03aad9, which is more than 15 seconds late
> 2011-10-05 20:50:48,290 WARN org.apache.hadoop.hbase.master.AssignmentManager: Received OPENING for region 3e69d628a8bd8e9b7c5e7a2a6e03aad9 from server linux76,60020,1317827746847 but region was in  the state null and not in expected PENDING_OPEN or OPENING states
> 2011-10-05 20:50:53,743 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=RS_ZK_REGION_OPENING, server=linux76,60020,1317827746847, region=3e69d628a8bd8e9b7c5e7a2a6e03aad9
> 2011-10-05 20:50:54,182 DEBUG org.apache.hadoop.hbase.master.CatalogJanitor: Scanned 1 catalog row(s) and gc'd 0 unreferenced parent region(s)
> 2011-10-05 20:50:54,397 WARN org.apache.hadoop.hbase.master.AssignmentManager: Received OPENING for region 3e69d628a8bd8e9b7c5e7a2a6e03aad9 from server linux76,60020,1317827746847 but region was in  the state null and not in expected PENDING_OPEN or OPENING states
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4540) OpenedRegionHandler is not enforcing atomicity of the operation it is performing

Posted by "ramkrishna.s.vasudevan (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13125174#comment-13125174 ] 

ramkrishna.s.vasudevan commented on HBASE-4540:
-----------------------------------------------

Testcases results will let you know tomorrow.  
                
> OpenedRegionHandler is not enforcing atomicity of the operation it is performing
> --------------------------------------------------------------------------------
>
>                 Key: HBASE-4540
>                 URL: https://issues.apache.org/jira/browse/HBASE-4540
>             Project: HBase
>          Issue Type: Bug
>            Reporter: ramkrishna.s.vasudevan
>            Assignee: ramkrishna.s.vasudevan
>             Fix For: 0.92.0
>
>         Attachments: HBASE-4540_1.patch, HBASE-4540_90.patch
>
>
> -> OpenedRegionHandler has not yet deleted the znode of the region R1 opened by RS1.
> -> RS1 goes down.
> -> Servershutdownhandler assigns the region R1 to RS2.
> -> The znode of R1 is moved to OFFLINE state by master or OPENING state by RS2 if RS2 has started opening the region.
> -> Now the first OpenedRegionHandler tries to delete the znode thinking its in OPENED state but fails.
> -> Though it fails it removes the node from RIT and adds RS1 as the owner of R1 in master's memory.
> -> Now when RS2 completes opening the region the master is not able to open the region as already the reigon has been deleted from RIT.
> {code}
> Master
> ======
> 2011-10-05 20:49:45,301 INFO org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Finished processing of shutdown of linux146,60020,1317827727647
> 2011-10-05 20:49:54,177 DEBUG org.apache.hadoop.hbase.master.HMaster: Not running balancer because 1 region(s) in transition: {3e69d628a8bd8e9b7c5e7a2a6e03aad9=t1,,1317827883842.3e69d628a8bd8e9b7c5e7a2a6e03aad9. state=PENDING_OPEN, ts=1317827985272, server=linux76,60020,1317827746847}
> 2011-10-05 20:49:57,720 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=M_ZK_REGION_OFFLINE, server=linux76,60000,1317827742012, region=3e69d628a8bd8e9b7c5e7a2a6e03aad9
> 2011-10-05 20:50:14,501 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:60000-0x132d3dc13090023 Deleting existing unassigned node for 3e69d628a8bd8e9b7c5e7a2a6e03aad9 that is in expected state RS_ZK_REGION_OPENED
> 2011-10-05 20:50:14,505 WARN org.apache.hadoop.hbase.zookeeper.ZKAssign: master:60000-0x132d3dc13090023 Attempting to delete unassigned node 3e69d628a8bd8e9b7c5e7a2a6e03aad9 in RS_ZK_REGION_OPENED state but node is in RS_ZK_REGION_OPENING state
> After the region is opened in RS2
> =================================
> 2011-10-05 20:50:48,066 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=RS_ZK_REGION_OPENING, server=linux76,60020,1317827746847, region=3e69d628a8bd8e9b7c5e7a2a6e03aad9, which is more than 15 seconds late
> 2011-10-05 20:50:48,290 WARN org.apache.hadoop.hbase.master.AssignmentManager: Received OPENING for region 3e69d628a8bd8e9b7c5e7a2a6e03aad9 from server linux76,60020,1317827746847 but region was in  the state null and not in expected PENDING_OPEN or OPENING states
> 2011-10-05 20:50:53,743 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=RS_ZK_REGION_OPENING, server=linux76,60020,1317827746847, region=3e69d628a8bd8e9b7c5e7a2a6e03aad9
> 2011-10-05 20:50:54,182 DEBUG org.apache.hadoop.hbase.master.CatalogJanitor: Scanned 1 catalog row(s) and gc'd 0 unreferenced parent region(s)
> 2011-10-05 20:50:54,397 WARN org.apache.hadoop.hbase.master.AssignmentManager: Received OPENING for region 3e69d628a8bd8e9b7c5e7a2a6e03aad9 from server linux76,60020,1317827746847 but region was in  the state null and not in expected PENDING_OPEN or OPENING states
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-4540) OpenedRegionHandler is not enforcing atomicity of the operation it is performing

Posted by "ramkrishna.s.vasudevan (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-4540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

ramkrishna.s.vasudevan updated HBASE-4540:
------------------------------------------

    Attachment: HBASE-4540_1.patch

I have not yet completely run the test suite.
Once run will let you know the results.  
                
> OpenedRegionHandler is not enforcing atomicity of the operation it is performing
> --------------------------------------------------------------------------------
>
>                 Key: HBASE-4540
>                 URL: https://issues.apache.org/jira/browse/HBASE-4540
>             Project: HBase
>          Issue Type: Bug
>            Reporter: ramkrishna.s.vasudevan
>            Assignee: ramkrishna.s.vasudevan
>         Attachments: HBASE-4540_1.patch
>
>
> -> OpenedRegionHandler has not yet deleted the znode of the region R1 opened by RS1.
> -> RS1 goes down.
> -> Servershutdownhandler assigns the region R1 to RS2.
> -> The znode of R1 is moved to OFFLINE state by master or OPENING state by RS2 if RS2 has started opening the region.
> -> Now the first OpenedRegionHandler tries to delete the znode thinking its in OPENED state but fails.
> -> Though it fails it removes the node from RIT and adds RS1 as the owner of R1 in master's memory.
> -> Now when RS2 completes opening the region the master is not able to open the region as already the reigon has been deleted from RIT.
> {code}
> Master
> ======
> 2011-10-05 20:49:45,301 INFO org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Finished processing of shutdown of linux146,60020,1317827727647
> 2011-10-05 20:49:54,177 DEBUG org.apache.hadoop.hbase.master.HMaster: Not running balancer because 1 region(s) in transition: {3e69d628a8bd8e9b7c5e7a2a6e03aad9=t1,,1317827883842.3e69d628a8bd8e9b7c5e7a2a6e03aad9. state=PENDING_OPEN, ts=1317827985272, server=linux76,60020,1317827746847}
> 2011-10-05 20:49:57,720 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=M_ZK_REGION_OFFLINE, server=linux76,60000,1317827742012, region=3e69d628a8bd8e9b7c5e7a2a6e03aad9
> 2011-10-05 20:50:14,501 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:60000-0x132d3dc13090023 Deleting existing unassigned node for 3e69d628a8bd8e9b7c5e7a2a6e03aad9 that is in expected state RS_ZK_REGION_OPENED
> 2011-10-05 20:50:14,505 WARN org.apache.hadoop.hbase.zookeeper.ZKAssign: master:60000-0x132d3dc13090023 Attempting to delete unassigned node 3e69d628a8bd8e9b7c5e7a2a6e03aad9 in RS_ZK_REGION_OPENED state but node is in RS_ZK_REGION_OPENING state
> After the region is opened in RS2
> =================================
> 2011-10-05 20:50:48,066 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=RS_ZK_REGION_OPENING, server=linux76,60020,1317827746847, region=3e69d628a8bd8e9b7c5e7a2a6e03aad9, which is more than 15 seconds late
> 2011-10-05 20:50:48,290 WARN org.apache.hadoop.hbase.master.AssignmentManager: Received OPENING for region 3e69d628a8bd8e9b7c5e7a2a6e03aad9 from server linux76,60020,1317827746847 but region was in  the state null and not in expected PENDING_OPEN or OPENING states
> 2011-10-05 20:50:53,743 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=RS_ZK_REGION_OPENING, server=linux76,60020,1317827746847, region=3e69d628a8bd8e9b7c5e7a2a6e03aad9
> 2011-10-05 20:50:54,182 DEBUG org.apache.hadoop.hbase.master.CatalogJanitor: Scanned 1 catalog row(s) and gc'd 0 unreferenced parent region(s)
> 2011-10-05 20:50:54,397 WARN org.apache.hadoop.hbase.master.AssignmentManager: Received OPENING for region 3e69d628a8bd8e9b7c5e7a2a6e03aad9 from server linux76,60020,1317827746847 but region was in  the state null and not in expected PENDING_OPEN or OPENING states
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4540) OpenedRegionHandler is not enforcing atomicity of the operation it is performing

Posted by "jiraposter@reviews.apache.org (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13122106#comment-13122106 ] 

jiraposter@reviews.apache.org commented on HBASE-4540:
------------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2251/
-----------------------------------------------------------

Review request for hbase, Ted Yu, Michael Stack, and Jonathan Gray.


Summary
-------

Fix for handling HBASE-4539 and HBASE-4540.
Ran all the testcases.  Added one new testcase to verify OpenedRegionHandler scenarios.
Also addresses Ted's comments.


This addresses bug HBASE-4540.
    https://issues.apache.org/jira/browse/HBASE-4540


Diffs
-----

  http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java 1179238 
  http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/handler/OpenedRegionHandler.java 1179238 
  http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKAssign.java 1179238 
  http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java 1179238 
  http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/master/TestOpenedRegionHandler.java PRE-CREATION 

Diff: https://reviews.apache.org/r/2251/diff


Testing
-------

Yes


Thanks,

ramkrishna


                
> OpenedRegionHandler is not enforcing atomicity of the operation it is performing
> --------------------------------------------------------------------------------
>
>                 Key: HBASE-4540
>                 URL: https://issues.apache.org/jira/browse/HBASE-4540
>             Project: HBase
>          Issue Type: Bug
>            Reporter: ramkrishna.s.vasudevan
>            Assignee: ramkrishna.s.vasudevan
>         Attachments: HBASE-4540_1.patch
>
>
> -> OpenedRegionHandler has not yet deleted the znode of the region R1 opened by RS1.
> -> RS1 goes down.
> -> Servershutdownhandler assigns the region R1 to RS2.
> -> The znode of R1 is moved to OFFLINE state by master or OPENING state by RS2 if RS2 has started opening the region.
> -> Now the first OpenedRegionHandler tries to delete the znode thinking its in OPENED state but fails.
> -> Though it fails it removes the node from RIT and adds RS1 as the owner of R1 in master's memory.
> -> Now when RS2 completes opening the region the master is not able to open the region as already the reigon has been deleted from RIT.
> {code}
> Master
> ======
> 2011-10-05 20:49:45,301 INFO org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Finished processing of shutdown of linux146,60020,1317827727647
> 2011-10-05 20:49:54,177 DEBUG org.apache.hadoop.hbase.master.HMaster: Not running balancer because 1 region(s) in transition: {3e69d628a8bd8e9b7c5e7a2a6e03aad9=t1,,1317827883842.3e69d628a8bd8e9b7c5e7a2a6e03aad9. state=PENDING_OPEN, ts=1317827985272, server=linux76,60020,1317827746847}
> 2011-10-05 20:49:57,720 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=M_ZK_REGION_OFFLINE, server=linux76,60000,1317827742012, region=3e69d628a8bd8e9b7c5e7a2a6e03aad9
> 2011-10-05 20:50:14,501 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:60000-0x132d3dc13090023 Deleting existing unassigned node for 3e69d628a8bd8e9b7c5e7a2a6e03aad9 that is in expected state RS_ZK_REGION_OPENED
> 2011-10-05 20:50:14,505 WARN org.apache.hadoop.hbase.zookeeper.ZKAssign: master:60000-0x132d3dc13090023 Attempting to delete unassigned node 3e69d628a8bd8e9b7c5e7a2a6e03aad9 in RS_ZK_REGION_OPENED state but node is in RS_ZK_REGION_OPENING state
> After the region is opened in RS2
> =================================
> 2011-10-05 20:50:48,066 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=RS_ZK_REGION_OPENING, server=linux76,60020,1317827746847, region=3e69d628a8bd8e9b7c5e7a2a6e03aad9, which is more than 15 seconds late
> 2011-10-05 20:50:48,290 WARN org.apache.hadoop.hbase.master.AssignmentManager: Received OPENING for region 3e69d628a8bd8e9b7c5e7a2a6e03aad9 from server linux76,60020,1317827746847 but region was in  the state null and not in expected PENDING_OPEN or OPENING states
> 2011-10-05 20:50:53,743 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=RS_ZK_REGION_OPENING, server=linux76,60020,1317827746847, region=3e69d628a8bd8e9b7c5e7a2a6e03aad9
> 2011-10-05 20:50:54,182 DEBUG org.apache.hadoop.hbase.master.CatalogJanitor: Scanned 1 catalog row(s) and gc'd 0 unreferenced parent region(s)
> 2011-10-05 20:50:54,397 WARN org.apache.hadoop.hbase.master.AssignmentManager: Received OPENING for region 3e69d628a8bd8e9b7c5e7a2a6e03aad9 from server linux76,60020,1317827746847 but region was in  the state null and not in expected PENDING_OPEN or OPENING states
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4540) OpenedRegionHandler is not enforcing atomicity of the operation it is performing

Posted by "ramkrishna.s.vasudevan (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13121707#comment-13121707 ] 

ramkrishna.s.vasudevan commented on HBASE-4540:
-----------------------------------------------

@JD
Yes the bug is same as hbase-4416. I did not know one already was there.
@Jon
I am working on writing test cases 
Will submit it once done

Thanks guys for your reviews
@Ted
will address your comment in the updated patch.
                
> OpenedRegionHandler is not enforcing atomicity of the operation it is performing
> --------------------------------------------------------------------------------
>
>                 Key: HBASE-4540
>                 URL: https://issues.apache.org/jira/browse/HBASE-4540
>             Project: HBase
>          Issue Type: Bug
>            Reporter: ramkrishna.s.vasudevan
>            Assignee: ramkrishna.s.vasudevan
>         Attachments: HBASE-4540_1.patch
>
>
> -> OpenedRegionHandler has not yet deleted the znode of the region R1 opened by RS1.
> -> RS1 goes down.
> -> Servershutdownhandler assigns the region R1 to RS2.
> -> The znode of R1 is moved to OFFLINE state by master or OPENING state by RS2 if RS2 has started opening the region.
> -> Now the first OpenedRegionHandler tries to delete the znode thinking its in OPENED state but fails.
> -> Though it fails it removes the node from RIT and adds RS1 as the owner of R1 in master's memory.
> -> Now when RS2 completes opening the region the master is not able to open the region as already the reigon has been deleted from RIT.
> {code}
> Master
> ======
> 2011-10-05 20:49:45,301 INFO org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Finished processing of shutdown of linux146,60020,1317827727647
> 2011-10-05 20:49:54,177 DEBUG org.apache.hadoop.hbase.master.HMaster: Not running balancer because 1 region(s) in transition: {3e69d628a8bd8e9b7c5e7a2a6e03aad9=t1,,1317827883842.3e69d628a8bd8e9b7c5e7a2a6e03aad9. state=PENDING_OPEN, ts=1317827985272, server=linux76,60020,1317827746847}
> 2011-10-05 20:49:57,720 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=M_ZK_REGION_OFFLINE, server=linux76,60000,1317827742012, region=3e69d628a8bd8e9b7c5e7a2a6e03aad9
> 2011-10-05 20:50:14,501 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:60000-0x132d3dc13090023 Deleting existing unassigned node for 3e69d628a8bd8e9b7c5e7a2a6e03aad9 that is in expected state RS_ZK_REGION_OPENED
> 2011-10-05 20:50:14,505 WARN org.apache.hadoop.hbase.zookeeper.ZKAssign: master:60000-0x132d3dc13090023 Attempting to delete unassigned node 3e69d628a8bd8e9b7c5e7a2a6e03aad9 in RS_ZK_REGION_OPENED state but node is in RS_ZK_REGION_OPENING state
> After the region is opened in RS2
> =================================
> 2011-10-05 20:50:48,066 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=RS_ZK_REGION_OPENING, server=linux76,60020,1317827746847, region=3e69d628a8bd8e9b7c5e7a2a6e03aad9, which is more than 15 seconds late
> 2011-10-05 20:50:48,290 WARN org.apache.hadoop.hbase.master.AssignmentManager: Received OPENING for region 3e69d628a8bd8e9b7c5e7a2a6e03aad9 from server linux76,60020,1317827746847 but region was in  the state null and not in expected PENDING_OPEN or OPENING states
> 2011-10-05 20:50:53,743 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=RS_ZK_REGION_OPENING, server=linux76,60020,1317827746847, region=3e69d628a8bd8e9b7c5e7a2a6e03aad9
> 2011-10-05 20:50:54,182 DEBUG org.apache.hadoop.hbase.master.CatalogJanitor: Scanned 1 catalog row(s) and gc'd 0 unreferenced parent region(s)
> 2011-10-05 20:50:54,397 WARN org.apache.hadoop.hbase.master.AssignmentManager: Received OPENING for region 3e69d628a8bd8e9b7c5e7a2a6e03aad9 from server linux76,60020,1317827746847 but region was in  the state null and not in expected PENDING_OPEN or OPENING states
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4540) OpenedRegionHandler is not enforcing atomicity of the operation it is performing

Posted by "stack (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13126646#comment-13126646 ] 

stack commented on HBASE-4540:
------------------------------

Can we resolve this issue now Ram?
                
> OpenedRegionHandler is not enforcing atomicity of the operation it is performing
> --------------------------------------------------------------------------------
>
>                 Key: HBASE-4540
>                 URL: https://issues.apache.org/jira/browse/HBASE-4540
>             Project: HBase
>          Issue Type: Bug
>            Reporter: ramkrishna.s.vasudevan
>            Assignee: ramkrishna.s.vasudevan
>             Fix For: 0.92.0
>
>         Attachments: HBASE-4540_1.patch, HBASE-4540_90.patch, HBASE-4540_90_1.patch
>
>
> -> OpenedRegionHandler has not yet deleted the znode of the region R1 opened by RS1.
> -> RS1 goes down.
> -> Servershutdownhandler assigns the region R1 to RS2.
> -> The znode of R1 is moved to OFFLINE state by master or OPENING state by RS2 if RS2 has started opening the region.
> -> Now the first OpenedRegionHandler tries to delete the znode thinking its in OPENED state but fails.
> -> Though it fails it removes the node from RIT and adds RS1 as the owner of R1 in master's memory.
> -> Now when RS2 completes opening the region the master is not able to open the region as already the reigon has been deleted from RIT.
> {code}
> Master
> ======
> 2011-10-05 20:49:45,301 INFO org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Finished processing of shutdown of linux146,60020,1317827727647
> 2011-10-05 20:49:54,177 DEBUG org.apache.hadoop.hbase.master.HMaster: Not running balancer because 1 region(s) in transition: {3e69d628a8bd8e9b7c5e7a2a6e03aad9=t1,,1317827883842.3e69d628a8bd8e9b7c5e7a2a6e03aad9. state=PENDING_OPEN, ts=1317827985272, server=linux76,60020,1317827746847}
> 2011-10-05 20:49:57,720 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=M_ZK_REGION_OFFLINE, server=linux76,60000,1317827742012, region=3e69d628a8bd8e9b7c5e7a2a6e03aad9
> 2011-10-05 20:50:14,501 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:60000-0x132d3dc13090023 Deleting existing unassigned node for 3e69d628a8bd8e9b7c5e7a2a6e03aad9 that is in expected state RS_ZK_REGION_OPENED
> 2011-10-05 20:50:14,505 WARN org.apache.hadoop.hbase.zookeeper.ZKAssign: master:60000-0x132d3dc13090023 Attempting to delete unassigned node 3e69d628a8bd8e9b7c5e7a2a6e03aad9 in RS_ZK_REGION_OPENED state but node is in RS_ZK_REGION_OPENING state
> After the region is opened in RS2
> =================================
> 2011-10-05 20:50:48,066 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=RS_ZK_REGION_OPENING, server=linux76,60020,1317827746847, region=3e69d628a8bd8e9b7c5e7a2a6e03aad9, which is more than 15 seconds late
> 2011-10-05 20:50:48,290 WARN org.apache.hadoop.hbase.master.AssignmentManager: Received OPENING for region 3e69d628a8bd8e9b7c5e7a2a6e03aad9 from server linux76,60020,1317827746847 but region was in  the state null and not in expected PENDING_OPEN or OPENING states
> 2011-10-05 20:50:53,743 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=RS_ZK_REGION_OPENING, server=linux76,60020,1317827746847, region=3e69d628a8bd8e9b7c5e7a2a6e03aad9
> 2011-10-05 20:50:54,182 DEBUG org.apache.hadoop.hbase.master.CatalogJanitor: Scanned 1 catalog row(s) and gc'd 0 unreferenced parent region(s)
> 2011-10-05 20:50:54,397 WARN org.apache.hadoop.hbase.master.AssignmentManager: Received OPENING for region 3e69d628a8bd8e9b7c5e7a2a6e03aad9 from server linux76,60020,1317827746847 but region was in  the state null and not in expected PENDING_OPEN or OPENING states
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4540) OpenedRegionHandler is not enforcing atomicity of the operation it is performing

Posted by "jiraposter@reviews.apache.org (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13122920#comment-13122920 ] 

jiraposter@reviews.apache.org commented on HBASE-4540:
------------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2251/
-----------------------------------------------------------

(Updated 2011-10-07 16:13:33.022073)


Review request for hbase, Ted Yu, Michael Stack, and Jonathan Gray.


Changes
-------

If we do not want to compare the version of znode while deleting we can pass -2 to the deleteNode api.
Uploaded the patch with the change.


Summary
-------

Fix for handling HBASE-4539 and HBASE-4540.
Ran all the testcases.  Added one new testcase to verify OpenedRegionHandler scenarios.
Also addresses Ted's comments.


This addresses bug HBASE-4540.
    https://issues.apache.org/jira/browse/HBASE-4540


Diffs (updated)
-----

  http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java 1179945 
  http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/handler/OpenedRegionHandler.java 1179945 
  http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKAssign.java 1179945 
  http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java 1179945 
  http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/master/TestOpenedRegionHandler.java PRE-CREATION 

Diff: https://reviews.apache.org/r/2251/diff


Testing
-------

Yes


Thanks,

ramkrishna


                
> OpenedRegionHandler is not enforcing atomicity of the operation it is performing
> --------------------------------------------------------------------------------
>
>                 Key: HBASE-4540
>                 URL: https://issues.apache.org/jira/browse/HBASE-4540
>             Project: HBase
>          Issue Type: Bug
>            Reporter: ramkrishna.s.vasudevan
>            Assignee: ramkrishna.s.vasudevan
>         Attachments: HBASE-4540_1.patch
>
>
> -> OpenedRegionHandler has not yet deleted the znode of the region R1 opened by RS1.
> -> RS1 goes down.
> -> Servershutdownhandler assigns the region R1 to RS2.
> -> The znode of R1 is moved to OFFLINE state by master or OPENING state by RS2 if RS2 has started opening the region.
> -> Now the first OpenedRegionHandler tries to delete the znode thinking its in OPENED state but fails.
> -> Though it fails it removes the node from RIT and adds RS1 as the owner of R1 in master's memory.
> -> Now when RS2 completes opening the region the master is not able to open the region as already the reigon has been deleted from RIT.
> {code}
> Master
> ======
> 2011-10-05 20:49:45,301 INFO org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Finished processing of shutdown of linux146,60020,1317827727647
> 2011-10-05 20:49:54,177 DEBUG org.apache.hadoop.hbase.master.HMaster: Not running balancer because 1 region(s) in transition: {3e69d628a8bd8e9b7c5e7a2a6e03aad9=t1,,1317827883842.3e69d628a8bd8e9b7c5e7a2a6e03aad9. state=PENDING_OPEN, ts=1317827985272, server=linux76,60020,1317827746847}
> 2011-10-05 20:49:57,720 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=M_ZK_REGION_OFFLINE, server=linux76,60000,1317827742012, region=3e69d628a8bd8e9b7c5e7a2a6e03aad9
> 2011-10-05 20:50:14,501 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:60000-0x132d3dc13090023 Deleting existing unassigned node for 3e69d628a8bd8e9b7c5e7a2a6e03aad9 that is in expected state RS_ZK_REGION_OPENED
> 2011-10-05 20:50:14,505 WARN org.apache.hadoop.hbase.zookeeper.ZKAssign: master:60000-0x132d3dc13090023 Attempting to delete unassigned node 3e69d628a8bd8e9b7c5e7a2a6e03aad9 in RS_ZK_REGION_OPENED state but node is in RS_ZK_REGION_OPENING state
> After the region is opened in RS2
> =================================
> 2011-10-05 20:50:48,066 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=RS_ZK_REGION_OPENING, server=linux76,60020,1317827746847, region=3e69d628a8bd8e9b7c5e7a2a6e03aad9, which is more than 15 seconds late
> 2011-10-05 20:50:48,290 WARN org.apache.hadoop.hbase.master.AssignmentManager: Received OPENING for region 3e69d628a8bd8e9b7c5e7a2a6e03aad9 from server linux76,60020,1317827746847 but region was in  the state null and not in expected PENDING_OPEN or OPENING states
> 2011-10-05 20:50:53,743 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=RS_ZK_REGION_OPENING, server=linux76,60020,1317827746847, region=3e69d628a8bd8e9b7c5e7a2a6e03aad9
> 2011-10-05 20:50:54,182 DEBUG org.apache.hadoop.hbase.master.CatalogJanitor: Scanned 1 catalog row(s) and gc'd 0 unreferenced parent region(s)
> 2011-10-05 20:50:54,397 WARN org.apache.hadoop.hbase.master.AssignmentManager: Received OPENING for region 3e69d628a8bd8e9b7c5e7a2a6e03aad9 from server linux76,60020,1317827746847 but region was in  the state null and not in expected PENDING_OPEN or OPENING states
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4540) OpenedRegionHandler is not enforcing atomicity of the operation it is performing

Posted by "Hudson (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13123672#comment-13123672 ] 

Hudson commented on HBASE-4540:
-------------------------------

Integrated in HBase-TRUNK #2311 (See [https://builds.apache.org/job/HBase-TRUNK/2311/])
    HBASE-4540 OpenedRegionHandler is not enforcing atomicity of the operation it is performing . Also fixes HBASE-4539 (ram)

ramkrishna : 
Files : 
* /hbase/trunk/CHANGES.txt
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/handler/OpenedRegionHandler.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKAssign.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java
* /hbase/trunk/src/test/java/org/apache/hadoop/hbase/master/TestOpenedRegionHandler.java

                
> OpenedRegionHandler is not enforcing atomicity of the operation it is performing
> --------------------------------------------------------------------------------
>
>                 Key: HBASE-4540
>                 URL: https://issues.apache.org/jira/browse/HBASE-4540
>             Project: HBase
>          Issue Type: Bug
>            Reporter: ramkrishna.s.vasudevan
>            Assignee: ramkrishna.s.vasudevan
>         Attachments: HBASE-4540_1.patch
>
>
> -> OpenedRegionHandler has not yet deleted the znode of the region R1 opened by RS1.
> -> RS1 goes down.
> -> Servershutdownhandler assigns the region R1 to RS2.
> -> The znode of R1 is moved to OFFLINE state by master or OPENING state by RS2 if RS2 has started opening the region.
> -> Now the first OpenedRegionHandler tries to delete the znode thinking its in OPENED state but fails.
> -> Though it fails it removes the node from RIT and adds RS1 as the owner of R1 in master's memory.
> -> Now when RS2 completes opening the region the master is not able to open the region as already the reigon has been deleted from RIT.
> {code}
> Master
> ======
> 2011-10-05 20:49:45,301 INFO org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Finished processing of shutdown of linux146,60020,1317827727647
> 2011-10-05 20:49:54,177 DEBUG org.apache.hadoop.hbase.master.HMaster: Not running balancer because 1 region(s) in transition: {3e69d628a8bd8e9b7c5e7a2a6e03aad9=t1,,1317827883842.3e69d628a8bd8e9b7c5e7a2a6e03aad9. state=PENDING_OPEN, ts=1317827985272, server=linux76,60020,1317827746847}
> 2011-10-05 20:49:57,720 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=M_ZK_REGION_OFFLINE, server=linux76,60000,1317827742012, region=3e69d628a8bd8e9b7c5e7a2a6e03aad9
> 2011-10-05 20:50:14,501 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:60000-0x132d3dc13090023 Deleting existing unassigned node for 3e69d628a8bd8e9b7c5e7a2a6e03aad9 that is in expected state RS_ZK_REGION_OPENED
> 2011-10-05 20:50:14,505 WARN org.apache.hadoop.hbase.zookeeper.ZKAssign: master:60000-0x132d3dc13090023 Attempting to delete unassigned node 3e69d628a8bd8e9b7c5e7a2a6e03aad9 in RS_ZK_REGION_OPENED state but node is in RS_ZK_REGION_OPENING state
> After the region is opened in RS2
> =================================
> 2011-10-05 20:50:48,066 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=RS_ZK_REGION_OPENING, server=linux76,60020,1317827746847, region=3e69d628a8bd8e9b7c5e7a2a6e03aad9, which is more than 15 seconds late
> 2011-10-05 20:50:48,290 WARN org.apache.hadoop.hbase.master.AssignmentManager: Received OPENING for region 3e69d628a8bd8e9b7c5e7a2a6e03aad9 from server linux76,60020,1317827746847 but region was in  the state null and not in expected PENDING_OPEN or OPENING states
> 2011-10-05 20:50:53,743 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=RS_ZK_REGION_OPENING, server=linux76,60020,1317827746847, region=3e69d628a8bd8e9b7c5e7a2a6e03aad9
> 2011-10-05 20:50:54,182 DEBUG org.apache.hadoop.hbase.master.CatalogJanitor: Scanned 1 catalog row(s) and gc'd 0 unreferenced parent region(s)
> 2011-10-05 20:50:54,397 WARN org.apache.hadoop.hbase.master.AssignmentManager: Received OPENING for region 3e69d628a8bd8e9b7c5e7a2a6e03aad9 from server linux76,60020,1317827746847 but region was in  the state null and not in expected PENDING_OPEN or OPENING states
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-4540) OpenedRegionHandler is not enforcing atomicity of the operation it is performing

Posted by "ramkrishna.s.vasudevan (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-4540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

ramkrishna.s.vasudevan updated HBASE-4540:
------------------------------------------

    Attachment: HBASE-4540_90.patch

Contains patch for 0.90.x version.  Moving MockServer and MockRegionServices to test.util package has also been done. Contains similar changes as in trunk version.
                
> OpenedRegionHandler is not enforcing atomicity of the operation it is performing
> --------------------------------------------------------------------------------
>
>                 Key: HBASE-4540
>                 URL: https://issues.apache.org/jira/browse/HBASE-4540
>             Project: HBase
>          Issue Type: Bug
>            Reporter: ramkrishna.s.vasudevan
>            Assignee: ramkrishna.s.vasudevan
>             Fix For: 0.92.0
>
>         Attachments: HBASE-4540_1.patch, HBASE-4540_90.patch
>
>
> -> OpenedRegionHandler has not yet deleted the znode of the region R1 opened by RS1.
> -> RS1 goes down.
> -> Servershutdownhandler assigns the region R1 to RS2.
> -> The znode of R1 is moved to OFFLINE state by master or OPENING state by RS2 if RS2 has started opening the region.
> -> Now the first OpenedRegionHandler tries to delete the znode thinking its in OPENED state but fails.
> -> Though it fails it removes the node from RIT and adds RS1 as the owner of R1 in master's memory.
> -> Now when RS2 completes opening the region the master is not able to open the region as already the reigon has been deleted from RIT.
> {code}
> Master
> ======
> 2011-10-05 20:49:45,301 INFO org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Finished processing of shutdown of linux146,60020,1317827727647
> 2011-10-05 20:49:54,177 DEBUG org.apache.hadoop.hbase.master.HMaster: Not running balancer because 1 region(s) in transition: {3e69d628a8bd8e9b7c5e7a2a6e03aad9=t1,,1317827883842.3e69d628a8bd8e9b7c5e7a2a6e03aad9. state=PENDING_OPEN, ts=1317827985272, server=linux76,60020,1317827746847}
> 2011-10-05 20:49:57,720 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=M_ZK_REGION_OFFLINE, server=linux76,60000,1317827742012, region=3e69d628a8bd8e9b7c5e7a2a6e03aad9
> 2011-10-05 20:50:14,501 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:60000-0x132d3dc13090023 Deleting existing unassigned node for 3e69d628a8bd8e9b7c5e7a2a6e03aad9 that is in expected state RS_ZK_REGION_OPENED
> 2011-10-05 20:50:14,505 WARN org.apache.hadoop.hbase.zookeeper.ZKAssign: master:60000-0x132d3dc13090023 Attempting to delete unassigned node 3e69d628a8bd8e9b7c5e7a2a6e03aad9 in RS_ZK_REGION_OPENED state but node is in RS_ZK_REGION_OPENING state
> After the region is opened in RS2
> =================================
> 2011-10-05 20:50:48,066 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=RS_ZK_REGION_OPENING, server=linux76,60020,1317827746847, region=3e69d628a8bd8e9b7c5e7a2a6e03aad9, which is more than 15 seconds late
> 2011-10-05 20:50:48,290 WARN org.apache.hadoop.hbase.master.AssignmentManager: Received OPENING for region 3e69d628a8bd8e9b7c5e7a2a6e03aad9 from server linux76,60020,1317827746847 but region was in  the state null and not in expected PENDING_OPEN or OPENING states
> 2011-10-05 20:50:53,743 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=RS_ZK_REGION_OPENING, server=linux76,60020,1317827746847, region=3e69d628a8bd8e9b7c5e7a2a6e03aad9
> 2011-10-05 20:50:54,182 DEBUG org.apache.hadoop.hbase.master.CatalogJanitor: Scanned 1 catalog row(s) and gc'd 0 unreferenced parent region(s)
> 2011-10-05 20:50:54,397 WARN org.apache.hadoop.hbase.master.AssignmentManager: Received OPENING for region 3e69d628a8bd8e9b7c5e7a2a6e03aad9 from server linux76,60020,1317827746847 but region was in  the state null and not in expected PENDING_OPEN or OPENING states
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4540) OpenedRegionHandler is not enforcing atomicity of the operation it is performing

Posted by "Ted Yu (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13121470#comment-13121470 ] 

Ted Yu commented on HBASE-4540:
-------------------------------

For ZKAssign.deleteNode(), the javadoc needs to be updated as it covers both opened and closed states:
{code}
   * <p>Returns false if the node was not in the proper state but did exist.
   *
   * <p>This method is used during table disables when a region finishes
   * successfully closing.  This is the Master acknowledging completion
   * of the specified regions transition to being closed.
{code}
                
> OpenedRegionHandler is not enforcing atomicity of the operation it is performing
> --------------------------------------------------------------------------------
>
>                 Key: HBASE-4540
>                 URL: https://issues.apache.org/jira/browse/HBASE-4540
>             Project: HBase
>          Issue Type: Bug
>            Reporter: ramkrishna.s.vasudevan
>            Assignee: ramkrishna.s.vasudevan
>         Attachments: HBASE-4540_1.patch
>
>
> -> OpenedRegionHandler has not yet deleted the znode of the region R1 opened by RS1.
> -> RS1 goes down.
> -> Servershutdownhandler assigns the region R1 to RS2.
> -> The znode of R1 is moved to OFFLINE state by master or OPENING state by RS2 if RS2 has started opening the region.
> -> Now the first OpenedRegionHandler tries to delete the znode thinking its in OPENED state but fails.
> -> Though it fails it removes the node from RIT and adds RS1 as the owner of R1 in master's memory.
> -> Now when RS2 completes opening the region the master is not able to open the region as already the reigon has been deleted from RIT.
> {code}
> Master
> ======
> 2011-10-05 20:49:45,301 INFO org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Finished processing of shutdown of linux146,60020,1317827727647
> 2011-10-05 20:49:54,177 DEBUG org.apache.hadoop.hbase.master.HMaster: Not running balancer because 1 region(s) in transition: {3e69d628a8bd8e9b7c5e7a2a6e03aad9=t1,,1317827883842.3e69d628a8bd8e9b7c5e7a2a6e03aad9. state=PENDING_OPEN, ts=1317827985272, server=linux76,60020,1317827746847}
> 2011-10-05 20:49:57,720 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=M_ZK_REGION_OFFLINE, server=linux76,60000,1317827742012, region=3e69d628a8bd8e9b7c5e7a2a6e03aad9
> 2011-10-05 20:50:14,501 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:60000-0x132d3dc13090023 Deleting existing unassigned node for 3e69d628a8bd8e9b7c5e7a2a6e03aad9 that is in expected state RS_ZK_REGION_OPENED
> 2011-10-05 20:50:14,505 WARN org.apache.hadoop.hbase.zookeeper.ZKAssign: master:60000-0x132d3dc13090023 Attempting to delete unassigned node 3e69d628a8bd8e9b7c5e7a2a6e03aad9 in RS_ZK_REGION_OPENED state but node is in RS_ZK_REGION_OPENING state
> After the region is opened in RS2
> =================================
> 2011-10-05 20:50:48,066 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=RS_ZK_REGION_OPENING, server=linux76,60020,1317827746847, region=3e69d628a8bd8e9b7c5e7a2a6e03aad9, which is more than 15 seconds late
> 2011-10-05 20:50:48,290 WARN org.apache.hadoop.hbase.master.AssignmentManager: Received OPENING for region 3e69d628a8bd8e9b7c5e7a2a6e03aad9 from server linux76,60020,1317827746847 but region was in  the state null and not in expected PENDING_OPEN or OPENING states
> 2011-10-05 20:50:53,743 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=RS_ZK_REGION_OPENING, server=linux76,60020,1317827746847, region=3e69d628a8bd8e9b7c5e7a2a6e03aad9
> 2011-10-05 20:50:54,182 DEBUG org.apache.hadoop.hbase.master.CatalogJanitor: Scanned 1 catalog row(s) and gc'd 0 unreferenced parent region(s)
> 2011-10-05 20:50:54,397 WARN org.apache.hadoop.hbase.master.AssignmentManager: Received OPENING for region 3e69d628a8bd8e9b7c5e7a2a6e03aad9 from server linux76,60020,1317827746847 but region was in  the state null and not in expected PENDING_OPEN or OPENING states
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4540) OpenedRegionHandler is not enforcing atomicity of the operation it is performing

Posted by "jiraposter@reviews.apache.org (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13123802#comment-13123802 ] 

jiraposter@reviews.apache.org commented on HBASE-4540:
------------------------------------------------------



bq.  On 2011-10-08 21:55:31, Michael Stack wrote:
bq.  > http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/master/TestOpenedRegionHandler.java, line 102
bq.  > <https://reviews.apache.org/r/2251/diff/4/?file=48951#file48951line102>
bq.  >
bq.  >     Good test.
bq.  >     
bq.  >     Would it be possible to test the handler without spinning up the cluster?  See TestOpenRegionHandler over under regionserver.handler in tests -- they don't spin up a cluster, just zk.  Test can run faster if no dfs+hbase.  Not important.  For the future.
bq.  
bq.  ramkrishna vasudevan wrote:
bq.      @Stack
bq.      I can do like that atleast for one of the testcases in TestOpenedRegionHandler.  But i have to use the MockServer and MockRegionServices.
bq.      I will raise one minor improvement task to do that. Currently MockServer and MockRegionServices are under regionserver.handler package but the new testcase is in master package.  So better we can move it to a test.utility package and then use it across. So i will currently go with this commit and then track the new improvement JIRA to closure.

Sounds good Ram.  Yes, we should move these out if more generally useful.


bq.  On 2011-10-08 21:55:31, Michael Stack wrote:
bq.  > http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKAssign.java, line 873
bq.  > <https://reviews.apache.org/r/2251/diff/4/?file=48949#file48949line873>
bq.  >
bq.  >     We don't have this method already in our ZK* classes?
bq.  
bq.  ramkrishna vasudevan wrote:
bq.      @Stack
bq.      ZKAssign() did have getDataAndWatch() that accepts stat object.  Only ZKUtil had but it returned data in bytes which had to be again converted to RegionTransitionData. 
bq.      Hence added an utility api in ZKAssign itself and thought it may be useful in future also.

Sounds good.


- Michael


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2251/#review2469
-----------------------------------------------------------


On 2011-10-08 05:13:32, ramkrishna vasudevan wrote:
bq.  
bq.  -----------------------------------------------------------
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/2251/
bq.  -----------------------------------------------------------
bq.  
bq.  (Updated 2011-10-08 05:13:32)
bq.  
bq.  
bq.  Review request for hbase, Ted Yu, Michael Stack, and Jonathan Gray.
bq.  
bq.  
bq.  Summary
bq.  -------
bq.  
bq.  Fix for handling HBASE-4539 and HBASE-4540.
bq.  Ran all the testcases.  Added one new testcase to verify OpenedRegionHandler scenarios.
bq.  Also addresses Ted's comments.
bq.  
bq.  
bq.  This addresses bug HBASE-4540.
bq.      https://issues.apache.org/jira/browse/HBASE-4540
bq.  
bq.  
bq.  Diffs
bq.  -----
bq.  
bq.    http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java 1179945 
bq.    http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/handler/OpenedRegionHandler.java 1179945 
bq.    http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKAssign.java 1179945 
bq.    http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java 1179945 
bq.    http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/master/TestOpenedRegionHandler.java PRE-CREATION 
bq.  
bq.  Diff: https://reviews.apache.org/r/2251/diff
bq.  
bq.  
bq.  Testing
bq.  -------
bq.  
bq.  Yes
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  ramkrishna
bq.  
bq.


                
> OpenedRegionHandler is not enforcing atomicity of the operation it is performing
> --------------------------------------------------------------------------------
>
>                 Key: HBASE-4540
>                 URL: https://issues.apache.org/jira/browse/HBASE-4540
>             Project: HBase
>          Issue Type: Bug
>            Reporter: ramkrishna.s.vasudevan
>            Assignee: ramkrishna.s.vasudevan
>         Attachments: HBASE-4540_1.patch
>
>
> -> OpenedRegionHandler has not yet deleted the znode of the region R1 opened by RS1.
> -> RS1 goes down.
> -> Servershutdownhandler assigns the region R1 to RS2.
> -> The znode of R1 is moved to OFFLINE state by master or OPENING state by RS2 if RS2 has started opening the region.
> -> Now the first OpenedRegionHandler tries to delete the znode thinking its in OPENED state but fails.
> -> Though it fails it removes the node from RIT and adds RS1 as the owner of R1 in master's memory.
> -> Now when RS2 completes opening the region the master is not able to open the region as already the reigon has been deleted from RIT.
> {code}
> Master
> ======
> 2011-10-05 20:49:45,301 INFO org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Finished processing of shutdown of linux146,60020,1317827727647
> 2011-10-05 20:49:54,177 DEBUG org.apache.hadoop.hbase.master.HMaster: Not running balancer because 1 region(s) in transition: {3e69d628a8bd8e9b7c5e7a2a6e03aad9=t1,,1317827883842.3e69d628a8bd8e9b7c5e7a2a6e03aad9. state=PENDING_OPEN, ts=1317827985272, server=linux76,60020,1317827746847}
> 2011-10-05 20:49:57,720 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=M_ZK_REGION_OFFLINE, server=linux76,60000,1317827742012, region=3e69d628a8bd8e9b7c5e7a2a6e03aad9
> 2011-10-05 20:50:14,501 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:60000-0x132d3dc13090023 Deleting existing unassigned node for 3e69d628a8bd8e9b7c5e7a2a6e03aad9 that is in expected state RS_ZK_REGION_OPENED
> 2011-10-05 20:50:14,505 WARN org.apache.hadoop.hbase.zookeeper.ZKAssign: master:60000-0x132d3dc13090023 Attempting to delete unassigned node 3e69d628a8bd8e9b7c5e7a2a6e03aad9 in RS_ZK_REGION_OPENED state but node is in RS_ZK_REGION_OPENING state
> After the region is opened in RS2
> =================================
> 2011-10-05 20:50:48,066 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=RS_ZK_REGION_OPENING, server=linux76,60020,1317827746847, region=3e69d628a8bd8e9b7c5e7a2a6e03aad9, which is more than 15 seconds late
> 2011-10-05 20:50:48,290 WARN org.apache.hadoop.hbase.master.AssignmentManager: Received OPENING for region 3e69d628a8bd8e9b7c5e7a2a6e03aad9 from server linux76,60020,1317827746847 but region was in  the state null and not in expected PENDING_OPEN or OPENING states
> 2011-10-05 20:50:53,743 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=RS_ZK_REGION_OPENING, server=linux76,60020,1317827746847, region=3e69d628a8bd8e9b7c5e7a2a6e03aad9
> 2011-10-05 20:50:54,182 DEBUG org.apache.hadoop.hbase.master.CatalogJanitor: Scanned 1 catalog row(s) and gc'd 0 unreferenced parent region(s)
> 2011-10-05 20:50:54,397 WARN org.apache.hadoop.hbase.master.AssignmentManager: Received OPENING for region 3e69d628a8bd8e9b7c5e7a2a6e03aad9 from server linux76,60020,1317827746847 but region was in  the state null and not in expected PENDING_OPEN or OPENING states
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4540) OpenedRegionHandler is not enforcing atomicity of the operation it is performing

Posted by "jiraposter@reviews.apache.org (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13123381#comment-13123381 ] 

jiraposter@reviews.apache.org commented on HBASE-4540:
------------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2251/#review2461
-----------------------------------------------------------

Ship it!


- Ted


On 2011-10-08 05:13:32, ramkrishna vasudevan wrote:
bq.  
bq.  -----------------------------------------------------------
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/2251/
bq.  -----------------------------------------------------------
bq.  
bq.  (Updated 2011-10-08 05:13:32)
bq.  
bq.  
bq.  Review request for hbase, Ted Yu, Michael Stack, and Jonathan Gray.
bq.  
bq.  
bq.  Summary
bq.  -------
bq.  
bq.  Fix for handling HBASE-4539 and HBASE-4540.
bq.  Ran all the testcases.  Added one new testcase to verify OpenedRegionHandler scenarios.
bq.  Also addresses Ted's comments.
bq.  
bq.  
bq.  This addresses bug HBASE-4540.
bq.      https://issues.apache.org/jira/browse/HBASE-4540
bq.  
bq.  
bq.  Diffs
bq.  -----
bq.  
bq.    http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java 1179945 
bq.    http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/handler/OpenedRegionHandler.java 1179945 
bq.    http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKAssign.java 1179945 
bq.    http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java 1179945 
bq.    http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/master/TestOpenedRegionHandler.java PRE-CREATION 
bq.  
bq.  Diff: https://reviews.apache.org/r/2251/diff
bq.  
bq.  
bq.  Testing
bq.  -------
bq.  
bq.  Yes
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  ramkrishna
bq.  
bq.


                
> OpenedRegionHandler is not enforcing atomicity of the operation it is performing
> --------------------------------------------------------------------------------
>
>                 Key: HBASE-4540
>                 URL: https://issues.apache.org/jira/browse/HBASE-4540
>             Project: HBase
>          Issue Type: Bug
>            Reporter: ramkrishna.s.vasudevan
>            Assignee: ramkrishna.s.vasudevan
>         Attachments: HBASE-4540_1.patch
>
>
> -> OpenedRegionHandler has not yet deleted the znode of the region R1 opened by RS1.
> -> RS1 goes down.
> -> Servershutdownhandler assigns the region R1 to RS2.
> -> The znode of R1 is moved to OFFLINE state by master or OPENING state by RS2 if RS2 has started opening the region.
> -> Now the first OpenedRegionHandler tries to delete the znode thinking its in OPENED state but fails.
> -> Though it fails it removes the node from RIT and adds RS1 as the owner of R1 in master's memory.
> -> Now when RS2 completes opening the region the master is not able to open the region as already the reigon has been deleted from RIT.
> {code}
> Master
> ======
> 2011-10-05 20:49:45,301 INFO org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Finished processing of shutdown of linux146,60020,1317827727647
> 2011-10-05 20:49:54,177 DEBUG org.apache.hadoop.hbase.master.HMaster: Not running balancer because 1 region(s) in transition: {3e69d628a8bd8e9b7c5e7a2a6e03aad9=t1,,1317827883842.3e69d628a8bd8e9b7c5e7a2a6e03aad9. state=PENDING_OPEN, ts=1317827985272, server=linux76,60020,1317827746847}
> 2011-10-05 20:49:57,720 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=M_ZK_REGION_OFFLINE, server=linux76,60000,1317827742012, region=3e69d628a8bd8e9b7c5e7a2a6e03aad9
> 2011-10-05 20:50:14,501 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:60000-0x132d3dc13090023 Deleting existing unassigned node for 3e69d628a8bd8e9b7c5e7a2a6e03aad9 that is in expected state RS_ZK_REGION_OPENED
> 2011-10-05 20:50:14,505 WARN org.apache.hadoop.hbase.zookeeper.ZKAssign: master:60000-0x132d3dc13090023 Attempting to delete unassigned node 3e69d628a8bd8e9b7c5e7a2a6e03aad9 in RS_ZK_REGION_OPENED state but node is in RS_ZK_REGION_OPENING state
> After the region is opened in RS2
> =================================
> 2011-10-05 20:50:48,066 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=RS_ZK_REGION_OPENING, server=linux76,60020,1317827746847, region=3e69d628a8bd8e9b7c5e7a2a6e03aad9, which is more than 15 seconds late
> 2011-10-05 20:50:48,290 WARN org.apache.hadoop.hbase.master.AssignmentManager: Received OPENING for region 3e69d628a8bd8e9b7c5e7a2a6e03aad9 from server linux76,60020,1317827746847 but region was in  the state null and not in expected PENDING_OPEN or OPENING states
> 2011-10-05 20:50:53,743 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=RS_ZK_REGION_OPENING, server=linux76,60020,1317827746847, region=3e69d628a8bd8e9b7c5e7a2a6e03aad9
> 2011-10-05 20:50:54,182 DEBUG org.apache.hadoop.hbase.master.CatalogJanitor: Scanned 1 catalog row(s) and gc'd 0 unreferenced parent region(s)
> 2011-10-05 20:50:54,397 WARN org.apache.hadoop.hbase.master.AssignmentManager: Received OPENING for region 3e69d628a8bd8e9b7c5e7a2a6e03aad9 from server linux76,60020,1317827746847 but region was in  the state null and not in expected PENDING_OPEN or OPENING states
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4540) OpenedRegionHandler is not enforcing atomicity of the operation it is performing

Posted by "jiraposter@reviews.apache.org (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13122123#comment-13122123 ] 

jiraposter@reviews.apache.org commented on HBASE-4540:
------------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2251/#review2395
-----------------------------------------------------------



http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/master/TestOpenedRegionHandler.java
<https://reviews.apache.org/r/2251/#comment5498>

    The two tests share a lot of the same code, some refactoring would be good



http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/master/TestOpenedRegionHandler.java
<https://reviews.apache.org/r/2251/#comment5497>

    You should be resetting the conf to what was created inside TEST_UTIL.


- Jean-Daniel


On 2011-10-06 17:55:05, ramkrishna vasudevan wrote:
bq.  
bq.  -----------------------------------------------------------
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/2251/
bq.  -----------------------------------------------------------
bq.  
bq.  (Updated 2011-10-06 17:55:05)
bq.  
bq.  
bq.  Review request for hbase, Ted Yu, Michael Stack, and Jonathan Gray.
bq.  
bq.  
bq.  Summary
bq.  -------
bq.  
bq.  Fix for handling HBASE-4539 and HBASE-4540.
bq.  Ran all the testcases.  Added one new testcase to verify OpenedRegionHandler scenarios.
bq.  Also addresses Ted's comments.
bq.  
bq.  
bq.  This addresses bug HBASE-4540.
bq.      https://issues.apache.org/jira/browse/HBASE-4540
bq.  
bq.  
bq.  Diffs
bq.  -----
bq.  
bq.    http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java 1179238 
bq.    http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/handler/OpenedRegionHandler.java 1179238 
bq.    http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKAssign.java 1179238 
bq.    http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java 1179238 
bq.    http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/master/TestOpenedRegionHandler.java PRE-CREATION 
bq.  
bq.  Diff: https://reviews.apache.org/r/2251/diff
bq.  
bq.  
bq.  Testing
bq.  -------
bq.  
bq.  Yes
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  ramkrishna
bq.  
bq.


                
> OpenedRegionHandler is not enforcing atomicity of the operation it is performing
> --------------------------------------------------------------------------------
>
>                 Key: HBASE-4540
>                 URL: https://issues.apache.org/jira/browse/HBASE-4540
>             Project: HBase
>          Issue Type: Bug
>            Reporter: ramkrishna.s.vasudevan
>            Assignee: ramkrishna.s.vasudevan
>         Attachments: HBASE-4540_1.patch
>
>
> -> OpenedRegionHandler has not yet deleted the znode of the region R1 opened by RS1.
> -> RS1 goes down.
> -> Servershutdownhandler assigns the region R1 to RS2.
> -> The znode of R1 is moved to OFFLINE state by master or OPENING state by RS2 if RS2 has started opening the region.
> -> Now the first OpenedRegionHandler tries to delete the znode thinking its in OPENED state but fails.
> -> Though it fails it removes the node from RIT and adds RS1 as the owner of R1 in master's memory.
> -> Now when RS2 completes opening the region the master is not able to open the region as already the reigon has been deleted from RIT.
> {code}
> Master
> ======
> 2011-10-05 20:49:45,301 INFO org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Finished processing of shutdown of linux146,60020,1317827727647
> 2011-10-05 20:49:54,177 DEBUG org.apache.hadoop.hbase.master.HMaster: Not running balancer because 1 region(s) in transition: {3e69d628a8bd8e9b7c5e7a2a6e03aad9=t1,,1317827883842.3e69d628a8bd8e9b7c5e7a2a6e03aad9. state=PENDING_OPEN, ts=1317827985272, server=linux76,60020,1317827746847}
> 2011-10-05 20:49:57,720 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=M_ZK_REGION_OFFLINE, server=linux76,60000,1317827742012, region=3e69d628a8bd8e9b7c5e7a2a6e03aad9
> 2011-10-05 20:50:14,501 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:60000-0x132d3dc13090023 Deleting existing unassigned node for 3e69d628a8bd8e9b7c5e7a2a6e03aad9 that is in expected state RS_ZK_REGION_OPENED
> 2011-10-05 20:50:14,505 WARN org.apache.hadoop.hbase.zookeeper.ZKAssign: master:60000-0x132d3dc13090023 Attempting to delete unassigned node 3e69d628a8bd8e9b7c5e7a2a6e03aad9 in RS_ZK_REGION_OPENED state but node is in RS_ZK_REGION_OPENING state
> After the region is opened in RS2
> =================================
> 2011-10-05 20:50:48,066 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=RS_ZK_REGION_OPENING, server=linux76,60020,1317827746847, region=3e69d628a8bd8e9b7c5e7a2a6e03aad9, which is more than 15 seconds late
> 2011-10-05 20:50:48,290 WARN org.apache.hadoop.hbase.master.AssignmentManager: Received OPENING for region 3e69d628a8bd8e9b7c5e7a2a6e03aad9 from server linux76,60020,1317827746847 but region was in  the state null and not in expected PENDING_OPEN or OPENING states
> 2011-10-05 20:50:53,743 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=RS_ZK_REGION_OPENING, server=linux76,60020,1317827746847, region=3e69d628a8bd8e9b7c5e7a2a6e03aad9
> 2011-10-05 20:50:54,182 DEBUG org.apache.hadoop.hbase.master.CatalogJanitor: Scanned 1 catalog row(s) and gc'd 0 unreferenced parent region(s)
> 2011-10-05 20:50:54,397 WARN org.apache.hadoop.hbase.master.AssignmentManager: Received OPENING for region 3e69d628a8bd8e9b7c5e7a2a6e03aad9 from server linux76,60020,1317827746847 but region was in  the state null and not in expected PENDING_OPEN or OPENING states
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-4540) OpenedRegionHandler is not enforcing atomicity of the operation it is performing

Posted by "ramkrishna.s.vasudevan (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-4540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

ramkrishna.s.vasudevan updated HBASE-4540:
------------------------------------------

    Attachment: HBASE-4540_90_1.patch

Added a try finally block in TestOpenedRegionHandler.java
                
> OpenedRegionHandler is not enforcing atomicity of the operation it is performing
> --------------------------------------------------------------------------------
>
>                 Key: HBASE-4540
>                 URL: https://issues.apache.org/jira/browse/HBASE-4540
>             Project: HBase
>          Issue Type: Bug
>            Reporter: ramkrishna.s.vasudevan
>            Assignee: ramkrishna.s.vasudevan
>             Fix For: 0.92.0
>
>         Attachments: HBASE-4540_1.patch, HBASE-4540_90.patch, HBASE-4540_90_1.patch
>
>
> -> OpenedRegionHandler has not yet deleted the znode of the region R1 opened by RS1.
> -> RS1 goes down.
> -> Servershutdownhandler assigns the region R1 to RS2.
> -> The znode of R1 is moved to OFFLINE state by master or OPENING state by RS2 if RS2 has started opening the region.
> -> Now the first OpenedRegionHandler tries to delete the znode thinking its in OPENED state but fails.
> -> Though it fails it removes the node from RIT and adds RS1 as the owner of R1 in master's memory.
> -> Now when RS2 completes opening the region the master is not able to open the region as already the reigon has been deleted from RIT.
> {code}
> Master
> ======
> 2011-10-05 20:49:45,301 INFO org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Finished processing of shutdown of linux146,60020,1317827727647
> 2011-10-05 20:49:54,177 DEBUG org.apache.hadoop.hbase.master.HMaster: Not running balancer because 1 region(s) in transition: {3e69d628a8bd8e9b7c5e7a2a6e03aad9=t1,,1317827883842.3e69d628a8bd8e9b7c5e7a2a6e03aad9. state=PENDING_OPEN, ts=1317827985272, server=linux76,60020,1317827746847}
> 2011-10-05 20:49:57,720 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=M_ZK_REGION_OFFLINE, server=linux76,60000,1317827742012, region=3e69d628a8bd8e9b7c5e7a2a6e03aad9
> 2011-10-05 20:50:14,501 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:60000-0x132d3dc13090023 Deleting existing unassigned node for 3e69d628a8bd8e9b7c5e7a2a6e03aad9 that is in expected state RS_ZK_REGION_OPENED
> 2011-10-05 20:50:14,505 WARN org.apache.hadoop.hbase.zookeeper.ZKAssign: master:60000-0x132d3dc13090023 Attempting to delete unassigned node 3e69d628a8bd8e9b7c5e7a2a6e03aad9 in RS_ZK_REGION_OPENED state but node is in RS_ZK_REGION_OPENING state
> After the region is opened in RS2
> =================================
> 2011-10-05 20:50:48,066 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=RS_ZK_REGION_OPENING, server=linux76,60020,1317827746847, region=3e69d628a8bd8e9b7c5e7a2a6e03aad9, which is more than 15 seconds late
> 2011-10-05 20:50:48,290 WARN org.apache.hadoop.hbase.master.AssignmentManager: Received OPENING for region 3e69d628a8bd8e9b7c5e7a2a6e03aad9 from server linux76,60020,1317827746847 but region was in  the state null and not in expected PENDING_OPEN or OPENING states
> 2011-10-05 20:50:53,743 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=RS_ZK_REGION_OPENING, server=linux76,60020,1317827746847, region=3e69d628a8bd8e9b7c5e7a2a6e03aad9
> 2011-10-05 20:50:54,182 DEBUG org.apache.hadoop.hbase.master.CatalogJanitor: Scanned 1 catalog row(s) and gc'd 0 unreferenced parent region(s)
> 2011-10-05 20:50:54,397 WARN org.apache.hadoop.hbase.master.AssignmentManager: Received OPENING for region 3e69d628a8bd8e9b7c5e7a2a6e03aad9 from server linux76,60020,1317827746847 but region was in  the state null and not in expected PENDING_OPEN or OPENING states
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4540) OpenedRegionHandler is not enforcing atomicity of the operation it is performing

Posted by "ramkrishna.s.vasudevan (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13125873#comment-13125873 ] 

ramkrishna.s.vasudevan commented on HBASE-4540:
-----------------------------------------------

@Ted,
I verified by running the testcases using my IDE, they passed.  So i feel it is ok. :)

                
> OpenedRegionHandler is not enforcing atomicity of the operation it is performing
> --------------------------------------------------------------------------------
>
>                 Key: HBASE-4540
>                 URL: https://issues.apache.org/jira/browse/HBASE-4540
>             Project: HBase
>          Issue Type: Bug
>            Reporter: ramkrishna.s.vasudevan
>            Assignee: ramkrishna.s.vasudevan
>             Fix For: 0.92.0
>
>         Attachments: HBASE-4540_1.patch, HBASE-4540_90.patch
>
>
> -> OpenedRegionHandler has not yet deleted the znode of the region R1 opened by RS1.
> -> RS1 goes down.
> -> Servershutdownhandler assigns the region R1 to RS2.
> -> The znode of R1 is moved to OFFLINE state by master or OPENING state by RS2 if RS2 has started opening the region.
> -> Now the first OpenedRegionHandler tries to delete the znode thinking its in OPENED state but fails.
> -> Though it fails it removes the node from RIT and adds RS1 as the owner of R1 in master's memory.
> -> Now when RS2 completes opening the region the master is not able to open the region as already the reigon has been deleted from RIT.
> {code}
> Master
> ======
> 2011-10-05 20:49:45,301 INFO org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Finished processing of shutdown of linux146,60020,1317827727647
> 2011-10-05 20:49:54,177 DEBUG org.apache.hadoop.hbase.master.HMaster: Not running balancer because 1 region(s) in transition: {3e69d628a8bd8e9b7c5e7a2a6e03aad9=t1,,1317827883842.3e69d628a8bd8e9b7c5e7a2a6e03aad9. state=PENDING_OPEN, ts=1317827985272, server=linux76,60020,1317827746847}
> 2011-10-05 20:49:57,720 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=M_ZK_REGION_OFFLINE, server=linux76,60000,1317827742012, region=3e69d628a8bd8e9b7c5e7a2a6e03aad9
> 2011-10-05 20:50:14,501 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:60000-0x132d3dc13090023 Deleting existing unassigned node for 3e69d628a8bd8e9b7c5e7a2a6e03aad9 that is in expected state RS_ZK_REGION_OPENED
> 2011-10-05 20:50:14,505 WARN org.apache.hadoop.hbase.zookeeper.ZKAssign: master:60000-0x132d3dc13090023 Attempting to delete unassigned node 3e69d628a8bd8e9b7c5e7a2a6e03aad9 in RS_ZK_REGION_OPENED state but node is in RS_ZK_REGION_OPENING state
> After the region is opened in RS2
> =================================
> 2011-10-05 20:50:48,066 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=RS_ZK_REGION_OPENING, server=linux76,60020,1317827746847, region=3e69d628a8bd8e9b7c5e7a2a6e03aad9, which is more than 15 seconds late
> 2011-10-05 20:50:48,290 WARN org.apache.hadoop.hbase.master.AssignmentManager: Received OPENING for region 3e69d628a8bd8e9b7c5e7a2a6e03aad9 from server linux76,60020,1317827746847 but region was in  the state null and not in expected PENDING_OPEN or OPENING states
> 2011-10-05 20:50:53,743 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=RS_ZK_REGION_OPENING, server=linux76,60020,1317827746847, region=3e69d628a8bd8e9b7c5e7a2a6e03aad9
> 2011-10-05 20:50:54,182 DEBUG org.apache.hadoop.hbase.master.CatalogJanitor: Scanned 1 catalog row(s) and gc'd 0 unreferenced parent region(s)
> 2011-10-05 20:50:54,397 WARN org.apache.hadoop.hbase.master.AssignmentManager: Received OPENING for region 3e69d628a8bd8e9b7c5e7a2a6e03aad9 from server linux76,60020,1317827746847 but region was in  the state null and not in expected PENDING_OPEN or OPENING states
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4540) OpenedRegionHandler is not enforcing atomicity of the operation it is performing

Posted by "jiraposter@reviews.apache.org (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13123569#comment-13123569 ] 

jiraposter@reviews.apache.org commented on HBASE-4540:
------------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2251/#review2469
-----------------------------------------------------------

Ship it!


I'm good on commit.

Have some suggestions for future handler tests below.  I'm ok if we commit w/o addressing them here.

Nice fix Ram


http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKAssign.java
<https://reviews.apache.org/r/2251/#comment5578>

    We don't have this method already in our ZK* classes?



http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/master/TestOpenedRegionHandler.java
<https://reviews.apache.org/r/2251/#comment5579>

    Do you have to spin up the cluster twice?   Could you do it once only in @BeforeClass and then shut it down in @AfterClass?  So its run once only?



http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/master/TestOpenedRegionHandler.java
<https://reviews.apache.org/r/2251/#comment5580>

    Good test.
    
    Would it be possible to test the handler without spinning up the cluster?  See TestOpenRegionHandler over under regionserver.handler in tests -- they don't spin up a cluster, just zk.  Test can run faster if no dfs+hbase.  Not important.  For the future.


- Michael


On 2011-10-08 05:13:32, ramkrishna vasudevan wrote:
bq.  
bq.  -----------------------------------------------------------
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/2251/
bq.  -----------------------------------------------------------
bq.  
bq.  (Updated 2011-10-08 05:13:32)
bq.  
bq.  
bq.  Review request for hbase, Ted Yu, Michael Stack, and Jonathan Gray.
bq.  
bq.  
bq.  Summary
bq.  -------
bq.  
bq.  Fix for handling HBASE-4539 and HBASE-4540.
bq.  Ran all the testcases.  Added one new testcase to verify OpenedRegionHandler scenarios.
bq.  Also addresses Ted's comments.
bq.  
bq.  
bq.  This addresses bug HBASE-4540.
bq.      https://issues.apache.org/jira/browse/HBASE-4540
bq.  
bq.  
bq.  Diffs
bq.  -----
bq.  
bq.    http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java 1179945 
bq.    http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/handler/OpenedRegionHandler.java 1179945 
bq.    http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKAssign.java 1179945 
bq.    http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java 1179945 
bq.    http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/master/TestOpenedRegionHandler.java PRE-CREATION 
bq.  
bq.  Diff: https://reviews.apache.org/r/2251/diff
bq.  
bq.  
bq.  Testing
bq.  -------
bq.  
bq.  Yes
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  ramkrishna
bq.  
bq.


                
> OpenedRegionHandler is not enforcing atomicity of the operation it is performing
> --------------------------------------------------------------------------------
>
>                 Key: HBASE-4540
>                 URL: https://issues.apache.org/jira/browse/HBASE-4540
>             Project: HBase
>          Issue Type: Bug
>            Reporter: ramkrishna.s.vasudevan
>            Assignee: ramkrishna.s.vasudevan
>         Attachments: HBASE-4540_1.patch
>
>
> -> OpenedRegionHandler has not yet deleted the znode of the region R1 opened by RS1.
> -> RS1 goes down.
> -> Servershutdownhandler assigns the region R1 to RS2.
> -> The znode of R1 is moved to OFFLINE state by master or OPENING state by RS2 if RS2 has started opening the region.
> -> Now the first OpenedRegionHandler tries to delete the znode thinking its in OPENED state but fails.
> -> Though it fails it removes the node from RIT and adds RS1 as the owner of R1 in master's memory.
> -> Now when RS2 completes opening the region the master is not able to open the region as already the reigon has been deleted from RIT.
> {code}
> Master
> ======
> 2011-10-05 20:49:45,301 INFO org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Finished processing of shutdown of linux146,60020,1317827727647
> 2011-10-05 20:49:54,177 DEBUG org.apache.hadoop.hbase.master.HMaster: Not running balancer because 1 region(s) in transition: {3e69d628a8bd8e9b7c5e7a2a6e03aad9=t1,,1317827883842.3e69d628a8bd8e9b7c5e7a2a6e03aad9. state=PENDING_OPEN, ts=1317827985272, server=linux76,60020,1317827746847}
> 2011-10-05 20:49:57,720 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=M_ZK_REGION_OFFLINE, server=linux76,60000,1317827742012, region=3e69d628a8bd8e9b7c5e7a2a6e03aad9
> 2011-10-05 20:50:14,501 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:60000-0x132d3dc13090023 Deleting existing unassigned node for 3e69d628a8bd8e9b7c5e7a2a6e03aad9 that is in expected state RS_ZK_REGION_OPENED
> 2011-10-05 20:50:14,505 WARN org.apache.hadoop.hbase.zookeeper.ZKAssign: master:60000-0x132d3dc13090023 Attempting to delete unassigned node 3e69d628a8bd8e9b7c5e7a2a6e03aad9 in RS_ZK_REGION_OPENED state but node is in RS_ZK_REGION_OPENING state
> After the region is opened in RS2
> =================================
> 2011-10-05 20:50:48,066 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=RS_ZK_REGION_OPENING, server=linux76,60020,1317827746847, region=3e69d628a8bd8e9b7c5e7a2a6e03aad9, which is more than 15 seconds late
> 2011-10-05 20:50:48,290 WARN org.apache.hadoop.hbase.master.AssignmentManager: Received OPENING for region 3e69d628a8bd8e9b7c5e7a2a6e03aad9 from server linux76,60020,1317827746847 but region was in  the state null and not in expected PENDING_OPEN or OPENING states
> 2011-10-05 20:50:53,743 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=RS_ZK_REGION_OPENING, server=linux76,60020,1317827746847, region=3e69d628a8bd8e9b7c5e7a2a6e03aad9
> 2011-10-05 20:50:54,182 DEBUG org.apache.hadoop.hbase.master.CatalogJanitor: Scanned 1 catalog row(s) and gc'd 0 unreferenced parent region(s)
> 2011-10-05 20:50:54,397 WARN org.apache.hadoop.hbase.master.AssignmentManager: Received OPENING for region 3e69d628a8bd8e9b7c5e7a2a6e03aad9 from server linux76,60020,1317827746847 but region was in  the state null and not in expected PENDING_OPEN or OPENING states
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4540) OpenedRegionHandler is not enforcing atomicity of the operation it is performing

Posted by "jiraposter@reviews.apache.org (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13122957#comment-13122957 ] 

jiraposter@reviews.apache.org commented on HBASE-4540:
------------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2251/#review2431
-----------------------------------------------------------



http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/handler/OpenedRegionHandler.java
<https://reviews.apache.org/r/2251/#comment5524>

    Space between } and catch, please.



http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKAssign.java
<https://reviews.apache.org/r/2251/#comment5525>

    Should we expose this constant as public ?
    How about naming this constant DONT_COMPARE_VERSION or NO_VERSION_COMPARISON ?


- Ted


On 2011-10-07 16:13:33, ramkrishna vasudevan wrote:
bq.  
bq.  -----------------------------------------------------------
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/2251/
bq.  -----------------------------------------------------------
bq.  
bq.  (Updated 2011-10-07 16:13:33)
bq.  
bq.  
bq.  Review request for hbase, Ted Yu, Michael Stack, and Jonathan Gray.
bq.  
bq.  
bq.  Summary
bq.  -------
bq.  
bq.  Fix for handling HBASE-4539 and HBASE-4540.
bq.  Ran all the testcases.  Added one new testcase to verify OpenedRegionHandler scenarios.
bq.  Also addresses Ted's comments.
bq.  
bq.  
bq.  This addresses bug HBASE-4540.
bq.      https://issues.apache.org/jira/browse/HBASE-4540
bq.  
bq.  
bq.  Diffs
bq.  -----
bq.  
bq.    http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java 1179945 
bq.    http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/handler/OpenedRegionHandler.java 1179945 
bq.    http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKAssign.java 1179945 
bq.    http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java 1179945 
bq.    http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/master/TestOpenedRegionHandler.java PRE-CREATION 
bq.  
bq.  Diff: https://reviews.apache.org/r/2251/diff
bq.  
bq.  
bq.  Testing
bq.  -------
bq.  
bq.  Yes
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  ramkrishna
bq.  
bq.


                
> OpenedRegionHandler is not enforcing atomicity of the operation it is performing
> --------------------------------------------------------------------------------
>
>                 Key: HBASE-4540
>                 URL: https://issues.apache.org/jira/browse/HBASE-4540
>             Project: HBase
>          Issue Type: Bug
>            Reporter: ramkrishna.s.vasudevan
>            Assignee: ramkrishna.s.vasudevan
>         Attachments: HBASE-4540_1.patch
>
>
> -> OpenedRegionHandler has not yet deleted the znode of the region R1 opened by RS1.
> -> RS1 goes down.
> -> Servershutdownhandler assigns the region R1 to RS2.
> -> The znode of R1 is moved to OFFLINE state by master or OPENING state by RS2 if RS2 has started opening the region.
> -> Now the first OpenedRegionHandler tries to delete the znode thinking its in OPENED state but fails.
> -> Though it fails it removes the node from RIT and adds RS1 as the owner of R1 in master's memory.
> -> Now when RS2 completes opening the region the master is not able to open the region as already the reigon has been deleted from RIT.
> {code}
> Master
> ======
> 2011-10-05 20:49:45,301 INFO org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Finished processing of shutdown of linux146,60020,1317827727647
> 2011-10-05 20:49:54,177 DEBUG org.apache.hadoop.hbase.master.HMaster: Not running balancer because 1 region(s) in transition: {3e69d628a8bd8e9b7c5e7a2a6e03aad9=t1,,1317827883842.3e69d628a8bd8e9b7c5e7a2a6e03aad9. state=PENDING_OPEN, ts=1317827985272, server=linux76,60020,1317827746847}
> 2011-10-05 20:49:57,720 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=M_ZK_REGION_OFFLINE, server=linux76,60000,1317827742012, region=3e69d628a8bd8e9b7c5e7a2a6e03aad9
> 2011-10-05 20:50:14,501 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:60000-0x132d3dc13090023 Deleting existing unassigned node for 3e69d628a8bd8e9b7c5e7a2a6e03aad9 that is in expected state RS_ZK_REGION_OPENED
> 2011-10-05 20:50:14,505 WARN org.apache.hadoop.hbase.zookeeper.ZKAssign: master:60000-0x132d3dc13090023 Attempting to delete unassigned node 3e69d628a8bd8e9b7c5e7a2a6e03aad9 in RS_ZK_REGION_OPENED state but node is in RS_ZK_REGION_OPENING state
> After the region is opened in RS2
> =================================
> 2011-10-05 20:50:48,066 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=RS_ZK_REGION_OPENING, server=linux76,60020,1317827746847, region=3e69d628a8bd8e9b7c5e7a2a6e03aad9, which is more than 15 seconds late
> 2011-10-05 20:50:48,290 WARN org.apache.hadoop.hbase.master.AssignmentManager: Received OPENING for region 3e69d628a8bd8e9b7c5e7a2a6e03aad9 from server linux76,60020,1317827746847 but region was in  the state null and not in expected PENDING_OPEN or OPENING states
> 2011-10-05 20:50:53,743 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=RS_ZK_REGION_OPENING, server=linux76,60020,1317827746847, region=3e69d628a8bd8e9b7c5e7a2a6e03aad9
> 2011-10-05 20:50:54,182 DEBUG org.apache.hadoop.hbase.master.CatalogJanitor: Scanned 1 catalog row(s) and gc'd 0 unreferenced parent region(s)
> 2011-10-05 20:50:54,397 WARN org.apache.hadoop.hbase.master.AssignmentManager: Received OPENING for region 3e69d628a8bd8e9b7c5e7a2a6e03aad9 from server linux76,60020,1317827746847 but region was in  the state null and not in expected PENDING_OPEN or OPENING states
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4540) OpenedRegionHandler is not enforcing atomicity of the operation it is performing

Posted by "Hudson (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13123658#comment-13123658 ] 

Hudson commented on HBASE-4540:
-------------------------------

Integrated in HBase-0.92 #54 (See [https://builds.apache.org/job/HBase-0.92/54/])
    HBASE-4540 OpenedRegionHandler is not enforcing atomicity of the operation it is performing.  Also addresses HBASE-4539. (Ram)

ramkrishna : 
Files : 
* /hbase/branches/0.92/CHANGES.txt
* /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java
* /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/master/handler/OpenedRegionHandler.java
* /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKAssign.java
* /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java
* /hbase/branches/0.92/src/test/java/org/apache/hadoop/hbase/master/TestOpenedRegionHandler.java

                
> OpenedRegionHandler is not enforcing atomicity of the operation it is performing
> --------------------------------------------------------------------------------
>
>                 Key: HBASE-4540
>                 URL: https://issues.apache.org/jira/browse/HBASE-4540
>             Project: HBase
>          Issue Type: Bug
>            Reporter: ramkrishna.s.vasudevan
>            Assignee: ramkrishna.s.vasudevan
>         Attachments: HBASE-4540_1.patch
>
>
> -> OpenedRegionHandler has not yet deleted the znode of the region R1 opened by RS1.
> -> RS1 goes down.
> -> Servershutdownhandler assigns the region R1 to RS2.
> -> The znode of R1 is moved to OFFLINE state by master or OPENING state by RS2 if RS2 has started opening the region.
> -> Now the first OpenedRegionHandler tries to delete the znode thinking its in OPENED state but fails.
> -> Though it fails it removes the node from RIT and adds RS1 as the owner of R1 in master's memory.
> -> Now when RS2 completes opening the region the master is not able to open the region as already the reigon has been deleted from RIT.
> {code}
> Master
> ======
> 2011-10-05 20:49:45,301 INFO org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Finished processing of shutdown of linux146,60020,1317827727647
> 2011-10-05 20:49:54,177 DEBUG org.apache.hadoop.hbase.master.HMaster: Not running balancer because 1 region(s) in transition: {3e69d628a8bd8e9b7c5e7a2a6e03aad9=t1,,1317827883842.3e69d628a8bd8e9b7c5e7a2a6e03aad9. state=PENDING_OPEN, ts=1317827985272, server=linux76,60020,1317827746847}
> 2011-10-05 20:49:57,720 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=M_ZK_REGION_OFFLINE, server=linux76,60000,1317827742012, region=3e69d628a8bd8e9b7c5e7a2a6e03aad9
> 2011-10-05 20:50:14,501 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:60000-0x132d3dc13090023 Deleting existing unassigned node for 3e69d628a8bd8e9b7c5e7a2a6e03aad9 that is in expected state RS_ZK_REGION_OPENED
> 2011-10-05 20:50:14,505 WARN org.apache.hadoop.hbase.zookeeper.ZKAssign: master:60000-0x132d3dc13090023 Attempting to delete unassigned node 3e69d628a8bd8e9b7c5e7a2a6e03aad9 in RS_ZK_REGION_OPENED state but node is in RS_ZK_REGION_OPENING state
> After the region is opened in RS2
> =================================
> 2011-10-05 20:50:48,066 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=RS_ZK_REGION_OPENING, server=linux76,60020,1317827746847, region=3e69d628a8bd8e9b7c5e7a2a6e03aad9, which is more than 15 seconds late
> 2011-10-05 20:50:48,290 WARN org.apache.hadoop.hbase.master.AssignmentManager: Received OPENING for region 3e69d628a8bd8e9b7c5e7a2a6e03aad9 from server linux76,60020,1317827746847 but region was in  the state null and not in expected PENDING_OPEN or OPENING states
> 2011-10-05 20:50:53,743 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=RS_ZK_REGION_OPENING, server=linux76,60020,1317827746847, region=3e69d628a8bd8e9b7c5e7a2a6e03aad9
> 2011-10-05 20:50:54,182 DEBUG org.apache.hadoop.hbase.master.CatalogJanitor: Scanned 1 catalog row(s) and gc'd 0 unreferenced parent region(s)
> 2011-10-05 20:50:54,397 WARN org.apache.hadoop.hbase.master.AssignmentManager: Received OPENING for region 3e69d628a8bd8e9b7c5e7a2a6e03aad9 from server linux76,60020,1317827746847 but region was in  the state null and not in expected PENDING_OPEN or OPENING states
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4540) OpenedRegionHandler is not enforcing atomicity of the operation it is performing

Posted by "Ted Yu (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13121407#comment-13121407 ] 

Ted Yu commented on HBASE-4540:
-------------------------------

For ZKAssign.deleteNode(), the following can be removed:
{code}
    LOG.debug(zkw.prefix("Successfully deleted unassigned node for region " +
        regionName + " in expected state " + expectedState));
{code}
                
> OpenedRegionHandler is not enforcing atomicity of the operation it is performing
> --------------------------------------------------------------------------------
>
>                 Key: HBASE-4540
>                 URL: https://issues.apache.org/jira/browse/HBASE-4540
>             Project: HBase
>          Issue Type: Bug
>            Reporter: ramkrishna.s.vasudevan
>            Assignee: ramkrishna.s.vasudevan
>         Attachments: HBASE-4540_1.patch
>
>
> -> OpenedRegionHandler has not yet deleted the znode of the region R1 opened by RS1.
> -> RS1 goes down.
> -> Servershutdownhandler assigns the region R1 to RS2.
> -> The znode of R1 is moved to OFFLINE state by master or OPENING state by RS2 if RS2 has started opening the region.
> -> Now the first OpenedRegionHandler tries to delete the znode thinking its in OPENED state but fails.
> -> Though it fails it removes the node from RIT and adds RS1 as the owner of R1 in master's memory.
> -> Now when RS2 completes opening the region the master is not able to open the region as already the reigon has been deleted from RIT.
> {code}
> Master
> ======
> 2011-10-05 20:49:45,301 INFO org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Finished processing of shutdown of linux146,60020,1317827727647
> 2011-10-05 20:49:54,177 DEBUG org.apache.hadoop.hbase.master.HMaster: Not running balancer because 1 region(s) in transition: {3e69d628a8bd8e9b7c5e7a2a6e03aad9=t1,,1317827883842.3e69d628a8bd8e9b7c5e7a2a6e03aad9. state=PENDING_OPEN, ts=1317827985272, server=linux76,60020,1317827746847}
> 2011-10-05 20:49:57,720 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=M_ZK_REGION_OFFLINE, server=linux76,60000,1317827742012, region=3e69d628a8bd8e9b7c5e7a2a6e03aad9
> 2011-10-05 20:50:14,501 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:60000-0x132d3dc13090023 Deleting existing unassigned node for 3e69d628a8bd8e9b7c5e7a2a6e03aad9 that is in expected state RS_ZK_REGION_OPENED
> 2011-10-05 20:50:14,505 WARN org.apache.hadoop.hbase.zookeeper.ZKAssign: master:60000-0x132d3dc13090023 Attempting to delete unassigned node 3e69d628a8bd8e9b7c5e7a2a6e03aad9 in RS_ZK_REGION_OPENED state but node is in RS_ZK_REGION_OPENING state
> After the region is opened in RS2
> =================================
> 2011-10-05 20:50:48,066 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=RS_ZK_REGION_OPENING, server=linux76,60020,1317827746847, region=3e69d628a8bd8e9b7c5e7a2a6e03aad9, which is more than 15 seconds late
> 2011-10-05 20:50:48,290 WARN org.apache.hadoop.hbase.master.AssignmentManager: Received OPENING for region 3e69d628a8bd8e9b7c5e7a2a6e03aad9 from server linux76,60020,1317827746847 but region was in  the state null and not in expected PENDING_OPEN or OPENING states
> 2011-10-05 20:50:53,743 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=RS_ZK_REGION_OPENING, server=linux76,60020,1317827746847, region=3e69d628a8bd8e9b7c5e7a2a6e03aad9
> 2011-10-05 20:50:54,182 DEBUG org.apache.hadoop.hbase.master.CatalogJanitor: Scanned 1 catalog row(s) and gc'd 0 unreferenced parent region(s)
> 2011-10-05 20:50:54,397 WARN org.apache.hadoop.hbase.master.AssignmentManager: Received OPENING for region 3e69d628a8bd8e9b7c5e7a2a6e03aad9 from server linux76,60020,1317827746847 but region was in  the state null and not in expected PENDING_OPEN or OPENING states
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4540) OpenedRegionHandler is not enforcing atomicity of the operation it is performing

Posted by "Ted Yu (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13122842#comment-13122842 ] 

Ted Yu commented on HBASE-4540:
-------------------------------

For Ram's comment @ 07/Oct/11 07:22
Since -1 is a possible return value from ZKAssign methods, I think we should use other values such as -2.
                
> OpenedRegionHandler is not enforcing atomicity of the operation it is performing
> --------------------------------------------------------------------------------
>
>                 Key: HBASE-4540
>                 URL: https://issues.apache.org/jira/browse/HBASE-4540
>             Project: HBase
>          Issue Type: Bug
>            Reporter: ramkrishna.s.vasudevan
>            Assignee: ramkrishna.s.vasudevan
>         Attachments: HBASE-4540_1.patch
>
>
> -> OpenedRegionHandler has not yet deleted the znode of the region R1 opened by RS1.
> -> RS1 goes down.
> -> Servershutdownhandler assigns the region R1 to RS2.
> -> The znode of R1 is moved to OFFLINE state by master or OPENING state by RS2 if RS2 has started opening the region.
> -> Now the first OpenedRegionHandler tries to delete the znode thinking its in OPENED state but fails.
> -> Though it fails it removes the node from RIT and adds RS1 as the owner of R1 in master's memory.
> -> Now when RS2 completes opening the region the master is not able to open the region as already the reigon has been deleted from RIT.
> {code}
> Master
> ======
> 2011-10-05 20:49:45,301 INFO org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Finished processing of shutdown of linux146,60020,1317827727647
> 2011-10-05 20:49:54,177 DEBUG org.apache.hadoop.hbase.master.HMaster: Not running balancer because 1 region(s) in transition: {3e69d628a8bd8e9b7c5e7a2a6e03aad9=t1,,1317827883842.3e69d628a8bd8e9b7c5e7a2a6e03aad9. state=PENDING_OPEN, ts=1317827985272, server=linux76,60020,1317827746847}
> 2011-10-05 20:49:57,720 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=M_ZK_REGION_OFFLINE, server=linux76,60000,1317827742012, region=3e69d628a8bd8e9b7c5e7a2a6e03aad9
> 2011-10-05 20:50:14,501 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:60000-0x132d3dc13090023 Deleting existing unassigned node for 3e69d628a8bd8e9b7c5e7a2a6e03aad9 that is in expected state RS_ZK_REGION_OPENED
> 2011-10-05 20:50:14,505 WARN org.apache.hadoop.hbase.zookeeper.ZKAssign: master:60000-0x132d3dc13090023 Attempting to delete unassigned node 3e69d628a8bd8e9b7c5e7a2a6e03aad9 in RS_ZK_REGION_OPENED state but node is in RS_ZK_REGION_OPENING state
> After the region is opened in RS2
> =================================
> 2011-10-05 20:50:48,066 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=RS_ZK_REGION_OPENING, server=linux76,60020,1317827746847, region=3e69d628a8bd8e9b7c5e7a2a6e03aad9, which is more than 15 seconds late
> 2011-10-05 20:50:48,290 WARN org.apache.hadoop.hbase.master.AssignmentManager: Received OPENING for region 3e69d628a8bd8e9b7c5e7a2a6e03aad9 from server linux76,60020,1317827746847 but region was in  the state null and not in expected PENDING_OPEN or OPENING states
> 2011-10-05 20:50:53,743 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=RS_ZK_REGION_OPENING, server=linux76,60020,1317827746847, region=3e69d628a8bd8e9b7c5e7a2a6e03aad9
> 2011-10-05 20:50:54,182 DEBUG org.apache.hadoop.hbase.master.CatalogJanitor: Scanned 1 catalog row(s) and gc'd 0 unreferenced parent region(s)
> 2011-10-05 20:50:54,397 WARN org.apache.hadoop.hbase.master.AssignmentManager: Received OPENING for region 3e69d628a8bd8e9b7c5e7a2a6e03aad9 from server linux76,60020,1317827746847 but region was in  the state null and not in expected PENDING_OPEN or OPENING states
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4540) OpenedRegionHandler is not enforcing atomicity of the operation it is performing

Posted by "ramkrishna.s.vasudevan (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13123637#comment-13123637 ] 

ramkrishna.s.vasudevan commented on HBASE-4540:
-----------------------------------------------

Integrated to 0.92 and trunk.
Thanks for the reviews Ted, Gray, J-D and Stack.
Created HBASE-4558 to refactor Stack's comment on using testcases without starting cluster.
                
> OpenedRegionHandler is not enforcing atomicity of the operation it is performing
> --------------------------------------------------------------------------------
>
>                 Key: HBASE-4540
>                 URL: https://issues.apache.org/jira/browse/HBASE-4540
>             Project: HBase
>          Issue Type: Bug
>            Reporter: ramkrishna.s.vasudevan
>            Assignee: ramkrishna.s.vasudevan
>         Attachments: HBASE-4540_1.patch
>
>
> -> OpenedRegionHandler has not yet deleted the znode of the region R1 opened by RS1.
> -> RS1 goes down.
> -> Servershutdownhandler assigns the region R1 to RS2.
> -> The znode of R1 is moved to OFFLINE state by master or OPENING state by RS2 if RS2 has started opening the region.
> -> Now the first OpenedRegionHandler tries to delete the znode thinking its in OPENED state but fails.
> -> Though it fails it removes the node from RIT and adds RS1 as the owner of R1 in master's memory.
> -> Now when RS2 completes opening the region the master is not able to open the region as already the reigon has been deleted from RIT.
> {code}
> Master
> ======
> 2011-10-05 20:49:45,301 INFO org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Finished processing of shutdown of linux146,60020,1317827727647
> 2011-10-05 20:49:54,177 DEBUG org.apache.hadoop.hbase.master.HMaster: Not running balancer because 1 region(s) in transition: {3e69d628a8bd8e9b7c5e7a2a6e03aad9=t1,,1317827883842.3e69d628a8bd8e9b7c5e7a2a6e03aad9. state=PENDING_OPEN, ts=1317827985272, server=linux76,60020,1317827746847}
> 2011-10-05 20:49:57,720 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=M_ZK_REGION_OFFLINE, server=linux76,60000,1317827742012, region=3e69d628a8bd8e9b7c5e7a2a6e03aad9
> 2011-10-05 20:50:14,501 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:60000-0x132d3dc13090023 Deleting existing unassigned node for 3e69d628a8bd8e9b7c5e7a2a6e03aad9 that is in expected state RS_ZK_REGION_OPENED
> 2011-10-05 20:50:14,505 WARN org.apache.hadoop.hbase.zookeeper.ZKAssign: master:60000-0x132d3dc13090023 Attempting to delete unassigned node 3e69d628a8bd8e9b7c5e7a2a6e03aad9 in RS_ZK_REGION_OPENED state but node is in RS_ZK_REGION_OPENING state
> After the region is opened in RS2
> =================================
> 2011-10-05 20:50:48,066 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=RS_ZK_REGION_OPENING, server=linux76,60020,1317827746847, region=3e69d628a8bd8e9b7c5e7a2a6e03aad9, which is more than 15 seconds late
> 2011-10-05 20:50:48,290 WARN org.apache.hadoop.hbase.master.AssignmentManager: Received OPENING for region 3e69d628a8bd8e9b7c5e7a2a6e03aad9 from server linux76,60020,1317827746847 but region was in  the state null and not in expected PENDING_OPEN or OPENING states
> 2011-10-05 20:50:53,743 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=RS_ZK_REGION_OPENING, server=linux76,60020,1317827746847, region=3e69d628a8bd8e9b7c5e7a2a6e03aad9
> 2011-10-05 20:50:54,182 DEBUG org.apache.hadoop.hbase.master.CatalogJanitor: Scanned 1 catalog row(s) and gc'd 0 unreferenced parent region(s)
> 2011-10-05 20:50:54,397 WARN org.apache.hadoop.hbase.master.AssignmentManager: Received OPENING for region 3e69d628a8bd8e9b7c5e7a2a6e03aad9 from server linux76,60020,1317827746847 but region was in  the state null and not in expected PENDING_OPEN or OPENING states
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4540) OpenedRegionHandler is not enforcing atomicity of the operation it is performing

Posted by "ramkrishna.s.vasudevan (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13122860#comment-13122860 ] 

ramkrishna.s.vasudevan commented on HBASE-4540:
-----------------------------------------------

@Ted
I just uploaded the patch before you had commented this. In that patch i had used -1.
So if we are going to use -2 or some negative value is it ok to add in javadoc something like
   * @param expectedVersion of the znode that is to be deleted.
   *        If expectedVersion need not be compared while deleting the znode
   *        pass -2(NEGATIVE_VERSION)
Is it ok Ted? 
                
> OpenedRegionHandler is not enforcing atomicity of the operation it is performing
> --------------------------------------------------------------------------------
>
>                 Key: HBASE-4540
>                 URL: https://issues.apache.org/jira/browse/HBASE-4540
>             Project: HBase
>          Issue Type: Bug
>            Reporter: ramkrishna.s.vasudevan
>            Assignee: ramkrishna.s.vasudevan
>         Attachments: HBASE-4540_1.patch
>
>
> -> OpenedRegionHandler has not yet deleted the znode of the region R1 opened by RS1.
> -> RS1 goes down.
> -> Servershutdownhandler assigns the region R1 to RS2.
> -> The znode of R1 is moved to OFFLINE state by master or OPENING state by RS2 if RS2 has started opening the region.
> -> Now the first OpenedRegionHandler tries to delete the znode thinking its in OPENED state but fails.
> -> Though it fails it removes the node from RIT and adds RS1 as the owner of R1 in master's memory.
> -> Now when RS2 completes opening the region the master is not able to open the region as already the reigon has been deleted from RIT.
> {code}
> Master
> ======
> 2011-10-05 20:49:45,301 INFO org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Finished processing of shutdown of linux146,60020,1317827727647
> 2011-10-05 20:49:54,177 DEBUG org.apache.hadoop.hbase.master.HMaster: Not running balancer because 1 region(s) in transition: {3e69d628a8bd8e9b7c5e7a2a6e03aad9=t1,,1317827883842.3e69d628a8bd8e9b7c5e7a2a6e03aad9. state=PENDING_OPEN, ts=1317827985272, server=linux76,60020,1317827746847}
> 2011-10-05 20:49:57,720 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=M_ZK_REGION_OFFLINE, server=linux76,60000,1317827742012, region=3e69d628a8bd8e9b7c5e7a2a6e03aad9
> 2011-10-05 20:50:14,501 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:60000-0x132d3dc13090023 Deleting existing unassigned node for 3e69d628a8bd8e9b7c5e7a2a6e03aad9 that is in expected state RS_ZK_REGION_OPENED
> 2011-10-05 20:50:14,505 WARN org.apache.hadoop.hbase.zookeeper.ZKAssign: master:60000-0x132d3dc13090023 Attempting to delete unassigned node 3e69d628a8bd8e9b7c5e7a2a6e03aad9 in RS_ZK_REGION_OPENED state but node is in RS_ZK_REGION_OPENING state
> After the region is opened in RS2
> =================================
> 2011-10-05 20:50:48,066 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=RS_ZK_REGION_OPENING, server=linux76,60020,1317827746847, region=3e69d628a8bd8e9b7c5e7a2a6e03aad9, which is more than 15 seconds late
> 2011-10-05 20:50:48,290 WARN org.apache.hadoop.hbase.master.AssignmentManager: Received OPENING for region 3e69d628a8bd8e9b7c5e7a2a6e03aad9 from server linux76,60020,1317827746847 but region was in  the state null and not in expected PENDING_OPEN or OPENING states
> 2011-10-05 20:50:53,743 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=RS_ZK_REGION_OPENING, server=linux76,60020,1317827746847, region=3e69d628a8bd8e9b7c5e7a2a6e03aad9
> 2011-10-05 20:50:54,182 DEBUG org.apache.hadoop.hbase.master.CatalogJanitor: Scanned 1 catalog row(s) and gc'd 0 unreferenced parent region(s)
> 2011-10-05 20:50:54,397 WARN org.apache.hadoop.hbase.master.AssignmentManager: Received OPENING for region 3e69d628a8bd8e9b7c5e7a2a6e03aad9 from server linux76,60020,1317827746847 but region was in  the state null and not in expected PENDING_OPEN or OPENING states
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4540) OpenedRegionHandler is not enforcing atomicity of the operation it is performing

Posted by "ramkrishna.s.vasudevan (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13125859#comment-13125859 ] 

ramkrishna.s.vasudevan commented on HBASE-4540:
-----------------------------------------------

Testcases are passing.
TestScanner
TestCatalogTrackerOnCluster had failures which were unrelated to this. 
                
> OpenedRegionHandler is not enforcing atomicity of the operation it is performing
> --------------------------------------------------------------------------------
>
>                 Key: HBASE-4540
>                 URL: https://issues.apache.org/jira/browse/HBASE-4540
>             Project: HBase
>          Issue Type: Bug
>            Reporter: ramkrishna.s.vasudevan
>            Assignee: ramkrishna.s.vasudevan
>             Fix For: 0.92.0
>
>         Attachments: HBASE-4540_1.patch, HBASE-4540_90.patch
>
>
> -> OpenedRegionHandler has not yet deleted the znode of the region R1 opened by RS1.
> -> RS1 goes down.
> -> Servershutdownhandler assigns the region R1 to RS2.
> -> The znode of R1 is moved to OFFLINE state by master or OPENING state by RS2 if RS2 has started opening the region.
> -> Now the first OpenedRegionHandler tries to delete the znode thinking its in OPENED state but fails.
> -> Though it fails it removes the node from RIT and adds RS1 as the owner of R1 in master's memory.
> -> Now when RS2 completes opening the region the master is not able to open the region as already the reigon has been deleted from RIT.
> {code}
> Master
> ======
> 2011-10-05 20:49:45,301 INFO org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Finished processing of shutdown of linux146,60020,1317827727647
> 2011-10-05 20:49:54,177 DEBUG org.apache.hadoop.hbase.master.HMaster: Not running balancer because 1 region(s) in transition: {3e69d628a8bd8e9b7c5e7a2a6e03aad9=t1,,1317827883842.3e69d628a8bd8e9b7c5e7a2a6e03aad9. state=PENDING_OPEN, ts=1317827985272, server=linux76,60020,1317827746847}
> 2011-10-05 20:49:57,720 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=M_ZK_REGION_OFFLINE, server=linux76,60000,1317827742012, region=3e69d628a8bd8e9b7c5e7a2a6e03aad9
> 2011-10-05 20:50:14,501 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:60000-0x132d3dc13090023 Deleting existing unassigned node for 3e69d628a8bd8e9b7c5e7a2a6e03aad9 that is in expected state RS_ZK_REGION_OPENED
> 2011-10-05 20:50:14,505 WARN org.apache.hadoop.hbase.zookeeper.ZKAssign: master:60000-0x132d3dc13090023 Attempting to delete unassigned node 3e69d628a8bd8e9b7c5e7a2a6e03aad9 in RS_ZK_REGION_OPENED state but node is in RS_ZK_REGION_OPENING state
> After the region is opened in RS2
> =================================
> 2011-10-05 20:50:48,066 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=RS_ZK_REGION_OPENING, server=linux76,60020,1317827746847, region=3e69d628a8bd8e9b7c5e7a2a6e03aad9, which is more than 15 seconds late
> 2011-10-05 20:50:48,290 WARN org.apache.hadoop.hbase.master.AssignmentManager: Received OPENING for region 3e69d628a8bd8e9b7c5e7a2a6e03aad9 from server linux76,60020,1317827746847 but region was in  the state null and not in expected PENDING_OPEN or OPENING states
> 2011-10-05 20:50:53,743 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=RS_ZK_REGION_OPENING, server=linux76,60020,1317827746847, region=3e69d628a8bd8e9b7c5e7a2a6e03aad9
> 2011-10-05 20:50:54,182 DEBUG org.apache.hadoop.hbase.master.CatalogJanitor: Scanned 1 catalog row(s) and gc'd 0 unreferenced parent region(s)
> 2011-10-05 20:50:54,397 WARN org.apache.hadoop.hbase.master.AssignmentManager: Received OPENING for region 3e69d628a8bd8e9b7c5e7a2a6e03aad9 from server linux76,60020,1317827746847 but region was in  the state null and not in expected PENDING_OPEN or OPENING states
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4540) OpenedRegionHandler is not enforcing atomicity of the operation it is performing

Posted by "Jonathan Gray (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13121384#comment-13121384 ] 

Jonathan Gray commented on HBASE-4540:
--------------------------------------

Looks pretty good.  Once you get the unit tests passing, want to put it up on RB?

Also, it'd be really good if you could start thinking about how to mock these scenarios better in our unit tests.  You are finding lots of great bugs but without tests it will be hard to prevent regressions.
                
> OpenedRegionHandler is not enforcing atomicity of the operation it is performing
> --------------------------------------------------------------------------------
>
>                 Key: HBASE-4540
>                 URL: https://issues.apache.org/jira/browse/HBASE-4540
>             Project: HBase
>          Issue Type: Bug
>            Reporter: ramkrishna.s.vasudevan
>            Assignee: ramkrishna.s.vasudevan
>         Attachments: HBASE-4540_1.patch
>
>
> -> OpenedRegionHandler has not yet deleted the znode of the region R1 opened by RS1.
> -> RS1 goes down.
> -> Servershutdownhandler assigns the region R1 to RS2.
> -> The znode of R1 is moved to OFFLINE state by master or OPENING state by RS2 if RS2 has started opening the region.
> -> Now the first OpenedRegionHandler tries to delete the znode thinking its in OPENED state but fails.
> -> Though it fails it removes the node from RIT and adds RS1 as the owner of R1 in master's memory.
> -> Now when RS2 completes opening the region the master is not able to open the region as already the reigon has been deleted from RIT.
> {code}
> Master
> ======
> 2011-10-05 20:49:45,301 INFO org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Finished processing of shutdown of linux146,60020,1317827727647
> 2011-10-05 20:49:54,177 DEBUG org.apache.hadoop.hbase.master.HMaster: Not running balancer because 1 region(s) in transition: {3e69d628a8bd8e9b7c5e7a2a6e03aad9=t1,,1317827883842.3e69d628a8bd8e9b7c5e7a2a6e03aad9. state=PENDING_OPEN, ts=1317827985272, server=linux76,60020,1317827746847}
> 2011-10-05 20:49:57,720 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=M_ZK_REGION_OFFLINE, server=linux76,60000,1317827742012, region=3e69d628a8bd8e9b7c5e7a2a6e03aad9
> 2011-10-05 20:50:14,501 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:60000-0x132d3dc13090023 Deleting existing unassigned node for 3e69d628a8bd8e9b7c5e7a2a6e03aad9 that is in expected state RS_ZK_REGION_OPENED
> 2011-10-05 20:50:14,505 WARN org.apache.hadoop.hbase.zookeeper.ZKAssign: master:60000-0x132d3dc13090023 Attempting to delete unassigned node 3e69d628a8bd8e9b7c5e7a2a6e03aad9 in RS_ZK_REGION_OPENED state but node is in RS_ZK_REGION_OPENING state
> After the region is opened in RS2
> =================================
> 2011-10-05 20:50:48,066 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=RS_ZK_REGION_OPENING, server=linux76,60020,1317827746847, region=3e69d628a8bd8e9b7c5e7a2a6e03aad9, which is more than 15 seconds late
> 2011-10-05 20:50:48,290 WARN org.apache.hadoop.hbase.master.AssignmentManager: Received OPENING for region 3e69d628a8bd8e9b7c5e7a2a6e03aad9 from server linux76,60020,1317827746847 but region was in  the state null and not in expected PENDING_OPEN or OPENING states
> 2011-10-05 20:50:53,743 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=RS_ZK_REGION_OPENING, server=linux76,60020,1317827746847, region=3e69d628a8bd8e9b7c5e7a2a6e03aad9
> 2011-10-05 20:50:54,182 DEBUG org.apache.hadoop.hbase.master.CatalogJanitor: Scanned 1 catalog row(s) and gc'd 0 unreferenced parent region(s)
> 2011-10-05 20:50:54,397 WARN org.apache.hadoop.hbase.master.AssignmentManager: Received OPENING for region 3e69d628a8bd8e9b7c5e7a2a6e03aad9 from server linux76,60020,1317827746847 but region was in  the state null and not in expected PENDING_OPEN or OPENING states
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4540) OpenedRegionHandler is not enforcing atomicity of the operation it is performing

Posted by "Jean-Daniel Cryans (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13121398#comment-13121398 ] 

Jean-Daniel Cryans commented on HBASE-4540:
-------------------------------------------

This reminds me of HBASE-4416.
                
> OpenedRegionHandler is not enforcing atomicity of the operation it is performing
> --------------------------------------------------------------------------------
>
>                 Key: HBASE-4540
>                 URL: https://issues.apache.org/jira/browse/HBASE-4540
>             Project: HBase
>          Issue Type: Bug
>            Reporter: ramkrishna.s.vasudevan
>            Assignee: ramkrishna.s.vasudevan
>         Attachments: HBASE-4540_1.patch
>
>
> -> OpenedRegionHandler has not yet deleted the znode of the region R1 opened by RS1.
> -> RS1 goes down.
> -> Servershutdownhandler assigns the region R1 to RS2.
> -> The znode of R1 is moved to OFFLINE state by master or OPENING state by RS2 if RS2 has started opening the region.
> -> Now the first OpenedRegionHandler tries to delete the znode thinking its in OPENED state but fails.
> -> Though it fails it removes the node from RIT and adds RS1 as the owner of R1 in master's memory.
> -> Now when RS2 completes opening the region the master is not able to open the region as already the reigon has been deleted from RIT.
> {code}
> Master
> ======
> 2011-10-05 20:49:45,301 INFO org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Finished processing of shutdown of linux146,60020,1317827727647
> 2011-10-05 20:49:54,177 DEBUG org.apache.hadoop.hbase.master.HMaster: Not running balancer because 1 region(s) in transition: {3e69d628a8bd8e9b7c5e7a2a6e03aad9=t1,,1317827883842.3e69d628a8bd8e9b7c5e7a2a6e03aad9. state=PENDING_OPEN, ts=1317827985272, server=linux76,60020,1317827746847}
> 2011-10-05 20:49:57,720 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=M_ZK_REGION_OFFLINE, server=linux76,60000,1317827742012, region=3e69d628a8bd8e9b7c5e7a2a6e03aad9
> 2011-10-05 20:50:14,501 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:60000-0x132d3dc13090023 Deleting existing unassigned node for 3e69d628a8bd8e9b7c5e7a2a6e03aad9 that is in expected state RS_ZK_REGION_OPENED
> 2011-10-05 20:50:14,505 WARN org.apache.hadoop.hbase.zookeeper.ZKAssign: master:60000-0x132d3dc13090023 Attempting to delete unassigned node 3e69d628a8bd8e9b7c5e7a2a6e03aad9 in RS_ZK_REGION_OPENED state but node is in RS_ZK_REGION_OPENING state
> After the region is opened in RS2
> =================================
> 2011-10-05 20:50:48,066 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=RS_ZK_REGION_OPENING, server=linux76,60020,1317827746847, region=3e69d628a8bd8e9b7c5e7a2a6e03aad9, which is more than 15 seconds late
> 2011-10-05 20:50:48,290 WARN org.apache.hadoop.hbase.master.AssignmentManager: Received OPENING for region 3e69d628a8bd8e9b7c5e7a2a6e03aad9 from server linux76,60020,1317827746847 but region was in  the state null and not in expected PENDING_OPEN or OPENING states
> 2011-10-05 20:50:53,743 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=RS_ZK_REGION_OPENING, server=linux76,60020,1317827746847, region=3e69d628a8bd8e9b7c5e7a2a6e03aad9
> 2011-10-05 20:50:54,182 DEBUG org.apache.hadoop.hbase.master.CatalogJanitor: Scanned 1 catalog row(s) and gc'd 0 unreferenced parent region(s)
> 2011-10-05 20:50:54,397 WARN org.apache.hadoop.hbase.master.AssignmentManager: Received OPENING for region 3e69d628a8bd8e9b7c5e7a2a6e03aad9 from server linux76,60020,1317827746847 but region was in  the state null and not in expected PENDING_OPEN or OPENING states
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4540) OpenedRegionHandler is not enforcing atomicity of the operation it is performing

Posted by "jiraposter@reviews.apache.org (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13122585#comment-13122585 ] 

jiraposter@reviews.apache.org commented on HBASE-4540:
------------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2251/#review2425
-----------------------------------------------------------



http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKAssign.java
<https://reviews.apache.org/r/2251/#comment5519>

    Yes Ted.  I too was thinking of unifying both the deleteNode() apis.  
    Was thinking what can the expectedVersion that can be passed when we need not check it.  Can we pass -1? and check if -1 is passed for expectedVersion we will skip that check.


- ramkrishna


On 2011-10-06 17:55:05, ramkrishna vasudevan wrote:
bq.  
bq.  -----------------------------------------------------------
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/2251/
bq.  -----------------------------------------------------------
bq.  
bq.  (Updated 2011-10-06 17:55:05)
bq.  
bq.  
bq.  Review request for hbase, Ted Yu, Michael Stack, and Jonathan Gray.
bq.  
bq.  
bq.  Summary
bq.  -------
bq.  
bq.  Fix for handling HBASE-4539 and HBASE-4540.
bq.  Ran all the testcases.  Added one new testcase to verify OpenedRegionHandler scenarios.
bq.  Also addresses Ted's comments.
bq.  
bq.  
bq.  This addresses bug HBASE-4540.
bq.      https://issues.apache.org/jira/browse/HBASE-4540
bq.  
bq.  
bq.  Diffs
bq.  -----
bq.  
bq.    http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java 1179238 
bq.    http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/handler/OpenedRegionHandler.java 1179238 
bq.    http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKAssign.java 1179238 
bq.    http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java 1179238 
bq.    http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/master/TestOpenedRegionHandler.java PRE-CREATION 
bq.  
bq.  Diff: https://reviews.apache.org/r/2251/diff
bq.  
bq.  
bq.  Testing
bq.  -------
bq.  
bq.  Yes
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  ramkrishna
bq.  
bq.


                
> OpenedRegionHandler is not enforcing atomicity of the operation it is performing
> --------------------------------------------------------------------------------
>
>                 Key: HBASE-4540
>                 URL: https://issues.apache.org/jira/browse/HBASE-4540
>             Project: HBase
>          Issue Type: Bug
>            Reporter: ramkrishna.s.vasudevan
>            Assignee: ramkrishna.s.vasudevan
>         Attachments: HBASE-4540_1.patch
>
>
> -> OpenedRegionHandler has not yet deleted the znode of the region R1 opened by RS1.
> -> RS1 goes down.
> -> Servershutdownhandler assigns the region R1 to RS2.
> -> The znode of R1 is moved to OFFLINE state by master or OPENING state by RS2 if RS2 has started opening the region.
> -> Now the first OpenedRegionHandler tries to delete the znode thinking its in OPENED state but fails.
> -> Though it fails it removes the node from RIT and adds RS1 as the owner of R1 in master's memory.
> -> Now when RS2 completes opening the region the master is not able to open the region as already the reigon has been deleted from RIT.
> {code}
> Master
> ======
> 2011-10-05 20:49:45,301 INFO org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Finished processing of shutdown of linux146,60020,1317827727647
> 2011-10-05 20:49:54,177 DEBUG org.apache.hadoop.hbase.master.HMaster: Not running balancer because 1 region(s) in transition: {3e69d628a8bd8e9b7c5e7a2a6e03aad9=t1,,1317827883842.3e69d628a8bd8e9b7c5e7a2a6e03aad9. state=PENDING_OPEN, ts=1317827985272, server=linux76,60020,1317827746847}
> 2011-10-05 20:49:57,720 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=M_ZK_REGION_OFFLINE, server=linux76,60000,1317827742012, region=3e69d628a8bd8e9b7c5e7a2a6e03aad9
> 2011-10-05 20:50:14,501 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:60000-0x132d3dc13090023 Deleting existing unassigned node for 3e69d628a8bd8e9b7c5e7a2a6e03aad9 that is in expected state RS_ZK_REGION_OPENED
> 2011-10-05 20:50:14,505 WARN org.apache.hadoop.hbase.zookeeper.ZKAssign: master:60000-0x132d3dc13090023 Attempting to delete unassigned node 3e69d628a8bd8e9b7c5e7a2a6e03aad9 in RS_ZK_REGION_OPENED state but node is in RS_ZK_REGION_OPENING state
> After the region is opened in RS2
> =================================
> 2011-10-05 20:50:48,066 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=RS_ZK_REGION_OPENING, server=linux76,60020,1317827746847, region=3e69d628a8bd8e9b7c5e7a2a6e03aad9, which is more than 15 seconds late
> 2011-10-05 20:50:48,290 WARN org.apache.hadoop.hbase.master.AssignmentManager: Received OPENING for region 3e69d628a8bd8e9b7c5e7a2a6e03aad9 from server linux76,60020,1317827746847 but region was in  the state null and not in expected PENDING_OPEN or OPENING states
> 2011-10-05 20:50:53,743 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=RS_ZK_REGION_OPENING, server=linux76,60020,1317827746847, region=3e69d628a8bd8e9b7c5e7a2a6e03aad9
> 2011-10-05 20:50:54,182 DEBUG org.apache.hadoop.hbase.master.CatalogJanitor: Scanned 1 catalog row(s) and gc'd 0 unreferenced parent region(s)
> 2011-10-05 20:50:54,397 WARN org.apache.hadoop.hbase.master.AssignmentManager: Received OPENING for region 3e69d628a8bd8e9b7c5e7a2a6e03aad9 from server linux76,60020,1317827746847 but region was in  the state null and not in expected PENDING_OPEN or OPENING states
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4540) OpenedRegionHandler is not enforcing atomicity of the operation it is performing

Posted by "jiraposter@reviews.apache.org (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13122829#comment-13122829 ] 

jiraposter@reviews.apache.org commented on HBASE-4540:
------------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2251/
-----------------------------------------------------------

(Updated 2011-10-07 14:27:20.231903)


Review request for hbase, Ted Yu, Michael Stack, and Jonathan Gray.


Changes
-------

LOG.debug(zkw.prefix("Successfully deleted unassigned node for region " +
        regionName + " in expected state " + expectedState));
@Ted - I have not removed this log so that it can be used for debugging.
Refactored the testcase and made it much simpler so that it doesn't take much time.


Summary
-------

Fix for handling HBASE-4539 and HBASE-4540.
Ran all the testcases.  Added one new testcase to verify OpenedRegionHandler scenarios.
Also addresses Ted's comments.


This addresses bug HBASE-4540.
    https://issues.apache.org/jira/browse/HBASE-4540


Diffs (updated)
-----

  http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/master/TestOpenedRegionHandler.java PRE-CREATION 
  http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java 1179945 
  http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKAssign.java 1179945 
  http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java 1179945 
  http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/handler/OpenedRegionHandler.java 1179945 

Diff: https://reviews.apache.org/r/2251/diff


Testing
-------

Yes


Thanks,

ramkrishna


                
> OpenedRegionHandler is not enforcing atomicity of the operation it is performing
> --------------------------------------------------------------------------------
>
>                 Key: HBASE-4540
>                 URL: https://issues.apache.org/jira/browse/HBASE-4540
>             Project: HBase
>          Issue Type: Bug
>            Reporter: ramkrishna.s.vasudevan
>            Assignee: ramkrishna.s.vasudevan
>         Attachments: HBASE-4540_1.patch
>
>
> -> OpenedRegionHandler has not yet deleted the znode of the region R1 opened by RS1.
> -> RS1 goes down.
> -> Servershutdownhandler assigns the region R1 to RS2.
> -> The znode of R1 is moved to OFFLINE state by master or OPENING state by RS2 if RS2 has started opening the region.
> -> Now the first OpenedRegionHandler tries to delete the znode thinking its in OPENED state but fails.
> -> Though it fails it removes the node from RIT and adds RS1 as the owner of R1 in master's memory.
> -> Now when RS2 completes opening the region the master is not able to open the region as already the reigon has been deleted from RIT.
> {code}
> Master
> ======
> 2011-10-05 20:49:45,301 INFO org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Finished processing of shutdown of linux146,60020,1317827727647
> 2011-10-05 20:49:54,177 DEBUG org.apache.hadoop.hbase.master.HMaster: Not running balancer because 1 region(s) in transition: {3e69d628a8bd8e9b7c5e7a2a6e03aad9=t1,,1317827883842.3e69d628a8bd8e9b7c5e7a2a6e03aad9. state=PENDING_OPEN, ts=1317827985272, server=linux76,60020,1317827746847}
> 2011-10-05 20:49:57,720 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=M_ZK_REGION_OFFLINE, server=linux76,60000,1317827742012, region=3e69d628a8bd8e9b7c5e7a2a6e03aad9
> 2011-10-05 20:50:14,501 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:60000-0x132d3dc13090023 Deleting existing unassigned node for 3e69d628a8bd8e9b7c5e7a2a6e03aad9 that is in expected state RS_ZK_REGION_OPENED
> 2011-10-05 20:50:14,505 WARN org.apache.hadoop.hbase.zookeeper.ZKAssign: master:60000-0x132d3dc13090023 Attempting to delete unassigned node 3e69d628a8bd8e9b7c5e7a2a6e03aad9 in RS_ZK_REGION_OPENED state but node is in RS_ZK_REGION_OPENING state
> After the region is opened in RS2
> =================================
> 2011-10-05 20:50:48,066 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=RS_ZK_REGION_OPENING, server=linux76,60020,1317827746847, region=3e69d628a8bd8e9b7c5e7a2a6e03aad9, which is more than 15 seconds late
> 2011-10-05 20:50:48,290 WARN org.apache.hadoop.hbase.master.AssignmentManager: Received OPENING for region 3e69d628a8bd8e9b7c5e7a2a6e03aad9 from server linux76,60020,1317827746847 but region was in  the state null and not in expected PENDING_OPEN or OPENING states
> 2011-10-05 20:50:53,743 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=RS_ZK_REGION_OPENING, server=linux76,60020,1317827746847, region=3e69d628a8bd8e9b7c5e7a2a6e03aad9
> 2011-10-05 20:50:54,182 DEBUG org.apache.hadoop.hbase.master.CatalogJanitor: Scanned 1 catalog row(s) and gc'd 0 unreferenced parent region(s)
> 2011-10-05 20:50:54,397 WARN org.apache.hadoop.hbase.master.AssignmentManager: Received OPENING for region 3e69d628a8bd8e9b7c5e7a2a6e03aad9 from server linux76,60020,1317827746847 but region was in  the state null and not in expected PENDING_OPEN or OPENING states
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-4540) OpenedRegionHandler is not enforcing atomicity of the operation it is performing

Posted by "ramkrishna.s.vasudevan (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-4540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

ramkrishna.s.vasudevan updated HBASE-4540:
------------------------------------------

    Fix Version/s: 0.92.0
    
> OpenedRegionHandler is not enforcing atomicity of the operation it is performing
> --------------------------------------------------------------------------------
>
>                 Key: HBASE-4540
>                 URL: https://issues.apache.org/jira/browse/HBASE-4540
>             Project: HBase
>          Issue Type: Bug
>            Reporter: ramkrishna.s.vasudevan
>            Assignee: ramkrishna.s.vasudevan
>             Fix For: 0.92.0
>
>         Attachments: HBASE-4540_1.patch
>
>
> -> OpenedRegionHandler has not yet deleted the znode of the region R1 opened by RS1.
> -> RS1 goes down.
> -> Servershutdownhandler assigns the region R1 to RS2.
> -> The znode of R1 is moved to OFFLINE state by master or OPENING state by RS2 if RS2 has started opening the region.
> -> Now the first OpenedRegionHandler tries to delete the znode thinking its in OPENED state but fails.
> -> Though it fails it removes the node from RIT and adds RS1 as the owner of R1 in master's memory.
> -> Now when RS2 completes opening the region the master is not able to open the region as already the reigon has been deleted from RIT.
> {code}
> Master
> ======
> 2011-10-05 20:49:45,301 INFO org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Finished processing of shutdown of linux146,60020,1317827727647
> 2011-10-05 20:49:54,177 DEBUG org.apache.hadoop.hbase.master.HMaster: Not running balancer because 1 region(s) in transition: {3e69d628a8bd8e9b7c5e7a2a6e03aad9=t1,,1317827883842.3e69d628a8bd8e9b7c5e7a2a6e03aad9. state=PENDING_OPEN, ts=1317827985272, server=linux76,60020,1317827746847}
> 2011-10-05 20:49:57,720 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=M_ZK_REGION_OFFLINE, server=linux76,60000,1317827742012, region=3e69d628a8bd8e9b7c5e7a2a6e03aad9
> 2011-10-05 20:50:14,501 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:60000-0x132d3dc13090023 Deleting existing unassigned node for 3e69d628a8bd8e9b7c5e7a2a6e03aad9 that is in expected state RS_ZK_REGION_OPENED
> 2011-10-05 20:50:14,505 WARN org.apache.hadoop.hbase.zookeeper.ZKAssign: master:60000-0x132d3dc13090023 Attempting to delete unassigned node 3e69d628a8bd8e9b7c5e7a2a6e03aad9 in RS_ZK_REGION_OPENED state but node is in RS_ZK_REGION_OPENING state
> After the region is opened in RS2
> =================================
> 2011-10-05 20:50:48,066 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=RS_ZK_REGION_OPENING, server=linux76,60020,1317827746847, region=3e69d628a8bd8e9b7c5e7a2a6e03aad9, which is more than 15 seconds late
> 2011-10-05 20:50:48,290 WARN org.apache.hadoop.hbase.master.AssignmentManager: Received OPENING for region 3e69d628a8bd8e9b7c5e7a2a6e03aad9 from server linux76,60020,1317827746847 but region was in  the state null and not in expected PENDING_OPEN or OPENING states
> 2011-10-05 20:50:53,743 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=RS_ZK_REGION_OPENING, server=linux76,60020,1317827746847, region=3e69d628a8bd8e9b7c5e7a2a6e03aad9
> 2011-10-05 20:50:54,182 DEBUG org.apache.hadoop.hbase.master.CatalogJanitor: Scanned 1 catalog row(s) and gc'd 0 unreferenced parent region(s)
> 2011-10-05 20:50:54,397 WARN org.apache.hadoop.hbase.master.AssignmentManager: Received OPENING for region 3e69d628a8bd8e9b7c5e7a2a6e03aad9 from server linux76,60020,1317827746847 but region was in  the state null and not in expected PENDING_OPEN or OPENING states
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Reopened] (HBASE-4540) OpenedRegionHandler is not enforcing atomicity of the operation it is performing

Posted by "ramkrishna.s.vasudevan (Reopened) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-4540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

ramkrishna.s.vasudevan reopened HBASE-4540:
-------------------------------------------


Reopening the issue to backport to 0.90 so that the fix can be available in 0.90 branch also.
                
> OpenedRegionHandler is not enforcing atomicity of the operation it is performing
> --------------------------------------------------------------------------------
>
>                 Key: HBASE-4540
>                 URL: https://issues.apache.org/jira/browse/HBASE-4540
>             Project: HBase
>          Issue Type: Bug
>            Reporter: ramkrishna.s.vasudevan
>            Assignee: ramkrishna.s.vasudevan
>             Fix For: 0.92.0
>
>         Attachments: HBASE-4540_1.patch
>
>
> -> OpenedRegionHandler has not yet deleted the znode of the region R1 opened by RS1.
> -> RS1 goes down.
> -> Servershutdownhandler assigns the region R1 to RS2.
> -> The znode of R1 is moved to OFFLINE state by master or OPENING state by RS2 if RS2 has started opening the region.
> -> Now the first OpenedRegionHandler tries to delete the znode thinking its in OPENED state but fails.
> -> Though it fails it removes the node from RIT and adds RS1 as the owner of R1 in master's memory.
> -> Now when RS2 completes opening the region the master is not able to open the region as already the reigon has been deleted from RIT.
> {code}
> Master
> ======
> 2011-10-05 20:49:45,301 INFO org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Finished processing of shutdown of linux146,60020,1317827727647
> 2011-10-05 20:49:54,177 DEBUG org.apache.hadoop.hbase.master.HMaster: Not running balancer because 1 region(s) in transition: {3e69d628a8bd8e9b7c5e7a2a6e03aad9=t1,,1317827883842.3e69d628a8bd8e9b7c5e7a2a6e03aad9. state=PENDING_OPEN, ts=1317827985272, server=linux76,60020,1317827746847}
> 2011-10-05 20:49:57,720 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=M_ZK_REGION_OFFLINE, server=linux76,60000,1317827742012, region=3e69d628a8bd8e9b7c5e7a2a6e03aad9
> 2011-10-05 20:50:14,501 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:60000-0x132d3dc13090023 Deleting existing unassigned node for 3e69d628a8bd8e9b7c5e7a2a6e03aad9 that is in expected state RS_ZK_REGION_OPENED
> 2011-10-05 20:50:14,505 WARN org.apache.hadoop.hbase.zookeeper.ZKAssign: master:60000-0x132d3dc13090023 Attempting to delete unassigned node 3e69d628a8bd8e9b7c5e7a2a6e03aad9 in RS_ZK_REGION_OPENED state but node is in RS_ZK_REGION_OPENING state
> After the region is opened in RS2
> =================================
> 2011-10-05 20:50:48,066 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=RS_ZK_REGION_OPENING, server=linux76,60020,1317827746847, region=3e69d628a8bd8e9b7c5e7a2a6e03aad9, which is more than 15 seconds late
> 2011-10-05 20:50:48,290 WARN org.apache.hadoop.hbase.master.AssignmentManager: Received OPENING for region 3e69d628a8bd8e9b7c5e7a2a6e03aad9 from server linux76,60020,1317827746847 but region was in  the state null and not in expected PENDING_OPEN or OPENING states
> 2011-10-05 20:50:53,743 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=RS_ZK_REGION_OPENING, server=linux76,60020,1317827746847, region=3e69d628a8bd8e9b7c5e7a2a6e03aad9
> 2011-10-05 20:50:54,182 DEBUG org.apache.hadoop.hbase.master.CatalogJanitor: Scanned 1 catalog row(s) and gc'd 0 unreferenced parent region(s)
> 2011-10-05 20:50:54,397 WARN org.apache.hadoop.hbase.master.AssignmentManager: Received OPENING for region 3e69d628a8bd8e9b7c5e7a2a6e03aad9 from server linux76,60020,1317827746847 but region was in  the state null and not in expected PENDING_OPEN or OPENING states
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4540) OpenedRegionHandler is not enforcing atomicity of the operation it is performing

Posted by "ramkrishna.s.vasudevan (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13122862#comment-13122862 ] 

ramkrishna.s.vasudevan commented on HBASE-4540:
-----------------------------------------------

Can we better document like anything less than some value. may be either 0 or -1? Instead of going with one value.
                
> OpenedRegionHandler is not enforcing atomicity of the operation it is performing
> --------------------------------------------------------------------------------
>
>                 Key: HBASE-4540
>                 URL: https://issues.apache.org/jira/browse/HBASE-4540
>             Project: HBase
>          Issue Type: Bug
>            Reporter: ramkrishna.s.vasudevan
>            Assignee: ramkrishna.s.vasudevan
>         Attachments: HBASE-4540_1.patch
>
>
> -> OpenedRegionHandler has not yet deleted the znode of the region R1 opened by RS1.
> -> RS1 goes down.
> -> Servershutdownhandler assigns the region R1 to RS2.
> -> The znode of R1 is moved to OFFLINE state by master or OPENING state by RS2 if RS2 has started opening the region.
> -> Now the first OpenedRegionHandler tries to delete the znode thinking its in OPENED state but fails.
> -> Though it fails it removes the node from RIT and adds RS1 as the owner of R1 in master's memory.
> -> Now when RS2 completes opening the region the master is not able to open the region as already the reigon has been deleted from RIT.
> {code}
> Master
> ======
> 2011-10-05 20:49:45,301 INFO org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Finished processing of shutdown of linux146,60020,1317827727647
> 2011-10-05 20:49:54,177 DEBUG org.apache.hadoop.hbase.master.HMaster: Not running balancer because 1 region(s) in transition: {3e69d628a8bd8e9b7c5e7a2a6e03aad9=t1,,1317827883842.3e69d628a8bd8e9b7c5e7a2a6e03aad9. state=PENDING_OPEN, ts=1317827985272, server=linux76,60020,1317827746847}
> 2011-10-05 20:49:57,720 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=M_ZK_REGION_OFFLINE, server=linux76,60000,1317827742012, region=3e69d628a8bd8e9b7c5e7a2a6e03aad9
> 2011-10-05 20:50:14,501 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:60000-0x132d3dc13090023 Deleting existing unassigned node for 3e69d628a8bd8e9b7c5e7a2a6e03aad9 that is in expected state RS_ZK_REGION_OPENED
> 2011-10-05 20:50:14,505 WARN org.apache.hadoop.hbase.zookeeper.ZKAssign: master:60000-0x132d3dc13090023 Attempting to delete unassigned node 3e69d628a8bd8e9b7c5e7a2a6e03aad9 in RS_ZK_REGION_OPENED state but node is in RS_ZK_REGION_OPENING state
> After the region is opened in RS2
> =================================
> 2011-10-05 20:50:48,066 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=RS_ZK_REGION_OPENING, server=linux76,60020,1317827746847, region=3e69d628a8bd8e9b7c5e7a2a6e03aad9, which is more than 15 seconds late
> 2011-10-05 20:50:48,290 WARN org.apache.hadoop.hbase.master.AssignmentManager: Received OPENING for region 3e69d628a8bd8e9b7c5e7a2a6e03aad9 from server linux76,60020,1317827746847 but region was in  the state null and not in expected PENDING_OPEN or OPENING states
> 2011-10-05 20:50:53,743 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=RS_ZK_REGION_OPENING, server=linux76,60020,1317827746847, region=3e69d628a8bd8e9b7c5e7a2a6e03aad9
> 2011-10-05 20:50:54,182 DEBUG org.apache.hadoop.hbase.master.CatalogJanitor: Scanned 1 catalog row(s) and gc'd 0 unreferenced parent region(s)
> 2011-10-05 20:50:54,397 WARN org.apache.hadoop.hbase.master.AssignmentManager: Received OPENING for region 3e69d628a8bd8e9b7c5e7a2a6e03aad9 from server linux76,60020,1317827746847 but region was in  the state null and not in expected PENDING_OPEN or OPENING states
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-4540) OpenedRegionHandler is not enforcing atomicity of the operation it is performing

Posted by "ramkrishna.s.vasudevan (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-4540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

ramkrishna.s.vasudevan updated HBASE-4540:
------------------------------------------

    Resolution: Fixed
        Status: Resolved  (was: Patch Available)
    
> OpenedRegionHandler is not enforcing atomicity of the operation it is performing
> --------------------------------------------------------------------------------
>
>                 Key: HBASE-4540
>                 URL: https://issues.apache.org/jira/browse/HBASE-4540
>             Project: HBase
>          Issue Type: Bug
>            Reporter: ramkrishna.s.vasudevan
>            Assignee: ramkrishna.s.vasudevan
>         Attachments: HBASE-4540_1.patch
>
>
> -> OpenedRegionHandler has not yet deleted the znode of the region R1 opened by RS1.
> -> RS1 goes down.
> -> Servershutdownhandler assigns the region R1 to RS2.
> -> The znode of R1 is moved to OFFLINE state by master or OPENING state by RS2 if RS2 has started opening the region.
> -> Now the first OpenedRegionHandler tries to delete the znode thinking its in OPENED state but fails.
> -> Though it fails it removes the node from RIT and adds RS1 as the owner of R1 in master's memory.
> -> Now when RS2 completes opening the region the master is not able to open the region as already the reigon has been deleted from RIT.
> {code}
> Master
> ======
> 2011-10-05 20:49:45,301 INFO org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Finished processing of shutdown of linux146,60020,1317827727647
> 2011-10-05 20:49:54,177 DEBUG org.apache.hadoop.hbase.master.HMaster: Not running balancer because 1 region(s) in transition: {3e69d628a8bd8e9b7c5e7a2a6e03aad9=t1,,1317827883842.3e69d628a8bd8e9b7c5e7a2a6e03aad9. state=PENDING_OPEN, ts=1317827985272, server=linux76,60020,1317827746847}
> 2011-10-05 20:49:57,720 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=M_ZK_REGION_OFFLINE, server=linux76,60000,1317827742012, region=3e69d628a8bd8e9b7c5e7a2a6e03aad9
> 2011-10-05 20:50:14,501 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:60000-0x132d3dc13090023 Deleting existing unassigned node for 3e69d628a8bd8e9b7c5e7a2a6e03aad9 that is in expected state RS_ZK_REGION_OPENED
> 2011-10-05 20:50:14,505 WARN org.apache.hadoop.hbase.zookeeper.ZKAssign: master:60000-0x132d3dc13090023 Attempting to delete unassigned node 3e69d628a8bd8e9b7c5e7a2a6e03aad9 in RS_ZK_REGION_OPENED state but node is in RS_ZK_REGION_OPENING state
> After the region is opened in RS2
> =================================
> 2011-10-05 20:50:48,066 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=RS_ZK_REGION_OPENING, server=linux76,60020,1317827746847, region=3e69d628a8bd8e9b7c5e7a2a6e03aad9, which is more than 15 seconds late
> 2011-10-05 20:50:48,290 WARN org.apache.hadoop.hbase.master.AssignmentManager: Received OPENING for region 3e69d628a8bd8e9b7c5e7a2a6e03aad9 from server linux76,60020,1317827746847 but region was in  the state null and not in expected PENDING_OPEN or OPENING states
> 2011-10-05 20:50:53,743 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=RS_ZK_REGION_OPENING, server=linux76,60020,1317827746847, region=3e69d628a8bd8e9b7c5e7a2a6e03aad9
> 2011-10-05 20:50:54,182 DEBUG org.apache.hadoop.hbase.master.CatalogJanitor: Scanned 1 catalog row(s) and gc'd 0 unreferenced parent region(s)
> 2011-10-05 20:50:54,397 WARN org.apache.hadoop.hbase.master.AssignmentManager: Received OPENING for region 3e69d628a8bd8e9b7c5e7a2a6e03aad9 from server linux76,60020,1317827746847 but region was in  the state null and not in expected PENDING_OPEN or OPENING states
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4540) OpenedRegionHandler is not enforcing atomicity of the operation it is performing

Posted by "jiraposter@reviews.apache.org (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13122147#comment-13122147 ] 

jiraposter@reviews.apache.org commented on HBASE-4540:
------------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2251/#review2399
-----------------------------------------------------------



http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/handler/OpenedRegionHandler.java
<https://reviews.apache.org/r/2251/#comment5504>

    Can this debugLog be folded into the one above ?



http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/handler/OpenedRegionHandler.java
<https://reviews.apache.org/r/2251/#comment5505>

    Remove this extra line.



http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/handler/OpenedRegionHandler.java
<https://reviews.apache.org/r/2251/#comment5506>

    'for transition ZK node' seems redundant.


- Ted


On 2011-10-06 17:55:05, ramkrishna vasudevan wrote:
bq.  
bq.  -----------------------------------------------------------
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/2251/
bq.  -----------------------------------------------------------
bq.  
bq.  (Updated 2011-10-06 17:55:05)
bq.  
bq.  
bq.  Review request for hbase, Ted Yu, Michael Stack, and Jonathan Gray.
bq.  
bq.  
bq.  Summary
bq.  -------
bq.  
bq.  Fix for handling HBASE-4539 and HBASE-4540.
bq.  Ran all the testcases.  Added one new testcase to verify OpenedRegionHandler scenarios.
bq.  Also addresses Ted's comments.
bq.  
bq.  
bq.  This addresses bug HBASE-4540.
bq.      https://issues.apache.org/jira/browse/HBASE-4540
bq.  
bq.  
bq.  Diffs
bq.  -----
bq.  
bq.    http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java 1179238 
bq.    http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/handler/OpenedRegionHandler.java 1179238 
bq.    http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKAssign.java 1179238 
bq.    http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java 1179238 
bq.    http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/master/TestOpenedRegionHandler.java PRE-CREATION 
bq.  
bq.  Diff: https://reviews.apache.org/r/2251/diff
bq.  
bq.  
bq.  Testing
bq.  -------
bq.  
bq.  Yes
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  ramkrishna
bq.  
bq.


                
> OpenedRegionHandler is not enforcing atomicity of the operation it is performing
> --------------------------------------------------------------------------------
>
>                 Key: HBASE-4540
>                 URL: https://issues.apache.org/jira/browse/HBASE-4540
>             Project: HBase
>          Issue Type: Bug
>            Reporter: ramkrishna.s.vasudevan
>            Assignee: ramkrishna.s.vasudevan
>         Attachments: HBASE-4540_1.patch
>
>
> -> OpenedRegionHandler has not yet deleted the znode of the region R1 opened by RS1.
> -> RS1 goes down.
> -> Servershutdownhandler assigns the region R1 to RS2.
> -> The znode of R1 is moved to OFFLINE state by master or OPENING state by RS2 if RS2 has started opening the region.
> -> Now the first OpenedRegionHandler tries to delete the znode thinking its in OPENED state but fails.
> -> Though it fails it removes the node from RIT and adds RS1 as the owner of R1 in master's memory.
> -> Now when RS2 completes opening the region the master is not able to open the region as already the reigon has been deleted from RIT.
> {code}
> Master
> ======
> 2011-10-05 20:49:45,301 INFO org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Finished processing of shutdown of linux146,60020,1317827727647
> 2011-10-05 20:49:54,177 DEBUG org.apache.hadoop.hbase.master.HMaster: Not running balancer because 1 region(s) in transition: {3e69d628a8bd8e9b7c5e7a2a6e03aad9=t1,,1317827883842.3e69d628a8bd8e9b7c5e7a2a6e03aad9. state=PENDING_OPEN, ts=1317827985272, server=linux76,60020,1317827746847}
> 2011-10-05 20:49:57,720 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=M_ZK_REGION_OFFLINE, server=linux76,60000,1317827742012, region=3e69d628a8bd8e9b7c5e7a2a6e03aad9
> 2011-10-05 20:50:14,501 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:60000-0x132d3dc13090023 Deleting existing unassigned node for 3e69d628a8bd8e9b7c5e7a2a6e03aad9 that is in expected state RS_ZK_REGION_OPENED
> 2011-10-05 20:50:14,505 WARN org.apache.hadoop.hbase.zookeeper.ZKAssign: master:60000-0x132d3dc13090023 Attempting to delete unassigned node 3e69d628a8bd8e9b7c5e7a2a6e03aad9 in RS_ZK_REGION_OPENED state but node is in RS_ZK_REGION_OPENING state
> After the region is opened in RS2
> =================================
> 2011-10-05 20:50:48,066 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=RS_ZK_REGION_OPENING, server=linux76,60020,1317827746847, region=3e69d628a8bd8e9b7c5e7a2a6e03aad9, which is more than 15 seconds late
> 2011-10-05 20:50:48,290 WARN org.apache.hadoop.hbase.master.AssignmentManager: Received OPENING for region 3e69d628a8bd8e9b7c5e7a2a6e03aad9 from server linux76,60020,1317827746847 but region was in  the state null and not in expected PENDING_OPEN or OPENING states
> 2011-10-05 20:50:53,743 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=RS_ZK_REGION_OPENING, server=linux76,60020,1317827746847, region=3e69d628a8bd8e9b7c5e7a2a6e03aad9
> 2011-10-05 20:50:54,182 DEBUG org.apache.hadoop.hbase.master.CatalogJanitor: Scanned 1 catalog row(s) and gc'd 0 unreferenced parent region(s)
> 2011-10-05 20:50:54,397 WARN org.apache.hadoop.hbase.master.AssignmentManager: Received OPENING for region 3e69d628a8bd8e9b7c5e7a2a6e03aad9 from server linux76,60020,1317827746847 but region was in  the state null and not in expected PENDING_OPEN or OPENING states
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4540) OpenedRegionHandler is not enforcing atomicity of the operation it is performing

Posted by "ramkrishna.s.vasudevan (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13122992#comment-13122992 ] 

ramkrishna.s.vasudevan commented on HBASE-4540:
-----------------------------------------------

If any node exists the version will start from 0. 
Thanks Ted for the confirmation.  I will wait for one day for further reviews and will make changes accordingly if not will take the patc at @ 07/Oct/11 14:27.
The space between catch and } i will take care.
                
> OpenedRegionHandler is not enforcing atomicity of the operation it is performing
> --------------------------------------------------------------------------------
>
>                 Key: HBASE-4540
>                 URL: https://issues.apache.org/jira/browse/HBASE-4540
>             Project: HBase
>          Issue Type: Bug
>            Reporter: ramkrishna.s.vasudevan
>            Assignee: ramkrishna.s.vasudevan
>         Attachments: HBASE-4540_1.patch
>
>
> -> OpenedRegionHandler has not yet deleted the znode of the region R1 opened by RS1.
> -> RS1 goes down.
> -> Servershutdownhandler assigns the region R1 to RS2.
> -> The znode of R1 is moved to OFFLINE state by master or OPENING state by RS2 if RS2 has started opening the region.
> -> Now the first OpenedRegionHandler tries to delete the znode thinking its in OPENED state but fails.
> -> Though it fails it removes the node from RIT and adds RS1 as the owner of R1 in master's memory.
> -> Now when RS2 completes opening the region the master is not able to open the region as already the reigon has been deleted from RIT.
> {code}
> Master
> ======
> 2011-10-05 20:49:45,301 INFO org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Finished processing of shutdown of linux146,60020,1317827727647
> 2011-10-05 20:49:54,177 DEBUG org.apache.hadoop.hbase.master.HMaster: Not running balancer because 1 region(s) in transition: {3e69d628a8bd8e9b7c5e7a2a6e03aad9=t1,,1317827883842.3e69d628a8bd8e9b7c5e7a2a6e03aad9. state=PENDING_OPEN, ts=1317827985272, server=linux76,60020,1317827746847}
> 2011-10-05 20:49:57,720 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=M_ZK_REGION_OFFLINE, server=linux76,60000,1317827742012, region=3e69d628a8bd8e9b7c5e7a2a6e03aad9
> 2011-10-05 20:50:14,501 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:60000-0x132d3dc13090023 Deleting existing unassigned node for 3e69d628a8bd8e9b7c5e7a2a6e03aad9 that is in expected state RS_ZK_REGION_OPENED
> 2011-10-05 20:50:14,505 WARN org.apache.hadoop.hbase.zookeeper.ZKAssign: master:60000-0x132d3dc13090023 Attempting to delete unassigned node 3e69d628a8bd8e9b7c5e7a2a6e03aad9 in RS_ZK_REGION_OPENED state but node is in RS_ZK_REGION_OPENING state
> After the region is opened in RS2
> =================================
> 2011-10-05 20:50:48,066 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=RS_ZK_REGION_OPENING, server=linux76,60020,1317827746847, region=3e69d628a8bd8e9b7c5e7a2a6e03aad9, which is more than 15 seconds late
> 2011-10-05 20:50:48,290 WARN org.apache.hadoop.hbase.master.AssignmentManager: Received OPENING for region 3e69d628a8bd8e9b7c5e7a2a6e03aad9 from server linux76,60020,1317827746847 but region was in  the state null and not in expected PENDING_OPEN or OPENING states
> 2011-10-05 20:50:53,743 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=RS_ZK_REGION_OPENING, server=linux76,60020,1317827746847, region=3e69d628a8bd8e9b7c5e7a2a6e03aad9
> 2011-10-05 20:50:54,182 DEBUG org.apache.hadoop.hbase.master.CatalogJanitor: Scanned 1 catalog row(s) and gc'd 0 unreferenced parent region(s)
> 2011-10-05 20:50:54,397 WARN org.apache.hadoop.hbase.master.AssignmentManager: Received OPENING for region 3e69d628a8bd8e9b7c5e7a2a6e03aad9 from server linux76,60020,1317827746847 but region was in  the state null and not in expected PENDING_OPEN or OPENING states
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4540) OpenedRegionHandler is not enforcing atomicity of the operation it is performing

Posted by "jiraposter@reviews.apache.org (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13122156#comment-13122156 ] 

jiraposter@reviews.apache.org commented on HBASE-4540:
------------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2251/#review2400
-----------------------------------------------------------



http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKAssign.java
<https://reviews.apache.org/r/2251/#comment5510>

    This new method is similar to deleteNode() above.
    Maybe we should retrofit the existing deleteNode() by adding expectedVersion ?
    We can designate some negative constant to signify that version check should be skipped.


- Ted


On 2011-10-06 17:55:05, ramkrishna vasudevan wrote:
bq.  
bq.  -----------------------------------------------------------
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/2251/
bq.  -----------------------------------------------------------
bq.  
bq.  (Updated 2011-10-06 17:55:05)
bq.  
bq.  
bq.  Review request for hbase, Ted Yu, Michael Stack, and Jonathan Gray.
bq.  
bq.  
bq.  Summary
bq.  -------
bq.  
bq.  Fix for handling HBASE-4539 and HBASE-4540.
bq.  Ran all the testcases.  Added one new testcase to verify OpenedRegionHandler scenarios.
bq.  Also addresses Ted's comments.
bq.  
bq.  
bq.  This addresses bug HBASE-4540.
bq.      https://issues.apache.org/jira/browse/HBASE-4540
bq.  
bq.  
bq.  Diffs
bq.  -----
bq.  
bq.    http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java 1179238 
bq.    http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/handler/OpenedRegionHandler.java 1179238 
bq.    http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKAssign.java 1179238 
bq.    http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java 1179238 
bq.    http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/master/TestOpenedRegionHandler.java PRE-CREATION 
bq.  
bq.  Diff: https://reviews.apache.org/r/2251/diff
bq.  
bq.  
bq.  Testing
bq.  -------
bq.  
bq.  Yes
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  ramkrishna
bq.  
bq.


                
> OpenedRegionHandler is not enforcing atomicity of the operation it is performing
> --------------------------------------------------------------------------------
>
>                 Key: HBASE-4540
>                 URL: https://issues.apache.org/jira/browse/HBASE-4540
>             Project: HBase
>          Issue Type: Bug
>            Reporter: ramkrishna.s.vasudevan
>            Assignee: ramkrishna.s.vasudevan
>         Attachments: HBASE-4540_1.patch
>
>
> -> OpenedRegionHandler has not yet deleted the znode of the region R1 opened by RS1.
> -> RS1 goes down.
> -> Servershutdownhandler assigns the region R1 to RS2.
> -> The znode of R1 is moved to OFFLINE state by master or OPENING state by RS2 if RS2 has started opening the region.
> -> Now the first OpenedRegionHandler tries to delete the znode thinking its in OPENED state but fails.
> -> Though it fails it removes the node from RIT and adds RS1 as the owner of R1 in master's memory.
> -> Now when RS2 completes opening the region the master is not able to open the region as already the reigon has been deleted from RIT.
> {code}
> Master
> ======
> 2011-10-05 20:49:45,301 INFO org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Finished processing of shutdown of linux146,60020,1317827727647
> 2011-10-05 20:49:54,177 DEBUG org.apache.hadoop.hbase.master.HMaster: Not running balancer because 1 region(s) in transition: {3e69d628a8bd8e9b7c5e7a2a6e03aad9=t1,,1317827883842.3e69d628a8bd8e9b7c5e7a2a6e03aad9. state=PENDING_OPEN, ts=1317827985272, server=linux76,60020,1317827746847}
> 2011-10-05 20:49:57,720 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=M_ZK_REGION_OFFLINE, server=linux76,60000,1317827742012, region=3e69d628a8bd8e9b7c5e7a2a6e03aad9
> 2011-10-05 20:50:14,501 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:60000-0x132d3dc13090023 Deleting existing unassigned node for 3e69d628a8bd8e9b7c5e7a2a6e03aad9 that is in expected state RS_ZK_REGION_OPENED
> 2011-10-05 20:50:14,505 WARN org.apache.hadoop.hbase.zookeeper.ZKAssign: master:60000-0x132d3dc13090023 Attempting to delete unassigned node 3e69d628a8bd8e9b7c5e7a2a6e03aad9 in RS_ZK_REGION_OPENED state but node is in RS_ZK_REGION_OPENING state
> After the region is opened in RS2
> =================================
> 2011-10-05 20:50:48,066 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=RS_ZK_REGION_OPENING, server=linux76,60020,1317827746847, region=3e69d628a8bd8e9b7c5e7a2a6e03aad9, which is more than 15 seconds late
> 2011-10-05 20:50:48,290 WARN org.apache.hadoop.hbase.master.AssignmentManager: Received OPENING for region 3e69d628a8bd8e9b7c5e7a2a6e03aad9 from server linux76,60020,1317827746847 but region was in  the state null and not in expected PENDING_OPEN or OPENING states
> 2011-10-05 20:50:53,743 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=RS_ZK_REGION_OPENING, server=linux76,60020,1317827746847, region=3e69d628a8bd8e9b7c5e7a2a6e03aad9
> 2011-10-05 20:50:54,182 DEBUG org.apache.hadoop.hbase.master.CatalogJanitor: Scanned 1 catalog row(s) and gc'd 0 unreferenced parent region(s)
> 2011-10-05 20:50:54,397 WARN org.apache.hadoop.hbase.master.AssignmentManager: Received OPENING for region 3e69d628a8bd8e9b7c5e7a2a6e03aad9 from server linux76,60020,1317827746847 but region was in  the state null and not in expected PENDING_OPEN or OPENING states
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4540) OpenedRegionHandler is not enforcing atomicity of the operation it is performing

Posted by "ramkrishna.s.vasudevan (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13120776#comment-13120776 ] 

ramkrishna.s.vasudevan commented on HBASE-4540:
-----------------------------------------------

{code}
    try {
      ZKAssign.deleteOpenedNode(server.getZooKeeper(),
          regionInfo.getEncodedName());
    } catch (KeeperException e) {
      server.abort("Error deleting OPENED node in ZK for transition ZK node ("
        + regionInfo.getRegionNameAsString() + ")", e);
    }
{code}
The return type of deleteOpenedNode is not getting used.  Using it may solve the problem
                
> OpenedRegionHandler is not enforcing atomicity of the operation it is performing
> --------------------------------------------------------------------------------
>
>                 Key: HBASE-4540
>                 URL: https://issues.apache.org/jira/browse/HBASE-4540
>             Project: HBase
>          Issue Type: Bug
>            Reporter: ramkrishna.s.vasudevan
>            Assignee: ramkrishna.s.vasudevan
>
> -> OpenedRegionHandler has not yet deleted the znode of the region R1 opened by RS1.
> -> RS1 goes down.
> -> Servershutdownhandler assigns the region R1 to RS2.
> -> The znode of R1 is moved to OFFLINE state by master or OPENING state by RS2 if RS2 has started opening the region.
> -> Now the first OpenedRegionHandler tries to delete the znode thinking its in OPENED state but fails.
> -> Though it fails it removes the node from RIT and adds RS1 as the owner of R1 in master's memory.
> -> Now when RS2 completes opening the region the master is not able to open the region as already the reigon has been deleted from RIT.
> {code}
> Master
> ======
> 2011-10-05 20:49:45,301 INFO org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Finished processing of shutdown of linux146,60020,1317827727647
> 2011-10-05 20:49:54,177 DEBUG org.apache.hadoop.hbase.master.HMaster: Not running balancer because 1 region(s) in transition: {3e69d628a8bd8e9b7c5e7a2a6e03aad9=t1,,1317827883842.3e69d628a8bd8e9b7c5e7a2a6e03aad9. state=PENDING_OPEN, ts=1317827985272, server=linux76,60020,1317827746847}
> 2011-10-05 20:49:57,720 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=M_ZK_REGION_OFFLINE, server=linux76,60000,1317827742012, region=3e69d628a8bd8e9b7c5e7a2a6e03aad9
> 2011-10-05 20:50:14,501 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:60000-0x132d3dc13090023 Deleting existing unassigned node for 3e69d628a8bd8e9b7c5e7a2a6e03aad9 that is in expected state RS_ZK_REGION_OPENED
> 2011-10-05 20:50:14,505 WARN org.apache.hadoop.hbase.zookeeper.ZKAssign: master:60000-0x132d3dc13090023 Attempting to delete unassigned node 3e69d628a8bd8e9b7c5e7a2a6e03aad9 in RS_ZK_REGION_OPENED state but node is in RS_ZK_REGION_OPENING state
> After the region is opened in RS2
> =================================
> 2011-10-05 20:50:48,066 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=RS_ZK_REGION_OPENING, server=linux76,60020,1317827746847, region=3e69d628a8bd8e9b7c5e7a2a6e03aad9, which is more than 15 seconds late
> 2011-10-05 20:50:48,290 WARN org.apache.hadoop.hbase.master.AssignmentManager: Received OPENING for region 3e69d628a8bd8e9b7c5e7a2a6e03aad9 from server linux76,60020,1317827746847 but region was in  the state null and not in expected PENDING_OPEN or OPENING states
> 2011-10-05 20:50:53,743 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=RS_ZK_REGION_OPENING, server=linux76,60020,1317827746847, region=3e69d628a8bd8e9b7c5e7a2a6e03aad9
> 2011-10-05 20:50:54,182 DEBUG org.apache.hadoop.hbase.master.CatalogJanitor: Scanned 1 catalog row(s) and gc'd 0 unreferenced parent region(s)
> 2011-10-05 20:50:54,397 WARN org.apache.hadoop.hbase.master.AssignmentManager: Received OPENING for region 3e69d628a8bd8e9b7c5e7a2a6e03aad9 from server linux76,60020,1317827746847 but region was in  the state null and not in expected PENDING_OPEN or OPENING states
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira