You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "stack (JIRA)" <ji...@apache.org> on 2011/04/29 02:10:03 UTC
[jira] [Created] (HBASE-3829) TestMasterFailover failures in
jenkins
TestMasterFailover failures in jenkins
--------------------------------------
Key: HBASE-3829
URL: https://issues.apache.org/jira/browse/HBASE-3829
Project: HBase
Issue Type: Bug
Reporter: stack
Assignee: stack
Attachments: 3829.patch
We'll fail the TestMasterFailover tests on occasion up on jenkins. One reason for the 180000 timeouts it that test completed but a regionserver won't go down because its stuck over in getMaster. Looking into it, we have all these loops in the regionserver; we have the main run loop but then there are loops trying to send regionserver reportForDuty and then over in the regionserver report method. In a recent fail up on jenkins we were stuck in one of these outer loops trying to get master.
This patch removes a bunch of the outer loops instead having the outer loops run around the HRegionServer#run loop.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-3829) TestMasterFailover failures in
jenkins
Posted by "stack (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-3829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
stack updated HBASE-3829:
-------------------------
Fix Version/s: 0.92.0
Status: Patch Available (was: Open)
This should help but I think there another failure type in TestMasterFailover yet to nail.
> TestMasterFailover failures in jenkins
> --------------------------------------
>
> Key: HBASE-3829
> URL: https://issues.apache.org/jira/browse/HBASE-3829
> Project: HBase
> Issue Type: Bug
> Reporter: stack
> Assignee: stack
> Fix For: 0.92.0
>
> Attachments: 3829.patch
>
>
> We'll fail the TestMasterFailover tests on occasion up on jenkins. One reason for the 180000 timeouts it that test completed but a regionserver won't go down because its stuck over in getMaster. Looking into it, we have all these loops in the regionserver; we have the main run loop but then there are loops trying to send regionserver reportForDuty and then over in the regionserver report method. In a recent fail up on jenkins we were stuck in one of these outer loops trying to get master.
> This patch removes a bunch of the outer loops instead having the outer loops run around the HRegionServer#run loop.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-3829) TestMasterFailover failures in
jenkins
Posted by "stack (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-3829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
stack updated HBASE-3829:
-------------------------
Resolution: Fixed
Status: Resolved (was: Patch Available)
The applied patch seems to have taken care of failures. Will open new issue if this comes up again.
> TestMasterFailover failures in jenkins
> --------------------------------------
>
> Key: HBASE-3829
> URL: https://issues.apache.org/jira/browse/HBASE-3829
> Project: HBase
> Issue Type: Bug
> Reporter: stack
> Assignee: stack
> Fix For: 0.92.0
>
> Attachments: 3829.patch
>
>
> We'll fail the TestMasterFailover tests on occasion up on jenkins. One reason for the 180000 timeouts it that test completed but a regionserver won't go down because its stuck over in getMaster. Looking into it, we have all these loops in the regionserver; we have the main run loop but then there are loops trying to send regionserver reportForDuty and then over in the regionserver report method. In a recent fail up on jenkins we were stuck in one of these outer loops trying to get master.
> This patch removes a bunch of the outer loops instead having the outer loops run around the HRegionServer#run loop.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3829) TestMasterFailover failures in
jenkins
Posted by "Hudson (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-3829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13027266#comment-13027266 ]
Hudson commented on HBASE-3829:
-------------------------------
Integrated in HBase-TRUNK #1888 (See [https://builds.apache.org/hudson/job/HBase-TRUNK/1888/])
> TestMasterFailover failures in jenkins
> --------------------------------------
>
> Key: HBASE-3829
> URL: https://issues.apache.org/jira/browse/HBASE-3829
> Project: HBase
> Issue Type: Bug
> Reporter: stack
> Assignee: stack
> Fix For: 0.92.0
>
> Attachments: 3829.patch
>
>
> We'll fail the TestMasterFailover tests on occasion up on jenkins. One reason for the 180000 timeouts it that test completed but a regionserver won't go down because its stuck over in getMaster. Looking into it, we have all these loops in the regionserver; we have the main run loop but then there are loops trying to send regionserver reportForDuty and then over in the regionserver report method. In a recent fail up on jenkins we were stuck in one of these outer loops trying to get master.
> This patch removes a bunch of the outer loops instead having the outer loops run around the HRegionServer#run loop.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-3829) TestMasterFailover failures in
jenkins
Posted by "stack (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-3829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
stack updated HBASE-3829:
-------------------------
Attachment: 3829.patch
> TestMasterFailover failures in jenkins
> --------------------------------------
>
> Key: HBASE-3829
> URL: https://issues.apache.org/jira/browse/HBASE-3829
> Project: HBase
> Issue Type: Bug
> Reporter: stack
> Assignee: stack
> Attachments: 3829.patch
>
>
> We'll fail the TestMasterFailover tests on occasion up on jenkins. One reason for the 180000 timeouts it that test completed but a regionserver won't go down because its stuck over in getMaster. Looking into it, we have all these loops in the regionserver; we have the main run loop but then there are loops trying to send regionserver reportForDuty and then over in the regionserver report method. In a recent fail up on jenkins we were stuck in one of these outer loops trying to get master.
> This patch removes a bunch of the outer loops instead having the outer loops run around the HRegionServer#run loop.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira