You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by "Andrew Kyle Purtell (Jira)" <ji...@apache.org> on 2022/06/11 23:14:00 UTC
[jira] [Resolved] (HBASE-20992) MTTR, Chaos, and ITBLL
[ https://issues.apache.org/jira/browse/HBASE-20992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Andrew Kyle Purtell resolved HBASE-20992.
-----------------------------------------
Resolution: Later
> MTTR, Chaos, and ITBLL
> ----------------------
>
> Key: HBASE-20992
> URL: https://issues.apache.org/jira/browse/HBASE-20992
> Project: HBase
> Issue Type: Sub-task
> Components: integration tests, MTTR
> Reporter: Michael Stack
> Priority: Major
>
> I've been having trouble getting a sustained, large ITBLL run to complete over the last few days. I'm seeing a bunch of the below:
> * A region splits or is moved
> * Chaos kills the Master in the middle of the Split or Move Procedure after a Region has been offlined
> * Master takes a while to come back whether because it is not started until a couple of minutes have passed and then there is some recovery to be done.
> So a region can be offline for minutes. Default we retry up to 16 times which ends up at about 2.5 minutes before we give up.
> So, I can up the retries when running larger tests but also, the region should come back online faster.
> Let me hang ITBLL fixes/notes off here.
--
This message was sent by Atlassian Jira
(v8.20.7#820007)