You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@asterixdb.apache.org by AsterixDB Code Review <do...@asterix-gerrit.ics.uci.edu> on 2021/11/10 19:30:04 UTC
Change in asterixdb[master]: [NO ISSUE][CLUS] Interrupt global recovery on node failure
From Murtadha Hubail <mh...@apache.org>:
Murtadha Hubail has uploaded this change for review. ( https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/14025 )
Change subject: [NO ISSUE][CLUS] Interrupt global recovery on node failure
......................................................................
[NO ISSUE][CLUS] Interrupt global recovery on node failure
- user model changes: no
- storage format changes: no
- interface changes: no
Details:
- When a node fails while global recovery is on-going, interrupt
recovery to avoid unnecessary waiting.
Change-Id: I58852e046ff4021f4c5d115f5c3488b249fc61a2
---
M asterixdb/asterix-app/src/main/java/org/apache/asterix/hyracks/bootstrap/GlobalRecoveryManager.java
1 file changed, 12 insertions(+), 1 deletion(-)
git pull ssh://asterix-gerrit.ics.uci.edu:29418/asterixdb refs/changes/25/14025/1
diff --git a/asterixdb/asterix-app/src/main/java/org/apache/asterix/hyracks/bootstrap/GlobalRecoveryManager.java b/asterixdb/asterix-app/src/main/java/org/apache/asterix/hyracks/bootstrap/GlobalRecoveryManager.java
index 9438b16..e6ef8df 100644
--- a/asterixdb/asterix-app/src/main/java/org/apache/asterix/hyracks/bootstrap/GlobalRecoveryManager.java
+++ b/asterixdb/asterix-app/src/main/java/org/apache/asterix/hyracks/bootstrap/GlobalRecoveryManager.java
@@ -23,6 +23,7 @@
import java.util.Collections;
import java.util.List;
import java.util.Set;
+import java.util.concurrent.Future;
import java.util.concurrent.TimeUnit;
import org.apache.asterix.app.message.StorageCleanupRequestMessage;
@@ -64,6 +65,7 @@
protected final IHyracksClientConnection hcc;
protected volatile boolean recoveryCompleted;
protected volatile boolean recovering;
+ protected Future<?> recoveryFuture;
public GlobalRecoveryManager(ICCServiceContext serviceCtx, IHyracksClientConnection hcc,
IStorageComponentProvider componentProvider) {
@@ -98,7 +100,7 @@
* Perform recovery on a different thread to avoid deadlocks in
* {@link org.apache.asterix.common.cluster.IClusterStateManager}
*/
- serviceCtx.getControllerService().getExecutor().submit(() -> {
+ recoveryFuture = serviceCtx.getControllerService().getExecutor().submit(() -> {
try {
recover(appCtx);
} catch (Throwable e) {
@@ -127,6 +129,9 @@
MetadataManager.INSTANCE.commitTransaction(mdTxnCtx);
recoveryCompleted = true;
recovering = false;
+ synchronized (this) {
+ recoveryFuture = null;
+ }
LOGGER.info("Global Recovery Completed. Refreshing cluster state...");
appCtx.getClusterStateManager().refreshState();
}
@@ -166,6 +171,12 @@
@Override
public void notifyStateChange(ClusterState newState) {
+ synchronized (this) {
+ if (recovering && newState == ClusterState.UNUSABLE && recoveryFuture != null) {
+ // interrupt the recovery attempt since cluster became unusable during global recovery
+ recoveryFuture.cancel(true);
+ }
+ }
if (newState != ClusterState.ACTIVE && newState != ClusterState.RECOVERING) {
recoveryCompleted = false;
}
--
To view, visit https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/14025
To unsubscribe, or for help writing mail filters, visit https://asterix-gerrit.ics.uci.edu/settings
Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Change-Id: I58852e046ff4021f4c5d115f5c3488b249fc61a2
Gerrit-Change-Number: 14025
Gerrit-PatchSet: 1
Gerrit-Owner: Murtadha Hubail <mh...@apache.org>
Gerrit-MessageType: newchange
Change in asterixdb[master]: [NO ISSUE][CLUS] Interrupt global recovery on node failure
Posted by AsterixDB Code Review <do...@asterix-gerrit.ics.uci.edu>.
From Murtadha Hubail <mh...@apache.org>:
Murtadha Hubail has posted comments on this change. ( https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/14025 )
Change subject: [NO ISSUE][CLUS] Interrupt global recovery on node failure
......................................................................
Patch Set 1: Code-Review+1
--
To view, visit https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/14025
To unsubscribe, or for help writing mail filters, visit https://asterix-gerrit.ics.uci.edu/settings
Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Change-Id: I58852e046ff4021f4c5d115f5c3488b249fc61a2
Gerrit-Change-Number: 14025
Gerrit-PatchSet: 1
Gerrit-Owner: Murtadha Hubail <mh...@apache.org>
Gerrit-Reviewer: Anon. E. Moose #1000171
Gerrit-Reviewer: Jenkins <je...@fulliautomatix.ics.uci.edu>
Gerrit-Reviewer: Murtadha Hubail <mh...@apache.org>
Gerrit-Comment-Date: Thu, 11 Nov 2021 14:50:56 +0000
Gerrit-HasComments: No
Gerrit-Has-Labels: Yes
Gerrit-MessageType: comment
Change in asterixdb[master]: [NO ISSUE][CLUS] Interrupt global recovery on node failure
Posted by AsterixDB Code Review <do...@asterix-gerrit.ics.uci.edu>.
From Murtadha Hubail <mh...@apache.org>:
Murtadha Hubail has uploaded this change for review. ( https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/14025 )
Change subject: [NO ISSUE][CLUS] Interrupt global recovery on node failure
......................................................................
[NO ISSUE][CLUS] Interrupt global recovery on node failure
- user model changes: no
- storage format changes: no
- interface changes: no
Details:
- When a node fails while global recovery is on-going, interrupt
recovery to avoid unnecessary waiting.
Change-Id: I58852e046ff4021f4c5d115f5c3488b249fc61a2
---
M asterixdb/asterix-app/src/main/java/org/apache/asterix/hyracks/bootstrap/GlobalRecoveryManager.java
1 file changed, 12 insertions(+), 1 deletion(-)
git pull ssh://asterix-gerrit.ics.uci.edu:29418/asterixdb refs/changes/25/14025/1
diff --git a/asterixdb/asterix-app/src/main/java/org/apache/asterix/hyracks/bootstrap/GlobalRecoveryManager.java b/asterixdb/asterix-app/src/main/java/org/apache/asterix/hyracks/bootstrap/GlobalRecoveryManager.java
index 9438b16..e6ef8df 100644
--- a/asterixdb/asterix-app/src/main/java/org/apache/asterix/hyracks/bootstrap/GlobalRecoveryManager.java
+++ b/asterixdb/asterix-app/src/main/java/org/apache/asterix/hyracks/bootstrap/GlobalRecoveryManager.java
@@ -23,6 +23,7 @@
import java.util.Collections;
import java.util.List;
import java.util.Set;
+import java.util.concurrent.Future;
import java.util.concurrent.TimeUnit;
import org.apache.asterix.app.message.StorageCleanupRequestMessage;
@@ -64,6 +65,7 @@
protected final IHyracksClientConnection hcc;
protected volatile boolean recoveryCompleted;
protected volatile boolean recovering;
+ protected Future<?> recoveryFuture;
public GlobalRecoveryManager(ICCServiceContext serviceCtx, IHyracksClientConnection hcc,
IStorageComponentProvider componentProvider) {
@@ -98,7 +100,7 @@
* Perform recovery on a different thread to avoid deadlocks in
* {@link org.apache.asterix.common.cluster.IClusterStateManager}
*/
- serviceCtx.getControllerService().getExecutor().submit(() -> {
+ recoveryFuture = serviceCtx.getControllerService().getExecutor().submit(() -> {
try {
recover(appCtx);
} catch (Throwable e) {
@@ -127,6 +129,9 @@
MetadataManager.INSTANCE.commitTransaction(mdTxnCtx);
recoveryCompleted = true;
recovering = false;
+ synchronized (this) {
+ recoveryFuture = null;
+ }
LOGGER.info("Global Recovery Completed. Refreshing cluster state...");
appCtx.getClusterStateManager().refreshState();
}
@@ -166,6 +171,12 @@
@Override
public void notifyStateChange(ClusterState newState) {
+ synchronized (this) {
+ if (recovering && newState == ClusterState.UNUSABLE && recoveryFuture != null) {
+ // interrupt the recovery attempt since cluster became unusable during global recovery
+ recoveryFuture.cancel(true);
+ }
+ }
if (newState != ClusterState.ACTIVE && newState != ClusterState.RECOVERING) {
recoveryCompleted = false;
}
--
To view, visit https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/14025
To unsubscribe, or for help writing mail filters, visit https://asterix-gerrit.ics.uci.edu/settings
Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Change-Id: I58852e046ff4021f4c5d115f5c3488b249fc61a2
Gerrit-Change-Number: 14025
Gerrit-PatchSet: 1
Gerrit-Owner: Murtadha Hubail <mh...@apache.org>
Gerrit-MessageType: newchange
Change in asterixdb[master]: [NO ISSUE][CLUS] Interrupt global recovery on node failure
Posted by AsterixDB Code Review <do...@asterix-gerrit.ics.uci.edu>.
From Ali Alsuliman <al...@gmail.com>:
Ali Alsuliman has posted comments on this change. ( https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/14025 )
Change subject: [NO ISSUE][CLUS] Interrupt global recovery on node failure
......................................................................
Patch Set 1: Code-Review+2
--
To view, visit https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/14025
To unsubscribe, or for help writing mail filters, visit https://asterix-gerrit.ics.uci.edu/settings
Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Change-Id: I58852e046ff4021f4c5d115f5c3488b249fc61a2
Gerrit-Change-Number: 14025
Gerrit-PatchSet: 1
Gerrit-Owner: Murtadha Hubail <mh...@apache.org>
Gerrit-Reviewer: Ali Alsuliman <al...@gmail.com>
Gerrit-Reviewer: Anon. E. Moose #1000171
Gerrit-Reviewer: Jenkins <je...@fulliautomatix.ics.uci.edu>
Gerrit-Reviewer: Michael Blow <mb...@apache.org>
Gerrit-Reviewer: Murtadha Hubail <mh...@apache.org>
Gerrit-Comment-Date: Thu, 11 Nov 2021 20:53:25 +0000
Gerrit-HasComments: No
Gerrit-Has-Labels: Yes
Gerrit-MessageType: comment
Change in asterixdb[master]: [NO ISSUE][CLUS] Interrupt global recovery on node failure
Posted by AsterixDB Code Review <do...@asterix-gerrit.ics.uci.edu>.
From Jenkins <je...@fulliautomatix.ics.uci.edu>:
Jenkins has posted comments on this change. ( https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/14025 )
Change subject: [NO ISSUE][CLUS] Interrupt global recovery on node failure
......................................................................
Patch Set 1: Integration-Tests+1
Integration Tests Successful
https://asterix-jenkins.ics.uci.edu/job/asterix-gerrit-integration-tests/12696/ : SUCCESS
--
To view, visit https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/14025
To unsubscribe, or for help writing mail filters, visit https://asterix-gerrit.ics.uci.edu/settings
Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Change-Id: I58852e046ff4021f4c5d115f5c3488b249fc61a2
Gerrit-Change-Number: 14025
Gerrit-PatchSet: 1
Gerrit-Owner: Murtadha Hubail <mh...@apache.org>
Gerrit-Reviewer: Jenkins <je...@fulliautomatix.ics.uci.edu>
Gerrit-CC: Anon. E. Moose #1000171
Gerrit-Comment-Date: Wed, 10 Nov 2021 21:36:38 +0000
Gerrit-HasComments: No
Gerrit-Has-Labels: Yes
Gerrit-MessageType: comment