You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@asterixdb.apache.org by "Michael Blow (Code Review)" <do...@asterixdb.incubator.apache.org> on 2019/03/13 21:14:46 UTC

Change in asterixdb[master]: [NO ISSUE][HYR] EnsureAllCcTasksCompleted failure handling

Michael Blow has uploaded a new change for review.

  https://asterix-gerrit.ics.uci.edu/3274

Change subject: [NO ISSUE][HYR] EnsureAllCcTasksCompleted failure handling
......................................................................

[NO ISSUE][HYR] EnsureAllCcTasksCompleted failure handling

Don't halt on interrupt while waiting for aborted cc tasks to complete,
or on interrupt while notifying cc of success of the completion

Change-Id: I02819afcb80a0bcd645c3f79950c3fa12dba0274
---
M hyracks-fullstack/hyracks/hyracks-control/hyracks-control-nc/src/main/java/org/apache/hyracks/control/nc/work/EnsureAllCcTasksCompleted.java
M hyracks-fullstack/hyracks/hyracks-util/src/main/java/org/apache/hyracks/util/ExitUtil.java
2 files changed, 36 insertions(+), 28 deletions(-)


  git pull ssh://asterix-gerrit.ics.uci.edu:29418/asterixdb refs/changes/74/3274/1

diff --git a/hyracks-fullstack/hyracks/hyracks-control/hyracks-control-nc/src/main/java/org/apache/hyracks/control/nc/work/EnsureAllCcTasksCompleted.java b/hyracks-fullstack/hyracks/hyracks-control/hyracks-control-nc/src/main/java/org/apache/hyracks/control/nc/work/EnsureAllCcTasksCompleted.java
index 0f36c80..9e090f2 100644
--- a/hyracks-fullstack/hyracks/hyracks-control/hyracks-control-nc/src/main/java/org/apache/hyracks/control/nc/work/EnsureAllCcTasksCompleted.java
+++ b/hyracks-fullstack/hyracks/hyracks-control/hyracks-control-nc/src/main/java/org/apache/hyracks/control/nc/work/EnsureAllCcTasksCompleted.java
@@ -40,7 +40,7 @@
     private final CcId ccId;
     private final Deque<Task> runningTasks;
 
-    public EnsureAllCcTasksCompleted(NodeControllerService ncs, CcId ccId, Deque<Task> runningTasks) {
+    EnsureAllCcTasksCompleted(NodeControllerService ncs, CcId ccId, Deque<Task> runningTasks) {
         this.ncs = ncs;
         this.ccId = ccId;
         this.runningTasks = runningTasks;
@@ -48,40 +48,47 @@
 
     @Override
     public void run() {
+        LOGGER.info("Ensuring all tasks of CC {} have completed", ccId);
         try {
-            LOGGER.info("Ensuring all tasks of CC {} have completed", ccId);
-            final Span maxWaitTime = Span.start(2, TimeUnit.MINUTES);
-            while (!maxWaitTime.elapsed()) {
-                removeCompleted();
-                if (runningTasks.isEmpty()) {
-                    break;
-                }
-                LOGGER.info("{} tasks are still running", runningTasks.size());
-                TimeUnit.SECONDS.sleep(1); // Check once a second
-            }
+            waitForTaskCompletion();
+        } catch (InterruptedException e) {
+            LOGGER.info("interrupted waiting for CC tasks to complete; giving up");
+            Thread.currentThread().interrupt();
+        }
+    }
+
+    private void waitForTaskCompletion() throws InterruptedException {
+        final Span maxWaitTime = Span.start(TIMEOUT, TimeUnit.MILLISECONDS);
+        while (!maxWaitTime.elapsed()) {
+            removeCompleted();
             if (runningTasks.isEmpty()) {
-                LOGGER.info("All tasks of CC {} have completed", ccId);
-                ncs.notifyTasksCompleted(ccId);
-            } else {
-                LOGGER.error("{} tasks associated with CC {} failed to complete after {}ms. Giving up",
-                        runningTasks.size(), ccId, TIMEOUT);
-                logPendingTasks();
-                ExitUtil.halt(ExitUtil.EC_NC_FAILED_TO_ABORT_ALL_PREVIOUS_TASKS);
+                break;
             }
-        } catch (Throwable th) {
-            LOGGER.error("Failed to abort all previous tasks associated with CC {}", ccId, th);
+            LOGGER.info("{} tasks are still running", runningTasks.size());
+            TimeUnit.SECONDS.sleep(1); // Check once a second
+        }
+        removeCompleted();
+        if (runningTasks.isEmpty()) {
+            LOGGER.info("all tasks of CC {} have completed", ccId);
+            try {
+                ncs.notifyTasksCompleted(ccId);
+            } catch (InterruptedException e) {
+                LOGGER.info("interrupted during notifyTasksCompleted");
+                throw e;
+            } catch (Exception e) {
+                LOGGER.error("unexpected error during notifyTasksCompleted", e);
+                ExitUtil.halt(ExitUtil.EC_NC_FAILED_TO_NOTIFY_TASKS_COMPLETED);
+            }
+        } else {
+            LOGGER.error("{} tasks associated with CC {} failed to complete after {}ms. Giving up", runningTasks.size(),
+                    ccId, TIMEOUT);
+            logPendingTasks();
             ExitUtil.halt(ExitUtil.EC_NC_FAILED_TO_ABORT_ALL_PREVIOUS_TASKS);
         }
     }
 
     private void removeCompleted() {
-        final int numTasks = runningTasks.size();
-        for (int i = 0; i < numTasks; i++) {
-            Task task = runningTasks.poll();
-            if (!task.isCompleted()) {
-                runningTasks.add(task);
-            }
-        }
+        runningTasks.removeIf(Task::isCompleted);
     }
 
     private void logPendingTasks() {
@@ -89,7 +96,7 @@
             final List<Thread> pendingThreads = task.getPendingThreads();
             LOGGER.error("task {} was stuck. Stuck thread count = {}", task.getTaskAttemptId(), pendingThreads.size());
             for (Thread thread : pendingThreads) {
-                LOGGER.error("Stuck thread trace", ExceptionUtils.fromThreadStack(thread));
+                LOGGER.error("stuck thread trace", ExceptionUtils.fromThreadStack(thread));
             }
         }
     }
diff --git a/hyracks-fullstack/hyracks/hyracks-util/src/main/java/org/apache/hyracks/util/ExitUtil.java b/hyracks-fullstack/hyracks/hyracks-util/src/main/java/org/apache/hyracks/util/ExitUtil.java
index 52c8f55..680d55e 100644
--- a/hyracks-fullstack/hyracks/hyracks-util/src/main/java/org/apache/hyracks/util/ExitUtil.java
+++ b/hyracks-fullstack/hyracks/hyracks-util/src/main/java/org/apache/hyracks/util/ExitUtil.java
@@ -51,6 +51,7 @@
     public static final int EC_NETWORK_FAILURE = 16;
     public static final int EC_ACTIVE_SUSPEND_FAILURE = 17;
     public static final int EC_ACTIVE_RESUME_FAILURE = 18;
+    public static final int EC_NC_FAILED_TO_NOTIFY_TASKS_COMPLETED = 19;
     public static final int EC_FAILED_TO_CANCEL_ACTIVE_START_STOP = 22;
     public static final int EC_IMMEDIATE_HALT = 33;
     public static final int EC_HALT_ABNORMAL_RESERVED_44 = 44;

-- 
To view, visit https://asterix-gerrit.ics.uci.edu/3274
To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I02819afcb80a0bcd645c3f79950c3fa12dba0274
Gerrit-PatchSet: 1
Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Owner: Michael Blow <mb...@apache.org>

Change in asterixdb[master]: [NO ISSUE][HYR] EnsureAllCcTasksCompleted failure handling

Posted by "Anon. E. Moose (Code Review)" <do...@asterixdb.incubator.apache.org>.
Anon. E. Moose #1000171 has posted comments on this change.

Change subject: [NO ISSUE][HYR] EnsureAllCcTasksCompleted failure handling
......................................................................


Patch Set 1:

Analytics Compatibility Compilation Successful
https://goo.gl/bbD8eU : SUCCESS

-- 
To view, visit https://asterix-gerrit.ics.uci.edu/3274
To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I02819afcb80a0bcd645c3f79950c3fa12dba0274
Gerrit-PatchSet: 1
Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Owner: Michael Blow <mb...@apache.org>
Gerrit-Reviewer: Anon. E. Moose #1000171
Gerrit-Reviewer: Jenkins <je...@fulliautomatix.ics.uci.edu>
Gerrit-HasComments: No

Change in asterixdb[master]: [NO ISSUE][HYR] EnsureAllCcTasksCompleted failure handling

Posted by "Jenkins (Code Review)" <do...@asterixdb.incubator.apache.org>.
Jenkins has posted comments on this change.

Change subject: [NO ISSUE][HYR] EnsureAllCcTasksCompleted failure handling
......................................................................


Patch Set 1:

Build Started https://asterix-jenkins.ics.uci.edu/job/asterix-gerrit-ssl-compression/168/ (13/16)

-- 
To view, visit https://asterix-gerrit.ics.uci.edu/3274
To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I02819afcb80a0bcd645c3f79950c3fa12dba0274
Gerrit-PatchSet: 1
Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Owner: Michael Blow <mb...@apache.org>
Gerrit-Reviewer: Anon. E. Moose #1000171
Gerrit-Reviewer: Jenkins <je...@fulliautomatix.ics.uci.edu>
Gerrit-HasComments: No

Change in asterixdb[master]: [NO ISSUE][HYR] EnsureAllCcTasksCompleted failure handling

Posted by "Jenkins (Code Review)" <do...@asterixdb.incubator.apache.org>.
Jenkins has posted comments on this change.

Change subject: [NO ISSUE][HYR] EnsureAllCcTasksCompleted failure handling
......................................................................


Patch Set 1:

Build Started https://asterix-jenkins.ics.uci.edu/job/asterix-gerrit-spidersilk-tests/329/ (16/16)

-- 
To view, visit https://asterix-gerrit.ics.uci.edu/3274
To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I02819afcb80a0bcd645c3f79950c3fa12dba0274
Gerrit-PatchSet: 1
Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Owner: Michael Blow <mb...@apache.org>
Gerrit-Reviewer: Anon. E. Moose #1000171
Gerrit-Reviewer: Jenkins <je...@fulliautomatix.ics.uci.edu>
Gerrit-HasComments: No

Change in asterixdb[master]: [NO ISSUE][HYR] EnsureAllCcTasksCompleted failure handling

Posted by "Jenkins (Code Review)" <do...@asterixdb.incubator.apache.org>.
Jenkins has posted comments on this change.

Change subject: [NO ISSUE][HYR] EnsureAllCcTasksCompleted failure handling
......................................................................


Patch Set 1:

Build Started https://asterix-jenkins.ics.uci.edu/job/asterix-verify-txnlog/526/ (1/16)

-- 
To view, visit https://asterix-gerrit.ics.uci.edu/3274
To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I02819afcb80a0bcd645c3f79950c3fa12dba0274
Gerrit-PatchSet: 1
Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Owner: Michael Blow <mb...@apache.org>
Gerrit-Reviewer: Jenkins <je...@fulliautomatix.ics.uci.edu>
Gerrit-HasComments: No

Change in asterixdb[master]: [NO ISSUE][HYR] EnsureAllCcTasksCompleted failure handling

Posted by "Jenkins (Code Review)" <do...@asterixdb.incubator.apache.org>.
Jenkins has posted comments on this change.

Change subject: [NO ISSUE][HYR] EnsureAllCcTasksCompleted failure handling
......................................................................


Patch Set 1:

Build Started https://asterix-jenkins.ics.uci.edu/job/asterix-gerrit-verify-asterix-app/5725/ (12/16)

-- 
To view, visit https://asterix-gerrit.ics.uci.edu/3274
To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I02819afcb80a0bcd645c3f79950c3fa12dba0274
Gerrit-PatchSet: 1
Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Owner: Michael Blow <mb...@apache.org>
Gerrit-Reviewer: Jenkins <je...@fulliautomatix.ics.uci.edu>
Gerrit-HasComments: No

Change in asterixdb[master]: [NO ISSUE][HYR] EnsureAllCcTasksCompleted failure handling

Posted by "Jenkins (Code Review)" <do...@asterixdb.incubator.apache.org>.
Jenkins has posted comments on this change.

Change subject: [NO ISSUE][HYR] EnsureAllCcTasksCompleted failure handling
......................................................................


Patch Set 1: Contrib+1

BAD Compatibility Tests Successful

https://asterix-jenkins.ics.uci.edu/job/asterixbad-compat/4062/ : SUCCESS

-- 
To view, visit https://asterix-gerrit.ics.uci.edu/3274
To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I02819afcb80a0bcd645c3f79950c3fa12dba0274
Gerrit-PatchSet: 1
Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Owner: Michael Blow <mb...@apache.org>
Gerrit-Reviewer: Anon. E. Moose #1000171
Gerrit-Reviewer: Jenkins <je...@fulliautomatix.ics.uci.edu>
Gerrit-HasComments: No

Change in asterixdb[master]: [NO ISSUE][HYR] EnsureAllCcTasksCompleted failure handling

Posted by "Jenkins (Code Review)" <do...@asterixdb.incubator.apache.org>.
Jenkins has posted comments on this change.

Change subject: [NO ISSUE][HYR] EnsureAllCcTasksCompleted failure handling
......................................................................


Patch Set 1:

Integration Tests Started https://asterix-jenkins.ics.uci.edu/job/asterix-gerrit-integration-tests/8130/

-- 
To view, visit https://asterix-gerrit.ics.uci.edu/3274
To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I02819afcb80a0bcd645c3f79950c3fa12dba0274
Gerrit-PatchSet: 1
Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Owner: Michael Blow <mb...@apache.org>
Gerrit-Reviewer: Anon. E. Moose #1000171
Gerrit-Reviewer: Jenkins <je...@fulliautomatix.ics.uci.edu>
Gerrit-HasComments: No

Change in asterixdb[master]: [NO ISSUE][HYR] EnsureAllCcTasksCompleted failure handling

Posted by "Jenkins (Code Review)" <do...@asterixdb.incubator.apache.org>.
Jenkins has posted comments on this change.

Change subject: [NO ISSUE][HYR] EnsureAllCcTasksCompleted failure handling
......................................................................


Patch Set 1:

Build Started https://asterix-jenkins.ics.uci.edu/job/asterix-gerrit-source-format/5320/ (11/16)

-- 
To view, visit https://asterix-gerrit.ics.uci.edu/3274
To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I02819afcb80a0bcd645c3f79950c3fa12dba0274
Gerrit-PatchSet: 1
Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Owner: Michael Blow <mb...@apache.org>
Gerrit-Reviewer: Jenkins <je...@fulliautomatix.ics.uci.edu>
Gerrit-HasComments: No

Change in asterixdb[master]: [NO ISSUE][HYR] EnsureAllCcTasksCompleted failure handling

Posted by "Anon. E. Moose (Code Review)" <do...@asterixdb.incubator.apache.org>.
Anon. E. Moose #1000171 has posted comments on this change.

Change subject: [NO ISSUE][HYR] EnsureAllCcTasksCompleted failure handling
......................................................................


Patch Set 1: Contrib+1

Analytics Compatibility Tests Successful
https://goo.gl/k2ut7t : SUCCESS

-- 
To view, visit https://asterix-gerrit.ics.uci.edu/3274
To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I02819afcb80a0bcd645c3f79950c3fa12dba0274
Gerrit-PatchSet: 1
Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Owner: Michael Blow <mb...@apache.org>
Gerrit-Reviewer: Anon. E. Moose #1000171
Gerrit-Reviewer: Jenkins <je...@fulliautomatix.ics.uci.edu>
Gerrit-HasComments: No

Change in asterixdb[master]: [NO ISSUE][HYR] EnsureAllCcTasksCompleted failure handling

Posted by "Jenkins (Code Review)" <do...@asterixdb.incubator.apache.org>.
Jenkins has posted comments on this change.

Change subject: [NO ISSUE][HYR] EnsureAllCcTasksCompleted failure handling
......................................................................


Patch Set 1:

Build Started https://asterix-jenkins.ics.uci.edu/job/hyracks-gerrit/5263/ (10/16)

-- 
To view, visit https://asterix-gerrit.ics.uci.edu/3274
To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I02819afcb80a0bcd645c3f79950c3fa12dba0274
Gerrit-PatchSet: 1
Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Owner: Michael Blow <mb...@apache.org>
Gerrit-Reviewer: Jenkins <je...@fulliautomatix.ics.uci.edu>
Gerrit-HasComments: No

Change in asterixdb[master]: [NO ISSUE][HYR] EnsureAllCcTasksCompleted failure handling

Posted by "Jenkins (Code Review)" <do...@asterixdb.incubator.apache.org>.
Jenkins has posted comments on this change.

Change subject: [NO ISSUE][HYR] EnsureAllCcTasksCompleted failure handling
......................................................................


Patch Set 1:

Build Started https://asterix-jenkins.ics.uci.edu/job/asterix-gerrit-verify-no-installer-app/5560/ (15/16)

-- 
To view, visit https://asterix-gerrit.ics.uci.edu/3274
To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I02819afcb80a0bcd645c3f79950c3fa12dba0274
Gerrit-PatchSet: 1
Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Owner: Michael Blow <mb...@apache.org>
Gerrit-Reviewer: Anon. E. Moose #1000171
Gerrit-Reviewer: Jenkins <je...@fulliautomatix.ics.uci.edu>
Gerrit-HasComments: No

Change in asterixdb[master]: [NO ISSUE][HYR] EnsureAllCcTasksCompleted failure handling

Posted by "Jenkins (Code Review)" <do...@asterixdb.incubator.apache.org>.
Jenkins has posted comments on this change.

Change subject: [NO ISSUE][HYR] EnsureAllCcTasksCompleted failure handling
......................................................................


Patch Set 1:

Build Started https://asterix-jenkins.ics.uci.edu/job/asterix-gerrit-asterix-app-openjdk11/725/ (14/16)

-- 
To view, visit https://asterix-gerrit.ics.uci.edu/3274
To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I02819afcb80a0bcd645c3f79950c3fa12dba0274
Gerrit-PatchSet: 1
Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Owner: Michael Blow <mb...@apache.org>
Gerrit-Reviewer: Anon. E. Moose #1000171
Gerrit-Reviewer: Jenkins <je...@fulliautomatix.ics.uci.edu>
Gerrit-HasComments: No

Change in asterixdb[master]: [NO ISSUE][HYR] EnsureAllCcTasksCompleted failure handling

Posted by "Jenkins (Code Review)" <do...@asterixdb.incubator.apache.org>.
Jenkins has posted comments on this change.

Change subject: [NO ISSUE][HYR] EnsureAllCcTasksCompleted failure handling
......................................................................


Patch Set 1:

Build Started https://asterix-jenkins.ics.uci.edu/job/asterix-gerrit-asterix-app-sql-execution/5353/ (4/16)

-- 
To view, visit https://asterix-gerrit.ics.uci.edu/3274
To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I02819afcb80a0bcd645c3f79950c3fa12dba0274
Gerrit-PatchSet: 1
Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Owner: Michael Blow <mb...@apache.org>
Gerrit-Reviewer: Jenkins <je...@fulliautomatix.ics.uci.edu>
Gerrit-HasComments: No

Change in asterixdb[master]: [NO ISSUE][HYR] EnsureAllCcTasksCompleted failure handling

Posted by "Jenkins (Code Review)" <do...@asterixdb.incubator.apache.org>.
Jenkins has posted comments on this change.

Change subject: [NO ISSUE][HYR] EnsureAllCcTasksCompleted failure handling
......................................................................


Patch Set 1:

Build Started https://asterix-jenkins.ics.uci.edu/job/asterix-gerrit-source-assemblies/5577/ (3/16)

-- 
To view, visit https://asterix-gerrit.ics.uci.edu/3274
To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I02819afcb80a0bcd645c3f79950c3fa12dba0274
Gerrit-PatchSet: 1
Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Owner: Michael Blow <mb...@apache.org>
Gerrit-Reviewer: Jenkins <je...@fulliautomatix.ics.uci.edu>
Gerrit-HasComments: No

Change in asterixdb[master]: [NO ISSUE][HYR] EnsureAllCcTasksCompleted failure handling

Posted by "Jenkins (Code Review)" <do...@asterixdb.incubator.apache.org>.
Jenkins has posted comments on this change.

Change subject: [NO ISSUE][HYR] EnsureAllCcTasksCompleted failure handling
......................................................................


Patch Set 1:

Build Started https://asterix-jenkins.ics.uci.edu/job/asterix-stabilization-f69489-compat/660/ (6/16)

-- 
To view, visit https://asterix-gerrit.ics.uci.edu/3274
To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I02819afcb80a0bcd645c3f79950c3fa12dba0274
Gerrit-PatchSet: 1
Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Owner: Michael Blow <mb...@apache.org>
Gerrit-Reviewer: Jenkins <je...@fulliautomatix.ics.uci.edu>
Gerrit-HasComments: No

Change in asterixdb[master]: [NO ISSUE][HYR] EnsureAllCcTasksCompleted failure handling

Posted by "Michael Blow (Code Review)" <do...@asterixdb.incubator.apache.org>.
Michael Blow has submitted this change and it was merged.

Change subject: [NO ISSUE][HYR] EnsureAllCcTasksCompleted failure handling
......................................................................


[NO ISSUE][HYR] EnsureAllCcTasksCompleted failure handling

Don't halt on interrupt while waiting for aborted cc tasks to complete,
or on interrupt while notifying cc of success of the completion

Change-Id: I02819afcb80a0bcd645c3f79950c3fa12dba0274
Reviewed-on: https://asterix-gerrit.ics.uci.edu/3274
Sonar-Qube: Jenkins <je...@fulliautomatix.ics.uci.edu>
Tested-by: Jenkins <je...@fulliautomatix.ics.uci.edu>
Contrib: Jenkins <je...@fulliautomatix.ics.uci.edu>
Integration-Tests: Jenkins <je...@fulliautomatix.ics.uci.edu>
Reviewed-by: Till Westmann <ti...@apache.org>
---
M hyracks-fullstack/hyracks/hyracks-control/hyracks-control-nc/src/main/java/org/apache/hyracks/control/nc/work/EnsureAllCcTasksCompleted.java
M hyracks-fullstack/hyracks/hyracks-util/src/main/java/org/apache/hyracks/util/ExitUtil.java
2 files changed, 36 insertions(+), 28 deletions(-)

Approvals:
  Anon. E. Moose #1000171: 
  Till Westmann: Looks good to me, approved
  Jenkins: Verified; No violations found; ; Verified



diff --git a/hyracks-fullstack/hyracks/hyracks-control/hyracks-control-nc/src/main/java/org/apache/hyracks/control/nc/work/EnsureAllCcTasksCompleted.java b/hyracks-fullstack/hyracks/hyracks-control/hyracks-control-nc/src/main/java/org/apache/hyracks/control/nc/work/EnsureAllCcTasksCompleted.java
index 0f36c80..9e090f2 100644
--- a/hyracks-fullstack/hyracks/hyracks-control/hyracks-control-nc/src/main/java/org/apache/hyracks/control/nc/work/EnsureAllCcTasksCompleted.java
+++ b/hyracks-fullstack/hyracks/hyracks-control/hyracks-control-nc/src/main/java/org/apache/hyracks/control/nc/work/EnsureAllCcTasksCompleted.java
@@ -40,7 +40,7 @@
     private final CcId ccId;
     private final Deque<Task> runningTasks;
 
-    public EnsureAllCcTasksCompleted(NodeControllerService ncs, CcId ccId, Deque<Task> runningTasks) {
+    EnsureAllCcTasksCompleted(NodeControllerService ncs, CcId ccId, Deque<Task> runningTasks) {
         this.ncs = ncs;
         this.ccId = ccId;
         this.runningTasks = runningTasks;
@@ -48,40 +48,47 @@
 
     @Override
     public void run() {
+        LOGGER.info("Ensuring all tasks of CC {} have completed", ccId);
         try {
-            LOGGER.info("Ensuring all tasks of CC {} have completed", ccId);
-            final Span maxWaitTime = Span.start(2, TimeUnit.MINUTES);
-            while (!maxWaitTime.elapsed()) {
-                removeCompleted();
-                if (runningTasks.isEmpty()) {
-                    break;
-                }
-                LOGGER.info("{} tasks are still running", runningTasks.size());
-                TimeUnit.SECONDS.sleep(1); // Check once a second
-            }
+            waitForTaskCompletion();
+        } catch (InterruptedException e) {
+            LOGGER.info("interrupted waiting for CC tasks to complete; giving up");
+            Thread.currentThread().interrupt();
+        }
+    }
+
+    private void waitForTaskCompletion() throws InterruptedException {
+        final Span maxWaitTime = Span.start(TIMEOUT, TimeUnit.MILLISECONDS);
+        while (!maxWaitTime.elapsed()) {
+            removeCompleted();
             if (runningTasks.isEmpty()) {
-                LOGGER.info("All tasks of CC {} have completed", ccId);
-                ncs.notifyTasksCompleted(ccId);
-            } else {
-                LOGGER.error("{} tasks associated with CC {} failed to complete after {}ms. Giving up",
-                        runningTasks.size(), ccId, TIMEOUT);
-                logPendingTasks();
-                ExitUtil.halt(ExitUtil.EC_NC_FAILED_TO_ABORT_ALL_PREVIOUS_TASKS);
+                break;
             }
-        } catch (Throwable th) {
-            LOGGER.error("Failed to abort all previous tasks associated with CC {}", ccId, th);
+            LOGGER.info("{} tasks are still running", runningTasks.size());
+            TimeUnit.SECONDS.sleep(1); // Check once a second
+        }
+        removeCompleted();
+        if (runningTasks.isEmpty()) {
+            LOGGER.info("all tasks of CC {} have completed", ccId);
+            try {
+                ncs.notifyTasksCompleted(ccId);
+            } catch (InterruptedException e) {
+                LOGGER.info("interrupted during notifyTasksCompleted");
+                throw e;
+            } catch (Exception e) {
+                LOGGER.error("unexpected error during notifyTasksCompleted", e);
+                ExitUtil.halt(ExitUtil.EC_NC_FAILED_TO_NOTIFY_TASKS_COMPLETED);
+            }
+        } else {
+            LOGGER.error("{} tasks associated with CC {} failed to complete after {}ms. Giving up", runningTasks.size(),
+                    ccId, TIMEOUT);
+            logPendingTasks();
             ExitUtil.halt(ExitUtil.EC_NC_FAILED_TO_ABORT_ALL_PREVIOUS_TASKS);
         }
     }
 
     private void removeCompleted() {
-        final int numTasks = runningTasks.size();
-        for (int i = 0; i < numTasks; i++) {
-            Task task = runningTasks.poll();
-            if (!task.isCompleted()) {
-                runningTasks.add(task);
-            }
-        }
+        runningTasks.removeIf(Task::isCompleted);
     }
 
     private void logPendingTasks() {
@@ -89,7 +96,7 @@
             final List<Thread> pendingThreads = task.getPendingThreads();
             LOGGER.error("task {} was stuck. Stuck thread count = {}", task.getTaskAttemptId(), pendingThreads.size());
             for (Thread thread : pendingThreads) {
-                LOGGER.error("Stuck thread trace", ExceptionUtils.fromThreadStack(thread));
+                LOGGER.error("stuck thread trace", ExceptionUtils.fromThreadStack(thread));
             }
         }
     }
diff --git a/hyracks-fullstack/hyracks/hyracks-util/src/main/java/org/apache/hyracks/util/ExitUtil.java b/hyracks-fullstack/hyracks/hyracks-util/src/main/java/org/apache/hyracks/util/ExitUtil.java
index 52c8f55..680d55e 100644
--- a/hyracks-fullstack/hyracks/hyracks-util/src/main/java/org/apache/hyracks/util/ExitUtil.java
+++ b/hyracks-fullstack/hyracks/hyracks-util/src/main/java/org/apache/hyracks/util/ExitUtil.java
@@ -51,6 +51,7 @@
     public static final int EC_NETWORK_FAILURE = 16;
     public static final int EC_ACTIVE_SUSPEND_FAILURE = 17;
     public static final int EC_ACTIVE_RESUME_FAILURE = 18;
+    public static final int EC_NC_FAILED_TO_NOTIFY_TASKS_COMPLETED = 19;
     public static final int EC_FAILED_TO_CANCEL_ACTIVE_START_STOP = 22;
     public static final int EC_IMMEDIATE_HALT = 33;
     public static final int EC_HALT_ABNORMAL_RESERVED_44 = 44;

-- 
To view, visit https://asterix-gerrit.ics.uci.edu/3274
To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I02819afcb80a0bcd645c3f79950c3fa12dba0274
Gerrit-PatchSet: 2
Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Owner: Michael Blow <mb...@apache.org>
Gerrit-Reviewer: Anon. E. Moose #1000171
Gerrit-Reviewer: Jenkins <je...@fulliautomatix.ics.uci.edu>
Gerrit-Reviewer: Michael Blow <mb...@apache.org>
Gerrit-Reviewer: Murtadha Hubail <mh...@apache.org>
Gerrit-Reviewer: Till Westmann <ti...@apache.org>

Change in asterixdb[master]: [NO ISSUE][HYR] EnsureAllCcTasksCompleted failure handling

Posted by "Jenkins (Code Review)" <do...@asterixdb.incubator.apache.org>.
Jenkins has posted comments on this change.

Change subject: [NO ISSUE][HYR] EnsureAllCcTasksCompleted failure handling
......................................................................


Patch Set 1:

Build Started https://asterix-jenkins.ics.uci.edu/job/asterix-gerrit-sonar/9309/ (8/16)

-- 
To view, visit https://asterix-gerrit.ics.uci.edu/3274
To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I02819afcb80a0bcd645c3f79950c3fa12dba0274
Gerrit-PatchSet: 1
Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Owner: Michael Blow <mb...@apache.org>
Gerrit-Reviewer: Jenkins <je...@fulliautomatix.ics.uci.edu>
Gerrit-HasComments: No

Change in asterixdb[master]: [NO ISSUE][HYR] EnsureAllCcTasksCompleted failure handling

Posted by "Till Westmann (Code Review)" <do...@asterixdb.incubator.apache.org>.
Till Westmann has posted comments on this change.

Change subject: [NO ISSUE][HYR] EnsureAllCcTasksCompleted failure handling
......................................................................


Patch Set 1: Code-Review+2

-- 
To view, visit https://asterix-gerrit.ics.uci.edu/3274
To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I02819afcb80a0bcd645c3f79950c3fa12dba0274
Gerrit-PatchSet: 1
Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Owner: Michael Blow <mb...@apache.org>
Gerrit-Reviewer: Anon. E. Moose #1000171
Gerrit-Reviewer: Jenkins <je...@fulliautomatix.ics.uci.edu>
Gerrit-Reviewer: Murtadha Hubail <mh...@apache.org>
Gerrit-Reviewer: Till Westmann <ti...@apache.org>
Gerrit-HasComments: No

Change in asterixdb[master]: [NO ISSUE][HYR] EnsureAllCcTasksCompleted failure handling

Posted by "Jenkins (Code Review)" <do...@asterixdb.incubator.apache.org>.
Jenkins has posted comments on this change.

Change subject: [NO ISSUE][HYR] EnsureAllCcTasksCompleted failure handling
......................................................................


Patch Set 1:

Build Started https://asterix-jenkins.ics.uci.edu/job/asterix-gerrit-notopic/10841/ (9/16)

-- 
To view, visit https://asterix-gerrit.ics.uci.edu/3274
To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I02819afcb80a0bcd645c3f79950c3fa12dba0274
Gerrit-PatchSet: 1
Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Owner: Michael Blow <mb...@apache.org>
Gerrit-Reviewer: Jenkins <je...@fulliautomatix.ics.uci.edu>
Gerrit-HasComments: No

Change in asterixdb[master]: [NO ISSUE][HYR] EnsureAllCcTasksCompleted failure handling

Posted by "Jenkins (Code Review)" <do...@asterixdb.incubator.apache.org>.
Jenkins has posted comments on this change.

Change subject: [NO ISSUE][HYR] EnsureAllCcTasksCompleted failure handling
......................................................................


Patch Set 1:

BAD Compatibility Tests Started https://asterix-jenkins.ics.uci.edu/job/asterixbad-compat/4062/

-- 
To view, visit https://asterix-gerrit.ics.uci.edu/3274
To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I02819afcb80a0bcd645c3f79950c3fa12dba0274
Gerrit-PatchSet: 1
Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Owner: Michael Blow <mb...@apache.org>
Gerrit-Reviewer: Anon. E. Moose #1000171
Gerrit-Reviewer: Jenkins <je...@fulliautomatix.ics.uci.edu>
Gerrit-HasComments: No

Change in asterixdb[master]: [NO ISSUE][HYR] EnsureAllCcTasksCompleted failure handling

Posted by "Jenkins (Code Review)" <do...@asterixdb.incubator.apache.org>.
Jenkins has posted comments on this change.

Change subject: [NO ISSUE][HYR] EnsureAllCcTasksCompleted failure handling
......................................................................


Patch Set 1:

Build Started https://asterix-jenkins.ics.uci.edu/job/asterix-gerrit-cancellation-test/5358/ (7/16)

-- 
To view, visit https://asterix-gerrit.ics.uci.edu/3274
To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I02819afcb80a0bcd645c3f79950c3fa12dba0274
Gerrit-PatchSet: 1
Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Owner: Michael Blow <mb...@apache.org>
Gerrit-Reviewer: Jenkins <je...@fulliautomatix.ics.uci.edu>
Gerrit-HasComments: No

Change in asterixdb[master]: [NO ISSUE][HYR] EnsureAllCcTasksCompleted failure handling

Posted by "Jenkins (Code Review)" <do...@asterixdb.incubator.apache.org>.
Jenkins has posted comments on this change.

Change subject: [NO ISSUE][HYR] EnsureAllCcTasksCompleted failure handling
......................................................................


Patch Set 1: Integration-Tests+1

Integration Tests Successful

https://asterix-jenkins.ics.uci.edu/job/asterix-gerrit-integration-tests/8130/ : SUCCESS

-- 
To view, visit https://asterix-gerrit.ics.uci.edu/3274
To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I02819afcb80a0bcd645c3f79950c3fa12dba0274
Gerrit-PatchSet: 1
Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Owner: Michael Blow <mb...@apache.org>
Gerrit-Reviewer: Anon. E. Moose #1000171
Gerrit-Reviewer: Jenkins <je...@fulliautomatix.ics.uci.edu>
Gerrit-Reviewer: Murtadha Hubail <mh...@apache.org>
Gerrit-Reviewer: Till Westmann <ti...@apache.org>
Gerrit-HasComments: No

Change in asterixdb[master]: [NO ISSUE][HYR] EnsureAllCcTasksCompleted failure handling

Posted by "Jenkins (Code Review)" <do...@asterixdb.incubator.apache.org>.
Jenkins has posted comments on this change.

Change subject: [NO ISSUE][HYR] EnsureAllCcTasksCompleted failure handling
......................................................................


Patch Set 1:

Build Started https://asterix-jenkins.ics.uci.edu/job/asterix-verify-storage/5934/ (5/16)

-- 
To view, visit https://asterix-gerrit.ics.uci.edu/3274
To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I02819afcb80a0bcd645c3f79950c3fa12dba0274
Gerrit-PatchSet: 1
Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Owner: Michael Blow <mb...@apache.org>
Gerrit-Reviewer: Jenkins <je...@fulliautomatix.ics.uci.edu>
Gerrit-HasComments: No

Change in asterixdb[master]: [NO ISSUE][HYR] EnsureAllCcTasksCompleted failure handling

Posted by "Jenkins (Code Review)" <do...@asterixdb.incubator.apache.org>.
Jenkins has posted comments on this change.

Change subject: [NO ISSUE][HYR] EnsureAllCcTasksCompleted failure handling
......................................................................


Patch Set 1:

Build Started https://asterix-jenkins.ics.uci.edu/job/asterix-gerrit-ensure-ancestor/3367/ (2/16)

-- 
To view, visit https://asterix-gerrit.ics.uci.edu/3274
To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I02819afcb80a0bcd645c3f79950c3fa12dba0274
Gerrit-PatchSet: 1
Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Owner: Michael Blow <mb...@apache.org>
Gerrit-Reviewer: Jenkins <je...@fulliautomatix.ics.uci.edu>
Gerrit-HasComments: No