You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cloudstack.apache.org by ro...@apache.org on 2018/05/16 10:05:09 UTC

[cloudstack] branch 4.11 updated: agent: Fixes #2633 don't wait for pending tasks on reconnection (#2638)

This is an automated email from the ASF dual-hosted git repository.

rohit pushed a commit to branch 4.11
in repository https://gitbox.apache.org/repos/asf/cloudstack.git


The following commit(s) were added to refs/heads/4.11 by this push:
     new d893fb5  agent: Fixes #2633 don't wait for pending tasks on reconnection (#2638)
d893fb5 is described below

commit d893fb5b00ba03e9718b08950b40b9af4a91b3e6
Author: Rohit Yadav <ro...@apache.org>
AuthorDate: Wed May 16 15:35:00 2018 +0530

    agent: Fixes #2633 don't wait for pending tasks on reconnection (#2638)
    
    When agent loses connection with management server, the reconnection
    logic waits for any pending tasks to finish. However, when such tasks
    do finish they fail to send an `Answer` back to managements server.
    Therefore from a management server's perspective such pending
    operations are stuck in a FSM state and need manual removal or fixing.
    This is by design where management server's side cmd-answer request
    pattern is code/execution dependent, therefore even if the answer
    were to be sent when management server came back up (reconnects)
    the management server will fail to acknowledge and process the answer
    due to missing listeners or being in the exact state to handle answers.
    
    Historically, the Agent would wait to reconnect until the internal
    tasks complete but I found no reason why it should wait for reconnection
    at all.
    
    Signed-off-by: Rohit Yadav <ro...@shapeblue.com>
---
 agent/src/com/cloud/agent/Agent.java | 14 +-------------
 1 file changed, 1 insertion(+), 13 deletions(-)

diff --git a/agent/src/com/cloud/agent/Agent.java b/agent/src/com/cloud/agent/Agent.java
index 90e3790..8a6c24b 100644
--- a/agent/src/com/cloud/agent/Agent.java
+++ b/agent/src/com/cloud/agent/Agent.java
@@ -495,19 +495,7 @@ public class Agent implements HandlerFactory, IAgentControl {
 
         _resource.disconnected();
 
-        final String lastConnectedHost = _shell.getConnectedHost();
-
-        int inProgress = 0;
-        do {
-            _shell.getBackoffAlgorithm().waitBeforeRetry();
-
-            s_logger.info("Lost connection to host: " + lastConnectedHost + ". Dealing with the remaining commands...");
-
-            inProgress = _inProgress.get();
-            if (inProgress > 0) {
-                s_logger.info("Cannot connect because we still have " + inProgress + " commands in progress.");
-            }
-        } while (inProgress > 0);
+        s_logger.info("Lost connection to host: " + _shell.getConnectedHost() + ". Attempting reconnection while we still have " + _inProgress.get() + " commands in progress.");
 
         _connection.stop();
 

-- 
To stop receiving notification emails like this one, please contact
rohit@apache.org.