You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@cloudstack.apache.org by "edison su (JIRA)" <ji...@apache.org> on 2013/12/20 00:12:08 UTC

[jira] [Commented] (CLOUDSTACK-5452) KVM - Agent is not able to connect back if management server was restarted when there are pending tasks to this host.

    [ https://issues.apache.org/jira/browse/CLOUDSTACK-5452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13853438#comment-13853438 ] 

edison su commented on CLOUDSTACK-5452:
---------------------------------------

The current agent code creates a thread for each task, there is no way for the agent to cancel the task(the thread), due to the limitation of java.
In future, we may need to move to process model, then we can cancel task.
Right now, if this issue happened, then restart agent is the way to workaround it.

> KVM - Agent is not able to connect back if management server was restarted when there are pending tasks to this host.
> ---------------------------------------------------------------------------------------------------------------------
>
>                 Key: CLOUDSTACK-5452
>                 URL: https://issues.apache.org/jira/browse/CLOUDSTACK-5452
>             Project: CloudStack
>          Issue Type: Bug
>      Security Level: Public(Anyone can view this level - this is the default.) 
>          Components: Management Server
>    Affects Versions: 4.3.0
>         Environment: Build from 4.3
>            Reporter: Sangeetha Hariharan
>            Assignee: edison su
>            Priority: Critical
>             Fix For: Future
>
>
> KVM - Agent is not able to connect back if management server was restarted when there are pending tasks to this host.
> Steps to reproduce the problem:
> Set up - Advanced zone with 2 KVM ( RHEL 6.3) hosts.
> Deployed few Vms.
> Started snapshot for ROOT volume of the VMs.
> When the snapshot processes  are still in progress , restart management server.
> When the management sever started , the KVM hosts remain in disconnected state.
> Attempt to stop Vms /start Vms fails because of having no connection to the host.
> Following is seen in agent logs:
> 2013-12-10 20:56:46,640 INFO  [cloud.agent.Agent] (Agent-Handler-2:null) Lost connection to the server. Dealing with the remaining commands...
> 2013-12-10 20:56:46,640 INFO  [cloud.agent.Agent] (Agent-Handler-2:null) Cannot connect because we still have 1 commands in progress.
> 2013-12-10 20:56:51,641 INFO  [cloud.agent.Agent] (Agent-Handler-2:null) Lost connection to the server. Dealing with the remaining commands...
> 2013-12-10 20:56:51,642 INFO  [cloud.agent.Agent] (Agent-Handler-2:null) Cannot connect because we still have 1 commands in progress.
> 2013-12-10 20:56:56,642 INFO  [cloud.agent.Agent] (Agent-Handler-2:null) Lost connection to the server. Dealing with the remaining commands...
> 2013-12-10 20:56:56,643 INFO  [cloud.agent.Agent] (Agent-Handler-2:null) Cannot connect because we still have 1 commands in progress.
> 2013-12-10 20:57:01,644 INFO  [cloud.agent.Agent] (Agent-Handler-2:null) Lost connection to the server. Dealing with the remaining commands...
> 2013-12-10 20:57:01,644 INFO  [cloud.agent.Agent] (Agent-Handler-2:null) Cannot connect because we still have 1 commands in progress.
> 2013-12-10 20:57:06,644 INFO  [cloud.agent.Agent] (Agent-Handler-2:null) Lost connection to the server. Dealing with the remaining commands...
> 2013-12-10 20:57:06,645 INFO  [cloud.agent.Agent] (Agent-Handler-2:null) Cannot connect because we still have 1 commands in progress.
> 2013-12-10 20:57:11,645 INFO  [cloud.agent.Agent] (Agent-Handler-2:null) Lost connection to the server. Dealing with the remaining commands...
> 2013-12-10 20:57:11,646 INFO  [cloud.agent.Agent] (Agent-Handler-2:null) Cannot connect because we still have 1 commands in progress.
> 2013-12-10 20:57:16,647 INFO  [cloud.agent.Agent] (Agent-Handler-2:null) Lost connection to the server. Dealing with the remaining commands...
> 2013-12-10 20:57:16,647 INFO  [cloud.agent.Agent] (Agent-Handler-2:null) Cannot connect because we still have 1 commands in progress.
> 2013-12-10 20:57:21,648 INFO  [cloud.agent.Agent] (Agent-Handler-2:null) Lost connection to the server. Dealing with the remaining commands...
> 2013-12-10 20:57:21,648 INFO  [cloud.agent.Agent] (Agent-Handler-2:null) Cannot connect because we still have 1 commands in progress.
> 2013-12-10 20:57:26,649 INFO  [cloud.agent.Agent] (Agent-Handler-2:null) Lost connection to the server. Dealing with the remaining commands...
> 2013-12-10 20:57:26,675 INFO  [cloud.agent.Agent] (Agent-Handler-2:null) Cannot connect because we still have 1 commands in progress.
> 2013-12-10 20:57:31,676 INFO  [cloud.agent.Agent] (Agent-Handler-2:null) Lost connection to the server. Dealing with the remaining commands...
> 2013-12-10 20:57:31,677 INFO  [cloud.agent.Agent] (Agent-Handler-2:null) Cannot connect because we still have 1 commands in progress.
> 2013-12-10 20:57:36,678 INFO  [cloud.agent.Agent] (Agent-Handler-2:null) Lost connection to the server. Dealing with the remaining commands...
> 2013-12-10 20:57:36,678 INFO  [cloud.agent.Agent] (Agent-Handler-2:null) Cannot connect because we still have 1 commands in progress.
> 2013-12-10 20:57:41,678 INFO  [cloud.agent.Agent] (Agent-Handler-2:null) Lost connection to the server. Dealing with the remaining commands...
> :



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)