You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@cloudstack.apache.org by "Sanjeev N (JIRA)" <ji...@apache.org> on 2013/12/23 14:07:50 UTC

[jira] [Updated] (CLOUDSTACK-5610) [Hyper-v] Host does not go into Alert state even though it is power-off hence vm deployment fails

     [ https://issues.apache.org/jira/browse/CLOUDSTACK-5610?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sanjeev N updated CLOUDSTACK-5610:
----------------------------------

    Attachment: management-server.rar
                cloud.dmp

Attached MS log file and cloud DB dump.

> [Hyper-v] Host does not go into Alert state even though it is power-off hence vm deployment fails
> -------------------------------------------------------------------------------------------------
>
>                 Key: CLOUDSTACK-5610
>                 URL: https://issues.apache.org/jira/browse/CLOUDSTACK-5610
>             Project: CloudStack
>          Issue Type: Bug
>      Security Level: Public(Anyone can view this level - this is the default.) 
>          Components: Hypervisor Controller, Management Server
>    Affects Versions: 4.3.0
>         Environment: Latest build from 4.3 with commit :d462db4ae5c30e677d5810111f9ea5ca6812bce2
> Storage: SMB for both primary and secondary
> Hypervisor: Hyper-v
>            Reporter: Sanjeev N
>            Priority: Blocker
>              Labels: hyper-V,
>             Fix For: 4.3.0
>
>         Attachments: cloud.dmp, management-server.rar
>
>
> [Hyper-v] Host does not go into Alert state even though it is power-off hence vm deployment fails
> Steps to Reproduce:
> =================
> 1.Bring up CS in advanced zone with with 2 or more Hyper-v hosts using SMB for both primary and secondary
> 2.Enable the zone and deploy few vms. Make sure that vms are distributed across all the hosts
> 3.Power off one of the hosts(Power off the hosts where vms are running)
> Expected Result:
> ==============
> Host should go into Alert state and all the vms running on it should be stopped
> Actual Result:
> ============
> Host remains in Up state and all the vms state show as running.
> I could see the ping commands to Hypervsior aget, system vm agents in the MS log. Even though the agents are behind ping, agent status remains in UP state.
> At this state , I have tried to deploy a vm and deployment planner chose the host which was powered off . Hence the vm deployment failed.
> Also CPVM was running on the powered off host. That also remained in running state. Since cpvm agent is not reachable from CS it should have been stopped and started on another Host in the cluster.
> 2013-12-23 18:19:25,334 ERROR [c.c.h.h.r.HypervDirectConnectResource] (DirectAgent-331:ctx-831c60e9) org.apache.http.conn.HttpHostConnectException: Connection to http://10.147.40.31:8250 refused
> 2013-12-23 18:19:25,334 INFO  [c.c.h.h.r.HypervDirectConnectResource] (DirectAgent-331:ctx-831c60e9) Cannot ping host 10.147.40.31 (IP 10.147.40.31), pingAns (blank means null) is:com.cloud.agent.api.UnsupportedAnswer
> 2013-12-23 18:19:25,334 WARN  [c.c.a.m.DirectAgentAttache] (DirectAgent-331:ctx-831c60e9) Unable to get current status on 5(10.147.40.31)
> 2013-12-23 18:19:25,336 INFO  [c.c.a.m.AgentManagerImpl] (AgentTaskPool-16:ctx-be3804c7) Investigating why host 5 has disconnected with event AgentDisconnected
> 2013-12-23 18:19:25,336 DEBUG [c.c.a.m.AgentManagerImpl] (AgentTaskPool-16:ctx-be3804c7) checking if agent (5) is alive
> 2013-12-23 18:19:25,339 DEBUG [c.c.a.t.Request] (AgentTaskPool-16:ctx-be3804c7) Seq 5-1482556239: Sending  { Cmd , MgmtId: 132129494109518, via: 5(10.147.40.31), Ver: v1, Flags: 100011, [{"com.cloud.agent.api.CheckHealthCommand":{"wait":50}}] }
> 2013-12-23 18:19:25,339 DEBUG [c.c.a.t.Request] (AgentTaskPool-16:ctx-be3804c7) Seq 5-1482556239: Executing:  { Cmd , MgmtId: 132129494109518, via: 5(10.147.40.31), Ver: v1, Flags: 100011, [{"com.cloud.agent.api.CheckHealthCommand":{"wait":50}}] }
> 2013-12-23 18:19:25,339 DEBUG [c.c.a.m.DirectAgentAttache] (DirectAgent-325:ctx-39f5ed39) Seq 5-1482556239: Executing request
> 2013-12-23 18:19:25,339 DEBUG [c.c.h.h.r.HypervDirectConnectResource] (DirectAgent-325:ctx-39f5ed39) POST request tohttp://10.147.40.31:8250/api/HypervResource/com.cloud.agent.api.CheckHealthCommand with contents{"contextMap":{},"wait":50}
> 2013-12-23 18:19:25,340 DEBUG [c.c.h.h.r.HypervDirectConnectResource] (DirectAgent-325:ctx-39f5ed39) Sending cmd to http://10.147.40.31:8250/api/HypervResource/com.cloud.agent.api.CheckHealthCommand cmd data:{"contextMap":{},"wait":50}
> 2013-12-23 18:19:46,345 DEBUG [c.c.h.UserVmDomRInvestigator] (AgentTaskPool-16:ctx-be3804c7) checking if agent (5) is alive
> 2013-12-23 18:19:46,347 DEBUG [c.c.h.UserVmDomRInvestigator] (AgentTaskPool-16:ctx-be3804c7) sending ping from (1) to agent's host ip address (10.147.40.31)
> 2013-12-23 18:19:46,349 DEBUG [c.c.a.t.Request] (AgentTaskPool-16:ctx-be3804c7) Seq 1-790364876: Sending  { Cmd , MgmtId: 132129494109518, via: 1(10.147.40.14), Ver: v1, Flags: 100011, [{"com.cloud.agent.api.PingTestCommand":{"_computingHostIp":"10.147.40.31","wait":20}}] }
> 2013-12-23 18:19:46,349 DEBUG [c.c.a.t.Request] (AgentTaskPool-16:ctx-be3804c7) Seq 1-790364876: Executing:  { Cmd , MgmtId: 132129494109518, via: 1(10.147.40.14), Ver: v1, Flags: 100011, [{"com.cloud.agent.api.PingTestCommand":{"_computingHostIp":"10.147.40.31","wait":20}}] }
> 2013-12-23 18:19:46,350 DEBUG [c.c.a.m.DirectAgentAttache] (DirectAgent-353:ctx-a48feb80) Seq 1-790364876: Executing request
> 2013-12-23 18:19:46,350 INFO  [c.c.h.h.r.HypervDirectConnectResource] (DirectAgent-353:ctx-a48feb80) Executing resource PingTestCommand: {"_computingHostIp":"10.147.40.31","contextMap":{},"wait":20}
> 2013-12-23 18:19:46,351 ERROR [c.c.h.h.r.HypervDirectConnectResource] (DirectAgent-353:ctx-a48feb80) Unable to execute ping command on DomR (null), domR may not be ready yet. failure due to There was a problem while connecting to null:3922
> 2013-12-23 18:19:46,351 DEBUG [c.c.a.m.DirectAgentAttache] (DirectAgent-353:ctx-a48feb80) Seq 1-790364876: Response Received:
> 2013-12-23 18:19:46,351 DEBUG [c.c.a.t.Request] (DirectAgent-353:ctx-a48feb80) Seq 1-790364876: Processing:  { Ans: , MgmtId: 132129494109518, via: 1, Ver: v1, Flags: 10, [{"com.cloud.agent.api.Answer":{"result":false,"details":"PingTestCommand failed","wait":0}}] }
> 2013-12-23 18:19:46,351 DEBUG [c.c.a.t.Request] (AgentTaskPool-16:ctx-be3804c7) Seq 1-790364876: Received:  { Ans: , MgmtId: 132129494109518, via: 1, Ver: v1, Flags: 10, { Answer } }
> 2013-12-23 18:19:46,351 DEBUG [c.c.h.AbstractInvestigatorImpl] (AgentTaskPool-16:ctx-be3804c7) host (10.147.40.31) cannot be pinged, returning null ('I don't know')
> 2013-12-23 18:19:46,351 DEBUG [c.c.h.UserVmDomRInvestigator] (AgentTaskPool-16:ctx-be3804c7) could not reach agent, could not reach agent's host, returning that we don't have enough information
> 2013-12-23 18:19:46,351 DEBUG [c.c.h.HighAvailabilityManagerImpl] (AgentTaskPool-16:ctx-be3804c7) PingInvestigator unable to determine the state of the host.  Moving on.
> 2013-12-23 18:19:46,351 DEBUG [c.c.h.HighAvailabilityManagerImpl] (AgentTaskPool-16:ctx-be3804c7) ManagementIPSysVMInvestigator unable to determine the state of the host.  Moving on.
> 2013-12-23 18:19:46,351 DEBUG [c.c.h.HighAvailabilityManagerImpl] (AgentTaskPool-16:ctx-be3804c7) KVMInvestigator unable to determine the state of the host.  Moving on.
> 2013-12-23 18:19:46,351 DEBUG [c.c.h.HighAvailabilityManagerImpl] (AgentTaskPool-16:ctx-be3804c7) VMwareInvestigator unable to determine the state of the host.  Moving on.
> 2013-12-23 18:19:46,351 WARN  [c.c.a.m.AgentManagerImpl] (AgentTaskPool-16:ctx-be3804c7) Agent state cannot be determined, do nothing
> Attaching MS log and cloud DB.
> Agent 5 is the host which was powered off.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)