You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@cloudstack.apache.org by "sadhu suresh (JIRA)" <ji...@apache.org> on 2014/01/23 12:18:39 UTC

[jira] [Comment Edited] (CLOUDSTACK-5713) vmsync:KVM:NPE when agentmanager tries to update the destroyed vm state during vmsync

    [ https://issues.apache.org/jira/browse/CLOUDSTACK-5713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13879860#comment-13879860 ] 

sadhu suresh edited comment on CLOUDSTACK-5713 at 1/23/14 11:18 AM:
--------------------------------------------------------------------

 noticed  Nullpointer exception when VM is stopped(shut-downed)outside the cloudstack in KVM 
Issue still exits in KVM.

I have tested  on upgrade environment :(upgraded from 4.2.1 to 4.3) 



2014-01-23 21:32:49,151 DEBUG [c.c.h.HighAvailabilityManagerImpl] (AgentManager-Handler-13:null) VM is not HA enabled so we're done.
2014-01-23 21:32:49,151 WARN  [o.a.c.f.j.AsyncJobExecutionContext] (AgentManager-Handler-13:null) Job is executed without a context, setup psudo job for the executing thread
2014-01-23 21:32:49,159 WARN  [c.c.a.m.AgentManagerImpl] (AgentManager-Handler-13:null) Caught:
java.lang.NullPointerException
        at org.apache.cloudstack.framework.jobs.AsyncJobExecutionContext.getCurrentExecutionContext(AsyncJobExecutionContext.java:172)
        at com.cloud.vm.VirtualMachineManagerImpl.advanceStop(VirtualMachineManagerImpl.java:1292)
        at com.cloud.ha.HighAvailabilityManagerImpl.scheduleRestart(HighAvailabilityManagerImpl.java:346)
        at com.cloud.vm.VirtualMachineManagerImpl.compareState(VirtualMachineManagerImpl.java:2804)
        at com.cloud.vm.VirtualMachineManagerImpl.deltaHostSync(VirtualMachineManagerImpl.java:2410)
        at com.cloud.vm.VirtualMachineManagerImpl.processCommands(VirtualMachineManagerImpl.java:2940)
        at com.cloud.agent.manager.AgentManagerImpl.handleCommands(AgentManagerImpl.java:292)
        at com.cloud.agent.manager.AgentManagerImpl$AgentHandler.processRequest(AgentManagerImpl.java:1184)
        at com.cloud.agent.manager.AgentManagerImpl$AgentHandler.doTask(AgentManagerImpl.java:1270)
        at com.cloud.agent.manager.ClusteredAgentManagerImpl$ClusteredAgentHandler.doTask(ClusteredAgentManagerImpl.java:677)
        at com.cloud.utils.nio.Task.run(Task.java:83)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
        at java.lang.Thread.run(Thread.java:679)
2014-01-23 21:32:49,179 DEBUG [c.c.a.m.AgentManagerImpl] (AgentManager-Handler-13:null) SeqA 1-213: Sending Seq 1-213:  { Ans: , MgmtId: 7322717913215, via: 1, Ver: v1, Flags: 100010, [{"com.cloud.agent.api.Answer":{"result":false,"wait":0}}] }
2014-01-23 21:32:49,186 DEBUG [c.c.c.ConsoleProxyManagerImpl] (consoleproxy-1:ctx-9b04e2ef) Zone 1 is ready to launch console proxy
2014-01-23 21:32:49,265 DEBUG [c.c.a.m.AgentManagerImpl] (AgentManager-Handler-12:null) Ping from 4
2014-01-23 21:32:57,420 DEBUG [c.c.s.StatsCollector] (StatsCollector-1:ctx-f4506fa8) HostStatsCollector is running...
2014-01-23 21:32:58,010 DEBUG [c.c.a.t.Request] (StatsCollector-1:ctx-f4506fa8) Seq 1-722862312: Received:  { Ans: , MgmtId: 7322717913215, via: 1, Ver: v1, Flags: 10, { GetHostStatsAnswer } }
2014-01-23 21:32:58,619 DEBUG [c.c.a.t.Request] (StatsCollector-1:ctx-f4506fa8) Seq 4-1319370895: Received:  { Ans: , MgmtId: 7322717913215, via: 4, Ver: v1, Flags: 10, { GetHostStatsAnswer } }
2014-01-23 21:33:06,323 DEBUG [c.c.s.StatsCollector] (StatsCollector-1:ctx-10707bfe) StorageCollector is running...
2014-01-23 21:33:06,402 DEBUG [c.c.a.t.Request] (StatsCollector-1:ctx-10707bfe) Seq 3-1709113424: Received:  { Ans: , MgmtId: 7322717913215, via: 3, Ver: v1, Flags: 10, { GetStorageStatsAnswer } }
2014-01-23 21:33:06,487 DEBUG [c.c.a.t.Request] (StatsCollector-1:ctx-10707bfe) Seq 1-722862313: Received:  { Ans: , MgmtId: 7322717913215, via: 1, Ver: v1, Flags: 10, { GetStorageStatsAnswer } }
2014-01-23 21:33:06,609 DEBUG [c.c.a.t.Request] (StatsCollector-1:ctx-10707bfe) Seq 1-722862314: Received:  { Ans: , MgmtId: 7322717913215, via: 1, Ver: v1, Flags: 10, { GetStorageStatsAnswer } }
2014-01-23 21:33:06,863 INFO  [c.c.u.AccountManagerImpl] (AccountChecker-1:ctx-17af7e43) Found 0 removed accounts to cleanup



was (Author: sadhu):
Please ignore the above exception. noticed same exception Nullpointer exception when VM is stopped(shutdowned)outside the cloudstack in KVM 
Issue still exits in KVM.

I have tested  on upgrade environment :(upgraded from 4.2.1 to 4.3) 



2014-01-23 21:32:49,151 DEBUG [c.c.h.HighAvailabilityManagerImpl] (AgentManager-Handler-13:null) VM is not HA enabled so we're done.
2014-01-23 21:32:49,151 WARN  [o.a.c.f.j.AsyncJobExecutionContext] (AgentManager-Handler-13:null) Job is executed without a context, setup psudo job for the executing thread
2014-01-23 21:32:49,159 WARN  [c.c.a.m.AgentManagerImpl] (AgentManager-Handler-13:null) Caught:
java.lang.NullPointerException
        at org.apache.cloudstack.framework.jobs.AsyncJobExecutionContext.getCurrentExecutionContext(AsyncJobExecutionContext.java:172)
        at com.cloud.vm.VirtualMachineManagerImpl.advanceStop(VirtualMachineManagerImpl.java:1292)
        at com.cloud.ha.HighAvailabilityManagerImpl.scheduleRestart(HighAvailabilityManagerImpl.java:346)
        at com.cloud.vm.VirtualMachineManagerImpl.compareState(VirtualMachineManagerImpl.java:2804)
        at com.cloud.vm.VirtualMachineManagerImpl.deltaHostSync(VirtualMachineManagerImpl.java:2410)
        at com.cloud.vm.VirtualMachineManagerImpl.processCommands(VirtualMachineManagerImpl.java:2940)
        at com.cloud.agent.manager.AgentManagerImpl.handleCommands(AgentManagerImpl.java:292)
        at com.cloud.agent.manager.AgentManagerImpl$AgentHandler.processRequest(AgentManagerImpl.java:1184)
        at com.cloud.agent.manager.AgentManagerImpl$AgentHandler.doTask(AgentManagerImpl.java:1270)
        at com.cloud.agent.manager.ClusteredAgentManagerImpl$ClusteredAgentHandler.doTask(ClusteredAgentManagerImpl.java:677)
        at com.cloud.utils.nio.Task.run(Task.java:83)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
        at java.lang.Thread.run(Thread.java:679)
2014-01-23 21:32:49,179 DEBUG [c.c.a.m.AgentManagerImpl] (AgentManager-Handler-13:null) SeqA 1-213: Sending Seq 1-213:  { Ans: , MgmtId: 7322717913215, via: 1, Ver: v1, Flags: 100010, [{"com.cloud.agent.api.Answer":{"result":false,"wait":0}}] }
2014-01-23 21:32:49,186 DEBUG [c.c.c.ConsoleProxyManagerImpl] (consoleproxy-1:ctx-9b04e2ef) Zone 1 is ready to launch console proxy
2014-01-23 21:32:49,265 DEBUG [c.c.a.m.AgentManagerImpl] (AgentManager-Handler-12:null) Ping from 4
2014-01-23 21:32:57,420 DEBUG [c.c.s.StatsCollector] (StatsCollector-1:ctx-f4506fa8) HostStatsCollector is running...
2014-01-23 21:32:58,010 DEBUG [c.c.a.t.Request] (StatsCollector-1:ctx-f4506fa8) Seq 1-722862312: Received:  { Ans: , MgmtId: 7322717913215, via: 1, Ver: v1, Flags: 10, { GetHostStatsAnswer } }
2014-01-23 21:32:58,619 DEBUG [c.c.a.t.Request] (StatsCollector-1:ctx-f4506fa8) Seq 4-1319370895: Received:  { Ans: , MgmtId: 7322717913215, via: 4, Ver: v1, Flags: 10, { GetHostStatsAnswer } }
2014-01-23 21:33:06,323 DEBUG [c.c.s.StatsCollector] (StatsCollector-1:ctx-10707bfe) StorageCollector is running...
2014-01-23 21:33:06,402 DEBUG [c.c.a.t.Request] (StatsCollector-1:ctx-10707bfe) Seq 3-1709113424: Received:  { Ans: , MgmtId: 7322717913215, via: 3, Ver: v1, Flags: 10, { GetStorageStatsAnswer } }
2014-01-23 21:33:06,487 DEBUG [c.c.a.t.Request] (StatsCollector-1:ctx-10707bfe) Seq 1-722862313: Received:  { Ans: , MgmtId: 7322717913215, via: 1, Ver: v1, Flags: 10, { GetStorageStatsAnswer } }
2014-01-23 21:33:06,609 DEBUG [c.c.a.t.Request] (StatsCollector-1:ctx-10707bfe) Seq 1-722862314: Received:  { Ans: , MgmtId: 7322717913215, via: 1, Ver: v1, Flags: 10, { GetStorageStatsAnswer } }
2014-01-23 21:33:06,863 INFO  [c.c.u.AccountManagerImpl] (AccountChecker-1:ctx-17af7e43) Found 0 removed accounts to cleanup


> vmsync:KVM:NPE when agentmanager tries to update the destroyed  vm state during vmsync
> --------------------------------------------------------------------------------------
>
>                 Key: CLOUDSTACK-5713
>                 URL: https://issues.apache.org/jira/browse/CLOUDSTACK-5713
>             Project: CloudStack
>          Issue Type: Bug
>      Security Level: Public(Anyone can view this level - this is the default.) 
>          Components: KVM
>    Affects Versions: 4.3.0
>            Reporter: sadhu suresh
>            Assignee: Kelven Yang
>            Priority: Critical
>             Fix For: 4.3.0
>
>         Attachments: management-server.rar
>
>
> env:4.3,kvm
> steps:
> 1. create a advanced zone and add a host
> 2. deploy 2 vms
> 3. from host destroy the  vm 
> 4. check the vm state in Cloudstack
> actual result:
> during vm sync ,agent manager throws NULLpointer exception  and due to this   it fail to update the vm state.
> VM":{"state":"PowerOn","host":"centos62"}},"_gatewayAccessible":true,"_vnetAccessible":true,"hostType":"Routing","hostId":3,"wait":0}}] }
> 2014-01-02 08:56:14,614 DEBUG [c.c.v.VirtualMachineManagerImpl] (AgentManager-Handler-13:null) VM i-2-7-VM: cs state = Running and realState = Stopped
> 2014-01-02 08:56:14,615 DEBUG [c.c.v.VirtualMachineManagerImpl] (AgentManager-Handler-13:null) VM i-2-7-VM: cs state = Running and realState = Stopped
> 2014-01-02 08:56:14,672 DEBUG [c.c.h.HighAvailabilityManagerImpl] (AgentManager-Handler-13:null) VM does not require investigation so I'm marking it as Stopped: VM[User|vm2]
> 2014-01-02 08:56:14,673 WARN  [o.a.c.alerts] (AgentManager-Handler-13:null)  alertType:: 8 // dataCenterId:: 1 // podId:: 1 // clusterId:: null // message:: VM (name: vm2, id: 7) stopped unexpectedly on host id:3, availability zone id:1, pod id:1
> 2014-01-02 08:56:14,786 DEBUG [c.c.h.HighAvailabilityManagerImpl] (AgentManager-Handler-13:null) VM is not HA enabled so we're done.
> 2014-01-02 08:56:14,825 WARN  [c.c.a.m.AgentManagerImpl] (AgentManager-Handler-13:null) Caught:
> java.lang.NullPointerException
>         at com.cloud.vm.VirtualMachineManagerImpl.advanceStop(VirtualMachineManagerImpl.java:1268)
>         at com.cloud.ha.HighAvailabilityManagerImpl.scheduleRestart(HighAvailabilityManagerImpl.java:346)
>         at com.cloud.vm.VirtualMachineManagerImpl.compareState(VirtualMachineManagerImpl.java:2719)
>         at com.cloud.vm.VirtualMachineManagerImpl.deltaHostSync(VirtualMachineManagerImpl.java:2325)
>         at com.cloud.vm.VirtualMachineManagerImpl.processCommands(VirtualMachineManagerImpl.java:2855)
>         at com.cloud.agent.manager.AgentManagerImpl.handleCommands(AgentManagerImpl.java:292)
>         at com.cloud.agent.manager.AgentManagerImpl$AgentHandler.processRequest(AgentManagerImpl.java:1184)
>         at com.cloud.agent.manager.AgentManagerImpl$AgentHandler.doTask(AgentManagerImpl.java:1270)
>         at com.cloud.agent.manager.ClusteredAgentManagerImpl$ClusteredAgentHandler.doTask(ClusteredAgentManagerImpl.java:677)
>         at com.cloud.utils.nio.Task.run(Task.java:83)
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>         at java.lang.Thread.run(Thread.java:701)
> 2014-01-02 08:56:14,827 DEBUG [c.c.a.m.AgentManagerImpl] (AgentManager-Handler-13:null) SeqA 3-196: Sending Seq 3-196:  { Ans: , MgmtId: 7020459589851, via: 3, Ver: v1, Flags: 100010, [{"com.cloud.agent.api.Answer":{"result":false,"wait":0}}] }
> 2014-01-02 08:56:15,657 DEBUG [c.c.a.m.AgentManagerImpl] (AgentManager-Handler-14:null) SeqA 4-1523: Processing Seq 4-1523:  { Cmd , MgmtId: -1, via: 4, Ver: v1, Flags: 11, [{"com.cloud.agent.api.ConsoleProxyLoadReportCommand":{"_proxyVmId":3,"_loadInfo":"{\n  \"connections\": []\n}","wait":0}}] }
> 2014-01-02 08:56:15,755 DEBUG [c.c.a.m.AgentManagerImpl] (AgentManager-Handler-14:null) SeqA 4-1523: Sending Seq 4-1523:  { Ans: , MgmtId: 7020459589851, via: 4, Ver: v1, Flags: 100010, [{"com.cloud.agent.api.AgentControlAnswer":{"result":true,"wait":0}}] }
> virsh # destroy i-2-7-VM
> Domain i-2-7-VM destroyed
> virsh # list
>  Id    Name                           State
> ----------------------------------------------------
>  1     s-4-VM                         running
>  2     v-3-VM                         running
>  3     r-6-VM                         running
> but on cloudstack it still shows as running.
> mysql> select * from vm_instance where id=7\G;
> *************************** 1. row ***************************
>                       id: 7
>                     name: vm2
>                     uuid: ff79a39f-0309-47c9-be1a-62b7f9223943
>            instance_name: i-2-7-VM
>                    state: Running
>           vm_template_id: 4
>              guest_os_id: 112
>      private_mac_address: 02:00:52:5d:00:03
>       private_ip_address: 10.1.1.240
>                   pod_id: 1
>           data_center_id: 1
>                  host_id: 3
>             last_host_id: 3
>                 proxy_id: NULL
>        proxy_assign_time: NULL
>             vnc_password: U8wzZMnrqJ/JXjGSV5QeStfk3S6tTpeN
>               ha_enabled: 0
>            limit_cpu_use: 0
>             update_count: 3
>              update_time: 2014-01-02 11:11:39
>                  created: 2014-01-02 11:11:36
>                  removed: NULL
>                     type: User
>                  vm_type: User
>               account_id: 2
>                domain_id: 1
>      service_offering_id: 1
>           reservation_id: ab656607-c35a-409e-90aa-9a029703b3ef
>          hypervisor_type: KVM
>         disk_offering_id: NULL
>                    owner: 2
>                host_name: vm2
>             display_name: vm2
>            desired_state: NULL
>     dynamically_scalable: 0
>               display_vm: 1
>              power_state: NULL
>  power_state_update_time: NULL
> power_state_update_count: 0
>               power_host: NULL
> 1 row in set (0.00 sec)



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)