You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cloudstack.apache.org by GitBox <gi...@apache.org> on 2018/10/23 18:56:49 UTC

[GitHub] somejfn edited a comment on issue #2890: KVMHAMonitor thread blocks indefinitely while NFS not available

somejfn edited a comment on issue #2890: KVMHAMonitor thread blocks indefinitely while NFS not available
URL: https://github.com/apache/cloudstack/issues/2890#issuecomment-432374446
 
 
   Confirmed we see similar behavior on 4.11.2rc3 and the agent went in Down state.   Agent logs:
   
   810986-e702-36ea-a87b-fd48064ecb12
   2018-10-23 13:14:40,391 INFO  [kvm.resource.LibvirtConnection] (agentRequest-Handler-4:null) (logid:f8cd7cf7) No existing libvirtd connection found. Opening a new one
   2018-10-23 13:14:40,392 WARN  [kvm.resource.LibvirtConnection] (agentRequest-Handler-4:null) (logid:f8cd7cf7) Can not find a connection for Instance i-4-24-VM. Assuming the default connection.
   2018-10-23 13:14:40,399 INFO  [kvm.storage.LibvirtStorageAdaptor] (agentRequest-Handler-4:null) (logid:f8cd7cf7) Trying to fetch storage pool 4e49054a-463f-306f-9678-b0d9b02af9a1 from libvirt
   2018-10-23 13:14:51,496 INFO  [kvm.storage.LibvirtStorageAdaptor] (agentRequest-Handler-2:null) (logid:3a0df8e5) Trying to fetch storage pool 0e233ec5-ea14-439e-bfde-a8c7566d254c from libvirt
   2018-10-23 13:14:51,498 INFO  [kvm.storage.LibvirtStorageAdaptor] (agentRequest-Handler-2:null) (logid:3a0df8e5) Asking libvirt to refresh storage pool 0e233ec5-ea14-439e-bfde-a8c7566d254c
   2018-10-23 13:15:25,027 INFO  [kvm.storage.LibvirtStorageAdaptor] (agentRequest-Handler-1:null) (logid:581a1d95) Trying to fetch storage pool 0e233ec5-ea14-439e-bfde-a8c7566d254c from libvirt
   2018-10-23 13:15:25,029 INFO  [kvm.storage.LibvirtStorageAdaptor] (agentRequest-Handler-1:null) (logid:581a1d95) Asking libvirt to refresh storage pool 0e233ec5-ea14-439e-bfde-a8c7566d254c
   2018-10-23 13:15:25,590 INFO  [kvm.storage.LibvirtStorageAdaptor] (agentRequest-Handler-5:null) (logid:581a1d95) Trying to fetch storage pool 3e810986-e702-36ea-a87b-fd48064ecb12 from libvirt
   2018-10-23 13:15:25,592 INFO  [kvm.storage.LibvirtStorageAdaptor] (agentRequest-Handler-5:null) (logid:581a1d95) Asking libvirt to refresh storage pool 3e810986-e702-36ea-a87b-fd48064ecb12
   
   2018-10-23 13:21:28,804 WARN  [kvm.resource.KVMHAChecker] (Script-3:null) (logid:) Interrupting script.
   2018-10-23 13:21:28,806 WARN  [kvm.resource.KVMHAChecker] (pool-15160-thread-1:null) (logid:c3d5dcaf) Timed out: /usr/share/cloudstack-common/scripts/vm/hypervisor/kvm/kvmheartbeat.sh -i 10.73.96.232 -p /vol/t500_0_fls3_pool36_root -m /mnt/d05f1c9d-9454-3707-a6c4-781398af198d -h 10.73.96.212 -r -t 60 .  Output is:
   2018-10-23 13:21:32,826 WARN  [kvm.resource.KVMHAChecker] (Script-7:null) (logid:) Interrupting script.
   2018-10-23 13:21:32,827 WARN  [kvm.resource.KVMHAChecker] (pool-15161-thread-1:null) (logid:c3d5dcaf) Timed out: /usr/share/cloudstack-common/scripts/vm/hypervisor/kvm/kvmheartbeat.sh -i 10.73.96.232 -p /vol/t500_0_fls3_pool36_root -m /mnt/d05f1c9d-9454-3707-a6c4-781398af198d -h 10.73.96.212 -r -t 60 .  Output is:
   2018-10-23 13:21:36,846 WARN  [kvm.resource.KVMHAChecker] (Script-4:null) (logid:) Interrupting script.
   2018-10-23 13:21:36,847 WARN  [kvm.resource.KVMHAChecker] (pool-15162-thread-1:null) (logid:4a3cb34f) Timed out: /usr/share/cloudstack-common/scripts/vm/hypervisor/kvm/kvmheartbeat.sh -i 10.73.96.232 -p /vol/t500_0_fls3_pool36_root -m /mnt/d05f1c9d-9454-3707-a6c4-781398af198d -h 10.73.96.212 -r -t 60 .  Output is:
   2018-10-23 13:24:44,205 INFO  [cloud.agent.Agent] (Agent-Handler-1:null) (logid:5a5a7500) Lost connection to host: 10.73.96.19. Attempting reconnection while we still have 5 commands in progress.
   2018-10-23 13:24:44,206 INFO  [utils.nio.NioClient] (Agent-Handler-1:null) (logid:5a5a7500) NioClient connection closed
   2018-10-23 13:24:44,206 INFO  [cloud.agent.Agent] (Agent-Handler-1:null) (logid:5a5a7500) Reconnecting to host:10.73.96.19
   2018-10-23 13:24:44,207 INFO  [utils.nio.NioClient] (Agent-Handler-1:null) (logid:5a5a7500) Connecting to 10.73.96.19:8250
   2018-10-23 13:24:44,207 INFO  [utils.nio.Link] (Agent-Handler-1:null) (logid:5a5a7500) Conf file found: /etc/cloudstack/agent/agent.properties
   
   Note sometimes you will see the agent successfully go in Disconnect state but the host HA framework might still fire after the kvm.ha.degraded.max.period timer and that is not expected.   In any case we want to avoid massive KVM host resets via IPMI for storage related problems because this is more damaging than waiting for primary storage to come back. 
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services