You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cloudstack.apache.org by GitBox <gi...@apache.org> on 2018/10/08 14:57:40 UTC
[GitHub] csquire edited a comment on issue #2722: CLOUDSTACK-10310 Fix KVM
reboot on storage issue
csquire edited a comment on issue #2722: CLOUDSTACK-10310 Fix KVM reboot on storage issue
URL: https://github.com/apache/cloudstack/pull/2722#issuecomment-427859929
This PR doesn't seem to completely fix the problem (or maybe this is a completely new problem). We installed the RC release with this PR on a test system and are able to get the KVM host to be marked as `Down` by using iptables to drop outgoing requests to NFS. My investigation shows that the line [`storage = conn.storagePoolLookupByUUIDString(uuid);`](https://github.com/apache/cloudstack/blob/4.11/plugins/hypervisors/kvm/src/com/cloud/hypervisor/kvm/resource/KVMHAMonitor.java#L95) blocks indefinitely. So, `kvmheartbeat.sh` is never executed, a host investigation is started, the host with blocked NFS is marked as `Down` and finally all VMs on that host are rescheduled and result in duplicate VMs.
I pulled a thread dump and found the KVMHAMonitor thread will hang here until NFS is unblocked, didn't dig any deeper yet though.
```"Thread-20" - Thread t@135
java.lang.Thread.State: RUNNABLE
at com.sun.jna.Native.invokePointer(Native Method)
at com.sun.jna.Function.invokePointer(Function.java:470)
at com.sun.jna.Function.invoke(Function.java:404)
at com.sun.jna.Function.invoke(Function.java:315)
at com.sun.jna.Library$Handler.invoke(Library.java:212)
at com.sun.proxy.$Proxy3.virStoragePoolLookupByUUIDString(Unknown Source)
at org.libvirt.Connect.storagePoolLookupByUUIDString(Unknown Source)
at com.cloud.hypervisor.kvm.resource.KVMHAMonitor$Monitor.runInContext(KVMHAMonitor.java:95)
- locked <1afb3370> (a java.util.concurrent.ConcurrentHashMap)
at org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(ManagedContextRunnable.java:49)
at org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:56)
at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:103)
at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:53)
at org.apache.cloudstack.managed.context.ManagedContextRunnable.run(ManagedContextRunnable.java:46)
at java.lang.Thread.run(Thread.java:748)
Locked ownable synchronizers:
- None```
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
With regards,
Apache Git Services