You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cloudstack.apache.org by GitBox <gi...@apache.org> on 2020/03/09 11:48:58 UTC

[GitHub] [cloudstack] slavkap opened a new pull request #3949: Fix CloudRuntimeException of migrated volume for Volume stats

slavkap opened a new pull request #3949: Fix CloudRuntimeException of migrated volume for Volume stats
URL: https://github.com/apache/cloudstack/pull/3949
 
 
   ## Description
   When volume is migrated between data stores CloudStack keeps the original UUID and changes the path of the volume. In volume stats command - LibvirtGetVolumeStatsCommandWrapper.java migrated NFS volumes could not be found and CloudRuntimeException is thrown but not handled.
   
   DB record:
   '29', '2', '1', '1', '2', NULL, NULL, 'nfsDisk', 'd644f5c6-bf70-4826-8d2b-d4bc373a9198', '5368709120', '/export/primary', 'efba7e17-dddd-4bbb-8233-347c9a5faaff', '1', '1', NULL, NULL, 'DATADISK', NULL, '3', NULL, NULL, '0', '2020-01-28 08:56:00', NULL, '2020-02-19 12:21:53', NULL, 'Ready', NULL, '21', NULL, NULL, NULL, '1', 'QCOW2', NULL, NULL, NULL, 'thin'
   
   2020-03-06 17:38:06,775 DEBUG [kvm.storage.LibvirtStoragePool] (agentRequest-Handler-1:null) (logid:4858b218) volume: d644f5c6-bf70-4826-8d2b-d4bc373a9198 not exist on storage pool
   2020-03-06 17:38:06,776 WARN  [cloud.agent.Agent] (agentRequest-Handler-1:null) (logid:4858b218) Caught: 
   com.cloud.utils.exception.CloudRuntimeException: Can't find volume:d644f5c6-bf70-4826-8d2b-d4bc373a9198
   	at com.cloud.hypervisor.kvm.storage.LibvirtStoragePool.getPhysicalDisk(LibvirtStoragePool.java:149)
   	at com.cloud.hypervisor.kvm.resource.wrapper.LibvirtGetVolumeStatsCommandWrapper.getVolumeStat(LibvirtGetVolumeStatsCommandWrapper.java:63)
   	at com.cloud.hypervisor.kvm.resource.wrapper.LibvirtGetVolumeStatsCommandWrapper.execute(LibvirtGetVolumeStatsCommandWrapper.java:52)
   	at com.cloud.hypervisor.kvm.resource.wrapper.LibvirtGetVolumeStatsCommandWrapper.execute(LibvirtGetVolumeStatsCommandWrapper.java:40)
   	at com.cloud.hypervisor.kvm.resource.wrapper.LibvirtRequestWrapper.execute(LibvirtRequestWrapper.java:78)
   	at com.cloud.hypervisor.kvm.resource.LibvirtComputingResource.executeRequest(LibvirtComputingResource.java:1483)
   	at com.cloud.agent.Agent.processRequest(Agent.java:640)
   	at com.cloud.agent.Agent$AgentRequestHandler.doTask(Agent.java:1053)
   	at com.cloud.utils.nio.Task.call(Task.java:83)
   	at com.cloud.utils.nio.Task.call(Task.java:29)
   	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
   	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
   	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
   	at java.lang.Thread.run(Thread.java:748)
   
   ## Types of changes
   <!--- What types of changes does your code introduce? Put an `x` in all the boxes that apply: -->
   - [ ] Breaking change (fix or feature that would cause existing functionality to change)
   - [ ] New feature (non-breaking change which adds functionality)
   - [ X] Bug fix (non-breaking change which fixes an issue)
   - [ ] Enhancement (improves an existing feature and functionality)
   - [ ] Cleanup (Code refactoring and cleanup, that may add test cases)
   
   
   ## How Has This Been Tested?
   tested with volumes migrated from on NFS to another NFS primary storage
   tested with volumes migrated from StorPool primary storage to NFS primary storage
   on CloudStack version 4.11.2.0, 4.13.0.0 and master
   
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [cloudstack] Spaceman1984 commented on issue #3949: Fix CloudRuntimeException of migrated volume for Volume stats

Posted by GitBox <gi...@apache.org>.
Spaceman1984 commented on issue #3949: Fix CloudRuntimeException of migrated volume for Volume stats
URL: https://github.com/apache/cloudstack/pull/3949#issuecomment-598105944
 
 
   For testing, I changed the global setting volume.stats.interval to 10000 to make stats run every 10 seconds.
   
   I did not get the exception in the description after migrating storage. A different null pointer was thrown every time stats ran, but as soon as the migration was finished the null pointer was no longer thrown.
   
   To reproduce the runtime exception I just removed the file from where it was in primary storage.
   
   This PR would hide the runtime exception which I feel is not the desired behavior here if you would agree @slavkap ?
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [cloudstack] DaanHoogland commented on issue #3949: Fix: catch CloudRuntimeException in LibvirtGetVolumeStatsCommandWrapper.java

Posted by GitBox <gi...@apache.org>.
DaanHoogland commented on issue #3949: Fix: catch CloudRuntimeException in LibvirtGetVolumeStatsCommandWrapper.java
URL: https://github.com/apache/cloudstack/pull/3949#issuecomment-601754641
 
 
   @andrijapanicsb pure bug fix, do we include it given blockers are still out?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [cloudstack] blueorangutan commented on issue #3949: Fix CloudRuntimeException of migrated volume for Volume stats

Posted by GitBox <gi...@apache.org>.
blueorangutan commented on issue #3949: Fix CloudRuntimeException of migrated volume for Volume stats
URL: https://github.com/apache/cloudstack/pull/3949#issuecomment-596534825
 
 
   @DaanHoogland a Jenkins job has been kicked to build packages. I'll keep you posted as I make progress.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [cloudstack] Spaceman1984 commented on issue #3949: Fix CloudRuntimeException of migrated volume for Volume stats

Posted by GitBox <gi...@apache.org>.
Spaceman1984 commented on issue #3949: Fix CloudRuntimeException of migrated volume for Volume stats
URL: https://github.com/apache/cloudstack/pull/3949#issuecomment-599951663
 
 
   @slavkap, I agree the NPE is not caused by this change.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [cloudstack] blueorangutan commented on issue #3949: Fix CloudRuntimeException of migrated volume for Volume stats

Posted by GitBox <gi...@apache.org>.
blueorangutan commented on issue #3949: Fix CloudRuntimeException of migrated volume for Volume stats
URL: https://github.com/apache/cloudstack/pull/3949#issuecomment-596828327
 
 
   <b>Trillian test result (tid-1223)</b>
   Environment: kvm-centos7 (x2), Advanced Networking with Mgmt server 7
   Total time taken: 26719 seconds
   Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr3949-t1223-kvm-centos7.zip
   Intermittent failure detected: /marvin/tests/smoke/test_privategw_acl.py
   Smoke tests completed. 76 look OK, 1 have error(s)
   Only failed tests results shown below:
   
   
   Test | Result | Time (s) | Test File
   --- | --- | --- | ---
   test_02_vpc_privategw_static_routes | `Failure` | 172.24 | test_privategw_acl.py
   test_03_vpc_privategw_restart_vpc_cleanup | `Failure` | 171.98 | test_privategw_acl.py
   test_04_rvpc_privategw_static_routes | `Failure` | 233.22 | test_privategw_acl.py
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [cloudstack] Spaceman1984 commented on issue #3949: Fix CloudRuntimeException of migrated volume for Volume stats

Posted by GitBox <gi...@apache.org>.
Spaceman1984 commented on issue #3949: Fix CloudRuntimeException of migrated volume for Volume stats
URL: https://github.com/apache/cloudstack/pull/3949#issuecomment-599420687
 
 
   I have tested #3884 with the code from this PR, I still get the null pointer while migrating a volume and stats run.
   
   I wasn't able to reproduce the error in the description after migrating storage.
   
   These are the steps I followed:
   - Change volume.stats.interval global setting to 10000
   - Create an  instance
   - Stop the instance
   - Migrate the volume to a different primary storage
   - Check the management server log for the error.
   
   Results:
   No runtime exception
   
   The next step I followed was:
   - Delete the file from primary storage
   
   Result:
   I have verified the error in management server logs:
   
   `2020-03-13 16:54:51,585 DEBUG [c.c.a.m.AgentManagerImpl] (StatsCollector-6:ctx-96e6db97) (logid:24749a89) Details from executing class com.cloud.agent.api.GetVolumeStatsCommand: Can't get vm disk stats: Can't find volume:76071844-5bab-42cc-a656-9c47cf0b7797`
   
   Seems like this PR improves logging of the error when a volume is missing from the storage pool, not when a volume was migrated.
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [cloudstack] Spaceman1984 commented on issue #3949: Fix CloudRuntimeException of migrated volume for Volume stats

Posted by GitBox <gi...@apache.org>.
Spaceman1984 commented on issue #3949: Fix CloudRuntimeException of migrated volume for Volume stats
URL: https://github.com/apache/cloudstack/pull/3949#issuecomment-598590681
 
 
   This is the null pointer on the agent while migrating a volume, happens when stats are run.
   
   ```
   java.lang.NullPointerException
   	at com.cloud.hypervisor.kvm.storage.LibvirtStoragePool.getPhysicalDisk(LibvirtStoragePool.java:128)
   	at com.cloud.hypervisor.kvm.resource.wrapper.LibvirtGetVolumeStatsCommandWrapper.getVolumeStat(LibvirtGetVolumeStatsCommandWrapper.java:64)
   	at com.cloud.hypervisor.kvm.resource.wrapper.LibvirtGetVolumeStatsCommandWrapper.execute(LibvirtGetVolumeStatsCommandWrapper.java:53)
   	at com.cloud.hypervisor.kvm.resource.wrapper.LibvirtGetVolumeStatsCommandWrapper.execute(LibvirtGetVolumeStatsCommandWrapper.java:41)
   	at com.cloud.hypervisor.kvm.resource.wrapper.LibvirtRequestWrapper.execute(LibvirtRequestWrapper.java:78)
   	at com.cloud.hypervisor.kvm.resource.LibvirtComputingResource.executeRequest(LibvirtComputingResource.java:1500)
   	at com.cloud.agent.Agent.processRequest(Agent.java:640)
   	at com.cloud.agent.Agent$AgentRequestHandler.doTask(Agent.java:1053)
   	at com.cloud.utils.nio.Task.call(Task.java:83)
   	at com.cloud.utils.nio.Task.call(Task.java:29)
   	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
   	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
   	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
   	at java.lang.Thread.run(Thread.java:748)
   2020-03-13 09:03:17,681 DEBUG [cloud.agent.Agent] (agentRequest-Handler-2:null) (logid:33b84d86) Seq 1-3499859860420296912:  { Ans: , MgmtId: 2130706689, via: 1, Ver: v1, Flags: 10, [{"com.cloud.agent.api.Answer":{"result":false,"details":"java.lang.NullPointerException\n\tat com.cloud.hypervisor.kvm.storage.LibvirtStoragePool.getPhysicalDisk(LibvirtStoragePool.java:128)\n\tat com.cloud.hypervisor.kvm.resource.wrapper.LibvirtGetVolumeStatsCommandWrapper.getVolumeStat(LibvirtGetVolumeStatsCommandWrapper.java:64)\n\tat com.cloud.hypervisor.kvm.resource.wrapper.LibvirtGetVolumeStatsCommandWrapper.execute(LibvirtGetVolumeStatsCommandWrapper.java:53)\n\tat com.cloud.hypervisor.kvm.resource.wrapper.LibvirtGetVolumeStatsCommandWrapper.execute(LibvirtGetVolumeStatsCommandWrapper.java:41)\n\tat com.cloud.hypervisor.kvm.resource.wrapper.LibvirtRequestWrapper.execute(LibvirtRequestWrapper.java:78)\n\tat com.cloud.hypervisor.kvm.resource.LibvirtComputingResource.executeRequest(LibvirtComputingResource.java:1500)\n\tat com.cloud.agent.Agent.processRequest(Agent.java:640)\n\tat com.cloud.agent.Agent$AgentRequestHandler.doTask(Agent.java:1053)\n\tat com.cloud.utils.nio.Task.call(Task.java:83)\n\tat com.cloud.utils.nio.Task.call(Task.java:29)\n\tat java.util.concurrent.FutureTask.run(FutureTask.java:266)\n\tat java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)\n\tat java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)\n\tat java.lang.Thread.run(Thread.java:748)\n","wait":0}}] }
   ```
   
   I have also configured nfs in /etc/exports to be 2 separate entries:
   ```
   /export *(rw,async,no_root_squash,no_subtree_check)
   /alternate *(rw,async,no_root_squash,no_subtree_check)
   ```
   I will test #3884 merged with this PR for the error on the management server, I haven't seen it yet.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [cloudstack] Spaceman1984 edited a comment on issue #3949: Fix CloudRuntimeException of migrated volume for Volume stats

Posted by GitBox <gi...@apache.org>.
Spaceman1984 edited a comment on issue #3949: Fix CloudRuntimeException of migrated volume for Volume stats
URL: https://github.com/apache/cloudstack/pull/3949#issuecomment-598105944
 
 
   For testing, I changed the global setting volume.stats.interval to 10000 to make stats run every 10 seconds.
   
   I did not get the exception in the description after migrating storage. A different null pointer was thrown every time stats ran, but as soon as the migration was finished the null pointer was no longer thrown.
   
   To reproduce the runtime exception, I just removed the file from where it was in primary storage.
   
   This PR would hide the runtime exception which I feel is not the desired behavior here if you would agree @slavkap ?
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [cloudstack] DaanHoogland commented on issue #3949: Fix CloudRuntimeException of migrated volume for Volume stats

Posted by GitBox <gi...@apache.org>.
DaanHoogland commented on issue #3949: Fix CloudRuntimeException of migrated volume for Volume stats
URL: https://github.com/apache/cloudstack/pull/3949#issuecomment-596488787
 
 
   @slavkap can you rebase your commit on latest head of the 4.13 branch, please?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [cloudstack] DaanHoogland commented on issue #3949: Fix CloudRuntimeException of migrated volume for Volume stats

Posted by GitBox <gi...@apache.org>.
DaanHoogland commented on issue #3949: Fix CloudRuntimeException of migrated volume for Volume stats
URL: https://github.com/apache/cloudstack/pull/3949#issuecomment-596607991
 
 
   @blueorangutan test

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [cloudstack] slavkap commented on issue #3949: Fix CloudRuntimeException of migrated volume for Volume stats

Posted by GitBox <gi...@apache.org>.
slavkap commented on issue #3949: Fix CloudRuntimeException of migrated volume for Volume stats
URL: https://github.com/apache/cloudstack/pull/3949#issuecomment-599639477
 
 
   I will change the description, because when I have created this PR  I wasn't updated my local repo with the latest changes with  #3884. The CloudRuntimeException could be thrown in a lot of cases when volume stats command is executed, but I have tested it only if the volume is missing(migrated between datastores before the change from 3884). You can get it also, if for some reason can't get storage pool.
   
   About the NPE if I find time I could check it, but this change doesn't causes it.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [cloudstack] blueorangutan commented on issue #3949: Fix CloudRuntimeException of migrated volume for Volume stats

Posted by GitBox <gi...@apache.org>.
blueorangutan commented on issue #3949: Fix CloudRuntimeException of migrated volume for Volume stats
URL: https://github.com/apache/cloudstack/pull/3949#issuecomment-596545291
 
 
   Packaging result: ✖centos6 ✔centos7 ✔debian. JID-1035

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [cloudstack] slavkap commented on issue #3949: Fix CloudRuntimeException of migrated volume for Volume stats

Posted by GitBox <gi...@apache.org>.
slavkap commented on issue #3949: Fix CloudRuntimeException of migrated volume for Volume stats
URL: https://github.com/apache/cloudstack/pull/3949#issuecomment-598190351
 
 
   @Spaceman1984 thanks that you spend time to test it! My bad that I didn't pull the latest changes and the fix is from #3884 where mgmt sends the path rather than the UUID of the volume. Unfortunately I couldn't reproduce the NPE that you hit.
   I have tested it with the latest changes and in this case you wouldn't get the runtime exception. I think that the exception is not hidden because we send it back to the management for example like this: 
   DEBUG [c.c.a.m.AgentManagerImpl] (StatsCollector-2:ctx-bc39a30f) (logid:cd207a3e) Details from executing class com.cloud.agent.api.GetVolumeStatsCommand: Can't get vm disk stats: Unable to attach volume a43deb8f-b8b5-4e73-9043-76298cacd27b. Error:  Error: volume 'a43deb8f-b8b5-4e73-9043-76298cacd27b' does not exist.
   In such cases when CloudRuntimeException is thrown we won't fill up the agent's log with the stack trace
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [cloudstack] DaanHoogland commented on issue #3949: Fix CloudRuntimeException of migrated volume for Volume stats

Posted by GitBox <gi...@apache.org>.
DaanHoogland commented on issue #3949: Fix CloudRuntimeException of migrated volume for Volume stats
URL: https://github.com/apache/cloudstack/pull/3949#issuecomment-596534710
 
 
   @blueorangutan package

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [cloudstack] Spaceman1984 commented on issue #3949: Fix: catch CloudRuntimeException in LibvirtGetVolumeStatsCommandWrapper.java

Posted by GitBox <gi...@apache.org>.
Spaceman1984 commented on issue #3949: Fix: catch CloudRuntimeException in LibvirtGetVolumeStatsCommandWrapper.java
URL: https://github.com/apache/cloudstack/pull/3949#issuecomment-601742398
 
 
   LGTM

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [cloudstack] blueorangutan commented on issue #3949: Fix CloudRuntimeException of migrated volume for Volume stats

Posted by GitBox <gi...@apache.org>.
blueorangutan commented on issue #3949: Fix CloudRuntimeException of migrated volume for Volume stats
URL: https://github.com/apache/cloudstack/pull/3949#issuecomment-596608278
 
 
   @DaanHoogland a Trillian-Jenkins test job (centos7 mgmt + kvm-centos7) has been kicked to run smoke tests

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services