You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cloudstack.apache.org by GitBox <gi...@apache.org> on 2021/11/03 07:18:24 UTC

[GitHub] [cloudstack] poonam-agarwal28 opened a new issue #5660: CEPH Deployments for large VM images failing due to high image conversion time and job timeout

poonam-agarwal28 opened a new issue #5660:
URL: https://github.com/apache/cloudstack/issues/5660


   ##### ISSUE TYPE
   
    * Improvement Request
   
   ##### COMPONENT NAME
   <!--
   CEPH, API
   -->
   ~~~
   
   ~~~
   
   ##### CLOUDSTACK VERSION
   <!--
   4.15
   -->
   
   ~~~
   
   ~~~
   
   ##### CONFIGURATION
   <!--
   Cloudstack 4.15 with KVM Hypervisors and RBD/CEPH as Primary Storage. Secondary Storage on NFS.
   -->
   
   
   ##### OS / ENVIRONMENT
   <!--
   CentOS 7.8
   -->
   
   
   ##### SUMMARY
   <!-- 
   The current VM deployment process on CEPH as Primary Store runs below command during the deployment to convert the image to raw and also write to the CEPH Pool at the same time. This takes huge amount of time eg below has a 15GB template which took around 3+ Hours. This causes the VM deployment fail due to Job Timeout
   
   If this process is converted to a 2-Step Process as below
   
   Step 1: Image converted to raw
   Step 2: Raw image moved to CEPH either using qemu-img convert or rbd as below takes much lower time.
   
    -->
   
   
   ##### STEPS TO REPRODUCE
   <!--
   Upload a big template with root volume size of 300GB or higher ( NFS as secondary Store )
   Deploy a VM with the template with CEPH as the backend primary store.
   -->
   
   <!-- Paste example playbooks or commands between quotes below -->
   ~~~
   [root@TEST]# time qemu-img convert -p -O raw /mnt/f268aaee-6090-3fc9-9e9d-41b807bfa8c5.qcow2 rbd:cloudstack/test450:mon_host=VXILAB1-CMON.ceph.local:auth_supported=cephx:id=cloudstack:key=AQDicxxxxxxxxxxxxxxxxFrNxwwxwxw==:rbd_default_format=2:client_mount_timeout=30
       (100.00/100%)
   
   real    114m40.176s
   user    10m34.924s
   sys     4m3.139s
   
   [root@VXIMUM1-CHVZR-2 mnt]# time qemu-img convert -p -O raw /media/ISETest12.raw rbd:cloudstack/testimage:mon_host=VXILAB1-CMON.ceph.local:auth_supported=cephx:id=cloudstack:key=AQxxxxxxxxxxxxxxrNDqlJueQ==:rbd_default_format=2:client_mount_timeout=30
       (0.00/100%)
       (32.02/100%)
   
       (100.00/100%)
   
   real    74m1.770s
   user    4m40.347s
   sys     11m20.207s
   
   [root@VXIMUM1-CHVZR-1 media]# time rbd import ISETest12.raw --dest-pool cloudstack
   
   Importing image: 100% complete...done. 
   
   real    7m58.793s
   
   user    1m30.641s
   
   sys     2m39.082s
   
   [root@VXIMUM1-CHVZR-2 mnt]# time qemu-img convert -p -O raw /media/f268aaee-6090-3fc9-9e9d-41b807bfa8c5.qcow2 /media/ISETest12.raw
       (100.00/100%)
   
   real    9m3.571s
   user    4m8.513s
   sys     2m41.161s
   ~~~
   
   <!-- You can also paste gist.github.com links for larger files -->
   
   ##### EXPECTED RESULTS
   <!-- What did you expect to happen when running the steps above? -->
   
   ~~~
   The expected result is that VM deployment on CEPH should complete in a reasonable time before the job times out.
   ~~~
   
   ##### ACTUAL RESULTS
   <!-- What actually happened? -->
   
   <!-- Paste verbatim command output between quotes below -->
   ~~~
   The actual result is that the first time VM deployment on CEPH takes a long time and eventually times out due to the conversion process being extremely slow.
   ~~~
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@cloudstack.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [cloudstack] poonam-agarwal28 edited a comment on issue #5660: CEPH Deployments for large VM images failing due to high image conversion time and job timeout

Posted by GitBox <gi...@apache.org>.
poonam-agarwal28 edited a comment on issue #5660:
URL: https://github.com/apache/cloudstack/issues/5660#issuecomment-992220947


   > The bigger templates become the longer it takes for them to be converted. It's the way it currently works that we need to use qemu-img to convert the template.
   > 
   > A template of 300GB is a rather large file and can indeed take a lot of time.
   > 
   > You can increase the job timeout if that helps for you. What would a different suggestion be?
   
   @wido 
   The issue already details proposed solutions , instead to directly converting and writing the image to CEPH Pool , if we convert the image locally and then upload to the secondary store using the rbd client the results are much better


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@cloudstack.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [cloudstack] poonam-agarwal28 commented on issue #5660: CEPH Deployments for large VM images failing due to high image conversion time and job timeout

Posted by GitBox <gi...@apache.org>.
poonam-agarwal28 commented on issue #5660:
URL: https://github.com/apache/cloudstack/issues/5660#issuecomment-992278997


   @rhtyd 
   a)  rbd import ISETest12.raw --dest-pool cloudstack
   
   This is the rbd import command you can use to directly import the converted image into CEPH
   
   
   b ) The package ceph-common should provide the rbd binary used for executing the command. 
   
   ceph-common-10.2.11-0.el7.x86_64 : Ceph Common
   Repo        : @ceph-x86-64
   Matched from:
   Filename    : /usr/bin/rbd
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@cloudstack.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [cloudstack] poonam-agarwal28 commented on issue #5660: CEPH Deployments for large VM images failing due to high image conversion time and job timeout

Posted by GitBox <gi...@apache.org>.
poonam-agarwal28 commented on issue #5660:
URL: https://github.com/apache/cloudstack/issues/5660#issuecomment-987847579


   @rhtyd  Please note below O/P
   [root@VXIMUM1-CHVZR-1 media]# time qemu-img convert -p -n -t none -O raw /media/ISETest12.raw rbd:cloudstack/testimage:mon_host=VXILAB1-CMON.ceph.local:auth_supported=cephx:id=cloudstack:key=AQDicAVfG2lpIhAAYu4bnFtZjy04FrNDqlJueQ==:rbd_default_format=2:client_mount_timeout=30
       (32.02/100%)
       (33.02/100%)
       (100.00/100%)
   real    78m42.581s
   user    1m48.733s
   sys     3m59.782s
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@cloudstack.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [cloudstack] rhtyd commented on issue #5660: CEPH Deployments for large VM images failing due to high image conversion time and job timeout

Posted by GitBox <gi...@apache.org>.
rhtyd commented on issue #5660:
URL: https://github.com/apache/cloudstack/issues/5660#issuecomment-987741135


   Hi @poonam-agarwal28 thanks for sharing the issue and the benchmarks. Can you share output of:
   ```
   qemu-img --version
   virsh version
   ```
   
   Can you also run adn share the benchmark for converting your raw or qcow2 template to rbd with flags `-t none -n`, something like:
   ```
   time qemu-img convert -p -n -t none -O raw /media/ISETest12.raw rbd:cloudstack/testimage:mon_host=VXILAB1-CMON.ceph.local:auth_supported=cephx:id=cloudstack:key=AQxxxxxxxxxxxxxxrNDqlJueQ==:rbd_default_format=2:client_mount_timeout=30
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@cloudstack.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [cloudstack] rhtyd commented on issue #5660: CEPH Deployments for large VM images failing due to high image conversion time and job timeout

Posted by GitBox <gi...@apache.org>.
rhtyd commented on issue #5660:
URL: https://github.com/apache/cloudstack/issues/5660#issuecomment-987850076


   Thanks for sharing @poonam-agarwal28 can you also share your qemu-img and ceph versions?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@cloudstack.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [cloudstack] rhtyd commented on issue #5660: CEPH Deployments for large VM images failing due to high image conversion time and job timeout

Posted by GitBox <gi...@apache.org>.
rhtyd commented on issue #5660:
URL: https://github.com/apache/cloudstack/issues/5660#issuecomment-992367837


   Thanks for confirming the dependency packages @poonam-agarwal28, but my question was to understand if this requires any dependency on the host where it can be performed. For example, can you run the `rbd import` command on any KVM host (irrespective whether it's an OSD/mon host or not), for example if you run this on a non-osd/non-mon KVM host do you need to pass any other parameters (such as credentials, path/port/ip address, any config file etc)?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@cloudstack.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [cloudstack] wido commented on issue #5660: CEPH Deployments for large VM images failing due to high image conversion time and job timeout

Posted by GitBox <gi...@apache.org>.
wido commented on issue #5660:
URL: https://github.com/apache/cloudstack/issues/5660#issuecomment-991302670


   The bigger templates become the longer it takes for them to be converted. It's the way it currently works that we need to use qemu-img to convert the template.
   
   A template of 300GB is a rather large file and can indeed take a lot of time.
   
   You can increase the job timeout if that helps for you. What would a different suggestion be?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@cloudstack.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [cloudstack] poonam-agarwal28 commented on issue #5660: CEPH Deployments for large VM images failing due to high image conversion time and job timeout

Posted by GitBox <gi...@apache.org>.
poonam-agarwal28 commented on issue #5660:
URL: https://github.com/apache/cloudstack/issues/5660#issuecomment-992219379


   @rhtyd 
   [root@VXIMUM1-CHVZR-1 ~]# qemu-img --version
   qemu-img version 2.10.0(qemu-kvm-ev-2.10.0-21.el7_5.7.1)
   Copyright (c) 2003-2017 Fabrice Bellard and the QEMU Project developers
   [root@VXIMUM1-CHVZR-1 ~]# ceph --version
   ceph version 10.2.11 (e4b061b47f07f583c92a050d9e84b1813a35671e)
   [root@VXIMUM1-CHVZR-1 ~]# virsh version
   Compiled against library: libvirt 4.5.0
   Using library: libvirt 4.5.0
   Using API: QEMU 4.5.0
   Running hypervisor: QEMU 2.10.0
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@cloudstack.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [cloudstack] rhtyd commented on issue #5660: CEPH Deployments for large VM images failing due to high image conversion time and job timeout

Posted by GitBox <gi...@apache.org>.
rhtyd commented on issue #5660:
URL: https://github.com/apache/cloudstack/issues/5660#issuecomment-992264097


   Thanks for sharing the details @poonam-agarwal28, we'll use your env details to setup a test env.
   Meanwhile, it may be faster for you to confirm and share if the `rbd import` command can be executed on any KVM host, even when they're not Ceph hosts (osd or monitors)? If this works on KVM hosts irrespective of them not being a ceph host (mon, osd etc) and if you can share if this requires installation of generally available/required dependency (such as `ceph-common` etc) then we can explore into the implementation of the proposed solution.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@cloudstack.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [cloudstack] poonam-agarwal28 commented on issue #5660: CEPH Deployments for large VM images failing due to high image conversion time and job timeout

Posted by GitBox <gi...@apache.org>.
poonam-agarwal28 commented on issue #5660:
URL: https://github.com/apache/cloudstack/issues/5660#issuecomment-992220947


   > The bigger templates become the longer it takes for them to be converted. It's the way it currently works that we need to use qemu-img to convert the template.
   > 
   > A template of 300GB is a rather large file and can indeed take a lot of time.
   > 
   > You can increase the job timeout if that helps for you. What would a different suggestion be?
   
   
   The issue already details proposed solutions , instead to directly converting and writing the image to CEPH Pool , if we convert the image locally and then upload to the secondary store using the rbd client the results are much better


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@cloudstack.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [cloudstack] wido commented on issue #5660: CEPH Deployments for large VM images failing due to high image conversion time and job timeout

Posted by GitBox <gi...@apache.org>.
wido commented on issue #5660:
URL: https://github.com/apache/cloudstack/issues/5660#issuecomment-991302670


   The bigger templates become the longer it takes for them to be converted. It's the way it currently works that we need to use qemu-img to convert the template.
   
   A template of 300GB is a rather large file and can indeed take a lot of time.
   
   You can increase the job timeout if that helps for you. What would a different suggestion be?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@cloudstack.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [cloudstack] nvazquez commented on issue #5660: CEPH Deployments for large VM images failing due to high image conversion time and job timeout

Posted by GitBox <gi...@apache.org>.
nvazquez commented on issue #5660:
URL: https://github.com/apache/cloudstack/issues/5660#issuecomment-1029573480


   Ping @poonam-agarwal28 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@cloudstack.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org