You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@cloudstack.apache.org by "Mandar Barve (JIRA)" <ji...@apache.org> on 2014/07/10 08:37:05 UTC
[jira] [Commented] (CLOUDSTACK-3896) [PrimaryStorage]
deleteStoragePool is not kicking GC for the downloaded system vm templates
on Primary Storage
[ https://issues.apache.org/jira/browse/CLOUDSTACK-3896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14057197#comment-14057197 ]
Mandar Barve commented on CLOUDSTACK-3896:
------------------------------------------
Looking into the logs here are a few things that I see:
Management server log shows pool id 12 named primaryZone2 threw an exception for deletePool command.
013-07-29 14:43:45,503 ERROR [cloud.api.ApiServer] (catalina-exec-20:null) unhandled exception executing api command: deleteStoragePool
com.cloud.utils.exception.CloudRuntimeException: Cannot delete pool primaryZone2 as there are associated volumes for this pool
at com.cloud.storage.StorageManagerImpl.deletePool(StorageManagerImpl.java:829)
at com.cloud.utils.component.ComponentInstantiationPostProcessor$InterceptorDispatcher.intercept(ComponentInstantiationPostProcessor.java:125)
at org.apache.cloudstack.api.command.admin.storage.DeletePoolCmd.execute(DeletePoolCmd.java:78)
at com.cloud.api.ApiDispatcher.dispatch(ApiDispatcher.java:158)
at com.cloud.api.ApiServer.queueCommand(ApiServer.java:514)
at com.cloud.api.ApiServer.handleRequest(ApiServer.java:372)
at com.cloud.api.ApiServlet.processRequest(ApiServlet.java:305)
at com.cloud.api.ApiServlet.doGet(ApiServlet.java:66)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:617)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:717)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:290)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
at org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:555)
at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:298)
at org.apache.coyote.http11.Http11NioProcessor.process(Http11NioProcessor.java:889)
at org.apache.coyote.http11.Http11NioProtocol$Http11ConnectionHandler.process(Http11NioProtocol.java:721)
at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.run(NioEndpoint.java:2268)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:679)
The pool has a template with id = 1 in Ready state. Volumes table has couple of volumes that are in Expunged state and refer to this template.
mysql> select * from volumes where template_id=1 AND pool_id=12\G;
*************************** 1. row ***************************
id: 27
account_id: 1
domain_id: 1
pool_id: 12
last_pool_id: NULL
instance_id: 24
device_id: 0
name: ROOT-24
uuid: c43cf1b3-238c-4d1f-b55b-d2fb5150bc4e
size: 2097152000
folder: NULL
path: 6501771c-65d5-4971-adff-e1f03626bac5
pod_id: NULL
data_center_id: 2
iscsi_name: NULL
host_ip: NULL
volume_type: ROOT
pool_type: NULL
disk_offering_id: 9
template_id: 1
first_snapshot_backup_uuid: NULL
recreatable: 1
created: 2013-07-29 07:22:12
attached: NULL
updated: 2013-07-29 09:14:04
removed: 2013-07-29 09:14:04
state: Expunged
chain_info: NULL
update_count: 6
disk_type: NULL
display_volume: 0
format: VHD
min_iops: NULL
max_iops: NULL
*************************** 2. row ***************************
id: 28
account_id: 1
domain_id: 1
pool_id: 12
last_pool_id: NULL
instance_id: 25
device_id: 0
name: ROOT-25
uuid: 92ff2b77-d2aa-4e0f-9508-b5414df2730f
size: 2097152000
folder: NULL
path: 2c36952b-2584-4b60-aff9-9b42cbc92258
pod_id: NULL
data_center_id: 2
iscsi_name: NULL
host_ip: NULL
volume_type: ROOT
pool_type: NULL
disk_offering_id: 11
template_id: 1
first_snapshot_backup_uuid: NULL
recreatable: 1
created: 2013-07-29 07:22:13
attached: NULL
updated: 2013-07-29 09:14:07
removed: 2013-07-29 09:14:07
state: Expunged
chain_info: NULL
update_count: 6
disk_type: NULL
display_volume: 0
format: VHD
min_iops: NULL
max_iops: NULL
2 rows in set (0.00 sec)
Management server logs also show a print that says "Storage pool garbage collector found 0 templates to cleanup in storage pool primaryZone2" which is little confusing. This code looks to clean up those templates that are "unused". It checks if the template is not a router template and already DOWNLOADED and has no references in volumes table. This template should really be "UNUSED" since both the volumes referring to it are in 'Expunged' state and following query returns a result of 0
mysql> SELECT COUNT(*) FROM volumes WHERE volumes.pool_id=12 AND volumes.template_id=1 AND volumes.removed IS NULL;
+----------+
| COUNT(*) |
+----------+
| 0 |
+----------+
Why does the garbage collector find 0 "unused" templates on this storage pool?
This code checks all template ids on the storage pool and for each it checks if the template in the vm_template table for that ID is marked as of type SYSTEM. This template looks like is marked as SYSTEM as a result will be considered to be in use
mysql> select * from vm_template where id=1\G
*************************** 1. row ***************************
id: 1
unique_name: routing-1
name: SystemVM Template (XenServer)
uuid: 4cdfb5c8-f4ef-11e2-a91c-069f2c0000aa
public: 0
featured: 0
type: SYSTEM
hvm: 0
bits: 64
url: http://10.147.28.7/templates/acton/acton-systemvm-02062012.vhd.bz2
format: VHD
created: 2013-07-25 11:28:59
removed: NULL
account_id: 1
checksum: f613f38c96bf039f2e5cbf92fa8ad4f8
display_text: SystemVM Template (XenServer)
enable_password: 0
enable_sshkey: 0
guest_os_id: 133
bootable: 1
prepopulate: 0
cross_zones: 1
extractable: 0
hypervisor_type: XenServer
source_template_id: NULL
template_tag: NULL
sort_key: 0
size: NULL
state: Allocated
update_count: 0
updated: NULL
dynamically_scalable: 0
1 row in set (0.00 sec)
> [PrimaryStorage] deleteStoragePool is not kicking GC for the downloaded system vm templates on Primary Storage
> --------------------------------------------------------------------------------------------------------------
>
> Key: CLOUDSTACK-3896
> URL: https://issues.apache.org/jira/browse/CLOUDSTACK-3896
> Project: CloudStack
> Issue Type: Bug
> Security Level: Public(Anyone can view this level - this is the default.)
> Components: Storage Controller
> Affects Versions: 4.2.0
> Environment: commit # ca474d0e09f772cb22abf2802a308a2da5351592
> Reporter: venkata swamybabu budumuru
> Priority: Minor
> Fix For: 4.4.0
>
> Attachments: logs.tgz
>
>
> Steps to reproduce:
> 1. Have the latest cloudstack setup with at least 1 advanced zone using XenServer
> 2. make sure that system vm is up and running (which means the system vm is downloaded to the primary storage)
> 3. Disable zone and destroy the system vas
> 4. place the primary & secondary storages in maintenance mode.
> 5. delete both primary and secondary
> # select * from storage_pool where id=12
> *************************** 9. row ***************************
> id: 12
> name: primaryZone2
> uuid: NULL
> pool_type: NetworkFilesystem
> port: 2049
> data_center_id: 2
> pod_id: 2
> cluster_id: 2
> used_bytes: 1993387966464
> capacity_bytes: 5902284816384
> host_address: 10.147.28.7
> user_info: NULL
> path: /export/home/swamy/primary.campo.xen.1.cluster
> created: 2013-07-29 07:19:06
> removed: 2013-07-29 09:14:19
> update_time: NULL
> status: Maintenance
> storage_provider_name: DefaultPrimary
> scope: CLUSTER
> hypervisor: NULL
> managed: 0
> capacity_iops: NULL
> 6. check cloud.template_spool_ref for the above system vm template.
> Observations:
> (i) template_spool_ref still shows that system vm template as "Ready"
> (ii) storage GC didn't happen for the above template.
> mysql> select * from template_spool_ref where pool_id=12\G
> *************************** 1. row ***************************
> id: 10
> pool_id: 12
> template_id: 1
> created: 2013-07-29 07:22:12
> last_updated: NULL
> job_id: NULL
> download_pct: 100
> download_state: DOWNLOADED
> error_str: NULL
> local_path: 332cedca-b187-4af8-9d0a-ac3379741211
> install_path: 332cedca-b187-4af8-9d0a-ac3379741211
> template_size: 0
> marked_for_gc: 0
> state: Ready
> update_count: 2
> updated: 2013-07-29 07:36:24
> 1 row in set (0.00 sec)
> (iii) Storage.cleanup.interval is enabled and set to 10 in my setup.
> Attaching all the required logs along with db dump to the bug.
--
This message was sent by Atlassian JIRA
(v6.2#6252)