You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@cloudstack.apache.org by Darren Shepherd <da...@gmail.com> on 2013/10/31 19:39:29 UTC

race conditions in VolumeServiceImpl.createBaseImageAsync() creates NPE

The following code results in a NPE in bad situations

        templatePoolRef =
_tmpltPoolDao.acquireInLockTable(templatePoolRefId,
storagePoolMaxWaitSeconds);

        if (templatePoolRef == null) {
            if (s_logger.isDebugEnabled()) {
                s_logger.info("Unable to acquire lock on
VMTemplateStoragePool " + templatePoolRefId);
            }
            templatePoolRef =
_tmpltPoolDao.findByPoolTemplate(dataStore.getId(), template.getId());
            if (templatePoolRef.getState() ==
ObjectInDataStoreStateMachine.State.Ready ) {
                s_logger.info("Unable to acquire lock on
VMTemplateStoragePool " + templatePoolRefId + ", But Template " +
template.getUniqueName() + " is already copied to primary storage, skip
copying");
                createVolumeFromBaseImageAsync(volume,
templateOnPrimaryStoreObj, dataStore, future);
                return;
            }
            throw new CloudRuntimeException("Unable to acquire lock on
VMTemplateStoragePool: " + templatePoolRefId);
        }

If two threads are trying to stage the same template thread one gets the
lock, thread two will wait.  If thread one fails to stage the template it
will delete the templatePoolRef from the database.  Thread two will now get
the lock in op_lock, but the internal findById will not find a
templatePoolRef because it has been deleted and return null from
acquireInLockTable().  Technically thread two has the lock, but the ref
templatePoolRef wasn't found.  The subsequent line "templatePoolRef =
_tmpltPoolDao.findByPoolTemplate(...)" will return null, because it doesn't
exist and then on the next line templatePoolRef.getState() will throw a NPE.

Darren

Re: race conditions in VolumeServiceImpl.createBaseImageAsync() creates NPE

Posted by Min Chen <mi...@citrix.com>.
Darren, I just checked the code, you are right. In case of one thread throws exception in downloading template to primary, it will delete the entry in template_store_ref, causing the second thread failing with NPE. We need to fix this in 4.3. Please file a bug for this.

Thanks
-min

From: Darren Shepherd <da...@gmail.com>>
Date: Thursday, October 31, 2013 11:39 AM
To: "dev@cloudstack.apache.org<ma...@cloudstack.apache.org>" <de...@cloudstack.apache.org>>, Min Chen <mi...@citrix.com>>
Subject: race conditions in VolumeServiceImpl.createBaseImageAsync() creates NPE

The following code results in a NPE in bad situations

        templatePoolRef = _tmpltPoolDao.acquireInLockTable(templatePoolRefId, storagePoolMaxWaitSeconds);

        if (templatePoolRef == null) {
            if (s_logger.isDebugEnabled()) {
                s_logger.info<http://s_logger.info>("Unable to acquire lock on VMTemplateStoragePool " + templatePoolRefId);
            }
            templatePoolRef = _tmpltPoolDao.findByPoolTemplate(dataStore.getId(), template.getId());
            if (templatePoolRef.getState() == ObjectInDataStoreStateMachine.State.Ready ) {
                s_logger.info<http://s_logger.info>("Unable to acquire lock on VMTemplateStoragePool " + templatePoolRefId + ", But Template " + template.getUniqueName() + " is already copied to primary storage, skip copying");
                createVolumeFromBaseImageAsync(volume, templateOnPrimaryStoreObj, dataStore, future);
                return;
            }
            throw new CloudRuntimeException("Unable to acquire lock on VMTemplateStoragePool: " + templatePoolRefId);
        }

If two threads are trying to stage the same template thread one gets the lock, thread two will wait.  If thread one fails to stage the template it will delete the templatePoolRef from the database.  Thread two will now get the lock in op_lock, but the internal findById will not find a templatePoolRef because it has been deleted and return null from acquireInLockTable().  Technically thread two has the lock, but the ref templatePoolRef wasn't found.  The subsequent line "templatePoolRef = _tmpltPoolDao.findByPoolTemplate(...)" will return null, because it doesn't exist and then on the next line templatePoolRef.getState() will throw a NPE.

Darren