You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@libcloud.apache.org by an...@apache.org on 2016/04/12 01:20:12 UTC
libcloud git commit: Fix a race condition in GCE’s list_nodes() Closes #727 Invoking GCE’s `list_nodes()` while some VMs are being shutdown can result in the following exception to be raised out of `list_nodes()`:
Repository: libcloud
Updated Branches:
refs/heads/trunk b444381e9 -> b1d073195
Fix a race condition in GCE’s list_nodes()
Closes #727
Invoking GCE’s `list_nodes()` while some VMs are being shutdown can result
in the following exception to be raised out of `list_nodes()`:
```
File "/usr/lib/python2.7/site-packages/libcloud/compute/drivers/gce.py", line 1411, in list_nodes
v.get('instances', [])]
File "/usr/lib/python2.7/site-packages/libcloud/compute/drivers/gce.py", line 5065, in _to_node
extra['boot_disk'] = self.ex_get_volume(bd['name'], bd['zone'])
File "/usr/lib/python2.7/site-packages/libcloud/compute/drivers/gce.py", line 3982, in ex_get_volume
response = self.connection.request(request, method='GET').object
File "/usr/lib/python2.7/site-packages/libcloud/common/google.py", line 684, in request
*args, **kwargs)
File "/usr/lib/python2.7/site-packages/libcloud/common/base.py", line 736, in request
response = responseCls(**kwargs)
File "/usr/lib/python2.7/site-packages/libcloud/common/base.py", line 119, in __init__
self.object = self.parse_body()
File "/usr/lib/python2.7/site-packages/libcloud/common/google.py", line 259, in parse_body
raise ResourceNotFoundError(message, self.status, code)
libcloud.common.google.ResourceNotFoundError: {'domain': 'global', 'message': "The resource 'projects/lenaic/zones/europe-west1-c/disks/devops-reg' was not found", 'reason': 'notFound'}
```
The above error occurred while the `devops-reg` machine was being deleted.
The issue occurs when the following events happen in that order:
* [`list_nodes()` sends a request to list all the instances.](https://github.com/apache/libcloud/blob/trunk/libcloud/compute/drivers/gce.py#L1622)
At this point, the `devops-reg` was still existing.
* The `devops-reg` instance is deleted.
* `list_nodes()` calls `_to_node` which calls [`ex_get_volume` which attempts to retrieve the information of the volumes](https://github.com/apache/libcloud/blob/trunk/libcloud/compute/drivers/gce.py#L4235)
But, as the instance was deleted since it was listed, `ex_get_volume` raises a `ResourceNotFoundError` exception.
When this happens, we should simply discard the node that was deleted during the execution of `list_nodes()` and return the information about the other nodes.
Project: http://git-wip-us.apache.org/repos/asf/libcloud/repo
Commit: http://git-wip-us.apache.org/repos/asf/libcloud/commit/b1d07319
Tree: http://git-wip-us.apache.org/repos/asf/libcloud/tree/b1d07319
Diff: http://git-wip-us.apache.org/repos/asf/libcloud/diff/b1d07319
Branch: refs/heads/trunk
Commit: b1d0731959637ab9df8876a3f3b9ad9fb2f38efd
Parents: b444381
Author: Lénaïc Huard <lh...@amadeus.com>
Authored: Fri Mar 25 17:00:48 2016 +0100
Committer: anthony-shaw <an...@gmail.com>
Committed: Tue Apr 12 09:19:46 2016 +1000
----------------------------------------------------------------------
libcloud/compute/drivers/gce.py | 28 ++++++++++++++++++++++++----
1 file changed, 24 insertions(+), 4 deletions(-)
----------------------------------------------------------------------
http://git-wip-us.apache.org/repos/asf/libcloud/blob/b1d07319/libcloud/compute/drivers/gce.py
----------------------------------------------------------------------
diff --git a/libcloud/compute/drivers/gce.py b/libcloud/compute/drivers/gce.py
index 14e10ab..4aa92ac 100644
--- a/libcloud/compute/drivers/gce.py
+++ b/libcloud/compute/drivers/gce.py
@@ -1625,11 +1625,31 @@ class GCENodeDriver(NodeDriver):
# The aggregated response returns a dict for each zone
if zone is None:
for v in response['items'].values():
- zone_nodes = [self._to_node(i) for i in
- v.get('instances', [])]
- list_nodes.extend(zone_nodes)
+ for i in v.get('instances', []):
+ try:
+ list_nodes.append(self._to_node(i))
+ # If a GCE node has been deleted between
+ # - is was listed by `request('.../instances', 'GET')
+ # - it is converted by `self._to_node(i)`
+ # `_to_node()` will raise a ResourceNotFoundError.
+ #
+ # Just ignore that node and return the list of the
+ # other nodes.
+ except ResourceNotFoundError:
+ pass
else:
- list_nodes = [self._to_node(i) for i in response['items']]
+ for i in response['items']:
+ try:
+ list_nodes.append(self._to_node(i))
+ # If a GCE node has been deleted between
+ # - is was listed by `request('.../instances', 'GET')
+ # - it is converted by `self._to_node(i)`
+ # `_to_node()` will raise a ResourceNotFoundError.
+ #
+ # Just ignore that node and return the list of the
+ # other nodes.
+ except ResourceNotFoundError:
+ pass
return list_nodes
def ex_list_regions(self):