You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@libcloud.apache.org by "Tomaz Muraus (JIRA)" <ji...@apache.org> on 2013/03/23 05:51:16 UTC

[dev] [jira] [Updated] (LIBCLOUD-254) Provide generator based iteration instead of LazyList

     [ https://issues.apache.org/jira/browse/LIBCLOUD-254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Tomaz Muraus updated LIBCLOUD-254:
----------------------------------

    Fix Version/s: 0.12.1
    
> Provide generator based iteration instead of LazyList
> -----------------------------------------------------
>
>                 Key: LIBCLOUD-254
>                 URL: https://issues.apache.org/jira/browse/LIBCLOUD-254
>             Project: Libcloud
>          Issue Type: Improvement
>          Components: Core, Storage
>            Reporter: Mahendra M
>            Assignee: Tomaz Muraus
>            Priority: Minor
>             Fix For: 0.12.1
>
>         Attachments: libcloud-254.patch
>
>
> LazyList was implemented by issue (LIBCLOUD-78) for more efficient iteration over objects stored in a container (S3, CloudFiles etc. limit the maximum number of objects returned in a single call).
> LazyList solved this problem, but I think it might have the following issues while handling containers with large number of objects
> 1) It loads the entire list to memory
> 2) caller has to wait for the entire list to be loaded in memory before any operation can be done
> 3) The api invocation using get_more() and value_list is a bit complex (as-in, it can be simplified)
> By using python generators, the above problems can be alleviated. Results can be returned to the caller as and when it is returned from the server.
> The following changes were done to the libcloud apis
> 1) A new api called - iterate_container_objects() was introduced. The storage drivers need to implement this instead of list_container_objects(). This API now returns a generator. Usage of this API will alleviate the above three problems.
> 2) list_container_objects() will simply do - list(self.iterate_container_objects(container)) - this is maintained for backwards compatibility. It would be better if users can start using iterate_**() api instead.
> 3) The same changes have been made for the DNS base class also.
> 4) LazyList() can be removed from libcloud if it is OK with everyone.
> 5) The generator based interface can be used (WIP) for providing paginated access to objects - This will be useful for webpages/apps where the user has to paginate through the results. The same can be implemented by providing "start_key" and "count" parameters (similar to CouchDB) instead of generating the entire list and then doing an offset. This will be more performance/memory efficient than generating the entire list for every request.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira