You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@libcloud.apache.org by GitBox <gi...@apache.org> on 2021/05/25 13:26:02 UTC

[GitHub] [libcloud] Ido-Levi opened a new pull request #1584: Optimized 'iterate_container_objects' by filtering the objects before…

Ido-Levi opened a new pull request #1584:
URL: https://github.com/apache/libcloud/pull/1584


   … creating the iterator
   
   ## Changes Title (replace this with a logical title for your changes)
   Optimized 'iterate_container_objects' by filtering the objects before creating the iterator for the local driver
   
   ### Description
   In the local driver - if a prefix is present, till now all of the objects in storage would've been serialized and just then filtered, changed it so the filtering would be beforehand.
   
   
   Replace this with the PR description (mention the changes you have made, why
   you have made them, provide some background and any references to the provider
   documentation if needed, etc.).
   
   For more information on contributing, please see [Contributing](http://libcloud.readthedocs.org/en/latest/development.html#contributing)
   section of our documentation.
   
   ### Status
   done, ready for review
   
   ### Checklist (tick everything that applies)
   
   - [v] [Code linting](http://libcloud.readthedocs.org/en/latest/development.html#code-style-guide) (required, can be done after the PR checks)
   - [v] Documentation
   - [v] [Tests](http://libcloud.readthedocs.org/en/latest/testing.html)
   - [ ] [ICLA](http://libcloud.readthedocs.org/en/latest/development.html#contributing-bigger-changes) (required for bigger changes)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [libcloud] Ido-Levi commented on a change in pull request #1584: Optimized 'iterate_container_objects' by filtering the objects before…

Posted by GitBox <gi...@apache.org>.
Ido-Levi commented on a change in pull request #1584:
URL: https://github.com/apache/libcloud/pull/1584#discussion_r638838434



##########
File path: libcloud/storage/drivers/local.py
##########
@@ -276,7 +279,7 @@ def iterate_container_objects(self, container, prefix=None,
         """
         prefix = self._normalize_prefix_argument(prefix, ex_prefix)
 
-        objects = self._get_objects(container)
+        objects = self._get_objects(container, prefix)
         return self._filter_listed_container_objects(objects, prefix)

Review comment:
       Aight, removed the call to `_filter_listed_container_objects`




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [libcloud] Kami commented on pull request #1584: Optimized 'iterate_container_objects' by filtering the objects before…

Posted by GitBox <gi...@apache.org>.
Kami commented on pull request #1584:
URL: https://github.com/apache/libcloud/pull/1584#issuecomment-949553345


   Sorry for the delay - the change has now been merged.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@libcloud.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [libcloud] c-w commented on a change in pull request #1584: Optimized 'iterate_container_objects' by filtering the objects before…

Posted by GitBox <gi...@apache.org>.
c-w commented on a change in pull request #1584:
URL: https://github.com/apache/libcloud/pull/1584#discussion_r638812859



##########
File path: libcloud/storage/drivers/local.py
##########
@@ -276,7 +279,7 @@ def iterate_container_objects(self, container, prefix=None,
         """
         prefix = self._normalize_prefix_argument(prefix, ex_prefix)
 
-        objects = self._get_objects(container)
+        objects = self._get_objects(container, prefix)
         return self._filter_listed_container_objects(objects, prefix)

Review comment:
       If we now do the filtering natively in the driver I don't we need the fallback call to `_filter_listed_container_objects` anymore.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [libcloud] asfgit merged pull request #1584: Optimized 'iterate_container_objects' by filtering the objects before…

Posted by GitBox <gi...@apache.org>.
asfgit merged pull request #1584:
URL: https://github.com/apache/libcloud/pull/1584


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@libcloud.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [libcloud] Kami commented on pull request #1584: Optimized 'iterate_container_objects' by filtering the objects before…

Posted by GitBox <gi...@apache.org>.
Kami commented on pull request #1584:
URL: https://github.com/apache/libcloud/pull/1584#issuecomment-983881324


   @louis-van-der-stam #1631 should fix this regression. Sadly we didn't have an existing test cases for this scenario which would have caught this issue early. Can you please verify the fix?
   
   @Ido-Levi Thanks for your original contribution and good intention with the performance optimization, but sadly the change needs to be reverted since it breaks the expected behavior (prefix filtering being applied against the object name - which may include actual bucket name, but that's provider / implementation specific).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@libcloud.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [libcloud] louis-van-der-stam commented on pull request #1584: Optimized 'iterate_container_objects' by filtering the objects before…

Posted by GitBox <gi...@apache.org>.
louis-van-der-stam commented on pull request #1584:
URL: https://github.com/apache/libcloud/pull/1584#issuecomment-983712951


   After upgrading to version 3.4.1 from version 3.3.1, the unit tests of my project broke! After diving into the code, it turns out that currently the local driver takes the prefix and appends it as a directory. Effectively this means that the prefix is not used as a file prefix anymore, but instead as a sub-directory name of the container main directory.
   
   Looking a bit further, I was unable to find a os level function that does filtering when requesting a directory listing. Please unmerge this PR!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@libcloud.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [libcloud] Kami commented on pull request #1584: Optimized 'iterate_container_objects' by filtering the objects before…

Posted by GitBox <gi...@apache.org>.
Kami commented on pull request #1584:
URL: https://github.com/apache/libcloud/pull/1584#issuecomment-983859677


   @louis-van-der-stam Thanks for reporting this issue.
   
   @Ido-Levi It looks like that the change was probably inadvertently backward incompatible and breaking (prefix filtering should apply to the object name and not to the sub-directory / bucket name). It sounds like we should revert it. What do you think?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@libcloud.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [libcloud] louis-van-der-stam commented on pull request #1584: Optimized 'iterate_container_objects' by filtering the objects before…

Posted by GitBox <gi...@apache.org>.
louis-van-der-stam commented on pull request #1584:
URL: https://github.com/apache/libcloud/pull/1584#issuecomment-927527078


   Hi, it seems I'm experiencing the same issue. Seeing the PR is open, will it go into a release? Are there still issues that need to be resolved? Can I help?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@libcloud.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [libcloud] Kami edited a comment on pull request #1584: Optimized 'iterate_container_objects' by filtering the objects before…

Posted by GitBox <gi...@apache.org>.
Kami edited a comment on pull request #1584:
URL: https://github.com/apache/libcloud/pull/1584#issuecomment-983881324


   @louis-van-der-stam #1631 should fix this regression. Sadly we didn't have an existing test cases for this scenario which would have caught this issue early. Can you please verify the fix?
   
   @Ido-Levi Thanks for your original contribution and good intention with the performance optimization, but sadly the change needs to be reverted since it breaks the expected behavior (prefix filtering being applied against the object name - which may include actual bucket name, but that's provider / implementation specific).
   
   EDIT: Thinking about it some more, we may still be able to introduce optimization for some scenarios in the future (perhaps by checking if the name contains path separator or similar, but that would only apply to some scenarios and likely also has more edge cases which we need to handle correctly).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@libcloud.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org