You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@jclouds.apache.org by Andrew Gaul <no...@github.com> on 2015/03/10 23:33:38 UTC

[jclouds] Add utility to crawl a blobstore (#702)

This avoids error-prone boilerplate when using paginated lists.
You can view, comment on, or merge this pull request online at:

  https://github.com/jclouds/jclouds/pull/702

-- Commit Summary --

  * Add utility to crawl a blobstore

-- File Changes --

    M blobstore/src/main/java/org/jclouds/blobstore/util/BlobStoreUtils.java (79)
    M blobstore/src/test/java/org/jclouds/blobstore/util/BlobStoreUtilsTest.java (24)

-- Patch Links --

https://github.com/jclouds/jclouds/pull/702.patch
https://github.com/jclouds/jclouds/pull/702.diff

---
Reply to this email directly or view it on GitHub:
https://github.com/jclouds/jclouds/pull/702

Re: [jclouds/jclouds] Add utility to crawl a blobstore (#702)

Posted by Andrew Gaul <no...@github.com>.
Closed #702.

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/jclouds/jclouds/pull/702#event-835791821

Re: [jclouds] Add utility to crawl a blobstore (#702)

Posted by Ignasi Barrera <no...@github.com>.
> +         marker = set.getNextMarker();
> +         iterator = set.iterator();
> +      }
> +
> +      @Override
> +      public boolean hasNext() {
> +         return iterator.hasNext() || marker != null;
> +      }
> +
> +      @Override
> +      public StorageMetadata next() {
> +         while (true) {
> +            if (!iterator.hasNext()) {
> +               advance();
> +            }
> +            StorageMetadata metadata = iterator.next();

The advance method does not guarantee that the new iterator has a next element right?

---
Reply to this email directly or view it on GitHub:
https://github.com/jclouds/jclouds/pull/702/files#r26179082

Re: [jclouds] Add utility to crawl a blobstore (#702)

Posted by Ignasi Barrera <no...@github.com>.
> +         marker = set.getNextMarker();
> +         iterator = set.iterator();
> +      }
> +
> +      @Override
> +      public boolean hasNext() {
> +         return iterator.hasNext() || marker != null;
> +      }
> +
> +      @Override
> +      public StorageMetadata next() {
> +         while (true) {
> +            if (!iterator.hasNext()) {
> +               advance();
> +            }
> +            StorageMetadata metadata = iterator.next();

Changes lgtm now. However, advancing the iterator in the `hasNext` is not the best pattern to use. I'd suggest to use the Guava's `AbstractIterator` to implement the iterator. You can take a look at the GCE's [AdvancingIterator](https://github.com/jclouds/jclouds-labs-google/blob/master/googlecloud/src/main/java/org/jclouds/googlecloud/internal/AdvancingIterator.java), which does pretty much the same. Your `crawl` methods should use that iterator and the [this concat method](https://github.com/jclouds/jclouds-labs-google/blob/master/googlecloud/src/main/java/org/jclouds/googlecloud/internal/ListPages.java#L34-44) to return an Iterator that can advance between pages, without having to change the state of the iterator in the `hasNext` method. WDYT?

---
Reply to this email directly or view it on GitHub:
https://github.com/jclouds/jclouds/pull/702/files#r26255537

Re: [jclouds] Add utility to crawl a blobstore (#702)

Posted by Ignasi Barrera <no...@github.com>.
> +         marker = set.getNextMarker();
> +         iterator = set.iterator();
> +      }
> +
> +      @Override
> +      public boolean hasNext() {
> +         return iterator.hasNext() || marker != null;
> +      }
> +
> +      @Override
> +      public StorageMetadata next() {
> +         while (true) {
> +            if (!iterator.hasNext()) {
> +               advance();
> +            }
> +            StorageMetadata metadata = iterator.next();

To clarify: it is unlikely that the provider returns an empty page if there is a marker, but since it is quite a common pattern to use fallbacks that return empty lists on 404, we should take care and be defensive here. Don't assume that the iterable has content at this point.

Also IMO this class deserves its own unit test to verify it behaves properly when there are 0 pages (the initial advance returns nothing), a single page, multiple pages, and empty pages.

---
Reply to this email directly or view it on GitHub:
https://github.com/jclouds/jclouds/pull/702/files#r26179656

Re: [jclouds] Add utility to crawl a blobstore (#702)

Posted by Andrew Gaul <no...@github.com>.
> +         marker = set.getNextMarker();
> +         iterator = set.iterator();
> +      }
> +
> +      @Override
> +      public boolean hasNext() {
> +         return iterator.hasNext() || marker != null;
> +      }
> +
> +      @Override
> +      public StorageMetadata next() {
> +         while (true) {
> +            if (!iterator.hasNext()) {
> +               advance();
> +            }
> +            StorageMetadata metadata = iterator.next();

I amended the pull request with the three tests you suggested and handing the presence of a marker with no more results.

---
Reply to this email directly or view it on GitHub:
https://github.com/jclouds/jclouds/pull/702/files#r26181564

Re: [jclouds] Add utility to crawl a blobstore (#702)

Posted by Ignasi Barrera <no...@github.com>.
> +         marker = set.getNextMarker();
> +         iterator = set.iterator();
> +      }
> +
> +      @Override
> +      public boolean hasNext() {
> +         return iterator.hasNext() || marker != null;
> +      }
> +
> +      @Override
> +      public StorageMetadata next() {
> +         while (true) {
> +            if (!iterator.hasNext()) {
> +               advance();
> +            }
> +            StorageMetadata metadata = iterator.next();

Looking at your implementation, that shouldn't be needed. That concat would substitute your `while(true)` loop.

In the links I passed as an example, you have an `Iterator<Page<T>>`, that is, an iterator that knows how to "fetch pages", but does not iterate the elements of each page. The `concat` method is used to "flatten" that iterator in an iterator that iterates over the element of each page and jumps to the next one once finished. Basically, if you model it to be an Iterator of pages (which is what it really is), you can take benefit of Guava's `Iterators.concat(Iterator)` to do the "jump from one page to the next one" for you.

It is just an implementation detail. I just shared how pagination is implemented in other parts of the code (which IMO ends up being simpler code). The current implementation looks fine, BTW :)

---
Reply to this email directly or view it on GitHub:
https://github.com/jclouds/jclouds/pull/702/files#r28673416

Re: [jclouds] Add utility to crawl a blobstore (#702)

Posted by Ka-Hing Cheung <no...@github.com>.
:+1: 

---
Reply to this email directly or view it on GitHub:
https://github.com/jclouds/jclouds/pull/702#issuecomment-78163635

Re: [jclouds] Add utility to crawl a blobstore (#702)

Posted by Andrew Gaul <no...@github.com>.
> +         marker = set.getNextMarker();
> +         iterator = set.iterator();
> +      }
> +
> +      @Override
> +      public boolean hasNext() {
> +         return iterator.hasNext() || marker != null;
> +      }
> +
> +      @Override
> +      public StorageMetadata next() {
> +         while (true) {
> +            if (!iterator.hasNext()) {
> +               advance();
> +            }
> +            StorageMetadata metadata = iterator.next();

I rewrote this to use `AdvancingIterator` although I do not understand the suggestion to use concat -- could you expand on this?

---
Reply to this email directly or view it on GitHub:
https://github.com/jclouds/jclouds/pull/702/files#r28665484

Re: [jclouds/jclouds] Add utility to crawl a blobstore (#702)

Posted by Andrew Gaul <no...@github.com>.
Abandoning.  I believe we want something like this, but more discoverable, e.g., within `BlobStore` itself.

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/jclouds/jclouds/pull/702#issuecomment-256095527