You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@samza.apache.org by Amit Yadav <am...@hotmail.com> on 2016/02/26 19:37:51 UTC

Re: Review Request 41068: SAMZA-813: Add Seek functionality to KeyValueStoreIterator

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/41068/
-----------------------------------------------------------

(Updated Feb. 26, 2016, 6:37 p.m.)


Review request for samza and Yi Pan (Data Infrastructure).


Changes
-------

Rebased to latest.


Repository: samza


Description
-------

Added Seek functionality to KeyValueStore and KeyValueStoreIterator. The following alternatives were considered

1. Create a 'SeekingKeyValueIterator' and 'SeekingKeyValueStore' - Have a completely separate classes which extends their existing
   counterparts. This solution is appealing as it provides separation of concerns. However, this also requires creating a complete
   hierarchy of all the classes which are implementing the KeyValueStore/Iterator interface. Therefore the effort to value didnt
   made much sense.

2. Use the existing 'range' functionality - This solution requires a new iterator to be created which could supply the 'end'
   key for the range. We could make this user configurable, however this assumes that the user can somehow come up with an
   end key in all the cases. This assumption may not be true all the times. Also, I feel this is more of a workaround then
   a real solution, more of range degenerated to a seekTo functionality and therefore is not a clean interaface.

3. Add seek functionality to existing interfaces and classes :- In the end, this is what I implemented since it is the cleanest solution
   in amongst all the options. This also requires minimum changes. The downside of this approach is that not al stores can support seek
   to a particular key (especially, the one derived from java iterators).


Diffs (updated)
-----

  samza-api/src/main/java/org/apache/samza/storage/kv/KeyValueIterator.java 854ebbfa640e96a87977ae25b67736ef57fadd8e 
  samza-api/src/main/java/org/apache/samza/storage/kv/KeyValueStore.java b1fea7bc38d1a44e7bba6bdeb638dc33c0936c6f 
  samza-kv-inmemory/src/main/scala/org/apache/samza/storage/kv/inmemory/InMemoryKeyValueStore.scala 72f25a354eaa98e8df379d07d9cc8613dfafd13a 
  samza-kv-rocksdb/src/main/scala/org/apache/samza/storage/kv/RocksDbKeyValueStore.scala d614f8a58882628129ec30c0fe8800395d60ed99 
  samza-kv/src/main/scala/org/apache/samza/storage/kv/CachedStore.scala 9a5b2d5f8f76220dfc8637823e2b63dffc138a5d 
  samza-kv/src/main/scala/org/apache/samza/storage/kv/CachedStoreMetrics.scala df8efaeefba829a47cb744d7a755cbe9e1f562f1 
  samza-kv/src/main/scala/org/apache/samza/storage/kv/KeyValueStorageEngine.scala e5a66a4770b9553a1cc48fbb505f52d123c6c754 
  samza-kv/src/main/scala/org/apache/samza/storage/kv/KeyValueStorageEngineMetrics.scala 233fba91caf041bfb78189efef00ce8fc56f9f15 
  samza-kv/src/main/scala/org/apache/samza/storage/kv/KeyValueStoreMetrics.scala 967d5097253640ee41ee84e551e15fa2285c00e2 
  samza-kv/src/main/scala/org/apache/samza/storage/kv/LoggedStore.scala 7bba6ff37d8266674e7f15c10c7c146f4a41fc91 
  samza-kv/src/main/scala/org/apache/samza/storage/kv/LoggedStoreMetrics.scala 743151a310792d4a6ff48ea102e796eb9f630bba 
  samza-kv/src/main/scala/org/apache/samza/storage/kv/NullSafeKeyValueStore.scala 3de257c3117c16b7cfdb7a3d7bb28d5cddc7c10a 
  samza-kv/src/main/scala/org/apache/samza/storage/kv/SerializedKeyValueStore.scala 8e183efcdec6fd3f921fc2bfe1971c95715930ed 
  samza-kv/src/main/scala/org/apache/samza/storage/kv/SerializedKeyValueStoreMetrics.scala 841e4a236da99cbfe121054654c648887f39c3f6 
  samza-kv/src/test/scala/org/apache/samza/storage/kv/MockKeyValueStore.scala 595dd0df6fde50f91ab5a046a193559326f2a1d5 

Diff: https://reviews.apache.org/r/41068/diff/


Testing
-------


Thanks,

Amit Yadav


Re: Review Request 41068: SAMZA-813: Add Seek functionality to KeyValueStoreIterator

Posted by Amit Yadav <am...@hotmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/41068/
-----------------------------------------------------------

(Updated March 9, 2016, 12:26 a.m.)


Review request for samza and Yi Pan (Data Infrastructure).


Repository: samza


Description
-------

Added Seek functionality to KeyValueStore and KeyValueStoreIterator. The following alternatives were considered

1. Create a 'SeekingKeyValueIterator' and 'SeekingKeyValueStore' - Have a completely separate classes which extends their existing
   counterparts. This solution is appealing as it provides separation of concerns. However, this also requires creating a complete
   hierarchy of all the classes which are implementing the KeyValueStore/Iterator interface. Therefore the effort to value didnt
   made much sense.

2. Use the existing 'range' functionality - This solution requires a new iterator to be created which could supply the 'end'
   key for the range. We could make this user configurable, however this assumes that the user can somehow come up with an
   end key in all the cases. This assumption may not be true all the times. Also, I feel this is more of a workaround then
   a real solution, more of range degenerated to a seekTo functionality and therefore is not a clean interaface.

3. Add seek functionality to existing interfaces and classes :- In the end, this is what I implemented since it is the cleanest solution
   in amongst all the options. This also requires minimum changes. The downside of this approach is that not al stores can support seek
   to a particular key (especially, the one derived from java iterators).


Diffs (updated)
-----

  samza-api/src/main/java/org/apache/samza/storage/kv/KeyValueIterator.java 854ebbfa640e96a87977ae25b67736ef57fadd8e 
  samza-api/src/main/java/org/apache/samza/storage/kv/KeyValueStore.java b1fea7bc38d1a44e7bba6bdeb638dc33c0936c6f 
  samza-kv-inmemory/src/main/scala/org/apache/samza/storage/kv/inmemory/InMemoryKeyValueStore.scala 72f25a354eaa98e8df379d07d9cc8613dfafd13a 
  samza-kv-rocksdb/src/main/scala/org/apache/samza/storage/kv/RocksDbKeyValueStore.scala d614f8a58882628129ec30c0fe8800395d60ed99 
  samza-kv/src/main/scala/org/apache/samza/storage/kv/CachedStore.scala c28f8db8cb59bd5415e78535877acc1e5bee0f67 
  samza-kv/src/main/scala/org/apache/samza/storage/kv/CachedStoreMetrics.scala a11b5fe4f91259228eb262cfbc1d36f2cba4e08d 
  samza-kv/src/main/scala/org/apache/samza/storage/kv/KeyValueStorageEngine.scala e5a66a4770b9553a1cc48fbb505f52d123c6c754 
  samza-kv/src/main/scala/org/apache/samza/storage/kv/KeyValueStorageEngineMetrics.scala 233fba91caf041bfb78189efef00ce8fc56f9f15 
  samza-kv/src/main/scala/org/apache/samza/storage/kv/KeyValueStoreMetrics.scala 967d5097253640ee41ee84e551e15fa2285c00e2 
  samza-kv/src/main/scala/org/apache/samza/storage/kv/LoggedStore.scala 7bba6ff37d8266674e7f15c10c7c146f4a41fc91 
  samza-kv/src/main/scala/org/apache/samza/storage/kv/LoggedStoreMetrics.scala 743151a310792d4a6ff48ea102e796eb9f630bba 
  samza-kv/src/main/scala/org/apache/samza/storage/kv/NullSafeKeyValueStore.scala 3de257c3117c16b7cfdb7a3d7bb28d5cddc7c10a 
  samza-kv/src/main/scala/org/apache/samza/storage/kv/SerializedKeyValueStore.scala 8e183efcdec6fd3f921fc2bfe1971c95715930ed 
  samza-kv/src/main/scala/org/apache/samza/storage/kv/SerializedKeyValueStoreMetrics.scala 841e4a236da99cbfe121054654c648887f39c3f6 
  samza-kv/src/test/scala/org/apache/samza/storage/kv/MockKeyValueStore.scala 595dd0df6fde50f91ab5a046a193559326f2a1d5 

Diff: https://reviews.apache.org/r/41068/diff/


Testing
-------


Thanks,

Amit Yadav


Re: Review Request 41068: SAMZA-813: Add Seek functionality to KeyValueStoreIterator

Posted by Amit Yadav <am...@hotmail.com>.

> On March 7, 2016, 8:42 a.m., Yi Pan (Data Infrastructure) wrote:
> > samza-api/src/main/java/org/apache/samza/storage/kv/KeyValueIterator.java, line 27
> > <https://reviews.apache.org/r/41068/diff/2/?file=1272213#file1272213line27>
> >
> >     I recommend to drop this seekToLast() for now. Based on your use case description, to achieve the backward traverse, you would need both seekToLast() and previous() in the iterator. Adding seekToLast() only won't be sufficient.

Agreed. The latest diff has this change.


- Amit


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/41068/#review122271
-----------------------------------------------------------


On March 9, 2016, 12:26 a.m., Amit Yadav wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/41068/
> -----------------------------------------------------------
> 
> (Updated March 9, 2016, 12:26 a.m.)
> 
> 
> Review request for samza and Yi Pan (Data Infrastructure).
> 
> 
> Repository: samza
> 
> 
> Description
> -------
> 
> Added Seek functionality to KeyValueStore and KeyValueStoreIterator. The following alternatives were considered
> 
> 1. Create a 'SeekingKeyValueIterator' and 'SeekingKeyValueStore' - Have a completely separate classes which extends their existing
>    counterparts. This solution is appealing as it provides separation of concerns. However, this also requires creating a complete
>    hierarchy of all the classes which are implementing the KeyValueStore/Iterator interface. Therefore the effort to value didnt
>    made much sense.
> 
> 2. Use the existing 'range' functionality - This solution requires a new iterator to be created which could supply the 'end'
>    key for the range. We could make this user configurable, however this assumes that the user can somehow come up with an
>    end key in all the cases. This assumption may not be true all the times. Also, I feel this is more of a workaround then
>    a real solution, more of range degenerated to a seekTo functionality and therefore is not a clean interaface.
> 
> 3. Add seek functionality to existing interfaces and classes :- In the end, this is what I implemented since it is the cleanest solution
>    in amongst all the options. This also requires minimum changes. The downside of this approach is that not al stores can support seek
>    to a particular key (especially, the one derived from java iterators).
> 
> 
> Diffs
> -----
> 
>   samza-api/src/main/java/org/apache/samza/storage/kv/KeyValueIterator.java 854ebbfa640e96a87977ae25b67736ef57fadd8e 
>   samza-api/src/main/java/org/apache/samza/storage/kv/KeyValueStore.java b1fea7bc38d1a44e7bba6bdeb638dc33c0936c6f 
>   samza-kv-inmemory/src/main/scala/org/apache/samza/storage/kv/inmemory/InMemoryKeyValueStore.scala 72f25a354eaa98e8df379d07d9cc8613dfafd13a 
>   samza-kv-rocksdb/src/main/scala/org/apache/samza/storage/kv/RocksDbKeyValueStore.scala d614f8a58882628129ec30c0fe8800395d60ed99 
>   samza-kv/src/main/scala/org/apache/samza/storage/kv/CachedStore.scala c28f8db8cb59bd5415e78535877acc1e5bee0f67 
>   samza-kv/src/main/scala/org/apache/samza/storage/kv/CachedStoreMetrics.scala a11b5fe4f91259228eb262cfbc1d36f2cba4e08d 
>   samza-kv/src/main/scala/org/apache/samza/storage/kv/KeyValueStorageEngine.scala e5a66a4770b9553a1cc48fbb505f52d123c6c754 
>   samza-kv/src/main/scala/org/apache/samza/storage/kv/KeyValueStorageEngineMetrics.scala 233fba91caf041bfb78189efef00ce8fc56f9f15 
>   samza-kv/src/main/scala/org/apache/samza/storage/kv/KeyValueStoreMetrics.scala 967d5097253640ee41ee84e551e15fa2285c00e2 
>   samza-kv/src/main/scala/org/apache/samza/storage/kv/LoggedStore.scala 7bba6ff37d8266674e7f15c10c7c146f4a41fc91 
>   samza-kv/src/main/scala/org/apache/samza/storage/kv/LoggedStoreMetrics.scala 743151a310792d4a6ff48ea102e796eb9f630bba 
>   samza-kv/src/main/scala/org/apache/samza/storage/kv/NullSafeKeyValueStore.scala 3de257c3117c16b7cfdb7a3d7bb28d5cddc7c10a 
>   samza-kv/src/main/scala/org/apache/samza/storage/kv/SerializedKeyValueStore.scala 8e183efcdec6fd3f921fc2bfe1971c95715930ed 
>   samza-kv/src/main/scala/org/apache/samza/storage/kv/SerializedKeyValueStoreMetrics.scala 841e4a236da99cbfe121054654c648887f39c3f6 
>   samza-kv/src/test/scala/org/apache/samza/storage/kv/MockKeyValueStore.scala 595dd0df6fde50f91ab5a046a193559326f2a1d5 
> 
> Diff: https://reviews.apache.org/r/41068/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Amit Yadav
> 
>


Re: Review Request 41068: SAMZA-813: Add Seek functionality to KeyValueStoreIterator

Posted by "Yi Pan (Data Infrastructure)" <yi...@linkedin.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/41068/#review122271
-----------------------------------------------------------



@Amit, the patch lgtm overall. I have a comment on seekToLast(). If you feel that seekToLast() is still useful w/o backward traversing function previous() in the iterator, please add some explanation and we can ship it.

Thanks!


samza-api/src/main/java/org/apache/samza/storage/kv/KeyValueIterator.java (line 27)
<https://reviews.apache.org/r/41068/#comment184192>

    I recommend to drop this seekToLast() for now. Based on your use case description, to achieve the backward traverse, you would need both seekToLast() and previous() in the iterator. Adding seekToLast() only won't be sufficient.


- Yi Pan (Data Infrastructure)


On Feb. 26, 2016, 6:37 p.m., Amit Yadav wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/41068/
> -----------------------------------------------------------
> 
> (Updated Feb. 26, 2016, 6:37 p.m.)
> 
> 
> Review request for samza and Yi Pan (Data Infrastructure).
> 
> 
> Repository: samza
> 
> 
> Description
> -------
> 
> Added Seek functionality to KeyValueStore and KeyValueStoreIterator. The following alternatives were considered
> 
> 1. Create a 'SeekingKeyValueIterator' and 'SeekingKeyValueStore' - Have a completely separate classes which extends their existing
>    counterparts. This solution is appealing as it provides separation of concerns. However, this also requires creating a complete
>    hierarchy of all the classes which are implementing the KeyValueStore/Iterator interface. Therefore the effort to value didnt
>    made much sense.
> 
> 2. Use the existing 'range' functionality - This solution requires a new iterator to be created which could supply the 'end'
>    key for the range. We could make this user configurable, however this assumes that the user can somehow come up with an
>    end key in all the cases. This assumption may not be true all the times. Also, I feel this is more of a workaround then
>    a real solution, more of range degenerated to a seekTo functionality and therefore is not a clean interaface.
> 
> 3. Add seek functionality to existing interfaces and classes :- In the end, this is what I implemented since it is the cleanest solution
>    in amongst all the options. This also requires minimum changes. The downside of this approach is that not al stores can support seek
>    to a particular key (especially, the one derived from java iterators).
> 
> 
> Diffs
> -----
> 
>   samza-api/src/main/java/org/apache/samza/storage/kv/KeyValueIterator.java 854ebbfa640e96a87977ae25b67736ef57fadd8e 
>   samza-api/src/main/java/org/apache/samza/storage/kv/KeyValueStore.java b1fea7bc38d1a44e7bba6bdeb638dc33c0936c6f 
>   samza-kv-inmemory/src/main/scala/org/apache/samza/storage/kv/inmemory/InMemoryKeyValueStore.scala 72f25a354eaa98e8df379d07d9cc8613dfafd13a 
>   samza-kv-rocksdb/src/main/scala/org/apache/samza/storage/kv/RocksDbKeyValueStore.scala d614f8a58882628129ec30c0fe8800395d60ed99 
>   samza-kv/src/main/scala/org/apache/samza/storage/kv/CachedStore.scala 9a5b2d5f8f76220dfc8637823e2b63dffc138a5d 
>   samza-kv/src/main/scala/org/apache/samza/storage/kv/CachedStoreMetrics.scala df8efaeefba829a47cb744d7a755cbe9e1f562f1 
>   samza-kv/src/main/scala/org/apache/samza/storage/kv/KeyValueStorageEngine.scala e5a66a4770b9553a1cc48fbb505f52d123c6c754 
>   samza-kv/src/main/scala/org/apache/samza/storage/kv/KeyValueStorageEngineMetrics.scala 233fba91caf041bfb78189efef00ce8fc56f9f15 
>   samza-kv/src/main/scala/org/apache/samza/storage/kv/KeyValueStoreMetrics.scala 967d5097253640ee41ee84e551e15fa2285c00e2 
>   samza-kv/src/main/scala/org/apache/samza/storage/kv/LoggedStore.scala 7bba6ff37d8266674e7f15c10c7c146f4a41fc91 
>   samza-kv/src/main/scala/org/apache/samza/storage/kv/LoggedStoreMetrics.scala 743151a310792d4a6ff48ea102e796eb9f630bba 
>   samza-kv/src/main/scala/org/apache/samza/storage/kv/NullSafeKeyValueStore.scala 3de257c3117c16b7cfdb7a3d7bb28d5cddc7c10a 
>   samza-kv/src/main/scala/org/apache/samza/storage/kv/SerializedKeyValueStore.scala 8e183efcdec6fd3f921fc2bfe1971c95715930ed 
>   samza-kv/src/main/scala/org/apache/samza/storage/kv/SerializedKeyValueStoreMetrics.scala 841e4a236da99cbfe121054654c648887f39c3f6 
>   samza-kv/src/test/scala/org/apache/samza/storage/kv/MockKeyValueStore.scala 595dd0df6fde50f91ab5a046a193559326f2a1d5 
> 
> Diff: https://reviews.apache.org/r/41068/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Amit Yadav
> 
>