You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@samza.apache.org by "Chris Riccomini (JIRA)" <ji...@apache.org> on 2015/03/10 02:06:38 UTC

[jira] [Assigned] (SAMZA-505) CachedStore doesn't support Array keys well

     [ https://issues.apache.org/jira/browse/SAMZA-505?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Chris Riccomini reassigned SAMZA-505:
-------------------------------------

    Assignee: Chris Riccomini

> CachedStore doesn't support Array keys well
> -------------------------------------------
>
>                 Key: SAMZA-505
>                 URL: https://issues.apache.org/jira/browse/SAMZA-505
>             Project: Samza
>          Issue Type: Bug
>          Components: kv
>    Affects Versions: 0.8.0
>            Reporter: Chris Riccomini
>            Assignee: Chris Riccomini
>             Fix For: 0.9.0
>
>
> Several people have hit an issue when using the Key/Value store with byte[] keys. Since CachedStore uses a HashMap, and Array.equals/Array.hashCode return object identity values, the HashMap behaves unexpectedly. This isn't really a bug, just a common misunderstanding in how things work. It's compounded by the fact that we default caches to "on". This yields the behavior:
> {code}
> store.put("a".getBytes, 1)
> store.get("a".getBytes) // returns null
> {code}
> See [this discussion|http://stackoverflow.com/questions/1058149/using-a-byte-array-as-hashmap-key-java] for details.
> Our TestKeyValueStore uses byte[] keys, but it keeps them in a list, and re-uses the same exact instance, so we don't hit this problem.
> I think we should wrap array keys in ByteBuffer, or use our own wrapper. We'll have to make sure to unwrap before calling the put/get/delete operations on the underlying store.
> Initially, I was thinking that the safest thing to do would be to have CachedStore check all keys, and throw an exception. This would allow individuals to choose the best course of action (ByteBuffer.wrap, use an alternative key, write a custom wrapper class, etc). But, I think this approach doesn't work in some cases. If there's a cache with a JSON serde, and the user is using a key of Array[Int], using the key of Array[Int] is valid. A JSON serde would just serialize it as [1,2,3], and everything should work in this case.
> Since this problem is basically an implementation detail introduced by CachedStore, I think it should be fixed internally by wrapping/unwrapping array keys.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)