You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by "Ismael Juma (JIRA)" <ji...@apache.org> on 2016/04/04 20:12:25 UTC

[jira] [Commented] (KAFKA-3499) byte[] should not be used as Map key nor Set member

    [ https://issues.apache.org/jira/browse/KAFKA-3499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15224691#comment-15224691 ] 

Ismael Juma commented on KAFKA-3499:
------------------------------------

[~guozhang], this looks important to fix so I set the fix version to 0.10.0.0. Please change it if you disagree.

> byte[] should not be used as Map key nor Set member
> ---------------------------------------------------
>
>                 Key: KAFKA-3499
>                 URL: https://issues.apache.org/jira/browse/KAFKA-3499
>             Project: Kafka
>          Issue Type: Bug
>          Components: kafka streams
>            Reporter: josh gruenberg
>             Fix For: 0.10.0.0
>
>
> On the JVM, Array.equals and Array.hashCode do not incorporate array contents; they inherit Object.equals/hashCode. This implies that Collections that rely upon equals/hashCode (eg, HashMap/HashSet and variants) treat two arrays with equal contents as distinct elements.
> Many of the Kafka Streams internal classes currently use generic HashMaps and Sets to manage caches and invalidation status. For example, RocksDBStore.cacheDirtyKeys is a HashSet<K>. Then, in RocksDBWindowStore, the Elements are constructed as RocksDBStore<byte[], byte[]>.
> Similarly, the MemoryLRUCache<K, RocksDBCacheEntry> internally holds a LinkedHashMap<K,V> map, and a HashSet<K> keys, and these end up holding byte[] keys. Finally, user-code may attempt to use any of these provided types with byte[], with undesirable results.
> Keys that are byte-arrays should be wrapped in a type that incorporates the content in their computation of equals/hashCode. java.nio.ByteBuffer is one such type that could be used, but a purpose-built immutable class would likely be a better solution.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)