You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@kafka.apache.org by "Matthias J. Sax (Jira)" <ji...@apache.org> on 2020/11/30 23:52:00 UTC

[jira] [Comment Edited] (KAFKA-7918) Streams store cleanup: inline byte-store generic parameters

    [ https://issues.apache.org/jira/browse/KAFKA-7918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17241139#comment-17241139 ] 

Matthias J. Sax edited comment on KAFKA-7918 at 11/30/20, 11:51 PM:
--------------------------------------------------------------------

The existing caching layer also collapses writes into the changelog for the same key – but if you don't do the serialization, it's very hard to budget the used memory. It could make the system unstable and your JVM might crash with out-of-memory errors.

We want to improve the system obviously and if you have a good solution, feel free to propose a KIP ([https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Improvement+Proposals]). Getting rid of expensive (de)serialization cost could be huge win. My gut feeling is though, that without good memory management is would be hard to convince people; stability might be more important than performance.


was (Author: mjsax):
The existing caching layer also collapses writes into the changelog for the same key – but if you don't do the serialization, it's very hard to budget the used memory. It could make the system unstable and your JVM might crash with ouf-of-memory errors.

> Streams store cleanup: inline byte-store generic parameters
> -----------------------------------------------------------
>
>                 Key: KAFKA-7918
>                 URL: https://issues.apache.org/jira/browse/KAFKA-7918
>             Project: Kafka
>          Issue Type: Improvement
>          Components: streams
>            Reporter: John Roesler
>            Assignee: A. Sophie Blee-Goldman
>            Priority: Major
>             Fix For: 2.3.0
>
>
> Currently, the fundamental layer of stores in Streams is the "bytes store".
> The easiest way to identify this is in `org.apache.kafka.streams.state.Stores`, all the `StoreBuilder`s require a `XXBytesStoreSupplier`. 
> We provide several implementations of these bytes stores, typically an in-memory one and a persistent one (aka RocksDB).
> Inside these bytes stores, the key is always `Bytes` and the value is always `byte[]` (serialization happens at a higher level). However, the store implementations are generically typed, just `K` and `V`.
> This is good for flexibility, but it makes the code a little harder to understand. I think that we used to do serialization at a lower level, so the generics are a hold-over from that.
> It would simplify the code if we just inlined the actual k/v types and maybe even renamed the classes from (e.g.) `InMemoryKeyValueStore<K,V>` to `InMemoryKeyValueBytesStore`, and so forth.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)