You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@kafka.apache.org by Alexander Jipa <al...@gmail.com> on 2016/06/08 18:47:05 UTC

Kafka Streams aggregation store

Hello,
According to http://www.confluent.io/blog/introducing-kafka-streams-stream-processing-made-simple:
“In terms of implementation Kafka Streams stores this derived aggregation in a local embedded key-value store (RocksDB by default, but you can plug in anything).”
 
So I tried running the world count example on my Windows machine (for local test) and got an error because RocksDB is not available for windows.
I thought it would be easy to switch to an in-memory store.
But after awhile I’ve figured out that the KStream aggregation implementation doesn’t allow that.
It looks like aggregateByKey (and thus countByKey) is always using a persistent store.
More over that it looks like there’s no way to change the default persistent store…
 
Even though I was more or less capable of achieving the goal using manual wiring of a Source, a Producer and a Sink – it doesn’t make it for an easy coding…
 
The questions that I have are:
-          Is there a plan of providing a persistent store support for Kafka Streams on Windows?
-          Is there a plan of providing KStream API to specify a custom store/factory for aggregations?
-          Is there a way of changing the default persistent store from RocksDB?
 
Best Regards,
Alexander Jipa

Re: Kafka Streams aggregation store

Posted by Eno Thereska <en...@gmail.com>.

Hi Alexander,

I haven't tried Kafka Streams on Windows but did notice that Microsoft has merged code into github to make RocksDB available on Windows. Perhaps this is useful:
https://blogs.msdn.microsoft.com/bingdevcenter/2015/07/22/open-source-contribution-from-bing-rocksdb-is-now-available-in-windows-platform/

Thanks,
Eno

> On 8 Jun 2016, at 19:47, Alexander Jipa <al...@gmail.com> wrote:
> 
> Hello,
> According to http://www.confluent.io/blog/introducing-kafka-streams-stream-processing-made-simple:
> “In terms of implementation Kafka Streams stores this derived aggregation in a local embedded key-value store (RocksDB by default, but you can plug in anything).”
> 
> So I tried running the world count example on my Windows machine (for local test) and got an error because RocksDB is not available for windows.
> I thought it would be easy to switch to an in-memory store.
> But after awhile I’ve figured out that the KStream aggregation implementation doesn’t allow that.
> It looks like aggregateByKey (and thus countByKey) is always using a persistent store.
> More over that it looks like there’s no way to change the default persistent store…
> 
> Even though I was more or less capable of achieving the goal using manual wiring of a Source, a Producer and a Sink – it doesn’t make it for an easy coding…
> 
> The questions that I have are:
> -          Is there a plan of providing a persistent store support for Kafka Streams on Windows?
> -          Is there a plan of providing KStream API to specify a custom store/factory for aggregations?
> -          Is there a way of changing the default persistent store from RocksDB?
> 
> Best Regards,
> Alexander Jipa