You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@kafka.apache.org by "Sagar Rao (Jira)" <ji...@apache.org> on 2021/04/08 15:13:00 UTC

[jira] [Comment Edited] (KAFKA-8295) Optimize count() using RocksDB merge operator

    [ https://issues.apache.org/jira/browse/KAFKA-8295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17317116#comment-17317116 ] 

Sagar Rao edited comment on KAFKA-8295 at 4/8/21, 3:12 PM:
-----------------------------------------------------------

hey [~ableegoldman], this is a very old ticket and slightly unrelated but i was looking at merge for one of my use cases so thought i will ask you. Not sure what the state was when you created this ticket, there is a MergeOperator which can be customised: [https://javadoc.io/static/org.rocksdb/rocksdbjni/6.16.4/org/rocksdb/MergeOperator.html]

Was this option present back then when you created the ticket? BTW, do you also think this could be added as an option to state stores in kafka streams?


was (Author: sagarrao):
hey [~ableegoldman], this is a very old ticket but i was looking at merge for one of my use cases so thought i will ask you. Not sure what the state was when you created this ticket, there is a MergeOperator which can be customised: [https://javadoc.io/static/org.rocksdb/rocksdbjni/6.16.4/org/rocksdb/MergeOperator.html]

But this would still cross the jni to actually run the merge operator. 

> Optimize count() using RocksDB merge operator
> ---------------------------------------------
>
>                 Key: KAFKA-8295
>                 URL: https://issues.apache.org/jira/browse/KAFKA-8295
>             Project: Kafka
>          Issue Type: Improvement
>          Components: streams
>            Reporter: A. Sophie Blee-Goldman
>            Priority: Major
>
> In addition to regular put/get/delete RocksDB provides a fourth operation, merge. This essentially provides an optimized read/update/write path in a single operation. One of the built-in (C++) merge operators exposed over the Java API is a counter. We should be able to leverage this for a more efficient implementation of count()
>  
> (Note: Unfortunately it seems unlikely we can use this to optimize general aggregations, even if RocksJava allowed for a custom merge operator, unless we provide a way for the user to specify and connect a C++ implemented aggregator – otherwise we incur too much cost crossing the jni for a net performance benefit)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)