You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@kafka.apache.org by "Guozhang Wang (Jira)" <ji...@apache.org> on 2020/06/18 18:59:00 UTC

[jira] [Comment Edited] (KAFKA-10005) Decouple RestoreListener from RestoreCallback and not enable bulk loading for RocksDB

    [ https://issues.apache.org/jira/browse/KAFKA-10005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17139533#comment-17139533 ] 

Guozhang Wang edited comment on KAFKA-10005 at 6/18/20, 6:58 PM:
-----------------------------------------------------------------

So just to have a quick summary, my proposal is primarily in three folds:

1) use {{db.addFileWithFileInfo(externalSstFileInfo)}} during restoration to add batch of records as SST files directly, this is to replace the impact of bulk loading.
2) move the restoration off the stream thread to a different thread (pool), for both restoring active tasks as well as updating standby tasks.
3) if needed, we also disable compaction during the restoration, and do a one-phase full compaction when we complete. I'm keeping it as "optional" for now since disabling compaction has both pros and cons, and if we have good performance from 1/2) alone then maybe we can afford to keep compaction enabled.

We already have an internal BulkLoadStore interface which e.g. RocksDBStore extends, we can leverage that interface to "toggle" restoration mode for 1) and 3) above.

cc [~cadonna]


was (Author: guozhang):
So just to have a quick summary, my proposal is primarily in three folds:

1) use {{db.addFileWithFileInfo(externalSstFileInfo)}} during restoration to add batch of records as SST files directly, this is to replace the impact of bulk loading.
2) move the restoration off the stream thread to a different thread (pool), for both restoring active tasks as well as updating standby tasks.
3) if needed, we also disable compaction during the restoration, and do a one-phase full compaction when we complete.

We already have an internal BulkLoadStore interface which e.g. RocksDBStore extends, we can leverage that interface to "toggle" restoration mode for 1) and 3) above.

cc [~cadonna]

> Decouple RestoreListener from RestoreCallback and not enable bulk loading for RocksDB
> -------------------------------------------------------------------------------------
>
>                 Key: KAFKA-10005
>                 URL: https://issues.apache.org/jira/browse/KAFKA-10005
>             Project: Kafka
>          Issue Type: Improvement
>          Components: streams
>            Reporter: Guozhang Wang
>            Assignee: Guozhang Wang
>            Priority: Major
>             Fix For: 2.6.0
>
>
> In Kafka Streams we have two restoration callbacks:
> * RestoreCallback (BatchingRestoreCallback): specified per-store via registration to specify the logic of applying a batch of records read from the changelog to the store. Used for both updating standby tasks and restoring active tasks.
> * RestoreListener: specified per-instance via `setRestoreListener`, to specify the logic for `onRestoreStart / onRestoreEnd / onBatchRestored`.
> As we can see these two callbacks are for quite different purposes, however today we allow user's to register a per-store RestoreCallback which is also implementing the RestoreListener. Such weird mixing is actually motivated by Streams internal usage to enable / disable bulk loading inside RocksDB. For user's however this is less meaningful to specify a callback to be a listener since the `onRestoreStart / End` has the storeName passed in, so that users can just define different listening logic if needed for different stores.
> On the other hand, this mixing of two callbacks enforces Streams to check internally if the passed in per-store callback is also implementing listener, and if yes trigger their calls, which increases the complexity. Besides, toggle rocksDB for bulk loading requires us to open / close / reopen / reclose 4 times during the restoration which could also be costly.
> Given that we have KIP-441 in place, I think we should consider different ways other than toggle bulk loading during restoration for Streams (e.g. using different threads for restoration).
> The proposal for this ticket is to completely decouple the listener from callback -- i.e. we would not presume users passing in a callback function that implements both RestoreCallback and RestoreListener, and also for RocksDB we replace the bulk loading mechanism with other ways of optimization: https://rockset.com/blog/optimizing-bulk-load-in-rocksdb/



--
This message was sent by Atlassian Jira
(v8.3.4#803005)