You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@kafka.apache.org by "Guozhang Wang (JIRA)" <ji...@apache.org> on 2018/02/15 00:35:03 UTC
[jira] [Issue Comment Deleted] (KAFKA-4608)
RocksDBWindowStore.fetch() is inefficient for large ranges
[ https://issues.apache.org/jira/browse/KAFKA-4608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Guozhang Wang updated KAFKA-4608:
---------------------------------
Comment: was deleted
(was: I have filed https://issues.apache.org/jira/browse/KAFKA-6560 to tackle on this issue, it aims to only use point queries for window stores than range queries.)
> RocksDBWindowStore.fetch() is inefficient for large ranges
> ----------------------------------------------------------
>
> Key: KAFKA-4608
> URL: https://issues.apache.org/jira/browse/KAFKA-4608
> Project: Kafka
> Issue Type: Improvement
> Components: streams
> Affects Versions: 0.10.1.1
> Reporter: Elias Levy
> Priority: Major
>
> It is not unreasonable for a user to call {{RocksDBWindowStore.fetch}} to scan for a key across a large time range. For instance, someone may call it with a {{timeFrom}} of zero or a {{timeTo}} of max long in an attempt to fetch keys matching across all time forwards or backwards.
> But if you do so, {{fetch}} will peg the CPU, as it attempts to iterate over every single segment id in the range. That is obviously very inefficient.
> {{fetch}} should trim the {{timeFrom}}/{{timeTo}} range based on the available time range in the {{segments}} hash map, so that it only iterates over the available time range.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)