You are viewing a plain text version of this content. The canonical link for it is here.

Posted to jira@kafka.apache.org by "Peter Davis (JIRA)" <ji...@apache.org> on 2017/12/15 02:59:00 UTC

[jira] [Commented] (KAFKA-5285) optimize upper / lower byte range for key range scan on windowed stores

    [ https://issues.apache.org/jira/browse/KAFKA-5285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16291962#comment-16291962 ] 

Peter Davis commented on KAFKA-5285:
------------------------------------

In debugging a recent performance blocker in an app of mine, I'm suspecting that when calling `ReadOnlySessionStore.fetch(from,to)`, which uses from (Timestamp) = Long.MAX_VALUE, it calls `upperRange` with a `maxSuffix` filled with mostly 0xFF's.  The resulting upperRange is therefore also mostly 0xFF and the resulting RocksDB iterator effectively iterates over (binaryKeyFrom...infinity).  With a large number of keys, this is much worse than a mere performance issue (though the result appears "correct" since SessionKeySchema.hasNextCondition filters out the bogus results).  It iterates over thousands of unnecessary records and is slow as molasses.

It looks like the issue dates to KIP-155.

In [`SessionKeySchema#upperRange`](https://github.com/apache/kafka/commit/e28752357705568219315375c666f8e500db9c12#diff-52e7d2701ecab21b32621d9b13b7f33bR57), why is `putLong(to)` (timestamp) repeated twice and it does not use `put(key)` to build the `maxRange`?

When using a timestamp less than `Long.MAX_VALUE`, the issue is avoided because `OrderedBytes.upperRange` copies more of the real key.  But `ReadOnlySessionStore.fetch` does not let one specify a different timestamp.

> optimize upper / lower byte range for key range scan on windowed stores
> -----------------------------------------------------------------------
>
>                 Key: KAFKA-5285
>                 URL: https://issues.apache.org/jira/browse/KAFKA-5285
>             Project: Kafka
>          Issue Type: Improvement
>          Components: streams
>            Reporter: Xavier Léauté
>            Assignee: Xavier Léauté
>              Labels: performance
>
> The current implementation of {{WindowKeySchema}} / {{SessionKeySchema}} {{upperRange}} and {{lowerRange}} does not make any assumptions with respect to the other key bound (e.g. the upper byte bound does not depends on lower key bound).
> It should be possible to optimize the byte range somewhat further using the information provided by the lower bound.
> More specifically, by incorporating that information, we should be able to eliminate the corresponding {{upperRangeFixedSize}} and {{lowerRangeFixedSize}}, since the result should be the same if we implement that optimization.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)