You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@kudu.apache.org by "Grant Henke (Jira)" <ji...@apache.org> on 2020/06/24 22:15:00 UTC

[jira] [Commented] (KUDU-1291) Efficiently support predicates on non-prefix key components

    [ https://issues.apache.org/jira/browse/KUDU-1291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17144441#comment-17144441 ] 

Grant Henke commented on KUDU-1291:
-----------------------------------

A patch was posted for this here: [https://gerrit.cloudera.org/#/c/10983/]

Along with a blog post about the design and in-progress work: [https://kudu.apache.org/2018/09/26/index-skip-scan-optimization-in-kudu.html]

 

> Efficiently support predicates on non-prefix key components
> -----------------------------------------------------------
>
>                 Key: KUDU-1291
>                 URL: https://issues.apache.org/jira/browse/KUDU-1291
>             Project: Kudu
>          Issue Type: Sub-task
>          Components: perf, tablet
>            Reporter: Todd Lipcon
>            Priority: Major
>              Labels: roadmap-candidate
>
> In a lot of workloads, users have a compound primary key where the first component (or few components) is low cardinality. For example, a time series workload may have (year, month, day, entity_id, timestamp) as a primary key. A metrics or log storage workload might have (hostname, timestamp).
> It's common to want to do cross-user or cross-date analytics like 'WHERE timestamp BETWEEN <a> and <b>' without specifying any predicate for the first column(s) of the PK. Currently, we do not execute this efficiently, but rather scan the whole table evaluating the predicate.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)