You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@kudu.apache.org by "ASF subversion and git services (Jira)" <ji...@apache.org> on 2020/11/12 07:00:00 UTC

[jira] [Commented] (KUDU-1644) Simplify IN-list predicate values based on tablet partition key or rowset PK bounds

    [ https://issues.apache.org/jira/browse/KUDU-1644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17230392#comment-17230392 ] 

ASF subversion and git services commented on KUDU-1644:
-------------------------------------------------------

Commit 6a7cadc7eddeaaa374971d5ba16fec8422e33db9 in kudu's branch refs/heads/master from ningw
[ https://gitbox.apache.org/repos/asf?p=kudu.git;h=6a7cadc ]

KUDU-1644 hash-partition based in-list predicate optimization

Hash prune for single hash-key based inList query. Reduce the values to predicate
by hash-partition match.
This patch reduces the IN List predicated values to be pushed to tablet
without change the content to be returned.

Table has P partitions, N records. Inlist predicate has V values.

Before:
To each tablet, time complexity to complete hash-key based in-list query is:
LOG(V) * N

After:
Complexity becomes:
LOG(V/P) * N

E.g.
Hash partition of table 'profile':
hash(id) by id partitions 3, simply use mod as hash function.
select * from profile where id in (1,2,3,4,5,6,7,8,9,10)

Before:
Tablet 1: id in (1,2,3,4,5,6,7,8,9,10)
Tablet 2: id in (1,2,3,4,5,6,7,8,9,10)
Tablet 3: id in (1,2,3,4,5,6,7,8,9,10)

After:
Tablet 1: id in (0,3,6,9)
Tablet 2: id in (1,4,7,10)
Tablet 3: id in (2,5,8)

Change-Id: I202001535669a72de7fbb9e766dbc27db48e0aa2
Reviewed-on: http://gerrit.cloudera.org:8080/16674
Tested-by: Kudu Jenkins
Reviewed-by: Andrew Wong <aw...@cloudera.com>


> Simplify IN-list predicate values based on tablet partition key or rowset PK bounds
> -----------------------------------------------------------------------------------
>
>                 Key: KUDU-1644
>                 URL: https://issues.apache.org/jira/browse/KUDU-1644
>             Project: Kudu
>          Issue Type: Sub-task
>          Components: perf, tablet
>            Reporter: Dan Burkert
>            Priority: Major
>         Attachments: image-2019-12-05-14-52-05-846.png, image-2019-12-05-14-52-18-487.png, image-2019-12-05-14-53-51-175.png, image-2019-12-05-14-53-57-741.png, image-2019-12-05-14-54-03-485.png
>
>
> When new scans are optimized by the tablet, the tablet's partition key bounds aren't taken into account in order to remove predicates from the scan.  One of the most important such optimizations is that IN-list predicates could remove values based on the tablet's constraints.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)