You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@kudu.apache.org by Boris Tyukin <bo...@boristyukin.com> on 2018/12/11 20:59:19 UTC

KuduScanner with multiple sets of compound primary keys

Hi guys,

my Kudu table has several PK columns and I need to create a scanner to pull
multiple rows for these primary keys. If I used Impala, it would be
something like

SELECT pk1, pk2, col1 FROM table1
WHERE
      (pk1 = 1 and pk2 = 11)
OR (pk1 = 2 and pk2 = 22)
OR (pk1 = 3 and pk2 = 33)

I tried one KuduScanner per PK set and it works but I want to see if I can
get a better performance by doing a scan on all PK sets at once. Cannot
figure out how to do the OR part with ScannerBuilder and addPredicate
method.

Thanks!

Re: KuduScanner with multiple sets of compound primary keys

Posted by Boris Tyukin <bo...@boristyukin.com>.
got it, thanks Adar!

On Tue, Dec 11, 2018 at 4:26 PM Adar Lieber-Dembo <ad...@cloudera.com> wrote:

> Unfortunately that isn't possible with Kudu today. The workaround is,
> as you said, to perform one scan per predicate and to union the
> results.
>
> KUDU-2494 tracks adding support for disjunctions (i.e. OR predicates);
> if this is something you'd be interested in working on, your patches
> would be welcome.
>
> On Tue, Dec 11, 2018 at 1:00 PM Boris Tyukin <bo...@boristyukin.com>
> wrote:
> >
> > Hi guys,
> >
> > my Kudu table has several PK columns and I need to create a scanner to
> pull multiple rows for these primary keys. If I used Impala, it would be
> something like
> >
> > SELECT pk1, pk2, col1 FROM table1
> > WHERE
> >       (pk1 = 1 and pk2 = 11)
> > OR (pk1 = 2 and pk2 = 22)
> > OR (pk1 = 3 and pk2 = 33)
> >
> > I tried one KuduScanner per PK set and it works but I want to see if I
> can get a better performance by doing a scan on all PK sets at once. Cannot
> figure out how to do the OR part with ScannerBuilder and addPredicate
> method.
> >
> > Thanks!
>

Re: KuduScanner with multiple sets of compound primary keys

Posted by Adar Lieber-Dembo <ad...@cloudera.com>.
Unfortunately that isn't possible with Kudu today. The workaround is,
as you said, to perform one scan per predicate and to union the
results.

KUDU-2494 tracks adding support for disjunctions (i.e. OR predicates);
if this is something you'd be interested in working on, your patches
would be welcome.

On Tue, Dec 11, 2018 at 1:00 PM Boris Tyukin <bo...@boristyukin.com> wrote:
>
> Hi guys,
>
> my Kudu table has several PK columns and I need to create a scanner to pull multiple rows for these primary keys. If I used Impala, it would be something like
>
> SELECT pk1, pk2, col1 FROM table1
> WHERE
>       (pk1 = 1 and pk2 = 11)
> OR (pk1 = 2 and pk2 = 22)
> OR (pk1 = 3 and pk2 = 33)
>
> I tried one KuduScanner per PK set and it works but I want to see if I can get a better performance by doing a scan on all PK sets at once. Cannot figure out how to do the OR part with ScannerBuilder and addPredicate method.
>
> Thanks!