You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@kudu.apache.org by Irtiza Ali <ia...@an10.io> on 2018/09/02 06:25:53 UTC

Kudu's data pagination

Hello everyone,

Is there a way to paginate kudu's data using its python client?


I

Re: Kudu's data pagination

Posted by Dan Burkert <da...@apache.org>.
Without the SORT BY requirement it's possible to do this by setting the
primary key range of the scan to the incremented previous value, plus a
limit, plus making it a fault-tolerant scan.

Here are the options you'll need to configure:

https://kudu.apache.org/apidocs/org/apache/kudu/client/
AbstractKuduScannerBuilder.html#lowerBound-org.apache.
kudu.client.PartialRow-
https://kudu.apache.org/apidocs/org/apache/kudu/client/
AbstractKuduScannerBuilder.html#limit-long-
https://kudu.apache.org/apidocs/org/apache/kudu/client/
AbstractKuduScannerBuilder.html#setFaultTolerant-boolean-

- Dan

On Tue, Sep 4, 2018 at 10:11 AM, William Berkeley <wd...@cloudera.com>
wrote:

> Hi Irtiza. What do you mean by paginate? I'm guessing you mean doing
> something like taking the results of a query like
>
> SELECT name, age FROM users SORT BY age DESC
>
> and displaying the results on some UI 10 at a time, say.
>
> If that's the case, the answer is no. It requires additional application
> code. In general, Kudu cannot return rows in order. So, if you want rows
> 101-110, you must retrieve *all* the rows, select the top 110, and then
> display only the final 10.
>
> In special cases when the sort is on a prefix of the primary key, scan
> tokens can be used to have Kudu return sorted subsets of rows from each
> tablet, which you can partially merge to get the desired result set.
>
> With a lot of data it's best to retrieve a large amount of sorted results
> and paginate from the cached results, rather than running a new query per
> page.
>
> -Will
>
> On Tue, Sep 4, 2018 at 9:02 AM Irtiza Ali <ia...@an10.io> wrote:
>
>> Hello everyone,
>>
>> Is there a way to paginate kudu's data using its python client?
>>
>>
>> I
>>
>

Re: Kudu's data pagination

Posted by William Berkeley <wd...@cloudera.com>.
Hi Irtiza. What do you mean by paginate? I'm guessing you mean doing
something like taking the results of a query like

SELECT name, age FROM users SORT BY age DESC

and displaying the results on some UI 10 at a time, say.

If that's the case, the answer is no. It requires additional application
code. In general, Kudu cannot return rows in order. So, if you want rows
101-110, you must retrieve *all* the rows, select the top 110, and then
display only the final 10.

In special cases when the sort is on a prefix of the primary key, scan
tokens can be used to have Kudu return sorted subsets of rows from each
tablet, which you can partially merge to get the desired result set.

With a lot of data it's best to retrieve a large amount of sorted results
and paginate from the cached results, rather than running a new query per
page.

-Will

On Tue, Sep 4, 2018 at 9:02 AM Irtiza Ali <ia...@an10.io> wrote:

> Hello everyone,
>
> Is there a way to paginate kudu's data using its python client?
>
>
> I
>