You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@orc.apache.org by "Stephen Samuel (Sam)" <sa...@sksamuel.com> on 2017/01/26 20:32:09 UTC

Predicate Pushdowns with Java API

Hello Orcers,

Is it correct to say that if a predicate is used, it will only apply to
either all the rows in a stripe (or whatever) or none of them. Ie, if a
chunk contains a matching row, you'll get back the entire chunk, and so we
must always filter afterwards just in case?

I ask because this is the behaviour I am seeing in the Java client when
using a search argument, but I cannot find any documentation on what the
intended behaviour is meant to be.

Cheers

Re: Predicate Pushdowns with Java API

Posted by Owen O'Malley <om...@apache.org>.
On Thu, Jan 26, 2017 at 12:32 PM, Stephen Samuel (Sam) <sa...@sksamuel.com>
wrote:

> Hello Orcers,
>
> Is it correct to say that if a predicate is used, it will only apply to
> either all the rows in a stripe (or whatever) or none of them. Ie, if a
> chunk contains a matching row, you'll get back the entire chunk, and so we
> must always filter afterwards just in case?
>

Actually, there are three levels of predicate push down:
* File level
* Stripe level
* Row group (10k row) level

For any set of rows that are skipped, there are 0 rows that satisfied the
SearchArg. No one has implemented row level filtering that would remove
individual rows out of the result set. So yes, your code will need to apply
the filter on the resulting rows.

.. Owen


> I ask because this is the behaviour I am seeing in the Java client when
> using a search argument, but I cannot find any documentation on what the
> intended behaviour is meant to be.
>
> Cheers
>