You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hive.apache.org by Andrew Mains <an...@kontagent.com> on 2015/03/31 01:48:42 UTC

Predicate pushdown on HBase snapshots

Hi all,

Looking at the current implementation on trunk, hive's hbase integration 
doesn't currently seem to support predicate pushdown for queries over 
HBase snapshots. Does this seem like a reasonable feature to add?
It would be nice to have relative feature parity between queries running 
over snapshots and queries running over live tables.

Thanks!

Andrew

Re: Predicate pushdown on HBase snapshots

Posted by Andrew Mains <an...@kontagent.com>.
> Are you suggesting taking advantage of the sorted order to seek to the key
mentioned in a SARG

Pretty much, yes. It's essentially the same use case as predicate 
pushdown for the live table case (already implemented), which converts 
predicates into a scan, and we should be able to reuse a significant 
amount of that code. It is perhaps a somewhat limited use case, but I'd 
argue that it's a reasonably significant one for hive on HBase--if 
you've designed your HBase row key based on your query patterns, it's 
reasonable to expect that most queries over snapshots will be SARGable 
(that's certainly true for our use case, though I can't speak so much to 
others).

Given that, does it seem worthwhile enough to file a ticket? We may 
implement it either way (depending on how our preliminary performance 
testing of queries over snapshots goes).

Thanks!

Andrew

On 3/30/15 8:03 PM, Gopal Vijayaraghavan wrote:
>> Looking at the current implementation on trunk, hive's hbase integration
>> doesn't currently seem to support predicate pushdown for queries over
>> HBase snapshots. Does this seem like a reasonable feature to add?
>> It would be nice to have relative feature parity between queries running
>> over snapshots and queries running over live tables.
> Are you suggesting taking advantage of the sorted order to seek to the key
> mentioned in a SARG?
>
> That particular method will be limited to simple filters on exactly one
> key or perhaps with a few seeks, the more generic IN/BETWEEN SARGs.
>
> But for that case, it will provided a significant boost.
>
> Cheers,
> Gopal
>
>


Re: Predicate pushdown on HBase snapshots

Posted by Gopal Vijayaraghavan <go...@apache.org>.
>Looking at the current implementation on trunk, hive's hbase integration
>doesn't currently seem to support predicate pushdown for queries over
>HBase snapshots. Does this seem like a reasonable feature to add?
>It would be nice to have relative feature parity between queries running
>over snapshots and queries running over live tables.

Are you suggesting taking advantage of the sorted order to seek to the key
mentioned in a SARG?

That particular method will be limited to simple filters on exactly one
key or perhaps with a few seeks, the more generic IN/BETWEEN SARGs.

But for that case, it will provided a significant boost.

Cheers,
Gopal



Re: Predicate pushdown on HBase snapshots

Posted by Andrew Mains <an...@kontagent.com>.
Filed https://issues.apache.org/jira/browse/HIVE-10545 for this; we're 
planning on taking this up in the next couple of weeks.

On 3/30/15 4:48 PM, Andrew Mains wrote:
> hive's hbase integration doesn't currently seem to support predicate 
> pushdown for queries over HBase snapshots.