You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Andrew Mains (JIRA)" <ji...@apache.org> on 2015/04/30 02:05:05 UTC
[jira] [Created] (HIVE-10545) Implement predicate pushdown for
queries over HBase snapshots
Andrew Mains created HIVE-10545:
-----------------------------------
Summary: Implement predicate pushdown for queries over HBase snapshots
Key: HIVE-10545
URL: https://issues.apache.org/jira/browse/HIVE-10545
Project: Hive
Issue Type: Improvement
Components: HBase Handler
Reporter: Andrew Mains
Hive's hbase integration currently supports queries over HBase snapshots, and predicate pushdown for queries over HBase tables, but doesn't currently support predicate pushdown for queries over HBase snapshots. This seems to be largely due to the fact that the hbase handler uses the `mapred` TableSnapshotInputFormat implementation, which doesn't support pushing a scan to the job, and not the `mapreduce` implementation, which does (see https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/mapred/TableMapReduceUtil.html#initTableSnapshotMapJob(java.lang.String,%20java.lang.String,%20java.lang.Class,%20java.lang.Class,%20java.lang.Class,%20org.apache.hadoop.mapred.JobConf,%20boolean,%20org.apache.hadoop.fs.Path vs https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/mapreduce/TableMapReduceUtil.html#initTableSnapshotMapperJob(java.lang.String,%20org.apache.hadoop.hbase.client.Scan,%20java.lang.Class,%20java.lang.Class,%20java.lang.Class,%20org.apache.hadoop.mapreduce.Job,%20boolean,%20org.apache.hadoop.fs.Path)) .
Hive should be able to switch to the mapreduce implementation (performing the necessary shimming between mapred and mapreduce), and thus gain the ability to push predicates down to the input format in the same way as is done with HiveTableInputFormat. This switch should result in significant performance improvements for queries which specify range/equality conditions on the row key (which seems like it would be a reasonably common case).
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)