You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by nick dimiduk <nd...@gmail.com> on 2014/07/22 22:27:08 UTC

Review Request 23824: Add HiveHBaseTableSnapshotInputFormat

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/23824/
-----------------------------------------------------------

Review request for hive, Ashutosh Chauhan, Navis Ryu, Sushanth Sowmyan, and Swarnim Kulkarni.


Bugs: HIVE-6584
    https://issues.apache.org/jira/browse/HIVE-6584


Repository: hive-git


Description
-------

HBASE-8369 provided mapreduce support for reading from HBase table snapsopts. This allows a MR job to consume a stable, read-only view of an HBase table directly off of HDFS. Bypassing the online region server API provides a nice performance boost for the full scan. HBASE-10642 is backporting that feature to 0.94/0.96 and also adding a mapred implementation. Once that's available, we should add an input format. A follow-on patch could work out how to integrate this functionality into the StorageHandler, similar to how HIVE-6473 integrates the HFileOutputFormat into existing table definitions.

See JIRA for further conversation.


Diffs
-----

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 593c566 
  conf/hive-default.xml.template ba922d0 
  hbase-handler/src/java/org/apache/hadoop/hive/hbase/HBaseSplit.java 998c15c 
  hbase-handler/src/java/org/apache/hadoop/hive/hbase/HBaseStorageHandler.java dbf5e51 
  hbase-handler/src/java/org/apache/hadoop/hive/hbase/HiveHBaseInputFormatUtil.java PRE-CREATION 
  hbase-handler/src/java/org/apache/hadoop/hive/hbase/HiveHBaseTableInputFormat.java 1032cc9 
  hbase-handler/src/java/org/apache/hadoop/hive/hbase/HiveHBaseTableSnapshotInputFormat.java PRE-CREATION 
  hbase-handler/src/test/queries/positive/hbase_handler_snapshot.q PRE-CREATION 
  hbase-handler/src/test/results/positive/external_table_ppd.q.out 6f1adf4 
  hbase-handler/src/test/results/positive/hbase_binary_storage_queries.q.out b92db11 
  hbase-handler/src/test/results/positive/hbase_handler_snapshot.q.out PRE-CREATION 
  hbase-handler/src/test/templates/TestHBaseCliDriver.vm 01d596a 
  itests/util/src/main/java/org/apache/hadoop/hive/hbase/HBaseQTestUtil.java 96a0de2 
  itests/util/src/main/java/org/apache/hadoop/hive/hbase/HBaseTestSetup.java cdc0a65 
  itests/util/src/main/java/org/apache/hadoop/hive/ql/QTestUtil.java 2fefa06 
  pom.xml b5a5697 
  ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java c80a2a3 

Diff: https://reviews.apache.org/r/23824/diff/


Testing
-------

Unit tests, local-mode testing, pseudo-distributed mode testing, and tested on a small distributed cluster. Tests included hbase versions 0.98.3 and the HEAD of 0.98 branch.


Thanks,

nick dimiduk


Re: Review Request 23824: Add HiveHBaseTableSnapshotInputFormat

Posted by nick dimiduk <nd...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/23824/
-----------------------------------------------------------

(Updated July 31, 2014, 6:33 p.m.)


Review request for hive, Ashutosh Chauhan, Navis Ryu, Sushanth Sowmyan, and Swarnim Kulkarni.


Changes
-------

Updating with patch v14 from JIRA.


Bugs: HIVE-6584
    https://issues.apache.org/jira/browse/HIVE-6584


Repository: hive-git


Description
-------

HBASE-8369 provided mapreduce support for reading from HBase table snapsopts. This allows a MR job to consume a stable, read-only view of an HBase table directly off of HDFS. Bypassing the online region server API provides a nice performance boost for the full scan. HBASE-10642 is backporting that feature to 0.94/0.96 and also adding a mapred implementation. Once that's available, we should add an input format. A follow-on patch could work out how to integrate this functionality into the StorageHandler, similar to how HIVE-6473 integrates the HFileOutputFormat into existing table definitions.

See JIRA for further conversation.


Diffs (updated)
-----

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 15bc0a3 
  hbase-handler/src/java/org/apache/hadoop/hive/hbase/HBaseSplit.java 998c15c 
  hbase-handler/src/java/org/apache/hadoop/hive/hbase/HBaseStorageHandler.java dbf5e51 
  hbase-handler/src/java/org/apache/hadoop/hive/hbase/HBaseTableSnapshotInputFormatUtil.java PRE-CREATION 
  hbase-handler/src/java/org/apache/hadoop/hive/hbase/HiveHBaseInputFormatUtil.java PRE-CREATION 
  hbase-handler/src/java/org/apache/hadoop/hive/hbase/HiveHBaseTableInputFormat.java 1032cc9 
  hbase-handler/src/java/org/apache/hadoop/hive/hbase/HiveHBaseTableSnapshotInputFormat.java PRE-CREATION 
  hbase-handler/src/test/queries/positive/hbase_handler_snapshot.q PRE-CREATION 
  hbase-handler/src/test/results/positive/external_table_ppd.q.out 6f1adf4 
  hbase-handler/src/test/results/positive/hbase_binary_storage_queries.q.out b92db11 
  hbase-handler/src/test/results/positive/hbase_handler_snapshot.q.out PRE-CREATION 
  hbase-handler/src/test/templates/TestHBaseCliDriver.vm 01d596a 
  itests/util/src/main/java/org/apache/hadoop/hive/hbase/HBaseQTestUtil.java 96a0de2 
  itests/util/src/main/java/org/apache/hadoop/hive/hbase/HBaseTestSetup.java cdc0a65 
  itests/util/src/main/java/org/apache/hadoop/hive/ql/QTestUtil.java ccfb58f 
  pom.xml b3216e1 
  ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java 40d910c 

Diff: https://reviews.apache.org/r/23824/diff/


Testing
-------

Unit tests, local-mode testing, pseudo-distributed mode testing, and tested on a small distributed cluster. Tests included hbase versions 0.98.3 and the HEAD of 0.98 branch.


Thanks,

nick dimiduk