You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@drill.apache.org by Aditya Kishore <ad...@gmail.com> on 2014/08/29 21:27:01 UTC
Re: Review Request 25190: DRILL-1346: Use HBase table size
information to improve scan parallelization
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25190/
-----------------------------------------------------------
(Updated Aug. 29, 2014, 12:27 p.m.)
Review request for drill.
Changes
-------
Correcting the issue id and description.
Summary (updated)
-----------------
DRILL-1346: Use HBase table size information to improve scan parallelization
Bugs: DRILL-1346
https://issues.apache.org/jira/browse/DRILL-1346
Repository: drill-git
Description
-------
This patch enables a better estimation of the size of HBase data under scan by computing the total data size of all regions and sample. This estimation can be disabled by setting "drill.exec.hbase.scan.sizecalculator.enabled" to false.
Diffs
-----
contrib/storage-hbase/src/main/java/org/apache/drill/exec/store/hbase/HBaseGroupScan.java 8e9ae18
contrib/storage-hbase/src/main/java/org/apache/drill/exec/store/hbase/HBaseScanSpec.java c2ee723
contrib/storage-hbase/src/main/java/org/apache/drill/exec/store/hbase/TableStatsCalculator.java PRE-CREATION
contrib/storage-hbase/src/main/resources/drill-module.conf 0edceaf
contrib/storage-hbase/src/test/java/org/apache/drill/hbase/TestTableGenerator.java 3678c78
Diff: https://reviews.apache.org/r/25190/diff/
Testing
-------
Manual testing with a full/partial scan on a large HBase table.
Thanks,
Aditya Kishore