You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@drill.apache.org by Aditya Kishore <ad...@gmail.com> on 2014/05/31 02:07:51 UTC
Review Request 22098: DRILL-672 Queries against hbase table do not close
after the data is returned.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22098/
-----------------------------------------------------------
Review request for drill.
Repository: drill-git
Description
-------
The bug is in the parallelization logic of HBaseGroupScan. The current code favored region affinity over load distribution.
Since Drill's parallelizer code already takes care of creating endpoints slots based on affinity, HBaseGroupScan should only distribute the work evenly among provided slots.
The modified algorithm ensures that, for 'm' regions to scan and 'n' endpoint slots:
1. Each slot gets at least floor(m/n) and at most ceil(m/n) regions.
2. Each slot on a single host with regions affinity gets even distribution of regions hosted on it.
Diffs
-----
contrib/storage-hbase/src/main/java/org/apache/drill/exec/store/hbase/HBaseGroupScan.java 809aa86
contrib/storage-hbase/src/main/java/org/apache/drill/exec/store/hbase/HBaseSubScan.java d9f2b7c
contrib/storage-hbase/src/test/java/org/apache/drill/hbase/BaseHBaseTest.java 96f0c4a
contrib/storage-hbase/src/test/java/org/apache/drill/hbase/HBaseTestsSuite.java e30f79e
contrib/storage-hbase/src/test/java/org/apache/drill/hbase/TestHBaseRegionScanAssignments.java PRE-CREATION
Diff: https://reviews.apache.org/r/22098/diff/
Testing
-------
Added test TestHBaseRegionScanAssignments.
Thanks,
Aditya Kishore
Re: Review Request 22098: DRILL-672 Queries against hbase table do not close
after the data is returned.
Posted by Aditya Kishore <ad...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22098/
-----------------------------------------------------------
(Updated May 31, 2014, 12:35 a.m.)
Review request for drill.
Changes
-------
This revised patch improves both functionality and performance.
* Now all of the slots are guaranteed to have maximum possible region affinity.
* With the earlier patches, some of the test cases took up to 7 milliseconds to apply the assignment while this one completes all of them under 200 microseconds.
Repository: drill-git
Description
-------
The bug is in the parallelization logic of HBaseGroupScan. The current code favored region affinity over load distribution.
Since Drill's parallelizer code already takes care of creating endpoints slots based on affinity, HBaseGroupScan should only distribute the work evenly among provided slots.
The modified algorithm ensures that, for 'm' regions to scan and 'n' endpoint slots:
1. Each slot gets at least floor(m/n) and at most ceil(m/n) regions.
2. Each slot on a single host with regions affinity gets even distribution of regions hosted on it.
Diffs (updated)
-----
contrib/storage-hbase/src/main/java/org/apache/drill/exec/store/hbase/HBaseGroupScan.java 809aa86
contrib/storage-hbase/src/main/java/org/apache/drill/exec/store/hbase/HBaseRecordReader.java caee8ed
contrib/storage-hbase/src/main/java/org/apache/drill/exec/store/hbase/HBaseSubScan.java d9f2b7c
contrib/storage-hbase/src/test/java/org/apache/drill/hbase/BaseHBaseTest.java 96f0c4a
contrib/storage-hbase/src/test/java/org/apache/drill/hbase/HBaseTestsSuite.java e30f79e
contrib/storage-hbase/src/test/java/org/apache/drill/hbase/TestHBaseRegionScanAssignments.java PRE-CREATION
Diff: https://reviews.apache.org/r/22098/diff/
Testing
-------
Added test TestHBaseRegionScanAssignments.
Thanks,
Aditya Kishore
Re: Review Request 22098: DRILL-672 Queries against hbase table do not close
after the data is returned.
Posted by Aditya Kishore <ad...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22098/
-----------------------------------------------------------
(Updated May 30, 2014, 5:54 p.m.)
Review request for drill.
Changes
-------
Updated patch.
Repository: drill-git
Description
-------
The bug is in the parallelization logic of HBaseGroupScan. The current code favored region affinity over load distribution.
Since Drill's parallelizer code already takes care of creating endpoints slots based on affinity, HBaseGroupScan should only distribute the work evenly among provided slots.
The modified algorithm ensures that, for 'm' regions to scan and 'n' endpoint slots:
1. Each slot gets at least floor(m/n) and at most ceil(m/n) regions.
2. Each slot on a single host with regions affinity gets even distribution of regions hosted on it.
Diffs (updated)
-----
contrib/storage-hbase/src/main/java/org/apache/drill/exec/store/hbase/HBaseGroupScan.java 809aa86
contrib/storage-hbase/src/main/java/org/apache/drill/exec/store/hbase/HBaseSubScan.java d9f2b7c
contrib/storage-hbase/src/test/java/org/apache/drill/hbase/BaseHBaseTest.java 96f0c4a
contrib/storage-hbase/src/test/java/org/apache/drill/hbase/HBaseTestsSuite.java e30f79e
contrib/storage-hbase/src/test/java/org/apache/drill/hbase/TestHBaseRegionScanAssignments.java PRE-CREATION
Diff: https://reviews.apache.org/r/22098/diff/
Testing
-------
Added test TestHBaseRegionScanAssignments.
Thanks,
Aditya Kishore