You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by Ted Yu <te...@yahoo.com> on 2011/04/09 16:00:23 UTC
Review Request: Speedup LoadIncrementalHFiles
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/572/
-----------------------------------------------------------
Review request for hbase and Todd Lipcon.
Summary
-------
I refactored LoadIncrementalHFiles so that tryLoad() queues work items in List<ServerCallable<Void>>. doBulkLoad() periodically sends batch of ServerCallable's to HBase cluster.
I added the following method to HConnection/HConnectionManager:
public <T> void getRegionServerWithRetries(ExecutorService pool,
List<ServerCallable<T>> callables, Object[] results)
This method uses thread pool to send multiple ServerCallable's through getRegionServerWithRetries(ServerCallable<T> callable).
I introduced two new config parameters: hbase.loadincremental.threads.max and hbase.loadincremental.batch.size
hbase.loadincremental.batch.size is for configuring the batch size above which HConnection.getRegionServerWithRetries() would be called. In Adam's case, there're many small HFiles. LoadIncrementalHFiles shouldn't wait until all HFiles have been scanned.
hbase.loadincremental.threads.max controls the maximum number of threads in thread pool.
This addresses bug HBASE-3721.
https://issues.apache.org/jira/briwse/HBASE-3721
Diffs
-----
/src/main/java/org/apache/hadoop/hbase/client/HConnection.java 1090500
/src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java 1090500
/src/main/java/org/apache/hadoop/hbase/client/HTable.java 1090500
/src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java 1090500
Diff: https://reviews.apache.org/r/572/diff
Testing
-------
TestLoadIncrementalHFiles and TestHFileOutputFormat pass.
Thanks,
Ted