You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Jonathan Bender <jo...@gmail.com> on 2011/04/09 18:13:08 UTC

Bulk load importer performance

Hello,

I am looking to understand the performance of the completebulkload command
line tool, since I'm trying to evaluate it against other HBase loading
options.  Is there some log file available that displays when the HFiles are
fully loaded into HBase after running that tool?  I don't see much in the
way of documentation about it.

Thanks!
Jon

Re: Bulk load importer performance

Posted by Stack <st...@duboce.net>.
You might have to dig a little in the code to figure what
completebulkload is up to ('completebulkload' runs this class:
http://hbase.apache.org/xref/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.html#72).

Perhaps you are running into the fact that the bulk loader runs
serially.  See this thread:
http://search-hadoop.com/m/48tibljaWr1/adam+gopnick&subj=Re+Speeding+up+LoadIncrementalHFiles+

An issue has been filed with a patch to address it but we are sort of
waiting on the original reporter to try the posted patch to say
whether or which it improves the situation.

St.Ack

On Sat, Apr 9, 2011 at 9:13 AM, Jonathan Bender
<jo...@gmail.com> wrote:
> Hello,
>
> I am looking to understand the performance of the completebulkload command
> line tool, since I'm trying to evaluate it against other HBase loading
> options.  Is there some log file available that displays when the HFiles are
> fully loaded into HBase after running that tool?  I don't see much in the
> way of documentation about it.
>
> Thanks!
> Jon
>