You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Thomas Downing <td...@proteus-technologies.com> on 2010/08/13 11:37:25 UTC

HBase at high ingest rates

First, I want to thank all on this list who responded to all
my questions, and all the HBase developers as well.

After trying both hadoop-0.20-append and Cloudera's
latest, along with various settings, we are still leaking both
file and socket handles at high ingest rates.  The handle
usage increases slowly, but linearly, till a process or OS
limit is reached.  The larger problem is that HBase does
not recover from that situation.  In fact, if once the file
handles were driven high, but less than quota, and the
ingest was slackened significantly, there was no recovery.
If ingest was completely stopped and restarted, there
was recovery.

By high ingest rate I mean about 1200 records/sec per
node, 2K per record, 50 records per batch put.

I truly wish I had the time to dig in to find the answers, or
at least indications, but my time is not my own.  We will
keeping a close eye on HBase, and will likely revisit it
before long.

Thanks again
Thomas Downing


Re: HBase at high ingest rates

Posted by Stack <st...@duboce.net>.
Thanks for writing back the list Thomas.  I've made
https://issues.apache.org/jira/browse/HBASE-2913 to take a look into
this over on our end.  Any data you might have lying around -- lsof
dumps or whatever -- that you don't mind attaching to illustrate what
you were seeing appreciiated.   Then watch the issue so you can see
when fixed (smile).

Go easy,
St.Ack

On Fri, Aug 13, 2010 at 2:37 AM, Thomas Downing
<td...@proteus-technologies.com> wrote:
> First, I want to thank all on this list who responded to all
> my questions, and all the HBase developers as well.
>
> After trying both hadoop-0.20-append and Cloudera's
> latest, along with various settings, we are still leaking both
> file and socket handles at high ingest rates.  The handle
> usage increases slowly, but linearly, till a process or OS
> limit is reached.  The larger problem is that HBase does
> not recover from that situation.  In fact, if once the file
> handles were driven high, but less than quota, and the
> ingest was slackened significantly, there was no recovery.
> If ingest was completely stopped and restarted, there
> was recovery.
>
> By high ingest rate I mean about 1200 records/sec per
> node, 2K per record, 50 records per batch put.
>
> I truly wish I had the time to dig in to find the answers, or
> at least indications, but my time is not my own.  We will
> keeping a close eye on HBase, and will likely revisit it
> before long.
>
> Thanks again
> Thomas Downing
>
>