You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@hbase.apache.org by Neil Yalowitz <ne...@gmail.com> on 2012/01/16 02:16:23 UTC

multi-threaded HTablePool, incrementColumnValue, compaction and large data set

I'm seeing something unusual here and I wanted to see if it has occurred
for any other HBase 0.90 users.  I've read several emails here that
recommend NOT using multi-threading in an MR job, so that's certainly under
consideration.  If anyone could add to their experiences with
multi-threading in an MR job it would be very helpful.  We are testing both
implementations (with threading and without), but the threaded solution is
causing the problem.

We are processing log files with PUTs in the Map and a followup
incrementColumnValue() to a separate "counts" table in the Reducer.  The
reduce phase uses multi-threading.  The Reducer initializes an HTablePool
in the setup(), starts threads in the reduce() (to a
Java BlockingQueue/CompletionService) which do the incrementColumnValue()
and depending on the value returned create a PUT in the "counter" table,
and in the cleanup() performs a completionService.take() which is ignored
and flushes the PUTs queued by the threads.

There are no issues for approximately the first 100GB of data inserted.
 After approximately 100GB however, every subsequent job has a freeze
during the Reduce phase.  What I see happening is at some point the Reduce
(where the incrementColumnValue() takes place) tasks are "hung" and
eventually killed with reason: task client has not responded for 600
seconds.  The counters in the reduce job seem to grow briefly but then all
the tasks' counter stop increasing and the task is eventually killed.

Oddly, the problem does not occur if compaction is completely disabled (not
just major, but also setting hbase.hstore.compactionThreshold = 9999999
and hbase.hstore.blockingStoreFiles = 9999999).

Could there be a bug with HTablePool for large datasets and compaction?
 Again, this works as expected for approximately the first 100 jobs (1GB
each) but consistently fails after that.  Also to repeat, the problem does
not occur with ALL compaction disabled.

Difficult problem to describe, but I'm hoping someone may have some
feedback and/or similar experiences.  I can provide code examples if anyone
is curious.



Neil Yalowitz

Re: multi-threaded HTablePool, incrementColumnValue, compaction and large data set

Posted by Sambit Tripathy <sa...@gmail.com>.

So did you get any success with the problem? Well, I think you can try
using it with Asynchbase, a hbase client used in OpenTSDB.



On Mon, Jan 16, 2012 at 6:46 AM, Neil Yalowitz <ne...@gmail.com>wrote:

> I'm seeing something unusual here and I wanted to see if it has occurred
> for any other HBase 0.90 users.  I've read several emails here that
> recommend NOT using multi-threading in an MR job, so that's certainly under
> consideration.  If anyone could add to their experiences with
> multi-threading in an MR job it would be very helpful.  We are testing both
> implementations (with threading and without), but the threaded solution is
> causing the problem.
>
> We are processing log files with PUTs in the Map and a followup
> incrementColumnValue() to a separate "counts" table in the Reducer.  The
> reduce phase uses multi-threading.  The Reducer initializes an HTablePool
> in the setup(), starts threads in the reduce() (to a
> Java BlockingQueue/CompletionService) which do the incrementColumnValue()
> and depending on the value returned create a PUT in the "counter" table,
> and in the cleanup() performs a completionService.take() which is ignored
> and flushes the PUTs queued by the threads.
>
> There are no issues for approximately the first 100GB of data inserted.
>  After approximately 100GB however, every subsequent job has a freeze
> during the Reduce phase.  What I see happening is at some point the Reduce
> (where the incrementColumnValue() takes place) tasks are "hung" and
> eventually killed with reason: task client has not responded for 600
> seconds.  The counters in the reduce job seem to grow briefly but then all
> the tasks' counter stop increasing and the task is eventually killed.
>
> Oddly, the problem does not occur if compaction is completely disabled (not
> just major, but also setting hbase.hstore.compactionThreshold = 9999999
> and hbase.hstore.blockingStoreFiles = 9999999).
>
> Could there be a bug with HTablePool for large datasets and compaction?
>  Again, this works as expected for approximately the first 100 jobs (1GB
> each) but consistently fails after that.  Also to repeat, the problem does
> not occur with ALL compaction disabled.
>
> Difficult problem to describe, but I'm hoping someone may have some
> feedback and/or similar experiences.  I can provide code examples if anyone
> is curious.
>
>
>
> Neil Yalowitz
>