You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Ryan Brush <rb...@gmail.com> on 2012/07/03 05:13:47 UTC

Possible unintended use of finalizers in HTablePool

While generating some load against a library that makes extensive use of
HTablePool in 0.92, I noticed that the largest heap consumer was
java.lang.ref.Finalizer.  Digging in, I discovered that HTablePool's
internal PooledHTable extends HTable, which instantiates a
ThreadPoolExecutor and supporting objects every time a pooled HTable is
retrieved.  Since ThreadPoolExecutor has a finalizer, it and its
dependencies can't get garbage collected until the finalizer runs.  The
result is by using HTablePool, we're creating a ton of objects to be
finalized that are stuck on the heap longer than they should be, creating
our largest source of pressure on the garbage collector.  It looks like
this will also be a problem in 0.94 and trunk.

Anyway, I started on the obvious patch, which is to have PooledHTable
implement HTableInterface rather than derive from HTable, but ran afoul of
a unit test that asserts items returned from HTablePool must be HTable
instances -- I'm presuming this is for some historical passivity need.  Is
it worth logging a JIRA to track this (non-passive) change?  Perhaps
there's another approach I should be taking?  For the time being I will
probably move forward by creating my own version of HTablePool (in a
separate package) to avoid the issue at hand, since it's otherwise a good
fit for my needs.

Re: Possible unintended use of finalizers in HTablePool

Posted by yu...@gmail.com.
You can log a Jira where you attach your patch. 

Thanks



On Jul 2, 2012, at 8:13 PM, Ryan Brush <rb...@gmail.com> wrote:

> While generating some load against a library that makes extensive use of
> HTablePool in 0.92, I noticed that the largest heap consumer was
> java.lang.ref.Finalizer.  Digging in, I discovered that HTablePool's
> internal PooledHTable extends HTable, which instantiates a
> ThreadPoolExecutor and supporting objects every time a pooled HTable is
> retrieved.  Since ThreadPoolExecutor has a finalizer, it and its
> dependencies can't get garbage collected until the finalizer runs.  The
> result is by using HTablePool, we're creating a ton of objects to be
> finalized that are stuck on the heap longer than they should be, creating
> our largest source of pressure on the garbage collector.  It looks like
> this will also be a problem in 0.94 and trunk.
> 
> Anyway, I started on the obvious patch, which is to have PooledHTable
> implement HTableInterface rather than derive from HTable, but ran afoul of
> a unit test that asserts items returned from HTablePool must be HTable
> instances -- I'm presuming this is for some historical passivity need.  Is
> it worth logging a JIRA to track this (non-passive) change?  Perhaps
> there's another approach I should be taking?  For the time being I will
> probably move forward by creating my own version of HTablePool (in a
> separate package) to avoid the issue at hand, since it's otherwise a good
> fit for my needs.