You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@commons.apache.org by Holger Hoffstaette <ho...@wizards.de> on 2006/09/06 02:53:17 UTC

[pool] Any existing util.concurrent implementations?

Hi,

Are there any implementations of StackObjectPool/KeyedStackObjectPool that
make use of ConcurrentHashMap, Lock and especially the lock-free
ConcurrentLinkedQueue instead Vector? I cannot imagine I'm the only one
who would like to see these, maybe as an optional jdk5 extension or
initial 2.0 branch. The only dealbreaker for CLQ could IMHO be the bounded
number of entries which might need some deep thought (being a hot field),
but other than that..any ideas or existing code in this direction?

thanks
Holger



---------------------------------------------------------------------
To unsubscribe, e-mail: commons-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-user-help@jakarta.apache.org

Re: [pool] Any existing util.concurrent implementations?

Posted by Holger Hoffstaette <ho...@wizards.de>.

On Mon, 11 Sep 2006 19:07:07 -0400, Sandy McArthur wrote:

>> Btw just curious how you test for lock contention etc.? I only know some
>> JDK 1.5/1.6 features that measure the native locks and thread wait time
>> but nothing for 1.4. I could provide test feedback on a shiny new
>> dual-core Opteron.
> 
> I don't have any truly robust tests, just a series of micro-benchmarks
> that try to simulate single threaded and multi-threaded access patterns
> with fake-expensive poolable objects that I've run many times on a single
> cpu, a hyper threaded cpu, and a quad xeon cpu servers. My conclusion a
> few months ago was that  the serialization that happens prevented the quad
> cpu server from delivering more than about 1.3 times the throughput of a
> single cpu server. I personally think it's reasonable to expect a quad cpu
> server should give you at least a 3 times performance boost over a single
> cpu server.

At least, yes - preferrably 4. :)
The pool is of course a bit of a bottleneck by design, since that is its
purpose in certain situations. Hard to break out of that. On the other
hand just using something like ConcurrentHashMap for the KeyedObjectPool
will likely make a difference already, assuming the keys are accessed with
relatively even distribution.

> But thread-safe access to the backing idle object pool collection isn't
> the main bottleneck despite that most people seem to look their to make
> commons pool faster.
> 
> What slows the provided pool implementations down is the state transitions
> poolable objects go through and the potential expense activating or
> passivating poolable objects while keeping the limits in check. For
> non-trivial poolable objects, which I assume all are, else you wouldn't be
> pooling them, the pool spends most of it's time in the activateObject,
> validateObject, and passivateObject methods. If those can run in parallel

Right, this makes a pool different from a bunch of simple queues.

> then the total throughput of the pool increases greatly, especially on
> multi-cpu servers. The problem is if you allow activate, validate, and
> passivateObject methods to run in parallel it gets much more complicated
> wether or not a poolable object that is transitioning state will cause a
> limit to be exceeded.

I see. Obvious thing would be a single (or more? one for each phase, like
a pipeline?) background worker - that way the 90% where fast handoff is OK
and does not clash with maxActive/minActive would be served quickly and
could be processed in the background while quickly returning to the
caller. SynchronousQueue (which is not really a queue) is made especially
for those cases. Unfortunately that would conflict with the synchronous
nature of borrow/returnObject. Maybe an opportunity for something like
returnObject(Object obj, boolean async)?

> For example pretty much every pool I've seen simply uses the internal
> Collection's .size() method as the result of getNumIdle(). But without
> full synchronization this isn't sufficient because a previously idle
> poolable object isn't really active until activateObject is done. A naive
> implementation will have a race condition that could allow too many
> database connections be created or whatever else is being pooled.

Yes, the JCIP book (great reading btw!) calls these "hot fields" since
they are the single contention points required for correctness. The huge
problem is that the "precision" of the pool cannot be relaxed for those
cases where you can tolerate a temporary one-off count. Maybe we can sneak
something in based on alternative implementations.

> I wasn't aware of backport-util-concurrent, I'll look into it.

What?! run, don't walk! :) It's really great and works extremely well.

> Patches attached to a issue can be submitted as soon as you have them
> ready, but you have to wait on a commiter to commit them. Then you have to
> be asked to become a committer, accept, and then it takes a some time and
> paper work to become official. You can read about it here:
> http://jakarta.apache.org/site/getinvolved.html

I got my CLA faxed in already so that is the least problem.

cheers
Holger

---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org

Re: [pool] Any existing util.concurrent implementations?

Posted by Sandy McArthur <sa...@apache.org>.

Note: This thread previously lived on the Commons User list.

On 9/11/06, Holger Hoffstaette <ho...@wizards.de> wrote:
> On Sat, 09 Sep 2006 22:53:00 -0400, Sandy McArthur wrote:
>
> > I have been following this plan for pool:
> > http://wiki.apache.org/jakarta-commons/PoolRoadMap The current plan isn't
> > to require jdk 5 until Pool 3.0.
>
> OK. I don't agree but I guess that's my problem. ;-)
> However I think we can meet in the middle and make everyone happy, see
> below.
>
> > The code in svn for pool 2.0 implements the updated behavior but doesn't
> > have the performance characteristics I'm satisfied with on multi-cpu box.
> > (Current pool versions have the same performance bottle neck.) I got
> > stalled on reworking the code because of my wedding but I hope to start
> > moving forward again soon.
>
> Great, thanks for the update. I agree with all points on the roadmap for
> 2.0, especially the behavioral things like never having a null factory
> etc.
>
> Btw just curious how you test for lock contention etc.? I only know some
> JDK 1.5/1.6 features that measure the native locks and thread wait time
> but nothing for 1.4. I could provide test feedback on a shiny new
> dual-core Opteron.

I don't have any truly robust tests, just a series of micro-benchmarks
that try to simulate single threaded and multi-threaded access
patterns with fake-expensive poolable objects that I've run many times
on a single cpu, a hyper threaded cpu, and a quad xeon cpu servers. My
conclusion a few months ago was that  the serialization that happens
prevented the quad cpu server from delivering more than about 1.3
times the throughput of a single cpu server. I personally think it's
reasonable to expect a quad cpu server should give you at least a 3
times performance boost over a single cpu server.

> > While I have been working with Pool, there have been a number of people
> > who have submitted fast / non-blocking / unsynchronized pool
> > implementations but to achieve their speed they tend to ignore steps
> > needed to make a pool implementation thread-safe. There are many times
> > strict thread-safety isn't needed but I'm weary of including such code in
> > the official distribution.
>
> That sounds wrong - the whole idea behind lock-free/wait-free algorithms is
> that they are still thread-safe, yet with less unencessary synchronization
> and contention. If these contributions lead to a loss of thread safety for
> the pool they are just wrong, period. This stuff is hard and just removing
> synchronized statements because they work on someone's machine somewhere
> is a recipe for disaster.

Agreed, almost. If you don't need a pool with limits
(maxActive/maxIdle/etc) and don't care if getNumIdle() or getNumActive
are accurate then it is safe to strip away most all synchronization
for pure speed.

But thread-safe access to the backing idle object pool collection
isn't the main bottleneck despite that most people seem to look their
to make commons pool faster.

What slows the provided pool implementations down is the state
transitions poolable objects go through and the potential expense
activating or passivating poolable objects while keeping the limits in
check. For non-trivial poolable objects, which I assume all are, else
you wouldn't be pooling them, the pool spends most of it's time in the
activateObject, validateObject, and passivateObject methods. If those
can run in parallel then the total throughput of the pool increases
greatly, especially on multi-cpu servers. The problem is if you allow
activate, validate, and passivateObject methods to run in parallel it
gets much more complicated wether or not a poolable object that is
transitioning state will cause a limit to be exceeded.

For example pretty much every pool I've seen simply uses the internal
Collection's .size() method as the result of getNumIdle(). But without
full synchronization this isn't sufficient because a previously idle
poolable object isn't really active until activateObject is done. A
naive implementation will have a race condition that could allow too
many database connections be created or whatever else is being pooled.

> Anyway -
> since the roadmap for 2.0 indicates that JDK 1.4 is the target baseline it
> would IMHO be a wasted chance not to get best of both worlds -
> better data structures and readyness for JDK 1.5, yet run on JDK 1.4 to
> not cut off existing users. I therefore encourage you to consider the use
> of the backport concurrent library
> (http://dcl.mathcs.emory.edu/util/backport-util-concurrent/).
>
> It has a ASF-compatible license and some other Apache projects are
> adopting it too; the most recent example is MINA which had their own
> ThreadPool but *of course* struggled with the resulting and completely
> predictable bugs until last week.
>
> Using the backport would even enable the use of some highly cool stuff
> that is so far only in Mustang - mostly Deques which have significantly
> higher concurrency than single-lock Queues.
> Obviously producing a 'native' Tiger/Mustang version 3.0 from that will be
> trivial by simply fixing the package names.

I wasn't aware of backport-util-concurrent, I'll look into it.

> If you want to discuss this further please send me email or let's move
> this to commons-dev. I would love to help with the development. My main
> project is with the Mule ESB (http://mule.codehaus.org/) and we make
> extensive use of both commons-pool and the backport library; having both
> work together would benefit everybody, especially since we're still based
> on JDK 1.4 as well (with a possible move to 1.5 in 2007).
>
> Please let me know what you think or even better when and how I can start
> committing.. :)

Patches attached to a issue can be submitted as soon as you have them
ready, but you have to wait on a commiter to commit them. Then you
have to be asked to become a committer, accept, and then it takes a
some time and paper work to become official. You can read about it
here: http://jakarta.apache.org/site/getinvolved.html
-- 
Sandy McArthur

"He who dares not offend cannot be honest."
- Thomas Paine

---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org

Re: [pool] Any existing util.concurrent implementations?

Posted by Holger Hoffstaette <ho...@wizards.de>.

Sandy,

thanks for your reply.

On Sat, 09 Sep 2006 22:53:00 -0400, Sandy McArthur wrote:

> I have been following this plan for pool:
> http://wiki.apache.org/jakarta-commons/PoolRoadMap The current plan isn't
> to require jdk 5 until Pool 3.0.

OK. I don't agree but I guess that's my problem. ;-)
However I think we can meet in the middle and make everyone happy, see
below.

> The code in svn for pool 2.0 implements the updated behavior but doesn't
> have the performance characteristics I'm satisfied with on multi-cpu box.
> (Current pool versions have the same performance bottle neck.) I got
> stalled on reworking the code because of my wedding but I hope to start
> moving forward again soon.

Great, thanks for the update. I agree with all points on the roadmap for 
2.0, especially the behavioral things like never having a null factory
etc.

Btw just curious how you test for lock contention etc.? I only know some
JDK 1.5/1.6 features that measure the native locks and thread wait time
but nothing for 1.4. I could provide test feedback on a shiny new
dual-core Opteron.

> While I have been working with Pool, there have been a number of people
> who have submitted fast / non-blocking / unsynchronized pool
> implementations but to achieve their speed they tend to ignore steps
> needed to make a pool implementation thread-safe. There are many times
> strict thread-safety isn't needed but I'm weary of including such code in
> the official distribution.

That sounds wrong - the whole idea behind lock-free/wait-free algorithms is
that they are still thread-safe, yet with less unencessary synchronization
and contention. If these contributions lead to a loss of thread safety for
the pool they are just wrong, period. This stuff is hard and just removing
synchronized statements because they work on someone's machine somewhere
is a recipe for disaster.

Anyway -
since the roadmap for 2.0 indicates that JDK 1.4 is the target baseline it
would IMHO be a wasted chance not to get best of both worlds -
better data structures and readyness for JDK 1.5, yet run on JDK 1.4 to
not cut off existing users. I therefore encourage you to consider the use
of the backport concurrent library
(http://dcl.mathcs.emory.edu/util/backport-util-concurrent/).

It has a ASF-compatible license and some other Apache projects are
adopting it too; the most recent example is MINA which had their own
ThreadPool but *of course* struggled with the resulting and completely
predictable bugs until last week.

Using the backport would even enable the use of some highly cool stuff
that is so far only in Mustang - mostly Deques which have significantly
higher concurrency than single-lock Queues. 
Obviously producing a 'native' Tiger/Mustang version 3.0 from that will be
trivial by simply fixing the package names.

If you want to discuss this further please send me email or let's move
this to commons-dev. I would love to help with the development. My main
project is with the Mule ESB (http://mule.codehaus.org/) and we make
extensive use of both commons-pool and the backport library; having both
work together would benefit everybody, especially since we're still based
on JDK 1.4 as well (with a possible move to 1.5 in 2007).

Please let me know what you think or even better when and how I can start
committing.. :)

regards,
Holger

---------------------------------------------------------------------
To unsubscribe, e-mail: commons-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-user-help@jakarta.apache.org

Re: [pool] Any existing util.concurrent implementations?

Posted by Sandy McArthur <sa...@apache.org>.

I have been following this plan for pool:
http://wiki.apache.org/jakarta-commons/PoolRoadMap The current plan
isn't to require jdk 5 until Pool 3.0.

The code in svn for pool 2.0 implements the updated behavior but
doesn't have the performance characteristics I'm satisfied with on
multi-cpu box. (Current pool versions have the same performance bottle
neck.) I got stalled on reworking the code because of my wedding but I
hope to start moving forward again soon.

While I have been working with Pool, there have been a number of
people who have submitted fast / non-blocking / unsynchronized pool
implementations but to achieve their speed they tend to ignore steps
needed to make a pool implementation thread-safe. There are many times
strict thread-safety isn't needed but I'm weary of including such code
in the official distribution.

On 9/5/06, Holger Hoffstaette <ho...@wizards.de> wrote:
>
> Hi,
>
> Are there any implementations of StackObjectPool/KeyedStackObjectPool that
> make use of ConcurrentHashMap, Lock and especially the lock-free
> ConcurrentLinkedQueue instead Vector? I cannot imagine I'm the only one
> who would like to see these, maybe as an optional jdk5 extension or
> initial 2.0 branch. The only dealbreaker for CLQ could IMHO be the bounded
> number of entries which might need some deep thought (being a hot field),
> but other than that..any ideas or existing code in this direction?
>
> thanks
> Holger
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: commons-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: commons-user-help@jakarta.apache.org
>
>

-- 
Sandy McArthur

"He who dares not offend cannot be honest."
- Thomas Paine

---------------------------------------------------------------------
To unsubscribe, e-mail: commons-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-user-help@jakarta.apache.org