You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@commons.apache.org by Sandy McArthur <sa...@apache.org> on 2006/03/26 08:28:15 UTC

Re: [pool] why the composite pool implementation isn't plugable [was: picking descriptive class names]

On 3/25/06, Rahul Akolkar <ra...@gmail.com> wrote:
> On 3/25/06, Sandy McArthur <sa...@apache.org> wrote:
> <snip/>
> >
> > The main behavior of the composite pools are configured via four
> > type-safe enum types. I'll describe what each type controls and then
> > suggest name variants. Let me know which one you think is the most
> > self-evident and user friendly. Feel free to suggest new names.
> >
> > 1. "Specifies the how objects are borrowed and returned to the pool."
> > a) BorrowType  b) BorrowStrategy  c) BorrowPolicy  d) BorrowBehavior
> >
> > 2. "Specifies the behavior of the pool when the pool is out of idle objects."
> > a) ExhaustionPolicy  b) ExhaustionBehavior  c) ExhaustionType d)
> > ExhaustionStrategy
> >
> > 3. "Specifies the behavior of when there is a limit on the number of
> > concurrently borrowed objects."
> > a) LimitStrategy  b) LimitPolicy  c) LimitBehavior  d) LimitType
> >
> > 4. "Specifies how active objects are tracked while they are borrowed
> > from the pool."
> > a) TrackingBehavior  b) TrackingType  c) TrackingStrategy  d) TrackingPolicy
> >
> > The enums above don't actually specify any implementation, they
> > describe desired features of a pool. The actual implementation isn't
> > broken down into four parts like that so try not to confuse how you
> > would implement that feature with how you would request that feature.
> >
> <snap/>
>
> Why isn't it broken down like that?

Because there are fundamentally three parts to a pool's behavior.
1. How objects are treated while they are in the idle object pool.
2. How objects are added/removed from the idle object pool.
3. How objects are treated while they are out of the pool, aka: active.

I choose to map these three aspects to four types of behavior because
that made the most sense in balancing the usability of the public
interface and allowing the functionality to expand in new ways.
Expressing all possible combinations with three enum created too many
permutations of enum choices to remain usable. Splitting the choices
like I did across four enum types groups them into logical chunks and
means the programmer only has to consider a handful of choices at a
time instead of dozens of choices at a time.

> IMO, such enum types have limited use, unless we can guarantee
> reasonable (ideally, full) closure. Often, it is not possible to
> enumerate all the types / strategies / policies that may make sense
> for the varying use cases that we only attempt to foresee. In many
> cases, such as this one, my personal preference is to leave things
> pluggable, rather than enumerable.

The composite pool is already plugable, you must give it a
PoolableObjectFactory. :-)

> We should instead, if you and
> others agree, define the contracts between a "pool" and each of the
> four "behaviors" that you list above. We can supply (n) out-of-the-box
> implementations, but leave it open for a user to *easily* define a
> (n+1)th should such a need arise (and I believe it will, sooner or
> later).

We already provide a number of out of the box implementations:
GenericObjectPool, StackObjectPool, SoftReferenceObjectPool, and soon
a "Composite ObjectPool". (There are also similar KeyedObjectPools)

Not everything is made better because it's made plugable (or
subclassable). Anything that you expose as public or protected you
cannot change without risking compatibility. By making all of the
implementation details private you can completely change the
implementation without worrying about breaking compatibility.

Also, because the composite pool implementation is so separated from
the way it configured it allows for internal optimizations. The
composite pool factory currently optimizes the created pool in a
number of ways, including:
* detecting when the idle pool will never grow over a conservatively
tweaked internal threshold and chooses an ArrayList over a LinkedList
because the worst case performance of an ArrayList with the size of
~15 is still better than the best case performance of a LinkedList
with a size greater than zero.
* detecting when a configured expression of a pool can be more
efficiently expressed as a different configuration and still have the
same behavior.
* detecting a pool with a self-contradictory configuration and
preventing the creation of a broken pool.

I also have some more intrusive optimizations planned that may not be
available with a more exposed implementation. The largest performance
killer of the composite pool code right now is serialization due to
synchronization, not the java.io.Serialize type. Different
configurations need different amounts of synchronization to remain
thread-safe and correct. Currently the composite pool code
synchronizes more than is needed for the default and most common
configuration. When I have time I'll add another optimization that
figures out what is the narrowest amount of synchronization needed to
remain thread-safe and maintain correct behavior. I'm pretty sure
other optimizations will be made available when the composite pool can
depend on Java 1.5 and take advantage of j.u.concurrent features.

With a fully plugable API the synchronization optimization above
wouldn't really be available. You could use marker interfaces or add
methods to query the synchronization needs of a plugin but that would
be poorly usable. Same logic applies as to why you should always use a
j.u.Iterator to loop across a List instead of checking for the
j.u.RandomAccess marker interface.

I'm not against plugable APIs. They often make sense but not always,
and this is one time they don't. I also want the composite pool code
to be "future proof". Peter Steijn who emailed a week ago is exploring
some new ways to improve the performance of Pool (and by extension
Dbcp) by using some more complex threading behaviors. We've discussed
some of his ideas off-list and provided his ideas pan out maybe the
composite pool code in Pool 2.1 will be faster and client code using
pool won't have to know or care how the improved performance came
about.

> As a concrete example, for [scxml], we define a SCXMLExecutor (the
> state machine "engine") accompanied by a SCXMLSemantics interface [1].
> The basic modus operandi for an engine is simple - when an event is
> triggered, figure out which (if any) transition(s) to follow, and
> transit to the new set of states executing any specified actions along
> the way. However, there are numerous points of contention along the
> way. Lets take dispute resolution for example -- when more than one
> outbound transitions from a single state holds true. Which path do we
> take? The default implementation available in the distro is puristic,
> it will throw a ModelException. However, a user may want:
>
>  * The transition defined closest to the document root to be followed
>  * The transition defined farthest from the document root to be followed
>  * The transition whose origin and target have the lowest common
> ancestor to be followed
>  * The transition whose origin and target have the highest common
> ancestor to be followed
>
> Even after one of above dispute resolution algorithms is applied, if
> we end up with more than one candidate transitions, the user may want:
>
>  * A ModelException to be thrown
>  * The transition that appears first in document order to be followed
>  * The transition that appears last in document order to be followed
>
> To implement any of the above choices, the user may simply extend the
> default SCXMLSemantics implementation, override the
> filterTransitionsSet() method, and use the new semantics while
> instantiating the SCXMLExecutor.
>
> This approach means:
>
>  * We don't have to forsee all dispute resolution algorithms, and
> provide implementations
>  * Users don't have to convince anyone that the algorithm they need is
> useful, they can just implement it if they need it
>  * We don't even have to contend that the default puristic behavior
> that doesn't tolerate any non-determinism is the most common or the
> most useful one, it is just one that is chosen as default (because I
> personally believe it leads to better proof of correctness arguments).
>
> Since we're talking about Pool 2.0 and beyond, perhaps a focus on
> similar extensibility is justified, and maybe we should revisit the
> enumeration approach, even before we get to names.

If you want to implement a more plugable pool implementation and put
the plugable pool and composite pool code in a steel cage match to the
death based on usability, flexibility, and performance I'm all for it.

> -Rahul
>
> (long, possibly fragmented URL below)
>
> [1] http://svn.apache.org/viewcvs.cgi/jakarta/commons/sandbox/scxml/trunk/src/main/java/org/apache/commons/scxml/SCXMLSemantics.java?view=markup

--
Sandy McArthur

"He who dares not offend cannot be honest."
- Thomas Paine

---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org


Re: [pool] why the composite pool implementation isn't plugable [was: picking descriptive class names]

Posted by Peter Steijn <ps...@gmail.com>.
>
> > > 2.1 because we'd rather get 2.0 out first, instead of waiting to try
> > > out the ideas? If you guys think its appropriate, you might even move
> > > that conversation here? Maybe others are interested as well; I am :-)
> > > Plus it will give us better background while reading the commit
> > > messages.
> >
> > Peter Steijn initiated the private email to me. It's up to him if he
> > wants to discuss it on list. I will let him talk about his ideas for
> > his thesis to whomever he chooses.
> >
> <snap/>
>
> Ofcourse, wanted to let him know I was interested as well.
>

Hi!

This is Peter Steijn.  I just wanted to let everyone who is interested know
that I am in the process of writing two different implementations of my
proposed optimization for the object pool module.

Look forward to some alpha code and also a formal explanation sometime
within the next day or two.

-Pete

Re: [pool] why the composite pool implementation isn't plugable [was: picking descriptive class names]

Posted by Rahul Akolkar <ra...@gmail.com>.
On 3/27/06, Sandy McArthur <sa...@apache.org> wrote:
> On 3/27/06, Rahul Akolkar <ra...@gmail.com> wrote:
<snip/>
> >
> > Isn't the PoolableObjectFactory orthogonal to the four enum types you
> > mention? Those tune the FactoryConfig?
>
> Yes, hence the smiley.
>
<snap/>

Aha ;-)


> The implication for a programmer wanting a feature that isn't
> expressible with the enums is one of:
> * the programmer customizes one of the other existing ObjectPool
> implementations available to him already.
> * the programmer uses his ASL given right to customize and enhance the
> source to meet his needs. (If we're lucky he'll submit a patch back to
> us.)
>
> Neither of those are terrible, end of the world implications.
> Personally I think the second one is pretty good.
>
<snip/>

Thats always the case, and not everyone has the privilege of being
able to use unreleased home-brew versions.


> >
> > 2.1 because we'd rather get 2.0 out first, instead of waiting to try
> > out the ideas? If you guys think its appropriate, you might even move
> > that conversation here? Maybe others are interested as well; I am :-)
> > Plus it will give us better background while reading the commit
> > messages.
>
> Peter Steijn initiated the private email to me. It's up to him if he
> wants to discuss it on list. I will let him talk about his ideas for
> his thesis to whomever he chooses.
>
<snap/>

Ofcourse, wanted to let him know I was interested as well.

[Snipped good summary of some of the existing pool issues and
expectation out of first composite pool release].


> > P.S.- [pool] code is quite hard to read with all that horizontal
> > scrolling. Irrespective of the code already in place, maybe we should
> > stick to a reasonable (80?) character line width for new code?
>
> The code I contribute to apache is code I wrote for pleasure. The code
> I contribute is in the form that was most pleasurable for me to write
> in. I impose no restrictions on how others choose to write their code.
> If you wish to compensate me to write code differently or reject my
> contributions because of such trivial issues, that is fine. The ASL
> grants anyone the right reformat ASL licensed code however they see
> fit. I only request that I am not stripped of attribution for my
> contributions.
<snip/>

<exclamation-mark/>

It was "merely a suggestion". Will address this in further detail in a
separate email, later.

-Rahul


> --
> Sandy McArthur
>
> "He who dares not offend cannot be honest."
> - Thomas Paine
>

---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org


Re: [pool] why the composite pool implementation isn't plugable [was: picking descriptive class names]

Posted by Sandy McArthur <sa...@apache.org>.
On 3/27/06, Rahul Akolkar <ra...@gmail.com> wrote:
> On 3/26/06, Sandy McArthur <sa...@apache.org> wrote:
> > On 3/25/06, Rahul Akolkar <ra...@gmail.com> wrote:
> > > On 3/25/06, Sandy McArthur <sa...@apache.org> wrote:
> > > <snip/>
> > > > The enums above don't actually specify any implementation, they
> > > > describe desired features of a pool. The actual implementation isn't
> > > > broken down into four parts like that so try not to confuse how you
> > > > would implement that feature with how you would request that feature.
> > > >
> > > <snap/>
> > >
> > > Why isn't it broken down like that?
> >
> > Because there are fundamentally three parts to a pool's behavior.
> > 1. How objects are treated while they are in the idle object pool.
> > 2. How objects are added/removed from the idle object pool.
> > 3. How objects are treated while they are out of the pool, aka: active.
> >
> > I choose to map these three aspects to four types of behavior because
> <snip/>
> > The composite pool is already plugable, you must give it a
> > PoolableObjectFactory. :-)
> <snap/>
>
> Isn't the PoolableObjectFactory orthogonal to the four enum types you
> mention? Those tune the FactoryConfig?

Yes, hence the smiley.

> > Not everything is made better because it's made plugable (or
> <snip/>
> > I'm not against plugable APIs. They often make sense but not always,
> > and this is one time they don't. I also want the composite pool code
> > to be "future proof".
> <snap/>
> > With a fully plugable API the synchronization optimization above
> > wouldn't really be available.
> <snip/>
>
> No silver bullet, we can only guarantee what is provided out of the
> box. All the internal optimizations you list may be appealing, and
> what I take from your post is that the optimizations rely on fact that
> the four "behavior" impls will always be one of the enum'ed ones. The
> question that I still cannot address is -- what is the implication of
> needing a enum'ed behavior that is not provided? Maybe that is what
> really makes this "future proof". I don't think we know.

The implication for a programmer wanting a feature that isn't
expressible with the enums is one of:
* the programmer customizes one of the other existing ObjectPool
implementations available to him already.
* the programmer uses his ASL given right to customize and enhance the
source to meet his needs. (If we're lucky he'll submit a patch back to
us.)

Neither of those are terrible, end of the world implications.
Personally I think the second one is pretty good.

> > Peter Steijn who emailed a week ago is exploring
> > some new ways to improve the performance of Pool (and by extension
> > Dbcp) by using some more complex threading behaviors. We've discussed
> > some of his ideas off-list and provided his ideas pan out maybe the
> > composite pool code in Pool 2.1 will be faster and client code using
> > pool won't have to know or care how the improved performance came
> > about.
> >
> <snap/>
>
> 2.1 because we'd rather get 2.0 out first, instead of waiting to try
> out the ideas? If you guys think its appropriate, you might even move
> that conversation here? Maybe others are interested as well; I am :-)
> Plus it will give us better background while reading the commit
> messages.

Peter Steijn initiated the private email to me. It's up to him if he
wants to discuss it on list. I will let him talk about his ideas for
his thesis to whomever he chooses.

In regards to the the commit message mentioning Peter, I just wanted
to give him credit for triggering the chain of thought that led me to
take an idea I had for optimizing destroyObject (recently committed)
and applying it to makeObject (also recently committed).

> > > Since we're talking about Pool 2.0 and beyond, perhaps a focus on
> > > similar extensibility is justified, and maybe we should revisit the
> > > enumeration approach, even before we get to names.
> >
> > If you want to implement a more plugable pool implementation and put
> > the plugable pool and composite pool code in a steel cage match to the
> > death based on usability, flexibility, and performance I'm all for it.
> >
> <snip/>
>
> I think this has value. In terms of a major version release, we might
> want to use some of liberties we get the best we can, and to that end,
> playing with multiple options can be beneficial. If I can make some
> progress on the things that are already on my plate within the next
> few months, I'll play in a new branch (and in that case, I'll ping the
> list for objections first). However I am hardly talking about a
> complete rewrite, so it should have the same usability and performance
> (for out-of-the-box impls) since I'll base it off of your
> contribution, just to be more flexible on the enums front. Maybe that
> puts it in a higher weight class already? ;-)

Go for it, but when designing a good API it's just as important to
consider what you are leaving out.

For example Gary Gregory recently requested that isClosed() in
BaseObjectPool be made public but that isn't really what he wanted.
What he wanted was for the pool not to throw an exception when you
call close a second time or when you call returnObject after close has
been called and he'll get this with behavior with pool 2.

Making isClosed public like he requested seems like it would have met
his needs as a quick and easy fix but it wouldn't really work. Not
only is it harder to use because now he'd have to call isClosed to
guard each access to the pool to avoid exceptions but it wouldn't work
because between the call to isClosed and returnObject the pool could
have been closed by another thread. To fix that he'd have to
synchronize (hurting pool performance), check isClosed, and then call
returnObject.

GenericObjectPool really suffers from naively tacking on features
without an eye to usability or solving the real problem. A FIFO pool
doesn't need an idle object evictor, all idle objects get touched over
time during normal usage. GOP has an evictor because people noticed
old idle objects never being removed from the pool. This is because
GOP was actually a LIFO and during heavy usage the pool may grow large
and until the next heavy usage the deepest idle object won't be
touched at all.

When the composite pool is first release I fully expect the first
feature request to be a naive request for a minIdle configuration
option. It was a completely intentional decision that the composite
pool code does not have a minIdle feature. Adding minIdle only fixes
the symptom of making new objects being slow thus slowing the pool.
What should be addressed is how the pool can remain fast despite slow
operations. I didn't know how to solve this back in November 2005 when
I wrote the composite pool code but Peter has some ideas and he help
me get an idea. The end result is/will be a pool that needs one less
configuration option and performs optimally.

> P.S.- [pool] code is quite hard to read with all that horizontal
> scrolling. Irrespective of the code already in place, maybe we should
> stick to a reasonable (80?) character line width for new code?

The code I contribute to apache is code I wrote for pleasure. The code
I contribute is in the form that was most pleasurable for me to write
in. I impose no restrictions on how others choose to write their code.
If you wish to compensate me to write code differently or reject my
contributions because of such trivial issues, that is fine. The ASL
grants anyone the right reformat ASL licensed code however they see
fit. I only request that I am not stripped of attribution for my
contributions.
--
Sandy McArthur

"He who dares not offend cannot be honest."
- Thomas Paine

---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org


Re: [pool] why the composite pool implementation isn't plugable [was: picking descriptive class names]

Posted by Rahul Akolkar <ra...@gmail.com>.
On 3/26/06, Sandy McArthur <sa...@apache.org> wrote:
> On 3/25/06, Rahul Akolkar <ra...@gmail.com> wrote:
> > On 3/25/06, Sandy McArthur <sa...@apache.org> wrote:
> > <snip/>
> > > The enums above don't actually specify any implementation, they
> > > describe desired features of a pool. The actual implementation isn't
> > > broken down into four parts like that so try not to confuse how you
> > > would implement that feature with how you would request that feature.
> > >
> > <snap/>
> >
> > Why isn't it broken down like that?
>
> Because there are fundamentally three parts to a pool's behavior.
> 1. How objects are treated while they are in the idle object pool.
> 2. How objects are added/removed from the idle object pool.
> 3. How objects are treated while they are out of the pool, aka: active.
>
> I choose to map these three aspects to four types of behavior because
<snip/>
> The composite pool is already plugable, you must give it a
> PoolableObjectFactory. :-)
<snap/>

Isn't the PoolableObjectFactory orthogonal to the four enum types you
mention? Those tune the FactoryConfig?


>
> Not everything is made better because it's made plugable (or
<snip/>
> I'm not against plugable APIs. They often make sense but not always,
> and this is one time they don't. I also want the composite pool code
> to be "future proof".
<snap/>
> With a fully plugable API the synchronization optimization above
> wouldn't really be available.
<snip/>

No silver bullet, we can only guarantee what is provided out of the
box. All the internal optimizations you list may be appealing, and
what I take from your post is that the optimizations rely on fact that
the four "behavior" impls will always be one of the enum'ed ones. The
question that I still cannot address is -- what is the implication of
needing a enum'ed behavior that is not provided? Maybe that is what
really makes this "future proof". I don't think we know.


> Peter Steijn who emailed a week ago is exploring
> some new ways to improve the performance of Pool (and by extension
> Dbcp) by using some more complex threading behaviors. We've discussed
> some of his ideas off-list and provided his ideas pan out maybe the
> composite pool code in Pool 2.1 will be faster and client code using
> pool won't have to know or care how the improved performance came
> about.
>
<snap/>

2.1 because we'd rather get 2.0 out first, instead of waiting to try
out the ideas? If you guys think its appropriate, you might even move
that conversation here? Maybe others are interested as well; I am :-)
Plus it will give us better background while reading the commit
messages.


> >
> > Since we're talking about Pool 2.0 and beyond, perhaps a focus on
> > similar extensibility is justified, and maybe we should revisit the
> > enumeration approach, even before we get to names.
>
> If you want to implement a more plugable pool implementation and put
> the plugable pool and composite pool code in a steel cage match to the
> death based on usability, flexibility, and performance I'm all for it.
>
<snip/>

I think this has value. In terms of a major version release, we might
want to use some of liberties we get the best we can, and to that end,
playing with multiple options can be beneficial. If I can make some
progress on the things that are already on my plate within the next
few months, I'll play in a new branch (and in that case, I'll ping the
list for objections first). However I am hardly talking about a
complete rewrite, so it should have the same usability and performance
(for out-of-the-box impls) since I'll base it off of your
contribution, just to be more flexible on the enums front. Maybe that
puts it in a higher weight class already? ;-)

-Rahul

P.S.- [pool] code is quite hard to read with all that horizontal
scrolling. Irrespective of the code already in place, maybe we should
stick to a reasonable (80?) character line width for new code?

---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org