You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@avalon.apache.org by Leo Sutic <le...@inspireinfrastructure.com> on 2002/06/09 15:47:36 UTC

[Design] ContainerManager is under fire--let's find the best

You wrote:
>The CM's lookup/release pair remind me a lot of the new/delete pair
>--
>which we don't have in Java because the VM does the 'delete' for us.
>
>If it is ok to let the VM reclaim memory assigned to objects without 
>having much control over the process, why is there a problem applying 
>this approach to reclaiming components for a pool?
>
>couldn't the same arguments work in each case?

Robert,

the arguments are not the same. You equate a component with something
like a 
String instance - an object that is cheap and doesn't cause any harm by
being
on the heap. For example, the code:

  for (int i = 0; i < 1000; i++) {
    String s = new String ("This is String number " + i);
  }

eats up some heap space, but that is all. The rest of the system is not 
adversely affected by the String instances on the heap, as we have not
"used 
up" 1000 precious String instances. To put it another way, there is no
shortage
of Strings - use as many as you want!

However, this code:

  for (int i = 0; i < 1000; i++) {
    InputStream is = new FileInputStream ("some_file." + i);
  }

is more dangerous. As we open files without closing them, the file
handles will
remain open until the InputStreams are GC:ed. This is bad, because as
long as
those InputStreams are on the heap, they hold on to a limited system
resource -
file handles. There is no shortage of Strings, but there is a shortage
of file
handles. That's why you should call close() on all streams that you
open, 
instead of waiting for them to be GC:ed.

There are other examples of "precious" objects - for example DB
connections. 
PostgreSQL allows (by default) only a maximum of 24 simultaneous
connections. 
That's why you should call close() on your DB connections as soon as
possible. 
(Even though other DBs may have other defaults.)

I hope the above explains why I do not think GC of components can be
considered 
equivalent to GC of regular objects.

"But", you say, "all components that are "rare" and thus must be
close()'d or 
similar, they usually have a close() method, so why bother at all? It
appears 
that all objects that needs something besides normal GC already have the
extra 
mechanisms they need."

The problem with that is that in Avalon, you do not know if the
component you 
are using needs something besides the usual GC. Every component is
defined by 
its interface, but the actual implementation may vary. So you may have
an 
implementation that works fine with normal GC, and one that needs to be 
explicitly close()'d or equivalent. As the interface must be able to
accomodate 
both, you end up writing the interface for worst-case: you include a
close() 
method (or something else equivalent).

Consider, for example, that you could switch implementations of
InputStream to 
one that does not need you to call close(). Let's say this may or may
not be 
supported on your target system. So we have a FileInputStream whose
close() 
method is empty (for backwards compatibility). Right, what do you code?
Well, 
since you do not know if your code will run on a system with
no-need-to-close-
streams, you end up writing code to close the stream anyway.

The same reasoning is behind my wish for a release() mechanism. Since we
do not 
know whether the component is pooled or not, and since no suitable
automatic 
method of reclaiming components have been found (and I doubt it will be
found in
time for Avalon 6 even), we must have a release() mechanism.


> If the only reason for using pools is performance then some tuning is 
> to be expected whenever they are used (e.g. determining the pool 
> size). This tuning is going to be dependent on implementation (e.g. 
> different implementations will require different pool sizes for best 
> performance).

The GC settings are:

 1) *Brutally* difficult to tune. I do not even know if it is
    possible.

 2) Global. Tune one pool, tune all. This makes it difficult to find
    values that work. Set value A too high, and your DB connection
    pool is suffering, set it lower and some other pool starts acting
    up.

> finally, if a pool runs out of components there is nothing stopping it

> from running the garbage collector itself [though again, this would 
> need tuning -- but might make reduce the differences between VMs].

Actually, there is. How do you run the collector? The System.gc() call
is only
a *suggestion* to the VM to collect. Calling it does not guarantee GC at
all. 
And even if you do run GC, there is no guarantee that *all* unreachable
objects 
are finalized - for various reasons, the VM may leave some for the next
pass. 
Which may not happen in a while.

I hope this explains my reasoning behind my opposition to the
GC-as-component-
release line, and my strong pro-release() stance.

/LS



--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>


Re: [Design] ContainerManager is under fire--let's find the best

Posted by Robert Mouat <ro...@mouat.net>.
It seems that I was too quick assume that a pool would only be used for
performance reasons and forgot that it can also be used to manage scarce
resources.

In the case where a pool is being used to increase performance by avoiding
creation/initialization overhead, I think that using the VM's GC can
work.  The case where you have a scarce resource is a little tricker...

Leo Sutic wrote:

> You wrote:
> >The CM's lookup/release pair remind me a lot of the new/delete pair --
> >which we don't have in Java because the VM does the 'delete' for us.
> >
> >If it is ok to let the VM reclaim memory assigned to objects without
> >having much control over the process, why is there a problem applying
> >this approach to reclaiming components for a pool?
> >
> >couldn't the same arguments work in each case?
> the arguments are not the same.

<snip>
stuff convincing me that the arguments are not the same.
</snip>

> I hope the above explains why I do not think GC of components can be considered
> equivalent to GC of regular objects.

yes.

> "But", you say, "all components that are "rare" and thus must be close()'d or
> similar, they usually have a close() method, so why bother at all? It appears
> that all objects that needs something besides normal GC already have the extra
> mechanisms they need."

I might have been tempted to say something like that... :)

> The problem with that is that in Avalon, you do not know if the component you
> are using needs something besides the usual GC. Every component is defined by
> its interface, but the actual implementation may vary. So you may have an
> implementation that works fine with normal GC, and one that needs to be
> explicitly close()'d or equivalent. As the interface must be able to accomodate
> both, you end up writing the interface for worst-case: you include a close()
> method (or something else equivalent).
> 
> Consider, for example, that you could switch implementations of InputStream to 
> one that does not need you to call close(). Let's say this may or may not be 
> supported on your target system. So we have a FileInputStream whose close() 
> method is empty (for backwards compatibility). Right, what do you code? Well, 
> since you do not know if your code will run on a system with no-need-to-close-
> streams, you end up writing code to close the stream anyway.
> 
> The same reasoning is behind my wish for a release() mechanism. Since we do not 
> know whether the component is pooled or not, and since no suitable automatic 
> method of reclaiming components have been found (and I doubt it will be found in
> time for Avalon 6 even), we must have a release() mechanism.

The problem comes when you have a component that holds a scarce resource
from initialize() until dispose(), and the client using the component
doesn't know about it...

If the client knew the component was hoding a scarce resource
(e.g. cm.lookup( "DBConnection" ), then the cm.lookup() could return a
DBConnectionManager which would have a release() method.

If the component only held the resource from, say, open() to close(), of
if it only held the resource while implementing a single method, then a
release() method isn't needed.

Now I'll try to convince you that the situation when a client uses a
component and the component holds a scarce resource throughout its
lifetime (initialize() to dispose()) can be avoided.

firstly, my understanding is that a component is unaware that a release()
has occured, if it is being pooled then it is returned unaware to the
pool, if it is not being pooled then the dispose() method will soon be
called.

It is possible that a component that is not poolable implements an
interface that doesn't have open/close (or similar) methods, tries to do
everything with one transaction and relies on dispose() to release the
dbconnection/filehandle -- I'm going to claim that this is bad design, and
that the interface doesn't require everything to be done with a single
dbconnection/filehandle.  [are there any counter-examples to this?]

Now if the component is poolable and implements an interface that doesn't
have open/close (or similar) methods then it can't try to do everything in
a single transaction (since it won't know when one transaction ends and
another begins), so it shouldn't need to keep a dbconnection/filehandle
open for its whole lifetime (just obtaining them as needed, and
releasing them afterwards).

Is this convincing?  I didn't want to try to come up with examples for
each different case -- If anyone has examples that contridict what I've
said I'll try and address them.

> > If the only reason for using pools is performance then some tuning is 
> > to be expected whenever they are used (e.g. determining the pool 
> > size). This tuning is going to be dependent on implementation (e.g. 
> > different implementations will require different pool sizes for best 
> > performance).
> The GC settings are:
> 
>  1) *Brutally* difficult to tune. I do not even know if it is
>     possible.
> 
>  2) Global. Tune one pool, tune all. This makes it difficult to find
>     values that work. Set value A too high, and your DB connection
>     pool is suffering, set it lower and some other pool starts acting
>     up.

actually I was thinking of tuning the individual pool settings rather than
the VMs GC -- e.g. if you are going to use GC to release components then
you'd probably want a larger pool than if you were going to explicitly
call release(), and the size of the pool may also depend on how good the
VM's GC was.

perhaps it might be possible to write a pool manager that grew/shrank the
size of the pool depending on how often requests came in while the pool
was empty [if such a beast doesn't already exist].

> > finally, if a pool runs out of components there is nothing stopping it
> > from running the garbage collector itself [though again, this would 
> > need tuning -- but might make reduce the differences between VMs].
> Actually, there is. How do you run the collector? The System.gc() call is only
> a *suggestion* to the VM to collect. Calling it does not guarantee GC at all. 
> And even if you do run GC, there is no guarantee that *all* unreachable objects 
> are finalized - for various reasons, the VM may leave some for the next pass. 
> Which may not happen in a while.

oops, I should read the javadocs more carefully:

  Calling the gc method suggests that the Java Virtual Machine expend
  effort toward recycling unused objects in order to make the memory they
  currently occupy available for quick reuse. When control returns from
  the method call, the Java Virtual Machine has made a best effort to
  reclaim space from all discarded objects. 

your right, while this implies that we can expect memory to be reclaimed
from finalized objects, it doesn't say that finalization will occur (and
hence would be of no use in the current context).

> I hope this explains my reasoning behind my opposition to the GC-as-component-
> release line, and my strong pro-release() stance.

yes, and I hope that my arguments show that it should be possible to
design components so that an explicit release() isn't needed...

having said that, it doesn't mean that this is the best way to go -- it
might cause troubles with legacy components that hold
dbconnections/filehandles... or if somehow you get stuck trying to combine
an inappropiate interface-implementaion pair (e.g. the implementation
needs a close() and the interface doesn't have such a method).

.... I think I'll now go and hold a different point of view on a different
branch of this thread... :)

Robert.



--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>