You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hc.apache.org by Roland Weber <ht...@dubioso.net> on 2007/01/20 23:46:47 UTC

Re: [HttpConn] manager

Hi Oleg,

> Mike put a lot of hard work into the multithreaded connection manager.
> To this point it still constitutes the most significant and complex
> contribution to the project. Even though it does look like some kind of
> black magic MTHCM has been working really well for us. I urge you to not
> start taking it apart until we have enough test coverage to test for
> regressions. Let's just get it working with the new API and then see
> what could be improved.

I appreciate the work that Mike and others have put into the MTHCM.
And I can well imagine how it developed over time. I'm sure that
originally connections were handed out directly, which explains why
they get deleted and created instead of being managed. Then the
wrappers came, and so on.
But I'm currently looking at the result of all these changes over
time. And I can't see where to start with making _small_ changes.
I can't "just make it work with the new API".
I thought it couldn't be so hard to have one interface for the
connections being managed and another one for the wrappers being
given to the caller. Then I look at the code and realize that the
connections are not being managed by reference, they are subclassed.
And the only purpose of the subclass is to create a weak reference
to the object itself, which will then be put into some static map,
to be retrieved I don't know when for I don't know what. Oh, and
there is code to put references in and out of those maps for reasons
I don't understand either.
The idea of moving some of the classes out to make things simpler
was a dead end too. The stuff is nested, with most of the complexity
being in the ConnectionPool class with plenty of private methods.
It won't help to move it to an extra file and make the methods public.

Basically nothing in MTHCM works in a way that is remotely in line
with my understanding of what the class is supposed to do. I don't
work or think that way. To me, a pool is a finite set of elements,
some of which are allocated and some of which are not. You keep
references to those elements and when the pool is shut down, you
shut down the elements. The manager *knows* which elements it is
managing. Weak references and garbage collection???
I'm sorry, but I have serious issues with this code. I'm not just
feeling a bit uncomfortable about it, and it's not "I would have
implemented this differently" either. It's not the formatting and
not the variable or method names. The class is based on an extremely
complex design which is completely alien to me. I believe that much
of the complexity will simply vanish if a more straightforward
design is applied to the problem.

I'll give the code one more try tomorrow. But only one.
Meanwhile, I'll appreciate any background information on the
current design that somebody can give me.

cheers,
  Roland


---------------------------------------------------------------------
To unsubscribe, e-mail: httpcomponents-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: httpcomponents-dev-help@jakarta.apache.org


Re: [HttpConn] manager

Posted by Michael Becke <mb...@gmail.com>.
> I got it. When being allocated, the connection was counted
> for the "connections per host" limit. When it is released,
> some other thread waiting for a connection to that host can
> be unblocked. We need to remember the host configuration
> for which the connection was allocated to uncount it when
> it gets released.

That is correct.

> I guess there is not even an unspecific list? Connections
> are kept in two lists simultaneously, one is always host
> specific while the other one keeps all connections in the
> pool. Currently, connections are never re-assigned to a
> different host, but instead they are deleted and a new
> connection is created for the new host.

It does keep a list of freeConnections (for the purpose of deleting a
least recently used connection).  This only occurs when the host limit
_has not_ been reached, the total allocated connection limit _has_
been reached, and there are free connections.  As you say though a
connection is never switched between hosts, it is just deleted and a
new one is created.

> I was a bit confused that HttpMethodDirector updates a
> HostConfiguration when following redirects, but that is
> not the HostConfiguration stored in the connection.

Yes, this is one of the strange "features" of the connection manager.
The current limits on connections per host (route really) are only
enforced at checkout time.  Once a connection is outside of the
connection manager it can be reused for other hosts.

Mike

---------------------------------------------------------------------
To unsubscribe, e-mail: httpcomponents-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: httpcomponents-dev-help@jakarta.apache.org


Re: [HttpConn] manager

Posted by Roland Weber <ht...@dubioso.net>.
> It seems that even closed connections are being put back
> into the pool for the host to which they were opened.

I got it. When being allocated, the connection was counted
for the "connections per host" limit. When it is released,
some other thread waiting for a connection to that host can
be unblocked. We need to remember the host configuration
for which the connection was allocated to uncount it when
it gets released.

> In my first attempt, I delete the HostConfiguration when
> the connection gets closed. The result is a SEVERE error
> message when the connection is released. I would have
> expected the closed or unassigned connections to just
> end up in the non-specific list of free connections.

I guess there is not even an unspecific list? Connections
are kept in two lists simultaneously, one is always host
specific while the other one keeps all connections in the
pool. Currently, connections are never re-assigned to a
different host, but instead they are deleted and a new
connection is created for the new host.

I was a bit confused that HttpMethodDirector updates a
HostConfiguration when following redirects, but that is
not the HostConfiguration stored in the connection.

cheers,
  Roland


---------------------------------------------------------------------
To unsubscribe, e-mail: httpcomponents-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: httpcomponents-dev-help@jakarta.apache.org


Re: [HttpConn] manager

Posted by Roland Weber <ht...@dubioso.net>.
Hi Mike,

looks like I was a bit premature: I do have a question :-)

It seems that even closed connections are being put back
into the pool for the host to which they were opened.
In my first attempt, I delete the HostConfiguration when
the connection gets closed. The result is a SEVERE error
message when the connection is released. I would have
expected the closed or unassigned connections to just
end up in the non-specific list of free connections.
Is my understanding correct, and would it be acceptable
to handle connections that way?

cheers,
  Roland

---------------------------------------------------------------------
To unsubscribe, e-mail: httpcomponents-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: httpcomponents-dev-help@jakarta.apache.org


Re: [HttpConn] manager

Posted by Roland Weber <ht...@dubioso.net>.
Hi Mike,

> I'm sorry I didn't see your email earlier or I could have saved you
> some headaches.  Sounds like you've figured it out though.  Your
> description of the weak references, etc below is spot on and quite
> well written.

Thanks, it's good to learn that I figured it out correctly.

> Please let me know if there is anything I can add/explain.

For the time being, I'm reasonably comfortable with the code.
But I'd appreciate it very much if you could spend some time on
a code review once I'm through with the rewrite. Since you've got
the most experience with the MTHCM, you're in the best position
to spot where I might have gone into the wrong direction. Due to
the collection classes, there are plenty of downcasts. So I can't
rely on the compiler's type checks when dealing with the
replacements for the old connection interface on various levels.

cheers,
  Roland



---------------------------------------------------------------------
To unsubscribe, e-mail: httpcomponents-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: httpcomponents-dev-help@jakarta.apache.org


Re: [HttpConn] manager

Posted by Michael Becke <mb...@gmail.com>.
Hi Roland,

I'm sorry I didn't see your email earlier or I could have saved you
some headaches.  Sounds like you've figured it out though.  Your
description of the weak references, etc below is spot on and quite
well written.  As you have discovered, most of the complexity of the
MTHCM comes from the support for handling GCed connections.  This part
of the MTHCM was added after the original design was completed, so
there is likely a better/simpler way to achieve the same results.
Please let me know if there is anything I can add/explain.

Mike

On 1/20/07, Roland Weber <ht...@dubioso.net> wrote:
> Good news,
>
> some inspiring music on mind-shaking volume helped me
> solve the riddle of the MTHCM, and with a good night's
> sleep I'm ready to tackle the code. I even dimly remember
> the discussions about the garbage collection thing.
>
> The reason for the weak references is that an application
> may allocate a connection and then just forget about it
> without releasing it. Since the manager does not maintain
> a hard reference to the connection, it will be GCed and
> the vanishing weak reference tells the manager(s) that a
> connection has been lost. Without the connection object,
> it's hard to tell _which_ manager issued the connection
> that was lost, hence there is a second map from the weak
> references to the responsible managers.
> I guess this strategy was introduced before the wrappers
> that are being handed out nowadays. With the wrappers in
> place, we can manage the connections in the traditional
> way and instead keep a weak reference to the _wrapper_.
> When the wrapper is GCed, the connection can be reclaimed.
> That way, connections will never be cleaned up by the GC
> (except when the whole connection manager is GCed) and
> we don't have to rely on finalizers being called in order
> to release for example native resources associated with
> the connection.
>
> Another missing puzzle piece is the route tracking that
> is implemented in the old connection class by means of
> a HostConfiguration and tunnelCreated flag. The new
> interfaces have been defined bottom-up, so they don't
> have this feature which is required by the manager. It's
> not possible to map a managed connection to an operated
> connection by a simple adapter without adding route
> tracking. I'll need an extra class here, and that is the
> place where I can fit the subclassing I mentioned earlier.
>
> I'll get started right away...
>
> cheers,
>   Roland
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: httpcomponents-dev-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: httpcomponents-dev-help@jakarta.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: httpcomponents-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: httpcomponents-dev-help@jakarta.apache.org


Re: [HttpConn] manager

Posted by Roland Weber <ht...@dubioso.net>.
Good news,

some inspiring music on mind-shaking volume helped me
solve the riddle of the MTHCM, and with a good night's
sleep I'm ready to tackle the code. I even dimly remember
the discussions about the garbage collection thing.

The reason for the weak references is that an application
may allocate a connection and then just forget about it
without releasing it. Since the manager does not maintain
a hard reference to the connection, it will be GCed and
the vanishing weak reference tells the manager(s) that a
connection has been lost. Without the connection object,
it's hard to tell _which_ manager issued the connection
that was lost, hence there is a second map from the weak
references to the responsible managers.
I guess this strategy was introduced before the wrappers
that are being handed out nowadays. With the wrappers in
place, we can manage the connections in the traditional
way and instead keep a weak reference to the _wrapper_.
When the wrapper is GCed, the connection can be reclaimed.
That way, connections will never be cleaned up by the GC
(except when the whole connection manager is GCed) and
we don't have to rely on finalizers being called in order
to release for example native resources associated with
the connection.

Another missing puzzle piece is the route tracking that
is implemented in the old connection class by means of
a HostConfiguration and tunnelCreated flag. The new
interfaces have been defined bottom-up, so they don't
have this feature which is required by the manager. It's
not possible to map a managed connection to an operated
connection by a simple adapter without adding route
tracking. I'll need an extra class here, and that is the
place where I can fit the subclassing I mentioned earlier.

I'll get started right away...

cheers,
  Roland

---------------------------------------------------------------------
To unsubscribe, e-mail: httpcomponents-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: httpcomponents-dev-help@jakarta.apache.org