You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@cxf.apache.org by Daniel Kulp <dk...@apache.org> on 2012/07/27 21:20:10 UTC

Async http client experiments....

I've committed a few experiments I've been working on to:

https://svn.apache.org/repos/asf/cxf/sandbox/dkulp_async_clients


Basically, I've been trying to find an async client that is somewhat usable 
for CXF without completely re-writing all of CXF.   Not exactly an easy 
task.  For "POST"s, they pretty much are all designed around being able to 
blast out pre-rendered content (like File's or byte[]).   Doesn't really fit 
with CXF's way of streaming out the soap messages as they are created.   

My notes on 4 client API's I've played with:

1) Ning/Sonatype Async Client:
http://sonatype.github.com/async-http-client/
(using Netty backend, didn't try the others)

This one really needed pre-rendered content.  When you can determine the 
Content-Length up front, it's really not too bad.   However, once you try to 
flip to Chunked, I just kept running into issues.  In the end, I could not 
find an API that would let me us Chunked that actually worked.   The one 
that looked like it should actually didn't write the chunk headers out.  No 
idea what's up with it.  Got very frustrated with it.

2) Netty directly:  https://netty.io/
I did get the simple cases working with this fairly easily.  Seems quite 
powerful and it actually performed really well.  The problem is that it's 
really very low level and doesn't provide a lot of things like Keep-Alive 
connection management "out of the box".  That's something we'd have to write 
a lot of code to handle.  Not something I wanted to really tackle.   That 
said, for small requests, Netty was the only one to stick the HTTP headers 
AND the body into a single network packet.   Even the URLConnection in the 
JDK doesn't do that.  Pretty cool.

3) Jetty Client:  http://wiki.eclipse.org/Jetty/Tutorial/HttpClient
I did get this to work for most of my test cases.   There are definitely a 
few "issues" that I think are likely bugs in Jetty, but easily worked 
around.  For example, to get it to ask for chunks, you have to first return 
an "empty" input stream.  Minor bugs asside, it did seem to work OK.  
However, using the chunks did expose some issues on a raw network side.   
Doing tcp dumps revealed that most of the chunk headers ended up in their 
own TCP packets so it ended up taking a bit longer due to extra packets 
transfered.  The other thing I didn't really like was it kind of required 
much more byte[] copying than I really wanted. 

4) Apache HTTP Components (HC)- this was the first one I tried, ran into 
performance issues, abandoned it to test the others, then came back to it 
and figured out the performance issue.  :-)   I had this "working", but a 
simple "hello world" echo in a loop resulted in VERY VERY slow operation, 
about 20x slower than the URLConnection in the JDK.   Couldn't figure out 
what was going on which is why I started looking at the others.   I came 
back to it and started doing wireshark captures and discovered that it was 
waiting for ACK packets whereas the other clients were not.   The main issue 
was that the docs for how to set the TCP_NO_DELAY flag (which, to me, should 
be the default) seem to be more geared toward the 3.x or non-NIO versions.   
Anyway, once I managed to get that set, things improved significantly.  For 
non-chunked data, it seems to be working very well.  For chunked data, it 
seems to work well 99% of the time.   It's that last 1% that's going to 
drive me nuts.  :-(   It's occassionally writing out bad chunk headers, and 
I have no idea why.   A raw wireshark look certainly shows bad chunk headers 
heading out.   I don't know if it's something I'm doing or a bug in their 
stuff.  Don't really know yet.

In anycase, I'm likely going to pursue option #4 a bit more and see if I can 
figure out the last issue with it.

>From a performance standpoint, for synchronous request/response, none of 
them perform as well as the in-jdk HttpURLConnection for what we do.   Netty 
came the closest at about 5% slower.  HC was about 10%, Jetty about 12%.  
Gave up on Ning before running benchmarks.   

However, as exected, the real win is when you use the JAX-WS async API's to 
call onto a "slow" service.   (I only have the HC version working for this 
at this point)   If you tune the connection pool to allow virtually 
unlimmitted connections to localhost (mimics the HttpURLConnection for 
comparison), it's pretty awesome.   I have a simple service that uses the 
CXF continuations to delay for about 2 seconds on the server side to mimic 
some processing on the server side.    5K async requests using the in-JDK 
stuff that we have using our default thread pool settings and such takes 
about 35 seconds for all the requests to complete.    With HC, I have that 
down to about 9 seconds.   However, that does require a bit of tuning of the 
HC defaults and settings which is the next bit of complication.   Using the 
async apis to call a "fast" (where response is returned very quickly) 
service won't benefit as much.

In anycase, I've committed my experiments to the sandbox.   It does require 
the latest trunk code for transport-http as well.   Any help or thoughts or 
anything would be more than welcome.  :-)



-- 
Daniel Kulp
dkulp@apache.org - http://dankulp.com/blog
Talend Community Coder - http://coders.talend.com

Re: Async http client experiments....

Posted by Oleg Kalnichevski <ol...@apache.org>.
On Thu, 2012-09-06 at 10:14 -0400, Daniel Kulp wrote:
> On Sep 6, 2012, at 9:42 AM, Oleg Kalnichevski <ol...@apache.org> wrote:
> 
> > On Wed, 2012-09-05 at 15:30 -0400, Daniel Kulp wrote:
> >> On Sep 5, 2012, at 10:49 AM, Daniel Kulp <dk...@apache.org> wrote:
> >> 
> >> OK.  Went ahead and ripped out the HttpHost related stuff and replaced with a CXFHttpHost which is basically a copy of the HttpHost with extra params for the TLS and Proxy stuff (Proxy stuff unused right now).   With that, we're back to a  single connection for all the TLS requests (providing they CAN).  So that looks good now.    That pretty much just leaves the Proxy stuff as the major missing piece.
> >> 
> > 
> > Would not aggregation be more appropriate here? 
> > 
> > class CXFRoute {
> > 
> >  HttpHost host;
> >  HttpHost proxy;
> >  SSLStuff sstuff;
> > 
> > }
> 
> Possibly, but if HttpHost is eventually made non-final, with the current setup it would be very trivial to rip out large chunks of code.   :-)
> 
> 

No problem at all. Besides, if this experimental code eventually lends
in the main code line I'll make sure generic bits get migrated back to
the framework. This should help reduce the footprint of custom code you
need to maintain internally. I have already made necessary changes to
the shared content i/o classes in the 4.3 branch of HttpCore so that
they could be used instead of the custom classes currently in the
sandbox.

Oleg


Re: Async http client experiments....

Posted by Daniel Kulp <dk...@apache.org>.
On Sep 6, 2012, at 9:42 AM, Oleg Kalnichevski <ol...@apache.org> wrote:

> On Wed, 2012-09-05 at 15:30 -0400, Daniel Kulp wrote:
>> On Sep 5, 2012, at 10:49 AM, Daniel Kulp <dk...@apache.org> wrote:
>> 
>> OK.  Went ahead and ripped out the HttpHost related stuff and replaced with a CXFHttpHost which is basically a copy of the HttpHost with extra params for the TLS and Proxy stuff (Proxy stuff unused right now).   With that, we're back to a  single connection for all the TLS requests (providing they CAN).  So that looks good now.    That pretty much just leaves the Proxy stuff as the major missing piece.
>> 
> 
> Would not aggregation be more appropriate here? 
> 
> class CXFRoute {
> 
>  HttpHost host;
>  HttpHost proxy;
>  SSLStuff sstuff;
> 
> }

Possibly, but if HttpHost is eventually made non-final, with the current setup it would be very trivial to rip out large chunks of code.   :-)


-- 
Daniel Kulp
dkulp@apache.org - http://dankulp.com/blog
Talend Community Coder - http://coders.talend.com


Re: Async http client experiments....

Posted by Oleg Kalnichevski <ol...@apache.org>.
On Wed, 2012-09-05 at 15:30 -0400, Daniel Kulp wrote:
> On Sep 5, 2012, at 10:49 AM, Daniel Kulp <dk...@apache.org> wrote:
> 
> > 
> > On Sep 5, 2012, at 9:26 AM, Oleg Kalnichevski <ol...@apache.org> wrote:
> >> 
> >> Just ditch HttpHost. You obviously need a richer class to represent
> >> connection routes. HttpClient has HttpRoute class for that matter. You
> >> probably should be using a custom class that also includes HTTP proxy
> >> and SSL context bits specific to CXF.
> >> 
> >> Does this help you in any way?
> > 
> > Yep.   Just feels like I have to subclass/override a lot of behavior from HC instead of "using" it.   If HttpHost wasn't final, it would be so much more useful.   Is there a particular reason why it has to be final?
> > 
> > Still not sure about the Proxy stuff at all, but that's likely because I don't know much about Proxies at all.  I'll likely need to look more into what proxy stuff does on the wire.
> 
> OK.  Went ahead and ripped out the HttpHost related stuff and replaced with a CXFHttpHost which is basically a copy of the HttpHost with extra params for the TLS and Proxy stuff (Proxy stuff unused right now).   With that, we're back to a  single connection for all the TLS requests (providing they CAN).  So that looks good now.    That pretty much just leaves the Proxy stuff as the major missing piece.
> 

Would not aggregation be more appropriate here? 

class CXFRoute {

  HttpHost host;
  HttpHost proxy;
  SSLStuff sstuff;
  
}

Oleg


Re: Async http client experiments....

Posted by Daniel Kulp <dk...@apache.org>.
On Sep 5, 2012, at 10:49 AM, Daniel Kulp <dk...@apache.org> wrote:

> 
> On Sep 5, 2012, at 9:26 AM, Oleg Kalnichevski <ol...@apache.org> wrote:
>> 
>> Just ditch HttpHost. You obviously need a richer class to represent
>> connection routes. HttpClient has HttpRoute class for that matter. You
>> probably should be using a custom class that also includes HTTP proxy
>> and SSL context bits specific to CXF.
>> 
>> Does this help you in any way?
> 
> Yep.   Just feels like I have to subclass/override a lot of behavior from HC instead of "using" it.   If HttpHost wasn't final, it would be so much more useful.   Is there a particular reason why it has to be final?
> 
> Still not sure about the Proxy stuff at all, but that's likely because I don't know much about Proxies at all.  I'll likely need to look more into what proxy stuff does on the wire.

OK.  Went ahead and ripped out the HttpHost related stuff and replaced with a CXFHttpHost which is basically a copy of the HttpHost with extra params for the TLS and Proxy stuff (Proxy stuff unused right now).   With that, we're back to a  single connection for all the TLS requests (providing they CAN).  So that looks good now.    That pretty much just leaves the Proxy stuff as the major missing piece.

-- 
Daniel Kulp
dkulp@apache.org - http://dankulp.com/blog
Talend Community Coder - http://coders.talend.com


Re: Async http client experiments....

Posted by Daniel Kulp <dk...@apache.org>.
On Sep 11, 2012, at 8:27 AM, Oleg Kalnichevski <ol...@apache.org> wrote:

> On Mon, 2012-09-10 at 17:10 -0400, Daniel Kulp wrote:
>> On Sep 10, 2012, at 6:13 AM, Oleg Kalnichevski <ol...@apache.org> wrote:
>> 
>>> I think this problem can be easily fixed. If the expected X500Principal
>>> is known in advance, one can stick it into the local execution context
>>> as ClientContext#USER_TOKEN attribute [1]. This will force the client to
>>> request connections with the given state only.
>> 
>> Yep.   Worked perfectly.
>> 
>> I committed my latest updates to the sandbox.   A little more cleanup and I'll likely merge it into the main build.   
>> 
>> I did do a pretty big update to the SharedOutputBuffer to better handle the case where a large byte[] is passed into the write method.   It will now avoid copying blocks out of there into the buffer to then copy into the encoder.   Avoids a bunch of copies.   Also avoids a bunch of flipping back and forth between threads as the enable/disable output stuff is flipped back and forth. 
>> 
>> 
> 
> Great stuff! Actually one probably could apply similar optimization to
> the SharedInputBuffer as well to reduce intermediate buffer copying for
> input operations.

Just did that.   That said, it's much less critical there.  Woodstox uses an internal 4000 byte buffer for decoding XML.   Thus, under "normal" circumstances, we only get 4K byte[]'s in there.   That said, I added a little detection in there so if we DO have to wait for data for the 4K buffer, we'll just read directly into that buffer.   JAX-RS and maybe MTOM might pass in larger blocks so they may benefit more. Not really sure.

With my simple test case, the updates drop the number of times the main processing thread needed to block for input from about 25% of the time to about 5%.   However, this CAN increase the amount of buffer copies.   I need to run some tests to see which is worse, they extra byte[] copies that occur on background threads or the waiting by the main thread.


-- 
Daniel Kulp
dkulp@apache.org - http://dankulp.com/blog
Talend Community Coder - http://coders.talend.com


Re: Async http client experiments....

Posted by Oleg Kalnichevski <ol...@apache.org>.
On Mon, 2012-09-10 at 17:10 -0400, Daniel Kulp wrote:
> On Sep 10, 2012, at 6:13 AM, Oleg Kalnichevski <ol...@apache.org> wrote:
> 
> > I think this problem can be easily fixed. If the expected X500Principal
> > is known in advance, one can stick it into the local execution context
> > as ClientContext#USER_TOKEN attribute [1]. This will force the client to
> > request connections with the given state only.
> 
> Yep.   Worked perfectly.
> 
> I committed my latest updates to the sandbox.   A little more cleanup and I'll likely merge it into the main build.   
> 
> I did do a pretty big update to the SharedOutputBuffer to better handle the case where a large byte[] is passed into the write method.   It will now avoid copying blocks out of there into the buffer to then copy into the encoder.   Avoids a bunch of copies.   Also avoids a bunch of flipping back and forth between threads as the enable/disable output stuff is flipped back and forth. 
> 
> 

Great stuff! Actually one probably could apply similar optimization to
the SharedInputBuffer as well to reduce intermediate buffer copying for
input operations.

Oleg 


Re: Async http client experiments....

Posted by Daniel Kulp <dk...@apache.org>.
On Sep 10, 2012, at 6:13 AM, Oleg Kalnichevski <ol...@apache.org> wrote:

> I think this problem can be easily fixed. If the expected X500Principal
> is known in advance, one can stick it into the local execution context
> as ClientContext#USER_TOKEN attribute [1]. This will force the client to
> request connections with the given state only.

Yep.   Worked perfectly.

I committed my latest updates to the sandbox.   A little more cleanup and I'll likely merge it into the main build.   

I did do a pretty big update to the SharedOutputBuffer to better handle the case where a large byte[] is passed into the write method.   It will now avoid copying blocks out of there into the buffer to then copy into the encoder.   Avoids a bunch of copies.   Also avoids a bunch of flipping back and forth between threads as the enable/disable output stuff is flipped back and forth. 


-- 
Daniel Kulp
dkulp@apache.org - http://dankulp.com/blog
Talend Community Coder - http://coders.talend.com


Re: Async http client experiments....

Posted by Oleg Kalnichevski <ol...@apache.org>.
On Sun, 2012-09-09 at 08:14 -0400, Daniel Kulp wrote:
> On Sep 8, 2012, at 4:33 PM, Oleg Kalnichevski <ol...@apache.org> wrote:
> > I committed your patch with some minor tweaks. Please double-check.
> > 
> 
> Cool.  Thanks.  I'll take a look on Monday.
> 
> > The fundamental problem here is that SSL connections set up on a per
> > request (or per client basis) when kept alive by the connection manager
> > can be later leased to another client thread with a different security
> > context if one is not careful.  
> 
> Actually, I'm struggling with the OPPOSITE problem with the Async client.   The Async client stores the X500Principal as the "state" for the connection (which is great BTW).   Thus, subsequent calls that don't provide a state will never re-use the connection.   I need to do a little work to store the principal, but it all looks very doable.   Hopefully will finish that up on Monday. 
> 
> 

Hi Daniel

I think this problem can be easily fixed. If the expected X500Principal
is known in advance, one can stick it into the local execution context
as ClientContext#USER_TOKEN attribute [1]. This will force the client to
request connections with the given state only.

[1]
http://hc.apache.org/httpcomponents-client-ga/tutorial/html/advanced.html#stateful_conn

> > We can cut as many BETA releases as needed and as often as needed. There
> > have been not that many changes since BETA2, so I would not rush BETA3,
> > but I'll call for a release vote as soon as you need it. What I am not
> > really comfortable committing myself to is any time frame for a GA
> > release. 
> 
> That's fine.   As long as there are "releases" in the central repo, we can at least proceed.  For performance reasons, the async transport likely won't be a default, but it will be available to those that may need it.   Definitely don't push a beta3 yet (unless someone else really needs it) as I do want to verify more tests with it, check how it will work in OSGi, etc...
> 

All right. Let's hold it off for a little while. Once you are sure no
more changes are needed let me know and I'll start the release process.

Cheers

Oleg



Re: Async http client experiments....

Posted by Daniel Kulp <dk...@apache.org>.
On Sep 8, 2012, at 4:33 PM, Oleg Kalnichevski <ol...@apache.org> wrote:
> I committed your patch with some minor tweaks. Please double-check.
> 

Cool.  Thanks.  I'll take a look on Monday.

> The fundamental problem here is that SSL connections set up on a per
> request (or per client basis) when kept alive by the connection manager
> can be later leased to another client thread with a different security
> context if one is not careful.  

Actually, I'm struggling with the OPPOSITE problem with the Async client.   The Async client stores the X500Principal as the "state" for the connection (which is great BTW).   Thus, subsequent calls that don't provide a state will never re-use the connection.   I need to do a little work to store the principal, but it all looks very doable.   Hopefully will finish that up on Monday. 


> We can cut as many BETA releases as needed and as often as needed. There
> have been not that many changes since BETA2, so I would not rush BETA3,
> but I'll call for a release vote as soon as you need it. What I am not
> really comfortable committing myself to is any time frame for a GA
> release. 

That's fine.   As long as there are "releases" in the central repo, we can at least proceed.  For performance reasons, the async transport likely won't be a default, but it will be available to those that may need it.   Definitely don't push a beta3 yet (unless someone else really needs it) as I do want to verify more tests with it, check how it will work in OSGi, etc...


-- 
Daniel Kulp
dkulp@apache.org - http://dankulp.com/blog
Talend Community Coder - http://coders.talend.com


Re: Async http client experiments....

Posted by Oleg Kalnichevski <ol...@apache.org>.
On Fri, 2012-09-07 at 12:53 -0400, Daniel Kulp wrote:
> On Sep 6, 2012, at 9:33 AM, Oleg Kalnichevski <ol...@apache.org> wrote:
> 
> > On Wed, 2012-09-05 at 10:49 -0400, Daniel Kulp wrote:
> >> Still not sure about the Proxy stuff at all, but that's likely because I don't know much about Proxies at all.  I'll likely need to look more into what proxy stuff does on the wire.
> >> 
> >> 
> > 
> > HTTP proxy support is not all that difficult _as long as_ support for
> > complex HTTP authentication schemes is not required. If you need to
> > support NTLM, Kerberos and the likes, expect a lot of grief and massive
> > scalp hair loss. HttpAsyncClient provides full support for HTTP proxies
> > and most commonly used authentication schemes but it is still considered
> > BETA quality. If you want to re-use CXF authentication code HttpCore
> > would give you more flexibility at the expense of having to write more
> > custom code. Otherwise you might want consider trying out
> > HttpAsyncClient.
> 
> Looking at the proxy stuff, I really have no interest in re-inventing the wheel on this.  :-)   Thus, it does look like going with HttpAsyncClient may make the most sense.   I started digging into this a bit, but ran into the SSL configured "per connection factory" issue that I had before.  HOWEVER, this looks easily solvable with a simple patch to HttpAsyncClient.  I've logged the issue and attached it to:
> 
> https://issues.apache.org/jira/browse/HTTPASYNC-25
> 
> Anyway, with the HttpAsyncClient stuff, the amount of code in CXF is SIGNIFICANTLY reduced.   A LOT less duplicated code.   That's good.   However, we would need to have beta3 released pretty soon.   Any ideas on what the plans are for beta3?     If beta3 won't be soon, I'll likely go with what I have now and detect if the proxy is configured and force use of the old transport if it is.   If it will be release real soon, I'll pursue using it.    I do have all the systests/transports tests passing with the HttpAsyncClient based stuff, including the new proxy based tests I added yesterday. 
> 
> Thoughts?
> 

Hi Daniel 

I committed your patch with some minor tweaks. Please double-check.

The fundamental problem here is that SSL connections set up on a per
request (or per client basis) when kept alive by the connection manager
can be later leased to another client thread with a different security
context if one is not careful.  

We can cut as many BETA releases as needed and as often as needed. There
have been not that many changes since BETA2, so I would not rush BETA3,
but I'll call for a release vote as soon as you need it. What I am not
really comfortable committing myself to is any time frame for a GA
release.  

Oleg


Re: Async http client experiments....

Posted by Daniel Kulp <dk...@apache.org>.
On Sep 6, 2012, at 9:33 AM, Oleg Kalnichevski <ol...@apache.org> wrote:

> On Wed, 2012-09-05 at 10:49 -0400, Daniel Kulp wrote:
>> Still not sure about the Proxy stuff at all, but that's likely because I don't know much about Proxies at all.  I'll likely need to look more into what proxy stuff does on the wire.
>> 
>> 
> 
> HTTP proxy support is not all that difficult _as long as_ support for
> complex HTTP authentication schemes is not required. If you need to
> support NTLM, Kerberos and the likes, expect a lot of grief and massive
> scalp hair loss. HttpAsyncClient provides full support for HTTP proxies
> and most commonly used authentication schemes but it is still considered
> BETA quality. If you want to re-use CXF authentication code HttpCore
> would give you more flexibility at the expense of having to write more
> custom code. Otherwise you might want consider trying out
> HttpAsyncClient.

Looking at the proxy stuff, I really have no interest in re-inventing the wheel on this.  :-)   Thus, it does look like going with HttpAsyncClient may make the most sense.   I started digging into this a bit, but ran into the SSL configured "per connection factory" issue that I had before.  HOWEVER, this looks easily solvable with a simple patch to HttpAsyncClient.  I've logged the issue and attached it to:

https://issues.apache.org/jira/browse/HTTPASYNC-25

Anyway, with the HttpAsyncClient stuff, the amount of code in CXF is SIGNIFICANTLY reduced.   A LOT less duplicated code.   That's good.   However, we would need to have beta3 released pretty soon.   Any ideas on what the plans are for beta3?     If beta3 won't be soon, I'll likely go with what I have now and detect if the proxy is configured and force use of the old transport if it is.   If it will be release real soon, I'll pursue using it.    I do have all the systests/transports tests passing with the HttpAsyncClient based stuff, including the new proxy based tests I added yesterday. 

Thoughts?

-- 
Daniel Kulp
dkulp@apache.org - http://dankulp.com/blog
Talend Community Coder - http://coders.talend.com


Re: Async http client experiments....

Posted by Oleg Kalnichevski <ol...@apache.org>.
On Wed, 2012-09-05 at 10:49 -0400, Daniel Kulp wrote:
> On Sep 5, 2012, at 9:26 AM, Oleg Kalnichevski <ol...@apache.org> wrote:

...

> >> 2) Lots of issues trying to associate the request with the connection at connection creation time.   I needed information from the request in order to setup the SSL stuff.  However, I had a LOT of issues trying to accomplish that.   The only thing the ConnectionFactory gets it the IOSession.  All that has in it is the HttpHost.  The HttpHost is final.   The connect occurs on a different thread so thread locals won't work.   Also note that all the fields in SSLIOSession are final and private.  Thus, I also couldn't find a way to delay the SSL setup stuff till later.  Thus, I had to get the request/connection stuff associated up front.  I ended up using a IdentityHashMap to associate the HttpHost with the request stuff, but it seemed very hacky.
> >> 
> >> Not really sure what the right solution would be.   My initial thought was to make HttpHost non-final so that I could subclass it with an CXFHttpHost or SSLHttpHost or something that would contain extra methods that I could call.   Alternatively, add a "properties" map to HttpHost or similar that I could store additional stuff.   Another option would be to have the IOSession passed to the factory also contain the "state" (if specified)  as a property that could be queried.  
> >> 
> > 
> > Just ditch HttpHost. You obviously need a richer class to represent
> > connection routes. HttpClient has HttpRoute class for that matter. You
> > probably should be using a custom class that also includes HTTP proxy
> > and SSL context bits specific to CXF.
> > 
> > Does this help you in any way?
> 
> Yep.   Just feels like I have to subclass/override a lot of behavior from HC instead of "using" it. 

Well, in a way this is intended. For better or worse HttpCore was always
meant to be a set of low level components one could re-use and extend to
build custom HTTP services, primarily HTTP proxies and gateways. For end
users we provide HttpClient and HttpAsyncClient modules that are based
on HttpCore and represent more 'out of the box' products.

>   If HttpHost wasn't final, it would be so much more useful.   Is there a particular reason why it has to be final?
> 

There is no particular reason but this is the very first time some one
actually wanted to extend it. I cant help thinking aggregation would be
more appropriate here than extension.  

> Still not sure about the Proxy stuff at all, but that's likely because I don't know much about Proxies at all.  I'll likely need to look more into what proxy stuff does on the wire.
> 
> 

HTTP proxy support is not all that difficult _as long as_ support for
complex HTTP authentication schemes is not required. If you need to
support NTLM, Kerberos and the likes, expect a lot of grief and massive
scalp hair loss. HttpAsyncClient provides full support for HTTP proxies
and most commonly used authentication schemes but it is still considered
BETA quality. If you want to re-use CXF authentication code HttpCore
would give you more flexibility at the expense of having to write more
custom code. Otherwise you might want consider trying out
HttpAsyncClient.

Oleg


Re: Async http client experiments....

Posted by Daniel Kulp <dk...@apache.org>.
On Sep 5, 2012, at 9:26 AM, Oleg Kalnichevski <ol...@apache.org> wrote:
> HC connection pools distinguish between route info and connection state.
> Routes are immutable and are well known in advance. A route in its basic
> form is a scheme/host/port combo. At connection creation time only the
> route matters as all HTTP connections start out their life cycle as
> state-less per HTTP specification. In real life settings, however, HTTP
> connections may acquire certain state during their life-cycle, for
> instance, by serving a resource protected with a connection based auth
> scheme such as NTLM. So, the object that represents a state of a
> connection may need to be associated with the connection after it has
> already been created. SSL is similar. An HTTP connection can be created
> as plain and later get upgraded to TLS/SSL. If the target server
> requires the client to authenticate itself with a certificate, the
> connection becomes state-full after successful authentication only.   
> 
> Once a connection becomes state-full it can only be leased if requested
> with the same combination of route info and state. I believe this is the
> reason why you are getting a new SSL connection for each new HTTPS
> request. I guess the state object used in the lease request is not equal
> to that associated with the pool entry.

Well, I was actually getting the opposite.   Since the "state" object wasn't stored anywhere, I was getting a single connection for all the SSL requests to a service/host and thus was ending up having the wrong credentials after the first connection.    For now, I forced a dummy state object on there to force a new connection per request to avoid the security issue of wrong credentials.   Once I figure all the state things out, this should fix itself.

> If in your case SSL context details are effectively a given, you should
> make then a part of the route info instead of the state. HC connection
> pools are designed to be able to use any arbitrary immutable class as a
> route info.

OK.  Subclass AbstractNIOConnPool instead of using BasicNIOConnPool.

>> 2) Lots of issues trying to associate the request with the connection at connection creation time.   I needed information from the request in order to setup the SSL stuff.  However, I had a LOT of issues trying to accomplish that.   The only thing the ConnectionFactory gets it the IOSession.  All that has in it is the HttpHost.  The HttpHost is final.   The connect occurs on a different thread so thread locals won't work.   Also note that all the fields in SSLIOSession are final and private.  Thus, I also couldn't find a way to delay the SSL setup stuff till later.  Thus, I had to get the request/connection stuff associated up front.  I ended up using a IdentityHashMap to associate the HttpHost with the request stuff, but it seemed very hacky.
>> 
>> Not really sure what the right solution would be.   My initial thought was to make HttpHost non-final so that I could subclass it with an CXFHttpHost or SSLHttpHost or something that would contain extra methods that I could call.   Alternatively, add a "properties" map to HttpHost or similar that I could store additional stuff.   Another option would be to have the IOSession passed to the factory also contain the "state" (if specified)  as a property that could be queried.  
>> 
> 
> Just ditch HttpHost. You obviously need a richer class to represent
> connection routes. HttpClient has HttpRoute class for that matter. You
> probably should be using a custom class that also includes HTTP proxy
> and SSL context bits specific to CXF.
> 
> Does this help you in any way?

Yep.   Just feels like I have to subclass/override a lot of behavior from HC instead of "using" it.   If HttpHost wasn't final, it would be so much more useful.   Is there a particular reason why it has to be final?

Still not sure about the Proxy stuff at all, but that's likely because I don't know much about Proxies at all.  I'll likely need to look more into what proxy stuff does on the wire.


-- 
Daniel Kulp
dkulp@apache.org - http://dankulp.com/blog
Talend Community Coder - http://coders.talend.com


Re: Async http client experiments....

Posted by Oleg Kalnichevski <ol...@apache.org>.
On Tue, 2012-09-04 at 15:20 -0400, Daniel Kulp wrote:
> On Aug 6, 2012, at 10:43 AM, Oleg Kalnichevski <ol...@apache.org> wrote:
> >>>> CXF allows these to be configured on a per-client (and sometimes even
> >>>> per- request) basis.    It looks like HC seems to have these on the
> >>>> connection factory.
> >>> 
> >>> It does not have to be this way. Connections can start off as plain and
> >>> later get upgraded to TLS/SSL by the protocol handling code. One would
> >>> have to tweak the default protocol handler a bit, though, probably by
> >>> overriding the HttpAsyncRequestExecutor#requestReady method [2].
> >> 
> >> Ah.  OK.   I'll take a look there.   I was looking in the various connection 
> >> factories and pools.   Haven't looked there yet.   Lots of stuff going on.  
> >> :-)
> >> 
> > 
> > I also added a CXF specific connection manager implementation in order
> > to enable SSL customization based on request configuration. However,
> > before I proceed I would like to understand the requirements a little
> > better. Can I assume that once an SSL connection has been fully set up
> > using request level config parameters and can be pooled and safely
> > reused for subsequent requests to the same service (even by different
> > users) or requests may contain different / incompatible SSL
> > configuration for the same service?   
> 
> They may contain different configurations.  Particular different client certs if using client certs as the authentication mechanism. 
> 
> I found the "state" object to pass into the connection factory and plan on implementing a new state object that will attempt to handle that by comparing the various values on the HTTPConduit settings.    However, I *DID* run into a couple problems that, while I was able to work around them with some hacks, they likely should be fixed in the HTTP stuff.  In particular:
> 
> 1) The "state" object that is passed into the factory is never saved anywhere.   IMO, if a new connection is created based on the state, it might be good to record that state (or at least make that an option) so the state stuff actually works.   I got around this by subclassing the BasicNIOConnPool to override the BasicNIOConnPool.createEntry method to add a state object to the entry.   A bit involved.
> 

Let me try to explain the rationale behind the design. 

HC connection pools distinguish between route info and connection state.
Routes are immutable and are well known in advance. A route in its basic
form is a scheme/host/port combo. At connection creation time only the
route matters as all HTTP connections start out their life cycle as
state-less per HTTP specification. In real life settings, however, HTTP
connections may acquire certain state during their life-cycle, for
instance, by serving a resource protected with a connection based auth
scheme such as NTLM. So, the object that represents a state of a
connection may need to be associated with the connection after it has
already been created. SSL is similar. An HTTP connection can be created
as plain and later get upgraded to TLS/SSL. If the target server
requires the client to authenticate itself with a certificate, the
connection becomes state-full after successful authentication only.   

Once a connection becomes state-full it can only be leased if requested
with the same combination of route info and state. I believe this is the
reason why you are getting a new SSL connection for each new HTTPS
request. I guess the state object used in the lease request is not equal
to that associated with the pool entry.

If in your case SSL context details are effectively a given, you should
make then a part of the route info instead of the state. HC connection
pools are designed to be able to use any arbitrary immutable class as a
route info.   

> 2) Lots of issues trying to associate the request with the connection at connection creation time.   I needed information from the request in order to setup the SSL stuff.  However, I had a LOT of issues trying to accomplish that.   The only thing the ConnectionFactory gets it the IOSession.  All that has in it is the HttpHost.  The HttpHost is final.   The connect occurs on a different thread so thread locals won't work.   Also note that all the fields in SSLIOSession are final and private.  Thus, I also couldn't find a way to delay the SSL setup stuff till later.  Thus, I had to get the request/connection stuff associated up front.  I ended up using a IdentityHashMap to associate the HttpHost with the request stuff, but it seemed very hacky.
>
> Not really sure what the right solution would be.   My initial thought was to make HttpHost non-final so that I could subclass it with an CXFHttpHost or SSLHttpHost or something that would contain extra methods that I could call.   Alternatively, add a "properties" map to HttpHost or similar that I could store additional stuff.   Another option would be to have the IOSession passed to the factory also contain the "state" (if specified)  as a property that could be queried.  
> 

Just ditch HttpHost. You obviously need a richer class to represent
connection routes. HttpClient has HttpRoute class for that matter. You
probably should be using a custom class that also includes HTTP proxy
and SSL context bits specific to CXF.

Does this help you in any way?

Oleg


Re: Async http client experiments....

Posted by Daniel Kulp <dk...@apache.org>.
On Aug 6, 2012, at 10:43 AM, Oleg Kalnichevski <ol...@apache.org> wrote:
>>>> CXF allows these to be configured on a per-client (and sometimes even
>>>> per- request) basis.    It looks like HC seems to have these on the
>>>> connection factory.
>>> 
>>> It does not have to be this way. Connections can start off as plain and
>>> later get upgraded to TLS/SSL by the protocol handling code. One would
>>> have to tweak the default protocol handler a bit, though, probably by
>>> overriding the HttpAsyncRequestExecutor#requestReady method [2].
>> 
>> Ah.  OK.   I'll take a look there.   I was looking in the various connection 
>> factories and pools.   Haven't looked there yet.   Lots of stuff going on.  
>> :-)
>> 
> 
> I also added a CXF specific connection manager implementation in order
> to enable SSL customization based on request configuration. However,
> before I proceed I would like to understand the requirements a little
> better. Can I assume that once an SSL connection has been fully set up
> using request level config parameters and can be pooled and safely
> reused for subsequent requests to the same service (even by different
> users) or requests may contain different / incompatible SSL
> configuration for the same service?   

They may contain different configurations.  Particular different client certs if using client certs as the authentication mechanism. 

I found the "state" object to pass into the connection factory and plan on implementing a new state object that will attempt to handle that by comparing the various values on the HTTPConduit settings.    However, I *DID* run into a couple problems that, while I was able to work around them with some hacks, they likely should be fixed in the HTTP stuff.  In particular:

1) The "state" object that is passed into the factory is never saved anywhere.   IMO, if a new connection is created based on the state, it might be good to record that state (or at least make that an option) so the state stuff actually works.   I got around this by subclassing the BasicNIOConnPool to override the BasicNIOConnPool.createEntry method to add a state object to the entry.   A bit involved.

2) Lots of issues trying to associate the request with the connection at connection creation time.   I needed information from the request in order to setup the SSL stuff.  However, I had a LOT of issues trying to accomplish that.   The only thing the ConnectionFactory gets it the IOSession.  All that has in it is the HttpHost.  The HttpHost is final.   The connect occurs on a different thread so thread locals won't work.   Also note that all the fields in SSLIOSession are final and private.  Thus, I also couldn't find a way to delay the SSL setup stuff till later.  Thus, I had to get the request/connection stuff associated up front.  I ended up using a IdentityHashMap to associate the HttpHost with the request stuff, but it seemed very hacky.

Not really sure what the right solution would be.   My initial thought was to make HttpHost non-final so that I could subclass it with an CXFHttpHost or SSLHttpHost or something that would contain extra methods that I could call.   Alternatively, add a "properties" map to HttpHost or similar that I could store additional stuff.   Another option would be to have the IOSession passed to the factory also contain the "state" (if specified)  as a property that could be queried.  

Anyway, things are definitely in "ok" shape right now.   Definitely making some significant progress.  Slowly but surely.

-- 
Daniel Kulp
dkulp@apache.org - http://dankulp.com/blog
Talend Community Coder - http://coders.talend.com


Re: Async http client experiments....

Posted by Oleg Kalnichevski <ol...@apache.org>.
On Thu, 2012-09-06 at 10:12 -0400, Daniel Kulp wrote:
> On Sep 6, 2012, at 9:40 AM, Oleg Kalnichevski <ol...@apache.org> wrote:
> > On Wed, 2012-09-05 at 10:55 -0400, Daniel Kulp wrote:
> >> On Sep 5, 2012, at 9:35 AM, Alessio Soldano <as...@redhat.com> wrote:
> >> 
> >>> On 09/04/2012 09:03 PM, Daniel Kulp wrote:
> >>>> 
> >>>> That said, there is still a bit of work todo:
> >>>> 
> >>>> 1) Proxy support - haven't looked into this at all yet.   Hard for me to test as well as I don't have any sort of proxy setup anywhere.  I may need to look into how to setup a simple proxy to dig into this.
> >>> 
> >>> Not sure if this would be the best option for you, but some time ago I
> >>> wrote some JBossWS tests related to http proxy configuration by using
> >>> LittleProxy [1], perhaps it's convenient here too.
> >>> 
> >>> [1] http://www.littleshoot.org/littleproxy/ 
> >> 
> >> Cool. I see they just started a vote for 0.5 so if I wait a day or three, I can just start with the newer version.  :-)
> >> 
> >> Thanks for the pointer.
> >> 
> >> Dan
> >> 
> > 
> > I personally use Squid [1] running locally in non-daemon mode to test
> > HttpClient against. It is one of the most widely used proxies around.
> 
> 
> I was originally looking at this, but there is a big disadvantage:  it's external to the build.   The nice thing about the littleproxy thing is in your junit tests, you can pop up a proxy on any port, use it, bring it down, etc… just like any other service we may be using.   It would work on every developers machine.  
> 
> Right now, we have ZERO coverage of the proxy support in CXF, even for the HttpURLConnection based transport.   With the little proxy thing, I'm hoping to rectify that.    That said, I don't think it provides all the various possible proxy situations that squid can, but at this point, some tests are better than none.   :-)
> 
> 

This is certainly a very valid point. Actually we are in the same boat.
Close to zero coverage of proxy code. And we have all the building
blocks necessary to put together a full blown caching proxy. The trouble
is that the _real_ troublemaker is never the proxy itself. In most cases
proxies are almost transparent to the client. The real ugly b**ch is
always a combination of HTTP proxy and authentication (either on the
proxy or the target side), especially NTLM, which is not very easy to
emulate.

Oleg



Re: Async http client experiments....

Posted by Alessio Soldano <as...@redhat.com>.
On 09/06/2012 04:12 PM, Daniel Kulp wrote:
>>
>> I personally use Squid [1] running locally in non-daemon mode to test
>> HttpClient against. It is one of the most widely used proxies around.
> 
> 
> I was originally looking at this, but there is a big disadvantage:  it's external to the build.   The nice thing about the littleproxy thing is in your junit tests, you can pop up a proxy on any port, use it, bring it down, etc… just like any other service we may be using.   It would work on every developers machine.  

+100, exactly my reasoning

Alessio

-- 
Alessio Soldano
Web Service Lead, JBoss

Re: Async http client experiments....

Posted by Daniel Kulp <dk...@apache.org>.
On Sep 6, 2012, at 9:40 AM, Oleg Kalnichevski <ol...@apache.org> wrote:
> On Wed, 2012-09-05 at 10:55 -0400, Daniel Kulp wrote:
>> On Sep 5, 2012, at 9:35 AM, Alessio Soldano <as...@redhat.com> wrote:
>> 
>>> On 09/04/2012 09:03 PM, Daniel Kulp wrote:
>>>> 
>>>> That said, there is still a bit of work todo:
>>>> 
>>>> 1) Proxy support - haven't looked into this at all yet.   Hard for me to test as well as I don't have any sort of proxy setup anywhere.  I may need to look into how to setup a simple proxy to dig into this.
>>> 
>>> Not sure if this would be the best option for you, but some time ago I
>>> wrote some JBossWS tests related to http proxy configuration by using
>>> LittleProxy [1], perhaps it's convenient here too.
>>> 
>>> [1] http://www.littleshoot.org/littleproxy/ 
>> 
>> Cool. I see they just started a vote for 0.5 so if I wait a day or three, I can just start with the newer version.  :-)
>> 
>> Thanks for the pointer.
>> 
>> Dan
>> 
> 
> I personally use Squid [1] running locally in non-daemon mode to test
> HttpClient against. It is one of the most widely used proxies around.


I was originally looking at this, but there is a big disadvantage:  it's external to the build.   The nice thing about the littleproxy thing is in your junit tests, you can pop up a proxy on any port, use it, bring it down, etc… just like any other service we may be using.   It would work on every developers machine.  

Right now, we have ZERO coverage of the proxy support in CXF, even for the HttpURLConnection based transport.   With the little proxy thing, I'm hoping to rectify that.    That said, I don't think it provides all the various possible proxy situations that squid can, but at this point, some tests are better than none.   :-)


-- 
Daniel Kulp
dkulp@apache.org - http://dankulp.com/blog
Talend Community Coder - http://coders.talend.com


Re: Async http client experiments....

Posted by Oleg Kalnichevski <ol...@apache.org>.
On Wed, 2012-09-05 at 10:55 -0400, Daniel Kulp wrote:
> On Sep 5, 2012, at 9:35 AM, Alessio Soldano <as...@redhat.com> wrote:
> 
> > On 09/04/2012 09:03 PM, Daniel Kulp wrote:
> >> 
> >> That said, there is still a bit of work todo:
> >> 
> >> 1) Proxy support - haven't looked into this at all yet.   Hard for me to test as well as I don't have any sort of proxy setup anywhere.  I may need to look into how to setup a simple proxy to dig into this.
> > 
> > Not sure if this would be the best option for you, but some time ago I
> > wrote some JBossWS tests related to http proxy configuration by using
> > LittleProxy [1], perhaps it's convenient here too.
> > 
> > [1] http://www.littleshoot.org/littleproxy/ 
> 
> Cool. I see they just started a vote for 0.5 so if I wait a day or three, I can just start with the newer version.  :-)
> 
> Thanks for the pointer.
> 
> Dan
> 

I personally use Squid [1] running locally in non-daemon mode to test
HttpClient against. It is one of the most widely used proxies around.

Oleg

[1] http://en.wikipedia.org/wiki/Squid_%28software%29


> > 
> > Cheers
> > Alessio
> > 
> > -- 
> > Alessio Soldano
> > Web Service Lead, JBoss
> 



Re: Async http client experiments....

Posted by Daniel Kulp <dk...@apache.org>.
On Sep 5, 2012, at 9:35 AM, Alessio Soldano <as...@redhat.com> wrote:

> On 09/04/2012 09:03 PM, Daniel Kulp wrote:
>> 
>> That said, there is still a bit of work todo:
>> 
>> 1) Proxy support - haven't looked into this at all yet.   Hard for me to test as well as I don't have any sort of proxy setup anywhere.  I may need to look into how to setup a simple proxy to dig into this.
> 
> Not sure if this would be the best option for you, but some time ago I
> wrote some JBossWS tests related to http proxy configuration by using
> LittleProxy [1], perhaps it's convenient here too.
> 
> [1] http://www.littleshoot.org/littleproxy/ 

Cool. I see they just started a vote for 0.5 so if I wait a day or three, I can just start with the newer version.  :-)

Thanks for the pointer.

Dan

> 
> Cheers
> Alessio
> 
> -- 
> Alessio Soldano
> Web Service Lead, JBoss

-- 
Daniel Kulp
dkulp@apache.org - http://dankulp.com/blog
Talend Community Coder - http://coders.talend.com


Re: Async http client experiments....

Posted by Alessio Soldano <as...@redhat.com>.
On 09/04/2012 09:03 PM, Daniel Kulp wrote:
> 
> That said, there is still a bit of work todo:
> 
> 1) Proxy support - haven't looked into this at all yet.   Hard for me to test as well as I don't have any sort of proxy setup anywhere.  I may need to look into how to setup a simple proxy to dig into this.

Not sure if this would be the best option for you, but some time ago I
wrote some JBossWS tests related to http proxy configuration by using
LittleProxy [1], perhaps it's convenient here too.

[1] http://www.littleshoot.org/littleproxy/

Cheers
Alessio

-- 
Alessio Soldano
Web Service Lead, JBoss

Re: Async http client experiments....

Posted by Oleg Kalnichevski <ol...@apache.org>.
On Tue, 2012-09-04 at 15:03 -0400, Daniel Kulp wrote:
> Finally had a chance over the last week or so to re-dig into the async http stuff.   You've probably seen a lot of commits and such related to it.  :-)
> 
> Good news:
> ALL of the CXF system tests now pass using it instead of the URLConnection based transport.  I'm pretty excited about that as that's a huge milestone.  
> 

Hi Dan

I am glad things are shaping up well. 

> That said, there is still a bit of work todo:
> 
> 1) Proxy support - haven't looked into this at all yet.   Hard for me to test as well as I don't have any sort of proxy setup anywhere.  I may need to look into how to setup a simple proxy to dig into this.
>
> 2) HTTPs Keep-Alive - right now, with HTTPs, a new connect is created for every request.   With HTTPs, that's really expensive as it needs to do the handshakes and such for every request.     This is next on my TODO to look at.   One issue is that the BasicNIOConnPool  really only compares the host name/port/scheme to determine connection re-use.  Not any of the certs or anything like that.   I'll need to implement a "state" object to handle checking a lot of that.   Definitely next.  
> 
> 3) Configuration - now that things are working, I'll need to find a way to expose some configuration things.  Specifically around things like number of connections to hold open, number per target/host, thread pools, etc….
> 
> 4) Flip the logging and such to using our normal Logging framework.
> 
> 5) Move it all from the sandbox to trunk.  
> 
> 
> Anyway, definitely getting closer.   :-)
> 

I am completely buried under with work at my day job (the one that helps
me pay the bills) and am unlikely to be able to find a lot of time to
write code but will happily assist with HC specific issues. Just let me
know.

Cheers

Oleg

> 
> Dan
> 
> 
> 
> On Aug 6, 2012, at 10:43 AM, Oleg Kalnichevski <ol...@apache.org> wrote:
> 
> > On Fri, 2012-08-03 at 09:17 -0400, Daniel Kulp wrote:
> >> On Friday, August 03, 2012 11:29:41 AM Oleg Kalnichevski wrote:
> >>>> I just went through and added a little bit of error handling into it as
> >>>> well.   Mostly around a connection refused exception and for read
> >>>> timeouts. For the read timeout, I DID have to add an ioct.shutdown() in
> >>>> the SharedInBuffer consume method when read or it kept getting called
> >>>> repeatedly since we didn't read anything.
> >>> 
> >>> Hmmm. This should not be necessary. It is the responsibility of the
> >>> protocol handler to shut down things properly if connection goes down or
> >>> something breaks at runtime. In case of an I/O timeout the protocol
> >>> handler always shuts down the execution handler, which in its turn
> >>> should shut down shared i/o buffers [1]. I'll look into it.
> >> 
> >> Well, it's mostly because I didn't actually set the timeouts on the 
> >> connection at all.   Haven't figured that out yet mostly due to the same 
> >> issus as with the SSL/connection timeouts/etc...  as we set them on a per-
> >> request basis normally.   For now so things don't "hang", I just added 
> >> timeouts on the wait(...) calls we use when we need to block for additional 
> >> IO.    In that case, the wait(...) times out and I need to shutdown the 
> >> buffers and such manually so when the data DOES eventually arrive, we ignore 
> >> it.   Setting the "real" timeouts on the connections would definitely help.
> >> 
> > 
> > 
> > Hi Daniel
> > 
> > I tweaked the code a bit to make sure i/o timeouts (both connect and
> > receive ones) are handled correctly and can be specified on a per
> > request basis overriding the defaults. Please review / verify.
> > 
> > I also added connection pool lease / release logging. I am no longer
> > seeing anything suspicious related to connection management. When
> > executing requests sequentially the connection manager allocates one
> > connection only. 
> > 
> >>>> CXF allows these to be configured on a per-client (and sometimes even
> >>>> per- request) basis.    It looks like HC seems to have these on the
> >>>> connection factory.
> >>> 
> >>> It does not have to be this way. Connections can start off as plain and
> >>> later get upgraded to TLS/SSL by the protocol handling code. One would
> >>> have to tweak the default protocol handler a bit, though, probably by
> >>> overriding the HttpAsyncRequestExecutor#requestReady method [2].
> >> 
> >> Ah.  OK.   I'll take a look there.   I was looking in the various connection 
> >> factories and pools.   Haven't looked there yet.   Lots of stuff going on.  
> >> :-)
> >> 
> > 
> > I also added a CXF specific connection manager implementation in order
> > to enable SSL customization based on request configuration. However,
> > before I proceed I would like to understand the requirements a little
> > better. Can I assume that once an SSL connection has been fully set up
> > using request level config parameters and can be pooled and safely
> > reused for subsequent requests to the same service (even by different
> > users) or requests may contain different / incompatible SSL
> > configuration for the same service?   
> > 
> > Oleg   
> > 
> > 
> 



Re: Async http client experiments....

Posted by Daniel Kulp <dk...@apache.org>.
Finally had a chance over the last week or so to re-dig into the async http stuff.   You've probably seen a lot of commits and such related to it.  :-)

Good news:
ALL of the CXF system tests now pass using it instead of the URLConnection based transport.  I'm pretty excited about that as that's a huge milestone.  

That said, there is still a bit of work todo:

1) Proxy support - haven't looked into this at all yet.   Hard for me to test as well as I don't have any sort of proxy setup anywhere.  I may need to look into how to setup a simple proxy to dig into this.

2) HTTPs Keep-Alive - right now, with HTTPs, a new connect is created for every request.   With HTTPs, that's really expensive as it needs to do the handshakes and such for every request.     This is next on my TODO to look at.   One issue is that the BasicNIOConnPool  really only compares the host name/port/scheme to determine connection re-use.  Not any of the certs or anything like that.   I'll need to implement a "state" object to handle checking a lot of that.   Definitely next.  

3) Configuration - now that things are working, I'll need to find a way to expose some configuration things.  Specifically around things like number of connections to hold open, number per target/host, thread pools, etc….

4) Flip the logging and such to using our normal Logging framework.

5) Move it all from the sandbox to trunk.  


Anyway, definitely getting closer.   :-)


Dan



On Aug 6, 2012, at 10:43 AM, Oleg Kalnichevski <ol...@apache.org> wrote:

> On Fri, 2012-08-03 at 09:17 -0400, Daniel Kulp wrote:
>> On Friday, August 03, 2012 11:29:41 AM Oleg Kalnichevski wrote:
>>>> I just went through and added a little bit of error handling into it as
>>>> well.   Mostly around a connection refused exception and for read
>>>> timeouts. For the read timeout, I DID have to add an ioct.shutdown() in
>>>> the SharedInBuffer consume method when read or it kept getting called
>>>> repeatedly since we didn't read anything.
>>> 
>>> Hmmm. This should not be necessary. It is the responsibility of the
>>> protocol handler to shut down things properly if connection goes down or
>>> something breaks at runtime. In case of an I/O timeout the protocol
>>> handler always shuts down the execution handler, which in its turn
>>> should shut down shared i/o buffers [1]. I'll look into it.
>> 
>> Well, it's mostly because I didn't actually set the timeouts on the 
>> connection at all.   Haven't figured that out yet mostly due to the same 
>> issus as with the SSL/connection timeouts/etc...  as we set them on a per-
>> request basis normally.   For now so things don't "hang", I just added 
>> timeouts on the wait(...) calls we use when we need to block for additional 
>> IO.    In that case, the wait(...) times out and I need to shutdown the 
>> buffers and such manually so when the data DOES eventually arrive, we ignore 
>> it.   Setting the "real" timeouts on the connections would definitely help.
>> 
> 
> 
> Hi Daniel
> 
> I tweaked the code a bit to make sure i/o timeouts (both connect and
> receive ones) are handled correctly and can be specified on a per
> request basis overriding the defaults. Please review / verify.
> 
> I also added connection pool lease / release logging. I am no longer
> seeing anything suspicious related to connection management. When
> executing requests sequentially the connection manager allocates one
> connection only. 
> 
>>>> CXF allows these to be configured on a per-client (and sometimes even
>>>> per- request) basis.    It looks like HC seems to have these on the
>>>> connection factory.
>>> 
>>> It does not have to be this way. Connections can start off as plain and
>>> later get upgraded to TLS/SSL by the protocol handling code. One would
>>> have to tweak the default protocol handler a bit, though, probably by
>>> overriding the HttpAsyncRequestExecutor#requestReady method [2].
>> 
>> Ah.  OK.   I'll take a look there.   I was looking in the various connection 
>> factories and pools.   Haven't looked there yet.   Lots of stuff going on.  
>> :-)
>> 
> 
> I also added a CXF specific connection manager implementation in order
> to enable SSL customization based on request configuration. However,
> before I proceed I would like to understand the requirements a little
> better. Can I assume that once an SSL connection has been fully set up
> using request level config parameters and can be pooled and safely
> reused for subsequent requests to the same service (even by different
> users) or requests may contain different / incompatible SSL
> configuration for the same service?   
> 
> Oleg   
> 
> 

-- 
Daniel Kulp
dkulp@apache.org - http://dankulp.com/blog
Talend Community Coder - http://coders.talend.com


Re: Async http client experiments....

Posted by Oleg Kalnichevski <ol...@apache.org>.
On Fri, 2012-08-03 at 09:17 -0400, Daniel Kulp wrote:
> On Friday, August 03, 2012 11:29:41 AM Oleg Kalnichevski wrote:
> > > I just went through and added a little bit of error handling into it as
> > > well.   Mostly around a connection refused exception and for read
> > > timeouts. For the read timeout, I DID have to add an ioct.shutdown() in
> > > the SharedInBuffer consume method when read or it kept getting called
> > > repeatedly since we didn't read anything.
> > 
> > Hmmm. This should not be necessary. It is the responsibility of the
> > protocol handler to shut down things properly if connection goes down or
> > something breaks at runtime. In case of an I/O timeout the protocol
> > handler always shuts down the execution handler, which in its turn
> > should shut down shared i/o buffers [1]. I'll look into it.
> 
> Well, it's mostly because I didn't actually set the timeouts on the 
> connection at all.   Haven't figured that out yet mostly due to the same 
> issus as with the SSL/connection timeouts/etc...  as we set them on a per-
> request basis normally.   For now so things don't "hang", I just added 
> timeouts on the wait(...) calls we use when we need to block for additional 
> IO.    In that case, the wait(...) times out and I need to shutdown the 
> buffers and such manually so when the data DOES eventually arrive, we ignore 
> it.   Setting the "real" timeouts on the connections would definitely help.
>  


Hi Daniel

I tweaked the code a bit to make sure i/o timeouts (both connect and
receive ones) are handled correctly and can be specified on a per
request basis overriding the defaults. Please review / verify.

I also added connection pool lease / release logging. I am no longer
seeing anything suspicious related to connection management. When
executing requests sequentially the connection manager allocates one
connection only. 

> > > CXF allows these to be configured on a per-client (and sometimes even
> > > per- request) basis.    It looks like HC seems to have these on the
> > > connection factory.
> > 
> > It does not have to be this way. Connections can start off as plain and
> > later get upgraded to TLS/SSL by the protocol handling code. One would
> > have to tweak the default protocol handler a bit, though, probably by
> > overriding the HttpAsyncRequestExecutor#requestReady method [2].
> 
> Ah.  OK.   I'll take a look there.   I was looking in the various connection 
> factories and pools.   Haven't looked there yet.   Lots of stuff going on.  
> :-)
> 

I also added a CXF specific connection manager implementation in order
to enable SSL customization based on request configuration. However,
before I proceed I would like to understand the requirements a little
better. Can I assume that once an SSL connection has been fully set up
using request level config parameters and can be pooled and safely
reused for subsequent requests to the same service (even by different
users) or requests may contain different / incompatible SSL
configuration for the same service?   

Oleg   



Re: Async http client experiments....

Posted by Daniel Kulp <dk...@apache.org>.
On Friday, August 03, 2012 11:29:41 AM Oleg Kalnichevski wrote:
> > I just went through and added a little bit of error handling into it as
> > well.   Mostly around a connection refused exception and for read
> > timeouts. For the read timeout, I DID have to add an ioct.shutdown() in
> > the SharedInBuffer consume method when read or it kept getting called
> > repeatedly since we didn't read anything.
> 
> Hmmm. This should not be necessary. It is the responsibility of the
> protocol handler to shut down things properly if connection goes down or
> something breaks at runtime. In case of an I/O timeout the protocol
> handler always shuts down the execution handler, which in its turn
> should shut down shared i/o buffers [1]. I'll look into it.

Well, it's mostly because I didn't actually set the timeouts on the 
connection at all.   Haven't figured that out yet mostly due to the same 
issus as with the SSL/connection timeouts/etc...  as we set them on a per-
request basis normally.   For now so things don't "hang", I just added 
timeouts on the wait(...) calls we use when we need to block for additional 
IO.    In that case, the wait(...) times out and I need to shutdown the 
buffers and such manually so when the data DOES eventually arrive, we ignore 
it.   Setting the "real" timeouts on the connections would definitely help.
 
> > CXF allows these to be configured on a per-client (and sometimes even
> > per- request) basis.    It looks like HC seems to have these on the
> > connection factory.
> 
> It does not have to be this way. Connections can start off as plain and
> later get upgraded to TLS/SSL by the protocol handling code. One would
> have to tweak the default protocol handler a bit, though, probably by
> overriding the HttpAsyncRequestExecutor#requestReady method [2].

Ah.  OK.   I'll take a look there.   I was looking in the various connection 
factories and pools.   Haven't looked there yet.   Lots of stuff going on.  
:-)

Dan



> 
> >   I thought about a new factory per request, but the pool seems to
> > 
> > be tied to the connection factory  as well and a full pool per request
> > would be aweful.   I then wanted to see if I could pass some sort of
> > contextual object in (there is that BasicHttpContext object) that I
> > could pull information out of at connection time or similar, but I
> > could really figure out where that went, at least not until after the
> > IOSession is setup, which is too late.   Also thought about a thread
> > local, but that doesn't work with the async connection.
> 
> I'll try to find time to write all the necessary plumbing code on this
> weekend.
> 
> Cheers
> 
> Oleg
> 
> > Any ideas on that would be a huge help.  I'll keep digging a bit as
> > well.
> > Kind of learning more about the HC stuff and NIO and all kind of things,
> > which is kind of fun.   :-)
> 
> [1]
> http://hc.apache.org/httpcomponents-core-ga/httpcore-nio/xref/org/apache/h
> ttp/nio/protocol/HttpAsyncRequestExecutor.html#269 [2]
> http://hc.apache.org/httpcomponents-core-ga/httpcore-nio/xref/org/apache/h
> ttp/nio/protocol/HttpAsyncRequestExecutor.html#124
-- 
Daniel Kulp
dkulp@apache.org - http://dankulp.com/blog
Talend Community Coder - http://coders.talend.com

Re: Async http client experiments....

Posted by Oleg Kalnichevski <ol...@apache.org>.
On Thu, 2012-08-02 at 14:58 -0400, Daniel Kulp wrote:
> On Wednesday, August 01, 2012 05:58:23 PM Oleg Kalnichevski wrote:
> > Hi Daniel
...
>  
> > I also noticed that for some reason additional connections get open and
> > closed without transmitting any data, which obviously should not be
> > happening. I'll investigate the problem if you confirm that you want me
> > to proceed.
> 
> Interesting.  I'm actually not seeing that now.  :-)   I ran the testCalls 
> things with small and larger requests and in both cases, all 20K requests 
> went on a single connection.  Was quite impressed.  (I actually expected the 
> Jetty backend to have a limit on number of requests per keepalive as I ran 
> into a similar setting in Tomcat last week that defaulted to 100).
> 

I would like to investigate nonetheless. Besides, connection pool
logging might come handy in other situations as well.

> I just went through and added a little bit of error handling into it as 
> well.   Mostly around a connection refused exception and for read timeouts.   
> For the read timeout, I DID have to add an ioct.shutdown() in the 
> SharedInBuffer consume method when read or it kept getting called repeatedly 
> since we didn't read anything.
> 

Hmmm. This should not be necessary. It is the responsibility of the
protocol handler to shut down things properly if connection goes down or
something breaks at runtime. In case of an I/O timeout the protocol
handler always shuts down the execution handler, which in its turn
should shut down shared i/o buffers [1]. I'll look into it.


> Also started working on the resends and such, but haven't yet added a 
> testcase for that.  
> 
> So, stuff I have left to look at are basically:  (and I've add //FIXME 
> comments in the code where I *think* something needs to happen)
> 
> 1) Proxy support
> 2) SSL/TLS
> 3) Connection timeouts
> 

Connect timeouts, i/o timeouts, TLS/SSL layering should be supported out
of the box. Proxy support can present a certain conceptual challenge as
it automatically implies a decision as to whether or not HTTP
authentication is to be supported and if so to what extent. Complex auth
schemes such SPNEGO and NTLM entail a lot of additional complexity and
may require additional external dependencies. HC provide full support
for HTTP request proxying and HTTPS tunneling in its higher level
components (HttpClient and HttpAsyncClient) but not in core. So, the
options are to upgrade to HttpAsyncClient or to implement proxy support
in the CXF HTTP conduit. The former most likely means having to live
with different (potentially inconsistent) HTTP authentication
frameworks. The latter means having more flexibility at the price of
having more code to maintain. So, essentially it is a classic buy
(re-use) or build dilemma.

> CXF allows these to be configured on a per-client (and sometimes even per-
> request) basis.    It looks like HC seems to have these on the connection 
> factory.

It does not have to be this way. Connections can start off as plain and
later get upgraded to TLS/SSL by the protocol handling code. One would
have to tweak the default protocol handler a bit, though, probably by
overriding the HttpAsyncRequestExecutor#requestReady method [2]. 

>   I thought about a new factory per request, but the pool seems to 
> be tied to the connection factory  as well and a full pool per request would 
> be aweful.   I then wanted to see if I could pass some sort of contextual 
> object in (there is that BasicHttpContext object) that I could pull 
> information out of at connection time or similar, but I could really figure 
> out where that went, at least not until after the IOSession is setup, which 
> is too late.   Also thought about a thread local, but that doesn't work with 
> the async connection.   
> 

I'll try to find time to write all the necessary plumbing code on this
weekend.

Cheers

Oleg

> Any ideas on that would be a huge help.  I'll keep digging a bit as well.   
> Kind of learning more about the HC stuff and NIO and all kind of things, 
> which is kind of fun.   :-)
> 
> 

[1]
http://hc.apache.org/httpcomponents-core-ga/httpcore-nio/xref/org/apache/http/nio/protocol/HttpAsyncRequestExecutor.html#269
[2]
http://hc.apache.org/httpcomponents-core-ga/httpcore-nio/xref/org/apache/http/nio/protocol/HttpAsyncRequestExecutor.html#124



Re: Async http client experiments....

Posted by Daniel Kulp <dk...@apache.org>.
On Wednesday, August 01, 2012 05:58:23 PM Oleg Kalnichevski wrote:
> Hi Daniel
> 
> I believe I have fixed the thread synchronization issue. All tests
> repeatedly passed for me. Could you please have a look at my changes and
> confirm that everything is more or less sane and HTTP transport still
> performs at an acceptable level?

Finally got a chance to look at all the stuff you did.  Awesome work!

 
> As the next I could optimize the code and hopefully bring its
> performance closer to that of the default blocking transport.
> 
> I also noticed that for some reason additional connections get open and
> closed without transmitting any data, which obviously should not be
> happening. I'll investigate the problem if you confirm that you want me
> to proceed.

Interesting.  I'm actually not seeing that now.  :-)   I ran the testCalls 
things with small and larger requests and in both cases, all 20K requests 
went on a single connection.  Was quite impressed.  (I actually expected the 
Jetty backend to have a limit on number of requests per keepalive as I ran 
into a similar setting in Tomcat last week that defaulted to 100).

I just went through and added a little bit of error handling into it as 
well.   Mostly around a connection refused exception and for read timeouts.   
For the read timeout, I DID have to add an ioct.shutdown() in the 
SharedInBuffer consume method when read or it kept getting called repeatedly 
since we didn't read anything.

Also started working on the resends and such, but haven't yet added a 
testcase for that.  

So, stuff I have left to look at are basically:  (and I've add //FIXME 
comments in the code where I *think* something needs to happen)

1) Proxy support
2) SSL/TLS
3) Connection timeouts

CXF allows these to be configured on a per-client (and sometimes even per-
request) basis.    It looks like HC seems to have these on the connection 
factory.  I thought about a new factory per request, but the pool seems to 
be tied to the connection factory  as well and a full pool per request would 
be aweful.   I then wanted to see if I could pass some sort of contextual 
object in (there is that BasicHttpContext object) that I could pull 
information out of at connection time or similar, but I could really figure 
out where that went, at least not until after the IOSession is setup, which 
is too late.   Also thought about a thread local, but that doesn't work with 
the async connection.   

Any ideas on that would be a huge help.  I'll keep digging a bit as well.   
Kind of learning more about the HC stuff and NIO and all kind of things, 
which is kind of fun.   :-)


-- 
Daniel Kulp
dkulp@apache.org - http://dankulp.com/blog
Talend Community Coder - http://coders.talend.com

Re: Async http client experiments....

Posted by Oleg Kalnichevski <ol...@apache.org>.
On Mon, 2012-07-30 at 19:47 -0400, Daniel Kulp wrote:
> On Monday, July 30, 2012 11:41:55 PM Oleg Kalnichevski wrote:
> > On Mon, 2012-07-30 at 17:06 -0400, Daniel Kulp wrote:
> > > BTW: I think the NHttpClient.java referenced from:
> > > http://hc.apache.org/httpcomponents-core-ga/examples.html
> > > needs work as well.   I don't think most of the parameters set in the
> > > HttpParams actually will actually go into effect.   That was part of my
> > > original issue as the TCP_NODELAY setting didn't take effect until I set
> > > that on the ConnectingIOReactor instead of in the params.    Not sure if
> > > it's a bug in the BasicNIOConnPool that was ignoring the params or a bug
> > > in the example.    Looking at the code in BasicNIOConnPool, I think the
> > > connection timeout is the only setting that would work from the params.
> >
> > Higher level components such as HttpAsyncClient usually take care of
> > creating and initializing the I/O reactor based on the HTTP level
> > parameters. When using core components directly more work may be needed.
> > I/O reactor config in NHttpClient is obviously wrong. TCP_NODELAY should
> > actually be set to true per default, which is something I clearly
> > overlooked.
> 
> I was JUST about to ask about that.  :-)     false is OK for GET requests, 
> but for PUT/POST, it completely kills performance as it adds up to a 200ms 
> delay between sending the headers and the body as it waits for the ack of 
> the headers.   Would you like me to log some issues in JIRA for these so 
> they get tracked?
> 

Hi Daniel

I believe I have fixed the thread synchronization issue. All tests
repeatedly passed for me. Could you please have a look at my changes and
confirm that everything is more or less sane and HTTP transport still
performs at an acceptable level?

As the next I could optimize the code and hopefully bring its
performance closer to that of the default blocking transport.

I also noticed that for some reason additional connections get open and
closed without transmitting any data, which obviously should not be
happening. I'll investigate the problem if you confirm that you want me
to proceed.

Cheers

Oleg 


Re: Async http client experiments....

Posted by Oleg Kalnichevski <ol...@apache.org>.
On Mon, 2012-07-30 at 19:47 -0400, Daniel Kulp wrote:
> On Monday, July 30, 2012 11:41:55 PM Oleg Kalnichevski wrote:
> > On Mon, 2012-07-30 at 17:06 -0400, Daniel Kulp wrote:
> > > BTW: I think the NHttpClient.java referenced from:
> > > http://hc.apache.org/httpcomponents-core-ga/examples.html
> > > needs work as well.   I don't think most of the parameters set in the
> > > HttpParams actually will actually go into effect.   That was part of my
> > > original issue as the TCP_NODELAY setting didn't take effect until I set
> > > that on the ConnectingIOReactor instead of in the params.    Not sure if
> > > it's a bug in the BasicNIOConnPool that was ignoring the params or a bug
> > > in the example.    Looking at the code in BasicNIOConnPool, I think the
> > > connection timeout is the only setting that would work from the params.
> >
> > Higher level components such as HttpAsyncClient usually take care of
> > creating and initializing the I/O reactor based on the HTTP level
> > parameters. When using core components directly more work may be needed.
> > I/O reactor config in NHttpClient is obviously wrong. TCP_NODELAY should
> > actually be set to true per default, which is something I clearly
> > overlooked.
> 
> I was JUST about to ask about that.  :-)     false is OK for GET requests, 
> but for PUT/POST, it completely kills performance as it adds up to a 200ms 
> delay between sending the headers and the body as it waits for the ack of 
> the headers.   Would you like me to log some issues in JIRA for these so 
> they get tracked?
> 

Absolutely! Please do.

Oleg


Re: Async http client experiments....

Posted by Daniel Kulp <dk...@apache.org>.
On Monday, July 30, 2012 11:41:55 PM Oleg Kalnichevski wrote:
> On Mon, 2012-07-30 at 17:06 -0400, Daniel Kulp wrote:
> > BTW: I think the NHttpClient.java referenced from:
> > http://hc.apache.org/httpcomponents-core-ga/examples.html
> > needs work as well.   I don't think most of the parameters set in the
> > HttpParams actually will actually go into effect.   That was part of my
> > original issue as the TCP_NODELAY setting didn't take effect until I set
> > that on the ConnectingIOReactor instead of in the params.    Not sure if
> > it's a bug in the BasicNIOConnPool that was ignoring the params or a bug
> > in the example.    Looking at the code in BasicNIOConnPool, I think the
> > connection timeout is the only setting that would work from the params.
>
> Higher level components such as HttpAsyncClient usually take care of
> creating and initializing the I/O reactor based on the HTTP level
> parameters. When using core components directly more work may be needed.
> I/O reactor config in NHttpClient is obviously wrong. TCP_NODELAY should
> actually be set to true per default, which is something I clearly
> overlooked.

I was JUST about to ask about that.  :-)     false is OK for GET requests, 
but for PUT/POST, it completely kills performance as it adds up to a 200ms 
delay between sending the headers and the body as it waits for the ack of 
the headers.   Would you like me to log some issues in JIRA for these so 
they get tracked?

-- 
Daniel Kulp
dkulp@apache.org - http://dankulp.com/blog
Talend Community Coder - http://coders.talend.com

Re: Async http client experiments....

Posted by Oleg Kalnichevski <ol...@apache.org>.
On Mon, 2012-07-30 at 17:06 -0400, Daniel Kulp wrote:
> On Monday, July 30, 2012 10:40:32 PM Oleg Kalnichevski wrote:
> > Individual ContentDecoder / ContentEncoder implementations are marked
> > with @NonThreadSafe annotation but public APIs do not make it clear
> > enough that codecs are not thread safe. Thank you for pointing that out.
> > I'll update javadocs and the tutorial as you suggested.
> 
> Well, the @NonThreadSafe annotation wouldn't have really helped me I think.   
> *I'm* only accessing the given encoder/decoder on a single thread.   Thus, 
> from my standpoint, it should have worked.  :-)    Yea, that single thread 
> is different than the  thread that passed it to me, but it is all on a 
> single thread.  
> 

This correct. However, there is also a I/O dispatch thread of the I/O
reactor that writes out content of the session buffer as soon as the
underlying channel becomes writable. Chunks headers can get corrupt if
both the I/O reactor and the worker thread happen to access the buffer
at the same time. This also explains why you have never seen data
corruption with Content-Length delimited messages, as in this case data
gets written to the underlying channel directly bypassing all
buffers.     

> 
> BTW: I think the NHttpClient.java referenced from:
> http://hc.apache.org/httpcomponents-core-ga/examples.html
> needs work as well.   I don't think most of the parameters set in the 
> HttpParams actually will actually go into effect.   That was part of my 
> original issue as the TCP_NODELAY setting didn't take effect until I set 
> that on the ConnectingIOReactor instead of in the params.    Not sure if 
> it's a bug in the BasicNIOConnPool that was ignoring the params or a bug in 
> the example.    Looking at the code in BasicNIOConnPool, I think the 
> connection timeout is the only setting that would work from the params.
> 

Higher level components such as HttpAsyncClient usually take care of
creating and initializing the I/O reactor based on the HTTP level
parameters. When using core components directly more work may be needed.
I/O reactor config in NHttpClient is obviously wrong. TCP_NODELAY should
actually be set to true per default, which is something I clearly
overlooked.

Oleg



Re: Async http client experiments....

Posted by Daniel Kulp <dk...@apache.org>.
On Monday, July 30, 2012 10:40:32 PM Oleg Kalnichevski wrote:
> Individual ContentDecoder / ContentEncoder implementations are marked
> with @NonThreadSafe annotation but public APIs do not make it clear
> enough that codecs are not thread safe. Thank you for pointing that out.
> I'll update javadocs and the tutorial as you suggested.

Well, the @NonThreadSafe annotation wouldn't have really helped me I think.   
*I'm* only accessing the given encoder/decoder on a single thread.   Thus, 
from my standpoint, it should have worked.  :-)    Yea, that single thread 
is different than the  thread that passed it to me, but it is all on a 
single thread.  


BTW: I think the NHttpClient.java referenced from:
http://hc.apache.org/httpcomponents-core-ga/examples.html
needs work as well.   I don't think most of the parameters set in the 
HttpParams actually will actually go into effect.   That was part of my 
original issue as the TCP_NODELAY setting didn't take effect until I set 
that on the ConnectingIOReactor instead of in the params.    Not sure if 
it's a bug in the BasicNIOConnPool that was ignoring the params or a bug in 
the example.    Looking at the code in BasicNIOConnPool, I think the 
connection timeout is the only setting that would work from the params.


-- 
Daniel Kulp
dkulp@apache.org - http://dankulp.com/blog
Talend Community Coder - http://coders.talend.com

Re: Async http client experiments....

Posted by Oleg Kalnichevski <ol...@apache.org>.
On Mon, 2012-07-30 at 10:35 -0400, Daniel Kulp wrote:
> On Sunday, July 29, 2012 12:59:15 PM Oleg Kalnichevski wrote:
> > All right. Then, with your permission, I'll try to commit some of my
> > changes. You can always revert or re-work them if anything is not to
> > your liking.
> 
> Yep.  That's the purpose of version control.   :-)  Major thanks for taking 
> a look.  That's a huge help.
> 
>  
> > I think I have found the problem. It appears to be a classic
> > thread-safety issue. Content codecs (ContentEncoder / ContentDecoder
> > instances) are not thread-safe and are not meant to be accessed by
> > worker threads directly. 
> 
> OK.  I actually thought about that originally when I wrote the code.  
> However, I didn't see anything in the javadocs for HttpAsyncRequestProducer 
> or for the Encoder saying they shouldn't.   Since you can use the IOControl 
> object to suspend the input, I assumed you would need to keep the IOControl 
> to resume it and thus IT should be usable on multiple threads.  If one is 
> usable, I thought the other should be as well.
> 
> IMO, it would be a HUGE help if the javadocs would be updated to 
> specifically say something like:
>
> "The encoder should only be used within the context of the call to 
> produceContent.   The IOControl object can be retained and used on other 
> threads to allow resuming events when appropriate." 
> 
> Likewise for the decoder if appropriate there as well.  If that was in the 
> javadoc, I wouldn't have written it like I did.  :-)
> 

Hi Daniel

Individual ContentDecoder / ContentEncoder implementations are marked
with @NonThreadSafe annotation but public APIs do not make it clear
enough that codecs are not thread safe. Thank you for pointing that out.
I'll update javadocs and the tutorial as you suggested. 

> 
> > Worker threads are expected to use shared
> > buffers to produce and consume message bodies and let the I/O dispatcher
> > take care of transferring data from and to the underlying channel
> > (through a content codec) instead of trying to force data in or out of
> > the content codec directly. The reason why content codecs are not meant
> > to be shared is to minimize the chances of blocking the I/O dispatch
> > thread and all other connections on the same I/O selector along with
> > it.
> > 
> > I'll see if I can re-work your code a little and fix the threading
> > issue.
> 
> OK.  Thanks for that.   I'll take a look as well.
> 
> Major thanks for jumping in and taking a look and providing good 
> explanations.  Huge help!
> 

You are probably helping me as much as I am helping you.

Cheers

Oleg



Re: Async http client experiments....

Posted by Daniel Kulp <dk...@apache.org>.
On Sunday, July 29, 2012 12:59:15 PM Oleg Kalnichevski wrote:
> All right. Then, with your permission, I'll try to commit some of my
> changes. You can always revert or re-work them if anything is not to
> your liking.

Yep.  That's the purpose of version control.   :-)  Major thanks for taking 
a look.  That's a huge help.

 
> I think I have found the problem. It appears to be a classic
> thread-safety issue. Content codecs (ContentEncoder / ContentDecoder
> instances) are not thread-safe and are not meant to be accessed by
> worker threads directly. 

OK.  I actually thought about that originally when I wrote the code.  
However, I didn't see anything in the javadocs for HttpAsyncRequestProducer 
or for the Encoder saying they shouldn't.   Since you can use the IOControl 
object to suspend the input, I assumed you would need to keep the IOControl 
to resume it and thus IT should be usable on multiple threads.  If one is 
usable, I thought the other should be as well.

IMO, it would be a HUGE help if the javadocs would be updated to 
specifically say something like:

"The encoder should only be used within the context of the call to 
produceContent.   The IOControl object can be retained and used on other 
threads to allow resuming events when appropriate." 

Likewise for the decoder if appropriate there as well.  If that was in the 
javadoc, I wouldn't have written it like I did.  :-)


> Worker threads are expected to use shared
> buffers to produce and consume message bodies and let the I/O dispatcher
> take care of transferring data from and to the underlying channel
> (through a content codec) instead of trying to force data in or out of
> the content codec directly. The reason why content codecs are not meant
> to be shared is to minimize the chances of blocking the I/O dispatch
> thread and all other connections on the same I/O selector along with
> it.
> 
> I'll see if I can re-work your code a little and fix the threading
> issue.

OK.  Thanks for that.   I'll take a look as well.

Major thanks for jumping in and taking a look and providing good 
explanations.  Huge help!


-- 
Daniel Kulp
dkulp@apache.org - http://dankulp.com/blog
Talend Community Coder - http://coders.talend.com

Re: Async http client experiments....

Posted by Oleg Kalnichevski <ol...@apache.org>.
On Sat, 2012-07-28 at 16:47 -0400, Daniel Kulp wrote:
> On Saturday, July 28, 2012 03:14:16 PM Oleg Kalnichevski wrote:
> > > 4) Apache HTTP Components (HC)- this was the first one I tried, ran into
> > > performance issues, abandoned it to test the others, then came back to
> > > it
> > > and figured out the performance issue.  :-)   I had this "working", but
> > > a
> > > simple "hello world" echo in a loop resulted in VERY VERY slow
> > > operation,
> > > about 20x slower than the URLConnection in the JDK.   Couldn't figure
> > > out
> > > what was going on which is why I started looking at the others.   I came
> > > back to it and started doing wireshark captures and discovered that it
> > > was waiting for ACK packets whereas the other clients were not.   The
> > > main issue was that the docs for how to set the TCP_NO_DELAY flag
> > > (which, to me, should be the default) seem to be more geared toward the
> > > 3.x or non-NIO versions. Anyway, once I managed to get that set, things
> > > improved significantly.  For non-chunked data, it seems to be working
> > > very well.  For chunked data, it seems to work well 99% of the time.  
> > > It's that last 1% that's going to drive me nuts.  :-(   It's
> > > occassionally writing out bad chunk headers, and I have no idea why.  
> > > A raw wireshark look certainly shows bad chunk headers heading out.   I
> > > don't know if it's something I'm doing or a bug in their stuff.  Don't
> > > really know yet.
> > 
> > Hi Daniel
> > 
> > If my memory serves me well, we have not had a single confirmed case of
> > message corruption I could think of for many years. I took a cursory
> > look at your code and could not spot anything obviously wrong. I am
> > attaching a patch that adds wire and i/o event logging to your HTTP
> > conduit. If you set 'org.apache.http' category to DEBUG priority you
> > should be able to see what kind of stuff gets written to and read from
> > the underlying NIO channel and compare it with what you see with
> > Wireshark.
> 
> I'll take a look at the patch a bit on monday.   BTW:  the CXF sandbox where 
> I'm working on this is open to all Apache committers.  Feel free to commit 
> anything that may help.  :-) 
> 

Hi Daniel

All right. Then, with your permission, I'll try to commit some of my
changes. You can always revert or re-work them if anything is not to
your liking.

> 
> > If you let me know how to reproduce the issue I'll happily investigate
> > and try to find out what causes it. If there is anything wrong with
> > HttpCore I'll get it fixed.
> 
> I just committed a quick update to the testcase to make the test messages 
> large enough to start chunking.   If you run the 
> AsyncHTTPConduitTest.testCalls  
> test in the unit test dir, it will USUALLY blow up after a few hundred 
> requests.    It will likely just hang if run from maven (I don't have any 
> exception handling in place yet so it's waiting for a response that just 
> won't be there).   The Jetty server it starts up has bailed due to an 
> invalid chunk size.   If you run from with Eclipse or similar, you will 
> likely see the log output from Jetty which will say something like:
> 
> Caused by: java.io.IOException: bad chunk char: 112
> 	at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:905)
> 	at 
> org.eclipse.jetty.http.HttpParser.blockForContent(HttpParser.java:1173)
> 	at org.eclipse.jetty.server.HttpInput.read(HttpInput.java:57)
> 	at java.io.FilterInputStream.read(FilterInputStream.java:116)
> 
> 
> From my wireshark trace, I see what was sent as:
> 
> POST http://localhost:9001/SoapContext/SoapPort HTTP/1.1
> Accept: */*
> User-Agent: Apache CXF 2.7.0-SNAPSHOT
> SOAPAction: ""
> Transfer-Encoding: chunked
> Content-Type: text/xml; charset=UTF-8
> Host: localhost:9001
> Connection: Keep-Alive
> 
> 
> <soap:Envelope..........
> 
> Since there are two lines between the "Keep-Alive" line and the 
> <soap:env..., I assume some space was reserved in the chunk for the header, 
> but it wasn't written there.   Not really sure.   
> 

I think I have found the problem. It appears to be a classic
thread-safety issue. Content codecs (ContentEncoder / ContentDecoder
instances) are not thread-safe and are not meant to be accessed by
worker threads directly. Worker threads are expected to use shared
buffers to produce and consume message bodies and let the I/O dispatcher
take care of transferring data from and to the underlying channel
(through a content codec) instead of trying to force data in or out of
the content codec directly. The reason why content codecs are not meant
to be shared is to minimize the chances of blocking the I/O dispatch
thread and all other connections on the same I/O selector along with
it.  

I'll see if I can re-work your code a little and fix the threading
issue.

> Another note is that all the calls are occuring one after another.   With 
> the keep-alives, I kind of would have expected just a single connection, but 
> I'm seeing additional connections being established.  Might be something I'm 
> doing (not marking a connection complete in time or something).   On my todo 
> list to look at as well.
> 

I'll look into the connection pooling issue as soon as the basic
transport works reliably. 

>  
> > > In anycase, I'm likely going to pursue option #4 a bit more and see if I
> > > can figure out the last issue with it.
> > > 
> > > >From a performance standpoint, for synchronous request/response, none
> > > >of
> > > 
> > > them perform as well as the in-jdk HttpURLConnection for what we do.  
> > > Netty came the closest at about 5% slower.  HC was about 10%, Jetty
> > > about 12%. Gave up on Ning before running benchmarks.
> > 
> > I am quite surprised the difference compared to HttpURLConnection is
> > relatively small. In my experience a decent blocking HTTP client
> > outperforms a decent non-blocking HTTP client by as much as 50% as long
> > as the number of concurrent connections is moderate (<500). NIO starts
> > paying off only with concurrency well over 2000 or when connections stay
> > idle most of the time.
> 
> Oh, cool.  I was actually disappointed with the numbers.  If they look good 
> to you, I'm happy.  :-)
> 
> Most of the overhead in this case is not so much with the http library but 
> with CXF.   CXF is generating the XML during the write process and that's 
> not exactly "cheap".   Thus, it's not like "normal" benchmarks that are 
> trying to blast pre-generated content back and forth as byte[].  
> 

That makes sense.

Cheers

Oleg


Re: Async http client experiments....

Posted by Daniel Kulp <dk...@apache.org>.
On Saturday, July 28, 2012 03:14:16 PM Oleg Kalnichevski wrote:
> > 4) Apache HTTP Components (HC)- this was the first one I tried, ran into
> > performance issues, abandoned it to test the others, then came back to
> > it
> > and figured out the performance issue.  :-)   I had this "working", but
> > a
> > simple "hello world" echo in a loop resulted in VERY VERY slow
> > operation,
> > about 20x slower than the URLConnection in the JDK.   Couldn't figure
> > out
> > what was going on which is why I started looking at the others.   I came
> > back to it and started doing wireshark captures and discovered that it
> > was waiting for ACK packets whereas the other clients were not.   The
> > main issue was that the docs for how to set the TCP_NO_DELAY flag
> > (which, to me, should be the default) seem to be more geared toward the
> > 3.x or non-NIO versions. Anyway, once I managed to get that set, things
> > improved significantly.  For non-chunked data, it seems to be working
> > very well.  For chunked data, it seems to work well 99% of the time.  
> > It's that last 1% that's going to drive me nuts.  :-(   It's
> > occassionally writing out bad chunk headers, and I have no idea why.  
> > A raw wireshark look certainly shows bad chunk headers heading out.   I
> > don't know if it's something I'm doing or a bug in their stuff.  Don't
> > really know yet.
> 
> Hi Daniel
> 
> If my memory serves me well, we have not had a single confirmed case of
> message corruption I could think of for many years. I took a cursory
> look at your code and could not spot anything obviously wrong. I am
> attaching a patch that adds wire and i/o event logging to your HTTP
> conduit. If you set 'org.apache.http' category to DEBUG priority you
> should be able to see what kind of stuff gets written to and read from
> the underlying NIO channel and compare it with what you see with
> Wireshark.

I'll take a look at the patch a bit on monday.   BTW:  the CXF sandbox where 
I'm working on this is open to all Apache committers.  Feel free to commit 
anything that may help.  :-) 


> If you let me know how to reproduce the issue I'll happily investigate
> and try to find out what causes it. If there is anything wrong with
> HttpCore I'll get it fixed.

I just committed a quick update to the testcase to make the test messages 
large enough to start chunking.   If you run the 
AsyncHTTPConduitTest.testCalls  
test in the unit test dir, it will USUALLY blow up after a few hundred 
requests.    It will likely just hang if run from maven (I don't have any 
exception handling in place yet so it's waiting for a response that just 
won't be there).   The Jetty server it starts up has bailed due to an 
invalid chunk size.   If you run from with Eclipse or similar, you will 
likely see the log output from Jetty which will say something like:

Caused by: java.io.IOException: bad chunk char: 112
	at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:905)
	at 
org.eclipse.jetty.http.HttpParser.blockForContent(HttpParser.java:1173)
	at org.eclipse.jetty.server.HttpInput.read(HttpInput.java:57)
	at java.io.FilterInputStream.read(FilterInputStream.java:116)


>From my wireshark trace, I see what was sent as:

POST http://localhost:9001/SoapContext/SoapPort HTTP/1.1
Accept: */*
User-Agent: Apache CXF 2.7.0-SNAPSHOT
SOAPAction: ""
Transfer-Encoding: chunked
Content-Type: text/xml; charset=UTF-8
Host: localhost:9001
Connection: Keep-Alive


<soap:Envelope..........

Since there are two lines between the "Keep-Alive" line and the 
<soap:env..., I assume some space was reserved in the chunk for the header, 
but it wasn't written there.   Not really sure.   

Another note is that all the calls are occuring one after another.   With 
the keep-alives, I kind of would have expected just a single connection, but 
I'm seeing additional connections being established.  Might be something I'm 
doing (not marking a connection complete in time or something).   On my todo 
list to look at as well.

 
> > In anycase, I'm likely going to pursue option #4 a bit more and see if I
> > can figure out the last issue with it.
> > 
> > >From a performance standpoint, for synchronous request/response, none
> > >of
> > 
> > them perform as well as the in-jdk HttpURLConnection for what we do.  
> > Netty came the closest at about 5% slower.  HC was about 10%, Jetty
> > about 12%. Gave up on Ning before running benchmarks.
> 
> I am quite surprised the difference compared to HttpURLConnection is
> relatively small. In my experience a decent blocking HTTP client
> outperforms a decent non-blocking HTTP client by as much as 50% as long
> as the number of concurrent connections is moderate (<500). NIO starts
> paying off only with concurrency well over 2000 or when connections stay
> idle most of the time.

Oh, cool.  I was actually disappointed with the numbers.  If they look good 
to you, I'm happy.  :-)

Most of the overhead in this case is not so much with the http library but 
with CXF.   CXF is generating the XML during the write process and that's 
not exactly "cheap".   Thus, it's not like "normal" benchmarks that are 
trying to blast pre-generated content back and forth as byte[].  



-- 
Daniel Kulp
dkulp@apache.org - http://dankulp.com/blog
Talend Community Coder - http://coders.talend.com

Re: Async http client experiments....

Posted by Oleg Kalnichevski <ol...@apache.org>.
On Fri, 2012-07-27 at 15:20 -0400, Daniel Kulp wrote:
> I've committed a few experiments I've been working on to:
> 
> https://svn.apache.org/repos/asf/cxf/sandbox/dkulp_async_clients
> 
> 
> Basically, I've been trying to find an async client that is somewhat usable 
> for CXF without completely re-writing all of CXF.   Not exactly an easy 
> task.  For "POST"s, they pretty much are all designed around being able to 
> blast out pre-rendered content (like File's or byte[]).   Doesn't really fit 
> with CXF's way of streaming out the soap messages as they are created.   
> 

...

> 4) Apache HTTP Components (HC)- this was the first one I tried, ran into 
> performance issues, abandoned it to test the others, then came back to it 
> and figured out the performance issue.  :-)   I had this "working", but a 
> simple "hello world" echo in a loop resulted in VERY VERY slow operation, 
> about 20x slower than the URLConnection in the JDK.   Couldn't figure out 
> what was going on which is why I started looking at the others.   I came 
> back to it and started doing wireshark captures and discovered that it was 
> waiting for ACK packets whereas the other clients were not.   The main issue 
> was that the docs for how to set the TCP_NO_DELAY flag (which, to me, should 
> be the default) seem to be more geared toward the 3.x or non-NIO versions.   
> Anyway, once I managed to get that set, things improved significantly.  For 
> non-chunked data, it seems to be working very well.  For chunked data, it 
> seems to work well 99% of the time.   It's that last 1% that's going to 
> drive me nuts.  :-(   It's occassionally writing out bad chunk headers, and 
> I have no idea why.   A raw wireshark look certainly shows bad chunk headers 
> heading out.   I don't know if it's something I'm doing or a bug in their 
> stuff.  Don't really know yet.
> 

Hi Daniel

If my memory serves me well, we have not had a single confirmed case of
message corruption I could think of for many years. I took a cursory
look at your code and could not spot anything obviously wrong. I am
attaching a patch that adds wire and i/o event logging to your HTTP
conduit. If you set 'org.apache.http' category to DEBUG priority you
should be able to see what kind of stuff gets written to and read from
the underlying NIO channel and compare it with what you see with
Wireshark. 

If you let me know how to reproduce the issue I'll happily investigate
and try to find out what causes it. If there is anything wrong with
HttpCore I'll get it fixed. 


> In anycase, I'm likely going to pursue option #4 a bit more and see if I can 
> figure out the last issue with it.
> 
> >From a performance standpoint, for synchronous request/response, none of 
> them perform as well as the in-jdk HttpURLConnection for what we do.   Netty 
> came the closest at about 5% slower.  HC was about 10%, Jetty about 12%.  
> Gave up on Ning before running benchmarks.   
> 

I am quite surprised the difference compared to HttpURLConnection is
relatively small. In my experience a decent blocking HTTP client
outperforms a decent non-blocking HTTP client by as much as 50% as long
as the number of concurrent connections is moderate (<500). NIO starts
paying off only with concurrency well over 2000 or when connections stay
idle most of the time.

Cheers

Oleg


Re: Async http client experiments....

Posted by Daniel Kulp <dk...@apache.org>.
On Friday, July 27, 2012 03:11:18 PM Jeff Genender wrote:
> Any reason you chose not to use Mina?

Two answers:

1)  I would put Mina in the category of lower level than even Netty.   
Probably doable, but would require a lot of work to write all the HTTP 
symantics onto it.   The "asyncweb" subproject of Mina could be a starting 
point, but that hasn't even had a commit to it in over 18 months and has 
never had a release.

2) Are you volunteering to look into it?   :-)



Dan



> 
> Jeff
> 
> On Jul 27, 2012, at 2:20 PM, Daniel Kulp wrote:
> > I've committed a few experiments I've been working on to:
> > 
> > https://svn.apache.org/repos/asf/cxf/sandbox/dkulp_async_clients
> > 
> > 
> > Basically, I've been trying to find an async client that is somewhat
> > usable for CXF without completely re-writing all of CXF.   Not exactly
> > an easy task.  For "POST"s, they pretty much are all designed around
> > being able to blast out pre-rendered content (like File's or byte[]).  
> > Doesn't really fit with CXF's way of streaming out the soap messages as
> > they are created.
> > 
> > My notes on 4 client API's I've played with:
> > 
> > 1) Ning/Sonatype Async Client:
> > http://sonatype.github.com/async-http-client/
> > (using Netty backend, didn't try the others)
> > 
> > This one really needed pre-rendered content.  When you can determine the
> > Content-Length up front, it's really not too bad.   However, once you
> > try to flip to Chunked, I just kept running into issues.  In the end, I
> > could not find an API that would let me us Chunked that actually
> > worked.   The one that looked like it should actually didn't write the
> > chunk headers out.  No idea what's up with it.  Got very frustrated
> > with it.
> > 
> > 2) Netty directly:  https://netty.io/
> > I did get the simple cases working with this fairly easily.  Seems quite
> > powerful and it actually performed really well.  The problem is that
> > it's
> > really very low level and doesn't provide a lot of things like
> > Keep-Alive
> > connection management "out of the box".  That's something we'd have to
> > write a lot of code to handle.  Not something I wanted to really
> > tackle.   That said, for small requests, Netty was the only one to
> > stick the HTTP headers AND the body into a single network packet.  
> > Even the URLConnection in the JDK doesn't do that.  Pretty cool.
> > 
> > 3) Jetty Client:  http://wiki.eclipse.org/Jetty/Tutorial/HttpClient
> > I did get this to work for most of my test cases.   There are definitely
> > a few "issues" that I think are likely bugs in Jetty, but easily worked
> > around.  For example, to get it to ask for chunks, you have to first
> > return an "empty" input stream.  Minor bugs asside, it did seem to work
> > OK. However, using the chunks did expose some issues on a raw network
> > side. Doing tcp dumps revealed that most of the chunk headers ended up
> > in their own TCP packets so it ended up taking a bit longer due to
> > extra packets transfered.  The other thing I didn't really like was it
> > kind of required much more byte[] copying than I really wanted.
> > 
> > 4) Apache HTTP Components (HC)- this was the first one I tried, ran into
> > performance issues, abandoned it to test the others, then came back to
> > it
> > and figured out the performance issue.  :-)   I had this "working", but
> > a
> > simple "hello world" echo in a loop resulted in VERY VERY slow
> > operation,
> > about 20x slower than the URLConnection in the JDK.   Couldn't figure
> > out
> > what was going on which is why I started looking at the others.   I came
> > back to it and started doing wireshark captures and discovered that it
> > was waiting for ACK packets whereas the other clients were not.   The
> > main issue was that the docs for how to set the TCP_NO_DELAY flag
> > (which, to me, should be the default) seem to be more geared toward the
> > 3.x or non-NIO versions. Anyway, once I managed to get that set, things
> > improved significantly.  For non-chunked data, it seems to be working
> > very well.  For chunked data, it seems to work well 99% of the time.  
> > It's that last 1% that's going to drive me nuts.  :-(   It's
> > occassionally writing out bad chunk headers, and I have no idea why.  
> > A raw wireshark look certainly shows bad chunk headers heading out.   I
> > don't know if it's something I'm doing or a bug in their stuff.  Don't
> > really know yet.
> > 
> > In anycase, I'm likely going to pursue option #4 a bit more and see if I
> > can figure out the last issue with it.
> > 
> > From a performance standpoint, for synchronous request/response, none of
> > them perform as well as the in-jdk HttpURLConnection for what we do.  
> > Netty came the closest at about 5% slower.  HC was about 10%, Jetty
> > about 12%. Gave up on Ning before running benchmarks.
> > 
> > However, as exected, the real win is when you use the JAX-WS async API's
> > to call onto a "slow" service.   (I only have the HC version working
> > for this at this point)   If you tune the connection pool to allow
> > virtually unlimmitted connections to localhost (mimics the
> > HttpURLConnection for comparison), it's pretty awesome.   I have a
> > simple service that uses the CXF continuations to delay for about 2
> > seconds on the server side to mimic some processing on the server side.
> >    5K async requests using the in-JDK stuff that we have using our
> > default thread pool settings and such takes about 35 seconds for all
> > the requests to complete.    With HC, I have that down to about 9
> > seconds.   However, that does require a bit of tuning of the HC
> > defaults and settings which is the next bit of complication.   Using
> > the async apis to call a "fast" (where response is returned very
> > quickly) service won't benefit as much.
> > 
> > In anycase, I've committed my experiments to the sandbox.   It does
> > require the latest trunk code for transport-http as well.   Any help or
> > thoughts or anything would be more than welcome.  :-)
-- 
Daniel Kulp
dkulp@apache.org - http://dankulp.com/blog
Talend Community Coder - http://coders.talend.com

Re: Async http client experiments....

Posted by Jeff Genender <jg...@apache.org>.
Any reason you chose not to use Mina?

Jeff

On Jul 27, 2012, at 2:20 PM, Daniel Kulp wrote:

> 
> I've committed a few experiments I've been working on to:
> 
> https://svn.apache.org/repos/asf/cxf/sandbox/dkulp_async_clients
> 
> 
> Basically, I've been trying to find an async client that is somewhat usable 
> for CXF without completely re-writing all of CXF.   Not exactly an easy 
> task.  For "POST"s, they pretty much are all designed around being able to 
> blast out pre-rendered content (like File's or byte[]).   Doesn't really fit 
> with CXF's way of streaming out the soap messages as they are created.   
> 
> My notes on 4 client API's I've played with:
> 
> 1) Ning/Sonatype Async Client:
> http://sonatype.github.com/async-http-client/
> (using Netty backend, didn't try the others)
> 
> This one really needed pre-rendered content.  When you can determine the 
> Content-Length up front, it's really not too bad.   However, once you try to 
> flip to Chunked, I just kept running into issues.  In the end, I could not 
> find an API that would let me us Chunked that actually worked.   The one 
> that looked like it should actually didn't write the chunk headers out.  No 
> idea what's up with it.  Got very frustrated with it.
> 
> 2) Netty directly:  https://netty.io/
> I did get the simple cases working with this fairly easily.  Seems quite 
> powerful and it actually performed really well.  The problem is that it's 
> really very low level and doesn't provide a lot of things like Keep-Alive 
> connection management "out of the box".  That's something we'd have to write 
> a lot of code to handle.  Not something I wanted to really tackle.   That 
> said, for small requests, Netty was the only one to stick the HTTP headers 
> AND the body into a single network packet.   Even the URLConnection in the 
> JDK doesn't do that.  Pretty cool.
> 
> 3) Jetty Client:  http://wiki.eclipse.org/Jetty/Tutorial/HttpClient
> I did get this to work for most of my test cases.   There are definitely a 
> few "issues" that I think are likely bugs in Jetty, but easily worked 
> around.  For example, to get it to ask for chunks, you have to first return 
> an "empty" input stream.  Minor bugs asside, it did seem to work OK.  
> However, using the chunks did expose some issues on a raw network side.   
> Doing tcp dumps revealed that most of the chunk headers ended up in their 
> own TCP packets so it ended up taking a bit longer due to extra packets 
> transfered.  The other thing I didn't really like was it kind of required 
> much more byte[] copying than I really wanted. 
> 
> 4) Apache HTTP Components (HC)- this was the first one I tried, ran into 
> performance issues, abandoned it to test the others, then came back to it 
> and figured out the performance issue.  :-)   I had this "working", but a 
> simple "hello world" echo in a loop resulted in VERY VERY slow operation, 
> about 20x slower than the URLConnection in the JDK.   Couldn't figure out 
> what was going on which is why I started looking at the others.   I came 
> back to it and started doing wireshark captures and discovered that it was 
> waiting for ACK packets whereas the other clients were not.   The main issue 
> was that the docs for how to set the TCP_NO_DELAY flag (which, to me, should 
> be the default) seem to be more geared toward the 3.x or non-NIO versions.   
> Anyway, once I managed to get that set, things improved significantly.  For 
> non-chunked data, it seems to be working very well.  For chunked data, it 
> seems to work well 99% of the time.   It's that last 1% that's going to 
> drive me nuts.  :-(   It's occassionally writing out bad chunk headers, and 
> I have no idea why.   A raw wireshark look certainly shows bad chunk headers 
> heading out.   I don't know if it's something I'm doing or a bug in their 
> stuff.  Don't really know yet.
> 
> In anycase, I'm likely going to pursue option #4 a bit more and see if I can 
> figure out the last issue with it.
> 
> From a performance standpoint, for synchronous request/response, none of 
> them perform as well as the in-jdk HttpURLConnection for what we do.   Netty 
> came the closest at about 5% slower.  HC was about 10%, Jetty about 12%.  
> Gave up on Ning before running benchmarks.   
> 
> However, as exected, the real win is when you use the JAX-WS async API's to 
> call onto a "slow" service.   (I only have the HC version working for this 
> at this point)   If you tune the connection pool to allow virtually 
> unlimmitted connections to localhost (mimics the HttpURLConnection for 
> comparison), it's pretty awesome.   I have a simple service that uses the 
> CXF continuations to delay for about 2 seconds on the server side to mimic 
> some processing on the server side.    5K async requests using the in-JDK 
> stuff that we have using our default thread pool settings and such takes 
> about 35 seconds for all the requests to complete.    With HC, I have that 
> down to about 9 seconds.   However, that does require a bit of tuning of the 
> HC defaults and settings which is the next bit of complication.   Using the 
> async apis to call a "fast" (where response is returned very quickly) 
> service won't benefit as much.
> 
> In anycase, I've committed my experiments to the sandbox.   It does require 
> the latest trunk code for transport-http as well.   Any help or thoughts or 
> anything would be more than welcome.  :-)
> 
> 
> 
> -- 
> Daniel Kulp
> dkulp@apache.org - http://dankulp.com/blog
> Talend Community Coder - http://coders.talend.com