You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@cxf.apache.org by Sergey Beryozkin <se...@iona.com> on 2007/05/22 13:03:13 UTC

Re: Problems with Chunking

Hi

Are there any future plans for CXF to support the chunking for (POST) requests ?
For client applications uploading large binary attachments that would be a big plus...

Cheers, Sergey
 

> -----Original Message-----
> From: Dan Diephouse [mailto:dan@envoisolutions.com] 
> Sent: 20 February 2007 17:14
> To: cxf-dev@incubator.apache.org
> Subject: Re: Problems with Chunking
> 
> On 2/20/07, Glynn, Eoghan <eo...@iona.com> wrote:
> >
> >
> >
> > > Are the redirect/authentication cases in particular HTTP 
> server bugs 
> > > or limitations of HTTP?
> > >
> > > It sounds like the later.
> >
> > Well, you could call the 401/30x response codes either a feature or 
> > limitation of HTTP, depending on your point of view.
> >
> > > I suppose we could keep a list of Servers that we should should 
> > > default to non-chunked, but it sounds like that doesn't help the 
> > > other cases.
> > >
> > > How about this counter-counter proposal :-) It seems we 
> have a lot 
> > > of cases which actually require non-chunked requests:
> > > - broken servers
> > > - authentication
> > > - redirects
> >
> > While superficially these scenarios look similar, if we attempt to 
> > solve the problem adaptively there are two crucial differences that 
> > become
> > apparent:
> >
> > 1. We might infer that a single 401 is possible/probable, and hence 
> > buffer only those initial outgoing request(s) up to the 
> receipt of the 
> > 401, before reverting to chunked. Hence the initial cost of 
> buffering 
> > the requests (often just one), could be amortized over many 
> subsequent 
> > unbuffered requests. But to avoid the ASP.NET bug, we cannot *ever* 
> > fallback to chunking (hence my objection to your idea to 
> buffer only 
> > up to the first 100k).
> 
> 
> 2. While we can take an educated guess in the HTTPConduit that a 401
> > *may* occur (e.g. the user has gone to the trouble of configing a 
> > AuthorizationPolicy so obviously they're expecting a 
> challenge), but 
> > we never know for sure when or if a 401 or redirect will be 
> received. 
> > On the other hand, if the server-stack support for incoming 
> chunks is 
> > broke, its not going to magically fix itself in the middle of a 
> > sequence of requests.
> >
> > So it seems two different adaptive strategies might be appropriate 
> > here ...
> >
> > For 401/30x: apply a heuristic to decide if a 401/30x is 
> likely, and 
> > if so buffer up outgoing requests until a 401/30x is received, then 
> > transparently resend and revert to chunking. If we wanna get fancy, 
> > allow for multiple 401/30x (e.g. a redirect followed by a 
> challenge). 
> > So we're unsure upfront, thus we temporarily disable 
> chunking for this 
> > target endpoint.
> >
> > For broken servers: probe upfront with an innocuous GET to 
> see if the 
> > server barfs, and if so *permanently* turn off chunking. So 
> we can be 
> > sure upfront, and then permanently disable chunking for this target 
> > endpoint.
> 
> 
> I don't see how this tells us if the server supports chunked 
> requests or not. GETs don't have a request, hence it can't be tested.


TRACE is the only HTTP 1.1 method that's explicitly disallowed from
including an entity body is AFAIK. 

So I think we probably could fabricate a GET with a chunked
transfer-encoding, and a single chunk of size zero.

A quick play with the java.net.HttpURLConnection suggests though that it
would frustrate us in doing so ... HttpURLConnection.getOutputStream()
changes a GET method to POST, and then barfs if the method is anything
other than POST or PUT.

We could try using PUT as the probe method, but I gues PUTs are often
rejected for other reasons.

Otherwise we could by-pass java.net.HttpURLConnection, open a direct
socket connection, and slap on a GET ...

Anyhoo, just an idea ...

 
> > So why not turn off chunking by default and put in a log
> > > message which states something to the extent of: "HTTP 
> chunking is 
> > > turned off by default for compatability reasons.
> > > For possible performance improvements, try enabling chunking."
> >
> > Why turn off an optimization by default to facilitate 
> working around a 
> > bug in a single server-stack?
> 
> 
> My assertion this whole time is that it isn't just a single 
> server stack.
> There are many that have this problem. I'm trying to come up 
> with a list, but people don't go around collecting data on 
> this and leaving it on the internet for me unfortunately.


OK, but its not rocket science for the server-stack to support incoming
chunks, and there are definitely plenty of examples around that do it
nicely (Jetty and Tomcat to name but two).

/Eoghan


> If we don't think its worth the trouble to use an adaptive approach as
> > outlined above, why not just add a note to the docco ... "there are 
> > know problems with ASP.NET consuming the chunked transfer-encoding, 
> > which can be worked around by disabling chunking with the following 
> > configuration fragment ..."?
> 
> 
> We'll definitely need docs on it, but its not just an ASP.NET problem
> 
> Then clients of ASP.net can work around the server bug, while 
> clients of
> > any other server-stack can continue to get the benefit of chunked 
> > requests without having to change their config.
> >
> > > For small requests (i.e. a couple K, and the most common), its 
> > > likely to be the same performance as woodstox wraps the 
> outputstream 
> > > in a BufferedOutputStream.  Is performance the only 
> reason you want 
> > > it turned on by default?
> >
> > Yep, performance for large outgoing requests is the concern.
> 
> 
> 
>  - Dan
> 
> --
> Dan Diephouse
> Envoi Solutions
> http://envoisolutions.com | http://netzooid.com/blog
>