You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@httpd.apache.org by Zvi Har'El <rl...@math.technion.ac.il> on 2002/05/22 17:09:27 UTC

Is Apache Proxy Half-Duplex?

Hello,

Experimenting with an Apache Proxy,  I noticed that in version 1.3 (the latest
cvs snapshot) it behaves in a half-duplex fashion. That is, it doesn't read the
backend server response until it have finished transmitting the client's
request body. This is pretty annoying, mainly if the request involves a very
large post (file upload), and the backend sever response, after the headers,
says "Please wait patiently...". I wonder: are there any intentions to change
this? It seems that full-duplex operation requires two threads per proxy, which
is not how the Apache proxy server works. Is the situation different, or going
to be different, in Apache 2? Just for reference, the Squid proxy doesn't
suffer from this deficiency.

Thanks for you attention,

Zvi.

-- 
Dr. Zvi Har'El     mailto:rl@math.technion.ac.il     Department of Mathematics
tel:+972-54-227607                   Technion - Israel Institute of Technology
fax:+972-4-8324654 http://www.math.technion.ac.il/~rl/     Haifa 32000, ISRAEL
"If you can't say somethin' nice, don't say nothin' at all." -- Thumper (1942)
                                Wednesday, 12 Sivan 5762, 22 May 2002,  6:03PM

Re: Is Apache Proxy Half-Duplex?

Posted by Bill Stoddard <bi...@wstoddard.com>.
I really want to spend some time on this (when I have some time to spend on it).  As part
of this work, we need to consider what needs to be done to the filter API to support an
event driven MPM. Need to get this prioritized above the other stuff I'm working on :-(

Bill

> > From: minfrin@localhost.localdomain
> [mailto:minfrin@localhost.localdomain]
> >
> > Bill Stoddard wrote:
> >
> > > This is a variation of the problem Aaron and I were interested in
> with
> > CGI scripts (and
> > > directly related to an open PR against 2.0.36).  Unfortunately, I
> think
> > filters need some
> > > more work to make this possible. As Will said, we need to be able to
> > poll/select on both
> > > the frontend and backend descriptors and do the right thing. I may
> be
> > mistaken but I
> > > thought the proxy in 1.3 handled this correctly...
> >
> > The v1.3 proxy was always read request then read response, it never
> did
> > these two simultaneously.
> >
> > Would the changes to the filters be that drastic? We would in theory
> > just have to poll for incoming data (in either direction), then read
> > bridages from the relevant filter stack...? No...?
>
> I tried to think through how to do this back in November, when I last
> touched the proxy.  The easiest way to do this, is to add a new read
> mode to  input filters, APR_READ_POLL.  Each filter would be responsible
> for returning any data that it has if called in this mode.  If none of
> the filters in the stack have any data, then the filter that has the
> socket must return the socket bucket, but it is allowed to keep a copy
> of the socket itself.  The filter_poll call can then use apr_poll to
> determine which of the sockets have any data.  This wouldn't be the
> cleanest code, but it should work.
>
> Ryan
>
>


Re: Is Apache Proxy Half-Duplex?

Posted by Graham Leggett <mi...@sharp.fm>.
Ryan Bloom wrote:

> I tried to think through how to do this back in November, when I last
> touched the proxy.  The easiest way to do this, is to add a new read
> mode to  input filters, APR_READ_POLL.  Each filter would be responsible
> for returning any data that it has if called in this mode.  If none of
> the filters in the stack have any data, then the filter that has the
> socket must return the socket bucket, but it is allowed to keep a copy
> of the socket itself.  The filter_poll call can then use apr_poll to
> determine which of the sockets have any data.  This wouldn't be the
> cleanest code, but it should work.

How about giving the read-from-filter-stack code the ability to read
from more than one stack simultaneously?

The basic idea would be that you would read from one or more stacks
(where now you can only read from one stack). The read would return a
brigade, and a variable to say which stack the brigade came from. It
would be up to the calling code to decide intelligently on what to do
with the brigade based on which stack it came from.

This way a proxy might implement request pipelining. One or more
requests would arrive via Apache's framework, and as they arrived the
proxy could kick off possibly more than one request to possibly more
than one backend. It would then read from the backends, and possibly
buffer the content, before pipelining it in the correct sequence to the
client. If the backends were slow, this could offer a significant
performance improvement, as the backend would not have to wait till
request A is finished before starting the generation of request B.

Regards,
Graham
-- 
-----------------------------------------
minfrin@sharp.fm		"There's a moon
					over Bourbon Street
						tonight..."

Re: Is Apache Proxy Half-Duplex?

Posted by Aaron Bannert <aa...@clove.org>.
On Fri, May 24, 2002 at 07:20:34PM +0100, Ben Laurie wrote:
> Seems to me that you really want a apr_poll equivalent that works on 
> bucket brigades - that would make this clean, and could be quite elegant 
> (IMO).

My thought as well:

http://www.apachelabs.org/apr-mbox/200203.mbox/%3C20020306171049.L10674@clove.org%3E

-aaron

Re: Is Apache Proxy Half-Duplex?

Posted by Ben Laurie <be...@algroup.co.uk>.
Jeff Trawick wrote:
> Ben Laurie <be...@algroup.co.uk> writes:
> 
> 
>>Seems to me that you really want a apr_poll equivalent that works on
>>bucket brigades - that would make this clean, and could be quite
>>elegant (IMO).
> 
> 
> What seems useful (to me) is for apr_poll() to operate on a generic
> I/O handle (instead of apr_socket_t) and for the APR app to be able to
> retrieve the generic I/O handle from an APR socket or pipe or
> whatever.  The bucket code could then be able to return a generic I/O
> handle corresponding to a bucket (extend it for brigades as well).
> 
> (For some bucket types (e.g., heap) perhaps they always appear
> readable, or perhaps trying to retrieve the handle indicates that the
> operation is inappropriat.)

Well, you may want that under the hood, but it clearly ain't right for 
bucket brigades, coz the brigade itself might have data ready.

I suppose write brigades will have to say no at some point. Hmm. 
Non-trivial.

Cheers,

Ben.

-- 
http://www.apache-ssl.org/ben.html       http://www.thebunker.net/

"There is no limit to what a man can do or how far he can go if he
doesn't mind who gets the credit." - Robert Woodruff


Re: Is Apache Proxy Half-Duplex?

Posted by Jeff Trawick <tr...@attglobal.net>.
Ben Laurie <be...@algroup.co.uk> writes:

> Seems to me that you really want a apr_poll equivalent that works on
> bucket brigades - that would make this clean, and could be quite
> elegant (IMO).

What seems useful (to me) is for apr_poll() to operate on a generic
I/O handle (instead of apr_socket_t) and for the APR app to be able to
retrieve the generic I/O handle from an APR socket or pipe or
whatever.  The bucket code could then be able to return a generic I/O
handle corresponding to a bucket (extend it for brigades as well).

(For some bucket types (e.g., heap) perhaps they always appear
readable, or perhaps trying to retrieve the handle indicates that the
operation is inappropriat.)

-- 
Jeff Trawick | trawick@attglobal.net
Born in Roswell... married an alien...

Re: Is Apache Proxy Half-Duplex?

Posted by Ben Laurie <be...@algroup.co.uk>.
Ryan Bloom wrote:
>>From: minfrin@localhost.localdomain
> 
> [mailto:minfrin@localhost.localdomain]
> 
>>Bill Stoddard wrote:
>>
>>
>>>This is a variation of the problem Aaron and I were interested in
>>
> with
> 
>>CGI scripts (and
>>
>>>directly related to an open PR against 2.0.36).  Unfortunately, I
>>
> think
> 
>>filters need some
>>
>>>more work to make this possible. As Will said, we need to be able to
>>
>>poll/select on both
>>
>>>the frontend and backend descriptors and do the right thing. I may
>>
> be
> 
>>mistaken but I
>>
>>>thought the proxy in 1.3 handled this correctly...
>>
>>The v1.3 proxy was always read request then read response, it never
> 
> did
> 
>>these two simultaneously.
>>
>>Would the changes to the filters be that drastic? We would in theory
>>just have to poll for incoming data (in either direction), then read
>>bridages from the relevant filter stack...? No...?
> 
> 
> I tried to think through how to do this back in November, when I last
> touched the proxy.  The easiest way to do this, is to add a new read
> mode to  input filters, APR_READ_POLL.  Each filter would be responsible
> for returning any data that it has if called in this mode.  If none of
> the filters in the stack have any data, then the filter that has the
> socket must return the socket bucket, but it is allowed to keep a copy
> of the socket itself.  The filter_poll call can then use apr_poll to
> determine which of the sockets have any data.  This wouldn't be the
> cleanest code, but it should work.

Seems to me that you really want a apr_poll equivalent that works on 
bucket brigades - that would make this clean, and could be quite elegant 
(IMO).

Cheers,

Ben.

-- 
http://www.apache-ssl.org/ben.html       http://www.thebunker.net/

"There is no limit to what a man can do or how far he can go if he
doesn't mind who gets the credit." - Robert Woodruff


RE: Is Apache Proxy Half-Duplex?

Posted by Ryan Bloom <rb...@covalent.net>.
> From: minfrin@localhost.localdomain
[mailto:minfrin@localhost.localdomain]
> 
> Bill Stoddard wrote:
> 
> > This is a variation of the problem Aaron and I were interested in
with
> CGI scripts (and
> > directly related to an open PR against 2.0.36).  Unfortunately, I
think
> filters need some
> > more work to make this possible. As Will said, we need to be able to
> poll/select on both
> > the frontend and backend descriptors and do the right thing. I may
be
> mistaken but I
> > thought the proxy in 1.3 handled this correctly...
> 
> The v1.3 proxy was always read request then read response, it never
did
> these two simultaneously.
> 
> Would the changes to the filters be that drastic? We would in theory
> just have to poll for incoming data (in either direction), then read
> bridages from the relevant filter stack...? No...?

I tried to think through how to do this back in November, when I last
touched the proxy.  The easiest way to do this, is to add a new read
mode to  input filters, APR_READ_POLL.  Each filter would be responsible
for returning any data that it has if called in this mode.  If none of
the filters in the stack have any data, then the filter that has the
socket must return the socket bucket, but it is allowed to keep a copy
of the socket itself.  The filter_poll call can then use apr_poll to
determine which of the sockets have any data.  This wouldn't be the
cleanest code, but it should work.

Ryan



Re: Is Apache Proxy Half-Duplex?

Posted by Graham Leggett <mi...@sharp.fm>.
Bill Stoddard wrote:

> This is a variation of the problem Aaron and I were interested in with CGI scripts (and
> directly related to an open PR against 2.0.36).  Unfortunately, I think filters need some
> more work to make this possible. As Will said, we need to be able to poll/select on both
> the frontend and backend descriptors and do the right thing. I may be mistaken but I
> thought the proxy in 1.3 handled this correctly...

The v1.3 proxy was always read request then read response, it never did
these two simultaneously.

Would the changes to the filters be that drastic? We would in theory
just have to poll for incoming data (in either direction), then read
bridages from the relevant filter stack...? No...?

Regards,
Graham
-- 
-----------------------------------------
minfrin@sharp.fm		"There's a moon
					over Bourbon Street
						tonight..."

Re: Is Apache Proxy Half-Duplex?

Posted by Bill Stoddard <bi...@wstoddard.com>.
> "William A. Rowe, Jr." wrote:
>
> > >Half duplex in the sense that a reply follows a request. ie a reply does
> > >not get sent during a request.
> >
> > Cannot, Should not, or generally Does not?
> >
> > POST accept modules might certainly echo...
> >
> > Headers:...
> >
> > Banners of the the next page
> > Accepting Input ... [long pause]
> > Processing Results ... [long pause]
> >
> > With the caviats that you can't begin a response body if you potentially
> > expect to error out on the results, and there are no promises that this will
> > ever be rendered, but that's not the point.  If you can find in the HTTP spec
> > where this is disallowed, please point me at it!
>
> Ok, then I've misunderstood this.
>
> The bottom line is that we must be able to read a request and read a
> reply simultaneously using filters. Is this possible?

This is a variation of the problem Aaron and I were interested in with CGI scripts (and
directly related to an open PR against 2.0.36).  Unfortunately, I think filters need some
more work to make this possible. As Will said, we need to be able to poll/select on both
the frontend and backend descriptors and do the right thing. I may be mistaken but I
thought the proxy in 1.3 handled this correctly...
>

> One point where we need this is in the CONNECT proxy, which needs to
> read bytes from both the foreign server and client simultaneously.
>
Yep


Bill


Re: Is Apache Proxy Half-Duplex?

Posted by Graham Leggett <mi...@sharp.fm>.
"William A. Rowe, Jr." wrote:

> >Half duplex in the sense that a reply follows a request. ie a reply does
> >not get sent during a request.
> 
> Cannot, Should not, or generally Does not?
> 
> POST accept modules might certainly echo...
> 
> Headers:...
> 
> Banners of the the next page
> Accepting Input ... [long pause]
> Processing Results ... [long pause]
> 
> With the caviats that you can't begin a response body if you potentially
> expect to error out on the results, and there are no promises that this will
> ever be rendered, but that's not the point.  If you can find in the HTTP spec
> where this is disallowed, please point me at it!

Ok, then I've misunderstood this.

The bottom line is that we must be able to read a request and read a
reply simultaneously using filters. Is this possible?

One point where we need this is in the CONNECT proxy, which needs to
read bytes from both the foreign server and client simultaneously.

Regards,
Graham
-- 
-----------------------------------------
minfrin@sharp.fm		"There's a moon
					over Bourbon Street
						tonight..."

Re: Is Apache Proxy Half-Duplex?

Posted by "William A. Rowe, Jr." <wr...@rowe-clan.net>.
At 09:00 AM 5/23/2002, Graham Leggett wrote:
>Bill Stoddard wrote:
>
> > > HTTP v1.1 is a half duplex protocol - this is 100% correct behavior.
> >
> > Where in the spac does it say that?
>
>Half duplex in the sense that a reply follows a request. ie a reply does
>not get sent during a request.

Cannot, Should not, or generally Does not?

POST accept modules might certainly echo...

Headers:...

Banners of the the next page
Accepting Input ... [long pause]
Processing Results ... [long pause]

With the caviats that you can't begin a response body if you potentially
expect to error out on the results, and there are no promises that this will
ever be rendered, but that's not the point.  If you can find in the HTTP spec
where this is disallowed, please point me at it!

>(The 100-continue handling I understand is an exception to this, but I
>think this can be ignored for this example).

And there your argument falls down on it's face.

Two threads is probably not the way to go... Taking a CGI example, we
probably want to poll on all three sources [client body socket read, stdout
and stderr] and both sinks [server response socket write and stdin].

We probably need several accessor bits in the core filter to actually make
this work, where the module wants the server to cooperate in this manner.
It won't be pretty.

Bill


Re: Is Apache Proxy Half-Duplex?

Posted by Graham Leggett <mi...@sharp.fm>.
Bill Stoddard wrote:

> > HTTP v1.1 is a half duplex protocol - this is 100% correct behavior.
> 
> Where in the spac does it say that?

Half duplex in the sense that a reply follows a request. ie a reply does
not get sent during a request.

(The 100-continue handling I understand is an exception to this, but I
think this can be ignored for this example).

Regards,
Graham
-- 
-----------------------------------------
minfrin@sharp.fm		"There's a moon
					over Bourbon Street
						tonight..."

Re: Is Apache Proxy Half-Duplex?

Posted by Bill Stoddard <bi...@wstoddard.com>.
> Zvi Har'El wrote:
> 
> > Experimenting with an Apache Proxy,  I noticed that in version 1.3 (the latest
> > cvs snapshot) it behaves in a half-duplex fashion. That is, it doesn't read the
> > backend server response until it have finished transmitting the client's
> > request body.
> 
> HTTP v1.1 is a half duplex protocol - this is 100% correct behavior.

Where in the spac does it say that?

Bill



Re: Is Apache Proxy Half-Duplex?

Posted by Graham Leggett <mi...@sharp.fm>.
Zvi Har'El wrote:

> Experimenting with an Apache Proxy,  I noticed that in version 1.3 (the latest
> cvs snapshot) it behaves in a half-duplex fashion. That is, it doesn't read the
> backend server response until it have finished transmitting the client's
> request body.

HTTP v1.1 is a half duplex protocol - this is 100% correct behavior.

> This is pretty annoying, mainly if the request involves a very
> large post (file upload), and the backend sever response, after the headers,
> says "Please wait patiently...". I wonder: are there any intentions to change
> this?

I doubt it. Would have to change HTTP.

> It seems that full-duplex operation requires two threads per proxy, which
> is not how the Apache proxy server works. Is the situation different, or going
> to be different, in Apache 2? Just for reference, the Squid proxy doesn't
> suffer from this deficiency.

Can you explain better exactly what the proxy is doing that you think is
wrong?

There is no way in the HTTP protocol for the server to start responding
before the request is completely uploaded, for obvious reasons. I don't
understand how Squid could be doing this.

Regards,
Graham
-- 
-----------------------------------------
minfrin@sharp.fm		"There's a moon
					over Bourbon Street
						tonight..."

Re: Is Apache Proxy Half-Duplex?

Posted by Igor Sysoev <is...@rambler-co.ru>.
On Wed, 22 May 2002, Zvi Har'El wrote:

> Experimenting with an Apache Proxy,  I noticed that in version 1.3 (the latest
> cvs snapshot) it behaves in a half-duplex fashion. That is, it doesn't read the
> backend server response until it have finished transmitting the client's
> request body. This is pretty annoying, mainly if the request involves a very
> large post (file upload), and the backend sever response, after the headers,
> says "Please wait patiently...". I wonder: are there any intentions to change
> this? It seems that full-duplex operation requires two threads per proxy, which
> is not how the Apache proxy server works. Is the situation different, or going
> to be different, in Apache 2? Just for reference, the Squid proxy doesn't
> suffer from this deficiency.

If you use Apache 1.3 then you can try mod_accel.
mod_accel uses temporary files if backend reponse or client POST
is bigger then memory buffer.

Igor Sysoev
http://sysoev.ru