You are viewing a plain text version of this content. The canonical link for it is here.
Posted to modperl@perl.apache.org by Mark James <mr...@bigpond.net.au> on 2003/03/05 13:13:02 UTC

[mp2] CGI redirects incorrectly handled?

I'm having CGI redirect problems mp2 (cvs).

Instead of being redirected to the proper web page, I'm sometimes
getting a "302 Moved" page containing a link to the correct URL.

Seems to be related to the following code in modperl_cgi.c:

if (location && (location[0] == '/') && (r->status == 200)) {
         r->method = apr_pstrdup(r->pool, "GET");
         r->method_number = M_GET;


The Location field I'm redirecting to is a fully-qualified URL,
starting with http://, but still at the local server.  A debug
put in above this code confirms that "location" is set to a
string starting with "http".

Why is the test for "location[0] == '/'" there?  Which section
of code is usually responsible for stripping off the server
part of the address if it is local?

Mark


ImageMagick

Posted by Lee Goddard <ho...@LeeGoddard.com>.
-----BEGIN PGP SIGNED MESSAGE-----
Hash: MD5


A while ago someone on this list experienced a similar problem
to myself, with ImageMagick crashing an Apache mod_perl server.

If it was you, and if that problem related to Writing an image,
please get in touch off list: I've been given a potential fix for it.

Cheers
lee

-----BEGIN PGP SIGNATURE-----
Version: 2.6

iQCVAwUAPmb6L6drfekeF/QBAQGaeQQAmqEdHQbsg/4dBUuqSw9A8tn8od5taO8y
0TYuoINUspSt0I9tXs9uF8LCUYrAux2+RGmDND6DZrP2S8Ja6HGCwKxckL8c0P2e
/O0Ko1otMG+1j8XoE0EG3qVwvWCg8Lqt0Ak59K/kSmXke81/h0PbYuUH32BSOPdM
jLMz9GIqnAw=
=K6co
-----END PGP SIGNATURE-----


Re: [mp2] CGI redirects incorrectly handled?

Posted by Stas Bekman <st...@stason.org>.
Mark James wrote:
> Stas Bekman wrote:
> 
>> Can you send a short script (removing all the irrelevant bits) that we 
>> can reproduce the problem with?
> 
> 
> Made a script that generated the same POST request and same
> redirect as the problem code.  The problem was not reproduced!
> 
> The only difference I can see between working POSTs (both those
> in my package code and the one in the test script) and the failing
> one (a particular one in my package) is in the distribution of the
> data across the TCP packets that carry the POST.

[snippet the packet dumps]

> Could mod_perl, with its persistent buffer, be tripping up on this?
> I'm trying now to trace the data through the mp2 code. -- Mark

mod_perl 2.0 is buffering only the content (response body, not the headers) 
(assuming that you aren't installing any output filters), so it's one of the 
apache core output filters, that decides how to split the data.

a normal output filter list ends with:

byterange(0x8841110): ctx=0x0, r=0x88402d0, c=0x883a390
content_length(0x8841128): ctx=0x0, r=0x88402d0, c=0x883a390
http_header(0x8841140): ctx=0x0, r=0x88402d0, c=0x883a390
core(0x883a760): ctx=0x883a738, r=0x0, c=0x883a390

you can dump this, using the dump_filters macro, which you can load by running

   gdb> source /path/to/httpd-2.0/.gdbinit

the 'core' filter is in httpd-2.0/src/core.c: core_output_filter(...)

__________________________________________________________________
Stas Bekman            JAm_pH ------> Just Another mod_perl Hacker
http://stason.org/     mod_perl Guide ---> http://perl.apache.org
mailto:stas@stason.org http://use.perl.org http://apacheweek.com
http://modperlbook.org http://apache.org   http://ticketmaster.com


Re: [mp2] CGI redirects incorrectly handled?

Posted by Mark James <mr...@bigpond.net.au>.
Stas Bekman wrote:

> Can you send a short script (removing all the irrelevant bits) that we 
> can reproduce the problem with?

Made a script that generated the same POST request and same
redirect as the problem code.  The problem was not reproduced!

The only difference I can see between working POSTs (both those
in my package code and the one in the test script) and the failing
one (a particular one in my package) is in the distribution of the
data across the TCP packets that carry the POST.

Packet 1 always has the POST request and a set of headers.

When it works (a 302 is sent), Packet 2 has the Content-Type
header, Packet 3 has the Content-Length header, and Packet 4
has the POSTed variables:

PACKET 2:
0x0000   4500 0065 f659 4000 4006 4993 9084 ecce        E..e.Y@.@.I.....
0x0010   9084 ecce 9ea1 0050 482e 99c3 4828 27ae        .......PH...H('.
0x0020   8018 7fff cd14 0000 0101 080a 00a9 6185        ..............a.
0x0030   00a9 6185 436f 6e74 656e 742d 5479 7065        ..a.Content-Type
0x0040   3a20 6170 706c 6963 6174 696f 6e2f 782d        :.application/x-
0x0050   7777 772d 666f 726d 2d75 726c 656e 636f        www-form-urlenco
0x0060   6465 640d 0a                                   ded..

PACKET 3:
0x0000   4500 004b f65a 4000 4006 49ac 9084 ecce        E..K.Z@.@.I.....
0x0010   9084 ecce 9ea1 0050 482e 99f4 4828 27ae        .......PH...H('.
0x0020   8018 7fff d88d 0000 0101 080a 00a9 6185        ..............a.
0x0030   00a9 6185 436f 6e74 656e 742d 4c65 6e67        ..a.Content-Leng
0x0040   7468 3a20 3334 320d 0a0d 0a                    th:.342....

PACKET 4:
0x0000   4500 018a f65b 4000 4006 486c 9084 ecce        E....[@.@.Hl....
0x0010   9084 ecce 9ea1 0050 482e 9a0b 4828 27ae        .......PH...H('.
0x0020   8018 7fff 945c 0000 0101 080a 00a9 6185        .....\........a.
0x0030   00a9 6185 5f70 6173 735f 6964 3d75 7365        ..a._pass_id=use
0x0040   7233 2534 306d 616b 6574 6865 6361 7365        r3%40makethecase
0x0050   2e6e 6574 265f 7061 7373 5f70 6173 733d        .net&_pass_pass=
[rest of packet cut]


But when it fails (a 200 with a 302 link is sent),
all is in the one packet:

0x0000   4500 01af 1d11 4000 4006 2192 9084 ecce        E.....@.@.!.....
0x0010   9084 ecce 9faf 0050 f769 cb03 f764 9ec7        .......P.i...d..
0x0020   8018 bb9e fe65 0000 0101 080a 00ad a4b7        .....e..........
0x0030   00ad a4b7 436f 6e74 656e 742d 5479 7065        ....Content-Type
0x0040   3a20 6170 706c 6963 6174 696f 6e2f 782d        :.application/x-
0x0050   7777 772d 666f 726d 2d75 726c 656e 636f        www-form-urlenco
0x0060   6465 640d 0a43 6f6e 7465 6e74 2d4c 656e        ded..Content-Len
0x0070   6774 683a 2033 3037 0d0a 0d0a 5f70 6173        gth:.307...._pas
0x0080   735f 6964 3d75 7365 7233 2534 306d 616b        s_id=user3%40mak
0x0090   6574 6865 6361 7365 2e6e 6574 265f 7061        ethecase.net&_pa
0x00a0   7373 5f70 6173 733d                            ss_pass=
[rest of packet cut]


Could mod_perl, with its persistent buffer, be tripping up on this?
I'm trying now to trace the data through the mp2 code. -- Mark


Re: [mp2] CGI redirects incorrectly handled?

Posted by Stas Bekman <st...@stason.org>.
Further down modperl_cgi.c has:

     else if (location && (r->status == 200)) {
         MP_dRCFG;

         /* Note that if a script wants to produce its own Redirect
          * body, it now has to explicitly *say* "Status: 302"
          */

but it seems that it's being done already by CGI.pm.

Can you send a short script (removing all the irrelevant bits) that we can 
reproduce the problem with?


__________________________________________________________________
Stas Bekman            JAm_pH ------> Just Another mod_perl Hacker
http://stason.org/     mod_perl Guide ---> http://perl.apache.org
mailto:stas@stason.org http://use.perl.org http://apacheweek.com
http://modperlbook.org http://apache.org   http://ticketmaster.com


Re: [mp2] CGI redirects incorrectly handled?

Posted by Mark James <mr...@bigpond.net.au>.
Mark James wrote:

> "303 See Other" is the correct post-POST redirect response:
>     http://rfc.net/rfc2616.html#s10.3.4
> which your first link suggests works in all browsers.

Well, taking a closer look, 303 doesn't work in Netscape 3 or 4.


> CGI.pm always returns a 302, though, if necessary, I can edit
> its reply in my script before printing it.  I'll give this a go.

This didn't work.  Got a 303 link page instead of a 302 one.


Re: [mp2] CGI redirects incorrectly handled?

Posted by Mark James <mr...@bigpond.net.au>.
Stas Bekman wrote:
> Mark James wrote:
>> No, them problem only manifests under mod_perl (2, haven't used 1).
> 
> Sorry, I'm not following your comment. I've suggested to test with 
> mod_cgi (under Apache2), since mod_perl mimics mod_cgi's behavior here.

I changed the handler in httpd.conf from perl-script to cgi-script
and the problem went away.


> Should POST-redirect return 307?
> http://ppewww.ph.gla.ac.uk/~flavell/www/post-redirect.html
> http://rfc.net/rfc2616.html#s10.3.8

"303 See Other" is the correct post-POST redirect response:
	http://rfc.net/rfc2616.html#s10.3.4
which your first link suggests works in all browsers.

CGI.pm always returns a 302, though, if necessary, I can edit
its reply in my script before printing it.  I'll give this a go.

Thanks -- Mark




Re: [mp2] CGI redirects incorrectly handled?

Posted by Stas Bekman <st...@stason.org>.
Mark James wrote:
> Stas Bekman wrote:
> 
>> Mark James wrote:
>>
>>> I'm having CGI redirect problems mp2 (cvs).
>>
>>
>> as the comment just above this line says, that code was copy-n-pasted 
>> from mod_cgi. Can you reproduce the same problem while running a cgi 
>> script?
> 
> 
> No, them problem only manifests under mod_perl (2, haven't used 1).

Sorry, I'm not following your comment. I've suggested to test with mod_cgi 
(under Apache2), since mod_perl mimics mod_cgi's behavior here.

>> Also could it be that it has to do with the recent change, I've 
>> applied which was already reported by Beau as broken. May be your 
>> headers don't get parsed What happens if you do:
>> [patch]
> 
> 
> Applied the patch, but the problem still occurred. No change also
> when I commented out the location[0]=='/' test.
> 
> The redirect header being printed by my perl script is:
> 
> Server: Apache/2.0.44 (Unix) mod_perl/1.99_09-dev Perl/v5.8.0
> Status: 302 Moved
> Date: Thu, 06 Mar 2003 01:10:54 GMT
> Location: 
> http://makethecase.net/db?auth=ckffb2a5c44ee0&editCmds=compact&file=62
> 
> Which is returned as a 302 link page.  This is a redirect response to a 
> POST.
> 
> Strangely, another redirect, with header:
> 
> Server: Apache/2.0.44 (Unix) mod_perl/1.99_09-dev Perl/v5.8.0
> Status: 302 Moved
> Date: Thu, 06 Mar 2003 01:15:54 GMT
> Location: 
> http://makethecase.net/db?_reason=6%20Case1Pro&_restart=editPart&checkSequenceNumber=60&cmd=authenticate&editCmds=compact&file=62&partnum= 
> 
> 
> works just fine.  This is a redirect after a GET.

Should POST-redirect return 307?
http://ppewww.ph.gla.ac.uk/~flavell/www/post-redirect.html
http://rfc.net/rfc2616.html#s10.3.8

__________________________________________________________________
Stas Bekman            JAm_pH ------> Just Another mod_perl Hacker
http://stason.org/     mod_perl Guide ---> http://perl.apache.org
mailto:stas@stason.org http://use.perl.org http://apacheweek.com
http://modperlbook.org http://apache.org   http://ticketmaster.com


Re: [mp2] CGI redirects incorrectly handled?

Posted by Mark James <mr...@bigpond.net.au>.
Stas Bekman wrote:
> Mark James wrote:
>> I'm having CGI redirect problems mp2 (cvs).
> 
> as the comment just above this line says, that code was copy-n-pasted 
> from mod_cgi. Can you reproduce the same problem while running a cgi 
> script?

No, them problem only manifests under mod_perl (2, haven't used 1).

> Also could it be that it has to do with the recent change, I've applied 
> which was already reported by Beau as broken. May be your headers don't 
> get parsed What happens if you do:
> [patch]

Applied the patch, but the problem still occurred. No change also
when I commented out the location[0]=='/' test.

The redirect header being printed by my perl script is:

Server: Apache/2.0.44 (Unix) mod_perl/1.99_09-dev Perl/v5.8.0
Status: 302 Moved
Date: Thu, 06 Mar 2003 01:10:54 GMT
Location: http://makethecase.net/db?auth=ckffb2a5c44ee0&editCmds=compact&file=62

Which is returned as a 302 link page.  This is a redirect response to a POST.

Strangely, another redirect, with header:

Server: Apache/2.0.44 (Unix) mod_perl/1.99_09-dev Perl/v5.8.0
Status: 302 Moved
Date: Thu, 06 Mar 2003 01:15:54 GMT
Location: http://makethecase.net/db?_reason=6%20Case1Pro&_restart=editPart&checkSequenceNumber=60&cmd=authenticate&editCmds=compact&file=62&partnum=

works just fine.  This is a redirect after a GET.

Mark


Re: [mp2] CGI redirects incorrectly handled?

Posted by Stas Bekman <st...@stason.org>.
Mark James wrote:
> I'm having CGI redirect problems mp2 (cvs).
> 
> Instead of being redirected to the proper web page, I'm sometimes
> getting a "302 Moved" page containing a link to the correct URL.
> 
> Seems to be related to the following code in modperl_cgi.c:
> 
> if (location && (location[0] == '/') && (r->status == 200)) {
>         r->method = apr_pstrdup(r->pool, "GET");
>         r->method_number = M_GET;
> 
> 
> The Location field I'm redirecting to is a fully-qualified URL,
> starting with http://, but still at the local server.  A debug
> put in above this code confirms that "location" is set to a
> string starting with "http".
> 
> Why is the test for "location[0] == '/'" there?  Which section
> of code is usually responsible for stripping off the server
> part of the address if it is local?

as the comment just above this line says, that code was copy-n-pasted from 
mod_cgi. Can you reproduce the same problem while running a cgi script?

Also could it be that it has to do with the recent change, I've applied which 
was already reported by Beau as broken. May be your headers don't get parsed 
What happens if you do:

Index: src/modules/perl/modperl_filter.c
===================================================================
RCS file: /home/cvs/modperl-2.0/src/modules/perl/modperl_filter.c,v
retrieving revision 1.54
diff -u -r1.54 modperl_filter.c
--- src/modules/perl/modperl_filter.c   3 Mar 2003 03:39:06 -0000       1.54
+++ src/modules/perl/modperl_filter.c   5 Mar 2003 23:15:44 -0000
@@ -55,7 +55,7 @@
      apr_bucket *bucket;
      const char *work_buf = buf;

-    if (wb->header_parse && !wb->r->content_type) {
+    if (wb->header_parse) {
          request_rec *r = wb->r;
          const char *bodytext = NULL;
          int status;



__________________________________________________________________
Stas Bekman            JAm_pH ------> Just Another mod_perl Hacker
http://stason.org/     mod_perl Guide ---> http://perl.apache.org
mailto:stas@stason.org http://use.perl.org http://apacheweek.com
http://modperlbook.org http://apache.org   http://ticketmaster.com


Re: [mp2] CGI redirects incorrectly handled?

Posted by Stas Bekman <st...@stason.org>.
Stas Bekman wrote:

> As I wrote this, I'm actually starting to think that it's Apache who 
> should ignore the flush bucket if it had seen no other data so far, and 
> not generate any headers till it actually sees the real data.

And I went to produce a patch in http_filter, I figured that that would be 
wrong for the same reason, mod_perl shouldn't handle that as a special case, 
since that behavior might be a desired one.

Another workaround for your problem could be a custom output filter, that 
yanks any bucket brigades including only the flush bucket, if it had not seen 
any real data buckets yet.

__________________________________________________________________
Stas Bekman            JAm_pH ------> Just Another mod_perl Hacker
http://stason.org/     mod_perl Guide ---> http://perl.apache.org
mailto:stas@stason.org http://use.perl.org http://apacheweek.com
http://modperlbook.org http://apache.org   http://ticketmaster.com


Re: [mp2] CGI redirects incorrectly handled?

Posted by Stas Bekman <st...@stason.org>.
Mark James wrote:
> Stas Bekman wrote:
> 
>>> So can flushing be held off until either (1) blank line is printed,
>>> (2) the 8k buffer fills, or (3) send_http_header is called?
>>
>>  
>> 1) is relevant only for handler that print headers, rather than set them
>> 2) absolutely not, what if you want to flush data before?
>> 3) send_http_header doesn't exist in Apache2/mod_perl2
> 
> 
> I didn't realise that mp2 doesn't use send_http_header.  That explains
> the appearance of wb->r->content_type in the mp2 code. 

I'm not quite happy yet about the current situation with send_http_header API 
removal. Currently an explicit call to $r->content_type (only in mod_perl 
handlers) turns the headers parsing off, if it was on. Which works fine as a 
replacement for $r->send_http_header. However it's possible that some earlier 
phase calls $r->content_type and the response phase still wants to *print* its 
headers, rather than use API to set them. If that's the case, we are in trouble.

I'll be soon working on providing the API for PerlOptions and other config 
options. And while most of these things are read-only, I'm thinking that I 
might be able to add a read/write accessor for ParseHeaders. So one can turn 
the parsing on and off, disregarding what was the setting in httpd.conf. I 
believe that would be the perfect solution, since it'll give developers a 
total flexibility.

> So is it
> true that if headers are sent using the API then no output filtering
> and transmission occurs until the 8k buffer is either filled or flushed
> (explcitly or after exit)?

That's correct. Though the headers are really sent by the Apache core http 
output filter, once we send the response body.

>> Only in the case that your handler is configured with:
>>
>>   PerlOptions +ParseHeaders
>>
>> *and*
>>
>> it prints headers ala:
>>
>>   print "Content-type: ...."
>>
>> In all other cases where headers are set via the API, e.g. 
>> $r->content_type, $r->headers_out, etc, there is no such a thing as 
>> "the handler has send an empty line signaling the end of sending 
>> headers", because it never sends any headers at all, but uses api to 
>> set them.
> 
> 
> Is +ParseHeaders always indicative of explicit header printing, or
> can it also be set when using the API?  If the former, then yes, if
> +ParseHeaders is set flushing could be held off until a blank line is seen.

See my plan and current situation explanation above.

>> Do we now agree that the event of "end of sending headers" is possible 
>> only in the case explained at the top?
> 
> 
> Yes, the key I was missing is that mp2 no longer uses send_http_header.
> Can you just lock out flushing when nothing has been printed and
> content_type is undefined? (You impliy above that the content_type
> setting is persistent, so the script would have to undef it if necessary.)
> Then all the user script has to do is to make sure any Status header is
> either printed or set via headers_out in the first batch of
> printing/setting code before flush is called (again).

As I suggested earlier, I think the solution is to ignore rflush calls if we 
expect to parse headers and they weren't parsed yet. However if the buffer 
overflows, we have not much choice but to send it out. But I think that this 
will be a satisfactory solution.

Also since close($fh) always flushes in mod_cgi, I think you can get rid of 
the explicit flush for good. Since mod_perl's CLOSE is no-op, it won't cause 
the flush (at least for now).

__________________________________________________________________
Stas Bekman            JAm_pH ------> Just Another mod_perl Hacker
http://stason.org/     mod_perl Guide ---> http://perl.apache.org
mailto:stas@stason.org http://use.perl.org http://apacheweek.com
http://modperlbook.org http://apache.org   http://ticketmaster.com


Re: [mp2] CGI redirects incorrectly handled?

Posted by Mark James <mr...@bigpond.net.au>.
Stas Bekman wrote:

>> So can flushing be held off until either (1) blank line is printed,
>> (2) the 8k buffer fills, or (3) send_http_header is called?
>  
> 1) is relevant only for handler that print headers, rather than set them
> 2) absolutely not, what if you want to flush data before?
> 3) send_http_header doesn't exist in Apache2/mod_perl2

I didn't realise that mp2 doesn't use send_http_header.  That explains
the appearance of wb->r->content_type in the mp2 code.  So is it
true that if headers are sent using the API then no output filtering
and transmission occurs until the 8k buffer is either filled or flushed
(explcitly or after exit)?


> Only in the case that your handler is configured with:
> 
>   PerlOptions +ParseHeaders
> 
> *and*
> 
> it prints headers ala:
> 
>   print "Content-type: ...."
> 
> In all other cases where headers are set via the API, e.g. 
> $r->content_type, $r->headers_out, etc, there is no such a thing as "the 
> handler has send an empty line signaling the end of sending headers", 
> because it never sends any headers at all, but uses api to set them.

Is +ParseHeaders always indicative of explicit header printing, or
can it also be set when using the API?  If the former, then yes, if
+ParseHeaders is set flushing could be held off until a blank line is seen.



>> With the current mp2 code, if you decide to
>> wait for the end of headers before doing cgi parsing and flushing then
>> the code is assuming that either the headers are less than 8k or that any
>> Status header is in the first 8k.  Otherwise the code would have to
>> be re-written to use continuous (spilling and merging) buffer buckets
>> like mod_cgi.  It can hold off on flushing indefinitely.
> 
> 
> That approach will break this handler:
> 
> sub handler {
>   my $r = shift;
>   $r->content_type('text/plain');
>   $r->rflush; # send something to the client immediately
>   long_job();
>   return Apache::OK
> }
> 
> However notice that it doesn't have to set content_type() because some 
> earlier handler could have done that and that should work as well:
> 
> sub handler {
>   my $r = shift;
>   $r->rflush; # send something to the client immediately
>   long_job();
>   return Apache::OK
> }
> 
> So as you can see, this handler doesn't tell us when it's done with 
> headers.
> 
> OK, you may say that that previous handler should have marked the end of 
> headers settings, but that would be wrong if the response handler wants 
> to set other headers as well.
> 
> Do we now agree that the event of "end of sending headers" is possible 
> only in the case explained at the top?

Yes, the key I was missing is that mp2 no longer uses send_http_header.
Can you just lock out flushing when nothing has been printed and
content_type is undefined? (You impliy above that the content_type
setting is persistent, so the script would have to undef it if necessary.)
Then all the user script has to do is to make sure any Status header is
either printed or set via headers_out in the first batch of
printing/setting code before flush is called (again).



Re: [mp2] CGI redirects incorrectly handled?

Posted by Stas Bekman <st...@stason.org>.
>>> The
>>> difference between mod_cgi and mod_perl is that mod_cgi does not
>>> activate the filter brigade until it has read all the headers.
>>
>>
>>
>> But in the case of mod_perl, this "event" is valid only for handlers 
>> which print their own headers, rather than using mod_perl API to set 
>> them. In the generic case, there is no way to tell whether a handler 
>> is going to set more headers or it has done with it.
> 
> 
> So can flushing be held off until either (1) blank line is printed,
> (2) the 8k buffer fills, or (3) send_http_header is called?

1) is relevant only for handler that print headers, rather than set them

2) absolutely not, what if you want to flush data before?

3) send_http_header doesn't exist in Apache2/mod_perl2

>> I suppose that we could prevent flushing in the case the handler is 
>> configured to parse headers. Does it make sense?
> 
> 
> No. Could you explain your reasoning.

Only in the case that your handler is configured with:

   PerlOptions +ParseHeaders

*and*

it prints headers ala:

   print "Content-type: ...."

In all other cases where headers are set via the API, e.g. $r->content_type, 
$r->headers_out, etc, there is no such a thing as "the handler has send an 
empty line signaling the end of sending headers", because it never sends any 
headers at all, but uses api to set them.

Are we on the same page now?

>>> mod_cgi uses spilling buckets, each of size 8K, to buffer script output,
>>> including during the header scan, while mod_perl seems to scan the 
>>> headers
>>> when the first 8K buffer is either filled or flushed.
>>
>>
>>
>> I don't think this is related to the issue in question. Since the 
>> problem is what to do on flush.
>>
>> Also we might have a problem if the headers to parse are bigger than 
>> the size of the buffer (8k). Do you think someone will ever need to 
>> send headers bigger than 8k?
> 
> 
> Yes, I mentioned the buffer size in case your objection to my proposal
> to wait until end of headers was seen was based on the possiblity of
> more than 8k of headers.  

Again, the concept of "the end of headers" exists only in certain cases.

> With the current mp2 code, if you decide to
> wait for the end of headers before doing cgi parsing and flushing then
> the code is assuming that either the headers are less than 8k or that any
> Status header is in the first 8k.  Otherwise the code would have to
> be re-written to use continuous (spilling and merging) buffer buckets
> like mod_cgi.  It can hold off on flushing indefinitely.

That approach will break this handler:

sub handler {
   my $r = shift;
   $r->content_type('text/plain');
   $r->rflush; # send something to the client immediately
   long_job();
   return Apache::OK
}

However notice that it doesn't have to set content_type() because some earlier 
handler could have done that and that should work as well:

sub handler {
   my $r = shift;
   $r->rflush; # send something to the client immediately
   long_job();
   return Apache::OK
}

So as you can see, this handler doesn't tell us when it's done with headers.

OK, you may say that that previous handler should have marked the end of 
headers settings, but that would be wrong if the response handler wants to set 
other headers as well.

Do we now agree that the event of "end of sending headers" is possible only in 
the case explained at the top?

__________________________________________________________________
Stas Bekman            JAm_pH ------> Just Another mod_perl Hacker
http://stason.org/     mod_perl Guide ---> http://perl.apache.org
mailto:stas@stason.org http://use.perl.org http://apacheweek.com
http://modperlbook.org http://apache.org   http://ticketmaster.com


Re: [mp2] CGI redirects incorrectly handled?

Posted by Mark James <mr...@bigpond.net.au>.
Stas Bekman wrote:

> Since the mod_perl's internal STDOUT buffer isn't mangled if you re-tie 
> it later, and it'll be always flushed at the end of the request, there 
> is no *need* to flush on CLOSE. However in order to be consistent with 
> perl fh close behavior, it probably needs to be changed to flush its 
> buffer.
> 
> What do you think?

Dunno.  But the problem I had would have been even harder to track
down if commenting out the flush hadn't fixed it.


>> The
>> difference between mod_cgi and mod_perl is that mod_cgi does not
>> activate the filter brigade until it has read all the headers.
> 
> 
> But in the case of mod_perl, this "event" is valid only for handlers 
> which print their own headers, rather than using mod_perl API to set 
> them. In the generic case, there is no way to tell whether a handler is 
> going to set more headers or it has done with it.

So can flushing be held off until either (1) blank line is printed,
(2) the 8k buffer fills, or (3) send_http_header is called?

> I suppose that we could prevent flushing in the case the handler is 
> configured to parse headers. Does it make sense?

No. Could you explain your reasoning.


>> mod_cgi uses spilling buckets, each of size 8K, to buffer script output,
>> including during the header scan, while mod_perl seems to scan the 
>> headers
>> when the first 8K buffer is either filled or flushed.
> 
> 
> I don't think this is related to the issue in question. Since the 
> problem is what to do on flush.
> 
> Also we might have a problem if the headers to parse are bigger than the 
> size of the buffer (8k). Do you think someone will ever need to send 
> headers bigger than 8k?

Yes, I mentioned the buffer size in case your objection to my proposal
to wait until end of headers was seen was based on the possiblity of
more than 8k of headers.  With the current mp2 code, if you decide to
wait for the end of headers before doing cgi parsing and flushing then
the code is assuming that either the headers are less than 8k or that any
Status header is in the first 8k.  Otherwise the code would have to
be re-written to use continuous (spilling and merging) buffer buckets
like mod_cgi.  It can hold off on flushing indefinitely.


Re: [mp2] CGI redirects incorrectly handled?

Posted by Stas Bekman <st...@stason.org>.
Mark James wrote:
> Stas Bekman wrote:
> 
>> Mark James wrote:
>>
>>> STDOUT is flushed prior to a fork to exec an external binary (rcs).
>>
>>
>> I understand the cause. But I hope that you agree with me that this is 
>> an application's problem. If you haven't sent anything to STDOUT yet, 
>> don't flush. And if this is not under your control, reopen STDOUT to 
>> /dev/null before you call that piece of code, that flushes and then 
>> re-tie STDOUT again.
>> (See t/response/TestModperl/request_rec_tie_api.pm)
> 
> 
> I guess the best way to fix the problem in-application would be to
> either run nph, or do the /dev/null redirect you suggest.
> 
> Interestingly, commenting out the pre-fork flush fixes the problem
> under mod_perl because close in mod_perl seems to be a no-op rather
> than a flush.  If the close is also no problem under mod_cgi then
> is there any real need for such a pre-fork flush in my script?

Since the mod_perl's internal STDOUT buffer isn't mangled if you re-tie it 
later, and it'll be always flushed at the end of the request, there is no 
*need* to flush on CLOSE. However in order to be consistent with perl fh close 
behavior, it probably needs to be changed to flush its buffer.

What do you think?

>>> I see. But why is there no problem when using mod_cgi?
>>
>>
>> That's an interesting question. mod_cgi is a generic handler, which 
>> can run applications written in any language. Therefore it has no clue 
>> of what flush is. It simply creates a pipe to the application, and 
>> expects the headers headers followed by the data.
>>
>> In your case, when cgi script flushes STDOUT, nothing happens at all, 
>> because there is no data to flush. So mod_cgi gets the headers and the 
>> data and all is cool
>>
>> When the same code is run under mod_perl, flush generates a special 
>> bucket which is sent out to the filters chain, and since no headers 
>> are generated yet, they get generated and sent out.
> 
> 
> Well, even under mod_cgi a program can still fflush or write.  

Ah, of course!

> The
> difference between mod_cgi and mod_perl is that mod_cgi does not
> activate the filter brigade until it has read all the headers.

But in the case of mod_perl, this "event" is valid only for handlers which 
print their own headers, rather than using mod_perl API to set them. In the 
generic case, there is no way to tell whether a handler is going to set more 
headers or it has done with it.

I suppose that we could prevent flushing in the case the handler is configured 
to parse headers. Does it make sense?

>>> Why would a perl handler script want to flush data down the filter chain
>>> before it had finished writing all of it?
>>
>>
>> Here is an example: You have a long running process, you want the 
>> headers to be sent immediately, but the data won't follow for a while. 
>> So you create the headers, do $r->rflush, and later on send the data.
> 
> 
> OK. So would there be a problem if mod_perl waited for the blank line
> end of headers, or at least a Status header, before passing the buckets
> down the line, just like mod_cgi?

See above.

> mod_cgi uses spilling buckets, each of size 8K, to buffer script output,
> including during the header scan, while mod_perl seems to scan the headers
> when the first 8K buffer is either filled or flushed.

I don't think this is related to the issue in question. Since the problem is 
what to do on flush.

Also we might have a problem if the headers to parse are bigger than the size 
of the buffer (8k). Do you think someone will ever need to send headers bigger 
than 8k?

__________________________________________________________________
Stas Bekman            JAm_pH ------> Just Another mod_perl Hacker
http://stason.org/     mod_perl Guide ---> http://perl.apache.org
mailto:stas@stason.org http://use.perl.org http://apacheweek.com
http://modperlbook.org http://apache.org   http://ticketmaster.com


Re: [mp2] CGI redirects incorrectly handled?

Posted by Mark James <mr...@bigpond.net.au>.
Stas Bekman wrote:
> Mark James wrote:
>> STDOUT is flushed prior to a fork to exec an external binary (rcs).
> 
> I understand the cause. But I hope that you agree with me that this is 
> an application's problem. If you haven't sent anything to STDOUT yet, 
> don't flush. And if this is not under your control, reopen STDOUT to 
> /dev/null before you call that piece of code, that flushes and then 
> re-tie STDOUT again.
> (See t/response/TestModperl/request_rec_tie_api.pm)

I guess the best way to fix the problem in-application would be to
either run nph, or do the /dev/null redirect you suggest.

Interestingly, commenting out the pre-fork flush fixes the problem
under mod_perl because close in mod_perl seems to be a no-op rather
than a flush.  If the close is also no problem under mod_cgi then
is there any real need for such a pre-fork flush in my script?


>> I see. But why is there no problem when using mod_cgi?
> 
> That's an interesting question. mod_cgi is a generic handler, which can 
> run applications written in any language. Therefore it has no clue of 
> what flush is. It simply creates a pipe to the application, and expects 
> the headers headers followed by the data.
> 
> In your case, when cgi script flushes STDOUT, nothing happens at all, 
> because there is no data to flush. So mod_cgi gets the headers and the 
> data and all is cool
> 
> When the same code is run under mod_perl, flush generates a special 
> bucket which is sent out to the filters chain, and since no headers are 
> generated yet, they get generated and sent out.

Well, even under mod_cgi a program can still fflush or write.  The
difference between mod_cgi and mod_perl is that mod_cgi does not
activate the filter brigade until it has read all the headers.


>> Why would a perl handler script want to flush data down the filter chain
>> before it had finished writing all of it?
> 
> Here is an example: You have a long running process, you want the 
> headers to be sent immediately, but the data won't follow for a while. 
> So you create the headers, do $r->rflush, and later on send the data.

OK. So would there be a problem if mod_perl waited for the blank line
end of headers, or at least a Status header, before passing the buckets
down the line, just like mod_cgi?

mod_cgi uses spilling buckets, each of size 8K, to buffer script output,
including during the header scan, while mod_perl seems to scan the headers
when the first 8K buffer is either filled or flushed.



Re: [mp2] CGI redirects incorrectly handled?

Posted by Stas Bekman <st...@stason.org>.
Mark James wrote:
> Stas Bekman wrote:
> 
>> Mark James wrote:
>>
>>> The cause of the problem was my perl code calling flush.pl and
>>> flushing STDOUT at a point prior to it printing the response headers.
>>
>>
>> Hmm, why do you flush?
> 
> 
> STDOUT is flushed prior to a fork to exec an external binary (rcs).
> The child is closing STDOUT and then redirecting it into a pipe
> to the parent.  I didn't write this part of the code, but the
> comment on the flushing is:
> 
>     # flush now, lest data in a buffer get flushed on close() in every 
> stinking
>     # child process.
> 
> The code for the forking is:
>     "bulletproof fork" from camel book, 2ed, page 167
> 
> If necessary I can propose a patch to this perl package to make the
> flushing conditional on not running under mod_perl.

I understand the cause. But I hope that you agree with me that this is an 
application's problem. If you haven't sent anything to STDOUT yet, don't 
flush. And if this is not under your control, reopen STDOUT to /dev/null 
before you call that piece of code, that flushes and then re-tie STDOUT again.
(See t/response/TestModperl/request_rec_tie_api.pm)

Technically it's possible to add a flag in mod_perl 2.0 to ignore any flush 
attempts, if no data data was printed yet. However, this could become an 
undesirable behavior for someone who wants to send a flush before any data is 
sent. In your case, you can work around the problem, in the case of a person 
who wants the other behavior, there is no workaround. So I suggest that we 
keep the mp behavior generic and not create special cases we may regret about 
later on.

>> The way Apache2 is designed is that the moment you send anything down 
>> the filter chain, the headers are generated, because they have to be 
>> sent before any data goes out. However mod_perl has an internal buffer 
>> and it won't flush the data before it gets full or the code tells it 
>> to flush using $r->rflush. If $|==0, then the buffer is not used and 
>> the data is flushed on every print.
> 
> 
> I see. But why is there no problem when using mod_cgi?

That's an interesting question. mod_cgi is a generic handler, which can run 
applications written in any language. Therefore it has no clue of what flush 
is. It simply creates a pipe to the application, and expects the headers 
headers followed by the data.

In your case, when cgi script flushes STDOUT, nothing happens at all, because 
there is no data to flush. So mod_cgi gets the headers and the data and all is 
cool.

When the same code is run under mod_perl, flush generates a special bucket 
which is sent out to the filters chain, and since no headers are generated 
yet, they get generated and sent out.

As I wrote this, I'm actually starting to think that it's Apache who should 
ignore the flush bucket if it had seen no other data so far, and not generate 
any headers till it actually sees the real data.

>>> Everything seems to work if the ap_rflush call is removed
>>> from mpxs_output_flush, but I don't know if this is the
>>> proper way to fix it.
>>
>>
>>
>> No, this is not a proper way to fix it. Otherwise those who want to 
>> flush their output won't be able to do so.
> 
> 
> Why would a perl handler script want to flush data down the filter chain
> before it had finished writing all of it?

Here is an example: You have a long running process, you want the headers to 
be sent immediately, but the data won't follow for a while. So you create the 
headers, do $r->rflush, and later on send the data.

__________________________________________________________________
Stas Bekman            JAm_pH ------> Just Another mod_perl Hacker
http://stason.org/     mod_perl Guide ---> http://perl.apache.org
mailto:stas@stason.org http://use.perl.org http://apacheweek.com
http://modperlbook.org http://apache.org   http://ticketmaster.com


Re: [mp2] CGI redirects incorrectly handled?

Posted by Mark James <mr...@bigpond.net.au>.
Stas Bekman wrote:

> Mark James wrote:
>> The cause of the problem was my perl code calling flush.pl and
>> flushing STDOUT at a point prior to it printing the response headers.
> 
> Hmm, why do you flush?

STDOUT is flushed prior to a fork to exec an external binary (rcs).
The child is closing STDOUT and then redirecting it into a pipe
to the parent.  I didn't write this part of the code, but the
comment on the flushing is:

     # flush now, lest data in a buffer get flushed on close() in every stinking
     # child process.

The code for the forking is:
	"bulletproof fork" from camel book, 2ed, page 167

If necessary I can propose a patch to this perl package to make the
flushing conditional on not running under mod_perl.


> The way Apache2 is designed is that the moment you send anything down 
> the filter chain, the headers are generated, because they have to be 
> sent before any data goes out. However mod_perl has an internal buffer 
> and it won't flush the data before it gets full or the code tells it to 
> flush using $r->rflush. If $|==0, then the buffer is not used and the 
> data is flushed on every print.

I see. But why is there no problem when using mod_cgi?


>> Everything seems to work if the ap_rflush call is removed
>> from mpxs_output_flush, but I don't know if this is the
>> proper way to fix it.
> 
> 
> No, this is not a proper way to fix it. Otherwise those who want to 
> flush their output won't be able to do so.

Why would a perl handler script want to flush data down the filter chain
before it had finished writing all of it?

Mark


Re: [mp2] CGI redirects incorrectly handled?

Posted by Stas Bekman <st...@stason.org>.
Mark James wrote:
> Mark James wrote:
> 
>> I'm having CGI redirect problems mp2 (cvs).
>>
>> Instead of being redirected to the proper web page, I'm sometimes
>> getting a "302 Moved" page containing a link to the correct URL.
> 
> 
> Damn this was a hard bug to track down.
> 
> The cause of the problem was my perl code calling flush.pl and
> flushing STDOUT at a point prior to it printing the response headers.
> Under mp2, flushing STDOUT calls mpxs_output_flush in
> xs/Apache/RequestIO/Apache__RequestIO.h, which in turn calls
> ap_rflush, which triggers creation of the HTTP header, which
> at this stage, prior to my script printing its 302 header,
> uses a 200 OK status.  So instead of a proper redirect
> being sent back to the browser, a normal web page with an
> embedded 302 link is sent.

Hmm, why do you flush?

The way Apache2 is designed is that the moment you send anything down the 
filter chain, the headers are generated, because they have to be sent before 
any data goes out. However mod_perl has an internal buffer and it won't flush 
the data before it gets full or the code tells it to flush using $r->rflush. 
If $|==0, then the buffer is not used and the data is flushed on every print.

> Everything seems to work if the ap_rflush call is removed
> from mpxs_output_flush, but I don't know if this is the
> proper way to fix it.

No, this is not a proper way to fix it. Otherwise those who want to flush 
their output won't be able to do so.



__________________________________________________________________
Stas Bekman            JAm_pH ------> Just Another mod_perl Hacker
http://stason.org/     mod_perl Guide ---> http://perl.apache.org
mailto:stas@stason.org http://use.perl.org http://apacheweek.com
http://modperlbook.org http://apache.org   http://ticketmaster.com


Re: [mp2] CGI redirects incorrectly handled?

Posted by Stas Bekman <st...@stason.org>.
dom@idealx.com wrote:
>>The cause of the problem was my perl code calling flush.pl and
>>flushing STDOUT at a point prior to it printing the response headers.
>>Under mp2, flushing STDOUT calls mpxs_output_flush in
>>xs/Apache/RequestIO/Apache__RequestIO.h, which in turn calls
>>ap_rflush, which triggers creation of the HTTP header, which
>>at this stage, prior to my script printing its 302 header,
>>uses a 200 OK status.
> 
> 
>   Meaning no offence to the mp2 developpers, I find this observed
> behaviour inappropriate - I recently have to develop a reverse-proxy
> and got bitten by undocumented semantics of this sort every so often,
> I had to resort to reading the source with pencil & paper like the
> original poster apparently did.

I think you are confusing mp2 design with Apache2 design. This is how Apache2 
works, the main reason I believe to accomodate the filtering mechanism.

As for undocumented behavior, you are welcome to submit documentation patches 
or wait till someone will write them.

>   What is the architectural justification for not choosing one of
> those two behaviours about header output, and erring on the middle
> side:
> 
>   * headers are out-of-band, and the first call to print() prepends
>     whatever headers were set using the appropriate API
>     (e.g. print_header() should have no effect afterwards, or maybe
>     should set HTTP/1.1 trailers);

That's exactly how it works. The first print/puts/printf/rflush causes the 
headers to be sent (assuming that STDOUT is unbuffered), using whatever 
headers were set so far. Do I miss something here? This is 1:1 mapping to 
Apache behavior.

The only difference is the mod_perl internal STDOUT buffer used for buffered 
STDOUT.

>   * headers are regular flow, and Apache / mp2 never tries to add its
>     own ones (almost impossible to ensure under Apache / mp1).
> 
>   Thanks for any insight on this topic - maybe there is a FAQ
> somewhere about MP2 architecture ?

There are Doug's architecture notes online, I did some changes to them to 
bring things to the current state of things. They are certainly could have 
some more work. Patches are welcome.

__________________________________________________________________
Stas Bekman            JAm_pH ------> Just Another mod_perl Hacker
http://stason.org/     mod_perl Guide ---> http://perl.apache.org
mailto:stas@stason.org http://use.perl.org http://apacheweek.com
http://modperlbook.org http://apache.org   http://ticketmaster.com


Re: [mp2] CGI redirects incorrectly handled?

Posted by do...@idealx.com.
> 
> The cause of the problem was my perl code calling flush.pl and
> flushing STDOUT at a point prior to it printing the response headers.
> Under mp2, flushing STDOUT calls mpxs_output_flush in
> xs/Apache/RequestIO/Apache__RequestIO.h, which in turn calls
> ap_rflush, which triggers creation of the HTTP header, which
> at this stage, prior to my script printing its 302 header,
> uses a 200 OK status.

  Meaning no offence to the mp2 developpers, I find this observed
behaviour inappropriate - I recently have to develop a reverse-proxy
and got bitten by undocumented semantics of this sort every so often,
I had to resort to reading the source with pencil & paper like the
original poster apparently did.

  What is the architectural justification for not choosing one of
those two behaviours about header output, and erring on the middle
side:

  * headers are out-of-band, and the first call to print() prepends
    whatever headers were set using the appropriate API
    (e.g. print_header() should have no effect afterwards, or maybe
    should set HTTP/1.1 trailers);

  * headers are regular flow, and Apache / mp2 never tries to add its
    own ones (almost impossible to ensure under Apache / mp1).

  Thanks for any insight on this topic - maybe there is a FAQ
somewhere about MP2 architecture ?

-- 
Dominique QUATRAVAUX                           Ingénieur développeur senior
01 44 42 00 08                                 IDEALX



Re: [mp2] CGI redirects incorrectly handled?

Posted by Mark James <mr...@bigpond.net.au>.
Mark James wrote:
> I'm having CGI redirect problems mp2 (cvs).
> 
> Instead of being redirected to the proper web page, I'm sometimes
> getting a "302 Moved" page containing a link to the correct URL.

Damn this was a hard bug to track down.

The cause of the problem was my perl code calling flush.pl and
flushing STDOUT at a point prior to it printing the response headers.
Under mp2, flushing STDOUT calls mpxs_output_flush in
xs/Apache/RequestIO/Apache__RequestIO.h, which in turn calls
ap_rflush, which triggers creation of the HTTP header, which
at this stage, prior to my script printing its 302 header,
uses a 200 OK status.  So instead of a proper redirect
being sent back to the browser, a normal web page with an
embedded 302 link is sent.

Everything seems to work if the ap_rflush call is removed
from mpxs_output_flush, but I don't know if this is the
proper way to fix it.

Mark


Re: [mp2] CGI redirects incorrectly handled?

Posted by Nick Tonkin <ni...@tonkinresolutions.com>.
On Thu, 6 Mar 2003, Mark James wrote:

> Nick Tonkin wrote:
>
> > Now that I think about it, maybe you're using CGI.pm to do your redirect?
> > If so, maybe the code in CGI.pm has not been correctly updated?
>
> Yes Nick, I'm using CGI.pm version 2.91 (the latest).  Its redirect code
> sends a "Status: 302 Moved".

That wouldn't be the problem, if there is one. The problem would be in how
CGI.pm sets the Location header, noting the differences in syntax I
pointed out earlier.

You might try one of the CGI mailing lists to see if anyone there knows
whether the code is compliant with what I posted before. The documentation
is at http://xrl.us/dfb

Otherwise, try setting the redirect location manually as I showed you and
see if the problem persists.


- nick

-- 

~~~~~~~~~~~~~~~~~~~~
Nick Tonkin   {|8^)>


Re: [mp2] CGI redirects incorrectly handled?

Posted by Mark James <mr...@bigpond.net.au>.
Nick Tonkin wrote:

> Now that I think about it, maybe you're using CGI.pm to do your redirect?
> If so, maybe the code in CGI.pm has not been correctly updated?

Yes Nick, I'm using CGI.pm version 2.91 (the latest).  Its redirect code
sends a "Status: 302 Moved".

Mark


Re: [mp2] CGI redirects incorrectly handled?

Posted by Nick Tonkin <ni...@tonkinresolutions.com>.
How are you telling the server to redirect? You do know it's different
from mp1, right?

In mp2 you need to do:

my $location = 'http://foo.bar.baz';

$r->headers_out->{'Location'} = $location;
# Or use $r->err_headers_out->{'Location'} which you will have
# to do with any other headers you want to have sent with the
# redirect, such as cookies

return Apache::HTTP_MOVED_TEMPORARILY;
# Apache::REDIRECT still supported, this is the correct
# constant though.

On Wed, 5 Mar 2003, Mark James wrote:

> I'm having CGI redirect problems mp2 (cvs).


How are you telling the server to redirect? You do know it's different
from mp1, right?

In mp2 you need to do:

my $location = 'http://foo.bar.baz';

$r->headers_out->{'Location'} = $location;
# Or use $r->err_headers_out->{'Location'} which you will have
# to do with any other headers you want to have sent with the
# redirect, such as cookies

return Apache::HTTP_MOVED_TEMPORARILY;
# Apache::REDIRECT still supported, this is the correct
# constant though.

Now that I think about it, maybe you're using CGI.pm to do your redirect?
If so, maybe the code in CGI.pm has not been correctly updated?


Hope this helps.

- nick

-- 

~~~~~~~~~~~~~~~~~~~~
Nick Tonkin   {|8^)>