You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@httpd.apache.org by Justin Erenkrantz <je...@apache.org> on 2002/11/01 09:53:50 UTC

Re: [PATCH] checking for failures encountered by core_output_filter

--On Thursday, October 31, 2002 4:58 PM -0500 Jeff Trawick 
<tr...@attglobal.net> wrote:

> default_handler(), which should be returning HTTP status code,
> returns whatever ap_pass_brigade() returns.  default_handler()
> would have to change too.  Any other handlers as well (In the
> thread I pointed to, Ryan claimed that 99% of handlers return
> whatever ap_pass_brigade() returned.  So I suspect that some other
> handlers need to be changed as well.)

Right, it's been a long-standing problem that this API is completely 
horked (yet another reason I'm not overjoyed with the thought of 
forcing the 2.0.43 module API on all developers until we release a 
2.4/3.0).  OtherBill just suggested to me a solution like changing 
the default_handler line from:

return ap_pass_brigade(r->output_filters, bb);

to:

rv = ap_pass_brigade(r->output_filters, bb);
if (rv != APR_SUCCESS && r->status == HTTP_OK) {
    return HTTP_INTERNAL_SERVER_ERROR;
}
else {
    return r->status;
}

That way, if a filter does set r->status for some reason, we return 
that value, otherwise we return 500.

Handlers should be returning HTTP status codes, while filters should 
be returning APR status codes.  I find myself believe that is the 
right model, because in theory (gah!), the filters could be more than 
just HTTP.  Too bad we can't use enums for one of those status codes 
and wash our hands of the whole deal by relying on type safety.  Oh, 
well.

I think your commits to check c->aborted in various filters should be 
replaced by getting core_input_filter to return APR_ECONNABORTED.

Namely the following lines in core_output_filter (server/core.c:4002) 
are wrong:

    /* No need to check for SUCCESS, we did that above. */
    if (!APR_STATUS_IS_EAGAIN(rv)) {
        c->aborted = 1;
    }

    /* The client has aborted, but the request was successful. We
     * will report success, and leave it to the access and error
     * logs to note that the connection was aborted.
     */
    return APR_SUCCESS;

I don't buy that logic at all.  Why should we be returning 
APR_SUCCESS here?  We want to signal an error, so that the filters 
stop what they are doing and exit.

Hmm.  By what I just said for default_handler, if a client aborts, 
that would mean that the handler would be returning with a 500. 
Hmmm.  Perhaps we could catch the c->aborted case in the default 
handler and just 'lie' and say that it would have been a 200.  Hmm. 
Not sure about that one.  Thoughts?

Random thought: define a new HTTP status code which means 'Client got 
the hell out of here, so we stopped early' (207??).  Remember that 
this status code is only going to be presented via the access logs, 
so we really don't need 'support' from HTTP clients.  That could 
solve our problem...

> Code playing with HTTP status codes should be analyzed to ensure
> that what is logged is what was written to the client (I guess?)
> (i.e., some subsequent error doesn't overwrite r->status or
> whatever is passed to the logger).

Do we really want to be logging if a filter has an error?  Here's 
another random thought: introduce a logging filter at the top of the 
filter chain which will write a note to the error log if the filters 
return something other than APR_SUCCESS.  This way we don't have to 
explicitly handle that in all of the handlers.

> clear as mud?

Oh, sure.  -- justin

Re: [PATCH] checking for failures encountered by core_output_filter

Posted by "William A. Rowe, Jr." <wr...@apache.org>.
At 10:07 AM 11/4/2002, Jeff Trawick wrote:
>Justin Erenkrantz <je...@apache.org> writes:
>
>> >> I don't buy that logic at all.  Why should we be returning
>> >> APR_SUCCESS here?  We want to signal an error, so that the filters
>> >> stop what they are doing and exit.
>> >
>> > Talk to Ryan :)
>> 
>> Well, he ain't here no more.  Just us folks.  Do you (or anyone else)
>> see a reason why we shouldn't change this?
>
>interpret "Talk to Ryan" as "I can't justify the current behavior" :)
>
>We should make the change to return an APR error code here.

Agreed.  If we really have to dig through and find ALL of the areas where
the current behavior disagrees, this seems like the time to do it.  Choose
one; return apr_status_t and pay attention only to r->status and c->aborted
for logging (the right decision IMHO) or return HTTP statuses (ick.)

Either way it's the same amount of work, no?

I'm discovering the same confusion in the SSL filter right now, which has
never paid attention to c->aborted within ssl_engine_io.c.


>> > I'm kinda stuck thinking that the philosophy should be that we log
>> > whatever was written to the client in the status line, but there are
>> > some implementation details with that one.
>> 
>> How do we differentiate between a 200 that all the data was sent for
>> and a 200 where the client aborted?  If someone looked at the access
>> logs, they would see that the transmitted length wasn't correct in the
>> aborted case.  Perhaps that was Ryan's initial motivation - did he
>> want the length to be what we wanted to send not what we actually
>> sent?  (mod_logio is bringing up some of these concerns as well.)
>> 
>> I guess I'd like to see some notation in access log that indicates,
>> "Hey, we know we didn't send everything, but that's because they left
>> in the middle."  Yes, it also might be noted in the error log, but I
>> gotta believe it's goodness to have tools looking at just the access
>> logs know that the client aborted.
>
>FWIW, this is what happens with 1.3 when the client drops the
>connection before reading the entire response.  In this case, the
>client read several bytes of a 1MB response (though the client TCP
>accepted a lot more).
>
>  [Mon Nov  4 10:31:22 2002] [info] [client 9.27.32.119] (32)Broken pipe: 
>  client stopped connection before send mmap completed
>
>  9.27.32.119 - - [04/Nov/2002:10:31:22 -0500] "GET /bigfile.html" 200 65536
>
>Even though Apache could have noticed that it sent only 64K out of
>1MB, it still logged 200.  The 64K isn't even an indication of how
>much the client really read.  It is simply how much TCP was willing to
>buffer.  Also, I suspect that we can hit the first indication of a
>dropped connection after the previous response has been noted in the
>log.

Right, and that's good behavior.  The REQUEST resulted in a 200.  What
the RESPONSE did while serving it is another matter.

>I'm not sure what promises we can make about what is in the access
>log.
>
>> > One issue with that is that the filter that encounters the error
>> > really should do the logging so that the most specific information
>> > is available.  If some other code also logs less-specific info,
>> > then it is all for naught.
>> 
>> Makes sense.  So, core_output_filter should be calling
>> ap_log_rerror. Hmm.  Does it have access to the request_rec?  Didn't
>> we change that? Or, was I vetoed on that?  -- justin
>
>I don't recall that conversation.  Regardless, I wonder how 
>core_output_filter could know the r associated with a particular
>bucket?

It shouldn't need to... the response should probably percolate back
to the handler, via r->status, but the connection filter shouldn't need
to be aware of it.  If you wanted to log the 'error' associated with the
request (e.g. transmission troubles), that would probably need to 
percolate with the bucket.

Bill


Re: [PATCH] checking for failures encountered by core_output_filter

Posted by Jeff Trawick <tr...@attglobal.net>.
Justin Erenkrantz <je...@apache.org> writes:

> Makes sense.  So, core_output_filter should be calling
> ap_log_rerror. Hmm.  Does it have access to the request_rec?  Didn't
> we change that? Or, was I vetoed on that?  -- justin

why no ap_log_cerror() I wonder?  at least that would be better than
ap_log_error() in that it can format the client address

-- 
Jeff Trawick | trawick@attglobal.net
Born in Roswell... married an alien...

Re: [PATCH] checking for failures encountered by core_output_filter

Posted by Jeff Trawick <tr...@attglobal.net>.
Justin Erenkrantz <je...@apache.org> writes:

> >> I don't buy that logic at all.  Why should we be returning
> >> APR_SUCCESS here?  We want to signal an error, so that the filters
> >> stop what they are doing and exit.
> >
> > Talk to Ryan :)
> 
> Well, he ain't here no more.  Just us folks.  Do you (or anyone else)
> see a reason why we shouldn't change this?

interpret "Talk to Ryan" as "I can't justify the current behavior" :)

We should make the change to return an APR error code here.

> > I'm kinda stuck thinking that the philosophy should be that we log
> > whatever was written to the client in the status line, but there are
> > some implementation details with that one.
> 
> How do we differentiate between a 200 that all the data was sent for
> and a 200 where the client aborted?  If someone looked at the access
> logs, they would see that the transmitted length wasn't correct in the
> aborted case.  Perhaps that was Ryan's initial motivation - did he
> want the length to be what we wanted to send not what we actually
> sent?  (mod_logio is bringing up some of these concerns as well.)
> 
> I guess I'd like to see some notation in access log that indicates,
> "Hey, we know we didn't send everything, but that's because they left
> in the middle."  Yes, it also might be noted in the error log, but I
> gotta believe it's goodness to have tools looking at just the access
> logs know that the client aborted.

FWIW, this is what happens with 1.3 when the client drops the
connection before reading the entire response.  In this case, the
client read several bytes of a 1MB response (though the client TCP
accepted a lot more).

  [Mon Nov  4 10:31:22 2002] [info] [client 9.27.32.119] (32)Broken pipe: 
  client stopped connection before send mmap completed

  9.27.32.119 - - [04/Nov/2002:10:31:22 -0500] "GET /bigfile.html" 200 65536

Even though Apache could have noticed that it sent only 64K out of
1MB, it still logged 200.  The 64K isn't even an indication of how
much the client really read.  It is simply how much TCP was willing to
buffer.  Also, I suspect that we can hit the first indication of a
dropped connection after the previous response has been noted in the
log.

I'm not sure what promises we can make about what is in the access
log.

> > One issue with that is that the filter that encounters the error
> > really should do the logging so that the most specific information
> > is available.  If some other code also logs less-specific info,
> > then it is all for naught.
> 
> Makes sense.  So, core_output_filter should be calling
> ap_log_rerror. Hmm.  Does it have access to the request_rec?  Didn't
> we change that? Or, was I vetoed on that?  -- justin

I don't recall that conversation.  Regardless, I wonder how 
core_output_filter could know the r associated with a particular
bucket?

-- 
Jeff Trawick | trawick@attglobal.net
Born in Roswell... married an alien...

Re: [PATCH] checking for failures encountered by core_output_filter

Posted by Justin Erenkrantz <je...@apache.org>.
--On Friday, November 1, 2002 11:13 AM -0500 Jeff Trawick 
<tr...@attglobal.net> wrote:

> The only filter I changed was the content-length filter.
>
> But yes I would agree that in a commit which fixes our handlers as
> you mentioned we also fix the content-length filter to not worry
> about c->aborted.
>
> Note that I changed the mod_cgi handler to look at c->aborted as
> well as rv after ap_pass_brigade().  In all honesty I don't know
> who gets to set c->aborted and whether rv absolutely must be
> non-APR_SUCCESS if aborted is set.  But I suspect that mod_cgi
> wouldn't have to check both rv and c->aborted to see if an error
> occurred talking to the client.

Right.  That's why I'm saying we should change core_output_filter as 
I described (so that it returns APR_ECONNABORTED in this case). 
(Don't think we're disagreeing on that.)

>> I don't buy that logic at all.  Why should we be returning
>> APR_SUCCESS here?  We want to signal an error, so that the filters
>> stop what they are doing and exit.
>
> Talk to Ryan :)

Well, he ain't here no more.  Just us folks.  Do you (or anyone else) 
see a reason why we shouldn't change this?

> I'm kinda stuck thinking that the philosophy should be that we log
> whatever was written to the client in the status line, but there are
> some implementation details with that one.

How do we differentiate between a 200 that all the data was sent for 
and a 200 where the client aborted?  If someone looked at the access 
logs, they would see that the transmitted length wasn't correct in 
the aborted case.  Perhaps that was Ryan's initial motivation - did 
he want the length to be what we wanted to send not what we actually 
sent?  (mod_logio is bringing up some of these concerns as well.)

I guess I'd like to see some notation in access log that indicates, 
"Hey, we know we didn't send everything, but that's because they left 
in the middle."  Yes, it also might be noted in the error log, but I 
gotta believe it's goodness to have tools looking at just the access 
logs know that the client aborted.

> One issue with that is that the filter that encounters the error
> really should do the logging so that the most specific information
> is available.  If some other code also logs less-specific info,
> then it is all for naught.

Makes sense.  So, core_output_filter should be calling ap_log_rerror. 
Hmm.  Does it have access to the request_rec?  Didn't we change that? 
Or, was I vetoed on that?  -- justin

Re: [PATCH] checking for failures encountered by core_output_filter

Posted by Jeff Trawick <tr...@attglobal.net>.
Justin Erenkrantz <je...@apache.org> writes:

> I think your commits to check c->aborted in various filters should be
> replaced by getting core_input_filter to return APR_ECONNABORTED.

The only filter I changed was the content-length filter.

But yes I would agree that in a commit which fixes our handlers as you
mentioned we also fix the content-length filter to not worry about
c->aborted.

Note that I changed the mod_cgi handler to look at c->aborted as well
as rv after ap_pass_brigade().  In all honesty I don't know who gets
to set c->aborted and whether rv absolutely must be non-APR_SUCCESS if
aborted is set.  But I suspect that mod_cgi wouldn't have to check
both rv and c->aborted to see if an error occurred talking to the
client.

> Namely the following lines in core_output_filter (server/core.c:4002)
> are wrong:
> 
>     /* No need to check for SUCCESS, we did that above. */
>     if (!APR_STATUS_IS_EAGAIN(rv)) {
>         c->aborted = 1;
>     }
> 
>     /* The client has aborted, but the request was successful. We
>      * will report success, and leave it to the access and error
>      * logs to note that the connection was aborted.
>      */
>     return APR_SUCCESS;
> 
> I don't buy that logic at all.  Why should we be returning APR_SUCCESS
> here?  We want to signal an error, so that the filters stop what they
> are doing and exit.

Talk to Ryan :)

> 
> Hmm.  By what I just said for default_handler, if a client aborts,
> that would mean that the handler would be returning with a 500. Hmmm.
> Perhaps we could catch the c->aborted case in the default handler and
> just 'lie' and say that it would have been a 200.  Hmm. Not sure about
> that one.  Thoughts?

I'm kinda stuck thinking that the philosophy should be that we log
whatever was written to the client in the status line, but there are
some implementation details with that one.

> Random thought: define a new HTTP status code which means 'Client got
> the hell out of here, so we stopped early' (207??).  Remember that
> this status code is only going to be presented via the access logs, so
> we really don't need 'support' from HTTP clients.  That could solve
> our problem...
> 
> > Code playing with HTTP status codes should be analyzed to ensure
> > that what is logged is what was written to the client (I guess?)
> > (i.e., some subsequent error doesn't overwrite r->status or
> > whatever is passed to the logger).
> 
> Do we really want to be logging if a filter has an error?

What I meant by "what is logged" is "which HTTP status code ends up in
the access log".  I didn't mean error log.  We'll always write to the
access log (i.e., we'll always run that logging hook) for any request
that we actually read, right?

>                                                               Here's
> another random thought: introduce a logging filter at the top of the
> filter chain which will write a note to the error log if the filters
> return something other than APR_SUCCESS.  This way we don't have to
> explicitly handle that in all of the handlers.

One issue with that is that the filter that encounters the error
really should do the logging so that the most specific information is
available.  If some other code also logs less-specific info, then it
is all for naught.

I'll try not to think about this until Monday :)

-- 
Jeff Trawick | trawick@attglobal.net
Born in Roswell... married an alien...