You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@httpd.apache.org by Bojan Smojver <bo...@rexursive.com> on 2002/09/17 23:36:00 UTC

[PATCH]: Update request in connection based filters

Justin,

After mucking around a bit with what you suggested, this seemed like the
straightforward thing to do. Not sure how things would work on internal
redirects etc. Can you please have a look. I'm sure it's naive, but so
am I (for now :-)

Bojan

Re: [PATCH]: Update request in connection based filters

Posted by Bojan Smojver <bo...@rexursive.com>.
OOPS, forgot the variable itself...

Bojan

On Wed, 2002-09-18 at 07:36, Bojan Smojver wrote:
> Justin,
> 
> After mucking around a bit with what you suggested, this seemed like the
> straightforward thing to do. Not sure how things would work on internal
> redirects etc. Can you please have a look. I'm sure it's naive, but so
> am I (for now :-)
> 
> Bojan

Re: [PATCH]: Update request in connection based filters

Posted by Bojan Smojver <bo...@rexursive.com>.
Quoting Justin Erenkrantz <je...@apache.org>:

> Well, I'd do (plus the declaration of f in your later post):
> 
> for (f = conn->input_filters; f; f = f->next) {
>    f->r = r;
> }
> 
> for (f = conn->output_filters; f; f = f->next) {
>    f->r = r;
> }

What was that book again... K&R, The C Programming Language. Maybe I should find
out where it went :-)

I have changed the code of mod_logio.c significantly in the past few hours,
throwing out the request based filters altogether and just keeping the
connection based filters, now that f->r is not NULL any more. The whole thing
turned out to be much simpler and very accurate on simple requests. I'm yet to
do heavy testing, but I reckon even things like SSL should be covered.

> Now, the question is whether anyone has any major qualms with this.
> Basically, it now allows connection filters some access to the
> current request_rec (as defined by the protocol module).  
> 
> Internal redirects would be handled by the update_r_in_filters func
> in http_request.c (which is equivalent to this for statement - hmm,
> I forsee ap_update_filters in our future), so that should be fine.

I'm not sure that's the case. It only updates content and protocol filters (i.e.
the ones tied to the request_rec), but it doesn't touch connection filters. I
reckon we should do the above for internal redirects as well. That is, if I
really understand what internal redirects are in the first place :-)

> The only potential problem I forsee is whether the input bytes could
> be logged on one request, and the output on another.  Therefore, we
> might want to walk r->prev and add the totals to get our real bytes
> read and sent.  Oooh, internal_internal_redirect copies the
> r->read_length to the new request, so I think we're fine.

I'll try to test that by using some funcionality that depends on
internal_internal_redirect. From what I see in internal_internal_redirect, the
new->notes gets created afresh, so whatever I put in r->notes is going to get
lost. Big dramas! Looks like we'll *have to* introduce r->bytes_in and
r->bytes_out if we want this to work at all...

This would, of course, require yet another MMN bump... And it would require
revisiting all of the code that deals with copying request (e.g.
internal_internal_redirect) and then making sure that those fields are copied as
well. It also might make a lot of people upset... :-( On the other hand, it's
going to make mod_logio.c almost trivial and probably faster.

> For the logging case, I think this makes a lot of sense as it lets
> us simplify how we log.

I can verify that it is almost impossible to log the correct number of output
bytes (on the wire but per request) unless the requests are tied to connection
filters. Input bytes are different, because request based filters run after
connection based filters and we can always do what I was doing before (i.e.
store in c->notes and then copy into r->whatever). However, request based output
filters are already out of the way when connection based filters take over and
if f->r is NULL, it is not possible to store the number of bytes into the
request. So, the above patch and other neccessary additions would be essential
to achieve this.

Bojan

-------------------------------------------------
This mail sent through IMP: http://horde.org/imp/

Re: [PATCH]: Update request in connection based filters

Posted by Justin Erenkrantz <je...@apache.org>.
On Wed, Sep 18, 2002 at 07:36:00AM +1000, Bojan Smojver wrote:
> Justin,
> 
> After mucking around a bit with what you suggested, this seemed like the
> straightforward thing to do. Not sure how things would work on internal
> redirects etc. Can you please have a look. I'm sure it's naive, but so
> am I (for now :-)
> 
> Bojan

> --- httpd-2.0-vanilla/server/protocol.c	Fri Sep  6 16:01:19 2002
> +++ httpd-2.0/server/protocol.c	Wed Sep 18 07:32:37 2002
> @@ -895,6 +895,15 @@
>      r->connection      = conn;
>      r->server          = conn->base_server;
>  
> +    /* Update the request in connection filters */
> +    f = conn->input_filters;
> +    for (; f; f = f->next)
> +        f->r = r;
> +
> +    f = conn->output_filters;
> +    for (; f; f = f->next)
> +        f->r = r;
> +

Well, I'd do (plus the declaration of f in your later post):

for (f = conn->input_filters; f; f = f->next) {
   f->r = r;
}

for (f = conn->output_filters; f; f = f->next) {
   f->r = r;
}

Now, the question is whether anyone has any major qualms with this.
Basically, it now allows connection filters some access to the
current request_rec (as defined by the protocol module).  

Internal redirects would be handled by the update_r_in_filters func
in http_request.c (which is equivalent to this for statement - hmm,
I forsee ap_update_filters in our future), so that should be fine.

The only potential problem I forsee is whether the input bytes could
be logged on one request, and the output on another.  Therefore, we
might want to walk r->prev and add the totals to get our real bytes
read and sent.  Oooh, internal_internal_redirect copies the
r->read_length to the new request, so I think we're fine.

For the logging case, I think this makes a lot of sense as it lets
us simplify how we log.  -- justin

Re: [PATCH]: Update request in connection based filters

Posted by Bojan Smojver <bo...@rexursive.com>.
Quoting Ryan Morgan <rm...@covalent.net>:

> I would tend to agree with Ryan here, it should not be assumed that the
> request_rec will be available in the core_output_filter.

I wasn't planning on making any changes to core_output_filter that would make it
depend on f->r being set. The whole thing would be happening in a separate
module (mod_logio.c), which would either act (if f->r is set) or leave it alone
(if f->r is NULL). So, HTTP requests, for which f->r would be set even in the
connection based filters would be dealt with, anything else would be ignored by
the module.

Bojan

-------------------------------------------------
This mail sent through IMP: http://horde.org/imp/

Re: [PATCH]: Update request in connection based filters

Posted by Ryan Morgan <rm...@covalent.net>.
I would tend to agree with Ryan here, it should not be assumed that the
request_rec will be available in the core_output_filter.  One possible
solution would be to have c->bytes_in and c->bytes_out, then have identical
fields in the request_rec that would be updated upon the completion of the
request.  (counting only the bytes sent during *that* request)  I gave this 
a try a while back, but ran into problems because the request processing 
can bail out and run ap_log_transaction whenever it wants. (on 404's, etc).  
The request processing in apache is not symmetrical enough for that approach.

Then there's the issue of counting the bytes in the set-aside buckets
from buffered output (like Brian mentioned), but thats another problem
altogether.

-Ryan

On Tue, Sep 17, 2002 at 11:22:03PM -0400, rbb@apache.org wrote:
> 
> You don't want to do this.  A connection based filter is connection
> oriented.  By definition, it has no concept of a request.  While this
> might make sense for HTTP, other protocol modules will have a much harder
> time with this change.
> 
> Ryan
> 
> On 18 Sep 2002, Bojan Smojver wrote:
> 
> > Justin,
> > 
> > After mucking around a bit with what you suggested, this seemed like the
> > straightforward thing to do. Not sure how things would work on internal
> > redirects etc. Can you please have a look. I'm sure it's naive, but so
> > am I (for now :-)
> > 
> > Bojan
> > 
> 
> -- 
> 
> _______________________________________________________________________________
> Ryan Bloom                        	rbb@apache.org
> 550 Jean St
> Oakland CA 94610
> -------------------------------------------------------------------------------
> 
> 

Re: [PATCH]: Update request in connection based filters

Posted by Bojan Smojver <bo...@rexursive.com>.
There is another possibility to fix this, and that's within the module
itself (I've tested it and it works):

--------------------------------------------------
static void logio_create_req(request_rec *r) {
    ap_filter_t *f;
    conn_rec *c = r->connection;

    /* Tie connection filters with request */
    for (f = c->input_filters; f; f = f->next)
        f->r = r;

    for (f = c->output_filters; f; f = f->next)
        f->r = r;
}

static void register_hooks(apr_pool_t *p) {
    ap_hook_pre_connection(logio_pre_conn, NULL, NULL, APR_HOOK_MIDDLE);
    ap_hook_create_request(logio_create_req, NULL, NULL,
APR_HOOK_MIDDLE);
    
    ap_register_input_filter(logio_filter_name, logio_in_filter, NULL,
                             AP_FTYPE_CONNECTION);
    ap_register_output_filter(logio_filter_name, logio_out_filter, NULL,
                              AP_FTYPE_CONNECTION);
}
--------------------------------------------------

Since this looks like modules own business, it seems like a good
compromise to me. What do you guys think?

Also, I've think that the correct thing to do in mod_logio is to be
AP_FTYPE_CONNECTION + 6, since SSL is AP_FTYPE_CONNECTION + 5, and we
want to be before it in input filtering and after it in output
filtering. I have the module working with r->notes for now.

Bojan

On Wed, 2002-09-18 at 13:22, rbb@apache.org wrote:
> 
> You don't want to do this.  A connection based filter is connection
> oriented.  By definition, it has no concept of a request.  While this
> might make sense for HTTP, other protocol modules will have a much harder
> time with this change.
> 
> Ryan


Re: [PATCH]: Update request in connection based filters

Posted by Bojan Smojver <bo...@rexursive.com>.
Quoting Brian Pane <br...@apache.org>:

> However, if you can do the byte counting in a request-level filter
> (right after the HTTP output filter?), that should enable you to
> bypass the problem.

Counting in the request based filter is easy, but there are a few problems:

- request based input filters can't see the headers, only bodies
- request based input and output filters aren't taking into account connection
protocols like SSL

The above then causes all number to be wrong.

Bojan

-------------------------------------------------
This mail sent through IMP: http://horde.org/imp/

Re: [PATCH]: Update request in connection based filters

Posted by Brian Pane <br...@apache.org>.
On Tue, 2002-09-17 at 21:46, Bojan Smojver wrote:
> Quoting rbb@apache.org:
> 
> > The problem is that the filters aren't HTTP specific.  If you make this
> > change, then the filters will be written to take advantage of the f->r
> > field in connection level filters.  Since that field doesn't make sense in
> > non-HTTP requests, modifying the filter_rec so that it is there is a
> > mistake.
> > 
> > Think about it this way.  If you add the request_rec to the
> > core_output_filter, and that filter is modified to use it, then all
> > protocol modules must ensure that the request_rec is added to the filter,
> > even if it doesn't make sense.
> > 
> > If you make it optional to put the request_rec in the filter_rec, then you
> > will be making it much harder to write filters.
> 
> OK, I think I understand now what you're trying to say. Things can go wrong if
> important fields can be NULL when you want them to be non-NULL and vice versa. I
> reckon that good documentation on the matter would help developers understand
> that they cannot depend on f->r being set for some protocols, which would
> eliminate confusion. But, this is the first ever filter that I'm writing and
> it's awfully simple, so I could be dead wrong.

The real problem is that "r" is HTTP-specific.  If we were designing
the server from scratch, I suppose we could introduce a protocol
independent "generic request structure" for use in any protocol that
had some sort of discrete request, and make that accessible within
the connection-level filters.  But for historical reasons, the
request_rec is HTTP-specific.

However, if you can do the byte counting in a request-level filter
(right after the HTTP output filter?), that should enable you to
bypass the problem.

Brian




Re: [PATCH]: Update request in connection based filters

Posted by Bojan Smojver <bo...@rexursive.com>.
On Thu, 2002-09-19 at 09:07, rbb@apache.org wrote:

> yep, requests are always logged in order.

Thanks. Back to the code :-)

Bojan


Re: [PATCH]: Update request in connection based filters

Posted by rb...@apache.org.
On 19 Sep 2002, Bojan Smojver wrote:

> On Thu, 2002-09-19 at 07:59, rbb@apache.org wrote:
> 
> > Great, now one more change.  :-)
> > 
> > Don't add the fields to the structure.  The conn_Rec has a vector for
> > modules to add data to.  Create a log_config structure, and add the fields
> > there.  This allows the core structures to remain as they are, and it
> > still allows for logging the information.  Since the data is collected and
> > used by the same module, there is no reason to put the information in the
> > core structure.
> 
> OK, this is the ap_conf_vector_t *conn_config, right?
> 
> Cool, one less change to make for me. One less MMN bump as well. One
> less module in Apache too. Not bad :-)
> 
> One more thing to consider - the requests are going to be logged in the
> same order they appear on the connection, right? Otherwise, we'll be
> logging the wrong number of I/O bytes per request...

yep, requests are always logged in order.

Ryan

_______________________________________________________________________________
Ryan Bloom                        	rbb@apache.org
550 Jean St
Oakland CA 94610
-------------------------------------------------------------------------------


Re: [PATCH]: Update request in connection based filters

Posted by Bojan Smojver <bo...@rexursive.com>.
On Thu, 2002-09-19 at 07:59, rbb@apache.org wrote:

> Great, now one more change.  :-)
> 
> Don't add the fields to the structure.  The conn_Rec has a vector for
> modules to add data to.  Create a log_config structure, and add the fields
> there.  This allows the core structures to remain as they are, and it
> still allows for logging the information.  Since the data is collected and
> used by the same module, there is no reason to put the information in the
> core structure.

OK, this is the ap_conf_vector_t *conn_config, right?

Cool, one less change to make for me. One less MMN bump as well. One
less module in Apache too. Not bad :-)

One more thing to consider - the requests are going to be logged in the
same order they appear on the connection, right? Otherwise, we'll be
logging the wrong number of I/O bytes per request...

Bojan


Re: [PATCH]: Update request in connection based filters

Posted by rb...@apache.org.
On 19 Sep 2002, Bojan Smojver wrote:

> On Thu, 2002-09-19 at 00:05, rbb@apache.org wrote:
> 
> > The easiest way, would be to put the filters in mod_log_config, and have
> > that module save the information in a connection_rec vector.
> 
> OK, I think I roughly understand what is the plan:
> 
> - introduce c->bytes_in and c->bytes_out (c is conn_rec here)
> - put connection based filters from mod_logio into mod_log_config with
> one difference - count bytes in c->bytes_in and c->bytes_out
> - every time the request is being logged, reset c->bytes_in and
> c->bytes_out to zero
> 
> I think I should be able to put together something like that. Thanks for
> the tip.

Great, now one more change.  :-)

Don't add the fields to the structure.  The conn_Rec has a vector for
modules to add data to.  Create a log_config structure, and add the fields
there.  This allows the core structures to remain as they are, and it
still allows for logging the information.  Since the data is collected and
used by the same module, there is no reason to put the information in the
core structure.

Ryan

_______________________________________________________________________________
Ryan Bloom                        	rbb@apache.org
550 Jean St
Oakland CA 94610
-------------------------------------------------------------------------------


Re: [PATCH]: Update request in connection based filters

Posted by Bojan Smojver <bo...@rexursive.com>.
On Thu, 2002-09-19 at 00:05, rbb@apache.org wrote:

> The easiest way, would be to put the filters in mod_log_config, and have
> that module save the information in a connection_rec vector.

OK, I think I roughly understand what is the plan:

- introduce c->bytes_in and c->bytes_out (c is conn_rec here)
- put connection based filters from mod_logio into mod_log_config with
one difference - count bytes in c->bytes_in and c->bytes_out
- every time the request is being logged, reset c->bytes_in and
c->bytes_out to zero

I think I should be able to put together something like that. Thanks for
the tip.

Bojan


Re: [PATCH]: Update request in connection based filters

Posted by rb...@apache.org.
On Wed, 18 Sep 2002, Bojan Smojver wrote:

> Quoting rbb@apache.org:
> 
> > The problem is that the filters aren't HTTP specific.  If you make this
> > change, then the filters will be written to take advantage of the f->r
> > field in connection level filters.  Since that field doesn't make sense in
> > non-HTTP requests, modifying the filter_rec so that it is there is a
> > mistake.
> > 
> > Think about it this way.  If you add the request_rec to the
> > core_output_filter, and that filter is modified to use it, then all
> > protocol modules must ensure that the request_rec is added to the filter,
> > even if it doesn't make sense.
> > 
> > If you make it optional to put the request_rec in the filter_rec, then you
> > will be making it much harder to write filters.
> 
> OK, I think I understand now what you're trying to say. Things can go wrong if
> important fields can be NULL when you want them to be non-NULL and vice versa. I
> reckon that good documentation on the matter would help developers understand
> that they cannot depend on f->r being set for some protocols, which would
> eliminate confusion. But, this is the first ever filter that I'm writing and
> it's awfully simple, so I could be dead wrong.

It isn't that you are "dead wrong", but that type of situation where a
field can be uninitialized based on protocol is going to be almost
impossible to debug.  The point of filters is that they are stand-alone
functions, if you give them the information they expect, they will
work.  If you add the r field to connection filters, somebody will look at
it and write a connection filter that requires it to be there.  Once that
happens, the filter will only work with HTTP.

> Let me go back to the core of the problem - how do I store the number of bytes
> that were sent/received on the wire into the request_rec structure in order to
> log that? After discussion with Justin, it seemed that connection based filters
> were the place to do it because request based filters either don't see the input
> headers or they don't take into account things like SSL. I mean, logging
> something like that should be a fairly trivial exercise and yet it seems to be
> creating major dramas...

The easiest way, would be to put the filters in mod_log_config, and have
that module save the information in a connection_rec vector.

Ryan

_______________________________________________________________________________
Ryan Bloom                        	rbb@apache.org
550 Jean St
Oakland CA 94610
-------------------------------------------------------------------------------


Re: [PATCH]: Update request in connection based filters

Posted by Bojan Smojver <bo...@rexursive.com>.
Quoting rbb@apache.org:

> The problem is that the filters aren't HTTP specific.  If you make this
> change, then the filters will be written to take advantage of the f->r
> field in connection level filters.  Since that field doesn't make sense in
> non-HTTP requests, modifying the filter_rec so that it is there is a
> mistake.
> 
> Think about it this way.  If you add the request_rec to the
> core_output_filter, and that filter is modified to use it, then all
> protocol modules must ensure that the request_rec is added to the filter,
> even if it doesn't make sense.
> 
> If you make it optional to put the request_rec in the filter_rec, then you
> will be making it much harder to write filters.

OK, I think I understand now what you're trying to say. Things can go wrong if
important fields can be NULL when you want them to be non-NULL and vice versa. I
reckon that good documentation on the matter would help developers understand
that they cannot depend on f->r being set for some protocols, which would
eliminate confusion. But, this is the first ever filter that I'm writing and
it's awfully simple, so I could be dead wrong.

Let me go back to the core of the problem - how do I store the number of bytes
that were sent/received on the wire into the request_rec structure in order to
log that? After discussion with Justin, it seemed that connection based filters
were the place to do it because request based filters either don't see the input
headers or they don't take into account things like SSL. I mean, logging
something like that should be a fairly trivial exercise and yet it seems to be
creating major dramas...

Bojan

-------------------------------------------------
This mail sent through IMP: http://horde.org/imp/

Re: [PATCH]: Update request in connection based filters

Posted by rb...@apache.org.
On Wed, 18 Sep 2002, Bojan Smojver wrote:

> I understand. But isn't the ap_read_request HTTP specific given it's creating
> the request and all? In other words, if protocol is HTTP, we let connection
> filter know about the request. If not, we don't. That to me sounds exactly the
> same as your suggestion.
> 
> I must be missing something...

The problem is that the filters aren't HTTP specific.  If you make this
change, then the filters will be written to take advantage of the f->r
field in connection level filters.  Since that field doesn't make sense in
non-HTTP requests, modifying the filter_rec so that it is there is a
mistake.

Think about it this way.  If you add the request_rec to the
core_output_filter, and that filter is modified to use it, then all
protocol modules must ensure that the request_rec is added to the filter,
even if it doesn't make sense.

If you make it optional to put the request_rec in the filter_rec, then you
will be making it much harder to write filters.

Ryan


> 
> Bojan
> 
> Quoting rbb@apache.org:
> 
> > You don't want to do this.  A connection based filter is connection
> > oriented.  By definition, it has no concept of a request.  While this
> > might make sense for HTTP, other protocol modules will have a much harder
> > time with this change.
> > 
> > Ryan
> 
> -------------------------------------------------
> This mail sent through IMP: http://horde.org/imp/
> 

-- 

_______________________________________________________________________________
Ryan Bloom                        	rbb@apache.org
550 Jean St
Oakland CA 94610
-------------------------------------------------------------------------------


Re: [PATCH]: Update request in connection based filters

Posted by Bojan Smojver <bo...@rexursive.com>.
I understand. But isn't the ap_read_request HTTP specific given it's creating
the request and all? In other words, if protocol is HTTP, we let connection
filter know about the request. If not, we don't. That to me sounds exactly the
same as your suggestion.

I must be missing something...

Bojan

Quoting rbb@apache.org:

> You don't want to do this.  A connection based filter is connection
> oriented.  By definition, it has no concept of a request.  While this
> might make sense for HTTP, other protocol modules will have a much harder
> time with this change.
> 
> Ryan

-------------------------------------------------
This mail sent through IMP: http://horde.org/imp/

Re: [PATCH]: Update request in connection based filters

Posted by rb...@apache.org.
You don't want to do this.  A connection based filter is connection
oriented.  By definition, it has no concept of a request.  While this
might make sense for HTTP, other protocol modules will have a much harder
time with this change.

Ryan

On 18 Sep 2002, Bojan Smojver wrote:

> Justin,
> 
> After mucking around a bit with what you suggested, this seemed like the
> straightforward thing to do. Not sure how things would work on internal
> redirects etc. Can you please have a look. I'm sure it's naive, but so
> am I (for now :-)
> 
> Bojan
> 

-- 

_______________________________________________________________________________
Ryan Bloom                        	rbb@apache.org
550 Jean St
Oakland CA 94610
-------------------------------------------------------------------------------