You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@httpd.apache.org by David Burry <db...@tagnet.org> on 2002/10/11 01:04:41 UTC

apache 2.0.43: %b not showing "bytes sent" but "bytes requested"

The documentation for Apache 2.0.43 for mod_log_config states:

%...b: Bytes sent, excluding HTTP headers. In CLF format i.e. a '-' rather than a 0 when no bytes are sent.

However, in testing I clearly see it's logging the number of bytes _requested_ (that is, that apache intended to send)!!! not the actual bytes _sent_!!!  If a user presses the cancel button on their browser or they're cut off in the middle, this number is not accurate at all, because it appears the entire file was sent when it was not.

We're running a site that serves many large files (dozen megs or so typically) for download.  It depends on this "bytes sent" number for statistics and monitoring to see if and when a download has completed including with a "206" byte range response... Typical throughput is 600 mbit/sec, 3 terabytes/day, running on Solaris 8 on 4 Sun E280R's with 4 gig of ram each...  We're seriously considering rolling back to old hardware with Apache 1.3.x (which seems to log actual bytes sent by the way) and CacheFlow machines because of this issue...  Is there a patch out for this problem instead? or is someone already working on this problem?  or does anyone have an idea where the root of this problem is and I might take a stab at patching myself?

As a side note, we tried but we were unable to use mod_mem_cache on this setup we suspect due to mutex issues, might be possible if we spread this out over more machines (but what's the point of memory caching if you have to do that), mod_file_cache works ok though for a static list of the most-often-used files, though mmap does have its limitations on how large of a file can be stuck into memory...

We have sendfile installed and I configured with the following:

./configure --prefix=/usr/local/apache_2.0.43 --enable-usertrack --enable-file-cache --enable-cache --enable-mem-cache --enable-static-support --disable-cgid --disable-cgi --enable-rewrite --with-mpm=worker --disable-userdir

Dave


Re: apache 2.0.43: %b not showing "bytes sent" but "bytes requested"

Posted by Bojan Smojver <bo...@rexursive.com>.
Quoting David Burry <db...@tagnet.org>:

> If we're not going to fix %s shouldn't we at least fix the documentation to
> be more accurate?  2.0 and 1.3 really are quite different here.

I kind of remember that someone was working on this (i.e. on fixing it) a while
ago, but I can't remember the details... Or maybe it was the other way around -
maybe the Apache 1.3 behaviour was pronounced wrong. Not sure any more.

Bojan

Re: apache 2.0.43: %b not showing "bytes sent" but "bytes requested"

Posted by "William A. Rowe, Jr." <wr...@apache.org>.
At 10:21 PM 10/11/2002, rbb@apache.org wrote:

>What we are learning here is simple.  We need to do the counting in the
>core_output_filter.  If that means adding a field to the conn_rec, or
>somehow getting the request_rec in the core_output_filter doesn't
>matter.  The count needs to be done in the core_output_filter, by tallying
>the amount of data actually written.

How?  The request is destroyed already.  The remaining body is sitting
on the connection scope waiting to flush out.  The model is borked.

A change proposal for 2.1 that I hinted to earlier.  Three scopes, not two...

connection
  request
    handler

(or call them
connection
  response
    request
whichever we prefer.)

The middle tier can live out the body, even as we begin a new
request handler for another pipelined request.  The hard work of
creating the response has already fallen out of scope, so the
impact to the memory footprint shouldn't be all that great.

Bill



Re: apache 2.0.43: %b not showing "bytes sent" but "bytes requested"

Posted by Bojan Smojver <bo...@rexursive.com>.
On Sat, 2002-10-12 at 20:19, Bojan Smojver wrote:

> r->bytes_sent =
>   (total bytes sent) - ( (total size of brigades) - (content length) )

Actually, I think this maths wouldn't work for SSL because content
length is calculated before it. Hmm...

Bojan


Re: apache 2.0.43: %b not showing "bytes sent" but "bytes requested"

Posted by Bojan Smojver <bo...@rexursive.com>.
On Sat, 2002-10-12 at 13:21, rbb@apache.org wrote:

> What we are learning here is simple.  We need to do the counting in the
> core_output_filter.  If that means adding a field to the conn_rec, or
> somehow getting the request_rec in the core_output_filter doesn't
> matter.  The count needs to be done in the core_output_filter, by tallying
> the amount of data actually written.

On a more practical note...

Looks like sendfile_it_all will have to change a bit (lucky it's not a
public function). As it is, it doesn't seem to return the length
actually sent (i.e. &unused_bytes_sent would have to be passed into it).
Otherwise, we should be able to add up the unused_bytes_sent (and change
the name of the variable at the same time :-) to get total bytes sent.

Now, this would give us the total bytes sent but not in the sense of
r->bytes_sent. However, it would be relatively easy to figure it out:

r->bytes_sent =
  (total bytes sent) - ( (total size of brigades) - (content length) )

Total size of brigades is what mod_logio output filter does now, so
that's easy. Content length is what what content_length_filter does now,
that's easy too. The difference between them is obviously the length of
the headers. If the end result (r->bytes_sent) is negative, we just toss
it and keep r->bytes_sent = 0.

Obviously, the functionality, or the filters themselves, from mod_logio
could move into core.c and mod_log_config would make sure the counters
are reset when the transaction ends.

Bojan


Re: apache 2.0.43: %b not showing "bytes sent" but "bytes requested"

Posted by Bojan Smojver <bo...@rexursive.com>.
Funny enough, there is a variable called unused_bytes_sent in that
function - kind of makes it obvious it's not being used ;-)

I thought that making f->r non-NULL was rejected due to complications
with other protocols that don't understand requests? Anyway, we don't
really need to store anything in request, we just need to retrieve from
connection, log it and then reset the counters.

Instead of introducing more fields to conn_rec (to prevent another MMN
bump), why don't we use the same thing you suggested for mod_logio,
which is putting the counting structure of core_output_filter into
f->c->conn_config? This structure would not be a mod_logio option, but
rather always there. mod_logio would then just use it to log. Or, even
better, all the logging would be done through mod_log_config, which
would make mod_logio redundant. By the same token, the input counting
could also go into core_input_filter...

Bojan

On Sat, 2002-10-12 at 13:21, rbb@apache.org wrote:
> 
> What we are learning here is simple.  We need to do the counting in the
> core_output_filter.  If that means adding a field to the conn_rec, or
> somehow getting the request_rec in the core_output_filter doesn't
> matter.  The count needs to be done in the core_output_filter, by tallying
> the amount of data actually written.
> 
> Ryan


Re: apache 2.0.43: %b not showing "bytes sent" but "bytes requested"

Posted by Brian Pane <br...@cnet.com>.
rbb@apache.org wrote:

>What we are learning here is simple.  We need to do the counting in the
>core_output_filter.  If that means adding a field to the conn_rec, or
>somehow getting the request_rec in the core_output_filter doesn't
>matter.  The count needs to be done in the core_output_filter, by tallying
>the amount of data actually written.
>  
>

Yes, exactly.  The only problem is that this will require
a design change.  Currently, the last write can complete
after the request_rec no longer exists (if the connection
is a keepalive).

Solving this is tricky.  One possibility is to transfer
ownership of the the request (including its pool) to
the core output filter as soon as we pass an EOF to that
filter.  That would complicate the memory management a
bit, but it might be worth the effort because it would
give us almost all the infrastructure needed to support
a semi-async MPM.

Brian



Re: apache 2.0.43: %b not showing "bytes sent" but "bytes requested"

Posted by rb...@apache.org.
What we are learning here is simple.  We need to do the counting in the
core_output_filter.  If that means adding a field to the conn_rec, or
somehow getting the request_rec in the core_output_filter doesn't
matter.  The count needs to be done in the core_output_filter, by tallying
the amount of data actually written.

Ryan

On 12 Oct 2002, Bojan Smojver wrote:

> Nope, that doesn't work. The number is always zero.
> 
> Bojan
> 
> On Sat, 2002-10-12 at 12:22, Bojan Smojver wrote:
> 
> > Anyway, I think what's causing this problem is the fact that mod_logio
> > calculates the length of all brigades that are ready to be sent out. If
> > the sending gets interrupted in the middle, the whole thing gets
> > reported, instead of just the buckets/brigades that were actually sent
> > out. Maybe I should use (AP_FTYPE_NETWORK + 1) instead of
> > (AP_FTYPE_NETWORK - 1) for the output filter (i.e. count after the
> > sending, not before). I'll give it a try...
> 

-- 

_______________________________________________________________________________
Ryan Bloom                        	rbb@apache.org
550 Jean St
Oakland CA 94610
-------------------------------------------------------------------------------


Re: apache 2.0.43: %b not showing "bytes sent" but "bytes requested"

Posted by Bojan Smojver <bo...@rexursive.com>.
Nope, that doesn't work. The number is always zero.

Bojan

On Sat, 2002-10-12 at 12:22, Bojan Smojver wrote:

> Anyway, I think what's causing this problem is the fact that mod_logio
> calculates the length of all brigades that are ready to be sent out. If
> the sending gets interrupted in the middle, the whole thing gets
> reported, instead of just the buckets/brigades that were actually sent
> out. Maybe I should use (AP_FTYPE_NETWORK + 1) instead of
> (AP_FTYPE_NETWORK - 1) for the output filter (i.e. count after the
> sending, not before). I'll give it a try...


Re: apache 2.0.43: %b not showing "bytes sent" but "bytes requested"

Posted by rb...@apache.org.
On 12 Oct 2002, Bojan Smojver wrote:

> On Fri, 2002-10-11 at 18:58, David Burry wrote:
> 
> > This should also be a concern for anyone who's using mod_logio to charge for
> > bandwidth, because customers should be concerned about some serious
> > overcharging going on here!
> 
> Only if you charge for outgoing bandwidth. On incoming bandwidth, I
> think it should be pretty accurate because it won't count what wasn't
> received.
> 
> Anyway, I think what's causing this problem is the fact that mod_logio
> calculates the length of all brigades that are ready to be sent out. If
> the sending gets interrupted in the middle, the whole thing gets
> reported, instead of just the buckets/brigades that were actually sent
> out. Maybe I should use (AP_FTYPE_NETWORK + 1) instead of
> (AP_FTYPE_NETWORK - 1) for the output filter (i.e. count after the
> sending, not before). I'll give it a try...

That won't work.  The core_filter will return without calling your
filter.

Ryan

_______________________________________________________________________________
Ryan Bloom                        	rbb@apache.org
550 Jean St
Oakland CA 94610
-------------------------------------------------------------------------------


Re: apache 2.0.43: %b not showing "bytes sent" but "bytes requested"

Posted by Bojan Smojver <bo...@rexursive.com>.
On Fri, 2002-10-11 at 18:58, David Burry wrote:

> This should also be a concern for anyone who's using mod_logio to charge for
> bandwidth, because customers should be concerned about some serious
> overcharging going on here!

Only if you charge for outgoing bandwidth. On incoming bandwidth, I
think it should be pretty accurate because it won't count what wasn't
received.

Anyway, I think what's causing this problem is the fact that mod_logio
calculates the length of all brigades that are ready to be sent out. If
the sending gets interrupted in the middle, the whole thing gets
reported, instead of just the buckets/brigades that were actually sent
out. Maybe I should use (AP_FTYPE_NETWORK + 1) instead of
(AP_FTYPE_NETWORK - 1) for the output filter (i.e. count after the
sending, not before). I'll give it a try...

Bojan


Re: apache 2.0.43: %b not showing "bytes sent" but "bytes requested"

Posted by David Burry <db...@tagnet.org>.
ok.... serious problems still...  I've sucessfully installed mod_logio
(finally), and %O is STILL not logging the actual bytes transferred, but the
bytes apache is INTENDING to eventually transfer... if I stop it in the
middle by canceling or disconnecting, this number is horribly too big!!!!
Of course it's a couple bytes larger than %b because of the headers
included, but this is still totally useless for the purpose of figuring out
if/when/whether the whole file was actually downloaded.  Anyone have any
ideas why this is so and how to fix it?  Is the logging happening completely
before the request is finished, could that be the problem?

This should also be a concern for anyone who's using mod_logio to charge for
bandwidth, because customers should be concerned about some serious
overcharging going on here!

I had to install libtool 1.4 and m4 1.4 and this was my configure line:
./configure --prefix=/usr/local/apache_2.0.43+logio --enable-logio --enable-
usertrack --enable-file-cache --enable-cache --enable-mem-cache --enable-sta
tic-support --disable-cgid --disable-cgi --enable-rewrite --with-mpm=worker 
--disable-userdir


Dave

----- Original Message -----
From: "David Burry" <db...@tagnet.org>
> Yes, I've been thinking of experimenting with mod_logio, but I'm a bit
> hesitant to hack out a patch from (or use whole-hog) CVS HEAD into a
> production site that gets 3 terabytes of traffic per day and I'm
embarassed
> to admit how much revenue depends on... ;o)  Thanks for the link, I'll try
> that.  It won't be as accurate as getting the byte count without the
> headers, but at least it should be something better than nothing if it
works
> as described...
>
> If we're not going to fix %s shouldn't we at least fix the documentation
to
> be more accurate?  2.0 and 1.3 really are quite different here.
>
> Dave
>
> ----- Original Message -----
> From: "Bojan Smojver" <bo...@rexursive.com>
> > Have you tried using mod_logio? If won't only give you the "body" bytes,
> but
> > also the "headers" bytes. It reports the number of input bytes too and
> should
> > understand encryption and compression. You can either check it out from
> Apache
> > CVS (HEAD), or download the patch for 2.0.43 here:
> > http://www.rexursive.com/software.html.
> >
> > You'd use it with %I (for input) and %O (for output). It would be
> interesting to
> > know if it reports accurately in the case you described...
>


Re: apache 2.0.43: %b not showing "bytes sent" but "bytes requested"

Posted by David Burry <db...@tagnet.org>.
Yes, I've been thinking of experimenting with mod_logio, but I'm a bit
hesitant to hack out a patch from (or use whole-hog) CVS HEAD into a
production site that gets 3 terabytes of traffic per day and I'm embarassed
to admit how much revenue depends on... ;o)  Thanks for the link, I'll try
that.  It won't be as accurate as getting the byte count without the
headers, but at least it should be something better than nothing if it works
as described...

If we're not going to fix %s shouldn't we at least fix the documentation to
be more accurate?  2.0 and 1.3 really are quite different here.

Dave

----- Original Message -----
From: "Bojan Smojver" <bo...@rexursive.com>
> Have you tried using mod_logio? If won't only give you the "body" bytes,
but
> also the "headers" bytes. It reports the number of input bytes too and
should
> understand encryption and compression. You can either check it out from
Apache
> CVS (HEAD), or download the patch for 2.0.43 here:
> http://www.rexursive.com/software.html.
>
> You'd use it with %I (for input) and %O (for output). It would be
interesting to
> know if it reports accurately in the case you described...


Re: apache 2.0.43: %b not showing "bytes sent" but "bytes requested"

Posted by Bojan Smojver <bo...@rexursive.com>.
Quoting David Burry <db...@tagnet.org>:

> However, in testing I clearly see it's logging the number of bytes
> _requested_ (that is, that apache intended to send)!!! not the actual bytes
> _sent_!!!  If a user presses the cancel button on their browser or they're
> cut off in the middle, this number is not accurate at all, because it appears
> the entire file was sent when it was not.

Have you tried using mod_logio? If won't only give you the "body" bytes, but
also the "headers" bytes. It reports the number of input bytes too and should
understand encryption and compression. You can either check it out from Apache
CVS (HEAD), or download the patch for 2.0.43 here:
http://www.rexursive.com/software.html.

You'd use it with %I (for input) and %O (for output). It would be interesting to
know if it reports accurately in the case you described...

Bojan