You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@httpd.apache.org by Brian Pane <bp...@pacbell.net> on 2001/09/10 08:51:16 UTC

[PATCH 2] Re: [PATCH] performance patch for mod_log_config

Cliff Woolley wrote:

>On Sun, 9 Sep 2001, Brian Pane wrote:
>
>>I think putting it in APR would work.  The one limitation I can think of
>>is that adding the cache in apr_explode_localtime() itself wouldn't be a
>>win because we'd have to add the overhead of a gettimeofday() call to
>>check whether the supplied time was indeed current (and thus susceptible
>>to caching).
>>
>
>It took me a minute to figure out why this was a problem, but you're
>right.  I don't really like the idea of combining time_now and
>explode_localtime, though.  What about some kind of deal like
>apr_explode_recent_localtime() which is the same as
>apr_explode_localtime() except that the input must never be older than
>TIME_CACHE_SIZE seconds?  Or is that too httpd-tailored?  If so, no sweat,
>make it an ap_ function instead of an apr_ function.
>
I think it's relatively httpd-specific, so I used 
ap_explode_recent_localtime().
Here's the revised patch, plus the new files include/util_time.h and 
server/util_time.c

--Brian

Index: modules/loggers/mod_log_config.c
===================================================================
RCS file: /home/cvspublic/httpd-2.0/modules/loggers/mod_log_config.c,v
retrieving revision 1.68
diff -u -r1.68 mod_log_config.c
--- modules/loggers/mod_log_config.c    2001/08/27 20:50:01    1.68
+++ modules/loggers/mod_log_config.c    2001/09/10 06:42:02
@@ -463,9 +463,9 @@
     a problem with this, you can set the define.  -djg
     */
 #ifdef I_INSIST_ON_EXTRA_CYCLES_FOR_CLF_COMPLIANCE
-    apr_explode_localtime(&xt, apr_time_now());
+    ap_explode_recent_localtime(&xt, apr_time_now());
 #else
-    apr_explode_localtime(&xt, r->request_time);
+    ap_explode_recent_localtime(&xt, r->request_time);
 #endif
     if (a && *a) {              /* Custom format */
         apr_strftime(tstr, &retcode, MAX_STRING_LEN, a, &xt);

Re: [PATCH 2] Re: [PATCH] performance patch for mod_log_config

Posted by Ryan Bloom <rb...@covalent.net>.

On Sunday 09 September 2001 23:51, Brian Pane wrote:
> Cliff Woolley wrote:
> >On Sun, 9 Sep 2001, Brian Pane wrote:
> >>I think putting it in APR would work.  The one limitation I can think of
> >>is that adding the cache in apr_explode_localtime() itself wouldn't be a
> >>win because we'd have to add the overhead of a gettimeofday() call to
> >>check whether the supplied time was indeed current (and thus susceptible
> >>to caching).
> >
> >It took me a minute to figure out why this was a problem, but you're
> >right.  I don't really like the idea of combining time_now and
> >explode_localtime, though.  What about some kind of deal like
> >apr_explode_recent_localtime() which is the same as
> >apr_explode_localtime() except that the input must never be older than
> >TIME_CACHE_SIZE seconds?  Or is that too httpd-tailored?  If so, no sweat,
> >make it an ap_ function instead of an apr_ function.
>
> I think it's relatively httpd-specific, so I used
> ap_explode_recent_localtime().
> Here's the revised patch, plus the new files include/util_time.h and
> server/util_time.c

+1

Ryan
______________________________________________________________
Ryan Bloom				rbb@apache.org
Covalent Technologies			rbb@covalent.net
--------------------------------------------------------------

Re: [PATCH 2] Re: [PATCH] performance patch for mod_log_config

Posted by Brian Pane <bp...@pacbell.net>.

dean gaudet wrote:

[...]

>but still, does this really work?  'cause the time used is
>r->request_time, if a request takes longer than 14 seconds or so to
>generate/serve then the logger is going to be asking for timestamps which
>are dangerously close to your cache rollover.   and since you can't
>update a cache entry atomically it's possible they'll see a partially
>updated entry, and get bogus results.
>
You're right; a request that takes almost exactly 15 seconds is
vulnerable to corrupt data.  I have a fix, though, that doesn't
require either locking or a gettimeofday() call.  I'll post an
updated patch once I finish testing it.

--Brian

Re: [PATCH 2] Re: [PATCH] performance patch for mod_log_config

Posted by dean gaudet <de...@arctic.org>.

On Mon, 10 Sep 2001, Brian Pane wrote:

> dean gaudet wrote:
>
> > why is there a need for 15 entries?  if it's a multiprocess server then
> > there's only a need for 1 or 2 entries.  if it's a multithreaded server
> > then you need to lock the cache (which you're not doing :)
>
> The whole point of the design is to do the multithreaded cache
> without locks.

oh i see.  i think you should carefully comment this then, 'cause this
is really subtle... the fact that you copy into the cache from distinct
apr_exploded_time_t * is what makes it safe.  (i.e. take a look at the
win32 time.c code and notice there's a few ++ operations which would
not be multithreaded safe.)

but still, does this really work?  'cause the time used is
r->request_time, if a request takes longer than 14 seconds or so to
generate/serve then the logger is going to be asking for timestamps which
are dangerously close to your cache rollover.   and since you can't
update a cache entry atomically it's possible they'll see a partially
updated entry, and get bogus results.

> > isn't the real win in eliminating both the divisions required by the
> > explode time functions and the divisions required by the printf?
>
> That's what I used to think, but the profile data showed that localtime()
> (which does most of the work in apr_explode_localtime()) is an order of
> magnitude more expensive than the apr_snprintf() that formats the resulting
> data.  Caching or otherwise optimizing the printf in the future wouldn't
> hurt, but it's a more marginal win than caching the exploded time struct.
>
> > also -- you could eliminate the seconds = t / APR_USECS_PER_SEC
> > calculation (death to all divisions!  especially the 64-bit ones)...
> > because this division already happens inside the explode function:
> >
> > when you miss the cache and do a full explode, then subtract the
> > resulting
> > tm_usec off the input time and you've got an apr_time_t which lands
> > exactly on a second boundary.  use that as your cache key, and you can
> > then do ranged comparisons (t >= cache->t && t < cache->t +
> > APR_USECS_PER_SEC).
>
> The one problem I can see with this is that removing the division means
> that you have to scan through the table to find out whether you have a
> cache hit or not--which probably negates the performance benefit of
> removing the division.

hehe, try this:

% cat >t.c <<EOF
long long r(long long t)
{
	return t / 1000000LL;
}

int main(int argc, char **argv)
{
	return 0;
}
EOF
% gcc -O2 -g t.c
% gdb a.out
(gdb) disass r
(gdb) disass __divdi3

(__divdi3 is what you should see a call to in r ... assuming you're on
a 32-bit box.)  64-bit division is a really big out of line subroutine :)

that's why strength reductions like i was suggesting are a good idea.

i hadn't looked at your code close enough to see that you didn't already
scan the cache ... cache scanning isn't SMP safe either...

btw, TIME_CACHE_SIZE should be a power of 2 so that the cache hash can
be strength reduced to a binary-and :)

-dean

Re: [PATCH 2] Re: [PATCH] performance patch for mod_log_config

Posted by Brian Pane <bp...@pacbell.net>.

dean gaudet wrote:

> why is there a need for 15 entries?  if it's a multiprocess server then
> there's only a need for 1 or 2 entries.  if it's a multithreaded server
> then you need to lock the cache (which you're not doing :) 

The whole point of the design is to do the multithreaded cache
without locks.  Instead of trying to prevent a race condition,
the cache is designed to make the race condition safe: when two
readers or a reader and a writer collide, the threads end up
computing the exploded time redundantly.  The race condition
that's not safe, though, is when one thread tries to explode
a time_t representing 'now - num_entries' seconds ago at the
same time that another thread is updating that same cache
element with the exploded value for the current time (the two
timestamps map to the same element in the ring buffer).  The
number of entries in the ring buffer thus determines how old
an input timestamp the cache can support and still provide
deterministic results: I used the 15-second value to provide
a large safety factor.

> isn't the real win in eliminating both the divisions required by the
> explode time functions and the divisions required by the printf?

That's what I used to think, but the profile data showed that localtime()
(which does most of the work in apr_explode_localtime()) is an order of
magnitude more expensive than the apr_snprintf() that formats the resulting
data.  Caching or otherwise optimizing the printf in the future wouldn't
hurt, but it's a more marginal win than caching the exploded time struct.

>   fwiw,
> tux does binary logging, and has an external tool to convert it to CLF...
> and the spec committee accepted that.  and tux also optimises the rfc822
> Date header by updating a cached copy once per second 

I was thinking doing something like that.  One of the alternate designs
that I considered would have had a thread running in the background would
update the time once a second, but I ended up picking the cache-on-demand
solution because it's more portable to threadless MPMs.  (I also 
contemplated
having the accept loop in the worker MPM wake up once a second to update the
cached time, but that was just too ugly. :-)

> also -- you could eliminate the seconds = t / APR_USECS_PER_SEC
> calculation (death to all divisions!  especially the 64-bit ones)...
> because this division already happens inside the explode function:
>
> when you miss the cache and do a full explode, then subtract the 
> resulting
> tm_usec off the input time and you've got an apr_time_t which lands
> exactly on a second boundary.  use that as your cache key, and you can
> then do ranged comparisons (t >= cache->t && t < cache->t +
> APR_USECS_PER_SEC). 

The one problem I can see with this is that removing the division means
that you have to scan through the table to find out whether you have a
cache hit or not--which probably negates the performance benefit of
removing the division.

--Brian

Re: [PATCH 2] Re: [PATCH] performance patch for mod_log_config

Posted by dean gaudet <de...@arctic.org>.

why is there a need for 15 entries?  if it's a multiprocess server then
there's only a need for 1 or 2 entries.  if it's a multithreaded server
then you need to lock the cache (which you're not doing :)

isn't the real win in eliminating both the divisions required by the
explode time functions and the divisions required by the printf?  fwiw,
tux does binary logging, and has an external tool to convert it to CLF...
and the spec committee accepted that.  and tux also optimises the rfc822
Date header by updating a cached copy once per second

also -- you could eliminate the seconds = t / APR_USECS_PER_SEC
calculation (death to all divisions!  especially the 64-bit ones)...
because this division already happens inside the explode function:

when you miss the cache and do a full explode, then subtract the resulting
tm_usec off the input time and you've got an apr_time_t which lands
exactly on a second boundary.  use that as your cache key, and you can
then do ranged comparisons (t >= cache->t && t < cache->t +
APR_USECS_PER_SEC).

-dean