You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@apr.apache.org by "William A. Rowe, Jr." <wr...@rowe-clan.net> on 2002/07/11 07:33:51 UTC

[PATCH 2] example binary BUSEC patch for benchmarking only

At 10:38 PM 7/10/2002, William A. Rowe, Jr. wrote:
>At 10:03 PM 7/10/2002, Brian Pane wrote:
>>Bill Stoddard wrote:
>>
>>>I've not looked at the generated code, but profiling indicates that an
>>>additional division is happening, adding an extra 231 instructions.
>>>(xlc_r -O2)
>>
>>If you redefine the macro as a shift, does the profile look better?

Ok, attached is the code redone as binary math.  I'm tired, could be
any number of major blunders in it, but on first pass, it looked right.

Bill

Re: [PATCH 3] example binary BUSEC for discussion

Posted by Brian Pane <br...@cnet.com>.

William A. Rowe, Jr. wrote:

> Just as a trivial example of using apr_time_sec_get... it helps us
> avoid casting over in the new poll/unix/poll.c line 164, where
>
>         tv.tv_sec = apr_time_sec_get(timeout);
>
> avoids a compiler emit that we are downcasting an in64 to a long. 

What's your strategy for updating the macro when we
someday start supporting 64-bit seconds in the apr_time
code?  Change the macro definition to cast to apr_uint64_t
instead of apr_uint32_t, or leave the 32-bit cast in place?
I have reservations about the latter, as it would create a
hidden Y2038 problem in code like your example above.

--Brian

Re: [PATCH 3] example binary BUSEC for discussion

Posted by "William A. Rowe, Jr." <wr...@rowe-clan.net>.

A simple example of the disparity below...

At 02:31 PM 7/11/2002, you wrote:
>At 02:13 PM 7/11/2002, Brian Pane wrote:
>>William A. Rowe, Jr. wrote:
>>
>>>apr_int32_t apr_time_sec_get(time)  deprecates apr_time_sec(time)
>...
>>>apr_int64_t apr_time_as_sec(time)
>>
>>The naming convention is way too subtle:
>>   apr_time_[unit]_get     means 32-bit
>>   apr_time_as_[unit]      means 64-bit
>
>WRONG! apr_time_msec_get(time) gets the fractional component
>of time... e.g. 32500.15 seconds would return 150 msecs..
>
>apr_time_as_msec returns the entire time value!  So we end up with
>a time of 32,500,150 in msec's.  That's why these are int64.
>
>>How often do you anticipate that people will be using the
>>32-bit versions?  If it's not very often, then I'd rather
>>just supply the 64-bit versions in the API, because those
>>are the safe ones (no Y2038 problem).  People who really
>>need to do 32-bit math for performance reasons would have
>>to cast explicitly, but IMHO that's good because it forces
>>them to think more carefully about whether the loss of
>>precision is safe.
>
>The 32bit flavor of sec, vs. a 64 bit flavor is a simple way of
>avoiding casts in the user's code.  I'm only asking for the number
>of seconds in an apr_time_t in order to set a time, calculate a
>difference, etc.  But I don't care too much, one way or the other.

Just as a trivial example of using apr_time_sec_get... it helps us
avoid casting over in the new poll/unix/poll.c line 164, where

         tv.tv_sec = apr_time_sec_get(timeout);

avoids a compiler emit that we are downcasting an in64 to a long.

>apr_time_as_sec gives me a HUGE number of seconds to work
>with as an absolute time from 1/1/70.  That must remain 64bit.

Re: [PATCH 3] example binary BUSEC for discussion

Posted by "William A. Rowe, Jr." <wr...@rowe-clan.net>.

At 02:13 PM 7/11/2002, Brian Pane wrote:
>William A. Rowe, Jr. wrote:
>
>>apr_int32_t apr_time_sec_get(time)  deprecates apr_time_sec(time)
>>apr_int32_t apr_time_msec_get(time) deprecates apr_time_msec(time)
>>apr_int32_t apr_time_usec_get(time) deprecates apr_time_usec(time)
>>apr_int32_t apr_time_nsec_get(time) deprecates apr_time_nsec(time)
>
>Oh no, I just recently finished changing all the code to use
>those now-deprecated macros. :-)

I know... but there are two different 'things'...

>>apr_int64_t apr_time_as_sec(time)
>>apr_int64_t apr_time_as_msec(time) or apr_time_as_272yrs_msec(time)
>>apr_int64_t apr_time_as_usec(time) or apr_time_as_3mos_usec(time)
>>apr_int64_t apr_time_as_nsec(time) or apr_time_as_2hrs_nsec(time)
>
>The naming convention is way too subtle:
>   apr_time_[unit]_get     means 32-bit
>   apr_time_as_[unit]      means 64-bit

WRONG! apr_time_msec_get(time) gets the fractional component
of time... e.g. 32500.15 seconds would return 150 msecs..

apr_time_as_msec returns the entire time value!  So we end up with
a time of 32,500,150 in msec's.  That's why these are int64.

>How often do you anticipate that people will be using the
>32-bit versions?  If it's not very often, then I'd rather
>just supply the 64-bit versions in the API, because those
>are the safe ones (no Y2038 problem).  People who really
>need to do 32-bit math for performance reasons would have
>to cast explicitly, but IMHO that's good because it forces
>them to think more carefully about whether the loss of
>precision is safe.

The 32bit flavor of sec, vs. a 64 bit flavor is a simple way of
avoiding casts in the user's code.  I'm only asking for the number
of seconds in an apr_time_t in order to set a time, calculate a
difference, etc.  But I don't care too much, one way or the other.

apr_time_as_sec gives me a HUGE number of seconds to work
with as an absolute time from 1/1/70.  That must remain 64bit.

Bill

Re: [PATCH 3] example binary BUSEC for discussion

Posted by Brian Pane <br...@apache.org>.

William A. Rowe, Jr. wrote:

> At 11:31 AM 7/11/2002, Brian Pane wrote:
>
>> I don't see a way to eliminate the "/ 1000" to convert usec to
>> msec.  But we may be able to do all the math in 32-bit mode, by
>> limiting the maximum timeout to the number of milliseconds that
>> will fit in 32 bits, which works out to a max timeout of about
>> 50 days.
>
>
> Actually, here is my current math, with optimized forms for specific
> situations we encounter often (aprox. limits are given in the macro 
> name).
>
> They break down to;
>
> apr_int32_t apr_time_sec_get(time)  deprecates apr_time_sec(time)
> apr_int32_t apr_time_msec_get(time) deprecates apr_time_msec(time)
> apr_int32_t apr_time_usec_get(time) deprecates apr_time_usec(time)
> apr_int32_t apr_time_nsec_get(time) deprecates apr_time_nsec(time)

Oh no, I just recently finished changing all the code to use
those now-deprecated macros. :-)

> apr_int64_t apr_time_as_sec(time)
> apr_int64_t apr_time_as_msec(time) or apr_time_as_272yrs_msec(time)
> apr_int64_t apr_time_as_usec(time) or apr_time_as_3mos_usec(time)
> apr_int64_t apr_time_as_nsec(time) or apr_time_as_2hrs_nsec(time) 

The naming convention is way too subtle:
   apr_time_[unit]_get     means 32-bit
   apr_time_as_[unit]      means 64-bit

How often do you anticipate that people will be using the
32-bit versions?  If it's not very often, then I'd rather
just supply the 64-bit versions in the API, because those
are the safe ones (no Y2038 problem).  People who really
need to do 32-bit math for performance reasons would have
to cast explicitly, but IMHO that's good because it forces
them to think more carefully about whether the loss of
precision is safe.

--Brian

Re: [PATCH 3] example binary BUSEC for discussion

Posted by Greg Marr <gr...@alum.wpi.edu>.

At 02:49 PM 07/11/2002, William A. Rowe, Jr. wrote:
>The code follows, patch attached is hard to follow.  [Like my code 
>isn't :o]
>
>/* Number of binary microseconds per second is (2^20)
>  * to keep the math fast and simple [binary arithmetic.]
>  */
>#define APR_BUSEC_BITS 20
>#define APR_BUSEC_PER_SEC APR_TIME_C(2 ^ APR_BUSEC_BITS)

Thanks, but (2 ^ APR_BUSEC_BITS) is 22 (^ is XOR).  You want 1 << 
APR_BUSEC_BITS.

-- 
Greg Marr
gregm@alum.wpi.edu
"We thought you were dead."
"I was, but I'm better now." - Sheridan, "The Summoning"

Re: [PATCH 3] example binary BUSEC for discussion

Posted by "William A. Rowe, Jr." <wr...@rowe-clan.net>.

Guys, everyone here admits we are lousy at inventing names...

all suggestions for better names of these macros are welcome!

At 01:49 PM 7/11/2002, William A. Rowe, Jr. wrote:
Actually, here is my current math, with optimized forms for specific
>situations we encounter often (aprox. limits are given in the macro name).
>
>They break down to;
>
>apr_int32_t apr_time_sec_get(time)  deprecates apr_time_sec(time)
>apr_int32_t apr_time_msec_get(time) deprecates apr_time_msec(time)
>apr_int32_t apr_time_usec_get(time) deprecates apr_time_usec(time)
>apr_int32_t apr_time_nsec_get(time) deprecates apr_time_nsec(time)
>
>apr_int64_t apr_time_as_sec(time)
>apr_int64_t apr_time_as_msec(time) or apr_time_as_272yrs_msec(time)
>apr_int64_t apr_time_as_usec(time) or apr_time_as_3mos_usec(time)
>apr_int64_t apr_time_as_nsec(time) or apr_time_as_2hrs_nsec(time)
>
>apr_time_t apr_time_mmake(sec, msec)
>apr_time_t apr_time_umake(sec, usec)
>apr_time_t apr_time_nmake(sec, nsec)
>
>apr_time_t apr_time_from_sec(sec)
>apr_time_t apr_time_from_msec(msec) or apr_time_from_3mos_msec(msec)
>apr_time_t apr_time_from_usec(usec) or apr_time_from_3mos_usec(usec)
>apr_time_t apr_time_from_nsec(nsec) or apr_time_from_3mos_nsec(nsec)

[PATCH 3] example binary BUSEC for discussion

Posted by "William A. Rowe, Jr." <wr...@rowe-clan.net>.

At 11:31 AM 7/11/2002, Brian Pane wrote:
>I don't see a way to eliminate the "/ 1000" to convert usec to
>msec.  But we may be able to do all the math in 32-bit mode, by
>limiting the maximum timeout to the number of milliseconds that
>will fit in 32 bits, which works out to a max timeout of about
>50 days.

Actually, here is my current math, with optimized forms for specific
situations we encounter often (aprox. limits are given in the macro name).

They break down to;

apr_int32_t apr_time_sec_get(time)  deprecates apr_time_sec(time)
apr_int32_t apr_time_msec_get(time) deprecates apr_time_msec(time)
apr_int32_t apr_time_usec_get(time) deprecates apr_time_usec(time)
apr_int32_t apr_time_nsec_get(time) deprecates apr_time_nsec(time)

apr_int64_t apr_time_as_sec(time)
apr_int64_t apr_time_as_msec(time) or apr_time_as_272yrs_msec(time)
apr_int64_t apr_time_as_usec(time) or apr_time_as_3mos_usec(time)
apr_int64_t apr_time_as_nsec(time) or apr_time_as_2hrs_nsec(time)

apr_time_t apr_time_mmake(sec, msec)
apr_time_t apr_time_umake(sec, usec)
apr_time_t apr_time_nmake(sec, nsec)

apr_time_t apr_time_from_sec(sec)
apr_time_t apr_time_from_msec(msec) or apr_time_from_3mos_msec(msec)
apr_time_t apr_time_from_usec(usec) or apr_time_from_3mos_usec(usec)
apr_time_t apr_time_from_nsec(nsec) or apr_time_from_3mos_nsec(nsec)

The code follows, patch attached is hard to follow.  [Like my code isn't :o]

/* Number of binary microseconds per second is (2^20)
  * to keep the math fast and simple [binary arithmetic.]
  */
#define APR_BUSEC_BITS 20
#define APR_BUSEC_PER_SEC APR_TIME_C(2 ^ APR_BUSEC_BITS)

/* apr_time_sec_get returns 32 bits and is subject to the Y2038 bug
  * @see apr_time_as_sec for 64 bit resolution
  */
#define apr_time_sec_get(time) ((apr_int32_t)((time) >> APR_BUSEC_BITS))

/* These functions 'get' the xsec fractional component of the time.
  *
  * msec extraction won't overflow 32 bits so convert early,
  * while usec/nsec computations must be in all 64 bit math.
  */
#define apr_time_msec_get(time) \
     (((apr_int32_t)((time) & (APR_BUSEC_PER_SEC - 1)) * 1000) \
                                                    >> APR_BUSEC_BITS)

#define apr_time_usec_get(time) \
     ((apr_int32_t)((((time) & (APR_BUSEC_PER_SEC - 1)) \
                             * APR_TIME_C(1000000)) >> APR_BUSEC_BITS))

#define apr_time_nsec_get(time) \
     ((apr_int32_t)((((time) & (APR_BUSEC_PER_SEC - 1)) \
                             * APR_TIME_C(1000000000)) >> APR_BUSEC_BITS))


/* apr_time_as_sec returns 64 bits and escapes the Y2038 bug
  * @see apr_time_sec_get for 32 bit resolution
  */
#define apr_time_as_sec(time) ((apr_int64_t)((time) >> APR_BUSEC_BITS))

/* These functions return the given time value, complete,
  * in the given resolution (not simply the fractional component)
  *
  * They carry no resolution penalty, and due to the binary math,
  * they suffer little performance penalty either.
  */
#define apr_time_as_msec(time) \
     (((apr_int64_t)(time) >> APR_BUSEC_BITS) \
      + ((((apr_int32_t)(time) & (APR_BUSEC_PER_SEC - 1)) \
          * 1000) >> APR_BUSEC_BITS))

#define apr_time_as_usec(time) \
     (((apr_int64_t)(time) >> APR_BUSEC_BITS) \
      + ((((apr_int64_t)(time) & (APR_BUSEC_PER_SEC - 1)) \
          * APR_TIME_C(1000000)) >> APR_BUSEC_BITS))

#define apr_time_as_nsec(time) \
     (((apr_int64_t)(time) >> APR_BUSEC_BITS) \
      + ((((apr_int64_t)(time) & (APR_BUSEC_PER_SEC - 1)) \
          * APR_TIME_C(1000000000)) >> APR_BUSEC_BITS))

/* this computation is alright, but not great, we are limited to 272
  * years.  No Y2038 bug, but consider it an 'optimized form' for which
  * we have some loss of resolution.  Doing this in 32 bit signed math
  * would yield only 2 seconds, which is not worth implementing compared
  * to the existing apr_time_msec_get().
  */
#define apr_time_as_272yrs_msec(time) \
     (((apr_int64_t)(time) * APR_TIME_C(1000)) >> APR_BUSEC_BITS)

/* Another optimized form, we are limited to some 101 days.
  * Consider it a bleeding optimization with a great loss of resolution.
  */
#define apr_time_as_3mos_usec(time) \
     (((apr_int64_t)(time) * APR_TIME_C(1000000)) >> APR_BUSEC_BITS)

/* The final optimized form, we are limited to some 2.44 hours.
  * Use with extreme care.
  */
#define apr_time_as_2hrs_nsec(time) \
     (((apr_int64_t)(time) * APR_TIME_C(1000000000)) >> APR_BUSEC_BITS)


/* These have no loss of resolution if the 2nd xsec arg is < 1second
  */
#define apr_time_mmake(sec, msec) ((apr_time_t)(sec) << APR_BUSEC_BITS \
           + ((apr_time_t)(msec) << APR_BUSEC_BITS) / APR_TIME_C(1000))

#define apr_time_umake(sec, usec) ((apr_time_t)(sec) << APR_BUSEC_BITS \
           + ((apr_time_t)(usec) << APR_BUSEC_BITS) / APR_TIME_C(1000000))

#define apr_time_nmake(sec, nsec) ((apr_time_t)(sec) << APR_BUSEC_BITS \
           + ((apr_time_t)(nsec) << APR_BUSEC_BITS) / APR_TIME_C(1000000000))


#define apr_time_from_sec(sec)  ((apr_time_t)(sec) << APR_BUSEC_BITS)

/* These carry no resolution penalty, just a nasty performance penalty.
  */
#define apr_time_from_msec(msec) \
     ((((apr_time_t)(msec) / APR_TIME_C(1000)) << APR_BUSEC_BITS) \
      + ((((apr_time_t)(msec) % APR_TIME_C(1000)) << APR_BUSEC_BITS) \
         / APR_TIME_C(1000)))

#define apr_time_from_usec(usec) \
     ((((apr_time_t)(usec) / APR_TIME_C(1000000)) << APR_BUSEC_BITS) \
      + ((((apr_time_t)(usec) % APR_TIME_C(1000000)) << APR_BUSEC_BITS) \
         / APR_TIME_C(1000000)))

#define apr_time_from_nsec(nsec) \
     ((((apr_time_t)(nsec) / APR_TIME_C(1000000000)) << APR_BUSEC_BITS) \
      + ((((apr_time_t)(nsec) % APR_TIME_C(1000000000)) << APR_BUSEC_BITS) \
         / APR_TIME_C(1000000000)))

/* 97 day macros; you would expect these to be correct, until you
  * consider that the initial SHL reduces precision to 43 bits (signed)
  * of which 20 bits is the busecs.  23 remaining bits ~= 97 days.
  * The apr_time_xmake macros above doesn't suffer the problem, since
  * we presume that the xsec component is < 1 second... and the 'proper'
  * apr_time_from_xsec macros will split into integral and fractional
  * seconds while converting, so there is no unexpected loss of range.
  */
#define apr_time_from_3mos_msec(msec) \
     (((apr_time_t)(msec) << APR_BUSEC_BITS) / APR_TIME_C(1000))

#define apr_time_from_3mos_usec(usec) \
     (((apr_time_t)(usec) << APR_BUSEC_BITS) / APR_TIME_C(1000000))

#define apr_time_from_3mos_nsec(nsec) \
     (((apr_time_t)(usec) << APR_BUSEC_BITS) / APR_TIME_C(1000000000))

/* XXX These are ambiguous and dangerous.  The symbol names must be
  * deprecated to assure folks don't misuse them (_xsec_get vs _as_xsec)
  */
#define apr_time_sec apr_time_sec_get
#define apr_time_msec apr_time_msec_get
#define apr_time_usec apr_time_usec_get
#define apr_time_nsec apr_time_nsec_get

/* XXX Also dangerous, since folks might assume the literal meaning.
  * we have no need to define this constant anymore for library users
  */
#define APR_USEC_PER_SEC APR_BUSEC_PER_SEC

RE: [PATCH 2] example binary BUSEC patch for benchmarking only

Posted by "William A. Rowe, Jr." <wr...@rowe-clan.net>.

At 10:41 AM 7/11/2002, Brian Pane wrote:
>On Thu, 2002-07-11 at 06:58, Bill Stoddard wrote:
>
> > I ran a quick profile with this patch and it eliminated a couple of
> > divisions (calls to __divi64 reduced from 4 to 2 in my test setup. your
> > mileage may vary) which was good for 493 instructions. Still have 3 
> __divu64
> > and 2 __divi64 calls. The three __divu64 calls are in the 
> gettimeofday() CRT
> > function, so there is not much we can do about these directly.  One 
> __divi64
> > is in apr_poll (convert microseconds to milliseconds. This can probably be
> > optimized away). The other __divi64 is somewhere in cached_explode
> > (util_time.c).
>
>The only division that I know of in cached_explode() is:
>
>     struct exploded_time_cache_element *cache_element =
>         &(cache[seconds % TIME_CACHE_SIZE]);
>
>Is that where the division operation is being generated
>on your test system?

Actually, we still have division on construct busec from usecs.

That would be the one /1000000 which can never be optimized
away.  Fortunately, it should be infrequent.

Bill

RE: [PATCH 2] example binary BUSEC patch for benchmarking only

Posted by Ryan Bloom <rb...@covalent.net>.

Brian, fix is commit soon.

Ryan

----------------------------------------------
Ryan Bloom                  rbb@covalent.net
645 Howard St.              rbb@apache.org
San Francisco, CA 

> -----Original Message-----
> From: Brian Pane [mailto:bpane@pacbell.net]
> Sent: Thursday, July 11, 2002 9:31 AM
> To: APR Development List
> Subject: RE: [PATCH 2] example binary BUSEC patch for benchmarking
only
> 
> On Thu, 2002-07-11 at 06:58, Bill Stoddard wrote:
> > I ran a quick profile with this patch and it eliminated a couple of
> > divisions (calls to __divi64 reduced from 4 to 2 in my test setup.
your
> > mileage may vary) which was good for 493 instructions. Still have 3
> __divu64
> > and 2 __divi64 calls. The three __divu64 calls are in the
gettimeofday()
> CRT
> > function, so there is not much we can do about these directly.  One
> __divi64
> > is in apr_poll (convert microseconds to milliseconds. This can
probably
> be
> > optimized away). The other __divi64 is somewhere in cached_explode
> > (util_time.c).
> 
> With the cached_explode() division now fixed, I just looked at the
> division in apr_poll().  The code that's there now is doing the math
> like this:
> 
>     if (timeout > 0) {
>         timeout /= 1000; /* convert microseconds to milliseconds */
>     }
> 
> It's assuming that timeout, an apr_interval_time_t, represents an
> absolute time in microseconds--but that may not be true much longer
> if we switch to binary microseconds.  The calculation should be
> something like:
> 
>     timeout_msec = apr_time_usec(timeout) / 1000;
>     timeout_sec = apr_time_sec(timeout);
>     if (timeout_sec) {
>         timeout_msec += 1000 * timeout_sec;
>     }
> 
> I don't see a way to eliminate the "/ 1000" to convert usec to
> msec.  But we may be able to do all the math in 32-bit mode, by
> limiting the maximum timeout to the number of milliseconds that
> will fit in 32 bits, which works out to a max timeout of about
> 50 days.
> 
> --Brian
>

RE: [PATCH 2] example binary BUSEC patch for benchmarking only

Posted by Brian Pane <bp...@pacbell.net>.

On Thu, 2002-07-11 at 06:58, Bill Stoddard wrote:
> I ran a quick profile with this patch and it eliminated a couple of
> divisions (calls to __divi64 reduced from 4 to 2 in my test setup. your
> mileage may vary) which was good for 493 instructions. Still have 3 __divu64
> and 2 __divi64 calls. The three __divu64 calls are in the gettimeofday() CRT
> function, so there is not much we can do about these directly.  One __divi64
> is in apr_poll (convert microseconds to milliseconds. This can probably be
> optimized away). The other __divi64 is somewhere in cached_explode
> (util_time.c).

With the cached_explode() division now fixed, I just looked at the
division in apr_poll().  The code that's there now is doing the math
like this:

    if (timeout > 0) {
        timeout /= 1000; /* convert microseconds to milliseconds */
    }

It's assuming that timeout, an apr_interval_time_t, represents an
absolute time in microseconds--but that may not be true much longer
if we switch to binary microseconds.  The calculation should be
something like:

    timeout_msec = apr_time_usec(timeout) / 1000;
    timeout_sec = apr_time_sec(timeout);
    if (timeout_sec) {
        timeout_msec += 1000 * timeout_sec;
    }

I don't see a way to eliminate the "/ 1000" to convert usec to
msec.  But we may be able to do all the math in 32-bit mode, by
limiting the maximum timeout to the number of milliseconds that
will fit in 32 bits, which works out to a max timeout of about
50 days.

--Brian

RE: [PATCH 2] example binary BUSEC patch for benchmarking only

Posted by Brian Pane <bp...@pacbell.net>.

On Thu, 2002-07-11 at 06:58, Bill Stoddard wrote:

> I ran a quick profile with this patch and it eliminated a couple of
> divisions (calls to __divi64 reduced from 4 to 2 in my test setup. your
> mileage may vary) which was good for 493 instructions. Still have 3 __divu64
> and 2 __divi64 calls. The three __divu64 calls are in the gettimeofday() CRT
> function, so there is not much we can do about these directly.  One __divi64
> is in apr_poll (convert microseconds to milliseconds. This can probably be
> optimized away). The other __divi64 is somewhere in cached_explode
> (util_time.c).

The only division that I know of in cached_explode() is:

    struct exploded_time_cache_element *cache_element =
        &(cache[seconds % TIME_CACHE_SIZE]);

Is that where the division operation is being generated
on your test system?

TIME_CACHE_SIZE is 16, specifically so that the compiler
can optimize away the division, but if there's still a
division being generated, we can replace it with
  "cache[seconds & TIME_CACHE_MASK]"

--Brian

RE: [PATCH 2] example binary BUSEC patch for benchmarking only

Posted by Ryan Bloom <rb...@covalent.net>.

> From: Bill Stoddard [mailto:bill@wstoddard.com]
> 
> I ran a quick profile with this patch and it eliminated a couple of
> divisions (calls to __divi64 reduced from 4 to 2 in my test setup.
your
> mileage may vary) which was good for 493 instructions. Still have 3
> __divu64
> and 2 __divi64 calls. The three __divu64 calls are in the
gettimeofday()
> CRT
> function, so there is not much we can do about these directly.  One

Has anybody looked at ExtendedStatus to make sure that we are
eliminating all of the time calls that we can during request processing?

> __divi64
> is in apr_poll (convert microseconds to milliseconds. This can
probably be

If we move to using the macros, it should go away.

Ryan

RE: [PATCH 2] example binary BUSEC patch for benchmarking only

Posted by Bill Stoddard <bi...@wstoddard.com>.

I ran a quick profile with this patch and it eliminated a couple of
divisions (calls to __divi64 reduced from 4 to 2 in my test setup. your
mileage may vary) which was good for 493 instructions. Still have 3 __divu64
and 2 __divi64 calls. The three __divu64 calls are in the gettimeofday() CRT
function, so there is not much we can do about these directly.  One __divi64
is in apr_poll (convert microseconds to milliseconds. This can probably be
optimized away). The other __divi64 is somewhere in cached_explode
(util_time.c).

Bill

> At 10:38 PM 7/10/2002, William A. Rowe, Jr. wrote:
> >At 10:03 PM 7/10/2002, Brian Pane wrote:
> >>Bill Stoddard wrote:
> >>
> >>>I've not looked at the generated code, but profiling indicates that an
> >>>additional division is happening, adding an extra 231 instructions.
> >>>(xlc_r -O2)
> >>
> >>If you redefine the macro as a shift, does the profile look better?
>
> Ok, attached is the code redone as binary math.  I'm tired, could be
> any number of major blunders in it, but on first pass, it looked right.
>
> Bill

Re: [PATCH 2] example binary BUSEC patch for benchmarking only

Posted by Cliff Woolley <jw...@virginia.edu>.

On Thu, 11 Jul 2002, Greg Marr wrote:

> I keep thinking that APR_USEC_PER_SEC should be (1 << 20), or now
> (1 << APR_USEC_BITS) instead of the magical constant.  I have no way
> of verifying with a quick glance that 1048576 is really 2^20.

You don't know your powers of 2?  Memorize, Greg, Memorize.  ;)

Re: [PATCH 2] example binary BUSEC patch for benchmarking only

Posted by Greg Marr <gr...@alum.wpi.edu>.

At 01:33 AM 07/11/2002, William A. Rowe, Jr. wrote:
>Ok, attached is the code redone as binary math.  I'm tired, could be 
>any number of major blunders in it, but on first pass, it looked right.
>
>-/** number of microseconds per second */
>-#define APR_USEC_PER_SEC APR_TIME_C(1000000)
>+/** number of binary microseconds per second (2^20) */
>+#define APR_USEC_PER_SEC APR_TIME_C(1048576)
>+#define APR_USEC_BITS 20

I keep thinking that APR_USEC_PER_SEC should be (1 << 20), or now
(1 << APR_USEC_BITS) instead of the magical constant.  I have no way 
of verifying with a quick glance that 1048576 is really 2^20.

-- 
Greg Marr
gregm@alum.wpi.edu
"We thought you were dead."
"I was, but I'm better now." - Sheridan, "The Summoning"