You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@thrift.apache.org by Rush Manbert <ru...@manbert.com> on 2009/05/27 17:14:35 UTC

Windows Timers - Seeking opinions

Hi all,

I have moved my Boostified Thrift C++ library code to my Windows XP  
box and I now have the concurrency test running. It's slow as  
anything, but I'm hoping that's because it's a 5 year old single CPU  
Athlon 64. I know my new code is as fast as the current library on the  
Mac.

It turned out that one of the biggest headaches in Windows was the  
system timer. I read a lot of Internet posts and went through three  
implementations before getting one that  is acceptable. There are,  
however, tradeoffs and I was hoping that interested and knowledgeable  
parties here might offer some opinions/insight.

The Windows system timer is just basically crappy. The millisecond  
epoch time clock provided by the C runtime library really updates at  
about 60 hz, and they just change it to whatever the current  
millisecond count should be. And when the system gets busy (lots of  
threads) the update becomes very haphazard.

The highest precision timer is the performance counter, which runs at  
some frequency that is subdivided off the CPU clock or a peripheral  
clack. On my system it runs at about 3.5 Megahertz, so it's plenty  
fast. But it has its own problems. From what I have read, it can slow  
down because of power save modes on the computers, and each core in a  
multi core machine has its own counter. This means that if your  
process gets moved between cores, the timer reading can change, and it  
can possibly appear to go backward. I believe we could get around that  
by remembering the last time value that was returned, and just not  
letting time go backward. For the purposes of Thrift, I think that  
would work. I don't know how to handle it if the timer can really slow  
down, though.

The last one that I tried, after reading about the nasties in the  
performance counter, is timeGetTime(). This gives the count of  
milliseconds since Windows was started. As long as you calibrate it  
against the epoch time on the first access (same thing has to be done  
if you use the performance counter), it seems to give good results  
with millisecond accuracy. However, in order to use it you must link  
against Winmm.lib/dll. This appears to be a safe choice, as the dll is  
supplied with all Windows systems.

So at this point, I'm planning to go with timeGetTime(), making it a  
requirement that all Windows apps that use Thrift be linked against  
Winmm.dll. Do I hear any well reasoned objections or alternatives?

Thanks,
Rush

Re: Windows Timers - Seeking opinions

Posted by Rush Manbert <ru...@manbert.com>.

Hi Chad,
On May 27, 2009, at 4:06 PM, Chad Walters wrote:

> Rush,
>
> Not an area of expertise for me but I had a couple questions:
>
>
> 1.  What problems were there using the Boost timer library?

This was actually my first attempt. The Boost library uses the low  
precision Windows timer under the hood, so it's updating at 60 hz. I  
was surprised at this, but guessed that they didn't want to introduce  
other dependencies and didn't want to deal with the performance  
counter issues.

>
> 2.  Did you look at Intel's Threading Building Blocks tick_count  
> implementation?

No. I didn't find this when I was searching. I'll take a look.

<snip>

Best regards,
Rush

Re: Windows Timers - Seeking opinions

Posted by Rush Manbert <ru...@manbert.com>.

Yes, that was one of them.

This article is interesting, from an official source, and recent:
http://software.intel.com/en-us/articles/measure-code-sections-using-the-enhanced-timer/

It appears that the Enhanced Timer is what tick_count uses, which  
makes me think that maybe the "Beware of QueryPerformanceCounter()"  
article is out of date. If the multi-core stuff is not an issue, then  
for our (Thrift internals) purposes a clock slowdown may not matter.  
The use within the library is to wait with a timeout, so you're at the  
mercy of the OS implementation for that, no matter what. The test  
pattern is generally that the main thread starts a test thread, then  
waits for the test thread to reach some known point where it waits,  
then signals the test thread to continue to the next wait point, etc.  
The test thread records progress by setting the value of a variable  
that the two threads share. The controlling thread calls wait with a  
short timeout, just to cede the CPU and give the thread under test a  
chance to run, but it does that in a loop that times out in hundreds  
or thousands of milliseconds. If time appeared to slow down during  
those tests it shouldn't matter.

I still have both the timeGetTime() and QueryPerformanceCounter()  
implementations in Util.cpp, selectable at compile time. I guess all I  
need now is a multi-core notebook running XP or Vista. I might be able  
to come up with one of those from our QA group. Or maybe I can scare  
up a volunteer from the list to do some testing.

- Rush

On May 28, 2009, at 8:38 PM, Chad Walters wrote:

>
> Hmm, is this one of the articles you looked at?
> http://www.virtualdub.org/blog/pivot/entry.php?id=106
>
> That is definitely a bit off-putting.
>
> Not sure about the power save mode issue, but I think someone told  
> me that
> tbb::tick_count gets the value from the same CPU each time.
>
> http://www.threadingbuildingblocks.org/documentation.php
>
>> From the TBB reference manual:
>
> 7 Timing
> Parallel programming is about speeding up wall clock time, which is  
> the real
> time that
> it takes a program to run. Unfortunately, some of the obvious wall  
> clock
> timing
> routines provided by operating systems do not always work reliably  
> across
> threads,
> because the hardware thread clocks are not synchronized. The library
> provides
> support for timing across threads. The routines are wrappers around
> operating
> services that we have verified as safe to use across threads.
>
>> From the TBB tutorial:
>
> 8 Timing
> ...
> Unlike some timing interfaces, tick_count is guaranteed to be safe  
> to use
> across
> threads. It is valid to subtract tick_count values that were created  
> by
> different
> threads. A tick_count difference can be converted to seconds.
> The resolution of tick_count corresponds to the highest resolution  
> timing
> service on
> the platform that is valid across threads in the same process. Since  
> the CPU
> timer
> registers are not valid across threads on some platforms, this means  
> that
> the
> resolution of tick_count can not be guaranteed to be consistent across
> platforms.
>
> I'd be interested to see whether this works for you.
>
> Chad
>
> On 5/28/09 8:31 AM, "Rush Manbert" <ru...@manbert.com> wrote:
>
>> Hi Chad,
>>
>> It uses QueryPerformanceCounter on Windows. I'm going to go back and
>> test my QPF code again. It's possible (oh so remotely possible ;-)
>> that I screwed something up.
>>
>> Best regards,
>> Rush
>>
>> On May 27, 2009, at 4:06 PM, Chad Walters wrote:
>> <snip>
>>>
>>> 2.  Did you look at Intel's Threading Building Blocks tick_count
>>> implementation?
>>>
>>

Re: Windows Timers - Seeking opinions

Posted by Chad Walters <Ch...@microsoft.com>.

Hmm, is this one of the articles you looked at?
http://www.virtualdub.org/blog/pivot/entry.php?id=106

That is definitely a bit off-putting.

Not sure about the power save mode issue, but I think someone told me that
tbb::tick_count gets the value from the same CPU each time.

http://www.threadingbuildingblocks.org/documentation.php

>From the TBB reference manual:

7 Timing 
Parallel programming is about speeding up wall clock time, which is the real
time that 
it takes a program to run. Unfortunately, some of the obvious wall clock
timing 
routines provided by operating systems do not always work reliably across
threads, 
because the hardware thread clocks are not synchronized. The library
provides 
support for timing across threads. The routines are wrappers around
operating 
services that we have verified as safe to use across threads.

>From the TBB tutorial:

8 Timing 
...
Unlike some timing interfaces, tick_count is guaranteed to be safe to use
across 
threads. It is valid to subtract tick_count values that were created by
different 
threads. A tick_count difference can be converted to seconds.
The resolution of tick_count corresponds to the highest resolution timing
service on 
the platform that is valid across threads in the same process. Since the CPU
timer 
registers are not valid across threads on some platforms, this means that
the 
resolution of tick_count can not be guaranteed to be consistent across
platforms. 

I'd be interested to see whether this works for you.

Chad

On 5/28/09 8:31 AM, "Rush Manbert" <ru...@manbert.com> wrote:

> Hi Chad,
> 
> It uses QueryPerformanceCounter on Windows. I'm going to go back and
> test my QPF code again. It's possible (oh so remotely possible ;-)
> that I screwed something up.
> 
> Best regards,
> Rush
> 
> On May 27, 2009, at 4:06 PM, Chad Walters wrote:
> <snip>
>> 
>> 2.  Did you look at Intel's Threading Building Blocks tick_count
>> implementation?
>> 
>

Re: Windows Timers - Seeking opinions

Posted by Rush Manbert <ru...@manbert.com>.

Hi Chad,

It uses QueryPerformanceCounter on Windows. I'm going to go back and  
test my QPF code again. It's possible (oh so remotely possible ;-)  
that I screwed something up.

Best regards,
Rush

On May 27, 2009, at 4:06 PM, Chad Walters wrote:
<snip>
>
> 2.  Did you look at Intel's Threading Building Blocks tick_count  
> implementation?
>

Re: Windows Timers - Seeking opinions

Posted by Chad Walters <Ch...@microsoft.com>.

Rush,

Not an area of expertise for me but I had a couple questions:


 1.  What problems were there using the Boost timer library?
 2.  Did you look at Intel's Threading Building Blocks tick_count implementation?

Chad

On 5/27/09 8:14 AM, "Rush Manbert" <ru...@manbert.com> wrote:

Hi all,

I have moved my Boostified Thrift C++ library code to my Windows XP
box and I now have the concurrency test running. It's slow as
anything, but I'm hoping that's because it's a 5 year old single CPU
Athlon 64. I know my new code is as fast as the current library on the
Mac.

It turned out that one of the biggest headaches in Windows was the
system timer. I read a lot of Internet posts and went through three
implementations before getting one that  is acceptable. There are,
however, tradeoffs and I was hoping that interested and knowledgeable
parties here might offer some opinions/insight.

The Windows system timer is just basically crappy. The millisecond
epoch time clock provided by the C runtime library really updates at
about 60 hz, and they just change it to whatever the current
millisecond count should be. And when the system gets busy (lots of
threads) the update becomes very haphazard.

The highest precision timer is the performance counter, which runs at
some frequency that is subdivided off the CPU clock or a peripheral
clack. On my system it runs at about 3.5 Megahertz, so it's plenty
fast. But it has its own problems. From what I have read, it can slow
down because of power save modes on the computers, and each core in a
multi core machine has its own counter. This means that if your
process gets moved between cores, the timer reading can change, and it
can possibly appear to go backward. I believe we could get around that
by remembering the last time value that was returned, and just not
letting time go backward. For the purposes of Thrift, I think that
would work. I don't know how to handle it if the timer can really slow
down, though.

The last one that I tried, after reading about the nasties in the
performance counter, is timeGetTime(). This gives the count of
milliseconds since Windows was started. As long as you calibrate it
against the epoch time on the first access (same thing has to be done
if you use the performance counter), it seems to give good results
with millisecond accuracy. However, in order to use it you must link
against Winmm.lib/dll. This appears to be a safe choice, as the dll is
supplied with all Windows systems.

So at this point, I'm planning to go with timeGetTime(), making it a
requirement that all Windows apps that use Thrift be linked against
Winmm.dll. Do I hear any well reasoned objections or alternatives?

Thanks,
Rush