You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@httpd.apache.org by Andrew Oliver <ac...@gmail.com> on 2011/04/05 13:27:33 UTC

Theory on recent Phoronix benchmark?

http://www.phoronix.com/scan.php?page=article&item=ubuntu_natty_pae64&num=3

"When it comes to running the Apache web-server in these different
configurations, there is a 3% improvement when moving from the i686 to i686
PAE kernel and 2% on top of that when moving to the x86_64 Ubuntu. With the
newer Core i7 Sandy Bridge notebook there is a 6% boost in performance with
the PAE kernel, but the 64-bit performance strangely suffers a setback. In
this test and the rest, they are built from source for the respective
architecture."

Anyone have any theory on why 64-bit was so much worse (suggest looking at
general article for context rather than solely the except above)?

-Andy

Re: Theory on recent Phoronix benchmark?

Posted by Andrew Oliver <ac...@gmail.com>.
That's an interesting point.  The reason this peaked my interest is it isn't
really in line with my last JVM benchmarking (granted some time ago).
Performance degradation was a factor in particular when I didn't increase
the heap size a little, but I've not seen  this level of degradation.
Having enjoyed the pleasure of 16-32bit thunking when I "got" to write an
OS/2 device driver a good bit back, I rather like your theory for that.  I
wrote the Phoronix dude(ette/s) and if (he/she/it/they) don't reply I'll
take a crack at it on 11.04.

Thanks,

Andy

On Tue, Apr 5, 2011 at 5:58 PM, William A. Rowe Jr. <wr...@rowe-clan.net>wrote:

> On 4/5/2011 3:52 PM, Stefan Fritsch wrote:
> > On Tuesday 05 April 2011, Andrew Oliver wrote:
> >> That is just the thing.  Other things that should have been
> >> similarly affected in the benchmark were not.  Take a gander if
> >> you would at some of the rest of that article...
> >
> > HTTPD uses lots of pointers when handling per-dir and per-module
> > configuration data. I agree with Bill that the 2x size increase in
> > pointers is likely a major performance factor. Maybe the other
> > workloads don't use so many pointers. They don't have a java benchmark
> > AFAICS, which should be similarily affected.
> >
> > Or it is just bad luck that with 32bit, HTTPD's working set just fits
> > into some cache while with 64bit, it doesn't. It would be interesting
> > to see the same comparison with 2.3.11. There were some optimizations
> > which should reduce CPU cache usage.
>
> I'm actually not entirely clear if they were using 64 bit executables
> throughout all of their tests for x86_64, in fact I suspect they weren't.
>
> If it is a 32 bit binary (and CC="gcc -m64" ./configure  might be needed
> here depending on gcc defaults), there is a painful threshold of thunking
> 32 bit calls into the 64 bit kernel.
>
> But one interesting thing about their 'stellar' performance stats on the
> x86_64 is that most apps are powered by assembler and very specific word
> size manipulations, e.g. the sound waveform or image bitmap memory
> footprint
> doesn't change, and openssl gets to employ SSE2 (post i686) manipulations.
>
> Finally, I'd expect no advantage from system caching for httpd moving
> from 2GB to 24GB of ram, which
>
>

Re: Theory on recent Phoronix benchmark?

Posted by "William A. Rowe Jr." <wr...@rowe-clan.net>.
On 4/5/2011 3:52 PM, Stefan Fritsch wrote:
> On Tuesday 05 April 2011, Andrew Oliver wrote:
>> That is just the thing.  Other things that should have been
>> similarly affected in the benchmark were not.  Take a gander if
>> you would at some of the rest of that article...
> 
> HTTPD uses lots of pointers when handling per-dir and per-module 
> configuration data. I agree with Bill that the 2x size increase in 
> pointers is likely a major performance factor. Maybe the other 
> workloads don't use so many pointers. They don't have a java benchmark 
> AFAICS, which should be similarily affected.
> 
> Or it is just bad luck that with 32bit, HTTPD's working set just fits 
> into some cache while with 64bit, it doesn't. It would be interesting 
> to see the same comparison with 2.3.11. There were some optimizations 
> which should reduce CPU cache usage.

I'm actually not entirely clear if they were using 64 bit executables
throughout all of their tests for x86_64, in fact I suspect they weren't.

If it is a 32 bit binary (and CC="gcc -m64" ./configure  might be needed
here depending on gcc defaults), there is a painful threshold of thunking
32 bit calls into the 64 bit kernel.

But one interesting thing about their 'stellar' performance stats on the
x86_64 is that most apps are powered by assembler and very specific word
size manipulations, e.g. the sound waveform or image bitmap memory footprint
doesn't change, and openssl gets to employ SSE2 (post i686) manipulations.

Finally, I'd expect no advantage from system caching for httpd moving
from 2GB to 24GB of ram, which


Re: Theory on recent Phoronix benchmark?

Posted by Stefan Fritsch <sf...@sfritsch.de>.
On Tuesday 05 April 2011, Andrew Oliver wrote:
> That is just the thing.  Other things that should have been
> similarly affected in the benchmark were not.  Take a gander if
> you would at some of the rest of that article...

HTTPD uses lots of pointers when handling per-dir and per-module 
configuration data. I agree with Bill that the 2x size increase in 
pointers is likely a major performance factor. Maybe the other 
workloads don't use so many pointers. They don't have a java benchmark 
AFAICS, which should be similarily affected.

Or it is just bad luck that with 32bit, HTTPD's working set just fits 
into some cache while with 64bit, it doesn't. It would be interesting 
to see the same comparison with 2.3.11. There were some optimizations 
which should reduce CPU cache usage.

Re: Theory on recent Phoronix benchmark?

Posted by Andrew Oliver <ac...@gmail.com>.
That is just the thing.  Other things that should have been similarly
affected in the benchmark were not.  Take a gander if you would at some of
the rest of that article...
On Apr 5, 2011 11:32 AM, "William A. Rowe Jr." <wr...@rowe-clan.net> wrote:
> On 4/5/2011 6:27 AM, Andrew Oliver wrote:
>>
>> Anyone have any theory on why 64-bit was so much worse (suggest looking
at general article
>> for context rather than solely the except above)?
>
> Simple memory access. Intel doesn't scale to 64 bits as cleanly as, say,
> a sparcv9 64 bit binary vs sparcv8 32 bit.
>
> int's, pointers, most resources consume 2x heap and stack, except of
course
> strings.
>
> All this means you are falling out of L1, L2 cache out to memory pretty
> regularly. Pick some other applications, you should find similar results
> on most any intel program, including 32 vs 64 bit jvm performance.

Re: Theory on recent Phoronix benchmark?

Posted by "William A. Rowe Jr." <wr...@rowe-clan.net>.
On 4/5/2011 6:27 AM, Andrew Oliver wrote:
> 
> Anyone have any theory on why 64-bit was so much worse (suggest looking at general article
> for context rather than solely the except above)?

Simple memory access.  Intel doesn't scale to 64 bits as cleanly as, say,
a sparcv9 64 bit binary vs sparcv8 32 bit.

int's, pointers, most resources consume 2x heap and stack, except of course
strings.

All this means you are falling out of L1, L2 cache out to memory pretty
regularly.  Pick some other applications, you should find similar results
on most any intel program, including 32 vs 64 bit jvm performance.

Re: Theory on recent Phoronix benchmark?

Posted by Igor Galić <i....@brainsware.org>.

----- Original Message -----
> http://www.phoronix.com/scan.php?page=article&item=ubuntu_natty_pae64&num=3
> 
> "When it comes to running the Apache web-server in these different
> configurations, there is a 3% improvement when moving from the i686
> to i686 PAE kernel and 2% on top of that when moving to the x86_64
> Ubuntu. With the newer Core i7 Sandy Bridge notebook there is a 6%
> boost in performance with the PAE kernel, but the 64-bit performance
> strangely suffers a setback. In this test and the rest, they are
> built from source for the respective architecture."
> 
> Anyone have any theory on why 64-bit was so much worse (suggest

On 64-bit on Core i7 (in this benchmark) -- that's a little more
specific. So it'd be worth looking at these two CPUs and see where
they differ:

Intel Core 2 Duo T9300 (2.50GHz dual-core) 
http://ark.intel.com/Product.aspx?id=33917 vs

Core i7 2820QM (2.30GHz quad-core + Hyper Threading)
http://ark.intel.com/Product.aspx?id=52227

Which in essence is: 6MB L2 Cache vs 8MB L2 Cache.


> looking at general article for context rather than solely the except
> above)?
> 
> -Andy

i

-- 
Igor Galić

Tel: +43 (0) 664 886 22 883
Mail: i.galic@brainsware.org
URL: http://brainsware.org/