You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Gopal V (JIRA)" <ji...@apache.org> on 2012/10/13 11:03:04 UTC
[jira] [Created] (HADOOP-8926) hadoop.util.PureJavaCrc32 cache
hit-ratio is low for static data
Gopal V created HADOOP-8926:
-------------------------------
Summary: hadoop.util.PureJavaCrc32 cache hit-ratio is low for static data
Key: HADOOP-8926
URL: https://issues.apache.org/jira/browse/HADOOP-8926
Project: Hadoop Common
Issue Type: Improvement
Components: util
Affects Versions: 2.0.3-alpha
Environment: Ubuntu 10.10 i386
Reporter: Gopal V
Priority: Trivial
While running microbenchmarks for HDFS write codepath, a significant part of the CPU fraction was consumed by the DataChecksum.update().
The attached patch converts the static arrays in CRC32 into a single linear array for a performance boost in the inner loop.
milli-seconds for 1Gig (16400 loop over a 64kb chunk)
|| platform || original || cache-aware || improvement ||
| x86 | 3894 | 2304 | 40.83 |
| x86_64 | 2131 | 1826 | 14 |
The performance improvement on x86 is rather larger than the 64bit case, due to the extra register/stack pressure caused by the static arrays.
A closer analysis of the PureJavaCrc32 JIT code shows the following assembly fragment
{code}
0x40f1e345: mov $0x184,%ecx
0x40f1e34a: mov 0x4415b560(%ecx),%ecx ;*getstatic T8_5
; - PureJavaCrc32::update@95 (line 61)
; {oop('PureJavaCrc32')}
0x40f1e350: mov %ecx,0x2c(%esp)
{code}
Basically, the static variables T8_0 through to T8_7 are being spilled to the stack because of register pressure. The x86_64 case has a lower likelihood of such pessimistic JIT code due to the increased number of registers.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HADOOP-8926) hadoop.util.PureJavaCrc32 cache
hit-ratio is low for static data
Posted by "Gopal V (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-8926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Gopal V updated HADOOP-8926:
----------------------------
Status: Open (was: Patch Available)
Reworking for readability of code
> hadoop.util.PureJavaCrc32 cache hit-ratio is low for static data
> ----------------------------------------------------------------
>
> Key: HADOOP-8926
> URL: https://issues.apache.org/jira/browse/HADOOP-8926
> Project: Hadoop Common
> Issue Type: Improvement
> Components: util
> Affects Versions: 2.0.3-alpha
> Environment: Ubuntu 10.10 i386
> Reporter: Gopal V
> Assignee: Gopal V
> Priority: Trivial
> Labels: optimization
> Attachments: crc32-faster+test.patch, pure-crc32-cache-hit.patch
>
>
> While running microbenchmarks for HDFS write codepath, a significant part of the CPU fraction was consumed by the DataChecksum.update().
> The attached patch converts the static arrays in CRC32 into a single linear array for a performance boost in the inner loop.
> milli-seconds for 1Gig (16400 loop over a 64kb chunk)
> || platform || original || cache-aware || improvement ||
> | x86 | 3894 | 2304 | 40.83 |
> | x86_64 | 2131 | 1826 | 14 |
> The performance improvement on x86 is rather larger than the 64bit case, due to the extra register/stack pressure caused by the static arrays.
> A closer analysis of the PureJavaCrc32 JIT code shows the following assembly fragment
> {code}
> 0x40f1e345: mov $0x184,%ecx
> 0x40f1e34a: mov 0x4415b560(%ecx),%ecx ;*getstatic T8_5
> ; - PureJavaCrc32::update@95 (line 61)
> ; {oop('PureJavaCrc32')}
> 0x40f1e350: mov %ecx,0x2c(%esp)
> {code}
> Basically, the static variables T8_0 through to T8_7 are being spilled to the stack because of register pressure. The x86_64 case has a lower likelihood of such pessimistic JIT code due to the increased number of registers.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HADOOP-8926) hadoop.util.PureJavaCrc32 cache
hit-ratio is low for static data
Posted by "Robert Joseph Evans (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-8926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Robert Joseph Evans updated HADOOP-8926:
----------------------------------------
Resolution: Fixed
Fix Version/s: 2.0.3-alpha
3.0.0
Status: Resolved (was: Patch Available)
Thanks Gopal. I put this into trunk and branch-2.
> hadoop.util.PureJavaCrc32 cache hit-ratio is low for static data
> ----------------------------------------------------------------
>
> Key: HADOOP-8926
> URL: https://issues.apache.org/jira/browse/HADOOP-8926
> Project: Hadoop Common
> Issue Type: Improvement
> Components: util
> Affects Versions: 2.0.3-alpha
> Environment: Ubuntu 10.10 i386
> Reporter: Gopal V
> Assignee: Gopal V
> Priority: Trivial
> Labels: optimization
> Fix For: 3.0.0, 2.0.3-alpha
>
> Attachments: crc32-faster+readable.patch, crc32-faster+test.patch, pure-crc32-cache-hit.patch
>
>
> While running microbenchmarks for HDFS write codepath, a significant part of the CPU fraction was consumed by the DataChecksum.update().
> The attached patch converts the static arrays in CRC32 into a single linear array for a performance boost in the inner loop.
> milli-seconds for 1Gig (16400 loop over a 64kb chunk)
> || platform || original || cache-aware || improvement ||
> | x86 | 3894 | 2304 | 40.83 |
> | x86_64 | 2131 | 1826 | 14 |
> The performance improvement on x86 is rather larger than the 64bit case, due to the extra register/stack pressure caused by the static arrays.
> A closer analysis of the PureJavaCrc32 JIT code shows the following assembly fragment
> {code}
> 0x40f1e345: mov $0x184,%ecx
> 0x40f1e34a: mov 0x4415b560(%ecx),%ecx ;*getstatic T8_5
> ; - PureJavaCrc32::update@95 (line 61)
> ; {oop('PureJavaCrc32')}
> 0x40f1e350: mov %ecx,0x2c(%esp)
> {code}
> Basically, the static variables T8_0 through to T8_7 are being spilled to the stack because of register pressure. The x86_64 case has a lower likelihood of such pessimistic JIT code due to the increased number of registers.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-8926) hadoop.util.PureJavaCrc32 cache
hit-ratio is low for static data
Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-8926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13476945#comment-13476945 ]
Hadoop QA commented on HADOOP-8926:
-----------------------------------
{color:green}+1 overall{color}. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12549285/crc32-faster%2Breadable.patch
against trunk revision .
{color:green}+1 @author{color}. The patch does not contain any @author tags.
{color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files.
{color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings.
{color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages.
{color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse.
{color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings.
{color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings.
{color:green}+1 core tests{color}. The patch passed unit tests in hadoop-common-project/hadoop-common.
{color:green}+1 contrib tests{color}. The patch passed contrib unit tests.
Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/1632//testReport/
Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/1632//console
This message is automatically generated.
> hadoop.util.PureJavaCrc32 cache hit-ratio is low for static data
> ----------------------------------------------------------------
>
> Key: HADOOP-8926
> URL: https://issues.apache.org/jira/browse/HADOOP-8926
> Project: Hadoop Common
> Issue Type: Improvement
> Components: util
> Affects Versions: 2.0.3-alpha
> Environment: Ubuntu 10.10 i386
> Reporter: Gopal V
> Assignee: Gopal V
> Priority: Trivial
> Labels: optimization
> Attachments: crc32-faster+readable.patch, crc32-faster+test.patch, pure-crc32-cache-hit.patch
>
>
> While running microbenchmarks for HDFS write codepath, a significant part of the CPU fraction was consumed by the DataChecksum.update().
> The attached patch converts the static arrays in CRC32 into a single linear array for a performance boost in the inner loop.
> milli-seconds for 1Gig (16400 loop over a 64kb chunk)
> || platform || original || cache-aware || improvement ||
> | x86 | 3894 | 2304 | 40.83 |
> | x86_64 | 2131 | 1826 | 14 |
> The performance improvement on x86 is rather larger than the 64bit case, due to the extra register/stack pressure caused by the static arrays.
> A closer analysis of the PureJavaCrc32 JIT code shows the following assembly fragment
> {code}
> 0x40f1e345: mov $0x184,%ecx
> 0x40f1e34a: mov 0x4415b560(%ecx),%ecx ;*getstatic T8_5
> ; - PureJavaCrc32::update@95 (line 61)
> ; {oop('PureJavaCrc32')}
> 0x40f1e350: mov %ecx,0x2c(%esp)
> {code}
> Basically, the static variables T8_0 through to T8_7 are being spilled to the stack because of register pressure. The x86_64 case has a lower likelihood of such pessimistic JIT code due to the increased number of registers.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-8926) hadoop.util.PureJavaCrc32 cache
hit-ratio is low for static data
Posted by "Gopal V (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-8926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13476916#comment-13476916 ]
Gopal V commented on HADOOP-8926:
---------------------------------
On x86_64 on an ec2 m1.xl (after changes)
Performance Table (The unit is MB/sec)
|| Num Bytes || CRC32 || PureJavaCrc32 ||
| 1 | 9.799 | 72.921 |
| 2 | 18.850 | 177.113 |
| 4 | 42.687 | 214.704 |
| 8 | 70.552 | 318.484 |
| 16 | 111.875 | 416.191 |
| 32 | 153.779 | 496.209 |
| 64 | 190.493 | 544.428 |
| 128 | 215.851 | 564.414 |
| 256 | 232.110 | 590.515 |
| 512 | 240.359 | 581.974 |
| 1024 | 244.682 | 597.676 |
| 2048 | 246.642 | 599.621 |
| 4096 | 249.438 | 604.247 |
| 8192 | 249.247 | 605.547 |
| 16384 | 249.524 | 606.494 |
| 32768 | 249.508 | 602.449 |
| 65536 | 250.977 | 604.064 |
| 131072 | 249.678 | 597.944 |
| 262144 | 249.505 | 603.270 |
| 524288 | 250.805 | 602.656 |
| 1048576 | 250.900 | 602.949 |
| 2097152 | 250.137 | 601.563 |
| 4194304 | 249.406 | 602.058 |
| 8388608 | 249.937 | 598.310 |
| 16777216 | 249.892 | 592.417 |
> hadoop.util.PureJavaCrc32 cache hit-ratio is low for static data
> ----------------------------------------------------------------
>
> Key: HADOOP-8926
> URL: https://issues.apache.org/jira/browse/HADOOP-8926
> Project: Hadoop Common
> Issue Type: Improvement
> Components: util
> Affects Versions: 2.0.3-alpha
> Environment: Ubuntu 10.10 i386
> Reporter: Gopal V
> Assignee: Gopal V
> Priority: Trivial
> Labels: optimization
> Attachments: crc32-faster+readable.patch, crc32-faster+test.patch, pure-crc32-cache-hit.patch
>
>
> While running microbenchmarks for HDFS write codepath, a significant part of the CPU fraction was consumed by the DataChecksum.update().
> The attached patch converts the static arrays in CRC32 into a single linear array for a performance boost in the inner loop.
> milli-seconds for 1Gig (16400 loop over a 64kb chunk)
> || platform || original || cache-aware || improvement ||
> | x86 | 3894 | 2304 | 40.83 |
> | x86_64 | 2131 | 1826 | 14 |
> The performance improvement on x86 is rather larger than the 64bit case, due to the extra register/stack pressure caused by the static arrays.
> A closer analysis of the PureJavaCrc32 JIT code shows the following assembly fragment
> {code}
> 0x40f1e345: mov $0x184,%ecx
> 0x40f1e34a: mov 0x4415b560(%ecx),%ecx ;*getstatic T8_5
> ; - PureJavaCrc32::update@95 (line 61)
> ; {oop('PureJavaCrc32')}
> 0x40f1e350: mov %ecx,0x2c(%esp)
> {code}
> Basically, the static variables T8_0 through to T8_7 are being spilled to the stack because of register pressure. The x86_64 case has a lower likelihood of such pessimistic JIT code due to the increased number of registers.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-8926) hadoop.util.PureJavaCrc32 cache
hit-ratio is low for static data
Posted by "Gopal V (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-8926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13478692#comment-13478692 ]
Gopal V commented on HADOOP-8926:
---------------------------------
Thanks Nicholas, this probably has future for improvement if we had ByteBuffers (direct) instead of byte[] arrays.
I'm tracing the code-path from the network I/O to see if a nio fix can possibly get me bytebuffers direct.
Meanwhile, If this is significant enough for a back-port, I can open a new ticket & port this patch to branch-1.1 (or is branch-1 the "live" one?).
> hadoop.util.PureJavaCrc32 cache hit-ratio is low for static data
> ----------------------------------------------------------------
>
> Key: HADOOP-8926
> URL: https://issues.apache.org/jira/browse/HADOOP-8926
> Project: Hadoop Common
> Issue Type: Improvement
> Components: util
> Affects Versions: 2.0.3-alpha
> Environment: Ubuntu 10.10 i386
> Reporter: Gopal V
> Assignee: Gopal V
> Labels: optimization
> Fix For: 3.0.0, 2.0.3-alpha
>
> Attachments: crc32-faster+readable.patch, crc32-faster+test.patch, pure-crc32-cache-hit.patch
>
>
> While running microbenchmarks for HDFS write codepath, a significant part of the CPU fraction was consumed by the DataChecksum.update().
> The attached patch converts the static arrays in CRC32 into a single linear array for a performance boost in the inner loop.
> milli-seconds for 1Gig (16400 loop over a 64kb chunk)
> || platform || original || cache-aware || improvement ||
> | x86 | 3894 | 2304 | 40.83 |
> | x86_64 | 2131 | 1826 | 14 |
> The performance improvement on x86 is rather larger than the 64bit case, due to the extra register/stack pressure caused by the static arrays.
> A closer analysis of the PureJavaCrc32 JIT code shows the following assembly fragment
> {code}
> 0x40f1e345: mov $0x184,%ecx
> 0x40f1e34a: mov 0x4415b560(%ecx),%ecx ;*getstatic T8_5
> ; - PureJavaCrc32::update@95 (line 61)
> ; {oop('PureJavaCrc32')}
> 0x40f1e350: mov %ecx,0x2c(%esp)
> {code}
> Basically, the static variables T8_0 through to T8_7 are being spilled to the stack because of register pressure. The x86_64 case has a lower likelihood of such pessimistic JIT code due to the increased number of registers.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-8926) hadoop.util.PureJavaCrc32 cache
hit-ratio is low for static data
Posted by "Hudson (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-8926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13484078#comment-13484078 ]
Hudson commented on HADOOP-8926:
--------------------------------
Integrated in Hadoop-Hdfs-0.23-Build #415 (See [https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/415/])
svn merge -c 1399005 FIXES: HADOOP-8926. hadoop.util.PureJavaCrc32 cache hit-ratio is low for static data (Gopal V via bobby) (Revision 1401803)
Result = SUCCESS
bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1401803
Files :
* /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/CHANGES.txt
* /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/PureJavaCrc32.java
* /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/PureJavaCrc32C.java
* /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/util/TestPureJavaCrc32.java
> hadoop.util.PureJavaCrc32 cache hit-ratio is low for static data
> ----------------------------------------------------------------
>
> Key: HADOOP-8926
> URL: https://issues.apache.org/jira/browse/HADOOP-8926
> Project: Hadoop Common
> Issue Type: Improvement
> Components: util
> Affects Versions: 2.0.3-alpha
> Environment: Ubuntu 10.10 i386
> Reporter: Gopal V
> Assignee: Gopal V
> Labels: optimization
> Fix For: 3.0.0, 2.0.3-alpha, 0.23.5
>
> Attachments: crc32-faster+readable.patch, crc32-faster+test.patch, pure-crc32-cache-hit.patch
>
>
> While running microbenchmarks for HDFS write codepath, a significant part of the CPU fraction was consumed by the DataChecksum.update().
> The attached patch converts the static arrays in CRC32 into a single linear array for a performance boost in the inner loop.
> milli-seconds for 1Gig (16400 loop over a 64kb chunk)
> || platform || original || cache-aware || improvement ||
> | x86 | 3894 | 2304 | 40.83 |
> | x86_64 | 2131 | 1826 | 14 |
> The performance improvement on x86 is rather larger than the 64bit case, due to the extra register/stack pressure caused by the static arrays.
> A closer analysis of the PureJavaCrc32 JIT code shows the following assembly fragment
> {code}
> 0x40f1e345: mov $0x184,%ecx
> 0x40f1e34a: mov 0x4415b560(%ecx),%ecx ;*getstatic T8_5
> ; - PureJavaCrc32::update@95 (line 61)
> ; {oop('PureJavaCrc32')}
> 0x40f1e350: mov %ecx,0x2c(%esp)
> {code}
> Basically, the static variables T8_0 through to T8_7 are being spilled to the stack because of register pressure. The x86_64 case has a lower likelihood of such pessimistic JIT code due to the increased number of registers.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HADOOP-8926) hadoop.util.PureJavaCrc32 cache
hit-ratio is low for static data
Posted by "Gopal V (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-8926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Gopal V updated HADOOP-8926:
----------------------------
Status: Patch Available (was: Open)
Updated for readability and fewer instructions in the inner loop.
> hadoop.util.PureJavaCrc32 cache hit-ratio is low for static data
> ----------------------------------------------------------------
>
> Key: HADOOP-8926
> URL: https://issues.apache.org/jira/browse/HADOOP-8926
> Project: Hadoop Common
> Issue Type: Improvement
> Components: util
> Affects Versions: 2.0.3-alpha
> Environment: Ubuntu 10.10 i386
> Reporter: Gopal V
> Assignee: Gopal V
> Priority: Trivial
> Labels: optimization
> Attachments: crc32-faster+readable.patch, crc32-faster+test.patch, pure-crc32-cache-hit.patch
>
>
> While running microbenchmarks for HDFS write codepath, a significant part of the CPU fraction was consumed by the DataChecksum.update().
> The attached patch converts the static arrays in CRC32 into a single linear array for a performance boost in the inner loop.
> milli-seconds for 1Gig (16400 loop over a 64kb chunk)
> || platform || original || cache-aware || improvement ||
> | x86 | 3894 | 2304 | 40.83 |
> | x86_64 | 2131 | 1826 | 14 |
> The performance improvement on x86 is rather larger than the 64bit case, due to the extra register/stack pressure caused by the static arrays.
> A closer analysis of the PureJavaCrc32 JIT code shows the following assembly fragment
> {code}
> 0x40f1e345: mov $0x184,%ecx
> 0x40f1e34a: mov 0x4415b560(%ecx),%ecx ;*getstatic T8_5
> ; - PureJavaCrc32::update@95 (line 61)
> ; {oop('PureJavaCrc32')}
> 0x40f1e350: mov %ecx,0x2c(%esp)
> {code}
> Basically, the static variables T8_0 through to T8_7 are being spilled to the stack because of register pressure. The x86_64 case has a lower likelihood of such pessimistic JIT code due to the increased number of registers.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-8926) hadoop.util.PureJavaCrc32 cache
hit-ratio is low for static data
Posted by "Gopal V (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-8926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13477318#comment-13477318 ]
Gopal V commented on HADOOP-8926:
---------------------------------
The patches are on Hadoop 2.x (trunk) for now - will move it to branch-1.1 once it is baked in.
> hadoop.util.PureJavaCrc32 cache hit-ratio is low for static data
> ----------------------------------------------------------------
>
> Key: HADOOP-8926
> URL: https://issues.apache.org/jira/browse/HADOOP-8926
> Project: Hadoop Common
> Issue Type: Improvement
> Components: util
> Affects Versions: 2.0.3-alpha
> Environment: Ubuntu 10.10 i386
> Reporter: Gopal V
> Assignee: Gopal V
> Priority: Trivial
> Labels: optimization
> Attachments: crc32-faster+readable.patch, crc32-faster+test.patch, pure-crc32-cache-hit.patch
>
>
> While running microbenchmarks for HDFS write codepath, a significant part of the CPU fraction was consumed by the DataChecksum.update().
> The attached patch converts the static arrays in CRC32 into a single linear array for a performance boost in the inner loop.
> milli-seconds for 1Gig (16400 loop over a 64kb chunk)
> || platform || original || cache-aware || improvement ||
> | x86 | 3894 | 2304 | 40.83 |
> | x86_64 | 2131 | 1826 | 14 |
> The performance improvement on x86 is rather larger than the 64bit case, due to the extra register/stack pressure caused by the static arrays.
> A closer analysis of the PureJavaCrc32 JIT code shows the following assembly fragment
> {code}
> 0x40f1e345: mov $0x184,%ecx
> 0x40f1e34a: mov 0x4415b560(%ecx),%ecx ;*getstatic T8_5
> ; - PureJavaCrc32::update@95 (line 61)
> ; {oop('PureJavaCrc32')}
> 0x40f1e350: mov %ecx,0x2c(%esp)
> {code}
> Basically, the static variables T8_0 through to T8_7 are being spilled to the stack because of register pressure. The x86_64 case has a lower likelihood of such pessimistic JIT code due to the increased number of registers.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-8926) hadoop.util.PureJavaCrc32 cache
hit-ratio is low for static data
Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-8926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13475581#comment-13475581 ]
Hadoop QA commented on HADOOP-8926:
-----------------------------------
{color:green}+1 overall{color}. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12549014/crc32-faster%2Btest.patch
against trunk revision .
{color:green}+1 @author{color}. The patch does not contain any @author tags.
{color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files.
{color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings.
{color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages.
{color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse.
{color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings.
{color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings.
{color:green}+1 core tests{color}. The patch passed unit tests in hadoop-common-project/hadoop-common.
{color:green}+1 contrib tests{color}. The patch passed contrib unit tests.
Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/1623//testReport/
Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/1623//console
This message is automatically generated.
> hadoop.util.PureJavaCrc32 cache hit-ratio is low for static data
> ----------------------------------------------------------------
>
> Key: HADOOP-8926
> URL: https://issues.apache.org/jira/browse/HADOOP-8926
> Project: Hadoop Common
> Issue Type: Improvement
> Components: util
> Affects Versions: 2.0.3-alpha
> Environment: Ubuntu 10.10 i386
> Reporter: Gopal V
> Priority: Trivial
> Labels: optimization
> Attachments: crc32-faster+test.patch, pure-crc32-cache-hit.patch
>
>
> While running microbenchmarks for HDFS write codepath, a significant part of the CPU fraction was consumed by the DataChecksum.update().
> The attached patch converts the static arrays in CRC32 into a single linear array for a performance boost in the inner loop.
> milli-seconds for 1Gig (16400 loop over a 64kb chunk)
> || platform || original || cache-aware || improvement ||
> | x86 | 3894 | 2304 | 40.83 |
> | x86_64 | 2131 | 1826 | 14 |
> The performance improvement on x86 is rather larger than the 64bit case, due to the extra register/stack pressure caused by the static arrays.
> A closer analysis of the PureJavaCrc32 JIT code shows the following assembly fragment
> {code}
> 0x40f1e345: mov $0x184,%ecx
> 0x40f1e34a: mov 0x4415b560(%ecx),%ecx ;*getstatic T8_5
> ; - PureJavaCrc32::update@95 (line 61)
> ; {oop('PureJavaCrc32')}
> 0x40f1e350: mov %ecx,0x2c(%esp)
> {code}
> Basically, the static variables T8_0 through to T8_7 are being spilled to the stack because of register pressure. The x86_64 case has a lower likelihood of such pessimistic JIT code due to the increased number of registers.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HADOOP-8926) hadoop.util.PureJavaCrc32 cache
hit-ratio is low for static data
Posted by "Gopal V (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-8926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Gopal V updated HADOOP-8926:
----------------------------
Attachment: pure-crc32-cache-hit.patch
main/ patch
> hadoop.util.PureJavaCrc32 cache hit-ratio is low for static data
> ----------------------------------------------------------------
>
> Key: HADOOP-8926
> URL: https://issues.apache.org/jira/browse/HADOOP-8926
> Project: Hadoop Common
> Issue Type: Improvement
> Components: util
> Affects Versions: 2.0.3-alpha
> Environment: Ubuntu 10.10 i386
> Reporter: Gopal V
> Priority: Trivial
> Attachments: pure-crc32-cache-hit.patch
>
>
> While running microbenchmarks for HDFS write codepath, a significant part of the CPU fraction was consumed by the DataChecksum.update().
> The attached patch converts the static arrays in CRC32 into a single linear array for a performance boost in the inner loop.
> milli-seconds for 1Gig (16400 loop over a 64kb chunk)
> || platform || original || cache-aware || improvement ||
> | x86 | 3894 | 2304 | 40.83 |
> | x86_64 | 2131 | 1826 | 14 |
> The performance improvement on x86 is rather larger than the 64bit case, due to the extra register/stack pressure caused by the static arrays.
> A closer analysis of the PureJavaCrc32 JIT code shows the following assembly fragment
> {code}
> 0x40f1e345: mov $0x184,%ecx
> 0x40f1e34a: mov 0x4415b560(%ecx),%ecx ;*getstatic T8_5
> ; - PureJavaCrc32::update@95 (line 61)
> ; {oop('PureJavaCrc32')}
> 0x40f1e350: mov %ecx,0x2c(%esp)
> {code}
> Basically, the static variables T8_0 through to T8_7 are being spilled to the stack because of register pressure. The x86_64 case has a lower likelihood of such pessimistic JIT code due to the increased number of registers.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HADOOP-8926) hadoop.util.PureJavaCrc32 cache
hit-ratio is low for static data
Posted by "Gopal V (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-8926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Gopal V updated HADOOP-8926:
----------------------------
Attachment: crc32-faster+readable.patch
Rewrite the core loop for readability & turn all loop locals into final variables.
Modify the small loop into a switch statement (java tableswitch instruction)
> hadoop.util.PureJavaCrc32 cache hit-ratio is low for static data
> ----------------------------------------------------------------
>
> Key: HADOOP-8926
> URL: https://issues.apache.org/jira/browse/HADOOP-8926
> Project: Hadoop Common
> Issue Type: Improvement
> Components: util
> Affects Versions: 2.0.3-alpha
> Environment: Ubuntu 10.10 i386
> Reporter: Gopal V
> Assignee: Gopal V
> Priority: Trivial
> Labels: optimization
> Attachments: crc32-faster+readable.patch, crc32-faster+test.patch, pure-crc32-cache-hit.patch
>
>
> While running microbenchmarks for HDFS write codepath, a significant part of the CPU fraction was consumed by the DataChecksum.update().
> The attached patch converts the static arrays in CRC32 into a single linear array for a performance boost in the inner loop.
> milli-seconds for 1Gig (16400 loop over a 64kb chunk)
> || platform || original || cache-aware || improvement ||
> | x86 | 3894 | 2304 | 40.83 |
> | x86_64 | 2131 | 1826 | 14 |
> The performance improvement on x86 is rather larger than the 64bit case, due to the extra register/stack pressure caused by the static arrays.
> A closer analysis of the PureJavaCrc32 JIT code shows the following assembly fragment
> {code}
> 0x40f1e345: mov $0x184,%ecx
> 0x40f1e34a: mov 0x4415b560(%ecx),%ecx ;*getstatic T8_5
> ; - PureJavaCrc32::update@95 (line 61)
> ; {oop('PureJavaCrc32')}
> 0x40f1e350: mov %ecx,0x2c(%esp)
> {code}
> Basically, the static variables T8_0 through to T8_7 are being spilled to the stack because of register pressure. The x86_64 case has a lower likelihood of such pessimistic JIT code due to the increased number of registers.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HADOOP-8926) hadoop.util.PureJavaCrc32 cache
hit-ratio is low for static data
Posted by "Tsz Wo (Nicholas), SZE (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-8926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Tsz Wo (Nicholas), SZE updated HADOOP-8926:
-------------------------------------------
Priority: Major (was: Trivial)
> hadoop.util.PureJavaCrc32 cache hit-ratio is low for static data
> ----------------------------------------------------------------
>
> Key: HADOOP-8926
> URL: https://issues.apache.org/jira/browse/HADOOP-8926
> Project: Hadoop Common
> Issue Type: Improvement
> Components: util
> Affects Versions: 2.0.3-alpha
> Environment: Ubuntu 10.10 i386
> Reporter: Gopal V
> Assignee: Gopal V
> Labels: optimization
> Fix For: 3.0.0, 2.0.3-alpha
>
> Attachments: crc32-faster+readable.patch, crc32-faster+test.patch, pure-crc32-cache-hit.patch
>
>
> While running microbenchmarks for HDFS write codepath, a significant part of the CPU fraction was consumed by the DataChecksum.update().
> The attached patch converts the static arrays in CRC32 into a single linear array for a performance boost in the inner loop.
> milli-seconds for 1Gig (16400 loop over a 64kb chunk)
> || platform || original || cache-aware || improvement ||
> | x86 | 3894 | 2304 | 40.83 |
> | x86_64 | 2131 | 1826 | 14 |
> The performance improvement on x86 is rather larger than the 64bit case, due to the extra register/stack pressure caused by the static arrays.
> A closer analysis of the PureJavaCrc32 JIT code shows the following assembly fragment
> {code}
> 0x40f1e345: mov $0x184,%ecx
> 0x40f1e34a: mov 0x4415b560(%ecx),%ecx ;*getstatic T8_5
> ; - PureJavaCrc32::update@95 (line 61)
> ; {oop('PureJavaCrc32')}
> 0x40f1e350: mov %ecx,0x2c(%esp)
> {code}
> Basically, the static variables T8_0 through to T8_7 are being spilled to the stack because of register pressure. The x86_64 case has a lower likelihood of such pessimistic JIT code due to the increased number of registers.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-8926) hadoop.util.PureJavaCrc32 cache
hit-ratio is low for static data
Posted by "Hudson (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-8926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13477863#comment-13477863 ]
Hudson commented on HADOOP-8926:
--------------------------------
Integrated in Hadoop-Mapreduce-trunk #1228 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1228/])
HADOOP-8926. hadoop.util.PureJavaCrc32 cache hit-ratio is low for static data (Gopal V via bobby) (Revision 1399005)
Result = FAILURE
bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1399005
Files :
* /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt
* /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/PureJavaCrc32.java
* /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/PureJavaCrc32C.java
* /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/util/TestPureJavaCrc32.java
> hadoop.util.PureJavaCrc32 cache hit-ratio is low for static data
> ----------------------------------------------------------------
>
> Key: HADOOP-8926
> URL: https://issues.apache.org/jira/browse/HADOOP-8926
> Project: Hadoop Common
> Issue Type: Improvement
> Components: util
> Affects Versions: 2.0.3-alpha
> Environment: Ubuntu 10.10 i386
> Reporter: Gopal V
> Assignee: Gopal V
> Priority: Trivial
> Labels: optimization
> Fix For: 3.0.0, 2.0.3-alpha
>
> Attachments: crc32-faster+readable.patch, crc32-faster+test.patch, pure-crc32-cache-hit.patch
>
>
> While running microbenchmarks for HDFS write codepath, a significant part of the CPU fraction was consumed by the DataChecksum.update().
> The attached patch converts the static arrays in CRC32 into a single linear array for a performance boost in the inner loop.
> milli-seconds for 1Gig (16400 loop over a 64kb chunk)
> || platform || original || cache-aware || improvement ||
> | x86 | 3894 | 2304 | 40.83 |
> | x86_64 | 2131 | 1826 | 14 |
> The performance improvement on x86 is rather larger than the 64bit case, due to the extra register/stack pressure caused by the static arrays.
> A closer analysis of the PureJavaCrc32 JIT code shows the following assembly fragment
> {code}
> 0x40f1e345: mov $0x184,%ecx
> 0x40f1e34a: mov 0x4415b560(%ecx),%ecx ;*getstatic T8_5
> ; - PureJavaCrc32::update@95 (line 61)
> ; {oop('PureJavaCrc32')}
> 0x40f1e350: mov %ecx,0x2c(%esp)
> {code}
> Basically, the static variables T8_0 through to T8_7 are being spilled to the stack because of register pressure. The x86_64 case has a lower likelihood of such pessimistic JIT code due to the increased number of registers.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-8926) hadoop.util.PureJavaCrc32 cache
hit-ratio is low for static data
Posted by "Hudson (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-8926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13477835#comment-13477835 ]
Hudson commented on HADOOP-8926:
--------------------------------
Integrated in Hadoop-Hdfs-trunk #1198 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1198/])
HADOOP-8926. hadoop.util.PureJavaCrc32 cache hit-ratio is low for static data (Gopal V via bobby) (Revision 1399005)
Result = SUCCESS
bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1399005
Files :
* /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt
* /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/PureJavaCrc32.java
* /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/PureJavaCrc32C.java
* /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/util/TestPureJavaCrc32.java
> hadoop.util.PureJavaCrc32 cache hit-ratio is low for static data
> ----------------------------------------------------------------
>
> Key: HADOOP-8926
> URL: https://issues.apache.org/jira/browse/HADOOP-8926
> Project: Hadoop Common
> Issue Type: Improvement
> Components: util
> Affects Versions: 2.0.3-alpha
> Environment: Ubuntu 10.10 i386
> Reporter: Gopal V
> Assignee: Gopal V
> Priority: Trivial
> Labels: optimization
> Fix For: 3.0.0, 2.0.3-alpha
>
> Attachments: crc32-faster+readable.patch, crc32-faster+test.patch, pure-crc32-cache-hit.patch
>
>
> While running microbenchmarks for HDFS write codepath, a significant part of the CPU fraction was consumed by the DataChecksum.update().
> The attached patch converts the static arrays in CRC32 into a single linear array for a performance boost in the inner loop.
> milli-seconds for 1Gig (16400 loop over a 64kb chunk)
> || platform || original || cache-aware || improvement ||
> | x86 | 3894 | 2304 | 40.83 |
> | x86_64 | 2131 | 1826 | 14 |
> The performance improvement on x86 is rather larger than the 64bit case, due to the extra register/stack pressure caused by the static arrays.
> A closer analysis of the PureJavaCrc32 JIT code shows the following assembly fragment
> {code}
> 0x40f1e345: mov $0x184,%ecx
> 0x40f1e34a: mov 0x4415b560(%ecx),%ecx ;*getstatic T8_5
> ; - PureJavaCrc32::update@95 (line 61)
> ; {oop('PureJavaCrc32')}
> 0x40f1e350: mov %ecx,0x2c(%esp)
> {code}
> Basically, the static variables T8_0 through to T8_7 are being spilled to the stack because of register pressure. The x86_64 case has a lower likelihood of such pessimistic JIT code due to the increased number of registers.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HADOOP-8926) hadoop.util.PureJavaCrc32 cache
hit-ratio is low for static data
Posted by "Gopal V (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-8926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Gopal V updated HADOOP-8926:
----------------------------
Attachment: crc32-faster+test.patch
patch for the polynomial table gen, crc32 and crc32c
> hadoop.util.PureJavaCrc32 cache hit-ratio is low for static data
> ----------------------------------------------------------------
>
> Key: HADOOP-8926
> URL: https://issues.apache.org/jira/browse/HADOOP-8926
> Project: Hadoop Common
> Issue Type: Improvement
> Components: util
> Affects Versions: 2.0.3-alpha
> Environment: Ubuntu 10.10 i386
> Reporter: Gopal V
> Priority: Trivial
> Attachments: crc32-faster+test.patch, pure-crc32-cache-hit.patch
>
>
> While running microbenchmarks for HDFS write codepath, a significant part of the CPU fraction was consumed by the DataChecksum.update().
> The attached patch converts the static arrays in CRC32 into a single linear array for a performance boost in the inner loop.
> milli-seconds for 1Gig (16400 loop over a 64kb chunk)
> || platform || original || cache-aware || improvement ||
> | x86 | 3894 | 2304 | 40.83 |
> | x86_64 | 2131 | 1826 | 14 |
> The performance improvement on x86 is rather larger than the 64bit case, due to the extra register/stack pressure caused by the static arrays.
> A closer analysis of the PureJavaCrc32 JIT code shows the following assembly fragment
> {code}
> 0x40f1e345: mov $0x184,%ecx
> 0x40f1e34a: mov 0x4415b560(%ecx),%ecx ;*getstatic T8_5
> ; - PureJavaCrc32::update@95 (line 61)
> ; {oop('PureJavaCrc32')}
> 0x40f1e350: mov %ecx,0x2c(%esp)
> {code}
> Basically, the static variables T8_0 through to T8_7 are being spilled to the stack because of register pressure. The x86_64 case has a lower likelihood of such pessimistic JIT code due to the increased number of registers.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-8926) hadoop.util.PureJavaCrc32 cache
hit-ratio is low for static data
Posted by "Hudson (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-8926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13477393#comment-13477393 ]
Hudson commented on HADOOP-8926:
--------------------------------
Integrated in Hadoop-trunk-Commit #2875 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/2875/])
HADOOP-8926. hadoop.util.PureJavaCrc32 cache hit-ratio is low for static data (Gopal V via bobby) (Revision 1399005)
Result = SUCCESS
bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1399005
Files :
* /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt
* /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/PureJavaCrc32.java
* /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/PureJavaCrc32C.java
* /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/util/TestPureJavaCrc32.java
> hadoop.util.PureJavaCrc32 cache hit-ratio is low for static data
> ----------------------------------------------------------------
>
> Key: HADOOP-8926
> URL: https://issues.apache.org/jira/browse/HADOOP-8926
> Project: Hadoop Common
> Issue Type: Improvement
> Components: util
> Affects Versions: 2.0.3-alpha
> Environment: Ubuntu 10.10 i386
> Reporter: Gopal V
> Assignee: Gopal V
> Priority: Trivial
> Labels: optimization
> Fix For: 3.0.0, 2.0.3-alpha
>
> Attachments: crc32-faster+readable.patch, crc32-faster+test.patch, pure-crc32-cache-hit.patch
>
>
> While running microbenchmarks for HDFS write codepath, a significant part of the CPU fraction was consumed by the DataChecksum.update().
> The attached patch converts the static arrays in CRC32 into a single linear array for a performance boost in the inner loop.
> milli-seconds for 1Gig (16400 loop over a 64kb chunk)
> || platform || original || cache-aware || improvement ||
> | x86 | 3894 | 2304 | 40.83 |
> | x86_64 | 2131 | 1826 | 14 |
> The performance improvement on x86 is rather larger than the 64bit case, due to the extra register/stack pressure caused by the static arrays.
> A closer analysis of the PureJavaCrc32 JIT code shows the following assembly fragment
> {code}
> 0x40f1e345: mov $0x184,%ecx
> 0x40f1e34a: mov 0x4415b560(%ecx),%ecx ;*getstatic T8_5
> ; - PureJavaCrc32::update@95 (line 61)
> ; {oop('PureJavaCrc32')}
> 0x40f1e350: mov %ecx,0x2c(%esp)
> {code}
> Basically, the static variables T8_0 through to T8_7 are being spilled to the stack because of register pressure. The x86_64 case has a lower likelihood of such pessimistic JIT code due to the increased number of registers.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-8926) hadoop.util.PureJavaCrc32 cache
hit-ratio is low for static data
Posted by "Gopal V (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-8926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13476795#comment-13476795 ]
Gopal V commented on HADOOP-8926:
---------------------------------
Sure, I will fix the readability of the patch by adding T8_[0-7]_start and re-run it through the JIT to make sure the class variables are getting inlined in the math.
If that still results in a register splill, I will move them out of the loop as local final variables & re-submit a patch today.
> hadoop.util.PureJavaCrc32 cache hit-ratio is low for static data
> ----------------------------------------------------------------
>
> Key: HADOOP-8926
> URL: https://issues.apache.org/jira/browse/HADOOP-8926
> Project: Hadoop Common
> Issue Type: Improvement
> Components: util
> Affects Versions: 2.0.3-alpha
> Environment: Ubuntu 10.10 i386
> Reporter: Gopal V
> Assignee: Gopal V
> Priority: Trivial
> Labels: optimization
> Attachments: crc32-faster+test.patch, pure-crc32-cache-hit.patch
>
>
> While running microbenchmarks for HDFS write codepath, a significant part of the CPU fraction was consumed by the DataChecksum.update().
> The attached patch converts the static arrays in CRC32 into a single linear array for a performance boost in the inner loop.
> milli-seconds for 1Gig (16400 loop over a 64kb chunk)
> || platform || original || cache-aware || improvement ||
> | x86 | 3894 | 2304 | 40.83 |
> | x86_64 | 2131 | 1826 | 14 |
> The performance improvement on x86 is rather larger than the 64bit case, due to the extra register/stack pressure caused by the static arrays.
> A closer analysis of the PureJavaCrc32 JIT code shows the following assembly fragment
> {code}
> 0x40f1e345: mov $0x184,%ecx
> 0x40f1e34a: mov 0x4415b560(%ecx),%ecx ;*getstatic T8_5
> ; - PureJavaCrc32::update@95 (line 61)
> ; {oop('PureJavaCrc32')}
> 0x40f1e350: mov %ecx,0x2c(%esp)
> {code}
> Basically, the static variables T8_0 through to T8_7 are being spilled to the stack because of register pressure. The x86_64 case has a lower likelihood of such pessimistic JIT code due to the increased number of registers.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HADOOP-8926) hadoop.util.PureJavaCrc32 cache
hit-ratio is low for static data
Posted by "Gopal V (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-8926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Gopal V updated HADOOP-8926:
----------------------------
Labels: optimization (was: )
Release Note: Speed up Crc32 by improving the cache hit-ratio of hadoop.util.PureJavaCrc32 (was: Improve cache hit-ratio of hadoop.util.PureJavaCrc32 )
Status: Patch Available (was: Open)
The peak throughput (on my machine) went from 283 MB/s to 323 MB/s (according to TestPureJavaCrc32$PerformanceTest)
> hadoop.util.PureJavaCrc32 cache hit-ratio is low for static data
> ----------------------------------------------------------------
>
> Key: HADOOP-8926
> URL: https://issues.apache.org/jira/browse/HADOOP-8926
> Project: Hadoop Common
> Issue Type: Improvement
> Components: util
> Affects Versions: 2.0.3-alpha
> Environment: Ubuntu 10.10 i386
> Reporter: Gopal V
> Priority: Trivial
> Labels: optimization
> Attachments: crc32-faster+test.patch, pure-crc32-cache-hit.patch
>
>
> While running microbenchmarks for HDFS write codepath, a significant part of the CPU fraction was consumed by the DataChecksum.update().
> The attached patch converts the static arrays in CRC32 into a single linear array for a performance boost in the inner loop.
> milli-seconds for 1Gig (16400 loop over a 64kb chunk)
> || platform || original || cache-aware || improvement ||
> | x86 | 3894 | 2304 | 40.83 |
> | x86_64 | 2131 | 1826 | 14 |
> The performance improvement on x86 is rather larger than the 64bit case, due to the extra register/stack pressure caused by the static arrays.
> A closer analysis of the PureJavaCrc32 JIT code shows the following assembly fragment
> {code}
> 0x40f1e345: mov $0x184,%ecx
> 0x40f1e34a: mov 0x4415b560(%ecx),%ecx ;*getstatic T8_5
> ; - PureJavaCrc32::update@95 (line 61)
> ; {oop('PureJavaCrc32')}
> 0x40f1e350: mov %ecx,0x2c(%esp)
> {code}
> Basically, the static variables T8_0 through to T8_7 are being spilled to the stack because of register pressure. The x86_64 case has a lower likelihood of such pessimistic JIT code due to the increased number of registers.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-8926) hadoop.util.PureJavaCrc32 cache
hit-ratio is low for static data
Posted by "Tsz Wo (Nicholas), SZE (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-8926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13478329#comment-13478329 ]
Tsz Wo (Nicholas), SZE commented on HADOOP-8926:
------------------------------------------------
Gopal, Amazing works! There are 10% or more performance improvement from your patch.
java.version = 1.6.0_35
java.runtime.name = Java(TM) SE Runtime Environment
java.runtime.version = 1.6.0_35-b10-428-11M3811
java.vm.version = 20.10-b01-428
java.vm.vendor = Apple Inc.
java.vm.name = Java HotSpot(TM) 64-Bit Server VM
java.vm.specification.version = 1.0
java.specification.version = 1.6
os.arch = x86_64
os.name = Mac OS X
os.version = 10.7.4
Performance Table (The unit is MB/sec)
|| Num Bytes || CRC32 || PureJavaCrc32_8926 || PureJavaCrc32 ||
| 1 | 11.896 | 129.832 | 164.845 |
| 2 | 24.097 | 192.742 | 210.266 |
| 4 | 46.274 | 222.059 | 233.902 |
| 8 | 82.332 | 488.716 | 438.514 |
| 16 | 131.682 | 587.312 | 602.784 |
| 32 | 187.265 | 796.510 | 760.628 |
| 64 | 237.088 | 938.650 | 891.017 |
| 128 | 264.795 | 1049.774 | 913.666 |
| 256 | 291.785 | 1095.084 | 987.380 |
| 512 | 298.590 | 1126.067 | 1002.899 |
| 1024 | 305.349 | 1152.375 | 1040.211 |
| 2048 | 309.342 | 1119.713 | 1033.258 |
| 4096 | 309.162 | 1170.767 | 1047.746 |
| 8192 | 321.775 | 1189.724 | 1053.065 |
| 16384 | 320.457 | 1181.128 | 1060.138 |
| 32768 | 324.524 | 1169.965 | 1050.610 |
| 65536 | 322.380 | 1160.471 | 1053.854 |
| 131072 | 315.983 | 1138.223 | 1009.193 |
| 262144 | 324.293 | 1190.476 | 1020.782 |
| 524288 | 316.003 | 1136.979 | 1015.389 |
| 1048576 | 321.715 | 1081.465 | 1033.750 |
| 2097152 | 318.330 | 1189.680 | 1072.054 |
| 4194304 | 316.710 | 1138.496 | 1024.352 |
| 8388608 | 315.701 | 1124.909 | 1030.505 |
| 16777216 | 325.575 | 1154.724 | 1031.285 |
> hadoop.util.PureJavaCrc32 cache hit-ratio is low for static data
> ----------------------------------------------------------------
>
> Key: HADOOP-8926
> URL: https://issues.apache.org/jira/browse/HADOOP-8926
> Project: Hadoop Common
> Issue Type: Improvement
> Components: util
> Affects Versions: 2.0.3-alpha
> Environment: Ubuntu 10.10 i386
> Reporter: Gopal V
> Assignee: Gopal V
> Priority: Trivial
> Labels: optimization
> Fix For: 3.0.0, 2.0.3-alpha
>
> Attachments: crc32-faster+readable.patch, crc32-faster+test.patch, pure-crc32-cache-hit.patch
>
>
> While running microbenchmarks for HDFS write codepath, a significant part of the CPU fraction was consumed by the DataChecksum.update().
> The attached patch converts the static arrays in CRC32 into a single linear array for a performance boost in the inner loop.
> milli-seconds for 1Gig (16400 loop over a 64kb chunk)
> || platform || original || cache-aware || improvement ||
> | x86 | 3894 | 2304 | 40.83 |
> | x86_64 | 2131 | 1826 | 14 |
> The performance improvement on x86 is rather larger than the 64bit case, due to the extra register/stack pressure caused by the static arrays.
> A closer analysis of the PureJavaCrc32 JIT code shows the following assembly fragment
> {code}
> 0x40f1e345: mov $0x184,%ecx
> 0x40f1e34a: mov 0x4415b560(%ecx),%ecx ;*getstatic T8_5
> ; - PureJavaCrc32::update@95 (line 61)
> ; {oop('PureJavaCrc32')}
> 0x40f1e350: mov %ecx,0x2c(%esp)
> {code}
> Basically, the static variables T8_0 through to T8_7 are being spilled to the stack because of register pressure. The x86_64 case has a lower likelihood of such pessimistic JIT code due to the increased number of registers.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-8926) hadoop.util.PureJavaCrc32 cache
hit-ratio is low for static data
Posted by "Gopal V (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-8926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13483346#comment-13483346 ]
Gopal V commented on HADOOP-8926:
---------------------------------
Opened HADOOP-8971 as a backport of this to branch-1
> hadoop.util.PureJavaCrc32 cache hit-ratio is low for static data
> ----------------------------------------------------------------
>
> Key: HADOOP-8926
> URL: https://issues.apache.org/jira/browse/HADOOP-8926
> Project: Hadoop Common
> Issue Type: Improvement
> Components: util
> Affects Versions: 2.0.3-alpha
> Environment: Ubuntu 10.10 i386
> Reporter: Gopal V
> Assignee: Gopal V
> Labels: optimization
> Fix For: 3.0.0, 2.0.3-alpha
>
> Attachments: crc32-faster+readable.patch, crc32-faster+test.patch, pure-crc32-cache-hit.patch
>
>
> While running microbenchmarks for HDFS write codepath, a significant part of the CPU fraction was consumed by the DataChecksum.update().
> The attached patch converts the static arrays in CRC32 into a single linear array for a performance boost in the inner loop.
> milli-seconds for 1Gig (16400 loop over a 64kb chunk)
> || platform || original || cache-aware || improvement ||
> | x86 | 3894 | 2304 | 40.83 |
> | x86_64 | 2131 | 1826 | 14 |
> The performance improvement on x86 is rather larger than the 64bit case, due to the extra register/stack pressure caused by the static arrays.
> A closer analysis of the PureJavaCrc32 JIT code shows the following assembly fragment
> {code}
> 0x40f1e345: mov $0x184,%ecx
> 0x40f1e34a: mov 0x4415b560(%ecx),%ecx ;*getstatic T8_5
> ; - PureJavaCrc32::update@95 (line 61)
> ; {oop('PureJavaCrc32')}
> 0x40f1e350: mov %ecx,0x2c(%esp)
> {code}
> Basically, the static variables T8_0 through to T8_7 are being spilled to the stack because of register pressure. The x86_64 case has a lower likelihood of such pessimistic JIT code due to the increased number of registers.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HADOOP-8926) hadoop.util.PureJavaCrc32 cache
hit-ratio is low for static data
Posted by "Robert Joseph Evans (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-8926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Robert Joseph Evans updated HADOOP-8926:
----------------------------------------
Fix Version/s: 0.23.5
I pulled this into branch-0.23 too.
> hadoop.util.PureJavaCrc32 cache hit-ratio is low for static data
> ----------------------------------------------------------------
>
> Key: HADOOP-8926
> URL: https://issues.apache.org/jira/browse/HADOOP-8926
> Project: Hadoop Common
> Issue Type: Improvement
> Components: util
> Affects Versions: 2.0.3-alpha
> Environment: Ubuntu 10.10 i386
> Reporter: Gopal V
> Assignee: Gopal V
> Labels: optimization
> Fix For: 3.0.0, 2.0.3-alpha, 0.23.5
>
> Attachments: crc32-faster+readable.patch, crc32-faster+test.patch, pure-crc32-cache-hit.patch
>
>
> While running microbenchmarks for HDFS write codepath, a significant part of the CPU fraction was consumed by the DataChecksum.update().
> The attached patch converts the static arrays in CRC32 into a single linear array for a performance boost in the inner loop.
> milli-seconds for 1Gig (16400 loop over a 64kb chunk)
> || platform || original || cache-aware || improvement ||
> | x86 | 3894 | 2304 | 40.83 |
> | x86_64 | 2131 | 1826 | 14 |
> The performance improvement on x86 is rather larger than the 64bit case, due to the extra register/stack pressure caused by the static arrays.
> A closer analysis of the PureJavaCrc32 JIT code shows the following assembly fragment
> {code}
> 0x40f1e345: mov $0x184,%ecx
> 0x40f1e34a: mov 0x4415b560(%ecx),%ecx ;*getstatic T8_5
> ; - PureJavaCrc32::update@95 (line 61)
> ; {oop('PureJavaCrc32')}
> 0x40f1e350: mov %ecx,0x2c(%esp)
> {code}
> Basically, the static variables T8_0 through to T8_7 are being spilled to the stack because of register pressure. The x86_64 case has a lower likelihood of such pessimistic JIT code due to the increased number of registers.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-8926) hadoop.util.PureJavaCrc32 cache
hit-ratio is low for static data
Posted by "Robert Joseph Evans (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-8926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13476230#comment-13476230 ]
Robert Joseph Evans commented on HADOOP-8926:
---------------------------------------------
The change looks like a clean refactoring to me. The hard coded 256*X in several places in the code is not too hard to understand, but I think it would be cleaner if we could create a few static final values like
{code}
private static final int T8_7_start = 256*7;
...
private static final int T8_0_start = 0; //256*0
...
localCrc = (T[T8_7_start + (c0 & 0xff)] ^ T[T8_6_start + (c1 & 0xff)])
^ (T[T8_5_start + (c2 & 0xff)] ^ T[T8_4_start + (c3 & 0xff)]);
...
{code}
My only other comment is that we may now need to update the comment at the top of PureJavaCrc32.java. ~10x to 1.8x as fast does not seem accurate any more :)
> hadoop.util.PureJavaCrc32 cache hit-ratio is low for static data
> ----------------------------------------------------------------
>
> Key: HADOOP-8926
> URL: https://issues.apache.org/jira/browse/HADOOP-8926
> Project: Hadoop Common
> Issue Type: Improvement
> Components: util
> Affects Versions: 2.0.3-alpha
> Environment: Ubuntu 10.10 i386
> Reporter: Gopal V
> Assignee: Gopal V
> Priority: Trivial
> Labels: optimization
> Attachments: crc32-faster+test.patch, pure-crc32-cache-hit.patch
>
>
> While running microbenchmarks for HDFS write codepath, a significant part of the CPU fraction was consumed by the DataChecksum.update().
> The attached patch converts the static arrays in CRC32 into a single linear array for a performance boost in the inner loop.
> milli-seconds for 1Gig (16400 loop over a 64kb chunk)
> || platform || original || cache-aware || improvement ||
> | x86 | 3894 | 2304 | 40.83 |
> | x86_64 | 2131 | 1826 | 14 |
> The performance improvement on x86 is rather larger than the 64bit case, due to the extra register/stack pressure caused by the static arrays.
> A closer analysis of the PureJavaCrc32 JIT code shows the following assembly fragment
> {code}
> 0x40f1e345: mov $0x184,%ecx
> 0x40f1e34a: mov 0x4415b560(%ecx),%ecx ;*getstatic T8_5
> ; - PureJavaCrc32::update@95 (line 61)
> ; {oop('PureJavaCrc32')}
> 0x40f1e350: mov %ecx,0x2c(%esp)
> {code}
> Basically, the static variables T8_0 through to T8_7 are being spilled to the stack because of register pressure. The x86_64 case has a lower likelihood of such pessimistic JIT code due to the increased number of registers.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-8926) hadoop.util.PureJavaCrc32 cache
hit-ratio is low for static data
Posted by "Tsz Wo (Nicholas), SZE (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-8926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13479245#comment-13479245 ]
Tsz Wo (Nicholas), SZE commented on HADOOP-8926:
------------------------------------------------
Sure, let's backport this to branch-1.
> hadoop.util.PureJavaCrc32 cache hit-ratio is low for static data
> ----------------------------------------------------------------
>
> Key: HADOOP-8926
> URL: https://issues.apache.org/jira/browse/HADOOP-8926
> Project: Hadoop Common
> Issue Type: Improvement
> Components: util
> Affects Versions: 2.0.3-alpha
> Environment: Ubuntu 10.10 i386
> Reporter: Gopal V
> Assignee: Gopal V
> Labels: optimization
> Fix For: 3.0.0, 2.0.3-alpha
>
> Attachments: crc32-faster+readable.patch, crc32-faster+test.patch, pure-crc32-cache-hit.patch
>
>
> While running microbenchmarks for HDFS write codepath, a significant part of the CPU fraction was consumed by the DataChecksum.update().
> The attached patch converts the static arrays in CRC32 into a single linear array for a performance boost in the inner loop.
> milli-seconds for 1Gig (16400 loop over a 64kb chunk)
> || platform || original || cache-aware || improvement ||
> | x86 | 3894 | 2304 | 40.83 |
> | x86_64 | 2131 | 1826 | 14 |
> The performance improvement on x86 is rather larger than the 64bit case, due to the extra register/stack pressure caused by the static arrays.
> A closer analysis of the PureJavaCrc32 JIT code shows the following assembly fragment
> {code}
> 0x40f1e345: mov $0x184,%ecx
> 0x40f1e34a: mov 0x4415b560(%ecx),%ecx ;*getstatic T8_5
> ; - PureJavaCrc32::update@95 (line 61)
> ; {oop('PureJavaCrc32')}
> 0x40f1e350: mov %ecx,0x2c(%esp)
> {code}
> Basically, the static variables T8_0 through to T8_7 are being spilled to the stack because of register pressure. The x86_64 case has a lower likelihood of such pessimistic JIT code due to the increased number of registers.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HADOOP-8926) hadoop.util.PureJavaCrc32 cache
hit-ratio is low for static data
Posted by "Suresh Srinivas (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-8926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Suresh Srinivas reassigned HADOOP-8926:
---------------------------------------
Assignee: Gopal V
> hadoop.util.PureJavaCrc32 cache hit-ratio is low for static data
> ----------------------------------------------------------------
>
> Key: HADOOP-8926
> URL: https://issues.apache.org/jira/browse/HADOOP-8926
> Project: Hadoop Common
> Issue Type: Improvement
> Components: util
> Affects Versions: 2.0.3-alpha
> Environment: Ubuntu 10.10 i386
> Reporter: Gopal V
> Assignee: Gopal V
> Priority: Trivial
> Labels: optimization
> Attachments: crc32-faster+test.patch, pure-crc32-cache-hit.patch
>
>
> While running microbenchmarks for HDFS write codepath, a significant part of the CPU fraction was consumed by the DataChecksum.update().
> The attached patch converts the static arrays in CRC32 into a single linear array for a performance boost in the inner loop.
> milli-seconds for 1Gig (16400 loop over a 64kb chunk)
> || platform || original || cache-aware || improvement ||
> | x86 | 3894 | 2304 | 40.83 |
> | x86_64 | 2131 | 1826 | 14 |
> The performance improvement on x86 is rather larger than the 64bit case, due to the extra register/stack pressure caused by the static arrays.
> A closer analysis of the PureJavaCrc32 JIT code shows the following assembly fragment
> {code}
> 0x40f1e345: mov $0x184,%ecx
> 0x40f1e34a: mov 0x4415b560(%ecx),%ecx ;*getstatic T8_5
> ; - PureJavaCrc32::update@95 (line 61)
> ; {oop('PureJavaCrc32')}
> 0x40f1e350: mov %ecx,0x2c(%esp)
> {code}
> Basically, the static variables T8_0 through to T8_7 are being spilled to the stack because of register pressure. The x86_64 case has a lower likelihood of such pessimistic JIT code due to the increased number of registers.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-8926) hadoop.util.PureJavaCrc32 cache
hit-ratio is low for static data
Posted by "Hudson (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-8926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13477766#comment-13477766 ]
Hudson commented on HADOOP-8926:
--------------------------------
Integrated in Hadoop-Yarn-trunk #6 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/6/])
HADOOP-8926. hadoop.util.PureJavaCrc32 cache hit-ratio is low for static data (Gopal V via bobby) (Revision 1399005)
Result = FAILURE
bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1399005
Files :
* /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt
* /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/PureJavaCrc32.java
* /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/PureJavaCrc32C.java
* /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/util/TestPureJavaCrc32.java
> hadoop.util.PureJavaCrc32 cache hit-ratio is low for static data
> ----------------------------------------------------------------
>
> Key: HADOOP-8926
> URL: https://issues.apache.org/jira/browse/HADOOP-8926
> Project: Hadoop Common
> Issue Type: Improvement
> Components: util
> Affects Versions: 2.0.3-alpha
> Environment: Ubuntu 10.10 i386
> Reporter: Gopal V
> Assignee: Gopal V
> Priority: Trivial
> Labels: optimization
> Fix For: 3.0.0, 2.0.3-alpha
>
> Attachments: crc32-faster+readable.patch, crc32-faster+test.patch, pure-crc32-cache-hit.patch
>
>
> While running microbenchmarks for HDFS write codepath, a significant part of the CPU fraction was consumed by the DataChecksum.update().
> The attached patch converts the static arrays in CRC32 into a single linear array for a performance boost in the inner loop.
> milli-seconds for 1Gig (16400 loop over a 64kb chunk)
> || platform || original || cache-aware || improvement ||
> | x86 | 3894 | 2304 | 40.83 |
> | x86_64 | 2131 | 1826 | 14 |
> The performance improvement on x86 is rather larger than the 64bit case, due to the extra register/stack pressure caused by the static arrays.
> A closer analysis of the PureJavaCrc32 JIT code shows the following assembly fragment
> {code}
> 0x40f1e345: mov $0x184,%ecx
> 0x40f1e34a: mov 0x4415b560(%ecx),%ecx ;*getstatic T8_5
> ; - PureJavaCrc32::update@95 (line 61)
> ; {oop('PureJavaCrc32')}
> 0x40f1e350: mov %ecx,0x2c(%esp)
> {code}
> Basically, the static variables T8_0 through to T8_7 are being spilled to the stack because of register pressure. The x86_64 case has a lower likelihood of such pessimistic JIT code due to the increased number of registers.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-8926) hadoop.util.PureJavaCrc32 cache
hit-ratio is low for static data
Posted by "Robert Joseph Evans (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-8926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13477039#comment-13477039 ]
Robert Joseph Evans commented on HADOOP-8926:
---------------------------------------------
The changes look good to me +1. Gopal, what versions of Hadoop are you targeting with this change?
> hadoop.util.PureJavaCrc32 cache hit-ratio is low for static data
> ----------------------------------------------------------------
>
> Key: HADOOP-8926
> URL: https://issues.apache.org/jira/browse/HADOOP-8926
> Project: Hadoop Common
> Issue Type: Improvement
> Components: util
> Affects Versions: 2.0.3-alpha
> Environment: Ubuntu 10.10 i386
> Reporter: Gopal V
> Assignee: Gopal V
> Priority: Trivial
> Labels: optimization
> Attachments: crc32-faster+readable.patch, crc32-faster+test.patch, pure-crc32-cache-hit.patch
>
>
> While running microbenchmarks for HDFS write codepath, a significant part of the CPU fraction was consumed by the DataChecksum.update().
> The attached patch converts the static arrays in CRC32 into a single linear array for a performance boost in the inner loop.
> milli-seconds for 1Gig (16400 loop over a 64kb chunk)
> || platform || original || cache-aware || improvement ||
> | x86 | 3894 | 2304 | 40.83 |
> | x86_64 | 2131 | 1826 | 14 |
> The performance improvement on x86 is rather larger than the 64bit case, due to the extra register/stack pressure caused by the static arrays.
> A closer analysis of the PureJavaCrc32 JIT code shows the following assembly fragment
> {code}
> 0x40f1e345: mov $0x184,%ecx
> 0x40f1e34a: mov 0x4415b560(%ecx),%ecx ;*getstatic T8_5
> ; - PureJavaCrc32::update@95 (line 61)
> ; {oop('PureJavaCrc32')}
> 0x40f1e350: mov %ecx,0x2c(%esp)
> {code}
> Basically, the static variables T8_0 through to T8_7 are being spilled to the stack because of register pressure. The x86_64 case has a lower likelihood of such pessimistic JIT code due to the increased number of registers.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira