You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hama.apache.org by Apache Wiki <wi...@apache.org> on 2009/01/19 03:44:59 UTC
[Hama Wiki] Trivial Update of "PerformanceEvaluation" by udanax
Dear Wiki user,
You have subscribed to a wiki page or wiki category on "Hama Wiki" for change notification.
The following page has been changed by udanax:
http://wiki.apache.org/hama/PerformanceEvaluation
------------------------------------------------------------------------------
- == Benchmarks ==
+ * work in progress.
+ * See also : http://blog.udanax.org/2009/01/distributed-matrix-multiplication-with.html
- This performance contains data load and export operations.
-
- Dependencies Information :
-
- * Hadoop 0.18.2
- * Hbase 0.18.1
-
- Hardware Information :
-
- * 4 Intel(R) Xeon(R) CPU 2.33GHz, SATA hard disk, Physical Memory 16,626,844 KB
- ----
-
- * Dense matrix add
- * Dense matrix multiply
-
- ||<bgcolor="#ececec"> Version ||<bgcolor="#ececec"> Operation ||<bgcolor="#ececec"> Cluster Size ||<bgcolor="#ececec"> Rows ||<bgcolor="#ececec"> Columns ||<bgcolor="#ececec"> Total Maps ||<bgcolor="#ececec"> Total Reduces ||<bgcolor="#ececec"> Time (seconds) ||<bgcolor="#ececec"> Bytes Read ||<bgcolor="#ececec"> Bytes Written||<bgcolor="#ececec"> mapred.child.java.opts ||
- ||Trunk 718158 || Mult ||2 node ||300 ||300 ||2||2||12 seconds ||1,464,484 || 2,929,092|| -Xmx200m ||
- ||Trunk 720735 || Mult ||2 node ||1,000 ||1,000 ||2||2||20 seconds || 16,166,452 || 32,333,028 || -Xmx200m ||
- ||Trunk 722320 || Add || 2 node ||3,000 ||3,000 ||4||2||298 seconds || 1,053,503,366 || 1,575,781,107 || -Xmx200m ||
- ||Trunk 722320 || Mult ||2 node ||3,000 ||3,000 ||4||2||124 seconds || 590,672,392 || 872,228,808 || -Xmx200m ||
- ||Trunk 722320 || Mult ||2 node ||5,000 ||5,000 ||50||4||912 seconds || 24,434,034,076 || 34,631,558,186 || -Xmx200m ||
-
- {{{
- NOTE: The following numbers are obtained by using poe+ on the entire code, including minimal I/O and matrix construction.
-
- Matrix-Matrix Multiply of 5,000 by 5,000 dense matrix
-
- Mflip/s Wall sec Library
- ------- -------- -------------------------------------------
- 8,300 30 PESSL PDGEMM (16 processors)
- 7,900 32 ScaLAPACK routine PDGEMM (16 processors)
- 7,900 32 ESSL-SMP routine DGEMM (16 threads)
- 7,900 32 NAG-SMP routine F01CKF (16 threads)
- 1,200 213 ESSL routine DGEMM
-
- Matrix-Matrix Multiply of 20,000 by 20,000 dense matrix
-
- Mflip/s Wall sec Library and configuration
- ------- -------- -------------------------------------------
- 158,900 100 ScaLAPACK PDGEMM (256 proc, 16 nodes)
- 146,200 110 PESSL PDGEMM (256 proc, 16 nodes)
- 105,400 150 ScaLAPACK PDGEMM (144 proc, 9 nodes, block 128)
- 100,960 160 PESSL PDGEMM (144 proc, 9 nodes, block 128)
- 79,400 200 PESSL PDGEMM (144 proc, 9 nodes, block 1024)
- 74,800 214 ScaLAPACK PDGEMM (144 proc, 9 nodes, block 1024)
- 55,000 290 PESSL PDGEMM (64 proc, 4 nodes)
- 50,000 320 ScaLAPACK PDGEMM (64 proc, 4 nodes)
- 27,160 590 PESSL PDGEMM (32 proc, 2 nodes)
- 25,630 625 ScaLAPACK PDGEMM (32 proc, 2 nodes)
- 15,800 1,010 PESSL PDGEMM (16 Proc, 1 node)
- 15,600 1,025 ScaLAPACK PDGEMM (16 Proc, 1 node)
-
- Matrix-Matrix Multiply of Larger Dense Matrix
-
- Gflip/s Wall sec Size Library and configuration
- ------- -------- ------- -------------------------------------------
- 163.6 1,529 50,000 ScaLAPACK PDGEMM (256 proc, 16 nodes)
- 163.4 1,531 50,000 PESSL PDGEMM (256 proc, 16 nodes)
- 179.6 11,141 100,000 PESSL PDGEMM (256 proc, 16 nodes, 128 block)
- 210.7 9,495 100,000 ScaLAPACK PDGEMM (256 proc, 16 nodes, 128 block)
- }}}
- ----
-
- * Dense LU factorization
- * Transpose
- * Matrix tridiagonalization, for eigenvalue computations of symmetric matrices.
-