You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hama.apache.org by Apache Wiki <wi...@apache.org> on 2009/01/19 03:44:59 UTC

[Hama Wiki] Trivial Update of "PerformanceEvaluation" by udanax

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hama Wiki" for change notification.

The following page has been changed by udanax:
http://wiki.apache.org/hama/PerformanceEvaluation

------------------------------------------------------------------------------
- == Benchmarks ==
+  * work in progress.
+  * See also : http://blog.udanax.org/2009/01/distributed-matrix-multiplication-with.html
  
- This performance contains data load and export operations.
- 
- Dependencies Information : 
- 
-  * Hadoop 0.18.2
-  * Hbase 0.18.1
- 
- Hardware Information :
- 
-  * 4 Intel(R) Xeon(R) CPU 2.33GHz, SATA hard disk, Physical Memory 16,626,844 KB
- ----
- 
-  * Dense matrix add
-  * Dense matrix multiply
- 
- ||<bgcolor="#ececec"> Version ||<bgcolor="#ececec"> Operation ||<bgcolor="#ececec"> Cluster Size ||<bgcolor="#ececec"> Rows ||<bgcolor="#ececec"> Columns ||<bgcolor="#ececec"> Total Maps ||<bgcolor="#ececec"> Total Reduces ||<bgcolor="#ececec"> Time (seconds) ||<bgcolor="#ececec"> Bytes Read ||<bgcolor="#ececec"> Bytes Written||<bgcolor="#ececec"> mapred.child.java.opts ||
- ||Trunk 718158 || Mult ||2 node ||300 ||300 ||2||2||12 seconds ||1,464,484 || 2,929,092|| -Xmx200m ||
- ||Trunk 720735 || Mult ||2 node ||1,000 ||1,000 ||2||2||20 seconds || 16,166,452 || 32,333,028 || -Xmx200m ||
- ||Trunk 722320 || Add || 2 node ||3,000 ||3,000 ||4||2||298 seconds || 1,053,503,366 || 1,575,781,107 || -Xmx200m ||
- ||Trunk 722320 || Mult ||2 node ||3,000 ||3,000 ||4||2||124 seconds || 590,672,392 || 872,228,808 || -Xmx200m ||
- ||Trunk 722320 || Mult ||2 node ||5,000 ||5,000 ||50||4||912 seconds || 24,434,034,076 || 34,631,558,186 || -Xmx200m ||
- 
- {{{
- NOTE: The following numbers are obtained by using poe+ on the entire code, including minimal I/O and matrix construction.
- 
- Matrix-Matrix Multiply of 5,000 by 5,000 dense matrix
- 
- Mflip/s  Wall sec   Library
- -------  --------   -------------------------------------------
-  8,300       30     PESSL PDGEMM (16 processors)
-  7,900       32     ScaLAPACK routine PDGEMM (16 processors)
-  7,900       32     ESSL-SMP routine DGEMM (16 threads)
-  7,900       32     NAG-SMP routine F01CKF (16 threads)
-  1,200      213     ESSL routine DGEMM
- 
- Matrix-Matrix Multiply of 20,000 by 20,000 dense matrix
- 
- Mflip/s  Wall sec   Library and configuration
- -------  --------   -------------------------------------------
- 158,900     100     ScaLAPACK PDGEMM (256 proc, 16 nodes) 
- 146,200     110     PESSL PDGEMM (256 proc, 16 nodes) 
- 105,400     150     ScaLAPACK PDGEMM (144 proc, 9 nodes, block 128) 
- 100,960     160     PESSL PDGEMM (144 proc, 9 nodes, block 128) 
-  79,400     200     PESSL PDGEMM (144 proc, 9 nodes, block 1024) 
-  74,800     214     ScaLAPACK PDGEMM (144 proc, 9 nodes, block 1024) 
-  55,000     290     PESSL PDGEMM (64 proc, 4 nodes) 
-  50,000     320     ScaLAPACK PDGEMM (64 proc, 4 nodes) 
-  27,160     590     PESSL PDGEMM (32 proc, 2 nodes) 
-  25,630     625     ScaLAPACK PDGEMM (32 proc, 2 nodes) 
-  15,800   1,010     PESSL PDGEMM (16 Proc, 1 node)
-  15,600   1,025     ScaLAPACK PDGEMM (16 Proc, 1 node)
- 
- Matrix-Matrix Multiply of Larger Dense Matrix
- 
- Gflip/s Wall sec Size    Library and configuration
- ------- -------- -------  -------------------------------------------
- 163.6   1,529   50,000  ScaLAPACK PDGEMM (256 proc, 16 nodes)
- 163.4   1,531   50,000  PESSL PDGEMM (256 proc, 16 nodes)
- 179.6  11,141  100,000  PESSL PDGEMM (256 proc, 16 nodes, 128 block)
- 210.7   9,495  100,000  ScaLAPACK PDGEMM (256 proc, 16 nodes, 128 block)
- }}}
- ----
- 
-  * Dense LU factorization
-  * Transpose
-  * Matrix tridiagonalization, for eigenvalue computations of symmetric matrices.
-