You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@mxnet.apache.org by GitBox <gi...@apache.org> on 2018/11/13 15:02:07 UTC

[GitHub] jiajinyu edited a comment on issue #12958: Improve dot(csr, rsp) on CPU by 10x

jiajinyu edited a comment on issue #12958: Improve dot(csr, rsp) on CPU by 10x 
URL: https://github.com/apache/incubator-mxnet/pull/12958#issuecomment-438274161
 
 
   I updated the benchmark script and compared perf on different rsp and csr density. The conclusion  is that the perf gain diminishes as csr matrix gets denser or rsp is very sparse. My 2 cents is that when csr gets denser, the binary searhc doesn't matter because of cache locality of `data_r`. 
   
   
   ## comparison
   
   For hash dim = 2 ^ 28, roughly 0.27G, if we have csr density 0.0001. If we interpret each row in csr is a training example, then we have 26k features per example, and the perf comparison (using  Intel Xeon E5-2650 with OMP_NUM_THREADS=24) is the following :
   
   | rsp density | old run in secs | new run in secs |
   | --- | --- | --- |
   | 0.01 | 0.3695 | 0.2037 |
   | 0.05 | 1.4380 | 0.4995 |
   | 0.15 | 4.7707 | 0.9360 |
   | 0.25 | 6.3904 | 1.5058 |
   | 0.50 | 11.4553 | 1.7401 |
   | 0.75 | 14.7599 | 2.9364 |
   
   for csr density 0.001, then we have 260k features per example, and the perf comparision is the following.
   
   |rsp density| old run in secs | new run in secs |
   | -- | -- | -- |
   | 0.01 | 0.6384 | 0.9936 |
   | 0.05 | 1.9556 | 2.5660 |
   | 0.15 | 4.6588 | 4.4983 |
   | 0.25 | 7.2088 | 6.0316 |
   | 0.50 | 11.1837 | 8.4714 |
   | 0.75 | 14.7599 | 8.1321 |
   
   the perf is worse when csr density is 1%, which means that we will have 2.6 M feature per example for dim = 2^28, but I'm not sure it makes sense.
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services