You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@mxnet.apache.org by GitBox <gi...@apache.org> on 2017/11/14 11:33:39 UTC

[GitHub] ZiyueHuang opened a new pull request #8647: sparse embedding operator, gpu implementation

ZiyueHuang opened a new pull request #8647: sparse embedding operator, gpu implementation
URL: https://github.com/apache/incubator-mxnet/pull/8647
 
 
   ## Description ##
   Benchmark is running on the machine of 20 cpu cores and 8G 1080 GPU, while others are using the GPU at the same time for a small model (occupy 200M memory on the GPU)
   ```
   python matrix_factorization.py --dummy-iter --batch-size 1024
   INFO:root:Namespace(batch_size=1024, dummy_iter=True, factor_size=128, num_epoch=3, print_every=100, use_dense=False, use_gpu=False)
   Preparing data iterators for ./ml-10M100K/r1.train ...
   Preparing data iterators for ./ml-10M100K/r1.test ...
   INFO:root:Training started ...
   INFO:root:Epoch[0] Batch [100]  Speed: 76509.67 samples/sec mse=1.515494
   INFO:root:Epoch[0] Batch [200]  Speed: 71561.11 samples/sec mse=0.168003
   INFO:root:Epoch[0] Batch [300]  Speed: 66174.41 samples/sec mse=0.147147
   INFO:root:Epoch[0] Batch [400]  Speed: 67082.92 samples/sec mse=0.128949
   INFO:root:Epoch[0] Batch [500]  Speed: 50821.51 samples/sec mse=0.099553
   INFO:root:Epoch[0] Batch [600]  Speed: 54877.48 samples/sec mse=0.060373
   INFO:root:Epoch[0] Batch [700]  Speed: 61079.30 samples/sec mse=0.033278
   INFO:root:Epoch[0] Batch [800]  Speed: 58887.36 samples/sec mse=0.019830
   INFO:root:Epoch[0] Batch [900]  Speed: 51305.48 samples/sec mse=0.012637
   INFO:root:Epoch[0] Batch [1000] Speed: 53329.86 samples/sec mse=0.008292
   INFO:root:Epoch[0] Batch [1100] Speed: 64218.71 samples/sec mse=0.005527
   INFO:root:Epoch[0] Batch [1200] Speed: 55143.71 samples/sec mse=0.003761
   INFO:root:Epoch[0] Batch [1300] Speed: 59557.81 samples/sec mse=0.002630
   INFO:root:Epoch[0] Batch [1400] Speed: 58418.85 samples/sec mse=0.001896
   INFO:root:Epoch[0] Batch [1500] Speed: 61707.86 samples/sec mse=0.001410
   INFO:root:Epoch[0] Batch [1600] Speed: 58582.34 samples/sec mse=0.001079
   INFO:root:Epoch[0] Batch [1700] Speed: 35937.64 samples/sec mse=0.000848
   INFO:root:Epoch[0] Batch [1800] Speed: 56355.98 samples/sec mse=0.000685
   ```
   
   ```
   python matrix_factorization.py --use-gpu --dummy-iter --batch-size 1024
   INFO:root:Namespace(batch_size=1024, dummy_iter=True, factor_size=128, num_epoch=3, print_every=100, use_dense=False, use_gpu=True)
   Preparing data iterators for ./ml-10M100K/r1.train ...
   Preparing data iterators for ./ml-10M100K/r1.test ...
   INFO:root:Training started ...
   INFO:root:Epoch[0] Batch [100]  Speed: 880708.20 samples/sec  mse=1.526374
   INFO:root:Epoch[0] Batch [200]  Speed: 972044.14 samples/sec  mse=0.168121
   INFO:root:Epoch[0] Batch [300]  Speed: 1067356.36 samples/sec mse=0.145816
   INFO:root:Epoch[0] Batch [400]  Speed: 705763.69 samples/sec  mse=0.126807
   INFO:root:Epoch[0] Batch [500]  Speed: 699850.30 samples/sec  mse=0.098362
   INFO:root:Epoch[0] Batch [600]  Speed: 494053.74 samples/sec  mse=0.061932
   INFO:root:Epoch[0] Batch [700]  Speed: 861395.03 samples/sec  mse=0.034591
   INFO:root:Epoch[0] Batch [800]  Speed: 1112933.79 samples/sec mse=0.020170
   INFO:root:Epoch[0] Batch [900]  Speed: 1134550.92 samples/sec mse=0.012432
   INFO:root:Epoch[0] Batch [1000] Speed: 1092055.60 samples/sec mse=0.007880
   INFO:root:Epoch[0] Batch [1100] Speed: 874503.66 samples/sec  mse=0.005157
   INFO:root:Epoch[0] Batch [1200] Speed: 1078110.87 samples/sec mse=0.003475
   INFO:root:Epoch[0] Batch [1300] Speed: 1109618.57 samples/sec mse=0.002401
   INFO:root:Epoch[0] Batch [1400] Speed: 1099466.59 samples/sec mse=0.001703
   INFO:root:Epoch[0] Batch [1500] Speed: 1100398.99 samples/sec mse=0.001242
   INFO:root:Epoch[0] Batch [1600] Speed: 1125609.93 samples/sec mse=0.000931
   INFO:root:Epoch[0] Batch [1700] Speed: 1112657.00 samples/sec mse=0.000719
   INFO:root:Epoch[0] Batch [1800] Speed: 1095104.58 samples/sec mse=0.000572
   ```
   cc @eric-haibin-lin 
   
   ## Checklist ##
   ### Essentials ###
   - [ ] Passed code style checking (`make lint`)
   - [ ] Changes are complete (i.e. I finished coding on this PR)
   - [ ] All changes have test coverage
   - [ ] For user-facing API changes, API doc string has been updated. For new C++ functions in header files, their functionalities and arguments are well-documented. 
   - [ ] To my best knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change
   
   ### Changes ###
   - [ ] Feature1, tests, (and when applicable, API doc)
   - [ ] Feature2, tests, (and when applicable, API doc)
   
   ## Comments ##
   - If this change is a backward incompatible change, why must this change be made.
   - Interesting edge cases to note here
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services