You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@mxnet.apache.org by GitBox <gi...@apache.org> on 2020/01/15 23:47:08 UTC

[GitHub] [incubator-mxnet] apeforest opened a new issue #17331: [mxnet 2.0] Turning on large tensor support by default

apeforest opened a new issue #17331: [mxnet 2.0] Turning on large tensor support by default
URL: https://github.com/apache/incubator-mxnet/issues/17331
 
 
   ## Description
   Currently, MXNet only supports tensor size smaller than 2^31. To support large tensors, users need to recompile MXNet with USE_INT64_TENSOR_SIZE compiler flag set to ON. 
   
   Large tensor is used often in applications such as recommendation system with sparse embedding matrix and graph neural networks such as DGL.
   
   To provide a better user experience, we would like to turn on this compiler flag by default so that MXNet binary release will support large tensors.
   
   RFC: https://lists.apache.org/thread.html/df53b8c26e9e0433378dd803baba9fec4dd922728a5ce9135dc164b3@%3Cdev.mxnet.apache.org%3E 
   
   ## Current Status:
   Large tensor support is already implemented in MXNet backend and C API. Over 80 operators have been tested and more are being tested.
   
   There was performance degradation in a few operators such as transpose and it has been fixed (https://github.com/apache/incubator-mxnet/pull/16104)
   
   ## TODO
   - update MXNet development doc and FAQ for adding new operators 
   (@ChaiBapchya )
   - turning on nightly tests for large tensor (@access2rohit )
   - adding end-to-end tests for a list of models (TBD)
   - setting the flag to ON and clean up
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] jonatan1626 commented on issue #17331: [mxnet 2.0] [item 2.4] Turning on large tensor support by default

Posted by GitBox <gi...@apache.org>.

jonatan1626 commented on issue #17331: [mxnet 2.0] [item 2.4] Turning on large tensor support by default
URL: https://github.com/apache/incubator-mxnet/issues/17331#issuecomment-582689067
 
 
   Training Benchmarks comparing LT_MKL with just MKL Enabled.
   Speed measured seconds per Epoch.
   GPU Memory measured in MB.
   
   Note: Samples/Second are opposite so I have multiple the percentages by -1. A quick explanation: The number should be going higher so a positive percentage change means there are now less samples/second. A negative percentage change means there are more samples/second.
   
   
   Model | Speed P50 LT | Speed P50 No LT | GPU Memory LT | GPU Memory No LT | Samples/Second P50 LT | Samples/Second P50 no LT | Speed Percentage Change | GPU Memory Percentage Change | Samples/Second Percentage Change
   -- | -- | -- | -- | -- | -- | -- | -- | -- | --
   xception | 19247.12517 | 18935.02989 | 15304 | 15320 | 67.51961 | 68.61849 | -1.65% | 0.10% | -1.60%
   resnet50_v2 | 4342.953992 | 4342.899322 | 6892 | 6762 | 299.0174 | 299.1728 | 0.00% | -1.92% | -0.05%
   gnmt | N/A | N/A | 4244 | 4112 | 7.65 | 7.675 |   | -3.21% | -0.33%
   vgg16 | 5680.658345 | 5641.058277 | 9480 | 9496 | 228.4218 | 230.0739 | -0.70% | 0.17% | -0.72%
   bert | 20.66 | 16.8 | 4684 | 4050 | 38.1 | 46.7 | -22.98% | -15.65% | -18.42%
   yolo3_darknet53_custom | 517.4205 | 454.908 | 7304 | 12436 | 31.6145 | 40.65 | -13.74% | 41.27% | -22.23%
   inceptionv3 | 5765.122603 | 5723.867063 | 8318 | 8304 | 225.4025 | 227.1884 | -0.72% | -0.17% | -0.79%
   se_resnet152_v1 | 10497.33863 | 10465.23692 | 11290 | 10568 | 123.7371 | 124.1493 | -0.31% | -6.83% | -0.33%
   word_language_model | 141.125 | 142.3 | 8846 | 7426 | 15651.19 | 15524.71 | 0.83% | -19.12% | 0.81%
   mobilenet0.25_cifar10 | 56.6609205 | 60.5992765 | 1234 | 1134 | N/A | N/A | 6.50% | -8.82% |  
   resnet101_v1 | 7354.353666 | 7329.202738 | 8118 | 8022 | 176.6355 | 177.3132 | -0.34% | -1.20% | -0.38%
   squeezenet1.0 | 1677.752777 | 1678.684668 | 3770 | 3590 | 790.7722 | 790.1395 | 0.06% | -5.01% | 0.08%
   mobilenetv2_0.75 | 1938.194231 | 1968.429737 | 5078 | 5008 | 680.4143 | 672.2202 | 1.54% | -1.40% | 1.22%
   ssd | 424.28 | 254.9485 | 4702 | 4592 | 66.2365 | 67.56 | -66.42% | -2.40% | -1.96%
   
   Average Percentage Change:
   Speed:  -7.53%
   GPU Memory: -1.73% 
   Samples / Second: -3.44%
   
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] ChaiBapchya edited a comment on issue #17331: [mxnet 2.0] [item 2.4] Turning on large tensor support by default

Posted by GitBox <gi...@apache.org>.

ChaiBapchya edited a comment on issue #17331: [mxnet 2.0] [item 2.4] Turning on large tensor support by default
URL: https://github.com/apache/incubator-mxnet/issues/17331#issuecomment-580034113
 
 
   Add LT support to ops found via OpPerf
   https://github.com/apache/incubator-mxnet/pull/17444
   Random, Sample, PDF ops : https://github.com/apache/incubator-mxnet/pull/17445 [Merged]

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] jonatan1626 commented on issue #17331: [mxnet 2.0] [item 2.4] Turning on large tensor support by default

Posted by GitBox <gi...@apache.org>.

jonatan1626 commented on issue #17331: [mxnet 2.0] [item 2.4] Turning on large tensor support by default
URL: https://github.com/apache/incubator-mxnet/issues/17331#issuecomment-582702619
 
 
   @eric-haibin-lin Yes I am calculating this by: 1 - (<LT, MKL value> / <MKL value>).
   For the samples/sec I am multiplying by -1.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] ChaiBapchya edited a comment on issue #17331: [mxnet 2.0] [item 2.4] Turning on large tensor support by default

Posted by GitBox <gi...@apache.org>.

ChaiBapchya edited a comment on issue #17331: [mxnet 2.0] [item 2.4] Turning on large tensor support by default
URL: https://github.com/apache/incubator-mxnet/issues/17331#issuecomment-580146186
 
 
   [OpPerf] : Indexing Ops https://github.com/apache/incubator-mxnet/pull/16253
   [OpPerf] : Neural Network Loss Ops https://github.com/apache/incubator-mxnet/pull/17482 [Merged]
   [OpPerf] : Consolidate array manipulation related operators #17487 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] apeforest commented on issue #17331: [mxnet 2.0] [item 2.4] Turning on large tensor support by default

Posted by GitBox <gi...@apache.org>.

apeforest commented on issue #17331: [mxnet 2.0] [item 2.4] Turning on large tensor support by default
URL: https://github.com/apache/incubator-mxnet/issues/17331#issuecomment-589829033
 
 
   Thanks to @JonTanS for running the profiler, we have ping pointed the performance degradation in operator `broadcast_axis` (from 138ms to 177ms) and `MXNDArraySyncCopyToCPU` (from 592ms to 679ms). 
   
   Running operator-level profiler we could also identify the performance drop in `broadcast_axis` alone.
   
   w/o USE_INT64_TENSOR_SIZE flag:
   ```[{'broadcast_axis': [{'inputs': {'data': (1, 1024, 1), 'axis': (0, 2), 'size': (1024, 8)}, 'max_storage_mem_alloc_gpu/0': 16777.2168, 'avg_time_forward_broadcast_axis': 2.7753}]}]```
   
   w/ USE_INT64_TENSOR_SIZE flag:
   ```[{'broadcast_axis': [{'inputs': {'data': (1, 1024, 1), 'axis': (0, 2), 'size': (1024, 8)}, 'max_storage_mem_alloc_gpu/0': 16777.2168, 'avg_time_forward_broadcast_axis': 6.3178}]}```
   
   Also, as I look into the implementation of broadcast_axis operator, many modulo and multiplication operator on the indices are involved. The next step will be to find an optimal implementation of broadcast_axis to reduce the ALU on indices in the kernel.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] ChaiBapchya commented on issue #17331: [mxnet 2.0] [item 2.4] Turning on large tensor support by default

Posted by GitBox <gi...@apache.org>.

ChaiBapchya commented on issue #17331: [mxnet 2.0] [item 2.4] Turning on large tensor support by default
URL: https://github.com/apache/incubator-mxnet/issues/17331#issuecomment-580146186
 
 
   [OpPerf] : Indexing Ops https://github.com/apache/incubator-mxnet/pull/16253

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] jonatan1626 commented on issue #17331: [mxnet 2.0] [item 2.4] Turning on large tensor support by default

Posted by GitBox <gi...@apache.org>.

jonatan1626 commented on issue #17331: [mxnet 2.0] [item 2.4] Turning on large tensor support by default
URL: https://github.com/apache/incubator-mxnet/issues/17331#issuecomment-582752477
 
 
   @apeforest Oh sorry, so I'm multiplying by only for the samples/second column -1 to keep the meaning consistent with everything else. The rest of the columns depict the correct positive percentage improvement and negative percentage degradation.
   
   For example if MKL_LT gives 66 samples/sec and MKL gives 70 samples/sec that will be:
   1-(66/70) or 6%. Because it's positive, we think that it's better but actually it's worse because the throughput has gone down.
   
   On the other hand if MKL_LT gives 74 samples/sec and MKL gives 70 samples/sec that will be:
   1-(74/70) or -5%. Because it's negative, we think it's worse but actually it's better because our throughput has gone up. 
   
   So I multiply by -1 to give it the same meaning as the rest of the percentages, where positive is better and negative is worse.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] ChaiBapchya commented on issue #17331: [mxnet 2.0] [item 2.4] Turning on large tensor support by default

Posted by GitBox <gi...@apache.org>.

ChaiBapchya commented on issue #17331: [mxnet 2.0] [item 2.4] Turning on large tensor support by default
URL: https://github.com/apache/incubator-mxnet/issues/17331#issuecomment-580034113
 
 
   Add LT support to ops found via OpPerf
   https://github.com/apache/incubator-mxnet/pull/17444
   https://github.com/apache/incubator-mxnet/pull/17445

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] jonatan1626 commented on issue #17331: [mxnet 2.0] [item 2.4] Turning on large tensor support by default

Posted by GitBox <gi...@apache.org>.

jonatan1626 commented on issue #17331: [mxnet 2.0] [item 2.4] Turning on large tensor support by default
URL: https://github.com/apache/incubator-mxnet/issues/17331#issuecomment-582591625
 
 
   Inference Benchmarks comparing LT_MKL with just MKL Enabled.
   All Time in MS.
   
   MXNet Type | Model | Mode | average | p50 | p90 | p99 | std-dev | p50 Improvement | p50 Improvement Percentage
   -- | -- | -- | -- | -- | -- | -- | -- | -- | --
   mxnet_LT_MKL | resnext101_64x4d | gluon | 50.1521455 | 47.3425388 | 53.8454056 | 191.146851 | 25.6092029 | 2.12430954 | 4%
   mxnet_MKL | resnext101_64x4d | gluon | 50.6807831 | 49.4668484 | 53.098917 | 192.113161 | 26.3168013 |   |  
   LT_MKL / MKL |   | 0.98956927 | 0.95705589 | 1.01405845 | 0.9949701 | 0.9731123 |   |  
     |   |   |   |   |   |   |   |   |  
   mxnet_LT_MKL | resnext101_64x4d | module | 33.3508749 | 28.8367271 | 35.504818 | 151.783705 | 27.1059643 | -0.3488064 | -1%
   mxnet_MKL | resnext101_64x4d | module | 32.4659094 | 28.4879208 | 36.0739231 | 99.5042324 | 20.7594173 |   |  
   LT_MKL / MKL |   | 1.0272583 | 1.01224401 | 0.98422392 | 1.52539948 | 1.30571894 |   |  
     |   |   |   |   |   |   |   |   |  
   mxnet_LT_MKL | resnext50 | gluon | 18.7125455 | 17.1453953 | 21.2440491 | 23.4313011 | 5.93459749 | 0.91052055 | 5%
   mxnet_MKL | resnext50 | gluon | 19.1066013 | 18.0559158 | 20.870924 | 24.1959095 | 7.32521894 |   |  
   LT_MKL / MKL |   | 0.97937593 | 0.94957218 | 1.01787775 | 0.96839927 | 0.81015974 |   |  
     |   |   |   |   |   |   |   |   |  
   mxnet_LT_MKL | resnext50 | module | 11.056515 | 10.0550652 | 12.2823715 | 13.5610104 | 6.54063042 | -0.4184246 | -4%
   mxnet_MKL | resnext50 | module | 10.7395838 | 9.63664055 | 12.3991966 | 13.109684 | 6.55145709 |   |  
   LT_MKL / MKL |   | 1.02951057 | 1.04342017 | 0.99057801 | 1.03442695 | 0.99834744 |   |  
     |   |   |   |   |   |   |   |   |  
   mxnet_LT_MKL | nin | gluon | 2.59861518 | 2.57444382 | 2.81214714 | 3.3082962 | 1.0480078 | 0.03361702 | 1%
   mxnet_MKL | nin | gluon | 2.67292721 | 2.60806084 | 2.96282768 | 3.25322151 | 1.40067806 |   |  
   LT_MKL / MKL |   | 0.97219826 | 0.98711034 | 0.949143 | 1.01692928 | 0.74821462 |   |  
     |   |   |   |   |   |   |   |   |  
   mxnet_LT_MKL | nin | module | 2.44113064 | 2.43210793 | 2.50315666 | 2.71081924 | 1.25599202 | 0.30565262 | 11%
   mxnet_MKL | nin | module | 2.78057171 | 2.73776054 | 2.78377533 | 3.11684608 | 2.75584319 |   |  
   LT_MKL / MKL |   | 0.877924 | 0.8883567 | 0.89919493 | 0.86973151 | 0.45575598 |   |  
     |   |   |   |   |   |   |   |   |  
   mxnet_LT_MKL | resnet18 | gluon | 3.86981446 | 3.89575958 | 4.06169891 | 4.43696976 | 2.31413846 | -0.2574921 | -7%
   mxnet_MKL | resnet18 | gluon | 3.69065023 | 3.63826752 | 3.96156311 | 4.49037552 | 1.61035003 |   |  
   LT_MKL / MKL |   | 1.04854544 | 1.07077326 | 1.02527684 | 0.98810662 | 1.43704065 |   |  
     |   |   |   |   |   |   |   |   |  
   mxnet_LT_MKL | resnet18 | module | 2.99924937 | 2.95495987 | 3.18932533 | 3.74364853 | 3.15199014 | 0.22792816 | 7%
   mxnet_MKL | resnet18 | module | 3.20062987 | 3.18288803 | 3.33738327 | 3.64685059 | 1.98630643 |   |  
   LT_MKL / MKL |   | 0.93708098 | 0.92838951 | 0.95563652 | 1.02654289 | 1.58685996 |   |  
     |   |   |   |   |   |   |   |   |  
   mxnet_LT_MKL | wavernn | gluon | 299.9031 | 262.938976 | 365.684748 | 861.40275 | 121.629071 | -6.3843727 | -2%
   mxnet_MKL | wavernn | gluon | 280.279419 | 256.554604 | 327.608585 | 785.755396 | 113.065038 |   |  
   LT_MKL / MKL |   | 1.0700147 | 1.02488504 | 1.11622456 | 1.09627341 | 1.0757443 |   |  
     |   |   |   |   |   |   |   |   |  
   mxnet_LT_MKL | caffenet | gluon | 2.94373734 | 2.93087959 | 3.27038765 | 3.6482811 | 2.00530845 | 0.15687943 | 5%
   mxnet_MKL | caffenet | gluon | 3.28968997 | 3.08775902 | 3.67426872 | 4.28771973 | 2.42602299 |   |  
   LT_MKL / MKL |   | 0.89483732 | 0.94919311 | 0.89007852 | 0.85086744 | 0.82658262 |   |  
     |   |   |   |   |   |   |   |   |  
   mxnet_LT_MKL | caffenet | module | 3.23916318 | 3.16953659 | 3.4327507 | 3.99780273 | 3.02630778 | 0.05578995 | 2%
   mxnet_MKL | caffenet | module | 3.3244319 | 3.22532654 | 3.74293327 | 4.31966782 | 2.66768369 |   |  
   LT_MKL / MKL |   | 0.97435089 | 0.98270254 | 0.91712848 | 0.92548846 | 1.13443276 |   |  
     |   |   |   |   |   |   |   |   |  
   mxnet_LT_MKL | vgg19 | gluon | 18.4855731 | 14.1830444 | 22.5963593 | 36.3841057 | 11.3180528 | -0.2920628 | -2%
   mxnet_MKL | vgg19 | gluon | 18.1985553 | 13.8909817 | 22.2861767 | 31.3508511 | 9.6094407 |   |  
   LT_MKL / MKL |   | 1.01577146 | 1.02102535 | 1.01391816 | 1.16054603 | 1.17780557 |   |  
     |   |   |   |   |   |   |   |   |  
   mxnet_LT_MKL | vgg19 | module | 17.794488 | 13.8015747 | 21.9786167 | 44.4116592 | 12.0074793 | 0.53334236 | 4%
   mxnet_MKL | vgg19 | module | 19.0417649 | 14.3349171 | 22.4690437 | 52.0129204 | 14.4256122 |   |  
   LT_MKL / MKL |   | 0.93449783 | 0.96279418 | 0.97817321 | 0.85385821 | 0.83237225 |   |  
     |   |   |   |   |   |   |   |   |  
   mxnet_LT_MKL | maskrcnn | gluon | 2388.43867 | 2340.85274 | 3201.93601 | 3775.1441 | 580.870266 | 50.8880615 | 2%
   mxnet_MKL | maskrcnn | gluon | 2450.09868 | 2391.7408 | 3336.97629 | 3986.08613 | 629.958779 |   |  
   LT_MKL / MKL |   | 0.97483367 | 0.97872342 | 0.95953214 | 0.94708041 | 0.92207663 |   |  
     |   |   |   |   |   |   |   |   |  
   mxnet_LT_MKL | maskrcnn | module | 1908.69608 | 1943.51578 | 2560.54378 | 3126.19233 | 507.120654 | -17.136097 | -1%
   mxnet_MKL | maskrcnn | module | 1907.48754 | 1926.37968 | 2591.58731 | 3203.3329 | 522.315238 |   |  
   LT_MKL / MKL |   | 1.00063358 | 1.00889549 | 0.98802142 | 0.97591865 | 0.97090917 |   |  
     |   |   |   |   |   |   |   |   |  
   mxnet_LT_MKL | superres | gluon | 18.7808693 | 17.3916817 | 20.9929943 | 23.9422321 | 4.32081492 | 0.6172657 | 3%
   mxnet_MKL | superres | gluon | 19.6252466 | 18.0089474 | 21.9612122 | 27.826786 | 6.95733346 |   |  
   LT_MKL / MKL |   | 0.95697495 | 0.9657245 | 0.95591237 | 0.86040235 | 0.62104468 |   |  
     |   |   |   |   |   |   |   |   |  
   mxnet_LT_MKL | superres | module | 18.2737037 | 16.9847012 | 20.7872391 | 23.021698 | 4.47125837 | 0.27728081 | 2%
   mxnet_MKL | superres | module | 18.8063287 | 17.261982 | 21.007061 | 22.7613449 | 5.01140401 |   |  
   LT_MKL / MKL |   | 0.97167842 | 0.98393691 | 0.98953581 | 1.01143839 | 0.8922167 |   |  
     |   |   |   |   |   |   |   |   |  
   mxnet_LT_MKL | resnet101 | gluon | 20.8401657 | 18.7370777 | 23.0050087 | 30.2422047 | 10.8608305 | -0.2958775 | -2%
   mxnet_MKL | resnet101 | gluon | 20.0479187 | 18.4412003 | 22.9201317 | 25.1522064 | 11.6235338 |   |  
   LT_MKL / MKL |   | 1.03951767 | 1.01604437 | 1.00370316 | 1.20236786 | 0.93438284 |   |  
     |   |   |   |   |   |   |   |   |  
   mxnet_LT_MKL | resnet101 | module | 17.0350233 | 16.6659355 | 18.3875561 | 24.0023136 | 13.8060545 | -1.8820763 | -13%
   mxnet_MKL | resnet101 | module | 16.1102734 | 14.7838593 | 18.3670521 | 21.1808681 | 8.89543366 |   |  
   LT_MKL / MKL |   | 1.05740125 | 1.12730615 | 1.00111635 | 1.13320726 | 1.55203838 |   |  
     |   |   |   |   |   |   |   |   |  
   mxnet_LT_MKL | vgg16 | gluon | 15.2813014 | 12.403965 | 19.0222263 | 27.8377533 | 9.67333654 | 3.85713577 | 24%
   mxnet_MKL | vgg16 | gluon | 15.9297859 | 16.2611008 | 19.2034245 | 26.6752243 | 9.02387587 |   |  
   LT_MKL / MKL |   | 0.95929107 | 0.76279984 | 0.99056428 | 1.04358085 | 1.07197137 |   |  
     |   |   |   |   |   |   |   |   |  
   mxnet_LT_MKL | vgg16 | module | 16.2116596 | 17.9307461 | 18.8512802 | 32.9666138 | 10.2862342 | -6.0946941 | -51%
   mxnet_MKL | vgg16 | module | 14.2450004 | 11.8360519 | 18.8217163 | 24.3728161 | 7.03325239 |   |  
   LT_MKL / MKL |   | 1.13805961 | 1.51492628 | 1.00157073 | 1.35259765 | 1.46251458 |   |  
     |   |   |   |   |   |   |   |   |  
   mxnet_LT_MKL | yolov3 | gluon | 24.9501066 | 22.9668617 | 28.6231041 | 35.8681679 | 6.52313189 | 0.0462532 | 0%
   mxnet_MKL | yolov3 | gluon | 25.000014 | 23.0131149 | 27.7833939 | 41.8856144 | 9.79219567 |   |  
   LT_MKL / MKL |   | 0.99800371 | 0.99799014 | 1.03022346 | 0.8563362 | 0.6661562 |   |  
     |   |   |   |   |   |   |   |   |  
   mxnet_LT_MKL | yolov3 | module | 20.7894616 | 18.5782909 | 23.6134529 | 31.6655636 | 7.58152619 | 1.47676468 | 7%
   mxnet_MKL | yolov3 | module | 20.6255135 | 20.0550556 | 22.567749 | 33.4455967 | 10.3772935 |   |  
   LT_MKL / MKL |   | 1.0079488 | 0.92636447 | 1.04633621 | 0.94677825 | 0.73058801 |   |  
     |   |   |   |   |   |   |   |   |  
   mxnet_LT_MKL | ssd | gluon | 18.7769867 | 17.1740055 | 20.4992294 | 37.5475883 | 7.72973113 | -0.4370213 | -3%
   mxnet_MKL | ssd | gluon | 18.3511962 | 16.7369843 | 20.431757 | 27.3761749 | 8.54720124 |   |  
   LT_MKL / MKL |   | 1.02320233 | 1.02611111 | 1.00330233 | 1.37154253 | 0.90435815 |   |  
     |   |   |   |   |   |   |   |   |  
   mxnet_LT_MKL | ssd | module | 15.136122 | 13.9861107 | 17.1849728 | 18.4459686 | 5.43104704 | 0.02145767 | 0%
   mxnet_MKL | ssd | module | 15.6105257 | 14.0075684 | 17.4417496 | 22.4208832 | 6.18526616 |   |  
   LT_MKL / MKL |   | 0.96961001 | 0.99846814 | 0.98527804 | 0.82271374 | 0.87806198 |   |  
     |   |   |   |   |   |   |   |   |  
   mxnet_LT_MKL | rnn | gluon | 27.4836681 | 28.2740593 | 29.2725563 | 46.2668419 | 6.14907033 | 0.64611435 | 2%
   mxnet_MKL | rnn | gluon | 29.8045689 | 28.9201736 | 29.4413567 | 48.0003834 | 4.62636332 |   |  
   LT_MKL / MKL |   | 0.92212936 | 0.9776587 | 0.99426656 | 0.96388484 | 1.32913693 |   |  
     |   |   |   |   |   |   |   |   |  
   mxnet_LT_MKL | rnn | module | 24.3197663 | 19.3209648 | 28.6319256 | 72.7839231 | 14.3442244 | 9.31382179 | 33%
   mxnet_MKL | rnn | module | 31.4545027 | 28.6347866 | 30.5602551 | 84.2146158 | 12.0897264 |   |  
   LT_MKL / MKL |   | 0.77317281 | 0.67473752 | 0.93690074 | 0.86426712 | 1.18648048 |   |  
     |   |   |   |   |   |   |   |   |  
   mxnet_LT_MKL | a3c | gluon | 0.9640653 | 0.92840195 | 1.01304054 | 1.15251541 | 0.81502039 | 0.01382828 | 1%
   mxnet_MKL | a3c | gluon | 0.95647019 | 0.94223022 | 0.98323822 | 1.03831291 | 0.51344909 |   |  
   LT_MKL / MKL |   | 1.00794077 | 0.98532389 | 1.03031038 | 1.10998852 | 1.5873441 |   |  
     |   |   |   |   |   |   |   |   |  
   mxnet_LT_MKL | a3c | module | 0.69630376 | 0.67305565 | 0.82182884 | 0.88357925 | 0.32450653 | 0.18548965 | 22%
   mxnet_MKL | a3c | module | 0.83829322 | 0.8585453 | 0.91028214 | 0.95915794 | 0.49133127 |   |  
   LT_MKL / MKL |   | 0.83062077 | 0.7839489 | 0.90282871 | 0.92120308 | 0.66046381 |   |  
     |   |   |   |   |   |   |   |   |  
   mxnet_LT_MKL | squeezenetv10 | gluon | 4.18813938 | 4.07266617 | 4.47583199 | 5.00273705 | 1.21134245 | 0.17929077 | 4%
   mxnet_MKL | squeezenetv10 | gluon | 4.29468324 | 4.25195694 | 4.69303131 | 5.25665283 | 0.99087141 |   |  
   LT_MKL / MKL |   | 0.97519169 | 0.95783335 | 0.95371876 | 0.9516963 | 1.22250217 |   |  
     |   |   |   |   |   |   |   |   |  
   mxnet_LT_MKL | squeezenetv10 | module | 3.81177734 | 3.68618965 | 4.15277481 | 4.50253487 | 2.61364197 | 0.13208389 | 3%
   mxnet_MKL | squeezenetv10 | module | 3.94122422 | 3.81827354 | 4.25291061 | 4.68373299 | 0.9315793 |   |  
   LT_MKL / MKL |   | 0.96715566 | 0.96540743 | 0.97645476 | 0.96131331 | 2.80560331 |   |  
     |   |   |   |   |   |   |   |   |  
   mxnet_LT_MKL | resnet152 | gluon | 29.1919005 | 25.8705616 | 32.4988365 | 40.3711796 | 16.6486174 | 1.78384781 | 6%
   mxnet_MKL | resnet152 | gluon | 31.515808 | 27.6544094 | 33.4124565 | 97.1477032 | 21.910896 |   |  
   LT_MKL / MKL |   | 0.92626216 | 0.935495 | 0.97265631 | 0.41556494 | 0.7598328 |   |  
     |   |   |   |   |   |   |   |   |  
   mxnet_LT_MKL | resnet152 | module | 23.2507938 | 20.5206871 | 25.9268284 | 36.5798473 | 13.8642909 | 0.51188469 | 2%
   mxnet_MKL | resnet152 | module | 24.1650592 | 21.0325718 | 26.6683102 | 50.9016514 | 16.7789512 |   |  
   LT_MKL / MKL |   | 0.96216581 | 0.97566229 | 0.97219615 | 0.71863773 | 0.82629068 |   |  
     |   |   |   |   |   |   |   |   |  
   mxnet_LT_MKL | resnet34 | gluon | 6.91427969 | 6.97827339 | 7.44247437 | 8.56328011 | 2.6140542 | 0.1885891 | 3%
   mxnet_MKL | resnet34 | gluon | 7.04208988 | 7.16686249 | 7.45677948 | 8.05544853 | 2.40589756 |   |  
   LT_MKL / MKL |   | 0.98185053 | 0.97368596 | 0.9980816 | 1.063042 | 1.08651933 |   |  
     |   |   |   |   |   |   |   |   |  
   mxnet_LT_MKL | resnet34 | module | 5.63637546 | 5.69367409 | 6.05964661 | 6.46090508 | 2.77143384 | -0.0398159 | -1%
   mxnet_MKL | resnet34 | module | 5.50023902 | 5.65385818 | 6.098032 | 6.44731522 | 2.77023209 |   |  
   LT_MKL / MKL |   | 1.024751 | 1.00704225 | 0.99370528 | 1.00210783 | 1.00043381 |   |  
     |   |   |   |   |   |   |   |   |  
   mxnet_LT_MKL | squeezenetv11 | gluon | 3.21247446 | 3.03792953 | 3.57890129 | 3.698349 | 1.24227484 | 0.12779236 | 4%
   mxnet_MKL | squeezenetv11 | gluon | 3.24508834 | 3.16572189 | 3.45659256 | 3.79419327 | 1.13932456 |   |  
   LT_MKL / MKL |   | 0.98994977 | 0.95963247 | 1.03538419 | 0.97473922 | 1.0903608 |   |  
     |   |   |   |   |   |   |   |   |  
   mxnet_LT_MKL | squeezenetv11 | module | 2.89250727 | 2.89011002 | 3.02529335 | 3.27515602 | 2.52591732 | 0.09322166 | 3%
   mxnet_MKL | squeezenetv11 | module | 2.99215784 | 2.98333168 | 3.11279297 | 3.41248512 | 1.21078614 |   |  
   LT_MKL / MKL |   | 0.96669608 | 0.9687525 | 0.97189032 | 0.95975686 | 2.08617957 |   |  
     |   |   |   |   |   |   |   |   |  
   mxnet_LT_MKL | resnext101 | gluon | 31.5375153 | 29.1929245 | 33.946991 | 47.6675034 | 19.3077627 | -1.541853 | -6%
   mxnet_MKL | resnext101 | gluon | 29.9642972 | 27.6510715 | 33.664465 | 37.4267101 | 12.3515484 |   |  
   LT_MKL / MKL |   | 1.05250309 | 1.05576106 | 1.00839241 | 1.27362259 | 1.56318561 |   |  
     |   |   |   |   |   |   |   |   |  
   mxnet_LT_MKL | resnext101 | module | 18.182181 | 15.9804821 | 20.3022957 | 23.8859653 | 15.396635 | 1.53660774 | 9%
   mxnet_MKL | resnext101 | module | 20.17889 | 17.5170898 | 21.5396881 | 35.5322361 | 18.4735895 |   |  
   LT_MKL / MKL |   | 0.90104961 | 0.91227951 | 0.94255291 | 0.67223367 | 0.83344035 |   |  
     |   |   |   |   |   |   |   |   |  
   mxnet_LT_MKL | bert | gluon | 54.0329981 | 44.3267822 | 66.2891865 | 128.249407 | 19.7570149 | -0.5500317 | -1%
   mxnet_MKL | bert | gluon | 53.967942 | 43.7767506 | 66.133976 | 121.48881 | 17.0734991 |   |  
   LT_MKL / MKL |   | 1.00120546 | 1.01256447 | 1.00234691 | 1.0556479 | 1.15717433 |   |  
     |   |   |   |   |   |   |   |   |  
   mxnet_LT_MKL | bert | module | 54.0380961 | 43.8582897 | 66.2884712 | 145.177603 | 20.7744165 | 1.52826309 | 3%
   mxnet_MKL | bert | module | 56.1151934 | 45.3865528 | 67.3725605 | 140.049696 | 20.0013855 |   |  
   LT_MKL / MKL |   | 0.96298512 | 0.96632784 | 0.98390904 | 1.03661491 | 1.03864887 |   |  
     |   |   |   |   |   |   |   |   |  
   mxnet_LT_MKL | resnet50 | gluon | 11.2203212 | 10.3917122 | 12.6955509 | 13.9648914 | 5.03671757 | -0.079155 | -1%
   mxnet_MKL | resnet50 | gluon | 10.9665512 | 10.3125572 | 12.5668049 | 13.5102272 | 4.35633152 |   |  
   LT_MKL / MKL |   | 1.02314037 | 1.00767559 | 1.01024493 | 1.03365334 | 1.15618326 |   |  
     |   |   |   |   |   |   |   |   |  
   mxnet_LT_MKL | resnet50 | module | 9.15425666 | 9.35149193 | 10.0131035 | 10.9114647 | 4.46406407 | -1.0385513 | -12%
   mxnet_MKL | resnet50 | module | 9.09212911 | 8.3129406 | 10.1752281 | 10.9033585 | 5.2183346 |   |  
   LT_MKL   / MKL |   | 1.00683311 | 1.12493188 | 0.98406673 | 1.00074346 | 0.85545761 |   |  
     |   |   |   |   |   |   |   |   |  
   mxnet_LT_MKL | fasterrcnn | gluon | 1062.97674 | 1041.80741 | 1534.40261 | 2087.55994 | 343.28198 | 19.7241306 | 2%
   mxnet_MKL | fasterrcnn | gluon | 1121.82226 | 1061.53154 | 1640.0001 | 2182.19948 | 359.84167 |   |  
   LT_MKL / MKL |   | 0.9475447 | 0.98141918 | 0.93561129 | 0.95663112 | 0.95398062 |   |  
   LT / None |   |   | #REF! | #REF! | #REF! | #REF! | #REF! |   |  
     |   |   |   |   |   |   |   |   |  
   mxnet_LT_MKL | fasterrcnn | module | 815.56046 | 702.314138 | 1247.55859 | 1681.26941 | 303.60249 | 1.4090538 | 0%
   mxnet_MKL | fasterrcnn | module | 816.149793 | 703.723192 | 1259.8958 | 1691.53523 | 304.535044 |   |  
   LT_MKL / MKL |   | 0.99927791 | 0.99799772 | 0.99020776 | 0.99393106 | 0.99693778 |   |  
     |   |   |   |   |   |   |   |   |  
   mxnet_LT_MKL | inception | gluon | 8.67918424 | 7.93433189 | 9.6976757 | 10.4568005 | 3.16472061 | 0.78010559 | 9%
   mxnet_MKL | inception | gluon | 8.90045151 | 8.71443748 | 9.84025002 | 10.6611252 | 3.43431553 |   |  
   LT_MKL / MKL |   | 0.97513977 | 0.91048125 | 0.98551111 | 0.9808346 | 0.92149967 |   |  
     |   |   |   |   |   |   |   |   |  
   mxnet_LT_MKL | inception | module | 5.41892739 | 5.17892838 | 5.85389137 | 6.43467903 | 4.06128674 | 0.1847744 | 3%
   mxnet_MKL | inception | module | 5.60380957 | 5.36370277 | 5.88989258 | 6.49142265 | 3.96470572 |   |  
   LT_MKL / MKL |   | 0.96700777 | 0.96555096 | 0.99388763 | 0.99125868 | 1.0243602 |   |  
     |   |   |   |   |   |   |   |   |  
   mxnet_LT_MKL | drmm | gluon | 972.36726 | 837.11791 | 1294.71469 | 1675.07172 | 223.987835 | -222.74709 | -36%
   mxnet_MKL | drmm | gluon | 701.621772 | 614.370823 | 923.994303 | 1309.37815 | 167.384486 |   |  
   LT_MKL / MKL |   | 1.38588524 | 1.36256131 | 1.40121501 | 1.27928797 | 1.33816365 |   |  
     |   |   |   |   |   |   |   |   |  
   mxnet_LT_MKL | drmm | module | 964.856578 | 830.979586 | 1274.74833 | 1633.81004 | 217.101817 | -223.33002 | -37%
   mxnet_MKL | drmm | module | 692.376242 | 607.649565 | 893.066168 | 1312.29162 | 164.837249 |   |  
   LT_MKL / MKL |   | 1.39354374 | 1.36753095 | 1.42738396 | 1.24500531 | 1.3170677 |   |  
     |   |   |   |   |   |   |   | Raw Numbers | Averages
     |   |   |   |   |   |   |   | 32 | 32
     |   |   |   |   |   |   |   | 17 | 17
     |   |   |   |   |   |   |   |   |  
     |   |   |   |   |   |   |   | 50.8880615 | 33%
     |   |   |   |   |   |   |   | -223.33002 | -51%
     |   |   |   |   |   |   |   |   |  
     |   |   |   |   |   |   |   | 3.12097371 | 6%
     |   |   |   |   |   |   |   | -28.40432 | -11%
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] jonatan1626 edited a comment on issue #17331: [mxnet 2.0] [item 2.4] Turning on large tensor support by default

Posted by GitBox <gi...@apache.org>.

jonatan1626 edited a comment on issue #17331: [mxnet 2.0] [item 2.4] Turning on large tensor support by default
URL: https://github.com/apache/incubator-mxnet/issues/17331#issuecomment-582591625
 
 
   Inference Benchmarks comparing LT_MKL with just MKL Enabled.
   All Time in MS.
   % Diff calculated by doing P50 with LT divided by P50 without LT.
   
   Model Name | Type | P50 w/LT | P50 no LT | % Diff
   -- | -- | -- | -- | --
   resnext101_64x4d | gluon | 47.3425388 | 49.4668484 | 96%
   resnext101_64x4d | module | 28.8367271 | 28.4879208 | 101%
   resnext50 | gluon | 17.1453953 | 18.0559158 | 95%
   resnext50 | module | 10.0550652 | 9.63664055 | 104%
   nin | gluon | 2.57444382 | 2.60806084 | 99%
   nin | module | 2.43210793 | 2.73776054 | 89%
   resnet18 | gluon | 3.89575958 | 3.63826752 | 107%
   resnet18 | module | 2.95495987 | 3.18288803 | 93%
   wavernn | gluon | 262.938976 | 256.554604 | 102%
   caffenet | gluon | 2.93087959 | 3.08775902 | 95%
   caffenet | module | 3.16953659 | 3.22532654 | 98%
   vgg19 | gluon | 14.1830444 | 13.8909817 | 102%
   vgg19 | module | 13.8015747 | 14.3349171 | 96%
   maskrcnn | gluon | 2340.85274 | 2391.7408 | 98%
   maskrcnn | module | 1943.51578 | 1926.37968 | 101%
   superres | gluon | 17.3916817 | 18.0089474 | 97%
   superres | module | 16.9847012 | 17.261982 | 98%
   resnet101 | gluon | 18.7370777 | 18.4412003 | 102%
   resnet101 | module | 16.6659355 | 14.7838593 | 113%
   vgg16 | gluon | 12.403965 | 16.2611008 | 76%
   vgg16 | module | 17.9307461 | 11.8360519 | 151%
   yolov3 | gluon | 22.9668617 | 23.0131149 | 100%
   yolov3 | module | 18.5782909 | 20.0550556 | 93%
   ssd | gluon | 17.1740055 | 16.7369843 | 103%
   ssd | module | 13.9861107 | 14.0075684 | 100%
   rnn | gluon | 28.2740593 | 28.9201736 | 98%
   rnn | module | 19.3209648 | 28.6347866 | 67%
   a3c | gluon | 0.92840195 | 0.94223022 | 99%
   a3c | module | 0.67305565 | 0.8585453 | 78%
   squeezenetv10 | gluon | 4.07266617 | 4.25195694 | 96%
   squeezenetv10 | module | 3.68618965 | 3.81827354 | 97%
   resnet152 | gluon | 25.8705616 | 27.6544094 | 94%
   resnet152 | module | 20.5206871 | 21.0325718 | 98%
   resnet34 | gluon | 6.97827339 | 7.16686249 | 97%
   resnet34 | module | 5.69367409 | 5.65385818 | 101%
   squeezenetv11 | gluon | 3.03792953 | 3.16572189 | 96%
   squeezenetv11 | module | 2.89011002 | 2.98333168 | 97%
   resnext101 | gluon | 29.1929245 | 27.6510715 | 106%
   resnext101 | module | 15.9804821 | 17.5170898 | 91%
   bert | gluon | 44.3267822 | 43.7767506 | 101%
   bert | module | 43.8582897 | 45.3865528 | 97%
   resnet50 | gluon | 10.3917122 | 10.3125572 | 101%
   resnet50 | module | 9.35149193 | 8.3129406 | 112%
   fasterrcnn | gluon | 1041.80741 | 1061.53154 | 98%
   fasterrcnn | module | 702.314138 | 703.723192 | 100%
   inception | gluon | 7.93433189 | 8.71443748 | 91%
   inception | module | 5.17892838 | 5.36370277 | 97%
   drmm | gluon | 837.11791 | 614.370823 | 136%
   drmm | module | 830.979586 | 607.649565 | 137%

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] jonatan1626 edited a comment on issue #17331: [mxnet 2.0] [item 2.4] Turning on large tensor support by default

Posted by GitBox <gi...@apache.org>.

jonatan1626 edited a comment on issue #17331: [mxnet 2.0] [item 2.4] Turning on large tensor support by default
URL: https://github.com/apache/incubator-mxnet/issues/17331#issuecomment-582591625
 
 
   Inference Benchmarks comparing LT_MKL with just MKL Enabled.
   All Time in MS.
   % Diff calculated by doing P50 with LT divided by P50 without LT.
   Thus Less than 100% is a speed improvement, greater than 100% is a speed degradation.
   
     |   |   |   |  
   -- | -- | -- | -- | --
   Model | Mode | P50 w/ LT | P50 No LT | Percentage Difference
   resnext101_64x4d | gluon | 47.34253883 | 49.46685 | 4.29%
   resnext101_64x4d | module | 28.83672714 | 28.48792 | -1.22%
   resnext50 | gluon | 17.14539528 | 18.05592 | 5.04%
   resnext50 | module | 10.05506516 | 9.636641 | -4.34%
   nin | gluon | 2.574443817 | 2.608061 | 1.29%
   nin | module | 2.432107925 | 2.737761 | 11.16%
   resnet18 | gluon | 3.895759583 | 3.638268 | -7.08%
   resnet18 | module | 2.954959869 | 3.182888 | 7.16%
   wavernn | gluon | 262.9389763 | 256.5546 | -2.49%
   caffenet | gluon | 2.930879593 | 3.087759 | 5.08%
   caffenet | module | 3.169536591 | 3.225327 | 1.73%
   vgg19 | gluon | 14.18304443 | 13.89098 | -2.10%
   vgg19 | module | 13.80157471 | 14.33492 | 3.72%
   maskrcnn | gluon | 2340.852737 | 2391.741 | 2.13%
   maskrcnn | module | 1943.515778 | 1926.38 | -0.89%
   superres | gluon | 17.39168167 | 18.00895 | 3.43%
   superres | module | 16.98470116 | 17.26198 | 1.61%
   resnet101 | gluon | 18.73707771 | 18.4412 | -1.60%
   resnet101 | module | 16.66593552 | 14.78386 | -12.73%
   vgg16 | gluon | 12.403965 | 16.2611 | 23.72%
   vgg16 | module | 17.93074608 | 11.83605 | -51.49%
   yolov3 | gluon | 22.96686172 | 23.01311 | 0.20%
   yolov3 | module | 18.57829094 | 20.05506 | 7.36%
   ssd | gluon | 17.17400551 | 16.73698 | -2.61%
   ssd | module | 13.98611069 | 14.00757 | 0.15%
   rnn | gluon | 28.2740593 | 28.92017 | 2.23%
   rnn | module | 19.32096481 | 28.63479 | 32.53%
   a3c | gluon | 0.928401947 | 0.94223 | 1.47%
   a3c | module | 0.673055649 | 0.858545 | 21.61%
   squeezenetv10 | gluon | 4.072666168 | 4.251957 | 4.22%
   squeezenetv10 | module | 3.686189651 | 3.818274 | 3.46%
   resnet152 | gluon | 25.8705616 | 27.65441 | 6.45%
   resnet152 | module | 20.5206871 | 21.03257 | 2.43%
   resnet34 | gluon | 6.978273392 | 7.166862 | 2.63%
   resnet34 | module | 5.693674088 | 5.653858 | -0.70%
   squeezenetv11 | gluon | 3.037929535 | 3.165722 | 4.04%
   squeezenetv11 | module | 2.890110016 | 2.983332 | 3.12%
   resnext101 | gluon | 29.1929245 | 27.65107 | -5.58%
   resnext101 | module | 15.9804821 | 17.51709 | 8.77%
   bert | gluon | 44.32678223 | 43.77675 | -1.26%
   bert | module | 43.85828972 | 45.38655 | 3.37%
   resnet50 | gluon | 10.39171219 | 10.31256 | -0.77%
   resnet50 | module | 9.351491928 | 8.312941 | -12.49%
   fasterrcnn | gluon | 1041.807413 | 1061.532 | 1.86%
   fasterrcnn | module | 702.3141384 | 703.7232 | 0.20%
   inception | gluon | 7.934331894 | 8.714437 | 8.95%
   inception | module | 5.178928375 | 5.363703 | 3.44%
   drmm | gluon | 837.1179104 | 614.3708 | -36.26%
   drmm | module | 830.9795856 | 607.6496 | -36.75%
   
   Average Percentage Change over all numbers:
   Gluon: 0.69%
   Module: -0.37%

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] ChaiBapchya edited a comment on issue #17331: [mxnet 2.0] [item 2.4] Turning on large tensor support by default

Posted by GitBox <gi...@apache.org>.

ChaiBapchya edited a comment on issue #17331: [mxnet 2.0] [item 2.4] Turning on large tensor support by default
URL: https://github.com/apache/incubator-mxnet/issues/17331#issuecomment-580146186
 
 
   [OpPerf] : Indexing Ops https://github.com/apache/incubator-mxnet/pull/16253 [Merged]
   [OpPerf] : Neural Network Loss Ops https://github.com/apache/incubator-mxnet/pull/17482 [Merged]
   [OpPerf] : Consolidate array manipulation related operators #17487 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] ChaiBapchya edited a comment on issue #17331: [mxnet 2.0] [item 2.4] Turning on large tensor support by default

Posted by GitBox <gi...@apache.org>.

ChaiBapchya edited a comment on issue #17331: [mxnet 2.0] [item 2.4] Turning on large tensor support by default
URL: https://github.com/apache/incubator-mxnet/issues/17331#issuecomment-580034113
 
 
   Add LT support to ops found via OpPerf
   NN optimizers and 1 activation https://github.com/apache/incubator-mxnet/pull/17444
   Random, Sample, PDF ops : https://github.com/apache/incubator-mxnet/pull/17445 [Merged]

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] ChaiBapchya edited a comment on issue #17331: [mxnet 2.0] [item 2.4] Turning on large tensor support by default

Posted by GitBox <gi...@apache.org>.

ChaiBapchya edited a comment on issue #17331: [mxnet 2.0] [item 2.4] Turning on large tensor support by default
URL: https://github.com/apache/incubator-mxnet/issues/17331#issuecomment-580146186
 
 
   [OpPerf] : Indexing Ops https://github.com/apache/incubator-mxnet/pull/16253
   [OpPerf] : Neural Network Loss Ops https://github.com/apache/incubator-mxnet/pull/17482

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] ChaiBapchya edited a comment on issue #17331: [mxnet 2.0] [item 2.4] Turning on large tensor support by default

Posted by GitBox <gi...@apache.org>.

ChaiBapchya edited a comment on issue #17331: [mxnet 2.0] [item 2.4] Turning on large tensor support by default
URL: https://github.com/apache/incubator-mxnet/issues/17331#issuecomment-580146186
 
 
   [OpPerf] : Indexing Ops https://github.com/apache/incubator-mxnet/pull/16253
   [OpPerf] : Neural Network Loss Ops https://github.com/apache/incubator-mxnet/pull/17482
   [OpPerf] : Consolidate array manipulation related operators

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] apeforest commented on issue #17331: [mxnet 2.0] [item 2.4] Turning on large tensor support by default

Posted by GitBox <gi...@apache.org>.

apeforest commented on issue #17331: [mxnet 2.0] [item 2.4] Turning on large tensor support by default
URL: https://github.com/apache/incubator-mxnet/issues/17331#issuecomment-582749284
 
 
   > @eric-haibin-lin Yes I am calculating this by: 1 - (LT, MKL value / MKL value).
   > For the samples/sec I doing the above and then multiplying by -1.
   
   In your description "A negative percentage change means there are more samples/second." Doesn't that mean negative percentage is faster?
   
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] access2rohit commented on issue #17331: [mxnet 2.0] [item 2.4] Turning on large tensor support by default

Posted by GitBox <gi...@apache.org>.

access2rohit commented on issue #17331:
URL: https://github.com/apache/incubator-mxnet/issues/17331#issuecomment-656772572


   > Has the large tensor for numpy array been supported?
   
   
   We looked at numpy files inside MXNet and they are using index_t for iterating over elements in their own kernels and use NDarray ones for remaining in which we ensured to use index_t where required. For kernels using BLAS I will update them in the same PR as making MXNet wrappers for openBLAS int64 compatible.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-mxnet] access2rohit commented on issue #17331: [mxnet 2.0] [item 2.4] Turning on large tensor support by default

Posted by GitBox <gi...@apache.org>.

access2rohit commented on issue #17331:
URL: https://github.com/apache/incubator-mxnet/issues/17331#issuecomment-622665471


   @szha @eric-haibin-lin @apeforest 
   
   With current master and new broadcast_axis changes on p3.16xl single GPU training run. 
   
   Bert Run Command:
   ```
   python3 run_pretraining.py --data='./part-0000.train' --data_eval='./part-0000.train' --num_steps 100 --lr 1e-4 --optimizer lamb --accumulate 1 --raw --gpus 0 --num_dataset_workers 2 --num_batch_workers 1 --circle_length 1 --total_batch_size 4 --total_batch_size_eval 4 --log_interval 10
   ```
   
   Results:
   
   
   | Code Version | throughput (samples/sec) |  |      | total time                                |
   |--------------|------------|---------------|--------|-------------------------------------------|
   |              | avg        | p50           | p90    | (only training ignoring evaluation steps) |
   | master LT    | 24.38k     | 25.50k        | 28.47k | 134.8 sec                                 |
   | master       | 25.90k     | 25.90k        | 27.82k | 131.9 sec                                 |
   | new LT       | 25.87k     | 25.80k        | 28.00k | 127.3 sec                                 |
   | new          | 25.92k     | 25.80k        | 27.80k | 131.5 sec                                 |
   
   
   "new" refers to mxnet code with optimized broadcast_axis.
   "master" refers to mxnet master branch code
   "LT" refers to of the build was done after enabling large tensor or not.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-mxnet] jonatan1626 edited a comment on issue #17331: [mxnet 2.0] [item 2.4] Turning on large tensor support by default

Posted by GitBox <gi...@apache.org>.

jonatan1626 edited a comment on issue #17331: [mxnet 2.0] [item 2.4] Turning on large tensor support by default
URL: https://github.com/apache/incubator-mxnet/issues/17331#issuecomment-582702619
 
 
   @eric-haibin-lin Yes I am calculating this by: 1 - (LT, MKL value / MKL value).
   For the samples/sec I doing the above and then multiplying by -1.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] ChaiBapchya edited a comment on issue #17331: [mxnet 2.0] [item 2.4] Turning on large tensor support by default

Posted by GitBox <gi...@apache.org>.

ChaiBapchya edited a comment on issue #17331: [mxnet 2.0] [item 2.4] Turning on large tensor support by default
URL: https://github.com/apache/incubator-mxnet/issues/17331#issuecomment-580034113
 
 
   Add LT support to ops found via OpPerf
   NN optimizers and 1 activation https://github.com/apache/incubator-mxnet/pull/17444 [Merged]
   Random, Sample, PDF ops : https://github.com/apache/incubator-mxnet/pull/17445 [Merged]

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] jonatan1626 edited a comment on issue #17331: [mxnet 2.0] [item 2.4] Turning on large tensor support by default

Posted by GitBox <gi...@apache.org>.

jonatan1626 edited a comment on issue #17331: [mxnet 2.0] [item 2.4] Turning on large tensor support by default
URL: https://github.com/apache/incubator-mxnet/issues/17331#issuecomment-582591625
 
 
   Inference Benchmarks comparing LT_MKL with just MKL Enabled.
   All Time in MS.
   % Diff calculated by doing P50 with LT divided by P50 without LT.
   Thus Less than 100% is a speed improvement, greater than 100% is a speed degradation.
   
   Model Name | Type | P50 w/LT | P50 no LT | % Diff
   -- | -- | -- | -- | --
   resnext101_64x4d | gluon | 47.3425388 | 49.4668484 | 96%
   resnext101_64x4d | module | 28.8367271 | 28.4879208 | 101%
   resnext50 | gluon | 17.1453953 | 18.0559158 | 95%
   resnext50 | module | 10.0550652 | 9.63664055 | 104%
   nin | gluon | 2.57444382 | 2.60806084 | 99%
   nin | module | 2.43210793 | 2.73776054 | 89%
   resnet18 | gluon | 3.89575958 | 3.63826752 | 107%
   resnet18 | module | 2.95495987 | 3.18288803 | 93%
   wavernn | gluon | 262.938976 | 256.554604 | 102%
   caffenet | gluon | 2.93087959 | 3.08775902 | 95%
   caffenet | module | 3.16953659 | 3.22532654 | 98%
   vgg19 | gluon | 14.1830444 | 13.8909817 | 102%
   vgg19 | module | 13.8015747 | 14.3349171 | 96%
   maskrcnn | gluon | 2340.85274 | 2391.7408 | 98%
   maskrcnn | module | 1943.51578 | 1926.37968 | 101%
   superres | gluon | 17.3916817 | 18.0089474 | 97%
   superres | module | 16.9847012 | 17.261982 | 98%
   resnet101 | gluon | 18.7370777 | 18.4412003 | 102%
   resnet101 | module | 16.6659355 | 14.7838593 | 113%
   vgg16 | gluon | 12.403965 | 16.2611008 | 76%
   vgg16 | module | 17.9307461 | 11.8360519 | 151%
   yolov3 | gluon | 22.9668617 | 23.0131149 | 100%
   yolov3 | module | 18.5782909 | 20.0550556 | 93%
   ssd | gluon | 17.1740055 | 16.7369843 | 103%
   ssd | module | 13.9861107 | 14.0075684 | 100%
   rnn | gluon | 28.2740593 | 28.9201736 | 98%
   rnn | module | 19.3209648 | 28.6347866 | 67%
   a3c | gluon | 0.92840195 | 0.94223022 | 99%
   a3c | module | 0.67305565 | 0.8585453 | 78%
   squeezenetv10 | gluon | 4.07266617 | 4.25195694 | 96%
   squeezenetv10 | module | 3.68618965 | 3.81827354 | 97%
   resnet152 | gluon | 25.8705616 | 27.6544094 | 94%
   resnet152 | module | 20.5206871 | 21.0325718 | 98%
   resnet34 | gluon | 6.97827339 | 7.16686249 | 97%
   resnet34 | module | 5.69367409 | 5.65385818 | 101%
   squeezenetv11 | gluon | 3.03792953 | 3.16572189 | 96%
   squeezenetv11 | module | 2.89011002 | 2.98333168 | 97%
   resnext101 | gluon | 29.1929245 | 27.6510715 | 106%
   resnext101 | module | 15.9804821 | 17.5170898 | 91%
   bert | gluon | 44.3267822 | 43.7767506 | 101%
   bert | module | 43.8582897 | 45.3865528 | 97%
   resnet50 | gluon | 10.3917122 | 10.3125572 | 101%
   resnet50 | module | 9.35149193 | 8.3129406 | 112%
   fasterrcnn | gluon | 1041.80741 | 1061.53154 | 98%
   fasterrcnn | module | 702.314138 | 703.723192 | 100%
   inception | gluon | 7.93433189 | 8.71443748 | 91%
   inception | module | 5.17892838 | 5.36370277 | 97%
   drmm | gluon | 837.11791 | 614.370823 | 136%
   drmm | module | 830.979586 | 607.649565 | 137%

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] jonatan1626 edited a comment on issue #17331: [mxnet 2.0] [item 2.4] Turning on large tensor support by default

Posted by GitBox <gi...@apache.org>.

jonatan1626 edited a comment on issue #17331: [mxnet 2.0] [item 2.4] Turning on large tensor support by default
URL: https://github.com/apache/incubator-mxnet/issues/17331#issuecomment-582591625
 
 
   Inference Benchmarks comparing LT_MKL with just MKL Enabled.
   All Time in MS.
   % Diff calculated by doing 1 - (P50 with LT divided by P50 without LT).
   A positive number means a speed increase, a negative number means a speed decrease.
   
     |   |   |   |  
   -- | -- | -- | -- | --
   Model | Mode | P50 w/ LT | P50 No LT | Percentage Difference
   resnext101_64x4d | gluon | 47.34253883 | 49.46685 | 4.29%
   resnext101_64x4d | module | 28.83672714 | 28.48792 | -1.22%
   resnext50 | gluon | 17.14539528 | 18.05592 | 5.04%
   resnext50 | module | 10.05506516 | 9.636641 | -4.34%
   nin | gluon | 2.574443817 | 2.608061 | 1.29%
   nin | module | 2.432107925 | 2.737761 | 11.16%
   resnet18 | gluon | 3.895759583 | 3.638268 | -7.08%
   resnet18 | module | 2.954959869 | 3.182888 | 7.16%
   wavernn | gluon | 262.9389763 | 256.5546 | -2.49%
   caffenet | gluon | 2.930879593 | 3.087759 | 5.08%
   caffenet | module | 3.169536591 | 3.225327 | 1.73%
   vgg19 | gluon | 14.18304443 | 13.89098 | -2.10%
   vgg19 | module | 13.80157471 | 14.33492 | 3.72%
   maskrcnn | gluon | 2340.852737 | 2391.741 | 2.13%
   maskrcnn | module | 1943.515778 | 1926.38 | -0.89%
   superres | gluon | 17.39168167 | 18.00895 | 3.43%
   superres | module | 16.98470116 | 17.26198 | 1.61%
   resnet101 | gluon | 18.73707771 | 18.4412 | -1.60%
   resnet101 | module | 16.66593552 | 14.78386 | -12.73%
   vgg16 | gluon | 12.403965 | 16.2611 | 23.72%
   vgg16 | module | 17.93074608 | 11.83605 | -51.49%
   yolov3 | gluon | 22.96686172 | 23.01311 | 0.20%
   yolov3 | module | 18.57829094 | 20.05506 | 7.36%
   ssd | gluon | 17.17400551 | 16.73698 | -2.61%
   ssd | module | 13.98611069 | 14.00757 | 0.15%
   rnn | gluon | 28.2740593 | 28.92017 | 2.23%
   rnn | module | 19.32096481 | 28.63479 | 32.53%
   a3c | gluon | 0.928401947 | 0.94223 | 1.47%
   a3c | module | 0.673055649 | 0.858545 | 21.61%
   squeezenetv10 | gluon | 4.072666168 | 4.251957 | 4.22%
   squeezenetv10 | module | 3.686189651 | 3.818274 | 3.46%
   resnet152 | gluon | 25.8705616 | 27.65441 | 6.45%
   resnet152 | module | 20.5206871 | 21.03257 | 2.43%
   resnet34 | gluon | 6.978273392 | 7.166862 | 2.63%
   resnet34 | module | 5.693674088 | 5.653858 | -0.70%
   squeezenetv11 | gluon | 3.037929535 | 3.165722 | 4.04%
   squeezenetv11 | module | 2.890110016 | 2.983332 | 3.12%
   resnext101 | gluon | 29.1929245 | 27.65107 | -5.58%
   resnext101 | module | 15.9804821 | 17.51709 | 8.77%
   bert | gluon | 44.32678223 | 43.77675 | -1.26%
   bert | module | 43.85828972 | 45.38655 | 3.37%
   resnet50 | gluon | 10.39171219 | 10.31256 | -0.77%
   resnet50 | module | 9.351491928 | 8.312941 | -12.49%
   fasterrcnn | gluon | 1041.807413 | 1061.532 | 1.86%
   fasterrcnn | module | 702.3141384 | 703.7232 | 0.20%
   inception | gluon | 7.934331894 | 8.714437 | 8.95%
   inception | module | 5.178928375 | 5.363703 | 3.44%
   drmm | gluon | 837.1179104 | 614.3708 | -36.26%
   drmm | module | 830.9795856 | 607.6496 | -36.75%
   
   Average Percentage Change over all numbers:
   Gluon: 0.69%
   Module: -0.37%

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] szha commented on issue #17331: [mxnet 2.0] [item 2.4] Turning on large tensor support by default

Posted by GitBox <gi...@apache.org>.

szha commented on issue #17331: [mxnet 2.0] [item 2.4] Turning on large tensor support by default
URL: https://github.com/apache/incubator-mxnet/issues/17331#issuecomment-587090121
 
 
   The slowdown for BERT (-22.98%) is quite significant. We will need to mitigate this before moving forward.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] ChaiBapchya edited a comment on issue #17331: [mxnet 2.0] [item 2.4] Turning on large tensor support by default

Posted by GitBox <gi...@apache.org>.

ChaiBapchya edited a comment on issue #17331: [mxnet 2.0] [item 2.4] Turning on large tensor support by default
URL: https://github.com/apache/incubator-mxnet/issues/17331#issuecomment-580146186
 
 
   [OpPerf] : Indexing Ops https://github.com/apache/incubator-mxnet/pull/16253
   [OpPerf] : Neural Network Loss Ops https://github.com/apache/incubator-mxnet/pull/17482
   [OpPerf] : Consolidate array manipulation related operators #17487 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] jonatan1626 edited a comment on issue #17331: [mxnet 2.0] [item 2.4] Turning on large tensor support by default

Posted by GitBox <gi...@apache.org>.

jonatan1626 edited a comment on issue #17331: [mxnet 2.0] [item 2.4] Turning on large tensor support by default
URL: https://github.com/apache/incubator-mxnet/issues/17331#issuecomment-582591625
 
 
   Inference Benchmarks comparing LT_MKL with just MKL Enabled.
   All Time in MS.
   % Diff calculated by doing P50 with LT divided by P50 without LT.
   Thus Less than 100% is a speed improvement, greater than 100% is a speed degradation.
   
   -- | -- | -- | -- | --
   Model | Mode | P50 w/ LT | P50 No LT | Percentage Difference
   resnext101_64x4d | gluon | 47.34253883 | 49.46685 | 4.29%
   resnext101_64x4d | module | 28.83672714 | 28.48792 | -1.22%
   resnext50 | gluon | 17.14539528 | 18.05592 | 5.04%
   resnext50 | module | 10.05506516 | 9.636641 | -4.34%
   nin | gluon | 2.574443817 | 2.608061 | 1.29%
   nin | module | 2.432107925 | 2.737761 | 11.16%
   resnet18 | gluon | 3.895759583 | 3.638268 | -7.08%
   resnet18 | module | 2.954959869 | 3.182888 | 7.16%
   wavernn | gluon | 262.9389763 | 256.5546 | -2.49%
   caffenet | gluon | 2.930879593 | 3.087759 | 5.08%
   caffenet | module | 3.169536591 | 3.225327 | 1.73%
   vgg19 | gluon | 14.18304443 | 13.89098 | -2.10%
   vgg19 | module | 13.80157471 | 14.33492 | 3.72%
   maskrcnn | gluon | 2340.852737 | 2391.741 | 2.13%
   maskrcnn | module | 1943.515778 | 1926.38 | -0.89%
   superres | gluon | 17.39168167 | 18.00895 | 3.43%
   superres | module | 16.98470116 | 17.26198 | 1.61%
   resnet101 | gluon | 18.73707771 | 18.4412 | -1.60%
   resnet101 | module | 16.66593552 | 14.78386 | -12.73%
   vgg16 | gluon | 12.403965 | 16.2611 | 23.72%
   vgg16 | module | 17.93074608 | 11.83605 | -51.49%
   yolov3 | gluon | 22.96686172 | 23.01311 | 0.20%
   yolov3 | module | 18.57829094 | 20.05506 | 7.36%
   ssd | gluon | 17.17400551 | 16.73698 | -2.61%
   ssd | module | 13.98611069 | 14.00757 | 0.15%
   rnn | gluon | 28.2740593 | 28.92017 | 2.23%
   rnn | module | 19.32096481 | 28.63479 | 32.53%
   a3c | gluon | 0.928401947 | 0.94223 | 1.47%
   a3c | module | 0.673055649 | 0.858545 | 21.61%
   squeezenetv10 | gluon | 4.072666168 | 4.251957 | 4.22%
   squeezenetv10 | module | 3.686189651 | 3.818274 | 3.46%
   resnet152 | gluon | 25.8705616 | 27.65441 | 6.45%
   resnet152 | module | 20.5206871 | 21.03257 | 2.43%
   resnet34 | gluon | 6.978273392 | 7.166862 | 2.63%
   resnet34 | module | 5.693674088 | 5.653858 | -0.70%
   squeezenetv11 | gluon | 3.037929535 | 3.165722 | 4.04%
   squeezenetv11 | module | 2.890110016 | 2.983332 | 3.12%
   resnext101 | gluon | 29.1929245 | 27.65107 | -5.58%
   resnext101 | module | 15.9804821 | 17.51709 | 8.77%
   bert | gluon | 44.32678223 | 43.77675 | -1.26%
   bert | module | 43.85828972 | 45.38655 | 3.37%
   resnet50 | gluon | 10.39171219 | 10.31256 | -0.77%
   resnet50 | module | 9.351491928 | 8.312941 | -12.49%
   fasterrcnn | gluon | 1041.807413 | 1061.532 | 1.86%
   fasterrcnn | module | 702.3141384 | 703.7232 | 0.20%
   inception | gluon | 7.934331894 | 8.714437 | 8.95%
   inception | module | 5.178928375 | 5.363703 | 3.44%
   drmm | gluon | 837.1179104 | 614.3708 | -36.26%
   drmm | module | 830.9795856 | 607.6496 | -36.75%
   
   Average Percentage Change over all numbers:
   Gluon: 0.69%
   Module: -0.37%

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] eric-haibin-lin commented on issue #17331: [mxnet 2.0] [item 2.4] Turning on large tensor support by default

Posted by GitBox <gi...@apache.org>.

eric-haibin-lin commented on issue #17331: [mxnet 2.0] [item 2.4] Turning on large tensor support by default
URL: https://github.com/apache/incubator-mxnet/issues/17331#issuecomment-582694592
 
 
   @jonatan1626 thanks for the update. Does `-22.98%` mean 22.98% slower? 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] leezu commented on issue #17331: [mxnet 2.0] [item 2.4] Turning on large tensor support by default

Posted by GitBox <gi...@apache.org>.

leezu commented on issue #17331:
URL: https://github.com/apache/incubator-mxnet/issues/17331#issuecomment-656788437


   > NOTE: currently openBLAS works correctly with smaller inputs(within range of int32) but will truncate parameters passed with higher values and hence will result in either SIGSEGV(mostly) or garbage values being found(will eventually cause SIGSEGV in a bigger script)
   
   I'm a little concerned that we don't have a correct integration of BLAS and Lapack. BLAS kernels and will get potential crashes or corrupt results. But I think @sandeep-krishnamurthy's point
   
   > Make openBLAS compatible with Large Tensor support and merge the PR for Enabling Large Tensor Support so that default PyPi users of MXNet can already benefit from the new capability. This will actually cover largest user base of MXNet.
   
   refers to fixing this? If so, I'm fine with the order of execution. Thank you @access2rohit for the hard work on this feature


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-mxnet] szha commented on issue #17331: [mxnet 2.0] [item 2.4] Turning on large tensor support by default

Posted by GitBox <gi...@apache.org>.

szha commented on issue #17331:
URL: https://github.com/apache/incubator-mxnet/issues/17331#issuecomment-656898183


   I think the numpy frontend hasn't supported large tensors yet. I started working on it here https://github.com/apache/incubator-mxnet/pull/18368 but I haven't found the time to finish migrating all the tests. @access2rohit would you be able to help out and take that over?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-mxnet] access2rohit edited a comment on issue #17331: [mxnet 2.0] [item 2.4] Turning on large tensor support by default

Posted by GitBox <gi...@apache.org>.

access2rohit edited a comment on issue #17331:
URL: https://github.com/apache/incubator-mxnet/issues/17331#issuecomment-622665471


   @szha @eric-haibin-lin @apeforest 
   
   With current master and new broadcast_axis changes on p3.16xl single GPU training run. 
   
   Bert Run Command:
   ```
   python3 run_pretraining.py --data='./part-0000.train' --data_eval='./part-0000.train' --num_steps 100 --lr 1e-4 --optimizer lamb --accumulate 1 --raw --gpus 0 --num_dataset_workers 2 --num_batch_workers 1 --circle_length 1 --total_batch_size 4 --total_batch_size_eval 4 --log_interval 10
   ```
   
   Results:
   
   
   | Code Version | throughput (samples/sec) |  |      | total time                                |
   |--------------|------------|---------------|--------|-------------------------------------------|
   |              | avg        | p50           | p90    | (only training ignoring evaluation steps) |
   | master LT    | 24.38k     | 25.50k        | 28.47k | 134.8 sec                                 |
   | master       | 25.90k     | 25.90k        | 27.82k | 131.9 sec                                 |
   | new LT       | 25.87k     | 25.80k        | 28.00k | 127.3 sec                                 |
   | new          | 25.92k     | 25.80k        | 27.80k | 131.5 sec                                 |
   
   
   "new" refers to mxnet code with optimized broadcast_axis.
   "master" refers to mxnet master branch code
   "LT" refers to of the build was done after enabling large tensor.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-mxnet] apeforest edited a comment on issue #17331: [mxnet 2.0] [item 2.4] Turning on large tensor support by default

Posted by GitBox <gi...@apache.org>.

apeforest edited a comment on issue #17331:
URL: https://github.com/apache/incubator-mxnet/issues/17331#issuecomment-624456645


   @access2rohit This result is a little surprising. In the earlier benchmark results provided by @JonTanS, there is a ~17% degradation in BERT training when large tensor (LT) compiler flag is turned on:
   
   
   bert | 38.1 | 46.7 | -18.42%
   -- | -- | -- | --
   
   However, from your result, even without your latest speedup in broadcast_axis operator, there is very little difference with LT flag is on:
   
   
   master LT | 24.38k | 25.50k | 28.47k | 134.8 sec
   -- | -- | -- | -- | --
   master | 25.90k | 25.90k | 27.82k | 131.9 sec
   
   Could you provide more insights?
   
   
   
   
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-mxnet] access2rohit edited a comment on issue #17331: [mxnet 2.0] [item 2.4] Turning on large tensor support by default

Posted by GitBox <gi...@apache.org>.

access2rohit edited a comment on issue #17331:
URL: https://github.com/apache/incubator-mxnet/issues/17331#issuecomment-650509402


   @apeforest @sandeep-krishnamurthy @szha 
   
   PR's to enable Large Tensor Support as default in master are divided into two stages:
   Stage1: Unix CPU/GPU and Windows CPU/GPU https://github.com/apache/incubator-mxnet/pull/18625
   Stage2: All remaining platforms https://github.com/apache/incubator-mxnet/pull/18626
   
   Once the above 2 PR's are merged MXNet will support Large Tensors for CPU/GPU(depending on Global Memory) on master.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-mxnet] szha commented on issue #17331: [mxnet 2.0] [item 2.4] Turning on large tensor support by default

Posted by GitBox <gi...@apache.org>.

szha commented on issue #17331:
URL: https://github.com/apache/incubator-mxnet/issues/17331#issuecomment-656750948


   Has the large tensor for numpy array been supported?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-mxnet] sandeep-krishnamurthy commented on issue #17331: [mxnet 2.0] [item 2.4] Turning on large tensor support by default

Posted by GitBox <gi...@apache.org>.

sandeep-krishnamurthy commented on issue #17331:
URL: https://github.com/apache/incubator-mxnet/issues/17331#issuecomment-656757856


   @access2rohit can correct me, but, few of them are supported as they use same kernels under the hood. This issue scope was mainly on the NDArray when it got started. After these are done, remaining Numpy ops will also be supported.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-mxnet] apeforest edited a comment on issue #17331: [mxnet 2.0] [item 2.4] Turning on large tensor support by default

Posted by GitBox <gi...@apache.org>.

apeforest edited a comment on issue #17331:
URL: https://github.com/apache/incubator-mxnet/issues/17331#issuecomment-624456645


   @access2rohit This result is a little surprising. In the earlier benchmark results provided by @JonTanS, there is a ~18% degradation in BERT training when large tensor (LT) compiler flag is turned on:
   
   
   bert | 38.1 | 46.7 | -18.42%
   -- | -- | -- | --
   
   However, from your result, even without your latest speedup in broadcast_axis operator, there is very little difference with LT flag is on:
   
   
   master LT | 24.38k | 25.50k | 28.47k | 134.8 sec
   -- | -- | -- | -- | --
   master | 25.90k | 25.90k | 27.82k | 131.9 sec
   
   Could you provide more insights?
   
   
   
   
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-mxnet] access2rohit commented on issue #17331: [mxnet 2.0] [item 2.4] Turning on large tensor support by default

Posted by GitBox <gi...@apache.org>.

access2rohit commented on issue #17331:
URL: https://github.com/apache/incubator-mxnet/issues/17331#issuecomment-650509402


   @apeforest @sandeep-krishnamurthy @szha 
   
   PR's to enable Large Tensor Support as default in master are divided into two stages:
   Stage1: Unix CPU/GPU and Windows CPU/GPU https://github.com/apache/incubator-mxnet/pull/18625
   Stage2: All remaining platforms https://github.com/apache/incubator-mxnet/pull/18626
   
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-mxnet] access2rohit commented on issue #17331: [mxnet 2.0] [item 2.4] Turning on large tensor support by default

Posted by GitBox <gi...@apache.org>.

access2rohit commented on issue #17331:
URL: https://github.com/apache/incubator-mxnet/issues/17331#issuecomment-656790489


   > upon investigation openBLAS needs to be built with specific flag to support int64_t signatures and MKL will support long long int signatures(in which case reinterpret_cast<>() is needed for casting pointers as int64_t is treated as long int* as opposed to long long int* in MKL). Additionally LAPACK and BLAS wrappers need to be updated from int -> int64_t.
   
   @leezu yes thats what I meant 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-mxnet] access2rohit commented on issue #17331: [mxnet 2.0] [item 2.4] Turning on large tensor support by default

Posted by GitBox <gi...@apache.org>.

access2rohit commented on issue #17331:
URL: https://github.com/apache/incubator-mxnet/issues/17331#issuecomment-656769897


   > Make openBLAS compatible with Large Tensor support and merge the PR for Enabling Large Tensor Support so that default PyPi users of MXNet can already benefit from the new capability. This will actually cover largest user base of MXNet.
   
   yes


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-mxnet] sandeep-krishnamurthy commented on issue #17331: [mxnet 2.0] [item 2.4] Turning on large tensor support by default

Posted by GitBox <gi...@apache.org>.

sandeep-krishnamurthy commented on issue #17331:
URL: https://github.com/apache/incubator-mxnet/issues/17331#issuecomment-656725373


   Thanks @access2rohit for the summary.
   
   Is the plan for enabling Large Tensor Support in the following order?
   1. Make openBLAS compatible with Large Tensor support and merge the PR for Enabling Large Tensor Support so that default PyPi users of MXNet can already benefit from the new capability. This will actually cover largest user base of MXNet.
   2. Next, we work on enabling MKL bindings capable of Large Tensor Support, as a separate PR. So users building custom MXNet builds with MKL as BLAS will get the Large Tensor functionality.
   3. We need to debate on ATLAS and Accelerate BLAS support and we can pick up this discussion once we get above 2 major steps done.
   
   Do you see this order of execution okay @access2rohit @leezu @szha @zheng-da ?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-mxnet] access2rohit commented on issue #17331: [mxnet 2.0] [item 2.4] Turning on large tensor support by default

Posted by GitBox <gi...@apache.org>.

access2rohit commented on issue #17331:
URL: https://github.com/apache/incubator-mxnet/issues/17331#issuecomment-656499453

Currently Large Tensor Support work on all operators implemented in MXNet and MKLDNN also supports int64. CUDA kernels written inside MXNET both generic(cpu/gpu) and specific(gpu only) support large tensors depending on device memory.

BLAS and LAPACK libs were not considered while defining the scope of the project. Currently following BLAS and LAPACK implementations are supported inside MXNet

openBLAS (Default)
MKL
ATLAS
Apple Accelerate

upon investigation openBLAS needs to be built with specific flag to support int64_t signatures and MKL will support long long int signatures(in which case reinterpret_cast<>() is needed for casting pointers as int64_t is treated as long int* as opposed to long long int* in MKL). Additionally LAPACK and BLAS wrappers need to be updated from int -> int64_t.

Initially openBLAS can be supported since it is used by default and in pypi wheels as well. Thus not, breaking any default behaviour of customer. Users attempting to use Large Tensor with other BLAS and LAPACK implementations won't face issues as long as they don't use large tensors. Additional error messages will be added in case Large Tensor is used BLAS implementation is not openBLAS until that BLAS library is made to work with large tensor support of MXNet.

@sandeep-krishnamurthy @leezu @szha @zheng-da

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-mxnet] access2rohit edited a comment on issue #17331: [mxnet 2.0] [item 2.4] Turning on large tensor support by default

Posted by GitBox <gi...@apache.org>.

access2rohit edited a comment on issue #17331:
URL: https://github.com/apache/incubator-mxnet/issues/17331#issuecomment-656772572


   > Has the large tensor for numpy array been supported?
   
   
   upon inspecting numpy files inside MXNet and they are using index_t for iterating over elements in their own kernels and use NDarray ones for remaining in which we ensured to use index_t where required. For kernels using BLAS I will update them in the same PR as making MXNet wrappers for openBLAS int64 compatible.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-mxnet] access2rohit commented on issue #17331: [mxnet 2.0] [item 2.4] Turning on large tensor support by default

Posted by GitBox <gi...@apache.org>.

access2rohit commented on issue #17331:
URL: https://github.com/apache/incubator-mxnet/issues/17331#issuecomment-625542495


   @apeforest THe profiling done by @JonTanS was done long back using mxnet-1.6in november. These results are using current master branch of MXNet, bert scripts have changed too. If there are newer setting for running BERT on single node they are not available on Gluon NLP site. If @eric-haibin-lin or @szhengac can verify whether my BERT is correct or not and also provide proper tuning params to run BERT on single node I will re-run benchmarks and update the results here.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-mxnet] apeforest commented on issue #17331: [mxnet 2.0] [item 2.4] Turning on large tensor support by default

Posted by GitBox <gi...@apache.org>.

apeforest commented on issue #17331:
URL: https://github.com/apache/incubator-mxnet/issues/17331#issuecomment-624456645


   @access2rohit This result is a little surprising. In the earlier benchmark results provided by @JonTanS, there is a ~17% degradation in BERT training when large tensor (LT) compiler flag is turned on:
   
   -- | -- | -- | --
   bert | 38.1 | 46.7 | -18.42%
   
   However, from your result, even without your latest speedup in broadcast_axis operator, there is very little difference with LT flag is on:
   
   
   master LT | 24.38k | 25.50k | 28.47k | 134.8 sec
   -- | -- | -- | -- | --
   master | 25.90k | 25.90k | 27.82k | 131.9 sec
   
   Could you provide more insights?
   
   
   
   
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-mxnet] access2rohit commented on issue #17331: [mxnet 2.0] [item 2.4] Turning on large tensor support by default

Posted by GitBox <gi...@apache.org>.

access2rohit commented on issue #17331:
URL: https://github.com/apache/incubator-mxnet/issues/17331#issuecomment-650509034


   PR: https://github.com/apache/incubator-mxnet/pull/17882 fixes regression in SSD. Following are the new results for SSD run:
   
   Code | SSD 1 Epoch time (sec) | %age Speedup/Slowdown w.r.t Master (large tensor disabled)
   -- | -- | --
   Master (large tensor disabled) | 226 | 0
   Master (large tensor   enabled) | 335 | 48.23% slowdown
   Master + CPU Optimized broadcast_axis (large tensor disabled) | 130 | 42.5% speedup
   Master + CPU Optimized broadcast_axis   (large tensor enabled) | 184 | 18.5% speedup
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-mxnet] access2rohit commented on issue #17331: [mxnet 2.0] [item 2.4] Turning on large tensor support by default

Posted by GitBox <gi...@apache.org>.

access2rohit commented on issue #17331:
URL: https://github.com/apache/incubator-mxnet/issues/17331#issuecomment-622660352


   [new_bert_train.log](https://github.com/apache/incubator-mxnet/files/4567005/new_bert_train.log)
   [new_lt_bert_train.log](https://github.com/apache/incubator-mxnet/files/4567006/new_lt_bert_train.log)
   [master_bert_train.log](https://github.com/apache/incubator-mxnet/files/4567007/master_bert_train.log)
   [master_lt_bert_train.log](https://github.com/apache/incubator-mxnet/files/4567008/master_lt_bert_train.log)
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-mxnet] access2rohit edited a comment on issue #17331: [mxnet 2.0] [item 2.4] Turning on large tensor support by default

Posted by GitBox <gi...@apache.org>.

access2rohit edited a comment on issue #17331:
URL: https://github.com/apache/incubator-mxnet/issues/17331#issuecomment-650509402


   @apeforest @sandeep-krishnamurthy @szha @zheng-da 
   
   PR's to enable Large Tensor Support as default in master are divided into two stages:
   Stage1: Unix CPU/GPU and Windows CPU/GPU https://github.com/apache/incubator-mxnet/pull/18625
   Stage2: All remaining platforms https://github.com/apache/incubator-mxnet/pull/18626
   
   Once the above 2 PR's are merged MXNet will support Large Tensors for CPU/GPU(depending on Global Memory) on master.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-mxnet] access2rohit edited a comment on issue #17331: [mxnet 2.0] [item 2.4] Turning on large tensor support by default

Posted by GitBox <gi...@apache.org>.

access2rohit edited a comment on issue #17331:
URL: https://github.com/apache/incubator-mxnet/issues/17331#issuecomment-656499453

BLAS and LAPACK libs were not considered while defining the scope of the project. Currently following BLAS and LAPACK implementations are supported inside MXNet

openBLAS (Default)
MKL
ATLAS
Apple Accelerate

NOTE: currently openBLAS works correctly with smaller inputs(within range of int32) but will truncate parameters passed with higher values and hence will result in either SIGSEGV(mostly) or garbage values being found(will eventually cause SIGSEGV in a bigger script)

@sandeep-krishnamurthy @leezu @szha @zheng-da

For queries about this service, please contact Infrastructure at:
users@infra.apache.org