You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@mxnet.apache.org by jx...@apache.org on 2018/05/02 17:41:10 UTC

[incubator-mxnet] branch master updated: update perf. (#10761)

This is an automated email from the ASF dual-hosted git repository.

jxie pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git


The following commit(s) were added to refs/heads/master by this push:
     new ebd8a6b  update perf. (#10761)
ebd8a6b is described below

commit ebd8a6bd35a34af427d8ee100788447eb564bcf1
Author: Da Zheng <zh...@gmail.com>
AuthorDate: Wed May 2 10:40:59 2018 -0700

    update perf. (#10761)
---
 docs/faq/perf.md | 231 ++++++++++++++++++++++++++++---------------------------
 1 file changed, 119 insertions(+), 112 deletions(-)

diff --git a/docs/faq/perf.md b/docs/faq/perf.md
index b5d73f6..ce74391 100644
--- a/docs/faq/perf.md
+++ b/docs/faq/perf.md
@@ -29,65 +29,70 @@ Note that _MXNet_ treats all CPUs on a single machine as a single device.
 So whether you specify `cpu(0)` or `cpu()`, _MXNet_ will use all CPU cores on the machine.
 
 ### Scoring results
-The following table shows performance,
+The following table shows performance of [MXNet-1.2.0.rc1](https://github.com/apache/incubator-mxnet/releases/download/1.2.0.rc1/apache-mxnet-src-1.2.0.rc1-incubating.tar.gz),
 namely number of images that can be predicted per second.
 We used [example/image-classification/benchmark_score.py](https://github.com/dmlc/mxnet/blob/master/example/image-classification/benchmark_score.py)
 to measure the performance on different AWS EC2 machines.
 
-AWS EC2 C4.8xlarge:
-
-| Batch | Alexnet | VGG | Inception-BN | Inception-v3 | Resnet 50 | Resnet 152 |
-| --- | --- | --- | --- | --- | --- | --- |
-|   1 |  119.57 | 34.23 |  111.36 |  54.42 |  42.83 | 19.51 |
-|   2 | 210.58 | 51.63 |  137.10 |  67.30 |  57.54 | 23.56 |
-|   4 | 318.54 | 70.00 |  187.21 |  76.53 |  63.64 | 25.80 |
-|   8 | 389.34 | 77.39 |  211.90 |  84.26 |  63.89 | 28.11 |
-|  16 | 489.12 | 85.26 |  220.52 |  82.00 |  63.93 | 27.08 |
-|  32 | 564.04 | 87.15 |  208.21 |  83.05 |  62.19 | 25.76 |
-
-AWS EC2 C4.4xlarge:
-
-| Batch | Alexnet | VGG | Inception-BN | Inception-v3 | Resnet 50 | Resnet 152 |
-| --- | --- | --- | --- | --- | --- | --- |
-|   1 |  109.96 | 23.00 |  71.82 |  28.10 |  30.66 | 11.81 |
-|   2 | 124.56 | 24.86 |  81.61 |  31.32 |  32.73 | 12.82 |
-|   4 | 157.01 | 26.60 |  86.77 |  32.94 |  33.32 | 13.16 |
-|   8 | 178.40 | 30.67 |  88.58 |  33.52 |  33.32 | 13.32 |
-|  16 | 189.52 | 35.61 |  90.36 |  33.63 |  32.94 | 13.18 |
-|  32 | 196.61 | 38.98 |  105.27 |  33.77 |  32.65 | 13.00 |
-
-AWS EC2 C4.2xlarge:
-
-| Batch | Alexnet | VGG | Inception-BN | Inception-v3 | Resnet 50 | Resnet 152 |
-| --- | --- | --- | --- | --- | --- | --- |
-|   1 |  70.75 | 12.87 |  42.86 |  16.53 |  18.14 | 7.01 |
-|   2 | 71.53 | 13.08 |  45.66 |  17.38 |  18.53 | 7.18 |
-|   4 | 84.72 | 15.38 |  47.50 |  17.80 |  18.96 | 7.35 |
-|   8 | 93.44 | 18.33 |  48.08 |  17.93 |  18.99 | 7.40 |
-|  16 | 97.03 | 20.12 |  55.73 |  18.00 |  18.91 | 7.36 |
-|  32 | 113.90 | 21.10 |  62.54 |  17.98 |  18.80 | 7.33 |
-
-AWS EC2 C4.xlarge:
-
-| Batch | Alexnet | VGG | Inception-BN | Inception-v3 | Resnet 50 | Resnet 152 |
-| --- | --- | --- | --- | --- | --- | --- |
-|   1 |  37.92 | 6.57 |  23.09 |  8.79 |  9.65 | 3.73 |
-|   2 | 36.77 | 7.31 |  24.00 |  9.00 |  9.84 | 3.78 |
-|   4 | 43.18 | 8.94 |  24.42 |  9.12 |  9.91 | 3.83 |
-|   8 | 47.05 | 10.01 |  28.32 |  9.13 |  9.88 | 3.83 |
-|  16 | 55.74 | 10.61 |  31.96 |  9.14 |  9.86 | 3.80 |
-|  32 | 65.05 | 10.91 |  33.86 |  9.34 |  10.31 | 3.86 |
-
-AWS EC2 C4.large:
-
-| Batch | Alexnet | VGG | Inception-BN | Inception-v3 | Resnet 50 | Resnet 152 |
-| --- | --- | --- | --- | --- | --- | --- |
-|   1 |  19.86 | 3.67 |  12.20 |  4.59 |  5.11 | 1.97 |
-|   2 | 19.37 | 4.24 |  12.41 |  4.64 |  5.15 | 1.98 |
-|   4 | 22.64 | 4.89 |  14.34 |  4.66 |  5.16 | 2.00 |
-|   8 | 27.19 | 5.25 |  16.17 |  4.66 |  5.16 | 1.99 |
-|  16 | 31.82 | 5.46 |  17.24 |  4.76 |  5.35 | OOM |
-|  32 | 34.67 | 5.55 |  17.64 |  4.88 |  OOM | OOM |
+AWS EC2 C5.18xlarge:
+
+| Batch | Alexnet | VGG    | Inception-BN | Inception-v3 | Resnet 50 | Resnet 152 |
+|-------|---------|--------|--------------|--------------|-----------|------------|
+| 1     | 390.53  | 81.57  | 124.13       | 62.26        | 76.22     | 32.92      |
+| 2     | 596.45  | 100.84 | 206.58       | 93.36        | 119.55    | 46.80      |
+| 4     | 710.77  | 119.04 | 275.55       | 127.86       | 148.62    | 59.36      |
+| 8     | 921.40  | 120.38 | 380.82       | 157.11       | 167.95    | 70.78      |
+| 16    | 1018.43 | 115.30 | 411.67       | 168.71       | 178.54    | 75.13      |
+| 32    | 1290.31 | 107.19 | 483.34       | 179.38       | 193.47    | 85.86      |
+
+
+AWS EC2 C5.9xlarge:
+
+| Batch | Alexnet | VGG   | Inception-BN | Inception-v3 | Resnet 50 | Resnet 152 |
+|-------|---------|-------|--------------|--------------|-----------|------------|
+| 1     | 257.77  | 50.61 | 130.99       | 66.95        | 75.38     | 32.33      |
+| 2     | 410.60  | 63.02 | 195.14       | 87.84        | 102.67    | 41.57      |
+| 4     | 462.59  | 62.64 | 263.15       | 109.87       | 127.15    | 50.69      |
+| 8     | 573.79  | 63.95 | 309.99       | 121.36       | 140.84    | 59.01      |
+| 16    | 709.47  | 67.79 | 350.19       | 128.26       | 147.41    | 64.15      |
+| 32    | 831.46  | 69.58 | 354.91       | 129.92       | 149.18    | 64.25      |
+
+
+AWS EC2 C5.4xlarge:
+
+| Batch | Alexnet | VGG   | Inception-BN | Inception-v3 | Resnet 50 | Resnet 152 |
+|-------|---------|-------|--------------|--------------|-----------|------------|
+| 1     | 214.15  | 29.32 | 114.97       | 47.96        | 61.01     | 23.92      |
+| 2     | 310.04  | 34.81 | 150.09       | 60.89        | 71.16     | 27.92      |
+| 4     | 330.69  | 34.56 | 186.63       | 74.15        | 86.86     | 34.37      |
+| 8     | 378.88  | 35.46 | 204.89       | 77.05        | 91.10     | 36.93      |
+| 16    | 424.00  | 36.49 | 211.55       | 78.39        | 91.23     | 37.34      |
+| 32    | 481.95  | 37.23 | 213.71       | 78.23        | 91.68     | 37.26      |
+
+
+AWS EC2 C5.2xlarge:
+
+| Batch | Alexnet | VGG   | Inception-BN | Inception-v3 | Resnet 50 | Resnet 152 |
+|-------|---------|-------|--------------|--------------|-----------|------------|
+| 1     | 131.01  | 15.67 | 78.75        | 31.12        | 37.30     | 14.75      |
+| 2     | 182.29  | 18.01 | 98.59        | 39.13        | 45.98     | 17.84      |
+| 4     | 189.31  | 18.25 | 110.26       | 41.35        | 49.21     | 19.32      |
+| 8     | 211.75  | 18.57 | 115.46       | 42.53        | 49.98     | 19.81      |
+| 16    | 236.06  | 19.11 | 117.18       | 42.59        | 50.20     | 19.92      |
+| 32    | 261.13  | 19.46 | 116.20       | 42.72        | 49.95     | 19.80      |
+
+
+AWS EC2 C5.xlarge:
+
+| Batch | Alexnet | VGG  | Inception-BN | Inception-v3 | Resnet 50 | Resnet 152 |
+|-------|---------|------|--------------|--------------|-----------|------------|
+| 1     | 36.64   | 3.93 | 27.06        | 10.09        | 12.98     | 5.06       |
+| 2     | 49.21   | 4.49 | 29.67        | 10.80        | 12.94     | 5.14       |
+| 4     | 50.12   | 4.50 | 30.31        | 10.83        | 13.17     | 5.19       |
+| 8     | 54.71   | 4.58 | 30.22        | 10.89        | 13.19     | 5.20       |
+| 16    | 60.23   | 4.70 | 30.20        | 10.91        | 13.23     | 5.19       |
+| 32    | 66.37   | 4.76 | 30.10        | 10.90        | 13.22     | 5.15       |
+
 
 ## Other CPU
 
@@ -101,88 +106,90 @@ We suggest always checking to make sure that a recent cuDNN version is used.
 
 Setting the environment `export MXNET_CUDNN_AUTOTUNE_DEFAULT=1` sometimes also helps.
 
-We show results when using various GPUs including K80 (EC2 p2.2xlarge), M40,
-and P100 (DGX-1).
+We show results when using various GPUs including K80 (EC2 p2.2xlarge), M60 (EC2 g3.4xlarge),
+and V100 (EC2 p3.2xlarge).
 
 ### Scoring results
 
 Based on
 [example/image-classification/benchmark_score.py](https://github.com/dmlc/mxnet/blob/master/example/image-classification/benchmark_score.py)
-and MXNet commit `0a03417`, with cuDNN 5.1
+and  [MXNet-1.2.0.rc1](https://github.com/apache/incubator-mxnet/releases/download/1.2.0.rc1/apache-mxnet-src-1.2.0.rc1-incubating.tar.gz), with cuDNN 7.0.5
 
 - K80 (single GPU)
 
-  | Batch | Alexnet | VGG | Inception-BN | Inception-v3 | Resnet 50 | Resnet 152 |
-  | --- | --- | --- | --- | --- | --- | --- |
-  |   1 | 202.66  | 70.76 | 74.91  | 42.61  | 70.94 | 24.87 |
-  |   2 | 233.76  | 63.53 | 119.60  | 60.09  | 92.28 | 34.23 |
-  |   4 | 367.91  | 78.16 | 164.41  | 72.30  | 116.68 | 44.76 |
-  |   8 | 624.14  | 119.06 | 195.24  | 79.62  | 129.37 | 50.96 |
-  |  16 | 1071.19 | 195.83 | 256.06  | 99.38  | 160.40 | 66.51 |
-  |  32 | 1443.90 | 228.96 | 287.93  | 106.43  | 167.12 | 69.73 |
-
-- M40
-
-  | Batch | Alexnet | VGG | Inception-BN | Inception-v3 | Resnet 50 | Resnet 152 |
-  | --- | --- | --- | --- | --- | --- | --- |
-  |   1 | 412.09 | 142.10 | 115.89  | 64.40  | 126.90 | 46.15 |
-  |   2 | 743.49 | 212.21 | 205.31  | 108.06  | 202.17 | 75.05 |
-  |   4 | 1155.43 | 280.92 | 335.69  | 161.59  | 266.53 | 106.83 |
-  |   8 | 1606.87 | 332.76 | 491.12  | 224.22  | 317.20 | 128.67 |
-  |  16 | 2070.97 | 400.10 | 618.25  | 251.87  | 335.62 | 134.60 |
-  |  32 | 2694.91 | 466.95 | 624.27  | 258.59  | 373.35 | 152.71 |
-
-- P100
-
-  | Batch | Alexnet | VGG | Inception-BN | Inception-v3 | Resnet 50 | Resnet 152 |
-  | --- | --- | --- | --- | --- | --- | --- |
-  |   1 | 624.84 | 294.6 | 139.82  | 80.17  | 162.27 | 58.99 |
-  |   2 | 1226.85 | 282.3 | 267.41  | 142.63  | 278.02 | 102.95 |
-  |   4 | 1934.97 | 399.3 | 463.38  | 225.56  | 423.63 | 168.91 |
-  |   8 | 2900.54 | 522.9 | 709.30  | 319.52  | 529.34 | 210.10 |
-  |  16 | 4063.70 | 755.3 | 949.22  | 444.65  | 647.43 | 270.07 |
-  |  32 | 4883.77 | 854.4 | 1197.74  | 493.72  | 713.17 | 294.17 |
+| Batch | Alexnet | VGG    | Inception-BN | Inception-v3 | Resnet 50 | Resnet 152 |
+|-------|---------|--------|--------------|--------------|-----------|------------|
+| 1     | 243.93  | 43.59  | 68.62        | 35.52        | 67.41     | 23.65      |
+| 2     | 338.16  | 49.14  | 113.41       | 56.29        | 93.35     | 33.88      |
+| 4     | 478.92  | 53.44  | 159.61       | 74.43        | 119.18    | 45.23      |
+| 8     | 683.52  | 70.50  | 190.49       | 86.23        | 131.32    | 50.54      |
+| 16    | 1004.66 | 109.01 | 254.20       | 105.70       | 155.40    | 62.55      |
+| 32    | 1238.55 | 114.98 | 285.49       | 116.79       | 159.42    | 64.99      |
+
+- M60
+
+| Batch | Alexnet | VGG    | Inception-BN | Inception-v3 | Resnet 50 | Resnet 152 |
+|-------|---------|--------|--------------|--------------|-----------|------------|
+| 1     | 243.49  | 59.95  | 101.97       | 48.30        | 95.46     | 39.29      |
+| 2     | 491.04  | 69.14  | 170.35       | 80.27        | 142.61    | 60.17      |
+| 4     | 711.54  | 78.94  | 257.89       | 123.09       | 182.36    | 76.51      |
+| 8     | 1077.73 | 109.34 | 343.42       | 152.82       | 208.74    | 87.27      |
+| 16    | 1447.21 | 144.93 | 390.25       | 166.32       | 220.73    | 92.41      |
+| 32    | 1797.66 | 151.86 | 416.69       | 176.56       | 230.19    | 97.03      |
+
+
+- V100
+
+| Batch | Alexnet | VGG    | Inception-BN | Inception-v3 | Resnet 50 | Resnet 152 |
+|-------|---------|--------|--------------|--------------|-----------|------------|
+| 1     | 659.51  | 205.16 | 136.91       | 76.54        | 162.15    | 61.38      |
+| 2     | 1248.21 | 265.40 | 261.85       | 144.23       | 293.74    | 116.30     |
+| 4     | 2122.41 | 333.97 | 477.22       | 270.03       | 479.14    | 195.17     |
+| 8     | 3894.30 | 420.26 | 831.09       | 450.68       | 699.39    | 294.19     |
+| 16    | 5815.58 | 654.16 | 1332.26      | 658.97       | 947.45    | 398.79     |
+| 32    | 7906.09 | 708.43 | 1784.23      | 817.33       | 1076.81   | 451.82     |
+
 
 ### Training results
 
 Based on
 [example/image-classification/train_imagenet.py](https://github.com/dmlc/mxnet/blob/master/example/image-classification/train_imagenet.py)
-and MXNet commit `0a03417`, with CUDNN 5.1. The benchmark script is available at
+and  [MXNet-1.2.0.rc1](https://github.com/apache/incubator-mxnet/releases/download/1.2.0.rc1/apache-mxnet-src-1.2.0.rc1-incubating.tar.gz), with CUDNN 7.0.5. The benchmark script is available at
 [here](https://github.com/mli/mxnet-benchmark/blob/master/run_vary_batch.sh),
-where the batch size for Alexnet is increased by 8x.
+where the batch size for Alexnet is increased by 16x.
 
 - K80 (single GPU)
 
   | Batch | Alexnet(\*8) | Inception-v3 | Resnet 50 |
   | --- | --- | --- | --- |
-  |   1 | 230.69 | 9.81  | 13.83 |
-  |   2 | 348.10 | 15.31 | 21.85 |
-  |   4 | 457.28 | 20.48 | 29.58 |
-  |   8 | 533.51 | 24.47 | 36.83 |
-  |  16 | 582.36 | 28.46 | 43.60 |
-  |  32 | 483.37 | 29.62 | 45.52 |
+  |   1 | 300.30 | 10.48 | 15.61 |
+  |   2 | 406.08 | 16.00 | 23.88 |
+  |   4 | 461.01 | 22.10 | 32.26 |
+  |   8 | 484.00 | 26.80 | 39.42 |
+  |  16 | 490.45 | 31.62 | 46.69 |
+  |  32 | 414.72 | 33.78 | 49.48 |
 
-- M40
+- M60
 
-  | Batch | Alexnet(\*8) | Inception-v3 | Resnet 50 |
+  | Batch | Alexnet(\*16) | Inception-v3 | Resnet 50 |
   | --- | --- | --- | --- |
-  |   1 | 405.17  | 14.35 | 21.56 |
-  |   2 | 606.32  | 23.96 | 36.48 |
-  |   4 | 792.66  | 37.38 | 52.96 |
-  |   8 | 1016.51 | 52.69 | 70.21 |
-  |  16 | 1105.18 | 62.35 | 83.13 |
-  |  32 | 1046.23 | 68.87 | 90.74 |
+  |   1 | 380.96 | 14.06 | 20.55 |
+  |   2 | 530.53 | 21.90 | 32.65 |
+  |   4 | 600.17 | 31.96 | 45.57 |
+  |   8 | 633.60 | 40.58 | 54.92 |
+  |  16 | 639.37 | 46.88 | 64.44 |
+  |  32 | 576.54 | 50.05 | 68.34 |
 
-- P100
+- V100
 
-  | Batch | Alexnet(\*8) | Inception-v3 | Resnet 50 |
+  | Batch | Alexnet(\*16) | Inception-v3 | Resnet 50 |
   | --- | --- | --- | --- |
-  |   1 | 809.94  | 15.14  | 27.20  |
-  |   2 | 1202.93 | 30.34  | 49.55  |
-  |   4 | 1631.37 | 50.59  | 78.31  |
-  |   8 | 1882.74 | 77.75  | 122.45 |
-  |  16 | 2012.04 | 111.11 | 156.79 |
-  |  32 | 1869.69 | 129.98 | 181.53 |
+  |   1 | 1629.52 | 21.83 | 34.54 |
+  |   2 | 2359.73 | 40.11 | 65.01 |
+  |   4 | 2687.89 | 72.79 | 113.49 |
+  |   8 | 2919.02 | 118.43 | 174.81 |
+  |  16 | 2994.32 | 173.15 | 251.22 |
+  |  32 | 2585.61 | 214.48 | 298.51 |
 
 ## Multiple Devices
 

-- 
To stop receiving notification emails like this one, please contact
jxie@apache.org.