You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@mxnet.apache.org by GitBox <gi...@apache.org> on 2019/07/22 03:05:11 UTC

[GitHub] [incubator-mxnet] zixuanweeei opened a new pull request #15621: [WIP] MKL-DNN LBR-GRU Inference Integration (FP32 LBR-GRU)

zixuanweeei opened a new pull request #15621: [WIP] MKL-DNN LBR-GRU Inference Integration (FP32 LBR-GRU)
URL: https://github.com/apache/incubator-mxnet/pull/15621
 
 
   ## Description ##
   We integrated the mkl-dnn Linear-Before-Reset GRU into MXNet. Currently, it supports FP32 inference. @pengzhao-intel @ciyongch @TaoLv 
   
   ## Checklist ##
   ### Essentials ###
   - [x] Changes are complete. The FP32 inference of all the RNN variants supported by MXNet is ready in this PR. 
   
   ### Changes ###
   - [x] LBR-GRU inference goes directly into mkl-dnn rnn forward primitive by default.
   - [x] Move `mkldnn::memory`s into a struct.
   
   ### Performance ###
   We tested the performance of FusedRNN with `mode='gru'` using the same dimension as that in PR#14713, i.e. seq_length = 300, batch_size = 20, input_size = 800, hidden_size = 800.
   <table border="0">
    <tr height=19 style='height:14.4pt'>
     <td rowspan=2 height=38 class=xl95 align=center width=26>mode</td>
     <td rowspan=2 class=xl95 width=39 align=center>Layer</td>
     <td rowspan=2 class=xl95 width=61 align=center>Direction</td>
     <td colspan=2 class=xl90 width=241 align=center>MXNET_USE_MKLDNN_RNN=0</td>
     <td colspan=2 class=xl90 width=241 align=center>MXNET_USE_MKLDNN_RNN=1</td>
     <td colspan=2 class=xl90 width=172 align=center>SpeedUp</td>
    </tr>
    <tr height=19 style='height:14.4pt'>
     <td height=19 class=xl90 align=center>Throughput (samples/sec)</td>
     <td class=xl90 align=center>Latency (ms)</td>
     <td class=xl90 align=center>Throughput (samples/sec)</td>
     <td class=xl90 align=center>Latency (ms)</td>
     <td class=xl90 align=center>Throughtput</td>
     <td class=xl90 align=center>Latency</td>
    </tr>
    <tr height=19 style='height:14.4pt'>
     <td height=19 class=xl70>gru</td>
     <td class=xl70 align=right>1</td>
     <td class=xl70 align=right>1</td>
     <td class=xl71 align=right>430.03</td>
     <td class=xl72 align=right>20.43</td>
     <td class=xl71 align=right>806.27</td>
     <td class=xl72 align=right>4.28</td>
     <td class=xl78 align=right>1.87</td>
     <td class=xl78 align=right>4.78</td>
    </tr>
    <tr height=19 style='height:14.4pt'>
     <td height=19 class=xl70>gru</td>
     <td class=xl70 align=right>1</td>
     <td class=xl70 align=right>2</td>
     <td class=xl71 align=right>218.58</td>
     <td class=xl72 align=right>119.50</td>
     <td class=xl71 align=right>416.55</td>
     <td class=xl72 align=right>8.58</td>
     <td class=xl78 align=right>1.91</td>
     <td class=xl78 align=right>13.93</td>
    </tr>
    <tr height=19 style='height:14.4pt'>
     <td height=19 class=xl70>gru</td>
     <td class=xl70 align=right>5</td>
     <td class=xl70 align=right>1</td>
     <td class=xl71 align=right>89.47</td>
     <td class=xl72 align=right>100.07</td>
     <td class=xl71 align=right>177.52</td>
     <td class=xl72 align=right>21.20</td>
     <td class=xl78 align=right>1.98</td>
     <td class=xl78 align=right>4.72</td>
    </tr>
    <tr height=19 style='height:14.4pt'>
     <td height=19 class=xl70>gru</td>
     <td class=xl70 align=right>5</td>
     <td class=xl70 align=right>2</td>
     <td class=xl71 align=right>39.68</td>
     <td class=xl72 align=right>611.38</td>
     <td class=xl71 align=right>71.15</td>
     <td class=xl72 align=right>46.45</td>
     <td class=xl78 align=right>1.79</td>
     <td class=xl78 align=right>13.16</td>
    </tr>
   </table>
   
   We also compared the performance of this PR with that of the previously integrated LSTM, vRNN tanh, vRNN Relu on branch master. It seems that there is a distinct regression with `mode='lstm'`. 
   
   <table border="0">
    <tr height=19 style='height:14.4pt'>
     <td rowspan=2 height=38 class=xl95 width=60 align=center>mode</td>
     <td rowspan=2 class=xl95 width=39 align=center>Layer</td>
     <td rowspan=2 class=xl95 width=61 align=center>Direction</td>
     <td colspan=2 class=xl97 width=241 align=center>eec0fb4</td>
     <td colspan=2 class=xl97 width=241 align=center>This PR (c186863)</td>
     <td colspan=2 class=xl90 width=172 align=center>Gap</td>
    </tr>
    <tr height=19 style='height:14.4pt'>
     <td height=19 class=xl96 align=center>Throughput (samples/sec)</td>
     <td class=xl96 align=center>Latency (ms)</td>
     <td class=xl96 align=center>Throughput (samples/sec)</td>
     <td class=xl96 align=center>Latency (ms)</td>
     <td class=xl78 align=center>Throughput</td>
     <td class=xl78 align=center>Latency</td>
    </tr>
    <tr height=19 style='height:14.4pt'>
     <td height=19 class=xl78>lstm</td>
     <td class=xl78 align=right>1</td>
     <td class=xl78 align=right>1</td>
     <td class=xl73 align=right>675.24</td>
     <td class=xl73 align=right>4.98</td>
     <td class=xl73 align=right>654.61</td>
     <td class=xl73 align=right>5.71</td>
     <td class=xl73 align=right>0.97</td>
     <td class=xl73 align=right>0.87</td>
    </tr>
    <tr height=19 style='height:14.4pt'>
     <td height=19 class=xl78>lstm</td>
     <td class=xl78 align=right>1</td>
     <td class=xl78 align=right>2</td>
     <td class=xl73 align=right>343.99</td>
     <td class=xl73 align=right>9.86</td>
     <td class=xl73 align=right>333.13</td>
     <td class=xl73 align=right>11.65</td>
     <td class=xl73 align=right>0.97</td>
     <td class=xl73 align=right>0.85</td>
    </tr>
    <tr height=19 style='height:14.4pt'>
     <td height=19 class=xl78>lstm</td>
     <td class=xl78 align=right>5</td>
     <td class=xl78 align=right>1</td>
     <td class=xl73 align=right>141.30</td>
     <td class=xl73 align=right>24.03</td>
     <td class=xl73 align=right>138.59</td>
     <td class=xl73 align=right>28.39</td>
     <td class=xl73 align=right>0.98</td>
     <td class=xl73 align=right>0.85</td>
    </tr>
    <tr height=19 style='height:14.4pt'>
     <td height=19 class=xl78>lstm</td>
     <td class=xl78 align=right>5</td>
     <td class=xl78 align=right>2</td>
     <td class=xl73 align=right>55.67</td>
     <td class=xl73 align=right>53.16</td>
     <td class=xl73 align=right>54.11</td>
     <td class=xl73 align=right>61.29</td>
     <td class=xl73 align=right>0.97</td>
     <td class=xl73 align=right>0.87</td>
    </tr>
    <tr height=19 style='height:14.4pt'>
     <td height=19 class=xl78>rnn_tanh</td>
     <td class=xl78 align=right>1</td>
     <td class=xl78 align=right>1</td>
     <td class=xl73 align=right>1617.27</td>
     <td class=xl73 align=right>2.46</td>
     <td class=xl73 align=right>1541.13</td>
     <td class=xl73 align=right>2.60</td>
     <td class=xl73 align=right>0.95</td>
     <td class=xl73 align=right>0.94</td>
    </tr>
    <tr height=19 style='height:14.4pt'>
     <td height=19 class=xl78>rnn_tanh</td>
     <td class=xl78 align=right>1</td>
     <td class=xl78 align=right>2</td>
     <td class=xl73 align=right>851.16</td>
     <td class=xl73 align=right>4.82</td>
     <td class=xl73 align=right>828.10</td>
     <td class=xl73 align=right>5.01</td>
     <td class=xl73 align=right>0.97</td>
     <td class=xl73 align=right>0.96</td>
    </tr>
    <tr height=19 style='height:14.4pt'>
     <td height=19 class=xl78>rnn_tanh</td>
     <td class=xl78 align=right>5</td>
     <td class=xl78 align=right>1</td>
     <td class=xl73 align=right>390.48</td>
     <td class=xl73 align=right>11.66</td>
     <td class=xl73 align=right>376.38</td>
     <td class=xl73 align=right>12.27</td>
     <td class=xl73 align=right>0.96</td>
     <td class=xl73 align=right>0.95</td>
    </tr>
    <tr height=19 style='height:14.4pt'>
     <td height=19 class=xl78>rnn_tanh</td>
     <td class=xl78 align=right>5</td>
     <td class=xl78 align=right>2</td>
     <td class=xl73 align=right>164.11</td>
     <td class=xl73 align=right>25.72</td>
     <td class=xl73 align=right>156.64</td>
     <td class=xl73 align=right>26.74</td>
     <td class=xl73 align=right>0.95</td>
     <td class=xl73 align=right>0.96</td>
    </tr>
    <tr height=19 style='height:14.4pt'>
     <td height=19 class=xl78>rnn_relu</td>
     <td class=xl78 align=right>1</td>
     <td class=xl78 align=right>1</td>
     <td class=xl73 align=right>1582.22</td>
     <td class=xl73 align=right>2.65</td>
     <td class=xl73 align=right>1508.54</td>
     <td class=xl73 align=right>2.59</td>
     <td class=xl73 align=right>0.95</td>
     <td class=xl73 align=right>1.02</td>
    </tr>
    <tr height=19 style='height:14.4pt'>
     <td height=19 class=xl78>rnn_relu</td>
     <td class=xl78 align=right>1</td>
     <td class=xl78 align=right>2</td>
     <td class=xl73 align=right>824.18</td>
     <td class=xl73 align=right>5.20</td>
     <td class=xl73 align=right>803.40</td>
     <td class=xl73 align=right>5.04</td>
     <td class=xl73 align=right>0.97</td>
     <td class=xl73 align=right>1.03</td>
    </tr>
    <tr height=19 style='height:14.4pt'>
     <td height=19 class=xl78>rnn_relu</td>
     <td class=xl78 align=right>5</td>
     <td class=xl78 align=right>1</td>
     <td class=xl73 align=right>381.53</td>
     <td class=xl73 align=right>12.58</td>
     <td class=xl73 align=right>366.71</td>
     <td class=xl73 align=right>12.09</td>
     <td class=xl73 align=right>0.96</td>
     <td class=xl73 align=right>1.04</td>
    </tr>
    <tr height=19 style='height:14.4pt'>
     <td height=19 class=xl78>rnn_relu</td>
     <td class=xl78 align=right>5</td>
     <td class=xl78 align=right>2</td>
     <td class=xl73 align=right>153.11</td>
     <td class=xl73 align=right>27.57</td>
     <td class=xl73 align=right>153.67</td>
     <td class=xl73 align=right>27.06</td>
     <td class=xl73 align=right>1.00</td>
     <td class=xl73 align=right>1.02</td>
    </tr>
   </table>
   
   ## Comments ##
   - The possible regression need to be solved before merging this PR.
   - The gates orders of GRU from MXNet and MKL-DNN are different. There are more overhead costs when it prepares `mkldnn::memory`s with `mode='gru'`.  

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services