You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@mxnet.apache.org by GitBox <gi...@apache.org> on 2020/07/30 07:30:36 UTC

[GitHub] [incubator-mxnet] sxjscience opened a new issue #18826: [Activation] GELU precision mismatch between MXNet and PyTorch in the CPU version

sxjscience opened a new issue #18826:
URL: https://github.com/apache/incubator-mxnet/issues/18826


   The CPU version of `mx.npx.leaky_relu(x, act_type='gelu')` has different precision from PyTorch.
   
   The minimal reproducible example:
   
   ```python
   import mxnet as mx
   mx.npx.set_np()
   a = mx.np.random.normal(0, 1, (10000,)) 
   b = mx.npx.leaky_relu(a, act_type='gelu')
   c = a * 0.5 * (1.0 + mx.npx.erf(a / math.sqrt(2.0)))
   
   import torch
   a_torch = torch.from_numpy(a.asnumpy()).cuda() 
   b_torch = torch.nn.functional.gelu(a_torch)
   assert_allclose(b_torch.cpu().numpy(), c.asnumpy(), 1E-4, 1E-4)  
   assert_allclose(b_torch.cpu().numpy(), b.asnumpy(), 1E-4, 1E-4)  
   ```
   
   The GPU version has no issue:
   ```python
   import mxnet as mx
   mx.npx.set_np()
   a = mx.np.random.normal(0, 1, (10000,), ctx=mx.gpu()) 
   b = mx.npx.leaky_relu(a, act_type='gelu')
   c = a * 0.5 * (1.0 + mx.npx.erf(a / math.sqrt(2.0)))
   
   import torch
   a_torch = torch.from_numpy(a.asnumpy()).cuda() 
   b_torch = torch.nn.functional.gelu(a_torch)
   assert_allclose(b_torch.cpu().numpy(), c.asnumpy(), 1E-4, 1E-4)  
   assert_allclose(b_torch.cpu().numpy(), b.asnumpy(), 1E-4, 1E-4)  
   ```
   
   @pengzhao-intel @ciyongch 
   
   Error:
   ```
   <ipython-input-48-6f3377797f65> in <module>
         9 b_torch = torch.nn.functional.gelu(a_torch)
        10 assert_allclose(b_torch.cpu().numpy(), c.asnumpy(), 1E-4, 1E-4)
   ---> 11 assert_allclose(b_torch.cpu().numpy(), b.asnumpy(), 1E-4, 1E-4)
   
   ~/.local/lib/python3.6/site-packages/numpy/testing/_private/utils.py in assert_allclose(actual, desired, rtol, atol, equal_nan, err_msg, verbose)
      1526     header = 'Not equal to tolerance rtol=%g, atol=%g' % (rtol, atol)
      1527     assert_array_compare(compare, actual, desired, err_msg=str(err_msg),
   -> 1528                          verbose=verbose, header=header, equal_nan=equal_nan)
      1529 
      1530 
   
   ~/.local/lib/python3.6/site-packages/numpy/testing/_private/utils.py in assert_array_compare(comparison, x, y, err_msg, verbose, header, precision, equal_nan, equal_inf)
       838                                 verbose=verbose, header=header,
       839                                 names=('x', 'y'), precision=precision)
   --> 840             raise AssertionError(msg)
       841     except ValueError:
       842         import traceback
   
   AssertionError: 
   Not equal to tolerance rtol=0.0001, atol=0.0001
   
   Mismatched elements: 2258 / 10000 (22.6%)
   Max absolute difference: 0.0004735
   Max relative difference: 0.8255573
    x: array([ 0.684651,  0.508604, -0.165598, ...,  1.706593,  0.288036,
           1.006167], dtype=float32)
    y: array([ 0.68455 ,  0.508554, -0.165716, ...,  1.706508,  0.288026,
           1.005966], dtype=float32)
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-mxnet] sxjscience closed issue #18826: [Activation] GELU precision mismatch between MXNet and PyTorch in the CPU version

Posted by GitBox <gi...@apache.org>.

sxjscience closed issue #18826:
URL: https://github.com/apache/incubator-mxnet/issues/18826


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-mxnet] sxjscience commented on issue #18826: [Activation] GELU precision mismatch between MXNet and PyTorch in the CPU version

Posted by GitBox <gi...@apache.org>.

sxjscience commented on issue #18826:
URL: https://github.com/apache/incubator-mxnet/issues/18826#issuecomment-666100930






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-mxnet] sxjscience commented on issue #18826: [Activation] GELU precision mismatch between MXNet and PyTorch in the CPU version

Posted by GitBox <gi...@apache.org>.

sxjscience commented on issue #18826:
URL: https://github.com/apache/incubator-mxnet/issues/18826#issuecomment-670336760


   Yes, it's solved.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-mxnet] TaoLv commented on issue #18826: [Activation] GELU precision mismatch between MXNet and PyTorch in the CPU version

Posted by GitBox <gi...@apache.org>.

TaoLv commented on issue #18826:
URL: https://github.com/apache/incubator-mxnet/issues/18826#issuecomment-666098440






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-mxnet] pengzhao-intel commented on issue #18826: [Activation] GELU precision mismatch between MXNet and PyTorch in the CPU version

Posted by GitBox <gi...@apache.org>.

pengzhao-intel commented on issue #18826:
URL: https://github.com/apache/incubator-mxnet/issues/18826#issuecomment-670334816


   > @TaoLv Sorry, missed some imports.
   > 
   > ```python
   > import mxnet as mx
   > import math
   > from numpy.testing import assert_allclose
   > mx.npx.set_np()
   > a = mx.np.random.normal(0, 1, (10000,)) 
   > b = mx.npx.leaky_relu(a, act_type='gelu')
   > c = a * 0.5 * (1.0 + mx.npx.erf(a / math.sqrt(2.0)))
   > 
   > import torch
   > a_torch = torch.from_numpy(a.asnumpy())
   > b_torch = torch.nn.functional.gelu(a_torch)
   > assert_allclose(b_torch.cpu().numpy(), c.asnumpy(), 1E-4, 1E-4)  
   > assert_allclose(b_torch.cpu().numpy(), b.asnumpy(), 1E-4, 1E-4)  
   > ```
   > 
   > (Compiling MXNet takes some time for me so it will be helpful if you can check that...)
   
   Does the issue still exist after Tao's PR?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org