You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@mxnet.apache.org by GitBox <gi...@apache.org> on 2021/07/27 12:36:57 UTC

[GitHub] [incubator-mxnet] matteosal opened a new issue #20471: Wrong gradients on Windows-GPU

matteosal opened a new issue #20471:
URL: https://github.com/apache/incubator-mxnet/issues/20471


   [sym.zip](https://github.com/apache/incubator-mxnet/files/6885370/sym.zip)
   I only see this on Windows. Download the symbol file and run this script:
   ```
   import mxnet as mx
   
   json_path = 'sym.json'
   sym = mx.sym.load(json_path)
   
   def run_example(ctx, reqs):
   	ex = sym._bind(
   		ctx,
   		{
   			'.Inputs.Input': mx.ndarray.array([[1, 2, 3]], ctx=ctx),
   			'.Inputs.Target': mx.ndarray.array([[4, 5, 6]], ctx=ctx),
   			'seq_715248120': mx.ndarray.array([3], ctx=ctx)
   		},
   		args_grad={
   			'.Inputs.Input': mx.ndarray.zeros([1, 3], ctx=ctx),
   			'.Inputs.Target': mx.ndarray.zeros([1, 3], ctx=ctx),
   			'seq_715248120': mx.ndarray.zeros([1], ctx=ctx)
   		},
   		grad_req=dict(zip(['.Inputs.Input', '.Inputs.Target', 'seq_715248120'], reqs))
   	)
   
   	ex.forward()
   	ex.backward(out_grads=[mx.ndarray.array([1], ctx=ctx), mx.ndarray.array([1], ctx=ctx)])
   
   	print(ex.grad_dict)
   
   print('Input + Target gradient, CPU (OK):')
   run_example(mx.cpu(), ['write', 'write', 'null'])
   print('\n')
   print('Input + Target gradient, GPU (OK):')
   run_example(mx.gpu(), ['write', 'write', 'null'])
   print('\n')
   print('Target gradient only, CPU (OK):')
   run_example(mx.cpu(), ['null', 'write', 'null'])
   print('\n')
   print('Target gradient only, GPU (WRONG):')
   run_example(mx.gpu(), ['null', 'write', 'null'])
   ```
   Output is:
   ```
   Input + Target gradient, CPU (OK):
   {'.Inputs.Input':
   [[-0.33333334 -0.33333334 -0.33333334]]
   <NDArray 1x3 @cpu(0)>, '.Inputs.Target':
   [[0.33333334 0.33333334 0.33333334]]
   <NDArray 1x3 @cpu(0)>, 'seq_715248120': None}
   
   
   Input + Target gradient, GPU (OK):
   {'.Inputs.Input':
   [[-0.33333334 -0.33333334 -0.33333334]]
   <NDArray 1x3 @gpu(0)>, '.Inputs.Target':
   [[0.33333334 0.33333334 0.33333334]]
   <NDArray 1x3 @gpu(0)>, 'seq_715248120': None}
   
   
   Target gradient only, CPU (OK):
   {'.Inputs.Input': None, '.Inputs.Target':
   [[0.33333334 0.33333334 0.33333334]]
   <NDArray 1x3 @cpu(0)>, 'seq_715248120': None}
   
   
   Target gradient only, GPU (WRONG):
   {'.Inputs.Input': None, '.Inputs.Target':
   [[-0.33333334 -0.33333334 -0.33333334]]
   <NDArray 1x3 @gpu(0)>, 'seq_715248120': None}
   ```
   The `Target` gradient has the sign flipped in the last example.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org
For additional commands, e-mail: issues-help@mxnet.apache.org


[GitHub] [incubator-mxnet] barry-jin commented on issue #20471: Wrong gradients on Windows-GPU

Posted by GitBox <gi...@apache.org>.
barry-jin commented on issue #20471:
URL: https://github.com/apache/incubator-mxnet/issues/20471#issuecomment-970485257


   @matteosal What build settings should we use to reproduce this issue? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org
For additional commands, e-mail: issues-help@mxnet.apache.org


[GitHub] [incubator-mxnet] TristonC commented on issue #20471: Wrong gradients on Windows-GPU

Posted by GitBox <gi...@apache.org>.
TristonC commented on issue #20471:
URL: https://github.com/apache/incubator-mxnet/issues/20471#issuecomment-890569291


   @matteosal Thanks for the update. @leezu  Do you have windows platform to help triage the the problem?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org
For additional commands, e-mail: issues-help@mxnet.apache.org


[GitHub] [incubator-mxnet] chinakook commented on issue #20471: Wrong gradients on Windows-GPU

Posted by GitBox <gi...@apache.org>.
chinakook commented on issue #20471:
URL: https://github.com/apache/incubator-mxnet/issues/20471#issuecomment-895792069


   I've tested with a 2.0 version modified by myself on Windows, and It's OK.
   ```python
   Input + Target gradient, CPU (OK):
   {'.Inputs.Input': 
   [[-0.33333334 -0.33333334 -0.33333334]]
   <NDArray 1x3 @cpu(0)>, '.Inputs.Target':
   [[0.33333334 0.33333334 0.33333334]]
   <NDArray 1x3 @cpu(0)>, 'seq_715248120': None}
   
   
   Input + Target gradient, GPU (OK):
   {'.Inputs.Input': 
   [[-0.33333334 -0.33333334 -0.33333334]]
   <NDArray 1x3 @gpu(0)>, '.Inputs.Target':
   [[0.33333334 0.33333334 0.33333334]]
   <NDArray 1x3 @gpu(0)>, 'seq_715248120': None}
   
   
   Target gradient only, CPU (OK):
   {'.Inputs.Input': None, '.Inputs.Target':
   [[0.33333334 0.33333334 0.33333334]]
   <NDArray 1x3 @cpu(0)>, 'seq_715248120': None}
   
   
   Target gradient only, GPU (WRONG):
   {'.Inputs.Input': None, '.Inputs.Target':
   [[0.33333334 0.33333334 0.33333334]]
   <NDArray 1x3 @gpu(0)>, 'seq_715248120': None}
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org
For additional commands, e-mail: issues-help@mxnet.apache.org


[GitHub] [incubator-mxnet] matteosal edited a comment on issue #20471: Wrong gradients on Windows-GPU

Posted by GitBox <gi...@apache.org>.
matteosal edited a comment on issue #20471:
URL: https://github.com/apache/incubator-mxnet/issues/20471#issuecomment-942456505


   A ping on this. Can anyone please investigate?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org
For additional commands, e-mail: issues-help@mxnet.apache.org


[GitHub] [incubator-mxnet] matteosal commented on issue #20471: Wrong gradients on Windows-GPU

Posted by GitBox <gi...@apache.org>.
matteosal commented on issue #20471:
URL: https://github.com/apache/incubator-mxnet/issues/20471#issuecomment-908498534


   A ping on this 
   @chinakook what modification are you talking about? Can you reproduce the problem on a plain v2.0 build?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org
For additional commands, e-mail: issues-help@mxnet.apache.org


[GitHub] [incubator-mxnet] TristonC commented on issue #20471: Wrong gradients on Windows-GPU

Posted by GitBox <gi...@apache.org>.
TristonC commented on issue #20471:
URL: https://github.com/apache/incubator-mxnet/issues/20471#issuecomment-887891710


   Which version of MXNet did you @matteosal  use?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org
For additional commands, e-mail: issues-help@mxnet.apache.org


[GitHub] [incubator-mxnet] matteosal commented on issue #20471: Wrong gradients on Windows-GPU

Posted by GitBox <gi...@apache.org>.
matteosal commented on issue #20471:
URL: https://github.com/apache/incubator-mxnet/issues/20471#issuecomment-889772128


   I am using version 2.0, built from source at commit fabcd145cd496628791f9f2ea813048360ac33ca
   I have tried the same example on Linux (building from the same commit) and the results are good there. This issue only affects Windows.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org
For additional commands, e-mail: issues-help@mxnet.apache.org


[GitHub] [incubator-mxnet] matteosal commented on issue #20471: Wrong gradients on Windows-GPU

Posted by GitBox <gi...@apache.org>.
matteosal commented on issue #20471:
URL: https://github.com/apache/incubator-mxnet/issues/20471#issuecomment-942456505


   A ping on this. Can anyone investigate?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org
For additional commands, e-mail: issues-help@mxnet.apache.org


[GitHub] [incubator-mxnet] barry-jin commented on issue #20471: Wrong gradients on Windows-GPU

Posted by GitBox <gi...@apache.org>.
barry-jin commented on issue #20471:
URL: https://github.com/apache/incubator-mxnet/issues/20471#issuecomment-1009369384


   @matteosal Current workaround is to replace 'elemwise_sub' with '_npi_subtract'. There are probably some issues in legacy subtract operator. 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org
For additional commands, e-mail: issues-help@mxnet.apache.org


[GitHub] [incubator-mxnet] matteosal commented on issue #20471: Wrong gradients on Windows-GPU

Posted by GitBox <gi...@apache.org>.
matteosal commented on issue #20471:
URL: https://github.com/apache/incubator-mxnet/issues/20471#issuecomment-1006802234


   @barry-jin any news on this? I have rebuilt with VC2019 in order to fix [this issue](https://github.com/apache/incubator-mxnet/issues/20675) but I still see this problem here


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org
For additional commands, e-mail: issues-help@mxnet.apache.org


[GitHub] [incubator-mxnet] leezu commented on issue #20471: Wrong gradients on Windows-GPU

Posted by GitBox <gi...@apache.org>.
leezu commented on issue #20471:
URL: https://github.com/apache/incubator-mxnet/issues/20471#issuecomment-891328336


   I'm not a Windows user. @yajiedesign is Windows expert


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org
For additional commands, e-mail: issues-help@mxnet.apache.org


[GitHub] [incubator-mxnet] leezu edited a comment on issue #20471: Wrong gradients on Windows-GPU

Posted by GitBox <gi...@apache.org>.
leezu edited a comment on issue #20471:
URL: https://github.com/apache/incubator-mxnet/issues/20471#issuecomment-891328336


   I'm not a Windows user, so it's very hard for me to get MXNet running on Windows. @yajiedesign is Windows expert, maybe he can help


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org
For additional commands, e-mail: issues-help@mxnet.apache.org


[GitHub] [incubator-mxnet] matteosal commented on issue #20471: Wrong gradients on Windows-GPU

Posted by GitBox <gi...@apache.org>.
matteosal commented on issue #20471:
URL: https://github.com/apache/incubator-mxnet/issues/20471#issuecomment-972906812


   @barry-jin here they are:
   ```
   cmake -G"Visual Studio 15 2017 Win64" -T host=x64 ^
    %= GENERAL FLAGS =% ^
    -DCMAKE_INSTALL_PREFIX=%output_dir% ^
    -DCMAKE_BUILD_TYPE=Release ^
    -DCMAKE_SKIP_BUILD_RPATH=On ^
    -DUSE_OPENCV=OFF ^
    -DUSE_F16C=Off %= float16 support =%^
    -DUSE_INT64_TENSOR_SIZE=ON ^
    -DCMAKE_C_FLAGS="-D_WIN32" ^
    -DCMAKE_CXX_FLAGS="-D_WIN32" ^
    -DCMAKE_C_FLAGS_RELEASE="/MT -DNDEBUG" ^
    -DCMAKE_CXX_FLAGS_RELEASE="/MT -DNDEBUG" ^
    -DMXNET_FORCE_SHARED_CRT=OFF %= link statically to C runtime =%^
    -DCMAKE_SHARED_LINKER_FLAGS="/DELAYLOAD:nvcuda.dll delayimp.lib" ^
    -DUSE_MXNET_LIB_NAMING=OFF ^
    %= MATH BACKENDS =% ^
    -DBLAS=MKL ^
    -DUSE_LAPACK=OFF ^
    -DUSE_ONEDNN=OFF ^
    -DBLA_VENDOR="Intel10_64ilp" ^
    -DBLA_STATIC=OFF ^
    -DMKL_USE_SINGLE_DYNAMIC_LIBRARY=OFF ^
    -DMKL_INCLUDE_DIR=%mkl_dir% ^
    -DBLAS_LIBRARIES="%mkl_dir%/libiomp5md.lib;%mkl_dir%/mkl_core_dll.lib;%mkl_dir%/mkl_intel_ilp64_dll.lib;%mkl_dir%/mkl_intel_thread_dll.lib" ^
    %= OPENMP =% ^
    -DUSE_OPENMP=ON ^
    -DOpenMP_C_FLAGS="-I%mkl_dir%" ^
    -DOpenMP_C_LIB_NAMES="libiomp5" ^
    -DOpenMP_CXX_FLAGS="-I%mkl_dir%" ^
    -DOpenMP_CXX_LIB_NAMES="libiomp5" ^
    -DOpenMP_libiomp5_LIBRARY="%mkl_dir%/libiomp5md.lib" ^
    %= CUDA =% ^
    -DUSE_CUDA=ON ^
    -DUSE_CUDNN=ON ^
    -DCUDNN_LIBRARY=%home_dir:\=/%cuDNN/lib/cudnn64_8.lib ^
    -DCUDNN_INCLUDE=%home_dir:\=/%cuDNN/include ^
    -DUSE_NCCL=OFF ^
    -DUSE_NVML=OFF ^
    -DCUDNN_ROOT=%home_dir:\=/%cuDNN ^
    -DMXNET_CUDA_ARCH="3.7"\;"5.0"\;"6.0"\;"7.0"\;"8.0+PTX" %= see Readme =%^
    -DCUDAToolkit_ROOT=%cuda_dir% ^
    -DCMAKE_CUDA_COMPILER="%cuda_dir%/bin/nvcc.exe" -I"%cuda_dir%/include" -L"%cuda_dir%/lib/x64"  ^
    -DUSE_SPLIT_ARCH_DLL=OFF ^
    %mxnet_dir%
   ```
   
   MKL version is 2019.4 and CUDA version is 11.4.0


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org
For additional commands, e-mail: issues-help@mxnet.apache.org


[GitHub] [incubator-mxnet] TristonC commented on issue #20471: Wrong gradients on Windows-GPU

Posted by GitBox <gi...@apache.org>.
TristonC commented on issue #20471:
URL: https://github.com/apache/incubator-mxnet/issues/20471#issuecomment-887903332


   With your sym3 example, here is what I got with MXNet 1.9. 
   ```
   Input1 + Input2 gradient, CPU (OK):
   {'.Inputs.Input1': 'write', '.Inputs.Input2': 'write', '.Inputs.Input3': 'null'}
   [23:37:54] ../src/storage/storage.cc:199: Using Pooled (Naive) StorageManager for CPU
   
   [[[-3. -2. -3. -3.]
     [ 0. -3. -1. -3.]]]
   <NDArray 1x2x4 @cpu(0)>
   
   
   Input1 + Input2 gradient, GPU (OK):
   {'.Inputs.Input1': 'write', '.Inputs.Input2': 'write', '.Inputs.Input3': 'null'}
   [23:38:01] ../src/storage/storage.cc:199: Using Pooled (Naive) StorageManager for GPU
   
   [[[-3. -2. -3. -3.]
     [ 0. -3. -1. -3.]]]
   <NDArray 1x2x4 @gpu(0)>
   
   
   Input2 gradient only, CPU (OK):
   {'.Inputs.Input1': 'null', '.Inputs.Input2': 'write', '.Inputs.Input3': 'null'}
   
   [[[-3. -2. -3. -3.]
     [ 0. -3. -1. -3.]]]
   <NDArray 1x2x4 @cpu(0)>
   
   
   Input2 gradient only, GPU (WRONG):
   {'.Inputs.Input1': 'null', '.Inputs.Input2': 'write', '.Inputs.Input3': 'null'}
   
   [[[-3. -2. -3. -3.]
     [ 0. -3. -1. -3.]]]
   <NDArray 1x2x4 @gpu(0)>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org
For additional commands, e-mail: issues-help@mxnet.apache.org


[GitHub] [incubator-mxnet] matteosal commented on issue #20471: Wrong gradients on Windows-GPU

Posted by GitBox <gi...@apache.org>.
matteosal commented on issue #20471:
URL: https://github.com/apache/incubator-mxnet/issues/20471#issuecomment-942456505


   A ping on this. Can anyone investigate?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org
For additional commands, e-mail: issues-help@mxnet.apache.org


[GitHub] [incubator-mxnet] barry-jin commented on issue #20471: Wrong gradients on Windows-GPU

Posted by GitBox <gi...@apache.org>.
barry-jin commented on issue #20471:
URL: https://github.com/apache/incubator-mxnet/issues/20471#issuecomment-1008556449


   Sorry, I'm still triaging this issue. I built with settings in [build_window.py](https://github.com/apache/incubator-mxnet/blob/master/ci/build_windows.py) and can also reproduce this issue. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org
For additional commands, e-mail: issues-help@mxnet.apache.org


[GitHub] [incubator-mxnet] TristonC commented on issue #20471: Wrong gradients on Windows-GPU

Posted by GitBox <gi...@apache.org>.
TristonC commented on issue #20471:
URL: https://github.com/apache/incubator-mxnet/issues/20471#issuecomment-901512651


   @chinakook  What did you modify? Is it related to this gradient issue? Could you share it with @matteosal?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org
For additional commands, e-mail: issues-help@mxnet.apache.org


[GitHub] [incubator-mxnet] matteosal commented on issue #20471: Wrong gradients on Windows-GPU

Posted by GitBox <gi...@apache.org>.
matteosal commented on issue #20471:
URL: https://github.com/apache/incubator-mxnet/issues/20471#issuecomment-969169132


   @szha @leezu another ping on this :)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org
For additional commands, e-mail: issues-help@mxnet.apache.org


[GitHub] [incubator-mxnet] matteosal edited a comment on issue #20471: Wrong gradients on Windows-GPU

Posted by GitBox <gi...@apache.org>.
matteosal edited a comment on issue #20471:
URL: https://github.com/apache/incubator-mxnet/issues/20471#issuecomment-942456505


   A ping on this. Can anyone please investigate?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org
For additional commands, e-mail: issues-help@mxnet.apache.org


[GitHub] [incubator-mxnet] matteosal commented on issue #20471: Wrong gradients on Windows-GPU

Posted by GitBox <gi...@apache.org>.
matteosal commented on issue #20471:
URL: https://github.com/apache/incubator-mxnet/issues/20471#issuecomment-908498534


   A ping on this 
   @chinakook what modification are you talking about? Can you reproduce the problem on a plain v2.0 build?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org
For additional commands, e-mail: issues-help@mxnet.apache.org


[GitHub] [incubator-mxnet] TristonC edited a comment on issue #20471: Wrong gradients on Windows-GPU

Posted by GitBox <gi...@apache.org>.
TristonC edited a comment on issue #20471:
URL: https://github.com/apache/incubator-mxnet/issues/20471#issuecomment-887903332


   With your sym3 example, here is what I got with MXNet 1.9 on Linux. Not sure if this issue only occurs on Windows. Did you @matteosal  try it on Linux?    
   ```
   Input1 + Input2 gradient, CPU (OK):
   {'.Inputs.Input1': 'write', '.Inputs.Input2': 'write', '.Inputs.Input3': 'null'}
   [23:37:54] ../src/storage/storage.cc:199: Using Pooled (Naive) StorageManager for CPU
   
   [[[-3. -2. -3. -3.]
     [ 0. -3. -1. -3.]]]
   <NDArray 1x2x4 @cpu(0)>
   
   
   Input1 + Input2 gradient, GPU (OK):
   {'.Inputs.Input1': 'write', '.Inputs.Input2': 'write', '.Inputs.Input3': 'null'}
   [23:38:01] ../src/storage/storage.cc:199: Using Pooled (Naive) StorageManager for GPU
   
   [[[-3. -2. -3. -3.]
     [ 0. -3. -1. -3.]]]
   <NDArray 1x2x4 @gpu(0)>
   
   
   Input2 gradient only, CPU (OK):
   {'.Inputs.Input1': 'null', '.Inputs.Input2': 'write', '.Inputs.Input3': 'null'}
   
   [[[-3. -2. -3. -3.]
     [ 0. -3. -1. -3.]]]
   <NDArray 1x2x4 @cpu(0)>
   
   
   Input2 gradient only, GPU (WRONG):
   {'.Inputs.Input1': 'null', '.Inputs.Input2': 'write', '.Inputs.Input3': 'null'}
   
   [[[-3. -2. -3. -3.]
     [ 0. -3. -1. -3.]]]
   <NDArray 1x2x4 @gpu(0)>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org
For additional commands, e-mail: issues-help@mxnet.apache.org


[GitHub] [incubator-mxnet] matteosal commented on issue #20471: Wrong gradients on Windows-GPU

Posted by GitBox <gi...@apache.org>.
matteosal commented on issue #20471:
URL: https://github.com/apache/incubator-mxnet/issues/20471#issuecomment-887596932


   I see the same sign flip with this other symbol (which can be fed to the same above script)
   [sym2.zip](https://github.com/apache/incubator-mxnet/files/6886594/sym2.zip)
   
   And with this one
   [sym3.zip](https://github.com/apache/incubator-mxnet/files/6886596/sym3.zip)
   Which goes with this script:
   ```
   import numpy as np
   import mxnet as mx
   
   json_path = 'sym3.json'
   sym = mx.sym.load(json_path)
   
   input_1 = np.random.rand(1, 2, 3, 4).tolist()
   input_2 = np.random.rand(1, 2, 4).tolist()
   input_3 = np.random.rand(1, 2).tolist()
   
   def run_example(ctx, reqs):
   	ex = sym._bind(
   		ctx,
   		{
   			'.Inputs.Input1': mx.ndarray.array(input_1, ctx=ctx),
   			'.Inputs.Input2': mx.ndarray.array(input_2, ctx=ctx),
   			'.Inputs.Input3': mx.ndarray.array(input_3, ctx=ctx)
   		},
   		args_grad={
   			'.Inputs.Input1': mx.ndarray.zeros([1, 2, 3, 4], ctx=ctx),
   			'.Inputs.Input2': mx.ndarray.zeros([1, 2, 4], ctx=ctx),
   			'.Inputs.Input3': mx.ndarray.zeros([1, 2], ctx=ctx)
   		},
   		grad_req=dict(zip(['.Inputs.Input1', '.Inputs.Input2', '.Inputs.Input3'], reqs))
   	)
   
   	ex.forward()
   	ex.backward(out_grads=[mx.ndarray.ones([1, 2, 3, 4], ctx=ctx)])
   
   	print(ex.grad_dict['.Inputs.Input2'])
   
   print('Input1 + Input2 gradient, CPU (OK):')
   run_example(mx.cpu(), ['write', 'write', 'null'])
   print('\n')
   print('Input1 + Input2 gradient, GPU (OK):')
   run_example(mx.gpu(), ['write', 'write', 'null'])
   print('\n')
   print('Input2 gradient only, CPU (OK):')
   run_example(mx.cpu(), ['null', 'write', 'null'])
   print('\n')
   print('Input2 gradient only, GPU (WRONG):')
   run_example(mx.gpu(), ['null', 'write', 'null'])
   ```
   Output is
   ```
   Input1 + Input2 gradient, CPU (OK):
   
   [[[-3. -2. -3. -2.]
     [ 0. -2. -2. -3.]]]
   <NDArray 1x2x4 @cpu(0)>
   
   
   Input1 + Input2 gradient, GPU (OK):
   
   [[[-3. -2. -3. -2.]
     [ 0. -2. -2. -3.]]]
   <NDArray 1x2x4 @gpu(0)>
   
   
   Input2 gradient only, CPU (OK):
   
   [[[-3. -2. -3. -2.]
     [ 0. -2. -2. -3.]]]
   <NDArray 1x2x4 @cpu(0)>
   
   
   Input2 gradient only, GPU (WRONG):
   
   [[[3. 2. 3. 2.]
     [0. 2. 2. 3.]]]
   <NDArray 1x2x4 @gpu(0)>
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org
For additional commands, e-mail: issues-help@mxnet.apache.org


[GitHub] [incubator-mxnet] matteosal commented on issue #20471: Wrong gradients on Windows-GPU

Posted by GitBox <gi...@apache.org>.
matteosal commented on issue #20471:
URL: https://github.com/apache/incubator-mxnet/issues/20471#issuecomment-1010934607


   @barry-jin thank you, I have verified that swapping the operator fixes the problem


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org
For additional commands, e-mail: issues-help@mxnet.apache.org