You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@mxnet.apache.org by GitBox <gi...@apache.org> on 2018/03/20 14:03:53 UTC

[GitHub] CXMA479 opened a new issue #10173: Unpredictable nan in Array

CXMA479 opened a new issue #10173: Unpredictable nan in Array
URL: https://github.com/apache/incubator-mxnet/issues/10173
 
 
   ## Description
   Hi there,
   I tried to apply some 'nd' operations on data. But the returned array seemed to *randomly* contain a `nan` element for a certain index.
   
   ## Environment info (Required)
   
   ```
   ----------Python Info----------
   ('Version      :', '2.7.12')
   ('Compiler     :', 'GCC 5.4.0 20160609')
   ('Build        :', ('default', 'Nov 19 2016 06:48:10'))
   ('Arch         :', ('64bit', 'ELF'))
   ----------MXNet Info-----------
   ('Version      :', '1.2.0')
   ('Directory    :', '/home/chen-k/workspace/incubator-mxnet/python/mxnet')
   ----------System Info----------
   ('Platform     :', 'Linux-4.4.0-96-generic-x86_64-with-Ubuntu-16.04-xenial')
   ('system       :', 'Linux')
   ('release      :', '4.4.0-96-generic')
   ```
   
   Package used (Python/R/Scala/Julia):
   Python
   
   MXNet commit hash:
   97570916f844bcb4515d972c75fb0a75da345d97
   
   ## Error Message:
   ```python
   # `Ctrl+F` to find the place where `nan` occurs
   # $ python batch_loss_bug.py 
   [[ 0.54881352  0.59284461]
    [ 0.71518934  0.84426576]
    [ 0.60276335  0.85794562]
    [ 0.54488319  0.84725171]
    [ 0.42365479  0.62356371]]
   [[ 0.79172504  0.81216872]
    [ 0.5288949   0.47997716]
    [ 0.56804454  0.3927848 ]
    [ 0.92559665  0.83607876]
    [ 0.07103606  0.33739617]]
   --------------------
   [[ 0.24291152  0.21932411]
    [ 0.24291152  0.21932411]
    [ 0.24291152  0.21932411]
    [ 0.24291152  0.21932411]
    [ 0.24291152  0.21932411]
    [-0.18629444 -0.3642886 ]
    [-0.18629444 -0.3642886 ]
    [-0.18629444 -0.3642886 ]
    [-0.18629444 -0.3642886 ]
    [-0.18629444 -0.3642886 ]
    [-0.03471881 -0.46516082]
    [-0.03471881 -0.46516082]
    [-0.03471881 -0.46516082]
    [-0.03471881 -0.46516082]
    [-0.03471881 -0.46516082]
    [ 0.38071346 -0.01117295]
    [ 0.38071346 -0.01117295]
    [ 0.38071346 -0.01117295]
    [        nan -0.01117295]     #     NaN occurs here!
    [ 0.38071346 -0.01117295]
    [-0.35261875 -0.28616753]
    [-0.35261875 -0.28616753]
    [-0.35261875 -0.28616753]
    [-0.35261875 -0.28616753]
    [-0.35261875 -0.28616753]]
   --------------------
   [[ 0.24291152  0.21932411]
    [ 0.24291152  0.21932411]
    [ 0.24291152  0.21932411]
    [ 0.24291152  0.21932411]
    [ 0.24291152  0.21932411]
    [-0.18629444 -0.3642886 ]
    [-0.18629444 -0.3642886 ]
    [-0.18629444 -0.3642886 ]
    [-0.18629444 -0.3642886 ]
    [-0.18629444 -0.3642886 ]
    [-0.03471881 -0.46516082]
    [-0.03471881 -0.46516082]
    [-0.03471881 -0.46516082]
    [-0.03471881 -0.46516082]
    [-0.03471881 -0.46516082]
    [ 0.38071346 -0.01117295]
    [ 0.38071346 -0.01117295]
    [ 0.38071346 -0.01117295]
    [ 0.38071346 -0.01117295]
    [ 0.38071346 -0.01117295]
    [-0.35261875 -0.28616753]
    [-0.35261875 -0.28616753]
    [-0.35261875 -0.28616753]
    [-0.35261875 -0.28616753]
    [-0.35261875 -0.28616753]]
   ```
   
   ## Minimum reproducible example
   Run the following code several times (must be less than 10 times in my experiments), and there is a chance to observe the trouble output.
   ```python
   import mxnet as mx
   import numpy as np
   def gen_e1_e2_e3(D, P):
       P_x, P_y = [P.T[i].T.reshape(shape=(0,0,1)) for i in range(2) ] # (b, Dest_N,1)
       D_x, D_y = [D.T[i].T.reshape(shape=(0,0,1)) for i in range(2) ]
       e3_P_x, e3_P_y = [mx.nd.repeat(x,axis=-1, repeats=P.shape[-2]) for x in [P_x, P_y] ]
       e3_D_x, e3_D_y = [mx.nd.repeat(x,axis=-1, repeats=P.shape[-2]) for x in [D_x, D_y] ]
       e3_x, e3_y = [ ( e3_P_z-e3_D_z ).reshape( (0,-1,1) )
                       for e3_P_z, e3_D_z in [ (e3_P_x, e3_D_x), (e3_P_y, e3_D_y) ]]
       e3  = mx.nd.concat(e3_x, e3_y, dim=-1) 
       return e3
   
   def bench_mark(D_slice, P_slice):
       n= D_slice.shape[0]
       e3 = mx.nd.empty((n*n,2))
       for i in xrange(n):
           e3[n*i:n*(i+1)]  = (e3[n*i:n*(i+1)]*0+1)*( P_slice[i] - D_slice[i])
       return e3
   x = mx.nd.random.uniform(shape=(2,5,2))
   y = mx.nd.random.uniform(shape=(2,5,2))
   e3 = gen_e1_e2_e3(x,y)
   e3_bm = bench_mark(x[0], y[0])
   
   print x[0].asnumpy()
   print y[0].asnumpy()
   print '-'*20
   print (e3_bm).asnumpy()
   print '-'*20
   print (e3[0]).asnumpy()
   ```
   
   ## What have you tried to solve it?
   Maybe the function of `bench_mark` should be replaced by the other( but I still think the problem  makes sense).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services