You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@mxnet.apache.org by GitBox <gi...@apache.org> on 2019/07/11 09:02:40 UTC

[GitHub] [incubator-mxnet] wshuail opened a new issue #15512: possible bug in nd.gather_nd

wshuail opened a new issue #15512: possible bug in nd.gather_nd
URL: https://github.com/apache/incubator-mxnet/issues/15512
 
 
   When I use nd.gather_nd with gpu, a unexpected error happened sometimes. Generally it may raise error like RuntimeError: cuda runtime error (77) : an illegal memory access was encountered.
   
   An error "terminate called without an active exception" could happen with the code below. 
   
   `import os
   import sys
   import mxnet as mx
   from mxnet import nd
   sys.path.insert(0, os.path.expanduser('~/gluon_detector'))
   
   ctx = mx.gpu(0)
   
   for i in range(10000):
       if i % 100 == 0:
           print (i)
   
       batch_size = 2
       num_classes = 2
       width, height = 5, 5
       k = 3
                                                                                                                                  hm_origin = nd.random.uniform(0, 10, (batch_size, num_classes, width, height), ctx=ctx)
   
       hm = nd.reshape(hm_origin, (0, 0, -1))
   
       topk_scores, topk_idx = nd.topk(hm, k=k, ret_typ='both')
   
       topk_x_idx = nd.floor(topk_idx/width)
       topk_y_idx = (topk_idx%height)
   
       batch_idx = nd.repeat(nd.arange(batch_size), repeats=num_classes*k).reshape((1, -1))
       batch_idx = batch_idx.as_in_context(ctx)
       class_idx = nd.repeat(nd.arange(num_classes), repeats=batch_size*k).reshape((1, -1))
       class_idx = class_idx.as_in_context(ctx)
   
       topk_x_idx = nd.reshape(topk_x_idx, (1, -1))
       topk_y_idx = nd.reshape(topk_y_idx, (1, -1))
   
       indices = nd.concat(batch_idx, class_idx, topk_x_idx, topk_y_idx, dim=0)
   
       results = nd.gather_nd(hm_origin, indices)
       results = nd.reshape(results, (batch_size, num_classes, k))
   
   `
   
   when I add nd.waitall() at last, it works well. 
   
   any suggestions? 
   
   BTW, does mxnet has a function like torch.gather?  gather_nd works close, but not so convenient.  

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services