You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@mxnet.apache.org by GitBox <gi...@apache.org> on 2017/11/06 03:57:42 UTC
[GitHub] reminisce opened a new pull request #8558: slice operator supporting arbitrary values of step

reminisce opened a new pull request #8558: slice operator supporting arbitrary values of step
URL: https://github.com/apache/incubator-mxnet/pull/8558
 
 
   ## Description ##
   - Re-implemented slice operator using Kernel::Launch
   - Added support for arbitrary values of `step` along with `begin` and `end`, i.e. `slice(data, begin, end, step)`, where `step` is an optional parameter.
   
   ## Checklist ##
   ### Essentials ###
   - [x] Passed code style checking (`make lint`)
   - [x] Changes are complete (i.e. I finished coding on this PR)
   - [x] All changes have test coverage
   - [x] For user-facing API changes, API doc string has been updated.
   - [x] To my best knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change
   
   ## Comments ##
   - The kernel implementation is based upon the parallelization approach of slice operator in mshadow with additional support for arbitrary value of `step`.
   - This operator will be used in https://github.com/apache/incubator-mxnet/pull/8246 to support slicing `NDArray` with `step!=1`. Currently, it uses `gather_nd` op to realize that functionality, which has heavy overhead of making index `NDArray` from slices.
   - Mini-benchmark `slice_v1` (**mshadow version**) and `slice` (**Kernel::Launch version**).
   
   **Hardware**
   p2.xlarge (4 omp threads)
   
   **Commit**
   [ba9de66](https://github.com/reminisce/mxnet/commit/ba9de669a752f232dcb21b0c11f9a4f91533af23)
   
   **GPU build**
   mx.nd.slice_v1: 10000 repeats costs 0.561281 seconds
   mx.nd.slice:      10000 repeats costs 0.562561 seconds
   
   **CPU-only build**
   mx.nd.slice_v1: 10000 repeats costs 1.049587 seconds
   mx.nd.slice:      10000 repeats costs 1.130866 seconds
   
   **Benchmark script**
   ```python
    import mxnet as mx
    import numpy as np
    import time
    from mxnet.test_utils import same
    
    #ctx = mx.gpu(0)
    ctx = mx.cpu(0)
    
    repeat = 10000
    shape = (16, 16, 16, 16)
    a = mx.nd.arange(np.prod(shape), ctx=ctx).reshape(shape=shape)
    begin = (None, 1, 2)
    end = (shape[0], shape[1], shape[2])
    
    # warm up
    for i in range(100):
        b = mx.nd.slice_v1(a, begin=begin, end=end)
        c = mx.nd.slice(a, begin=begin, end=end)
    
    mx.nd.waitall()
    start = time.time()
    for i in range(repeat):
        c = mx.nd.slice_v1(a, begin=begin, end=end)
    mx.nd.waitall()
    elapsed = time.time() - start
    print("mx.nd.slice_v1: %d repeats costs %f seconds" % (repeat, elapsed))
    
    start = time.time()
    for i in range(repeat):
        b = mx.nd.slice(a, begin=begin, end=end)
    mx.nd.waitall()
    elapsed = time.time() - start
    print("mx.nd.slice: %d repeats costs %f seconds" % (repeat, elapsed))
    
    assert same(c.asnumpy(), b.asnumpy())
   ```
   @piiswrong @eric-haibin-lin @anirudh2290 @rahul003 @cjolivier01 
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services