You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@mxnet.apache.org by GitBox <gi...@apache.org> on 2019/03/06 23:39:16 UTC

[GitHub] [incubator-mxnet] ptrendx opened a new pull request #14352: Optimize NMS part 2

ptrendx opened a new pull request #14352: Optimize NMS part 2
URL: https://github.com/apache/incubator-mxnet/pull/14352
 
 
   ## Description ##
   This PR changes the `batch_start` calculation in the BoxNMSForward op to the custom kernel, much faster than the mshadow generated one. In MaskRCNN model it changes the runtime of that part from 20 ms to 2 us, speeding up the single GPU training by 20% in fp16 mode.
   
   ## Checklist ##
   ### Essentials ###
   Please feel free to remove inapplicable items for your PR.
   - [x] Changes are complete (i.e. I finished coding on this PR)
   - [x] All changes have test coverage:
   - [x] To the my best knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change
   
   ## Comments ##
   - I'm pretty sure that on a CPU path a simple for loop would be much better than the mshadow generated kernel as well, but since I did not have experimental data, I did not change it. FYI @zhreshold 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services