You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@mxnet.apache.org by GitBox <gi...@apache.org> on 2020/01/31 01:39:54 UTC

[GitHub] [incubator-mxnet] eric-haibin-lin opened a new issue #17495: Initial inspections for singleton thread safety in MXNet

eric-haibin-lin opened a new issue #17495: Initial inspections for singleton thread safety in MXNet
URL: https://github.com/apache/incubator-mxnet/issues/17495
 
 
   I roughly skim-through the code base and searched for classes with `::Get` method. Some of these class objects are global singletons and have potential thread safety issues. I list the initial assessment: 
   
   ### thread safe classes
   These classes are thread safe
   - ::dmlc::Registry<TVMOpConfig> 
   - Engine
   - Storage
   - StorageManager
     - CPUSharedStorageManager
     - GPUPooledStorageManager
     - GPUPooledRoundedStorageManager
     - PinnedMemoryStorage
     - GPUDeviceStorage/NaiveStorageManager
     - CPUDeviceStorage
   - Profiler
   - CustomOpProfiler
   - LazyAllocArray
   
   ### is a thread_local object or contains thread local variables
   The thread safety depends on the lifecycle of the thread. Are there alternative ways to avoid them? 
   - ResourceManager
   - MKLDNNStream @PatricZhao @TaoLv
   - Imperative 
   
   ### classes that are not thread safe
   These classes are not thread safe and may cause bugs: 
   - engine::OpenMP 
   - DeviceStorageProfiler 
   - SubgraphBackendRegistry @samskalicky 
   
   I didn't look into C APIs. And there are lots of other thread_local objects spreading around in the code base as well. 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] eric-haibin-lin commented on issue #17495: Initial inspections for singleton thread safety in MXNet

Posted by GitBox <gi...@apache.org>.
eric-haibin-lin commented on issue #17495: Initial inspections for singleton thread safety in MXNet
URL: https://github.com/apache/incubator-mxnet/issues/17495#issuecomment-589827805
 
 
   @leezu thanks for the followup. 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] eric-haibin-lin commented on issue #17495: Initial inspections for singleton thread safety in MXNet

Posted by GitBox <gi...@apache.org>.
eric-haibin-lin commented on issue #17495: Initial inspections for singleton thread safety in MXNet
URL: https://github.com/apache/incubator-mxnet/issues/17495#issuecomment-585461965
 
 
   There are a couple of thread local variables in the python level, too:
   ```
   python/mxnet/attribute.py:    _current = threading.local()
   python/mxnet/context.py:    _default_ctx = threading.local()
   python/mxnet/gluon/block.py:    _current = threading.local()
   python/mxnet/name.py:    _current = threading.local()
   python/mxnet/util.py:    _current = threading.local()
   ```
   they suggest that if the frontend python thread is switched, some of these contexts are lost. 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] leezu commented on issue #17495: Initial inspections for singleton thread safety in MXNet

Posted by GitBox <gi...@apache.org>.
leezu commented on issue #17495: Initial inspections for singleton thread safety in MXNet
URL: https://github.com/apache/incubator-mxnet/issues/17495#issuecomment-589460965
 
 
   @eric-haibin-lin I raised the question on the Python Bug Tracker.
   Python maintainers recommend to refactor MXNet to make state management pluggable / customizable.
   
   Adding callbacks to contextvars is infeasible: 
   
   > For extra context: context switches occur on every callback invocation in asyncio and there can be thousands of them per seconds (or even more). Adding any extra code to context switching code will noticeably degrade the performance.
   
   Reference: https://bugs.python.org/issue39660#msg362370

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] eric-haibin-lin edited a comment on issue #17495: Initial inspections for singleton thread safety in MXNet

Posted by GitBox <gi...@apache.org>.
eric-haibin-lin edited a comment on issue #17495: Initial inspections for singleton thread safety in MXNet
URL: https://github.com/apache/incubator-mxnet/issues/17495#issuecomment-585461965
 
 
   There are a couple of thread local variables in the python level, too:
   - [AttrScope](python/mxnet/symbol/contrib.py) for symbols. Used in control flow ops. 
   - python/mxnet/context.py:    _default_ctx = threading.local()
   - python/mxnet/gluon/block.py:    _current = threading.local()
   - python/mxnet/name.py:    _current = threading.local()
   - [_NumpyArrayScope](python/mxnet/util.py) used to declare numpy array creation
   
   
   They suggest that if the frontend python thread is switched, some of these contexts are lost. 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] leezu commented on issue #17495: Initial inspections for singleton thread safety in MXNet

Posted by GitBox <gi...@apache.org>.
leezu commented on issue #17495: Initial inspections for singleton thread safety in MXNet
URL: https://github.com/apache/incubator-mxnet/issues/17495#issuecomment-585466432
 
 
   Specifically, we'd need to use something like https://docs.python.org/3/library/contextvars.html
   But Contextvar in it's current form is not sufficient, as it doesn't allow us to hook in C API calls.
   This feature should be added to Python standard library, and on the MXNet side we should use a patched version of Contextvar

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services