You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@mxnet.apache.org by GitBox <gi...@apache.org> on 2018/12/04 03:21:07 UTC

[GitHub] YutingZhang opened a new issue #13521: Gluon DataLoader cannot release the processes in the pool

YutingZhang opened a new issue #13521: Gluon DataLoader cannot release the processes in the pool
URL: https://github.com/apache/incubator-mxnet/issues/13521
 
 
   https://github.com/apache/incubator-mxnet/blob/f2dcd7c7b8676b55d912997fc3f9c62c55915307/python/mxnet/gluon/data/dataloader.py#L532-L533
   
   Logically, when a `DataLoader` is recycled, the `_worker_pool` should be recycled, and the `terminate()` of the `_worker_pool` function should be called immediately. However, it did not ... 
   
   Each time I kill a `DataLoader`, it leaves the worker processes dangling.
   I guess it is a bug of python `multiprocess.Pool`. Anyway, I think we can patch it by explicitly call `_worker_pool.terminate()`
   
   Minimum code to reproduce the errors.
   ```python
   import mxnet as mx
   import numpy as np
   A=np.random.rand(999, 2000)
   D=mx.gluon.data.DataLoader(A, batch_size=8, num_workers=2)
   the_iter = iter(D)
   next(the_iter)
   D._worker_pool.terminate()
   del the_iter
   del D
   ```
   
   I recorded a video demo for this bug: https://drive.google.com/open?id=1q4CmU_F1vAtxoZ_KUmrIEfVRk3RsQfv8
   
   Environment: today's mxnet from pip, python3.6 on p3
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services