You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@mxnet.apache.org by GitBox <gi...@apache.org> on 2020/09/16 07:36:38 UTC

[GitHub] [incubator-mxnet] kohillyang opened a new issue #19159: GPU memory usage keeps increasing even hybridize with static_alloc when used in flask debug mode after mxnet 1.6.0post0.

kohillyang opened a new issue #19159:
URL: https://github.com/apache/incubator-mxnet/issues/19159


   ## Description
   Hello, I'm using flask with mxnet to write a server. Since it is a web app, we want the GPU memory is fully static allocated. 
   However, as the title said, I found the GPU memory usage keeps increasing when the version of mxnet is 1.6.0post0 and 1.7.0, and if you are using mxnet 1.5.1, then all things are good. Since Flask debug mode uses multi-threading, I think it may be caused by some calls which are not thread-safe.
   ![x](https://user-images.githubusercontent.com/7751507/93306377-5a952b00-f832-11ea-9573-ce6b306e3c90.gif)
   
   
   ## To Reproduce
   This is a naive fLask server:
   ```python
   import mxnet as mx
   import os
   os.environ["MXNET_CUDNN_AUTOTUNE_DEFAULT"] = "0"
   os.environ["MXNET_GPU_MEM_POOL_TYPE"] = "Round"
   
   
   class Predictor(object):
       def __init__(self):
           ctx = mx.gpu(0)
           net = mx.gluon.model_zoo.vision.resnet50_v1()
           net.initialize()
           net.collect_params().reset_ctx(ctx)
           net.hybridize(active=True)
           max_h = 768
           max_w = 768
           _ = net(mx.nd.zeros(shape=(1, 3, max_h, max_w), ctx=ctx))
           self.ctx = ctx
           self.net = net
   
       def __call__(self, *args, **kwargs):
           max_h = 768
           max_w = 768
           x_h = np.random.randint(100, max_h)
           x_w = np.random.randint(100, max_w)
           xx = np.random.randn(1, 3, x_h, x_w)
           y = self.net(mx.nd.array(xx, ctx=self.ctx))
           return y.asnumpy().sum()
   
   
   if __name__ == '__main__':
       import flask
       import tornado.wsgi
       import tornado.httpserver
       import os
       import cv2
       import numpy as np
       from flask_cors import CORS
       import os
       import cv2
       import json
       import logging
       import base64
   
       os.environ["MXNET_CUDNN_AUTOTUNE_DEFAULT"]="0"
       DEBUG = True
       PORT = 21500
       app = flask.Flask(__name__)
       CORS(app, supports_credentials=True)
       predictor = Predictor()
   
       @app.route('/test', methods=['POST'])
       def net_forward():
           try:
               r = predictor()
               return None
           except Exception as e:
               logging.exception(e)
               print("failed")
               return flask.jsonify(str(e)), 400
   
       print("starting webserver...")
       if DEBUG:
           app.run(debug=True, host='0.0.0.0', port=PORT)
       else:
           http_server = tornado.httpserver.HTTPServer(
               tornado.wsgi.WSGIContainer(app))
           http_server.listen(PORT, address="0.0.0.0")
           tornado.ioloop.IOLoop.instance().start()
   ```
   
   And just run the following code to request the server:
   ```
   import base64
   import json
   import time
   import os
   import numpy as np
   import cv2
   
   
   def remote_call(url):
       register_data = {"Pic": "xx"}
       data = json.dumps(register_data)
       import requests
       return requests.post(url, data)
   
   
   if __name__ == '__main__':
       import glob
       import matplotlib.pyplot as plt
       folder = '/data1/test_paper_reco_jingyouwang/1st-2st-merged-for-line-detection/val/'
       for item in glob.iglob(folder + '*.jpg'):
           register_url = 'http://127.0.0.1:21500/test'
           while True:
               try:
                   remote_call(register_url)
               except Exception as e:
                   print(e)
   
   ```
   
   ## Environment
   I'm using flask 1.0.2 and tornado 5.1, but I think it is independent of the versions of flask and tornado.
   We recommend using our script for collecting the diagnositc information. Run the following command and paste the outputs below:
   ```
   curl --retry 10 -s https://raw.githubusercontent.com/apache/incubator-mxnet/master/tools/diagnose.py | python
   ```
   # paste outputs here
   
   ```
   /data2/kohill/anaconda3/bin/python /data2/kohill/mx-detection/diagnose.py
   ----------Python Info----------
   Version      : 3.7.0
   Compiler     : GCC 7.2.0
   Build        : ('default', 'Jun 28 2018 13:15:42')
   Arch         : ('64bit', '')
   ------------Pip Info-----------
   Version      : 20.2.2
   Directory    : /data2/kohill/anaconda3/lib/python3.7/site-packages/pip
   ----------MXNet Info-----------
   Version      : 1.7.0
   Directory    : /data2/kohill/anaconda3/lib/python3.7/site-packages/mxnet
   Commit Hash   : 64f737cdd59fe88d2c5b479f25d011c5156b6a8a
   64f737cdd59fe88d2c5b479f25d011c5156b6a8a
   64f737cdd59fe88d2c5b479f25d011c5156b6a8a
   64f737cdd59fe88d2c5b479f25d011c5156b6a8a
   64f737cdd59fe88d2c5b479f25d011c5156b6a8a
   64f737cdd59fe88d2c5b479f25d011c5156b6a8a
   64f737cdd59fe88d2c5b479f25d011c5156b6a8a
   64f737cdd59fe88d2c5b479f25d011c5156b6a8a
   64f737cdd59fe88d2c5b479f25d011c5156b6a8a
   64f737cdd59fe88d2c5b479f25d011c5156b6a8a
   Library      : ['/data2/kohill/anaconda3/lib/python3.7/site-packages/mxnet/libmxnet.so']
   Build features:
   ? CUDA
   ? CUDNN
   ? NCCL
   ? CUDA_RTC
   ? TENSORRT
   ? CPU_SSE
   ? CPU_SSE2
   ? CPU_SSE3
   ? CPU_SSE4_1
   ? CPU_SSE4_2
   ? CPU_SSE4A
   ? CPU_AVX
   ? CPU_AVX2
   ? OPENMP
   ? SSE
   ? F16C
   ? JEMALLOC
   ? BLAS_OPEN
   ? BLAS_ATLAS
   ? BLAS_MKL
   ? BLAS_APPLE
   ? LAPACK
   ? MKLDNN
   ? OPENCV
   ? CAFFE
   ? PROFILER
   ? DIST_KVSTORE
   ? CXX14
   ? INT64_TENSOR_SIZE
   ? SIGNAL_HANDLER
   ? DEBUG
   ? TVM_OP
   ----------System Info----------
   Platform     : Linux-4.15.0-117-generic-x86_64-with-debian-stretch-sid
   system       : Linux
   node         : ubuntu
   release      : 4.15.0-117-generic
   version      : #118~16.04.1-Ubuntu SMP Sat Sep 5 23:35:06 UTC 2020
   ----------Hardware Info----------
   machine      : x86_64
   processor    : x86_64
   Architecture:          x86_64
   CPU op-mode(s):        32-bit, 64-bit
   Byte Order:            Little Endian
   CPU(s):                48
   On-line CPU(s) list:   0-47
   Thread(s) per core:    2
   Core(s) per socket:    12
   Socket(s):             2
   NUMA node(s):          2
   Vendor ID:             GenuineIntel
   CPU family:            6
   Model:                 63
   Model name:            Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz
   Stepping:              2
   CPU MHz:               1200.672
   CPU max MHz:           3300.0000
   CPU min MHz:           1200.0000
   BogoMIPS:              5002.04
   Virtualization:        VT-x
   L1d cache:             32K
   L1i cache:             32K
   L2 cache:              256K
   L3 cache:              30720K
   NUMA node0 CPU(s):     0-11,24-35
   NUMA node1 CPU(s):     12-23,36-47
   Flags:                 fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm cpuid_fault epb invpcid_single pti intel_ppin ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid cqm xsaveopt cqm_llc cqm_occup_llc dtherm ida arat pln pts md_clear flush_l1d
   ----------Network Test----------
   Setting timeout: 10
   Timing for MXNet: https://github.com/apache/incubator-mxnet, DNS: 0.0060 sec, LOAD: 1.4688 sec.
   Timing for Gluon Tutorial(en): http://gluon.mxnet.io, DNS: 0.1272 sec, LOAD: 1.2150 sec.
   Error open Gluon Tutorial(cn): https://zh.gluon.ai, <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: certificate has expired (_ssl.c:1045)>, DNS finished in 0.10556268692016602 sec.
   Timing for FashionMNIST: https://apache-mxnet.s3-accelerate.dualstack.amazonaws.com/gluon/dataset/fashion-mnist/train-labels-idx1-ubyte.gz, DNS: 0.0053 sec, LOAD: 1.4548 sec.
   Timing for PYPI: https://pypi.python.org/pypi/pip, DNS: 0.0048 sec, LOAD: 11.7945 sec.
   Error open Conda: https://repo.continuum.io/pkgs/free/, HTTP Error 403: Forbidden, DNS finished in 0.005016326904296875 sec.
   ----------Environment----------
   ```
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org
For additional commands, e-mail: issues-help@mxnet.apache.org

[GitHub] [incubator-mxnet] szha commented on issue #19159: GPU memory usage keeps increasing even hybridize with static_alloc when used in flask debug mode after mxnet 1.6.0post0.

Posted by GitBox <gi...@apache.org>.

szha commented on issue #19159:
URL: https://github.com/apache/incubator-mxnet/issues/19159#issuecomment-695837707


   @leezu could this be related to the issue fixed by #18328 #18363?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org
For additional commands, e-mail: issues-help@mxnet.apache.org

[GitHub] [incubator-mxnet] leezu removed a comment on issue #19159: GPU memory usage keeps increasing even hybridize with static_alloc when used in flask debug mode after mxnet 1.6.0post0.

Posted by GitBox <gi...@apache.org>.

leezu removed a comment on issue #19159:
URL: https://github.com/apache/incubator-mxnet/issues/19159#issuecomment-696475563


   You're creating a predictor inside `def net_forward()`. I don't know how often that is called.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org
For additional commands, e-mail: issues-help@mxnet.apache.org

[GitHub] [incubator-mxnet] szha commented on issue #19159: GPU memory usage keeps increasing even hybridize with static_alloc when used in flask debug mode after mxnet 1.6.0post0.

Posted by GitBox <gi...@apache.org>.

szha commented on issue #19159:
URL: https://github.com/apache/incubator-mxnet/issues/19159#issuecomment-695837707


   @leezu could this be related to the issue fixed by #18328 #18363?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org
For additional commands, e-mail: issues-help@mxnet.apache.org

[GitHub] [incubator-mxnet] wkcn edited a comment on issue #19159: GPU memory usage keeps increasing even hybridize with static_alloc when used in flask debug mode after mxnet 1.6.0post0.

Posted by GitBox <gi...@apache.org>.

wkcn edited a comment on issue #19159:
URL: https://github.com/apache/incubator-mxnet/issues/19159#issuecomment-693464974


   Hi @kohillyang , I think it is not related to MXNet.
   
   When there is a new connection, the library flask will create a new python process to handle the connection, which creates a new copy of MXNet instance `predictor`.
   
   To validate it, you can print the id of the predictor by ~`print(id(r))`~ `print(id(predictor))` in the function `def net_forward():`


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org
For additional commands, e-mail: issues-help@mxnet.apache.org

[GitHub] [incubator-mxnet] wkcn commented on issue #19159: GPU memory usage keeps increasing even hybridize with static_alloc when used in flask debug mode after mxnet 1.6.0post0.

Posted by GitBox <gi...@apache.org>.

wkcn commented on issue #19159:
URL: https://github.com/apache/incubator-mxnet/issues/19159#issuecomment-693464974


   Hi @kohillyang , I think it is not related to MXNet.
   
   When there is a new connection, the library flask will create a new python process to handle the connection, which creates a new copy of MXNet instance `predictor`.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org
For additional commands, e-mail: issues-help@mxnet.apache.org

[GitHub] [incubator-mxnet] kohillyang commented on issue #19159: GPU memory usage keeps increasing even hybridize with static_alloc when used in flask debug mode after mxnet 1.6.0post0.

Posted by GitBox <gi...@apache.org>.

kohillyang commented on issue #19159:
URL: https://github.com/apache/incubator-mxnet/issues/19159#issuecomment-696461731


   Why do you think I'm creating a new predictor in each call? Apparently there is only one instance for Predictor.  


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org
For additional commands, e-mail: issues-help@mxnet.apache.org

[GitHub] [incubator-mxnet] kohillyang commented on issue #19159: GPU memory usage keeps increasing even hybridize with static_alloc when used in flask debug mode after mxnet 1.6.0post0.

Posted by GitBox <gi...@apache.org>.

kohillyang commented on issue #19159:
URL: https://github.com/apache/incubator-mxnet/issues/19159#issuecomment-696461731


   Why do you think I'm creating a new predictor in each call? Apparently there is only one instance for Predictor.  


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org
For additional commands, e-mail: issues-help@mxnet.apache.org

[GitHub] [incubator-mxnet] kohillyang commented on issue #19159: GPU memory usage keeps increasing even hybridize with static_alloc when used in flask debug mode after mxnet 1.6.0post0.

Posted by GitBox <gi...@apache.org>.

kohillyang commented on issue #19159:
URL: https://github.com/apache/incubator-mxnet/issues/19159#issuecomment-695749604


   @wkcn predictor is created by `predictor = Predictor()`, not `r = predictor()`, since its children function `__calll__` is overided.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org
For additional commands, e-mail: issues-help@mxnet.apache.org

[GitHub] [incubator-mxnet] leezu commented on issue #19159: GPU memory usage keeps increasing even hybridize with static_alloc when used in flask debug mode after mxnet 1.6.0post0.

Posted by GitBox <gi...@apache.org>.

leezu commented on issue #19159:
URL: https://github.com/apache/incubator-mxnet/issues/19159#issuecomment-696317102


   @kohillyang so you are creating a new predictor in every HTTP call? Thus yes, a new Block is created in every HTTP call and due to https://github.com/apache/incubator-mxnet/pull/18328 the parameter of the Block won't be deallocated.
   
   https://github.com/apache/incubator-mxnet/pull/18328/files only contains Python changes. Would you like to try applying the changes to your MXNet files and see if the memory leak goes away. Thank you


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org
For additional commands, e-mail: issues-help@mxnet.apache.org

[GitHub] [incubator-mxnet] leezu commented on issue #19159: GPU memory usage keeps increasing even hybridize with static_alloc when used in flask debug mode after mxnet 1.6.0post0.

Posted by GitBox <gi...@apache.org>.

leezu commented on issue #19159:
URL: https://github.com/apache/incubator-mxnet/issues/19159#issuecomment-696317102






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org
For additional commands, e-mail: issues-help@mxnet.apache.org

[GitHub] [incubator-mxnet] kohillyang edited a comment on issue #19159: GPU memory usage keeps increasing even hybridize with static_alloc when used in flask debug mode after mxnet 1.6.0post0.

Posted by GitBox <gi...@apache.org>.

kohillyang edited a comment on issue #19159:
URL: https://github.com/apache/incubator-mxnet/issues/19159#issuecomment-695749604


   @wkcn predictor is created by `predictor = Predictor()`, not `r = predictor()`, since its children function `__call__` is override. And the memory usage grows slow, it seems that it is because the memory allocated by the line `mx.nd.zeros(shape=(1, 3, max_h, max_w), ctx=ctx)` is not freed. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org
For additional commands, e-mail: issues-help@mxnet.apache.org

[GitHub] [incubator-mxnet] kohillyang commented on issue #19159: GPU memory usage keeps increasing even hybridize with static_alloc when used in flask debug mode after mxnet 1.6.0post0.

Posted by GitBox <gi...@apache.org>.

kohillyang commented on issue #19159:
URL: https://github.com/apache/incubator-mxnet/issues/19159#issuecomment-696461731


   Why do you think I'm creating a new predictor in each call? Apparently there is only one instance for Predictor.  


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org
For additional commands, e-mail: issues-help@mxnet.apache.org

[GitHub] [incubator-mxnet] kohillyang edited a comment on issue #19159: GPU memory usage keeps increasing even hybridize with static_alloc when used in flask debug mode after mxnet 1.6.0post0.

Posted by GitBox <gi...@apache.org>.

kohillyang edited a comment on issue #19159:
URL: https://github.com/apache/incubator-mxnet/issues/19159#issuecomment-693581070

@wkcn but even if flask has created a new process, the GPU memory should be freed once the process ends. And the predictor is created in the main function, which should only be called once and has only one predictor instance. On the other side, if the main process has initialized a CUDA environment, the mxnet in the subprocess will fail when inference because their CUDA file descriptor can not be shared between the main process and the sub-process.

BTW. , the pid of the process and the id of the predictor remain unchanged. I print them using the following codes:
```python
print(id(self))
print(os.getpid())
```

PS: `ctx.empty_cache()` is also not thread-safe. If you called it in two threads, the program would crash in some cases.

Thread-safe is of importance because in some time you need to implement a Block with asnumpy, and it is too hard to implement all blocks as HybridBlock and as an asynchronous way. In pytorch it is not a problem because we have DataParallel. It will start a thread for each CPU instance and gather the results, but this operation is not officially supported by mxnet because at least there are something like <https://github.com/apache/incubator-mxnet/issues/13199> which need workarounds.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org
For additional commands, e-mail: issues-help@mxnet.apache.org

[GitHub] [incubator-mxnet] leezu commented on issue #19159: GPU memory usage keeps increasing even hybridize with static_alloc when used in flask debug mode after mxnet 1.6.0post0.

Posted by GitBox <gi...@apache.org>.

leezu commented on issue #19159:
URL: https://github.com/apache/incubator-mxnet/issues/19159#issuecomment-696475563


   You're creating a predictor inside `def net_forward()`. I don't know how often that is called.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org
For additional commands, e-mail: issues-help@mxnet.apache.org

[GitHub] [incubator-mxnet] kohillyang commented on issue #19159: GPU memory usage keeps increasing even hybridize with static_alloc when used in flask debug mode after mxnet 1.6.0post0.

Posted by GitBox <gi...@apache.org>.

kohillyang commented on issue #19159:
URL: https://github.com/apache/incubator-mxnet/issues/19159#issuecomment-693581070


   @wkcn but even if flask has created a new process, the GPU memory should be freed once the process ends. And the predictor is created in the main function, which should only be called once and has only one predictor instance. On the other side, if the main process has initialized a CUDA environment, the mxnet in the subprocess will fail when inference because their CUDA file descriptor can not be shared between the main process and the sub-process.
   
   BTW. , the pid of the process and the id of the predictor remain unchanged. I print them using the following codes:
   ```python
           print(id(self))
           print(os.getpid())
   ```
   
   PS: `ctx.empty_cache()` is also not thread-safe. If you called it in two threads, the program would crash in some cases.   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org
For additional commands, e-mail: issues-help@mxnet.apache.org

[GitHub] [incubator-mxnet] leezu removed a comment on issue #19159: GPU memory usage keeps increasing even hybridize with static_alloc when used in flask debug mode after mxnet 1.6.0post0.

Posted by GitBox <gi...@apache.org>.

leezu removed a comment on issue #19159:
URL: https://github.com/apache/incubator-mxnet/issues/19159#issuecomment-696475563


   You're creating a predictor inside `def net_forward()`. I don't know how often that is called.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org
For additional commands, e-mail: issues-help@mxnet.apache.org

[GitHub] [incubator-mxnet] leezu commented on issue #19159: GPU memory usage keeps increasing even hybridize with static_alloc when used in flask debug mode after mxnet 1.6.0post0.

Posted by GitBox <gi...@apache.org>.

leezu commented on issue #19159:
URL: https://github.com/apache/incubator-mxnet/issues/19159#issuecomment-696475563






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org
For additional commands, e-mail: issues-help@mxnet.apache.org

[GitHub] [incubator-mxnet] kohillyang edited a comment on issue #19159: GPU memory usage keeps increasing even hybridize with static_alloc when used in flask debug mode after mxnet 1.6.0post0.

Posted by GitBox <gi...@apache.org>.

kohillyang edited a comment on issue #19159:
URL: https://github.com/apache/incubator-mxnet/issues/19159#issuecomment-695749604


   @wkcn predictor is created by `predictor = Predictor()`, not `r = predictor()`, since its children function `__call__` is overided.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org
For additional commands, e-mail: issues-help@mxnet.apache.org

[GitHub] [incubator-mxnet] wkcn edited a comment on issue #19159: GPU memory usage keeps increasing even hybridize with static_alloc when used in flask debug mode after mxnet 1.6.0post0.

Posted by GitBox <gi...@apache.org>.

wkcn edited a comment on issue #19159:
URL: https://github.com/apache/incubator-mxnet/issues/19159#issuecomment-693464974


   Hi @kohillyang , I think it is not related to MXNet.
   
   When there is a new connection, the library flask will create a new python process to handle the connection, which creates a new copy of MXNet instance `predictor`.
   
   To validate it, you can print the id of the predictor by `print(id(r))` in the function `def net_forward():`


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org
For additional commands, e-mail: issues-help@mxnet.apache.org

[GitHub] [incubator-mxnet] leezu commented on issue #19159: GPU memory usage keeps increasing even hybridize with static_alloc when used in flask debug mode after mxnet 1.6.0post0.

Posted by GitBox <gi...@apache.org>.

leezu commented on issue #19159:
URL: https://github.com/apache/incubator-mxnet/issues/19159#issuecomment-696475745






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org
For additional commands, e-mail: issues-help@mxnet.apache.org

[GitHub] [incubator-mxnet] leezu removed a comment on issue #19159: GPU memory usage keeps increasing even hybridize with static_alloc when used in flask debug mode after mxnet 1.6.0post0.

Posted by GitBox <gi...@apache.org>.

leezu removed a comment on issue #19159:
URL: https://github.com/apache/incubator-mxnet/issues/19159#issuecomment-696475563


   You're creating a predictor inside `def net_forward()`. I don't know how often that is called.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org
For additional commands, e-mail: issues-help@mxnet.apache.org