You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@mxnet.apache.org by GitBox <gi...@apache.org> on 2018/03/29 21:42:07 UTC

[GitHub] marcoabreu opened a new issue #10324: test_gluon_model_zoo_gpu.test_inference out of memory

marcoabreu opened a new issue #10324: test_gluon_model_zoo_gpu.test_inference out of memory
URL: https://github.com/apache/incubator-mxnet/issues/10324
 
 
   ```
   ======================================================================
   
   ERROR: test_gluon_model_zoo_gpu.test_inference
   
   ----------------------------------------------------------------------
   
   Traceback (most recent call last):
   
     File "/usr/lib/python3.6/site-packages/nose/case.py", line 198, in runTest
   
       self.test(*self.arg)
   
     File "/work/mxnet/tests/python/gpu/../unittest/common.py", line 157, in test_new
   
       orig_test(*args, **kwargs)
   
     File "/work/mxnet/tests/python/gpu/test_gluon_model_zoo_gpu.py", line 90, in test_inference
   
       gpu_max_val = np.max(np.abs(gpu_out.asnumpy()))
   
     File "/work/mxnet/python/mxnet/ndarray/ndarray.py", line 1826, in asnumpy
   
       ctypes.c_size_t(data.size)))
   
     File "/work/mxnet/python/mxnet/base.py", line 149, in check_call
   
       raise MXNetError(py_str(_LIB.MXGetLastError()))
   
   mxnet.base.MXNetError: [17:02:37] src/storage/./pooled_storage_manager.h:108: cudaMalloc failed: out of memory
   
   
   
   Stack trace returned 10 entries:
   
   [bt] (0) /work/mxnet/python/mxnet/../../lib/libmxnet.so(dmlc::StackTrace()+0x42) [0x7f260c0384e2]
   
   [bt] (1) /work/mxnet/python/mxnet/../../lib/libmxnet.so(dmlc::LogMessageFatal::~LogMessageFatal()+0x18) [0x7f260c038a88]
   
   [bt] (2) /work/mxnet/python/mxnet/../../lib/libmxnet.so(mxnet::storage::GPUPooledStorageManager::Alloc(mxnet::Storage::Handle*)+0x1ab) [0x7f260ec629eb]
   
   [bt] (3) /work/mxnet/python/mxnet/../../lib/libmxnet.so(mxnet::StorageImpl::Alloc(mxnet::Storage::Handle*)+0x4c) [0x7f260ec64e0c]
   
   [bt] (4) /work/mxnet/python/mxnet/../../lib/libmxnet.so(mxnet::NDArray::CheckAndAlloc() const+0x194) [0x7f260c0de3f4]
   
   [bt] (5) /work/mxnet/python/mxnet/../../lib/libmxnet.so(+0x334a2b9) [0x7f260e77e2b9]
   
   [bt] (6) /work/mxnet/python/mxnet/../../lib/libmxnet.so(mxnet::imperative::PushFCompute(std::function<void (nnvm::NodeAttrs const&, mxnet::OpContext const&, std::vector<mxnet::TBlob, std::allocator<mxnet::TBlob> > const&, std::vector<mxnet::OpReqType, std::allocator<mxnet::OpReqType> > const&, std::vector<mxnet::TBlob, std::allocator<mxnet::TBlob> > const&)> const&, nnvm::Op const*, nnvm::NodeAttrs const&, mxnet::Context const&, std::vector<mxnet::engine::Var*, std::allocator<mxnet::engine::Var*> > const&, std::vector<mxnet::engine::Var*, std::allocator<mxnet::engine::Var*> > const&, std::vector<mxnet::Resource, std::allocator<mxnet::Resource> > const&, std::vector<mxnet::NDArray*, std::allocator<mxnet::NDArray*> > const&, std::vector<mxnet::NDArray*, std::allocator<mxnet::NDArray*> > const&, std::vector<unsigned int, std::allocator<unsigned int> > const&, std::vector<mxnet::OpReqType, std::allocator<mxnet::OpReqType> > const&)::{lambda(mxnet::RunContext)#1}::operator()(mxnet::RunContext) const+0x1f8) [0x7f260e7a13d8]
   
   [bt] (7) /work/mxnet/python/mxnet/../../lib/libmxnet.so(+0x381e603) [0x7f260ec52603]
   
   [bt] (8) /work/mxnet/python/mxnet/../../lib/libmxnet.so(mxnet::engine::ThreadedEngine::ExecuteOprBlock(mxnet::RunContext, mxnet::engine::OprBlock*)+0x572) [0x7f260ec4de02]
   
   [bt] (9) /work/mxnet/python/mxnet/../../lib/libmxnet.so(void mxnet::engine::ThreadedEnginePerDevice::GPUWorker<(dmlc::ConcurrentQueueType)0>(mxnet::Context, bool, mxnet::engine::ThreadedEnginePerDevice::ThreadWorkerBlock<(dmlc::ConcurrentQueueType)0>*, std::shared_ptr<dmlc::ManualEvent> const&)+0xdb) [0x7f260ec5cbdb]
   
   
   
   
   
   -------------------- >> begin captured logging << --------------------
   
   common: INFO: Setting module np/mx/python random seeds, use MXNET_MODULE_SEED=1776805682 to reproduce.
   
   urllib3.connectionpool: DEBUG: Starting new HTTP connection (1): data.mxnet.io
   
   urllib3.connectionpool: DEBUG: http://data.mxnet.io:80 "GET /data/val-5k-256.rec HTTP/1.1" 200 150874780
   
   root: INFO: downloaded http://data.mxnet.io/data/val-5k-256.rec into data/val-5k-256.rec successfully
   
   common: INFO: Setting test np/mx/python random seeds, use MXNET_TEST_SEED=1823912549 to reproduce.
   
   --------------------- >> end captured logging << ---------------------
   ```
   
   http://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/incubator-mxnet/detail/PR-10313/2/pipeline/594

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services