You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@mxnet.apache.org by GitBox <gi...@apache.org> on 2020/12/03 18:56:59 UTC

[GitHub] [incubator-mxnet] leezu opened a new issue #19623: out of memory during compilation on CI

leezu opened a new issue #19623:
URL: https://github.com/apache/incubator-mxnet/issues/19623


   CI is subject to out of memory errors if object is not in ccache. Similar to https://github.com/apache/incubator-mxnet/issues/18501
   
   ```
   [2020-12-03T18:13:57.250Z] FAILED: CMakeFiles/mxnet.dir/src/operator/numpy/linalg/np_norm_forward.cc.o 
   
   [2020-12-03T18:13:57.251Z] /usr/local/bin/ccache /usr/bin/c++  -DDMLC_CORE_USE_CMAKE -DDMLC_LOG_FATAL_THROW=1 -DDMLC_LOG_STACK_TRACE_SIZE=0 -DDMLC_MODERN_THREAD_LOCAL=0 -DDMLC_STRICT_CXX11 -DDMLC_USE_CXX11 -DDMLC_USE_CXX11=1 -DDMLC_USE_CXX14 -DMKL_ILP64 -DMSHADOW_INT64_TENSOR_SIZE=1 -DMSHADOW_IN_CXX11 -DMSHADOW_USE_CBLAS=0 -DMSHADOW_USE_CUDA=0 -DMSHADOW_USE_MKL=1 -DMSHADOW_USE_SSE -DMXNET_USE_BLAS_MKL=1 -DMXNET_USE_INTGEMM=1 -DMXNET_USE_LAPACK=1 -DMXNET_USE_LIBJPEG_TURBO=0 -DMXNET_USE_OPENCV=1 -DMXNET_USE_OPENMP=1 -DMXNET_USE_OPERATOR_TUNING=1 -DMXNET_USE_SIGNAL_HANDLER=1 -DMXNET_USE_TVM_OP=1 -D__USE_XOPEN2K8 -Dmxnet_EXPORTS -I/work/mxnet/include -I/work/mxnet/src -I/work/mxnet/3rdparty/tvm/nnvm/include -I/work/mxnet/3rdparty/tvm/include -I/work/mxnet/3rdparty/dmlc-core/include -I/work/mxnet/3rdparty/dlpack/include -I/work/mxnet/3rdparty/mshadow -I3rdparty/intgemm -I/work/mxnet/3rdparty/intgemm -I/work/mxnet/3rdparty/miniz -I3rdparty/dmlc-core/include -isystem /opt/intel/mkl/inclu
 de -isystem /usr/include/opencv4 -D_GLIBCXX_ASSERTIONS  -Wall -Wno-sign-compare -O3 -g -fopenmp -O2 -g -DNDEBUG -fPIC   -Werror -Wno-error=unused-variable -Wno-unused-parameter -Wno-unknown-pragmas -Wno-unused-local-typedefs -msse3 -mf16c -fopenmp -std=gnu++17 -MD -MT CMakeFiles/mxnet.dir/src/operator/numpy/linalg/np_norm_forward.cc.o -MF CMakeFiles/mxnet.dir/src/operator/numpy/linalg/np_norm_forward.cc.o.d -o CMakeFiles/mxnet.dir/src/operator/numpy/linalg/np_norm_forward.cc.o -c /work/mxnet/src/operator/numpy/linalg/np_norm_forward.cc
   
   [2020-12-03T18:13:57.251Z] c++: fatal error: Killed signal terminated program cc1plus
   
   [2020-12-03T18:13:57.251Z] compilation terminated.
   
   [2020-12-03T18:13:57.251Z] [363/888] Building CXX object CMakeFiles/mxnet.dir/src/operator/numpy/np_broadcast_reduce_op_value.cc.o
   
   [2020-12-03T18:13:57.251Z] FAILED: CMakeFiles/mxnet.dir/src/operator/numpy/np_broadcast_reduce_op_value.cc.o 
   
   [2020-12-03T18:13:57.251Z] /usr/local/bin/ccache /usr/bin/c++  -DDMLC_CORE_USE_CMAKE -DDMLC_LOG_FATAL_THROW=1 -DDMLC_LOG_STACK_TRACE_SIZE=0 -DDMLC_MODERN_THREAD_LOCAL=0 -DDMLC_STRICT_CXX11 -DDMLC_USE_CXX11 -DDMLC_USE_CXX11=1 -DDMLC_USE_CXX14 -DMKL_ILP64 -DMSHADOW_INT64_TENSOR_SIZE=1 -DMSHADOW_IN_CXX11 -DMSHADOW_USE_CBLAS=0 -DMSHADOW_USE_CUDA=0 -DMSHADOW_USE_MKL=1 -DMSHADOW_USE_SSE -DMXNET_USE_BLAS_MKL=1 -DMXNET_USE_INTGEMM=1 -DMXNET_USE_LAPACK=1 -DMXNET_USE_LIBJPEG_TURBO=0 -DMXNET_USE_OPENCV=1 -DMXNET_USE_OPENMP=1 -DMXNET_USE_OPERATOR_TUNING=1 -DMXNET_USE_SIGNAL_HANDLER=1 -DMXNET_USE_TVM_OP=1 -D__USE_XOPEN2K8 -Dmxnet_EXPORTS -I/work/mxnet/include -I/work/mxnet/src -I/work/mxnet/3rdparty/tvm/nnvm/include -I/work/mxnet/3rdparty/tvm/include -I/work/mxnet/3rdparty/dmlc-core/include -I/work/mxnet/3rdparty/dlpack/include -I/work/mxnet/3rdparty/mshadow -I3rdparty/intgemm -I/work/mxnet/3rdparty/intgemm -I/work/mxnet/3rdparty/miniz -I3rdparty/dmlc-core/include -isystem /opt/intel/mkl/inclu
 de -isystem /usr/include/opencv4 -D_GLIBCXX_ASSERTIONS  -Wall -Wno-sign-compare -O3 -g -fopenmp -O2 -g -DNDEBUG -fPIC   -Werror -Wno-error=unused-variable -Wno-unused-parameter -Wno-unknown-pragmas -Wno-unused-local-typedefs -msse3 -mf16c -fopenmp -std=gnu++17 -MD -MT CMakeFiles/mxnet.dir/src/operator/numpy/np_broadcast_reduce_op_value.cc.o -MF CMakeFiles/mxnet.dir/src/operator/numpy/np_broadcast_reduce_op_value.cc.o.d -o CMakeFiles/mxnet.dir/src/operator/numpy/np_broadcast_reduce_op_value.cc.o -c /work/mxnet/src/operator/numpy/np_broadcast_reduce_op_value.cc
   
   [2020-12-03T18:13:57.251Z] c++: fatal error: Killed signal terminated program cc1plus
   ```
   
   https://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/mxnet-validation%2Funix-cpu/detail/PR-19588/12/pipeline


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org
For additional commands, e-mail: issues-help@mxnet.apache.org


[GitHub] [incubator-mxnet] leezu commented on issue #19623: out of memory during compilation on CI

Posted by GitBox <gi...@apache.org>.
leezu commented on issue #19623:
URL: https://github.com/apache/incubator-mxnet/issues/19623#issuecomment-738503383


   It's possible to workaround this error by retriggering CI a few times


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org
For additional commands, e-mail: issues-help@mxnet.apache.org


[GitHub] [incubator-mxnet] mseth10 commented on issue #19623: out of memory during compilation on CI

Posted by GitBox <gi...@apache.org>.
mseth10 commented on issue #19623:
URL: https://github.com/apache/incubator-mxnet/issues/19623#issuecomment-747088098


   Object sizes (>10MB) on mac cpu build
   ```
    11M	./operator/numpy/linalg/np_norm_backward.cc.o
    11M	./operator/numpy/np_kron.cc.o
    11M	./operator/numpy/random/np_location_scale_op.cc.o
    12M	./operator/numpy/np_insert_op_slice.cc.o
    12M	./operator/numpy/np_insert_op_tensor.cc.o
    13M	./operator/numpy/np_elemwise_broadcast_op_extended_sec.cc.o
    13M	./operator/numpy/np_elemwise_unary_op_basic.cc.o
    13M	./operator/numpy/np_percentile_op.cc.o
    14M	./operator/numpy/np_matrix_op.cc.o
    14M	./operator/numpy/np_moments_op.cc.o
    14M	./operator/numpy/np_where_op.cc.o
    15M	./operator/numpy/np_einsum_op.cc.o
    16M	./operator/numpy/np_elemwise_broadcast_op_extended.cc.o
    21M	./operator/numpy/np_broadcast_reduce_op_value.cc.o
    22M	./operator/numpy/linalg/np_norm_forward.cc.o
    24M	./operator/numpy/np_elemwise_broadcast_op.cc.o
    34M	./operator/numpy/np_elemwise_broadcast_logic_op.cc.o
   ```
   We can start by splitting the corresponding cc files (largest first) to reduce compiler's memory footprint.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org
For additional commands, e-mail: issues-help@mxnet.apache.org