You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@mxnet.apache.org by GitBox <gi...@apache.org> on 2019/12/11 07:38:49 UTC

[GitHub] [incubator-mxnet] leezu opened a new issue #17045: Relocation truncation issues

leezu opened a new issue #17045: Relocation truncation issues
URL: https://github.com/apache/incubator-mxnet/issues/17045
 
 
   ## Description
   `libmxnet.so` is too large (depending on compile options), so that linking fails. This was observed before on CI with test coverage functionality enabled (https://github.com/apache/incubator-mxnet/issues/15971), but can also happen with non-test-coverage builds, such as `-DUSE_INT64_TENSOR_SIZE=ON` build.
   
   I first observe this in the #17031 (http://jenkins.mxnet-ci.amazon-ml.com/blue/rest/organizations/jenkins/pipelines/mxnet-validation/pipelines/unix-gpu/branches/PR-17031/runs/6/nodes/52/steps/84/log/?start=0), but can easily reproduce it on the master branch when building with GCC 7.4.
   
   ### Error Message
   From the CI
   
   ```
   /usr/lib/gcc/x86_64-linux-gnu/5/../../../x86_64-linux-gnu/crti.o: In function `_init':
   (.init+0x7): relocation truncated to fit: R_X86_64_REX_GOTPCRELX against undefined symbol `__gmon_start__'
   /usr/lib/gcc/x86_64-linux-gnu/5/crtbeginS.o: In function `deregister_tm_clones':
   crtstuff.c:(.text+0x3): relocation truncated to fit: R_X86_64_PC32 against `.tm_clone_table'
   crtstuff.c:(.text+0xa): relocation truncated to fit: R_X86_64_PC32 against symbol `__TMC_END__' defined in .nvFatBinSegment section in libmxnet.so
   crtstuff.c:(.text+0x1e): relocation truncated to fit: R_X86_64_REX_GOTPCRELX against undefined symbol `_ITM_deregisterTMCloneTable'
   /usr/lib/gcc/x86_64-linux-gnu/5/crtbeginS.o: In function `register_tm_clones':
   crtstuff.c:(.text+0x43): relocation truncated to fit: R_X86_64_PC32 against `.tm_clone_table'
   crtstuff.c:(.text+0x4a): relocation truncated to fit: R_X86_64_PC32 against symbol `__TMC_END__' defined in .nvFatBinSegment section in libmxnet.so
   crtstuff.c:(.text+0x6b): relocation truncated to fit: R_X86_64_REX_GOTPCRELX against undefined symbol `_ITM_registerTMCloneTable'
   /usr/lib/gcc/x86_64-linux-gnu/5/crtbeginS.o: In function `__do_global_dtors_aux':
   crtstuff.c:(.text+0x92): relocation truncated to fit: R_X86_64_PC32 against `.bss'
   crtstuff.c:(.text+0x9c): relocation truncated to fit: R_X86_64_GOTPCREL against symbol `__cxa_finalize@@GLIBC_2.2.5' defined in .text section in /lib/x86_64-linux-gnu/libc.so.6
   crtstuff.c:(.text+0xaa): relocation truncated to fit: R_X86_64_PC32 against symbol `__dso_handle' defined in .data.rel.local section in /usr/lib/gcc/x86_64-linux-gnu/5/crtbeginS.o
   crtstuff.c:(.text+0xbb): additional relocation overflows omitted from the output
   libmxnet.so: PC-relative offset overflow in PLT entry for `_ZN5mxnet2op8mxnet_op6KernelINS0_9pick_gradILi3ELb0EEEN7mshadow3gpuEE6LaunchIJPdS9_PfiiNS5_5ShapeILi3EEESC_EEEvPNS5_6StreamIS6_EEiDpT_'
   collect2: error: ld returned 1 exit status
   FAILED: : && /tmp/ccache-redirects/g++  -mf16c -Wall -Wno-unknown-pragmas -Wno-sign-compare -O3 -msse3 -std=c++11 -fno-builtin-malloc -fno-builtin-calloc -fno-builtin-realloc -fno-builtin-free -fopenmp -std=c++0x -O3 -DNDEBUG   tests/CMakeFiles/mxnet_unit_tests.dir/cpp/engine/engine_shutdown_test.cc.o tests/CMakeFiles/mxnet_unit_tests.dir/cpp/engine/thread_local_test.cc.o tests/CMakeFiles/mxnet_unit_tests.dir/cpp/engine/threaded_engine_test.cc.o tests/CMakeFiles/mxnet_unit_tests.dir/cpp/kvstore/gpu_topology_test.cc.o tests/CMakeFiles/mxnet_unit_tests.dir/cpp/misc/base.cc.o tests/CMakeFiles/mxnet_unit_tests.dir/cpp/misc/libinfo_test.cc.o tests/CMakeFiles/mxnet_unit_tests.dir/cpp/operator/activation_perf.cc.o tests/CMakeFiles/mxnet_unit_tests.dir/cpp/operator/batchnorm_test.cc.o tests/CMakeFiles/mxnet_unit_tests.dir/cpp/operator/coreop_perf.cc.o tests/CMakeFiles/mxnet_unit_tests.dir/cpp/operator/dropout_perf.cc.o tests/CMakeFiles/mxnet_unit_tests.dir/cpp/operator/fully_conn_perf.cc.o tests/CMakeFiles/mxnet_unit_tests.dir/cpp/operator/krprod_test.cc.o tests/CMakeFiles/mxnet_unit_tests.dir/cpp/operator/mkldnn_operator_test.cc.o tests/CMakeFiles/mxnet_unit_tests.dir/cpp/operator/mkldnn_test.cc.o tests/CMakeFiles/mxnet_unit_tests.dir/cpp/operator/runner/core_op_runner_test.cc.o tests/CMakeFiles/mxnet_unit_tests.dir/cpp/operator/slice_channel_perf.cc.o tests/CMakeFiles/mxnet_unit_tests.dir/cpp/operator/tune/operator_tune_test.cc.o tests/CMakeFiles/mxnet_unit_tests.dir/cpp/storage/storage_test.cc.o tests/CMakeFiles/mxnet_unit_tests.dir/cpp/test_main.cc.o tests/CMakeFiles/mxnet_unit_tests.dir/cmake_device_link.o  -o tests/mxnet_unit_tests -L/usr/local/cuda/lib64  -L/work/build/3rdparty/tvm  -L/usr/local/cuda/targets/x86_64-linux/lib -Wl,-rpath,/usr/local/cuda/lib64:/work/build/3rdparty/openmp/runtime/src:/work/build/3rdparty/tvm lib/libgtest.a -Wl,--whole-archive libmxnet.a -Wl,--no-whole-archive 3rdparty/dmlc-core/libdmlc.a /usr/local/cuda/lib64/libnvToolsExt.so /usr/lib/libopenblas.so /usr/lib/x86_64-linux-gnu/librt.so /usr/lib/x86_64-linux-gnu/libjemalloc.so /usr/lib/x86_64-linux-gnu/libopencv_highgui.so.2.4.9 /usr/lib/x86_64-linux-gnu/libopencv_imgproc.so.2.4.9 3rdparty/openmp/runtime/src/libomp.so -lpthread -llapack /usr/lib/x86_64-linux-gnu/libjemalloc.so /usr/lib/x86_64-linux-gnu/libcudnn.so -lcublas -lcufft -lcusolver -lcurand -lnvrtc -lcuda /usr/lib/x86_64-linux-gnu/libprotobuf.so /usr/lib/x86_64-linux-gnu/libzmq.so 3rdparty/ps-lite/libpslite.a -lprotobuf -ltvm_runtime /usr/lib/x86_64-linux-gnu/libzmq.so 3rdparty/ps-lite/libpslite.a -lprotobuf -lrt -lpthread -llapack /usr/lib/x86_64-linux-gnu/libcudnn.so -lcublas -lcufft -lcusolver -lcurand -lnvrtc -lcuda /usr/lib/x86_64-linux-gnu/libprotobuf.so /usr/lib/x86_64-linux-gnu/libzmq.so -lprotobuf -ltvm_runtime /usr/lib/x86_64-linux-gnu/libzmq.so -lprotobuf -ltvm_runtime /usr/lib/x86_64-linux-gnu/libopencv_core.so.2.4.9 -ldl -lpthread -lcudadevrt -lcudart_static -lrt -lpthread -ldl && :
   /usr/lib/gcc/x86_64-linux-gnu/5/../../../x86_64-linux-gnu/crt1.o:(.eh_frame+0x20): relocation truncated to fit: R_X86_64_PC32 against `.text'
   /usr/lib/gcc/x86_64-linux-gnu/5/../../../x86_64-linux-gnu/crti.o: In function `_init':
   (.init+0x7): relocation truncated to fit: R_X86_64_REX_GOTPCRELX against undefined symbol `__gmon_start__'
   /usr/lib/gcc/x86_64-linux-gnu/5/crtbegin.o: In function `deregister_tm_clones':
   crtstuff.c:(.text+0x8): relocation truncated to fit: R_X86_64_32S against `.tm_clone_table'
   /usr/lib/gcc/x86_64-linux-gnu/5/crtbegin.o: In function `register_tm_clones':
   crtstuff.c:(.text+0x49): relocation truncated to fit: R_X86_64_32S against `.tm_clone_table'
   /usr/lib/gcc/x86_64-linux-gnu/5/crtbegin.o: In function `__do_global_dtors_aux':
   crtstuff.c:(.text+0x82): relocation truncated to fit: R_X86_64_PC32 against `.bss'
   crtstuff.c:(.text+0x95): relocation truncated to fit: R_X86_64_PC32 against `.bss'
   tests/CMakeFiles/mxnet_unit_tests.dir/cpp/engine/engine_shutdown_test.cc.o: In function `EngineShutdown_stop_without_crashing_Test::TestBody()':
   engine_shutdown_test.cc:(.text+0xf8): relocation truncated to fit: R_X86_64_PC32 against `.bss'
   engine_shutdown_test.cc:(.text+0x130): relocation truncated to fit: R_X86_64_PC32 against `.bss'
   engine_shutdown_test.cc:(.text+0x137): relocation truncated to fit: R_X86_64_PC32 against `.bss'
   engine_shutdown_test.cc:(.text+0x15d): relocation truncated to fit: R_X86_64_GOTPCREL against symbol `__pthread_key_create@@GLIBC_2.2.5' defined in .text section in /lib/x86_64-linux-gnu/libpthread.so.0
   engine_shutdown_test.cc:(.text+0x18d): additional relocation overflows omitted from the output
   tests/mxnet_unit_tests: PC-relative offset overflow in PLT entry for `nvrtcGetPTX@@libnvrtc.so.10.1'
   collect2: error: ld returned 1 exit status
   ```
   
   Compiling master version with GCC on Ubuntu 18.04 (Deep Learning AMI) gives an equivalent error message (though slightly different wording due to GCC vs Clang).
   
   ## To Reproduce
   `cmake -DUSE_SIGNAL_HANDLER=ON -DUSE_CUDA=ON -DUSE_CUDNN=ON -DPython3_EXECUTABLE=/usr/bin/python3 -DUSE_MKL_IF_AVAILABLE=OFF -DUSE_MKLDNN=OFF -DUSE_DIST_KVSTORE=ON -DCMAKE_BUILD_TYPE=Release -DCUDA_ARCH_NAME=Manual -DCUDA_ARCH_BIN=52,70 -DUSE_INT64_TENSOR_SIZE=ON ..`
   
   on Ubuntu 18.04 (gcc 7.4, ld 2.3), where the CMake options here are taken from the `build_ubuntu_gpu_large_tensor` CI run.
   
   ## Environment
   Environment used for reproducing the error with master version of MXNet.
   
   ```
   ----------Python Info----------
   Version      : 3.8.0
   Compiler     : GCC 7.4.0
   Build        : ('default', 'Dec  8 2019 08:07:09')
   Arch         : ('64bit', 'ELF')
   ------------Pip Info-----------
   Version      : 19.2.3
   Directory    : /home/ubuntu/.pyenv/versions/3.8.0/lib/python3.8/site-packages/pip
   ----------MXNet Info-----------
   Version      : 1.6.0
   Directory    : /home/ubuntu/src/mxnet-dc/python/mxnet
   Num GPUs     : 0
   Hashtag not found. Not installed from pre-built package.
   ----------System Info----------
   Platform     : Linux-4.15.0-1056-aws-x86_64-with-glibc2.27
   system       : Linux
   node         : ip-172-31-26-35
   release      : 4.15.0-1056-aws
   version      : #58-Ubuntu SMP Tue Nov 26 15:14:34 UTC 2019
   ----------Hardware Info----------
   machine      : x86_64
   processor    : x86_64
   Architecture:        x86_64
   CPU op-mode(s):      32-bit, 64-bit
   Byte Order:          Little Endian
   CPU(s):              96
   On-line CPU(s) list: 0-95
   Thread(s) per core:  2
   Core(s) per socket:  24
   Socket(s):           2
   NUMA node(s):        2
   Vendor ID:           GenuineIntel
   CPU family:          6
   Model:               85
   Model name:          Intel(R) Xeon(R) Platinum 8275CL CPU @ 3.00GHz
   Stepping:            7
   CPU MHz:             3600.024
   BogoMIPS:            6000.00
   Hypervisor vendor:   KVM
   Virtualization type: full
   L1d cache:           32K
   L1i cache:           32K
   L2 cache:            1024K
   L3 cache:            36608K
   NUMA node0 CPU(s):   0-23,48-71
   NUMA node1 CPU(s):   24-47,72-95
   Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon rep_good nopl xtopology nonstop_tsc cpuid aperfmperf tsc_known_freq pni pclmulqdq monitor ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch invpcid_single pti fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid mpx avx512f avx512dq rdseed adx smap clflushopt clwb avx512cd avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves ida arat pku ospke avx512_vnni
   ----------Network Test----------
   Setting timeout: 10
   Timing for MXNet: https://github.com/apache/incubator-mxnet, DNS: 0.0021 sec, LOAD: 0.3891 sec.
   Timing for GluonNLP GitHub: https://github.com/dmlc/gluon-nlp, DNS: 0.0003 sec, LOAD: 0.3134 sec.
   Timing for GluonNLP: http://gluon-nlp.mxnet.io, DNS: 0.0450 sec, LOAD: 0.0738 sec.
   Timing for D2L: http://d2l.ai, DNS: 0.0034 sec, LOAD: 0.0103 sec.
   Timing for D2L (zh-cn): http://zh.d2l.ai, DNS: 0.0159 sec, LOAD: 0.1406 sec.
   Timing for FashionMNIST: https://repo.mxnet.io/gluon/dataset/fashion-mnist/train-labels-idx1-ubyte.gz, DNS: 0.0432 sec, LOAD: 0.3530 sec.
   Timing for PYPI: https://pypi.python.org/pypi/pip, DNS: 0.0021 sec, LOAD: 0.0701 sec.
   Timing for Conda: https://repo.continuum.io/pkgs/free/, DNS: 0.0313 sec, LOAD: 0.1727 sec.
   
   ```
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services