You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@kudu.apache.org by "Todd Lipcon (JIRA)" <ji...@apache.org> on 2018/05/03 18:12:00 UTC

[jira] [Resolved] (KUDU-2422) Subprocess hang at startup in TSAN builds

     [ https://issues.apache.org/jira/browse/KUDU-2422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Todd Lipcon resolved KUDU-2422.
-------------------------------
       Resolution: Fixed
    Fix Version/s: 1.8.0

> Subprocess hang at startup in TSAN builds
> -----------------------------------------
>
>                 Key: KUDU-2422
>                 URL: https://issues.apache.org/jira/browse/KUDU-2422
>             Project: Kudu
>          Issue Type: Improvement
>          Components: test
>    Affects Versions: 1.8.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>            Priority: Critical
>             Fix For: 1.8.0
>
>
> I'm seeing a small percentage of test timeouts caused by a hang in subprocess starting. I managed to catch one in the act and attach to the subprocess in gdb. It's hung before 'exec' here:
> {code}
> * 1    Thread 0x7f1feebdb440 (LWP 9453) "disk_failure-it" __sanitizer::internal_sched_yield () at /data/1/todd/kudu/thirdparty/src/llvm-6.0.0.src/projects/compiler-rt/lib/sanitizer_common/sanitizer_linux.cc:414
> (gdb) bt
> #0  __sanitizer::internal_sched_yield () at /data/1/todd/kudu/thirdparty/src/llvm-6.0.0.src/projects/compiler-rt/lib/sanitizer_common/sanitizer_linux.cc:414
> #1  0x00000000004c39db in Do (this=<synthetic pointer>) at /data/1/todd/kudu/thirdparty/src/llvm-6.0.0.src/projects/compiler-rt/lib/tsan/rtl/tsan_mutex.cc:195
> #2  __tsan::Mutex::Lock (this=this@entry=0x600001000000) at /data/1/todd/kudu/thirdparty/src/llvm-6.0.0.src/projects/compiler-rt/lib/tsan/rtl/tsan_mutex.cc:235
> #3  0x00000000004c72ed in GenericScopedLock (mu=0x600001000000, this=<synthetic pointer>) at /data/1/todd/kudu/thirdparty/src/llvm-6.0.0.src/projects/compiler-rt/lib/tsan/../sanitizer_common/sanitizer_mutex.h:189
> #4  TraceSwitch (thr=<optimized out>) at /data/1/todd/kudu/thirdparty/src/llvm-6.0.0.src/projects/compiler-rt/lib/tsan/rtl/tsan_rtl.cc:552
> #5  __tsan::__tsan_trace_switch () at /data/1/todd/kudu/thirdparty/src/llvm-6.0.0.src/projects/compiler-rt/lib/tsan/rtl/tsan_rtl.cc:581
> #6  0x00000000004da3df in __tsan_trace_switch_thunk () at /data/1/todd/kudu/thirdparty/src/llvm-6.0.0.src/projects/compiler-rt/lib/tsan/rtl/tsan_rtl_amd64.S:53
> #7  0x00000000004cfabc in TraceAddEvent (thr=<optimized out>, addr=0, typ=__tsan::EventTypeFuncExit, fs=...) at /data/1/todd/kudu/thirdparty/src/llvm-6.0.0.src/projects/compiler-rt/lib/tsan/rtl/tsan_rtl.h:845
> #8  FuncExit (thr=<optimized out>) at /data/1/todd/kudu/thirdparty/src/llvm-6.0.0.src/projects/compiler-rt/lib/tsan/rtl/tsan_rtl.cc:997
> #9  __tsan_func_exit () at /data/1/todd/kudu/thirdparty/src/llvm-6.0.0.src/projects/compiler-rt/lib/tsan/rtl/tsan_interface_inl.h:108
> #10 0x00007f1fe06a3ee5 in safe_strtou32_base (str=<optimized out>, value=<optimized out>, base=<optimized out>) at ../../src/kudu/gutil/strings/numbers.cc:717
> #11 0x00007f1fe06a4426 in safe_strtou32 (str=0xfffffffffffc04c0 <error: Cannot access memory at address 0xfffffffffffc04c0>, value=0x1fffff) at ../../src/kudu/gutil/strings/numbers.cc:773
> #12 0x00007f1fe288e42f in kudu::(anonymous namespace)::CloseNonStandardFDs (fd_dir=0x7bb400000000) at ../../src/kudu/util/subprocess.cc:144
> #13 0x00007f1fe288d3b4 in kudu::Subprocess::Start (this=<optimized out>) at ../../src/kudu/util/subprocess.cc:418
> #14 0x00007f1fee0ddcf0 in kudu::cluster::ExternalDaemon::StartProcess (this=<optimized out>, user_flags=...) at ../../src/kudu/mini-cluster/external_mini_cluster.cc:821
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)