You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@qpid.apache.org by "Jiri Daněk (Jira)" <ji...@apache.org> on 2020/08/22 11:09:00 UTC

[jira] [Updated] (PROTON-2273) Test c-threaderciser can get stuck

     [ https://issues.apache.org/jira/browse/PROTON-2273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jiri Daněk updated PROTON-2273:
-------------------------------
    Description: 
As observed in comments on PROTON-2225, the c-threaderciser test can occasionally get stuck.

The following is output after I set \{{}} in the test source, and run the test under valgrind (as it seems to be more likely to get stuck when running under valgrind).
{noformat}
while true; do
    /usr/bin/python "/home/jdanek/repos/qpid/qpid-proton/scripts/env.py" "--" "TEST_EXE_PREFIX=/usr/bin/valgrind --tool=memcheck --leak-check=full --error-exitcode=42 --quiet --suppressions=/home/jdanek/repos/qpid/qpid-proton/tests/valgrind.supp" "/usr/bin/valgrind" "--tool=memcheck" "--leak-check=full" "--error-exitcode=42" "--quiet" --vgdb=yes "--suppressions=/home/jdanek/repos/qpid/qpid-proton/tests/valgrind.supp" "/home/jdanek/repos/qpid/qpid-proton/cmake-build-debug/c/tests/c-threaderciser" | tee cthreadlog.txt; done
[...]
threaderciser start: threads=8, time=1, actions=[listen, close-listen, connect, close-connect, wake, timeout, cancel-timeout]
(4e2ab80) [0x4e2e220] listen
(6a2e700) user_thread start
(6a2e700) [0x4e2d140] timeout
(722f700) proactor_thread start
(8a32700) user_thread start
(722f700) [0x4e2e220] listening on 0.0.0.0:36767
(722f700) [0x4e31170] connect
(8231700) proactor_thread start
(7a30700) user_thread start
(6a2e700) [0x4e3a480] connect
(5a2c700) user_thread start
(4e2ab80) shut down
(5a2c700) user_thread end
(8a32700) [0x4e30840] listen
(622d700) proactor_thread start
(8231700) [0x4e39b50] listen
(7a30700) [0x4e3b360] listen
(9233700) proactor_thread start
(6a2e700) [0x4e3a480] wake
(622d700) [0x4e30840] listening on 0.0.0.0:39411
(622d700) [0x4e44e40] connect
(8a32700) user_thread end
(6a2e700) [0x4e2d140] timeout
(9233700) [0x4e3b360] listening on 0.0.0.0:35627
(9233700) [0x4e4d3f0] connect
(6a2e700) user_thread end
(7a30700) user_thread end
(4e2ab80) disconnect
(8231700) [0x4e39b50] listening on 0.0.0.0:45417
(8231700) proactor_thread end

[... I've let it run for over an hour at this point, but got no further output ...]{noformat}

GDB says

{noformat}

(gdb) thread apply all bt

Thread 4 (Thread 137904):
 #0 0x0000000004a1dd49 in epoll_wait (epfd=4, events=0x4e2db50, maxevents=16, timeout=timeout@entry=-1) at ../sysdeps/unix/sysv/linux/epoll_wait.c:30
 #1 0x00000000048595ef in proactor_do_epoll (p=0x4e2d140, can_block=can_block@entry=true) at /home/jdanek/repos/qpid/qpid-proton/c/src/proactor/epoll.c:2623
 #2 0x0000000004859e6b in pn_proactor_wait (p=<optimized out>) at /home/jdanek/repos/qpid/qpid-proton/c/src/proactor/epoll.c:2802
 #3 0x000000000010afc6 in proactor_thread (void_g=0x1ffeffda50) at /home/jdanek/repos/qpid/qpid-proton/c/tests/threaderciser.c:474
 #4 0x00000000048d881f in start_thread (arg=<optimized out>) at pthread_create.c:477
 #5 0x0000000004a1d963 in clone () from /usr/lib64/haswell/libc.so.6

Thread 3 (Thread 137900):
 #0 futex_wait_cancelable (private=<optimized out>, expected=0, futex_word=0x4e30304) at ../sysdeps/nptl/futex-internal.h:183
 #1 __pthread_cond_wait_common (abstime=0x0, clockid=0, mutex=0x4e302b0, cond=0x4e302d8) at pthread_cond_wait.c:508
 #2 __pthread_cond_wait (cond=cond@entry=0x4e302d8, mutex=mutex@entry=0x4e302b0) at pthread_cond_wait.c:638
 #3 0x0000000004859bd4 in suspend (ts=0x4e302b0, p=0x4e2d140) at /home/jdanek/repos/qpid/qpid-proton/c/src/proactor/epoll.c:512
 #4 proactor_do_epoll (p=0x4e2d140, can_block=can_block@entry=true) at /home/jdanek/repos/qpid/qpid-proton/c/src/proactor/epoll.c:2740
 #5 0x0000000004859e6b in pn_proactor_wait (p=<optimized out>) at /home/jdanek/repos/qpid/qpid-proton/c/src/proactor/epoll.c:2802
 #6 0x000000000010afc6 in proactor_thread (void_g=0x1ffeffda50) at /home/jdanek/repos/qpid/qpid-proton/c/tests/threaderciser.c:474
 #7 0x00000000048d881f in start_thread (arg=<optimized out>) at pthread_create.c:477
 #8 0x0000000004a1d963 in clone () from /usr/lib64/haswell/libc.so.6

Thread 2 (Thread 137898):
 #0 futex_wait_cancelable (private=<optimized out>, expected=0, futex_word=0x4e43520) at ../sysdeps/nptl/futex-internal.h:183
 #1 __pthread_cond_wait_common (abstime=0x0, clockid=0, mutex=0x4e434d0, cond=0x4e434f8) at pthread_cond_wait.c:508
 #2 __pthread_cond_wait (cond=cond@entry=0x4e434f8, mutex=mutex@entry=0x4e434d0) at pthread_cond_wait.c:638
 #3 0x0000000004859bd4 in suspend (ts=0x4e434d0, p=0x4e2d140) at /home/jdanek/repos/qpid/qpid-proton/c/src/proactor/epoll.c:512
 #4 proactor_do_epoll (p=0x4e2d140, can_block=can_block@entry=true) at /home/jdanek/repos/qpid/qpid-proton/c/src/proactor/epoll.c:2740
 #5 0x0000000004859e6b in pn_proactor_wait (p=<optimized out>) at /home/jdanek/repos/qpid/qpid-proton/c/src/proactor/epoll.c:2802
 #6 0x000000000010afc6 in proactor_thread (void_g=0x1ffeffda50) at /home/jdanek/repos/qpid/qpid-proton/c/tests/threaderciser.c:474
 #7 0x00000000048d881f in start_thread (arg=<optimized out>) at pthread_create.c:477
 #8 0x0000000004a1d963 in clone () from /usr/lib64/haswell/libc.so.6

Thread 1 (Thread 137869):
 #0 __pthread_clockjoin_ex (threadid=102946560, thread_return=0x1ffeffda48, clockid=<optimized out>, abstime=<optimized out>, block=<optimized out>) at pthread_join_common.c:145
 #1 0x000000000010b5ff in main (argc=argc@entry=1, argv=0x1ffeffda50, argv@entry=0x1ffeffdc98) at /home/jdanek/repos/qpid/qpid-proton/c/tests/threaderciser.c:583
 #2 0x000000000491d51d in __libc_start_main (main=0x10b225 <main>, argc=1, argv=0x1ffeffdc98, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x1ffeffdc88) at ../csu/libc-start.c:308
 #3 0x000000000010a3ce in _start ()
{noformat}

  was:
As observed in comments on PROTON-2225, the c-threaderciser test can occasionally get stuck.

The following is output after I set \{{}} in the test source, and run the test under valgrind (as it seems to be more likely to get stuck when running under valgrind).

{noformat}
while true; do
    /usr/bin/python "/home/jdanek/repos/qpid/qpid-proton/scripts/env.py" "--" "TEST_EXE_PREFIX=/usr/bin/valgrind --tool=memcheck --leak-check=full --error-exitcode=42 --quiet --suppressions=/home/jdanek/repos/qpid/qpid-proton/tests/valgrind.supp" "/usr/bin/valgrind" "--tool=memcheck" "--leak-check=full" "--error-exitcode=42" "--quiet" --vgdb=yes "--suppressions=/home/jdanek/repos/qpid/qpid-proton/tests/valgrind.supp" "/home/jdanek/repos/qpid/qpid-proton/cmake-build-debug/c/tests/c-threaderciser" | tee cthreadlog.txt; done
[...]
threaderciser start: threads=8, time=1, actions=[listen, close-listen, connect, close-connect, wake, timeout, cancel-timeout]
(4e2ab80) [0x4e2e220] listen
(6a2e700) user_thread start
(6a2e700) [0x4e2d140] timeout
(722f700) proactor_thread start
(8a32700) user_thread start
(722f700) [0x4e2e220] listening on 0.0.0.0:36767
(722f700) [0x4e31170] connect
(8231700) proactor_thread start
(7a30700) user_thread start
(6a2e700) [0x4e3a480] connect
(5a2c700) user_thread start
(4e2ab80) shut down
(5a2c700) user_thread end
(8a32700) [0x4e30840] listen
(622d700) proactor_thread start
(8231700) [0x4e39b50] listen
(7a30700) [0x4e3b360] listen
(9233700) proactor_thread start
(6a2e700) [0x4e3a480] wake
(622d700) [0x4e30840] listening on 0.0.0.0:39411
(622d700) [0x4e44e40] connect
(8a32700) user_thread end
(6a2e700) [0x4e2d140] timeout
(9233700) [0x4e3b360] listening on 0.0.0.0:35627
(9233700) [0x4e4d3f0] connect
(6a2e700) user_thread end
(7a30700) user_thread end
(4e2ab80) disconnect
(8231700) [0x4e39b50] listening on 0.0.0.0:45417
(8231700) proactor_thread end

[... I've let it run for over an hour at this point, but got no further output ...]{noformat}

GDB says

{nofomat}
(gdb) thread apply all bt

Thread 4 (Thread 137904):
#0  0x0000000004a1dd49 in epoll_wait (epfd=4, events=0x4e2db50, maxevents=16, timeout=timeout@entry=-1) at ../sysdeps/unix/sysv/linux/epoll_wait.c:30
#1  0x00000000048595ef in proactor_do_epoll (p=0x4e2d140, can_block=can_block@entry=true) at /home/jdanek/repos/qpid/qpid-proton/c/src/proactor/epoll.c:2623
#2  0x0000000004859e6b in pn_proactor_wait (p=<optimized out>) at /home/jdanek/repos/qpid/qpid-proton/c/src/proactor/epoll.c:2802
#3  0x000000000010afc6 in proactor_thread (void_g=0x1ffeffda50) at /home/jdanek/repos/qpid/qpid-proton/c/tests/threaderciser.c:474
#4  0x00000000048d881f in start_thread (arg=<optimized out>) at pthread_create.c:477
#5  0x0000000004a1d963 in clone () from /usr/lib64/haswell/libc.so.6

Thread 3 (Thread 137900):
#0  futex_wait_cancelable (private=<optimized out>, expected=0, futex_word=0x4e30304) at ../sysdeps/nptl/futex-internal.h:183
#1  __pthread_cond_wait_common (abstime=0x0, clockid=0, mutex=0x4e302b0, cond=0x4e302d8) at pthread_cond_wait.c:508
#2  __pthread_cond_wait (cond=cond@entry=0x4e302d8, mutex=mutex@entry=0x4e302b0) at pthread_cond_wait.c:638
#3  0x0000000004859bd4 in suspend (ts=0x4e302b0, p=0x4e2d140) at /home/jdanek/repos/qpid/qpid-proton/c/src/proactor/epoll.c:512
#4  proactor_do_epoll (p=0x4e2d140, can_block=can_block@entry=true) at /home/jdanek/repos/qpid/qpid-proton/c/src/proactor/epoll.c:2740
#5  0x0000000004859e6b in pn_proactor_wait (p=<optimized out>) at /home/jdanek/repos/qpid/qpid-proton/c/src/proactor/epoll.c:2802
#6  0x000000000010afc6 in proactor_thread (void_g=0x1ffeffda50) at /home/jdanek/repos/qpid/qpid-proton/c/tests/threaderciser.c:474
#7  0x00000000048d881f in start_thread (arg=<optimized out>) at pthread_create.c:477
#8  0x0000000004a1d963 in clone () from /usr/lib64/haswell/libc.so.6

Thread 2 (Thread 137898):
#0  futex_wait_cancelable (private=<optimized out>, expected=0, futex_word=0x4e43520) at ../sysdeps/nptl/futex-internal.h:183
#1  __pthread_cond_wait_common (abstime=0x0, clockid=0, mutex=0x4e434d0, cond=0x4e434f8) at pthread_cond_wait.c:508
#2  __pthread_cond_wait (cond=cond@entry=0x4e434f8, mutex=mutex@entry=0x4e434d0) at pthread_cond_wait.c:638
#3  0x0000000004859bd4 in suspend (ts=0x4e434d0, p=0x4e2d140) at /home/jdanek/repos/qpid/qpid-proton/c/src/proactor/epoll.c:512
#4  proactor_do_epoll (p=0x4e2d140, can_block=can_block@entry=true) at /home/jdanek/repos/qpid/qpid-proton/c/src/proactor/epoll.c:2740
#5  0x0000000004859e6b in pn_proactor_wait (p=<optimized out>) at /home/jdanek/repos/qpid/qpid-proton/c/src/proactor/epoll.c:2802
#6  0x000000000010afc6 in proactor_thread (void_g=0x1ffeffda50) at /home/jdanek/repos/qpid/qpid-proton/c/tests/threaderciser.c:474
#7  0x00000000048d881f in start_thread (arg=<optimized out>) at pthread_create.c:477
#8  0x0000000004a1d963 in clone () from /usr/lib64/haswell/libc.so.6

Thread 1 (Thread 137869):
#0  __pthread_clockjoin_ex (threadid=102946560, thread_return=0x1ffeffda48, clockid=<optimized out>, abstime=<optimized out>, block=<optimized out>) at pthread_join_common.c:145
#1  0x000000000010b5ff in main (argc=argc@entry=1, argv=0x1ffeffda50, argv@entry=0x1ffeffdc98) at /home/jdanek/repos/qpid/qpid-proton/c/tests/threaderciser.c:583
#2  0x000000000491d51d in __libc_start_main (main=0x10b225 <main>, argc=1, argv=0x1ffeffdc98, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x1ffeffdc88) at ../csu/libc-start.c:308
#3  0x000000000010a3ce in _start ()
{noformat}


> Test c-threaderciser can get stuck
> ----------------------------------
>
>                 Key: PROTON-2273
>                 URL: https://issues.apache.org/jira/browse/PROTON-2273
>             Project: Qpid Proton
>          Issue Type: Bug
>    Affects Versions: proton-c-0.32.0
>            Reporter: Jiri Daněk
>            Priority: Major
>
> As observed in comments on PROTON-2225, the c-threaderciser test can occasionally get stuck.
> The following is output after I set \{{}} in the test source, and run the test under valgrind (as it seems to be more likely to get stuck when running under valgrind).
> {noformat}
> while true; do
>     /usr/bin/python "/home/jdanek/repos/qpid/qpid-proton/scripts/env.py" "--" "TEST_EXE_PREFIX=/usr/bin/valgrind --tool=memcheck --leak-check=full --error-exitcode=42 --quiet --suppressions=/home/jdanek/repos/qpid/qpid-proton/tests/valgrind.supp" "/usr/bin/valgrind" "--tool=memcheck" "--leak-check=full" "--error-exitcode=42" "--quiet" --vgdb=yes "--suppressions=/home/jdanek/repos/qpid/qpid-proton/tests/valgrind.supp" "/home/jdanek/repos/qpid/qpid-proton/cmake-build-debug/c/tests/c-threaderciser" | tee cthreadlog.txt; done
> [...]
> threaderciser start: threads=8, time=1, actions=[listen, close-listen, connect, close-connect, wake, timeout, cancel-timeout]
> (4e2ab80) [0x4e2e220] listen
> (6a2e700) user_thread start
> (6a2e700) [0x4e2d140] timeout
> (722f700) proactor_thread start
> (8a32700) user_thread start
> (722f700) [0x4e2e220] listening on 0.0.0.0:36767
> (722f700) [0x4e31170] connect
> (8231700) proactor_thread start
> (7a30700) user_thread start
> (6a2e700) [0x4e3a480] connect
> (5a2c700) user_thread start
> (4e2ab80) shut down
> (5a2c700) user_thread end
> (8a32700) [0x4e30840] listen
> (622d700) proactor_thread start
> (8231700) [0x4e39b50] listen
> (7a30700) [0x4e3b360] listen
> (9233700) proactor_thread start
> (6a2e700) [0x4e3a480] wake
> (622d700) [0x4e30840] listening on 0.0.0.0:39411
> (622d700) [0x4e44e40] connect
> (8a32700) user_thread end
> (6a2e700) [0x4e2d140] timeout
> (9233700) [0x4e3b360] listening on 0.0.0.0:35627
> (9233700) [0x4e4d3f0] connect
> (6a2e700) user_thread end
> (7a30700) user_thread end
> (4e2ab80) disconnect
> (8231700) [0x4e39b50] listening on 0.0.0.0:45417
> (8231700) proactor_thread end
> [... I've let it run for over an hour at this point, but got no further output ...]{noformat}
> GDB says
> {noformat}
> (gdb) thread apply all bt
> Thread 4 (Thread 137904):
>  #0 0x0000000004a1dd49 in epoll_wait (epfd=4, events=0x4e2db50, maxevents=16, timeout=timeout@entry=-1) at ../sysdeps/unix/sysv/linux/epoll_wait.c:30
>  #1 0x00000000048595ef in proactor_do_epoll (p=0x4e2d140, can_block=can_block@entry=true) at /home/jdanek/repos/qpid/qpid-proton/c/src/proactor/epoll.c:2623
>  #2 0x0000000004859e6b in pn_proactor_wait (p=<optimized out>) at /home/jdanek/repos/qpid/qpid-proton/c/src/proactor/epoll.c:2802
>  #3 0x000000000010afc6 in proactor_thread (void_g=0x1ffeffda50) at /home/jdanek/repos/qpid/qpid-proton/c/tests/threaderciser.c:474
>  #4 0x00000000048d881f in start_thread (arg=<optimized out>) at pthread_create.c:477
>  #5 0x0000000004a1d963 in clone () from /usr/lib64/haswell/libc.so.6
> Thread 3 (Thread 137900):
>  #0 futex_wait_cancelable (private=<optimized out>, expected=0, futex_word=0x4e30304) at ../sysdeps/nptl/futex-internal.h:183
>  #1 __pthread_cond_wait_common (abstime=0x0, clockid=0, mutex=0x4e302b0, cond=0x4e302d8) at pthread_cond_wait.c:508
>  #2 __pthread_cond_wait (cond=cond@entry=0x4e302d8, mutex=mutex@entry=0x4e302b0) at pthread_cond_wait.c:638
>  #3 0x0000000004859bd4 in suspend (ts=0x4e302b0, p=0x4e2d140) at /home/jdanek/repos/qpid/qpid-proton/c/src/proactor/epoll.c:512
>  #4 proactor_do_epoll (p=0x4e2d140, can_block=can_block@entry=true) at /home/jdanek/repos/qpid/qpid-proton/c/src/proactor/epoll.c:2740
>  #5 0x0000000004859e6b in pn_proactor_wait (p=<optimized out>) at /home/jdanek/repos/qpid/qpid-proton/c/src/proactor/epoll.c:2802
>  #6 0x000000000010afc6 in proactor_thread (void_g=0x1ffeffda50) at /home/jdanek/repos/qpid/qpid-proton/c/tests/threaderciser.c:474
>  #7 0x00000000048d881f in start_thread (arg=<optimized out>) at pthread_create.c:477
>  #8 0x0000000004a1d963 in clone () from /usr/lib64/haswell/libc.so.6
> Thread 2 (Thread 137898):
>  #0 futex_wait_cancelable (private=<optimized out>, expected=0, futex_word=0x4e43520) at ../sysdeps/nptl/futex-internal.h:183
>  #1 __pthread_cond_wait_common (abstime=0x0, clockid=0, mutex=0x4e434d0, cond=0x4e434f8) at pthread_cond_wait.c:508
>  #2 __pthread_cond_wait (cond=cond@entry=0x4e434f8, mutex=mutex@entry=0x4e434d0) at pthread_cond_wait.c:638
>  #3 0x0000000004859bd4 in suspend (ts=0x4e434d0, p=0x4e2d140) at /home/jdanek/repos/qpid/qpid-proton/c/src/proactor/epoll.c:512
>  #4 proactor_do_epoll (p=0x4e2d140, can_block=can_block@entry=true) at /home/jdanek/repos/qpid/qpid-proton/c/src/proactor/epoll.c:2740
>  #5 0x0000000004859e6b in pn_proactor_wait (p=<optimized out>) at /home/jdanek/repos/qpid/qpid-proton/c/src/proactor/epoll.c:2802
>  #6 0x000000000010afc6 in proactor_thread (void_g=0x1ffeffda50) at /home/jdanek/repos/qpid/qpid-proton/c/tests/threaderciser.c:474
>  #7 0x00000000048d881f in start_thread (arg=<optimized out>) at pthread_create.c:477
>  #8 0x0000000004a1d963 in clone () from /usr/lib64/haswell/libc.so.6
> Thread 1 (Thread 137869):
>  #0 __pthread_clockjoin_ex (threadid=102946560, thread_return=0x1ffeffda48, clockid=<optimized out>, abstime=<optimized out>, block=<optimized out>) at pthread_join_common.c:145
>  #1 0x000000000010b5ff in main (argc=argc@entry=1, argv=0x1ffeffda50, argv@entry=0x1ffeffdc98) at /home/jdanek/repos/qpid/qpid-proton/c/tests/threaderciser.c:583
>  #2 0x000000000491d51d in __libc_start_main (main=0x10b225 <main>, argc=1, argv=0x1ffeffdc98, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x1ffeffdc88) at ../csu/libc-start.c:308
>  #3 0x000000000010a3ce in _start ()
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@qpid.apache.org
For additional commands, e-mail: dev-help@qpid.apache.org