You are viewing a plain text version of this content. The canonical link for it is here.
Posted to bugs@apr.apache.org by bu...@apache.org on 2014/01/25 17:22:33 UTC

[Bug 56063] New: testcond deadlocks on Tru64 V4.0F (PK8)

https://issues.apache.org/bugzilla/show_bug.cgi?id=56063

            Bug ID: 56063
           Summary: testcond deadlocks on Tru64 V4.0F (PK8)
           Product: APR
           Version: 1.5.0
          Hardware: DEC
                OS: OSF/1
            Status: NEW
          Severity: normal
          Priority: P2
         Component: APR test
          Assignee: bugs@apr.apache.org
          Reporter: urs.traber@gmail.com

Created attachment 31254
  --> https://issues.apache.org/bugzilla/attachment.cgi?id=31254&action=edit
same test based directly on pthreads

APR was built with:

-bash-3.2$ CFLAGS='-w2 -c99 -g' ./configure --enable-threads --disable-ipv6
--with-egd=/var/run/egd-pool


Attaching the debugger shows the following:

-bash-3.2$ ladebug -pid 26335 testall
Welcome to the Ladebug Debugger Version 4.0-49
------------------ 
object file name: testall 
Reading symbolic information ...done
Attached to process id 26335  ....


Interrupt (for process)

Stopping process localhost:26335 (testall).
Thread received signal INT
stopped at [<opaque> __nxm_thread_block(...) 0x3ff805b1028]    
(ladebug) show thread  
  Thread Name                      State           Substate    Policy       Pri
  ------ ------------------------- --------------- ----------- ------------ ---
>*    -1 manager thread            blk SCS                     SCHED_RR     19
       1 default thread            blocked         join 122    SCHED_OTHER  19
      -2 null thread for VP 1      running VP 1                null thread  -1
     122 <anonymous>               blocked         mut 108     SCHED_OTHER  19

Information:  An <opaque> type was presented during execution of the previous
command.  For complete type information on this symbol, recompilation of the
program will be necessary.  Consult the compiler man pages for details on
producing full symbol table information using the -g (and -gall for cxx) flags.

(ladebug) where thread 1  
Stack trace for thread 1
#0  0x3ff805b0e1c in __hstTransferRegisters(0x3ffc01b1470, 0x3ff8058d248,
0x3ffc01b1470, 0x3ff80598f04, 0x3ffc01b1470, 0x100000000) in
/usr/shlib/libpthread.so
#1  0x3ff80590514 in __hstTransferContext(0x3ffc01b1470, 0x3ff8058d248,
0x3ffc01b1470, 0x3ff80598f04, 0x3ffc01b1470, 0x100000000) in
/usr/shlib/libpthread.so
#2  0x3ff8058d3d8 in __dspDispatch(0x3ffc01b1470, 0x3ff8058d248, 0x3ffc01b1470,
0x3ff80598f04, 0x3ffc01b1470, 0x100000000) in /usr/shlib/libpthread.so
#3  0x3ff805a4780 in __pthread_join(0x3ffc01b1470, 0x3ff8058d248,
0x3ffc01b1470, 0x3ff80598f04, 0x3ffc01b1470, 0x100000000) in
/usr/shlib/libpthread.so
#4  0x3ffbfff2a88 in apr_thread_join(retval=0x11ffffb50, thd=0x140594e38)
"threadproc/unix/thread.c":217
#5  0x12002bef0 in nested_wait(tc=0x11ffffba0, data=0x11ffffbe0)
"testcond.c":340
#6  0x120008960 in abts_run_test(ts=0x14006e6c0, f=0x12002bcb8,
value=0x11ffffbe0) "abts.c":171
#7  0x12002d0f0 in testcond(suite=0x14006e6c0) "testcond.c":662
#8  0x120009620 in main(argc=2, argv=0x11ffffc48) "abts.c":429
#9  0x120008368 in __start(0x3ffc01b1470, 0x3ff8058d248, 0x3ffc01b1470,
0x3ff80598f04, 0x3ffc01b1470, 0x100000000) in testall

(ladebug) where thread 122  
Stack trace for thread 122
#0  0x3ff805b0e1c in __hstTransferRegisters(0x3ff80597744, 0x3ff8058d248,
0x140103880, 0x3ff80599200, 0x140103880, 0x0) in /usr/shlib/libpthread.so
#1  0x3ff80590514 in __hstTransferContext(0x3ff80597744, 0x3ff8058d248,
0x140103880, 0x3ff80599200, 0x140103880, 0x0) in /usr/shlib/libpthread.so
#2  0x3ff8058d3d8 in __dspDispatch(0x3ff80597744, 0x3ff8058d248, 0x140103880,
0x3ff80599200, 0x140103880, 0x0) in /usr/shlib/libpthread.so
#3  0x3ff8058c780 in __cvWaitPrim(0x3ff80597744, 0x3ff8058d248, 0x140103880,
0x3ff80599200, 0x140103880, 0x0) in /usr/shlib/libpthread.so
#4  0x3ff8058a368 in __pthread_cond_wait(0x3ff80597744, 0x3ff8058d248,
0x140103880, 0x3ff80599200, 0x140103880, 0x0) in /usr/shlib/libpthread.so
#5  0x3ffbffe3334 in apr_thread_cond_wait(cond=0x140594e08, mutex=0x140594dd0)
"locks/unix/thread_cond.c":68
#6  0x12002bac8 in nested_lock_and_wait(box=0x11ffffb58) "testcond.c":275
#7  0x12002afb0 in thread_routine(thd=0x140594e38, data=0x11ffffb58)
"testcond.c":97
#8  0x3ffbfff2804 in dummy_worker(opaque=0x140594e38)
"threadproc/unix/thread.c":142
#9  0x3ff805a5eac in __thdBase(0x3ff80597744, 0x3ff8058d248, 0x140103880,
0x3ff80599200, 0x140103880, 0x0) in /usr/shlib/libpthread.so

(ladebug) show mutex 108  
Mutex  Name                      State Owner  Pri Type     Waiters (+Count)
------ ------------------------- ----- ------ --- -------- --------------------
   108 <anonymous>               Lock     122     Recurs   122
(ladebug) show condition with state == wait  
Cond   Name                      Mutex  Type  Waiters (+Count)
------ ------------------------- ------ ----- ---------------------------------
(ladebug) show condition  
Cond   Name                      Mutex  Type  Waiters (+Count)
------ ------------------------- ------ ----- ---------------------------------
     1 _exc_read_mutex+72(0x3ffc
     2 _exc_read_mutex+112(0x3ff
     3 .bss+72(0x3ffc0512f98)
     4 mq_server_data+72(0x3ffc0
    10 <anonymous>
(ladebug) show condition 10  
Cond   Name                      Mutex  Type  Waiters (+Count)
------ ------------------------- ------ ----- ---------------------------------
    10 <anonymous>
(ladebug) 


>From the stack traces you can see that thread 1 is blocked in "nested_wait"
while joining thread 122. Thread 122 is blocked on mutex 108. Nobody is waiting
for any condition variable.

The reason for this behaviour is the pthread implementation on Tru64 V4.0F and
probably other 4.0x versions. (The test runs fine on V5.1B (PK7) without
recompilation). I've adapted and attached a little C program based on pthreads
only that does pretty much the same as the "nested_wait" test in testcond.c. It
shows the same behaviour on a V4.0F and runs on V5.1B.

Any ideas to disable recursive mutex on Tru64 V4.0x? I found
APR_CHECK_PTHREAD_RECURSIVE_MUTEX.

-- 
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@apr.apache.org
For additional commands, e-mail: bugs-help@apr.apache.org


[Bug 56063] testcond deadlocks on Tru64 V4.0F (PK8)

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=56063

--- Comment #2 from Urs Traber <ur...@gmail.com> ---
Created attachment 31622
  --> https://issues.apache.org/bugzilla/attachment.cgi?id=31622&action=edit
using APR_THREAD_MUTEX_DEFAULT in apr_thread_pool

- apr_thread_pool.c: replace APR_THREAD_MUTEX_NESTED
                     with APR_THREAD_MUTEX_DEFAULT
- provide testthreadpool.c to make sure that the apr_thread_pool works with the
  replaced mutex attribute.
- fix multiple Compaq cc compiler warnings when building with -w2

-- 
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@apr.apache.org
For additional commands, e-mail: bugs-help@apr.apache.org


[Bug 56063] testcond deadlocks on Tru64 V4.0F (PK8)

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=56063

--- Comment #3 from Urs Traber <ur...@gmail.com> ---
I have worked on a --disable-mutex-recursive switch to configure APR without
recursive mutexes. They remain enabled by default on platforms that support
them.

Of course when APR is configured with --disable-mutex-recursive we need to look
and take care of all existing usages of APR_THREAD_MUTEX_NESTED. APR-util e.g.
uses it in thread_pool_construct(). I have replaced this mutex attribute and
provided a test. Test coverage of the apr_thread_pool could be better though.

The two patched have been tested on:

o Tru64 UNIX V5.1B (Rev. 2650) PK7 (Alpha EV67)
 - Compaq C V6.5-303
 - APR :--recursive-mutex-enabled [ok]
 -   APR-util: [ok]
 - APR: --recursive-mutex-disabled [ok] 
 -   APR-util: [ok] 

o Digital UNIX V4.0F (Rev. 1229) PK8 (Alpha EV6)
 - Compaq C V6.5-303
 - APR: --recursive-mutex-disabled [ok] 
 -   APR-util: [ok] 

o Debian /Linux Wheezy (amd64)
 - gcc 4.7.2
 - APR: --recursive-mutex-enabled [ok]
 -   APR-util: [ok] 
 - APR: --recursive-mutex-disabled [ok] 
 -   APR-util: [ok] 

o Debian/Linux Sarge (Alpha EV56)
 - kernel 2.6.21
 - gcc 3.3.5
 - APR: --recursive-mutex-enabled [ok]
 -   APR-util: [testxml segfaults because of a broken libexpat 1.95.8-3] 
 - APR: --recursive-mutex-disabled [ok]
 -   APR-util: [testxml segfaults because of a broken libexpat 1.95.8-3]  

Httpd runs also with these modifications on Tru64 V4.0F and V5.1B

Regards
Urs

-- 
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@apr.apache.org
For additional commands, e-mail: bugs-help@apr.apache.org


[Bug 56063] testcond deadlocks on Tru64 V4.0F (PK8)

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=56063

--- Comment #1 from Urs Traber <ur...@gmail.com> ---
Created attachment 31621
  --> https://issues.apache.org/bugzilla/attachment.cgi?id=31621&action=edit
--disable-mutex-recursive ; fixes issues on Tru64

- provide --disable-mutex-recursive to remove APR_THREAD_MUTEX_RECURSIVE
  from APRs interface where they are not supported (e.g. Tru64 V4.0x)
- sockaddr.c: fixes Bug 14589: parse_ip() causes an "unaligned access"
              warning on Tru64 when configued with --enable-ipv6
- shm.c: added error checks where missing after mmap calls
- socket_util.c: fix compiler error/warning (has been fixed in APR 1.5.1)
- fix multiple Compaq cc compiler warnings when building with -w2
- fix some autotool warnings 
- apr.h.in: enable building DEC C++ code against APR
- testrand.c: added a test for apr_generate_random_bytes,
apr_random_add_entropy
              apr_random_insecure_ready to detect an endless loop in httpd
under
              Tru64 when configured with --with-egd=/dev/random

-- 
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@apr.apache.org
For additional commands, e-mail: bugs-help@apr.apache.org