You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@apr.apache.org by "William A. Rowe, Jr." <wr...@rowe-clan.net> on 2003/08/22 08:20:29 UTC

RE: unix/thread_mutex.c 1.19 and --enable-pool-debug=yes core dump

My thought is that the existing code is resilient enough without any
cas logic; but I have been waiting for others to take the time to prove
that to themselves before we back down to simple inc/dec/assignment.

My other question - why is it necessary to explicitly use nested locks
within the pool code?  Do we have a nesting issue that could be fixed
and return to more optimal 'default' thread mutexes for pools?

Bill



At 11:13 PM 8/21/2003, Sander Striker wrote:
>> From: Blair Zajac [mailto:blair@orcaware.com]
>> Sent: Thursday, August 21, 2003 11:49 PM
>
>Hi Blair!
>
>> I tracked my core dump down to the following combination.
>> 
>> Using apr CVS HEAD with --enable-pool-debug=yes and with thread_mutex
>> 1.19 or greater causes the coredumps in apr/test/testall I reported in
>> my messages with the subject "testall core dump with apr HEAD".  Backing
>> thread_mutex.c down to 1.18 has no problems with -enable-pool-debug=yes.
>
>The introduction of atomics in the thread_mutex code.  Let's see...
>
>Program received signal SIGSEGV, Segmentation fault.
>[Switching to Thread 1074592640 (LWP 16810)]
>0x4003356e in apr_atomic_cas (mem=0x80c10b8, with=0, cmp=0) at apr_atomic.c:166
>166         apr_thread_mutex_t *lock = hash_mutex[ATOMIC_HASH(mem)];
>(gdb) bt
>#0  0x4003356e in apr_atomic_cas (mem=0x80c10b8, with=0, cmp=0)
>    at apr_atomic.c:166
>#1  0x40019944 in _r_debug ()
>   from /export/home1/blair/Code/Apache/2.0/h/srclib/apr/.libs/libapr-0.so.0
>#2  0x4002fe0a in apr_thread_mutex_lock (mutex=0x0) at thread_mutex.c:129
>#3  0x4003219a in apr_pool_create_ex_debug (newpool=0xbfffed40,
>    parent=0x80c1048, abort_fn=0, allocator=0x0,
>    file_line=0x40035a83 "start.c:96") at apr_pools.c:1540
>#4  0x4002e512 in apr_initialize () at start.c:96
>
>Right.  The core dump happens when apr_atomic_init() hasn't been called yet.
>
>>>From apr_initialize()  [misc/unix/start.c]
>--
>    if ((status = apr_pool_initialize()) != APR_SUCCESS)
>        return status;
>
>    if (apr_pool_create(&pool, NULL) != APR_SUCCESS) {
>        return APR_ENOPOOL;
>    }
>
>    apr_pool_tag(pool, "apr_initialize");
>
>    if ((status = apr_atomic_init(pool)) != APR_SUCCESS) {
>        return status;
>--
>
>Since pools depend on apr_thread_mutex-es, the atomics change might
>have introduced a circular dependency.  And we're getting bit by
>it in the debug code.  Stepping through the code will positively
>point out where the thread mutexes are first used before
>apr_atomic_init() is called.  If you get around to that before I do
>please post ;) :).
> 
>> I'm on RedHat 9 and I've used the RedHat gcc 3.2.2 and my own gcc 3.3.1
>> and they both have the same behavior.
>> 
>> I haven't received any response to my previous messages, so I'd
>> appreciate some advice in getting a response from the APR developers.
>> It seems that other messages from APR newbies that ask questions
>> implying that they did no homework on reading APR documentation
>> gets more of a response then somebody who's got a decent amount of
>> feedback to a core dump from an APR supplied program, which isn't
>> obviously then operator error.  So the question is, what's missing
>> in my reports to get a response.
>
>Nothing.  Sorry for the late reply.
> 
>> I do like to contribute, so anything to make this easier for me
>> would be appreciated.
>> 
>> I'm hoping I've narrowed this problem down to a small enough
>> changeset where somebody that's familar with the locking code
>> can make quick progress on this bug.
>> 
>> BTW, I'm using --enable-pool-debug=yes with APR as part of
>> Subversion.
>
>
>Sander

RE: unix/thread_mutex.c 1.19 and --enable-pool-debug=yes core dump

Posted by Sander Striker <st...@apache.org>.

> From: William A. Rowe, Jr. [mailto:wrowe@rowe-clan.net]
> Sent: Friday, August 22, 2003 8:20 AM

> My thought is that the existing code is resilient enough without any
> cas logic; but I have been waiting for others to take the time to prove
> that to themselves before we back down to simple inc/dec/assignment.
> 
> My other question - why is it necessary to explicitly use nested locks
> within the pool code?  Do we have a nesting issue that could be fixed
> and return to more optimal 'default' thread mutexes for pools?

This only for the pools debug code.  From apr_pool_create_ex_debug:

        /* No matter what the creation flags say, always create
         * a lock.  Without it integrity_check and apr_pool_num_bytes
         * blow up (because they traverse pools child lists that
         * possibly belong to another thread, in combination with
         * the pool having no lock).  However, this might actually
         * hide problems like creating a child pool of a pool
         * belonging to another thread.
         */


Sander