You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@apr.apache.org by Stas Bekman <st...@stason.org> on 2004/05/20 11:18:40 UTC

segfault in apr_bucket_delete

Doing just:

   apr_brigade_create(p, ba);

and leaving here segfaults:

#0  0x4017a81f in apr_brigade_cleanup (data=0x93e7110) at apr_brigade.c:47
47              apr_bucket_delete(e);
(gdb) bt
#0  0x4017a81f in apr_brigade_cleanup (data=0x93e7110) at apr_brigade.c:47
#1  0x4017a7d6 in brigade_cleanup (data=0x93e7110) at apr_brigade.c:33
#2  0x40277e4e in run_cleanups (cref=0x93d9500) at apr_pools.c:1997
#3  0x402775eb in apr_pool_destroy (pool=0x93d94f0) at apr_pools.c:763
#4  0x402775ad in apr_pool_clear (pool=0x93d34d8) at apr_pools.c:723
#5  0x080d9a76 in child_main (child_num_arg=1) at prefork.c:528
#6  0x080d9dbb in make_child (s=0x81420c0, slot=1) at prefork.c:703
#7  0x080d9e30 in startup_children (number_to_start=1) at prefork.c:721
#8  0x080da235 in ap_mpm_run (_pconf=0x813d0a8, plog=0x81851c8, s=0x81420c0)
     at prefork.c:940
#9  0x080e0ea9 in main (argc=9, argv=0xbffff264) at main.c:619
(gdb) p e

opening up apr_bucket_delete in apr_brigade_cleanup gives:

APU_DECLARE(apr_status_t) apr_brigade_cleanup(void *data)
{
     apr_bucket_brigade *b = data;
     apr_bucket *e;

     while (!APR_BRIGADE_EMPTY(b)) {
         e = APR_BRIGADE_FIRST(b);
         APR_RING_UNSPLICE((e), (e), link);
         (e)->type->destroy((e)->data);
         (e)->free(e);
         //apr_bucket_delete(e);
     }
     /*
      * We don't need to free(bb) because it's allocated from a pool.
      */
     return APR_SUCCESS;
}

brings us to APR_RING_UNSPLICE, which segfaults doing:

   APR_RING_NEXT(APR_RING_PREV((ep1), link), link) = ...;

gdb> p *e
$2 = {link = {next = 0x0, prev = 0x9402a60}, type = 0x4048c600, length = 2,
   start = 0, data = 0x9403180, free = 0x8067014, list = 0x6c24202c}
...
gdb> p (e->link.prev)->link
$5 = {next = 0x0, prev = 0x93d7660}
(gdb) p (e->link.prev)->link.next
$6 = (struct apr_bucket *) 0x0

which translates to:

    0x0 = ...;

boom, segfault. Not sure where it the right place to add a check w/o speed 
penalty.

And please add this case to the apr test suite. It's painful to discover this 
kind of segfaults, when trying to test the glue code. Thanks.

-- 
__________________________________________________________________
Stas Bekman            JAm_pH ------> Just Another mod_perl Hacker
http://stason.org/     mod_perl Guide ---> http://perl.apache.org
mailto:stas@stason.org http://use.perl.org http://apacheweek.com
http://modperlbook.org http://apache.org   http://ticketmaster.com

Re: segfault in apr_bucket_delete

Posted by Cliff Woolley <jw...@virginia.edu>.
On Sun, 23 May 2004, Noah Misch wrote:

> Perhaps a ``configure'' option for bucket debugging is in order?  For that
> matter, why not make it the default; most folks who build their own APU do so
> for a development project, so intuitive failure modes will trump efficiency in
> early use.  I can roll a patch, if appropriate.

I feel like I went to implement that at some point and ran into trouble...
but at this point I don't remember what that would have been.  Maybe
something to do with certain flags being shared between apr and apr-util.
Anyway if you want to do it, feel free.  :)

--Cliff

Re: segfault in apr_bucket_delete

Posted by Noah Misch <no...@cs.caltech.edu>.
On Sat, May 22, 2004 at 04:21:02PM -0400, Cliff Woolley wrote:
> On Fri, 21 May 2004, Stas Bekman wrote:
> 
> > I understand all that, but I guess I fail to pass the point across. It is not
> > a problem that I encounter in my code. On the contrary I'm writing tests that
> > exercise, both valid and invalid ways the API can be called. API that hangs
> > when called in invalid way is a problem. Don't you think?
> >
> >    APR_BUCKET_INSERT_BEFORE(fb, db);
> 
> The thing is, it would not be this macro that hangs.  All this macro can
> do is segfault (if one of the pointers is null, meaning the brigade was
> previously corrupted), or do what it's supposed to do (though in doing so
> it could potentially corrupt some other brigade, which is what happens
> here -- if the bucket being inserted is still in a brigade, as db is, then
> that brigade will be corrupted by this operation).  The only way to detect
> that such corruption will occur is to check the entire ring...  that's a
> linear time checking operation tacked on to a constant time insertion
> operation... not acceptable.  :)  However, if you compile with bucket
> debugging turned on, those validity checks WILL be done.

Perhaps a ``configure'' option for bucket debugging is in order?  For that
matter, why not make it the default; most folks who build their own APU do so
for a development project, so intuitive failure modes will trump efficiency in
early use.  I can roll a patch, if appropriate.

Re: segfault in apr_bucket_delete

Posted by Cliff Woolley <jw...@virginia.edu>.
On Sat, 22 May 2004, Stas Bekman wrote:

> > that brigade will be corrupted by this operation).
>
> Do you suggest that the sample program that I posted doesn't hang in that
> macro, but after it?

That should be correct, yes.  You'll end up creating a loop in the
brigade, and walking through the brigade will thus hang.  The first time
you walk through the brigade in that code is when you clean it up.

> I guess that works for me. If in the future someone reports a problem, I can
> suggest to them what you've prescribed above. It's just that there could be
> other reasons for the hanging, which is usually hard to figure out w/o being
> in the user's shoes.
>
> Thanks Cliff and Joe.

You're welcome.  :)

--Cliff

Re: segfault in apr_bucket_delete

Posted by Stas Bekman <st...@stason.org>.
Cliff Woolley wrote:
> On Fri, 21 May 2004, Stas Bekman wrote:
> 
> 
>>I understand all that, but I guess I fail to pass the point across. It is not
>>a problem that I encounter in my code. On the contrary I'm writing tests that
>>exercise, both valid and invalid ways the API can be called. API that hangs
>>when called in invalid way is a problem. Don't you think?
>>
>>   APR_BUCKET_INSERT_BEFORE(fb, db);
> 
> 
> The thing is, it would not be this macro that hangs.  All this macro can
> do is segfault (if one of the pointers is null, meaning the brigade was
> previously corrupted), or do what it's supposed to do (though in doing so
> it could potentially corrupt some other brigade, which is what happens
> here -- if the bucket being inserted is still in a brigade, as db is, then
> that brigade will be corrupted by this operation). 

Do you suggest that the sample program that I posted doesn't hang in that 
macro, but after it? I didn't step through to check, just saw that when I 
remove it or fix the order things work just fine, so it could be just so. I 
need to check that.

> The only way to detect
> that such corruption will occur is to check the entire ring...  that's a
> linear time checking operation tacked on to a constant time insertion
> operation... not acceptable.  :)  

Absolutely!

> However, if you compile with bucket
> debugging turned on, those validity checks WILL be done.

I guess that works for me. If in the future someone reports a problem, I can 
suggest to them what you've prescribed above. It's just that there could be 
other reasons for the hanging, which is usually hard to figure out w/o being 
in the user's shoes.

Thanks Cliff and Joe.

-- 
__________________________________________________________________
Stas Bekman            JAm_pH ------> Just Another mod_perl Hacker
http://stason.org/     mod_perl Guide ---> http://perl.apache.org
mailto:stas@stason.org http://use.perl.org http://apacheweek.com
http://modperlbook.org http://apache.org   http://ticketmaster.com

Re: segfault in apr_bucket_delete

Posted by Cliff Woolley <jw...@virginia.edu>.
On Fri, 21 May 2004, Stas Bekman wrote:

> I understand all that, but I guess I fail to pass the point across. It is not
> a problem that I encounter in my code. On the contrary I'm writing tests that
> exercise, both valid and invalid ways the API can be called. API that hangs
> when called in invalid way is a problem. Don't you think?
>
>    APR_BUCKET_INSERT_BEFORE(fb, db);

The thing is, it would not be this macro that hangs.  All this macro can
do is segfault (if one of the pointers is null, meaning the brigade was
previously corrupted), or do what it's supposed to do (though in doing so
it could potentially corrupt some other brigade, which is what happens
here -- if the bucket being inserted is still in a brigade, as db is, then
that brigade will be corrupted by this operation).  The only way to detect
that such corruption will occur is to check the entire ring...  that's a
linear time checking operation tacked on to a constant time insertion
operation... not acceptable.  :)  However, if you compile with bucket
debugging turned on, those validity checks WILL be done.

--Cliff

Re: segfault in apr_bucket_delete

Posted by Stas Bekman <st...@stason.org>.
Cliff Woolley wrote:
> On Fri, 21 May 2004, Stas Bekman wrote:
> 
> 
>>Joe Orton wrote:
>>
>>>On Thu, May 20, 2004 at 03:54:58PM -0700, Stas Bekman wrote:
>>>
>>>
>>>>       fb = apr_bucket_flush_create(ba);
>>>>       db = apr_bucket_transient_create("aaa", 3, ba);
>>>>       APR_BRIGADE_INSERT_HEAD(bb, db);
>>>>       APR_BUCKET_INSERT_BEFORE(fb, db);
>>>
>>>The arguments to APR_BUCKET_INSERT_BEFORE are reversed, right? It works
>>>for me with the arguments switched.
>>
>>right, but why does it hang when reversed.
> 
> 
> APR_BUCKET_INSERT_BEFORE(fb, db) expands to something like:
> 
>     APR_BUCKET_NEXT(db) = fb;
>     APR_BUCKET_PREV(db) = APR_BUCKET_PREV(fb);
>     APR_BUCKET_NEXT(APR_BUCKET_PREV(fb)) = db;
>     APR_BUCKET_PREV(fb) = db;
> 
> Obviously for this to work, all that has to happen is that fb's prev
> pointer and the next pointer of that bucket must correctly point to each
> other.  Everything else is arbitrarily overwritten.  Did you try running
> this with bucket debugging turned on like I suggested?  If you do that,
> then a bunch of ring consistency checks will be run for you at strategic
> times that might help you discern when it is that your brigade gets
> corrupted.
> 
> 
>>Shouldn't it work both ways? If
>>not, then it should produce an error and not hang.
> 
> 
> No... it's just a macro manipulating some pointers.  Error handling would
> be difficult (given the number of layers of macros) and expensive.

I understand all that, but I guess I fail to pass the point across. It is not 
a problem that I encounter in my code. On the contrary I'm writing tests that 
exercise, both valid and invalid ways the API can be called. API that hangs 
when called in invalid way is a problem. Don't you think?

   APR_BUCKET_INSERT_BEFORE(fb, db);

is not the most intuitive API, and it's very easy to mix the arguments (since 
both are of the same type). I have to pause every time and think hard to see 
whether I've got it right.

Granted, if I was passing NULL or a corrupted reference and getting a 
segfault, then it'll be my problem. But how do you suggest that we protect 
users from doing mistakes and more important how do we point out those 
mistakes in the error message and not having each user submit a bug report, us 
waste hours trying to understand what the problem is, just to discover that 
the user got the arguments in the wrong order.

I suppose if APR doesn't do validation, we will be forced to write wrappers 
which will do the validation :( I understand that this validation may slow 
things down and therefore an undesired thing. I'm not sure what's the happy 
compromise here.

-- 
__________________________________________________________________
Stas Bekman            JAm_pH ------> Just Another mod_perl Hacker
http://stason.org/     mod_perl Guide ---> http://perl.apache.org
mailto:stas@stason.org http://use.perl.org http://apacheweek.com
http://modperlbook.org http://apache.org   http://ticketmaster.com

Re: segfault in apr_bucket_delete

Posted by Cliff Woolley <jw...@virginia.edu>.
On Fri, 21 May 2004, Stas Bekman wrote:

> Joe Orton wrote:
> > On Thu, May 20, 2004 at 03:54:58PM -0700, Stas Bekman wrote:
> >
> >>        fb = apr_bucket_flush_create(ba);
> >>        db = apr_bucket_transient_create("aaa", 3, ba);
> >>        APR_BRIGADE_INSERT_HEAD(bb, db);
> >>        APR_BUCKET_INSERT_BEFORE(fb, db);
> >
> > The arguments to APR_BUCKET_INSERT_BEFORE are reversed, right? It works
> > for me with the arguments switched.
>
> right, but why does it hang when reversed.

APR_BUCKET_INSERT_BEFORE(fb, db) expands to something like:

    APR_BUCKET_NEXT(db) = fb;
    APR_BUCKET_PREV(db) = APR_BUCKET_PREV(fb);
    APR_BUCKET_NEXT(APR_BUCKET_PREV(fb)) = db;
    APR_BUCKET_PREV(fb) = db;

Obviously for this to work, all that has to happen is that fb's prev
pointer and the next pointer of that bucket must correctly point to each
other.  Everything else is arbitrarily overwritten.  Did you try running
this with bucket debugging turned on like I suggested?  If you do that,
then a bunch of ring consistency checks will be run for you at strategic
times that might help you discern when it is that your brigade gets
corrupted.

> Shouldn't it work both ways? If
> not, then it should produce an error and not hang.

No... it's just a macro manipulating some pointers.  Error handling would
be difficult (given the number of layers of macros) and expensive.

--Cliff

Re: segfault in apr_bucket_delete

Posted by Stas Bekman <st...@stason.org>.
Joe Orton wrote:
> On Thu, May 20, 2004 at 03:54:58PM -0700, Stas Bekman wrote:
> 
>>        fb = apr_bucket_flush_create(ba);
>>        db = apr_bucket_transient_create("aaa", 3, ba);
>>        APR_BRIGADE_INSERT_HEAD(bb, db);
>>        APR_BUCKET_INSERT_BEFORE(fb, db);
>
> The arguments to APR_BUCKET_INSERT_BEFORE are reversed, right? It works
> for me with the arguments switched.

right, but why does it hang when reversed. Shouldn't it work both ways? If 
not, then it should produce an error and not hang.

-- 
__________________________________________________________________
Stas Bekman            JAm_pH ------> Just Another mod_perl Hacker
http://stason.org/     mod_perl Guide ---> http://perl.apache.org
mailto:stas@stason.org http://use.perl.org http://apacheweek.com
http://modperlbook.org http://apache.org   http://ticketmaster.com

Re: segfault in apr_bucket_delete

Posted by Joe Orton <jo...@manyfish.co.uk>.
On Thu, May 20, 2004 at 03:54:58PM -0700, Stas Bekman wrote:
>         fb = apr_bucket_flush_create(ba);
>         db = apr_bucket_transient_create("aaa", 3, ba);
>         APR_BRIGADE_INSERT_HEAD(bb, db);
>         APR_BUCKET_INSERT_BEFORE(fb, db);

The arguments to APR_BUCKET_INSERT_BEFORE are reversed, right? It works
for me with the arguments switched.

joe

Re: segfault in apr_bucket_delete

Posted by Stas Bekman <st...@stason.org>.
Ok, here is a mod_perl handler that reliably segfaults:

sub handler {
     my $r = shift;

     my $ba = $r->connection->bucket_alloc;

     my $d1 = APR::Bucket->new("d1");
     my $f1 = APR::Bucket::flush_create($ba);

     my $bb = APR::Brigade->new($r->pool, $ba);
     $bb->insert_head($d1);

     # d1->f1
     $f1->insert_before($d1);

     0;
}

I'm writing all kind of tests to exercise various insertion techniques and 
make sure it works or fails with a useful error message, rather than segfault. 
In this case I create a bucket brigade, one data and one flush buckets. Now I 
insert the data bucket into the head of bb, and then try to insert that data 
bucket before the flush bucket, thus I think linking bb->db->fb. It segfaults 
as reported before (thought the circumstances are right this time).

Though when I try to convert it to an equivalent C program, it hangs. Here is 
a small program I've used to try to reproduce the problem. It's not exactly 
the same as a perl case, where a custom bucket type is used. But I can't get 
the C one to run and hopefully give you a test case:

#include <stdlib.h>

#include "apr_general.h"
#include "apr_hooks.h"
#include "apr_buckets.h"
#include "apr_pools.h"

int main(void)
{
     apr_status_t rv;

     apr_initialize();

     if (apr_hook_global_pool == NULL) {
         apr_pool_t *global_pool;
         rv = apr_pool_create(&global_pool, NULL);
         if (rv != APR_SUCCESS) {
             fprintf(stderr, "failed to create pool");
             exit(1);
         }
         apr_hook_global_pool = global_pool;
     }

     {
         apr_pool_t *pool;
         apr_bucket_alloc_t *ba;
         apr_bucket_brigade *bb;
         apr_bucket *fb, *db;

         rv = apr_pool_create(&pool, apr_hook_global_pool);
         if (rv != APR_SUCCESS) {
             fprintf(stderr, "failed to create pool");
             exit(1);
         }

         ba = apr_bucket_alloc_create(pool);
         bb = apr_brigade_create(pool, ba);

         fb = apr_bucket_flush_create(ba);
         db = apr_bucket_transient_create("aaa", 3, ba);
         APR_BRIGADE_INSERT_HEAD(bb, db);
         APR_BUCKET_INSERT_BEFORE(fb, db);

         apr_pool_clear(pool);

     }

     apr_terminate();

     exit(0);
}

I've built it as:

gcc -I/home/stas/httpd/prefork/include -Wall -L/home/stas/httpd/prefork/lib 
-lapr-0 -lrt -lm -lcrypt -lnsl -lpthread -ldl -laprutil-0 -lgdbm -ldb-4.0 
-lexpat bb.c -o bb

It hangs in APR_BUCKET_INSERT_BEFORE(fb, db);

...
set_thread_area({entry_number:-1 -> 6, base_addr:0x4030ea20, limit:1048575, 
seg_32bit:1, contents:0, read_exec_only:0, limit_in_pages:1, 
seg_not_present:0, useable:1}) = 0
munmap(0x40018000, 86655)               = 0
set_tid_address(0x4030ea68)             = 28339
rt_sigaction(SIGRTMIN, {0x400cb650, [], SA_RESTORER|SA_SIGINFO, 0x400d2210}, 
NULL, 8) = 0
rt_sigprocmask(SIG_UNBLOCK, [RTMIN], NULL, 8) = 0
getrlimit(RLIMIT_STACK, {rlim_cur=8192*1024, rlim_max=RLIM_INFINITY}) = 0
brk(0)                                  = 0x804a000
brk(0x806b000)                          = 0x806b000
brk(0)                                  = 0x806b000

Any idea why?

-- 
__________________________________________________________________
Stas Bekman            JAm_pH ------> Just Another mod_perl Hacker
http://stason.org/     mod_perl Guide ---> http://perl.apache.org
mailto:stas@stason.org http://use.perl.org http://apacheweek.com
http://modperlbook.org http://apache.org   http://ticketmaster.com

Re: segfault in apr_bucket_delete

Posted by Stas Bekman <st...@stason.org>.
Trying to reproduce it in a standalone C program didn't work, it's more 
complex than what I thought, will keep you posted when I get a reproducable 
test case.

-- 
__________________________________________________________________
Stas Bekman            JAm_pH ------> Just Another mod_perl Hacker
http://stason.org/     mod_perl Guide ---> http://perl.apache.org
mailto:stas@stason.org http://use.perl.org http://apacheweek.com
http://modperlbook.org http://apache.org   http://ticketmaster.com

Re: segfault in apr_bucket_delete

Posted by Stas Bekman <st...@stason.org>.
Joe Orton wrote:
> On Thu, May 20, 2004 at 02:41:43AM -0700, Stas Bekman wrote:
> 
>>Stas Bekman wrote:
>>
>>>Doing just:
>>>
>>> apr_brigade_create(p, ba);
>>>
>>>and leaving here segfaults:
>>
>>[...]
> 
> 
> With what 'p' and 'ba'?

r->pool
r->connection->bucket_alloc

 > Can you post the complete test case?

Well, it's coming from mod_perl. Is there something similar to mod_example.c 
that I can throw the C code in to give you a reproducible case? I guess I 
could use mod_example for that :)

If you have mod_perl 2 around just write this simple handler:

   use APR::Brigade ();
   use Apache::Connection ();
   use Apache::RequestRec ();
   sub handler {
     my $r = shift;
     my $bb = APR::Brigade->new($r->pool, $r->connection->bucket_alloc);
     0;
   }

it translates to:

   apr_brigade_create(r->pool, r->connection->bucket_alloc)

I think the segfault happens, when r->pool's cleanup is run.

-- 
__________________________________________________________________
Stas Bekman            JAm_pH ------> Just Another mod_perl Hacker
http://stason.org/     mod_perl Guide ---> http://perl.apache.org
mailto:stas@stason.org http://use.perl.org http://apacheweek.com
http://modperlbook.org http://apache.org   http://ticketmaster.com

Re: segfault in apr_bucket_delete

Posted by Joe Orton <jo...@manyfish.co.uk>.
On Thu, May 20, 2004 at 02:41:43AM -0700, Stas Bekman wrote:
> Stas Bekman wrote:
> >Doing just:
> >
> >  apr_brigade_create(p, ba);
> >
> >and leaving here segfaults:
> [...]

With what 'p' and 'ba'? Can you post the complete test case?

joe

Re: segfault in apr_bucket_delete

Posted by Cliff Woolley <jw...@virginia.edu>.
On Thu, 20 May 2004, Stas Bekman wrote:

> #0  0x4037db83 in mallopt () from /lib/tls/libc.so.6
> #1  0x4037b8ba in free () from /lib/tls/libc.so.6

A segfault in free() more or less always means heap corruption has
previously occurred.  You might try enabling bucket debugging at compile
time -- that will help check for double-frees and so forth.  Or just do
what Joe said and post the whole test case.  :)

Okay, I'm hittin the road.  Talk to you guys from Corbin, KY.

Re: segfault in apr_bucket_delete

Posted by Stas Bekman <st...@stason.org>.
Stas Bekman wrote:
> Doing just:
> 
>   apr_brigade_create(p, ba);
> 
> and leaving here segfaults:
[...]

I think my manual expansion of multiple nested macros went wrong somewhere, 
the real backtrace for the cvs version is:

#0  0x4037db83 in mallopt () from /lib/tls/libc.so.6
#1  0x4037b8ba in free () from /lib/tls/libc.so.6
#2  0x4017a847 in apr_brigade_cleanup (data=0x93e6ff8) at apr_brigade.c:50
#3  0x4017a7d6 in brigade_cleanup (data=0x93e6ff8) at apr_brigade.c:33
#4  0x40277e4e in run_cleanups (cref=0x93f88f0) at apr_pools.c:1997
#5  0x402775eb in apr_pool_destroy (pool=0x93d90f0) at apr_pools.c:763
#6  0x402775ad in apr_pool_clear (pool=0x93d30d8) at apr_pools.c:723
#7  0x080d9a76 in child_main (child_num_arg=1) at prefork.c:528
#8  0x080d9dbb in make_child (s=0x81420c0, slot=1) at prefork.c:703
#9  0x080d9e30 in startup_children (number_to_start=1) at prefork.c:721
#10 0x080da235 in ap_mpm_run (_pconf=0x813d0a8, plog=0x81851c8, s=0x81420c0)
     at prefork.c:940
#11 0x080e0ea9 in main (argc=9, argv=0xbffff264) at main.c:619

So I guess it segfaults elsewhere inside the multiple nested macros. I hope 
you have a clear head, I don't, I'm heading to sleep. Just write that one 
liner from above to reproduce. Thanks!



-- 
__________________________________________________________________
Stas Bekman            JAm_pH ------> Just Another mod_perl Hacker
http://stason.org/     mod_perl Guide ---> http://perl.apache.org
mailto:stas@stason.org http://use.perl.org http://apacheweek.com
http://modperlbook.org http://apache.org   http://ticketmaster.com