You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@apr.apache.org by Stas Bekman <st...@stason.org> on 2004/05/20 11:18:40 UTC
segfault in apr_bucket_delete
Doing just:
apr_brigade_create(p, ba);
and leaving here segfaults:
#0 0x4017a81f in apr_brigade_cleanup (data=0x93e7110) at apr_brigade.c:47
47 apr_bucket_delete(e);
(gdb) bt
#0 0x4017a81f in apr_brigade_cleanup (data=0x93e7110) at apr_brigade.c:47
#1 0x4017a7d6 in brigade_cleanup (data=0x93e7110) at apr_brigade.c:33
#2 0x40277e4e in run_cleanups (cref=0x93d9500) at apr_pools.c:1997
#3 0x402775eb in apr_pool_destroy (pool=0x93d94f0) at apr_pools.c:763
#4 0x402775ad in apr_pool_clear (pool=0x93d34d8) at apr_pools.c:723
#5 0x080d9a76 in child_main (child_num_arg=1) at prefork.c:528
#6 0x080d9dbb in make_child (s=0x81420c0, slot=1) at prefork.c:703
#7 0x080d9e30 in startup_children (number_to_start=1) at prefork.c:721
#8 0x080da235 in ap_mpm_run (_pconf=0x813d0a8, plog=0x81851c8, s=0x81420c0)
at prefork.c:940
#9 0x080e0ea9 in main (argc=9, argv=0xbffff264) at main.c:619
(gdb) p e
opening up apr_bucket_delete in apr_brigade_cleanup gives:
APU_DECLARE(apr_status_t) apr_brigade_cleanup(void *data)
{
apr_bucket_brigade *b = data;
apr_bucket *e;
while (!APR_BRIGADE_EMPTY(b)) {
e = APR_BRIGADE_FIRST(b);
APR_RING_UNSPLICE((e), (e), link);
(e)->type->destroy((e)->data);
(e)->free(e);
//apr_bucket_delete(e);
}
/*
* We don't need to free(bb) because it's allocated from a pool.
*/
return APR_SUCCESS;
}
brings us to APR_RING_UNSPLICE, which segfaults doing:
APR_RING_NEXT(APR_RING_PREV((ep1), link), link) = ...;
gdb> p *e
$2 = {link = {next = 0x0, prev = 0x9402a60}, type = 0x4048c600, length = 2,
start = 0, data = 0x9403180, free = 0x8067014, list = 0x6c24202c}
...
gdb> p (e->link.prev)->link
$5 = {next = 0x0, prev = 0x93d7660}
(gdb) p (e->link.prev)->link.next
$6 = (struct apr_bucket *) 0x0
which translates to:
0x0 = ...;
boom, segfault. Not sure where it the right place to add a check w/o speed
penalty.
And please add this case to the apr test suite. It's painful to discover this
kind of segfaults, when trying to test the glue code. Thanks.
--
__________________________________________________________________
Stas Bekman JAm_pH ------> Just Another mod_perl Hacker
http://stason.org/ mod_perl Guide ---> http://perl.apache.org
mailto:stas@stason.org http://use.perl.org http://apacheweek.com
http://modperlbook.org http://apache.org http://ticketmaster.com
Re: segfault in apr_bucket_delete
Posted by Cliff Woolley <jw...@virginia.edu>.
On Sun, 23 May 2004, Noah Misch wrote:
> Perhaps a ``configure'' option for bucket debugging is in order? For that
> matter, why not make it the default; most folks who build their own APU do so
> for a development project, so intuitive failure modes will trump efficiency in
> early use. I can roll a patch, if appropriate.
I feel like I went to implement that at some point and ran into trouble...
but at this point I don't remember what that would have been. Maybe
something to do with certain flags being shared between apr and apr-util.
Anyway if you want to do it, feel free. :)
--Cliff
Re: segfault in apr_bucket_delete
Posted by Noah Misch <no...@cs.caltech.edu>.
On Sat, May 22, 2004 at 04:21:02PM -0400, Cliff Woolley wrote:
> On Fri, 21 May 2004, Stas Bekman wrote:
>
> > I understand all that, but I guess I fail to pass the point across. It is not
> > a problem that I encounter in my code. On the contrary I'm writing tests that
> > exercise, both valid and invalid ways the API can be called. API that hangs
> > when called in invalid way is a problem. Don't you think?
> >
> > APR_BUCKET_INSERT_BEFORE(fb, db);
>
> The thing is, it would not be this macro that hangs. All this macro can
> do is segfault (if one of the pointers is null, meaning the brigade was
> previously corrupted), or do what it's supposed to do (though in doing so
> it could potentially corrupt some other brigade, which is what happens
> here -- if the bucket being inserted is still in a brigade, as db is, then
> that brigade will be corrupted by this operation). The only way to detect
> that such corruption will occur is to check the entire ring... that's a
> linear time checking operation tacked on to a constant time insertion
> operation... not acceptable. :) However, if you compile with bucket
> debugging turned on, those validity checks WILL be done.
Perhaps a ``configure'' option for bucket debugging is in order? For that
matter, why not make it the default; most folks who build their own APU do so
for a development project, so intuitive failure modes will trump efficiency in
early use. I can roll a patch, if appropriate.
Re: segfault in apr_bucket_delete
Posted by Cliff Woolley <jw...@virginia.edu>.
On Sat, 22 May 2004, Stas Bekman wrote:
> > that brigade will be corrupted by this operation).
>
> Do you suggest that the sample program that I posted doesn't hang in that
> macro, but after it?
That should be correct, yes. You'll end up creating a loop in the
brigade, and walking through the brigade will thus hang. The first time
you walk through the brigade in that code is when you clean it up.
> I guess that works for me. If in the future someone reports a problem, I can
> suggest to them what you've prescribed above. It's just that there could be
> other reasons for the hanging, which is usually hard to figure out w/o being
> in the user's shoes.
>
> Thanks Cliff and Joe.
You're welcome. :)
--Cliff
Re: segfault in apr_bucket_delete
Posted by Stas Bekman <st...@stason.org>.
Cliff Woolley wrote:
> On Fri, 21 May 2004, Stas Bekman wrote:
>
>
>>I understand all that, but I guess I fail to pass the point across. It is not
>>a problem that I encounter in my code. On the contrary I'm writing tests that
>>exercise, both valid and invalid ways the API can be called. API that hangs
>>when called in invalid way is a problem. Don't you think?
>>
>> APR_BUCKET_INSERT_BEFORE(fb, db);
>
>
> The thing is, it would not be this macro that hangs. All this macro can
> do is segfault (if one of the pointers is null, meaning the brigade was
> previously corrupted), or do what it's supposed to do (though in doing so
> it could potentially corrupt some other brigade, which is what happens
> here -- if the bucket being inserted is still in a brigade, as db is, then
> that brigade will be corrupted by this operation).
Do you suggest that the sample program that I posted doesn't hang in that
macro, but after it? I didn't step through to check, just saw that when I
remove it or fix the order things work just fine, so it could be just so. I
need to check that.
> The only way to detect
> that such corruption will occur is to check the entire ring... that's a
> linear time checking operation tacked on to a constant time insertion
> operation... not acceptable. :)
Absolutely!
> However, if you compile with bucket
> debugging turned on, those validity checks WILL be done.
I guess that works for me. If in the future someone reports a problem, I can
suggest to them what you've prescribed above. It's just that there could be
other reasons for the hanging, which is usually hard to figure out w/o being
in the user's shoes.
Thanks Cliff and Joe.
--
__________________________________________________________________
Stas Bekman JAm_pH ------> Just Another mod_perl Hacker
http://stason.org/ mod_perl Guide ---> http://perl.apache.org
mailto:stas@stason.org http://use.perl.org http://apacheweek.com
http://modperlbook.org http://apache.org http://ticketmaster.com
Re: segfault in apr_bucket_delete
Posted by Cliff Woolley <jw...@virginia.edu>.
On Fri, 21 May 2004, Stas Bekman wrote:
> I understand all that, but I guess I fail to pass the point across. It is not
> a problem that I encounter in my code. On the contrary I'm writing tests that
> exercise, both valid and invalid ways the API can be called. API that hangs
> when called in invalid way is a problem. Don't you think?
>
> APR_BUCKET_INSERT_BEFORE(fb, db);
The thing is, it would not be this macro that hangs. All this macro can
do is segfault (if one of the pointers is null, meaning the brigade was
previously corrupted), or do what it's supposed to do (though in doing so
it could potentially corrupt some other brigade, which is what happens
here -- if the bucket being inserted is still in a brigade, as db is, then
that brigade will be corrupted by this operation). The only way to detect
that such corruption will occur is to check the entire ring... that's a
linear time checking operation tacked on to a constant time insertion
operation... not acceptable. :) However, if you compile with bucket
debugging turned on, those validity checks WILL be done.
--Cliff
Re: segfault in apr_bucket_delete
Posted by Stas Bekman <st...@stason.org>.
Cliff Woolley wrote:
> On Fri, 21 May 2004, Stas Bekman wrote:
>
>
>>Joe Orton wrote:
>>
>>>On Thu, May 20, 2004 at 03:54:58PM -0700, Stas Bekman wrote:
>>>
>>>
>>>> fb = apr_bucket_flush_create(ba);
>>>> db = apr_bucket_transient_create("aaa", 3, ba);
>>>> APR_BRIGADE_INSERT_HEAD(bb, db);
>>>> APR_BUCKET_INSERT_BEFORE(fb, db);
>>>
>>>The arguments to APR_BUCKET_INSERT_BEFORE are reversed, right? It works
>>>for me with the arguments switched.
>>
>>right, but why does it hang when reversed.
>
>
> APR_BUCKET_INSERT_BEFORE(fb, db) expands to something like:
>
> APR_BUCKET_NEXT(db) = fb;
> APR_BUCKET_PREV(db) = APR_BUCKET_PREV(fb);
> APR_BUCKET_NEXT(APR_BUCKET_PREV(fb)) = db;
> APR_BUCKET_PREV(fb) = db;
>
> Obviously for this to work, all that has to happen is that fb's prev
> pointer and the next pointer of that bucket must correctly point to each
> other. Everything else is arbitrarily overwritten. Did you try running
> this with bucket debugging turned on like I suggested? If you do that,
> then a bunch of ring consistency checks will be run for you at strategic
> times that might help you discern when it is that your brigade gets
> corrupted.
>
>
>>Shouldn't it work both ways? If
>>not, then it should produce an error and not hang.
>
>
> No... it's just a macro manipulating some pointers. Error handling would
> be difficult (given the number of layers of macros) and expensive.
I understand all that, but I guess I fail to pass the point across. It is not
a problem that I encounter in my code. On the contrary I'm writing tests that
exercise, both valid and invalid ways the API can be called. API that hangs
when called in invalid way is a problem. Don't you think?
APR_BUCKET_INSERT_BEFORE(fb, db);
is not the most intuitive API, and it's very easy to mix the arguments (since
both are of the same type). I have to pause every time and think hard to see
whether I've got it right.
Granted, if I was passing NULL or a corrupted reference and getting a
segfault, then it'll be my problem. But how do you suggest that we protect
users from doing mistakes and more important how do we point out those
mistakes in the error message and not having each user submit a bug report, us
waste hours trying to understand what the problem is, just to discover that
the user got the arguments in the wrong order.
I suppose if APR doesn't do validation, we will be forced to write wrappers
which will do the validation :( I understand that this validation may slow
things down and therefore an undesired thing. I'm not sure what's the happy
compromise here.
--
__________________________________________________________________
Stas Bekman JAm_pH ------> Just Another mod_perl Hacker
http://stason.org/ mod_perl Guide ---> http://perl.apache.org
mailto:stas@stason.org http://use.perl.org http://apacheweek.com
http://modperlbook.org http://apache.org http://ticketmaster.com
Re: segfault in apr_bucket_delete
Posted by Cliff Woolley <jw...@virginia.edu>.
On Fri, 21 May 2004, Stas Bekman wrote:
> Joe Orton wrote:
> > On Thu, May 20, 2004 at 03:54:58PM -0700, Stas Bekman wrote:
> >
> >> fb = apr_bucket_flush_create(ba);
> >> db = apr_bucket_transient_create("aaa", 3, ba);
> >> APR_BRIGADE_INSERT_HEAD(bb, db);
> >> APR_BUCKET_INSERT_BEFORE(fb, db);
> >
> > The arguments to APR_BUCKET_INSERT_BEFORE are reversed, right? It works
> > for me with the arguments switched.
>
> right, but why does it hang when reversed.
APR_BUCKET_INSERT_BEFORE(fb, db) expands to something like:
APR_BUCKET_NEXT(db) = fb;
APR_BUCKET_PREV(db) = APR_BUCKET_PREV(fb);
APR_BUCKET_NEXT(APR_BUCKET_PREV(fb)) = db;
APR_BUCKET_PREV(fb) = db;
Obviously for this to work, all that has to happen is that fb's prev
pointer and the next pointer of that bucket must correctly point to each
other. Everything else is arbitrarily overwritten. Did you try running
this with bucket debugging turned on like I suggested? If you do that,
then a bunch of ring consistency checks will be run for you at strategic
times that might help you discern when it is that your brigade gets
corrupted.
> Shouldn't it work both ways? If
> not, then it should produce an error and not hang.
No... it's just a macro manipulating some pointers. Error handling would
be difficult (given the number of layers of macros) and expensive.
--Cliff
Re: segfault in apr_bucket_delete
Posted by Stas Bekman <st...@stason.org>.
Joe Orton wrote:
> On Thu, May 20, 2004 at 03:54:58PM -0700, Stas Bekman wrote:
>
>> fb = apr_bucket_flush_create(ba);
>> db = apr_bucket_transient_create("aaa", 3, ba);
>> APR_BRIGADE_INSERT_HEAD(bb, db);
>> APR_BUCKET_INSERT_BEFORE(fb, db);
>
> The arguments to APR_BUCKET_INSERT_BEFORE are reversed, right? It works
> for me with the arguments switched.
right, but why does it hang when reversed. Shouldn't it work both ways? If
not, then it should produce an error and not hang.
--
__________________________________________________________________
Stas Bekman JAm_pH ------> Just Another mod_perl Hacker
http://stason.org/ mod_perl Guide ---> http://perl.apache.org
mailto:stas@stason.org http://use.perl.org http://apacheweek.com
http://modperlbook.org http://apache.org http://ticketmaster.com
Re: segfault in apr_bucket_delete
Posted by Joe Orton <jo...@manyfish.co.uk>.
On Thu, May 20, 2004 at 03:54:58PM -0700, Stas Bekman wrote:
> fb = apr_bucket_flush_create(ba);
> db = apr_bucket_transient_create("aaa", 3, ba);
> APR_BRIGADE_INSERT_HEAD(bb, db);
> APR_BUCKET_INSERT_BEFORE(fb, db);
The arguments to APR_BUCKET_INSERT_BEFORE are reversed, right? It works
for me with the arguments switched.
joe
Re: segfault in apr_bucket_delete
Posted by Stas Bekman <st...@stason.org>.
Ok, here is a mod_perl handler that reliably segfaults:
sub handler {
my $r = shift;
my $ba = $r->connection->bucket_alloc;
my $d1 = APR::Bucket->new("d1");
my $f1 = APR::Bucket::flush_create($ba);
my $bb = APR::Brigade->new($r->pool, $ba);
$bb->insert_head($d1);
# d1->f1
$f1->insert_before($d1);
0;
}
I'm writing all kind of tests to exercise various insertion techniques and
make sure it works or fails with a useful error message, rather than segfault.
In this case I create a bucket brigade, one data and one flush buckets. Now I
insert the data bucket into the head of bb, and then try to insert that data
bucket before the flush bucket, thus I think linking bb->db->fb. It segfaults
as reported before (thought the circumstances are right this time).
Though when I try to convert it to an equivalent C program, it hangs. Here is
a small program I've used to try to reproduce the problem. It's not exactly
the same as a perl case, where a custom bucket type is used. But I can't get
the C one to run and hopefully give you a test case:
#include <stdlib.h>
#include "apr_general.h"
#include "apr_hooks.h"
#include "apr_buckets.h"
#include "apr_pools.h"
int main(void)
{
apr_status_t rv;
apr_initialize();
if (apr_hook_global_pool == NULL) {
apr_pool_t *global_pool;
rv = apr_pool_create(&global_pool, NULL);
if (rv != APR_SUCCESS) {
fprintf(stderr, "failed to create pool");
exit(1);
}
apr_hook_global_pool = global_pool;
}
{
apr_pool_t *pool;
apr_bucket_alloc_t *ba;
apr_bucket_brigade *bb;
apr_bucket *fb, *db;
rv = apr_pool_create(&pool, apr_hook_global_pool);
if (rv != APR_SUCCESS) {
fprintf(stderr, "failed to create pool");
exit(1);
}
ba = apr_bucket_alloc_create(pool);
bb = apr_brigade_create(pool, ba);
fb = apr_bucket_flush_create(ba);
db = apr_bucket_transient_create("aaa", 3, ba);
APR_BRIGADE_INSERT_HEAD(bb, db);
APR_BUCKET_INSERT_BEFORE(fb, db);
apr_pool_clear(pool);
}
apr_terminate();
exit(0);
}
I've built it as:
gcc -I/home/stas/httpd/prefork/include -Wall -L/home/stas/httpd/prefork/lib
-lapr-0 -lrt -lm -lcrypt -lnsl -lpthread -ldl -laprutil-0 -lgdbm -ldb-4.0
-lexpat bb.c -o bb
It hangs in APR_BUCKET_INSERT_BEFORE(fb, db);
...
set_thread_area({entry_number:-1 -> 6, base_addr:0x4030ea20, limit:1048575,
seg_32bit:1, contents:0, read_exec_only:0, limit_in_pages:1,
seg_not_present:0, useable:1}) = 0
munmap(0x40018000, 86655) = 0
set_tid_address(0x4030ea68) = 28339
rt_sigaction(SIGRTMIN, {0x400cb650, [], SA_RESTORER|SA_SIGINFO, 0x400d2210},
NULL, 8) = 0
rt_sigprocmask(SIG_UNBLOCK, [RTMIN], NULL, 8) = 0
getrlimit(RLIMIT_STACK, {rlim_cur=8192*1024, rlim_max=RLIM_INFINITY}) = 0
brk(0) = 0x804a000
brk(0x806b000) = 0x806b000
brk(0) = 0x806b000
Any idea why?
--
__________________________________________________________________
Stas Bekman JAm_pH ------> Just Another mod_perl Hacker
http://stason.org/ mod_perl Guide ---> http://perl.apache.org
mailto:stas@stason.org http://use.perl.org http://apacheweek.com
http://modperlbook.org http://apache.org http://ticketmaster.com
Re: segfault in apr_bucket_delete
Posted by Stas Bekman <st...@stason.org>.
Trying to reproduce it in a standalone C program didn't work, it's more
complex than what I thought, will keep you posted when I get a reproducable
test case.
--
__________________________________________________________________
Stas Bekman JAm_pH ------> Just Another mod_perl Hacker
http://stason.org/ mod_perl Guide ---> http://perl.apache.org
mailto:stas@stason.org http://use.perl.org http://apacheweek.com
http://modperlbook.org http://apache.org http://ticketmaster.com
Re: segfault in apr_bucket_delete
Posted by Stas Bekman <st...@stason.org>.
Joe Orton wrote:
> On Thu, May 20, 2004 at 02:41:43AM -0700, Stas Bekman wrote:
>
>>Stas Bekman wrote:
>>
>>>Doing just:
>>>
>>> apr_brigade_create(p, ba);
>>>
>>>and leaving here segfaults:
>>
>>[...]
>
>
> With what 'p' and 'ba'?
r->pool
r->connection->bucket_alloc
> Can you post the complete test case?
Well, it's coming from mod_perl. Is there something similar to mod_example.c
that I can throw the C code in to give you a reproducible case? I guess I
could use mod_example for that :)
If you have mod_perl 2 around just write this simple handler:
use APR::Brigade ();
use Apache::Connection ();
use Apache::RequestRec ();
sub handler {
my $r = shift;
my $bb = APR::Brigade->new($r->pool, $r->connection->bucket_alloc);
0;
}
it translates to:
apr_brigade_create(r->pool, r->connection->bucket_alloc)
I think the segfault happens, when r->pool's cleanup is run.
--
__________________________________________________________________
Stas Bekman JAm_pH ------> Just Another mod_perl Hacker
http://stason.org/ mod_perl Guide ---> http://perl.apache.org
mailto:stas@stason.org http://use.perl.org http://apacheweek.com
http://modperlbook.org http://apache.org http://ticketmaster.com
Re: segfault in apr_bucket_delete
Posted by Joe Orton <jo...@manyfish.co.uk>.
On Thu, May 20, 2004 at 02:41:43AM -0700, Stas Bekman wrote:
> Stas Bekman wrote:
> >Doing just:
> >
> > apr_brigade_create(p, ba);
> >
> >and leaving here segfaults:
> [...]
With what 'p' and 'ba'? Can you post the complete test case?
joe
Re: segfault in apr_bucket_delete
Posted by Cliff Woolley <jw...@virginia.edu>.
On Thu, 20 May 2004, Stas Bekman wrote:
> #0 0x4037db83 in mallopt () from /lib/tls/libc.so.6
> #1 0x4037b8ba in free () from /lib/tls/libc.so.6
A segfault in free() more or less always means heap corruption has
previously occurred. You might try enabling bucket debugging at compile
time -- that will help check for double-frees and so forth. Or just do
what Joe said and post the whole test case. :)
Okay, I'm hittin the road. Talk to you guys from Corbin, KY.
Re: segfault in apr_bucket_delete
Posted by Stas Bekman <st...@stason.org>.
Stas Bekman wrote:
> Doing just:
>
> apr_brigade_create(p, ba);
>
> and leaving here segfaults:
[...]
I think my manual expansion of multiple nested macros went wrong somewhere,
the real backtrace for the cvs version is:
#0 0x4037db83 in mallopt () from /lib/tls/libc.so.6
#1 0x4037b8ba in free () from /lib/tls/libc.so.6
#2 0x4017a847 in apr_brigade_cleanup (data=0x93e6ff8) at apr_brigade.c:50
#3 0x4017a7d6 in brigade_cleanup (data=0x93e6ff8) at apr_brigade.c:33
#4 0x40277e4e in run_cleanups (cref=0x93f88f0) at apr_pools.c:1997
#5 0x402775eb in apr_pool_destroy (pool=0x93d90f0) at apr_pools.c:763
#6 0x402775ad in apr_pool_clear (pool=0x93d30d8) at apr_pools.c:723
#7 0x080d9a76 in child_main (child_num_arg=1) at prefork.c:528
#8 0x080d9dbb in make_child (s=0x81420c0, slot=1) at prefork.c:703
#9 0x080d9e30 in startup_children (number_to_start=1) at prefork.c:721
#10 0x080da235 in ap_mpm_run (_pconf=0x813d0a8, plog=0x81851c8, s=0x81420c0)
at prefork.c:940
#11 0x080e0ea9 in main (argc=9, argv=0xbffff264) at main.c:619
So I guess it segfaults elsewhere inside the multiple nested macros. I hope
you have a clear head, I don't, I'm heading to sleep. Just write that one
liner from above to reproduce. Thanks!
--
__________________________________________________________________
Stas Bekman JAm_pH ------> Just Another mod_perl Hacker
http://stason.org/ mod_perl Guide ---> http://perl.apache.org
mailto:stas@stason.org http://use.perl.org http://apacheweek.com
http://modperlbook.org http://apache.org http://ticketmaster.com