You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@perl.apache.org by "Philippe M. Chiasson" <go...@ectoplasm.org> on 2009/01/08 06:06:26 UTC

Re: [mp2] frequent segfaults in APR::Table

On 31/12/08 05:39, Tupshin Harper wrote:
> Some comments inline, but the really useful stuff at the bottom.

Same here, keep reading ;-)

> Thank you very much for your help.
>
> -Tupshin
>
> Philippe M. Chiasson wrote:
>> On 29/12/08 18:56, Tupshin Harper wrote:
>>> 1. Problem Description:
>>>
>>> I'm attempting to upgrade one of the largest (measured both by users and
>>> lines of code, I suspect) mod_perl sites from mod_perl 1 to mod_perl 2,
>>> and also from 32 bit OS to 64 bit at the same time. I converted our
>>> calls to use the new API, and basic functionality started working.
>>> However, I am experiencing frequent segfaults in APR::Table (stack trace
>>> below) when loading pages.
>> Just out of curiosity, are you handling APR::Table objects directly ?
> The only places where APR is ever mentioned are:
> http://code.livejournal.org/trac/livejournal/browser/branches/modernize/cgi-bin/LJ/Request/Apache2.pm
> (comment line 76)
> and
> http://code.livejournal.org/trac/bml/browser/branches/modernize/lib/Apache/BML.pm
> (the two "use" statements, but those appear to actually be unused.

Sometimes, when you call something in mod_perl land, you might get an APR::Table back, say:

my $o = $r->headers_out;

#$o is an APR::Table now, and you need to
use APR::Table;

#before you can call
$o->get([...])

> So no.

Okay, so this probably means you are hitting a mod_perl bug, not something
evil your code is doing with APR::Table's

>>> Somewhere betwen 1 out of every 2-4 page
>>> loads will cause it. Identical problem occurs on:
>>> 64 bit Debian Lenny with stock mod_perl 2.0.4
>>> 64 bit Debian Lenny with hand-built mod_perl 2.0.5-dev from latest
>>> source.
>>> 64 bit Centos 5.2 with stock mod_perl 2.0.2.
>>>
>>> Let me know if there is any other information you need.
>> See below. Of course, a shorter, reproducible test case would be the
>> ideal.
> Agreed, but given the complexity of the entire system and the fact that
> we are using a home-brewed templating system (bml), makes it quite
> difficult. I'll work on that if nothing else proves fruitful.

I know, it's sometimes tricky to boil down problems that only appear sometimes
in the wild and in a large system.

>>> I have not yet
>>> tried it with mod_perl 2 on a 32-bit OS.
>>>
>>> [...]
>>>
>>> Method it crashes in:
>>>
>>> /* Try to shortcut apr_table_get by fetching the key using the current
>>>    * iterator (unless it's inactive or points at different key).
>>>    */
>>> static MP_INLINE const char *mpxs_APR__Table_FETCH(pTHX_ SV *tsv,
>>>                                                      const char *key)
>>> {
>>>       SV* rv = modperl_hash_tied_object_rv(aTHX_ "APR::Table", tsv);
>>>       const int i = mpxs_apr_table_iterix(rv);
>>>       apr_table_t *t = INT2PTR(apr_table_t *, SvIVX(SvRV(rv)));
>> Possibly smells like a 64 bit issue to me.
> My next step will be to confirm this theory by bringing it up on a 32
> bit instance.

Any changes/update with that ?

>>>       const apr_array_header_t *arr = apr_table_elts(t);
>>>       apr_table_entry_t *elts = (apr_table_entry_t
>>> *)arr->elts;<---crashing line 186
>> Can you get a little more information out of the current local variables.
>>
>> i.e. I'd be interested in seeing the value of:
>>
>> i
>> *t
>> *arr
>>
>> Which you can easily do from withing gdb with
>>
>> (gdb) display *t
>> (gdb) display *arr
>>
> "i" is never anything but zero in the cases I'm looking at
> a typical value for "t" is "(apr_table_t *) 0x956bfa0"
> but printing *t always generates<incomplete type>
> however, there is useful wrongness in "arr" and "elts".
>
> A quick adendum to my previous report:
>
> Sometimes it crashes directly on line 186, and in those cases, arr =
> 0x4f5349203a746573 (or something similar), and printing *arr reasonably
> says "Cannot access memory at address 0x4f5349203a746573"
>
> In other cases, it crashes within the apr_table_get(t, key) call on line
> 192. In those cases, "arr" is more reasonable, e.g.
> (const apr_array_header_t *) 0x956bfa0
> but *arr is:
>   {pool = 0x636f6c2f7273752f, elt_size = 1932487777, nelts = 980314466,
> nalloc = 1920169263,
>    elts = 0x2f3a6e69622f6c61<Address 0x2f3a6e69622f6c61 out of bounds>}
> elts is:
> (apr_table_entry_t *) 0x2f3a6e69622f6c61
> and *elts is:
> Cannot access memory at address 0x2f3a6e69622f6c61
>
> So, to summarize, when it crashes on line 186, *arr is a bad pointer,
> and when it crashes when calling apr_table_get from line 192, *elts is a
> bad pointer.

Starting to smell more and more like bad pointer mangling when in 64bit.

Forgot to ask, but can you dump the SV *rv and *tsv like so:

(gdb) call sv_dump(rv)
(gdb) call sv_dump(tsv)

Thanks, in the meantime, I am trying to visualize why this might be
hapenning. Everything so far looks like it's using the correct macros
to safely convery between IV and pointer. Hrm.

One way to dig into this further would be to add extra debugging to
  modperl_hash_tied_object
  modperl_hash_tie

when classname=="APR::Table"

and see how the void * is converted back and forth.

-- 
Philippe M. Chiasson     GPG: F9BFE0C2480E7680 1AE53631CB32A107 88C3A5A5
http://gozer.ectoplasm.org/       m/gozer\@(apache|cpan|ectoplasm)\.org/

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@perl.apache.org
For additional commands, e-mail: dev-help@perl.apache.org