You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@apr.apache.org by "John L. Poole" <jl...@gmail.com> on 2011/07/22 06:42:01 UTC

Segment Fault: apr_palloc() in libapr-1.so.0

My instance of Apache has started to segment fault, it occurred around 
June 30, 2011.

Here's my system:
plug src # uname -a
Linux plug 2.6.33.5 #3 PREEMPT Thu Sep 2 07:47:34 PDT 2010 armv5tel 
Feroceon 88FR131 rev 1 (v5l) Marvell SheevaPlug Reference Board GNU/Linux
plug src #

Apache version:
     Installed versions:  2.2.17(2)(07:46:19 PM 07/21/2011)... [listing 
of modules]

I've tried running Apache without any OPTS by remming my APACHE2_OPTS 
variable, this was it is
a simple instance of pure Apache and no modules:
#APACHE2_OPTS="${APACHE2_OPTS} ....."

and I get the same result in the gdb session (see below) with or without 
APACHE2_OPTS, so it
looks like the problem is in apr.

Here's what I have installed:
  dev-libs/apr

      plug src # eix -I apr
     [I] dev-libs/apr
          Available versions:  (1) 1.4.4!t (~)1.4.5!t
             {doc elibc_FreeBSD older-kernels-compatibility +urandom +uuid}
          Installed versions:  1.4.5(1)!t(07:20:04 PM 
07/21/2011)(urandom uuid -doc -elibc_FreeBSD -older-kernels-compatibility)
          Homepage:            http://apr.apache.org/
          Description:         Apache Portable Runtime Library

     [I] dev-libs/apr-util
          Available versions:  (1) 1.3.11!t
             {berkdb doc freetds gdbm ldap mysql odbc postgres sqlite 
sqlite3}
          Installed versions:  1.3.11(1)!t(12:54:25 PM 
07/04/2011)(berkdb gdbm ldap mysql postgres -doc -freetds -odbc -sqlite 
-sqlite3)
          Homepage:            http://apr.apache.org/
          Description:         Apache Portable Runtime Utility Library

Note: I had installed 1.4.4 and had the same problem, so I allowed the 
"unstable" version for the ARM
platform, 1.4.5, to install.

When apr was compiling, I got this warning at the end:

  * QA Notice: The following files contain runtime text relocations
  *  Text relocations force the dynamic linker to perform extra
  *  work at startup, waste system resources, and may pose a security
  *  risk.  On some architectures, the code may not even function
  *  properly, if at all.
  *  For more information, see http://hardened.gentoo.org/pic-fix-guide.xml
  *  Please include the following list of files in your report:
  * TEXTREL usr/lib/libapr-1.so.0.4.4

I've tried going outside of Gentoo's package system and I installed APR 
directly.
I ran ./configure, make test, and make install.  "make test" gave me: 
"All tests passed." before
testing some database tasks which also passed, for the most part.

  Here's my session in gdb which suggests "apr_palloc" is
  where the problem occurs:

     plug apr # gdb /usr/sbin/apache2
     GNU gdb (Gentoo 7.2 p1) 7.2
     Copyright (C) 2010 Free Software Foundation, Inc.
     License GPLv3+: GNU GPL version 3 or later 
<http://gnu.org/licenses/gpl.html>
     This is free software: you are free to change and redistribute it.
     There is NO WARRANTY, to the extent permitted by law.  Type "show 
copying"
     and "show warranty" for details.
     This GDB was configured as "armv5tel-softfloat-linux-gnueabi".
     For bug reporting instructions, please see:
<http://bugs.gentoo.org/>...
     Reading symbols from /usr/sbin/apache2...(no debugging symbols 
found)...done.
     (gdb) run
     Starting program: /usr/sbin/apache2
     [Thread debugging using libthread_db enabled]

     Program received signal SIGSEGV, Segmentation fault.
     0x4029c334 in apr_palloc () from /usr/lib/libapr-1.so.0
     (gdb) quit


  Here's the build log for Apache: http://pastebin.com/U6uj4FHj
  Here's the build lof for apr: http://pastebin.com/XRYAtmqu

  I have saved the complete build staging area and can provide any 
portion thereof.

  Unfortunately, back-stepping to older versions of apr is very 
difficult as the
  maintainer basically removed the older versions from being available 
via emerge.

  What can I provide and/or do to help uncover what is causing this 
problem?  Should
  I log a bug?

-- John

Re: Segment Fault: apr_palloc() in libapr-1.so.0

Posted by "John L. Poole" <jl...@gmail.com>.
On 7/22/2011 1:38 AM, Nick Kew wrote:
> On 22 Jul 2011, at 05:42, John L. Poole wrote:
>
> [asides]
> I have httpd running on a different ARM platform, but I don't recollect the
> APR version, and I think it might be 1.3.x.
>
> How does that "-elibc_FreeBSD -older-kernels-compatibility" fit with
> your Linux platform?
> [/asides]
>
>> What can I provide and/or do to help uncover what is causing this problem?  Should
>> I log a bug?
> I take it this is way too unpredictable to reproduce to order, so an interactive gdb
> session would be pointless?
>
> If so, then a traceback would be helpful.  Compiling apr with "-g -O0" will help you
> get that from a coredump, or (since it's happening with httpd) Jeff's mod_backtrace
> from http://people.apache.org/~trawick/ might be an alternative to the core (though
> it'll still need the symbolic information).
>
> And yes, by all means log a bug.  But you might still have to poke this list a bit,
> especially if none of the active devs is in a position to reproduce the problem!
> If you can identify specific details of your scenario that would help reproduce it,
> that improves your chances of a diagnosis and fix.
>
I'll address some to the issues that you had questions about.

The Gentoo USE Flag "older-kernels-compatibility" is described as 
"Enable binary compatibility with older kernels".  
http://www.gentoo.org/dyn/use-index.xml  Just for kicks, I re-emerged 
[compiled] dev-libs/apr-1.4.5 and then apache using the 
"older-kernels-compatibility" flag in the apr build to see if anything 
changed in the behavior.  After building apr, but not apache, I tried 
the gdb and got the same result.  I then rebuilt apache and ran gdb and 
got the same result.

The Gentoo USE Flag "elibc_FreeBSD" appears to be undocumented.  I've 
logged a bug with Gentoo 
https://bugs.gentoo.org/show_bug.cgi?id=375995.  When I try to perform a 
pretend ("-p") compile, the eligc_FreeBSD is ommitted automatically 
despite my specifically including it.  Note, I also included a 
nonexistent USE variable, "garbageVarTest", just to see how the package 
installer reacts.  It appears that if the variable is not recognized, it 
is simply ignored -- there is no error trap for unexpected USE 
variables. Based on this behavior, I'm guessing the elibc_FreeBSD is 
accidentally included in the package description.

==========
     plug portage # FEATURES="keepwork" USE="older-kernels-compatibility 
doc elibc_FreeBSD garbageVarTest"  emerge dev-libs/apr -p
     * WARNING: The FEATURES variable contains one or more values that
     * should be disabled under normal circumstances: keepwork

     These are the packages that would be merged, in order:

     Calculating dependencies... done!
     [ebuild   R   ~] dev-libs/apr-1.4.5  USE="doc* 
older-kernels-compatibility*"
     plug portage #
============

The segment fault always occurs right at start up -- immediately, and is 
reproducible.  The interactive gdb sessions are reproducible.  Apache 
does not even get to the point of writing to its own logs.

I've run apache and apr on this platform for years, so I'm confident 
something recently introduced into the system (through Gentoo's portage) 
or a setting I altered is causing this.

Also, I'm wondering about the Gentoo "TEXTREL" warning I received when I 
emerged the package:
================
  * QA Notice: The following files contain runtime text relocations
  *  Text relocations force the dynamic linker to perform extra
  *  work at startup, waste system resources, and may pose a security
  *  risk.  On some architectures, the code may not even function
  *  properly, if at all.
  *  For more information, see http://hardened.gentoo.org/pic-fix-guide.xml
  *  Please include the following list of files in your report:
  * TEXTREL usr/lib/libapr-1.so.0.4.5
================

Would TEXTREL have any bearing here?  One of the fixes, according to 
http://www.gentoo.org/proj/en/hardened/pic-fix-guide.xml, is " Just 
review the build output and see if the command to compile it was invoked 
with -fPIC. If not, go fix the build system as you do not need to dig 
into the source."

It turns out the TEXTREL warnings were, indeed, indicative of the problem.

I modified by CFLAGS variable (compiler flags in Gentoo) to include 
"-fPIC" and the re-emerge [compiled] apr and apache.  Now apache works.

Conclusion: the TEXTRELs that occurs in apr and Apache required that I 
have the '-fPIC' flag invoked.

Re: Segment Fault: apr_palloc() in libapr-1.so.0

Posted by Nick Kew <ni...@apache.org>.
On 22 Jul 2011, at 05:42, John L. Poole wrote:

[asides]
I have httpd running on a different ARM platform, but I don't recollect the
APR version, and I think it might be 1.3.x.

How does that "-elibc_FreeBSD -older-kernels-compatibility" fit with
your Linux platform?
[/asides]

> What can I provide and/or do to help uncover what is causing this problem?  Should
> I log a bug?

I take it this is way too unpredictable to reproduce to order, so an interactive gdb
session would be pointless?

If so, then a traceback would be helpful.  Compiling apr with "-g -O0" will help you
get that from a coredump, or (since it's happening with httpd) Jeff's mod_backtrace
from http://people.apache.org/~trawick/ might be an alternative to the core (though
it'll still need the symbolic information).

And yes, by all means log a bug.  But you might still have to poke this list a bit,
especially if none of the active devs is in a position to reproduce the problem!
If you can identify specific details of your scenario that would help reproduce it,
that improves your chances of a diagnosis and fix.

-- 
Nick Kew

Available for work, contract or permanent
http://www.webthing.com/~nick/cv.html