You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@httpd.apache.org by David Guyot <da...@europecamions-interactive.com> on 2012/05/16 10:47:24 UTC

[users@httpd] Random Apache httpd 2.2.16-6+squeeze7 + PHP 5.3.3-7+squeeze9 crashs without any consistent backtrace

Hello, everybody.

First, I must warn you that English isn't my mother tongue, so please
excuse my possible language errors.

I manage a web server which runs under heavy flow of requests (CPU
around 50% on a Bi-Quad Core), and, during the last months, we
encountered random crashes of the Web server. At first, it was only
apache2 crashs (around one crash a day), but, after a few weeks, these
crash affected the operating system (Debian 6.0.4, amd64 arch), ie the
OS hanged and the whole server had to be electrically rebooted.

At this point, given that none of the system logs nor Apache logs
displayed any error, failure or notice before crashes, that no data was
damaged, that no irregular session opening nor suspect activity was
detected, we considered two possibilities : a software bug or a hardware
problem. As we rent our physical server, we tried to eliminate the first
possibility by manually upgrading Apache to the last 2.2 version
available, that is to say 2.2.22. We quickly found out that this did not
solved the problem, as crashes were as frequent as before the upgrade.

We informed the server supplier that his machine was behaving strangely
and asked a full hardware diagnostic, but, despite three system crashes
within a single day, he didn't seem to find it worrying. We insisted,
and after a single SMART and RAID (1) self-test, he advised us that the
server had to be shutted down during several hours to allow a full
hardware diagnostic. Such a website downtime was impossible for us, so I
tried to make me absolutely sure that these crash had nothing to do with
our code before asking the hardware diagnostic, so I activated Apache
CrashDump feature in order to find a pattern within the crashed
processes. Because I am not an expert in software debugging, I followed
a procedure found on the Internet, which was installing gdb and the
debugging symbols of both Apache and PHP, then waiting for a crash to
occur and executing the following command on the CrashDump file :
gdb /usr/sbin/apache2 $CrashDump --batch --quiet -ex "thread apply all
bt full" > $CrashDump.log

Along the past month, I collected 18 backtraces, all analyzed with the
above command and attached to this mail. I wasn't able to find any
relevant pattern in these crashdumps, but, as I told, I don't even know
if my backtrace procedure was right.

I made several SMART self-tests, as I wasn't confident in the supplier
ones, and none of them displayed any value above the normal ones, and no
data I/O errors at all.

As the crashes continued, the server supplier finally agreed that there
was maybe a hardware problem, built an identical machine and switched
our RAID cluster on the new machine. The httpd server, which was
crashing every day, runned without problem during a week, and we even
noticed a decreased ressource consumption and latency, giving credit to
the hardware problem theory.

Nevertheless, after this week of honeymoon period, httpd crashes soon
came back at the one-crash-a-day usual rate. I then tried to remove an
old PHP Xcache module, which was outdated for years, but it did not
solved the problem. I tried to upgrade the whole PHP 5.3 components, ie
Apache module and php* installed packages, to the 5.3.13-1~dotdeb.0
version, but our code quickly messed with this version, and I was forced
to downgrade PHP packages to the previous 5.3.3-7+squeeze9.

Please find attached PHP and Apache versions and config files.

I hope that I gave all necessary informations. If some are missing, just
ask me and I will give them, as far as possible.

I know that the problem could looks trivial for somebody who knows the
ins and outs of Apache httpd, but this problem has been present for
months, we already checked our code but, without a decent backtrace, it
is impossibile to find the possible source bug, and it's worse if the
problem comes form an Apache httpd or PHP bug.

Thank you in advance for your help.

Regards.