You are viewing a plain text version of this content. The canonical link for it is here.

Posted to modperl@perl.apache.org by Adam Worrall <ab...@yahoo.com> on 2001/10/12 09:47:54 UTC

Variant of Monday-morning bug with Apache::DBI + DBD::Oracle

Just thought I'd report on a puzzling bug ... for us it was caused by a
firewall, but I can imagine you'd get the same behaviour if packets to
your Oracle box starting falling into a hole.

Symptom:
 Apache children hanging for almost exactly 12 minutes on DB
 transactions, usually early weekday mornings.

Underlying cause:
 A firewall was breaking idle TCP sessions, including connections
 between Apache & Oracle, causing packets to be 'mysteriously' dropped.

Problem:
 Apache::DBI's ping check worked fine, but when the dbh was ejected from
 the cache (and so went out of scope), something in the DESTROY stack
 was blocking, and holding the child up for 12m. I'm guesing the
 underlying DBD::Oracle code was trying to do a nice shutdown on the
 dbh, but obviously couldn't.

Quick hack:
 Tweak Apache::DBI to keep the ejected dbh in scope, in a global @array
 or something, and perform a daily/weekly restart on apache.

My guess at a proper solution:
 Some way of flagging the DBH as broken, so that underlying DBD::*
 drivers don't try to use it during a DESTROY call, or something.

Our solution:
 Reconfigure the firewall ;)

 - Adam

Re: Variant of Monday-morning bug with Apache::DBI + DBD::Oracle

Posted by Perrin Harkins <pe...@elem.com>.

>     PH> Another solution is to have the child process exit if the ping
>     PH> fails. You get one failed request, but you clear out the messed
>     PH> up processes quickly and replace them with new ones that can
>     PH> connect safely.
>
> Yeah, good point. Although our poor little WAP service (for that is what
> was for) gets so few hits, they'll all be getting failed if we do that ;>

Okay, try never letting the connections go idle.  Set up a cron to send in a
request every minute or so, or whatever it takes to keep your Oracle
connections active.
- Perrin

Re: Variant of Monday-morning bug with Apache::DBI + DBD::Oracle

Posted by Adam Worrall <ab...@yahoo.com>.

>>>>> "PH" == Perrin Harkins <pe...@elem.com> writes:

    PH> Are you loading the Oracle driver in the parent process (with
    PH> startup.pl)? I think I remember this sometimes causing problems
    PH> with re-connecting.

No, but we did hand load it in a module called from perl.conf with
PerlModule. We had a good reason for that, too (to do with the Oracle
driver overloading alarm(), so that the second alarm in

   alarm(20);
   $dbh = DBI->connect("dbi:Oracle:...", ...);
   alarm(0);

cleared a different type of alarm that the first call set ...)

    PH> Another solution is to have the child process exit if the ping
    PH> fails. You get one failed request, but you clear out the messed
    PH> up processes quickly and replace them with new ones that can
    PH> connect safely.

Yeah, good point. Although our poor little WAP service (for that is what
was for) gets so few hits, they'll all be getting failed if we do that ;>

 - Adam

Re: Variant of Monday-morning bug with Apache::DBI + DBD::Oracle

Posted by Perrin Harkins <pe...@elem.com>.

>  Apache::DBI's ping check worked fine, but when the dbh was ejected from
>  the cache (and so went out of scope), something in the DESTROY stack
>  was blocking, and holding the child up for 12m. I'm guesing the
>  underlying DBD::Oracle code was trying to do a nice shutdown on the
>  dbh, but obviously couldn't.
>
> Quick hack:
>  Tweak Apache::DBI to keep the ejected dbh in scope, in a global @array
>  or something, and perform a daily/weekly restart on apache.

Are you loading the Oracle driver in the parent process (with startup.pl)?
I think I remember this sometimes causing problems with re-connecting.

Another solution is to have the child process exit if the ping fails.  You
get one failed request, but you clear out the messed up processes quickly
and replace them with new ones that can connect safely.

- Perrin