You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Peter Farrell <pe...@gmail.com> on 2007/06/18 15:39:32 UTC

Troubleshooting SA: regex & time_t 3 min delays

Hi all.

I was trying to shave down the 7+ minutes it takes for
Postfix/amavisd/SA to process a single message today <ahem> and
wondered about the two biggest choke points I could identify.

*feeding a test message to spamassassin:
# su - vscan -c 'spamassassin -D <sample-nonspam.txt 2>&1' | timestamp

** versions:
FC4
SpamAssassin version 3.1.7
running on Perl version 5.8.3
amavisd-new-2.4.3

There is a 3 minute delay each at two points: processing the regex
rules and one called 'time_t'.

Any advice or links to push me in the right direction? Is it normal?

Thanks.
-Peter

REGEX
=============================
13:57:28.380 65.608 0.002 [12521] dbg: config: adding redirector
regex: m'^http:/*(?:\w+\.)?google(?:\.\w{2,3}){1,2}/translate\?.*?(?<=[?&])u=(.*?)(?:$|[&#])'i

14:00:02.774 220.003 154.395 [12521] dbg: plugin:
Mail::SpamAssassin::Plugin::ReplaceTags=HASH(0xa8b0720) implements
'finish_parsing_end'
=============================

TIME_T
=============================
14:00:17.937 235.166 0.000 [12521] dbg: eval: time_t from
date=987801124, rcvd= 20 Apr 2001 17:12:04 -0400
14:03:17.105 414.333 179.168 [12521] dbg: eval: all '*To' addrs:
foo@foo.com tbtf@facteur.std.com tbtf@world.std.com
=============================

Re: Troubleshooting SA: regex & time_t 3 min delays

Posted by Mark Martinec <Ma...@ijs.si>.
Peter,

> I blew away SA today and am re-installing via CPAN - I think it may be
> something to do w/ my Perl installation as a whole... Plausible???

Can't say, my first suspects would be DNS resolver or complex regexps.

> I've reinstalled 3 times w/ the same appalling results 10-15 minute
> scanning... the SA and Amavis builds are by the book! Plus I've got
> other working machines that provide the basis of the limited
> configuration options... I'm just about at the end of my tether...

Try the following patch (adds some debug logging) and repeat
your exercise with:
  su vscan -c 'spamassassin -t -D <test.msg' 2>&1 | timestamp



--- Mail/SpamAssassin/Plugin/Check.pm~	Fri Jun  8 14:55:28 2007
+++ Mail/SpamAssassin/Plugin/Check.pm	Wed Jun 13 18:23:59 2007
@@ -578,4 +578,5 @@
         }
       }
+      dbg("rules: finished run body rule '.$rulename.'");
       ';
     }
@@ -891,4 +892,7 @@
       $self->{test_log_msgs} = ();
     ';
+    $evalstr .= '
+      dbg("rules: about to run eval rule $rulename");
+    '  if would_log('dbg');
  
     # only need to set current_rule_name for plugin evals



Is the message suspicious in any way (like: very long,
or many addresses in a mail header, ...)?

  Mark

Re: Troubleshooting SA: regex & time_t 3 min delays

Posted by Peter Farrell <pe...@gmail.com>.
Thanks for the response - unfortunately - there aren't any local, custom rules.
I even removed all of the RulesDuJour whilst testing.

I blew away SA today and am re-installing via CPAN - I think it may be
something to do w/ my Perl installation as a whole... Plausible???
I've reinstalled 3 times w/ the same appalling results 10-15 minute
scanning... the SA and Amavis builds are by the book! Plus I've got
other working machines that provide the basis of the limited
configuration options... I'm just about at the end of my tether...

I remember when I was settling dependencies for Amavisd, I had lots of
problems w/ Math::Pari, bignum, all the RSA stuff and did a few
'forced' installs in the build directory. I've been fighting w/ these
machines for 3 weeks now and it's the only variable that I've not
explored...

RE: the 66mhz - no it's a Poweredge PIII w/ 512 of ram - all it does
is filter SA, act as a backup SQUID proxy, an infrequent SSL apache
pass through and backup MX.

 thanks again., all the best.

 -Peter Farrell

> On 18/06/07, Loren Wilton <lw...@earthlink.net> wrote:
> > Three minutes for regex processing is very much NOT normal, unless you are
> > running on a 66mhz box or the like.
> >
> > First question: are you thrashing?  That is the number one reason for slow
> > SA processing, you have run out of memory for one reason or another.
> >
> > If the 3 minutes is CPU time and you aren't thrashing, you have a bad regex
> > that is getting looped up.  Probably something with a number of *'s and
> > backtracking in it.  While it is possible this could be a release or SARE
> > rule that has found some creative way to fail on your system, I would be
> > more inclined to suspect a locally-crafted rule.
> >
> > There is some technique that can be used to time the individual rules, but
> > I'm not sure what it is.
> >
> >         Loren
> >
> >
> >
>

Re: Troubleshooting SA: regex & time_t 3 min delays

Posted by Loren Wilton <lw...@earthlink.net>.
Three minutes for regex processing is very much NOT normal, unless you are 
running on a 66mhz box or the like.

First question: are you thrashing?  That is the number one reason for slow 
SA processing, you have run out of memory for one reason or another.

If the 3 minutes is CPU time and you aren't thrashing, you have a bad regex 
that is getting looped up.  Probably something with a number of *'s and 
backtracking in it.  While it is possible this could be a release or SARE 
rule that has found some creative way to fail on your system, I would be 
more inclined to suspect a locally-crafted rule.

There is some technique that can be used to time the individual rules, but 
I'm not sure what it is.

        Loren