You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Munroe Sollog <so...@digiraticonsulting.com> on 2012/01/22 22:49:18 UTC

Identifying actual performance on rules

I am trying to locate reason for performance spikes.  I have read the various wiki pages, and they suggest solutions but not a way to identify the bottleneck.  Is there way to increase logging so that I can begin to identify or rule out the actual performance bottlenecks?


Munroe Sollog
Digirati Consulting
sollog@digiraticonsulting.com
(610) 332-7234 x805


Re: Identifying actual performance on rules

Posted by John Hardin <jh...@impsec.org>.
On Tue, 24 Jan 2012, Karsten Bräckelmann wrote:

> On Mon, 2012-01-23 at 08:03 -0800, John Hardin wrote:
>> On Sun, 22 Jan 2012, Munroe Sollog wrote:
>>
>>> I am trying to locate reason for performance spikes.  I have read the
>>> various wiki pages, and they suggest solutions but not a way to identify
>>> the bottleneck.  Is there way to increase logging so that I can begin to
>>> identify or rule out the actual performance bottlenecks?
>
> Using debugging as mentioned by John will definitely be helpful to
> pinpoint a specific sub-system or rule causing higher load. However,
> this pretty much requires sample messages causing the load -- basically
> implied by running the stand-alone 'spamassassin' script.
>
> Enabling debug mode on a production system should be done with care and
> while closely monitoring the system -- it will result in GOBS of debug
> info being logged. Mentioning this, since I understand the OP to not yet
> have samples, but generally noticing "performance spikes".
>
> Thus, step one should be to find some samples causing an increased load,
> and then to investigate further.

Yes, very good point. Thanks for clarifying that, and apologies on my part 
for assuming you had problematic messages already identified.

SA should log per-message scan times to the system log. From that you 
should be able to determine a message recipient and message-ID for 
messages that take a long time to scan.

You may also be experiencing inbound traffic volume spikes that may be 
overloading your system.

So, if you can track when the processing spikes occur, two things to 
correlate are the number of messages you're processing at that time (you 
may have too many spamd children defined, or not enough memory - are you 
hitting swap?), and any specific messages received at that time that take 
an unusually long time to scan (they may be exercising weaknesses in the 
rules).

One other thing: what version of SA are you using? Older versions have had 
rules that were known to perform poorly in certain situations and that 
have been fixed in newer versions.

-- 
  John Hardin KA7OHZ                    http://www.impsec.org/~jhardin/
  jhardin@impsec.org    FALaholic #11174     pgpk -a jhardin@impsec.org
  key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
   Vista is at best mildly annoying and at worst makes you want to
   rush to Redmond, Wash. and rip somebody's liver out.      -- Forbes
-----------------------------------------------------------------------
  Today: John Moses Browning's 157th Birthday

Re: Identifying actual performance on rules

Posted by Karsten Bräckelmann <gu...@rudersport.de>.
On Mon, 2012-01-23 at 08:03 -0800, John Hardin wrote:
> On Sun, 22 Jan 2012, Munroe Sollog wrote:
> 
> > I am trying to locate reason for performance spikes.  I have read the 
> > various wiki pages, and they suggest solutions but not a way to identify 
> > the bottleneck.  Is there way to increase logging so that I can begin to 
> > identify or rule out the actual performance bottlenecks?

Using debugging as mentioned by John will definitely be helpful to
pinpoint a specific sub-system or rule causing higher load. However,
this pretty much requires sample messages causing the load -- basically
implied by running the stand-alone 'spamassassin' script.

Enabling debug mode on a production system should be done with care and
while closely monitoring the system -- it will result in GOBS of debug
info being logged. Mentioning this, since I understand the OP to not yet
have samples, but generally noticing "performance spikes".

Thus, step one should be to find some samples causing an increased load,
and then to investigate further.


> Sure. Put this into a .pre file:
> 
>    loadplugin HitFreqsRuleTiming ./HitFreqsRuleTiming.pm
> 
> (set the path appropriately for where the .pm file actually lives)
> 
> And then run your SpamAssassin in debug mode, for example:
> 
>    spamassassin -L -t --debug area=all,rules < "$testmsg"
> 
> You should then see logging output like:
> 
>    Jan 13 06:27:45.701 [30309] dbg: rules: [...] rulename ovl(s) max(s) #run %tot
>    Jan 13 06:27:45.701 [30309] dbg: rules: [...] AWL 0.1862 0.1862 1 35.98%
>    Jan 13 06:27:45.701 [30309] dbg: rules: [...] ALL_FULL 0.0485 0.0485 1 9.38%
>    Jan 13 06:27:45.701 [30309] dbg: rules: [...] ALL_RAWBODY 0.0077 0.0077 1 1.48%
>    Jan 13 06:27:45.701 [30309] dbg: rules: [...] T_LONG_HEADER_LINE_80 0.0052 0.0052 1 1.01%

-- 
char *t="\10pse\0r\0dtu\0.@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4";
main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;i<l;i++){ i%8? c<<=1:
(c=*++x); c&128 && (s+=h); if (!(h>>=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}


Re: Identifying actual performance on rules

Posted by John Hardin <jh...@impsec.org>.
On Sun, 22 Jan 2012, Munroe Sollog wrote:

> I am trying to locate reason for performance spikes.  I have read the 
> various wiki pages, and they suggest solutions but not a way to identify 
> the bottleneck.  Is there way to increase logging so that I can begin to 
> identify or rule out the actual performance bottlenecks?

Sure. Put this into a .pre file:

   loadplugin HitFreqsRuleTiming ./HitFreqsRuleTiming.pm

(set the path appropriately for where the .pm file actually lives)

And then run your SpamAssassin in debug mode, for example:

   spamassassin -L -t --debug area=all,rules < "$testmsg"

You should then see logging output like:

   Jan 13 06:27:45.701 [30309] dbg: rules: [...] rulename ovl(s) max(s) #run %tot
   Jan 13 06:27:45.701 [30309] dbg: rules: [...] AWL 0.1862 0.1862 1 35.98%
   Jan 13 06:27:45.701 [30309] dbg: rules: [...] ALL_FULL 0.0485 0.0485 1 9.38%
   Jan 13 06:27:45.701 [30309] dbg: rules: [...] ALL_RAWBODY 0.0077 0.0077 1 1.48%
   Jan 13 06:27:45.701 [30309] dbg: rules: [...] T_LONG_HEADER_LINE_80 0.0052 0.0052 1 1.01%


-- 
  John Hardin KA7OHZ                    http://www.impsec.org/~jhardin/
  jhardin@impsec.org    FALaholic #11174     pgpk -a jhardin@impsec.org
  key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
   Any time law enforcement becomes a revenue center, the system
   becomes corrupt.
-----------------------------------------------------------------------
  Today: John Moses Browning's 157th Birthday