You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Nick I <ni...@gmail.com> on 2014/04/22 21:31:57 UTC

high cpu load

Hi,

I use SpamAssassin version 3.3.1 running on Perl version 5.10.1,
amavisd-new-2.8.0-8.el6 as before-queue filter.
Today for unknown reason i noticed high load on my server. Mail flow is as
usual. About 8k in hour checked by amavisd.

Here is timing from amavis. tests_pri_0 is around 90% all the time :
amavis[26002]: (26002-05) TIMING-SA total 759 ms - parse: 1.92 (0.3%),
extract_message_metadata: 31 (4.1%), get_uri_detail_list: 6 (0.8%),
tests_pri_-1000: 17 (2.2%), tests_pri_-950: 0.93 (0.1%), tests_pri_-900:
1.00 (0.1%), tests_pri_-400: 0.79 (0.1%), *tests_pri_0: 668 (88.0%)*,
check_dkim_signature: 9 (1.2%), check_spf: 364 (47.9%), poll_dns_idle: 338
(44.5%), tests_pri_500: 7 (1.0%), get_report: 0.99 (0.1%)

In my another test i see async completed from 0.016 s till 0.174 s.

How can i disable/speed up tests_pri_0 ?
What is inside tests_pri_0?
Thanks.

Re: high cpu load

Posted by Axb <ax...@gmail.com>.
On 04/24/2014 04:28 PM, John Hardin wrote:
> On Thu, 24 Apr 2014, Axb wrote:
>
>> On 04/24/2014 04:16 PM, John Hardin wrote:
>>>  On Thu, 24 Apr 2014, Nick I wrote:
>>>
>>> >  Finally i found message caused high load.
>>> > >  It looks like one message sent all the time from ticket system.
>>> >  Message size is ~4M. We scan messages with this size in amavis.
>>> > >  Content of message is complex and has multiple items
>>> >  Content-Type: image/gif
>>> >  Content-Transfer-Encoding: base64
>>> >  Content-Type: application/pdf
>>> > >  Results from debug, with % > 1:
>>> > dbg: rules:  timing: Total time: 131.6748 s
>>> > dbg: rules:  [...] rulename ovl(s) max(s) #run %tot
>>> > dbg: rules:  [...] __FILL_THIS_FORM_LONG2 26.3811 26.3811 1 20.04%
>>> > dbg: rules:  [...] __FILL_THIS_FORM_SHORT2 26.3050 26.3050 1 19.98%
>>> > dbg: rules:  [...] __FILL_THIS_FORM_FRAUD_PHISH1 10.0878 10.0878 1
>>> 7.66%
>>> > dbg: rules:  [...] __FILL_THIS_FORM_LOAN1 7.2766 7.2766 1 5.53%
>>> > dbg: rules:  [...] __FILL_THIS_FORM_SHORT1 2.3360 2.3360 1 1.77%
>>> > dbg: rules:  [...] __FILL_THIS_FORM_LONG1 2.3051 2.3051 1 1.75%
>>>
>>>  That's not too surprising if the content is 4MB.
>>>
>>>  Would you be willing to share it with me so that I can try to find the
>>>  problem with the FILLFORM rules?
>>>
>>>  Alternatively, you might want to configure your system to not scan
>>> mails
>>>  from the ticket system (which I assume is internal and trusted).
>>
>> Why does this smell like replace_tag noise? (I hate that stuff :)
>
> The FILLFORM rules are complex and inherently repetitive - there's
> really no other way to detect a form.
>
> They've had some minor problems with boundedness in the past, but I
> thought I had fixed them. Apparently here's another bad case.

seems like a LOT of work going on for the results :)

score FILL_THIS_FORM                        1.456 0.001 1.456 0.001

I'd +1 to start from scratch with simpler conditions


Re: high cpu load

Posted by John Hardin <jh...@impsec.org>.
On Thu, 24 Apr 2014, Axb wrote:

> On 04/24/2014 04:16 PM, John Hardin wrote:
>>  On Thu, 24 Apr 2014, Nick I wrote:
>> 
>> >  Finally i found message caused high load.
>> > 
>> >  It looks like one message sent all the time from ticket system.
>> >  Message size is ~4M. We scan messages with this size in amavis.
>> > 
>> >  Content of message is complex and has multiple items
>> >  Content-Type: image/gif
>> >  Content-Transfer-Encoding: base64
>> >  Content-Type: application/pdf
>> > 
>> >  Results from debug, with % > 1:
>> > dbg: rules:  timing: Total time: 131.6748 s
>> > dbg: rules:  [...] rulename ovl(s) max(s) #run %tot
>> > dbg: rules:  [...] __FILL_THIS_FORM_LONG2 26.3811 26.3811 1 20.04%
>> > dbg: rules:  [...] __FILL_THIS_FORM_SHORT2 26.3050 26.3050 1 19.98%
>> > dbg: rules:  [...] __FILL_THIS_FORM_FRAUD_PHISH1 10.0878 10.0878 1 7.66%
>> > dbg: rules:  [...] __FILL_THIS_FORM_LOAN1 7.2766 7.2766 1 5.53%
>> > dbg: rules:  [...] __FILL_THIS_FORM_SHORT1 2.3360 2.3360 1 1.77%
>> > dbg: rules:  [...] __FILL_THIS_FORM_LONG1 2.3051 2.3051 1 1.75%
>>
>>  That's not too surprising if the content is 4MB.
>>
>>  Would you be willing to share it with me so that I can try to find the
>>  problem with the FILLFORM rules?
>>
>>  Alternatively, you might want to configure your system to not scan mails
>>  from the ticket system (which I assume is internal and trusted).
>
> Why does this smell like replace_tag noise? (I hate that stuff :)

The FILLFORM rules are complex and inherently repetitive - there's really 
no other way to detect a form.

They've had some minor problems with boundedness in the past, but I 
thought I had fixed them. Apparently here's another bad case.

-- 
  John Hardin KA7OHZ                    http://www.impsec.org/~jhardin/
  jhardin@impsec.org    FALaholic #11174     pgpk -a jhardin@impsec.org
  key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
   Yet another example of a Mexican doing a job Americans are
   unwilling to do.   -- Reno Sepulveda, on UniVision reporters asking
                         President Obama some pointed questions about
                         the BATFE Fast and Furious scandal.
-----------------------------------------------------------------------
  693 days since the first successful private support mission to ISS (SpaceX)

Re: high cpu load

Posted by Axb <ax...@gmail.com>.
On 04/24/2014 04:16 PM, John Hardin wrote:
> On Thu, 24 Apr 2014, Nick I wrote:
>
>> Finally i found message caused high load.
>>
>> It looks like one message sent all the time from ticket system.
>> Message size is ~4M. We scan messages with this size in amavis.
>>
>> Content of message is complex and has multiple items
>> Content-Type: image/gif
>> Content-Transfer-Encoding: base64
>> Content-Type: application/pdf
>>
>> Results from debug, with % > 1:
>> dbg: rules: timing: Total time: 131.6748 s
>> dbg: rules: [...] rulename ovl(s) max(s) #run %tot
>> dbg: rules: [...] __FILL_THIS_FORM_LONG2 26.3811 26.3811 1 20.04%
>> dbg: rules: [...] __FILL_THIS_FORM_SHORT2 26.3050 26.3050 1 19.98%
>> dbg: rules: [...] __FILL_THIS_FORM_FRAUD_PHISH1 10.0878 10.0878 1 7.66%
>> dbg: rules: [...] __FILL_THIS_FORM_LOAN1 7.2766 7.2766 1 5.53%
>> dbg: rules: [...] __FILL_THIS_FORM_SHORT1 2.3360 2.3360 1 1.77%
>> dbg: rules: [...] __FILL_THIS_FORM_LONG1 2.3051 2.3051 1 1.75%
>
> That's not too surprising if the content is 4MB.
>
> Would you be willing to share it with me so that I can try to find the
> problem with the FILLFORM rules?
>
> Alternatively, you might want to configure your system to not scan mails
> from the ticket system (which I assume is internal and trusted).

Why does this smell like replace_tag noise? (I hate that stuff :)


Re: high cpu load

Posted by John Hardin <jh...@impsec.org>.
On Thu, 24 Apr 2014, Nick I wrote:

> Finally i found message caused high load.
>
> It looks like one message sent all the time from ticket system.
> Message size is ~4M. We scan messages with this size in amavis.
>
> Content of message is complex and has multiple items
> Content-Type: image/gif
> Content-Transfer-Encoding: base64
> Content-Type: application/pdf
>
> Results from debug, with % > 1:
> dbg: rules: timing: Total time: 131.6748 s
> dbg: rules: [...] rulename ovl(s) max(s) #run %tot
> dbg: rules: [...] __FILL_THIS_FORM_LONG2 26.3811 26.3811 1 20.04%
> dbg: rules: [...] __FILL_THIS_FORM_SHORT2 26.3050 26.3050 1 19.98%
> dbg: rules: [...] __FILL_THIS_FORM_FRAUD_PHISH1 10.0878 10.0878 1 7.66%
> dbg: rules: [...] __FILL_THIS_FORM_LOAN1 7.2766 7.2766 1 5.53%
> dbg: rules: [...] __FILL_THIS_FORM_SHORT1 2.3360 2.3360 1 1.77%
> dbg: rules: [...] __FILL_THIS_FORM_LONG1 2.3051 2.3051 1 1.75%

That's not too surprising if the content is 4MB.

Would you be willing to share it with me so that I can try to find the 
problem with the FILLFORM rules?

Alternatively, you might want to configure your system to not scan mails 
from the ticket system (which I assume is internal and trusted).

-- 
  John Hardin KA7OHZ                    http://www.impsec.org/~jhardin/
  jhardin@impsec.org    FALaholic #11174     pgpk -a jhardin@impsec.org
  key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
   Yet another example of a Mexican doing a job Americans are
   unwilling to do.   -- Reno Sepulveda, on UniVision reporters asking
                         President Obama some pointed questions about
                         the BATFE Fast and Furious scandal.
-----------------------------------------------------------------------
  693 days since the first successful private support mission to ISS (SpaceX)

Re: high cpu load

Posted by Nick I <ni...@gmail.com>.
Finally i found message caused high load.

It looks like one message sent all the time from ticket system.
Message size is ~4M. We scan messages with this size in amavis.

Content of message is complex and has multiple items
Content-Type: image/gif
Content-Transfer-Encoding: base64
Content-Type: application/pdf

Results from debug, with % > 1:
 dbg: rules: timing: Total time: 131.6748 s
 dbg: rules: [...] rulename ovl(s) max(s) #run %tot
 dbg: rules: [...] __FILL_THIS_FORM_LONG2 26.3811 26.3811 1 20.04%
 dbg: rules: [...] __FILL_THIS_FORM_SHORT2 26.3050 26.3050 1 19.98%
 dbg: rules: [...] __FILL_THIS_FORM_FRAUD_PHISH1 10.0878 10.0878 1 7.66%
 dbg: rules: [...] __FILL_THIS_FORM_LOAN1 7.2766 7.2766 1 5.53%
 dbg: rules: [...] __FILL_THIS_FORM_SHORT1 2.3360 2.3360 1 1.77%
 dbg: rules: [...] __FILL_THIS_FORM_LONG1 2.3051 2.3051 1 1.75%


 1.8 FUZZY_XPILL            BODY: Attempt to obfuscate words in spam
 0.0 FUZZY_CPILL            BODY: Attempt to obfuscate words in spam
 0.5 FUZZY_VPILL            BODY: Attempt to obfuscate words in spam
 0.8 HTML_IMAGE_RATIO_02    BODY: HTML has a low ratio of text to image area
 0.0 HTML_MESSAGE           BODY: HTML included in message
 0.0 LOTS_OF_MONEY          Huge... sums of money

Thanks all for the help!


2014-04-24 1:39 GMT+03:00 John Hardin <jh...@impsec.org>:

> On Wed, 23 Apr 2014, Nick I wrote:
>
>  Another interesting thing. Today when daily cron executed at 5 am load
>> calmed to ~0. As it was before. Sa-update executed at that time.
>> Amavisd has been reloaded at 7 am and load raised back again.
>> Also i see that some messages checked 150329 ms, 158742 ms. But most
>> messages checked ~400ms.
>>
>> I used @debug_recipient_maps and sa_debug but did not see any userful
>> info.
>> Can anyone suggest how to look inside tests_pri_0 ?
>>
>
> The first thing you need to do is capture one of the messages that took a
> very long time to scan, so that it can be tested in a controlled
> environment. There are tools that will allow you to capture timing data for
> every rule, and if the message is a spam you could provide it to us for
> analysis.
>
> --
>  John Hardin KA7OHZ                    http://www.impsec.org/~jhardin/
>  jhardin@impsec.org    FALaholic #11174     pgpk -a jhardin@impsec.org
>  key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
> -----------------------------------------------------------------------
>  Today: Max Planck's 156th birthday
>

Re: high cpu load

Posted by John Hardin <jh...@impsec.org>.
On Wed, 23 Apr 2014, Nick I wrote:

> Another interesting thing. Today when daily cron executed at 5 am load
> calmed to ~0. As it was before. Sa-update executed at that time.
> Amavisd has been reloaded at 7 am and load raised back again.
> Also i see that some messages checked 150329 ms, 158742 ms. But most
> messages checked ~400ms.
>
> I used @debug_recipient_maps and sa_debug but did not see any userful info.
> Can anyone suggest how to look inside tests_pri_0 ?

The first thing you need to do is capture one of the messages that took a 
very long time to scan, so that it can be tested in a controlled 
environment. There are tools that will allow you to capture timing data 
for every rule, and if the message is a spam you could provide it to us 
for analysis.

-- 
  John Hardin KA7OHZ                    http://www.impsec.org/~jhardin/
  jhardin@impsec.org    FALaholic #11174     pgpk -a jhardin@impsec.org
  key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
  Today: Max Planck's 156th birthday

Re: high cpu load

Posted by Benny Pedersen <me...@junc.eu>.
Nick I skrev den 2014-04-23 19:14:

> Can anyone suggest how to look inside tests_pri_0 ?

disable BodyEVAL plugin as a first step, if load is not low then its not 
that plugin that causing its problem

to take it all, one could test with just check plugin enabled, and add 
one by one to see with plugin the time killer is in

also it was a thing i remember some mails took extramly long scan here, 
it was at that time a slow regex that was not bounded, so it scanned one 
big file without limits

show spamassassin 2>&1 -D --lint so maillist see if and what perl you 
have installed and there most other her knows more in detail where to 
search more for solving

there should not be a amavisd problem

Re: high cpu load

Posted by Nick I <ni...@gmail.com>.
Another interesting thing. Today when daily cron executed at 5 am load
calmed to ~0. As it was before. Sa-update executed at that time.
Amavisd has been reloaded at 7 am and load raised back again.
Also i see that some messages checked 150329 ms, 158742 ms. But most
messages checked ~400ms.

I used @debug_recipient_maps and sa_debug but did not see any userful info.
Can anyone suggest how to look inside tests_pri_0 ?


2014-04-23 14:49 GMT+03:00 RW <rw...@googlemail.com>:

> On Wed, 23 Apr 2014 02:43:25 +0200
> Karsten Bräckelmann wrote:
>
> > On Tue, 2014-04-22 at 22:31 +0300, Nick I wrote:
> > >
> > > Here is timing from amavis. tests_pri_0 is around 90% all the time :
> > > amavis[26002]: (26002-05) TIMING-SA total 759 ms - parse: 1.92
> > > (0.3%), extract_message_metadata: 31 (4.1%), get_uri_detail_list: 6
> >
> I'm not entirely sure about that amavis log-line, but it most likely
> > maps to the SA priorities rules are run at.
>
>
>
> The text after "TIMING-SA" looks to be SA's own _TIMING_ tag.
>
>
>
>

Re: high cpu load

Posted by RW <rw...@googlemail.com>.
On Wed, 23 Apr 2014 02:43:25 +0200
Karsten Bräckelmann wrote:

> On Tue, 2014-04-22 at 22:31 +0300, Nick I wrote:
> >
> > Here is timing from amavis. tests_pri_0 is around 90% all the time :
> > amavis[26002]: (26002-05) TIMING-SA total 759 ms - parse: 1.92
> > (0.3%), extract_message_metadata: 31 (4.1%), get_uri_detail_list: 6
>
I'm not entirely sure about that amavis log-line, but it most likely
> maps to the SA priorities rules are run at.



The text after "TIMING-SA" looks to be SA's own _TIMING_ tag.




Re: high cpu load

Posted by Karsten Bräckelmann <gu...@rudersport.de>.
On Tue, 2014-04-22 at 22:31 +0300, Nick I wrote:
> I use SpamAssassin version 3.3.1 running on Perl version 5.10.1,
> amavisd-new-2.8.0-8.el6 as before-queue filter.
> Today for unknown reason i noticed high load on my server. Mail flow
> is as usual. About 8k in hour checked by amavisd.
> 
> Here is timing from amavis. tests_pri_0 is around 90% all the time :
> amavis[26002]: (26002-05) TIMING-SA total 759 ms - parse: 1.92 (0.3%),
> extract_message_metadata: 31 (4.1%), get_uri_detail_list: 6 (0.8%),
> tests_pri_-1000: 17 (2.2%), tests_pri_-950: 0.93 (0.1%),
> tests_pri_-900: 1.00 (0.1%), tests_pri_-400: 0.79 (0.1%), tests_pri_0:
> 668 (88.0%), check_dkim_signature: 9 (1.2%), check_spf: 364 (47.9%),
> poll_dns_idle: 338 (44.5%), tests_pri_500: 7 (1.0%), get_report: 0.99
> (0.1%)
> 
> In my another test i see async completed from 0.016 s till 0.174 s.
> 
> How can i disable/speed up tests_pri_0 ?
> What is inside tests_pri_0?

The letter question better be raised before asking how to *disable*
something... ;)

I'm not entirely sure about that amavis log-line, but it most likely
maps to the SA priorities rules are run at. Lower (including negative)
priorities are run prior to higher ones, with 0 being the default
priority level. Thus, this also is the priority of the bulk of the regex
based, CPU-bound general rules.

(a) Check your logs, if there is an sa-update run with a close timely
relation to the increase of CPU load. Both SA update channels, as well
as any third-party update channel (if any).

(b) Carefully review all your custom rules. Recent changes need special
attention.

Even if (in both cases) no changes correlate with the time of load
increase, it still might be a regex based rule in backtrace hell -- now
tripping on a pattern not observed in previous mail flow.


FWIW, custom rules with runaway regex patterns usually is the culprit
for such behavior. Stock SA update channel rules are unlikely, unless
others observe the same load increase.


-- 
char *t="\10pse\0r\0dtu\0.@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4";
main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;i<l;i++){ i%8? c<<=1:
(c=*++x); c&128 && (s+=h); if (!(h>>=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}


Re: high cpu load

Posted by Adam Moffett <ad...@plexicomm.net>.
Have you tried compiling the rules with sa-compile.  It speeds up 
everything.
I'm afraid I don't know the answer to what that specific test does.

> Hi,
>
> I use SpamAssassin version 3.3.1 running on Perl version 5.10.1, 
> amavisd-new-2.8.0-8.el6 as before-queue filter.
> Today for unknown reason i noticed high load on my server. Mail flow 
> is as usual. About 8k in hour checked by amavisd.
>
> Here is timing from amavis. tests_pri_0 is around 90% all the time :*
> *amavis[26002]: (26002-05) TIMING-SA total 759 ms - parse: 1.92 
> (0.3%), extract_message_metadata: 31 (4.1%), get_uri_detail_list: 6 
> (0.8%), tests_pri_-1000: 17 (2.2%), tests_pri_-950: 0.93 (0.1%), 
> tests_pri_-900: 1.00 (0.1%), tests_pri_-400: 0.79 (0.1%), 
> *tests_pri_0: 668 (88.0%)*, check_dkim_signature: 9 (1.2%), check_spf: 
> 364 (47.9%), poll_dns_idle: 338 (44.5%), tests_pri_500: 7 (1.0%), 
> get_report: 0.99 (0.1%)
>
> In my another test i see async completed from 0.016 s till 0.174 s.
>
> How can i disable/speed up tests_pri_0 ?
> What is inside tests_pri_0?
> Thanks.