You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Nick I <ni...@gmail.com> on 2014/04/22 21:31:57 UTC
high cpu load
Hi,
I use SpamAssassin version 3.3.1 running on Perl version 5.10.1,
amavisd-new-2.8.0-8.el6 as before-queue filter.
Today for unknown reason i noticed high load on my server. Mail flow is as
usual. About 8k in hour checked by amavisd.
Here is timing from amavis. tests_pri_0 is around 90% all the time :
amavis[26002]: (26002-05) TIMING-SA total 759 ms - parse: 1.92 (0.3%),
extract_message_metadata: 31 (4.1%), get_uri_detail_list: 6 (0.8%),
tests_pri_-1000: 17 (2.2%), tests_pri_-950: 0.93 (0.1%), tests_pri_-900:
1.00 (0.1%), tests_pri_-400: 0.79 (0.1%), *tests_pri_0: 668 (88.0%)*,
check_dkim_signature: 9 (1.2%), check_spf: 364 (47.9%), poll_dns_idle: 338
(44.5%), tests_pri_500: 7 (1.0%), get_report: 0.99 (0.1%)
In my another test i see async completed from 0.016 s till 0.174 s.
How can i disable/speed up tests_pri_0 ?
What is inside tests_pri_0?
Thanks.
Re: high cpu load
Posted by Axb <ax...@gmail.com>.
On 04/24/2014 04:28 PM, John Hardin wrote:
> On Thu, 24 Apr 2014, Axb wrote:
>
>> On 04/24/2014 04:16 PM, John Hardin wrote:
>>> On Thu, 24 Apr 2014, Nick I wrote:
>>>
>>> > Finally i found message caused high load.
>>> > > It looks like one message sent all the time from ticket system.
>>> > Message size is ~4M. We scan messages with this size in amavis.
>>> > > Content of message is complex and has multiple items
>>> > Content-Type: image/gif
>>> > Content-Transfer-Encoding: base64
>>> > Content-Type: application/pdf
>>> > > Results from debug, with % > 1:
>>> > dbg: rules: timing: Total time: 131.6748 s
>>> > dbg: rules: [...] rulename ovl(s) max(s) #run %tot
>>> > dbg: rules: [...] __FILL_THIS_FORM_LONG2 26.3811 26.3811 1 20.04%
>>> > dbg: rules: [...] __FILL_THIS_FORM_SHORT2 26.3050 26.3050 1 19.98%
>>> > dbg: rules: [...] __FILL_THIS_FORM_FRAUD_PHISH1 10.0878 10.0878 1
>>> 7.66%
>>> > dbg: rules: [...] __FILL_THIS_FORM_LOAN1 7.2766 7.2766 1 5.53%
>>> > dbg: rules: [...] __FILL_THIS_FORM_SHORT1 2.3360 2.3360 1 1.77%
>>> > dbg: rules: [...] __FILL_THIS_FORM_LONG1 2.3051 2.3051 1 1.75%
>>>
>>> That's not too surprising if the content is 4MB.
>>>
>>> Would you be willing to share it with me so that I can try to find the
>>> problem with the FILLFORM rules?
>>>
>>> Alternatively, you might want to configure your system to not scan
>>> mails
>>> from the ticket system (which I assume is internal and trusted).
>>
>> Why does this smell like replace_tag noise? (I hate that stuff :)
>
> The FILLFORM rules are complex and inherently repetitive - there's
> really no other way to detect a form.
>
> They've had some minor problems with boundedness in the past, but I
> thought I had fixed them. Apparently here's another bad case.
seems like a LOT of work going on for the results :)
score FILL_THIS_FORM 1.456 0.001 1.456 0.001
I'd +1 to start from scratch with simpler conditions
Re: high cpu load
Posted by John Hardin <jh...@impsec.org>.
On Thu, 24 Apr 2014, Axb wrote:
> On 04/24/2014 04:16 PM, John Hardin wrote:
>> On Thu, 24 Apr 2014, Nick I wrote:
>>
>> > Finally i found message caused high load.
>> >
>> > It looks like one message sent all the time from ticket system.
>> > Message size is ~4M. We scan messages with this size in amavis.
>> >
>> > Content of message is complex and has multiple items
>> > Content-Type: image/gif
>> > Content-Transfer-Encoding: base64
>> > Content-Type: application/pdf
>> >
>> > Results from debug, with % > 1:
>> > dbg: rules: timing: Total time: 131.6748 s
>> > dbg: rules: [...] rulename ovl(s) max(s) #run %tot
>> > dbg: rules: [...] __FILL_THIS_FORM_LONG2 26.3811 26.3811 1 20.04%
>> > dbg: rules: [...] __FILL_THIS_FORM_SHORT2 26.3050 26.3050 1 19.98%
>> > dbg: rules: [...] __FILL_THIS_FORM_FRAUD_PHISH1 10.0878 10.0878 1 7.66%
>> > dbg: rules: [...] __FILL_THIS_FORM_LOAN1 7.2766 7.2766 1 5.53%
>> > dbg: rules: [...] __FILL_THIS_FORM_SHORT1 2.3360 2.3360 1 1.77%
>> > dbg: rules: [...] __FILL_THIS_FORM_LONG1 2.3051 2.3051 1 1.75%
>>
>> That's not too surprising if the content is 4MB.
>>
>> Would you be willing to share it with me so that I can try to find the
>> problem with the FILLFORM rules?
>>
>> Alternatively, you might want to configure your system to not scan mails
>> from the ticket system (which I assume is internal and trusted).
>
> Why does this smell like replace_tag noise? (I hate that stuff :)
The FILLFORM rules are complex and inherently repetitive - there's really
no other way to detect a form.
They've had some minor problems with boundedness in the past, but I
thought I had fixed them. Apparently here's another bad case.
--
John Hardin KA7OHZ http://www.impsec.org/~jhardin/
jhardin@impsec.org FALaholic #11174 pgpk -a jhardin@impsec.org
key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
Yet another example of a Mexican doing a job Americans are
unwilling to do. -- Reno Sepulveda, on UniVision reporters asking
President Obama some pointed questions about
the BATFE Fast and Furious scandal.
-----------------------------------------------------------------------
693 days since the first successful private support mission to ISS (SpaceX)
Re: high cpu load
Posted by Axb <ax...@gmail.com>.
On 04/24/2014 04:16 PM, John Hardin wrote:
> On Thu, 24 Apr 2014, Nick I wrote:
>
>> Finally i found message caused high load.
>>
>> It looks like one message sent all the time from ticket system.
>> Message size is ~4M. We scan messages with this size in amavis.
>>
>> Content of message is complex and has multiple items
>> Content-Type: image/gif
>> Content-Transfer-Encoding: base64
>> Content-Type: application/pdf
>>
>> Results from debug, with % > 1:
>> dbg: rules: timing: Total time: 131.6748 s
>> dbg: rules: [...] rulename ovl(s) max(s) #run %tot
>> dbg: rules: [...] __FILL_THIS_FORM_LONG2 26.3811 26.3811 1 20.04%
>> dbg: rules: [...] __FILL_THIS_FORM_SHORT2 26.3050 26.3050 1 19.98%
>> dbg: rules: [...] __FILL_THIS_FORM_FRAUD_PHISH1 10.0878 10.0878 1 7.66%
>> dbg: rules: [...] __FILL_THIS_FORM_LOAN1 7.2766 7.2766 1 5.53%
>> dbg: rules: [...] __FILL_THIS_FORM_SHORT1 2.3360 2.3360 1 1.77%
>> dbg: rules: [...] __FILL_THIS_FORM_LONG1 2.3051 2.3051 1 1.75%
>
> That's not too surprising if the content is 4MB.
>
> Would you be willing to share it with me so that I can try to find the
> problem with the FILLFORM rules?
>
> Alternatively, you might want to configure your system to not scan mails
> from the ticket system (which I assume is internal and trusted).
Why does this smell like replace_tag noise? (I hate that stuff :)
Re: high cpu load
Posted by John Hardin <jh...@impsec.org>.
On Thu, 24 Apr 2014, Nick I wrote:
> Finally i found message caused high load.
>
> It looks like one message sent all the time from ticket system.
> Message size is ~4M. We scan messages with this size in amavis.
>
> Content of message is complex and has multiple items
> Content-Type: image/gif
> Content-Transfer-Encoding: base64
> Content-Type: application/pdf
>
> Results from debug, with % > 1:
> dbg: rules: timing: Total time: 131.6748 s
> dbg: rules: [...] rulename ovl(s) max(s) #run %tot
> dbg: rules: [...] __FILL_THIS_FORM_LONG2 26.3811 26.3811 1 20.04%
> dbg: rules: [...] __FILL_THIS_FORM_SHORT2 26.3050 26.3050 1 19.98%
> dbg: rules: [...] __FILL_THIS_FORM_FRAUD_PHISH1 10.0878 10.0878 1 7.66%
> dbg: rules: [...] __FILL_THIS_FORM_LOAN1 7.2766 7.2766 1 5.53%
> dbg: rules: [...] __FILL_THIS_FORM_SHORT1 2.3360 2.3360 1 1.77%
> dbg: rules: [...] __FILL_THIS_FORM_LONG1 2.3051 2.3051 1 1.75%
That's not too surprising if the content is 4MB.
Would you be willing to share it with me so that I can try to find the
problem with the FILLFORM rules?
Alternatively, you might want to configure your system to not scan mails
from the ticket system (which I assume is internal and trusted).
--
John Hardin KA7OHZ http://www.impsec.org/~jhardin/
jhardin@impsec.org FALaholic #11174 pgpk -a jhardin@impsec.org
key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
Yet another example of a Mexican doing a job Americans are
unwilling to do. -- Reno Sepulveda, on UniVision reporters asking
President Obama some pointed questions about
the BATFE Fast and Furious scandal.
-----------------------------------------------------------------------
693 days since the first successful private support mission to ISS (SpaceX)
Re: high cpu load
Posted by Nick I <ni...@gmail.com>.
Finally i found message caused high load.
It looks like one message sent all the time from ticket system.
Message size is ~4M. We scan messages with this size in amavis.
Content of message is complex and has multiple items
Content-Type: image/gif
Content-Transfer-Encoding: base64
Content-Type: application/pdf
Results from debug, with % > 1:
dbg: rules: timing: Total time: 131.6748 s
dbg: rules: [...] rulename ovl(s) max(s) #run %tot
dbg: rules: [...] __FILL_THIS_FORM_LONG2 26.3811 26.3811 1 20.04%
dbg: rules: [...] __FILL_THIS_FORM_SHORT2 26.3050 26.3050 1 19.98%
dbg: rules: [...] __FILL_THIS_FORM_FRAUD_PHISH1 10.0878 10.0878 1 7.66%
dbg: rules: [...] __FILL_THIS_FORM_LOAN1 7.2766 7.2766 1 5.53%
dbg: rules: [...] __FILL_THIS_FORM_SHORT1 2.3360 2.3360 1 1.77%
dbg: rules: [...] __FILL_THIS_FORM_LONG1 2.3051 2.3051 1 1.75%
1.8 FUZZY_XPILL BODY: Attempt to obfuscate words in spam
0.0 FUZZY_CPILL BODY: Attempt to obfuscate words in spam
0.5 FUZZY_VPILL BODY: Attempt to obfuscate words in spam
0.8 HTML_IMAGE_RATIO_02 BODY: HTML has a low ratio of text to image area
0.0 HTML_MESSAGE BODY: HTML included in message
0.0 LOTS_OF_MONEY Huge... sums of money
Thanks all for the help!
2014-04-24 1:39 GMT+03:00 John Hardin <jh...@impsec.org>:
> On Wed, 23 Apr 2014, Nick I wrote:
>
> Another interesting thing. Today when daily cron executed at 5 am load
>> calmed to ~0. As it was before. Sa-update executed at that time.
>> Amavisd has been reloaded at 7 am and load raised back again.
>> Also i see that some messages checked 150329 ms, 158742 ms. But most
>> messages checked ~400ms.
>>
>> I used @debug_recipient_maps and sa_debug but did not see any userful
>> info.
>> Can anyone suggest how to look inside tests_pri_0 ?
>>
>
> The first thing you need to do is capture one of the messages that took a
> very long time to scan, so that it can be tested in a controlled
> environment. There are tools that will allow you to capture timing data for
> every rule, and if the message is a spam you could provide it to us for
> analysis.
>
> --
> John Hardin KA7OHZ http://www.impsec.org/~jhardin/
> jhardin@impsec.org FALaholic #11174 pgpk -a jhardin@impsec.org
> key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79
> -----------------------------------------------------------------------
> Today: Max Planck's 156th birthday
>
Re: high cpu load
Posted by John Hardin <jh...@impsec.org>.
On Wed, 23 Apr 2014, Nick I wrote:
> Another interesting thing. Today when daily cron executed at 5 am load
> calmed to ~0. As it was before. Sa-update executed at that time.
> Amavisd has been reloaded at 7 am and load raised back again.
> Also i see that some messages checked 150329 ms, 158742 ms. But most
> messages checked ~400ms.
>
> I used @debug_recipient_maps and sa_debug but did not see any userful info.
> Can anyone suggest how to look inside tests_pri_0 ?
The first thing you need to do is capture one of the messages that took a
very long time to scan, so that it can be tested in a controlled
environment. There are tools that will allow you to capture timing data
for every rule, and if the message is a spam you could provide it to us
for analysis.
--
John Hardin KA7OHZ http://www.impsec.org/~jhardin/
jhardin@impsec.org FALaholic #11174 pgpk -a jhardin@impsec.org
key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
Today: Max Planck's 156th birthday
Re: high cpu load
Posted by Benny Pedersen <me...@junc.eu>.
Nick I skrev den 2014-04-23 19:14:
> Can anyone suggest how to look inside tests_pri_0 ?
disable BodyEVAL plugin as a first step, if load is not low then its not
that plugin that causing its problem
to take it all, one could test with just check plugin enabled, and add
one by one to see with plugin the time killer is in
also it was a thing i remember some mails took extramly long scan here,
it was at that time a slow regex that was not bounded, so it scanned one
big file without limits
show spamassassin 2>&1 -D --lint so maillist see if and what perl you
have installed and there most other her knows more in detail where to
search more for solving
there should not be a amavisd problem
Re: high cpu load
Posted by Nick I <ni...@gmail.com>.
Another interesting thing. Today when daily cron executed at 5 am load
calmed to ~0. As it was before. Sa-update executed at that time.
Amavisd has been reloaded at 7 am and load raised back again.
Also i see that some messages checked 150329 ms, 158742 ms. But most
messages checked ~400ms.
I used @debug_recipient_maps and sa_debug but did not see any userful info.
Can anyone suggest how to look inside tests_pri_0 ?
2014-04-23 14:49 GMT+03:00 RW <rw...@googlemail.com>:
> On Wed, 23 Apr 2014 02:43:25 +0200
> Karsten Bräckelmann wrote:
>
> > On Tue, 2014-04-22 at 22:31 +0300, Nick I wrote:
> > >
> > > Here is timing from amavis. tests_pri_0 is around 90% all the time :
> > > amavis[26002]: (26002-05) TIMING-SA total 759 ms - parse: 1.92
> > > (0.3%), extract_message_metadata: 31 (4.1%), get_uri_detail_list: 6
> >
> I'm not entirely sure about that amavis log-line, but it most likely
> > maps to the SA priorities rules are run at.
>
>
>
> The text after "TIMING-SA" looks to be SA's own _TIMING_ tag.
>
>
>
>
Re: high cpu load
Posted by RW <rw...@googlemail.com>.
On Wed, 23 Apr 2014 02:43:25 +0200
Karsten Bräckelmann wrote:
> On Tue, 2014-04-22 at 22:31 +0300, Nick I wrote:
> >
> > Here is timing from amavis. tests_pri_0 is around 90% all the time :
> > amavis[26002]: (26002-05) TIMING-SA total 759 ms - parse: 1.92
> > (0.3%), extract_message_metadata: 31 (4.1%), get_uri_detail_list: 6
>
I'm not entirely sure about that amavis log-line, but it most likely
> maps to the SA priorities rules are run at.
The text after "TIMING-SA" looks to be SA's own _TIMING_ tag.
Re: high cpu load
Posted by Karsten Bräckelmann <gu...@rudersport.de>.
On Tue, 2014-04-22 at 22:31 +0300, Nick I wrote:
> I use SpamAssassin version 3.3.1 running on Perl version 5.10.1,
> amavisd-new-2.8.0-8.el6 as before-queue filter.
> Today for unknown reason i noticed high load on my server. Mail flow
> is as usual. About 8k in hour checked by amavisd.
>
> Here is timing from amavis. tests_pri_0 is around 90% all the time :
> amavis[26002]: (26002-05) TIMING-SA total 759 ms - parse: 1.92 (0.3%),
> extract_message_metadata: 31 (4.1%), get_uri_detail_list: 6 (0.8%),
> tests_pri_-1000: 17 (2.2%), tests_pri_-950: 0.93 (0.1%),
> tests_pri_-900: 1.00 (0.1%), tests_pri_-400: 0.79 (0.1%), tests_pri_0:
> 668 (88.0%), check_dkim_signature: 9 (1.2%), check_spf: 364 (47.9%),
> poll_dns_idle: 338 (44.5%), tests_pri_500: 7 (1.0%), get_report: 0.99
> (0.1%)
>
> In my another test i see async completed from 0.016 s till 0.174 s.
>
> How can i disable/speed up tests_pri_0 ?
> What is inside tests_pri_0?
The letter question better be raised before asking how to *disable*
something... ;)
I'm not entirely sure about that amavis log-line, but it most likely
maps to the SA priorities rules are run at. Lower (including negative)
priorities are run prior to higher ones, with 0 being the default
priority level. Thus, this also is the priority of the bulk of the regex
based, CPU-bound general rules.
(a) Check your logs, if there is an sa-update run with a close timely
relation to the increase of CPU load. Both SA update channels, as well
as any third-party update channel (if any).
(b) Carefully review all your custom rules. Recent changes need special
attention.
Even if (in both cases) no changes correlate with the time of load
increase, it still might be a regex based rule in backtrace hell -- now
tripping on a pattern not observed in previous mail flow.
FWIW, custom rules with runaway regex patterns usually is the culprit
for such behavior. Stock SA update channel rules are unlikely, unless
others observe the same load increase.
--
char *t="\10pse\0r\0dtu\0.@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4";
main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;i<l;i++){ i%8? c<<=1:
(c=*++x); c&128 && (s+=h); if (!(h>>=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}
Re: high cpu load
Posted by Adam Moffett <ad...@plexicomm.net>.
Have you tried compiling the rules with sa-compile. It speeds up
everything.
I'm afraid I don't know the answer to what that specific test does.
> Hi,
>
> I use SpamAssassin version 3.3.1 running on Perl version 5.10.1,
> amavisd-new-2.8.0-8.el6 as before-queue filter.
> Today for unknown reason i noticed high load on my server. Mail flow
> is as usual. About 8k in hour checked by amavisd.
>
> Here is timing from amavis. tests_pri_0 is around 90% all the time :*
> *amavis[26002]: (26002-05) TIMING-SA total 759 ms - parse: 1.92
> (0.3%), extract_message_metadata: 31 (4.1%), get_uri_detail_list: 6
> (0.8%), tests_pri_-1000: 17 (2.2%), tests_pri_-950: 0.93 (0.1%),
> tests_pri_-900: 1.00 (0.1%), tests_pri_-400: 0.79 (0.1%),
> *tests_pri_0: 668 (88.0%)*, check_dkim_signature: 9 (1.2%), check_spf:
> 364 (47.9%), poll_dns_idle: 338 (44.5%), tests_pri_500: 7 (1.0%),
> get_report: 0.99 (0.1%)
>
> In my another test i see async completed from 0.016 s till 0.174 s.
>
> How can i disable/speed up tests_pri_0 ?
> What is inside tests_pri_0?
> Thanks.