You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Justin Mason <jm...@jmason.org> on 2005/06/02 20:40:39 UTC

Re: 3.0.3 uses all CPUs after tie

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


can you repro this reliably?  if so, output from -D and/or an "strace
- -f -p $spamdpid" would be helpful.

where does "tie" come in? (from the subj line).

- --j.

Matthew Daubenspeck writes:
> I am using Spamassassin 3.0.3 on a Gentoo AMD64 system with exim and
> exiscan. This has worked VERY well for months without a single issue.
> All of the sudden spamd eventually uses all of both CPU's and nearly
> locks the machine. I have tried downgrading to 3.0.2 with the same
> result. I have been using several of the RulesDuJour's and first started
> to suspect that.
> 
> I removed all of the files from /etc/mail/spamassassin except for the
> following local.cf:
> 
> required_hits                   5
> skip_rbl_checks                 0
> use_bayes                       0
> score HELO_DYNAMIC_IPADDR       2
> score ALL_TRUSTED               0
> use_auto_whitelist              0
> 
> When spamd is running normally its processes look as such:
> 
> # ps aux | grep spamd
> root     29434  0.0  1.6  66712 33828 ?        Ss   21:13   0:00
> /usr/sbin/spamd -d -r /var/run/spamd.pid -m 5 -c -H
> root     29442  0.1  1.8  69712 37152 ?        S    21:13   0:00 spamd
> child
> root     29443  0.0  1.7  68852 36300 ?        S    21:13   0:00 spamd
> child
> root     29444  0.0  1.7  68444 35904 ?        S    21:13   0:00 spamd
> child
> root     29445  0.0  1.7  68124 35584 ?        S    21:13   0:00 spamd
> child
> root     29446  0.0  1.7  68160 35600 ?        S    21:13   0:00 spamd
> child
> 
> When both CPU's are pegged at 100%, they look like this:
> 
> # ps aux | grep spamd
> root     10097  0.2  5.6 152336 117208 ?       Ss   10:32   0:06
> /usr/sbin/spamd -d -r /var/run/spamd.pid -m 5 -c -H
> root     10378  0.9  6.8 176116 141012 ?       S    10:32   0:19 spamd
> child
> root     10379  1.0  6.6 170452 136024 ?       S    10:32   0:22 spamd
> child
> root     10380  0.9  6.8 174528 140080 ?       S    10:32   0:19 spamd
> child
> nobody   10381 27.1 38.0 818616 783476 ?       R    10:32   9:20 spamd
> child
> root     10382  0.7  6.4 167376 133004 ?       S    10:32   0:16 spamd
> child
> 
> I'm sure pasting that to a message screwed everything up, so you can
> also see them at http://daubnet.dyndns.org:3000/foo/spamassassin
> 
> For some reason, one of the processes switches from being owned by root
> to owned by nobody. Its state also changes from S to R. The only way I
> can clear this is by killing all spamd processes and restarting the
> service. I was initially using bayes, but thought that might have
> something to do with it so I disabled it. This made no change. 
> 
> I've tried everything I can think of but nothing makes any difference. I
> have searched the archives and can't seem to find a solution. I know the
> list has heard this a million times, but I have changed nothing as far
> as settings in months :)
> 
> Any suggestions?
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Exmh CVS

iD8DBQFCn1KnMJF5cimLx9ARAvkNAJ9RzXGvFxCHkrSKcpBAVuaizjpASACgr/i6
wpy5hgHz/nI9P1s0hgHvYaM=
=lgor
-----END PGP SIGNATURE-----


Re: 3.0.3 uses all CPUs after tie

Posted by Thomas Jacob <ja...@internet24.de>.
> It randomly happens after an hour or so of use. Next time it happens I
> will try both and send it to the list.

To follow up on the Debian thread with the same problem:

Since seems to happen for several people, during the last days, could it
be that this is not in fact exim/exiscan related, but some sort of
bug/attack on spamassassin/perl thru spam containing certain triggers,
causing buffer overflows?

I've tried analyzed our scanning logs a bit today, from the times when
the memory usage exploded, and there were was nothing unusual about the
size or number of scanned mail.

Re: 3.0.3 uses all CPUs after tie

Posted by Matthew Daubenspeck <ma...@oddprocess.org>.
On Thu, Jun 02, 2005 at 11:40:39AM -0700, Justin Mason wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> 
> can you repro this reliably?  if so, output from -D and/or an "strace
> - -f -p $spamdpid" would be helpful.

It randomly happens after an hour or so of use. Next time it happens I
will try both and send it to the list.

> where does "tie" come in? (from the subj line).

Whoops. That should have been time :)

Re: 3.0.3/4 uses all CPUs after tie (uuencoded attachments)?

Posted by Thomas Jacob <ja...@internet24.de>.
> Yes, a size limit is *required*.   It's very important to limit
> the size of messages scanned by SpamAssassin.

Well, we're limiting the size of emails that spamd sees now, maybe
that will "solve" the problem, and of course it's generally sensibly to
do this, as there isn't really much spam larger than lets say 250k,
but still, when scanning a single 10mb mail makes the spamd process dealing 
with that mail eat >2 gigabytes of main memory until all of it is exhausted, 
that doesn't seem like "normal" programm behaviour, does it?

What could it possibly do with that much memory for a 10mb mail? ;)

Re: 3.0.3/4 uses all CPUs after tie (uuencoded attachments)?

Posted by Thomas Jacob <ja...@internet24.de>.
It seems, that for us at least, this is caused by Spamassassin scanning
larger (>1mb) mails containing uuencoded files, without mime attachment
headers
or anything.

But this only seems to happen sometimes or when spamd has been running
for a little while, for if we feed an email that appears to have caused
the memory problem into a restarted spamd, nothing happens.

When spamd chokes on such a mail, it slowly but constantly increases its
memory usage, eating up all the systems memory.

We haven't been using a size-limit for exiscan/exim up till now, but
that can hardly be the root cause of the problem, for why would
need spamd gigabytes of memory when processing, let's say, a 10mb
mail?

Re: 3.0.3 uses all CPUs after tie

Posted by Michael Parker <pa...@pobox.com>.
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Matthew Daubenspeck wrote:

>On Thu, Jun 02, 2005 at 11:40:39AM -0700, Justin Mason wrote:
>
>>can you repro this reliably? if so, output from -D and/or an "strace
>>- -f -p $spamdpid" would be helpful.
>
>
>>>From top:
>
>28702 nobody 25 0 781m 714m 1796 R 99.9 35.5 4:11.72 spamd
>
>That's the "runaway process."
>
># strace -f -p 28702
>Process 28702 attached - interrupt to quit
>
>That's all it does. I never see anything else. It then continues to chew
>up both processors untill I killall and restart spamd. If I kill just
>that PID, another spamd PID takes over and uses 100% cpu.
>
>About the only thing I can do is run a cron script that kills all of
>spamd and restarts it. However, that is a VERY ugly fix :)
>
>Thanks.
>
Exim? If so are you limiting the size of msgs sent spamd?

Michael
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (Darwin)

iD8DBQFCoHUIG4km+uS4gOIRAhsyAJ0f2solLG3igMOml5OIAQ1f63zv3ACgl/xu
xOT4LMtSATDvqF+hl/ja178=
=0C5o
-----END PGP SIGNATURE-----


Re: 3.0.3 uses all CPUs after tie

Posted by Matthew Daubenspeck <ma...@oddprocess.org>.
On Thu, Jun 02, 2005 at 11:40:39AM -0700, Justin Mason wrote:
> can you repro this reliably?  if so, output from -D and/or an "strace
> - -f -p $spamdpid" would be helpful.

>From top:

28702 nobody    25   0  781m 714m 1796 R 99.9 35.5   4:11.72 spamd

That's the "runaway process."

# strace -f -p 28702
Process 28702 attached - interrupt to quit

That's all it does. I never see anything else. It then continues to chew
up both processors untill I killall and restart spamd. If I kill just
that PID, another spamd PID takes over and uses 100% cpu.

About the only thing I can do is run a cron script that kills all of
spamd and restarts it. However, that is a VERY ugly fix :)

Thanks.