You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Stefano Catani <ca...@units.it> on 2005/01/18 11:56:22 UTC

spamassassin process a single message for 10 minutes !

here is the message:
http://mail.units.it/6474

it contains a lot of email addresses and stops our mailserver

these are the times on a dual PIII 1GHz (SpamAssassin 3.0.2)

time spamc < 6474
real    9m59.995s
user    0m0.000s
sys     0m0.000s


any suggestions ???

Thanks

Stefano


Re: spamassassin process a single message for 10 minutes !

Posted by Loren Wilton <lw...@earthlink.net>.
I seem to remember there was a problem or design feature where something
like Net::DNS was using about 4 file handles per URL that it looked up.
Must have been about 250 URLs in that spam, at a guess.

Personally, I'd be inclined to submit a bug.  :-)

        Loren

----- Original Message ----- 
From: "Christian Recktenwald" <sp...@citecs.de>
To: "Stefano Catani" <ca...@units.it>
Cc: <us...@spamassassin.apache.org>
Sent: Tuesday, January 18, 2005 8:48 AM
Subject: Re: spamassassin process a single message for 10 minutes !


> On Tue, Jan 18, 2005 at 10:56:22AM +0000, Stefano Catani wrote:
> > here is the message:
> > http://mail.units.it/6474
> >
> > it contains a lot of email addresses and stops our mailserver
> >
> > these are the times on a dual PIII 1GHz (SpamAssassin 3.0.2)
> >
> > time spamc < 6474
> > real    9m59.995s
> > user    0m0.000s
> > sys     0m0.000s
>
> similar result here:
>
> real    10m0.067s
> user    0m0.010s
> sys     0m0.000s
>
> single PIII 1GHz 750MB SA 3.0.0
>
> spamd (according to top) does not eat significantly CPU.
> I called strace on the spamd process:
>
> ...
> select(0, NULL, NULL, NULL, {1, 20000}) = 0 (Timeout)
> open("/etc/protocols", O_RDONLY)        = -1 EMFILE (Too many open files)
> open("/var/lib/misc/protocols.db", O_RDWR|O_LARGEFILE) = -1 EMFILE (Too
many open files)
> ...
>
> this is reported endlessly
>
> so there seems to be a file handle problem.
>
> According to lsof:
>
> lsof | grep ^spamd | awk '{print $1,$2}'  | sort | uniq -c
>   NrOF         PID
>      37 spamd 20696
>     126 spamd 20698
>     129 spamd 20699
>     130 spamd 20700
>    1055 spamd 20701
>      38 spamd 26284
>
> This surely is insane.
>
> Process 20701 which is the actually scanning child process
> has openend 933 UDP sockets:
> spamd     20701     root 1023u  IPv4     555058                UDP *:38796
> and 85 handles on bayes_toks:
> spamd     20701     root  136u   REG       58,2 5226496 656011
/home/chris/.spamassassin/bayes_toks
>
> I'd guess the UDP sockets are from DNS lookups f. sender verify.
>
> HTH, Chris
>
> -- 
> Christian Recktenwald      :                         :
> citecs GmbH                : spamassassin-talk-dist@citecs.de
> Unternehmensberatung fuer  : voice +49 711 601 2090  : Boeblinger Strasse
189
> EDV und Telekommunikation  : fax   +49 711 601 2092  : D-70199 Stuttgart


Re: spamassassin process a single message for 10 minutes !

Posted by Keith Whyte <ke...@media-solutions.ie>.
It seems ok on my system
linux stock 2.2.17
perl 5.8.6
dunno what other libs might make a difference.

i ran it like this:
wget -O - http://mail.units.it/6474 |spamassassin -t -D

it scored thus:
X-Spam-Status: Yes, score=15.1 required=9.5 tests=BAYES_50,J_CHICKENPOX_26,
        J_CHICKENPOX_31,J_CHICKENPOX_32,J_CHICKENPOX_53,J_CHICKENPOX_65,
        J_CHICKENPOX_81,J_CHICKENPOX_93,MSGID_FROM_MTA_ID,NIGERIAN_BODY1,
        NIGERIAN_BODY2,NIGERIAN_BODY3,NIGERIAN_BODY4,RISK_FREE,TO_EMPTY,
        URG_BIZ,US_DOLLARS_3 autolearn=no version=3.0.2

i didn't time it with a clock, but the last of the debug output from 
URIDNSBL might be of interest
debug: URIDNSBL: query for start.no took 14 seconds to look up 
(sbl.spamhaus.org.:201.0.159.195)
debug: URIDNSBL: queries completed: 11 started: 0
debug: URIDNSBL: queries active:  at Tue Jan 18 21:11:33 2005
debug: done waiting for URIDNSBL lookups to complete

maybe you could try that on your system and see if it is the URIDNSBL 
queries that are taking so long.

also, try running it through spamassassin as opposed to spamc, this 
might let us know if the problem is with spamd using too many handles as 
Christian suggests.

good luck,
keith.


Re: spamassassin process a single message for 10 minutes !

Posted by Dave Goodrich <ld...@tls.net>.
Christian Recktenwald wrote:
> On Tue, Jan 18, 2005 at 10:56:22AM +0000, Stefano Catani wrote:
> 
>>here is the message:
>>http://mail.units.it/6474
>>
>>it contains a lot of email addresses and stops our mailserver
>>
>>these are the times on a dual PIII 1GHz (SpamAssassin 3.0.2)
>>
>>time spamc < 6474
>>real    9m59.995s
>>user    0m0.000s
>>sys     0m0.000s
> 

I have spamd on a Sparc Enterprize, I believe it is dual 400 with 2gb 
ram, spamc is running on dial 3.2ghz box and FreeBSD. SA 3.0.1.

X-Spam-Status: Yes, hits=13.037 tagged_above=-999 required=5 tests=AWL,
  MISSING_SUBJECT, MSGID_FROM_MTA_ID, NIGERIAN_BODY1, NIGERIAN_BODY2,
  NIGERIAN_BODY3, NIGERIAN_BODY4, RCVD_IN_BL_SPAMCOP_NET, RISK_FREE,
  SPF_HELO_PASS, SPF_PASS, URG_BIZ, URIBL_SBL, URIBL_WS_SURBL, US_DOLLARS_3
X-Spam-Level: *************
X-Spam-Flag: YES
Subject: ***SPAM***

real    0m0.174s
user    0m0.001s
sys     0m0.003s

DAve

> 
> similar result here:
> 
> real    10m0.067s
> user    0m0.010s  
> sys     0m0.000s
> 
> single PIII 1GHz 750MB SA 3.0.0
> 
> spamd (according to top) does not eat significantly CPU.
> I called strace on the spamd process:
> 
> ...
> select(0, NULL, NULL, NULL, {1, 20000}) = 0 (Timeout)
> open("/etc/protocols", O_RDONLY)        = -1 EMFILE (Too many open files)
> open("/var/lib/misc/protocols.db", O_RDWR|O_LARGEFILE) = -1 EMFILE (Too many open files)
> ...
> 
> this is reported endlessly
> 
> so there seems to be a file handle problem.
> 
> According to lsof:
> 
> lsof | grep ^spamd | awk '{print $1,$2}'  | sort | uniq -c
>   NrOF         PID
>      37 spamd 20696
>     126 spamd 20698
>     129 spamd 20699
>     130 spamd 20700
>    1055 spamd 20701
>      38 spamd 26284
> 
> This surely is insane.
> 
> Process 20701 which is the actually scanning child process
> has openend 933 UDP sockets: 
> 	spamd     20701     root 1023u  IPv4     555058                UDP *:38796
> and 85 handles on bayes_toks:
> 	spamd     20701     root  136u   REG       58,2 5226496 656011 /home/chris/.spamassassin/bayes_toks
> 
> I'd guess the UDP sockets are from DNS lookups f. sender verify.
> 
> HTH, Chris
> 


-- 
Systems Administrator
http://www.tls.net
Get rid of Unwanted Emails...get TLS Spam Blocker!


Re: spamassassin process a single message for 10 minutes !

Posted by Christian Recktenwald <sp...@citecs.de>.
On Tue, Jan 18, 2005 at 10:56:22AM +0000, Stefano Catani wrote:
> here is the message:
> http://mail.units.it/6474
> 
> it contains a lot of email addresses and stops our mailserver
> 
> these are the times on a dual PIII 1GHz (SpamAssassin 3.0.2)
> 
> time spamc < 6474
> real    9m59.995s
> user    0m0.000s
> sys     0m0.000s

similar result here:

real    10m0.067s
user    0m0.010s  
sys     0m0.000s

single PIII 1GHz 750MB SA 3.0.0

spamd (according to top) does not eat significantly CPU.
I called strace on the spamd process:

...
select(0, NULL, NULL, NULL, {1, 20000}) = 0 (Timeout)
open("/etc/protocols", O_RDONLY)        = -1 EMFILE (Too many open files)
open("/var/lib/misc/protocols.db", O_RDWR|O_LARGEFILE) = -1 EMFILE (Too many open files)
...

this is reported endlessly

so there seems to be a file handle problem.

According to lsof:

lsof | grep ^spamd | awk '{print $1,$2}'  | sort | uniq -c
  NrOF         PID
     37 spamd 20696
    126 spamd 20698
    129 spamd 20699
    130 spamd 20700
   1055 spamd 20701
     38 spamd 26284

This surely is insane.

Process 20701 which is the actually scanning child process
has openend 933 UDP sockets: 
	spamd     20701     root 1023u  IPv4     555058                UDP *:38796
and 85 handles on bayes_toks:
	spamd     20701     root  136u   REG       58,2 5226496 656011 /home/chris/.spamassassin/bayes_toks

I'd guess the UDP sockets are from DNS lookups f. sender verify.

HTH, Chris

-- 
Christian Recktenwald      :                         :
citecs GmbH                : spamassassin-talk-dist@citecs.de
Unternehmensberatung fuer  : voice +49 711 601 2090  : Boeblinger Strasse 189
EDV und Telekommunikation  : fax   +49 711 601 2092  : D-70199 Stuttgart