You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by "Henry, Austin MSER:EX" <Au...@gov.bc.ca> on 2005/05/20 01:03:23 UTC

RE: Simple question TRUE or FALSE (More data to answer this question)

>> I have a mailserver here that has a 1Ghz CPU and 512MB RAM and SA on that
>> server usually takes 2 or 3 seconds per message.
>> Like already posted, some of your rulesets are unnecessary because they
>> are included in SA (standard rulesets or SURBL).
>> Did you check 'cat messages | spamassassin -D' to see what part takes
most
>> time? DNS time-outs can take a lot of time for example (also checkable
>> with tcpdump port 53).
>> Also your SMTP-server (xmail?) takes a lot of cpu. I've never used Xmail
>> but I use postfix (and amavisd-new) and I think it's quite memory and CPU
>> efficient.
>> 
>
>Please don't take this as me doubting you - but how in the world are you 
>able to scan a message in 2-3 seconds?  I assume you're running some of 
>the network tests, like other people that have posted 2-3 second message 
>processing times, is that correct?
>
>My Dl360 with dual 1.266ghz CPU's, 2GB of RAM, and dual 18GB mirrored scsi

>drives can only scan a message in 4-5 seconds.  At least that was my scan 
>time with a completely default setup, running spamd/spamass-milter, SA
3.0.1, 
>RedHat FC2, and sendmail 8.13.1.  I haven't checked in a while (since I
updated 
>SA, the milter, and sendmail), but I have a good feeling most of my
processing 
>time was spent waiting for DNS responses.
>
>Any input into my situation would be appreciated.  I'd love to be able to
get 
>down to 2-3 seconds, basically cutting my processing time in half!
>
>.jon

I'll describe my setup, and that may give you some insight. It's almost
certainly what you think: network tests.

My setup uses Compaq ML570s, 4 700MHz Xeon CPUs each, 2G of ram, RAID 0+1
disk arrays.  They do virus scanning, spam scanning, and various other mail
related tasks which all (of course) take resources.  These machines rarely
go above 700M consumed, and only really run more than 50% busy (over a
several minute window) on Monday morning, or when a spammer has decided that
it would be a wonderful idea to hit every single address of ours that they
have in rapid succession.

The sa-stats routine return the following data: Based on yesterday's logs,
the average scan time was 1.44s, average ham scan time 1.11s, average spam
scan time 1.62s.  The total number of messages scanned was 225,850.  It
would much higher, but we don't scan outbound email, and also block mail
using a sendmail milter derived from rbl-milter, which blocks when 2 (or
more) of the RBLs that we use agree.

To speed up the network tests, we take advantage of any RBL provider that
offers rsync access to their lists (njabl, dsbl, surbl, others), and then
(almost) only use those ones.  Our scan times went up after I added a few
others (sbl-xbl, and bl.spamcop), but those ones are really fast anyway.
Each machine runs a local caching DNS server, and the locally hosted RBLs
are served by an rbldnsd server.  Conveniently, rbldns makes it easy to run
a private URIBL, which is occasionally nice.

Our site-wide bayes database lives in SQL, because it's more convenient to
share among multiple machines that way, and has the added benefit of being
faster.  I don't run Razor or DCC or Pyzor.  A pile of custom rules, and
SARE rulesets finish the setup.  I've probably forgotten something, but
those are the important things.

Anyway, I hope that helps someone :)  The setup works nicely, with nary a
hitch, thanks to everyone who makes it possible!

- Austin.