You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by ed <ed...@lennon.nu> on 2004/08/17 02:56:07 UTC

SA is processing messages slowly

For the past 2 weeks we've been trying to get our new mail server, with Spam Assassin up and running.  I've read all the FAQs and help files and am posting here as a last resort.  

We're running 1 gig of memory with a P4 2.4 Prescott processor on Fedora Core 2 with SpamAssassin version 2.63.  We're using SPAMD being called from an init script with /usr/bin/spamd -o -f   The options for spamd are: SPAMDOPTIONS="-d -c -a -m5 -H".

We've been having a problem with email files being processed very slowly by SA (averaging between 20 - 30 sec per message to process).  I had been running several rules until today.  After removing all add on rules, SpamAssassin picked up speed considerably.  It now process an email message in about 1 to 2 seconds.  But at this point we only have a handful of boxes active. 

 I have three questions that I need help finding the answers for:

1.)  What hardware is recommended to run SquirrelMail with Spam Assassin for about 1000 users.  We have an old domain and receive about 50,000 to 75,000 emails a day.

2.)  How many and which "add on rules" are recommended.  If I start adding them one by one, are there some "must have rules" that I should start with first?

3.)  Any suggestions on how to keep the memory usage down.  Sometimes if SA is processing 5 emails at the same time for more than 10 seconds, the memory usages climbs to almost 100%.

Any help or suggestions are greatly appreciated. Local.cf and procmailrc files are listed below for further info.

Thanks,

Ed


*** The spamasassin local.cf file:

required_hits 5.0
rewrite_subject 1
subject_tag [SPAM]

*** The /etc/procmailrc file:

DROPPRIVS=yes

:0fw:
* < 200000
| /usr/bin/spamassassin

:0fw:
* ! ^X-Spam-Level:.*
* < 200000
| /usr/bin/spamassassin

:0:
* ^X-Spam-Status: Yes
$HOME/mail/mail/Spam

Re: General Comment - SA is processing messages slowly

Posted by Jason Haar <Ja...@trimble.co.nz>.
On Thu, Aug 19, 2004 at 01:48:21PM +1200, Simon Byrnand wrote:
> Actually named *does* cache negative lookups, so the original poster that 
> said failed lookups aren't cached was wrong...

Yeah - he's correct - or at least correct for other people's environments :-(

I've just checked. I have dnscache installed on our SA servers, and they
*ARE NOT* caching negative values. But they forward to our official
site-wide dnscache servers, and they *ARE* caching the negative values....

I've got to figure that one out :-(


-- 
Cheers

Jason Haar
Information Security Manager, Trimble Navigation Ltd.
Phone: +64 3 9635 377 Fax: +64 3 9635 417
PGP Fingerprint: 7A2E 0407 C9A6 CAF6 2B9F 8422 C063 5EBB FE1D 66D1

Re: General Comment - SA is processing messages slowly

Posted by Simon Byrnand <si...@igrin.co.nz>.
At 13:41 19/08/2004, Justin Mason wrote:

>-----BEGIN PGP SIGNED MESSAGE-----
>Hash: SHA1
>
>
>Jason Haar writes:
> > On Tue, Aug 17, 2004 at 08:31:41AM -0400, Jeff Koch wrote:
> > > I question your statement that these DNSRBL can handle the load. Our
> > > mailservers are handling over 10K messages per hour - but to be
> > > conservative assume there are a million SA boxes checking 1.0K 
> messages per
> > > hour. Is it reasonable to assume that each DNSRBL can handle a billion
> > > queries an hour?
> >
> > We really need negative caching for DNS lookups. DNS TTLs are great for
> > caching *successful* lookups - but failed lookups aren't cached.
> >
> > This is the problem with the RBL style. It has retro-fitted DNS to do a job
> > it wasn't designed to do. Another example of a product with the same issues
> > is the Squid proxy server. They designed negative DNS caching into Squid to
> > reduce the amount of network DNS calls Squid makes.
> >
> > Has anyone looked into adding a DNS cache component into SA? You could 
> cache
> > both positive and negative lookups for (say) 5-10 minutes without really
> > causing any bad side effects...
>
>We were considering it, since it'd be doable now that we prefork and keep
>a spamd process running for a few hundred messages.   However, the other
>devs were pretty sure that a local caching "named" process would probably
>do the trick nicely enough.  (me, I'm not quite convinced ;)
>
>So a local caching named won't cache negative lookups?  That *could*
>be quite an improvement if that's the case...

Actually named *does* cache negative lookups, so the original poster that 
said failed lookups aren't cached was wrong...

There was a little bit of discussion on the SURBL list just the other day 
suggesting the possibility of making the negative TTL much shorter than the 
normal TTL - so entires that are listed in surbl will cache for a longer 
time (say a few hours) while entries that aren't currently listed wont be 
negatively cached for too long, to make newly added entries usable sooner...

Regards,
Simon


Re: General Comment - SA is processing messages slowly

Posted by Jason Haar <Ja...@trimble.co.nz>.
On Tue, Aug 17, 2004 at 08:31:41AM -0400, Jeff Koch wrote:
> I question your statement that these DNSRBL can handle the load. Our 
> mailservers are handling over 10K messages per hour - but to be 
> conservative assume there are a million SA boxes checking 1.0K messages per 
> hour. Is it reasonable to assume that each DNSRBL can handle a billion 
> queries an hour?

We really need negative caching for DNS lookups. DNS TTLs are great for
caching *successful* lookups - but failed lookups aren't cached. 

This is the problem with the RBL style. It has retro-fitted DNS to do a job
it wasn't designed to do. Another example of a product with the same issues
is the Squid proxy server. They designed negative DNS caching into Squid to
reduce the amount of network DNS calls Squid makes.

Has anyone looked into adding a DNS cache component into SA? You could cache
both positive and negative lookups for (say) 5-10 minutes without really
causing any bad side effects...

-- 
Cheers

Jason Haar
Information Security Manager, Trimble Navigation Ltd.
Phone: +64 3 9635 377 Fax: +64 3 9635 417
PGP Fingerprint: 7A2E 0407 C9A6 CAF6 2B9F 8422 C063 5EBB FE1D 66D1

Re: General Comment - SA is processing messages slowly

Posted by Jeff Chan <je...@surbl.org>.
On Wednesday, August 18, 2004, 7:42:38 PM, Daniel Quinlan wrote:
> Jeff Koch <je...@intersessions.com> writes:

{Quibble: this seems to be a private message from Jeff to Daniel.]

>> I question your statement that these DNSRBL can handle the load. Our 
>> mailservers are handling over 10K messages per hour

> Sites with high volume generally use local copies of blacklists.  That
> is considered a best practice.  It doesn't require any code changes on
> our part, though.  10K messages per hour is not very big, but you might
> want to think about it.

10k an hour is roughly 240k per day, which is well within the
range most RBLs prescribe for local mirroring of their zone
files.  Please strongly consider setting up rbldnsd and rsyncing
the RBL zones that you use instead of querying the public name
servers.


  http://www.spamhaus.org/xbl/index.lasso

> Data Feed: Zone Transfers (rsync) for ISPs
> 
> Internet Service Providers and large corporate mail services
> with an incoming traffic of over 200,000 emails per day should
> use zone transfers of the Spamhaus DNSBLs to a local DNS server
> on your network. To submit a request for zone transfers see:
> Data Feed Application Form.   http://www.spamhaus.org/datafeed/index.html


  http://dsbl.org/faq-help#secondary

> Can I run my own DSBL DNS secondary?
>
> DSBL provides direct rsync access to our zones. To retrieve one
> of the zones, install rsync and run the following command: 
> 
>       rsync -t rsync.dsbl.org::dsbl/<file> .
> 
> where <file> is one of:
> 
>       rbldns-list.dsbl.org
>       rbldns-multihop.dsbl.org
>       rbldns-unconfirmed.dsbl.org


  http://www.ordb.org/faq/?&setlang=pl#zone_transfer

> Is DNS Zone transfer for our own use possible?
>
> The short answer: No
> 
> The "complete" answer: We do not allow anonymous zone
> transfers, and unless you receive lots of email (hundreds of
> thousands a day), the current setup shouldn't be a problem. We
> have quite a few well connected nameservers on almost all
> continents. 
> 
> However:
> As of late, we have begun allowing zone transfers via rsync. If
> you are interested in this, and you are willing to sign a
> reasonably strict non-disclosure agreement, feel free to get in
> touch with secondaries at ordb dot org. 


  http://www.surbl.org/rsync-signup.html

> SURBL Rsync Access Request
>
> For systems processing large volumes of inbound messages, for
> example more than 100,000 per day, we recommend setting up a
> local caching name server for mirroring SURBL and other RBL
> zone files. Doing so offers multiple advantages: 
> 
>    1. Significantly improve the performance of your mail servers
>    2. Reduce your network traffic
>    3. Help reduce public DNS server traffic

(Our site has several Howtos about setting up rbldnsd and rsync.)


Etc.

Jeff C.
-- 
Jeff Chan
mailto:jeffc@surbl.org
http://www.surbl.org/


Re: General Comment - SA is processing messages slowly

Posted by Daniel Quinlan <qu...@pathname.com>.
Jeff Koch <je...@intersessions.com> writes:

> We have had delay problems with SBL, RCFI, DYNABLOCK, NJABL - even SPAMCOP 
> gets overloaded on occasion. Once we start getting network test delays the 
> problems cascade very quickly and load averages escalate to the stratosphere.

Sure, occasionally that does happen.  SA now times out dead blacklists
very quickly on each query.  The major delays are from the MX test
(before pre-3.0.0-rc1), Pyzor, DCC, and Razor2 -- as I told you already.
If you're using those remotely (no local server/mirror), then you're
wasting your time worrying about DNSBL performance.

> I question your statement that these DNSRBL can handle the load. Our 
> mailservers are handling over 10K messages per hour

Sites with high volume generally use local copies of blacklists.  That
is considered a best practice.  It doesn't require any code changes on
our part, though.  10K messages per hour is not very big, but you might
want to think about it.

> - but to be conservative assume there are a million SA boxes checking
> 1.0K messages per hour. Is it reasonable to assume that each DNSRBL
> can handle a billion queries an hour?

A lot of people don't use network tests, plus a lot of sites download
local copies of some, most, or all of the DNSBLs they are using.  I
believe we generate about 520 million queries per hour.

> I also believe that SA should provide a mechanism for logging network test 
> response times. We need to be able to quickly turn off tests that create 
> delays.

We might be willing to consider logging any slow/slower responses in
spamd.  We'd want to see what the performance impact would be.  I tried
tracking per-DNSBL performance between spamd processes and it was *way*
too expensive and slowed things down on average.  The current timeout
mechanism is very fast and efficient.  Read the rbl_timeout
documentation.

Daniel

-- 
Daniel Quinlan
http://www.pathname.com/~quinlan/

Re: General Comment - SA is processing messages slowly

Posted by Jeff Koch <je...@intersessions.com>.
Thanks for the reply Daniel.

We have had delay problems with SBL, RCFI, DYNABLOCK, NJABL - even SPAMCOP 
gets overloaded on occasion. Once we start getting network test delays the 
problems cascade very quickly and load averages escalate to the stratosphere.

I question your statement that these DNSRBL can handle the load. Our 
mailservers are handling over 10K messages per hour - but to be 
conservative assume there are a million SA boxes checking 1.0K messages per 
hour. Is it reasonable to assume that each DNSRBL can handle a billion 
queries an hour?

I also believe that SA should provide a mechanism for logging network test 
response times. We need to be able to quickly turn off tests that create 
delays.


At 12:06 AM 8/17/2004, Daniel Quinlan wrote:
>Jeff Koch <je...@intersessions.com> writes:
>
> > In our experience we have traced the problem to the large number of RBL's,
> > DNSBL's, and other network tests (SPF, SURL, SORB) that seem to be
> > multiplying in SA.
>
>DNSBLs are the fastest network tests in reasonably recent SA versions.
>SPF (in 3.0), the MX check (in 2.6x), and distributed checksum tests
>(DCC, Razor2, Pyzor) are generally the slow network tests.
>
> > Not counting DCC and RAZOR there must be over a dozen that are now
> > consulted for every email. If you consider the hundreds of thousands
> > of boxes now using SA and the billions of emails processed by SA these
> > RBL's are just getting overloaded.
>
>Millions of boxes, and not really.
>
> > Has anybody thought about that? And is anybody considering response time
> > and load capacity of an RBL before adding it to SA? No wonder we're 
> getting
> > time-outs on some of these tests and our servers are overloading with SA
> > processes sitting around waiting for RBL results.
>
>Yes, we've thought about it and yes, we test DNSBL speed before adding
>them and check with the people running them before publishing with
>them.  All obvious stuff.
>
> > I think before any other network tests are added we need some logging 
> added
> > to SA that would show the response speed of each network test. Then 
> we'd be
> > able to quickly turn off those that are timing-out instead of the trial 
> and
> > error method we use now.
>
>If a DNSBL is timing it, SA doesn't wait on it.  It's the other tests
>that are problematic.  Hopefully, the plugin architecture will help
>improve DCC, Razor2, and Pyzor by the time 3.1 rolls around.  SPF is
>going to be a tough cookie to speed up.
>
>Daniel
>
>--
>Daniel Quinlan
>http://www.pathname.com/~quinlan/

Best Regards,

Jeff Koch, Intersessions 


Re: General Comment - SA is processing messages slowly

Posted by Daniel Quinlan <qu...@pathname.com>.
Jeff Koch <je...@intersessions.com> writes:

> In our experience we have traced the problem to the large number of RBL's, 
> DNSBL's, and other network tests (SPF, SURL, SORB) that seem to be 
> multiplying in SA.

DNSBLs are the fastest network tests in reasonably recent SA versions.
SPF (in 3.0), the MX check (in 2.6x), and distributed checksum tests
(DCC, Razor2, Pyzor) are generally the slow network tests.

> Not counting DCC and RAZOR there must be over a dozen that are now
> consulted for every email. If you consider the hundreds of thousands
> of boxes now using SA and the billions of emails processed by SA these
> RBL's are just getting overloaded.

Millions of boxes, and not really.

> Has anybody thought about that? And is anybody considering response time 
> and load capacity of an RBL before adding it to SA? No wonder we're getting 
> time-outs on some of these tests and our servers are overloading with SA 
> processes sitting around waiting for RBL results.

Yes, we've thought about it and yes, we test DNSBL speed before adding
them and check with the people running them before publishing with
them.  All obvious stuff.

> I think before any other network tests are added we need some logging added 
> to SA that would show the response speed of each network test. Then we'd be 
> able to quickly turn off those that are timing-out instead of the trial and 
> error method we use now.

If a DNSBL is timing it, SA doesn't wait on it.  It's the other tests
that are problematic.  Hopefully, the plugin architecture will help
improve DCC, Razor2, and Pyzor by the time 3.1 rolls around.  SPF is
going to be a tough cookie to speed up.

Daniel

-- 
Daniel Quinlan
http://www.pathname.com/~quinlan/

Re: General Comment - SA is processing messages slowly

Posted by Evan Platt <ev...@espphotography.com>.
At 08:30 PM 8/16/2004, you wrote:
>We have also experienced slowness by SA where it takes 20 to 30 seconds to 
>process a message. This might be OK for a stand alone Linux box but it is 
>unacceptable for a production mailserver handling several thousand email 
>accounts.

What version of SA are you running?

Evan 


General Comment - SA is processing messages slowly

Posted by Jeff Koch <je...@intersessions.com>.
Hi:

We have also experienced slowness by SA where it takes 20 to 30 seconds to 
process a message. This might be OK for a stand alone Linux box but it is 
unacceptable for a production mailserver handling several thousand email 
accounts.

In our experience we have traced the problem to the large number of RBL's, 
DNSBL's, and other network tests (SPF, SURL, SORB) that seem to be 
multiplying in SA. Not counting DCC and RAZOR there must be over a dozen 
that are now consulted for every email. If you consider the hundreds of 
thousands of boxes now using SA and the billions of emails processed by SA 
these RBL's are just getting overloaded.

Has anybody thought about that? And is anybody considering response time 
and load capacity of an RBL before adding it to SA? No wonder we're getting 
time-outs on some of these tests and our servers are overloading with SA 
processes sitting around waiting for RBL results.

I think before any other network tests are added we need some logging added 
to SA that would show the response speed of each network test. Then we'd be 
able to quickly turn off those that are timing-out instead of the trial and 
error method we use now.

Just my two cents.


Best Regards,

Jeff Koch 


Re: SA is processing messages slowly

Posted by LuKreme <kr...@kreme.com>.
On 16 Aug 2004, at 18:56, ed wrote:
> 1.)  What hardware is recommended to run SquirrelMail with Spam 
> Assassin for about 1000 users.  We have an old domain and receive 
> about 50,000 to 75,000 emails a day.

That's a pretty medium load.  I'll say anything over 1GHz would be more 
than sufficient.  Maybe even a speedy PIII.

> 2.)  How many and which "add on rules" are recommended.  If I start 
> adding them one by one, are there some "must have rules" that I should 
> start with first?

Surbl I think.  Also, consider greylisting (I use postgrey myself) as 
that will dramatically reduce the load on Spamassassin.

> 3.)  Any suggestions on how to keep the memory usage down.  Sometimes 
> if SA is processing 5 emails at the same time for more than 10 
> seconds, the memory usages climbs to almost 100%.

That seems odd.  Are you running BigEvil?

> *** The /etc/procmailrc file:
>
> DROPPRIVS=yes
>
> :0fw:
> * < 200000
> | /usr/bin/spamassassin
>
> :0fw:
> * ! ^X-Spam-Level:.*
> * < 200000
> | /usr/bin/spamassassin

Why are you running every non-spam through SA twice?

-- 
sometimes ascii is the best use of bandwidth... Tonya Engst



Re: SA is processing messages slowly

Posted by Daniel Quinlan <qu...@pathname.com>.
Matt Kettler <mk...@comcast.net> writes:

> Hmm. Those are problematic.. Don't specify -H unless you specify a 
> directory after it.

My spamd options:

  /usr/sbin/spamd -c -m 4 -H -d --pidfile=/var/run/spamd.pid

Works for me.  The path after the -H is optional.
 
> Really the "big winners" add-ons to SA that I've used are:
>          1) SURBL
>          2) installing Net::DNS to enable DNSBLs
>          3) a well trained bayes DB
>          4) DCC (note: I hack the default score down a bit to 2.0, 
> occasional FP problems but in general very good)
>          5) antidrug.cf (I wrote it, so I am biased here)
>          6) backhair.cf
>          7) 70_sare_random.cf

Most of that is in 3.0 and tuned to run with everything else.  :-)

Daniel

-- 
Daniel Quinlan
http://www.pathname.com/~quinlan/

Re: SA is processing messages slowly

Posted by Matt Kettler <mk...@comcast.net>.
At 08:56 PM 8/16/2004 -0400, ed wrote:
>We're running 1 gig of memory with a P4 2.4 Prescott processor on Fedora 
>Core 2 with SpamAssassin version 2.63.

Upgrade to 2.64 ASAP.. 2.63 is subject to a DoS attack from malformed MIME 
messages.

>   We're using SPAMD being called from an init script with /usr/bin/spamd 
> -o -f   The options for spamd are: SPAMDOPTIONS="-d -c -a -m5 -H".

Hmm. Those are problematic.. Don't specify -H unless you specify a 
directory after it.

Personaly, I'd ditch -a as well, but that's really a matter of personal taste.


>We've been having a problem with email files being processed very slowly 
>by SA (averaging between 20 - 30 sec per message to process).  I had been 
>running several rules until today.  After removing all add on rules, 
>SpamAssassin picked up speed considerably.  It now process an email 
>message in about 1 to 2 seconds.  But at this point we only have a handful 
>of boxes active.
>
>  I have three questions that I need help finding the answers for:
>
>1.)  What hardware is recommended to run SquirrelMail with Spam Assassin 
>for about 1000 users.  We have an old domain and receive about 50,000 to 
>75,000 emails a day.

I can't help you here, this isn't in my expertise.

>
>2.)  How many and which "add on rules" are recommended.  If I start adding 
>them one by one, are there some "must have rules" that I should start with 
>first?

First, I'd actually start off with none of the add-on rules. I'd start with 
surbl first, via the Mail::SpamCopURI plugin.

Really the "big winners" add-ons to SA that I've used are:
         1) SURBL
         2) installing Net::DNS to enable DNSBLs
         3) a well trained bayes DB
         4) DCC (note: I hack the default score down a bit to 2.0, 
occasional FP problems but in general very good)
         5) antidrug.cf (I wrote it, so I am biased here)
         6) backhair.cf
         7) 70_sare_random.cf

Of course, your experience may differ, and I'd suggest adding things in a 
"one at a time" approach to start with so you can keep an eye on memory 
load, processing time, and hit-rate impacts of each.

>
>3.)  Any suggestions on how to keep the memory usage down.  Sometimes if 
>SA is processing 5 emails at the same time for more than 10 seconds, the 
>memory usages climbs to almost 100%.

1) Don't use any add-on rulesets that are "large" (ie: >128k in .cf file 
format)
2) Don't use a bayes_expiry_max_db_size over the default of 150,000 (if 
bayes is enabled)
3) As a "fail-safe backup" measure, run sa-learn --force-expire as a daily 
cron job. This will make sure the bayes DB can expire properly and keep it 
from being grossly huge.