You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by "Frank M. Cook" <fc...@acsplus.com> on 2005/07/26 17:22:09 UTC

rbl checking

I just posted a message about spamd not keeping up.  I'm asking this
question separately to keep the answers distinct.

my spamd hasn't been keeping up and so I turned on rbl checking ahead of
spamassassin.  that reduced the load but it also seems to be catching more
spam than spamassassin was on its own.

should I need to rbl check ahead of spamd or should I just edit local.cf to
rbl check there?  if I do move the checking to spamd, is there a file
somewhere that sets which rbl's are used?

my other computer can check the senders address against the black lists but
I understand that spamd can also check url's found in the body of messages
which would be helpful.  if local.cf sets rbl to 0, does that turn off both
kinds of checks or are the body checks controlled elsewhere?  can all this
be adjusted in the cf files or do I need to tweak rules?  if the later, are
the rules documented anywhere or is it all in you expert's heads?

I'd think that it would be inefficient to check the lists twice.  shouldn't
I get the best throughput by having spamd check both the sender and any
url's in the body against the lists I find most reliable?

Frank M. Cook


Re: rbl checking

Posted by Matt Kettler <mk...@evi-inc.com>.
Frank M. Cook wrote:
> I just posted a message about spamd not keeping up.  I'm asking this
> question separately to keep the answers distinct.
> 
> my spamd hasn't been keeping up and so I turned on rbl checking ahead of
> spamassassin.  that reduced the load but it also seems to be catching more
> spam than spamassassin was on its own.

True, it will catch more spam. Blocking more spam is easy.. want 100% of your
spam blocked? unplug your server :)


In general doing RBLs at the SMTP layer as a one-shot block, you'll likely have
a higher overall false-positive rate. Adding any additional measures is going to
increase the chances of a legitimate message will get blocked.

You'll have to trade off between how much spam sneaks by and what chances there
are of a legit message getting blocked and make your own decisions.


> 
> should I need to rbl check ahead of spamd or should I just edit local.cf to
> rbl check there?  if I do move the checking to spamd, is there a file
> somewhere that sets which rbl's are used?

By default, MANY of them are used (provided perl's Net::DNS is installed) ...
see /usr/share/spamassassin/20_dnsbl.cf. SBL, XBL, DSBL, SORBS, RFCI, spamcop
and NJABL are all enabled by default in SA 3.0.x, all with various scores.

You can turn various RBLs on and off by changing the score of the rule. Setting
it to a score of 0 disables the rule entirely, and it will not be queried.

You can check the default score of a rule by greping for it in
/usr/share/spamassassin/50_scores.cf.

you can also look at STATISTICS-set3.txt that comes with the SA tarball to see
what the overall, ham, and spam hit rates, as well as the S/O (spam vs overall)
ratio. S/O is generally a good number to work with, as it indicates what
percentage of messages that the rule hit were actually spam. (in a hand-sorted
corpus). A good RBL should have a S/O of 0.95 or greater.

> 
> my other computer can check the senders address against the black lists but
> I understand that spamd can also check url's found in the body of messages
> which would be helpful.  if local.cf sets rbl to 0, does that turn off both
> kinds of checks or are the body checks controlled elsewhere?

ERm, there is no "rbl 0" setting. There is however "skip_rbl_checks 1". AFAIK
turning this off will only turn off normal Received: header RBL queries, and
will leave the body URI checks in place.

You can disable URIBLs by removing the loadplugin statement for them from
/etc/mail/spamassassin/init.pre.

As for using a score of 0 to disable a RBL, the body URL checks are done as a
completely separate rule, so you'd have to zero them both.

ie: SBL is the only RBL queried for both. Normal queries are implemented by
RCVD_IN_SBL and URL queries are URIBL_SBL.

You'd have to zero both rules to disable them both.


  can all this
> be adjusted in the cf files or do I need to tweak rules? 

You can adjust it all in /etc/mail/spamassassin/*.cf (your choice of files, but
most people just use local.cf)

 if the later, are
> the rules documented anywhere or is it all in you expert's heads?

The main rules are in /usr/share/spamassassin/*.cf. Most are just regexes of
various sorts. I'd advise not editing them, unless you consider that when you
upgrade SA, this whole directory gets rm -f'ed


>
> I'd think that it would be inefficient to check the lists twice.

It would be slightly inefficient, however, if your DNS server is worth anything
it will cache the answer to the first query and the second will be lightweight.

  shouldn't
> I get the best throughput by having spamd check both the sender and any
> url's in the body against the lists I find most reliable?

Generally, yes. But there is the speed gain of blocking a message completely and
not scanning it at all to consider. This causes unblocked messages to be checked
twice, but the penalty is small, and the overhead savings on blocked messages is
great.

If you can accept the false-blocks caused by the RBLs, this will generally
improve overall performance if the RBL hits at least 1% of your inbound mail.