You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by "David F. Skoll" <df...@roaringpenguin.com> on 2011/12/13 20:00:42 UTC

DNS list inclusion policy (was Re: DNSWL will be disabled by default as of tomorrow)

Hi,

Is there a formal policy for including (or excluding) DNS-based lists
from SpamAssassin's default rules?  I think that SpamAssassin should
not enable any DNS-based list by default unless the list owners have a
clear policy that the list is free to use by SpamAssassin users,
regardless of query volume [*], commercial vs. non-commercial use,
etc. and a promise not to change the policy without reasonable (say,
six months) notice.

I realize this means practically no DNS lists will qualify to be
enabled by default.  I believe SA should ship all the rules for
various DNS lists and have clear instructions (maybe even a tool for
newbies that runs automatically from the makefile) to enable them.
But I'm not sure it should serve as a marketing tool for DNSBL
operators who want to charge money.

(/me dons flame-proof suit...)

Regards,

David.

[*] DNSBLs that force you to use a (free) rsync service if your query volume
is too high are OK.

Re: DNS list inclusion policy (was Re: DNSWL will be disabled by default as of tomorrow)

Posted by Martin Gregorie <ma...@gregorie.org>.
On Tue, 2011-12-13 at 15:33 -0500, David F. Skoll wrote:
> I think something like this would be good:
> 
> $ tar xvfz Mail-SpamAssassin-xxx.tar.gz
> $ cd Mail-SpamAssassin-xxx
> $ perl Makefile.PL
> [...]
> 
> The following RBLs have certain usage restrictions.  Would you like to
> enable them?  If unsure, say Y.  (Y/N): [Y]
> [... a list of RBLs, preferably with links to their policy pages ...]
> 
> so that at installation time, SA users are at least aware of the issues.
> 
The only problem with this that I can see is, unfortunately, a biggish
one: it won't work for the package installs that I and, at a guess, a
lot of small installations use. I run the Fedora distro, and so the
version of SA I'm running is that installed and periodically updated by
yum.

I like the idea of the tool though, particularly if it can link to
descriptions of each RBL and, hopefully, to the masscheck results for
it. From my POV a post-install tool, and especially one that can be
re-run periodically, would be a lot more useful.

Martin



Re: DNS list inclusion policy (was Re: DNSWL will be disabled by default as of tomorrow)

Posted by "David F. Skoll" <df...@roaringpenguin.com>.
On Tue, 13 Dec 2011 15:24:34 -0500
darxus@chaosreigns.com wrote:

> If you can come up with a way to disable all network tests that have
> a free use limit without crippling spamassassin, please tell us.
> That would be lovely.

I think something like this would be good:

$ tar xvfz Mail-SpamAssassin-xxx.tar.gz
$ cd Mail-SpamAssassin-xxx
$ perl Makefile.PL
[...]

The following RBLs have certain usage restrictions.  Would you like to
enable them?  If unsure, say Y.  (Y/N): [Y]
[... a list of RBLs, preferably with links to their policy pages ...]

so that at installation time, SA users are at least aware of the issues.

For package maintainers, you'd want a noninteractive way to build SpamAssassin
and then a post-install configuration script that asks the RBL question.
(This is easily done on Debian; not so easy with RPM-based systems that
don't allow interaction during installation.)

Regards,

David.

Re: DNS list inclusion policy (was Re: DNSWL will be disabled by default as of tomorrow)

Posted by da...@chaosreigns.com.
On 12/13, Kevin A. McGrail wrote:
>  Is there a formal policy for including (or excluding) DNS-based lists

                                               what    s
>    There is formal consensus from the PMC that ^ work^ for most installations is
>    adequate for default inclusion once the merits of the BL are shown

I wanted to point out the logic behind this.  SpamAssassin is heavily
dependent on network tests for accuracy.  Disabling them results in
something like 5 times as many emails mis-classified, both false positives
and false negatives (with the default threshold of 5).  

So the default configuration is set up to need minimal adjustment for small
installations likely to have the least ability to do those adjustments, and
require more adjustment for very large installations which are inevitably
going to need to change stuff anyway.

If you can come up with a way to disable all network tests that have a free
use limit without crippling spamassassin, please tell us.  That would be
lovely.

And I do think it's appropriate to discuss this in terms of all network
tests, not just DNS tests.


These are the statistics from... 2009.  Would be nice to get newer ones.
Why aren't these updated?

Set 0, no net, no bayes:
http://svn.apache.org/viewvc/spamassassin/trunk/rules/STATISTICS-set0.txt?view=markup
# False positives:       238  1.12%
# False negatives:      9678  21.93%

Set 1, with net, no bayes:
http://svn.apache.org/viewvc/spamassassin/trunk/rules/STATISTICS-set1.txt?view=markup
# False positives:        30  0.14%
# False negatives:      1381  3.13%


Without the network tests, it got 7.9x as many spams wrong, and 7.0x as
many hams wrong.  


(Opened bug for those files being out of
date, I think the data gets generated daily:
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6726 )

-- 
"I love God. He's so deliciously evil." - Stewie Griffin, Family Guy 2x02
http://www.ChaosReigns.com

Re: DNS list inclusion policy (was Re: DNSWL will be disabled by default as of tomorrow)

Posted by "Kevin A. McGrail" <KM...@PCCC.com>.
On 12/13/2011 2:00 PM, David F. Skoll wrote:
> Hi,
>
> Is there a formal policy for including (or excluding) DNS-based lists
> from SpamAssassin's default rules?  I think that SpamAssassin should
> not enable any DNS-based list by default unless the list owners have a
> clear policy that the list is free to use by SpamAssassin users,
> regardless of query volume [*], commercial vs. non-commercial use,
> etc. and a promise not to change the policy without reasonable (say,
> six months) notice.
>
> I realize this means practically no DNS lists will qualify to be
> enabled by default.  I believe SA should ship all the rules for
> various DNS lists and have clear instructions (maybe even a tool for
> newbies that runs automatically from the makefile) to enable them.
> But I'm not sure it should serve as a marketing tool for DNSBL
> operators who want to charge money.
>
> (/me dons flame-proof suit...)
>
> Regards,
>
> David.
>
> [*] DNSBLs that force you to use a (free) rsync service if your query volume
> is too high are OK.
You are a brave person!

There is formal consensus from the PMC that work for most installations 
is adequate for default inclusion once the merits of the BL are shown 
combined with discussion if the infrastructure can withstand the query 
onslaught.

Beyond that, we have the "new" issue of responding with purposefully 
wrong answers which is a no-no for default inclusion.  However, this 
issue is NOT new.  This issue was discussed by the PMC when URIBL was 
being considered for inclusion:
"NOTE: URIBL fails with false positives once a threshold is exceeded. 
  This will not be acceptable for a default enabled SA test." and the 
response: UPDATE FROM DALLAS: We don't return positive to anymore.   We 
moved our heavy hitter blocking
to the top level, so blocked users will timeout trying to reach our 
mirrors.".

But consensus for the PMC is clear, that free for some licensing with 
query volumes being acceptable.

 From my notes, a daily limit and ok for non-commercial use has been 
acceptable to SA to consider for default enabling.  Of course, we've had 
a LOT of discussions on this matter and Spamhaus, for example, changed 
their query and SMTP limits based on discussions with SA.

So I know we try and be fair but flexible so we are not closed to new 
BLs.  I also personally try and support BLs by offering nameserver 
resources.


In the end, your idea of an easier way for admins to review rules/BL for 
inclusion/removal has good merit.  I've had similar ideas with at least 
making a list of rules that are disabled for these type of reasons more 
in the forefront.  The biggest con I can see is that rules and releases 
are purposefully separated.  If masscheck could be augmented, having 
LOTS of different sa-update channels might be a good idea but that won't 
enable the plugins.


Now what's more interesting is that I took on the crazy idea of writing 
a policy for RBL inclusion. Let me formalize my notes and see if I can 
post something today about that.  It'll be really for the PMC to decide 
but user/dev/rbl operator input is always good.

Regards,
KAM