You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@spamassassin.apache.org by bu...@bugzilla.spamassassin.org on 2011/10/03 05:26:24 UTC

[Bug 6668] New: DNSWL is lacking a rule to communicate excessive use to users

https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6668

             Bug #: 6668
           Summary: DNSWL is lacking a rule to communicate excessive use
                    to users
           Product: Spamassassin
           Version: unspecified
          Platform: All
        OS/Version: All
            Status: NEW
          Severity: normal
          Priority: P2
         Component: Rules
        AssignedTo: dev@spamassassin.apache.org
        ReportedBy: Darxus@ChaosReigns.com
    Classification: Unclassified


In bug #6220 it was discussed that Spam Eating Monkey has a way to trigger
SpamAssassin to intentionally cause false positives by returning a value of
127.0.0.255 in cases where people are abusing their service with excessive
load.

DNSWL.org has had this kind of problem recently, with some folks who have been
particularly difficult to contact about it, and has resorted to returning a
trust value of "HI" to all queries from the problematic users.

I'd like to provide DNSWL with a better option, to handle a return value of
127.*.*.255, and instead of hitting "RCVD_IN_DNSWL_HI", hit a rule that
explains that there is a problem with abusive levels of load on the DNSWL
servers.

How was that implemented for Spam Eating Monkey?  There doesn't seem to be a
rule to match *.255.


Should I create a rule like this?

score RCVD_IN_DNSWL_ABUSE -100 # I figure getting it noticed quick is best for
everybody?

##{ RCVD_IN_DNSWL_ABUSE ifplugin Mail::SpamAssassin::Plugin::DNSEval

ifplugin Mail::SpamAssassin::Plugin::DNSEval
header  RCVD_IN_DNSWL_ABUSE        eval:check_rbl_sub('dnswl-firsttrusted',
'^127\.0\.\d+\.255$')
describe RCVD_IN_DNSWL_ABUSE       You are using a DNS server that is placing
too high a load on the DNSWL.org DNS servers without a subscription, please see
https://subscription.dnswl.org/
tflags RCVD_IN_DNSWL_ABUSE         nice net
endif
##} RCVD_IN_DNSWL_ABUSE ifplugin Mail::SpamAssassin::Plugin::DNSEval


Returning _HI for everything is resulting in many false negatives for the
abusing users, and thinking about ideal scores for this kind of situation, I
think maybe a large negative score should be used for things like SEM as well,
because not filtering out spam is always a much better failure mode than
filtering too much as spam.

Also, I think it's really irresponsible for SpamAssassin to expose users to
this kind of punitive activity without actually warning them of the usage
thresholds of the services involved, as Warren lists here: 
http://www.spamtips.org/2011/01/usage-limits-of-spamassassin-network.html


The DNSWL folks who started making this use of _HI are probably not aware of
this option, and I just heard this was happening for the first time, so I'm
going to go point them to this bug now.  (For those who may be new, I'm a DNSWL
admin.)

-- 
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

[Bug 6668] DNSWL is lacking a rule to communicate excessive use to users

Posted by bu...@bugzilla.spamassassin.org.

https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6668

--- Comment #29 from Matthias Leisi <ma...@leisi.net> 2011-12-12 20:37:39 UTC ---
(In reply to comment #26)

> I believe the rules have been in a sandbox since 3.3.0.  I am correct, they are
> in Theo's sandbox which is where they have been living for a while.

Since 3.2.0 (with an error first),
http://www.dnswl.org/news/archives/1-dnswl.org-data-and-SpamAssassin-3.2.0.html

-- 
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

[Bug 6668] DNSWL is lacking a rule to communicate excessive use to users

Posted by bu...@bugzilla.spamassassin.org.

https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6668

--- Comment #13 from Darxus <Da...@ChaosReigns.com> 2011-10-03 21:39:19 UTC ---
(In reply to comment #12)
> > Would you like to recommend another URL?
> 
> The URL I wrote was PURELY a place-holder.  I could have and should have
> written something that implied more firmly that status such as
> www.sa.org/foobar.  
> 
> The actually link should be determined in the patch.  It likely should go to a
> wiki article discussing RBL errors.  But it definitely shouldn't be a link to a
> vendor, shouldn't say abuse and should be kept non specific.  It likely should
> have information on disabling the RBL rules as well.

Agreed.

> > I don't see how that is at at all conflicting with what I have suggested.
> 
> I'm trying to keep my answers too short.  I'll rephrase:
> 
> You have suggested that disabling network tests causes FNs because emails then
> slip through unmarked. I don't consider those true FNs because I consider them
> FNs caused by a misconfiguration of SA.  

I disagree.  That's a false negative, even if it's due to configuration.  

> SA works best with network tests and
> we aren't recommending they are disabled.  But some of them need consideration
> before they are enabled post-installation.

Yep.

> We will have to agree to disagree.  
> 
> I am 100% convinced it is inappropriate to intentionally affect scores to get
> the attention of admins.  It is the very definition of collateral damage and
> something I would strongly advocate against.  

Okay.

> But again, I am one vote and this
> is my opinion.  

"Votes on code modifications follow a different model. In this scenario, a
negative vote constitutes a veto , which cannot be overridden."
"...the proposal requires three positive votes and no negative ones in order to
pass..."
- http://www.apache.org/foundation/voting.html

By our rules, it's enough on its own to make this not happen.

> > RCVD_IN_DNSWL_HI is currently scored -5.  Would you veto a rule that matched
> > the return value of 127.0.0.255 with a score of -5 and a description that was
> > helpful in resolving the situation that could not be construed as advertising?
> 
> This needs more thought but I would veto it unless the following points are
> met:
> 
>  - the NET result of the rules for the RBL in question in total add up to zero
> (or subsequently similar e.g. 0.0001, etc.) So if there is a positive score and
> a negative score, the two together = 0.  In other words, an RBL can't issue a
> response that incorrectly affects scores on purpose due to limits, technical
> errors, etc.

I believe that requirement would eliminate dnswl.org's interest.  Since you're
willing to veto without it, I think that's sufficient to consider this thread
dead.

> > Another possibility I brought up 6 months ago in bug #6220 was, when receiving
> > a return value of 127.*.*.255, disabling that rule.  No more load on the
> > provider, no skewed score for the user, no advertising.
> 
> You are mentioning ideas that need to be adopted by RBLs more so than SA

I don't understand why you say that.  It's just another way of handing a
127.0.0.255 within spamassassin.  So as far as RBLs and WLs are concerned it's
still just an implementation of providing a .255 response for users who are
over limit.

As an example, say an email provider is using spamassassin to filter millions
of emails a day.  Some of the rules (RCVD_IN_XBL, RCVD_IN_PBL, RCVD_IN_SBL)
cause queries is to zen.spamhaus.org.  That being over their free use
threshold, they start returning (only) 127.0.0.255 for all queries, to indicate
the over limit condition.  SpamAssassin notices the 127.0.0.255 value, and
stops running all rules that hit zen.spamhaus.org.

> but
> this sounds a bit like a DoS ready to happen AND it's a case where the rule
> that implemented this likely couldn't be on by default as shipped by SA.  If
> they are smart enough to turn on the feature, they likely know enough about RBL
> queries to perform local caching, rsync, etc.

How is that a DoS ready to happen?  Are we having another misunderstanding
here?

> I run quite a number of RBL public nameservers.  I don't consider the traffic
> to be that big a deal and I can blackhole queries quite easily.

Are they RBLs that spamassassin has enabled by default?  I run one dnswl.org
mirror, and the only reason I can do that is my provider is willing to overlook
my bandwidth limit due to a belief that dnswl is worth supporting.  Mirroring
dnswl.org causes almost all of my bandwidth usage.

-- 
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

[Bug 6668] DNSWL is lacking a rule to communicate excessive use to users

Posted by bu...@bugzilla.spamassassin.org.

https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6668

--- Comment #10 from Kevin A. McGrail <km...@pccc.com> 2011-10-03 19:34:17 UTC ---
> That sounds great to me as well.  Although I'd prefer something in the wiki for
> maintainability, I don't know, maybe
> http://wiki.apache.org/spamassassin/XBLAbuse ?

Again, generic and I don't consider this necessarily "abuse".  That's a very
strong word to many people. 

To me, it's an error requiring administrative attention with a landing page to
help them try and resolve the issue.  Nothing more, nothing less.

> You mis-read what I said.  I never suggested false positives (in fact I
> suggested it was bad that SEM intentionally caused false positives).  I was
> talking about causing false negatives (spam being marked as non-spam).  

That is not the correct definition of a FN in my opinion.  By your definition,
any email that got through SA for any reason is a False Negative.

We have to ship SA in a way that is safe for the vast majority of users which
might not be the most effective for blocking all Spam.


> So the question is, what are acceptable methods of enforcing those thresholds? 
> Blocking queries resulting in delay of email is acceptable to you.  I don't
> know how effective that is in getting people to stop querying, and it doesn't
> provide any feedback to indicate that there is a problem.  Is it acceptable to
> cause false-negatives, spam being marked as non-spam, with clear indication
> (via a matching rule and description) of what the problem is?

PRIMARILY, I want to see a method which doesn't artificially change the SA
scoring up or down substantially.  

An RBL that starts returning ALL true or ALL false for over-limit issues is
artificially changing the scores.  No answer or a answer handled as an error
would be acceptable.  

At worst, the queries can be stopped by blackholing the requests from overlimit
IPs. So this is really a matter for the RBL to handle.

However, some RBLs want to convert those over-limit users into customers and
they do so through harmful techniques to get the admin's attention.

Regards,
AKM

-- 
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

[Bug 6668] DNSWL is lacking a rule to communicate excessive use to users

Posted by bu...@bugzilla.spamassassin.org.

https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6668

--- Comment #22 from Matthias Leisi <ma...@leisi.net> 2011-12-11 21:07:27 UTC ---
(self-correction)

> * Some big hosting provider resolvers: softlayer.com, dimenoc.com,
> theplanet.com, bluehost.com, dyndns.com, netline.net.uk (multi-million queries

dyndns.com was moved from the "listed, high trust" back in April 2011 to simple
"refuse" again because they answered and promised to fix the situation.

-- 
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

[Bug 6668] DNSWL is lacking a rule to communicate excessive use to users

Posted by bu...@bugzilla.spamassassin.org.

https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6668

--- Comment #28 from Matthias Leisi <ma...@leisi.net> 2011-12-12 20:33:53 UTC ---
(In reply to comment #27)
> (In reply to comment #26)

> > Overall, the efficacy of DNSWL outside of the FP scores is well established
> > and that's not a barrier to the re-enabling of the scores.

It should be noted that the policy contested in this bug does not cause FPs. It
does cause FNs for a small number of users (where other attempts to rectify an
unaccepted situation failed). On the other hand, removing the rules will lead
to a higher risk of FPs for 99.something % of users.

> http://www.mail-archive.com/users@spamassassin.apache.org/msg69546.html
> Well established based on what?  Has anyone looked at the statistics to prove
> that this situation has not changed?

Well established based on http://www.chaosreigns.com/dnswl/. While the stats
have a lot of fluctuation which makes it sometimes hard to interpret individual
data points, it generally shows the "usefulness" of dnswl.org rules/data.

-- 
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

[Bug 6668] DNSWL is lacking a rule to communicate excessive use to users

Posted by bu...@bugzilla.spamassassin.org.

https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6668

--- Comment #12 from Kevin A. McGrail <km...@pccc.com> 2011-10-03 21:11:59 UTC ---
> Would you like to recommend another URL?

The URL I wrote was PURELY a place-holder.  I could have and should have
written something that implied more firmly that status such as
www.sa.org/foobar.  

The actually link should be determined in the patch.  It likely should go to a
wiki article discussing RBL errors.  But it definitely shouldn't be a link to a
vendor, shouldn't say abuse and should be kept non specific.  It likely should
have information on disabling the RBL rules as well.


> Er, yeah, that sounds like a pretty good definition to me.  Especially if SA
> actually slaps on a "X-Spam-Status: No" header, which would be the case here.
> 
> > We have to ship SA in a way that is safe for the vast majority of users which
> > might not be the most effective for blocking all Spam.
> 
> I don't see how that is at at all conflicting with what I have suggested.

I'm trying to keep my answers too short.  I'll rephrase:

You have suggested that disabling network tests causes FNs because emails then
slip through unmarked. I don't consider those true FNs because I consider them
FNs caused by a misconfiguration of SA.  SA works best with network tests and
we aren't recommending they are disabled.  But some of them need consideration
before they are enabled post-installation.


> I don't claim to know the intentions of the owners of DNSWL and SEM.  But I'm
> not convinced that it's inappropriate to intentionally affect scores
> (preferably with false negatives instead of false positives) in order to get
> the attention of an administrator to explain the problem and get them to either
> stop sending millions of queries a day, or start sending money.

We will have to agree to disagree.  

I am 100% convinced it is inappropriate to intentionally affect scores to get
the attention of admins.  It is the very definition of collateral damage and
something I would strongly advocate against.  But again, I am one vote and this
is my opinion.  

> RCVD_IN_DNSWL_HI is currently scored -5.  Would you veto a rule that matched
> the return value of 127.0.0.255 with a score of -5 and a description that was
> helpful in resolving the situation that could not be construed as advertising?

This needs more thought but I would veto it unless the following points are
met:

 - the NET result of the rules for the RBL in question in total add up to zero
(or subsequently similar e.g. 0.0001, etc.) So if there is a positive score and
a negative score, the two together = 0.  In other words, an RBL can't issue a
response that incorrectly affects scores on purpose due to limits, technical
errors, etc.

 - The description in the Rule was generic, suitable for all RBLs and pointed
to a URL under SA's control.  Perhaps even just one rule for all the RBLs that
can give an error code response.

> Another possibility I brought up 6 months ago in bug #6220 was, when receiving
> a return value of 127.*.*.255, disabling that rule.  No more load on the
> provider, no skewed score for the user, no advertising.

You are mentioning ideas that need to be adopted by RBLs more so than SA but
this sounds a bit like a DoS ready to happen AND it's a case where the rule
that implemented this likely couldn't be on by default as shipped by SA.  If
they are smart enough to turn on the feature, they likely know enough about RBL
queries to perform local caching, rsync, etc.

I run quite a number of RBL public nameservers.  I don't consider the traffic
to be that big a deal and I can blackhole queries quite easily.

-- 
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

[Bug 6668] DNSWL is lacking a rule to communicate excessive use to users

Posted by bu...@bugzilla.spamassassin.org.

https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6668

Kevin A. McGrail <km...@pccc.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|RESOLVED                    |REOPENED
         Resolution|WONTFIX                     |

--- Comment #19 from Kevin A. McGrail <km...@pccc.com> 2011-12-11 15:45:58 UTC ---
As noted by Darxus, this is PURPOSEFUL behavior to return true statements for
what they consider abuse.

"DNSWL announced this behavior here: 
http://www.dnswl.org/news/archives/24-Abusive-use-of-dnswl.org-infrastructure-enforcing-limits.html"

If they had chosen to add a time-delay, block answers or return a false answer
that did not trigger the rules, I would support it.

1 - Do we have any other RBLs enabled by default that return False Positives
once a threshold is hit?

2 - IMO they need to be disabled by default -OR- documented far stronger.


Unless someone steps up with some ideas, I'm going to disable DNSWL by default
very shortly.

-- 
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

Re: [Bug 6668] DNSWL is lacking a rule to communicate excessive use to users

Posted by Karsten Bräckelmann <gu...@rudersport.de>.

> https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6668
> 
> Kevin A. McGrail <km...@pccc.com> changed:

> If an RBL is submitted for inclusion for SA, it should not have policies that
> would affect anything but the most extreme cases.  Any URLs should point to an
> SA page such as a wiki letting them know to disable the rules.
> 
> > Also, I think it's really irresponsible for SpamAssassin to expose users to
> > this kind of punitive activity without actually warning them of the usage
> > thresholds of the services involved, as Warren lists here: 
> > http://www.spamtips.org/2011/01/usage-limits-of-spamassassin-network.html
> 
> I agree.  What RBLs have this issue and I will immediate work to disable them
> in a default SA installation for the 3.4.0 release?

Merely having glimpsed over this bug report and discussion...

I do not agree in the general case. I do agree, however, in the case of
RBLs returning FP hits -- as apposed to anything harmless like a reply
never causing a hit, or even blocking the DNS queries.

This has been discussed many times before, and the bottom line is: We do
include RBLs like Spamhaus' lists by default, even though they require
subscription for really large sites. One of the strongest arguments is,
that this will by default use the RBLs in question, benefiting the vast
majority of SA users -- those, who would not have to sign up for a
subscription.

These typically smaller, and often really small installations do NOT
have the resources or knowledge to configure all these tiny thingies and
options, to get the best result. Whereas the really large sites DO have
the admin resources, and SHOULD DO have the knowledge, to either disable
them, or sign up for the subscription.


As I have done before, I pro-actively vote -1 on removing such RBLs.
Those who deliberately return FPs, on the other hand, should be pulled
from vanilla SA.


-- 
char *t="\10pse\0r\0dtu\0.@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4";
main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;i<l;i++){ i%8? c<<=1:
(c=*++x); c&128 && (s+=h); if (!(h>>=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}

[Bug 6668] DNSWL is lacking a rule to communicate excessive use to users

Posted by bu...@bugzilla.spamassassin.org.

https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6668

Kevin A. McGrail <km...@pccc.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |kmcgrail@pccc.com

--- Comment #1 from Kevin A. McGrail <km...@pccc.com> 2011-10-03 12:27:55 UTC ---
> ##{ RCVD_IN_DNSWL_ABUSE ifplugin Mail::SpamAssassin::Plugin::DNSEval
> 
> ifplugin Mail::SpamAssassin::Plugin::DNSEval
> header  RCVD_IN_DNSWL_ABUSE        eval:check_rbl_sub('dnswl-firsttrusted',
> '^127\.0\.\d+\.255$')
> describe RCVD_IN_DNSWL_ABUSE       You are using a DNS server that is placing
> too high a load on the DNSWL.org DNS servers without a subscription, please see
> https://subscription.dnswl.org/
> tflags RCVD_IN_DNSWL_ABUSE         nice net
> endif
> ##} RCVD_IN_DNSWL_ABUSE ifplugin Mail::SpamAssassin::Plugin::DNSEval

I would personally veto this immediately.  We are not an advertising service
for RBLs.

If an RBL is submitted for inclusion for SA, it should not have policies that
would affect anything but the most extreme cases.  Any URLs should point to an
SA page such as a wiki letting them know to disable the rules.

> Also, I think it's really irresponsible for SpamAssassin to expose users to
> this kind of punitive activity without actually warning them of the usage
> thresholds of the services involved, as Warren lists here: 
> http://www.spamtips.org/2011/01/usage-limits-of-spamassassin-network.html

I agree.  What RBLs have this issue and I will immediate work to disable them
in a default SA installation for the 3.4.0 release?

regards,
KAM

-- 
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

[Bug 6668] DNSWL is lacking a rule to communicate excessive use to users

Posted by bu...@bugzilla.spamassassin.org.

https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6668

--- Comment #11 from Darxus <Da...@ChaosReigns.com> 2011-10-03 20:01:09 UTC ---
(In reply to comment #10)
> > That sounds great to me as well.  Although I'd prefer something in the wiki for
> > maintainability, I don't know, maybe
> > http://wiki.apache.org/spamassassin/XBLAbuse ?
> 
> Again, generic and I don't consider this necessarily "abuse".  That's a very
> strong word to many people. 

I have no objection to using a word other than "abuse", I just thought "rbl"
was both too generic (there are lots of potential subjects relating to RBL) and
too specific to blacklists, which this bug is specifically related to a
whitelist.  Although if you really want to over-generalize the term RBL to
include whitelists (as I think a relevant RFC has done) I wouldn't argue. 
Would you like to recommend another URL?

> To me, it's an error requiring administrative attention with a landing page to
> help them try and resolve the issue.  Nothing more, nothing less.

Yep.

> > You mis-read what I said.  I never suggested false positives (in fact I
> > suggested it was bad that SEM intentionally caused false positives).  I was
> > talking about causing false negatives (spam being marked as non-spam).  
> 
> That is not the correct definition of a FN in my opinion.  By your definition,
> any email that got through SA for any reason is a False Negative.

Er, yeah, that sounds like a pretty good definition to me.  Especially if SA
actually slaps on a "X-Spam-Status: No" header, which would be the case here.

> We have to ship SA in a way that is safe for the vast majority of users which
> might not be the most effective for blocking all Spam.

I don't see how that is at at all conflicting with what I have suggested.

> PRIMARILY, I want to see a method which doesn't artificially change the SA
> scoring up or down substantially.  
> 
> An RBL that starts returning ALL true or ALL false for over-limit issues is
> artificially changing the scores.  No answer or a answer handled as an error
> would be acceptable.  
> 
> At worst, the queries can be stopped by blackholing the requests from overlimit
> IPs. So this is really a matter for the RBL to handle.

That is certainly one option.

> However, some RBLs want to convert those over-limit users into customers and
> they do so through harmful techniques to get the admin's attention.

I don't claim to know the intentions of the owners of DNSWL and SEM.  But I'm
not convinced that it's inappropriate to intentionally affect scores
(preferably with false negatives instead of false positives) in order to get
the attention of an administrator to explain the problem and get them to either
stop sending millions of queries a day, or start sending money.


RCVD_IN_DNSWL_HI is currently scored -5.  Would you veto a rule that matched
the return value of 127.0.0.255 with a score of -5 and a description that was
helpful in resolving the situation that could not be construed as advertising?


Another possibility I brought up 6 months ago in bug #6220 was, when receiving
a return value of 127.*.*.255, disabling that rule.  No more load on the
provider, no skewed score for the user, no advertising.

-- 
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

[Bug 6668] DNSWL is lacking a rule to communicate excessive use to users

Posted by bu...@bugzilla.spamassassin.org.

https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6668

--- Comment #36 from drmres@yahoo.com 2011-12-14 03:36:28 UTC ---
Please note: Comments by Darxus should be disregarded in this report as he is
an active DNSWL admin with direct personal gain interest in this issue.

-- 
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

[Bug 6668] DNSWL is lacking a rule to communicate excessive use to users

Posted by bu...@bugzilla.spamassassin.org.

https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6668

--- Comment #34 from Steve Freegard <st...@stevefreegard.com> 2011-12-13 11:46:39 UTC ---
(In reply to comment #33)
> I can tell you that I have nothing on my public NS for URIBL that gives out FP
> answers.  I do have the rbldnsd ACL implemented which I believe does interfere
> but only in a blocking/pretend there is no data way.

As the web site says - it uses 'Split Horizon' to do this, so the mirrors
wouldn't see who where being blocked and when as it's being done upstream by
supplying different NS records to blocked senders, which in turn return the
positive replies.

-- 
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

[Bug 6668] DNSWL is lacking a rule to communicate excessive use to users

Posted by bu...@bugzilla.spamassassin.org.

https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6668

Darxus <Da...@ChaosReigns.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |Darxus@ChaosReigns.com

--- Comment #3 from Darxus <Da...@ChaosReigns.com> 2011-10-03 17:53:51 UTC ---
(In reply to comment #1)
> I would personally veto this immediately.  We are not an advertising service
> for RBLs.

I find that statement kind of interesting, when shutting off network tests,
many of which require payment over some threshold (often around 100,000 hits a
day), makes SpamAssassin five times less accurate.  5.35x the false positives,
and 4.25x the false negatives, based on the 2011-03-24 score generation.  And
that's if SA *knows* the network tests aren't working.  What if it's expecting
the tests to work, and the major ones aren't because of going over their (free
use) thresholds?  Probably bad.

I'm not happy about it, but SA seems pretty dependent on things like RBLs
which, under some circumstances, charge money.


>From the Ubuntu SpamAssassin 3.3.1 package:

/usr/share/doc/spamassassin/rules/STATISTICS-set0.txt.gz (no bayes, no net)
# SUMMARY for threshold 5.0:
# False positives:       238  1.12%
# False negatives:      9678  21.93%

/usr/share/doc/spamassassin/rules/STATISTICS-set1.txt.gz (no bayes, net
enabled)
# SUMMARY for threshold 5.0:
# False positives:        30  0.14%
# False negatives:      1381  3.13%

7.93x the false positives, 7.01x the false negatives, without network tests.  

> If an RBL is submitted for inclusion for SA, it should not have policies that
> would affect anything but the most extreme cases.  Any URLs should point to an
> SA page such as a wiki letting them know to disable the rules.

I think the cases where DNSWL has done are likely to qualify as "most extreme". 

> > Also, I think it's really irresponsible for SpamAssassin to expose users to
> > this kind of punitive activity without actually warning them of the usage
> > thresholds of the services involved, as Warren lists here: 
> > http://www.spamtips.org/2011/01/usage-limits-of-spamassassin-network.html
> 
> I agree.  What RBLs have this issue and I will immediate work to disable them
> in a default SA installation for the 3.4.0 release?

According to Michael Scheidell, Spamhaus's (providers of ZEN, SBL, PBL, XBL,
included in SA by default) policy of blocking queries results in "10 and 20 min
delays in inbound email" - bug #6220.  You could call that DOSing email
providers, instead of disabling spam filtration, both with the same goal of
getting the provider to disable the relevant network tests.  Which is worse?

Should the Spamhaus rules be removed from the default SA rule set because they
will DOS email providers for querying them for over 100,000 emails per day?

SEM (bug #6220) is the only one I know of that affects scores.  And by a
mechanism that seemed to have the approval of SpamAssassin folks.  Should that
bug be closed, and the rules not included in SA by default, because of that
mechanism?


I think it would be great if SpamAssassin, by default, didn't include any
network rules that have limits on free use.  Although it would probably require
more work to improve the accuracy, which I don't really see happening.

-- 
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

[Bug 6668] DNSWL is lacking a rule to communicate excessive use to users

Posted by bu...@bugzilla.spamassassin.org.

https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6668

--- Comment #27 from Warren Togami <wt...@gmail.com> 2011-12-12 19:41:17 UTC ---
(In reply to comment #26)
> (In reply to comment #25)
> > Should these rules be put in a sandbox so they continue to be monitored?  They
> > could also be left enabled with informational scores so reuse could be used,
> > but I doubt that would be worthwhile.
> 
> I believe the rules have been in a sandbox since 3.3.0.  I am correct, they are
> in Theo's sandbox which is where they have been living for a while.
> 
> As of 3.3.0, I believe, we were publishing hand-generated scores that are
> higher than masscheck auto-determined.

We were publishing hand-generated scores for DNSWL long before 3.3.0.

http://www.mail-archive.com/users@spamassassin.apache.org/msg69546.html
I lead the charge to manually reduce these hard-coded scores prior to 3.3.0 due
to this issue.  Subsequently I went even further to suggest that we should
reduce DNSWL and IADB scores even further as they don't seem to have automatic
means of enforcement in place and we see consistent FP's in our tests.  More
recently I suggested that we should set all whitelists to -0.01 informational
during GA scoring as they have nothing to do with the performance of positive
scoring rules and thus can improperly throw off the scoring.

> 
> Overall, the efficacy of DNSWL outside of the FP scores is well established
> and that's not a barrier to the re-enabling of the scores.

http://www.mail-archive.com/users@spamassassin.apache.org/msg69546.html
Well established based on what?  Has anyone looked at the statistics to prove
that this situation has not changed?

-- 
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

[Bug 6668] DNSWL is lacking a rule to communicate excessive use to users

Posted by bu...@bugzilla.spamassassin.org.

https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6668

--- Comment #6 from Kevin A. McGrail <km...@pccc.com> 2011-10-03 19:00:49 UTC ---
> If the RBL provides a documented "You are overusing the free service" return
> code, what is the problem with recognizing that and hitting a non-scoring
> (0.001, neither FP nor FN) rule with an informative description? It doesn't
> need to contain a link to the RBL's TOS or subscription page (advertising), but
> telling the admin _why_ they're getting an unusable response from the RBL is
> polite.
> 
> I think that's a much better approach than either removing one of the most
> effective antispam techniques by default, or having the RBL suddenly mark
> _everything_ as spam because we don't interpret the "overuse" code correctly.

I don't have a problem with this concept.  My veto statement above had to do
with the actual URL and the description which were a direct link and
advertisement for a vendor.

A more generic message such as this would be fine and +1'd by me:

The RBL responded with a failure code.  Visit www.spamassassin.org/rbl for more
information.

Regards,
KAM

-- 
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

[Bug 6668] DNSWL is lacking a rule to communicate excessive use to users

Posted by bu...@bugzilla.spamassassin.org.

https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6668

--- Comment #9 from Kevin A. McGrail <km...@pccc.com> 2011-10-03 19:24:01 UTC ---
(In reply to comment #8)
> (In reply to comment #5)
> > If the RBL provides a documented "You are overusing the free service" return
> > code, what is the problem with recognizing that and hitting a non-scoring
> > (0.001, neither FP nor FN) rule with an informative description?
> 
> The problem, from the perspective of DNSWL.org, is that it provides no
> incentive to stop sending millions of queries a day.

Then the RBL should be disabled by SA by default and the RBL should consider a
sign-up procedure for activation so they have an out of band contact method.

-- 
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

[Bug 6668] DNSWL is lacking a rule to communicate excessive use to users

Posted by bu...@bugzilla.spamassassin.org.

https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6668

--- Comment #20 from Darxus <Da...@ChaosReigns.com> 2011-12-11 18:58:23 UTC ---
(In reply to comment #19)
> 1 - Do we have any other RBLs enabled by default that return False Positives
> once a threshold is hit?

I believe we do not.

> Unless someone steps up with some ideas, I'm going to disable DNSWL by default
> very shortly.

I just forwarded this along to the DNSWL admins list so they are aware of the
situation.

-- 
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

Re: No drop in DNSWL queries while disabled

Posted by "Kevin A. McGrail" <KM...@PCCC.com>.

On 12/19/2011 6:08 PM, darxus@chaosreigns.com wrote:
> I noticed disabling DNSWL in SA didn't seem to affect bandwidth.  Then
> Matthias pointed out the possibility it was because this lacked:
>
> meta __DNSBL_FOO  0
>
> Seems Karsten's ranting was correct :P
>
> Shame, would have been fun to see that data.
>
>
> I see http://wiki.apache.org/spamassassin/DnsBlocklists says you should
> use:
>
> score   __RCVD_IN_ZEN  0
>
> .."score" instead of "meta".  Is one better than the other?
>
>
> Should we open a bug to disable things like __RCVD_IN_DNSWL when all its
> subrules have a score of 0?  Current behavior seems not great.
I believe we have a bug open concerning an interface to enable/disable 
DNSBLs so this note might be good to document theere.

No drop in DNSWL queries while disabled

Posted by da...@chaosreigns.com.

I noticed disabling DNSWL in SA didn't seem to affect bandwidth.  Then
Matthias pointed out the possibility it was because this lacked:

meta __DNSBL_FOO  0

Seems Karsten's ranting was correct :P

Shame, would have been fun to see that data.


I see http://wiki.apache.org/spamassassin/DnsBlocklists says you should
use:

score   __RCVD_IN_ZEN  0

.."score" instead of "meta".  Is one better than the other?


Should we open a bug to disable things like __RCVD_IN_DNSWL when all its
subrules have a score of 0?  Current behavior seems not great.


On 12/12, bugzilla-daemon@bugzilla.spamassassin.org wrote:
> https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6668
> 
> Kevin A. McGrail <km...@pccc.com> changed:
> 
>            What    |Removed                     |Added
> ----------------------------------------------------------------------------
>              Status|REOPENED                    |RESOLVED
>          Resolution|                            |FIXED
> 
> --- Comment #24 from Kevin A. McGrail <km...@pccc.com> 2011-12-12 16:31:32 UTC ---
> > The reason for the special result code, as indicated in the posting referenced
> > above, is that REFUSED rcode will result in triple the amount of queries in
> > most cases. 
> 
> In the absence of a patch to implement your special return value (which I think
> needs to be outside of 127.X and should be discussed with other RBLs), I can
> only recommend that you simply blackhole the requests from servers in excess of
> 100K that you consider abusive.
> 
> Additionally, as with Joao, I am also happy to support your project with a
> public nameserver.
> 
> However, I can't support your policy that causes FPs in SA as I feel it is
> unrealistic to launch an RBL and not expect this type of problem.  
> 
> As of today, DNSWL will be disabled by default in SA's rules.  SA Admins
> wishing to use it, should add something like this to your local.cf:
> 
> #ENABLING DNSWL - BUG 6668
> score RCVD_IN_DNSWL_NONE 0 -0.0001 0 -0.0001
> score RCVD_IN_DNSWL_LOW 0 -0.7 0 -0.7
> score RCVD_IN_DNSWL_MED 0 -2.3 0 -2.3
> score RCVD_IN_DNSWL_HI 0 -5 0 -5
> 
> This disabling will be effective with the next rules update.
> 
> However, please note that we are *very* open to discussing policy changes that
> will help maintain your project, it's success as a spam test and not cause FPs
> so that it could be re-enabled by default.
> 
> Regards,
> KAM
> 
> svn commit -m 'Changing scores of DNSWL due to FPs caused by their nameservers
> anti-abuse policies - Bug 6668'
> Sending        rules/50_scores.cf
> Transmitting file data .
> Committed revision 1213299.
> 
> -- 
> Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
> ------- You are receiving this mail because: -------
> You are the assignee for the bug.
> 

-- 
"Think, or I will set you on fire."
http://www.ChaosReigns.com

[Bug 6668] DNSWL is lacking a rule to communicate excessive use to users

Posted by bu...@bugzilla.spamassassin.org.

https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6668

Kevin A. McGrail <km...@pccc.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|REOPENED                    |RESOLVED
         Resolution|                            |FIXED

--- Comment #24 from Kevin A. McGrail <km...@pccc.com> 2011-12-12 16:31:32 UTC ---
> The reason for the special result code, as indicated in the posting referenced
> above, is that REFUSED rcode will result in triple the amount of queries in
> most cases. 

In the absence of a patch to implement your special return value (which I think
needs to be outside of 127.X and should be discussed with other RBLs), I can
only recommend that you simply blackhole the requests from servers in excess of
100K that you consider abusive.

Additionally, as with Joao, I am also happy to support your project with a
public nameserver.

However, I can't support your policy that causes FPs in SA as I feel it is
unrealistic to launch an RBL and not expect this type of problem.  

As of today, DNSWL will be disabled by default in SA's rules.  SA Admins
wishing to use it, should add something like this to your local.cf:

#ENABLING DNSWL - BUG 6668
score RCVD_IN_DNSWL_NONE 0 -0.0001 0 -0.0001
score RCVD_IN_DNSWL_LOW 0 -0.7 0 -0.7
score RCVD_IN_DNSWL_MED 0 -2.3 0 -2.3
score RCVD_IN_DNSWL_HI 0 -5 0 -5

This disabling will be effective with the next rules update.

However, please note that we are *very* open to discussing policy changes that
will help maintain your project, it's success as a spam test and not cause FPs
so that it could be re-enabled by default.

Regards,
KAM

svn commit -m 'Changing scores of DNSWL due to FPs caused by their nameservers
anti-abuse policies - Bug 6668'
Sending        rules/50_scores.cf
Transmitting file data .
Committed revision 1213299.

-- 
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

[Bug 6668] DNSWL is lacking a rule to communicate excessive use to users

Posted by bu...@bugzilla.spamassassin.org.

https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6668

John Hardin <jh...@impsec.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jhardin@impsec.org

--- Comment #5 from John Hardin <jh...@impsec.org> 2011-10-03 18:55:04 UTC ---
(In reply to comment #4)
> (In reply to comment #3)
> > (In reply to comment #1)
> > > I would personally veto this immediately.  We are not an advertising service
> > > for RBLs.
> > 
> > I find that statement kind of interesting, when shutting off network tests,
> > many of which require payment over some threshold (often around 100,000 hits a
> > day), makes SpamAssassin five times less accurate.  
> 
> IMO, ANY provider that gives FALSE positives under any circumstances should not
> be configured to be enabled by default with SA.
> 
> I have zero problem with them stopping their replies and zero problems with
> them charging for heavy usage.

If the RBL provides a documented "You are overusing the free service" return
code, what is the problem with recognizing that and hitting a non-scoring
(0.001, neither FP nor FN) rule with an informative description? It doesn't
need to contain a link to the RBL's TOS or subscription page (advertising), but
telling the admin _why_ they're getting an unusable response from the RBL is
polite.

I think that's a much better approach than either removing one of the most
effective antispam techniques by default, or having the RBL suddenly mark
_everything_ as spam because we don't interpret the "overuse" code correctly.

-- 
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

[Bug 6668] DNSWL is lacking a rule to communicate excessive use to users

Posted by bu...@bugzilla.spamassassin.org.

https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6668

--- Comment #2 from AXB <ax...@gmail.com> 2011-10-03 12:58:45 UTC ---
(In reply to comment #1)
> > ##{ RCVD_IN_DNSWL_ABUSE ifplugin Mail::SpamAssassin::Plugin::DNSEval
> > 
> > ifplugin Mail::SpamAssassin::Plugin::DNSEval
> > header  RCVD_IN_DNSWL_ABUSE        eval:check_rbl_sub('dnswl-firsttrusted',
> > '^127\.0\.\d+\.255$')
> > describe RCVD_IN_DNSWL_ABUSE       You are using a DNS server that is placing
> > too high a load on the DNSWL.org DNS servers without a subscription, please see
> > https://subscription.dnswl.org/
> > tflags RCVD_IN_DNSWL_ABUSE         nice net
> > endif
> > ##} RCVD_IN_DNSWL_ABUSE ifplugin Mail::SpamAssassin::Plugin::DNSEval
> 
> I would personally veto this immediately.  We are not an advertising service
> for RBLs.
> 
> If an RBL is submitted for inclusion for SA, it should not have policies that
> would affect anything but the most extreme cases.  Any URLs should point to an
> SA page such as a wiki letting them know to disable the rules.
> 
> > Also, I think it's really irresponsible for SpamAssassin to expose users to
> > this kind of punitive activity without actually warning them of the usage
> > thresholds of the services involved, as Warren lists here: 
> > http://www.spamtips.org/2011/01/usage-limits-of-spamassassin-network.html
> 
> I agree.  What RBLs have this issue and I will immediate work to disable them
> in a default SA installation for the 3.4.0 release?

Warren forgot URIBL.com in his list - afaik, also has a limit of 300k
queries/day which, like the others is usually enough for the average site.

IF the BL query limits hit ISPs/service providers/huge corps they're freeriding
on (in most cases) donated resources and being cheap so nobody should be
surprised if their queries are blocked/filtered

-- 
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

[Bug 6668] DNSWL is lacking a rule to communicate excessive use to users

Posted by bu...@bugzilla.spamassassin.org.

https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6668

Matthias Leisi <ma...@leisi.net> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |matthias@leisi.net

--- Comment #21 from Matthias Leisi <ma...@leisi.net> 2011-12-11 20:59:48 UTC ---
(speaking for dnswl.org)

(In reply to comment #19)
> As noted by Darxus, this is PURPOSEFUL behavior to return true statements for
> what they consider abuse.
> 
> "DNSWL announced this behavior here: 
> http://www.dnswl.org/news/archives/24-Abusive-use-of-dnswl.org-infrastructure-enforcing-limits.html"

Currently, there following nameservers are getting a "listed, high trust"
answer (plus the reasons for blocking them):

* Google Public DNS servers (multi-million queries per 24 hours, no response
from Google contacts)
* Some big hosting provider resolvers: softlayer.com, dimenoc.com,
theplanet.com, bluehost.com, dyndns.com, netline.net.uk (multi-million queries
per 24 hours, no response/action from abuse@ and similar contacts)
* Five single hosts with multi-million queries per 24 hours with no
response/action from multiple contacts.

The reason for the special result code, as indicated in the posting referenced
above, is that REFUSED rcode will result in triple the amount of queries in
most cases. 

This is not used for those doing below one million queries per 24 hours
(aggregated over those IPs that can be identified as belonging to the same
organisation/user).

-- 
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

[Bug 6668] DNSWL is lacking a rule to communicate excessive use to users

Posted by bu...@bugzilla.spamassassin.org.

https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6668

--- Comment #26 from Kevin A. McGrail <km...@pccc.com> 2011-12-12 17:36:21 UTC ---
(In reply to comment #25)
> Should these rules be put in a sandbox so they continue to be monitored?  They
> could also be left enabled with informational scores so reuse could be used,
> but I doubt that would be worthwhile.

I believe the rules have been in a sandbox since 3.3.0.  I am correct, they are
in Theo's sandbox which is where they have been living for a while.

As of 3.3.0, I believe, we were publishing hand-generated scores that are
higher than masscheck auto-determined.

Overall, the efficacy of DNSWL outside of the FP scores is well established and
that's not a barrier to the re-enabling of the scores.

Regards,
KAM

-- 
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

[Bug 6668] DNSWL is lacking a rule to communicate excessive use to users

Posted by bu...@bugzilla.spamassassin.org.

https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6668

Steve Freegard <st...@stevefreegard.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |steve@stevefreegard.com

--- Comment #32 from Steve Freegard <st...@stevefreegard.com> 2011-12-12 23:49:14 UTC ---
Just to add my 2c....

KAM:  This was discussed before with regards to URIBL doing the same stuff, see
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6048

URIBL still has the ability to return 127.0.0.255 for all queries as per their
'abuse' page.  See http://uribl.com/about.shtml#abuse and were returning
positive for queries from Google DNS about 3-4 weeks ago (AXB can probably
confirm this).

Personally I think it would be a shame to loose either list from the default
rulesets despite these practices.

-- 
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

[Bug 6668] DNSWL is lacking a rule to communicate excessive use to users

Posted by bu...@bugzilla.spamassassin.org.

https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6668

John Hardin <jh...@impsec.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |drmres@yahoo.com

--- Comment #18 from John Hardin <jh...@impsec.org> 2011-12-10 18:07:16 UTC ---
*** Bug 6718 has been marked as a duplicate of this bug. ***

-- 
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

[Bug 6668] DNSWL is lacking a rule to communicate excessive use to users

Posted by bu...@bugzilla.spamassassin.org.

https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6668

--- Comment #14 from Kevin A. McGrail <km...@pccc.com> 2011-10-03 22:18:28 UTC ---
> > But again, I am one vote and this
> > is my opinion.  
> 
> "Votes on code modifications follow a different model. In this scenario, a
> negative vote constitutes a veto , which cannot be overridden."
> "...the proposal requires three positive votes and no negative ones in order to
> pass..."
> - http://www.apache.org/foundation/voting.html
>
> By our rules, it's enough on its own to make this not happen.

Good point.  Well I have not voted formally so I don't need to withdraw a vote.
So let's continue the discussion and get more votes and I won't submarine it if
others agree with you.

> >  - the NET result of the rules for the RBL in question in total add up to zero
> > (or subsequently similar e.g. 0.0001, etc.) So if there is a positive score and
> > a negative score, the two together = 0.  In other words, an RBL can't issue a
> > response that incorrectly affects scores on purpose due to limits, technical
> > errors, etc.
> 
> I believe that requirement would eliminate dnswl.org's interest.  Since you're
> willing to veto without it, I think that's sufficient to consider this thread
> dead.

I would strongly try and convince others it is wrong to purposefully give wrong
answers from an RBL that lead to skewed scoring.  If a patch you are proposing
skews the scores plus or minus, expect me to request for it to be revised to a
net 0.

If DNSWL only wants a case where the scores are skewed to gain attention from
admins/users, then it seems they want SA to be a sales lead generator.  This is
exactly what I want to prevent.

> I don't understand why you say that.  It's just another way of handing a
> 127.0.0.255 within spamassassin.  So as far as RBLs and WLs are concerned it's
> still just an implementation of providing a .255 response for users who are
> over limit.

Because to me 255 is a legitimate bit mask for a valid response. 

- Do older versions of SA contain code that considers .255 as an invalid
response for an RBL?

- Is there agreement among RBLs that .255 is considered an error code?

I would support some standard for an error code but likely it should be
something in a different class c such as 192.168.255.X or something similar.

And I have more ideas on it I'll add below.



> As an example, say an email provider is using spamassassin to filter millions
> of emails a day.  Some of the rules (RCVD_IN_XBL, RCVD_IN_PBL, RCVD_IN_SBL)
> cause queries is to zen.spamhaus.org.  That being over their free use
> threshold, they start returning (only) 127.0.0.255 for all queries, to indicate
> the over limit condition.  SpamAssassin notices the 127.0.0.255 value, and
> stops running all rules that hit zen.spamhaus.org.

Zen, according to their docs, does not issue a .255. See
http://www.spamhaus.org/faq/answers.lasso?section=DNSBL%20Usage#200

But assuming they did, your ISP uses an old version of SA, Zen responds with
.255 and it's considered true and legitimate email gets blocked.

In short, an error bitmask will have YEARS of lag in getting an error code in
place for RBLs.

The only way I see it could happen is to can get an RBL to announce via
alternate names so querying zen.spamhaus.org would never give out .255 but
querying zenv2.spamhaus.org could implement an error code response that APIs
would know how to properly implement.

> > but
> > this sounds a bit like a DoS ready to happen AND it's a case where the rule
> > that implemented this likely couldn't be on by default as shipped by SA.  If
> > they are smart enough to turn on the feature, they likely know enough about RBL
> > queries to perform local caching, rsync, etc.
> 
> How is that a DoS ready to happen?  Are we having another misunderstanding
> here?

I just see that as an avenue to figure out how to trick your system into
getting a DNS response that changes SA not to query an RBL in order to get all
my Spam through.  With the number of DNS servers that change responses, this
doesn't sound that hard.

> > I run quite a number of RBL public nameservers.  I don't consider the traffic
> > to be that big a deal and I can blackhole queries quite easily.
> 
> Are they RBLs that spamassassin has enabled by default?  I run one dnswl.org
> mirror, and the only reason I can do that is my provider is willing to overlook
> my bandwidth limit due to a belief that dnswl is worth supporting.  Mirroring
> dnswl.org causes almost all of my bandwidth usage.

If DNSWL needs another public mirror, have them email me.  The solution to me
is to increase public mirrors not to harm the flow of email to try and get
people to use the service less.

-- 
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

[Bug 6668] DNSWL is lacking a rule to communicate excessive use to users

Posted by bu...@bugzilla.spamassassin.org.

https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6668

Darxus <Da...@ChaosReigns.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|                            |WONTFIX

--- Comment #17 from Darxus <Da...@ChaosReigns.com> 2011-10-17 20:24:11 UTC ---
Closing, not going anywhere.

An additional bit of information from Matthias:  In these "abuse" cases, he is
initially just blocking the queries.  "Due to the way some DNS resolvers work,
this may result in a *higher* query rate, since the resolver just tries it
again and again to get an answer."

-- 
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

[Bug 6668] DNSWL is lacking a rule to communicate excessive use to users

Posted by bu...@bugzilla.spamassassin.org.

https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6668

--- Comment #8 from Darxus <Da...@ChaosReigns.com> 2011-10-03 19:17:39 UTC ---
(In reply to comment #5)
> If the RBL provides a documented "You are overusing the free service" return
> code, what is the problem with recognizing that and hitting a non-scoring
> (0.001, neither FP nor FN) rule with an informative description?

The problem, from the perspective of DNSWL.org, is that it provides no
incentive to stop sending millions of queries a day.

-- 
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

[Bug 6668] DNSWL is lacking a rule to communicate excessive use to users

Posted by bu...@bugzilla.spamassassin.org.

https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6668

--- Comment #15 from Darxus <Da...@ChaosReigns.com> 2011-10-04 21:13:11 UTC ---
(In reply to comment #14)
> > I don't understand why you say that.  It's just another way of handing a
> > 127.0.0.255 within spamassassin.  So as far as RBLs and WLs are concerned it's
> > still just an implementation of providing a .255 response for users who are
> > over limit.
> 
> Because to me 255 is a legitimate bit mask for a valid response. 

I was providing an example (127.0.0.255), not suggesting that value always be
treated this way.  I think it would be necessary to create another eval thing
to define a regex for each RBL.

> > As an example, say an email provider is using spamassassin to filter millions
> > of emails a day.  Some of the rules (RCVD_IN_XBL, RCVD_IN_PBL, RCVD_IN_SBL)
> > cause queries is to zen.spamhaus.org.  That being over their free use
> > threshold, they start returning (only) 127.0.0.255 for all queries, to indicate
> > the over limit condition.  SpamAssassin notices the 127.0.0.255 value, and
> > stops running all rules that hit zen.spamhaus.org.
> 
> Zen, according to their docs, does not issue a .255. See
> http://www.spamhaus.org/faq/answers.lasso?section=DNSBL%20Usage#200

Right, just providing an example.

> In short, an error bitmask will have YEARS of lag in getting an error code in
> place for RBLs.

For all of them, yes.

> > How is that a DoS ready to happen?  Are we having another misunderstanding
> > here?
> 
> I just see that as an avenue to figure out how to trick your system into
> getting a DNS response that changes SA not to query an RBL in order to get all
> my Spam through.  With the number of DNS servers that change responses, this
> doesn't sound that hard.

Sounds hard to me (to use this to cause a DoS).

> If DNSWL needs another public mirror, have them email me.  

I'll let them know.


If I don't get any positive responses within a couple days, I'll close this (or
someone else can feel free).

-- 
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

[Bug 6668] DNSWL is lacking a rule to communicate excessive use to users

Posted by bu...@bugzilla.spamassassin.org.

https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6668

--- Comment #31 from Warren Togami <wt...@gmail.com> 2011-12-12 22:03:48 UTC ---
Matthias, I join in asking you to please reconsider your approach to misuse
prevention.  Causing FN's is hardly an effective means at making the sysadmin
take notice, as they aren't losing any important mail like FP's would cause.

Blackholing is a superior approach as it causes DNS timeout delays in mail
delivery, which legitimately causes problems for the sysadmin and is more
likely to cause them to take notice that there is a problem.

Also, I too have the ability to host a high capacity mirror of DNSWL.  Would
you allow us to help?

-- 
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

[Bug 6668] DNSWL is lacking a rule to communicate excessive use to users

Posted by bu...@bugzilla.spamassassin.org.

https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6668

--- Comment #25 from Darxus <Da...@ChaosReigns.com> 2011-12-12 17:27:10 UTC ---
Should these rules be put in a sandbox so they continue to be monitored?  They
could also be left enabled with informational scores so reuse could be used,
but I doubt that would be worthwhile.

-- 
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

[Bug 6668] DNSWL is lacking a rule to communicate excessive use to users

Posted by bu...@bugzilla.spamassassin.org.

https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6668

João Gouveia <jo...@anubisnetworks.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |joao.gouveia@anubisnetworks
                   |                            |.com

--- Comment #23 from João Gouveia <jo...@anubisnetworks.com> 2011-12-11 21:09:39 UTC ---
Matthias,

If it helps, I can offer our support by adding DNSWL to our public mirrors.

-- 
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

[Bug 6668] DNSWL is lacking a rule to communicate excessive use to users

Posted by bu...@bugzilla.spamassassin.org.

https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6668

--- Comment #33 from Kevin A. McGrail <km...@pccc.com> 2011-12-13 00:08:32 UTC ---
(In reply to comment #32)
> Just to add my 2c....
> 
> KAM:  This was discussed before with regards to URIBL doing the same stuff, see
> https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6048
> 
> URIBL still has the ability to return 127.0.0.255 for all queries as per their
> 'abuse' page.  See http://uribl.com/about.shtml#abuse and were returning
> positive for queries from Google DNS about 3-4 weeks ago (AXB can probably
> confirm this).
> 
> Personally I think it would be a shame to loose either list from the default
> rulesets despite these practices.

It's good to mention this because we need to implement it the same for URIBL. 
My understanding back like 2 years ago was that URIBL changed to a block of the
query and not to return false positives.

I can tell you that I have nothing on my public NS for URIBL that gives out FP
answers.  I do have the rbldnsd ACL implemented which I believe does interfere
but only in a blocking/pretend there is no data way.

Blocking/pretending no data for queries is considered acceptable, I believe.

AXB, can you confirm otherwise?

-- 
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

[Bug 6668] DNSWL is lacking a rule to communicate excessive use to users

Posted by bu...@bugzilla.spamassassin.org.

https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6668

--- Comment #30 from Kevin A. McGrail <km...@pccc.com> 2011-12-12 21:12:31 UTC ---
> It should be noted that the policy contested in this bug does not cause FPs. It
> does cause FNs for a small number of users (where other attempts to rectify an
> unaccepted situation failed). On the other hand, removing the rules will lead
> to a higher risk of FPs for 99.something % of users.

To clarify, the DNSWL policy of returning positive answers to gain the
attention of administrators with SA installations sending DNS queries to DNSWL
through over-quota IPs causes misfiring on the DNSWL Rules.  It's a FP on the
Rule regardless of the negative or positive score the rule applies.  

The scoring effect (FP/FN or even a neutral) on the status of the email is
another discussion.  

However, I agree, that a FP on a negative scoring rule is likely to cause a FN
on an email that is spam.

Blackhole the requests instead or add more public NS to protect the
infrascture.  

And in case you missed it, two well-experienced RBL infrastructures (including
myself) have offered to help with more public nameservers.

Regards,
KAM

-- 
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

[Bug 6668] DNSWL is lacking a rule to communicate excessive use to users

Posted by bu...@bugzilla.spamassassin.org.

https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6668

--- Comment #7 from Darxus <Da...@ChaosReigns.com> 2011-10-03 19:16:27 UTC ---
(In reply to comment #6)
> I don't have a problem with this concept.  My veto statement above had to do
> with the actual URL and the description which were a direct link and
> advertisement for a vendor.

Ah, that was just a rough guess at how it should be implemented.  A url to a
spamassassin page certainly seems entirely appropriate to me.  

> A more generic message such as this would be fine and +1'd by me:
> 
> The RBL responded with a failure code.  Visit www.spamassassin.org/rbl for more
> information.

That sounds great to me as well.  Although I'd prefer something in the wiki for
maintainability, I don't know, maybe
http://wiki.apache.org/spamassassin/XBLAbuse ?


(In reply to comment #4)
> > getting the provider to disable the relevant network tests.  Which is worse?
> 
> False positives are worse from an anti-Spam perspective.  

You mis-read what I said.  I never suggested false positives (in fact I
suggested it was bad that SEM intentionally caused false positives).  I was
talking about causing false negatives (spam being marked as non-spam).  

> That's unrealistic as there are great services that have reasonable thresholds
> for use. 

So the question is, what are acceptable methods of enforcing those thresholds? 
Blocking queries resulting in delay of email is acceptable to you.  I don't
know how effective that is in getting people to stop querying, and it doesn't
provide any feedback to indicate that there is a problem.  Is it acceptable to
cause false-negatives, spam being marked as non-spam, with clear indication
(via a matching rule and description) of what the problem is?

-- 
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

[Bug 6668] DNSWL is lacking a rule to communicate excessive use to users

Posted by bu...@bugzilla.spamassassin.org.

https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6668

D. Stussy <so...@kd6lvw.ampr.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |software+spamassassin@kd6lv
                   |                            |w.ampr.org

--- Comment #16 from D. Stussy <so...@kd6lvw.ampr.org> 2011-10-05 20:18:14 UTC ---
What DNSBLs should do is return a result which is not within the 127.0.0.0/8
subnet to indicate an answer which doesn't constitute listing -- especially if
they decide not to issue a DNS RC of "refused."  That way, there will be no
confusion should some other DNSBL define "127.0.0.255" as a valid reply.  It
also works in the case of a shut down DNSBL where a valid IP address from a
domain squatter is returned (especially by use of a wildcarded DNS response).

As to detecting an "excessive query" condition and scoring it with a value
sufficiently near zero (e.g. 0.001), I am in favor of such an approach.

Future queries to any DNS based list should not happen if a given DNS list
returns a "REFUSED" answer (until SA is restarted).  For classic lists, a query
returning an A record outside of 127/8 should also be interpreted as "refused."

If "127.0.0.255" is to be treated as a special case of "refused," it should be
handled by a rule on a per DNSBL basis.  In other words, I suggest that this
type of response is not preferred.

Since classic DNSBLs are all supposed to return "127.0.0.2" for a query for
IPv4 address 127.0.0.2, maybe upon SA startup, each DNSBL should be tested for
the value.  However, there is a good reason for not performing "unnecessary"
queries.  If the entire world rebooted at the same time, would the DNSBLs be
DOS'ed with a flood of queries?

-- 
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

[Bug 6668] DNSWL is lacking a rule to communicate excessive use to users

Posted by bu...@bugzilla.spamassassin.org.

https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6668

bernd <dn...@grisu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |dnswl@grisu.org

-- 
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

[Bug 6668] DNSWL is lacking a rule to communicate excessive use to users

Posted by bu...@bugzilla.spamassassin.org.

https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6668

--- Comment #35 from Kevin A. McGrail <km...@pccc.com> 2011-12-13 12:09:15 UTC ---
(In reply to comment #34)
> (In reply to comment #33)
> > I can tell you that I have nothing on my public NS for URIBL that gives out FP
> > answers.  I do have the rbldnsd ACL implemented which I believe does interfere
> > but only in a blocking/pretend there is no data way.
> 
> As the web site says - it uses 'Split Horizon' to do this, so the mirrors
> wouldn't see who where being blocked and when as it's being done upstream by
> supplying different NS records to blocked senders, which in turn return the
> positive replies.

I've asked the URIBL admins to comment.  But please open a different bug on the
URIBL issue.  You've sort of hi-jacked this DNSWL bug.

-- 
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

[Bug 6668] DNSWL is lacking a rule to communicate excessive use to users

Posted by bu...@bugzilla.spamassassin.org.

https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6668

--- Comment #37 from Darxus <Da...@ChaosReigns.com> ---
(In reply to comment #36)
> Please note: Comments by Darxus should be disregarded in this report as he
> is an active DNSWL admin with direct personal gain interest in this issue.

I have no "direct personal gain interest in this issue."  My relationship with
dnswl.org was fully disclosed to everyone long ago.  It's no different from my
relationship with spamassassin.

-- 
You are receiving this mail because:
You are the assignee for the bug.

[Bug 6668] DNSWL is lacking a rule to communicate excessive use to users

Posted by bu...@bugzilla.spamassassin.org.

https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6668

--- Comment #4 from Kevin A. McGrail <km...@pccc.com> 2011-10-03 18:24:58 UTC ---
(In reply to comment #3)
> (In reply to comment #1)
> > I would personally veto this immediately.  We are not an advertising service
> > for RBLs.
> 
> I find that statement kind of interesting, when shutting off network tests,
> many of which require payment over some threshold (often around 100,000 hits a
> day), makes SpamAssassin five times less accurate.  

IMO, ANY provider that gives FALSE positives under any circumstances should not
be configured to be enabled by default with SA.

I have zero problem with them stopping their replies and zero problems with
them charging for heavy usage.

> I'm not happy about it, but SA seems pretty dependent on things like RBLs
> which, under some circumstances, charge money.

RBLs have a good place in anti-spam work.  However, the concept that SA can be
deployed out of the box with zero config and work well is likely unattainable
due to the commercial realities of the world.

> According to Michael Scheidell, Spamhaus's (providers of ZEN, SBL, PBL, XBL,
> included in SA by default) policy of blocking queries results in "10 and 20 min
> delays in inbound email" - bug #6220.  You could call that DOSing email
> providers, instead of disabling spam filtration, both with the same goal of
> getting the provider to disable the relevant network tests.  Which is worse?

False positives are worse from an anti-Spam perspective.  

> Should the Spamhaus rules be removed from the default SA rule set because they
> will DOS email providers for querying them for over 100,000 emails per day?

I don't consider a delay a DoS. It's not keeping the sendmail/spamc process
grinding for 10-20 minutes is it?  It's just causing the mail to await for a
second delivery.  That's "normal" for email as it is not an method of IM.  

The method to reduce the delay is simple: disable the RBL tests or pay for the
RBL providers services, etc.

> SEM (bug #6220) is the only one I know of that affects scores.  And by a
> mechanism that seemed to have the approval of SpamAssassin folks.  Should that
> bug be closed, and the rules not included in SA by default, because of that
> mechanism?

IMO, the mechanism should be changed to point to a URL controller by SA.

> I think it would be great if SpamAssassin, by default, didn't include any
> network rules that have limits on free use.  Although it would probably require
> more work to improve the accuracy, which I don't really see happening.

That's unrealistic as there are great services that have reasonable thresholds
for use. 

Regards,
KAM

-- 
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.