You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spamassassin.apache.org by bu...@bugzilla.spamassassin.org on 2006/10/12 13:21:31 UTC

[Bug 4628] remove some RFC-I rules

http://issues.apache.org/SpamAssassin/show_bug.cgi?id=4628


jm@jmason.org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Severity|normal                      |major
            Summary|rfc-ignorant.org blacklists |remove some RFC-I rules
                   |are overzealous             |
   Target Milestone|Undefined                   |3.2.0




------- Additional Comments From jm@jmason.org  2006-10-12 04:21 -------
some up-to-date data from the ruleqa system -- 
http://ruleqa.spamassassin.org/?daterev=20061007-r453869-n&s_defcorpus=on&rule=%2FRFC&srcpath=&s_zero=on&s_detail=+&g=Change
:

0.00000 	3.7247  0.0540  0.986 	0.85 	2.60 	DNS_FROM_RFC_DSN 	 	
0.00000 	2.2447  0.1700  0.930 	0.73 	1.94 	DNS_FROM_RFC_BOGUSMX 		
0.00000 	15.1533 4.6068  0.767 	0.51 	1.45 	DNS_FROM_RFC_POST 		
0.00000 	18.6219 8.6003  0.684 	0.49 	1.71 	DNS_FROM_RFC_ABUSE 		
0.00000 	6.4258  4.0476  0.614 	0.48 	0.20 	DNS_FROM_RFC_WHOIS 	

measured on a 260k spam, 63k ham corpus with 7 contributors.

I think we should keep DNS_FROM_RFC_DSN and DNS_FROM_RFC_BOGUSMX (98.6% and 93%
accurate respectively), but drop the other 3 rules.

They have a very high ham hit-rate, but are still assigned relatively high
scores by the perceptron; I'd prefer if the perceptron didn't have the option of
being misled by them at all... as this bug and mailing traffic attests, it's a
PITA to support.

2. an alternative would be a way to hint to the perceptron that these are
untrustworthy rules, "tflags weak" or whatever.  but that would require
additional code and I think we're better off dropping the rules.

3. actually, another alternative: make them into meta subrules, so that future
rules can use them in metas. that may be useful.  They're "free" anyway, since
DNS_FROM_RFC_* are all from one DNS lookup.



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.