You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spamassassin.apache.org by bu...@bugzilla.spamassassin.org on 2006/10/12 15:04:21 UTC

[Bug 4945] Add support for open-whois.org

http://issues.apache.org/SpamAssassin/show_bug.cgi?id=4945


jm@jmason.org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|                            |FIXED
   Target Milestone|Undefined                   |3.2.0




------- Additional Comments From jm@jmason.org  2006-10-12 06:04 -------
in recent mass-checks, these have been doing nicely:
http://ruleqa.spamassassin.org/?daterev=20061007-r453869-n&s_defcorpus=on&rule=&srcpath=sandbox%2Fjm%2F20_openwhois&s_zero=on&s_detail=checked+&g=Change

These are good:

MSECS     	SPAM%     	HAM%     	S/O%     	RANK     	SCORE     	NAME     	
0.00000 	1.4394 3766 of 261633 messages 	0.0000 0 messages 	1.000 	0.84 	0.00 
T_WHOIS_AITPRIV 		
0.00000 	1.1310 2960 of 261633 messages 	0.0064 5 of 62951 messages 	0.994 
0.81 	0.00 	T_WHOIS_PRIVPROT 		
0.00000 	0.7407 1938 of 261633 messages 	0.0032 3 of 62951 messages 	0.996 
0.78 	0.00 	T_DNS_FROM_OPENWHOIS 		
0.00000 	0.2389 626 of 261633 messages 	0.0000 0 messages 	1.000 	0.66 	0.00 
T_WHOIS_SECUREWHOIS 		
0.00000 	0.1697 444 of 261633 messages 	0.0016 1 of 62951 messages 	0.991 
0.62 	0.00 	T_WHOIS_WHOISGUARD 		
0.00000 	0.1407 369 of 261633 messages 	0.0000 0 messages 	1.000 	0.60 	0.00 
T_WHOIS_REGISTERFLY 		
0.00000 	0.1173 307 of 261633 messages 	0.0000 0 messages 	1.000 	0.58 	0.00 
T_WHOIS_NAMEKING 	

and these don't really hit enough spam, or hit too much ham:
	
0.00000 	0.1888 494 of 261633 messages 	0.0683 43 of 62951 messages 	0.734 
0.57 	0.00 	T_WHOIS_DMNBYPROXY 		
0.00000 	0.0520 137 of 261633 messages 	0.0032 3 of 62951 messages 	0.942 
0.51 	0.00 	T_WHOIS_UNLISTED 		
0.00000 	0.0084 22 of 261633 messages 	0.0000 0 messages 	1.000 	0.45 	0.00 
T_WHOIS_PRIVACYPOST 		
0.00000 	0.0031 9 of 261633 messages 	0.0000 0 messages 	1.000 	0.45 	0.00 
T_WHOIS_GKGPROXY 		
0.00000 	0.0023 7 of 261633 messages 	0.0000 0 messages 	1.000 	0.44 	0.00 
T_WHOIS_WHOISPROT 		
0.00000 	0.0019 5 of 261633 messages 	0.0000 0 messages 	1.000 	0.44 	0.00 
T_WHOIS_SECINFOSERV 		
0.00000 	0.0011 3 of 261633 messages 	0.0000 0 messages 	1.000 	0.44 	0.00 
T_WHOIS_NOLDC 		
0.00000 	0.0019 5 of 261633 messages 	0.0016 1 of 62951 messages 	0.546 
0.44 	0.00 	T_WHOIS_MONIKER_PRIV 		
0.00000 	0.0008 3 of 261633 messages 	0.0000 0 messages 	1.000 	0.44 	0.00 
T_WHOIS_SPAMFREE 		
0.00000 	0.0008 3 of 261633 messages 	0.0000 0 messages 	1.000 	0.44 	0.00 
T_WHOIS_CONTACTPRIV 		
0.00000 	0.0191 50 of 261633 messages 	0.1048 66 of 62951 messages 	0.154 
0.38 	0.00 	T_WHOIS_MYPRIVREG 		
0.00000 	0.1242 325 of 261633 messages 	0.5449 344 of 62951 messages 	0.186 
0.31 	0.00 	T_WHOIS_NETSOLPR

I guess that's entirely to be expected. ;)

the high scoring ones seem to have correlations to specific spammers or spam
gangs; e.g. one of the rules overlaps noticeably with a specific URL format!

The next issue is one for us -- we need to figure out how to publish these,
since we obv can't promote all the rules.  ideally, we could leave the nightly
rule-promotion code to do it.  However, right now it is not allowed to
auto-promote network rules, since they are too "heavyweight" in impact.

what I've done is just insert 'publish NAMEOFRULE' lines for the good ones
above... that works, but will require occasional revisiting as hitrates change :(

anyway, they're in now! marking fixed...




------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.