You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spamassassin.apache.org by Robert Menschel <Ro...@Menschel.net> on 2004/04/16 03:44:51 UTC
Obsolete rules?
Looking at the results of the LW_BIG_AND_RED corpus run I just did, I
find a goodly number of rules that seem to be backfiring at this time,
using SA 2.63. I know that rules are being reevaluated for 3.0, and just
wanted to make sure people were aware of these.
Corpus includes 3 years ham, 4 months spam.
Because they stand out like a sore thumb, I checked into the TONER ham
hits -- they're all advertisements at the bottom of YahooGroups mailing
list emails.
Bob Menschel
Section 3 -- Frequencies Log
(First numeric frequencies, followed by percentage frequencies)
OVERALL SPAM HAM S/O SCORE NAME
111528 90720 20808 0.813 0.00 0.00 (all messages)
...
16 8 8 0.187 0.01 0.85 EXCUSE_13
4 2 2 0.187 0.01 0.07 REPLY_TO_EMPTY
2 1 1 0.187 0.01 0.20 MSGID_THREESIXSIX
2 1 1 0.187 0.01 1.87 FAKE_HELO_YAHOO
2 1 1 0.187 0.01 2.90 VERB_UP_TO_OR_MORES
2 1 1 0.187 0.01 1.18 MIME_BOUND_DASH_DIGIT
369 156 213 0.144 0.00 2.00 BLANK_LINES_70_80
5 2 3 0.133 0.00 2.12 FIND_ANYTHING
171 64 107 0.121 0.00 1.64 BLANK_LINES_80_90
3 1 2 0.103 0.00 2.38 FAKED_HOTMAIL_DAV
7 2 5 0.084 0.00 2.90 FRIEND_AT_PUBLIC
508 129 379 0.072 0.00 1.60 FROM_NO_LOWER
339 86 253 0.072 0.00 1.12 EXTRA_MPART_TYPE
60 14 46 0.065 0.00 0.72 FROM_AND_TO_SAME
2320 87 2233 0.009 0.00 0.56 TONER
9 1 8 0.028 0.00 2.70 BUGGY_CGI
359 1 358 0.001 0.00 2.90 FAKE_HELO_BIGFOOT
10 0 10 0.000 0.00 1.65 IDENT_NOBODY
1 0 1 0.000 0.00 0.04 SUBJ_NOW_ONLY
2 0 2 0.000 0.00 1.17 FAKE_HELO_HOTMAIL
18 0 18 0.000 0.00 1.92 FAKE_HELO_AOL
18 0 18 0.000 0.00 2.19 NO_RDNS_DOTCOM_HELO
5 0 5 0.000 0.00 1.64 THE_FOLLOWING_FORM
2 0 2 0.000 0.00 2.80 FAKE_HELO_USA_NET
4 0 4 0.000 0.00 1.70 AOL_USERS_LINK
OVERALL% SPAM% HAM% S/O RANK SCORE NAME
111528 90720 20808 0.813 0.00 0.00 (all messages)
100.000 81.3428 18.6572 0.813 0.00 0.00 (all messages as %)
...
0.014 0.0088 0.0384 0.187 0.01 0.85 EXCUSE_13
0.004 0.0022 0.0096 0.187 0.01 0.07 REPLY_TO_EMPTY
0.002 0.0011 0.0048 0.187 0.01 0.20 MSGID_THREESIXSIX
0.002 0.0011 0.0048 0.187 0.01 1.87 FAKE_HELO_YAHOO
0.002 0.0011 0.0048 0.187 0.01 2.90 VERB_UP_TO_OR_MORES
0.002 0.0011 0.0048 0.187 0.01 1.18 MIME_BOUND_DASH_DIGIT
0.331 0.1720 1.0236 0.144 0.00 2.00 BLANK_LINES_70_80
0.004 0.0022 0.0144 0.133 0.00 2.12 FIND_ANYTHING
0.153 0.0705 0.5142 0.121 0.00 1.64 BLANK_LINES_80_90
0.003 0.0011 0.0096 0.103 0.00 2.38 FAKED_HOTMAIL_DAV
0.006 0.0022 0.0240 0.084 0.00 2.90 FRIEND_AT_PUBLIC
0.455 0.1422 1.8214 0.072 0.00 1.60 FROM_NO_LOWER
0.304 0.0948 1.2159 0.072 0.00 1.12 EXTRA_MPART_TYPE
0.054 0.0154 0.2211 0.065 0.00 0.72 FROM_AND_TO_SAME
2.080 0.0959 10.7314 0.009 0.00 0.56 TONER
0.008 0.0011 0.0384 0.028 0.00 2.70 BUGGY_CGI
0.322 0.0011 1.7205 0.001 0.00 2.90 FAKE_HELO_BIGFOOT
0.009 0.0000 0.0481 0.000 0.00 1.65 IDENT_NOBODY
0.001 0.0000 0.0048 0.000 0.00 0.04 SUBJ_NOW_ONLY
0.002 0.0000 0.0096 0.000 0.00 1.17 FAKE_HELO_HOTMAIL
0.016 0.0000 0.0865 0.000 0.00 1.92 FAKE_HELO_AOL
0.016 0.0000 0.0865 0.000 0.00 2.19 NO_RDNS_DOTCOM_HELO
0.004 0.0000 0.0240 0.000 0.00 1.64 THE_FOLLOWING_FORM
0.002 0.0000 0.0096 0.000 0.00 2.80 FAKE_HELO_USA_NET
0.004 0.0000 0.0192 0.000 0.00 1.70 AOL_USERS_LINK