You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Chris Santerre <cs...@MerchantsOverseas.com> on 2004/09/29 16:55:55 UTC

Why such a low score?

What was the reason WS got such a low score in SA 3.0??? .5 is a joke! Hell
BigEvil was scored a 3 and now one complained, and it is the same data!! I
don't understand. Did the mass check not go well?

Chris Santerre 
System Admin and SARE Ninja
http://www.rulesemporium.com
http://www.surbl.org
'It is not the strongest of the species that survives,
not the most intelligent, but the one most responsive to change.'
Charles Darwin 

Re: Why such a low score?

Posted by Matt Kettler <mk...@evi-inc.com>.
At 10:55 AM 9/29/2004, Chris Santerre wrote:
>What was the reason WS got such a low score in SA 3.0??? .5 is a joke! Hell
>BigEvil was scored a 3 and now one complained, and it is the same data!! I
>don't understand. Did the mass check not go well?

Chris... Calm down a sec.

The score assigned by the GA does not indicate how well or poorly a rule 
performs.

In this case WS probably got a low score due to a large amount of overlap 
with the other URIBL rules.

Remember, the GA tunes the scores so the optimal amount of spam and ham end 
up in the right baskets. This means that rule scores aren't a function of 
just the rule, but a function of how that rule interacts with other rules. 


Re: Why such a low score?

Posted by Raymond Dijkxhoorn <ra...@prolocation.net>.
Chris,

> What was the reason WS got such a low score in SA 3.0??? .5 is a joke! Hell
> BigEvil was scored a 3 and now one complained, and it is the same data!! I
> don't understand. Did the mass check not go well?

We pointed this out several times, the mass checker found way too many 
FP's and so SA decided to score it lower. Its 'our own' problem, we have 
to get out those FP's. The scoring is done with SA 3.1 again, so lets try 
to do better there...

And yes, i am disappointed also with this very low scoring, personally i 
have raised it via my local.cf.

Bye,
Raymond

Re: Why such a low score?

Posted by Matt Kettler <mk...@evi-inc.com>.
At 10:55 AM 9/29/2004, Chris Santerre wrote:
>What was the reason WS got such a low score in SA 3.0??? .5 is a joke! Hell
>BigEvil was scored a 3 and now one complained, and it is the same data!! I
>don't understand. Did the mass check not go well?

Upon closer inspection, the WS mass-check went pretty well, but WS had the 
greatest number of nonspam hits of all the SURBL lists. It also hit the 
most spam, but the OB list hit nearly as much spam, and almost no nonspam.

Since the GA treats FP's as 100 times worse than FNs, the GA is going to 
heavily bias the score of any overlapping spam hits to the one that has the 
least nonspam hits. I suspect that in the spam cases, most of the WS hits 
also hit either OB or SC, which have better FP ratios, and the scores 
assigned reflect this.

Admittedly the amount of nonspam WS hit is small (0.4%), but that's over 6 
times more nonspam than OB did, and 100 times more than SC did.

Thus WS got a lowish score not for being a bad rule, but for not doing as 
well as it's neighbors that catch the same spams.

 From STATISTICS-set1.txt
OVERALL%   SPAM%     HAM%     S/O    RANK   SCORE  NAME
  10.497  15.8904   0.0008    1.000   0.98    2.01  URIBL_AB_SURBL
  18.019  27.2741   0.0046    1.000   0.97    3.90  URIBL_SC_SURBL
  49.029  74.1861   0.0654    0.999   0.74    2.00  URIBL_OB_SURBL
  51.999  78.4712   0.4756    0.994   0.45    0.54  URIBL_WS_SURBL
   0.010   0.0146   0.0012    0.927   0.39    0.84  URIBL_PH_SURBL

 From STATISTICS-set3.txt:
OVERALL%   SPAM%     HAM%     S/O    RANK   SCORE  NAME
   7.022  14.4233   0.0061    1.000   0.95    4.26  URIBL_SC_SURBL
  30.471  62.5514   0.0632    0.999   0.74    3.21  URIBL_OB_SURBL
   2.950   6.0208   0.0385    0.994   0.73    0.42  URIBL_AB_SURBL
  33.807  68.9994   0.4494    0.994   0.47    1.46  URIBL_WS_SURBL
   0.019   0.0390   0.0008    0.981   0.44    2.00  URIBL_PH_SURBL

grep SURBL 50_scores.cf:
score URIBL_AB_SURBL 0 2.007 0 0.417
score URIBL_OB_SURBL 0 1.996 0 3.213
score URIBL_PH_SURBL 0 0.839 0 2.000
score URIBL_SC_SURBL 0 3.897 0 4.263
score URIBL_WS_SURBL 0 0.539 0 1.462