You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@spamassassin.apache.org by Justin Mason <jm...@jmason.org> on 2005/07/07 19:55:39 UTC

Re: mass-check, reuse, scores and thoughts

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


Peter Fritz writes:
> Results are better, feel BAYES_99 is a bit low, but a good start.  Time
> to double check for FN/FPs in my corpus.
> 
> # SUMMARY for threshold 5.0:
> # Correctly non-spam:    314  92.90%
> # Correctly spam:      14688  97.56%
> # False positives:        24  7.10%
> # False negatives:       368  2.44%
> # Average score for spam:  21.294    ham: 1.3
> # Average for false-pos:   7.230  false-neg: 3.0
> # TOTAL:               15394  100.00%
> score BAYES_00                       -2.599 # not mutable
> score BAYES_05                       -0.413 # not mutable
> score BAYES_40                       -1.096 # not mutable
> score BAYES_50                       0.001 # not mutable
> score BAYES_60                       0.372 # not mutable
> score BAYES_80                       2.087 # not mutable
> score BAYES_95                       2.063 # not mutable
> score BAYES_99                       1.886 # not mutable

quite often BAYES_99 fires on the spam that's *really* spammy -- in other
words it doesn't need a high score for many messages to be marked as spam.

in my opinion, it may be worthwhile locking the BAYES_99 scores
to a high value, manually, and not let the Perceptron do this.
I've raised this as an issue at 
http://bugzilla.spamassassin.org/show_bug.cgi?id=4467

- --j.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Exmh CVS

iD8DBQFCzWybMJF5cimLx9ARAoC+AJ0ZX9ctruWyAw7wfT/e4HQch960ywCeOHfQ
xtt9gZ8L8qbaIIgrmEDilvc=
=uep9
-----END PGP SIGNATURE-----