You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Justin Mason <jm...@jmason.org> on 2004/09/04 01:09:54 UTC
Re: shifting the midpoint between the average spam and average ham
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Joe Emenaker writes:
> Joe Flowers wrote:
>
> >> If your "spread" is good and it's just the threshold that needs
> >> adjusting, it would be trivial to make a rule that fires on every
> >> message and give > it a score equal to the desired difference...
> >
> > Thanks Pierre. That may be what I have to do, if noone has a better idea.
>
> Actually, what this discussion has inspired me to do is to investigate
> the idea of having a script auto-adjust each user's spam_threshold.
>
> Currently, I've got a setup where users have two trash folders: one for
> spam, one for ham. Every hour, a cron job runs sa-learn on the contents
> of those folders. However, something *else* that it does is it records
> each message to a "spamlog", which holds the SA spam score and whether
> or not the user felt that it was spam or not.
>
> Originally, I did it so that I could give users personalized values in a
> page which would look like this
> (http://fruitpie.blastpoint.com/~jemenake/spamreport.cgi). However,
> after reading this thread, I think I'm deciding that this isn't
> necessary. The user can just indicate what their desired level of
> false-positives or false-negatives is. Then, my hourly script, after it
> runs sa-learn and updates the spamlog, it could run some stats on the
> updated spam log and figure out the best spam_threshold in order to
> achive the user's desired FP or FN rate.
that sounds pretty cool. suggestion: get it to record what rules
hit and what those rules' scores were.
- --j.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (GNU/Linux)
Comment: Exmh CVS
iD8DBQFBOPnBQTcbUG5Y7woRAoQZAJ0VAJoCMTo6OgPI2cf+odoBzryOzgCgxU2t
P3fjukvVJf5EO/rCfn2Rn68=
=W71z
-----END PGP SIGNATURE-----
Re: shifting the midpoint between the average spam and average ham
Posted by Joe Emenaker <jo...@emenaker.com>.
Justin Mason wrote:
>that sounds pretty cool. suggestion: get it to record what rules
>hit and what those rules' scores were.
>
>
Actually, I'm already doing it for Bayes. When I turned off autolearning
and went solely with manual-training of the Bayes db, I was interested
to see if Bayes, alone, was significantly more accurate than the entire
set of the SA rules. If so, then I'd increase the weighting of Bayes
significantly.
- Joe
--
When freedom gives way to tyranny, it is not because tyranny comes
dressed as a wolf. Rather, it comes dressed as a shepherd,
pointing out other wolves. Go *read* the Patriot Act.