You are viewing a plain text version of this content. The canonical link for it is here.

Posted to users@spamassassin.apache.org by Justin Mason <jm...@jmason.org> on 2004/09/04 01:09:54 UTC

Re: shifting the midpoint between the average spam and average ham

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


Joe Emenaker writes:
> Joe Flowers wrote:
> 
> >> If your "spread" is good and it's just the threshold that needs 
> >> adjusting, it would be trivial to make a rule that fires on every 
> >> message and give > it a score equal to the desired difference...
> >
> > Thanks Pierre. That may be what I have to do, if noone has a better idea.
> 
> Actually, what this discussion has inspired me to do is to investigate 
> the idea of having a script auto-adjust each user's spam_threshold.
> 
> Currently, I've got a setup where users have two trash folders: one for 
> spam, one for ham. Every hour, a cron job runs sa-learn on the contents 
> of those folders. However, something *else* that it does is it records 
> each message to a "spamlog", which holds the SA spam score and whether 
> or not the user felt that it was spam or not.
> 
> Originally, I did it so that I could give users personalized values in a 
> page which would look like this 
> (http://fruitpie.blastpoint.com/~jemenake/spamreport.cgi). However, 
> after reading this thread, I think I'm deciding that this isn't 
> necessary. The user can just indicate what their desired level of 
> false-positives or false-negatives is. Then, my hourly script, after it 
> runs sa-learn and updates the spamlog, it could run some stats on the 
> updated spam log and figure out the best spam_threshold in order to 
> achive the user's desired FP or FN rate.

that sounds pretty cool.  suggestion: get it to record what rules
hit and what those rules' scores were.

- --j.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (GNU/Linux)
Comment: Exmh CVS

iD8DBQFBOPnBQTcbUG5Y7woRAoQZAJ0VAJoCMTo6OgPI2cf+odoBzryOzgCgxU2t
P3fjukvVJf5EO/rCfn2Rn68=
=W71z
-----END PGP SIGNATURE-----

Re: shifting the midpoint between the average spam and average ham

Posted by Joe Emenaker <jo...@emenaker.com>.

Justin Mason wrote:

>that sounds pretty cool.  suggestion: get it to record what rules
>hit and what those rules' scores were.
>  
>
Actually, I'm already doing it for Bayes. When I turned off autolearning 
and went solely with manual-training of the Bayes db, I was interested 
to see if Bayes, alone, was significantly more accurate than the entire 
set of the SA rules. If so, then I'd increase the weighting of Bayes 
significantly.

- Joe

-- 
When freedom gives way to tyranny, it is not because tyranny comes
dressed as a wolf. Rather, it comes dressed as a shepherd,
pointing out other wolves. Go *read* the Patriot Act.