You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by up...@3.am on 2005/02/09 01:05:52 UTC

Spam with "BAYES_00"

(running 3.0.2) Nearly all spam that gets through is being tagged as
"BAYES_00" since I started using sbl_xbl at the smtp level (before that,
alot more was hitting).

I've been using the same corpus with daily manual additions of my own, and
also using 70_sare_bayes_poison_nxm.cf to prevent this kind of thing, but
it looks like the auto-learn has been learning some of the wrong stuff.

I also run sa-learn --force-expire every night via cron.

Ideas?  I'm wondering if training alot more SPAM than HAM could cause this
(still well over the minimum amount of ham).

James Smallacombe		      PlantageNet, Inc. CEO and Janitor
up@3.am							    http://3.am
=========================================================================


Re: Spam with "BAYES_00"

Posted by Matt Kettler <mk...@evi-inc.com>.
At 07:05 PM 2/8/2005, up@3.am wrote:
>I've been using the same corpus with daily manual additions of my own, and
>also using 70_sare_bayes_poison_nxm.cf to prevent this kind of thing, but
>it looks like the auto-learn has been learning some of the wrong stuff.

Yeah, I'm not a big fan of SA's default ham learning threshold..

Having a positive number here has always struck me as an extrordinarily bad 
idea. I use a threshold of -0.01 and have a bunch of "learning comp" rules 
with -0.01 scores to them..

This way learning as ham only happens if you hit one of my comp rules, or 
one of SA's very few negative scoring rules, but the comp rules are too 
weak to be abused by spammers as a whitelist.




>I also run sa-learn --force-expire every night via cron.
>
>Ideas?  I'm wondering if training alot more SPAM than HAM could cause this
>(still well over the minimum amount of ham).

No, training more spam than ham should actually cause an increase in 
average bayes score, not a decrease.