You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by "Chr. v. Stuckrad" <st...@mi.fu-berlin.de> on 2006/07/19 17:24:22 UTC
Re: Will bayes-db be 'skewed' by ... autolearning ham?
On Tue, 18 Jul 2006, Dirk Bonengel wrote:
> did you investigate auto-learning? This might let your system learn ham
> as well as spam. Works fine here (same situation - gateway server to a
> Lotus Notes system, no feedback loop possible)
May be I should change the threshholds for autolearning
different from the default? (I never touched them so far).
I just found *lots* 'autolearn=ham' in my log,
and I can not believe that so many are correct.
Out of the current log I see Mail classified as
21805 ham
11493 autolearned as ham (this seems suspiciously high?)
85963 spam
52977 autolearned as spam
So I fear the 'skew' in my database comes form autoloearning
'bayes-fodder' of spammers and not fron 'skewed explicite learning'.
WHat may make it even worse is, that 'inhouse mail==ham' is
never learned, because it's never spamchecked (users did complain
too much about the slowdown, so only the 'outside' goes through the
Spamfilter).
Stucki
--
Christoph von Stuckrad * * |nickname |<st...@mi.fu-berlin.de> \
Freie Universitaet Berlin |/_*|'stucki' |Tel(days):+49 30 838-5 57 78|
Mathematik & Informatik EDV |\ *|if online|Tel(else):+49 30 77 39 66 00|
Arnimallee 6 / 14195 Berlin * * |on IRCnet|Fax(alle):+49 30 838-75 454/
Re: Will bayes-db be 'skewed' by ... autolearning ham?
Posted by Paul Boven <p....@chello.nl>.
Hi all,
Loren Wilton wrote:
>> May be I should change the threshholds for autolearning
>> different from the default? (I never touched them so far).
>
> Yes. Set it to -0.1. If you have been doing a lot of autolearning
> without this you may have a moderately sick bayes db, and might want to
> consider starting over.
Seconded - otherwise spam that doesn't score points gets autolearned. I
have:
bayes_auto_learn_threshold_nonspam -0.1
So really only stuff that is whitelisted or has ALL_TRUSTED (e.g.
outgoing mail) has any chance of being autolearned.
Regards, Paul Boven.
Re: Will bayes-db be 'skewed' by ... autolearning ham?
Posted by Loren Wilton <lw...@earthlink.net>.
> May be I should change the threshholds for autolearning
> different from the default? (I never touched them so far).
Yes. Set it to -0.1. If you have been doing a lot of autolearning without
this you may have a moderately sick bayes db, and might want to consider
starting over.
Loren