You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Bram Mertens <br...@sofico.be> on 2007/02/25 20:11:30 UTC
Bayes DB maintenance
Hi
Like I wrote in my previous post SA's effectiveness has dropped
dramatically over the past couple of days.
I read something about "overtraining" bayes databases a while ago and was
wondering if this could be an issue.
How can I check the status of my bayes DB? The output of sa-learn --dump
magic doesn't mean much to me.
Are there routines to run to clean up a bayes db and if so how often should
they be run?
I ran sa-learn --force-expire today but it appears to have made little
difference on the output of sa-learn --dump magic:
before:
0.000 0 3 0 non-token data: bayes db version
0.000 0 12234 0 non-token data: nspam
0.000 0 115904 0 non-token data: nham
0.000 0 179531 0 non-token data: ntokens
0.000 0 1169633453 0 non-token data: oldest atime
0.000 0 1172424260 0 non-token data: newest atime
0.000 0 0 0 non-token data: last journal sync
atime
0.000 0 1172399818 0 non-token data: last expiry atime
0.000 0 2764800 0 non-token data: last expire atime
delta
0.000 0 1438 0 non-token data: last expire
reduction count
after:
0.000 0 3 0 non-token data: bayes db version
0.000 0 12234 0 non-token data: nspam
0.000 0 115905 0 non-token data: nham
0.000 0 178179 0 non-token data: ntokens
0.000 0 1169659662 0 non-token data: oldest atime
0.000 0 1172426770 0 non-token data: newest atime
0.000 0 0 0 non-token data: last journal sync
atime
0.000 0 1172427375 0 non-token data: last expiry atime
0.000 0 2764800 0 non-token data: last expire atime
delta
0.000 0 1386 0 non-token data: last expire
reduction count
Would it make sense to clean (using sa-learn --clear) out the bayes db and
retrain with recent ham/spam?
Regards
Bram