You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Michael Monnerie <mi...@it-management.at> on 2006/05/15 09:05:44 UTC

how to check your settings?

Dear list,

I have 
bayes_expiry_max_db_size            2000000
set in local.cf, but tonight I got this message:

bayes: synced databases from journal in 4 seconds: 1690 unique entries 
(3283 total entries)
expired old bayes database entries in 180 seconds
209328 entries kept, 1347841 deleted
token frequency: 1-occurrence tokens: 41.01%
token frequency: less than 8 occurrences: 26.51%

It should keep 2m settings, but kept only 209k. Is there a way to see 
which settings SA actually uses? In postfix there's a "postconf -n", 
which is very practical, but
spamassassin -D --lint 2>&1|less 
doesn't show me that.

mfg zmi
-- 
// Michael Monnerie, Ing.BSc    -----      http://it-management.at
// Tel: 0660/4156531                          .network.your.ideas.
// PGP Key:   "lynx -source http://zmi.at/zmi3.asc | gpg --import"
// Fingerprint: 44A3 C1EC B71E C71A B4C2  9AA6 C818 847C 55CB A4EE
// Keyserver: www.keyserver.net                 Key-ID: 0x55CBA4EE

Re: how to check your settings?

Posted by Theo Van Dinter <fe...@apache.org>.
On Mon, May 15, 2006 at 07:29:41PM +0200, Michael Monnerie wrote:
> > If the previous expire looks similar to this expire run, Bayes will
> > use an estimate for atime to do the expire.  This speeds things up,
> > but also could possibly be wrong (it is an estimate afterall).  So if
> > an estimate was used, up to 1557169-100000 tokens (100k is a safety
> > net) could be expired.
> 
> I still don't understand the reason for that. Wouldn't it be bad to 
> throw away that many tokens?

The reason for the estimate, or ...?  It may or may not be bad to throw
away that many tokens, it depends.  A larger number of tokens in the DB
doesn't necessarily mean better accuracy.

-- 
Randomly Generated Tagline:
 Amy: Worms? Ew, pukatronic! 

Re: how to check your settings?

Posted by Michael Monnerie <mi...@it-management.at>.
On Montag, 15. Mai 2006 17:01 Theo Van Dinter wrote:
> Since 209328+1347841 < 2000000, an expire wouldn't have been
> attempted opportunistically.  However, running "sa-learn
> --force-expire" would have made it happen.  

Yes, this was the cause. I have
bayes_auto_expire 0
therefore a nightly --force-expire job.

> 209328+1347841 = 1557169. 
>  Bayes will try to expire tokens down to 75% of the max number.  So
> 2000000*0.75 = 1500000, which means it saw that it should try to
> expire 1557169-1500000 = 57169 tokens.

That would have been expected.

> If the previous expire looks similar to this expire run, Bayes will
> use an estimate for atime to do the expire.  This speeds things up,
> but also could possibly be wrong (it is an estimate afterall).  So if
> an estimate was used, up to 1557169-100000 tokens (100k is a safety
> net) could be expired.

I still don't understand the reason for that. Wouldn't it be bad to 
throw away that many tokens?

mfg zmi
-- 
// Michael Monnerie, Ing.BSc    -----      http://it-management.at
// Tel: 0660/4156531                          .network.your.ideas.
// PGP Key:   "lynx -source http://zmi.at/zmi3.asc | gpg --import"
// Fingerprint: 44A3 C1EC B71E C71A B4C2  9AA6 C818 847C 55CB A4EE
// Keyserver: www.keyserver.net                 Key-ID: 0x55CBA4EE

Re: how to check your settings?

Posted by Theo Van Dinter <fe...@apache.org>.
On Mon, May 15, 2006 at 09:05:44AM +0200, Michael Monnerie wrote:
> bayes: synced databases from journal in 4 seconds: 1690 unique entries 
> (3283 total entries)
> expired old bayes database entries in 180 seconds
> 209328 entries kept, 1347841 deleted
> 
> It should keep 2m settings, but kept only 209k. Is there a way to see 
> which settings SA actually uses? In postfix there's a "postconf -n", 
> which is very practical, but
> spamassassin -D --lint 2>&1|less 
> doesn't show me that.

No, there is no way to see that.  You can see which files are being read, and
assuming there are no conf errors, all of the options should be used.

Unfortunately, the output doesn't actually give enough details to figure
out what happened, but here's a theory:

Since 209328+1347841 < 2000000, an expire wouldn't have been attempted
opportunistically.  However, running "sa-learn --force-expire" would
have made it happen.  209328+1347841 = 1557169.  Bayes will try to expire
tokens down to 75% of the max number.  So 2000000*0.75 = 1500000, which
means it saw that it should try to expire 1557169-1500000 = 57169 tokens.

If the previous expire looks similar to this expire run, Bayes will
use an estimate for atime to do the expire.  This speeds things up,
but also could possibly be wrong (it is an estimate afterall).  So if
an estimate was used, up to 1557169-100000 tokens (100k is a safety net)
could be expired.

-- 
Randomly Generated Tagline:
"What the hell is this?  For crying out loud, somebody throw a pie!"
         - Peter Griffin on Family Guy