You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Steve Martin <st...@planomartins.com> on 2005/08/16 05:23:58 UTC

bayes expiration problems?

I noticed an email took over 300 seconds to process, and the reason  
was apparently opportunistic bayes expiry taking to long as it ended  
up aborting processing.

So, I tried sa-learn --force-expire -D and saw this in the output...

[1364] dbg: bayes: token count: 156544, final goal reduction size: 44044
[1364] dbg: bayes: first pass?  current: 1124161947, Last:  
1124117250, atime: 2764800, count: 2658, newdelta: 166852, ratio:  
16.5703536493604, period: 43200
[1364] dbg: bayes: can't use estimation method for expiry, unexpected  
result, calculating optimal atime delta (first pass)
[1364] dbg: bayes: expiry max exponent: 9
[1364] dbg: bayes: atime_token reduction
[1364] dbg: bayes: ========_===============
[1364] dbg: bayes: 43200_150537
[1364] dbg: bayes: 86400_145179
[1364] dbg: bayes: 172800_135718
[1364] dbg: bayes: 345600_125439
[1364] dbg: bayes: 691200_103991
[1364] dbg: bayes: 1382400_63770
[1364] dbg: bayes: 2764800_1714
[1364] dbg: bayes: 5529600_0
[1364] dbg: bayes: 11059200_0
[1364] dbg: bayes: 22118400_0
[1364] dbg: bayes: first pass decided on 2764800 for atime delta
[1364] dbg: locker: refresh_lock: refresh /etc/mail/spamassassin/ 
bayes.lock

... lots of those...

[1364] dbg: bayes: untie-ing
[1364] dbg: bayes: untie-ing db_toks
[1364] dbg: bayes: untie-ing db_seen
[1364] dbg: bayes: files locked, now unlocking lock
[1364] dbg: locker: safe_unlock: unlink /etc/mail/spamassassin/ 
bayes.lock
expired old bayes database entries in 180 seconds
154830 entries kept, 1714 deleted
token frequency: 1-occurence tokens: 62.73%
token frequency: less than 8 occurrences: 20.42%
[1364] dbg: bayes: expiry completed


I ran it again, just to see and got this...


[1381] dbg: bayes: expiry check keep size, 0.75 * max: 112500
[1381] dbg: bayes: token count: 154830, final goal reduction size: 42330
[1381] dbg: bayes: first pass?  current: 1124162215, Last:  
1124162127, atime: 2764800, count: 1714, newdelta: 111950, ratio:  
24.6966161026838, period: 43200
[1381] dbg: bayes: can't use estimation method for expiry, unexpected  
result, calculating optimal atime delta (first pass)
[1381] dbg: bayes: expiry max exponent: 9
[1381] dbg: bayes: atime_token reduction
[1381] dbg: bayes: ========_===============
[1381] dbg: bayes: 43200_148823
[1381] dbg: bayes: 86400_143465
[1381] dbg: bayes: 172800_134004
[1381] dbg: bayes: 345600_123725
[1381] dbg: bayes: 691200_102277
[1381] dbg: bayes: 1382400_62056
[1381] dbg: bayes: 2764800_0
[1381] dbg: bayes: 5529600_0
[1381] dbg: bayes: 11059200_0
[1381] dbg: bayes: 22118400_0
[1381] dbg: bayes: couldn't find a good delta atime, need more token  
difference, skipping expire
[1381] dbg: bayes: expiry completed
[1381] dbg: bayes: untie-ing
[1381] dbg: bayes: untie-ing db_toks
[1381] dbg: bayes: untie-ing db_seen
[1381] dbg: bayes: files locked, now unlocking lock
[1381] dbg: locker: safe_unlock: unlink /etc/mail/spamassassin/ 
bayes.lock


So, the first time it only got rid of about 2000 tokens and is stuck?


[1364] dbg: bayes: can't use estimation method for expiry, unexpected  
result, calculating optimal atime delta (first pass)

How can I figure out  what went wrong here?

[1381] dbg: bayes: couldn't find a good delta atime, need more token  
difference, skipping expire

and why did that happen on the second pass....


--
Steve Martin                              http://www.cheezmo.com/
Smart Calibration, LLC           http://www.smartcalibration.com/
The Widescreen Movie Center            http://www.widemovies.com/
Letterboxed Movie TV Schedule  http://www.widemovies.com/lbx.html


Re: bayes expiration problems?

Posted by Theo Van Dinter <fe...@apache.org>.
On Mon, Aug 15, 2005 at 10:23:58PM -0500, Steve Martin wrote:
> 154830 entries kept, 1714 deleted

Ok.

> [1381] dbg: bayes: token count: 154830, final goal reduction size: 42330
> [1381] dbg: bayes: 1382400_62056
> 
> So, the first time it only got rid of about 2000 tokens and is stuck?

Yup.

> [1364] dbg: bayes: can't use estimation method for expiry, unexpected  
> result, calculating optimal atime delta (first pass)
> 
> How can I figure out  what went wrong here?

It's in the sa-learn docs.  Basically your last expiry is too different from
what it's trying to do now, so it can't estimate new values based on the old
values.

> and why did that happen on the second pass....

Per the above, expiry wants to get rid of 42330 tokens, but the first
(smallest value > 0) atime difference is 62056 tokens, which means too many
would be removed, so it can't expire.

-- 
Randomly Generated Tagline:
"You can't run sausage backwards through a meat grinder and end up with
 a whole pig."
 - Tim Peoples talking about the irreversability of UNIX password encoding