You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by "Rogers, Zoë A." <zo...@dns.co.uk> on 2004/08/18 16:25:26 UTC
RE: Bayes database, expiry not working
Thanks for posting this script, it didn't work for me however. I did the following:
Sync the journal (sa-learn --rebuild)
check (sa-learn --dump magic | head -1) for the db value, should be 2 - it was
Run db-to-text2.pl -o bayes_toks > bayes_toks.txt
It changed the atime of hundreds of tokens:
Resetting atime of key in the future:
<key>H*c:alternative</key><ts>411973</ts><th>56913</th><atime>1735776000</atime>
......
Resetting atime of key in the future:
<key>listing</key><ts>1278</ts><th>2979</th><atime>1735776000</atime>
.................
Resetting atime of key in the future:
<key>94120-7334</key><ts>4</ts><th>1</th><atime>1735776000</atime>
..................................................................
Resetting atime of key in the future:
<key>editions</key><ts>253</ts><th>495</th><atime>1735776000</atime>
...............................
but after I ran db-to-text2.pl -i bayes_toks < bayes_toks.txt and ran the auto-expire it didn't work.
Expiry output after atime adjustment:
debug: bayes: found bayes db version 2
debug: bayes: expiry check keep size, 75% of max: 60000
debug: bayes: expiry keep size too small, resetting to 100,000 tokens
debug: bayes: token count: 4179196, final goal reduction size: 4079196
debug: bayes: First pass? Current: 1092831331, Last: 1092826832, atime: 736899888, count: 0, newdelta: 0, ratio: 0
debug: bayes: something fishy, calculating atime (first pass)
debug: bayes: couldn't find a good delta atime, need more token difference, skipping expire.
debug: Syncing complete.
debug: bayes: 32574 untie-ing
debug: bayes: 32574 untie-ing db_toks
debug: bayes: 32574 untie-ing db_seen
debug: bayes: files locked, now unlocking lock
debug: unlock: 32574 unlink /usr/local/share/spamassassin/run/bayes.lock
So ran it again:
Resetting atime of key in the future:
<key>editions</key><ts>253</ts><th>495</th><atime>1735776000</atime>
...............................
Resetting atime of key in the future:
<key>UD:htm</key><ts>42786</ts><th>26947</th><atime>1735776000</atime>
....................................................
bayes_toks: 4179192 keys copied
bayes_toks: 146 future-keys reset
The fact that it was finding atimes in the future on the second pass means that the import of bayes_toks.txt must not be working properly because I could find not future atimes when doing a grep on bayes_toks.txt. Expiry still didn't work.
Doing an sa-learn --dump magic showed that the newest atime is in 2025 and the oldest in 2000 which are both incorrect:
0.000 0 2 0 non-token data: bayes db version
0.000 0 632798 0 non-token data: nspam
0.000 0 619738 0 non-token data: nham
0.000 0 4179196 0 non-token data: ntokens
0.000 0 952965268 0 non-token data: oldest atime
0.000 0 1735776000 0 non-token data: newest atime
0.000 0 1092825211 0 non-token data: last journal sync atime
0.000 0 1092836039 0 non-token data: last expiry atime
0.000 0 736899888 0 non-token data: last expire atime delta
0.000 0 0 0 non-token data: last expire reduction count
Any suggestions? No errors are given after the import back to the database so I'm not sure what is going wrong here.
Cheers,
Zoe
________________________________
From: Martin Schröder [mailto:ms@artcom-gmbh.de]
Sent: Tue 17/08/2004 17:59
To: spamassassin-users@incubator.apache.org
Subject: Re: Bayes database
On 2004-08-17 18:28:48 +0200, Andy Spiegl wrote:
> You don't have to delete your bayes database!
>
> In April I had the same problem and I ended up extending and fixing the tool
> http://spamassassin.taint.org/devel/db-to-text.pl.txt
> and posting it to the mailing list. I asked that someone puts it on the
THANKS!
expired old Bayes database entries in 661 seconds
145343 entries kept, 455855 deleted
:-))
Best regards
Martin
--
Martin Schröder, ms@artcom-gmbh.de
ArtCom GmbH, Lise-Meitner-Str 5, 28359 Bremen, Germany
Voice +49 421 20419-44 / Fax +49 421 20419-10
http://www.artcom-gmbh.de
---------------------------------------------------
This email from dns has been validated by dnsMSS Managed Email Security and is free from all known viruses.
For further information contact email-integrity@dns.co.uk