You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by "Rogers, Zoë A." <zo...@dns.co.uk> on 2004/08/18 16:25:26 UTC

RE: Bayes database, expiry not working

Thanks for posting this script, it didn't work for me however.  I did the following:
 
Sync the journal (sa-learn --rebuild)
check (sa-learn --dump magic | head -1) for the db value, should be 2 - it was
Run db-to-text2.pl -o bayes_toks > bayes_toks.txt

It changed the atime of hundreds of tokens:
 
Resetting atime of key in the future:
 <key>H*c:alternative</key><ts>411973</ts><th>56913</th><atime>1735776000</atime>
......
Resetting atime of key in the future:
 <key>listing</key><ts>1278</ts><th>2979</th><atime>1735776000</atime>
.................
Resetting atime of key in the future:
 <key>94120-7334</key><ts>4</ts><th>1</th><atime>1735776000</atime>
..................................................................
Resetting atime of key in the future:
 <key>editions</key><ts>253</ts><th>495</th><atime>1735776000</atime>
...............................

but after I ran db-to-text2.pl -i bayes_toks < bayes_toks.txt and ran the auto-expire it didn't work. 
 
Expiry output after atime adjustment:
 
debug: bayes: found bayes db version 2
debug: bayes: expiry check keep size, 75% of max: 60000
debug: bayes: expiry keep size too small, resetting to 100,000 tokens
debug: bayes: token count: 4179196, final goal reduction size: 4079196
debug: bayes: First pass?  Current: 1092831331, Last: 1092826832, atime: 736899888, count: 0, newdelta: 0, ratio: 0
debug: bayes: something fishy, calculating atime (first pass)
debug: bayes: couldn't find a good delta atime, need more token difference, skipping expire.
debug: Syncing complete.
debug: bayes: 32574 untie-ing
debug: bayes: 32574 untie-ing db_toks
debug: bayes: 32574 untie-ing db_seen
debug: bayes: files locked, now unlocking lock
debug: unlock: 32574 unlink /usr/local/share/spamassassin/run/bayes.lock
 
So ran it again:
 
Resetting atime of key in the future:
 <key>editions</key><ts>253</ts><th>495</th><atime>1735776000</atime>
...............................
Resetting atime of key in the future:
 <key>UD:htm</key><ts>42786</ts><th>26947</th><atime>1735776000</atime>
....................................................
bayes_toks: 4179192 keys copied
bayes_toks: 146 future-keys reset
 
The fact that it was finding atimes in the future on the second pass means that the import of bayes_toks.txt must not be working properly because I could find not future atimes when doing a grep on bayes_toks.txt.  Expiry still didn't work.
 
Doing an sa-learn --dump magic showed that the newest atime is in 2025 and the oldest in 2000 which are both incorrect:
 
0.000          0          2          0  non-token data: bayes db version
0.000          0     632798          0  non-token data: nspam
0.000          0     619738          0  non-token data: nham
0.000          0    4179196          0  non-token data: ntokens
0.000          0  952965268          0  non-token data: oldest atime
0.000          0 1735776000          0  non-token data: newest atime
0.000          0 1092825211          0  non-token data: last journal sync atime
0.000          0 1092836039          0  non-token data: last expiry atime
0.000          0  736899888          0  non-token data: last expire atime delta
0.000          0          0          0  non-token data: last expire reduction count

Any suggestions?  No errors are given after the import back to the database so I'm not sure what is going wrong here.
 
Cheers,
Zoe



________________________________

From: Martin Schröder [mailto:ms@artcom-gmbh.de]
Sent: Tue 17/08/2004 17:59
To: spamassassin-users@incubator.apache.org
Subject: Re: Bayes database



On 2004-08-17 18:28:48 +0200, Andy Spiegl wrote:
> You don't have to delete your bayes database!
>
> In April I had the same problem and I ended up extending and fixing the tool
>  http://spamassassin.taint.org/devel/db-to-text.pl.txt
> and posting it to the mailing list.  I asked that someone puts it on the

THANKS!

expired old Bayes database entries in 661 seconds
145343 entries kept, 455855 deleted

:-))

Best regards
        Martin
--
               Martin Schröder, ms@artcom-gmbh.de
     ArtCom GmbH, Lise-Meitner-Str 5, 28359 Bremen, Germany
          Voice +49 421 20419-44 / Fax +49 421 20419-10
                    http://www.artcom-gmbh.de




---------------------------------------------------
This email from dns has been validated by dnsMSS Managed Email Security and is free from all known viruses.

For further information contact email-integrity@dns.co.uk