You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by "Jeremy M. Dolan" <jm...@pobox.com> on 2004/09/25 19:59:57 UTC
Bayes DB seemingly corrupted during v2 to v3 upgrade
Hi all. Hoping someone might be able to help me out here. Just
upgraded from 2.6x to 3.0.0 this morning, and, though I followed the
Bayes DB upgrade steps in the UPGRADE file to a T, my token names all
seem to be garbage now.
Here's a few lines of the output from "sa-learn --dump all":
0.560 21 3 1094789733 dc60473720
0.992 6 0 1090849205 20d2b3d689
0.958 1 0 1092129562 23c375c031
0.998 20 0 1095699812 cc75bc02df
That fifth field, which I remember as being the token name in 2.6x, is
ostensibly junk data now. I could be wrong, maybe it's supposed to now
look like that, but as the documentation says to check --dump output
and "make sure the data looks valid", that seems unlikely.
What happened? I don't see any similar reports in the list archives.
Where could I have gone wrong? I have backups of bayes_(toks|seen) if
you can suggest anything to try. I did run the --sync in 3.0.0 with
the -D flag though, and aside from the only slightly suspicious
debug: refresh: 22434 refresh /home/jmd/.spamassassin/bayes.lock
being printed about 150 times, everything seemed like your typical
debug mode output.
As I get a few hundred spams a day, I'm terrified of starting up
fetchmail again without SpamAssassin back and fully operational.
Help! :)
/jmd
PS: Great job on 3.0 folks--it looks great on paper/the web site, at
least. Hoping it will cut in to the 2-3% of spam that was slipping by
2.6x. The new tests look promising.
--
Jeremy M. Dolan <ma...@pobox.com> <http://jmd.us/>
PGP: 1024D/3C68A1BA 9470 210C A476 FFBB 6D11 0223 0D1C ABFC 3C68 A1BA
Re: Bayes DB seemingly corrupted during v2 to v3 upgrade
Posted by Michael Parker <pa...@pobox.com>.
On Sat, Sep 25, 2004 at 12:59:57PM -0500, Jeremy M. Dolan wrote:
> Hi all. Hoping someone might be able to help me out here. Just
> upgraded from 2.6x to 3.0.0 this morning, and, though I followed the
> Bayes DB upgrade steps in the UPGRADE file to a T, my token names all
> seem to be garbage now.
>
> Here's a few lines of the output from "sa-learn --dump all":
>
> 0.560 21 3 1094789733 dc60473720
> 0.992 6 0 1090849205 20d2b3d689
> 0.958 1 0 1092129562 23c375c031
> 0.998 20 0 1095699812 cc75bc02df
>
We no longer store the raw token value in the database, instead it is
a hashed value. There is a small blurb about this in UPGRADE.
The values in the dump are actually hex representations of the binary
values stored in the database.
So, relax, you database is fine.
Michael