You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by "C. Bensend" <be...@bennyvision.com> on 2006/03/25 20:55:32 UTC

Re: sa-learn --backup and --restore issue: duplicate key violations

>    After a number of these, it dies with:
>
> bayes: encountered too many errors (20) while parsing seen lines,
> reverting to empty database and exiting
>
> ERROR: Bayes restore returned an error, please re-run with -D for more
> information
>
>    .. which makes me sad.  So, my question - is there a way to
> fix this?  Or will I have to end up dumping my Bayes and starting
> over?  I really hope I don't have to do that, because my Bayes
> database is huge and really quite accurate.

   I hadn't seen any responses to my question as of yet, so I
decided to do some more experimenting.

   I ran the backup file through sort and uniq, moved the version
line back to the top, and ran it through sa-learn again.  This
time, it completed successfully:

[17678] dbg: bayes: parsed 522507 lines
[17678] dbg: bayes: created database with 117864 tokens based on 249654
spam messages and 155005 ham messages

   So, is this an OK thing to have done?  Due to the lack of
a single error, I'm guessing that changing the order of the
backup file (other than the version line) doesn't hurt anything.
Is this correct?

   Also, any ideas how my Bayes database got duplicate tokens
in the first place?

Thanks,

Benny


-- 
"A computer lets you make more mistakes faster than any invention
in human history, with the possible exceptions of handguns and
tequila."                                          -- Found on usenet