You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Hamish Marson <ha...@travellingkiwi.com> on 2006/08/08 12:42:11 UTC

Bayes errors...

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


I keep getting the following from spamassasin (Running under amavisd
debug-sa). Any ideas what I've done wrong this time?

The database is mysql. SpamAssassin is 3.1.4 (It also did the same
with 3.1.3).

[12172] dbg: bayes: database connection established
[12172] dbg: bayes: found bayes db version 3
[12172] dbg: bayes: Using userid: 1
[12172] dbg: bayes: tok_get: SQL error: Illegal mix of collations
(latin1_swedish_ci,IMPLICIT) and (utf8_general_ci,COERCIBLE) for
operation '='
[12172] dbg: bayes: _put_token: SQL error: Duplicate entry '1-?7?k?'
for key 1
[12172] dbg: bayes: tok_get: SQL error: Illegal mix of collations
(latin1_swedish_ci,IMPLICIT) and (utf8_general_ci,COERCIBLE) for
operation '='
[12172] dbg: bayes: _put_token: SQL error: Duplicate entry '1-??%m!'
for key 1
[12172] dbg: bayes: tok_get: SQL error: Illegal mix of collations
(latin1_swedish_ci,IMPLICIT) and (utf8_general_ci,COERCIBLE) for
operation '='
[12172] dbg: bayes: _put_token: SQL error: Duplicate entry '1-?4??%'
for key 1
[12172] dbg: bayes: tok_get: SQL error: Illegal mix of collations
(latin1_swedish_ci,IMPLICIT) and (utf8_general_ci,COERCIBLE) for
operation '='
[12172] dbg: bayes: _put_token: SQL error: Duplicate entry '1-l???'
for key 1

When it's not giving the above it gives

[12254] dbg: bayes: database connection established
[12254] dbg: bayes: found bayes db version 3
[12254] dbg: bayes: Using userid: 1
[12254] dbg: bayes: corpus size: nspam = 7492, nham = 100752
[12254] dbg: bayes: tok_get_all: token count: 198
[12254] dbg: bayes: tok_get_all: SQL error: Illegal mix of collations
for operation ' IN '
[12254] dbg: bayes: cannot use bayes on this message; none of the
tokens were found in the database
[12254] dbg: bayes: not scoring message, returning undef



TIA
  Hamish.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFE2GqC/3QXwQQkZYwRAsg+AKDTrpxO1Zs/D3vMpHpH33v192LwfACdHriQ
gPVGxD5aCuAImhjhUzaFR9w=
=kll1
-----END PGP SIGNATURE-----


Re: Bayes errors...

Posted by Nigel Frankcom <ni...@blue-canoe.net>.
On Tue, 08 Aug 2006 12:02:04 +0100, Hamish Marson
<ha...@travellingkiwi.com> wrote:

>-----BEGIN PGP SIGNED MESSAGE-----
>Hash: SHA1
>
>Nigel Frankcom wrote:
>> I'm not sure what you've done there, I didn't realise it was
>> possible to mix collation types in the same table. Have you checked
>> that all tables are the same type? MyISAM or Inno? If they are all
>> the same, I'd be inclined to pull it down, rebuild from the SA
>> supplied SQL and retrain.
>>
>> Did you merge 2 databases at any point?
>>
>
>Nope... I think it broke when I updates to 3.1.3 (But it was
>previously running 3.1.0 fine IIRC).
>
>Maybe I'll try a rebuild... on the bayes tables...
>
>H
>-----BEGIN PGP SIGNATURE-----
>Version: GnuPG v1.4.2 (GNU/Linux)
>Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
>
>iD8DBQFE2G8r/3QXwQQkZYwRAjJdAKCN2POhd2faxG8Um6QZzkcig99A4ACghG3O
>zT9bwVcF0V+JALTf6TIL55c=
>=gFaH
>-----END PGP SIGNATURE-----

It might be worth trying the MySQL Admin Tool and seeing if it can
repair the tables.

http://dev.mysql.com/downloads/administrator/1.1.html

Nigel

Re: Bayes errors...

Posted by Hamish Marson <ha...@travellingkiwi.com>.
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Nigel Frankcom wrote:
> I'm not sure what you've done there, I didn't realise it was
> possible to mix collation types in the same table. Have you checked
> that all tables are the same type? MyISAM or Inno? If they are all
> the same, I'd be inclined to pull it down, rebuild from the SA
> supplied SQL and retrain.
>
> Did you merge 2 databases at any point?
>

Nope... I think it broke when I updates to 3.1.3 (But it was
previously running 3.1.0 fine IIRC).

Maybe I'll try a rebuild... on the bayes tables...

H
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFE2G8r/3QXwQQkZYwRAjJdAKCN2POhd2faxG8Um6QZzkcig99A4ACghG3O
zT9bwVcF0V+JALTf6TIL55c=
=gFaH
-----END PGP SIGNATURE-----


Re: Re: Bayes errors...

Posted by Nigel Frankcom <ni...@blue-canoe.net>.
On Tue, 08 Aug 2006 12:08:52 +0100, Hamish Marson
<ha...@travellingkiwi.com> wrote:

>-----BEGIN PGP SIGNED MESSAGE-----
>Hash: SHA1
>
>Nigel Frankcom wrote:
>> I'm not sure what you've done there, I didn't realise it was
>> possible to mix collation types in the same table. Have you checked
>> that all tables are the same type? MyISAM or Inno? If they are all
>> the same, I'd be inclined to pull it down, rebuild from the SA
>> supplied SQL and retrain.
>>
>
>The tables are all MyISAM... But the collation is latin1_swedish_ci
>for some reason, WHich seems strange to me.

latin1_swedish_ci is the default for MyISAM and should be fine (mine
are set that way here).


>What collation do others have? And what should it be? (I'm assuming
>the problem is SA using utf8 & the database using latin1_swedish_ci).
>I'm looking now to see if it's possible to change the collation on the
>fly.

You probably can change it on the fly, but you shouldn't have to, I've
checked 3 servers here and they are all latin1_swedish_ci/MyISAM

>
>> Did you merge 2 databases at any point?
>>

Yes, if you merged it's possible things went awry along the way.

Re: Bayes errors...

Posted by Hamish Marson <ha...@travellingkiwi.com>.
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Nigel Frankcom wrote:
> I'm not sure what you've done there, I didn't realise it was
> possible to mix collation types in the same table. Have you checked
> that all tables are the same type? MyISAM or Inno? If they are all
> the same, I'd be inclined to pull it down, rebuild from the SA
> supplied SQL and retrain.
>

The tables are all MyISAM... But the collation is latin1_swedish_ci
for some reason, WHich seems strange to me.

What collation do others have? And what should it be? (I'm assuming
the problem is SA using utf8 & the database using latin1_swedish_ci).
I'm looking now to see if it's possible to change the collation on the
fly.


> Did you merge 2 databases at any point?
>
> Nigel
>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFE2HDE/3QXwQQkZYwRAl+TAJ92L9d4yvm48M/4fCj/6HlOwJIdfACgvwxz
OSh1p7YKzNR/GBNLTsQsSXU=
=nTNc
-----END PGP SIGNATURE-----


Re: Bayes errors...

Posted by Nigel Frankcom <ni...@blue-canoe.net>.
I'm not sure what you've done there, I didn't realise it was possible
to mix collation types in the same table. Have you checked that all
tables are the same type? MyISAM or Inno? If they are all the same,
I'd be inclined to pull it down, rebuild from the SA supplied SQL and
retrain.

Did you merge 2 databases at any point?

Nigel

On Tue, 08 Aug 2006 11:42:11 +0100, Hamish Marson
<ha...@travellingkiwi.com> wrote:

>-----BEGIN PGP SIGNED MESSAGE-----
>Hash: SHA1
>
>
>I keep getting the following from spamassasin (Running under amavisd
>debug-sa). Any ideas what I've done wrong this time?
>
>The database is mysql. SpamAssassin is 3.1.4 (It also did the same
>with 3.1.3).
>
>[12172] dbg: bayes: database connection established
>[12172] dbg: bayes: found bayes db version 3
>[12172] dbg: bayes: Using userid: 1
>[12172] dbg: bayes: tok_get: SQL error: Illegal mix of collations
>(latin1_swedish_ci,IMPLICIT) and (utf8_general_ci,COERCIBLE) for
>operation '='
>[12172] dbg: bayes: _put_token: SQL error: Duplicate entry '1-?7?k?'
>for key 1
>[12172] dbg: bayes: tok_get: SQL error: Illegal mix of collations
>(latin1_swedish_ci,IMPLICIT) and (utf8_general_ci,COERCIBLE) for
>operation '='
>[12172] dbg: bayes: _put_token: SQL error: Duplicate entry '1-??%m!'
>for key 1
>[12172] dbg: bayes: tok_get: SQL error: Illegal mix of collations
>(latin1_swedish_ci,IMPLICIT) and (utf8_general_ci,COERCIBLE) for
>operation '='
>[12172] dbg: bayes: _put_token: SQL error: Duplicate entry '1-?4??%'
>for key 1
>[12172] dbg: bayes: tok_get: SQL error: Illegal mix of collations
>(latin1_swedish_ci,IMPLICIT) and (utf8_general_ci,COERCIBLE) for
>operation '='
>[12172] dbg: bayes: _put_token: SQL error: Duplicate entry '1-l???'
>for key 1
>
>When it's not giving the above it gives
>
>[12254] dbg: bayes: database connection established
>[12254] dbg: bayes: found bayes db version 3
>[12254] dbg: bayes: Using userid: 1
>[12254] dbg: bayes: corpus size: nspam = 7492, nham = 100752
>[12254] dbg: bayes: tok_get_all: token count: 198
>[12254] dbg: bayes: tok_get_all: SQL error: Illegal mix of collations
>for operation ' IN '
>[12254] dbg: bayes: cannot use bayes on this message; none of the
>tokens were found in the database
>[12254] dbg: bayes: not scoring message, returning undef
>
>
>
>TIA
>  Hamish.
>-----BEGIN PGP SIGNATURE-----
>Version: GnuPG v1.4.2 (GNU/Linux)
>Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
>
>iD8DBQFE2GqC/3QXwQQkZYwRAsg+AKDTrpxO1Zs/D3vMpHpH33v192LwfACdHriQ
>gPVGxD5aCuAImhjhUzaFR9w=
>=kll1
>-----END PGP SIGNATURE-----

RE: Bayes errors...

Posted by "Gary W. Smith" <ga...@primeexalia.com>.
This is because your database is in UTF8 format.  As a result SA cannot
read it (though it can write it).

Drop the database and recreate it and the tables in latin and it will
work just fine.  You will have to retrain after that though.

-----Original Message-----
From: Hamish Marson [mailto:hamish@travellingkiwi.com] 
Sent: Tuesday, August 08, 2006 3:42 AM
To: users@spamassassin.apache.org
Subject: Bayes errors...

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


I keep getting the following from spamassasin (Running under amavisd
debug-sa). Any ideas what I've done wrong this time?

The database is mysql. SpamAssassin is 3.1.4 (It also did the same
with 3.1.3).

[12172] dbg: bayes: database connection established
[12172] dbg: bayes: found bayes db version 3
[12172] dbg: bayes: Using userid: 1
[12172] dbg: bayes: tok_get: SQL error: Illegal mix of collations
(latin1_swedish_ci,IMPLICIT) and (utf8_general_ci,COERCIBLE) for
operation '='
[12172] dbg: bayes: _put_token: SQL error: Duplicate entry '1-?7?k?'
for key 1
[12172] dbg: bayes: tok_get: SQL error: Illegal mix of collations
(latin1_swedish_ci,IMPLICIT) and (utf8_general_ci,COERCIBLE) for
operation '='
[12172] dbg: bayes: _put_token: SQL error: Duplicate entry '1-??%m!'
for key 1
[12172] dbg: bayes: tok_get: SQL error: Illegal mix of collations
(latin1_swedish_ci,IMPLICIT) and (utf8_general_ci,COERCIBLE) for
operation '='
[12172] dbg: bayes: _put_token: SQL error: Duplicate entry '1-?4??%'
for key 1
[12172] dbg: bayes: tok_get: SQL error: Illegal mix of collations
(latin1_swedish_ci,IMPLICIT) and (utf8_general_ci,COERCIBLE) for
operation '='
[12172] dbg: bayes: _put_token: SQL error: Duplicate entry '1-l???'
for key 1

When it's not giving the above it gives

[12254] dbg: bayes: database connection established
[12254] dbg: bayes: found bayes db version 3
[12254] dbg: bayes: Using userid: 1
[12254] dbg: bayes: corpus size: nspam = 7492, nham = 100752
[12254] dbg: bayes: tok_get_all: token count: 198
[12254] dbg: bayes: tok_get_all: SQL error: Illegal mix of collations
for operation ' IN '
[12254] dbg: bayes: cannot use bayes on this message; none of the
tokens were found in the database
[12254] dbg: bayes: not scoring message, returning undef



TIA
  Hamish.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFE2GqC/3QXwQQkZYwRAsg+AKDTrpxO1Zs/D3vMpHpH33v192LwfACdHriQ
gPVGxD5aCuAImhjhUzaFR9w=
=kll1
-----END PGP SIGNATURE-----