You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spamassassin.apache.org by Ivan Pantovic <iv...@yu.net> on 2007/06/28 16:35:04 UTC
sa-learn --backup and bayes tokens depend on mysql version?
Hi to all... I tried to drill this one out myself but it takes much more time
than I thought it would.
Problem is:
I have a setup with couple of mail macines running SA querying same mysql
for bayes.
Then, just recently I noticed the following... some of new installed SA's
have different bayes score then the older ones.
I did some debuging and conclusion is that although both machines are indeed
looking at the same database, they are getting different results depending
on a mysql version there is. How is this possible?
Working versions are SA 3.1.7 mysql 4.1.14 and perl 5.8.8 and the not
properly working version is mysql 4.1.22.
perl modules are the same....
Their configurations are the same, and even some results are the same ...
then i noticed this ...
I tried to see what sa-lern look like on working machine ... and it dumps
something like this... :
v 3 db_version # this must be the first line!!!
v 3171 num_spam
v 1740 num_nonspam
t 10 0 1182854858 0000c354f7
t 0 1 1182599601 0002c9c814
t 1 0 1182847851 0003961cfc
t 1 0 1163020787 000413ba19
t 0 1 1182863710 00046b538f
t 0 1 1182864981 00048eb213
t 1 0 1182792337 00052e9096
but ... non working sa-learn who sees the same database dumps it like
this:...
v 3 db_version # this must be the first line!!!
v 3171 num_spam
v 1740 num_nonspam
t 10 0 1182854858 0000c38354c3b7
t 0 1 1182599601 0002c389c38814
t 1 0 1182847851 0003e280931cc3bc
t 1 0 1163020787 000413c2ba19
t 0 1 1182863710 00046b53c28f
t 0 1 1182864981 0004c5bdc2b213
t 1 0 1182792337 00052ec290e28093
it is obviously the same database, number of messages are the same even
atime values are there but what is this extra infomation i'm getting when
dumping bayes data from a machine bayes is not working properly?
I have to add there is some bayes tokens i get hit when checking the message
but lot less then I sould get.
working bayes debug:
[11508] dbg: bayes: tok_get_all: token count: 307
[11508] dbg: bayes: token 'H*r:mail.yu.net' => 0.999939851581825
[11508] dbg: bayes: token 'H*r:ip*194.247.192.231' => 0.99958038147139
[11508] dbg: bayes: token 'H*r:8.13.6' => 0.997298245614035
[11508] dbg: bayes: token 'Vam' => 0.995425742574258
[11508] dbg: bayes: token 'HX-Library:Indy' => 0.994923076923077
[11508] dbg: bayes: token 'HX-Library:8.0.25' => 0.994296296296296
[11508] dbg: bayes: token 'H*p:D*gmail.com' => 0.994296296296296
[11508] dbg: bayes: token 'nudimo' => 0.994296296296296
[11508] dbg: bayes: token 'H*F:D*gmail.com' => 0.994296296296296
[11508] dbg: bayes: token 'informacija' => 0.00644100651702229
[11508] dbg: bayes: token 'H*MI:smtpclu' => 0.993509555934965
[11508] dbg: bayes: token 'H*m:smtpclu' => 0.993509555934965
[11508] dbg: bayes: token 'mogucnost' => 0.993492957746479
[11508] dbg: bayes: token 'posaljite' => 0.992426229508197
[11508] dbg: bayes: token 'H*RT:mail.yu.net' => 0.991800275823836
[11508] dbg: bayes: token 'H*RT:sk:smtpclu' => 0.991800275823836
[11508] dbg: bayes: token 'srbije' => 0.0090549961351273
[11508] dbg: bayes: token 'obavestite' => 0.990941176470588
[11508] dbg: bayes: token 'preduzeca' => 0.990941176470588
[11508] dbg: bayes: token 'VAM' => 0.990941176470588
[11508] dbg: bayes: token 'DOSTAVITE' => 0.990941176470588
[11508] dbg: bayes: token 'saznajte' => 0.990941176470588
[11508] dbg: bayes: token 'Srbije' => 0.00931763810770952
[11508] dbg: bayes: token 'Vasa' => 0.988731707317073
[11508] dbg: bayes: token 'vasa' => 0.988731707317073
[11508] dbg: bayes: token 'cetiri' => 0.988731707317073
[11508] dbg: bayes: token 'narucivanje' => 0.988731707317073
[11508] dbg: bayes: token 'Vase' => 0.988731707317073
[11508] dbg: bayes: token 'kaze' => 0.014453270710345
[11508] dbg: bayes: token 'dlanu' => 0.985096774193548
[11508] dbg: bayes: token 'vasih' => 0.985096774193548
[11508] dbg: bayes: token 'proizvoda' => 0.985096774193548
[11508] dbg: bayes: token 'preduzecu' => 0.985096774193548
[11508] dbg: bayes: token 'adresar' => 0.985096774193548
[11508] dbg: bayes: token 'proizvod' => 0.985096774193548
[11508] dbg: bayes: token '990000' => 0.985096774193548
[11508] dbg: bayes: token 'postanski' => 0.985096774193548
[11508] dbg: bayes: token 'Vasih' => 0.985096774193548
[11508] dbg: bayes: token 'E-mail' => 0.985096774193548
[11508] dbg: bayes: token 'nekoliko' => 0.0156882143902964
[11508] dbg: bayes: token 'vrednost' => 0.0184329504297786
[11508] dbg: bayes: token 'H*RT:194.247.192.231' => 0.979696201682168
[11508] dbg: bayes: token 'Internetu' => 0.0215768525398059
[11508] dbg: bayes: token 'praznike' => 0.978
[11508] dbg: bayes: token 'Vasu' => 0.978
...
...
[11508] dbg: bayes: token 'novim' => 0.0438056888210111
[11508] dbg: bayes: token 'naziv' => 0.0444866337211565
[11508] dbg: bayes: token 'domena' => 0.0449915067301374
[11508] dbg: bayes: token 'koji' => 0.0452773463381254
[11508] dbg: bayes: token 'internetu' => 0.0453527611321541
[11508] dbg: bayes: token 'banke' => 0.0455685505490914
[11508] dbg: bayes: score = 0.994045530451266
[11508] dbg: bayes: DB expiry: tokens in DB: 121728, Expiry max size:
150000, Oldest atime: 1130853389, Newest atime: 1183041130, Last expire: 0,
Current time: 1183041132
not properly working sa bayes debug:
[3145] dbg: bayes: tok_get_all: token count: 308
[3145] dbg: bayes: token 'TIM' => 0.993172413793104
[3145] dbg: bayes: token 'H*RU:sk:postpai' => 0.986543689320388
[3145] dbg: bayes: token 'HX-Spam-Relays-External:sk:postpai' =>
0.986543689320388
[3145] dbg: bayes: token 'sk:wwwkon' => 0.986543689320388
[3145] dbg: bayes: token 'delatnost' => 0.0387147883969348
[3145] dbg: bayes: score = 0.781653777640385
[3145] dbg: bayes: DB expiry: tokens in DB: 121728, Expiry max size: 150000,
Oldest atime: 1130853389, Newest atime: 1183041077, Last expire: 0, Current
time: 1183041078
Any ideas?
--
View this message in context: http://www.nabble.com/sa-learn---backup-and-bayes-tokens-depend-on-mysql-version--tf3994597.html#a11343786
Sent from the SpamAssassin - Dev mailing list archive at Nabble.com.