You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Lars Ringh <la...@bahnhof.net> on 2006/04/03 14:34:29 UTC
Moving bayes from bdb to MySQL
I'm about to move my bayes and auto-whitelist data from local db-files
on each server to a common MySQL-db.
I have 2+2 load balanced servers scanning mail using amavisd-new for
different kinds of customers, home and corporate users repectively, and
I was planning to keep their respective data in two separate db's since
they seem to be quite different.
Now, since in each case the source data can come from two different
servers scanning the same kind of mails, should I try to merge the
bayes-data from servers home1 and home2 into the the same myqsl-db and
then merge the data from corp1 and corp2 into the other mysql-db, or
should I pick my starting sourcedata from only one server in each pair?
Would spamassassin benefit from having the greater source to look at, or
would I only be adding close-to-identical data which would then only be
expired faster than it was to merge them?
And out of curiosity, the "home servers" have about 165MB och bayes-data
and 335MB in auto-whitelist, while the "corporate servers" have it the
other way around, 335MB in bayes-db and 165MB in auto-whitelist. Could
anyone enlight me briefly on why? Is it as simple as that the
"home-servers" has fewer senders/recipients, but more different emails,
and the "corporate-servers" has more senders/recipients but fewer
different e-mails, or what?
//maccall
--
lars-dot-ringh-at-bahnhof-dot-net
Re: Moving bayes from bdb to MySQL
Posted by Lars Ringh <la...@bahnhof.net>.
Michael Monnerie wrote:
> On Montag, 3. April 2006 14:34 Lars Ringh wrote:
>
>>Now, since in each case the source data can come from two different
>>servers scanning the same kind of mails, should I try to merge the
>>bayes-data from servers home1 and home2 into the the same myqsl-db
>>and then merge the data from corp1 and corp2 into the other mysql-db,
>>or should I pick my starting sourcedata from only one server in each
>>pair? Would spamassassin benefit from having the greater source to
>>look at, or would I only be adding close-to-identical data which
>>would then only be expired faster than it was to merge them?
>
>
> I believe you should *not* mix two different bayes DBs. Use just one,
> and the rest will fill up with the next SPAM jumping in...
Yes, I've done some more thinking myself and this must be the only
reasonable approach.
>>165MB...335MB
>
>
> Did you not bayes_auto_expire?
I was under the impression that i did, but since I've done some import
of the data into mysql-dbs (where I am able to examine the data easier
than when they are in bdb-files) I must say that I don't seem to...
A bit strange though, since the files reach this size from scratch in
quite a short time, and then the file sizes stays at this size, that is
they don't grow bigger than this. That's why I thought auto expire did
it's work... One might suspect that I've given bayes_expiry_max_db_size
some really odd value but that's not the case either...
Well, anyway, thanks for your input.
//maccall
--
lars-dot-ringh-at-bahnhof-dot-net
Re: Moving bayes from bdb to MySQL
Posted by Michael Monnerie <m....@zmi.at>.
On Montag, 3. April 2006 14:34 Lars Ringh wrote:
> Now, since in each case the source data can come from two different
> servers scanning the same kind of mails, should I try to merge the
> bayes-data from servers home1 and home2 into the the same myqsl-db
> and then merge the data from corp1 and corp2 into the other mysql-db,
> or should I pick my starting sourcedata from only one server in each
> pair? Would spamassassin benefit from having the greater source to
> look at, or would I only be adding close-to-identical data which
> would then only be expired faster than it was to merge them?
I believe you should *not* mix two different bayes DBs. Use just one,
and the rest will fill up with the next SPAM jumping in...
> 165MB...335MB
Did you not bayes_auto_expire?
mf gzmi
--
// Michael Monnerie, Ing.BSc --- it-management Michael Monnerie
// http://zmi.at Tel: 0660/4156531 Linux 2.6.11
// PGP Key: "lynx -source http://zmi.at/zmi2.asc | gpg --import"
// Fingerprint: EB93 ED8A 1DCD BB6C F952 F7F4 3911 B933 7054 5879
// Keyserver: www.keyserver.net Key-ID: 0x70545879