You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Stefan Jakobs <st...@rus.uni-stuttgart.de> on 2006/06/08 13:56:22 UTC
size of bayes db
Hello list,
I'm using SA 3.1.2 with amavis-new and postfix on a mailrelay.
I turned on bayes autolearning with the standard options, but my bayes_seen db
grows and grows, now it is by 1.1 GB.
Why reduce SA the size not automatically?
What can I do, to reduce the size of the db?
What are your experience with the bayes db?
Thanks for help.
Greetings
Stefan
Re: size of bayes db
Posted by Kris Deugau <kd...@vianet.ca>.
Stefan Jakobs wrote:
> I'm using SA 3.1.2 with amavis-new and postfix on a mailrelay.
> I turned on bayes autolearning with the standard options, but my bayes_seen db
> grows and grows, now it is by 1.1 GB.
> Why reduce SA the size not automatically?
Probably because its automatic expiry runs are getting interrupted by
amavis-new. Check back in the list archives; quite a few people have
had this problem.
For *any* file-based sitewide Bayes setup, IMO, you should set the SA
options so it doesn't run automatic expiry, and set up a cron job to
manually run the expiry process on a regular basis (daily is probably
good for most sites; *really* high-traffic sites can probably go every
few hours but they should be using SQL-based Bayes anyway IMO <g>).
> What can I do, to reduce the size of the db?
Right away, you can manually expire tokens by running sa-learn
--force-expire.
> What are your experience with the bayes db?
One legacy system still running 2.64 has had a stable Bayes db around
40M for close to four years now. (Possibly 5 years. I don't recall
when I upgraded to 2.5x on that box.) Fairly early on, I disabled
automatic expiry and set up a daily cron job to run the expiry process
manually. I've *never* had trouble with the database inflating out of
control.
If you do set up a cron'ed expiry on your system, make sure it runs as
the same user amavis-new is running as. Otherwise you'll end up with
file permission issues.
Check the man pages for your local SA install for the exact Bayes
options you need to tweak.
-kgd
Re: size of bayes db
Posted by Kai Schaetzl <ma...@conactive.com>.
Stefan Jakobs wrote on Fri, 9 Jun 2006 11:06:47 +0200:
> It is a dbm db! The server process ~ 80 000 Mails per Day and the bayes_seen
> db is 5 month old.
If you count both dbs together 1 GB might be what you end up with this volume
and no expiry. What's your "salearn --dump magic" output? That will show you
some statistics about your db. As an example, this is a dump of a 42 MB dbm
database. I let it expire with a threshold of 1.5 Mio. tokens or so.
0.000 0 47588 0 non-token data: nspam
0.000 0 87524 0 non-token data: nham
0.000 0 1231268 0 non-token data: ntokens
With such a large db you may be better off in terms of performance by using a
sqlized one. But expect it to take even more space. With the volume of mail you
get I'd expire everything older than a month.
Kai
--
Kai Schätzl, Berlin, Germany
Get your web at Conactive Internet Services: http://www.conactive.com
Re: size of bayes db
Posted by Stefan Jakobs <st...@rus.uni-stuttgart.de>.
Hallo,
Am Donnerstag, 8. Juni 2006 22:31 schrieb Kai Schaetzl:
> Stefan Jakobs wrote on Thu, 8 Jun 2006 13:56:22 +0200:
> > I turned on bayes autolearning with the standard options, but my
> > bayes_seen db grows and grows, now it is by 1.1 GB.
>
> This is indeed very much. This is a dbm db? (SQL has bigger sizes because
> of indexing.) How much mail do you process per day?
It is a dbm db! The server process ~ 80 000 Mails per Day and the bayes_seen
db is 5 month old.
> Kai
Bye Stefan
Re: size of bayes db
Posted by Kai Schaetzl <ma...@conactive.com>.
Stefan Jakobs wrote on Thu, 8 Jun 2006 13:56:22 +0200:
> I turned on bayes autolearning with the standard options, but my bayes_seen db
> grows and grows, now it is by 1.1 GB.
This is indeed very much. This is a dbm db? (SQL has bigger sizes because of
indexing.) How much mail do you process per day?
Kai
--
Kai Schätzl, Berlin, Germany
Get your web at Conactive Internet Services: http://www.conactive.com