You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Matias Lopez Bergero <ml...@udesa.edu.ar> on 2007/11/07 20:21:52 UTC

large bayes db may cause downgrade in performance???

Hello,

I am dealing with slow scan times, I was thinking about iowait.... I am
using the standar db for bayes, may this be causing a higher scan time?

Here is my db directory:

> -rw-------    1 spamd    spamd           6 Oct  5  2006 auto-whitelist.mutex
> -rw-r--r--    1 root     root     438956192 Nov  7 15:51 bayes_backup_07112007
> -rw-------    1 spamd    spamd       24096 Nov  7 16:06 bayes_journal
> -rw-r--r--    1 spamd    spamd      124758 Nov  7 16:06 bayes.mutex
> -rw-r--r--    1 spamd    spamd    672612352 Nov  7 16:06 bayes_seen
> -rw-------    1 spamd    spamd    10629120 Nov  7 16:06 bayes_toks
> -rw-------    1 spamd    spamd    10182656 Feb 28  2007 bayes_toks.expire10134
> -rw-------    1 spamd    spamd     4472832 Feb 23  2007 bayes_toks.expire10399
> -rw-------    1 spamd    spamd    10629120 Oct 24 00:54 bayes_toks.expire1220
> -rw-------    1 spamd    spamd    10432512 Aug  7 15:01 bayes_toks.expire12662
> -rw-------    1 spamd    spamd     1097728 Feb 23  2007 bayes_toks.expire13340
> -rw-------    1 spamd    spamd     9506816 Sep 14 19:27 bayes_toks.expire1419
> -rw-------    1 spamd    spamd    10141696 Oct 23 12:45 bayes_toks.expire14302
> -rw-------    1 spamd    spamd    10047488 Feb 24  2007 bayes_toks.expire1514
> -rw-------    1 spamd    spamd     2371584 Feb 23  2007 bayes_toks.expire15994
> -rw-------    1 spamd    spamd     9445376 Sep 14 19:42 bayes_toks.expire1689
> -rw-------    1 spamd    spamd     9900032 Feb 28  2007 bayes_toks.expire17095
> -rw-------    1 spamd    spamd     9940992 Feb 28  2007 bayes_toks.expire17096
> -rw-------    1 spamd    spamd     6004736 May 30 23:40 bayes_toks.expire17408
> -rw-------    1 spamd    spamd     4780032 Aug  7 15:04 bayes_toks.expire19113
> -rw-------    1 spamd    spamd     5988352 May 30 23:56 bayes_toks.expire19120
> -rw-------    1 spamd    spamd      544768 Feb 23  2007 bayes_toks.expire19332
> -rw-------    1 spamd    spamd     4448256 Feb 23  2007 bayes_toks.expire19333
> -rw-------    1 spamd    spamd     4333568 Feb 23  2007 bayes_toks.expire19401
> -rw-------    1 spamd    spamd     5353472 Feb 28  2007 bayes_toks.expire19543
> -rw-------    1 spamd    spamd     5812224 May 30 23:51 bayes_toks.expire19987
> -rw-------    1 spamd    spamd     4472832 Sep 17 16:47 bayes_toks.expire21309
> -rw-------    1 spamd    spamd     9908224 Feb 28  2007 bayes_toks.expire21568
> -rw-------    1 spamd    spamd    10047488 Feb 28  2007 bayes_toks.expire21569
> -rw-------    1 spamd    spamd    10477568 Feb 27  2007 bayes_toks.expire21644
> -rw-------    1 spamd    spamd      540672 Feb 23  2007 bayes_toks.expire21686
> -rw-------    1 spamd    spamd     1073152 Feb 23  2007 bayes_toks.expire24464
> -rw-------    1 spamd    spamd     1056768 Feb 26  2007 bayes_toks.expire2596
> -rw-------    1 spamd    spamd    10375168 Feb 28  2007 bayes_toks.expire2628
> -rw-------    1 spamd    spamd     5500928 Feb  7  2007 bayes_toks.expire27617
> -rw-------    1 spamd    spamd    10072064 Feb 28  2007 bayes_toks.expire2906
> -rw-------    1 spamd    spamd     2215936 Feb 23  2007 bayes_toks.expire29415
> -rw-------    1 spamd    spamd     2199552 Feb 23  2007 bayes_toks.expire30847
> -rw-------    1 spamd    spamd    10563584 Nov  1 14:02 bayes_toks.expire31820
> -rw-------    1 spamd    spamd     5378048 Feb 28  2007 bayes_toks.expire334
> -rw-------    1 spamd    spamd      561152 Feb 20  2007 bayes_toks.expire354
> -rw-------    1 spamd    spamd     5115904 Feb 28  2007 bayes_toks.expire361
> -rw-------    1 spamd    spamd     4780032 Sep 14 19:22 bayes_toks.expire3773
> -rw-------    1 spamd    spamd     9445376 Dec 15  2006 bayes_toks.expire4088
> -rw-------    1 spamd    spamd     9428992 Sep 14 19:47 bayes_toks.expire442
> -rw-------    1 spamd    spamd     5009408 Nov 14  2006 bayes_toks.expire463
> -rw-------    1 spamd    spamd     5484544 Feb 28  2007 bayes_toks.expire501
> -rw-------    1 spamd    spamd     2158592 Feb 23  2007 bayes_toks.expire547
> -rw-------    1 spamd    spamd     2134016 Feb 23  2007 bayes_toks.expire548
> -rw-------    1 spamd    spamd     4972544 Feb 28  2007 bayes_toks.expire6264
> -rw-------    1 spamd    spamd     9723904 Sep 14 19:13 bayes_toks.expire662
> -rw-------    1 spamd    spamd      278528 Feb 23  2007 bayes_toks.expire6713
> -rw-------    1 spamd    spamd      561152 Feb 20  2007 bayes_toks.expire7497
> -rw-------    1 spamd    spamd     4456448 Feb 23  2007 bayes_toks.expire8117
> -rw-------    1 spamd    spamd     9965568 Feb 28  2007 bayes_toks.expire8626
> -rw-------    1 spamd    spamd     9322496 Sep 14 19:52 bayes_toks.expire8875
> -rw-------    1 spamd    spamd     9261056 Sep 14 19:57 bayes_toks.expire8912
> -rw-------    1 spamd    spamd      540672 Feb 23  2007 bayes_toks.expire8923
> -rw-------    1 spamd    spamd           0 Sep 15 20:33 __db.bayes_toks.expire1270.
> -rw-------    1 spamd    spamd           0 Sep 15 21:49 __db.bayes_toks.expire15897.
> -rw-------    1 spamd    spamd           0 Sep 16 00:58 __db.bayes_toks.expire16797.
> -rw-------    1 spamd    spamd           0 Sep 15 20:05 __db.bayes_toks.expire17778.
> -rw-------    1 spamd    spamd           0 Sep 15 22:15 __db.bayes_toks.expire20880.
> -rw-------    1 spamd    spamd           0 Sep 15 20:10 __db.bayes_toks.expire23514.
> -rw-------    1 spamd    spamd           0 Sep 15 20:08 __db.bayes_toks.expire26416.
> -rw-------    1 spamd    spamd           0 Sep 15 20:18 __db.bayes_toks.expire26421.
> -rw-------    1 spamd    spamd           0 Sep 16 01:59 __db.bayes_toks.expire26977.
> -rw-------    1 spamd    spamd           0 Sep 15 20:41 __db.bayes_toks.expire2739.
> -rw-------    1 spamd    spamd           0 Sep 15 20:07 __db.bayes_toks.expire27515.
> -rw-------    1 spamd    spamd           0 Sep 16 02:03 __db.bayes_toks.expire27877.
> -rw-------    1 spamd    spamd           0 Sep 15 20:12 __db.bayes_toks.expire28697.
> -rw-------    1 spamd    spamd           0 Sep 15 20:13 __db.bayes_toks.expire29089.
> -rw-------    1 spamd    spamd           0 Sep 16 02:22 __db.bayes_toks.expire30629.
> -rw-------    1 spamd    spamd           0 Sep 15 20:21 __db.bayes_toks.expire31427.
> -rw-------    1 spamd    spamd           0 Sep 15 23:48 __db.bayes_toks.expire4127.
> -rw-------    1 spamd    spamd           0 Sep 15 20:49 __db.bayes_toks.expire4501.
> -rw-------    1 spamd    spamd           0 Sep 15 20:30 __db.bayes_toks.expire660.
> -rw-------    1 spamd    spamd           0 Sep 15 20:31 __db.bayes_toks.expire661.
> -rw-------    1 spamd    spamd           0 Sep 16 02:34 __db.bayes_toks.expire714.
> -rw-------    1 spamd    spamd           0 Sep 16 03:13 __db.bayes_toks.expire7413.
> -rw-------    1 spamd    spamd           0 Sep 16 03:21 __db.bayes_toks.expire8458.
> -rw-------    1 spamd    spamd           0 Sep 15 21:14 __db.bayes_toks.expire9362.

As you can see, I have stopped using AWL a wile ago... but what are all
those bayes_toks.expire~ and __db.bayes_toks~~??? it's seems that I need
to clean this up a little bit :-D Could this be performed with sa-learn
--sync? I am reading the wiki about "Setting up a cron job to force
Bayes Expiry", not sure if I need to apply this. Any comments?

I was told before at this list, that for a faster performance I'll have
to use the bayes db at a mysql environment. It's the size of the bayes
db telling me that I should move to the mysql setup??

I am receiving 50K msg daily, at a 3GB RAM Xeon 2.80GHz CPU, ~3K
accounts. I'm using sendmail, milters and spamd. also
clamav-milter/clamd is running plus httpd, pop3d, etc.

Any comments are greatly appreciated :-)

Best Regards,
Matías.




Re: large bayes db may cause downgrade in performance???

Posted by "John D. Hardin" <jh...@impsec.org>.
On Wed, 7 Nov 2007, Matias Lopez Bergero wrote:

> John D. Hardin wrote:
> > On Wed, 7 Nov 2007, Matias Lopez Bergero wrote:
> > FAQ.
> > 
> > (1) turn off Bayes auto-expire. It's taking longer to clean your
> > database than spamd is willing to wait, so you're collecting lots of
> > aborted cleanup data and your database is not being cleaned up;
> > 
> > then,
> > 
> > (2) set up a cron job to do manual Bayes expiry daily or weekly or 
> > monthly, depending on your message traffic.
> 
> 
> Do I need to restart SA after sa-learn --force-expire?

No, you do not.

You can also safely delete all the bayes temporary files to recover 
disk space.

--
 John Hardin KA7OHZ                    http://www.impsec.org/~jhardin/
 jhardin@impsec.org    FALaholic #11174     pgpk -a jhardin@impsec.org
 key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
  The yardstick you should use when considering whether to support a
  given piece of legislation is "what if my worst enemy is chosen to
  administer this law?"
-----------------------------------------------------------------------
 4 days until Veterans Day


Re: large bayes db may cause downgrade in performance???

Posted by Matias Lopez Bergero <ml...@udesa.edu.ar>.
John D. Hardin wrote:
> On Wed, 7 Nov 2007, Matias Lopez Bergero wrote:
> FAQ.
> 
> (1) turn off Bayes auto-expire. It's taking longer to clean your
> database than spamd is willing to wait, so you're collecting lots of
> aborted cleanup data and your database is not being cleaned up;
> 
> then,
> 
> (2) set up a cron job to do manual Bayes expiry daily or weekly or 
> monthly, depending on your message traffic.


Do I need to restart SA after sa-learn --force-expire?

Thank you,
Matías.



Re: large bayes db may cause downgrade in performance???

Posted by "John D. Hardin" <jh...@impsec.org>.
On Wed, 7 Nov 2007, Matias Lopez Bergero wrote:

> -rw-------    1 spamd    spamd    10182656 Feb 28  2007 bayes_toks.expire10134
> -rw-------    1 spamd    spamd     4472832 Feb 23  2007 bayes_toks.expire10399
> -rw-------    1 spamd    spamd    10629120 Oct 24 00:54 bayes_toks.expire1220
> -rw-------    1 spamd    spamd    10432512 Aug  7 15:01 bayes_toks.expire12662

FAQ.

(1) turn off Bayes auto-expire. It's taking longer to clean your
database than spamd is willing to wait, so you're collecting lots of
aborted cleanup data and your database is not being cleaned up;

then,

(2) set up a cron job to do manual Bayes expiry daily or weekly or 
monthly, depending on your message traffic.

Would it be reasonable to extend the SA lint tool to check for this 
and excessively large rule files and other performance FAQ subjects?

--
 John Hardin KA7OHZ                    http://www.impsec.org/~jhardin/
 jhardin@impsec.org    FALaholic #11174     pgpk -a jhardin@impsec.org
 key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
  Of the twenty-two civilizations that have appeared in history,
  nineteen of them collapsed when they reached the moral state the
  United States is in now.                          -- Arnold Toynbee
-----------------------------------------------------------------------
 4 days until Veterans Day