You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Spiro Harvey <sp...@knossos.net.nz> on 2010/06/10 01:51:21 UTC

How do I get better processing/delivery times?

I maintain a mail cluster that gets about 70,000 messages a day per
node.

I'm just wondering if it's possible to decrease the scan times. In the
TOTALS section AvgTm is the average "scantime" in the spamassassin log
file:

(Delivered are messages that SA scores under 5, Spamboxed are scored
5+, but under 10, and Rejected are 10+)



# ./knl-spam-stats.awk /var/log/spamassassin.*

 TOTALS                                              
                              AvgTm  AvgThruput
             # Msgs   %/Total (sec) (bytes/sec)
             ~~~~~~ ~~~~~~~~~ ~~~~~ ~~~~~~~~~~~
  Delivered  176086 ( 17.80%) 15.22     2208.40
  Spamboxed   51194 (  5.17%) 19.92      550.14
  Rejected   762189 ( 77.03%) 19.30      537.56

  Total      989469 messages processed

  (70676/day; 2944.85/hr; 49.08/min; 0.82/sec)

 BLACKLIST HITS                                       
  Blacklist       Msgs   %/Total   %/Spam Avg Score
  ~~~~~~~~~~~~~ ~~~~~~ ~~~~~~~~~ ~~~~~~~~ ~~~~~~~~~
  Spamhaus SBL    2087 (  0.21%) (  0.26%)    16.15
  Spamhaus PBL  569825 ( 57.59%) ( 70.06%)    21.99
  Spamhaus XBL  497403 ( 50.27%) ( 61.15%)    21.98
  SBL URI       187292 ( 18.93%) ( 23.03%)    26.05
  NJABL           3544 (  0.36%) (  0.44%)    23.65
  SORBS         387539 ( 39.17%) ( 47.65%)    22.32
  Spamcop       513748 ( 51.92%) ( 63.16%)    22.48
  SURBL URI     360620 ( 36.45%) ( 44.34%)    27.25
  RFC Ignorant   29295 (  2.96%) (  3.60%)    20.68

 CUSTOM RULE HITS                                     
  Custom Rule     Msgs   %/Total Avg Score
  ~~~~~~~~~~~~~ ~~~~~~ ~~~~~~~~~ ~~~~~~~~~
  MIME/JPG          84 (  0.01%)    15.45
  ZIP file         741 (  0.07%)    18.44


Yet, on another mail cluster that only gets 4-5000 messages a day per
node, the average scantimes are 4-5 seconds. Both have the same custom
rules, so any slowness in processing regexes should be noticable on
both systems.

In the first case, we have started rsyncing Spamhaus' blacklists in the
hopes that it would increase scantimes by decreasing DNS lookup times.
It hasn't really made too much difference, but my main concern is that
the messages seem to be taking so long regardless.

The boxes are running Sendmail 8.14 + ClamAV 0.96 + SA 3.3.1 + Razor


SPAMDOPTIONS="-d -x -c -m50 -H -s local2 /home/spamd -u spamd
--min-children=10 --min-spare=10"

Core 2 Duo @2.93GHz, 4GB RAM. Load averages typically sit at 5-7 during
the day.

Any advice on how I can tune the scantimes?


-- 
Spiro Harvey                  Knossos Networks Ltd
021-295-1923                  www.knossos.net.nz

Re: How do I get better processing/delivery times?

Posted by Spiro Harvey <sp...@knossos.net.nz>.
Matt Kettler <mk...@verizon.net> wrote:

> These settings:
> -m 50 --min-children=10 --min-spare=10
> 
> seem a bit high for a box with only 4GB of ram... Is the box suffering
> from severe swap usage, and grinding itself to a halt when all 50 are
> up and running? (try running "free", what does it say?)

It's not that bad. I think that's how we came up with the number of
children in the first place. Just ramped them up until the server
started showing signs of a hernia, then backing them off:

# free
             total       used       free     shared    buffers     cached
Mem:       4148588    3565068     583520          0     173192    1955428
-/+ buffers/cache:    1436448    2712140
Swap:      1052248         88    1052160


Here's some output from vmstat (5 sec intervals):

procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu------
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
 1  2     88 823612 176664 1780900    0    0    13  1604 1509  558 20  2 40 38  0
 0  3     88 816160 176664 1781056    0    0    13  1534 1421  443  7  1 47 45  0
 1  3     88 805348 176664 1784068    0    0     7  1952 1616  363 14  2 36 49  0
 0  3     88 787760 176684 1784272    0    0    10  1630 1388  573 40  2 21 37  0
10  2     88 719736 176684 1781208    0    0     7  1542 1829 1595 71  6 13 11  0
 0  2     88 695932 176684 1781392    0    0    23  1406 1476  771 55  2  9 34  0
 0  2     88 692472 176684 1781520    0    0    14  1346 1352  267  7  1 29 63  0
 0  2     88 686520 176684 1781584    0    0     6  1634 1381  400 10  2 44 45  0
 0  2     88 681904 176684 1781660    0    0     9  1540 1422  323  3  1 48 48  0
 0  2     88 826464 176684 1782428    0    0     0  1827 1781 1270 37  4 31 29  0
13  2     88 845932 176688 1782564    0    0    16  2036 1434  443 12  2 43 43  0
 0  2     88 830008 176688 1782548    0    0     2  1419 1397  319  9  1 45 45  0
 0  2     88 826136 176688 1782688    0    0    10  1392 1364  251  4  1 43 52  0
 0  3     88 827168 176688 1782728    0    0     2  1449 1420  367 12  2 39 48  0
 0  2     88 825376 176688 1781760    0    0     6  1954 1645 1500 75  4  9 12  0
21  3     88 813164 176688 1781840    0    0     7  1602 1443  368  6  2 48 45  0
 0  3     88 810728 176688 1781980    0    0    19  1379 1403  409  7  2 35 57  0
 0  2     88 797956 176688 1782056    0    0     4  1305 1341  214  4  1 44 50  0
 4  0     88 742540 176688 1782224    0    0     1  1780 1469  951 30  3 33 34  0
 0  1     88 748244 176688 1782132    0    0     0  1429 1424  362 29  2 36 33  0
18  2     88 725460 176688 1782276    0    0     4  1705 1519  593 24  3 37 36  0
 0  2     88 685828 176688 1782312    0    0     0  1108 1377  270 36  2 28 34  0
 0  3     88 673348 176688 1782564    0    0    18  1434 1456  373 13  2 40 46  0
 1  2     88 668980 176688 1783320    0    0     6  1354 1518  515  4  1 29 66  0
 0  2     88 673316 176728 1783220    0    0     2  1948 1393  494 11  2 30 58  0
 0  2     88 670124 176728 1783240    0    0     0  1663 1445  286  3  2 47 49  0
 3  0     88 614648 176736 1783216    0    0     2  1523 1590 1452 69  4 18  9  0
 0  2     88 582764 176736 1783544    0    0     7  1612 1617  944 26  3 36 36  0
 0  2     88 582876 176744 1783552    0    0     3  1774 1407  220  4  1 46 49  0


> I might suggest something more like 10-20 as a max children with 4gb
> of ram:
> -m 10 --min-children 5 --min-spare=1
> -m 20 --min-children 10 --min-spare=2
> Adding more children helps, but only if you have enough ram to fit
> them all. Once you run out of ram, performance suffers severely.

With those stats, I don't think we're running out of ram. Maybe temporarily, but not long term, and the memory is being freed back up.

I might run some stats to just extract the scantimes (maybe do some hourly averages) and figure out if there are times when it's running slower. Perhaps during busy loads, those boxes are running into swap and that's what's killing the averages.

Thanks


-- 
Spiro Harvey                  Knossos Networks Ltd
021-295-1923                  www.knossos.net.nz

Re: How do I get better processing/delivery times?

Posted by Matt Kettler <mk...@verizon.net>.
On 6/9/2010 7:51 PM, Spiro Harvey wrote:
> I maintain a mail cluster that gets about 70,000 messages a day per
> node.
>
> I'm just wondering if it's possible to decrease the scan times. In the
> TOTALS section AvgTm is the average "scantime" in the spamassassin log
> file:
>
> (Delivered are messages that SA scores under 5, Spamboxed are scored
> 5+, but under 10, and Rejected are 10+)
>
>
>
> # ./knl-spam-stats.awk /var/log/spamassassin.*
>
>  TOTALS                                              
>                               AvgTm  AvgThruput
>              # Msgs   %/Total (sec) (bytes/sec)
>              ~~~~~~ ~~~~~~~~~ ~~~~~ ~~~~~~~~~~~
>   Delivered  176086 ( 17.80%) 15.22     2208.40
>   Spamboxed   51194 (  5.17%) 19.92      550.14
>   Rejected   762189 ( 77.03%) 19.30      537.56
>
>   Total      989469 messages processed
>
>   (70676/day; 2944.85/hr; 49.08/min; 0.82/sec)
>
>  BLACKLIST HITS                                       
>   Blacklist       Msgs   %/Total   %/Spam Avg Score
>   ~~~~~~~~~~~~~ ~~~~~~ ~~~~~~~~~ ~~~~~~~~ ~~~~~~~~~
>   Spamhaus SBL    2087 (  0.21%) (  0.26%)    16.15
>   Spamhaus PBL  569825 ( 57.59%) ( 70.06%)    21.99
>   Spamhaus XBL  497403 ( 50.27%) ( 61.15%)    21.98
>   SBL URI       187292 ( 18.93%) ( 23.03%)    26.05
>   NJABL           3544 (  0.36%) (  0.44%)    23.65
>   SORBS         387539 ( 39.17%) ( 47.65%)    22.32
>   Spamcop       513748 ( 51.92%) ( 63.16%)    22.48
>   SURBL URI     360620 ( 36.45%) ( 44.34%)    27.25
>   RFC Ignorant   29295 (  2.96%) (  3.60%)    20.68
>
>  CUSTOM RULE HITS                                     
>   Custom Rule     Msgs   %/Total Avg Score
>   ~~~~~~~~~~~~~ ~~~~~~ ~~~~~~~~~ ~~~~~~~~~
>   MIME/JPG          84 (  0.01%)    15.45
>   ZIP file         741 (  0.07%)    18.44
>
>
> Yet, on another mail cluster that only gets 4-5000 messages a day per
> node, the average scantimes are 4-5 seconds. Both have the same custom
> rules, so any slowness in processing regexes should be noticable on
> both systems.
>
> In the first case, we have started rsyncing Spamhaus' blacklists in the
> hopes that it would increase scantimes by decreasing DNS lookup times.
> It hasn't really made too much difference, but my main concern is that
> the messages seem to be taking so long regardless.
>
> The boxes are running Sendmail 8.14 + ClamAV 0.96 + SA 3.3.1 + Razor
>
>
> SPAMDOPTIONS="-d -x -c -m50 -H -s local2 /home/spamd -u spamd
> --min-children=10 --min-spare=10"
>
> Core 2 Duo @2.93GHz, 4GB RAM. Load averages typically sit at 5-7 during
> the day.
>
> Any advice on how I can tune the scantimes?
>
>
>   
These settings:
-m 50 --min-children=10 --min-spare=10

seem a bit high for a box with only 4GB of ram... Is the box suffering
from severe swap usage, and grinding itself to a halt when all 50 are up
and running? (try running "free", what does it say?)


I might suggest something more like 10-20 as a max children with 4gb of ram:

-m 10 --min-children 5 --min-spare=1
-m 20 --min-children 10 --min-spare=2

Adding more children helps, but only if you have enough ram to fit them
all. Once you run out of ram, performance suffers severely.

Or, as the manpage for -m says:

Note that if you run too many servers for the amount of free RAM
available, you run the danger of hurting performance by causing a high
swap load as server processes are swapped in and out continually.


Re: How do I get better processing/delivery times?

Posted by John Hardin <jh...@impsec.org>.
On Thu, 10 Jun 2010, Spiro Harvey wrote:

> I'm just wondering if it's possible to decrease the scan times. In the 
> TOTALS section AvgTm is the average "scantime" in the spamassassin log 
> file:
>
> (Delivered are messages that SA scores under 5, Spamboxed are scored 5+, 
> but under 10, and Rejected are 10+)
>
> BLACKLIST HITS
>  Blacklist       Msgs   %/Total   %/Spam Avg Score
>  ~~~~~~~~~~~~~ ~~~~~~ ~~~~~~~~~ ~~~~~~~~ ~~~~~~~~~
>  Spamhaus SBL    2087 (  0.21%) (  0.26%)    16.15
>  Spamhaus PBL  569825 ( 57.59%) ( 70.06%)    21.99
>  Spamhaus XBL  497403 ( 50.27%) ( 61.15%)    21.98

As these hits are being rejected anyway, promoting the ZEN DNSBL to an 
MTA-enforced SMTP-time check would reduce the load on SA and might reduce 
overall scan times if load is a contributing factor.

-- 
  John Hardin KA7OHZ                    http://www.impsec.org/~jhardin/
  jhardin@impsec.org    FALaholic #11174     pgpk -a jhardin@impsec.org
  key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
   A well educated Electorate, being necessary to the liberty of a
   free State, the Right of the People to Keep and Read Books,
   shall not be infringed.
-----------------------------------------------------------------------
  244 days since President Obama won the Nobel "Not George W. Bush" prize