You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Karsten Bräckelmann <gu...@rudersport.de> on 2011/11/18 01:09:47 UTC

Re: Performance Problems Upgrading From 3.2.5 to 3.3.1 on CentOS 5/6

On Thu, 2011-11-17 at 15:55 +0000, Tom wrote:
> SPAMDOPTIONS="-d -L -i 10.44.219.208 -A 10.44.217.0/20 -m 40 -q -x -u 
> spamd --min-children=40"

Do you really run a single spamd server, serving a /20 of potential SMTP
servers?

Also, you configured spamd to try hard and always keep exactly 40
children around -- both max and min are set to that value.


> Other info Bayes module disabled, compiled regexes being used, RBL 
> checks disabled, network checks disabled.

No numbers, but I believe the stock set of regex rules in the 3.3 branch
should impose higher CPU and probably RAM usage than 3.2. In particular
with some previous iterations, which have long been fixed since. Did you
run sa-update to keep the rules fresh, re-compiled rules and restarted
spamd?

Any chance you're hitting swap?

Since you're running in -L local mode, how long does processing a mail
take? You mentioned "up to 600 mails per minute", 10 per second. I hope
that's a beefy machine, but with local mode only keeping 40 children for
up to 10 messages a second seems excessive.

(Without hitting swap, 4 seconds per message pure CPU, no idle waiting
for DNS, is too high.)


Besides, you mentioned both CentOS 5 and 6 machines. So, is that number
of messages per minute total, or per machine?


-- 
char *t="\10pse\0r\0dtu\0.@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4";
main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;i<l;i++){ i%8? c<<=1:
(c=*++x); c&128 && (s+=h); if (!(h>>=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}


Re: Performance Problems Upgrading From 3.2.5 to 3.3.1 on CentOS 5/6

Posted by Karsten Bräckelmann <gu...@rudersport.de>.
On Fri, 2011-11-18 at 19:36 +0100, Karsten Bräckelmann wrote:
> On Fri, 2011-11-18 at 08:16 +0000, Tom wrote:
> > (apologies if the html doesn't end up translating well!)

Damn, sorry. My attempt at pruning the large tables seriously fucked up
the formatting. :/


> > output from top, after running spamassassin for a couple of minutes:
> > 
> > [root@spam_209 ~]# top
> > 
> > top - 08:01:24 up 23:27,  3 users,  load average: 31.22, 10.73, 3.84
> 
> These numbers don't match up. The above, "after 15 minutes" claims load
> average < 1, whereas this "after a couple minutes" skyrockets.
> 
> Also, I guess the "spam_209" hostname corresponds with the IP, so this
> machine is not included in the above stats.

Oh, wait -- that's probably 3.2 and 3.3 respectively.

Anyway, the load is the number of running, busy processes per timeslice.
With 40 busy, entirely CPU bound processes running, I'd expect the load
to be in that range...


> > Tasks: 115 total,  40 running,  75 sleeping,   0 stopped,   0 zombie
> > Cpu(s): 99.0%us,  0.7%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.3%si,  0.0%st
> > Mem:   2055676k total,  1985400k used,    70276k free,    27704k buffers
> > Swap:  4128760k total,      344k used,  4128416k free,   470672k cached
> > 
> >   PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
> > 18458 spamd     20   0  255m  58m 3136 R  2.6  2.9   0:02.90 spamd
> > 18463 spamd     20   0  255m  58m 3152 R  2.6  2.9   0:04.62 spamd
> [...]
> 
> Here we have 40 spamd processes, each getting a tiny share of the CPU
> resources.
> 
> I might be wrong, but since your -L local only spamd processes are
> entirely CPU (and memory) bound, wouldn't it be more appropriate to run
> about twice as much processes than CPU cores, to avoid context switches?
> 
> As it is right now, there are 5 (or even 10?) spamd processes per CPU
> core, all busy burning CPU simultaneously.
> 
> Raising the number of concurrent spamd processes is useful to avoid idle
> waiting for DNS, etc -- which you don't have.

-- 
char *t="\10pse\0r\0dtu\0.@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4";
main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;i<l;i++){ i%8? c<<=1:
(c=*++x); c&128 && (s+=h); if (!(h>>=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}


Re: Performance Problems Upgrading From 3.2.5 to 3.3.1 on CentOS 5/6

Posted by Karsten Bräckelmann <gu...@rudersport.de>.
On Fri, 2011-11-18 at 08:16 +0000, Tom wrote:
> Here's the stats from my cluster at the moment (8am) (these figures wll
> ramp up considerably!) (apologies if the html doesn't end up
> translating well!)
> 
> Server
> Load Avg
> Processed/Min
> Busy Child Proc
> Proc Time
> 10.44.219.192
> 0.34
> 42
> 1
> 0.31
> 10.44.219.193
> 0.19
> 42
> 1
> 0.32
> [...]
> 
> 
> 
> 
> 10.44.219.208
> 0.55
> 164
> 1
> 0.16
> Total:
> 0.45
> 1,527
> 16
> 0.21
> 
> and 15 minutes later :)
> 
> Server
> Load Avg
> Processed/Min
> Busy Child Proc
> Proc Time
> 10.44.219.192
> 0.40
> 63
> 1
> 0.34
> 10.44.219.193
> 0.59
> 61
> 1
> 0.38
> [...]
> 
> 
> 
> 
> 10.44.219.208
> 0.84
> 224
> 1
> 0.15
> Total:
> 0.66
> 2,174
> 16
> 0.23
> 
> 
> output from top, after running spamassassin for a couple of minutes:
> 
> [root@spam_209 ~]# top
> 
> top - 08:01:24 up 23:27,  3 users,  load average: 31.22, 10.73, 3.84

These numbers don't match up. The above, "after 15 minutes" claims load
average < 1, whereas this "after a couple minutes" skyrockets.

Also, I guess the "spam_209" hostname corresponds with the IP, so this
machine is not included in the above stats.


> Tasks: 115 total,  40 running,  75 sleeping,   0 stopped,   0 zombie
> Cpu(s): 99.0%us,  0.7%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.3%si,  0.0%st
> Mem:   2055676k total,  1985400k used,    70276k free,    27704k buffers
> Swap:  4128760k total,      344k used,  4128416k free,   470672k cached
> 
>   PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
> 18458 spamd     20   0  255m  58m 3136 R  2.6  2.9   0:02.90 spamd
> 18463 spamd     20   0  255m  58m 3152 R  2.6  2.9   0:04.62 spamd
[...]

Here we have 40 spamd processes, each getting a tiny share of the CPU
resources.

I might be wrong, but since your -L local only spamd processes are
entirely CPU (and memory) bound, wouldn't it be more appropriate to run
about twice as much processes than CPU cores, to avoid context switches?

As it is right now, there are 5 (or even 10?) spamd processes per CPU
core, all busy burning CPU simultaneously.

Raising the number of concurrent spamd processes is useful to avoid idle
waiting for DNS, etc -- which you don't have.


-- 
char *t="\10pse\0r\0dtu\0.@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4";
main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;i<l;i++){ i%8? c<<=1:
(c=*++x); c&128 && (s+=h); if (!(h>>=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}


Re: Performance Problems Upgrading From 3.2.5 to 3.3.1 on CentOS 5/6

Posted by Tom <to...@t0mb.net>.
Hi Karsten,

Firstly, thanks for getting back to me and taking the time to really dig 
in to my mail!

I have 16 servers, all a mix of Dell PowerEdge 1750's, 1850's and 
1950's.  The 1750's with 1 GB of memory tend to process about 120 
mails/sec, and the 1950's with 4GB memory get up to 600/sec when they're 
working hard.  They all have dual-dual core, or quad core xeons.  I'm 
submitting mails from 42 different smtpin servers running postfix.  At 
the moment, they're only doing 1300 mails per minute across the cluster, 
as we're not at peak time here.  The average time to process a mail is 
0.21 seconds and they're all running well, not working very hard at 
all.  As load starts to ramp up, I'd expect to see them processing about 
4K up to 5.5k per minute across the whole cluster.

I actually removed most of the IP ranges that I'm specifying as options 
to spamd, as I don't imagine it would be helping diagnose the issue, but 
no, I'm not serving a /20.  I think one of my colleagues was possibly 
just being a bit lazy, specifying a /20 range beginning in the second 
/24 in a /22 range! :)

My previous tests with this issue have been on:

a) first tried running src rpm compiled from apache tarball of 3.3.1 on 
a 1950 + 4gb RAM
b) second time unplanned upgrade (general upgrade to centos-5.7) caused 
a mixture of machines to be upgraded, all exhibited the same problems
c) 3rd time yesterday, me testing on an ESX VM, running rhel6 w/ 2gb RAM.

I ran sa-update when upgrading to 3.3.1, so the ruleset was as fresh as 
it possibly could be.

I can confirm that I'm not hitting swap at any point, at all.  Not even 
a megabyte of it.  I'll paste some output from top and vmstat below..

This is the timing output summary from spamassassin --lint -L -D on a 
3.3.1 machine

Nov 18 08:18:38.723 [18613] dbg: timing: total 2492 ms - init: 1860 
(74.6%), parse: 1.57 (0.1%), extract_message_metadata: 2 (0.1%), 
get_uri_detail_list: 1.92 (0.1%), tests_pri_-1000: 14 (0.6%), 
compile_gen: 234 (9.4%), compile_eval: 33 (1.3%), tests_pri_-950: 10 
(0.4%), tests_pri_-900: 11 (0.4%), tests_pri_0: 451 (18.1%), 
tests_pri_500: 135 (5.4%)

the same mail on 3.2.5 takes 1.109 seconds to run, and that's on a 
machine that's busy.  I understand that things might take longer on 
3.3.1 because it's filtering so much more effectively, so, what do i 
need to do to get it performing decently?

Here's the stats from my cluster at the moment (8am) (these figures wll 
ramp up considerably!) (apologies if the html doesn't end up translating 
well!)

Server 	Load Avg 	Processed/Min 	Busy Child Proc 	Proc Time
10.44.219.192 <http://10.44.219.192:8004> 	0.34 	42 	1 	0.31
10.44.219.193 <http://10.44.219.193:8004> 	0.19 	42 	1 	0.32
10.44.219.194 <http://10.44.219.194:8004> 	0.40 	42 	1 	0.31
10.44.219.195 <http://10.44.219.195:8004> 	0.17 	42 	1 	0.22
10.44.219.196 <http://10.44.219.196:8004> 	0.53 	85 	1 	0.23
10.44.219.197 <http://10.44.219.197:8004> 	0.29 	84 	1 	0.22
10.44.219.198 <http://10.44.219.198:8004> 	0.53 	85 	1 	0.23
10.44.219.199 <http://10.44.219.199:8004> 	0.46 	83 	1 	0.22
10.44.219.200 <http://10.44.219.200:8004> 	0.99 	84 	1 	0.22
10.44.219.201 <http://10.44.219.201:8004> 	0.50 	92 	1 	0.24
10.44.219.202 <http://10.44.219.202:8004> 	0.40 	84 	1 	0.21
10.44.219.203 <http://10.44.219.203:8004> 	0.12 	0 	0 	0.00
10.44.219.204 <http://10.44.219.204:8004> 	0.41 	94 	1 	0.18
10.44.219.205 <http://10.44.219.205:8004> 	0.55 	169 	1 	0.18
10.44.219.206 <http://10.44.219.206:8004> 	0.81 	169 	1 	0.18
10.44.219.207 <http://10.44.219.207:8004> 	0.49 	166 	1 	0.14
10.44.219.208 <http://10.44.219.208:8004> 	0.55 	164 	1 	0.16
Total: 	0.45 	1,527 	16 	0.21


and 15 minutes later :)

Server 	Load Avg 	Processed/Min 	Busy Child Proc 	Proc Time
10.44.219.192 <http://10.44.219.192:8004> 	0.40 	63 	1 	0.34
10.44.219.193 <http://10.44.219.193:8004> 	0.59 	61 	1 	0.38
10.44.219.194 <http://10.44.219.194:8004> 	0.25 	64 	1 	0.30
10.44.219.195 <http://10.44.219.195:8004> 	0.47 	63 	1 	0.26
10.44.219.196 <http://10.44.219.196:8004> 	0.55 	122 	1 	0.23
10.44.219.197 <http://10.44.219.197:8004> 	0.71 	115 	1 	0.21
10.44.219.198 <http://10.44.219.198:8004> 	0.82 	125 	1 	0.24
10.44.219.199 <http://10.44.219.199:8004> 	0.46 	114 	1 	0.22
10.44.219.200 <http://10.44.219.200:8004> 	0.97 	127 	1 	0.26
10.44.219.201 <http://10.44.219.201:8004> 	0.79 	116 	1 	0.26
10.44.219.202 <http://10.44.219.202:8004> 	0.54 	113 	1 	0.25
10.44.219.203 <http://10.44.219.203:8004> 	0.04 	0 	0 	0.00
10.44.219.204 <http://10.44.219.204:8004> 	0.50 	117 	1 	0.20
10.44.219.205 <http://10.44.219.205:8004> 	1.35 	254 	1 	0.21
10.44.219.206 <http://10.44.219.206:8004> 	1.04 	247 	1 	0.21
10.44.219.207 <http://10.44.219.207:8004> 	0.93 	249 	1 	0.17
10.44.219.208 <http://10.44.219.208:8004> 	0.84 	224 	1 	0.15
Total: 	0.66 	2,174 	16 	0.23



output from top, after running spamassassin for a couple of minutes:

[root@spam_209 ~]# top

top - 08:01:24 up 23:27,  3 users,  load average: 31.22, 10.73, 3.84
Tasks: 115 total,  40 running,  75 sleeping,   0 stopped,   0 zombie
Cpu(s): 99.0%us,  0.7%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.3%si,  
0.0%st
Mem:   2055676k total,  1985400k used,    70276k free,    27704k buffers
Swap:  4128760k total,      344k used,  4128416k free,   470672k cached

   PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
18458 spamd     20   0  255m  58m 3136 R  2.6  2.9   0:02.90 spamd
18463 spamd     20   0  255m  58m 3152 R  2.6  2.9   0:04.62 spamd
18466 spamd     20   0  249m  53m 3536 R  2.6  2.6   0:02.86 spamd
18467 spamd     20   0  251m  54m 3544 R  2.6  2.7   0:03.65 spamd
18468 spamd     20   0  252m  55m 3532 R  2.6  2.8   0:03.52 spamd
18470 spamd     20   0  250m  54m 3548 R  2.6  2.7   0:03.29 spamd
18473 spamd     20   0  251m  54m 3540 R  2.6  2.7   0:03.23 spamd
18474 spamd     20   0  252m  56m 3564 R  2.6  2.8   0:02.99 spamd
18475 spamd     20   0  252m  55m 3528 R  2.6  2.8   0:03.02 spamd
18476 spamd     20   0  250m  53m 3564 R  2.6  2.7   0:02.72 spamd
18477 spamd     20   0  251m  55m 3524 R  2.6  2.7   0:02.82 spamd
18478 spamd     20   0  252m  56m 3528 R  2.6  2.8   0:03.09 spamd
18480 spamd     20   0  250m  53m 3544 R  2.6  2.7   0:03.86 spamd
18481 spamd     20   0  255m  58m 3500 R  2.6  2.9   0:03.14 spamd
18483 spamd     20   0  253m  56m 3528 R  2.6  2.8   0:03.31 spamd
18484 spamd     20   0  249m  52m 3524 R  2.6  2.6   0:02.73 spamd
18485 spamd     20   0  251m  55m 3540 R  2.6  2.8   0:02.64 spamd
18487 spamd     20   0  252m  55m 3532 R  2.6  2.8   0:03.74 spamd
18489 spamd     20   0  252m  56m 3540 R  2.6  2.8   0:02.68 spamd
18495 spamd     20   0  250m  53m 3532 R  2.6  2.7   0:02.60 spamd
18496 spamd     20   0  252m  56m 3516 R  2.6  2.8   0:02.41 spamd
18498 spamd     20   0  257m  60m 3552 R  2.6  3.0   0:02.65 spamd
18460 spamd     20   0  249m  53m 3528 R  2.3  2.7   0:03.93 spamd
18461 spamd     20   0  251m  55m 3512 R  2.3  2.7   0:02.80 spamd
18464 spamd     20   0  254m  57m 3548 S  2.3  2.9   0:03.30 spamd
18465 spamd     20   0  251m  55m 3508 R  2.3  2.7   0:02.85 spamd
18469 spamd     20   0  253m  56m 3536 R  2.3  2.8   0:02.66 spamd
18471 spamd     20   0  251m  54m 3544 R  2.3  2.7   0:02.80 spamd
18472 spamd     20   0  252m  55m 3136 R  2.3  2.8   0:02.82 spamd
18479 spamd     20   0  251m  54m 3540 R  2.3  2.7   0:02.94 spamd
18482 spamd     20   0  250m  53m 3556 R  2.3  2.7   0:02.99 spamd
18486 spamd     20   0  251m  54m 3548 R  2.3  2.7   0:03.24 spamd
18488 spamd     20   0  252m  55m 3548 R  2.3  2.8   0:03.07 spamd
18490 spamd     20   0  252m  55m 3112 R  2.3  2.7   0:03.38 spamd
18491 spamd     20   0  252m  56m 3536 R  2.3  2.8   0:02.81 spamd
18492 spamd     20   0  251m  55m 3532 R  2.3  2.7   0:03.31 spamd
18493 spamd     20   0  250m  54m 3528 R  2.3  2.7   0:02.86 spamd
18494 spamd     20   0  252m  55m 3528 R  2.3  2.8   0:02.86 spamd
18497 spamd     20   0  249m  53m 3528 R  2.3  2.7   0:02.89 spamd
18499 spamd     20   0  250m  54m 3532 R  2.3  2.7   0:02.67 spamd

This is me starting spamassassin and immediately running vmstat, 
updating each second, until the load average gets to over 30 as can be 
seen below, and I kill it, so it can't start having an impact on my mail 
cluster:

[root@spam_209 spamassassin]# /etc/init.d/spamassassin start ; vmstat 1
Starting spamd:                                            [  OK  ]
procs -----------memory---------- ---swap-- -----io---- --system-- 
-----cpu-----
  r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy 
id wa st
  5  0      0 1199252  37252 575968    0    0     2    11   46   48  1  
0 99  0  0
  2  0      0 1064052  37252 575996    0    0     0     0 1278  654 67 
33  0  0  0
  3  0      0 1013956  37252 575996    0    0     0     0 1176  465 93  
7  0  0  0
  4  0      0 968448  37264 575996    0    0     0    88 1244  565 91  
9  0  0  0
  4  0      0 938820  37264 576000    0    0     0     0 1088  575 95  
5  0  0  0
  6  0      0 902472  37264 576000    0    0     0     0 1142  618 93  
7  0  0  0
  7  0      0 862544  37264 576036    0    0     0     0 1134  610 91  
9  0  0  0
  9  0      0 832040  37264 576056    0    0     0     0 1099  635 94  
6  0  0  0
  9  0      0 788260  37272 576016    0    0     0    36 1160  652 90 
10  0  0  0
11  0      0 737536  37272 576020    0    0     0     0 1104  627 89 11  
0  0  0
13  0      0 690540  37272 576020    0    0     0     0 1140  610 93  7  
0  0  0
11  0      0 641188  37272 576012    0    0     0     0 1219  642 89 11  
0  0  0
10  0      0 599344  37272 576016    0    0     0     0 1176  696 87 13  
0  0  0
11  0      0 545380  37280 576060    0    0     0    56 1205  664 79 21  
0  0  0
10  0      0 503972  37280 576092    0    0     0    44 1135  643 86 14  
0  0  0
12  0      0 461316  37280 576024    0    0     0     0 1184  650 85 15  
0  0  0
15  0      0 418800  37280 576024    0    0     0     0 1194  625 85 15  
0  0  0
13  0      0 396604  37280 576028    0    0     0     0 1103  623 95  5  
0  0  0
12  0      0 374656  37288 576028    0    0     0    72 1118  634 95  5  
0  0  0
12  0      0 342912  37288 576032    0    0     0     0 1100  642 94  6  
0  0  0
14  0      0 321452  37288 576036    0    0     0     0 1172  655 95  5  
0  0  0
16  0      0 313012  37288 576036    0    0     0     0 1162  658 95  5  
0  0  0
12  0      0 299620  37288 576040    0    0     0     0 1167  624 98  2  
0  0  0
15  0      0 277920  37296 576040    0    0     0    12 1223  687 96  4  
0  0  0
14  0      0 268744  37300 576324    0    0     0     0 1169  655 98  2  
0  0  0
16  0      0 262048  37300 576344    0    0     0     0 1179  665 99  1  
0  0  0
19  0      0 250392  37300 576176    0    0     0     0 1302  649 97  3  
0  0  0
20  0      0 237364  37300 576208    0    0     0     0 1164  636 96  4  
0  0  0
16  0      0 221120  37308 576208    0    0     0    48 1088  623 96  4  
0  0  0
17  0      0 203264  37308 576208    0    0     0    64 1165  673 96  4  
0  0  0
17  0      0 193964  37308 576212    0    0     0     0 1258  678 99  1  
0  0  0
16  0      0 191236  37308 576336    0    0     0     0 1085  624 98  2  
0  0  0
15  0      0 179952  37308 576356    0    0     0     0 1249  666 96  4  
0  0  0
14  0      0 176604  37316 576196    0    0     0    68 1154  635 100  
0  0  0  0
19  0      0 174248  37316 576184    0    0     0     4 1180  677 99  1  
0  0  0
22  0      0 170892  37316 576192    0    0     0     0 1240  649 98  2  
0  0  0
25  0      0 164568  37316 576192    0    0     0     0 1347  655 97  3  
0  0  0
24  0      0 161344  37316 576200    0    0     0     0 1117  630 100  
0  0  0  0
27  0      0 157244  37324 576508    0    0     0    56 1338  704 98  2  
0  0  0
28  0      0 151168  37324 576524    0    0     0     0 1151  627 97  3  
0  0  0
25  0      0 145588  37324 576528    0    0     0     0 1172  664 99  1  
0  0  0
24  0      0 137156  37324 576212    0    0     0     0 1249  688 99  1  
0  0  0
24  0      0 135172  37324 576216    0    0     0     0 1154  626 99  1  
0  0  0
29  0      0 131272  37332 576092    0    0     0    60 1270  727 95  5  
0  0  0
28  0      0 122500  37332 576096    0    0     0     8 1085  632 94  6  
0  0  0
29  0      0 120764  37332 576096    0    0     0     0 1177  646 96  4  
0  0  0
procs -----------memory---------- ---swap-- -----io---- --system-- 
-----cpu-----
  r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy 
id wa st
29  0      0 112084  37332 576096    0    0     0     0 1123  615 98  2  
0  0  0
31  0      0 109852  37332 576256    0    0     0     0 1110  639 98  2  
0  0  0
31  0      0 102800  37340 576260    0    0     0    48 1139  660 99  1  
0  0  0
32  0      0 101808  37340 576260    0    0     0    92 1156  651 98  2  
0  0  0
30  0      0 101188  37340 576260    0    0     0     0 1109  622 99  1  
0  0  0
33  0      0 100072  37340 576332    0    0     0     0 1172  654 98  2  
0  0  0
34  0      0  97724  37340 576176    0    0     0     0 1348  664 99  1  
0  0  0
30  0      0  88672  37348 576108    0    0     0    36 1126  648 99  1  
0  0  0
29  0      0  86316  37348 576480    0    0     0     0 1105  640 96  4  
0  0  0
29  0      0  81852  37348 576484    0    0     0     0 1226  652 100  
0  0  0  0
28  0      0  80240  37348 576488    0    0     0     0 1134  639 99  1  
0  0  0
33  0      0  78416  37348 576120    0    0     0     0 1300  630 98  2  
0  0  0
33  0      0  74944  37348 576120    0    0     0     0 1043  521 96  4  
0  0  0
33  0      0  70844  37356 576124    0    0     0   100 1149  584 99  1  
0  0  0
37  0      0  68796  37356 576124    0    0     0     0 1136  567 97  3  
0  0  0
40  0      0  68428  37356 576128    0    0     0     0 1109  563 99  1  
0  0  0
40  0      0  68008  37356 576128    0    0     0     0 1050  559 99  1  
0  0  0
40  0      0  66068  37356 576128    0    0     0     4 1129  558 99  1  
0  0  0
37  0      0  70276  36848 567108    0    0     0    40 1097  606 95  5  
0  0  0
41  0      0  66184  35716 562760    0    0     0     0 1128  597 96  4  
0  0  0
41  0      0  74056  35196 554404    0    0     0     0 1112  599 97  3  
0  0  0
40  0      0  75300  34956 552864    0    0     0     0 1119  572 97  3  
0  0  0
41  0      0  75064  34956 552864    0    0     0     0 1147  604 99  1  
0  0  0
41  0      0  74560  34956 552868    0    0     0     4 1157  589 100  
0  0  0  0
41  0      0  72884  34964 552968    0    0    36    24 1227  647 97  3  
0  0  0
39  0      0  66188  34964 552952    0    0     0     0 1120  564 99  1  
0  0  0
41  0      0  66064  34964 552952    0    0     0     0 1095  602 100  
0  0  0  0
40  0      0  64576  34964 552912    0    0     0     0 1180  620 98  2  
0  0  0
41  0      0  75468  34872 541228    0    0     0     0 1220  583 98  2  
0  0  0
41  0      0  75280  34880 541228    0    0     0    40 1132  576 99  1  
0  0  0
41  0      0  73912  34880 541228    0    0     0     0 1201  585 98  2  
0  0  0
41  0      0  73168  34880 541232    0    0     0     0 1126  557 100  
0  0  0  0
41  0      0  69688  34880 541232    0    0     0     0 1139  565 99  1  
0  0  0
40  0      0  65844  34880 541232    0    0     0     0 1151  584 99  1  
0  0  0
39  0      0  64604  34888 541232    0    0     0    12 1116  629 100  
0  0  0  0
41  0      0  69936  34888 535140    0    0     0     0 1067  599 98  2  
0  0  0
40  0      0  75516  34580 529816    0    0     0     0 1109  627 98  2  
0  0  0
40  0      0  70300  34580 529816    0    0     0     0 1127  632 99  1  
0  0  0
40  0      0  65092  34580 529820    0    0     0    24 1131  613 99  1  
0  0  0
39  0      0  75632  34532 518104    0    0     0    40 1144  646 98  2  
0  0  0
39  0      0  69184  34532 518104    0    0     0     0 1235  632 98  2  
0  0  0
40  0      0  67440  34532 518148    0    0     0     0 1113  619 99  1  
0  0  0
41  0    100  69672  34520 512220    0   96     0    96 1087  618 99  1  
0  0  0
40  0    132  74996  34512 502032    0   32     0    72 1205  657 97  3  
0  0  0
40  0    132  74624  34520 502040    0    0     0    56 1153  621 100  
0  0  0  0
40  0    132  69540  34520 502044    0    0     0     0 1058  590 98  2  
0  0  0
procs -----------memory---------- ---swap-- -----io---- --system-- 
-----cpu-----
  r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy 
id wa st
40  0    132  68548  34520 502044    0    0     0     0 1143  638 98  2  
0  0  0
40  0    132  67064  34520 502068    0    0     0     0 1213  658 100  
0  0  0  0
40  0    132  66692  34520 502084    0    0     0     0 1107  633 98  2  
0  0  0
40  0    132  65948  34520 502088    0    0     0     0 1131  625 99  1  
0  0  0
40  0    264  75612  33328 491516    0  128     0   168 1244  662 98  2  
0  0  0
41  0    264  74124  33328 491520    0    0     0     0 1135  615 98  2  
0  0  0
40  0    264  72512  33328 491464    0    0     0     0 1172  661 98  2  
0  0  0
39  0    264  71520  33328 491472    0    0     0     0 1270  656 99  1  
0  0  0
40  0    264  71024  33328 491492    0    0     0     0 1133  611 99  1  
0  0  0
39  0    264  70776  33336 491500    0    0     0    48 1137  683 98  2  
0  0  0
40  0    264  69512  33336 491696    0    0   188     0 1198  649 99  1  
0  0  0
40  0    256  68868  33336 491884  148    0   304     0 1167  722 98  2  
0  0  0
37  0    252  68628  33336 491824   32    0    32     0 1210  656 99  1  
0  0  0
39  0    252  68132  33336 491784    0    0     0     0 1195  645 98  2  
0  0  0
39  0    252  66892  33344 491784    0    0     0    64 1124  635 100  
0  0  0  0
40  0    252  66520  33344 491784    0    0     0     0 1107  625 99  1  
0  0  0
40  0    252  66288  33344 491788    0    0     0     0 1130  611 98  2  
0  0  0
40  0    252  66296  33344 491788    0    0     0     0 1083  602 100  
0  0  0  0
41  0    252  65612  33344 491788    0    0     0     0 1098  563 99  1  
0  0  0
41  0    252  64612  33352 491788    0    0     0    12 1113  592 99  1  
0  0  0
41  0    252  64496  33352 491792    0    0     0     0 1159  585 100  
0  0  0  0
40  0    344  75788  32108 480748    0   92     0    92 1140  588 97  3  
0  0  0
41  0    344  75656  32108 480748    0    0     0     0 1132  579 99  1  
0  0  0
41  0    344  75408  32108 480748    0    0     0     0 1137  579 100  
0  0  0  0




On 18/11/11 00:09, Karsten Bräckelmann wrote:
> On Thu, 2011-11-17 at 15:55 +0000, Tom wrote:
>> SPAMDOPTIONS="-d -L -i 10.44.219.208 -A 10.44.217.0/20 -m 40 -q -x -u
>> spamd --min-children=40"
> Do you really run a single spamd server, serving a /20 of potential SMTP
> servers?
>
> Also, you configured spamd to try hard and always keep exactly 40
> children around -- both max and min are set to that value.
>
>
>> Other info Bayes module disabled, compiled regexes being used, RBL
>> checks disabled, network checks disabled.
> No numbers, but I believe the stock set of regex rules in the 3.3 branch
> should impose higher CPU and probably RAM usage than 3.2. In particular
> with some previous iterations, which have long been fixed since. Did you
> run sa-update to keep the rules fresh, re-compiled rules and restarted
> spamd?
>
> Any chance you're hitting swap?
>
> Since you're running in -L local mode, how long does processing a mail
> take? You mentioned "up to 600 mails per minute", 10 per second. I hope
> that's a beefy machine, but with local mode only keeping 40 children for
> up to 10 messages a second seems excessive.
>
> (Without hitting swap, 4 seconds per message pure CPU, no idle waiting
> for DNS, is too high.)
>
>
> Besides, you mentioned both CentOS 5 and 6 machines. So, is that number
> of messages per minute total, or per machine?
>
>