You are viewing a plain text version of this content. The canonical link for it is here.

Posted to users@spamassassin.apache.org by le...@srs.gov on 2005/05/12 15:07:07 UTC

SA Performance under Solaris -w- Sendmail

I've been experiencing and documenting a pretty severe performance problem 
with SA versions 3.0.1 through 3.1x (nightly)  under Solaris 8 and 9, Perl 
5.8.3.

We're running Sendmail 8.12.11, and I've tried milters MimeDefang and 
Spamass-milter. 

I initially thought this problem was related to the "round-robin" forking 
of spamd, but find that the 3.1 nightly exhibits the same behavior (using 
the new pre-forking algorithms), regardless of the number of spamd 
children (which MimeDefang doesn't appear to use anyway).

My problem is that when running a test load of about 350 messages through 
a test box (serially, so we're only talking 1 message at a time), I see 
the CPU load peg out frequently, around 90% of the time.  At other times 
during this cycle, the CPU load is between 30 and 70 % which I'd call 
acceptable.  In our production environment, in which processing is not 
serial (multiple sendmail threads running), the CPU load kills the machine 
dead in short order.

When MimeDefang is used as the milter, I see a fairly even spread of CPU 
usage between user, system, and wait times.  With the spamass-milter, I 
see almost all of the CPU consumed by user processing.

The SA results look much "better" when we use spamass-milter / spamd, as I 
think MimeDefang doesn't round the scores up.

I'm wondering if there's some kind of Perl or Solaris "tuning" that I 
might need to do in order to not kill the CPU so bad.  I've tried 
"nice"ing spamd, but that really didn't do much for the problem.

Anyone have any ideas or suggestions of places to look?

Thanks!

Re: SA Performance under Solaris -w- Sendmail

Posted by Alex S Moore <as...@edge.net>.

leonard.gray@srs.gov wrote:
> The production server we are trying to run on only has 128mb of memory. 
>  I can't believe we got a machine with that little, but it happened.  I 
> might try running only 2 children of SPAMD, refreshing the processes 
> every 5 messages or so to see if that will work, but I'd say the machine 
> is a little light on the horsepower.

Glad to hear that you found the problem.  I do not know about 
horsepower, but 128Mb of memory sounds hopeless or at least very limiting:>

Alex

Re: SA Performance under Solaris -w- Sendmail

Posted by le...@srs.gov.

Thanks for your efforts and tests.  I think I found the problem.

The production server we are trying to run on only has 128mb of memory.  I 
can't believe we got a machine with that little, but it happened.  I might 
try running only 2 children of SPAMD, refreshing the processes every 5 
messages or so to see if that will work, but I'd say the machine is a 
little light on the horsepower.

Thanks again!




Alex S Moore <as...@edge.net> 
05/12/2005 12:04 PM

To
leonard.gray@srs.gov
cc
users@spamassassin.apache.org
Subject
Re: SA Performance under Solaris -w- Sendmail






leonard.gray@srs.gov wrote:
> 
> I've been experiencing and documenting a pretty severe performance 
> problem with SA versions 3.0.1 through 3.1x (nightly)  under Solaris 8 
> and 9, Perl 5.8.3.

Ran the test differently and got different results.  I sent the 573 
messages from a different host.  Both the send and the processing in MD 
finished in a fraction of the time required for my earlier test. 
Everything still seemed fine.  What do you think?

----
highest and typical prstat output:
[amoore@mcsrv5 tmp]$ prstat -n 18 2
    PID USERNAME  SIZE   RSS STATE  PRI NICE      TIME  CPU PROCESS/NLWP
  13757 defang     40M   32M run     33    0   0:00:16 7.0% 
mimedefang-mult/1
   9387 defang     41M   33M run     32    0   0:00:28 5.6% 
mimedefang-mult/1
  13734 defang     40M   32M run     32    0   0:00:17 5.6% 
mimedefang-mult/1
  13732 defang     40M   31M run     33    0   0:00:17 5.2% 
mimedefang-mult/1
  13738 defang     40M   31M run     55    0   0:00:17 4.7% 
mimedefang-mult/1
  13835 defang     40M   31M run     29    0   0:00:15 4.7% 
mimedefang-mult/1
  13736 defang     40M   31M run     32    0   0:00:16 4.4% 
mimedefang-mult/1
  13851 defang     40M   31M run     39    0   0:00:14 4.3% 
mimedefang-mult/1
  13854 defang     39M   31M cpu0    49    0   0:00:11 3.3% 
mimedefang-mult/1
  15675 defang     38M   28M sleep   53    0   0:00:02 3.2% 
mimedefang-mult/1
  28308 defang     15M   11M sleep   59    0   0:00:21 2.3% clamd/4
  16192 root     2280K 1064K sleep   59    0   0:28:27 0.4% nfsd/5
   7466 root       14M 3184K run     30    0   0:00:02 0.4% sendmail/1
    232 root     3816K 1376K sleep   59    0   0:03:27 0.4% syslogd/15
  17701 root     6544K 5216K sleep   59    0   0:00:22 0.3% authdaemond/1
   8345 defang     37M   31M sleep   59    0   0:00:23 0.3% 
mimedefang-mult/1
  15620 amoore   4672K 4360K cpu1    59    0   0:00:00 0.2% prstat/1
   6097 root     6136K 4744K sleep   59    0   0:02:16 0.2% fam/1
Total: 138 processes, 297 lwps, load averages: 6.47, 2.95, 1.21
[amoore@mcsrv5 tmp]$ prstat -n 18 2
    PID USERNAME  SIZE   RSS STATE  PRI NICE      TIME  CPU PROCESS/NLWP
   9387 defang     44M   36M cpu0    10    0   0:00:59  22% 
mimedefang-mult/1
  28308 defang     15M   11M sleep   59    0   0:00:23 1.0% clamd/4
  16062 root       15M 5368K sleep   47    0   0:00:01 0.9% sendmail/1
   8357 defang   4632K 2152K sleep   59    0   0:01:30 0.5% mimedefang/3
  13854 defang     41M   32M sleep   59    0   0:00:15 0.3% 
mimedefang-mult/1
  13757 defang     40M   32M sleep   59    0   0:00:20 0.3% 
mimedefang-mult/1
  13835 defang     40M   32M sleep   59    0   0:00:19 0.2% 
mimedefang-mult/1
  13851 defang     40M   31M sleep   59    0   0:00:17 0.2% 
mimedefang-mult/1
  13738 defang     40M   32M sleep   59    0   0:00:20 0.2% 
mimedefang-mult/1
  13732 defang     40M   32M sleep   59    0   0:00:21 0.2% 
mimedefang-mult/1
  13736 defang     41M   32M sleep   59    0   0:00:19 0.2% 
mimedefang-mult/1
  16048 amoore   4672K 4368K cpu1    59    0   0:00:00 0.2% prstat/1
  13734 defang     40M   32M sleep   59    0   0:00:21 0.2% 
mimedefang-mult/1
  28314 defang   6688K 2336K sleep   59    0   0:00:03 0.2% 
clamav-milter/3
  15675 defang     38M   29M sleep   59    0   0:00:05 0.2% 
mimedefang-mult/1
   6097 root     6136K 4744K sleep   59    0   0:02:16 0.1% fam/1
  17700 root     4056K 2728K sleep   59    0   0:00:03 0.1% authdaemond/1
   8345 defang     37M   31M sleep   59    0   0:00:23 0.1% 
mimedefang-mult/1
Total: 120 processes, 262 lwps, load averages: 3.26, 3.02, 1.42
[amoore@mcsrv5 tmp]$ /

----
log for MD process that did most of the work:
May 12 10:47:05 mcsrv5 mimedefang-multiplexor[8345]: [ID 638987 
mail.info] Slave 1 resource usage: req=500, scans=500, user=233.180, 
sys=10.460, nswap=0, majflt=0, minflt=0, maxrss=0, bi=0, bo=0

----

Alex

Re: SA Performance under Solaris -w- Sendmail

Posted by Alex S Moore <as...@edge.net>.

leonard.gray@srs.gov wrote:
> 
> I've been experiencing and documenting a pretty severe performance 
> problem with SA versions 3.0.1 through 3.1x (nightly)  under Solaris 8 
> and 9, Perl 5.8.3.

Ran the test differently and got different results.  I sent the 573 
messages from a different host.  Both the send and the processing in MD 
finished in a fraction of the time required for my earlier test. 
Everything still seemed fine.  What do you think?

----
highest and typical prstat output:
[amoore@mcsrv5 tmp]$ prstat -n 18 2
    PID USERNAME  SIZE   RSS STATE  PRI NICE      TIME  CPU PROCESS/NLWP
  13757 defang     40M   32M run     33    0   0:00:16 7.0% 
mimedefang-mult/1
   9387 defang     41M   33M run     32    0   0:00:28 5.6% 
mimedefang-mult/1
  13734 defang     40M   32M run     32    0   0:00:17 5.6% 
mimedefang-mult/1
  13732 defang     40M   31M run     33    0   0:00:17 5.2% 
mimedefang-mult/1
  13738 defang     40M   31M run     55    0   0:00:17 4.7% 
mimedefang-mult/1
  13835 defang     40M   31M run     29    0   0:00:15 4.7% 
mimedefang-mult/1
  13736 defang     40M   31M run     32    0   0:00:16 4.4% 
mimedefang-mult/1
  13851 defang     40M   31M run     39    0   0:00:14 4.3% 
mimedefang-mult/1
  13854 defang     39M   31M cpu0    49    0   0:00:11 3.3% 
mimedefang-mult/1
  15675 defang     38M   28M sleep   53    0   0:00:02 3.2% 
mimedefang-mult/1
  28308 defang     15M   11M sleep   59    0   0:00:21 2.3% clamd/4
  16192 root     2280K 1064K sleep   59    0   0:28:27 0.4% nfsd/5
   7466 root       14M 3184K run     30    0   0:00:02 0.4% sendmail/1
    232 root     3816K 1376K sleep   59    0   0:03:27 0.4% syslogd/15
  17701 root     6544K 5216K sleep   59    0   0:00:22 0.3% authdaemond/1
   8345 defang     37M   31M sleep   59    0   0:00:23 0.3% 
mimedefang-mult/1
  15620 amoore   4672K 4360K cpu1    59    0   0:00:00 0.2% prstat/1
   6097 root     6136K 4744K sleep   59    0   0:02:16 0.2% fam/1
Total: 138 processes, 297 lwps, load averages: 6.47, 2.95, 1.21
[amoore@mcsrv5 tmp]$ prstat -n 18 2
    PID USERNAME  SIZE   RSS STATE  PRI NICE      TIME  CPU PROCESS/NLWP
   9387 defang     44M   36M cpu0    10    0   0:00:59  22% 
mimedefang-mult/1
  28308 defang     15M   11M sleep   59    0   0:00:23 1.0% clamd/4
  16062 root       15M 5368K sleep   47    0   0:00:01 0.9% sendmail/1
   8357 defang   4632K 2152K sleep   59    0   0:01:30 0.5% mimedefang/3
  13854 defang     41M   32M sleep   59    0   0:00:15 0.3% 
mimedefang-mult/1
  13757 defang     40M   32M sleep   59    0   0:00:20 0.3% 
mimedefang-mult/1
  13835 defang     40M   32M sleep   59    0   0:00:19 0.2% 
mimedefang-mult/1
  13851 defang     40M   31M sleep   59    0   0:00:17 0.2% 
mimedefang-mult/1
  13738 defang     40M   32M sleep   59    0   0:00:20 0.2% 
mimedefang-mult/1
  13732 defang     40M   32M sleep   59    0   0:00:21 0.2% 
mimedefang-mult/1
  13736 defang     41M   32M sleep   59    0   0:00:19 0.2% 
mimedefang-mult/1
  16048 amoore   4672K 4368K cpu1    59    0   0:00:00 0.2% prstat/1
  13734 defang     40M   32M sleep   59    0   0:00:21 0.2% 
mimedefang-mult/1
  28314 defang   6688K 2336K sleep   59    0   0:00:03 0.2% clamav-milter/3
  15675 defang     38M   29M sleep   59    0   0:00:05 0.2% 
mimedefang-mult/1
   6097 root     6136K 4744K sleep   59    0   0:02:16 0.1% fam/1
  17700 root     4056K 2728K sleep   59    0   0:00:03 0.1% authdaemond/1
   8345 defang     37M   31M sleep   59    0   0:00:23 0.1% 
mimedefang-mult/1
Total: 120 processes, 262 lwps, load averages: 3.26, 3.02, 1.42
[amoore@mcsrv5 tmp]$ /

----
log for MD process that did most of the work:
May 12 10:47:05 mcsrv5 mimedefang-multiplexor[8345]: [ID 638987 
mail.info] Slave 1 resource usage: req=500, scans=500, user=233.180, 
sys=10.460, nswap=0, majflt=0, minflt=0, maxrss=0, bi=0, bo=0

----

Alex

Re: SA Performance under Solaris -w- Sendmail

Posted by Alex S Moore <as...@edge.net>.

leonard.gray@srs.gov wrote:
> 
> I've been experiencing and documenting a pretty severe performance 
> problem with SA versions 3.0.1 through 3.1x (nightly)  under Solaris 8 
> and 9, Perl 5.8.3.

What is the simplest way for me to see this problem?  I use CSW packages 
for sendmail, MD, SA, perl and others.  Running Solaris 9 on a small 
V210 with dual sparc CPU and 2Gb ram.  I have not seen any large spikes 
in CPU usage, but my volumes may be too low.

I would like to simulate your test with MD.  I have spamass-milter 
available if needed.  Also, I have a single 360Mhz sparc Solaris 8 or 10 
box with plenty of ram available for testing, but it may be too far away 
from production horsepower.  I also have a dual 450Mhz sparc Solaris 10 
with plenty of ram that I can use for testing.  Actually, that one may 
be the simplest for me to use for a test.

Alex

Re: SA Performance under Solaris -w- Sendmail

Posted by Alex S Moore <as...@edge.net>.

leonard.gray@srs.gov wrote:
> 
> I've been experiencing and documenting a pretty severe performance 
> problem with SA versions 3.0.1 through 3.1x (nightly)  under Solaris 8 
> and 9, Perl 5.8.3.
> 

This may not be much help.  I put 573 messages in a subfolder and ran 
the following script.  I watched `prstat -n 10 2` and the typical and 
highest output follows.  It seemed fine to me.  The script, MD and 
everything was on a V210, which is also my courier-imap server and 
exports the home directories.  While the script was dumping messages, 
courier was fine.  Also clamav was running from clamav-milter and MD 
also runs clamav using clamd.sock.  Should I disable clamav for this test?

Are you running MD with embedded perl?  That may not help for your 
tests, but it should help in production.  Have you considered using the 
CSW packages from www.blastwave.org?  Everything that you need should be 
available there and I do recommend MIMEDefang instead of spamass-milter. 
  Maybe the CSW packages are compiled differently from what you have.

Also, you did not give a summary of your hardware.  My small V210 is 
dual CPU with 2Gb ram with RAID1 for most directories, including 
/export/home and using Solaris Volume Manager.

---
The script:
#!/bin/sh
cd /export/home/amoore/Maildir/.Mail.Hold/cur
for file in `ls`
do
         cat $file |/opt/csw/lib/sendmail -f amoore@localhost 
sunuser2@localhost
done

---
A couple of outputs from prstat:
[root@mcsrv5 /]# prstat -n 10 2
    PID USERNAME  SIZE   RSS STATE  PRI NICE      TIME  CPU PROCESS/NLWP
   7039 defang     47M   40M cpu1    30    0   0:01:36  20% 
mimedefang-mult/1
   6097 root     6136K 4744K sleep   59    0   0:01:56 0.5% fam/1
  28308 defang     15M   10M sleep   59    0   0:00:11 0.3% clamd/3
    217 root      124M  123M sleep   59    0   1:26:55 0.2% automountd/2
  10506 root     4656K 4336K cpu0    59    0   0:00:00 0.2% prstat/1
   7466 root       14M 3184K sleep   49    0   0:00:01 0.2% sendmail/1
   8345 defang     37M   31M sleep   59    0   0:00:21 0.2% 
mimedefang-mult/1
  11068 root      108M  106M sleep   59    0   0:05:30 0.1% nscd/22
  11022 amoore   7264K 4336K sleep   54    0   0:00:00 0.1% sendmail/1
  17701 root     6360K 5032K sleep   59    0   0:00:20 0.1% authdaemond/1
Total: 117 processes, 253 lwps, load averages: 0.67, 0.46, 0.25
[root@mcsrv5 /]# prstat -n 10 2
    PID USERNAME  SIZE   RSS STATE  PRI NICE      TIME  CPU PROCESS/NLWP
   7039 defang     48M   40M run      9    0   0:01:41  21% 
mimedefang-mult/1
   9387 defang     38M   28M run     11    0   0:00:01 3.8% 
mimedefang-mult/1
  11066 defang     38M   26M sleep   50    0   0:00:00 1.6% 
mimedefang-mult/1
   6097 root     6136K 4744K sleep   59    0   0:01:56 0.6% fam/1
  28308 defang     15M   10M sleep   59    0   0:00:11 0.5% clamd/3
  11172 defang     37M   17M cpu0    29    0   0:00:00 0.4% 
mimedefang-mult/1
  11047 root       15M 5352K sleep   59    0   0:00:00 0.2% sendmail/1
   7466 root       14M 3184K sleep   38    0   0:00:01 0.2% sendmail/1
   8357 defang   4240K 1344K sleep   59    0   0:01:22 0.2% mimedefang/6
  11168 amoore   7248K 4320K sleep   39    0   0:00:00 0.1% sendmail/1
Total: 130 processes, 271 lwps, load averages: 0.88, 0.51, 0.27
[root@mcsrv5 /]#

---
The MD process that seemed to do most of the work:
May 12 10:00:09 mcsrv5 mimedefang-multiplexor[8345]: [ID 638987 
mail.info] Slave 0 resource usage: req=500, scans=500, user=184.580, 
sys=8.680, nswap=0, majflt=0, minflt=0, maxrss=0, bi=0, bo=0

--

Alex