You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by "Rosenbaum, Larry M." <ro...@ornl.gov> on 2007/08/24 18:55:27 UTC

spamd children killed but don't die?

This morning on one of our servers, spamd was having problems.  There
were 8 spamd children running, but "top" showed only two of them were
using any CPU time even though there was a backlog of messages to be
processed.  The log file included lines like this:

Aug 24 08:49:22 localhost spamd[21051]: prefork: child states: BBBKKKKK
Aug 24 08:49:23 localhost spamd[21051]: prefork: child states: BBBKKKKK
Aug 24 08:49:32 localhost spamd[21051]: prefork: child states: BBBKKKKK
Aug 24 08:49:41 localhost spamd[21051]: prefork: child states: BBBKKKKK
Aug 24 08:49:49 localhost spamd[21051]: prefork: child states: BBBKKKKK
Aug 24 08:49:51 localhost spamd[21051]: prefork: child states: BBBKKKKK
Aug 24 08:49:59 localhost spamd[21051]: prefork: child states: BBBKKKKK
Aug 24 08:50:00 localhost spamd[21051]: prefork: child states: BBBKKKKK
Aug 24 08:50:04 localhost spamd[21051]: prefork: child states: BBBKKKKK
Aug 24 08:50:06 localhost spamd[21051]: prefork: child states: BBBKKKKK
Aug 24 08:50:12 localhost spamd[21051]: prefork: child states: BBBKKKKK

which I think means 3 children busy, 5 children waiting to die.  This
(the multiple "K" children) had been going on for a few hours, which
prevented new children from being spawned to handle the load.
Restarting spamd via "kill -HUP" restored normal operation.

Why were the killed processes not dying?

System information:

SunOS email 5.9 Generic_118558-39 sun4u sparc SUNW,Sun-Fire-V210
SpamAssassin Server version 3.2.3
  running on Perl 5.8.8
  with SSL support (IO::Socket::SSL 0.97)
  with zlib support (Compress::Zlib 1.41)

Process information (combination of "top" and "ps"):

Fri Aug 24 08:56:52 2007  last pid: 23996;  load averages:  0.55,  0.54,
0.52
192 processes: 190 sleeping, 2 on cpu
CPU states: 88.4% idle,  4.2% user,  3.4% kernel,  3.9% iowait,  0.0%
swap
Memory: 2048M real, 342M free, 1323M swap in use, 6109M swap free

USER       PID   PPID       STIME     TIME    STATE    SIZE     RES
CPU
spamd    23740  21051    08:55:39     0:17    sleep     72M     59M
11.40%
spamd    20459  21051    08:16:13     7:08    cpu/1     77M     66M
6.27%
root     21051      1    09:47:26     2:17    sleep     66M     58M
0.03%
spamd    27830  21051    03:39:24     5:28    sleep     81M     70M
0.00%
spamd    27926  21051    03:39:37     0:26    sleep     76M     63M
0.00%
spamd    14411  21051    00:53:09     0:11    sleep     70M     51M
0.00%
spamd    22780  21051    02:37:46     0:06    sleep     71M     57M
0.00%
spamd    22775  21051    02:37:32     0:04    sleep     70M     56M
0.00%
spamd    22776  21051    02:37:32     0:01    sleep     68M     54M
0.00%

spamd startup command:

ulimit -n 256
spamd -d -u spamd -r $pidfile -x -m 8 --syslog=local2
--syslog-socket=inet -i -A $me,$em1,$em2,$em3,$em4