You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spamassassin.apache.org by bu...@bugzilla.spamassassin.org on 2004/09/27 15:46:11 UTC
[Bug 3828] New: spamd parent stops accepting requests
http://bugzilla.spamassassin.org/show_bug.cgi?id=3828
Summary: spamd parent stops accepting requests
Product: Spamassassin
Version: 3.0.0
Platform: Other
OS/Version: other
Status: NEW
Severity: major
Priority: P5
Component: spamc/spamd
AssignedTo: dev@spamassassin.apache.org
ReportedBy: dallase@nmgi.com
I'm currently trying to track down another spamd problem(s) that seems to be
pretty random. Happens about once every day on a 4 box spamd cluster. One of
the boxes will just drop to 0 load and spamd will apparently be hung.
restarting spamd makes it take right off again. Its random in which box it
occurs on.
Also, over the weekend I caught another box on another network doing similar
things. I have some ps and strace info from each, not sure how much it will
help until i can debug this out further. There is nothing in the spamd log
that indicates it has stopped. Weird thing I always see when it happens is the
port 783/tcp shows as filtered until i restart spamd.
Here is a little info. First, a process listing, 15 children. Parent is 17752.
[root@spamd2 root]# ps auxwww | grep spamd
root 1217 0.0 0.0 1340 60 ? S Jun23 0:00 supervise spamd
root 17752 0.0 2.8 23652 21888 ? S Sep22 0:05 /usr/bin/perl -
T -w /usr/bin/spamd --syslog-socket=none -A 127.0.0.0/8,24.225.0.0/24 -i
0.0.0.0 -q -x -m 15 --max-conn-per-child 25
root 28432 0.0 3.6 29472 27596 ? S Sep22 0:07 spamd child
root 28511 0.0 3.2 26916 25136 ? S Sep22 0:08 spamd child
root 28520 0.0 3.2 26252 24532 ? S Sep22 0:06 spamd child
root 28702 0.0 3.6 29756 27884 ? S Sep22 0:07 spamd child
root 28832 0.0 3.2 27016 25220 ? S Sep22 0:04 spamd child
root 28842 0.0 3.2 26892 25100 ? S Sep22 0:02 spamd child
root 28882 0.0 3.0 25116 23400 ? S Sep22 0:01 spamd child
root 26097 0.0 3.2 26780 24916 ? S 04:11 0:06 spamd child
root 26142 0.0 3.1 25848 24052 ? S 04:11 0:03 spamd child
root 26147 0.0 3.5 28636 26840 ? S 04:11 0:07 spamd child
root 26148 0.0 3.2 26416 24540 ? S 04:11 0:08 spamd child
root 26213 0.0 3.3 27600 25804 ? S 04:12 0:04 spamd child
root 26238 0.0 3.1 26132 24284 ? S 04:12 0:02 spamd child
root 26283 0.0 3.0 25292 23580 ? S 04:12 0:02 spamd child
root 26308 0.0 3.1 26032 24300 ? S 04:13 0:03 spamd child
root 28492 0.0 0.0 1764 592 pts/4 R 08:12 0:00 grep spamd
trying to echo|spamc never returns... stracing the parent shows it in pause()
state.
[root@spamd2 root]# strace -p 17752
pause(
checking the tcp socket shows it is filtered?
[root@spamd2 root]# nmap -sT -p783 localhost
Interesting ports on localhost.localdomain (127.0.0.1):
Port State Service
783/tcp filtered hp-alarm-mgr
restarting spamd makes it available again...
[root@spamd2 root]# qmailctl restart
[root@spamd2 root]# nmap -sT -p783 localhost
Interesting ports on localhost.localdomain (127.0.0.1):
Port State Service
783/tcp open hp-alarm-mgr
-------------------------------------------
and another... although this looks a little different. very long run times on
spamd procs (looks like DoS again)?
11:55pm up 175 days, 3:50, 1 user, load average: 2.52, 2.34, 2.29
142 processes: 136 sleeping, 4 running, 2 zombie, 0 stopped
CPU states: 99.0% user, 0.9% system, 0.0% nice, 0.0% idle
Mem: 253876K av, 232444K used, 21432K free, 0K shrd, 40336K buff
Swap: 522104K av, 177936K used, 344168K free 64336K cached
PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME COMMAND
31797 root 15 0 26480 196 52 R 46.9 0.0 16781m spamd
23746 root 14 0 26052 36 4 R 46.7 0.0 940:22 spamd
echo|spamc results in nothing.
[root@email root]# echo | spamc
process listing running with 2 childs.
[root@email root]# ps auxwww | grep spamd
root 8153 0.0 0.3 25024 904 ? S Aug26 0:49 /usr/bin/perl -
T -w /usr/bin/spamd --syslog-socket=none -q -x -m 4 --max-conn-per-child 25
root 10541 0.0 2.7 25384 6920 ? S 20:03 0:00 spamd child
root 25227 16.1 8.1 27432 20664 ? S 23:56 0:05 spamd child
strace the parent shows wait(), children show read().
[root@email root]# strace -p 8153
wait4(-1,
[root@email root]# strace -p 25227
read(6, <unfinished ...>
[root@email root]# strace -p 10541
read(6,
nothing in the spamd debug log until spamd is restarted.
--------------------------------------------
hopefully i'll discover more as it happens again.
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.