You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spamassassin.apache.org by bu...@bugzilla.spamassassin.org on 2004/09/27 15:46:11 UTC

[Bug 3828] New: spamd parent stops accepting requests

http://bugzilla.spamassassin.org/show_bug.cgi?id=3828

           Summary: spamd parent stops accepting requests
           Product: Spamassassin
           Version: 3.0.0
          Platform: Other
        OS/Version: other
            Status: NEW
          Severity: major
          Priority: P5
         Component: spamc/spamd
        AssignedTo: dev@spamassassin.apache.org
        ReportedBy: dallase@nmgi.com


I'm currently trying to track down another spamd problem(s) that seems to be 
pretty random.  Happens about once every day on a 4 box spamd cluster.   One of 
the boxes will just drop to 0 load and spamd will apparently be hung.  
restarting spamd makes it take right off again.   Its random in which box it 
occurs on.  

Also, over the weekend I caught another box on another network doing similar 
things.  I have some ps and strace info from each, not sure how much it will 
help until i can debug this out further.  There is nothing in the spamd log 
that indicates it has stopped.  Weird thing I always see when it happens is the 
port 783/tcp shows as filtered until i restart spamd.

Here is a little info.  First, a process listing, 15 children.  Parent is 17752.


[root@spamd2 root]# ps auxwww  | grep spamd
root      1217  0.0  0.0  1340   60 ?        S    Jun23   0:00 supervise spamd
root     17752  0.0  2.8 23652 21888 ?       S    Sep22   0:05 /usr/bin/perl -
T -w /usr/bin/spamd --syslog-socket=none -A 127.0.0.0/8,24.225.0.0/24 -i 
0.0.0.0 -q -x -m 15 --max-conn-per-child 25
root     28432  0.0  3.6 29472 27596 ?       S    Sep22   0:07 spamd child
root     28511  0.0  3.2 26916 25136 ?       S    Sep22   0:08 spamd child
root     28520  0.0  3.2 26252 24532 ?       S    Sep22   0:06 spamd child
root     28702  0.0  3.6 29756 27884 ?       S    Sep22   0:07 spamd child
root     28832  0.0  3.2 27016 25220 ?       S    Sep22   0:04 spamd child
root     28842  0.0  3.2 26892 25100 ?       S    Sep22   0:02 spamd child
root     28882  0.0  3.0 25116 23400 ?       S    Sep22   0:01 spamd child
root     26097  0.0  3.2 26780 24916 ?       S    04:11   0:06 spamd child
root     26142  0.0  3.1 25848 24052 ?       S    04:11   0:03 spamd child
root     26147  0.0  3.5 28636 26840 ?       S    04:11   0:07 spamd child
root     26148  0.0  3.2 26416 24540 ?       S    04:11   0:08 spamd child
root     26213  0.0  3.3 27600 25804 ?       S    04:12   0:04 spamd child
root     26238  0.0  3.1 26132 24284 ?       S    04:12   0:02 spamd child
root     26283  0.0  3.0 25292 23580 ?       S    04:12   0:02 spamd child
root     26308  0.0  3.1 26032 24300 ?       S    04:13   0:03 spamd child
root     28492  0.0  0.0  1764  592 pts/4    R    08:12   0:00 grep spamd

trying to echo|spamc never returns... stracing the parent shows it in pause() 
state.

[root@spamd2 root]# strace -p 17752
pause(

checking the tcp socket shows it is filtered?

[root@spamd2 root]# nmap -sT -p783 localhost
Interesting ports on localhost.localdomain (127.0.0.1):
Port       State       Service
783/tcp    filtered    hp-alarm-mgr

restarting spamd makes it available again...

[root@spamd2 root]# qmailctl restart
[root@spamd2 root]# nmap -sT -p783 localhost
Interesting ports on localhost.localdomain (127.0.0.1):
Port       State       Service
783/tcp    open        hp-alarm-mgr


-------------------------------------------

and another... although this looks a little different.  very long run times on 
spamd procs (looks like DoS again)?  

 11:55pm  up 175 days,  3:50,  1 user,  load average: 2.52, 2.34, 2.29
142 processes: 136 sleeping, 4 running, 2 zombie, 0 stopped
CPU states: 99.0% user,  0.9% system,  0.0% nice,  0.0% idle
Mem:   253876K av,  232444K used,   21432K free,       0K shrd,   40336K buff
Swap:  522104K av,  177936K used,  344168K free                   64336K cached

  PID USER     PRI  NI  SIZE  RSS SHARE STAT %CPU %MEM   TIME COMMAND
31797 root      15   0 26480  196    52 R    46.9  0.0 16781m spamd
23746 root      14   0 26052   36     4 R    46.7  0.0 940:22 spamd

echo|spamc results in nothing.

[root@email root]# echo | spamc

process listing running with 2 childs.

[root@email root]# ps auxwww | grep spamd
root      8153  0.0  0.3 25024  904 ?        S    Aug26   0:49 /usr/bin/perl -
T -w /usr/bin/spamd --syslog-socket=none -q -x -m 4 --max-conn-per-child 25
root     10541  0.0  2.7 25384 6920 ?        S    20:03   0:00 spamd child
root     25227 16.1  8.1 27432 20664 ?       S    23:56   0:05 spamd child

strace the parent shows wait(), children show read().

[root@email root]# strace -p 8153
wait4(-1,

[root@email root]# strace -p 25227
read(6,  <unfinished ...>

[root@email root]# strace -p 10541
read(6,

nothing in the spamd debug log until spamd is restarted.

--------------------------------------------

hopefully i'll discover more as it happens again.



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.