You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spamassassin.apache.org by bu...@bugzilla.spamassassin.org on 2004/12/13 16:04:58 UTC

[Bug 4026] New: spamd spawns children endlessly

http://bugzilla.spamassassin.org/show_bug.cgi?id=4026

           Summary: spamd spawns children endlessly
           Product: Spamassassin
           Version: 3.0.1
          Platform: All
        OS/Version: Linux
            Status: NEW
          Severity: major
          Priority: P5
         Component: spamc/spamd
        AssignedTo: dev@spamassassin.apache.org
        ReportedBy: Alexander.Gretencord@gedoplan.de


I don't know when exactly it happens, but it is again time for it. Spamd will 
not accept any new connections because the original spamd instance is spawning 
one child after another, which seems to die instantly without processing 
anything. The following is the output of a ptrace of the father spamd process. 
 
--- 
fork()                                  = 26236 
wait4(26236, [WIFEXITED(s) && WEXITSTATUS(s) == 0], 0, NULL) = 26236 
rt_sigaction(SIGPIPE, {SIG_DFL}, {0x80933f0, [], SA_RESTART|0x4000000}, 8) = 0 
wait4(-1, 0xbffff658, WNOHANG, NULL)    = -1 ECHILD (No child processes) 
rt_sigaction(SIGCHLD, {0x80933f0, [], SA_RESTART|0x4000000}, {0x80933f0, [], 
SA_RESTART|0x4000000}, 8) = 0 
sigreturn()                             = ? (mask now []) 
--- SIGCHLD (Child exited) --- 
rt_sigaction(SIGPIPE, {SIG_DFL}, {SIG_DFL}, 8) = 0 
rt_sigaction(SIGPIPE, {0x80933f0, [], SA_RESTART|0x4000000}, {SIG_DFL}, 8) = 0 
rt_sigprocmask(SIG_BLOCK, NULL, [CHLD], 8) = 0 
send(4, "<22>spamd[13358]: server hit by "..., 41, 0) = -1 ENOTCONN (Transport 
endpoint is not connected) 
fork()                                  = 26237 
wait4(26237, [WIFEXITED(s) && WEXITSTATUS(s) == 0], 0, NULL) = 26237 
rt_sigaction(SIGPIPE, {SIG_DFL}, {0x80933f0, [], SA_RESTART|0x4000000}, 8) = 0 
wait4(-1, 0xbffff658, WNOHANG, NULL)    = -1 ECHILD (No child processes) 
rt_sigaction(SIGCHLD, {0x80933f0, [], SA_RESTART|0x4000000}, {0x80933f0, [], 
SA_RESTART|0x4000000}, 8) = 0 
sigreturn()                             = ? (mask now []) 
--- SIGCHLD (Child exited) --- 
rt_sigaction(SIGPIPE, {SIG_DFL}, {SIG_DFL}, 8) = 0 
rt_sigaction(SIGPIPE, {0x80933f0, [], SA_RESTART|0x4000000}, {SIG_DFL}, 8) = 0 
rt_sigprocmask(SIG_BLOCK, NULL, [CHLD], 8) = 0 
send(4, "<22>spamd[13358]: server hit by "..., 41, 0) = -1 ENOTCONN (Transport 
endpoint is not connected) 
fork()                                  = 26238 
--- 
 
No mails are scanned anymore and people start complaining about the amount of 
spam they get. This started after the upgrade from 2.63 to 3.0.1 and I will 
downgrade to 2.63 asap. Do you need any other information that could be 
helpful?



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 4026] spamd spawns children endlessly

Posted by bu...@bugzilla.spamassassin.org.
http://bugzilla.spamassassin.org/show_bug.cgi?id=4026





------- Additional Comments From Alexander.Gretencord@gedoplan.de  2004-12-13 14:12 -------
Ah ok thx, didn't know about that. Now I just have to wait until it happens 
again, as reloading/restarting syslog did not "help" to reproduce the problem. 
Perhaps the logrotation will. Anyway, time to go to bed :) 



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 4026] spamd spawns children endlessly

Posted by bu...@bugzilla.spamassassin.org.
http://bugzilla.spamassassin.org/show_bug.cgi?id=4026





------- Additional Comments From spamass@oldach.net  2004-12-14 03:22 -------
Sorry, forgot: This is not related to syslogd.



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 4026] spamd spawns children endlessly

Posted by bu...@bugzilla.spamassassin.org.
http://bugzilla.spamassassin.org/show_bug.cgi?id=4026





------- Additional Comments From spamass@oldach.net  2004-12-20 02:46 -------
Created an attachment (id=2574)
 --> (http://bugzilla.spamassassin.org/attachment.cgi?id=2574&action=view)
debug output for child process 8605




------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 4026] spamd spawns children endlessly

Posted by bu...@bugzilla.spamassassin.org.
http://bugzilla.spamassassin.org/show_bug.cgi?id=4026





------- Additional Comments From spamass@oldach.net  2004-12-16 04:35 -------
I disabled razor2 (and checked that it isn't being used) - the issue still pops 
up. There might be another (undocumented?) 10 second timeout?



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 4026] spamd spawns children endlessly

Posted by bu...@bugzilla.spamassassin.org.
http://bugzilla.spamassassin.org/show_bug.cgi?id=4026





------- Additional Comments From spamass@oldach.net  2004-12-14 03:20 -------
I observe the same behaviour on a FreeBSD 4.10-STABLE box. Apparently, after --
max-conn-per-child has run out for *all* children, a new child is spawned for 
*any* new request.



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 4026] spamd spawns children endlessly

Posted by bu...@bugzilla.spamassassin.org.
http://bugzilla.spamassassin.org/show_bug.cgi?id=4026





------- Additional Comments From Alexander.Gretencord@gedoplan.de  2004-12-13 13:45 -------
As I can't see because I didn't know what fd4 was but thx :) But then my 
question is, why is spamd restarting and restarting its children instead of 
just ignoring the fact that it can't log anymore? 
 
In fact I would like to trace the children, but they are respawning too fast, 
theres about 10 pids between two runs of ps ... suggestions? 



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 4026] spamd spawns children endlessly

Posted by bu...@bugzilla.spamassassin.org.
http://bugzilla.spamassassin.org/show_bug.cgi?id=4026





------- Additional Comments From spamass@oldach.net  2004-12-15 21:51 -------
I don't think it is sensible to split into two bugs. What is missing in the 
original case are just the logging lines, so depending on whether you syslog or 
do a buffered write to a file it might well be the same issue. Let's keep it 
here for the moment to ease communication.

I did some further investigations: It appears that after some time of running 
the children just die away. See this syslog snippet:

Dec 16 03:33:25 merak spamd[85718]: server successfully spawned child process, 
pid 1705
Dec 16 03:33:25 merak spamd[1705]: connection from somehere.come [149.7.61.230] 
at port 1161
Dec 16 03:33:25 merak spamd[1705]: checking message <21...@web.de> for 
(unknown):27.
Dec 16 03:33:34 merak spamd[85718]: handled cleanup of child pid 1705

What strikes me is that it the children die silently after exactly 10 seconds. I 
am running a default configuration (no changes to local.cf) with razor2 
installed. By coincidence, 10 seconds is the default for razor_timeout. Might it 
be that were are losing here?

I will enable razor2 logging to see what is going on there.

Strangely, I noticed that this dying of children stops again after some random 
time, and processing continues normal.



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 4026] spamd spawns children endlessly

Posted by bu...@bugzilla.spamassassin.org.
http://bugzilla.spamassassin.org/show_bug.cgi?id=4026

spamass@oldach.net changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |spamass@oldach.net





------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 4026] spamd spawns children endlessly

Posted by bu...@bugzilla.spamassassin.org.
http://bugzilla.spamassassin.org/show_bug.cgi?id=4026





------- Additional Comments From Alexander.Gretencord@gedoplan.de  2004-12-20 09:22 -------
Created an attachment (id=2576)
 --> (http://bugzilla.spamassassin.org/attachment.cgi?id=2576&action=view)
strace output for the master spamd process running with -D




------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 4026] spamd spawns children endlessly

Posted by bu...@bugzilla.spamassassin.org.
http://bugzilla.spamassassin.org/show_bug.cgi?id=4026





------- Additional Comments From jm@jmason.org  2004-12-13 14:01 -------
doesn't ptrace support a "-f" switch, or similar, to follow through fork()
calls?  that's what linux strace does, and ISTR solaris ptrace having something
similar.



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 4026] spamd spawns children endlessly

Posted by bu...@bugzilla.spamassassin.org.
http://bugzilla.spamassassin.org/show_bug.cgi?id=4026





------- Additional Comments From Alexander.Gretencord@gedoplan.de  2004-12-20 09:25 -------
Created an attachment (id=2577)
 --> (http://bugzilla.spamassassin.org/attachment.cgi?id=2577&action=view)
strace output for one child

Between starting the strace and hitting ctrl-c, 6 children were spawned. If I
would let the strace run for some minutes, the spamdtrace.pid files would soon
not be a unique filename. Hope this helps in any way. The system itself is a
standard debian woody system.



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 4026] spamd spawns children endlessly

Posted by bu...@bugzilla.spamassassin.org.
http://bugzilla.spamassassin.org/show_bug.cgi?id=4026





------- Additional Comments From spamass@oldach.net  2004-12-14 13:12 -------
Yes, I see a different behaviour: The newly spawned processess exit IMMEDIATELY 
after executing one SINGLE request, i.e. the do not observe max-conn-per-child, 
as the originally spawned processes do.



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 4026] spamd spawns children endlessly

Posted by bu...@bugzilla.spamassassin.org.
http://bugzilla.spamassassin.org/show_bug.cgi?id=4026





------- Additional Comments From jm@jmason.org  2004-12-13 10:30 -------
I think this is after syslogd has been restarted -- as you can see it's
attempting to syslog a message, getting a "transport endpoint is not connected".
 perhaps the children are attempting to log something?

1. try kill -HUP'ping the spamd after restarting syslogd, as you do with apache
et al.

2. a ptrace that includes the forked children would be helpful.



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 4026] spamd spawns children endlessly

Posted by bu...@bugzilla.spamassassin.org.
http://bugzilla.spamassassin.org/show_bug.cgi?id=4026





------- Additional Comments From jm@jmason.org  2004-12-14 15:03 -------
ok, could we split this double-bug into two separate bugs, then ;)   if they
turn out to be the same issue, it's easy to merge them again.  But in the
meantime, Helge, please open a new bug with your issue.



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 4026] spamd spawns children endlessly

Posted by bu...@bugzilla.spamassassin.org.
http://bugzilla.spamassassin.org/show_bug.cgi?id=4026





------- Additional Comments From felicity@kluge.net  2004-12-13 07:15 -------
Subject: Re:  New: spamd spawns children endlessly

On Mon, Dec 13, 2004 at 07:04:58AM -0800, bugzilla-daemon@bugzilla.spamassassin.org wrote:
> No mails are scanned anymore and people start complaining about the amount of 
> spam they get. This started after the upgrade from 2.63 to 3.0.1 and I will 
> downgrade to 2.63 asap. Do you need any other information that could be 
> helpful?

Did you read the UPGRADE file?  Do you have the required modules installed
as listed in the INSTALL doc?  Did "make test" pass?

It sounds like the children can't run because something is missing.  There's
no stdout/stderr/debug output in your report, so it's hard to know what is
going on.





------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 4026] spamd spawns children endlessly

Posted by bu...@bugzilla.spamassassin.org.
http://bugzilla.spamassassin.org/show_bug.cgi?id=4026





------- Additional Comments From Alexander.Gretencord@gedoplan.de  2004-12-14 14:25 -------
It's worse for me. As said, the processes respawn immediately _without_ doing 
anything. They do not process even a single request. Which is how this bug got 
my attention, users complained about missing spamassassin headers. Our server 
has only a few emails per second at peak times and mistly sits idle for some 
time, but the spamd processes still die and get respawned. 
 
 



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 4026] spamd spawns children endlessly

Posted by bu...@bugzilla.spamassassin.org.
http://bugzilla.spamassassin.org/show_bug.cgi?id=4026





------- Additional Comments From spamass@oldach.net  2004-12-20 03:04 -------
Created an attachment (id=2575)
 --> (http://bugzilla.spamassassin.org/attachment.cgi?id=2575&action=view)
ktrace output (without binary read/write) of a child running -D




------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 4026] spamd spawns children endlessly

Posted by bu...@bugzilla.spamassassin.org.
http://bugzilla.spamassassin.org/show_bug.cgi?id=4026





------- Additional Comments From jm@jmason.org  2004-12-16 17:09 -------
ok, please attach a ptrace/strace/truss of a child dying, from spamd running
with -D.



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 4026] spamd spawns children endlessly

Posted by bu...@bugzilla.spamassassin.org.
http://bugzilla.spamassassin.org/show_bug.cgi?id=4026





------- Additional Comments From Alexander.Gretencord@gedoplan.de  2004-12-13 07:18 -------
Ok sorry, I think I did not make this really clear. This problem only starts 
after spamd has run for days without problems. A simple restart of spamd will 
make this problem go away. Until it starts again after some time. 



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 4026] spamd spawns children endlessly

Posted by bu...@bugzilla.spamassassin.org.
http://bugzilla.spamassassin.org/show_bug.cgi?id=4026





------- Additional Comments From jm@jmason.org  2004-12-14 13:03 -------
Helge - I don't understand.

each child should exit once it hits the max-conn-per-child limit, and the server
will immediately respawn a new one to replace it.  are you seeing different
behaviour?



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.