You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spamassassin.apache.org by bu...@bugzilla.spamassassin.org on 2007/10/08 11:35:48 UTC

[Bug 5671] New: spamd child processing timeout causes children to hang

http://issues.apache.org/SpamAssassin/show_bug.cgi?id=5671

           Summary: spamd child processing timeout causes children to hang
           Product: Spamassassin
           Version: 3.2.3
          Platform: All
        OS/Version: Linux
            Status: NEW
          Severity: major
          Priority: P5
         Component: spamc/spamd
        AssignedTo: dev@spamassassin.apache.org
        ReportedBy: steve.freegard@fsl.com


I've noticed an issue with spamd, after a varying amount of time spamd appears
to hang when it gets a timeout; this eventually causes spamd to run out of
processing children as they all end up in this state eventually.

Here is a 'ps' output showing what arguments spamd was started with and you can
see the first three children have a lot of CPU time.  All three of these
children are 'stuck':

 3561 ?        Ss     2:08 /usr/bin/spamd -d -c --timeout-child=60
--max-children=20 --min-children=10 -x -u spamd -r /var/run/spamd
 2421 ?        R    291:12  \_ spamd child
 2905 ?        R    249:15  \_ spamd child
10886 ?        R    216:33  \_ spamd child
 3655 ?        S      0:56  \_ spamd child
 4178 ?        S      2:08  \_ spamd child
16571 ?        S      0:36  \_ spamd child
16573 ?        S      0:19  \_ spamd child
16952 ?        S      0:07  \_ spamd child
17747 ?        S      0:09  \_ spamd child

strace/ltrace on any of the three processes yields no results.  The hang does
not appear to be bayes related, there are no bayes lock files and I have the
following set in local.cf:

lock_type flock
bayes_learn_to_journal 1
bayes_auto_expire 0

Searching the logs for the relevant PIDs shows the following:

Oct  7 14:10:40 securemail spamd[2421]: spamd: connection from
localhost.localdomain [127.0.0.1] at port 37022 
Oct  7 14:10:53 securemail spamd[2421]: spamd: checking message
<20...@nyms1.anonymizer.com> for (unknown):501 
Oct  7 14:11:41 securemail spamd[2421]: rules: failed to run BAYES_99 test,
skipping: 
Oct  7 14:11:41 securemail spamd[2421]:  (child processing timeout at
/usr/bin/spamd line 1246, <GEN15116> line 91271. 
Oct  7 14:11:41 securemail spamd[2421]: ) 

Oct  7 16:57:34 securemail spamd[10886]: spamd: checking message
<20...@nyms1.anonymizer.com> for (unknown):501 
Oct  7 16:58:17 securemail spamd[10886]: rules: failed to run BAYES_99 test,
skipping: 
Oct  7 16:58:17 securemail spamd[10886]:  (child processing timeout at
/usr/bin/spamd line 1246, <GEN15703> line 91271. 
Oct  7 16:58:17 securemail spamd[10886]: )



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 5671] spamd child processing timeout causes children to hang

Posted by bu...@bugzilla.spamassassin.org.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=5671





------- Additional Comments From achill@hotbox.ru  2008-03-04 17:05 -------
....

I get "spamd[27226]: rules: failed to run BAYES_99 test, skipping:" error in
maillog and spamd hangs.




------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 5671] spamd child processing timeout causes children to hang

Posted by bu...@bugzilla.spamassassin.org.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=5671





------- Additional Comments From bugzilla.spamassassin.org@dale.us  2008-02-25 07:13 -------
Was anyone else using --round-robin?  I was using this to work around for bug
#4594 and never removed it.  I removed it 3 days ago and haven't had to manually
restart spamd since then.



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 5671] spamd child processing timeout causes children to hang

Posted by bu...@bugzilla.spamassassin.org.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=5671





------- Additional Comments From achill@hotbox.ru  2008-02-26 02:41 -------
To Dale Blount:

My spamd startup string:

/usr/bin/spamd -x -q --min-children=1 --max-children=30 --min-spare 1
--max-spare 3 --round-robin --max-conn-per-child=2 --timeout-child=30

Some days it worked fine, but yesterday I've got spamd hung.

...any suggestions? 



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 5671] spamd child processing timeout causes children to hang

Posted by bu...@bugzilla.spamassassin.org.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=5671





------- Additional Comments From achill@hotbox.ru  2008-03-05 16:01 -------
(In reply to comment #11)
> I'm not sure our bugs are the same... I don't use BAYES at all.

But have you got error (or something unusual) in maillog file (or another spamd
log file)?



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 5671] spamd child processing timeout causes children to hang

Posted by bu...@bugzilla.spamassassin.org.
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=5671





------- Additional Comments From bugzilla.spamassassin.org@dale.us  2008-02-14 13:12 -------
Also is happening to me with 3.2.4



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 5671] spamd child processing timeout causes children to hang

Posted by bu...@bugzilla.spamassassin.org.
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=5671





------- Additional Comments From schulz@adi.com  2007-10-08 08:18 -------
Created an attachment (id=4144)
 --> (http://issues.apache.org/SpamAssassin/attachment.cgi?id=4144&action=view)
Spamd log

I have just had the same type of hang.	This is the first time in many years
that I have had Spamd hang.  Attached is a log showing the hang.  Note that
the first unusual message occured at  Oct  7 03:20:10.	The last log entry
occured at  Oct  7 04:06:37 after which no more mail was scanned.  I restarted
Spamd at  Oct  8 08:57:43.



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 5671] spamd child processing timeout causes children to hang

Posted by bu...@bugzilla.spamassassin.org.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=5671





------- Additional Comments From achill@hotbox.ru  2008-03-04 16:57 -------
Hi Dale!

I tried without --round-robin before:

/usr/bin/spamd -x -q --max-children=30


I have just tried to run without --round-robin:

/usr/bin/spamd -D -x -q --min-children=1 --max-children=30 --min-spare 1
--max-spare 3 --max-conn-per-child=2 --timeout-child=30

result is the same:

I get "spamd[27226]: rules: failed to run BAYES_99 test, skipping:" error in
maillog.

It seems that it's not problem of options's choise.



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 5671] spamd child processing timeout causes children to hang

Posted by bu...@bugzilla.spamassassin.org.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=5671





------- Additional Comments From bugzilla.spamassassin.org@dale.us  2008-03-03 06:35 -------
Vladimir,

I was going to suggest removing --round-robin, but this has failed me today. 
Seems to be less prone to failure, though, so you may still want to try that.



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 5671] spamd child processing timeout causes children to hang

Posted by bu...@bugzilla.spamassassin.org.
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=5671


bugzilla.spamassassin.org@dale.us changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |bugzilla.spamassassin.org@da
                   |                            |le.us






------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 5671] spamd child processing timeout causes children to hang

Posted by bu...@bugzilla.spamassassin.org.
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=5671





------- Additional Comments From schulz@adi.com  2007-10-08 09:00 -------
Looking through old log files, I find the message  'timeout with empty $@ at
/opt/perl/lib/site_perl/5.8.5/Mail/SpamAssassin/Timeout.pm line 185.'  a few
times a day with apparently no ill effects.  This is this first time that I
have seen either  'rules: failed to run BAYES_99 test, skipping:'  or
'(child processing timeout at /usr/bin/spamd line 1246, <GEN18161> line 29.'
Note that these occur in pairs and that the number after 'GEN' and the final
line number vary.



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 5671] spamd child processing timeout causes children to hang

Posted by bu...@bugzilla.spamassassin.org.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=5671





------- Additional Comments From bugzilla.spamassassin.org@dale.us  2008-03-05 05:31 -------
I'm not sure our bugs are the same... I don't use BAYES at all.



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 5671] spamd child processing timeout causes children to hang

Posted by bu...@bugzilla.spamassassin.org.
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=5671





------- Additional Comments From protector66@mail.ru  2007-10-17 07:33 -------
This really looks like our bug 5679
We had messages "prefork: server reached --max-children setting, consider 
raising it" and raised it. Now we don't have such messages in our log. Did you 
try to set a bigger value for --max-children setting? Maybe that would make 
our situations more similar.



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 5671] spamd child processing timeout causes children to hang

Posted by bu...@bugzilla.spamassassin.org.
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=5671





------- Additional Comments From achill@hotbox.ru  2008-02-17 17:34 -------
Sendmail 8.12 + spamass-milter 0.3.1 + spamassassin 3.2.4 (runs as spamd)

Spamd overloads CPU.

Symptoms are the same:

Feb 17 07:59:27 mailrelay spamd[7325]: rules: failed to run BAYES_99 test, skipping:
Feb 17 07:59:27 mailrelay spamd[7325]:  (child processing timeout at
/usr/bin/spamd line 1262, <GEN1417> line 2175.
Feb 17 07:59:27 mailrelay spamd[7325]: )

Tried with file bayes database and MySQL bayes database. Result is the same.

This happens to me up to 10 times per 24 hours (even in moments when mail load
is not high).




------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.