You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spamassassin.apache.org by eventhorizon5 <ry...@onshore.com> on 2007/08/17 18:18:18 UTC
Possible Spamd bug causing spamc crashes
We've been running Spamassassin 3.1.7 (Debian Etch release) here for a while
now on a fairly high traffic mail server, and have been having intermittent
problems with the reliability of the spamd/spamc combination. We recently
found the actual error that's occurring in spamd (which is causing it to
break the connection with spamc, and spamc failing with an I/O error). I
turned off the Argus probes of the spamd port, tried switching to Unix
sockets, tried making a script force only 1 spamc connection at a time, etc
and nothing has worked. Here's an strace dump of one of the failed messages
(this might overlap with normal ones; I can't really tell where this one
starts or ends):
(the error is the illegal seek, a little over halfway through)
-----------------------------------------------------
16:51:40 rt_sigprocmask(SIG_BLOCK, [ALRM], [], 8) = 0
16:51:40 rt_sigaction(SIGALRM, {SIG_DFL}, {0x80afde0, [], 0}, 8) = 0
16:51:40 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
16:51:40 alarm(0) = 0
16:51:40 time(NULL) = 1187301100
16:51:40 rt_sigprocmask(SIG_BLOCK, [ALRM], [], 8) = 0
16:51:40 rt_sigaction(SIGALRM, {SIG_DFL}, {SIG_DFL}, 8) = 0
16:51:40 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
16:51:40 rt_sigprocmask(SIG_BLOCK, [ALRM], [], 8) = 0
16:51:40 rt_sigaction(SIGALRM, {0x80afde0, [], 0}, {SIG_DFL}, 8) = 0
16:51:40 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
16:51:40 alarm(64) = 0
16:51:40 select(16, [5 6 7], NULL, [5 6 7], {2, 0}) = 1 (in [5], left {0,
580000})
16:51:41 alarm(0) = 63
16:51:41 rt_sigprocmask(SIG_BLOCK, [ALRM], [], 8) = 0
16:51:41 rt_sigaction(SIGALRM, {SIG_DFL}, {0x80afde0, [], 0}, 8) = 0
16:51:41 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
16:51:41 alarm(0) = 0
16:51:41 write(7, "A....\n", 6) = 6
16:51:41 read(7, 0x9419278, 6) = -1 EAGAIN (Resource temporarily
unavailable)
16:51:41 time(NULL) = 1187301101
16:51:41 select(8, [7], NULL, NULL, {300, 0}) = 1 (in [7], left {299,
992000})
16:51:41 read(7, "B\0\0\24W\n", 6) = 6
16:51:41 rt_sigprocmask(SIG_BLOCK, [ALRM], [], 8) = 0
16:51:41 rt_sigaction(SIGALRM, {SIG_DFL}, {SIG_DFL}, 8) = 0
16:51:41 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
16:51:41 rt_sigprocmask(SIG_BLOCK, [ALRM], [], 8) = 0
16:51:41 rt_sigaction(SIGALRM, {0x80afde0, [], 0}, {SIG_DFL}, 8) = 0
16:51:41 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
16:51:41 alarm(64) = 0
16:51:41 select(16, [5 6 7], NULL, [5 6 7], {2, 0}[5207] info: spamd:
connection from mail.onshore.net [127.0.0.1] at port 49606
ld: ldap-nss.c:1374: do_init: Assertion
`cfg->ldc_uris[__session.ls_current_uri] != ((void *)0)' failed.
) = 1 (in [7], left {1, 988000})
16:51:41 --- SIGCHLD (Child exited) @ 0 (0) ---
16:51:41 sigreturn() = ? (mask now [])
16:51:41 rt_sigprocmask(SIG_BLOCK, [CHLD], NULL, 8) = 0
16:51:41 waitpid(-1, [{WIFSIGNALED(s) && WTERMSIG(s) == SIGABRT}], WNOHANG)
= 5207
16:51:41 write(2, "[984] info: spamd: handled clean"..., 68[984] info:
spamd: handled cleanup of child pid 5207 due to SIGCHLD
) = 68
16:51:41 rt_sigprocmask(SIG_BLOCK, [PIPE], [CHLD], 8) = 0
16:51:41 rt_sigaction(SIGPIPE, {SIG_DFL}, {SIG_IGN}, 8) = 0
16:51:41 rt_sigprocmask(SIG_SETMASK, [CHLD], NULL, 8) = 0
16:51:41 rt_sigprocmask(SIG_BLOCK, [PIPE], [CHLD], 8) = 0
16:51:41 rt_sigaction(SIGPIPE, {0x80afde0, [], 0}, {SIG_DFL}, 8) = 0
16:51:41 rt_sigprocmask(SIG_SETMASK, [CHLD], NULL, 8) = 0
16:51:41 time([1187301101]) = 1187301101
16:51:41 time([1187301101]) = 1187301101
16:51:41 stat64("/etc/localtime", {st_mode=S_IFREG|0644, st_size=1279, ...})
= 0
16:51:41 stat64("/etc/localtime", {st_mode=S_IFREG|0644, st_size=1279, ...})
= 0
16:51:41 stat64("/etc/localtime", {st_mode=S_IFREG|0644, st_size=1279, ...})
= 0
16:51:41 stat64("/etc/localtime", {st_mode=S_IFREG|0644, st_size=1279, ...})
= 0
16:51:41 select(8, [4], NULL, [4], {0, 0}) = 0 (Timeout)
16:51:41 write(4, "<174>Aug 16 16:51:41 spamd[984]:"..., 90) = 90
16:51:41 rt_sigprocmask(SIG_BLOCK, [PIPE], [CHLD], 8) = 0
16:51:41 rt_sigaction(SIGPIPE, {SIG_IGN}, {0x80afde0, [], 0}, 8) = 0
16:51:41 rt_sigprocmask(SIG_SETMASK, [CHLD], NULL, 8) = 0
16:51:41 rt_sigprocmask(SIG_BLOCK, [PIPE], [CHLD], 8) = 0
16:51:41 rt_sigaction(SIGPIPE, {SIG_IGN}, {SIG_IGN}, 8) = 0
16:51:41 rt_sigprocmask(SIG_SETMASK, [CHLD], NULL, 8) = 0
16:51:41 waitpid(-1, 0xbffb7f98, WNOHANG) = 0
16:51:41 rt_sigprocmask(SIG_BLOCK, [CHLD], [CHLD], 8) = 0
16:51:41 rt_sigaction(SIGCHLD, {0x80afde0, [], 0}, {0x80afde0, [], 0}, 8) =
0
16:51:41 rt_sigprocmask(SIG_SETMASK, [CHLD], NULL, 8) = 0
16:51:41 rt_sigprocmask(SIG_UNBLOCK, [CHLD], NULL, 8) = 0
16:51:41 alarm(0) = 64
16:51:41 rt_sigprocmask(SIG_BLOCK, [ALRM], [], 8) = 0
16:51:41 rt_sigaction(SIGALRM, {SIG_DFL}, {0x80afde0, [], 0}, 8) = 0
16:51:41 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
16:51:41 alarm(0) = 0
16:51:41 read(7, "", 6) = 0
16:51:41 close(7) = 0
16:51:41 write(2, "[984] info: prefork: child state"..., 37[984] info:
prefork: child states: I) = 37
16:51:41 rt_sigprocmask(SIG_BLOCK, [PIPE], [], 8) = 0
16:51:41 rt_sigaction(SIGPIPE, {SIG_DFL}, {SIG_IGN}, 8) = 0
16:51:41 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
16:51:41 rt_sigprocmask(SIG_BLOCK, [PIPE], [], 8) = 0
16:51:41 rt_sigaction(SIGPIPE, {0x80afde0, [], 0}, {SIG_DFL}, 8) = 0
16:51:41 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
16:51:41 time([1187301101]) = 1187301101
16:51:41 time([1187301101]) = 1187301101
16:51:41 stat64("/etc/localtime", {st_mode=S_IFREG|0644, st_size=1279, ...})
= 0
16:51:41 stat64("/etc/localtime", {st_mode=S_IFREG|0644, st_size=1279, ...})
= 0
16:51:41 stat64("/etc/localtime", {st_mode=S_IFREG|0644, st_size=1279, ...})
= 0
16:51:41 stat64("/etc/localtime", {st_mode=S_IFREG|0644, st_size=1279, ...})
= 0
16:51:41 select(8, [4], NULL, [4], {0, 0}) = 0 (Timeout)
16:51:41 write(4, "<174>Aug 16 16:51:41 spamd[984]:"..., 59) = 59
16:51:41 rt_sigprocmask(SIG_BLOCK, [PIPE], [], 8) = 0
16:51:41 rt_sigaction(SIGPIPE, {SIG_IGN}, {0x80afde0, [], 0}, 8) = 0
16:51:41 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
16:51:41 rt_sigprocmask(SIG_BLOCK, [PIPE], [], 8) = 0
16:51:41 rt_sigaction(SIGPIPE, {SIG_IGN}, {SIG_IGN}, 8) = 0
16:51:41 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
16:51:41 socketpair(PF_FILE, SOCK_STREAM, 0, [7, 8]) = 0
16:51:41 ioctl(7, SNDCTL_TMR_TIMEBASE or TCGETS, 0xbffb8068) = -1 EINVAL
(Invalid argument)
16:51:41 _llseek(7, 0, 0xbffb80b0, SEEK_CUR) = -1 ESPIPE (Illegal seek)
16:51:41 ioctl(7, SNDCTL_TMR_TIMEBASE or TCGETS, 0xbffb8068) = -1 EINVAL
(Invalid argument)
16:51:41 _llseek(7, 0, 0xbffb80b0, SEEK_CUR) = -1 ESPIPE (Illegal seek)
16:51:41 ioctl(8, SNDCTL_TMR_TIMEBASE or TCGETS, 0xbffb8068) = -1 EINVAL
(Invalid argument)
16:51:41 _llseek(8, 0, 0xbffb80b0, SEEK_CUR) = -1 ESPIPE (Illegal seek)
16:51:41 ioctl(8, SNDCTL_TMR_TIMEBASE or TCGETS, 0xbffb8068) = -1 EINVAL
(Invalid argument)
16:51:41 _llseek(8, 0, 0xbffb80b0, SEEK_CUR) = -1 ESPIPE (Illegal seek)
16:51:41 fcntl64(7, F_SETFD, FD_CLOEXEC) = 0
16:51:41 fcntl64(8, F_SETFD, FD_CLOEXEC) = 0
16:51:41 fcntl64(8, F_GETFL) = 0x2 (flags O_RDWR)
16:51:41 fcntl64(8, F_SETFL, O_RDWR|O_NONBLOCK) = 0
16:51:41 fcntl64(7, F_GETFL) = 0x2 (flags O_RDWR)
16:51:41 fcntl64(7, F_SETFL, O_RDWR|O_NONBLOCK) = 0
16:51:41 rt_sigprocmask(SIG_BLOCK, [INT CHLD], NULL, 8) = 0
16:51:41 clone(child_stack=0,
flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD,
child_tidptr=0xb7d82928) = 5278
16:51:41 write(2, "[984] info: spamd: server succes"..., 71[984] info:
spamd: server successfully spawned child process, pid 5278
) = 71
16:51:41 rt_sigprocmask(SIG_BLOCK, [PIPE], [INT CHLD], 8) = 0
16:51:41 rt_sigaction(SIGPIPE, {SIG_DFL}, {SIG_IGN}, 8) = 0
16:51:41 rt_sigprocmask(SIG_SETMASK, [INT CHLD], NULL, 8) = 0
16:51:41 rt_sigprocmask(SIG_BLOCK, [PIPE], [INT CHLD], 8) = 0
16:51:41 rt_sigaction(SIGPIPE, {0x80afde0, [], 0}, {SIG_DFL}, 8) = 0
16:51:41 rt_sigprocmask(SIG_SETMASK, [INT CHLD], NULL, 8) = 0
16:51:41 time([1187301101]) = 1187301101
16:51:41 time([1187301101]) = 1187301101
16:51:41 stat64("/etc/localtime", {st_mode=S_IFREG|0644, st_size=1279, ...})
= 0
16:51:41 stat64("/etc/localtime", {st_mode=S_IFREG|0644, st_size=1279, ...})
= 0
16:51:41 stat64("/etc/localtime", {st_mode=S_IFREG|0644, st_size=1279, ...})
= 0
16:51:41 stat64("/etc/localtime", {st_mode=S_IFREG|0644, st_size=1279, ...})
= 0
16:51:41 select(8, [4], NULL, [4], {0, 0}) = 0 (Timeout)
16:51:41 write(4, "<174>Aug 16 16:51:41 spamd[984]:"..., 93) = 93
16:51:41 rt_sigprocmask(SIG_BLOCK, [PIPE], [INT CHLD], 8) = 0
16:51:41 rt_sigaction(SIGPIPE, {SIG_IGN}, {0x80afde0, [], 0}, 8) = 0
16:51:41 rt_sigprocmask(SIG_SETMASK, [INT CHLD], NULL, 8) = 0
16:51:41 rt_sigprocmask(SIG_BLOCK, [PIPE], [INT CHLD], 8) = 0
16:51:41 rt_sigaction(SIGPIPE, {SIG_IGN}, {SIG_IGN}, 8) = 0
16:51:41 rt_sigprocmask(SIG_SETMASK, [INT CHLD], NULL, 8) = 0
16:51:41 close(8) = 0
16:51:41 rt_sigprocmask(SIG_UNBLOCK, [INT CHLD], NULL, 8) = 0
16:51:41 rt_sigprocmask(SIG_BLOCK, [ALRM], [], 8) = 0
16:51:41 rt_sigaction(SIGALRM, {SIG_DFL}, {SIG_DFL}, 8) = 0
16:51:41 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
16:51:41 rt_sigprocmask(SIG_BLOCK, [ALRM], [], 8) = 0
16:51:41 rt_sigaction(SIGALRM, {0x80afde0, [], 0}, {SIG_DFL}, 8) = 0
16:51:41 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
16:51:41 alarm(64) = 0
16:51:41 select(16, [5 6 7], NULL, [5 6 7], {2, 0}) = 1 (in [7], left {2,
0})
16:51:41 alarm(0) = 64
16:51:41 rt_sigprocmask(SIG_BLOCK, [ALRM], [], 8) = 0
16:51:41 rt_sigaction(SIGALRM, {SIG_DFL}, {0x80afde0, [], 0}, 8) = 0
16:51:41 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
16:51:41 alarm(0) = 0
16:51:41 read(7, "I\0\0\24\236\n", 6) = 6
16:51:41 write(2, "[984] info: prefork: child state"..., 38[984] info:
prefork: child states: II) = 38
-----------------------------------------------------
A regular log shows this:
---------
Jul 9 14:47:32 mail spamd[22607]: spamd: connection from mail.onshore.net
[127.0.0.1] at port 52379
Jul 9 14:47:32 mail spamd[32346]: prefork: child 22607: entering state 2
Jul 9 14:47:32 mail spamd[32346]: prefork: new lowest idle kid: 22633
Jul 9 14:47:32 mail spamd[32346]: prefork: new lowest idle kid: 22633
Jul 9 14:47:32 mail spamd[32346]: spamd: handled cleanup of child pid 22607
due to SIGCHLD
Jul 9 14:47:32 mail spamd[32346]: prefork: child closed connection
Jul 9 14:47:32 mail spamd[32346]: prefork: child states: I
----------
The strace of spamc shows the I/O error:
17:54:38 recv(3, 0xbf8cd178, 1, 0) = -1 ECONNRESET (Connection reset by
peer)
I'm going to look over the spamd code and try to find where the problem's
occurring (already did that with spamc). I also found this other message
about the same problem:
http://mail-archives.apache.org/mod_mbox/spamassassin-users/200403.mbox/%3c5FF0BCBB99E4274FB88322DB28971157CFA637@mwjdc2.mweb.com%3e
So has anyone on this list experienced this before, and do you possibly know
where in the app this could be occurring? Thanks.
--
Ryan Thoryk
System Administrator
onShore Networks, LLC
1407 West Chicago Avenue
Chicago, Illinois 60622
www.onshore.com
--
View this message in context: http://www.nabble.com/Possible-Spamd-bug-causing-spamc-crashes-tf4286739.html#a12202596
Sent from the SpamAssassin - Dev mailing list archive at Nabble.com.
Re: Possible Spamd bug causing spamc crashes
Posted by eventhorizon5 <ry...@onshore.com>.
Fixed it - just so others know how to fix this issue, it had to do with the
permissions of the /etc/libnss-ldap.conf file - the file must be
world-readable (we had it as 640; should be 644).
Ryan Thoryk
eventhorizon5 wrote:
>
> Thanks for that info - I didn't even notice that ldap-nss error (I was
> somewhat thinking that it was LDAP-related, since the machine became an
> LDAP client entirely a few months ago). I'll look on the Debian list and
> see if there's a patch.
>
> Ryan Thoryk
>
>
> Justin Mason wrote:
>>
>>
>> I think this is your problem:
>>
>> ld: ldap-nss.c:1374: do_init: Assertion
>> `cfg->ldc_uris[__session.ls_current_uri] != ((void *)0)' failed.
>>
>> That looks like a bug in some LDAP support library, possibly this one:
>> http://lists.alioth.debian.org/pipermail/pkg-cyrus-imapd-debian-devel/2007-January/001934.html
>> , that's Debian bug #399849.
>>
>> --j.
>>
>
>
--
View this message in context: http://www.nabble.com/Possible-Spamd-bug-causing-spamc-crashes-tf4286739.html#a12206491
Sent from the SpamAssassin - Dev mailing list archive at Nabble.com.
Re: Possible Spamd bug causing spamc crashes
Posted by eventhorizon5 <ry...@onshore.com>.
Thanks for that info - I didn't even notice that ldap-nss error (I was
somewhat thinking that it was LDAP-related, since the machine became an LDAP
client entirely a few months ago). I'll look on the Debian list and see if
there's a patch.
Ryan Thoryk
Justin Mason wrote:
>
>
> I think this is your problem:
>
> ld: ldap-nss.c:1374: do_init: Assertion
> `cfg->ldc_uris[__session.ls_current_uri] != ((void *)0)' failed.
>
> That looks like a bug in some LDAP support library, possibly this one:
> http://lists.alioth.debian.org/pipermail/pkg-cyrus-imapd-debian-devel/2007-January/001934.html
> , that's Debian bug #399849.
>
> --j.
>
--
View this message in context: http://www.nabble.com/Possible-Spamd-bug-causing-spamc-crashes-tf4286739.html#a12203064
Sent from the SpamAssassin - Dev mailing list archive at Nabble.com.