You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spamassassin.apache.org by eventhorizon5 <ry...@onshore.com> on 2007/08/17 18:18:18 UTC

Possible Spamd bug causing spamc crashes

We've been running Spamassassin 3.1.7 (Debian Etch release) here for a while
now on a fairly high traffic mail server, and have been having intermittent
problems with the reliability of the spamd/spamc combination.  We recently
found the actual error that's occurring in spamd (which is causing it to
break the connection with spamc, and spamc failing with an I/O error).  I
turned off the Argus probes of the spamd port, tried switching to Unix
sockets, tried making a script force only 1 spamc connection at a time, etc
and nothing has worked.  Here's an strace dump of one of the failed messages
(this might overlap with normal ones; I can't really tell where this one
starts or ends):

(the error is the illegal seek, a little over halfway through)
-----------------------------------------------------
16:51:40 rt_sigprocmask(SIG_BLOCK, [ALRM], [], 8) = 0
16:51:40 rt_sigaction(SIGALRM, {SIG_DFL}, {0x80afde0, [], 0}, 8) = 0
16:51:40 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
16:51:40 alarm(0)                       = 0
16:51:40 time(NULL)                     = 1187301100
16:51:40 rt_sigprocmask(SIG_BLOCK, [ALRM], [], 8) = 0
16:51:40 rt_sigaction(SIGALRM, {SIG_DFL}, {SIG_DFL}, 8) = 0
16:51:40 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
16:51:40 rt_sigprocmask(SIG_BLOCK, [ALRM], [], 8) = 0
16:51:40 rt_sigaction(SIGALRM, {0x80afde0, [], 0}, {SIG_DFL}, 8) = 0
16:51:40 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
16:51:40 alarm(64)                      = 0
16:51:40 select(16, [5 6 7], NULL, [5 6 7], {2, 0}) = 1 (in [5], left {0,
580000})
16:51:41 alarm(0)                       = 63
16:51:41 rt_sigprocmask(SIG_BLOCK, [ALRM], [], 8) = 0
16:51:41 rt_sigaction(SIGALRM, {SIG_DFL}, {0x80afde0, [], 0}, 8) = 0
16:51:41 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
16:51:41 alarm(0)                       = 0
16:51:41 write(7, "A....\n", 6)         = 6
16:51:41 read(7, 0x9419278, 6)          = -1 EAGAIN (Resource temporarily
unavailable)
16:51:41 time(NULL)                     = 1187301101
16:51:41 select(8, [7], NULL, NULL, {300, 0}) = 1 (in [7], left {299,
992000})
16:51:41 read(7, "B\0\0\24W\n", 6)      = 6
16:51:41 rt_sigprocmask(SIG_BLOCK, [ALRM], [], 8) = 0
16:51:41 rt_sigaction(SIGALRM, {SIG_DFL}, {SIG_DFL}, 8) = 0
16:51:41 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
16:51:41 rt_sigprocmask(SIG_BLOCK, [ALRM], [], 8) = 0
16:51:41 rt_sigaction(SIGALRM, {0x80afde0, [], 0}, {SIG_DFL}, 8) = 0
16:51:41 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
16:51:41 alarm(64)                      = 0
16:51:41 select(16, [5 6 7], NULL, [5 6 7], {2, 0}[5207] info: spamd:
connection from mail.onshore.net [127.0.0.1] at port 49606
ld: ldap-nss.c:1374: do_init: Assertion
`cfg->ldc_uris[__session.ls_current_uri] != ((void *)0)' failed.
) = 1 (in [7], left {1, 988000})
16:51:41 --- SIGCHLD (Child exited) @ 0 (0) ---
16:51:41 sigreturn()                    = ? (mask now [])
16:51:41 rt_sigprocmask(SIG_BLOCK, [CHLD], NULL, 8) = 0
16:51:41 waitpid(-1, [{WIFSIGNALED(s) && WTERMSIG(s) == SIGABRT}], WNOHANG)
= 5207
16:51:41 write(2, "[984] info: spamd: handled clean"..., 68[984] info:
spamd: handled cleanup of child pid 5207 due to SIGCHLD
) = 68
16:51:41 rt_sigprocmask(SIG_BLOCK, [PIPE], [CHLD], 8) = 0
16:51:41 rt_sigaction(SIGPIPE, {SIG_DFL}, {SIG_IGN}, 8) = 0
16:51:41 rt_sigprocmask(SIG_SETMASK, [CHLD], NULL, 8) = 0
16:51:41 rt_sigprocmask(SIG_BLOCK, [PIPE], [CHLD], 8) = 0
16:51:41 rt_sigaction(SIGPIPE, {0x80afde0, [], 0}, {SIG_DFL}, 8) = 0
16:51:41 rt_sigprocmask(SIG_SETMASK, [CHLD], NULL, 8) = 0
16:51:41 time([1187301101])             = 1187301101
16:51:41 time([1187301101])             = 1187301101
16:51:41 stat64("/etc/localtime", {st_mode=S_IFREG|0644, st_size=1279, ...})
= 0
16:51:41 stat64("/etc/localtime", {st_mode=S_IFREG|0644, st_size=1279, ...})
= 0
16:51:41 stat64("/etc/localtime", {st_mode=S_IFREG|0644, st_size=1279, ...})
= 0
16:51:41 stat64("/etc/localtime", {st_mode=S_IFREG|0644, st_size=1279, ...})
= 0
16:51:41 select(8, [4], NULL, [4], {0, 0}) = 0 (Timeout)
16:51:41 write(4, "<174>Aug 16 16:51:41 spamd[984]:"..., 90) = 90
16:51:41 rt_sigprocmask(SIG_BLOCK, [PIPE], [CHLD], 8) = 0
16:51:41 rt_sigaction(SIGPIPE, {SIG_IGN}, {0x80afde0, [], 0}, 8) = 0
16:51:41 rt_sigprocmask(SIG_SETMASK, [CHLD], NULL, 8) = 0
16:51:41 rt_sigprocmask(SIG_BLOCK, [PIPE], [CHLD], 8) = 0
16:51:41 rt_sigaction(SIGPIPE, {SIG_IGN}, {SIG_IGN}, 8) = 0
16:51:41 rt_sigprocmask(SIG_SETMASK, [CHLD], NULL, 8) = 0
16:51:41 waitpid(-1, 0xbffb7f98, WNOHANG) = 0
16:51:41 rt_sigprocmask(SIG_BLOCK, [CHLD], [CHLD], 8) = 0
16:51:41 rt_sigaction(SIGCHLD, {0x80afde0, [], 0}, {0x80afde0, [], 0}, 8) =
0
16:51:41 rt_sigprocmask(SIG_SETMASK, [CHLD], NULL, 8) = 0
16:51:41 rt_sigprocmask(SIG_UNBLOCK, [CHLD], NULL, 8) = 0
16:51:41 alarm(0)                       = 64
16:51:41 rt_sigprocmask(SIG_BLOCK, [ALRM], [], 8) = 0
16:51:41 rt_sigaction(SIGALRM, {SIG_DFL}, {0x80afde0, [], 0}, 8) = 0
16:51:41 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
16:51:41 alarm(0)                       = 0
16:51:41 read(7, "", 6)                 = 0
16:51:41 close(7)                       = 0
16:51:41 write(2, "[984] info: prefork: child state"..., 37[984] info:
prefork: child states: I) = 37
16:51:41 rt_sigprocmask(SIG_BLOCK, [PIPE], [], 8) = 0
16:51:41 rt_sigaction(SIGPIPE, {SIG_DFL}, {SIG_IGN}, 8) = 0
16:51:41 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
16:51:41 rt_sigprocmask(SIG_BLOCK, [PIPE], [], 8) = 0
16:51:41 rt_sigaction(SIGPIPE, {0x80afde0, [], 0}, {SIG_DFL}, 8) = 0
16:51:41 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
16:51:41 time([1187301101])             = 1187301101
16:51:41 time([1187301101])             = 1187301101
16:51:41 stat64("/etc/localtime", {st_mode=S_IFREG|0644, st_size=1279, ...})
= 0
16:51:41 stat64("/etc/localtime", {st_mode=S_IFREG|0644, st_size=1279, ...})
= 0
16:51:41 stat64("/etc/localtime", {st_mode=S_IFREG|0644, st_size=1279, ...})
= 0
16:51:41 stat64("/etc/localtime", {st_mode=S_IFREG|0644, st_size=1279, ...})
= 0
16:51:41 select(8, [4], NULL, [4], {0, 0}) = 0 (Timeout)
16:51:41 write(4, "<174>Aug 16 16:51:41 spamd[984]:"..., 59) = 59
16:51:41 rt_sigprocmask(SIG_BLOCK, [PIPE], [], 8) = 0
16:51:41 rt_sigaction(SIGPIPE, {SIG_IGN}, {0x80afde0, [], 0}, 8) = 0
16:51:41 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
16:51:41 rt_sigprocmask(SIG_BLOCK, [PIPE], [], 8) = 0
16:51:41 rt_sigaction(SIGPIPE, {SIG_IGN}, {SIG_IGN}, 8) = 0
16:51:41 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
16:51:41 socketpair(PF_FILE, SOCK_STREAM, 0, [7, 8]) = 0
16:51:41 ioctl(7, SNDCTL_TMR_TIMEBASE or TCGETS, 0xbffb8068) = -1 EINVAL
(Invalid argument)
16:51:41 _llseek(7, 0, 0xbffb80b0, SEEK_CUR) = -1 ESPIPE (Illegal seek)
16:51:41 ioctl(7, SNDCTL_TMR_TIMEBASE or TCGETS, 0xbffb8068) = -1 EINVAL
(Invalid argument)
16:51:41 _llseek(7, 0, 0xbffb80b0, SEEK_CUR) = -1 ESPIPE (Illegal seek)
16:51:41 ioctl(8, SNDCTL_TMR_TIMEBASE or TCGETS, 0xbffb8068) = -1 EINVAL
(Invalid argument)
16:51:41 _llseek(8, 0, 0xbffb80b0, SEEK_CUR) = -1 ESPIPE (Illegal seek)
16:51:41 ioctl(8, SNDCTL_TMR_TIMEBASE or TCGETS, 0xbffb8068) = -1 EINVAL
(Invalid argument)
16:51:41 _llseek(8, 0, 0xbffb80b0, SEEK_CUR) = -1 ESPIPE (Illegal seek)
16:51:41 fcntl64(7, F_SETFD, FD_CLOEXEC) = 0
16:51:41 fcntl64(8, F_SETFD, FD_CLOEXEC) = 0
16:51:41 fcntl64(8, F_GETFL)            = 0x2 (flags O_RDWR)
16:51:41 fcntl64(8, F_SETFL, O_RDWR|O_NONBLOCK) = 0
16:51:41 fcntl64(7, F_GETFL)            = 0x2 (flags O_RDWR)
16:51:41 fcntl64(7, F_SETFL, O_RDWR|O_NONBLOCK) = 0
16:51:41 rt_sigprocmask(SIG_BLOCK, [INT CHLD], NULL, 8) = 0
16:51:41 clone(child_stack=0,
flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD,
child_tidptr=0xb7d82928) = 5278
16:51:41 write(2, "[984] info: spamd: server succes"..., 71[984] info:
spamd: server successfully spawned child process, pid 5278
) = 71
16:51:41 rt_sigprocmask(SIG_BLOCK, [PIPE], [INT CHLD], 8) = 0
16:51:41 rt_sigaction(SIGPIPE, {SIG_DFL}, {SIG_IGN}, 8) = 0
16:51:41 rt_sigprocmask(SIG_SETMASK, [INT CHLD], NULL, 8) = 0
16:51:41 rt_sigprocmask(SIG_BLOCK, [PIPE], [INT CHLD], 8) = 0
16:51:41 rt_sigaction(SIGPIPE, {0x80afde0, [], 0}, {SIG_DFL}, 8) = 0
16:51:41 rt_sigprocmask(SIG_SETMASK, [INT CHLD], NULL, 8) = 0
16:51:41 time([1187301101])             = 1187301101
16:51:41 time([1187301101])             = 1187301101
16:51:41 stat64("/etc/localtime", {st_mode=S_IFREG|0644, st_size=1279, ...})
= 0
16:51:41 stat64("/etc/localtime", {st_mode=S_IFREG|0644, st_size=1279, ...})
= 0
16:51:41 stat64("/etc/localtime", {st_mode=S_IFREG|0644, st_size=1279, ...})
= 0
16:51:41 stat64("/etc/localtime", {st_mode=S_IFREG|0644, st_size=1279, ...})
= 0
16:51:41 select(8, [4], NULL, [4], {0, 0}) = 0 (Timeout)
16:51:41 write(4, "<174>Aug 16 16:51:41 spamd[984]:"..., 93) = 93
16:51:41 rt_sigprocmask(SIG_BLOCK, [PIPE], [INT CHLD], 8) = 0
16:51:41 rt_sigaction(SIGPIPE, {SIG_IGN}, {0x80afde0, [], 0}, 8) = 0
16:51:41 rt_sigprocmask(SIG_SETMASK, [INT CHLD], NULL, 8) = 0
16:51:41 rt_sigprocmask(SIG_BLOCK, [PIPE], [INT CHLD], 8) = 0
16:51:41 rt_sigaction(SIGPIPE, {SIG_IGN}, {SIG_IGN}, 8) = 0
16:51:41 rt_sigprocmask(SIG_SETMASK, [INT CHLD], NULL, 8) = 0
16:51:41 close(8)                       = 0
16:51:41 rt_sigprocmask(SIG_UNBLOCK, [INT CHLD], NULL, 8) = 0
16:51:41 rt_sigprocmask(SIG_BLOCK, [ALRM], [], 8) = 0
16:51:41 rt_sigaction(SIGALRM, {SIG_DFL}, {SIG_DFL}, 8) = 0
16:51:41 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
16:51:41 rt_sigprocmask(SIG_BLOCK, [ALRM], [], 8) = 0
16:51:41 rt_sigaction(SIGALRM, {0x80afde0, [], 0}, {SIG_DFL}, 8) = 0
16:51:41 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
16:51:41 alarm(64)                      = 0
16:51:41 select(16, [5 6 7], NULL, [5 6 7], {2, 0}) = 1 (in [7], left {2,
0})
16:51:41 alarm(0)                       = 64
16:51:41 rt_sigprocmask(SIG_BLOCK, [ALRM], [], 8) = 0
16:51:41 rt_sigaction(SIGALRM, {SIG_DFL}, {0x80afde0, [], 0}, 8) = 0
16:51:41 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
16:51:41 alarm(0)                       = 0
16:51:41 read(7, "I\0\0\24\236\n", 6)   = 6
16:51:41 write(2, "[984] info: prefork: child state"..., 38[984] info:
prefork: child states: II) = 38
-----------------------------------------------------

A regular log shows this:
---------
Jul  9 14:47:32 mail spamd[22607]: spamd: connection from mail.onshore.net
[127.0.0.1] at port 52379
Jul  9 14:47:32 mail spamd[32346]: prefork: child 22607: entering state 2
Jul  9 14:47:32 mail spamd[32346]: prefork: new lowest idle kid: 22633
Jul  9 14:47:32 mail spamd[32346]: prefork: new lowest idle kid: 22633
Jul  9 14:47:32 mail spamd[32346]: spamd: handled cleanup of child pid 22607
due to SIGCHLD
Jul  9 14:47:32 mail spamd[32346]: prefork: child closed connection
Jul  9 14:47:32 mail spamd[32346]: prefork: child states: I 
----------

The strace of spamc shows the I/O error:
17:54:38 recv(3, 0xbf8cd178, 1, 0)      = -1 ECONNRESET (Connection reset by
peer) 

I'm going to look over the spamd code and try to find where the problem's
occurring (already did that with spamc).  I also found this other message
about the same problem:
http://mail-archives.apache.org/mod_mbox/spamassassin-users/200403.mbox/%3c5FF0BCBB99E4274FB88322DB28971157CFA637@mwjdc2.mweb.com%3e

So has anyone on this list experienced this before, and do you possibly know
where in the app this could be occurring?  Thanks.

-- 
Ryan Thoryk
System Administrator
onShore Networks, LLC
1407 West Chicago Avenue
Chicago, Illinois 60622
www.onshore.com 

-- 
View this message in context: http://www.nabble.com/Possible-Spamd-bug-causing-spamc-crashes-tf4286739.html#a12202596
Sent from the SpamAssassin - Dev mailing list archive at Nabble.com.


Re: Possible Spamd bug causing spamc crashes

Posted by eventhorizon5 <ry...@onshore.com>.
Fixed it - just so others know how to fix this issue, it had to do with the
permissions of the /etc/libnss-ldap.conf file - the file must be
world-readable (we had it as 640; should be 644).

Ryan Thoryk


eventhorizon5 wrote:
> 
> Thanks for that info - I didn't even notice that ldap-nss error (I was
> somewhat thinking that it was LDAP-related, since the machine became an
> LDAP client entirely a few months ago).  I'll look on the Debian list and
> see if there's a patch.
> 
> Ryan Thoryk
> 
> 
> Justin Mason wrote:
>> 
>> 
>> I think this is your problem:
>> 
>> ld: ldap-nss.c:1374: do_init: Assertion
>> `cfg->ldc_uris[__session.ls_current_uri] != ((void *)0)' failed.
>> 
>> That looks like a bug in some LDAP support library, possibly this one:
>> http://lists.alioth.debian.org/pipermail/pkg-cyrus-imapd-debian-devel/2007-January/001934.html
>> , that's Debian bug #399849.
>> 
>> --j.
>> 
> 
> 

-- 
View this message in context: http://www.nabble.com/Possible-Spamd-bug-causing-spamc-crashes-tf4286739.html#a12206491
Sent from the SpamAssassin - Dev mailing list archive at Nabble.com.


Re: Possible Spamd bug causing spamc crashes

Posted by eventhorizon5 <ry...@onshore.com>.
Thanks for that info - I didn't even notice that ldap-nss error (I was
somewhat thinking that it was LDAP-related, since the machine became an LDAP
client entirely a few months ago).  I'll look on the Debian list and see if
there's a patch.

Ryan Thoryk


Justin Mason wrote:
> 
> 
> I think this is your problem:
> 
> ld: ldap-nss.c:1374: do_init: Assertion
> `cfg->ldc_uris[__session.ls_current_uri] != ((void *)0)' failed.
> 
> That looks like a bug in some LDAP support library, possibly this one:
> http://lists.alioth.debian.org/pipermail/pkg-cyrus-imapd-debian-devel/2007-January/001934.html
> , that's Debian bug #399849.
> 
> --j.
> 

-- 
View this message in context: http://www.nabble.com/Possible-Spamd-bug-causing-spamc-crashes-tf4286739.html#a12203064
Sent from the SpamAssassin - Dev mailing list archive at Nabble.com.