You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spamassassin.apache.org by bu...@bugzilla.spamassassin.org on 2007/03/23 20:29:37 UTC
[Bug 5388] New: socket errors in spamd, both ipv4 and ipv6
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=5388
Summary: socket errors in spamd, both ipv4 and ipv6
Product: Spamassassin
Version: SVN Trunk (Latest Devel Version)
Platform: PC
OS/Version: Linux
Status: NEW
Severity: blocker
Priority: P5
Component: spamc/spamd
AssignedTo: dev@spamassassin.apache.org
ReportedBy: mattw@nmgi.com
latest SVN build of spamassassin on latest everything on debian Etch, as of
yesterday:
with debug enabled, spamd is reporting the following errors very often.
Fri Mar 23 14:21:54 2007 [480] info: prefork: child states: II
Fri Mar 23 14:22:00 2007 [486] warn: spamd: error: failed to obtain port and ip
from socket at /usr/bin/spamd line 1196.
Fri Mar 23 14:22:00 2007 [486] warn: Transport endpoint is not connected,
continuing at /usr/bin/spamd line 1071.
I have tried 'apt-get remove libio-socket-inet6-perl', which causes it to report
a different error. Here is something similar to the error I see, recently
reported on spamassassin-users.
> Mar 23 11:50:48 spamfilter5 spamd[28398]: Use of uninitialized value
> in subroutine entry at
> /usr/lib/perl5/5.8.5/i386-linux-thread-multi/Socket.pm line 370.
> Mar 23 11:50:48 spamfilter5 spamd[28398]: Bad arg length for
> Socket::unpack_sockaddr_in, length is 0, should be 16 at
> /usr/lib/perl5/5.8.5/i386-linux-thread-multi/Socket.pm line 370.
> Mar 23 11:50:48 spamfilter5 spamd[28398]: spamd: error: Bad arg length
> for Socket::unpack_sockaddr_in, length is 0, should be 16 at
> /usr/lib/perl5/5.8.5/i386-linux-thread-multi/Socket.pm
> line 370.
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
[Bug 5388] socket errors in spamd, both ipv4 and ipv6
Posted by bu...@bugzilla.spamassassin.org.
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=5388
------- Additional Comments From mattw@nmgi.com 2007-03-27 10:06 -------
err oops, that
$orders == PF_PING_ORDER
should be
$orders eq PF_PING_ORDER
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
[Bug 5388] socket errors in spamd, both ipv4 and ipv6
Posted by bu...@bugzilla.spamassassin.org.
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=5388
mattheww@nmgi.com changed:
What |Removed |Added
----------------------------------------------------------------------------
BugsThisDependsOn| |5416
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
[Bug 5388] socket errors in spamd, both ipv4 and ipv6
Posted by bu...@bugzilla.spamassassin.org.
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=5388
------- Additional Comments From mattw@nmgi.com 2007-03-27 09:42 -------
Created an attachment (id=3888)
--> (http://issues.apache.org/SpamAssassin/attachment.cgi?id=3888&action=view)
fix for several interrelated parent/child/socket problems.
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
[Bug 5388] socket errors in spamd, both ipv4 and ipv6
Posted by bu...@bugzilla.spamassassin.org.
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=5388
------- Additional Comments From jm@jmason.org 2007-04-13 03:59 -------
(In reply to comment #7 on bug 5416, which I'd prefer to keep here to keep the
thread in one place ;)
> FWIW/HTH: it's my theory that this is more likely to happen on slower machines,
> or spamd installations with lots of large custom rulesets (are custom rulesets
> loaded during the tests?), b/c from what I understand, the first spamc
> connection attempt from those test scripts *might* be accepted by the server in
> the very short period of time between the instant the last child forks and the
> instant the first forked child finishes reporting that it's idle (ready to
accept).
>
> So what my patch was trying to do was give SpamdForkScaling.pm a way to know if
> it was still in "startup" phase, which would have overriden the trigger for
> "overloaded" mode. But I botched the implementation as you can tell from
> Justin's commentary...
yeah -- you see, what I would expect to happen in that situation is that the
parent spamd would just fork *more* children. if the load then wasn't there to
keep that many kids active, they'd exit, and it'd settle down to a steady state
quickly enough.
You can fake the problem situation described above, by adding this JMD line to
update_child_status_idle() in SpamdForkScaling.pm:
sub update_child_status_idle {
my ($self) = @_;
warn "JMD faking slow machine with a sleep"; sleep 10;
# "I b1 b2 b3 b4 \n "
$self->report_backchannel_socket("I".pack("N",$self->{pid})."\n");
}
even with this, the spamd continues to work ok.
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
[Bug 5388] socket errors in spamd, both ipv4 and ipv6
Posted by bu...@bugzilla.spamassassin.org.
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=5388
------- Additional Comments From mattw@nmgi.com 2007-03-23 12:46 -------
I should note that under both scenarios, spamd is accepting connections and
processing mail.
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
[Bug 5388] socket errors in spamd, both ipv4 and ipv6
Posted by bu...@bugzilla.spamassassin.org.
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=5388
------- Additional Comments From jm@jmason.org 2007-03-27 12:12 -------
hmm.
@@ -178,10 +178,11 @@
foreach my $p (@pids) {
if ($l > $p) { $l = $p };
}
- $self->{lowest_idle_pid} = $l;
+ $self->{lowest_idle_pid} = $l if defined $l;
+ # make sure not to auto-vivify lowest_idle_pid, so the parent knows no child
has ever gone idle.
dbg("prefork: new lowest idle kid: ".
- ($self->{lowest_idle_pid} ? $self->{lowest_idle_pid} : 'none'));
+ ((exists $self->{lowest_idle_pid} && $self->{lowest_idle_pid}) ?
$self->{lowest_idle_pid} : 'none'));
}
If lowest_idle_pid is set to a value, then this is called and no child is idle,
the *previous* value will *be left intact*. that's a bad idea, since that
child probably isn't idle anymore....
@@ -296,7 +297,8 @@
if (!$self->order_idle_child_to_accept()) {
# dbg("prefork: no idle kids, noting overloaded");
# there are no idle kids! we're overloaded, mark that
- $self->{overloaded} = 1;
+ # Don't mark the server as overloaded if no child has ever gone idle.
+ $self->{overloaded} = 1 if exists $self->{lowest_idle_pid};
}
return;
}
So if the spamd server accepts a connection before any of the initial
children reports "idle", it starts another child? I'm inclined
to consider that a feature, to be honest ;) if load is that high,
another child is probably going to be needed.
@@ -532,7 +534,7 @@
chomp $line;
if (index ($line, "P") == 0) { # string starts with "P" = ping
dbg("prefork: periodic ping from spamd parent");
- next;
+ return PF_PING_ORDER;
}
if (index ($line, "A") == 0) { # string starts with "A" = accept
return PFORDER_ACCEPT;
Index: spamd/spamd.raw
===================================================================
--- spamd/spamd.raw (revision 522978)
+++ spamd/spamd.raw (working copy)
@@ -1059,7 +1059,12 @@
if ($scaling) {
$scaling->update_child_status_idle();
$orders = $scaling->wait_for_orders(); # and sleep...
-
+
+ if ($orders eq PF_PING_ORDER) {
+ $i--; #don't count this connection against the max; it's just a ping.
+ next;
+ }
+
if ($orders != PFORDER_ACCEPT) {
info("spamd: unknown order: $orders");
}
I don't get this either. AFAICS this just defers the skipping of the
ping request, up a level in the function call tree, and a few no-ops
later.
@@ -1165,7 +1170,7 @@
}
# Bah!
- if ( !$client ) {
+ if ( !$client || !$client->connected() ) {
# this can happen when interrupted by SIGCHLD on Solaris,
# perl 5.8.0, and some other platforms with -m.
OK, on the other hand, I'm happy to apply this change! It's in trunk now.
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
[Bug 5388] socket errors in spamd, both ipv4 and ipv6
Posted by bu...@bugzilla.spamassassin.org.
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=5388
mattw@nmgi.com changed:
What |Removed |Added
----------------------------------------------------------------------------
Attachment #3888 is|0 |1
obsolete| |
------- Additional Comments From mattw@nmgi.com 2007-03-27 10:30 -------
Created an attachment (id=3889)
--> (http://issues.apache.org/SpamAssassin/attachment.cgi?id=3889&action=view)
corrected patch
Yes, those errors no longer occur because the children aren't continually told
anymore to accept non-existent connections. See a few comments above.
Here's the corrected patch.
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
[Bug 5388] socket errors in spamd, both ipv4 and ipv6
Posted by bu...@bugzilla.spamassassin.org.
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=5388
------- Additional Comments From jm@jmason.org 2007-03-27 10:25 -------
interesting patch! I'm a bit leery of applying it so late in the 3.2.0 dev
cycle though :(
does that fix have anything to do with the original 'failed to obtain port and ip
from socket' error issue? what are the symptoms of what it's fixing, if not?
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
[Bug 5388] socket errors in spamd, both ipv4 and ipv6
Posted by bu...@bugzilla.spamassassin.org.
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=5388
jm@jmason.org changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|REOPENED |RESOLVED
Resolution| |FIXED
------- Additional Comments From jm@jmason.org 2007-08-17 09:30 -------
I think this is fixed nowadays; please reopen if not
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
[Bug 5388] socket errors in spamd, both ipv4 and ipv6
Posted by bu...@bugzilla.spamassassin.org.
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=5388
mattw@nmgi.com changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |RESOLVED
Resolution| |INVALID
------- Additional Comments From mattw@nmgi.com 2007-03-24 23:32 -------
Figured out the issue with the ipv4 version - I have an additional local
loopback address that is part of an LVS/keepalived scheme, and that address was
accept()ing connections on tcp783 every second as part of LVS' tcpcheck. At the
initiation of these connections, the socket object returns a null ->peername()
in peer_info_from_socket(), which causes the error messages I listed above. As
a workaround, we will configure LVS/keepalived to perform health status checks
via the correct interface at both ends, but also teach it to ping spamd with a
proper ping header.
I have not tested it, but I assume the ipv6 version error messages are caused
similarly.
I would also like to take this opportunity to request an additional flag for
spamd (--bind-ipv4-only) that would override/condition the eval{} in the second
BEGIN{}, since in our case, I would like to keep the debian package
libio-socket-inet6-perl installed, but still force spamd to bind to ipv4 only.
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
[Bug 5388] socket errors in spamd, both ipv4 and ipv6
Posted by bu...@bugzilla.spamassassin.org.
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=5388
mattw@nmgi.com changed:
What |Removed |Added
----------------------------------------------------------------------------
Priority|P5 |P2
------- Additional Comments From mattw@nmgi.com 2007-03-23 12:40 -------
to clarify, here is the exact error I see when removing ipv6 capability:
Fri Mar 23 14:39:25 2007 [606] info: logger: removing stderr method
Fri Mar 23 14:39:32 2007 [608] info: spamd: server started on port 783/tcp
(running version 3.2.0-pre3)
Fri Mar 23 14:39:32 2007 [608] info: spamd: server pid: 608
Fri Mar 23 14:39:32 2007 [608] info: spamd: server successfully spawned child
process, pid 612
Fri Mar 23 14:39:32 2007 [608] info: spamd: server successfully spawned child
process, pid 613
Fri Mar 23 14:39:32 2007 [612] warn: Use of uninitialized value in subroutine
entry at /usr/lib/perl/5.8/Socket.pm line 198.
Fri Mar 23 14:39:32 2007 [612] warn: spamd: error: Bad arg length for
Socket::unpack_sockaddr_in, length is 0, should be 16 at
/usr/lib/perl/5.8/Socket.pm line 198.
Fri Mar 23 14:39:32 2007 [612] warn: , continuing at /usr/bin/spamd line 1071.
Fri Mar 23 14:39:32 2007 [608] info: prefork: child states: BI
Fri Mar 23 14:39:32 2007 [608] info: prefork: child states: II
Fri Mar 23 14:39:34 2007 [612] warn: Use of uninitialized value in subroutine
entry at /usr/lib/perl/5.8/Socket.pm line 198.
Fri Mar 23 14:39:34 2007 [612] warn: spamd: error: Bad arg length for
Socket::unpack_sockaddr_in, length is 0, should be 16 at
/usr/lib/perl/5.8/Socket.pm line 198.
Fri Mar 23 14:39:34 2007 [612] warn: , continuing at /usr/bin/spamd line 1071.
Fri Mar 23 14:39:34 2007 [608] info: prefork: child states: II
Fri Mar 23 14:39:40 2007 [612] warn: Use of uninitialized value in subroutine
entry at /usr/lib/perl/5.8/Socket.pm line 198.
Fri Mar 23 14:39:40 2007 [612] warn: spamd: error: Bad arg length for
Socket::unpack_sockaddr_in, length is 0, should be 16 at
/usr/lib/perl/5.8/Socket.pm line 198.
Fri Mar 23 14:39:40 2007 [612] warn: , continuing at /usr/bin/spamd line 1071.
Fri Mar 23 14:39:40 2007 [608] info: prefork: child states: II
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
[Bug 5388] socket errors in spamd, both ipv4 and ipv6
Posted by bu...@bugzilla.spamassassin.org.
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=5388
mattw@nmgi.com changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|RESOLVED |REOPENED
Resolution|INVALID |
------- Additional Comments From mattw@nmgi.com 2007-03-27 09:44 -------
This patch solves several problems with spamd parent/child relations and
sockets. Two problems:
1. On initial startup, server was being marked as overloaded b/c no children had
started up yet (so there were no idle children), which caused the parent to
continually direct the children to accept non-existent connections.
2. The "periodic ping" from the parent to all the children was leaving the
listening child socket in a state that sometimes caused the child to commit suicide.
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
[Bug 5388] socket errors in spamd, both ipv4 and ipv6
Posted by bu...@bugzilla.spamassassin.org.
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=5388
Bug 5388 depends on bug 5416, which changed state.
Bug 5416 Summary: make test fails spamc_cf.t, spamd_plugin.t, spamd_unix.t, and spamd_unix_and_tcp.t
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=5416
What |Old Value |New Value
----------------------------------------------------------------------------
Status|NEW |RESOLVED
Resolution| |FIXED
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.