You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Rick Macdougall <ri...@ummm-beer.com> on 2006/07/08 20:22:37 UTC

Strange problem

Hi,

I'm having a strange problem on one of my spamd servers since upgrading 
to 3.1.3.

After awhile under heavy load, children are not exiting, ie the log show 
BBBIII, and eventually it's all BBBBBBBBBBBBB.

I've started hupping the server every night but I do not have the same 
problem on another server.

Server with the problem

Fedora core 4 diskless server
3 gigs of ram (never more than around 1 gig used, no swap)
auth-learn is off



Server without the problem

FreeBSD 4.8
1.5 gigs of ram (and swap)
auth-learn is enabled.

Both servers have exactly the same config except for the auto-learn and 
bayes/user prefs are stored in mysql on the FreeBSD server.

Anyone else seeing anything like this ?

Regards,

Rick

Re: Strange problem

Posted by "Daryl C. W. O'Shea" <sp...@dostech.ca>.
Rick Macdougall wrote:
> Hi,
> 
> I'm having a strange problem on one of my spamd servers since upgrading 
> to 3.1.3.

What version were you running previously on this host?  Upgrade, or wipe 
and clean install?


> After awhile under heavy load, children are not exiting, ie the log show 
> BBBIII, and eventually it's all BBBBBBBBBBBBB.
> 
> I've started hupping the server every night but I do not have the same 
> problem on another server.
> 
> Server with the problem
> 
> Fedora core 4 diskless server
> 3 gigs of ram (never more than around 1 gig used, no swap)
> auth-learn is off

Diskless... are you logging via syslog to another host?

If so, and you've got enough room on the volume to run spamd with the -D 
flag, you should be able to tell what's causing this from the debug output.

Even if you can't tell, having some debug output might clue someone more 
familiar with how spamd works in to what's going on.

In the meantime, any other "prefork:" messages being logged?


Daryl

Re: Strange problem

Posted by Rick Macdougall <ri...@ummm-beer.com>.
Dirk Bonengel wrote:
> Hi all,
> 
> Dallas, I think the problem isn't the request timing out - Rick says 
> 'the child never exits from processing the message'...how can this be??
> 
> Rick, as a workaround either raise the timeout limit (both my servers 
> are located near Cologne/Germany). And take a look at how fast DNS is 
> for you
> 
> Dirk
> 
> Dallas Engelken schrieb:
>>> -----Original Message-----
>>> From: Rick Macdougall [mailto:rickm@ummm-beer.com] Sent: Monday, July 
>>> 10, 2006 11:59
>>> To: dallase@uribl.com
>>> Cc: users@spamassassin.apache.org
>>> Subject: Re: Strange problem
>>>
>>> Sanford Whiteman wrote:
>>>    
>>>>> Both  servers have exactly the same config except for the         
>>> auto-learn    
>>>>> and bayes/user prefs are stored in mysql on the FreeBSD server.
>>>>>         
>>> Thanks to all who replied.
>>>
>>> I found the problem and it's related to ixhash, the timeout doesn't 
>>> work correctly / work at all.
>>>
>>> I see
>>>
>>> Jul 10 11:13:01 spa010 spamd[29830]: ixhash timeout reached at 
>>> /etc/mail/spamassassin/ixhash.pm line 91, <GEN12101> line 2226.
>>>
>>> Jul 10 11:13:01 spa010 spamd[29830]: ixhash timeout reached 
>>> at/etc/mail/spamassassin/ixhash.pm line 91, <GEN12101> line 2226.
>>>
>>> In the logs and the child never exits from processing the message.
>>>
>>> I've cc'd Dallas to see if he has any insights into the problem.
>>>
>>>     
>>
>> the warns are being generated because the timeout value has been 
>> exceeded...
>>
>>
>>         my $timeout = 
>> $permsgstatus->{main}->{conf}->{'ixhash_timeout'} ||
>> 5;
>>         eval {
>>           Mail::SpamAssassin::Util::trap_sigalrm_fully(sub { die "ixhash
>> timeout reached"; });
>>
>> the code is right.. you need to figure out why it times out.  have you
>> hardcoded ixhash_timeout to some other value?   have you tried manual
>> lookups from that box?
>>
>> # host -tA abc.ix.dnsbl.manitu.net
>> Host abc.ix.dnsbl.manitu.net not found: 3(NXDOMAIN)
>>
>> d
>>
>>   
> 

Hi,

Lookups are fast, very fast.  I'm running djb's dnscache server locally.

time host -tA abcd.ix.dnsbl.manitu.net
Host abcd.ix.dnsbl.manitu.net not found: 3(NXDOMAIN)

real    0m0.138s
user    0m0.003s
sys     0m0.005s

Like I said in my first message, it happens very rarely, so it's most 
likely a network glitch.  I haven't changed the timeout and it is still 
at 10.

I've turned it off on that box for now but I'm more than willing to turn 
it back on if someone wants me to test something for them.

As Dirk said, it should exit after the timeout is reached but it never 
does, it just keeps logging errors about ixhash timing out.

Regards,

Rick

Re: Strange problem

Posted by Dirk Bonengel <di...@bonengel.de>.
Hi all,

Dallas, I think the problem isn't the request timing out - Rick says 
'the child never exits from processing the message'...how can this be??

Rick, as a workaround either raise the timeout limit (both my servers 
are located near Cologne/Germany). And take a look at how fast DNS is 
for you

Dirk

Dallas Engelken schrieb:
>> -----Original Message-----
>> From: Rick Macdougall [mailto:rickm@ummm-beer.com] 
>> Sent: Monday, July 10, 2006 11:59
>> To: dallase@uribl.com
>> Cc: users@spamassassin.apache.org
>> Subject: Re: Strange problem
>>
>> Sanford Whiteman wrote:
>>     
>>>> Both  servers have exactly the same config except for the 
>>>>         
>> auto-learn 
>>     
>>>> and bayes/user prefs are stored in mysql on the FreeBSD server.
>>>>         
>> Thanks to all who replied.
>>
>> I found the problem and it's related to ixhash, the timeout 
>> doesn't work correctly / work at all.
>>
>> I see
>>
>> Jul 10 11:13:01 spa010 spamd[29830]: ixhash timeout reached 
>> at /etc/mail/spamassassin/ixhash.pm line 91, <GEN12101> line 2226.
>>
>> Jul 10 11:13:01 spa010 spamd[29830]: ixhash timeout reached 
>> at/etc/mail/spamassassin/ixhash.pm line 91, <GEN12101> line 2226.
>>
>> In the logs and the child never exits from processing the message.
>>
>> I've cc'd Dallas to see if he has any insights into the problem.
>>
>>     
>
> the warns are being generated because the timeout value has been exceeded...
>
>
>         my $timeout = $permsgstatus->{main}->{conf}->{'ixhash_timeout'} ||
> 5;
>         eval {
>           Mail::SpamAssassin::Util::trap_sigalrm_fully(sub { die "ixhash
> timeout reached"; });
>
> the code is right.. you need to figure out why it times out.  have you
> hardcoded ixhash_timeout to some other value?   have you tried manual
> lookups from that box?
>
> # host -tA abc.ix.dnsbl.manitu.net
> Host abc.ix.dnsbl.manitu.net not found: 3(NXDOMAIN)
>
> d
>
>   


RE: Strange problem

Posted by Dallas Engelken <da...@uribl.com>.
> -----Original Message-----
> From: Rick Macdougall [mailto:rickm@ummm-beer.com] 
> Sent: Monday, July 10, 2006 11:59
> To: dallase@uribl.com
> Cc: users@spamassassin.apache.org
> Subject: Re: Strange problem
> 
> Sanford Whiteman wrote:
> >> Both  servers have exactly the same config except for the 
> auto-learn 
> >> and bayes/user prefs are stored in mysql on the FreeBSD server.
> > 
> 
> Thanks to all who replied.
> 
> I found the problem and it's related to ixhash, the timeout 
> doesn't work correctly / work at all.
> 
> I see
> 
> Jul 10 11:13:01 spa010 spamd[29830]: ixhash timeout reached 
> at /etc/mail/spamassassin/ixhash.pm line 91, <GEN12101> line 2226.
> 
> Jul 10 11:13:01 spa010 spamd[29830]: ixhash timeout reached 
> at/etc/mail/spamassassin/ixhash.pm line 91, <GEN12101> line 2226.
> 
> In the logs and the child never exits from processing the message.
> 
> I've cc'd Dallas to see if he has any insights into the problem.
> 

the warns are being generated because the timeout value has been exceeded...


        my $timeout = $permsgstatus->{main}->{conf}->{'ixhash_timeout'} ||
5;
        eval {
          Mail::SpamAssassin::Util::trap_sigalrm_fully(sub { die "ixhash
timeout reached"; });

the code is right.. you need to figure out why it times out.  have you
hardcoded ixhash_timeout to some other value?   have you tried manual
lookups from that box?

# host -tA abc.ix.dnsbl.manitu.net
Host abc.ix.dnsbl.manitu.net not found: 3(NXDOMAIN)

d


Re: Strange problem

Posted by Rick Macdougall <ri...@ummm-beer.com>.
Sanford Whiteman wrote:
>> Both  servers have exactly the same config except for the auto-learn
>> and bayes/user prefs are stored in mysql on the FreeBSD server.
> 

Thanks to all who replied.

I found the problem and it's related to ixhash, the timeout doesn't work 
correctly / work at all.

I see

Jul 10 11:13:01 spa010 spamd[29830]: ixhash timeout reached at 
/etc/mail/spamassassin/ixhash.pm line 91, <GEN12101> line 2226.

Jul 10 11:13:01 spa010 spamd[29830]: ixhash timeout reached 
at/etc/mail/spamassassin/ixhash.pm line 91, <GEN12101> line 2226.

In the logs and the child never exits from processing the message.

I've cc'd Dallas to see if he has any insights into the problem.

Regards,

Rick

Re: Strange problem

Posted by Rick Macdougall <ri...@ummm-beer.com>.
jdow wrote:
> From: "Rick Macdougall" <ri...@ummm-beer.com>
> 
>> Hi,
>>
>> I'm having a strange problem on one of my spamd servers since 
>> upgrading to 3.1.3.
>>
>> After awhile under heavy load, children are not exiting, ie the log 
>> show BBBIII, and eventually it's all BBBBBBBBBBBBB.
>>
>> I've started hupping the server every night but I do not have the same 
>> problem on another server.
>>
>> Server with the problem
>>
>> Fedora core 4 diskless server
>> 3 gigs of ram (never more than around 1 gig used, no swap)
>> auth-learn is off
>>
>>
>>
>> Server without the problem
>>
>> FreeBSD 4.8
>> 1.5 gigs of ram (and swap)
>> auth-learn is enabled.
>>
>> Both servers have exactly the same config except for the auto-learn 
>> and bayes/user prefs are stored in mysql on the FreeBSD server.
>>
>> Anyone else seeing anything like this ?
> 
> I've not seen it but I have a hunch you are seeing a file locking
> problem. I think there are notes in the documentation loaded with
> SpamAssassin and on the WIKI related to file locking. You might find
> your answer there.
> 
> {^_^}

Hi,

Thanks, but I'm not using any flat files or DBM based databases, it's 
all mysql based.

Regards,

Rick



Re: Strange problem

Posted by jdow <jd...@earthlink.net>.
From: "Rick Macdougall" <ri...@ummm-beer.com>

> Hi,
> 
> I'm having a strange problem on one of my spamd servers since upgrading 
> to 3.1.3.
> 
> After awhile under heavy load, children are not exiting, ie the log show 
> BBBIII, and eventually it's all BBBBBBBBBBBBB.
> 
> I've started hupping the server every night but I do not have the same 
> problem on another server.
> 
> Server with the problem
> 
> Fedora core 4 diskless server
> 3 gigs of ram (never more than around 1 gig used, no swap)
> auth-learn is off
> 
> 
> 
> Server without the problem
> 
> FreeBSD 4.8
> 1.5 gigs of ram (and swap)
> auth-learn is enabled.
> 
> Both servers have exactly the same config except for the auto-learn and 
> bayes/user prefs are stored in mysql on the FreeBSD server.
> 
> Anyone else seeing anything like this ?

I've not seen it but I have a hunch you are seeing a file locking
problem. I think there are notes in the documentation loaded with
SpamAssassin and on the WIKI related to file locking. You might find
your answer there.

{^_^}