You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Michał Jęczalik <mi...@jeczalik.com> on 2007/03/12 13:33:22 UTC

Do you experience problems with 3.1.8?

 	Hello,

after upgrading from 3.1.7 I have numerous problems with my spamd. It 
hangs up during high load and become permamently unresponsive. According 
to advices I have found on devel list, I'm using --round-robin now and it 
hangs less often. But now I have a lot of 
~/.spamassassin/bayes_toks.expire[pid] lockfiles, that don't disappear and 
quickly foul user's quota. It's interesting that on another host with 
similar load conditions everything works ok. Anyway - am I the only one 
experiencing these problems? There's no rumour on the devel list, there's 
no rumour here - what's wrong? :) In this situation 3.1.8 is quite 
unusable for me and I'm thinking about downgrade. The only reason I have 
not done it already is that I'm not sure if this is a simple task - my 
users won't stand another spamassassin blackout, after numerous spam 
floods due to those hang-ups in past couple of days. ;-)
-- 
Michał Jęczalik, +48.603.64.62.97
INFONAUTIC, +48.33.487.69.04


Re: Do you experience problems with 3.1.8?

Posted by Theo Van Dinter <fe...@apache.org>.
On Mon, Mar 12, 2007 at 01:33:22PM +0100, Michał Jęczalik wrote:
> ~/.spamassassin/bayes_toks.expire[pid] lockfiles, that don't disappear and 

Those aren't lock files, those are databases, which means that something is
killing SA before the expire finishes.

> I invoke spamd with inetd and spamc with procmail, but the problem is in spamd itself.

Huh?  spamd is a daemon, it doesn't run from inetd.

-- 
Randomly Selected Tagline:
Chess is a moving experience.

Re: Do you experience problems with 3.1.8?

Posted by "Daryl C. W. O'Shea" <sp...@dostech.ca>.
Michal Jeczalik wrote:
> On Mon, 12 Mar 2007, Daryl C. W. O'Shea wrote:
> 
>>> after upgrading from 3.1.7 I have numerous problems with my spamd. It 
>>> hangs up during high load and become permamently unresponsive. 
>>> According to advices I have found on devel list, I'm using 
>>> --round-robin now and it hangs less often. But now I have a lot of 
>>> ~/.spamassassin/bayes_toks.expire[pid] lockfiles, that don't 
>>> disappear and quickly foul user's quota. It's interesting that on 
>>> another host with similar load conditions everything works ok. Anyway 
>>> - am I the only one experiencing these problems? There's no rumour on 
>>> the devel list, there's no rumour here - what's wrong? :) In this 
>>> situation 3.1.8 is quite unusable for me and I'm thinking about 
>>> downgrade. The only reason I have not done it already is that I'm not 
>>> sure if this is a simple task - my users won't stand another 
>>> spamassassin blackout, after numerous spam floods due to those 
>>> hang-ups in past couple of days. ;-)
>>
>> This has nothing to do with 3.1.8 specifically.  The same thing would 
>> happen with 3.1.7.  Reverting to an earlier SA version will do nothing 
>> for you.
>>
>> spamd isn't "hanging up", it's doing bayes expiries, as you can tell 
>> from having the bayes_toks.expire* lock files left after you kill off 
>> the child process(es) doing the expiry.  Since you're killing off the 
>> expiries before they complete, this will (of course) keep happening.
>>
>> If your system is too loaded to deal with bayes auto expiries, disable 
>> bayes_auto_expire and then schedule them to be done via a cron job 
>> using sa-learn --force-expire -u username.
> 
> OK, I'll try disabling autoexpire, but the fact is that I had no 
> problems with 3.1.7.

No changes were made to how expiries are done between 3.1.7 (and a lot 
further back) and 3.1.8.  It's most likely just that as time has gone on 
more of your users' databases have become ready for expiry, whereas 
before expiries were less frequent and thus manageable.

Daryl


Re: Do you experience problems with 3.1.8?

Posted by Michal Jeczalik <mi...@jeczalik.com>.
On Mon, 12 Mar 2007, Daryl C. W. O'Shea wrote:

>> after upgrading from 3.1.7 I have numerous problems with my spamd. It hangs 
>> up during high load and become permamently unresponsive. According to 
>> advices I have found on devel list, I'm using --round-robin now and it 
>> hangs less often. But now I have a lot of 
>> ~/.spamassassin/bayes_toks.expire[pid] lockfiles, that don't disappear and 
>> quickly foul user's quota. It's interesting that on another host with 
>> similar load conditions everything works ok. Anyway - am I the only one 
>> experiencing these problems? There's no rumour on the devel list, there's 
>> no rumour here - what's wrong? :) In this situation 3.1.8 is quite unusable 
>> for me and I'm thinking about downgrade. The only reason I have not done it 
>> already is that I'm not sure if this is a simple task - my users won't 
>> stand another spamassassin blackout, after numerous spam floods due to 
>> those hang-ups in past couple of days. ;-)
>
> This has nothing to do with 3.1.8 specifically.  The same thing would happen 
> with 3.1.7.  Reverting to an earlier SA version will do nothing for you.
>
> spamd isn't "hanging up", it's doing bayes expiries, as you can tell from 
> having the bayes_toks.expire* lock files left after you kill off the child 
> process(es) doing the expiry.  Since you're killing off the expiries before 
> they complete, this will (of course) keep happening.
>
> If your system is too loaded to deal with bayes auto expiries, disable 
> bayes_auto_expire and then schedule them to be done via a cron job using 
> sa-learn --force-expire -u username.

OK, I'll try disabling autoexpire, but the fact is that I had no problems 
with 3.1.7.
-- 
Michał Jęczalik, +48.603.64.62.97
INFONAUTIC, +48.33.487.69.04


Re: Do you experience problems with 3.1.8?

Posted by "Daryl C. W. O'Shea" <sp...@dostech.ca>.
Michal Jeczalik wrote:
> On Mon, 12 Mar 2007, Daryl C. W. O'Shea wrote:
> 
>>> after upgrading from 3.1.7 I have numerous problems with my spamd. It 
>>> hangs up during high load and become permamently unresponsive. 
>>> According to advices I have found on devel list, I'm using 
>>> --round-robin now and it hangs less often. But now I have a lot of 
>>> ~/.spamassassin/bayes_toks.expire[pid] lockfiles, that don't 
>>> disappear and quickly foul user's quota. It's interesting that on 
>>> another host with similar load conditions everything works ok. Anyway 
>>> - am I the only one experiencing these problems? There's no rumour on 
>>> the devel list, there's no rumour here - what's wrong? :) In this 
>>> situation 3.1.8 is quite unusable for me and I'm thinking about 
>>> downgrade. The only reason I have not done it already is that I'm not 
>>> sure if this is a simple task - my users won't stand another 
>>> spamassassin blackout, after numerous spam floods due to those 
>>> hang-ups in past couple of days. ;-)
>>
>> This has nothing to do with 3.1.8 specifically.  The same thing would 
>> happen with 3.1.7.  Reverting to an earlier SA version will do nothing 
>> for you.
>>
>> spamd isn't "hanging up", it's doing bayes expiries, as you can tell 
>> from having the bayes_toks.expire* lock files left after you kill off 
>> the child process(es) doing the expiry.  Since you're killing off the 
>> expiries before they complete, this will (of course) keep happening.
>>
>> If your system is too loaded to deal with bayes auto expiries, disable 
>> bayes_auto_expire and then schedule them to be done via a cron job 
>> using sa-learn --force-expire -u username.
> 
> BTW - if it hangs up, it hangs up *completely* until I restart it. If it 
> goes down at midnight, then spamd is unresposive until 8am when I get up 
> and do something. There are no log messages during this period. It's 
> *dead* in the full meaning of this word. :) So I'm not so sure as you 
> that it's only a matter of auto expire - would a single autoexpire task 
> lock up a frontend process for so long?!

If it's as busy as you said it was, "hangs up during high load", and 
all/most of the children are trying to do expiries it could take months 
to complete -- especially if you don't have the physical memory to do it 
(read a whole lot of RAM if multiple expiries are happening).

Disable auto expiry, do serialized expiries via cron, and see if the 
problem stops.  Actually, you don't even need to do the expries to stop 
the problem, just disable auto expiries.  If spamd stops "hanging" then 
it's the auto expiries causing the problem.

Experience tells me that if the spamd children are actually using CPU 
time and they're not spewing errors all over your syslog, then it's an 
expiry issue.


Daryl

Re: Do you experience problems with 3.1.8?

Posted by Michal Jeczalik <mi...@jeczalik.com>.
On Mon, 12 Mar 2007, Daryl C. W. O'Shea wrote:

>> after upgrading from 3.1.7 I have numerous problems with my spamd. It hangs 
>> up during high load and become permamently unresponsive. According to 
>> advices I have found on devel list, I'm using --round-robin now and it 
>> hangs less often. But now I have a lot of 
>> ~/.spamassassin/bayes_toks.expire[pid] lockfiles, that don't disappear and 
>> quickly foul user's quota. It's interesting that on another host with 
>> similar load conditions everything works ok. Anyway - am I the only one 
>> experiencing these problems? There's no rumour on the devel list, there's 
>> no rumour here - what's wrong? :) In this situation 3.1.8 is quite unusable 
>> for me and I'm thinking about downgrade. The only reason I have not done it 
>> already is that I'm not sure if this is a simple task - my users won't 
>> stand another spamassassin blackout, after numerous spam floods due to 
>> those hang-ups in past couple of days. ;-)
>
> This has nothing to do with 3.1.8 specifically.  The same thing would happen 
> with 3.1.7.  Reverting to an earlier SA version will do nothing for you.
>
> spamd isn't "hanging up", it's doing bayes expiries, as you can tell from 
> having the bayes_toks.expire* lock files left after you kill off the child 
> process(es) doing the expiry.  Since you're killing off the expiries before 
> they complete, this will (of course) keep happening.
>
> If your system is too loaded to deal with bayes auto expiries, disable 
> bayes_auto_expire and then schedule them to be done via a cron job using 
> sa-learn --force-expire -u username.

BTW - if it hangs up, it hangs up *completely* until I restart it. If it 
goes down at midnight, then spamd is unresposive until 8am when I get up 
and do something. There are no log messages during this period. It's 
*dead* in the full meaning of this word. :) So I'm not so sure as you that 
it's only a matter of auto expire - would a single autoexpire task lock up 
a frontend process for so long?!
-- 
Michał Jęczalik, +48.603.64.62.97
INFONAUTIC, +48.33.487.69.04


RE: Do you experience problems with 3.1.8?

Posted by Dave Koontz <dk...@mbc.edu>.
Oddly enough, I did have a similar problem when I first upgraded to v3.18.
What I was noticing was a permissions failing message at the end of the
expirary cycle.  Same thing with a sa-learn --force sync.  I went back to
3.17 and everything worked as expected.  My second upgrade to v3.18 failed
the first time during expirary, but has worked flawlessly ever since.  I am
wondeing if perhaps there was something in the bayes.db that 3.18 didn't
like.  I was going to report it until it started working auto magically....
<g>


-----Original Message-----
From: Daryl C. W. O'Shea [mailto:spamassassin@dostech.ca] 
Sent: Monday, March 12, 2007 5:30 PM
To: Michal Jeczalik
Cc: users@spamassassin.apache.org
Subject: Re: Do you experience problems with 3.1.8?

Michał Jęczalik wrote:
>     Hello,
> 
> after upgrading from 3.1.7 I have numerous problems with my spamd. It 
> hangs up during high load and become permamently unresponsive. 
> According to advices I have found on devel list, I'm using 
> --round-robin now and it hangs less often. But now I have a lot of 
> ~/.spamassassin/bayes_toks.expire[pid] lockfiles, that don't disappear 
> and quickly foul user's quota. It's interesting that on another host 
> with similar load conditions everything works ok. Anyway - am I the 
> only one experiencing these problems? There's no rumour on the devel 
> list, there's no rumour here - what's wrong? :) In this situation 
> 3.1.8 is quite unusable for me and I'm thinking about downgrade. The 
> only reason I have not done it already is that I'm not sure if this is 
> a simple task
> - my users won't stand another spamassassin blackout, after numerous 
> spam floods due to those hang-ups in past couple of days. ;-)

This has nothing to do with 3.1.8 specifically.  The same thing would happen
with 3.1.7.  Reverting to an earlier SA version will do nothing for you.

spamd isn't "hanging up", it's doing bayes expiries, as you can tell from
having the bayes_toks.expire* lock files left after you kill off the child
process(es) doing the expiry.  Since you're killing off the expiries before
they complete, this will (of course) keep happening.

If your system is too loaded to deal with bayes auto expiries, disable
bayes_auto_expire and then schedule them to be done via a cron job using
sa-learn --force-expire -u username.


Daryl



Re: Do you experience problems with 3.1.8?

Posted by "Daryl C. W. O'Shea" <sp...@dostech.ca>.
Michał Jęczalik wrote:
>     Hello,
> 
> after upgrading from 3.1.7 I have numerous problems with my spamd. It 
> hangs up during high load and become permamently unresponsive. According 
> to advices I have found on devel list, I'm using --round-robin now and 
> it hangs less often. But now I have a lot of 
> ~/.spamassassin/bayes_toks.expire[pid] lockfiles, that don't disappear 
> and quickly foul user's quota. It's interesting that on another host 
> with similar load conditions everything works ok. Anyway - am I the only 
> one experiencing these problems? There's no rumour on the devel list, 
> there's no rumour here - what's wrong? :) In this situation 3.1.8 is 
> quite unusable for me and I'm thinking about downgrade. The only reason 
> I have not done it already is that I'm not sure if this is a simple task 
> - my users won't stand another spamassassin blackout, after numerous 
> spam floods due to those hang-ups in past couple of days. ;-)

This has nothing to do with 3.1.8 specifically.  The same thing would 
happen with 3.1.7.  Reverting to an earlier SA version will do nothing 
for you.

spamd isn't "hanging up", it's doing bayes expiries, as you can tell 
from having the bayes_toks.expire* lock files left after you kill off 
the child process(es) doing the expiry.  Since you're killing off the 
expiries before they complete, this will (of course) keep happening.

If your system is too loaded to deal with bayes auto expiries, disable 
bayes_auto_expire and then schedule them to be done via a cron job using 
sa-learn --force-expire -u username.


Daryl

Re: Do you experience problems with 3.1.8?

Posted by maillist <ma...@emailacs.com>.
Michał Jęczalik wrote:
> On Mon, 12 Mar 2007, maillist wrote:
>
>> Michał Jęczalik wrote:
>>>     Hello,
>>>
>>> after upgrading from 3.1.7 I have numerous problems with my spamd. 
>>> It hangs up during high load and become permamently unresponsive. 
>>> According to advices I have found on devel list, I'm using 
>>> --round-robin now and it hangs less often. But now I have a lot of 
>>> ~/.spamassassin/bayes_toks.expire[pid] lockfiles, that don't 
>>> disappear and quickly foul user's quota. It's interesting that on 
>>> another host with similar load conditions everything works ok. 
>>> Anyway - am I the only one experiencing these problems? There's no 
>>> rumour on the devel list, there's no rumour here - what's wrong? :) 
>>> In this situation 3.1.8 is quite unusable for me and I'm thinking 
>>> about downgrade. The only reason I have not done it already is that 
>>> I'm not sure if this is a simple task - my users won't stand another 
>>> spamassassin blackout, after numerous spam floods due to those 
>>> hang-ups in past couple of days. ;-)
>> How did you upgrade?
>
> perl Makefile.PL etc ;-)
>
>> What OS?
>
> Linux 2.4
>
>> What MDA?
>
> It is completly unrelated to MDA. I invoke spamd with inetd and spamc 
> with procmail, but the problem is in spamd itself. Probably one could 
> repeat it with feeding messages manually to spamc. As far as I read 
> the devel list, guys out there are aware of this problem, but they 
> seem to be satisfied with the temporary (?) solution of --round-robin 
> so far. But it doesn't fix the problem, it just seems to decrease 
> intensivity.
>
> Oh, I've just noticed it died again. Well, killall spamd... ;-)
>
>> When you say "hangs" what do you mean?
>
> This is what I mean:
>
>  5707 ?        Ss     0:02 /usr/bin/perl -T -w /usr/bin/spamd 
> --max-children=14 --round-robin
>  5805 ?        R     58:05 spamd child
>  5826 ?        S      3:10 spamd child
>  5851 ?        R     31:03 spamd child
>  5862 ?        R     26:19 spamd child
>  5873 ?        R     26:11 spamd child
>  5882 ?        R     26:09 spamd child
> 15341 ?        R     18:15 spamd child
> 17651 ?        R     16:09 spamd child
> 22972 ?        R     16:16 spamd child
>  9744 ?        R     10:47 spamd child
> 14581 ?        S      1:37 spamd child
> 18379 ?        R     10:18 spamd child
> 21493 ?        R      7:21 spamd child
> 24789 ?        R      6:43 spamd child
>
> And a nice bunch of spamc - some probably hung up waiting for output 
> from spamd, and some continously trying to connect and feed incoming 
> mails (and giving up after some retries, passing the message 
> spam-uncredited).
>
> A last sane response of every spamd's child is "processing message ...".
make uninstall
perl Makefile.PL etc ;-)

Sorry man, I'm stumped.  It just seems like it must be an issue with the 
upgrade.

-=Aubrey=-

Re: Do you experience problems with 3.1.8?

Posted by Michał Jęczalik <mi...@jeczalik.com>.
On Mon, 12 Mar 2007, maillist wrote:

> Michał Jęczalik wrote:
>>     Hello,
>> 
>> after upgrading from 3.1.7 I have numerous problems with my spamd. It hangs 
>> up during high load and become permamently unresponsive. According to 
>> advices I have found on devel list, I'm using --round-robin now and it 
>> hangs less often. But now I have a lot of 
>> ~/.spamassassin/bayes_toks.expire[pid] lockfiles, that don't disappear and 
>> quickly foul user's quota. It's interesting that on another host with 
>> similar load conditions everything works ok. Anyway - am I the only one 
>> experiencing these problems? There's no rumour on the devel list, there's 
>> no rumour here - what's wrong? :) In this situation 3.1.8 is quite unusable 
>> for me and I'm thinking about downgrade. The only reason I have not done it 
>> already is that I'm not sure if this is a simple task - my users won't 
>> stand another spamassassin blackout, after numerous spam floods due to 
>> those hang-ups in past couple of days. ;-)
> How did you upgrade?

perl Makefile.PL etc ;-)

> What OS?

Linux 2.4

> What MDA?

It is completly unrelated to MDA. I invoke spamd with inetd and spamc with 
procmail, but the problem is in spamd itself. Probably one could repeat it 
with feeding messages manually to spamc. As far as I read the devel 
list, guys out there are aware of this problem, but they seem to be 
satisfied with the temporary (?) solution of --round-robin so far. But it 
doesn't fix the problem, it just seems to decrease intensivity.

Oh, I've just noticed it died again. Well, killall spamd... ;-)

> When you say "hangs" what do you mean?

This is what I mean:

  5707 ?        Ss     0:02 /usr/bin/perl -T -w /usr/bin/spamd --max-children=14 --round-robin
  5805 ?        R     58:05 spamd child
  5826 ?        S      3:10 spamd child
  5851 ?        R     31:03 spamd child
  5862 ?        R     26:19 spamd child
  5873 ?        R     26:11 spamd child
  5882 ?        R     26:09 spamd child
15341 ?        R     18:15 spamd child
17651 ?        R     16:09 spamd child
22972 ?        R     16:16 spamd child
  9744 ?        R     10:47 spamd child
14581 ?        S      1:37 spamd child
18379 ?        R     10:18 spamd child
21493 ?        R      7:21 spamd child
24789 ?        R      6:43 spamd child

And a nice bunch of spamc - some probably hung up waiting for output from 
spamd, and some continously trying to connect and feed incoming mails (and 
giving up after some retries, passing the message spam-uncredited).

A last sane response of every spamd's child is "processing message ...".
-- 
Michał Jęczalik, +48.603.64.62.97
INFONAUTIC, +48.33.487.69.04


Re: Do you experience problems with 3.1.8?

Posted by maillist <ma...@emailacs.com>.
Michał Jęczalik wrote:
>     Hello,
>
> after upgrading from 3.1.7 I have numerous problems with my spamd. It 
> hangs up during high load and become permamently unresponsive. 
> According to advices I have found on devel list, I'm using 
> --round-robin now and it hangs less often. But now I have a lot of 
> ~/.spamassassin/bayes_toks.expire[pid] lockfiles, that don't disappear 
> and quickly foul user's quota. It's interesting that on another host 
> with similar load conditions everything works ok. Anyway - am I the 
> only one experiencing these problems? There's no rumour on the devel 
> list, there's no rumour here - what's wrong? :) In this situation 
> 3.1.8 is quite unusable for me and I'm thinking about downgrade. The 
> only reason I have not done it already is that I'm not sure if this is 
> a simple task - my users won't stand another spamassassin blackout, 
> after numerous spam floods due to those hang-ups in past couple of 
> days. ;-)
How did you upgrade?
What OS?
What MDA?
When you say "hangs" what do you mean?