You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by ma...@seaan.net on 2007/03/16 18:28:44 UTC

Re: per user bayes db: auto_expiry problem --->> SOLVED

Hi Folks,

it's been a while I asked here how to solve bayes timeout and spamd child
timeout problems.
Well, at least for our environment I have found a solution that seems to
work.
Also, I have a theory about the reason for this bayes timeout and spamd
child timeout problems and I'd like to know whether this theory is
correct.

Symptoms:
 child processing timeout at spamd line 1086, <GEN786> line 108.
 child processing timeout at spamd line 1086, <GEN73> line 209.
 ...
 bayes: child processing timeout at spamd line 1086.


Reason:
spamc timeout set to 220s
spamd timeout set to 240s
procmail timeout set to 300s

First I did what everybody suggested. Disabling bayes_auto_expire in
local.cf and doing the job manually per user. I wrote a script that
extracted the users from the maillog that had a scantime of more than 220s
and ran a sa-learn -u $user --force-expire --sync. The problem stayed
unsolved. Then I changed the timeouts to values more than twice as high.
Result: For nearly 2 days I had no timeout errors anymore. Then I checked
once more the logs and I saw a lot of users having scantimes quite above
300s but lower than the new values. Those where users, that never before
have had come up in my logs with such high scantimes. Then, I basically
ran the whole day --force-expire --sync.
I realized that the manual force-expire job was not applicable for 2700
users and a 2.5GB Bayes DB in mysql (myisam engine). Also I realized that
doing the --force-expire job manually probably would mess up some or most
of the users Bayes DBs.

I changed back to auto_expire = 1 in local.cf and restarted spamd.
This is what happened next for a number of users:
bayes: expire_old_tokens: child processing timeout at spamd line 1086,
This was on Tuesday, March 14. Since then I have had no problems anymore
with spamd child timeouts.


I have not looked into the spamd code and I think I shouldnt do it as I am
no perl coder. Nevertheless I have a theory why the short timeout values
could have such a heavy impact:

If the timeouts are too short, spamd under some circumstances cannot
finish the bayes expire job if bayes_auto_expire is enabled in local.cf. I
hope, I correctly understand the expire job as a database cleanup job.
Thus, if it can't be finished, it turns from a cleanup to a messup job;
the problem gets wors or at least stayes at least as bad as it is.
Now, I hope that by changing the mysql engine from myisam to innodb which
is  capable of doing DB transactions and is suggested by the SpamAssassin
people in the Bayes manpages the expire job gets finished even if spamd
suffers a timeout.

Your comments?

Philipp