You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by "@lbutlr" <kr...@kreme.com> on 2019/08/30 16:58:30 UTC

Many sa-learn processes getting stuck

I have a lot of processes that look like this:

root     48359 100.0  1.4  55984  47680  -  R    17:53      989:39.50 /usr/local/bin/perl -T -w /usr/local/bin/sa-learn --spam -u vscan /usr/local/virtual/kreme@kreme.com/Maildir/.Junk/cur/15670…  /usr/local/virtual/kreme@kreme.com/Maildir/.Junk/cur/15670201…  
[ 15 lines ]
/usr/local/virtual/kreme@kreme.com/Maildir/.Junk/cur/15670

I have a script in dovecot that feeds mails to sa-learn —spam when then are moved to the junk folder, but it is a script that is used by a lot of people, so I doubt the problem is there.

I also have other processes that hit a similar script that marks messages as ham when they are moved to the archives mailbox.

FreeBSD is up to date, SA is up to date, postfix and dovecot ar up to date, perl is up to date (5.28 branch).

When I run the command manually with -D, (I've recently reset everything, thus the bases DB being light on content) I get the following:

Aug 30 10:44:37.624 [19164] dbg: bayes: learner_new: got store=Mail::SpamAssassin::BayesStore::DBM=HASH(0x8fe2c84)
Aug 30 10:44:37.624 [19164] dbg: plugin: Mail::SpamAssassin::Plugin::Bayes=HASH(0x88969a8) implements 'learner_is_scan_available', priority 0
Aug 30 10:44:37.624 [19164] dbg: bayes: tie-ing to DB file R/O /var/spool/spamd/.spamassassin/bayes_toks
Aug 30 10:44:37.624 [19164] dbg: bayes: tie-ing to DB file R/O /var/spool/spamd/.spamassassin/bayes_seen
Aug 30 10:44:37.625 [19164] dbg: bayes: found bayes db version 3
Aug 30 10:44:37.625 [19164] dbg: bayes: DB journal sync: last sync: 0
Aug 30 10:44:37.625 [19164] dbg: bayes: not available for scanning, only 97 ham(s) in bayes DB < 200
Aug 30 10:44:37.625 [19164] dbg: bayes: untie-ing
Aug 30 10:44:37.625 [19164] dbg: config: score set 1 chosen.
Aug 30 10:44:37.626 [19164] dbg: dns: EDNS, UDP payload size 4096
Aug 30 10:44:37.626 [19164] dbg: dns: servers obtained from Net::DNS : [127.0.0.1]:53, [9.9.9.9]:53
Aug 30 10:44:37.626 [19164] dbg: dns: nameservers set to 127.0.0.1, 9.9.9.9
Aug 30 10:44:37.626 [19164] dbg: dns: using socket module: IO::Socket::IP version 0.39
Aug 30 10:44:37.626 [19164] dbg: dns: is Net::DNS::Resolver available? yes
Aug 30 10:44:37.626 [19164] dbg: dns: Net::DNS version: 1.2
Aug 30 10:44:37.627 [19164] dbg: sa-learn: spamtest initialized
Aug 30 10:44:37.627 [19164] dbg: learn: initializing learner
Aug 30 10:44:37.627 [19164] dbg: plugin: Mail::SpamAssassin::Plugin::Bayes=HASH(0x88969a8) implements 'learner_sync', priority 0
Aug 30 10:44:37.627 [19164] dbg: bayes: bayes journal sync starting
Aug 30 10:44:37.627 [19164] dbg: bayes: bayes journal sync completed
Aug 30 10:44:37.627 [19164] dbg: plugin: Mail::SpamAssassin::Plugin::Bayes=HASH(0x88969a8) implements 'learner_expire_old_training', priority 0
Aug 30 10:44:37.627 [19164] dbg: bayes: expiry starting
Aug 30 10:44:37.627 [19164] dbg: locker: mode is 438
Aug 30 10:44:37.627 [19164] dbg: locker: safe_lock: created /var/spool/spamd/.spamassassin/bayes.mutex
Aug 30 10:44:37.627 [19164] dbg: locker: safe_lock: trying to get lock on /var/spool/spamd/.spamassassin/bayes with 300 timeout

(does this again), then)

Aug 30 10:54:37.675 [19164] dbg: locker: safe_lock: timed out after 300 seconds
bayes: cannot open bayes databases /var/spool/spamd/.spamassassin/bayes_* R/W: lock failed: 
Learned tokens from 0 message(s) (1 message(s) examined)
Aug 30 10:54:37.676 [19164] dbg: plugin: Mail::SpamAssassin::Plugin::Bayes=HASH(0x88969a8) implements 'learner_close', priority 0
ERROR: the Bayes learn function returned an error, please re-run with -D for more information at /usr/local/bin/sa-learn line 500.
Aug 30 10:54:37.678 [19164] dbg: netset: cache trusted_networks hits/attempts: 0/1, 0.0 %

The running process never gives up (as you ca see, its been chugging along for a long time).

How can I see what is preventing the lock on the site-wide?





-- 
They say whisky'll kill you, but I don't think it will I'm ridin' with
you to the top of the hill


Re: Many sa-learn processes getting stuck

Posted by "@lbutlr" <kr...@kreme.com>.
On 30 Aug 2019, at 12:32, @lbutlr <kr...@kreme.com> wrote:
> That is probably my error then. I remove the -Q flag manually and didn’t check -u since there is a scan user on the system.

Found the problem, it wasn’t spam assassin at all, it was an old crontab script that someone how was re-enabled. removed it from the crontab (instead of commenting it out) and killed all the processes and things seem to be running OK now, 🤞🏼.



Re: Many sa-learn processes getting stuck

Posted by "@lbutlr" <kr...@kreme.com>.
On 30 Aug 2019, at 11:49, RW <rw...@googlemail.com> wrote:
> On Fri, 30 Aug 2019 10:58:30 -0600
> @lbutlr wrote:
> 
>> I have a lot of processes that look like this:
>> 
>> root     48359 100.0  1.4  55984  47680  -  R    17:53      989:39.50
>> /usr/local/bin/perl -T -w /usr/local/bin/sa-learn --spam -u vscan
> ...
>> /var/spool/spamd/.spamassassin/bayes_toks Aug 30 10:44:37.624 [19164]
>> dbg: bayes: tie-ing to DB file R/O
> 
> This looks a bit strange. The -u argument to sa-learn is supposed to be
> for SQL virtual users, but spamd is using a single Berkeley database.

That is probably my error then. I remove the -Q flag manually and didn’t check -u since there is a scan user on the system.

> bayes_toks is in the default location for the spamd user, do you have
> that location in  bayes_path, otherwise sa-learn is probably looking
> under ~root.

It looks t be looking in /var/spool/spamd/.spamassassin according to the output of -D I posted above.

> IIWY I'd run sa-learn as the user spamd using su.

All the as-learn processes are running as root currently, so I don’t think it’s a permission issue. I will remove the -u flag though.

> Also disable bayes_auto_expire, if you haven't already.

Will do.

-- 
Nobody puts one over on Fred C. Dobbs.

Re: Many sa-learn processes getting stuck

Posted by "@lbutlr" <kr...@kreme.com>.
On 30 Aug 2019, at 11:49, RW <rw...@googlemail.com> wrote:
> On Fri, 30 Aug 2019 10:58:30 -0600
> @lbutlr wrote:
> 
>> I have a lot of processes that look like this:
>> 
>> root     48359 100.0  1.4  55984  47680  -  R    17:53      989:39.50
>> /usr/local/bin/perl -T -w /usr/local/bin/sa-learn --spam -u vscan
> ...
>> /var/spool/spamd/.spamassassin/bayes_toks Aug 30 10:44:37.624 [19164]
>> dbg: bayes: tie-ing to DB file R/O
> 
> This looks a bit strange. The -u argument to sa-learn is supposed to be
> for SQL virtual users, but spamd is using a single Berkeley database.

That is probably my error then. I remove the -Q flag manually and didn’t check -u since there is a scan user on the system.

> bayes_toks is in the default location for the spamd user, do you have
> that location in  bayes_path, otherwise sa-learn is probably looking
> under ~root.

It looks t be looking in /var/spool/spamd/.spamassassin according to the output of -D I posted above.

> IIWY I'd run sa-learn as the user spamd using su.

All the as-learn processes are running as root currently, so I don’t think it’s a permission issue. I will remove the -u flag though.

> Also disable bayes_auto_expire, if you haven't already.

Will do.

-- 
Nobody puts one over on Fred C. Dobbs.

Re: Many sa-learn processes getting stuck

Posted by RW <rw...@googlemail.com>.
On Fri, 30 Aug 2019 10:58:30 -0600
@lbutlr wrote:

> I have a lot of processes that look like this:
> 
> root     48359 100.0  1.4  55984  47680  -  R    17:53      989:39.50
> /usr/local/bin/perl -T -w /usr/local/bin/sa-learn --spam -u vscan
...
> /var/spool/spamd/.spamassassin/bayes_toks Aug 30 10:44:37.624 [19164]
> dbg: bayes: tie-ing to DB file R/O

This looks a bit strange. The -u argument to sa-learn is supposed to be
for SQL virtual users, but spamd is using a single Berkeley database.
bayes_toks is in the default location for the spamd user, do you have
that location in  bayes_path, otherwise sa-learn is probably looking
under ~root.

IIWY I'd run sa-learn as the user spamd using su.

Also disable bayes_auto_expire, if you haven't already.