You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Kris Deugau <kd...@vianet.ca> on 2007/06/04 19:46:09 UTC

sa-learn not remembering messages it's learned

I have a pair of servers with global Bayes DBs that no longer remember
which messages they've learned, for no apparent reason I can see.  I
can't think of any particular changes I might have made that might cause
this.

A third machine with per-user Bayes seems to learn fine for at least one
user account (mine - I have a cron job that learns a newspam folder
every night).  I typically leave messages in a newspam folder until I've
accumulated about 100 before deleting them;  recently that's been taking
several weeks.  Learning via sa-learn is done manually on the
misbehaving systems, usually once a day or every other day.  (Aside from
cron-based learning, this applies to all three systems.)

Comparing debug output between the three machines shows no obvious
processing differences aside from the different system names and Bayes
pathnames.

All three machines are running SA3.1.8, from the same package repository
(RPMForge), although the system that's working is currently running
CentOS3 where the other two are currently on CentOS 4.  (They've all
been moved from one physical box to another and had services migrated to
new OS installs, versions, and distributions, all more than once.  Other
small ISP admins are no doubt familiar with that dance.  <g>)

I thought it might be something sa-update had pulled in, but removing
the update files on one misbehaving system didn't change anything.

bayes_learn_to_journal is set on the two misbehaving systems, but it's
been set since I originally upgraded from SA2.44 to 2.54 on both (IIRC).
 Expiry is disabled, and both systems have had a daily cron-based expiry
run (again, set that way since SA-with-Bayes was originally set up on
these machines, and working since originally configured).  Autolearn is
also enabled and tweaked (lower threshold set at -0.1);  it's the only
way I can really gather much ham on these systems.  :/

Odd not-remembering-what-we've-learned issue aside, learning (automatic
and manual) seems to happen - single messages run through SA again after
learning show different Bayes scores.

Any suggestions on what to look for in the debug output?

Anyone want to wade through pages and pages of debug output?  <g>

-kgd

Re: sa-learn not remembering messages it's learned

Posted by Kris Deugau <kd...@vianet.ca>.
Anyone?  This isn't urgent or critical, but it's bugging me because it 
changed from "working" to "not working" spontaneously.

Kris Deugau wrote:
> I have a pair of servers with global Bayes DBs that no longer remember
> which messages they've learned, for no apparent reason I can see.  I
> can't think of any particular changes I might have made that might cause
> this.
> 
> A third machine with per-user Bayes seems to learn fine for at least one
> user account (mine - I have a cron job that learns a newspam folder
> every night).  I typically leave messages in a newspam folder until I've
> accumulated about 100 before deleting them;  recently that's been taking
> several weeks.  Learning via sa-learn is done manually on the
> misbehaving systems, usually once a day or every other day.  (Aside from
> cron-based learning, this applies to all three systems.)
> 
> Comparing debug output between the three machines shows no obvious
> processing differences aside from the different system names and Bayes
> pathnames.
> 
> All three machines are running SA3.1.8, from the same package repository
> (RPMForge), although the system that's working is currently running
> CentOS3 where the other two are currently on CentOS 4.  (They've all
> been moved from one physical box to another and had services migrated to
> new OS installs, versions, and distributions, all more than once.  Other
> small ISP admins are no doubt familiar with that dance.  <g>)
> 
> I thought it might be something sa-update had pulled in, but removing
> the update files on one misbehaving system didn't change anything.
> 
> bayes_learn_to_journal is set on the two misbehaving systems, but it's
> been set since I originally upgraded from SA2.44 to 2.54 on both (IIRC).
>  Expiry is disabled, and both systems have had a daily cron-based expiry
> run (again, set that way since SA-with-Bayes was originally set up on
> these machines, and working since originally configured).  Autolearn is
> also enabled and tweaked (lower threshold set at -0.1);  it's the only
> way I can really gather much ham on these systems.  :/
> 
> Odd not-remembering-what-we've-learned issue aside, learning (automatic
> and manual) seems to happen - single messages run through SA again after
> learning show different Bayes scores.
> 
> Any suggestions on what to look for in the debug output?
> 
> Anyone want to wade through pages and pages of debug output?  <g>
> 
> -kgd