You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Brett Millett <bm...@localmatters.com> on 2008/08/01 23:28:47 UTC

autolearn=yes but sa-learn dump magic shows no new spam

Hi,

I've been googling quite a bit today to find the answer to what I'm
seeing that is happening on my mail server. However, I just can't seem
to find a definitive answer. When looking at my mail logs I see a number
of autolearn=spam, however when I run "sa-learn --dump magic" nspam does
not increment. If I run sa-learn manually, nspam increments. Is this
normal or should each autolearn=spam indicate that nspam should
increment by one.

Here is my local.cf file as it pertains to bayes and autolearn:

use_bayes 1
bayes_auto_learn 1
bayes_ignore_header X-Bogosity
bayes_ignore_header X-Spam-Flag
bayes_ignore_header X-Spam-Status
bayes_auto_learn_threshold_nonspam -0.1
bayes_auto_learn_threshold_spam 5.0
use_auto_whitelist 1
bayes_use_hapaxes 1
bayes_min_ham_num 150
bayes_min_spam_num 150
score BAYES_00 -3
score BAYES_05 -1
score BAYES_95 6
score BAYES_99 9
score BAYES_20 -0.8
score BAYES_40 0
score BAYES_50 1.567
score BAYES_60 3.515
score BAYES_80 3.608

Thanks,

Brett

RE: autolearn=yes but sa-learn dump magic shows no new spam

Posted by Brett Millett <bm...@localmatters.com>.
Great guess! I was running as root before (sudo.) Here are the results when I run the command as the site-wide user.

sa-learn --dump magic

0.000          0          3          0  non-token data: bayes db version
0.000          0        329          0  non-token data: nspam
0.000          0      42903          0  non-token data: nham
0.000          0     158973          0  non-token data: ntokens
0.000          0 1204946608          0  non-token data: oldest atime
0.000          0 1205337646          0  non-token data: newest atime
0.000          0 1217631298          0  non-token data: last journal sync atime
0.000          0 1205335826          0  non-token data: last expiry atime
0.000          0     417421          0  non-token data: last expire atime delta
0.000          0          0          0  non-token data: last expire reduction count

Thanks!

Also, I'm just using the flat files (non-sql.)

I'm still confused though...a "$ps aux | grep spam" reveals

/usr/sbin/spamd --create-prefs --max-children 5 --helper-home-dir -x --virtual-config-dir=/etc/mail/spamassassin --username spamassassin -d --pidfile=/var/run/spamd.pid

Of course my manual sa-learns were altering the bayes files under /etc/mail/spamassassin whereas the bayes files spam was writing to were under /home/spamassassin/.spamassassin. So I changed the home directory for spamassassin to /etc/spamassassin. That created a hidden directory .spamassassin under /etc/spamassassin and when running "sa-learn --dump magic" I got errors (because spamd is still writing to bayes files just one directory up.) Thus, I made symlinks to the bayes files under the .spamassassin pointing to the same files one dir up and everything seems to be working.

My question is: Is that the best way to do that. Did I miss something?

Thanks for all who have helped.

Brett

-----Original Message-----
From: Karsten Bräckelmann [mailto:guenther@rudersport.de] 
Sent: Friday, August 01, 2008 3:59 PM
To: users@spamassassin.apache.org
Subject: Re: autolearn=yes but sa-learn dump magic shows no new spam

On Fri, 2008-08-01 at 15:28 -0600, Brett Millett wrote:
> Hi,
> 
> I've been googling quite a bit today to find the answer to what I'm
> seeing that is happening on my mail server. However, I just can't seem
> to find a definitive answer. When looking at my mail logs I see a number
> of autolearn=spam, however when I run "sa-learn --dump magic" nspam does
> not increment. If I run sa-learn manually, nspam increments. Is this
> normal or should each autolearn=spam indicate that nspam should
> increment by one.

The site-wide spamassassin user or the user spamassassin has been called
on behalf is not the user you are running sa-learn --dump magic as?

Just a guess. :)

  guenther


-- 
char *t="\10pse\0r\0dtu\0.@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4";
main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;i<l;i++){ i%8? c<<=1:
(c=*++x); c&128 && (s+=h); if (!(h>>=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}


Re: autolearn=yes but sa-learn dump magic shows no new spam

Posted by Karsten Bräckelmann <gu...@rudersport.de>.
On Fri, 2008-08-01 at 15:28 -0600, Brett Millett wrote:
> Hi,
> 
> I've been googling quite a bit today to find the answer to what I'm
> seeing that is happening on my mail server. However, I just can't seem
> to find a definitive answer. When looking at my mail logs I see a number
> of autolearn=spam, however when I run "sa-learn --dump magic" nspam does
> not increment. If I run sa-learn manually, nspam increments. Is this
> normal or should each autolearn=spam indicate that nspam should
> increment by one.

The site-wide spamassassin user or the user spamassassin has been called
on behalf is not the user you are running sa-learn --dump magic as?

Just a guess. :)

  guenther


-- 
char *t="\10pse\0r\0dtu\0.@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4";
main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;i<l;i++){ i%8? c<<=1:
(c=*++x); c&128 && (s+=h); if (!(h>>=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}


RE: autolearn=yes but sa-learn dump magic shows no new spam

Posted by Brett Millett <bm...@localmatters.com>.
As you can see from the response I just posted, I'm not using MySQL for
bayes (albeit, maybe I should be, that seems very convenient.)

>You do not need to include the X-Spam-* header fields as they are 
>stripped before learning.

Thanks. I'll pull those out.

-----Original Message-----
From: Duane Hill [mailto:d.hill@yournetplus.com] 
Sent: Friday, August 01, 2008 4:27 PM
To: users@spamassassin.apache.org
Subject: Re: autolearn=yes but sa-learn dump magic shows no new spam

On Fri, 1 Aug 2008, Brett Millett wrote:

> I've been googling quite a bit today to find the answer to what I'm
> seeing that is happening on my mail server. However, I just can't seem
> to find a definitive answer. When looking at my mail logs I see a
number
> of autolearn=spam, however when I run "sa-learn --dump magic" nspam
does
> not increment. If I run sa-learn manually, nspam increments. Is this
> normal or should each autolearn=spam indicate that nspam should
> increment by one.

I can only speculate this would have something to do with the user 
autolearn is running against. I'm going to assume you are using MySQL as

you did not state. Have you tried going into MySQL and:

   select count(*) from bayes_vars;

to see how many usernames are in the table?

Perhaps you can post the startup parameters spamd is using. Also how
spamc 
is being called (if that is the case). That may shed more light on why
you 
are getting the results you are.

> Here is my local.cf file as it pertains to bayes and autolearn:
>
> use_bayes 1
> bayes_auto_learn 1
> bayes_ignore_header X-Bogosity
> bayes_ignore_header X-Spam-Flag
> bayes_ignore_header X-Spam-Status

You do not need to include the X-Spam-* header fields as they are 
stripped before learning.

> bayes_auto_learn_threshold_nonspam -0.1
> bayes_auto_learn_threshold_spam 5.0
> use_auto_whitelist 1
> bayes_use_hapaxes 1
> bayes_min_ham_num 150
> bayes_min_spam_num 150
> score BAYES_00 -3
> score BAYES_05 -1
> score BAYES_95 6
> score BAYES_99 9
> score BAYES_20 -0.8
> score BAYES_40 0
> score BAYES_50 1.567
> score BAYES_60 3.515
> score BAYES_80 3.608

-d

Re: autolearn=yes but sa-learn dump magic shows no new spam

Posted by Duane Hill <d....@yournetplus.com>.
On Fri, 1 Aug 2008, Brett Millett wrote:

> I've been googling quite a bit today to find the answer to what I'm
> seeing that is happening on my mail server. However, I just can't seem
> to find a definitive answer. When looking at my mail logs I see a number
> of autolearn=spam, however when I run "sa-learn --dump magic" nspam does
> not increment. If I run sa-learn manually, nspam increments. Is this
> normal or should each autolearn=spam indicate that nspam should
> increment by one.

I can only speculate this would have something to do with the user 
autolearn is running against. I'm going to assume you are using MySQL as 
you did not state. Have you tried going into MySQL and:

   select count(*) from bayes_vars;

to see how many usernames are in the table?

Perhaps you can post the startup parameters spamd is using. Also how spamc 
is being called (if that is the case). That may shed more light on why you 
are getting the results you are.

> Here is my local.cf file as it pertains to bayes and autolearn:
>
> use_bayes 1
> bayes_auto_learn 1
> bayes_ignore_header X-Bogosity
> bayes_ignore_header X-Spam-Flag
> bayes_ignore_header X-Spam-Status

You do not need to include the X-Spam-* header fields as they are 
stripped before learning.

> bayes_auto_learn_threshold_nonspam -0.1
> bayes_auto_learn_threshold_spam 5.0
> use_auto_whitelist 1
> bayes_use_hapaxes 1
> bayes_min_ham_num 150
> bayes_min_spam_num 150
> score BAYES_00 -3
> score BAYES_05 -1
> score BAYES_95 6
> score BAYES_99 9
> score BAYES_20 -0.8
> score BAYES_40 0
> score BAYES_50 1.567
> score BAYES_60 3.515
> score BAYES_80 3.608

-d