You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Raphael Clifford <ra...@clifford.net> on 2005/04/14 10:52:20 UTC

sa-learn doesn't learn

Hi,

I am trying to set up Bayes classifying for the first time using 
sa-learn.  It looks like it is working but doesn't actually seem to 
be... Here is the output


[raph]$ sa-learn --showdots --mbox --spam 
.thunderbird/gmnjx6hf.default/Mail/mail.plus.net/Junk
................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................. 

Learned from 870 message(s) (1025 message(s) examined).
[raph]$ sa-learn --showdots --mbox --ham 
.thunderbird/gmnjx6hf.default/Mail/mail.plus.net/Inbox
.................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................. 

Learned from 2390 message(s) (2578 message(s) examined).


Now when I do spamassassin -D --lint I get

[...]
debug: bayes: 5790 tie-ing to DB file R/O 
/home/raph/.spamassassin/bayes_toks
debug: bayes: 5790 tie-ing to DB file R/O 
/home/raph/.spamassassin/bayes_seen
debug: bayes: found bayes db version 3
debug: using "/home/raph/.spamassassin" for user state dir
debug: bayes: Not available for scanning, only 1 spam(s) in Bayes DB < 200
debug: bayes: 5790 untie-ing
debug: bayes: 5790 untie-ing db_toks
debug: bayes: 5790 untie-ing db_seen
debug: Score set 1 chosen.
debug: ---- MIME PARSER START ----
debug: main message type: text/plain
debug: parsing normal part
debug: added part, type: text/plain
debug: ---- MIME PARSER END ----
debug: bayes: 5790 tie-ing to DB file R/O 
/home/raph/.spamassassin/bayes_toks
debug: bayes: 5790 tie-ing to DB file R/O 
/home/raph/.spamassassin/bayes_seen
debug: bayes: found bayes db version 3
debug: bayes: Not available for scanning, only 1 spam(s) in Bayes DB < 200
debug: bayes: 5790 untie-ing
debug: bayes: 5790 untie-ing db_toks
debug: bayes: 5790 untie-ing db_seen
[...]

This seems to imply that it didn't work right?

When I run spamc on my mail it doesn't use Bayes as far as I can see in 
the headers.

Any help is very much appreciated. What am I doing wrong?

Raphael

P.S.  spamassassin 3.0.2, heavily updated redhat 9 system.  spamc is 
called from procmail with no arguments.



Re: sa-learn doesn't learn

Posted by Raphael Clifford <ra...@clifford.net>.
Raphael Clifford wrote:

> Just to reply to my own message.
>
> It is seems to make a crucial difference which order to run the spam 
> and ham tests in!  I reran the spam test and it now says I have
>
Typo:

"spam test" above should be "sa-learn command for the spam folder"

> (from sa-learn dump magic)
> [...]
> 0.000          0        881          0  non-token data: nspam
> 0.000          0       1524          0  non-token data: nham
> [...]



Raphael

Re: sa-learn doesn't learn

Posted by Raphael Clifford <ra...@clifford.net>.
Just to reply to my own message.

It is seems to make a crucial difference which order to run the spam and 
ham tests in!  I reran the spam test and it now says I have

(from sa-learn dump magic)
[...]
0.000          0        881          0  non-token data: nspam
0.000          0       1524          0  non-token data: nham
[...]

So the number of spam has increased to roughly what it should be but the 
number of ham has decreased by 1000!

Can anyone explain this?  It looks like a bug as surely the order of 
execution shouldn't matter?!

Raphael


Raphael Clifford wrote:

> Hi,
>
> I am trying to set up Bayes classifying for the first time using 
> sa-learn.  It looks like it is working but doesn't actually seem to 
> be... Here is the output
>
>
> [raph]$ sa-learn --showdots --mbox --spam 
> .thunderbird/gmnjx6hf.default/Mail/mail.plus.net/Junk
> ................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................. 
>
> Learned from 870 message(s) (1025 message(s) examined).
> [raph]$ sa-learn --showdots --mbox --ham 
> .thunderbird/gmnjx6hf.default/Mail/mail.plus.net/Inbox
> .................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................. 
>
> Learned from 2390 message(s) (2578 message(s) examined).
>
>
> Now when I do spamassassin -D --lint I get
>
> [...]
> debug: bayes: 5790 tie-ing to DB file R/O 
> /home/raph/.spamassassin/bayes_toks
> debug: bayes: 5790 tie-ing to DB file R/O 
> /home/raph/.spamassassin/bayes_seen
> debug: bayes: found bayes db version 3
> debug: using "/home/raph/.spamassassin" for user state dir
> debug: bayes: Not available for scanning, only 1 spam(s) in Bayes DB < 
> 200
> debug: bayes: 5790 untie-ing
> debug: bayes: 5790 untie-ing db_toks
> debug: bayes: 5790 untie-ing db_seen
> debug: Score set 1 chosen.
> debug: ---- MIME PARSER START ----
> debug: main message type: text/plain
> debug: parsing normal part
> debug: added part, type: text/plain
> debug: ---- MIME PARSER END ----
> debug: bayes: 5790 tie-ing to DB file R/O 
> /home/raph/.spamassassin/bayes_toks
> debug: bayes: 5790 tie-ing to DB file R/O 
> /home/raph/.spamassassin/bayes_seen
> debug: bayes: found bayes db version 3
> debug: bayes: Not available for scanning, only 1 spam(s) in Bayes DB < 
> 200
> debug: bayes: 5790 untie-ing
> debug: bayes: 5790 untie-ing db_toks
> debug: bayes: 5790 untie-ing db_seen
> [...]
>
> This seems to imply that it didn't work right?
>
> When I run spamc on my mail it doesn't use Bayes as far as I can see 
> in the headers.
>
> Any help is very much appreciated. What am I doing wrong?
>
> Raphael
>
> P.S.  spamassassin 3.0.2, heavily updated redhat 9 system.  spamc is 
> called from procmail with no arguments.
>
>
>