You are viewing a plain text version of this content. The canonical link for it is here.

Posted to users@spamassassin.apache.org by Harry Putnam <re...@newsguy.com> on 2004/08/03 12:27:47 UTC

basic bayes .. new use

I wondered if someone can supply an example of a basic bayes setup?

I'm a fairly long time user of SA but never used bayes.  Lately I've
tried to introduce bogofilter which uses bayes and not seeing the
results I expected.  Maybe misuse.  But now thinking maybe I'd be
better off using SA native support for bayes.

However it seems a bit confusing with the sa-learn and some needed
training etc.  

I hoped for a guide to starting to use bayes at a primitive level.

Just train it and run it.  And how to erase false hits.

Re: basic bayes .. new use

Posted by Matt Kettler <mk...@evi-inc.com>.

At 06:27 AM 8/3/2004, Harry Putnam wrote:
>However it seems a bit confusing with the sa-learn and some needed
>training etc.
>
>I hoped for a guide to starting to use bayes at a primitive level.
>
>Just train it and run it.  And how to erase false hits.

In short:

For "running" it, there's nothing you really need to do, besides make sure 
you don't have use_bayes set to 0 in your configfiles.

You'll also need to install the DB_File perl module and the BerkelyDB 
library (aka libdb) if they aren't installed already. Most *nix 
distributions have a packages for them.

Once you have enough messages trained, SA will automatically start using bayes.

To train it on spam (maildir or 822 format):

         sa-learn --spam  {files}

To train it on nonspam:

         sa-learn --ham  {files}

or if you have mbox format mail where many messages exist in one file, just 
add the --mbox parameter

         sa-learn --spam --mbox  {mboxfiles}
         sa-learn --ham  --mbox {mboxfiles}

If you have a mis-training, you can simply re-learn it as the proper type. 
sa-learn will automatically realize it was previously learned the wrong way 
and compensate.

If you want to "unlearn" a message without relearning it, use sa-learn 
--forget on it.

If you need more detail than above, check out the wiki:

http://wiki.apache.org/spamassassin/BayesInSpamAssassin

and

http://wiki.apache.org/spamassassin/BayesFaq