You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by andrij <an...@gmail.com> on 2010/08/02 14:51:25 UTC

Bayes scoring

Hi all,

I run the bayes classifier on more than 4500 e-mails. All (except of cca 100
e-mails) contained test=BAYES_*. Does anybody have any idea why these 100
e-mails were not scored by the bayes classifier?

At http://www.paulgraham.com/spam.html, it is written that "When new mail
arrives, it is scanned into tokens, the most interesting fifteen tokens,
..., are used to calculate the probability that the mail is spam". How many
tokens are used by the SA's bayes classifier to calculate the probability
that the mail is spam/ham?

Thanks a lot. 
-- 
View this message in context: http://old.nabble.com/Bayes-scoring-tp29324885p29324885.html
Sent from the SpamAssassin - Users mailing list archive at Nabble.com.


Re: Bayes scoring

Posted by RW <rw...@googlemail.com>.
On Mon, 2 Aug 2010 05:51:25 -0700 (PDT)
andrij <an...@gmail.com> wrote:


>  How many tokens are used by the SA's bayes classifier to
> calculate the probability that the mail is spam/ham?

It varies. It uses all the tokens above a minimum token strength, up to
a  maximum of 150.

Re: Bayes scoring

Posted by andrij <an...@gmail.com>.

Daniel Lemke wrote:
> 
> 
> andrij wrote:
>> 
>> I run the bayes classifier on more than 4500 e-mails. All (except of cca
>> 100 e-mails) contained test=BAYES_*. Does anybody have any idea why these
>> 100 e-mails were not scored by the bayes classifier?
>> 
> 
> Do you have any shortcircuit enabled?
> 

No. I am playing with Bayes and RelayCountry plugins. I have enabled only
Bayes, RelayCountry, Check plugins and Bayes rules.


Daniel Lemke wrote:
> 
> Could you post a raw example of one of those mails, not scored by bayes?
> 

I cannot, I should ask the owner of the e-mails. I tried with databases of
spam and ham e-mails. What is interesting it happened only to the database
of ham emails.

-- 
View this message in context: http://old.nabble.com/Bayes-scoring-tp29324885p29325278.html
Sent from the SpamAssassin - Users mailing list archive at Nabble.com.


Re: Bayes scoring

Posted by Daniel Lemke <le...@jam-software.com>.

andrij wrote:
> 
> I run the bayes classifier on more than 4500 e-mails. All (except of cca
> 100 e-mails) contained test=BAYES_*. Does anybody have any idea why these
> 100 e-mails were not scored by the bayes classifier?
> 

Do you have any shortcircuit enabled?
Could you post a raw example of one of those mails, not scored by bayes?

Daniel
-- 
View this message in context: http://old.nabble.com/Bayes-scoring-tp29324885p29324968.html
Sent from the SpamAssassin - Users mailing list archive at Nabble.com.