You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by John W Mickevich <jm...@cmtonline.com> on 2006/12/15 23:05:26 UTC

How to tell why BAYES_00 is hit

Hi all!

 

I have a bayes question I am hoping someone may be able to answer for me.
Since implementing bayes it has been doing a very good job except for one
thing.  

 

One particular spam email is not getting tagged as spam.  My rules are
scoring the email high enough to be tagged as spam, but it is also hitting
the BAYES_00 rule, which is deducting 4.9 point, thus causing the email to
not be tagged as spam.

 

I am very new to bayes so some of my terms may be incorrect.  But it would
appear that bayes has "learned" something incorrectly.  

 

I am not sure if something got autolearned as ham, etc.  But, my question is
how do I go about finding out exactly what within bayes is causing this
email to be scored as BAYES_00?  And more importantly, how do I "undo" it?

 

If it helps, here is the X-Spam info from the header:

 

------------------

X-Spam-Status: No, hits=3.2 required=4.9 tests=BAYES_00,DATE_IN_PAST_96_XX,

    J_CHICKENPOX_13,J_CHICKENPOX_22,J_CHICKENPOX_33,J_CHICKENPOX_34,

    J_CHICKENPOX_42,J_CHICKENPOX_45,J_CHICKENPOX_91,MISSING_OUTLOOK_NAME,

    SARE_ADLTOBFU autolearn=no version=2.64

------------------

 

Hope this makes sense.  If not, I apologize.

 

Thanks!

 

John W Mickevich

Computer Management Technologies

Email:  <ma...@cmtonline.com> jm@cmtonline.com

 

 


Re: How to tell why BAYES_00 is hit

Posted by Karl Auer <ka...@biplane.com.au>.
> FWIW, if you had left BAYES_00 with its default score, as opposed to
> increasing it to -4.9, this mail would have been flagged as spam.

At least in my case (admittedly using SA 2.64 not 3.x) -4.9 appears to
be the default score for BAYES_00. At least, that's what it was before I
set it to zero, and I'm pretty sure I never touched it earlier...

Regards, K.

-- 
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Karl Auer (kauer@biplane.com.au)                   +61-2-64957160 (h)
http://www.biplane.com.au/~kauer/                  +61-428-957160 (mob)


Re: How to tell why BAYES_00 is hit

Posted by Theo Van Dinter <fe...@apache.org>.
On Fri, Dec 15, 2006 at 05:05:26PM -0500, John W Mickevich wrote:
> I am very new to bayes so some of my terms may be incorrect.  But it would
> appear that bayes has "learned" something incorrectly.  

Not necessarily.  It means that the tokens found in the message which are also
found in the DB are considered to be hammy overall.  That's not necessarily
incorrect.

> I am not sure if something got autolearned as ham, etc.  But, my question is
> how do I go about finding out exactly what within bayes is causing this
> email to be scored as BAYES_00?  And more importantly, how do I "undo" it?

1) run the message through "spamassassin -D bayes" and you'll see the
token output w/ score information.

2) learn the spam message as spam.

> X-Spam-Status: No, hits=3.2 required=4.9 tests=BAYES_00,DATE_IN_PAST_96_XX,

FWIW, if you had left BAYES_00 with its default score, as opposed to
increasing it to -4.9, this mail would have been flagged as spam.

-- 
Randomly Selected Tagline:
"I'd rather see my sister in a whorehouse than my brother using windows."
                                 - Sam Creasey

Re: How to tell why BAYES_00 is hit

Posted by aubreyl <au...@emailacs.com>.
Bret Miller wrote:
>  
>   
>> I have a bayes question I am hoping someone may be able to 
>> answer for me.  Since implementing bayes it has been doing a 
>> very good job except for one thing.  
>>
>> One particular spam email is not getting tagged as spam.  My 
>> rules are scoring the email high enough to be tagged as spam, 
>> but it is also hitting the BAYES_00 rule, which is deducting 
>> 4.9 point, thus causing the email to not be tagged as spam.
>>
>> I am very new to bayes so some of my terms may be incorrect.  
>> But it would appear that bayes has "learned" something incorrectly.  
>>
>> I am not sure if something got autolearned as ham, etc.  But, 
>> my question is how do I go about finding out exactly what 
>> within bayes is causing this email to be scored as BAYES_00?  
>> And more importantly, how do I "undo" it?
>>
>>     
>
> Bayes tokenizes the e-mail, so it's hard to point at exactly what might
> make it think it's spam. The best way to combat this is to sa-learn
> --spam the message when it comes it. That way, if it was autolearned as
> ham, it's reversed. If tokens appeared in several ham messages, then you
> might have to repeat this a few times before the scores get reversed
> enough that it hits bayes_99 intead.
>
> Bret
>
>
>
>
>   
I don't know how your box is set up, but if you are using the mbox 
format for keeping mail, then you may want to single out this one 
message and run:

    * spamassassin --test-mode /file/containing/spam_message


and get whatever the scoring is.  Then run:

    * sa-learn --spam --spam [--mbox?] /file/containing/spam_message


Then re-run:

    * spamassassin --test-mode /file/containing/spam_message


You should notice that the score is now being scored by BAYES_99.  After 
you do that run:

    * spamc -c < /file/containing/spam_message


And make sure that you are getting the same score from spamc.  You could 
have a configuration issue.  Not sure if this would help you but I had 
the same problem about 2 weeks ago, and this helped me.

-=Aubrey=-

RE: How to tell why BAYES_00 is hit

Posted by Bret Miller <br...@wcg.org>.
 
> I have a bayes question I am hoping someone may be able to 
> answer for me.  Since implementing bayes it has been doing a 
> very good job except for one thing.  
> 
> One particular spam email is not getting tagged as spam.  My 
> rules are scoring the email high enough to be tagged as spam, 
> but it is also hitting the BAYES_00 rule, which is deducting 
> 4.9 point, thus causing the email to not be tagged as spam.
> 
> I am very new to bayes so some of my terms may be incorrect.  
> But it would appear that bayes has "learned" something incorrectly.  
> 
> I am not sure if something got autolearned as ham, etc.  But, 
> my question is how do I go about finding out exactly what 
> within bayes is causing this email to be scored as BAYES_00?  
> And more importantly, how do I "undo" it?
> 

Bayes tokenizes the e-mail, so it's hard to point at exactly what might
make it think it's spam. The best way to combat this is to sa-learn
--spam the message when it comes it. That way, if it was autolearned as
ham, it's reversed. If tokens appeared in several ham messages, then you
might have to repeat this a few times before the scores get reversed
enough that it hits bayes_99 intead.

Bret