You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Andrew Davidson <am...@gmail.com> on 2015/10/02 19:15:46 UTC

Training Bayes with BAYES_999 Mail

I'm not an expert on the mechanics of Bayes so I'm wondering how
valuable it is to continue training with collected spam that is
properly tagged with BAYES_999.

Does that help to reinforce the logic or is it overly focusing the
database on emails it can already detect? Should I only be training it
with miscategorized emails and emails in the 20-80% confidence range?

Thanks for clarifying,

-- Andrew

Re: Training Bayes with BAYES_999 Mail

Posted by Reindl Harald <h....@thelounge.net>.

Am 02.10.2015 um 19:15 schrieb Andrew Davidson:
> I'm not an expert on the mechanics of Bayes so I'm wondering how valuable it is to continue training with collected spam that is properly tagged with BAYES_999.
>
> Does that help to reinforce the logic or is it overly focusing the database on emails it can already detect? Should I only be training it with miscategorized emails and emails in the 20-80% confidence range?

yes, because it contains clear spam parts repeated in the future in 
parts, doing that here for many months now and the results get better 
and better - we have a BAYES_00 of 85% of all scanned messages by 
heavily train ham as well as spam

0      51534    SPAM
0      19007    HAM
0    2161267    TOKEN


Re: Training Bayes with BAYES_999 Mail

Posted by Matus UHLAR - fantomas <uh...@fantomas.sk>.
On 02.10.15 13:15, Andrew Davidson wrote:
>I'm not an expert on the mechanics of Bayes so I'm wondering how
>valuable it is to continue training with collected spam that is
>properly tagged with BAYES_999.
>
>Does that help to reinforce the logic or is it overly focusing the
>database on emails it can already detect? Should I only be training it
>with miscategorized emails and emails in the 20-80% confidence range?

imho, the more uncertain BAYES score is, the more it's usefull to train.
something hitting BAYES_999 is not worth imho.
-- 
Matus UHLAR - fantomas, uhlar@fantomas.sk ; http://www.fantomas.sk/
Warning: I wish NOT to receive e-mail advertising to this address.
Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu.
"The box said 'Requires Windows 95 or better', so I bought a Macintosh".