You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Geoff Sweet <li...@whootis.com> on 2005/05/13 08:38:23 UTC
Help with Bayes auto-learn
I would like to enable the Bayes system with auto-learning. I thought
that I had my config setup correctly but apparently I don't. My config
looks like this:
##########
# How we want to modify the email
rewrite_header subject [**SPAM**]
report_safe 0
#Bayes learning system
use_bayes 1
bayes_auto_learn 1
# Define the sensitivity level. Standard level is 5.
required_hits 6.8
# Enable SpamAssassin's RBL checking features :
skip_rbl_checks 0
rbl_timeout 3
num_check_received 3
score RCVD_IN_BL_SPAMCOP_NET 3
report_header 1
use_terse_report 1
##########
so I thought from the reading in the FAQ and on the wiki that this would
enable bayes, and turn on its auto_learn for spam that hits higher then
the default of 12. But in my logs I end up with this:
2005-05-12 23:30:33.240563500 2005-05-13 06:30:33 [88906] i: connection
from localhost.whootis.com [127.0.0.1] at port 4737
2005-05-12 23:30:33.333094500 2005-05-13 06:30:33 [88906] i: processing
message <7o...@k08.kdrv> for qmaild:10004.
2005-05-12 23:30:33.431814500 2005-05-13 06:30:33 [88906] i: identified
spam (23.2/6.8) for qmaild:10004 in 0.2 seconds, 1311 bytes.
2005-05-12 23:30:33.432514500 2005-05-13 06:30:33 [88906] i: result: Y
23 -
BAYES_99,FORGED_MUA_THEBAT_BOUN,FORGED_THEBAT_HTML,FORGED_YAHOO_RCVD,HEAD_ILLEGAL_CHARS,HTML_MESSAGE,HTML_MIME_NO_HTML_TAG,MIME_HTML_ONLY,MIME_HTML_ONLY_MULTI,MSGID_RANDY,NORMAL_HTTP_TO_IP,RCVD_BY_IP,RCVD_DOUBLE_IP_LOOSE,RCVD_HELO_IP_MISMATCH,RCVD_NUMERIC_HELO,SUBJ_ILLEGAL_CHARS
scantime=0.2,size=1311,mid=<7o...@k08.kdrv>,bayes=0.999999999999999,autolearn=no
Does the "autolearn=no" mean that this message has not been submitted to
bayes for auto-learn? And if not, can someone steer me in the right
direction for getting my config setup correctly?
Thanks very much,
Geoff Sweet
Re: Help with Bayes auto-learn
Posted by wolfgang <me...@gmx.net>.
In an older episode (Friday 13 May 2005 08:38), Geoff Sweet wrote:
> I would like to enable the Bayes system with auto-learning. I thought
> that I had my config setup correctly but apparently I don't. My config
> looks like this:
>
> ##########
> # How we want to modify the email
> rewrite_header subject [**SPAM**]
> report_safe 0
>
> #Bayes learning system
> use_bayes 1
> bayes_auto_learn 1
In an older episode (Friday 13 May 2005 10:17), George Breahna wrote:
> I really recommend you research your question before asking it.
good point, anyway:
man Mail::SpamAssassin::Conf
and
http://spamassassin.apache.org/full/3.0.x/dist/doc/Mail_SpamAssassin_Conf.html
would tell you:
bayes_min_ham_num (Default: 200)
bayes_min_spam_num (Default: 200)
To be accurate, the Bayes system does not activate until a certain number
of ham (non-spam) and spam have been learned. The default is 200 of each ham
and spam, but you can tune these up or down with these two settings.
for information how to learn the needed amount of mails, see
man sa-learn
regards,
wolfgang
Re: Help with Bayes auto-learn
Posted by Matt Kettler <mk...@comcast.net>.
At 02:38 AM 5/13/2005, Geoff Sweet wrote:
>2005-05-12 23:30:33.432514500 2005-05-13 06:30:33 [88906] i: result: Y 23
>-
>BAYES_99,FORGED_MUA_THEBAT_BOUN,FORGED_THEBAT_HTML,FORGED_YAHOO_RCVD,HEAD_ILLEGAL_CHARS,HTML_MESSAGE,HTML_MIME_NO_HTML_TAG,MIME_HTML_ONLY,MIME_HTML_ONLY_MULTI,MSGID_RANDY,NORMAL_HTTP_TO_IP,RCVD_BY_IP,RCVD_DOUBLE_IP_LOOSE,RCVD_HELO_IP_MISMATCH,RCVD_NUMERIC_HELO,SUBJ_ILLEGAL_CHARS
>scantime=0.2,size=1311,mid=<7o...@k08.kdrv>,bayes=0.999999999999999,autolearn=no
>
>Does the "autolearn=no" mean that this message has not been submitted to
>bayes for auto-learn? And if not, can someone steer me in the right
>direction for getting my config setup correctly?
First, I'm assuming you're using SA 3.0.0 or higher, if not, please specify
version and I'll correct my message (some of the details differ)
That does mean the message was not autolearned. However, it does not mean
that no messages will be autolearned. In SA 3.0 if autolearning was
disabled, or failing, you would have seen "disabled" or "failed", not "no".
The requirements for autolearning are considerably more complex than just
"total score over xx".
The following things have to happen:
Note: ALL scores referenced below are the learning score. Learning score is
NOT the same as the final spam score. It is the score recalculated as if
bayes was disabled, *including* changing scoreset. Also all AWL, whitelist,
and blacklist rules don't count towards this score.
1) total learning score over bayes_auto_learn_threshold_spam (default 12)
2) learning score of header rules must be over 3.0
3) learning score of body rules must be over 3.0
4) existing bayes learning must not be strongly ham (ie: don't learn as
spam anything that would otherwise get bayes_00'ed)
5) From addresses (including Return-Path, etc) must not match a
bayes_ignore_from statement
6) To addresses (including Cc, etc) must not match a bayes_ignore_from
statement
7) The bayes DB must not be locked by some other SA process (another
learner, expiry, etc). Note: this test results in autolearn=failed.
See also:
http://wiki.apache.org/spamassassin/AutolearningNotWorking
RE: Help with Bayes auto-learn
Posted by George Breahna <sa...@top-consulting.net>.
I can swear I saw this question in at least 20 different messages, not to
mention the website
I really recommend you research your question before asking it.
autolearn=no means that it didn't 'learn' this message.
Other possible states are 'spam, 'ham' and ... 'DISABLED'
If autolearn were to be disabled, you would see this last one.
I would like to enable the Bayes system with auto-learning. I thought that
I had my config setup correctly but apparently I don't. My config looks
like this:
##########
# How we want to modify the email
rewrite_header subject [**SPAM**]
report_safe 0
#Bayes learning system
use_bayes 1
bayes_auto_learn 1
# Define the sensitivity level. Standard level is 5.
required_hits 6.8
# Enable SpamAssassin's RBL checking features :
skip_rbl_checks 0
rbl_timeout 3
num_check_received 3
score RCVD_IN_BL_SPAMCOP_NET 3
report_header 1
use_terse_report 1
##########
so I thought from the reading in the FAQ and on the wiki that this would
enable bayes, and turn on its auto_learn for spam that hits higher then the
default of 12. But in my logs I end up with this:
2005-05-12 23:30:33.240563500 2005-05-13 06:30:33 [88906] i: connection from
localhost.whootis.com [127.0.0.1] at port 4737
2005-05-12 23:30:33.333094500 2005-05-13 06:30:33 [88906] i: processing
message <7o...@k08.kdrv> for qmaild:10004.
2005-05-12 23:30:33.431814500 2005-05-13 06:30:33 [88906] i: identified spam
(23.2/6.8) for qmaild:10004 in 0.2 seconds, 1311 bytes.
2005-05-12 23:30:33.432514500 2005-05-13 06:30:33 [88906] i: result: Y
23 -
BAYES_99,FORGED_MUA_THEBAT_BOUN,FORGED_THEBAT_HTML,FORGED_YAHOO_RCVD,HEAD_IL
LEGAL_CHARS,HTML_MESSAGE,HTML_MIME_NO_HTML_TAG,MIME_HTML_ONLY,MIME_HTML_ONLY
_MULTI,MSGID_RANDY,NORMAL_HTTP_TO_IP,RCVD_BY_IP,RCVD_DOUBLE_IP_LOOSE,RCVD_HE
LO_IP_MISMATCH,RCVD_NUMERIC_HELO,SUBJ_ILLEGAL_CHARS
scantime=0.2,size=1311,mid=<7o...@k08.kdrv>,bayes=0.9
99999999999999,autolearn=no
Does the "autolearn=no" mean that this message has not been submitted to
bayes for auto-learn? And if not, can someone steer me in the right
direction for getting my config setup correctly?
Thanks very much,
Geoff Sweet