You are viewing a plain text version of this content. The canonical link for it is here.

Posted to users@spamassassin.apache.org by Sn!per <sn...@home.net.my> on 2008/03/22 03:23:09 UTC

Am confused with mail header

Hi all,
Am using SA version 3.2.4 . So far working great.

I am comparing the headers on two of my mails. The first mail is NOT a spam, and X-Spam-Status is like so:

No, score=2.7 required=10.0 tests=RCVD_NUMERIC_HELO,RDNS_NONE autolearn=no version=3.2.4


The second email is a spam, and its X-Spam-Status looks like this:
Yes, score=24.4 required=10.0 tests=DCC_CHECK,DIGEST_MULTIPLE, PYZOR_CHECK,RAZOR2_CF_RANGE_51_100,RAZOR2_CF_RANGE_E8_51_100,RAZOR2_CHECK, RCVD_IN_BL_SPAMCOP_NET,RCVD_IN_PBL,RCVD_IN_SORBS_DUL,RCVD_IN_XBL,RDNS_NONE, URIBL_BLACK,URIBL_JP_SURBL,URIBL_RHS_DOB,URIBL_SC_SURBL,URIBL_WS_SURBL autolearn=spam version=3.2.4


Question: Why is it that on the header of the ham, autolearn=no ?

My local.cf looks like this: 

...
...
required_score 10.0
use_bayes 1
bayes_auto_learn 1

bayes_store_module             Mail::SpamAssassin::BayesStore::MySQL
bayes_sql_dsn                      DBI:mysql:spamassassin:localhost
bayes_sql_username             dbusername
bayes_sql_password              secret

#AWL
auto_whitelist_factory           Mail::SpamAssassin::SQLBasedAddrList
user_awl_dsn                       DBI:mysql:spamassassin:localhost
user_awl_sql_username        dbusername
user_awl_sql_password         secret

# Enable or disable network checks
skip_rbl_checks     0
use_razor2     1
use_dcc     1
use_pyzor     1


I can see that there are entries in these tables: awl, bayes_global_vars, bayes_seen, bayes_vars  but bayes_token and bayes_expire are empty.


Am really new with all these. Appreciate your comment. TIA.

--
Roger


---------------------------------------------------
Sign Up for free Email at http://ureg.home.net.my/
---------------------------------------------------

Re: Am confused with mail header

Posted by Matt Kettler <mk...@verizon.net>.

Sn!per wrote:
> Quoting Matt Kettler <mk...@verizon.net>:
>
>   
>> Sn!per wrote:
>>     
>>> Hi all,
>>> Am using SA version 3.2.4 . So far working great.
>>>
>>> I am comparing the headers on two of my mails. The first mail is NOT a
>>>       
>> spam, and X-Spam-Status is like so:
>>     
>>> No, score=2.7 required=10.0 tests=RCVD_NUMERIC_HELO,RDNS_NONE autolearn=no
>>>       
<snip>
>>> Question: Why is it that on the header of the ham, autolearn=no ?
>>>       
>> Because with a score of +2.7 it didn't meet the score criteria for 
>> either spam or nonspam learning. SA only autolearns low scoring nonspam 
>> and high scoring spam. Everything "in between" it figures might be 
>> miscategorized, so it doesn't autolearn it.
>>
>>     
<snip>
>
> Thanks Matt and all.
>
> So I figure my configs are all okay then. Can I just leave things the way they are or you guys reckon I should be better off with some other additional steps ?
>
>   
Your AWL looks to be normal, no action needed.

Re: Am confused with mail header

Posted by Sn!per <sn...@home.net.my>.

Quoting Matt Kettler <mk...@verizon.net>:

> Sn!per wrote:
> > Hi all,
> > Am using SA version 3.2.4 . So far working great.
> >
> > I am comparing the headers on two of my mails. The first mail is NOT a
> spam, and X-Spam-Status is like so:
> >
> > No, score=2.7 required=10.0 tests=RCVD_NUMERIC_HELO,RDNS_NONE autolearn=no
> version=3.2.4
> >
> >
> > The second email is a spam, and its X-Spam-Status looks like this:
> > Yes, score=24.4 required=10.0 tests=DCC_CHECK,DIGEST_MULTIPLE,
> PYZOR_CHECK,RAZOR2_CF_RANGE_51_100,RAZOR2_CF_RANGE_E8_51_100,RAZOR2_CHECK,
> RCVD_IN_BL_SPAMCOP_NET,RCVD_IN_PBL,RCVD_IN_SORBS_DUL,RCVD_IN_XBL,RDNS_NONE,
> URIBL_BLACK,URIBL_JP_SURBL,URIBL_RHS_DOB,URIBL_SC_SURBL,URIBL_WS_SURBL
> autolearn=spam version=3.2.4
> >
> >
> > Question: Why is it that on the header of the ham, autolearn=no ?
> 
> Because with a score of +2.7 it didn't meet the score criteria for 
> either spam or nonspam learning. SA only autolearns low scoring nonspam 
> and high scoring spam. Everything "in between" it figures might be 
> miscategorized, so it doesn't autolearn it.
> 
> By default, the "learning score" needs to be under 0.1 to autolearn as 
> nonspam. It needs to be above 12 to learn as spam (with at least 3 
> header-rule points and 3 body rule points).
> 
> See also:
> 
> http://spamassassin.apache.org/full/3.2.x/doc/Mail_SpamAssassin_Plugin_AutoLearnThreshold.html
> 
> Note: the "learning score" is the score the message would have gotten if:
>     1) bayes was disabled (includes scoreset change)
>     2) the awl doesn't count.
>     3) any rules with tflags learn or userconf (ie: white/blacklist 
> rules) don't count.
> 
> SA uses that score for deciding to autolearn or not to prevent 
> self-feedback on the learning systems, and to prevent a mistaken 
> whitelist_from from causing a lot of spam to autolearn as nonspam.
> 

Thanks Matt and all.

So I figure my configs are all okay then. Can I just leave things the way they are or you guys reckon I should be better off with some other additional steps ?

Many thanks.

--
Roger


---------------------------------------------------
Sign Up for free Email at http://ureg.home.net.my/
---------------------------------------------------

Re: Am confused with mail header

Posted by Matt Kettler <mk...@verizon.net>.

Sn!per wrote:
> Hi all,
> Am using SA version 3.2.4 . So far working great.
>
> I am comparing the headers on two of my mails. The first mail is NOT a spam, and X-Spam-Status is like so:
>
> No, score=2.7 required=10.0 tests=RCVD_NUMERIC_HELO,RDNS_NONE autolearn=no version=3.2.4
>
>
> The second email is a spam, and its X-Spam-Status looks like this:
> Yes, score=24.4 required=10.0 tests=DCC_CHECK,DIGEST_MULTIPLE, PYZOR_CHECK,RAZOR2_CF_RANGE_51_100,RAZOR2_CF_RANGE_E8_51_100,RAZOR2_CHECK, RCVD_IN_BL_SPAMCOP_NET,RCVD_IN_PBL,RCVD_IN_SORBS_DUL,RCVD_IN_XBL,RDNS_NONE, URIBL_BLACK,URIBL_JP_SURBL,URIBL_RHS_DOB,URIBL_SC_SURBL,URIBL_WS_SURBL autolearn=spam version=3.2.4
>
>
> Question: Why is it that on the header of the ham, autolearn=no ?

Because with a score of +2.7 it didn't meet the score criteria for 
either spam or nonspam learning. SA only autolearns low scoring nonspam 
and high scoring spam. Everything "in between" it figures might be 
miscategorized, so it doesn't autolearn it.

By default, the "learning score" needs to be under 0.1 to autolearn as 
nonspam. It needs to be above 12 to learn as spam (with at least 3 
header-rule points and 3 body rule points).

See also:

http://spamassassin.apache.org/full/3.2.x/doc/Mail_SpamAssassin_Plugin_AutoLearnThreshold.html

Note: the "learning score" is the score the message would have gotten if:
    1) bayes was disabled (includes scoreset change)
    2) the awl doesn't count.
    3) any rules with tflags learn or userconf (ie: white/blacklist 
rules) don't count.

SA uses that score for deciding to autolearn or not to prevent 
self-feedback on the learning systems, and to prevent a mistaken 
whitelist_from from causing a lot of spam to autolearn as nonspam.

Re: Am confused with mail header

Posted by Karsten Bräckelmann <gu...@rudersport.de>.

On Sat, 2008-03-22 at 10:23 +0800, Sn!per wrote:
> I am comparing the headers on two of my mails. The first mail is NOT a
> spam, and X-Spam-Status is like so:
> 
> No, score=2.7 required=10.0 tests=RCVD_NUMERIC_HELO,RDNS_NONE autolearn=no version=3.2.4
            ^^^

> The second email is a spam, and its X-Spam-Status looks like this:
> Yes, score=24.4 required=10.0 tests=DCC_CHECK,DIGEST_MULTIPLE,
> PYZOR_CHECK,RAZOR2_CF_RANGE_51_100,RAZOR2_CF_RANGE_E8_51_100,RAZOR2_CHECK, RCVD_IN_BL_SPAMCOP_NET,RCVD_IN_PBL,RCVD_IN_SORBS_DUL,RCVD_IN_XBL,RDNS_NONE, URIBL_BLACK,URIBL_JP_SURBL,URIBL_RHS_DOB,URIBL_SC_SURBL,URIBL_WS_SURBL autolearn=spam version=3.2.4
> 
> 
> Question: Why is it that on the header of the ham, autolearn=no ?

Because the score exceeds bayes_auto_learn_threshold_nonspam, which is
0.1 by default. See the docs for M::SA::Plugin::AutoLearnThreshold [1]
and M::SA:Conf [2] section Learning Options, which references there.

Please note that this is a safety measure to prevent accidentally
learning slipped through spam as ham. Do not change it, unless you
really know what you are doing.

Instead, just learn the mail in question manually using sa-learn. There
is no BAYES_xx rule in either of your shown headers, which suggests that
Bayes did not yet see sufficient mail. You can speed this up by manually
learning BOTH, ham and spam, at least 200 each -- after that Bayes will
kick in and classify mail. It probably would have subtracted a point or
two from the ham mail's score.

Also, for future reference, please note that Bayes is not self feeding,
in that the threshold will be checked against the score without any
BAYES_xx rule applied.

> I can see that there are entries in these tables: awl,
> bayes_global_vars, bayes_seen, bayes_vars  but bayes_token and
> bayes_expire are empty.

Hmm, never used the SQL backends. But an empty bayes_token table might
hint, that no tokens have been learned so far. Someone else needs to
pick up on this one. :)

  guenther

[1] http://spamassassin.apache.org/full/3.2.x/doc/Mail_SpamAssassin_Plugin_AutoLearnThreshold.html
[2] http://spamassassin.apache.org/full/3.2.x/doc/Mail_SpamAssassin_Conf.html#learning_options

-- 
char *t="\10pse\0r\0dtu\0.@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4";
main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;i<l;i++){ i%8? c<<=1:
(c=*++x); c&128 && (s+=h); if (!(h>>=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}

Re: Am confused with mail header

Posted by Theo Van Dinter <fe...@apache.org>.

On Sat, Mar 22, 2008 at 10:23:09AM +0800, Sn!per wrote:
> Question: Why is it that on the header of the ham, autolearn=no ?

http://wiki.apache.org/spamassassin/AutolearningNotWorking

-- 
Randomly Selected Tagline:
There was a young lady named Bright
 Who could travel much faster than light.
         She took off one day,
         In a relative way,
 And returned on the previous night.