You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Roman Gelfand <rg...@gmail.com> on 2015/06/23 14:34:52 UTC
bayes filtlering
Periodically, I am running the following command on my spam box...
sa-learn --no-sync --spam /mbx/adomain.com/auser/Maildir/.Junk/{cur,new}
It seems to work. However, I continue to get this message type. Why?
Here is SA message.
X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on mail.adomain.com
X-Spam-Level: ***
X-Spam-Status: No, score=3.6 required=5.0 tests=BAYES_99,BAYES_999,DKIM_SIGNED,
DKIM_VALID,DKIM_VALID_AU,HTML_MESSAGE,SPF_PASS,URIBL_BLOCKED autolearn=no
version=3.3.2
Thanks in advance
Re: bayes filtlering
Posted by Reindl Harald <h....@thelounge.net>.
Am 25.10.2015 um 19:06 schrieb Roman Gelfand:
> In you post some time ago, you had mentioned that my configuration may
> not be sufficient to block emails using bayes filtering. Below, is my
> configuration. I had since fixed the dns issue. But not sure how to
> deal with non-changing score from 3.5 I am getting 4 emails/day with
> score 3.5.
>
> X-Spam-Status: No, score=3.6 required=5.0 tests=AWL,BAYES_99,BAYES_999,
> DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HTML_MESSAGE,SPF_PASS autolearn=no
> version=3.3.2
* introduce body-filters
* raise you bayes-scoring if you trust your training
caution: that's for a long trained bayes with around 70000 sample
messages, a ton of low-scored whitelists and a milter-reject of 8.0
# adjust bayes scoring
score BAYES_00 -3.5
score BAYES_05 -2.0
score BAYES_20 -1.0
score BAYES_40 -0.5
score BAYES_50 1.5
score BAYES_60 3.5
score BAYES_80 5.5
score BAYES_95 6.5
score BAYES_99 7.5
score BAYES_999 0.4
body CUST_BODY_17 /.*(1st page ranking of google|a company which
can understand you & your business).*/i
score CUST_BODY_17 1.5
describe CUST_BODY_17 Contains Low
Re: bayes filtlering
Posted by Roman Gelfand <rg...@gmail.com>.
In you post some time ago, you had mentioned that my configuration may not
be sufficient to block emails using bayes filtering. Below, is my
configuration. I had since fixed the dns issue. But not sure how to deal
with non-changing score from 3.5 I am getting 4 emails/day with score
3.5.
X-Spam-Status: No, score=3.6 required=5.0 tests=AWL,BAYES_99,BAYES_999,
DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HTML_MESSAGE,SPF_PASS autolearn=no
version=3.3.2
required_score 5.0
# rewrite_header Subject [SPAM]
rewrite_header Subject
# trusted_networks 192.168.7.0/24 192.168.3.0/24
report_safe 0
use_bayes 1
bayes_auto_learn 1
skip_rbl_checks 0
use_razor2 1
use_pyzor 1
ok_languages en
user_scores_dsn DBI:mysql:spamassassin:localhost
user_scores_sql_username spamd
user_scores_sql_password XXXXXXXXXXXX=9
auto_whitelist_factory Mail::SpamAssassin::SQLBasedAddrList
user_awl_dsn DBI:mysql:spamassassin:localhost
user_awl_sql_table awl
user_awl_sql_username spamd
user_awl_sql_password onepluseight=9
bayes_store_module Mail::SpamAssassin::BayesStore::MySQL
bayes_sql_dsn DBI:mysql:spamassassin:localhost
bayes_sql_username spamd
bayes_sql_password XXXXXXXXXXXXx=9
On Tue, Jun 23, 2015 at 2:52 PM, Bill Cole <
sausers-20150205@billmail.scconsult.com> wrote:
> On 23 Jun 2015, at 8:34, Roman Gelfand wrote:
>
> Periodically, I am running the following command on my spam box...
>> sa-learn --no-sync --spam /mbx/adomain.com/auser/Maildir/.Junk/{cur,new}
>> <http://adomain.com/auser/Maildir/.Junk/%7Bcur,new%7D>
>>
>> It seems to work. However, I continue to get this message type. Why?
>> Here is SA message.
>>
>> X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on
>> mail.adomain.com
>> X-Spam-Level: ***
>> X-Spam-Status: No, score=3.6 required=5.0
>> tests=BAYES_99,BAYES_999,DKIM_SIGNED,
>> DKIM_VALID,DKIM_VALID_AU,HTML_MESSAGE,SPF_PASS,URIBL_BLOCKED
>> autolearn=no
>> version=3.3.2
>>
>
> Your configuration appears to use the default scores for the rules that
> are being hit there and for the "required" threshold. A 100% certain Bayes
> judgment (technically anything >99.9%) only adds up to a score of 3.7 with
> the default scores, and the default threshold is 5.0, so you need
> *something more* than a Bayes certainty to get SA to call anything spam,
> using the default configuration. Without seeing the actual mail, what
> "more" might be is a generic theoretical discussion.
>
> However, in this case there's an obvious first thing to fix: stop using a
> shared DNS resolver.
>
> The URIBL_BLOCKED "rule" is a message from the operators of the uribl.com
> service that the DNS resolver used for a query is explicitly refused
> service. The most common reason for this is excess query volume from a
> resolver. The only likely reasons for you to hit this are:
>
> 1. You are scanning so much mail with SA that you must be a large
> commercial operation capable of helping to support the uribl.com service
> as "free for most," so they require you to do so. This seems unlikely for
> someone newly setting up SA...
>
> 2. You are using a DNS resolver that is shared by a large number of other
> people and in aggregate you are all pounding the uribl.com nameservers as
> if you are a commercial service provider or large business.
>
> The solution for (2) is a step that should be part of running *ANY* MTA
> that accepts mail from the world at large: bring up a caching recursive
> (NOT forwarding) resolver DNS daemon on the same host (or in multi-host
> environments: same physical LAN) as the MTA and use it as the resolver for
> the MTA. In addition to being able to use services like uribl.com and
> Spamhaus that block large resolvers who don't support them, having your own
> resolver makes DNS resolution substantially faster on average for your MTA.
> With a modern MTA doing basic spam control, DNS resolution time is a
> substantial contributor to session lifetime, which is a major determinant
> of overall capacity. Another positive advantage is that many shared
> resolvers (especially those run by ISPs) do non-standard things in response
> to some queries designed to either assist and protect web-surfing users or
> line their own pockets, depending on the particular resolver and one's PoV.
> None of those tricks are helpful for an MTA, and some can be positively
> harmful, so you shouldn't do resolution for an MTA through such a server. A
> caching-only recursive nameserver isn't a substantial load and isn't
> difficult to configure, and many OS distributions include such a
> configuration in the base OS (e.g. FreeBSD) or as the default config in
> packages of ISC BIND and/or other DNS daemons.
>
>
Re: bayes filtlering
Posted by Bill Cole <sa...@billmail.scconsult.com>.
On 23 Jun 2015, at 8:34, Roman Gelfand wrote:
> Periodically, I am running the following command on my spam box...
> sa-learn --no-sync --spam
> /mbx/adomain.com/auser/Maildir/.Junk/{cur,new}
>
> It seems to work. However, I continue to get this message type. Why?
> Here is SA message.
>
> X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on
> mail.adomain.com
> X-Spam-Level: ***
> X-Spam-Status: No, score=3.6 required=5.0
> tests=BAYES_99,BAYES_999,DKIM_SIGNED,
> DKIM_VALID,DKIM_VALID_AU,HTML_MESSAGE,SPF_PASS,URIBL_BLOCKED
> autolearn=no
> version=3.3.2
Your configuration appears to use the default scores for the rules that
are being hit there and for the "required" threshold. A 100% certain
Bayes judgment (technically anything >99.9%) only adds up to a score of
3.7 with the default scores, and the default threshold is 5.0, so you
need *something more* than a Bayes certainty to get SA to call anything
spam, using the default configuration. Without seeing the actual mail,
what "more" might be is a generic theoretical discussion.
However, in this case there's an obvious first thing to fix: stop using
a shared DNS resolver.
The URIBL_BLOCKED "rule" is a message from the operators of the
uribl.com service that the DNS resolver used for a query is explicitly
refused service. The most common reason for this is excess query volume
from a resolver. The only likely reasons for you to hit this are:
1. You are scanning so much mail with SA that you must be a large
commercial operation capable of helping to support the uribl.com service
as "free for most," so they require you to do so. This seems unlikely
for someone newly setting up SA...
2. You are using a DNS resolver that is shared by a large number of
other people and in aggregate you are all pounding the uribl.com
nameservers as if you are a commercial service provider or large
business.
The solution for (2) is a step that should be part of running *ANY* MTA
that accepts mail from the world at large: bring up a caching recursive
(NOT forwarding) resolver DNS daemon on the same host (or in multi-host
environments: same physical LAN) as the MTA and use it as the resolver
for the MTA. In addition to being able to use services like uribl.com
and Spamhaus that block large resolvers who don't support them, having
your own resolver makes DNS resolution substantially faster on average
for your MTA. With a modern MTA doing basic spam control, DNS resolution
time is a substantial contributor to session lifetime, which is a major
determinant of overall capacity. Another positive advantage is that many
shared resolvers (especially those run by ISPs) do non-standard things
in response to some queries designed to either assist and protect
web-surfing users or line their own pockets, depending on the particular
resolver and one's PoV. None of those tricks are helpful for an MTA, and
some can be positively harmful, so you shouldn't do resolution for an
MTA through such a server. A caching-only recursive nameserver isn't a
substantial load and isn't difficult to configure, and many OS
distributions include such a configuration in the base OS (e.g. FreeBSD)
or as the default config in packages of ISC BIND and/or other DNS
daemons.
Re: bayes filtlering
Posted by Reindl Harald <h....@thelounge.net>.
Am 23.06.2015 um 14:34 schrieb Roman Gelfand:
> Periodically, I am running the following command on my spam box...
> sa-learn --no-sync --spam /mbx/adomain.com/auser/Maildir/.Junk/{cur,new}
> <http://adomain.com/auser/Maildir/.Junk/{cur,new}>
>
> It seems to work. However, I continue to get this message type. Why?
> Here is SA message.
>
> X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) onmail.adomain.com <http://mail.adomain.com>
> X-Spam-Level: ***
> X-Spam-Status: No, score=3.6 required=5.0 tests=BAYES_99,BAYES_999,DKIM_SIGNED,
> DKIM_VALID,DKIM_VALID_AU,HTML_MESSAGE,SPF_PASS,URIBL_BLOCKED autolearn=no
> version=3.3.2
because the score for BAYES_99 + BAYES_999 alone is not high enough but
your *realy problem* is URIBL_BLOCKED and that's just because you are
using a DNS forwarder instead a local cache, that topic was discussed
thousands of times, so solve that and then consider *careful* to adjust
scores because it heavily depends on how your bayes in both directions
is trained
/etc/mail/spamassassin/local.cf
score BAYES_00 -3.5
score BAYES_05 -2.0
score BAYES_20 -1.0
score BAYES_40 -0.5
score BAYES_50 1.8
score BAYES_60 3.5
score BAYES_80 5.0
score BAYES_95 6.5
score BAYES_99 7.5
score BAYES_999 0.4