You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Roman Gelfand <rg...@gmail.com> on 2015/07/22 02:55:05 UTC
DKIM, SPF and Bayesian Learning
It seems that if DKIM or SPF is verified, the bayesian learning doesn't
matter.
X-Spam-Status: No, score=3.6 required=5.0 tests=BAYES_99,BAYES_999,DKIM_SIGNED,
DKIM_VALID,DKIM_VALID_AU,HTML_MESSAGE,SPF_PASS autolearn=no version=3.3.2
Re: DKIM, SPF and Bayesian Learning
Posted by Bill Cole <sa...@billmail.scconsult.com>.
On 21 Jul 2015, at 20:55, Roman Gelfand wrote:
> It seems that if DKIM or SPF is verified, the bayesian learning
> doesn't
> matter.
Not so. Perhaps you need to refresh your understanding of what
SpamAssassin is. It is not a collection of binary switches, but rather a
scoring system consisting of rules which have various scores.
How much each rule matters is a local decision, subject to default
values
> X-Spam-Status: No, score=3.6 required=5.0
> tests=BAYES_99,BAYES_999,DKIM_SIGNED,
> DKIM_VALID,DKIM_VALID_AU,HTML_MESSAGE,SPF_PASS autolearn=no
> version=3.3.2
3.3.2 is rather obsolete, but I still have the defaultrules laying
about...
/var/spamassassin/3.003002/updates_spamassassin_org/50_scores.cf:score
BAYES_99 0 0 3.8 3.5
/var/spamassassin/3.003002/updates_spamassassin_org/50_scores.cf:score
BAYES_999 0 0 0.2 0.2
/var/spamassassin/3.003002/updates_spamassassin_org/50_scores.cf:score
DKIM_SIGNED 0.1
/var/spamassassin/3.003002/updates_spamassassin_org/50_scores.cf:score
DKIM_VALID -0.1
/var/spamassassin/3.003002/updates_spamassassin_org/50_scores.cf:score
DKIM_VALID_AU -0.1
/var/spamassassin/3.003002/updates_spamassassin_org/50_scores.cf:score
HTML_MESSAGE 0.001
/var/spamassassin/3.003002/updates_spamassassin_org/50_scores.cf:score
SPF_PASS -0.001
The arithmetic, assuming you allow network tests: The 2 Bayes rules (de
facto Bayes certitude of spaminess) only add up to 3.7. All of the DKIM
and SPF crap nets out to -0.101, vastly overstating their value in
making spam/ham decisions, which in fact is indistinguishable from zero
as independent rules. However, that remains a small mitigation relative
the Bayes rules, which are much more reliable but still subject to error
by their nature as statistically-derived values. This is consistent with
your shown score and a reasonable understanding of spam.
On the other hand, if you really trust your Bayes DB and have a
particular widespread flavor of spam hitting you, that precise set of
rules (including HTML_MESSAGE) makes an excellent 'meta' rule worth a
solid half point, and if you don't have a lot of non-spam marketing mail
that you get voluntarily, you can probably lower your threshold to 4.5
or maybe even 4. Try this first on a personal mail server, NOT on one
handling mail for a broad audience including people who can fire you
(until after you've analyzed the mail stream very carefully.)
Re: DKIM, SPF and Bayesian Learning
Posted by "Kevin A. McGrail" <KM...@PCCC.com>.
On 7/21/2015 8:55 PM, Roman Gelfand wrote:
> It seems that if DKIM or SPF is verified, the bayesian learning
> doesn't matter.
>
> X-Spam-Status: No, score=3.6 required=5.0 tests=BAYES_99,BAYES_999,DKIM_SIGNED,
> DKIM_VALID,DKIM_VALID_AU,HTML_MESSAGE,SPF_PASS autolearn=no version=3.3.2
If you mean autolearn, it requires a mixture of body and header rules.
Most all the rules hit appear to be header rules
"Normally, SpamAssassin will require 3 points from the header and 3
points from the body to be auto-learned as spam. "
See perldoc for Mail::SpamAssassin::Plugin::AutoLearnThreshold and
Mail::SpamAssassin::Conf
Regards,
KAM