You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Brian Eliassen <br...@eliassen.org> on 2014/04/27 04:41:39 UTC
BAYES_00 Query
Hello Keepers of SpamAssassin Knowledge,
I've been lurking on this list for years and never had a question pop
up until today. About a week ago I said, "enough is enough" regarding
the amount of spam I've been receiving so I've been doing some
upgrades. As such, I recently upgraded to SA 3.4 and did the
recommended "sa-learn --clear" to clean out the database. I had a
huge pile of recent spam and ham so I repopulated the database with
those. Afterwards, here is what my "sa-learn --dump magic" looked like:
0.000 0 3 0 non-token data: bayes db version
0.000 0 35575 0 non-token data: nspam
0.000 0 1870 0 non-token data: nham
0.000 0 180984 0 non-token data: ntokens
0.000 0 1314919780 0 non-token data: oldest atime
0.000 0 1398209850 0 non-token data: newest atime
0.000 0 0 0 non-token data: last journal
sync atime
0.000 0 1398228671 0 non-token data: last expiry
atime
0.000 0 691200 0 non-token data: last expire
atime delta
0.000 0 2166321 0 non-token data: last expire
reduction count
Yes, I had that much spam stored up. That sa-learn took several
hours. But on to my question; I have been extra careful to note what
has been slipping by the filter and here is what I've seen over the
past two days:
3.299 (***) BAYES_00,FORGED_RELAY_MUA_TO_MX
3.92 (***)
BAYES_00
,FREEMAIL_FROM
,RDNS_NONE,TBIRD_SUSP_MIME_BDRY,T_HTML_ATTACH,T_OBFU_HTML_ATTACH
-1 () BAYES_00
0.279 () BAD_CREDIT,BAYES_00
-0.988 () BAYES_00,HTML_EXTRA_CLOSE,HTML_MESSAGE,T_REMOTE_IMAGE
3.299 (***) BAYES_00,FORGED_RELAY_MUA_TO_MX
-0.988 () BAYES_00,HTML_EXTRA_CLOSE,HTML_MESSAGE,T_REMOTE_IMAGE
-0.979 () BAYES_00,FREEMAIL_FROM,T_HTML_ATTACH,T_OBFU_HTML_ATTACH
0.436 ()
BAYES_00,DIET_1,HELO_MISC_IP,HTML_FONT_LOW_CONTRAST,HTML_MESSAGE
0.436 ()
BAYES_00,DIET_1,HELO_MISC_IP,HTML_FONT_LOW_CONTRAST,HTML_MESSAGE
The thing that is common is BAYES_00 on all of these. It's the
standard -1 score. Did I do something horrible with my installation
to allow this sort of crud to slip through? Isn't that when Bayes
things that the mail isn't spam? Look at some of the other rules that
are hitting. I cannot figure out why BAYES_00 would hit on these.
Thanks in advance.
Oh, this is a sendmail -> mimedefang -> spamassassin/clamav/razor
installation. Any recommendations on additional plugins to consider
and/or SARE-like channels to subscribe to would be greatly appreciated.
Brian
Re: BAYES_00 Query
Posted by John Hardin <jh...@impsec.org>.
On Sun, 27 Apr 2014, Axb wrote:
> On 04/27/2014 06:02 PM, John Hardin wrote:
>> Then wipe and retrain again.
>
> I'd definitely go for that
>
> oldest spam in bayes is from Thu, 01 Sep 2011 23:29:40 GMT
> 0.000 0 1314919780 0 non-token data: oldest atime
>
> The DB just hasn't enough spam to make a difference.
Ah, I didn't notice that bit. Let me amend my advice some:
Disable automatic bayes expiry too. Don't run a manual bayes expiration
until and unless you decide to enable autolearn, and don't run an
expiration until autolearn has collected sufficient *recent* messages.
--
John Hardin KA7OHZ http://www.impsec.org/~jhardin/
jhardin@impsec.org FALaholic #11174 pgpk -a jhardin@impsec.org
key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
I would buy a Mac today if I was not working at Microsoft.
-- James Allchin, Microsoft VP of Platforms
-----------------------------------------------------------------------
696 days since the first successful private support mission to ISS (SpaceX)
Re: BAYES_00 Query
Posted by Axb <ax...@gmail.com>.
On 04/27/2014 06:02 PM, John Hardin wrote:
> Then wipe and retrain again.
I'd definitely go for that
oldest spam in bayes is from Thu, 01 Sep 2011 23:29:40 GMT
0.000 0 1314919780 0 non-token data: oldest atime
The DB just hasn't enough spam to make a difference.
Re: BAYES_00 Query
Posted by John Hardin <jh...@impsec.org>.
On Sat, 26 Apr 2014, Brian Eliassen wrote:
> Yes, I had that much spam stored up.
Good.
> I cannot figure out why BAYES_00 would hit on these.
First, do you have autolearn enabled? If so, I would turn it off until the
basic initial Bayes training is proven.
Second, if spams are hitting BAYES_00 that means they "look hammy" based
on how Bayes has been trained. Take a look, manually, at *every* message
in your ham corpus and verify that it indeed is purely ham.
You can also add some of the recent misclassified spams to your spam
training corpus.
Then wipe and retrain again.
--
John Hardin KA7OHZ http://www.impsec.org/~jhardin/
jhardin@impsec.org FALaholic #11174 pgpk -a jhardin@impsec.org
key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
...to announce there must be no criticism of the President or to
stand by the President right or wrong is not only unpatriotic and
servile, but is morally treasonous to the American public.
-- Theodore Roosevelt, 1918
-----------------------------------------------------------------------
696 days since the first successful private support mission to ISS (SpaceX)
Re: BAYES_00 Query
Posted by Benny Pedersen <me...@junc.eu>.
Check bayes settings, did you train as same user as mimedefang runs as if not using sql bayes backend, is your setup global bayes or pr user setup?
--
Sendt fra min Android telefon med K-9 Mail. Undskyld hvis jeg er lidt kortfattet.