You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Arthur Dent <mi...@blueyonder.co.uk> on 2008/08/14 16:29:04 UTC

Something not quite right (Mainly with Bayes)

Hello all,

I'm running SpamAssassin version 3.2.5 on Perl version 5.10

It has always given me good results in the past and still does (though
now it seems that is almost entirely due to the RBLs, JM Sought, Pyzor
and Razor) the stock rules, SARE ruleset iand in particular Bayes seem
to be letting me down.

For example:

This message slipped through:
http://troodos.mine.nu/spamsamples/Job1.txt

Fair enough... a few hours later the RBLs had it, and so did JM Sought.
What worried me was that the very next day (after my cron job would have
learned that message as spam) I got this message:
http://troodos.mine.nu/spamsamples/Job2.txt

Which still showed BAYES_50. To my untrained eye the body, at least, of
that message is near enough identical to the first one. Surely that
should warrant a much higher Bayes hit?

My sa-learn --dump magic shows:
0.000          0          3          0  non-token data: bayes db version
0.000          0       6347          0  non-token data: nspam
0.000          0      14755          0  non-token data: nham
0.000          0     145389          0  non-token data: ntokens
0.000          0 1215403368          0  non-token data: oldest atime
0.000          0 1218686581          0  non-token data: newest atime
0.000          0 1218684032          0  non-token data: last journal
sync atime
0.000          0 1218168235          0  non-token data: last expiry
atime
0.000          0    2764800          0  non-token data: last expire
atime delta
0.000          0      23036          0  non-token data: last expire
reduction count

Also I got this message:
http://troodos.mine.nu/spamsamples/Watch1.txt
(Which now scores around 19 with JM Sought etc.)

Surely this should have hit some stock rules or SARE rules???? (I've had
some blatant Pharma stuff slip though too!)

What's wrong with my set-up?

Thanks in advance...

Mark

p.s.

Pastebin wouldn't allow me to post those samples (they hit their spam
filters!) where else can one post spam samples now? (My webspace won't
cope with too much traffic)