You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Matt Wills <sa...@virtualvermont.com> on 2005/07/02 13:33:34 UTC

Three questions

I have only recently begun using SA 2.63, because that's what my host made 
available through automatic install.

With the help of this group and such as SARE and other available rulesets, 
I am catching a load of spam. No Bayes, just the packaged rulesets and the 
local.cf. I am enjoying learning regex and how to write rules. If I get 
brave, I may try to implement RDJ.

Questions:

1. I have been reluctant to implement Bayes. With close to 200 users, 
There's no shortage of spam to teach with, but there's only my own ham, 
which consists mostly of some list mail such as this, some clients and a 
few friends (we talk on the phone far more often than we email). I'm 
certainly not going to go digging into my clients' mailboxes for it. It 
thus appears to me that for Bayes to learn my ham wouldn't really be a good 
lesson. Am I correct? My big client is a publisher, and the words in their 
ham are considerably different than in mine.

2. Is there any compelling reason to install SA 3.x myself? Given how I am 
using it and am likely to continue using it, 2.63 seems to be accomplishing 
quite a bit. It don't seem to be broke, should I fix it?

3. Regarding regex, I see a number of rules that contain a construction 
such as this (from Mike Kettler's Anti Drug ruleset):

body __DRUGS_ERECTILE4	/\bC(?:alis|ilias|ilais)\b/i

What does the "?:" do? I see similar sules without it, which appear to do 
what is intended, I am not finding much to explain it in any regex search 
I've done.

Matt



Re: Three questions

Posted by Loren Wilton <lw...@earthlink.net>.
Bowie has answered your questions well.  I'd just like to add a little to an
answer to your first question.

You a probably correct that your ham probably isn't close to that of your
other users; so you might be best off not enabling Bayes if you can't figure
out  a way to let them submit the occasional ham.

One way you might be able to do this would be a site-wide Bayes and
auto-training, but I'd be inclined to watch it like a hawk for the first few
weeks, and keep a fair eye on statistics for a few weeks after that to see
if all is well.

If you try this, you would probably want to initially train with the ham
that you have, which is yours (less this group), and set the auto-learn ham
threshhold down to -.1 or even -.2 if you have any negative-scoring rules at
all other than Bayes itself.


One thing for sure, you DO NOT want to submit the mail in *this* group as
Bayes fodder!  There is too much stuff here that is really spammy-looking
(not to mention the occasional copied or attached spam), and it could
possibly really mess up your Bayes training.

In fact, if you have Procmail or some such for your frontend, you should try
to arrange to make this newsgroup bypass SA entirely.


On 2.63 and "if it ain't broke", unfortunately it potentially is, so you may
want to install 2.64.  I don't know that I've ever heard of the DOS
possibility actually being expolited, but it is there for someone to
attempt.

        Loren


Re: Three questions

Posted by Matt Kettler <mk...@comcast.net>.
At 07:33 AM 7/2/2005, Matt Wills wrote:
>2. Is there any compelling reason to install SA 3.x myself? Given how I am 
>using it and am likely to continue using it, 2.63 seems to be 
>accomplishing quite a bit. It don't seem to be broke, should I fix it?

2.63 is DEFINITELY, without question, broke.. It's vulnerable to a DoS 
attack if someone mails you a particular malformed mime email.

http://www.cve.mitre.org/cgi-bin/cvename.cgi?name=CAN-2004-0796

Upgrade to 2.64 or 3.0.4. Neither has any known DoS vulnerabilities. 3.0.0 
is also free of DoSes but it's rough, and 3.0.1-3.0.3 are vulnerable to a 
different DoS.

>3. Regarding regex, I see a number of rules that contain a construction 
>such as this (from Mike Kettler's Anti Drug ruleset):
>
>body __DRUGS_ERECTILE4  /\bC(?:alis|ilias|ilais)\b/i
>
>What does the "?:" do? I see similar sules without it, which appear to do 
>what is intended, I am not finding much to explain it in any regex search 
>I've done.

Turns off backreferences.. Normaly a string matching ()'s section of a 
regex can later be backrefrenced as \1. This is often used to look for 
duplicate strings, or other matched pairs...

However, setting it up if you aren't going to use it later in the regex 
wastes time and memory.