You are viewing a plain text version of this content. The canonical link for it is here.

Posted to users@spamassassin.apache.org by Matt <ma...@fileholder.net> on 2005/03/23 18:01:45 UTC

Effectiveness

When I first updated to Spamassassin 3.0.2 in December it worked great and 
stopped 95% of my junk.  Now its down to about 65% and seems to be getting 
worse.  I guess its just a matter of the Spammers having a copy of there own 
and tweaking there spam for a low score.  Has anyone else noticed this?  I 
have Razor, DCC and I think all possible opions enabled and working.

It sure would be nice if the rules were able to dynamically update like a 
virus scanner but from reading the FAQ this just is not possible.  I would 
think it would just be a matter of updating the *.cf files in 
"/usr/share/spamassassin/".

Another thing is I have several domains.  One is from our dialup ISP 10 
years old.  It has several email addresses that are dead and receive nothing 
but junk and lots of it.  About 20 pieces or more an hour.  Is there anyway 
I can use these to improve the effectiveness of Spamassassin?

Thanks.

Matt

Re[2]: Effectiveness

Posted by Robert Menschel <Ro...@Menschel.net>.

Hello Matt,

Wednesday, March 23, 2005, 12:47:12 PM, you wrote:

>> extra rules from www.rulesemporium.com/rules, auto updated with 
>> rules_du_jour.
>> make sure the surbl URI-RBL's are active.

M> They are.  Which rule sets should I choose from those below?  This domain is
M> for a small ISP so has a diversity of users.

M> # Here are some of the rulesets included in the 1.12 release:
M> # "MRWIGGLY BIGEVIL TRIPWIRE ANTIDRUG EVILNUMBERS BOGUSVIRUS SARE_ADULT
M> SARE_FRAUD SARE_BML SARE_RATWARE SARE_SPOOF SARE_BAYES_POISON_NXM SARE_OEM
M> SARE_RANDOM SARE_HEADER_ABUSE SARE_CODING_HTML";

I'm not sure what you mean by the 1.12 release -- We don't release
rules files as a family like that.

Looking at the current list at http://www.rulesemporium.com/rules.htm
    * Yes: redir
    * No:  bigevil
    * Yes: evilnumbers
    * Yes: bayes_poison_nxm
    * Some: html
    * Some: header
    * Yes: specific spammers
    * No:  ratware
    * Yes: adult
    * Yes: biz_market_learn
    * Yes: fraud
    * Yes: random
    * Yes: spoof
    * No:  sc_top200 (assuming SURBL and other net tests)
    * Yes: oem
    * Some: genlsubj
    * No:  highrisk
    * Yes: unsub
    * Some: uri

The ones I've marked "Some" have multiple files in side.  I'd suggest
files 0 and 1, and possibly the English file as well.

Looking at http://www.rulesemporium.com/other-rules.htm also use:
    * 99_FVGT_Tripwire.cf  (ver 1.18)
      (Fred -- is this your most current version?)
    * backhair.cf
    * chickenpox.cf
    * weeds_2.cf

Read the documentation in each of these files and decide for yourself.
Each system is different, as are each systems' users.  This set should
be AOK for the generic ISP in the USA, but we make no guarantees.

Bob Menschel

RE: Effectiveness

Posted by Greg Allen <ga...@netrox.net>.

I like the new URL black lists in SA so much I gave them all 4 points each
(but do that at your own risk). Bayes is even doing better now as a result
of the better auto-learning with the URL black lists. I also now give Bayes
99 (5) points (all other bayes points are default or less) and no false
positives yet on bayes 99.

I don't know my exact hit rate with the new SA but, I would say it has to be
the very HIGH 90s. I am still impressed every day I see it in action.

With URL black lists this new SA is the best thing going.

-----Original Message-----
From: Jeff Chan [mailto:jeffc@surbl.org]
Sent: Wednesday, March 23, 2005 10:35 PM
To: spamassassin-users@apache.org
Subject: Re: Effectiveness

On Wednesday, March 23, 2005, 12:47:12 PM, Matt Matt wrote:
>> extra rules from www.rulesemporium.com/rules, auto updated with
>> rules_du_jour.
>>
>> make sure the surbl URI-RBL's are active.

> They are.  Which rule sets should I choose from those below?  This domain
is
> for a small ISP so has a diversity of users.

> Thanks.

> Matt

> # Here are some of the rulesets included in the 1.12 release:
> # "MRWIGGLY BIGEVIL TRIPWIRE ANTIDRUG EVILNUMBERS BOGUSVIRUS SARE_ADULT
> SARE_FRAUD SARE_BML SARE_RATWARE SARE_SPOOF SARE_BAYES_POISON_NXM SARE_OEM
> SARE_RANDOM SARE_HEADER_ABUSE SARE_CODING_HTML";

If SURBLs are active you should be detecting at least 90% of
spams (more like 99+%).  The rules above are SARE rules, not
SURBL ones, BTW.

In order to use SURBLs you need to have Network tests enabled:

  http://www.surbl.org/faq.html#nettest

Jeff C.
--
Jeff Chan
mailto:jeffc@surbl.org
http://www.surbl.org/

Re: Effectiveness

Posted by Jeff Chan <je...@surbl.org>.

On Wednesday, March 23, 2005, 12:47:12 PM, Matt Matt wrote:
>> extra rules from www.rulesemporium.com/rules, auto updated with 
>> rules_du_jour.
>>
>> make sure the surbl URI-RBL's are active.

> They are.  Which rule sets should I choose from those below?  This domain is 
> for a small ISP so has a diversity of users.

> Thanks.

> Matt

> # Here are some of the rulesets included in the 1.12 release:
> # "MRWIGGLY BIGEVIL TRIPWIRE ANTIDRUG EVILNUMBERS BOGUSVIRUS SARE_ADULT 
> SARE_FRAUD SARE_BML SARE_RATWARE SARE_SPOOF SARE_BAYES_POISON_NXM SARE_OEM 
> SARE_RANDOM SARE_HEADER_ABUSE SARE_CODING_HTML";

If SURBLs are active you should be detecting at least 90% of
spams (more like 99+%).  The rules above are SARE rules, not
SURBL ones, BTW.

In order to use SURBLs you need to have Network tests enabled:

  http://www.surbl.org/faq.html#nettest

Jeff C.
-- 
Jeff Chan
mailto:jeffc@surbl.org
http://www.surbl.org/

Re: Effectiveness

Posted by Matt <ma...@fileholder.net>.

> extra rules from www.rulesemporium.com/rules, auto updated with 
> rules_du_jour.
>
> make sure the surbl URI-RBL's are active.

They are.  Which rule sets should I choose from those below?  This domain is 
for a small ISP so has a diversity of users.

Thanks.

Matt

# Here are some of the rulesets included in the 1.12 release:
# "MRWIGGLY BIGEVIL TRIPWIRE ANTIDRUG EVILNUMBERS BOGUSVIRUS SARE_ADULT 
SARE_FRAUD SARE_BML SARE_RATWARE SARE_SPOOF SARE_BAYES_POISON_NXM SARE_OEM 
SARE_RANDOM SARE_HEADER_ABUSE SARE_CODING_HTML";

Re: Effectiveness

Posted by Martin Hepworth <ma...@solid-state-logic.com>.

Matt
extra rules from www.rulesemporium.com/rules, auto updated with 
rules_du_jour.

make sure the surbl URI-RBL's are active.


--
Martin Hepworth
Snr Systems Administrator
Solid State Logic
Tel: +44 (0)1865 842300


Matt wrote:
> When I first updated to Spamassassin 3.0.2 in December it worked great 
> and stopped 95% of my junk.  Now its down to about 65% and seems to be 
> getting worse.  I guess its just a matter of the Spammers having a copy 
> of there own and tweaking there spam for a low score.  Has anyone else 
> noticed this?  I have Razor, DCC and I think all possible opions enabled 
> and working.
> 
> It sure would be nice if the rules were able to dynamically update like 
> a virus scanner but from reading the FAQ this just is not possible.  I 
> would think it would just be a matter of updating the *.cf files in 
> "/usr/share/spamassassin/".
> 
> Another thing is I have several domains.  One is from our dialup ISP 10 
> years old.  It has several email addresses that are dead and receive 
> nothing but junk and lots of it.  About 20 pieces or more an hour.  Is 
> there anyway I can use these to improve the effectiveness of Spamassassin?
> 
> Thanks.
> 
> Matt
> 
> 

**********************************************************************

This email and any files transmitted with it are confidential and
intended solely for the use of the individual or entity to whom they
are addressed. If you have received this email in error please notify
the system manager.

This footnote confirms that this email message has been swept
for the presence of computer viruses and is believed to be clean.	

**********************************************************************

Re: Effectiveness

Posted by "Daryl C. W. O'Shea" <sp...@dostech.ca>.

Eric A. Hall wrote:
> On 3/28/2005 2:07 PM, Daryl C. W. O'Shea wrote:
> 
> 
>>Better yet, is to not even bother running mail for that account through 
>>SpamAssassin in the first place and instead just pipe it to sa-learn. 
>>No point in filtering mail that you are positive is 100% spam.
> 
> 
> except that he wants to blacklist for all of the other recipients too, so
> running it through SA with blacklist_to is needed for that, even with
> really high bayes marks
> 

Oops... wrong thread.  I was thinking he just wanted to learn spamtrap 
mails.  :0

Re: Effectiveness

Posted by "Eric A. Hall" <eh...@ehsco.com>.

On 3/28/2005 2:07 PM, Daryl C. W. O'Shea wrote:

> Better yet, is to not even bother running mail for that account through 
> SpamAssassin in the first place and instead just pipe it to sa-learn. 
> No point in filtering mail that you are positive is 100% spam.

except that he wants to blacklist for all of the other recipients too, so
running it through SA with blacklist_to is needed for that, even with
really high bayes marks

-- 
Eric A. Hall                                        http://www.ehsco.com/
Internet Core Protocols          http://www.oreilly.com/catalog/coreprot/

Re: Effectiveness

Posted by "Daryl C. W. O'Shea" <sp...@dostech.ca>.

Eric A. Hall wrote:
> On 3/28/2005 9:30 AM, Matt wrote:
> 
>>That worked but your right it has no effect on the autolearn=spam.  Any idea 
>>how I get it to autolearn all email to a given address as spam?
> 
> 
> can you pipe incoming mail for that account to sa-learn?
> 

Even if you were to alter the tflags for the rule that fires on a 
blacklist_to address, the mail still wouldn't always be learnt due to 
the requirement for points from both the header and body.

One way to do what you want is what Eric suggested and pipe the mail to 
sa-learn.  Of course you don't need to pipe all the mail, just the mail 
that hasn't yet been learned (but 're-learning' won't hurt, it'll just 
take time).

Better yet, is to not even bother running mail for that account through 
SpamAssassin in the first place and instead just pipe it to sa-learn. 
No point in filtering mail that you are positive is 100% spam.

Daryl

Re: Effectiveness

Posted by "Eric A. Hall" <eh...@ehsco.com>.

On 3/28/2005 9:30 AM, Matt wrote:
> That worked but your right it has no effect on the autolearn=spam.  Any idea 
> how I get it to autolearn all email to a given address as spam?

can you pipe incoming mail for that account to sa-learn?

-- 
Eric A. Hall                                        http://www.ehsco.com/
Internet Core Protocols          http://www.oreilly.com/catalog/coreprot/

Re: Effectiveness

Posted by Matt <ma...@fileholder.net>.

That worked but your right it has no effect on the autolearn=spam.  Any idea 
how I get it to autolearn all email to a given address as spam?

Matt

> score USER_IN_BLACKLIST_TO 100.0
>
> or whatever score you want
>
> Dunno if the bayes auto-learner works with blacklist_to rules; it doesn't
> work with some whitelist rules.

Re: Effectiveness

Posted by "Eric A. Hall" <eh...@ehsco.com>.

On 3/26/2005 4:47 PM, Matt wrote:
> "blacklist_to" appears to add 10 points to spam score.  I would like to
> change it so it adds 20 points.  How would I do that?  Reason being
> that way "blacklist_to" messages will always be scored high enough to
> trigger them to be bayes auto_learn spam.

Add this to one of your *.cf files

	score USER_IN_BLACKLIST_TO 100.0

or whatever score you want

Dunno if the bayes auto-learner works with blacklist_to rules; it doesn't
work with some whitelist rules.

-- 
Eric A. Hall                                        http://www.ehsco.com/
Internet Core Protocols          http://www.oreilly.com/catalog/coreprot/

Re: Effectiveness

Posted by Matt <ma...@fileholder.net>.

"blacklist_to" appears to add 10 points to spam score.  I would like to 
change it so it adds 20 points.  How would I do that?  Reason being that way 
"blacklist_to" messages will always be scored high enough to trigger them to 
be bayes auto_learn spam.

Matt

> Add them to your cf with a "blacklist_to email@domain.dom" entry and
> they'll make good spamtraps for other recipients of those same messages
> (but will have no effect on recipients of other copies that were sent
> under separate cover). You could also write the message-id and/or envelope
> sender (among other things) and deal with secondary copies that way. One

Re: Effectiveness

Posted by "Eric A. Hall" <eh...@ehsco.com>.

On 3/23/2005 12:01 PM, Matt wrote:

> Another thing is I have several domains.  One is from our dialup ISP 10
>  years old.  It has several email addresses that are dead and receive
> nothing but junk and lots of it.  About 20 pieces or more an hour.  Is
> there anyway I can use these to improve the effectiveness of
> Spamassassin?

Add them to your cf with a "blacklist_to email@domain.dom" entry and
they'll make good spamtraps for other recipients of those same messages
(but will have no effect on recipients of other copies that were sent
under separate cover). You could also write the message-id and/or envelope
sender (among other things) and deal with secondary copies that way. One
thing I'm noticing more of lately is that some spam will come from three
or four sources all at once, which is presumably happening because
somebody has submitted the spam and mailing list to multiple trojaned PCs,
so my spamtraps are having a little bit less success lately, but they
still work very well.

You can also use the messages to feed a ~global bayes training process if
you're willing to accept the possibe side-effects of one-dimensional training.

-- 
Eric A. Hall                                        http://www.ehsco.com/
Internet Core Protocols          http://www.oreilly.com/catalog/coreprot/

Re: Effectiveness

Posted by Matt <ma...@fileholder.net>.

It was cleared 6 days ago.  It has 958 messages in it now.  So its about 160 
messages a day and not any good ones.  Not quite as many as I originally 
thought but still a lot.  The previous owner had the email account 
completely disabled for a couple years due to the spam.  I renabled it just 
to test how well my spam scanning is working.  Seems strange after even 
being completely dead for several years the crap keeps coming.  It has not 
even been used as a real email address in likely over 4 years.

Matt



> 500 spams/day?!
>
> Wow.
>
> I've had this address for something like 7 years. I have never munged and 
> it
> is plastered across the internet (google me...). I have posted to this 
> list
> and nanae. I get about 30 spams a day. (SA kills about 85% of those).
>
> Any idea why your old addresses get that much spam?