You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Dennis Hardy <dh...@sogetthis.com> on 2009/01/23 16:56:44 UTC

please help, getting hammered with snowshoe spam

Hi, I'm getting hammered by snowshoe spam :-(  I've added rules to try to
catch common formats of included URLs in the spam, but I'm wary of scoring
these rules too high because of the potential for false positives.  It's
hard to come up with other rules as the spam e-mail content is so generic. 
By default these spams score incredibly low (bayes, etc.)  In many cases,
the low bayes values are scoring negative, which completely offsets the few
positive scoring rules that I have added.

Are there other RBLs or domain checks or something that could be used to
possibly get more indication that a spam is a snowshoe spam from a "bogus"
domain?  I've also added a meta rule that combines URIBL_BLACK, DCC_CHECK,
and my rules...but spam still gets by many times because it scores so
low/negative otherwise.  Maybe I just need to score everything higher...?

Any thoughts/advice are appreciated :-)


-- 
View this message in context: http://www.nabble.com/please-help%2C-getting-hammered-with-snowshoe-spam-tp21627042p21627042.html
Sent from the SpamAssassin - Users mailing list archive at Nabble.com.


Re: please help, getting hammered with snowshoe spam

Posted by Dennis Hardy <dh...@sogetthis.com>.
> I've been using this rule to knock some of these down:
>   [...]
> Highly unusual to have a url like that in ham...
> I'm running a meta to bump up the score...

Yes, I've actually been doing the very same thing (URI detection and metas,
and then string matching in the tail part of the e-mail) !  However it has
been getting tedious maintaining the string list manually, because the "xxxx
Marketing" and "xxxx Media" etc. targets and addresses have been changing
far more frequently now.  They'll use them for a few days, then disappear
completely, and new ones will appear.  This type of spam is so incredibly a
pain...  Is there some more general way that this sort of thing could be
handled?


-- 
View this message in context: http://www.nabble.com/please-help%2C-getting-hammered-with-snowshoe-spam-tp21627042p21628143.html
Sent from the SpamAssassin - Users mailing list archive at Nabble.com.


Re: please help, getting hammered with snowshoe spam

Posted by Daniel J McDonald <da...@austinenergy.com>.
On Fri, 2009-01-23 at 07:56 -0800, Dennis Hardy wrote:
> Hi, I'm getting hammered by snowshoe spam :-(  I've added rules to try to
> catch common formats of included URLs in the spam, but I'm wary of scoring
> these rules too high because of the potential for false positives.  It's
> hard to come up with other rules as the spam e-mail content is so generic. 
> By default these spams score incredibly low (bayes, etc.)  In many cases,
> the low bayes values are scoring negative, which completely offsets the few
> positive scoring rules that I have added.

I've been using this rule to knock some of these down:
uri AE_ASM			/\/[[:alpha:]]{28,40}$/
describe AE_ASM			long gibberish path used by ASM Marketing
score AE_ASM			1

Highly unusual to have a url like that in ham...
I'm running a meta to bump up the score...

-- 
Daniel J McDonald, CCIE #2495, CISSP #78281, CNX
Austin Energy
http://www.austinenergy.com


Re: please help, getting hammered with snowshoe spam

Posted by Kai Schaetzl <ma...@conactive.com>.
Dennis Hardy wrote on Fri, 23 Jan 2009 08:36:59 -0800 (PST):

> see http://www.spamhaus.org/faq/answers.lasso?section=Glossary#233

Ah. I know a lot of spam terms, but this is certainly new to me ;-)

> 
> > If the former, put some example up on a pastebin (not ehre!).
> 
> Yes already done:  http://pastebin.com/m4400a74d

As it doesn't contain any headers I don't know if I wouldn't have rejected 
it at MTA, anyway. I get:

X-Spam-Report: 
    *  5.0 BAYES_99 BODY: Bayesian spam probability is 99 to 100%
    *      [score: 1.0000]
    *  3.0 URIBL_BLACK Contains an URL listed in the URIBL blacklist
    *      [URIs: twolumpsofcoal.net]
    *  0.1 DIET_1 BODY: Lose Weight Spam
    
It may not have been in URIBL_BLACK at the time you got it. But there are 
two other good rules that hit on it. As you are getting BAYES_05 there's 
something wrong with your Bayes I'd say.

Kai

-- 
Kai Schätzl, Berlin, Germany
Get your web at Conactive Internet Services: http://www.conactive.com




RE: please help, getting hammered with snowshoe spam

Posted by Faris Raouf <as...@raouf.net>.
> Do people generally have good non-FP experience with BRBL?  I am
> thinking of
> bumping up the score, but I get so much spam per day it is hard to
> check for
> FPs with it enabled.  It seems like a great resource, will it be pushed
> out
> with "sa-update" soon?  I believe it is enabled in svn, from what I've
> read.
> 

On one of the systems we run we set it to 0.1 initially to see how it went.
After three months monitoring we upped it to 3.0. and have never had any
problems. However you have to take this in the context of the other settings
and mail throughput for this particular system: A tagging score of 4 and a
drop score of 12 (yes, this is a bit high), on roughly 4000 emails per day
(after zen.spamhause.org dnsbl blocking).

Faris.


Re: please help, getting hammered with snowshoe spam

Posted by Dennis Hardy <dh...@sogetthis.com>.
Yes, it has been a problem as there are so many domains used.  However......I
took everyone's earlier suggestions, including training Bayes against FN
snowshoe spam and adding the Barracuda RBL (BRBL), and this appears to
almost completely take care of the problem!!  So far I have been able to
remove all of my custom rules except for BRBL of course, and only a few of
these snowshoe spams get through now.  Nice!

Do people generally have good non-FP experience with BRBL?  I am thinking of
bumping up the score, but I get so much spam per day it is hard to check for
FPs with it enabled.  It seems like a great resource, will it be pushed out
with "sa-update" soon?  I believe it is enabled in svn, from what I've read.

Also I am using policyd-weight to do front-end greylisting if the DNSBL
checks trigger as this reduces load on the server.  Can anyone suggest how
to enable the BRBL in policyd-weight?  I'm not sure what values to use.

Again thank you for your help with this problem!  It is great to see SA
working so well now against it :-)


-- 
View this message in context: http://www.nabble.com/please-help%2C-getting-hammered-with-snowshoe-spam-tp21627042p21792616.html
Sent from the SpamAssassin - Users mailing list archive at Nabble.com.


Re: please help, getting hammered with snowshoe spam

Posted by Kai Schaetzl <ma...@conactive.com>.
Karsten Bräckelmann wrote on Fri, 30 Jan 2009 20:25:52 +0100:

> Dennis clearly stated a *week* ago that the "domains change too
> quickly" (actual quote). Getting them listed will not help him. Oh, and
> don't you think he would have created a trivial uri rule already, if
> that would get them caught?

Obviously they are caught for others ;-) Either by Bayes, rules, network 
checks or other measure. It's never a "one hits them all" solution, so 
adding a spam domain to uribl is always good.

Kai

-- 
Kai Schätzl, Berlin, Germany
Get your web at Conactive Internet Services: http://www.conactive.com




Re: please help, getting hammered with snowshoe spam

Posted by Karsten Bräckelmann <gu...@rudersport.de>.
On Fri, 2009-01-30 at 18:28 +0100, Benny Pedersen wrote:
> On Fri, January 23, 2009 17:36, Dennis Hardy wrote:
> 
> > Yes already done:  http://pastebin.com/m4400a74d
> 
> why not get it listed on http://uribl.com/ ?

Benny, this is going to help how?

Dennis clearly stated a *week* ago that the "domains change too
quickly" (actual quote). Getting them listed will not help him. Oh, and
don't you think he would have created a trivial uri rule already, if
that would get them caught?


-- 
char *t="\10pse\0r\0dtu\0.@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4";
main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;i<l;i++){ i%8? c<<=1:
(c=*++x); c&128 && (s+=h); if (!(h>>=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}


Re: please help, getting hammered with snowshoe spam

Posted by Rob McEwen <ro...@invaluement.com>.
Benny Pedersen wrote:
> On Fri, January 23, 2009 17:36, Dennis Hardy wrote:
>   
>> Yes already done:  http://pastebin.com/m4400a74d
>>     
> why not get it listed on http://uribl.com/?
>   

Both uribl and ivmURI listed this domain back on January 23rd. But it is
unclear exactly *when* this spam sample was sent because the person who
started this thread didn't include full headers. So it is unclear if the
message hit this guy's server before these two URI blacklists listed
that domain? or after? (I'm guessing after?)

-- 
Rob McEwen
http://dnsbl.invaluement.com/
rob@invaluement.com
+1 (478) 475-9032



Re: please help, getting hammered with snowshoe spam

Posted by Benny Pedersen <me...@junc.org>.
On Fri, January 23, 2009 17:36, Dennis Hardy wrote:

> Yes already done:  http://pastebin.com/m4400a74d

why not get it listed on http://uribl.com/ ?

-- 
http://localhost/ 100% uptime and 100% mirrored :)


Re: please help, getting hammered with snowshoe spam

Posted by mouss <mo...@ml.netoyen.net>.
Dennis Hardy a écrit :
>> Is this spam for snowshoes or some "spam term"?
> 
> "Like a snowshoe spreads the load of a traveler across a wide area of snow,
> some spammers use many frequently-changing IP addresses and domains to
> spread out the spam load in order to dilute recipient reputation metrics and
> evade filters."
> 
> see http://www.spamhaus.org/faq/answers.lasso?section=Glossary#233
> 
>> If the former, put some example up on a pastebin (not ehre!).
> 
> Yes already done:  http://pastebin.com/m4400a74d

you need to show full headers. there are generally patterns in the
envelope sender and in few headers.

Also, consider using BRBL:

header   RCVD_IN_BRBL          eval:check_rbl('brbl-lastexternal',
'bb.barracudacentral.org.')
describe RCVD_IN_BRBL          Received via a relay in Barracuda BRBL
tflags   RCVD_IN_BRBL          net
score    RCVD_IN_BRBL          3.0

adjust the score of course.







Re: please help, getting hammered with snowshoe spam

Posted by Dennis Hardy <dh...@sogetthis.com>.
> Is this spam for snowshoes or some "spam term"?

"Like a snowshoe spreads the load of a traveler across a wide area of snow,
some spammers use many frequently-changing IP addresses and domains to
spread out the spam load in order to dilute recipient reputation metrics and
evade filters."

see http://www.spamhaus.org/faq/answers.lasso?section=Glossary#233

> If the former, put some example up on a pastebin (not ehre!).

Yes already done:  http://pastebin.com/m4400a74d


-- 
View this message in context: http://www.nabble.com/please-help%2C-getting-hammered-with-snowshoe-spam-tp21627042p21627984.html
Sent from the SpamAssassin - Users mailing list archive at Nabble.com.


Re: please help, getting hammered with snowshoe spam

Posted by Kai Schaetzl <ma...@conactive.com>.
Dennis Hardy wrote on Fri, 23 Jan 2009 07:56:44 -0800 (PST):

> Hi, I'm getting hammered by snowshoe spam

Is this spam for snowshoes or some "spam term"? If the former, put some 
example up on a pastebin (not ehre!).

Kai

-- 
Kai Schätzl, Berlin, Germany
Get your web at Conactive Internet Services: http://www.conactive.com




Re: please help, getting hammered with snowshoe spam

Posted by Derek Harding <de...@innovyx.com>.
Dennis Hardy wrote:
> Hi, I'm getting hammered by snowshoe spam :-(
>
> Any thoughts/advice are appreciated :-)
>   

When this started happening to us the only solution I found was manual 
CIDR blocks.

Yea I know very last millennium but I didn't find anything else to work 
with. Some particular snowshoers had patterns I could use but it seemed 
the addresses under attack were rapidly passed out among a large number 
of different outfits each with different styles. Bayes did not help sadly.

Derek


Re: please help, getting hammered with snowshoe spam

Posted by Dennis Hardy <dh...@sogetthis.com>.
Everyone has given very helpful feedback!  At present it definitely sounds
like I should tweak my rules and train my bayes.  I will try taking steps
here and see how it goes.

Thank you all so very much!


-- 
View this message in context: http://www.nabble.com/please-help%2C-getting-hammered-with-snowshoe-spam-tp21627042p21631249.html
Sent from the SpamAssassin - Users mailing list archive at Nabble.com.


Re: please help, getting hammered with snowshoe spam

Posted by John Hardin <jh...@impsec.org>.
On Fri, 23 Jan 2009, Dennis Hardy wrote:

> Here is what I have been using (from previous help from this mail list!):
>
>    uri SSS_URI30 /\bhttp:\/\/[^\.\/]+\.(?i:com|net|info|biz)\/\w{30}\b/
>    uri SSS_URI30 1.5
>
> this uri rule does work very well.  but they change the length 
> sometimes, so I have a few rules that handle different lengths.  Maybe I 
> should use 29,31 instead of just 30 for example?
>
> Am I being too conservative?  Should I consider bumping the score of 
> this up more?  And my meta up more perhaps?

Again, I'd have to see more examples to comment meaningfully. I would be 
especially interested in whether or not the part after the domain name is 
indeed free from punctuation.

A long string of unpunctuated letters is less likely to FP than a long 
string of letters, numbers and underscores.

You might want to anchor your rule with a $ as it may FP if there is stuff 
in the URI following the string of gibberish. Try it against this very 
legitimate looking (if overly verbose) URI:

   http://fnord.com/retrieve_document_as_pdf3_file.php?123456

And the rule I suggested makes an attempt to detect gibberish by looking 
for a "q" that is not followed by a "u", which is rare in English words.

-- 
  John Hardin KA7OHZ                    http://www.impsec.org/~jhardin/
  jhardin@impsec.org    FALaholic #11174     pgpk -a jhardin@impsec.org
  key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
   Vista: because the audio experience is *far* more important than
   network throughput.
-----------------------------------------------------------------------
  4 days until Wolfgang Amadeus Mozart's 253rd Birthday

Re: please help, getting hammered with snowshoe spam

Posted by Dennis Hardy <dh...@sogetthis.com>.
> Can you repost that with full headers?

Yes, I have to wait for more to come through though as I have gotten into
the habit of just deleting the FNs.

> No DNSBL hits on the URI domain?

No, the domains change too quickly, so I almost never get DNSBL hits for
these.  I have DNSBL greylisting front-ending SA as well, and I get no hits
there either.  It is really annoying.  Usually someone will submit and
URIBL_BLACK will hit after a few though.  I've added a meta for the URL
check (below) and URIBL_BLACK and DCC_CHECK, maybe all I really need to do
is bump up the meta score for this combination?

> We'd need more than one sample URI to do a good job. Have you been
> collecting a corpus?

Not of a FN set.  I should collect this.

> I notice that this URI has a format that may be a good spam sign: the 
> domain name, followed by a long string of unpunctuated text gibberish.

Here is what I have been using (from previous help from this mail list!):

    uri SSS_URI30 /\bhttp:\/\/[^\.\/]+\.(?i:com|net|info|biz)\/\w{30}\b/
    uri SSS_URI30 1.5

this uri rule does work very well.  but they change the length sometimes, so
I have a few rules that handle different lengths.   Maybe I should use 29,31
instead of just 30 for example?

Am I being too conservative?  Should I consider bumping the score of this up
more?  And my meta up more perhaps?


-- 
View this message in context: http://www.nabble.com/please-help%2C-getting-hammered-with-snowshoe-spam-tp21627042p21628431.html
Sent from the SpamAssassin - Users mailing list archive at Nabble.com.


Re: please help, getting hammered with snowshoe spam

Posted by John Hardin <jh...@impsec.org>.
On Fri, 23 Jan 2009, Dennis Hardy wrote:

>
>> why are those scores low? What gives them negative score?
>> those rules have quite high score...
>
> Here is an example (without my rules):  http://pastebin.com/m4400a74d

Can you repost that with full headers?

> The ones that get through are relatively short and simple, and many are 
> very "clean".

No DNSBL hits on the URI domain?

> I've been thinking about maybe writing an SA plugin that counts the 
> three repeated URL patterns that are always present in all of these 
> spams, but I don't know where to start in trying to do that.

We'd need more than one sample URI to do a good job. Have you been 
collecting a corpus?

I notice that this URI has a format that may be a good spam sign: the 
domain name, followed by a long string of unpunctuated text gibberish.

Just off the top of my head and untested, how does this do against your 
corpus?

   uri GIBBERISH ;://[^/]{4,50}/(?=[a-z]{25,80}$)[a-z]{0,80}q[^u][a-z]{0,80}$;i

-- 
  John Hardin KA7OHZ                    http://www.impsec.org/~jhardin/
  jhardin@impsec.org    FALaholic #11174     pgpk -a jhardin@impsec.org
  key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
   Gun Control is nothing more than an attempt to return to feudalism,
   where the peasants are helpless and must humbly petition their lord
   and master to protect them from bandits and thieves (when they can
   get around to it), and where the lords and masters can abuse the
   peasants whenever they like without fear of effective resistance.
-----------------------------------------------------------------------
  4 days until Wolfgang Amadeus Mozart's 253rd Birthday

Re: please help, getting hammered with snowshoe spam

Posted by Dennis Hardy <dh...@sogetthis.com>.
> your BAYES is misfiring. Ths difference between BAYES_05 and BAYES_99 is
4.6
> so you could have score of 5.7 if you'd have well-trained BAYES.

Yes, that would be great.  I will look at trying this.  I do get tens of
thousands of e-mails a day through this system though so it is hard to do
manual processes.  I need to play conservative and can't afford FPs at
all...


-- 
View this message in context: http://www.nabble.com/please-help%2C-getting-hammered-with-snowshoe-spam-tp21627042p21628480.html
Sent from the SpamAssassin - Users mailing list archive at Nabble.com.


Re: please help, getting hammered with snowshoe spam

Posted by Matus UHLAR - fantomas <uh...@fantomas.sk>.
> > why are those scores low? What gives them negative score?
> > those rules have quite high score...

On 23.01.09 08:26, Dennis Hardy wrote:
> Here is an example (without my rules):  http://pastebin.com/m4400a74d

X-Spam-Status: No, score=1.1 required=5.0 tests=BAYES_05,DCC_CHECK,DIET_1,
        SPF_HELO_PASS,SPF_PASS autolearn=no version=3.2.5

your BAYES is misfiring. Ths difference between BAYES_05 and BAYES_99 is 4.6
so you could have score of 5.7 if you'd have well-trained BAYES.

> The ones that get through are relatively short and simple, and many are very
> "clean".  This example is just one that focuses on weight loss, some are
> regarding tea or satellite companies or coffee makers or the like.  I worry
> about increasing FPs of real e-mails by training of "clean" spams as spam,
> when they are short and sweet and many times look like they could be
> legitimate e-mails.

just train on them, and remember to train on clean mails (especially those
which will start getting higher BAYES score).

> Also would training bayes on this sort of e-mail help if many things are
> different between each e-mail, and if the e-mail is so short and relatively
> "clean"?  Addresses change, company names change, sender domains are always
> different, etc

Iv you trained with enough of mail, it would help. However the result says
similar mails were trasined as ham, which is what you should investigate and
fix.

on some mailboxes I keep trained ham/spam in special folders so I could
whenever re-train or forget if anything was incorrect.

> I've been thinking about maybe writing an SA plugin that counts the three
> repeated URL patterns that are always present in all of these spams, but I
> don't know where to start in trying to do that.  I was hoping I could just
> handle this with SA rules or something (like using another RBL or
> something).

more mails could give an idea what should be hit. Maybe a rule would be
enough, not needed to create a plugin. But I'm sure BAYES training should be
enough for this mail...

-- 
Matus UHLAR - fantomas, uhlar@fantomas.sk ; http://www.fantomas.sk/
Warning: I wish NOT to receive e-mail advertising to this address.
Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu.
Support bacteria - they're the only culture some people have. 

Re: please help, getting hammered with snowshoe spam

Posted by Dennis Hardy <dh...@sogetthis.com>.
> why are those scores low? What gives them negative score?
> those rules have quite high score...

Here is an example (without my rules):  http://pastebin.com/m4400a74d

The ones that get through are relatively short and simple, and many are very
"clean".  This example is just one that focuses on weight loss, some are
regarding tea or satellite companies or coffee makers or the like.  I worry
about increasing FPs of real e-mails by training of "clean" spams as spam,
when they are short and sweet and many times look like they could be
legitimate e-mails.

Also would training bayes on this sort of e-mail help if many things are
different between each e-mail, and if the e-mail is so short and relatively
"clean"?  Addresses change, company names change, sender domains are always
different, etc

I've been thinking about maybe writing an SA plugin that counts the three
repeated URL patterns that are always present in all of these spams, but I
don't know where to start in trying to do that.  I was hoping I could just
handle this with SA rules or something (like using another RBL or
something).

Thank you!

-- 
View this message in context: http://www.nabble.com/please-help%2C-getting-hammered-with-snowshoe-spam-tp21627042p21627664.html
Sent from the SpamAssassin - Users mailing list archive at Nabble.com.


Re: please help, getting hammered with snowshoe spam

Posted by Matus UHLAR - fantomas <uh...@fantomas.sk>.
On 23.01.09 07:56, Dennis Hardy wrote:
> Hi, I'm getting hammered by snowshoe spam :-(  I've added rules to try to
> catch common formats of included URLs in the spam, but I'm wary of scoring
> these rules too high because of the potential for false positives.  It's
> hard to come up with other rules as the spam e-mail content is so generic. 
> By default these spams score incredibly low (bayes, etc.)  In many cases,
> the low bayes values are scoring negative, which completely offsets the few
> positive scoring rules that I have added.

train bayes properly, it's the first thing you should do for such mail.

> Are there other RBLs or domain checks or something that could be used to
> possibly get more indication that a spam is a snowshoe spam from a "bogus"
> domain?  I've also added a meta rule that combines URIBL_BLACK, DCC_CHECK,
> and my rules...but spam still gets by many times because it scores so
> low/negative otherwise.  Maybe I just need to score everything higher...?

why are those scores low? What gives them negative score?
those rules have quite high score...

-- 
Matus UHLAR - fantomas, uhlar@fantomas.sk ; http://www.fantomas.sk/
Warning: I wish NOT to receive e-mail advertising to this address.
Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu.
Quantum mechanics: The dreams stuff is made of.