You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Geoff Soper <ge...@alphaworks.co.uk> on 2004/12/13 23:51:12 UTC

Watches and pain relief

My current spam problem consists almost entirely of spam advertising
watches and various pain relief products. I have a reasonable variety of
rulesets installed including the following (it's not definitive as I don't
have access to the server right now):
SARE General Subject
SARE HEADER
SARE html0
SARE OEM
SARE Random
SARE Ratware Detection
SARE Specific
SARE Spoof
EvilNumber

Can anyone suggest what might catch these watch and pain relief spams
before I start writing some of my own?

Thanks,
Geoff

Re: Watches and pain relief

Posted by Loren Wilton <lw...@earthlink.net>.
Here are my current watch-catchers.  These will end up in a ruleset at some
point, doubtless with modifications.  you probably will have to fiddle the
scores on these a little.

body  LW_ROLEX  /\broll?ex\b/i
score  LW_ROLEX  1
describe LW_ROLEX  Mentions Rolex

body  __LW_OBREPLICA /\brepIicas?\b/i
body  __LW_REPLICA /\breplicas?\b/i
body  __LW_WATCHES /\bwatch(?:es)?\b/i

meta  LW_ROLEXWATCH LW_ROLEX && __LW_WATCHES
score  LW_ROLEXWATCH 1
describe LW_ROLEXWATCH Mentions rolex watches

meta  LW_FAKEROLEX LW_ROLEX && __LW_REPLICA
score  LW_FAKEROLEX 5
describe LW_FAKEROLEX Talks about rolex and replicas

body  LW_WANTAROLEX /Want a (?:\w+ ){0,3}Rolex(?: Watch)?\?/i  # Want a
cheap Rolex Watch?
score  LW_WANTAROLEX 5
describe LW_WANTAROLEX Asks if you want a rolex watch

meta  LW_ROLEXOBFU __LW_OBREPLICA && LW_ROLEX
score  LW_ROLEXOBFU 5
describe LW_ROLEXOBFU Obfuscating replica rolexes!


As for pain meds, you shouldn't be having problems on 3.0 since Anitdrug is
part of it.  On 2.64, you need to be running Antidruf as well as the SARE
things.

        Loren


Re: Watches and pain relief

Posted by jdow <jd...@earthlink.net>.
Someone passed some ROLEX rules through the list a week or two ago. They
are in my user_prefs and working just fine. I'd recommend them to the
SARE people.

{^_^}
----- Original Message ----- 
From: "Geoff Soper" <ge...@alphaworks.co.uk>


> My current spam problem consists almost entirely of spam advertising
> watches and various pain relief products. I have a reasonable variety of
> rulesets installed including the following (it's not definitive as I don't
> have access to the server right now):
> SARE General Subject
> SARE HEADER
> SARE html0
> SARE OEM
> SARE Random
> SARE Ratware Detection
> SARE Specific
> SARE Spoof
> EvilNumber
>
> Can anyone suggest what might catch these watch and pain relief spams
> before I start writing some of my own?
>
> Thanks,
> Geoff



Re: Watches and pain relief

Posted by Matthew Newton <mc...@leicester.ac.uk>.
Hi

On Mon, Dec 13, 2004 at 04:43:28PM -0800, jdow wrote:
> > I've seen another variant about by Matthew Newton that makes a bunch of
> > rules for both subject and body separately. I generally don't do this as
> > the body rules will match the subject line, so there's really no need,
> > other than as a score amplifier. I usually only make subject rules when a
> > body rule isn't appropriate. He's also done separate regular and
> gappy-text
> > rules, but doesn't pick up on character-sub obfuscations.. It is a decent
> > set however..
> >
> > One good rule I've seen that Matthew Newton wrote is this one:
> >
> > rawbody   UOLCC_WATCH_BODY   /^(Do you )?[Ww]ant (a )?(cheap
> > )?([Ww]ristw|[Ww])atch\?\s*$/m
> > describe  UOLCC_WATCH_BODY   Body asks if you want a watch
> > score     UOLCC_WATCH_BODY   1.5
> >
> > Very targeted, but effective with low risk of FPs.
> 
> Here is the full set of his stuff I am running. So far it has hit no ham.

I've recently updated some of these to try and match a few that were
slipping through. The UOLCC_WATCH_BODY has now been modified to accept
"rolex" in the place of "cheap", as one like that arrived the other day.
The UOLCC_HTM_HTML_URL one is slightly less picky about which characters
can appear in the "proverb" line and the "name" line, just looking for
more than 8 "words" and less than 15 "words". I figured out that it's
more the repeated URLs that will be unique to the spam, rather than the
formatting of the two text lines. Oh, and the URL can now contain 0-9
and -, too.

Didn't realise that the body test checks the subject, too, but I don't
suppose it can hurt with both tests.

Current set below.

Matthew


---------------------------------------------------------------------

header    UOLCC_ROLEX_SUB1   Subject =~ /\brolex\b/i
describe  UOLCC_ROLEX_SUB1   Subject contains the word 'rolex'
score     UOLCC_ROLEX_SUB1   0.5

header    UOLCC_ROLEX_SUB2   Subject =~ /\br.{1,2}o.{1,2}l.{1,2}e.{1,2}x\b/i
describe  UOLCC_ROLEX_SUB2   Subject contains a gappy version of 'rolex'
score     UOLCC_ROLEX_SUB2   1.5

body      UOLCC_ROLEX_BODY1  /\brolex\b/i
describe  UOLCC_ROLEX_BODY1  Body contains the word 'rolex'
score     UOLCC_ROLEX_BODY1  0.5

body      UOLCC_ROLEX_BODY2  /\br.{1,2}o.{1,2}l.{1,2}e.{1,2}x\b/i
describe  UOLCC_ROLEX_BODY2  Body contains a gappy version of 'rolex'
score     UOLCC_ROLEX_BODY2  1.5

rawbody   UOLCC_WATCH_BODY  /^(Do\syou\s)?[Ww]ant\s(a\s)?(rolex\s|cheap\s)?[Ww](ristw)?atch\?\s*$/m
describe  UOLCC_WATCH_BODY  Body asks if you want a watch
score     UOLCC_WATCH_BODY  2

full      UOLCC_HTM_HTML_URL /\n(http:\/\/[a-z0-9-]+\.[a-z]{3,4}\/[0-9a-f]{5,35}\/[[:alnum:]]{5,20}=?\.htm)\s*\n\s*\n\s*([^\s]+)(\s+[^\s]+){6,}\n\s*\n[^\s,.]+(\s[^\s,.]+){0,15}\n\s*\n\1l/s
describe  UOLCC_HTM_HTML_URL Matches pattern of spam mail (.htm .html)
score     UOLCC_HTM_HTML_URL 3.5

full      UOLCC_BBONE        /\n[bB1 ]{8,20}\n[bB1 ]{8,20}\n/s
describe  UOLCC_BBONE        Contains two code lines with b, B and 1
score     UOLCC_BBONE        2

body      UOLCC_CAPWORD_TEST /([A-Z][a-z]{3,}\s{1,2}){15,}/s
describe  UOLCC_CAPWORD_TEST String of words that all begin with caps letter
score     UOLCC_CAPWORD_TEST 1.2

---------------------------------------------------------------------

-- 
Matthew Newton <mc...@le.ac.uk>

UNIX Systems Administrator, Network Support Section,
Computer Centre, University of Leicester,
Leicester LE1 7RH, United Kingdom

Re: Watches and pain relief

Posted by Matt Kettler <mk...@comcast.net>.
At 04:43 PM 12/13/2004 -0800, jdow wrote:
>Here is the full set of his stuff I am running. So far it has hit no ham.

Yep, that's the set.. it's pretty decent stuff. The only limitation I see 
is it won't catch r0lex, rol3x, or roIex, just rolex. Hence I quickly 
hacked one up that uses the same character-substitutions and gap-clauses I 
use in antidrug. Thus far I've not seen obfu attempts on the scale of drug 
spam, but I've seen a lot of simple obfuscations like the ones above..

You could simplify my rule a bit if it bothers you by shortening the gap 
clause:
body ROLEX_BODY 
/(?:\b|\s)[_\W]{0,3}r[_\W]{0,3}[o0\xF2-\xF6][_\W]{0,3}[l!|1][_\W]{0,3}[e3\xE8-\xEB][_\W]{0,3}x[_\W]{0,3}(?:\b|\s)/i

becomes
body ROLEX_BODY 
/(?:\b|\s).?r.?[o0\xF2-\xF6].?[l!|1].?[e3\xE8-\xEB].?x.?(?:\b|\s)/i

But I'm really more of a fan of [_\W]{0,3} over .? when it comes to 
gapping, fewer potential FPs. I've also definitely found the 
leading/trailing gap, and the (?:\b|\s) at the beginning and end help a 
lot. Particularly for words like __R_O_L_E_X__ which won't hit the standard 
.?'s in the middle with \b's on each end. (_ is a word-character, so X_ 
doesn't match \b after the X)






Re: Watches and pain relief

Posted by jdow <jd...@earthlink.net>.
From: "Matt Kettler" <mk...@evi-inc.com>

> At 05:51 PM 12/13/2004, Geoff Soper wrote:
> >My current spam problem consists almost entirely of spam advertising
> >watches and various pain relief products. I have a reasonable variety of
> >rulesets installed including the following (it's not definitive as I
don't
> >have access to the server right now):
> >SARE General Subject
> >SARE HEADER
> >SARE html0
> >SARE OEM
> >SARE Random
> >SARE Ratware Detection
> >SARE Specific
> >SARE Spoof
> >EvilNumber
> >
> >Can anyone suggest what might catch these watch and pain relief spams
> >before I start writing some of my own?
>
> What version of SA do you use? If 2.6x, look at antidrug for the pain
> relief side.
>
>

...

> I've seen another variant about by Matthew Newton that makes a bunch of
> rules for both subject and body separately. I generally don't do this as
> the body rules will match the subject line, so there's really no need,
> other than as a score amplifier. I usually only make subject rules when a
> body rule isn't appropriate. He's also done separate regular and
gappy-text
> rules, but doesn't pick up on character-sub obfuscations.. It is a decent
> set however..
>
> One good rule I've seen that Matthew Newton wrote is this one:
>
> rawbody   UOLCC_WATCH_BODY   /^(Do you )?[Ww]ant (a )?(cheap
> )?([Ww]ristw|[Ww])atch\?\s*$/m
> describe  UOLCC_WATCH_BODY   Body asks if you want a watch
> score     UOLCC_WATCH_BODY   1.5
>
> Very targeted, but effective with low risk of FPs.

Here is the full set of his stuff I am running. So far it has hit no ham.
--8<--
header    UOLCC_ROLEX_SUB1   Subject =~ /\brolex\b/i
describe  UOLCC_ROLEX_SUB1   Subject contains the word 'rolex'
score     UOLCC_ROLEX_SUB1   0.5

header    UOLCC_ROLEX_SUB2   Subject =~ /\br.{1,2}o.{1,2}l.{1,2}e.{1,2}x\b/i
describe  UOLCC_ROLEX_SUB2   Subject contains a gappy version of 'rolex'
score     UOLCC_ROLEX_SUB2   1.5

body      UOLCC_ROLEX_BODY1  /\brolex\b/i
describe  UOLCC_ROLEX_BODY1  Body contains the word 'rolex'
score     UOLCC_ROLEX_BODY1  0.5

body      UOLCC_ROLEX_BODY2  /\br.{1,2}o.{1,2}l.{1,2}e.{1,2}x\b/i
describe  UOLCC_ROLEX_BODY2  Body contains a gappy version of 'rolex'
score     UOLCC_ROLEX_BODY2  1.5

rawbody   UOLCC_WATCH_BODY   /^(Do you )?[Ww]ant
(a )?(cheap )?([Ww]ristw|W)atch\?\s*$/m
describe  UOLCC_WATCH_BODY   Body asks if you want a watch
score     UOLCC_WATCH_BODY   2

#Checking messages with two lines of just b, B, space and 1 in them.
#Seems to be some sort of code used in spam, maybe:

full      UOLCC_BBONE        /\n[bB1 ]{8,20}\n[bB1 ]{8,20}\n/s
describe  UOLCC_BBONE        Contains two code lines with b, B and 1
score     UOLCC_BBONE        2

#Checking one particular type of spam that has a URL (that follows a
#certain pattern, ends .htm), blank line, line of proverb or something,
#blank, line of name, blank, exact same URL with "l" on the end (i.e.
#ends .html). I guess the rules should be small, but this one has picked
#up loads of spam for me:

full      UOLCC_HTM_HTML_URL
/\n(http:\/\/[a-z]+\.[a-z]{3,4}\/[0-9a-f]{5,35}\/[[:alnum:]]{5,20}=?\.htm)\s
\n\s*\n[[:alnum:]\?\.',\s:,-]+\n\s*\n[^\s,.]+(\s[^\s,.]+){0,15}\n\s*\n\1l/s
describe  UOLCC_HTM_HTML_URL Matches pattern of spam mail (.htm .html)
score     UOLCC_HTM_HTML_URL 3.5
--8<--

{^_^}



Re: Watches and pain relief

Posted by Matt Kettler <mk...@evi-inc.com>.
At 05:51 PM 12/13/2004, Geoff Soper wrote:
>My current spam problem consists almost entirely of spam advertising
>watches and various pain relief products. I have a reasonable variety of
>rulesets installed including the following (it's not definitive as I don't
>have access to the server right now):
>SARE General Subject
>SARE HEADER
>SARE html0
>SARE OEM
>SARE Random
>SARE Ratware Detection
>SARE Specific
>SARE Spoof
>EvilNumber
>
>Can anyone suggest what might catch these watch and pain relief spams
>before I start writing some of my own?

What version of SA do you use? If 2.6x, look at antidrug for the pain 
relief side.


You could do an antidrug-eque version of a rolex rule...

Something like this:

body    __ROLEX_PLAIN   /rolex/i
body 
ROLEX_BODY 
/(?:\b|\s)[_\W]{0,3}r[_\W]{0,3}[o0\xF2-\xF6][_\W]{0,3}[l!|1][_\W]{0,3}[e3\xE8-\xEB][_\W]{0,3}x[_\W]{0,3}(?:\b|\s)/i
describe ROLEX_BODY     refers to a rolex in body or subject
score ROLEX_BODY                0.5

meta ROLEX_OBFU (ROLEX_BODY && !__ROLEX_PLAIN)
describe ROLEX_OBFU     obfuscated rolex reference
score ROLEX_OBFU        2.0

(note: OBFU fires in cascade with BODY, so an obfuscated rolex reference 
will hit for 2.5)

I've seen another variant about by Matthew Newton that makes a bunch of 
rules for both subject and body separately. I generally don't do this as 
the body rules will match the subject line, so there's really no need, 
other than as a score amplifier. I usually only make subject rules when a 
body rule isn't appropriate. He's also done separate regular and gappy-text 
rules, but doesn't pick up on character-sub obfuscations.. It is a decent 
set however..

One good rule I've seen that Matthew Newton wrote is this one:

rawbody   UOLCC_WATCH_BODY   /^(Do you )?[Ww]ant (a )?(cheap 
)?([Ww]ristw|[Ww])atch\?\s*$/m
describe  UOLCC_WATCH_BODY   Body asks if you want a watch
score     UOLCC_WATCH_BODY   1.5

Very targeted, but effective with low risk of FPs.