You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Robert Menschel <Ro...@Menschel.net> on 2004/02/10 05:22:53 UTC

Re[2]: Rule for V-word spam with "?AFF_ID=[a-z]+&$RANDOM=$RANDOM"

Hello Justin,

Saturday, February 7, 2004, 2:14:44 PM, you responded to Jens:

>>Could anybody please run this rule against his SPAM/HAM corpus?
>>rawbody LOCAL_URL_SYNTAX_1 /www\.[a-z]\.[a-z]\.com\/[a-z0-9
>>    {1,4}\/\?AFF_ID=[a-z0-9]+\&[a-z]+[a-z]+/
>>describe LOCAL_URL_SYNTAX_1 Spammer-like URL syntax - TEST RULE 04-02-07
>>score LOCAL_URL_SYNTAX_1 1.0
>>to catch all those mails that contain URLs like
>><A HREF="http://www.xbaq.whatuthinkwillhappen.com/c/?AFF_ID=c1224&qgdwcmaewo=uwdi">Clwck

JM> Actually -- has anyone got *any* legit mail containing "aff_id",
JM> "AFF_ID", "affiliateid", "aff_sub_id" etc.?  I would bet not.

JM> This may make a good rule:

JM> 	uri LOCAL_URI_AFFILIATE		/aff\w+id=/i

I tested these two rules:
rawbody LOCAL_URL_SYNTAX_1 /www\.[a-z]\.[a-z]\.com\/[a-z0-9]{1,4}\/\?AFF_ID=[a-z0-9]+\&[a-z]+[a-z]+/
describe LOCAL_URL_SYNTAX_1 Spammer-like URL syntax - TEST RULE 04-02-07
score LOCAL_URL_SYNTAX_1 1.0
uri LOCAL_URI_AFFILIATE         /aff\w+id=/i
describe LOCAL_URI_AFFILIATE spam from an affiliate
score LOCAL_URI_AFFILIATE 1

OVERALL%   SPAM%     HAM%     S/O    RANK   SCORE  NAME
  91185    73148    18037    0.802   0.00    0.00  (all messages)
100.000  80.2193  19.7807    0.802   0.00    0.00  (all messages as %)
  2.071   2.5811   0.0000    1.000   1.00    1.00  LOCAL_URI_AFFILIATE
  0.000   0.0000   0.0000    0.500   0.00    1.00  LOCAL_URL_SYNTAX_1

No matches at all for Jens' rule, great results to Jason's.

Bob Menschel




Re[4]: Rule for V-word spam with "?AFF_ID=[a-z]+&$RANDOM=$RANDOM"

Posted by Robert Menschel <Ro...@Menschel.net>.
Hello Loren,

Section 3 -- Frequencies Log
(First numeric frequencies, followed by percentage frequencies)

OVERALL     SPAM      HAM     S/O   SCORE  NAME
  91185    73148    18037    0.802   0.00    0.00  (all messages)
    468      468        0    1.000   0.97   4.40  URL_EQUALS
    417      417        0    1.000   0.97   3.00  AFF_ID

OVERALL%   SPAM%     HAM%     S/O    RANK   SCORE  NAME
  91185    73148    18037    0.802   0.00    0.00  (all messages)
100.000  80.2193  19.7807    0.802   0.00    0.00  (all messages as %)
  0.513   0.6398   0.0000    1.000   0.97    4.40  URL_EQUALS
  0.457   0.5701   0.0000    1.000   0.97    3.00  AFF_ID

Rules hit a decent amount of spam, and no ham. I like them.

FYI, Justin's affiliate rule posted recently hits better (I've renamed it
for my system):
uri       JM_uwd_AFFILIATE       /aff\w+id=/i
describe  JM_uwd_AFFILIATE       spam from an affiliate
score     JM_uwd_AFFILIATE       4.000  # 1888s/0h of 91185 corpus (73148s/18037h) 02/09/04

Bob Menschel


Monday, February 9, 2004, 10:54:58 PM, you wrote:

LW> Out of curiosity, if you have the spare machine time, could you test the
LW> simple rule

LW> uri  AFF_ID     /\/\?AFF_ID\=/
LW> describe AFF_ID     URL contains AFF_ID=
LW> score AFF_ID     3

LW> This has been working well for me for a couple days, but I really don't get
LW> a huge quantity of messages.  I'm interested if it hits any ham at all.

LW> Also, up to today this one has been catching lots of spam for me, although
LW> interestingly today I haven't had a single spam matching this pattern.
LW> Normally it catches 30-40 a day.

LW> uri URL_EQUALS   /www\.[0-9a-z\.\_]+\=[0-9a-z\.\_]+/i
LW> describe URL_EQUALS  URL has equal sign in hostname
LW> score URL_EQUALS  4.4

LW> Thanks,

LW>         Loren




Re: Re[2]: Rule for V-word spam with "?AFF_ID=[a-z]+&$RANDOM=$RANDOM"

Posted by Loren Wilton <lw...@earthlink.net>.
Out of curiosity, if you have the spare machine time, could you test the
simple rule

uri  AFF_ID     /\/\?AFF_ID\=/
describe AFF_ID     URL contains AFF_ID=
score AFF_ID     3

This has been working well for me for a couple days, but I really don't get
a huge quantity of messages.  I'm interested if it hits any ham at all.

Also, up to today this one has been catching lots of spam for me, although
interestingly today I haven't had a single spam matching this pattern.
Normally it catches 30-40 a day.

uri URL_EQUALS   /www\.[0-9a-z\.\_]+\=[0-9a-z\.\_]+/i
describe URL_EQUALS  URL has equal sign in hostname
score URL_EQUALS  4.4

Thanks,

        Loren


Re: Re[2]: Rule for V-word spam with "?AFF_ID=[a-z]+&$RANDOM=$RANDOM"

Posted by Jens Benecke <je...@spamfreemail.de>.
Robert Menschel wrote:

> I tested these two rules:
> rawbody LOCAL_URL_SYNTAX_1
> /www\.[a-z]\.[a-z]\.com\/[a-z0-9]{1,4}\/\?AFF_ID=[a-z0-9]+\&[a-z]+[a-z]+/
> describe LOCAL_URL_SYNTAX_1 Spammer-like URL syntax - TEST RULE 04-02-07
> score LOCAL_URL_SYNTAX_1 1.0
> uri LOCAL_URI_AFFILIATE         /aff\w+id=/i
> describe LOCAL_URI_AFFILIATE spam from an affiliate
> score LOCAL_URI_AFFILIATE 1
> 
> OVERALL%   SPAM%     HAM%     S/O    RANK   SCORE  NAME
>   91185    73148    18037    0.802   0.00    0.00  (all messages)
> 100.000  80.2193  19.7807    0.802   0.00    0.00  (all messages as %)
>   2.071   2.5811   0.0000    1.000   1.00    1.00  LOCAL_URI_AFFILIATE
>   0.000   0.0000   0.0000    0.500   0.00    1.00  LOCAL_URL_SYNTAX_1
> 
> No matches at all for Jens' rule, great results to Jason's.

That's because my rule was buggy. And the one above contains a typing
mistake, I think. :) 

This should catch them:

rawbody LOCAL_URL_SYNTAX_1 /(www\.[a-z]\.com=)?[a-z]+\.[a-z]+\.com\/[a-z0-9
{1,4}\/(index\.php)?\?AFF_ID=[a-z0-9]+(\&[a-z0-9]+=[a-z0-9]+)?/
describe LOCAL_URL_SYNTAX_1 Spammer-like URL syntax - TEST RULE 04-02-07
score LOCAL_URL_SYNTAX_1 1.0

or use "uri" instead of "rawbody" (I honestly don't know exactly what "uri"
assumes so I just search the raw message body).

At least my SPAM folder likes them: 
(total mails, mails containing AFF_ID, mails containing my rule)

# grep -c "^From " .Mailbox.S{PAM,URESPAM}
.Mailbox.SPAM:3368
.Mailbox.SURESPAM:8014

# grep -c AFF_ID .Mailbox.S{PAM,URESPAM}
.Mailbox.SPAM:1473
.Mailbox.SURESPAM:1058

# egrep -c '(www\.[a-z]\.com=)?[a-z]+\.[a-z]+\.com\/[a-z0-9
{1,2}\/(index\.php)?\?AFF_ID=[a-z0-9]+(\&[a-z0-9]+=[a-z0-9]+)?' .Mailbox
{SPAM,SURESPAM}
.Mailbox.SPAM:1469
.Mailbox.SURESPAM:1027


-- 
Jens Benecke (jens at spamfreemail.de)
http://www.hitchhikers.de - Europaweite kostenlose Mitfahrzentrale
http://www.spamfreemail.de - 100% saubere Postfächer - garantiert!
http://www.rb-hosting.de - PHP ab 9? - SSH ab 19? - günstiger Traffic