You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Robert Menschel <Ro...@Menschel.net> on 2004/02/10 05:22:53 UTC
Re[2]: Rule for V-word spam with "?AFF_ID=[a-z]+&$RANDOM=$RANDOM"
Hello Justin,
Saturday, February 7, 2004, 2:14:44 PM, you responded to Jens:
>>Could anybody please run this rule against his SPAM/HAM corpus?
>>rawbody LOCAL_URL_SYNTAX_1 /www\.[a-z]\.[a-z]\.com\/[a-z0-9
>> {1,4}\/\?AFF_ID=[a-z0-9]+\&[a-z]+[a-z]+/
>>describe LOCAL_URL_SYNTAX_1 Spammer-like URL syntax - TEST RULE 04-02-07
>>score LOCAL_URL_SYNTAX_1 1.0
>>to catch all those mails that contain URLs like
>><A HREF="http://www.xbaq.whatuthinkwillhappen.com/c/?AFF_ID=c1224&qgdwcmaewo=uwdi">Clwck
JM> Actually -- has anyone got *any* legit mail containing "aff_id",
JM> "AFF_ID", "affiliateid", "aff_sub_id" etc.? I would bet not.
JM> This may make a good rule:
JM> uri LOCAL_URI_AFFILIATE /aff\w+id=/i
I tested these two rules:
rawbody LOCAL_URL_SYNTAX_1 /www\.[a-z]\.[a-z]\.com\/[a-z0-9]{1,4}\/\?AFF_ID=[a-z0-9]+\&[a-z]+[a-z]+/
describe LOCAL_URL_SYNTAX_1 Spammer-like URL syntax - TEST RULE 04-02-07
score LOCAL_URL_SYNTAX_1 1.0
uri LOCAL_URI_AFFILIATE /aff\w+id=/i
describe LOCAL_URI_AFFILIATE spam from an affiliate
score LOCAL_URI_AFFILIATE 1
OVERALL% SPAM% HAM% S/O RANK SCORE NAME
91185 73148 18037 0.802 0.00 0.00 (all messages)
100.000 80.2193 19.7807 0.802 0.00 0.00 (all messages as %)
2.071 2.5811 0.0000 1.000 1.00 1.00 LOCAL_URI_AFFILIATE
0.000 0.0000 0.0000 0.500 0.00 1.00 LOCAL_URL_SYNTAX_1
No matches at all for Jens' rule, great results to Jason's.
Bob Menschel
Re[4]: Rule for V-word spam with "?AFF_ID=[a-z]+&$RANDOM=$RANDOM"
Posted by Robert Menschel <Ro...@Menschel.net>.
Hello Loren,
Section 3 -- Frequencies Log
(First numeric frequencies, followed by percentage frequencies)
OVERALL SPAM HAM S/O SCORE NAME
91185 73148 18037 0.802 0.00 0.00 (all messages)
468 468 0 1.000 0.97 4.40 URL_EQUALS
417 417 0 1.000 0.97 3.00 AFF_ID
OVERALL% SPAM% HAM% S/O RANK SCORE NAME
91185 73148 18037 0.802 0.00 0.00 (all messages)
100.000 80.2193 19.7807 0.802 0.00 0.00 (all messages as %)
0.513 0.6398 0.0000 1.000 0.97 4.40 URL_EQUALS
0.457 0.5701 0.0000 1.000 0.97 3.00 AFF_ID
Rules hit a decent amount of spam, and no ham. I like them.
FYI, Justin's affiliate rule posted recently hits better (I've renamed it
for my system):
uri JM_uwd_AFFILIATE /aff\w+id=/i
describe JM_uwd_AFFILIATE spam from an affiliate
score JM_uwd_AFFILIATE 4.000 # 1888s/0h of 91185 corpus (73148s/18037h) 02/09/04
Bob Menschel
Monday, February 9, 2004, 10:54:58 PM, you wrote:
LW> Out of curiosity, if you have the spare machine time, could you test the
LW> simple rule
LW> uri AFF_ID /\/\?AFF_ID\=/
LW> describe AFF_ID URL contains AFF_ID=
LW> score AFF_ID 3
LW> This has been working well for me for a couple days, but I really don't get
LW> a huge quantity of messages. I'm interested if it hits any ham at all.
LW> Also, up to today this one has been catching lots of spam for me, although
LW> interestingly today I haven't had a single spam matching this pattern.
LW> Normally it catches 30-40 a day.
LW> uri URL_EQUALS /www\.[0-9a-z\.\_]+\=[0-9a-z\.\_]+/i
LW> describe URL_EQUALS URL has equal sign in hostname
LW> score URL_EQUALS 4.4
LW> Thanks,
LW> Loren
Re: Re[2]: Rule for V-word spam with "?AFF_ID=[a-z]+&$RANDOM=$RANDOM"
Posted by Loren Wilton <lw...@earthlink.net>.
Out of curiosity, if you have the spare machine time, could you test the
simple rule
uri AFF_ID /\/\?AFF_ID\=/
describe AFF_ID URL contains AFF_ID=
score AFF_ID 3
This has been working well for me for a couple days, but I really don't get
a huge quantity of messages. I'm interested if it hits any ham at all.
Also, up to today this one has been catching lots of spam for me, although
interestingly today I haven't had a single spam matching this pattern.
Normally it catches 30-40 a day.
uri URL_EQUALS /www\.[0-9a-z\.\_]+\=[0-9a-z\.\_]+/i
describe URL_EQUALS URL has equal sign in hostname
score URL_EQUALS 4.4
Thanks,
Loren
Re: Re[2]: Rule for V-word spam with "?AFF_ID=[a-z]+&$RANDOM=$RANDOM"
Posted by Jens Benecke <je...@spamfreemail.de>.
Robert Menschel wrote:
> I tested these two rules:
> rawbody LOCAL_URL_SYNTAX_1
> /www\.[a-z]\.[a-z]\.com\/[a-z0-9]{1,4}\/\?AFF_ID=[a-z0-9]+\&[a-z]+[a-z]+/
> describe LOCAL_URL_SYNTAX_1 Spammer-like URL syntax - TEST RULE 04-02-07
> score LOCAL_URL_SYNTAX_1 1.0
> uri LOCAL_URI_AFFILIATE /aff\w+id=/i
> describe LOCAL_URI_AFFILIATE spam from an affiliate
> score LOCAL_URI_AFFILIATE 1
>
> OVERALL% SPAM% HAM% S/O RANK SCORE NAME
> 91185 73148 18037 0.802 0.00 0.00 (all messages)
> 100.000 80.2193 19.7807 0.802 0.00 0.00 (all messages as %)
> 2.071 2.5811 0.0000 1.000 1.00 1.00 LOCAL_URI_AFFILIATE
> 0.000 0.0000 0.0000 0.500 0.00 1.00 LOCAL_URL_SYNTAX_1
>
> No matches at all for Jens' rule, great results to Jason's.
That's because my rule was buggy. And the one above contains a typing
mistake, I think. :)
This should catch them:
rawbody LOCAL_URL_SYNTAX_1 /(www\.[a-z]\.com=)?[a-z]+\.[a-z]+\.com\/[a-z0-9
{1,4}\/(index\.php)?\?AFF_ID=[a-z0-9]+(\&[a-z0-9]+=[a-z0-9]+)?/
describe LOCAL_URL_SYNTAX_1 Spammer-like URL syntax - TEST RULE 04-02-07
score LOCAL_URL_SYNTAX_1 1.0
or use "uri" instead of "rawbody" (I honestly don't know exactly what "uri"
assumes so I just search the raw message body).
At least my SPAM folder likes them:
(total mails, mails containing AFF_ID, mails containing my rule)
# grep -c "^From " .Mailbox.S{PAM,URESPAM}
.Mailbox.SPAM:3368
.Mailbox.SURESPAM:8014
# grep -c AFF_ID .Mailbox.S{PAM,URESPAM}
.Mailbox.SPAM:1473
.Mailbox.SURESPAM:1058
# egrep -c '(www\.[a-z]\.com=)?[a-z]+\.[a-z]+\.com\/[a-z0-9
{1,2}\/(index\.php)?\?AFF_ID=[a-z0-9]+(\&[a-z0-9]+=[a-z0-9]+)?' .Mailbox
{SPAM,SURESPAM}
.Mailbox.SPAM:1469
.Mailbox.SURESPAM:1027
--
Jens Benecke (jens at spamfreemail.de)
http://www.hitchhikers.de - Europaweite kostenlose Mitfahrzentrale
http://www.spamfreemail.de - 100% saubere Postfächer - garantiert!
http://www.rb-hosting.de - PHP ab 9? - SSH ab 19? - günstiger Traffic