You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Jeff Chan <je...@surbl.org> on 2005/11/13 10:52:37 UTC

geocities rule?

Does anyone have a geocities rule that catches most of the spams
and has few FPs?

Cheers,

Jeff C.
-- 
Jeff Chan
mailto:jeffc@surbl.org
http://www.surbl.org/


Re: geocities rule?

Posted by Ilan Aisic <ia...@gmail.com>.
I've recently gave up on geocities alltogether and wrote a simple rule to
give any mail coming from it a high score.
AFAIK, my users never get any legit mail from geocities anyway. I'm sure
it'll generate FPs to other people.

On 11/13/05, Jeff Chan <je...@surbl.org> wrote:
>
> Does anyone have a geocities rule that catches most of the spams
> and has few FPs?
>
> Cheers,
>
> Jeff C.
> --
> Jeff Chan
> mailto:jeffc@surbl.org
> http://www.surbl.org/
>
>


--
Ilan Aisic
Registered Linux User 8124 http://counter.li.org

Re: geocities rule?

Posted by Matt Kettler <mk...@evi-inc.com>.
mouss wrote:
> Matt Kettler a écrit :
> 
>> Side note: nearly all my recent geocities-link spams have not used a
>> /? script.
>> I am still seeing some, but most of my recent ones are in the /xyz123/
>> format:
>>  
>>
> Most geo-spam has the "/?blah=blah" pattern. (some may not have the "=",
> I don't remember).  I call this "first" kind.
> The "second" category is the rest. so there seems to be at least two
> ratware engines.

That's interesting, my current experience is the opposite, as demonstrated below
where out of 11 recent geocities link-spams, only 1 is of the "first" form.
That's a 10:1 difference..

Now of course, in total I have received a lot more of the "first" kind, but the
first kind has been around a lot longer, I've got examples of it going back to
9/30/2005.

The second form didn't gain popularity until 11/4/2005, but currently it's the
overwhelming majority.


> 
>> Here's my most recent ones, only one of which had a /? in it:
>>
>> http://uk.geocities.com/aaqwn54/
>> http://uk.geocities.com/aaqwno13/
>> http://uk.geocities.com/Pyotr76560Ernaline5835/?64378274347
>>
> <> This one isn't a counterexample.  </>

I know, it's the one I refer to above as "only one of which had a /? in it"

> 
>> http://uk.geocities.com/zzqwno29/
>> http://uk.geocities.com/zzqwno31/
>> http://uk.geocities.com/zqwn26/
>> http://uk.geocities.com/zqwn35/
>> http://de.geocities.com/Ilka23447Harli65881/
>> http://uk.geocities.com/zqwnasqw54/
>> http://uk.geocities.com/zqnasqw12/
>> http://uk.geocities.com/zqnasqw96/
>>  
>>
> so we might score geocities urls with 4 or more consonants or /\d\w+\d/
> 
> The following is harder:
> 
> http://uk.geocities.com/currency2021/currency_trading
> 
> because some people do use a login of the form \w+\d+ (for instance, if
> I wanted to get a "mouss" account there, that would not be available,
> and I would end up taking "mouss66"). Fortunately, this geospam is rare
> (and "currency"+"trading" may be a sign).

Yes, that's the problem I've been having.

I score them, but I score them pretty low due to the great potential for FPs
here. 1.0 if on the uk or de server, and I just added a 0.1 rule for geocities
in general.

I'll see what the rates are like between these two rules:

#another pattern. General match of geocities sites using /? scripts
uri L_L_GEOGEN1 /\.geocities\.com\/\.*\/\?/i
score L_L_GEOGEN1       0.5

#high FP chances
uri L_L_GEOGEN2 /\.geocities\.com\/[a-z]{5,9}[0-9]{2,3}\//
score L_L_GEOGEN2       0.1


> 
> In the bug side, I've seen this one (indented for readability):
> 
>     Jwh  http://uk.geocities.com/    mteetotalfunnyba
>     uk.geocities.com/mteetotalfunnyba/?kaunebsojepv
> 
> (the two lines come as they are shown, without the indentation).
> probably a ratware bug...

Interesting.

Re: geocities rule?

Posted by mouss <us...@free.fr>.
Matt Kettler a écrit :

>Side note: nearly all my recent geocities-link spams have not used a /? script.
>I am still seeing some, but most of my recent ones are in the /xyz123/ format:
>  
>
Most geo-spam has the "/?blah=blah" pattern. (some may not have the "=", 
I don't remember).  I call this "first" kind.
The "second" category is the rest. so there seems to be at least two 
ratware engines.

>Here's my most recent ones, only one of which had a /? in it:
>
>http://uk.geocities.com/aaqwn54/
>http://uk.geocities.com/aaqwno13/
>http://uk.geocities.com/Pyotr76560Ernaline5835/?64378274347
>  
>
<> This one isn't a counterexample.  </>

>http://uk.geocities.com/zzqwno29/
>http://uk.geocities.com/zzqwno31/
>http://uk.geocities.com/zqwn26/
>http://uk.geocities.com/zqwn35/
>http://de.geocities.com/Ilka23447Harli65881/
>http://uk.geocities.com/zqwnasqw54/
>http://uk.geocities.com/zqnasqw12/
>http://uk.geocities.com/zqnasqw96/
>  
>
so we might score geocities urls with 4 or more consonants or /\d\w+\d/

The following is harder:

http://uk.geocities.com/currency2021/currency_trading

because some people do use a login of the form \w+\d+ (for instance, if 
I wanted to get a "mouss" account there, that would not be available, 
and I would end up taking "mouss66"). Fortunately, this geospam is rare 
(and "currency"+"trading" may be a sign).

In the bug side, I've seen this one (indented for readability):

	Jwh  http://uk.geocities.com/	mteetotalfunnyba
	uk.geocities.com/mteetotalfunnyba/?kaunebsojepv

(the two lines come as they are shown, without the indentation). 
probably a ratware bug...

<>
fight those who break email
fight TrendMicro and their abusive and arbitrary rbl-plus
send your opinion to <us...@free.fr>
</>


Re: geocities rule?

Posted by Matt Kettler <mk...@evi-inc.com>.
Matt Kettler wrote:
> mouss wrote:
> 
>>Matt Kettler a écrit :
>>
>>
>>>mouss wrote:
>>> 
>>>
>>>
>>>>They used other geocities sites since a long time. I just score 4 for
>>>>any "*.geocities.com/*/?" URLs. for now,  I have no FN, no FP.
>>>>  
>>>
>>>
>>>Unfortunately, I've had plenty of FPs with the basic *.geocities.com..
>>
>>You may have missed the "/?" part. I don't score any geocities link
>>unless it has a "/?" pattern. for me, this is very rare (the only legit
>>example I have in mind now is the courier-imap site, which really
>>doesn't need that) in general. so, in short: I have never seen a legit
>>geocities url that uses a "/?" script. This is not an easy construct for
>>beginners (most people don't even know this is feasible), and advanced
>>people don't need it.
> 
> 
> Ahh, I did miss that..
> 
> So the regex looks something like this:
> 
> /\.geocities\.com\/\.*\/\?/i
>

Side note: nearly all my recent geocities-link spams have not used a /? script.
I am still seeing some, but most of my recent ones are in the /xyz123/ format:

Here's my most recent ones, only one of which had a /? in it:

http://uk.geocities.com/aaqwn54/
http://uk.geocities.com/aaqwno13/
http://uk.geocities.com/Pyotr76560Ernaline5835/?64378274347
http://uk.geocities.com/zzqwno29/
http://uk.geocities.com/zzqwno31/
http://uk.geocities.com/zqwn26/
http://uk.geocities.com/zqwn35/
http://de.geocities.com/Ilka23447Harli65881/
http://uk.geocities.com/zqwnasqw54/
http://uk.geocities.com/zqnasqw12/
http://uk.geocities.com/zqnasqw96/

Re: geocities rule?

Posted by Matt Kettler <mk...@evi-inc.com>.
mouss wrote:
> Matt Kettler a écrit :
> 
>> mouss wrote:
>>  
>>
>>> They used other geocities sites since a long time. I just score 4 for
>>> any "*.geocities.com/*/?" URLs. for now,  I have no FN, no FP.
>>>   
>>
>>
>> Unfortunately, I've had plenty of FPs with the basic *.geocities.com..
> 
> You may have missed the "/?" part. I don't score any geocities link
> unless it has a "/?" pattern. for me, this is very rare (the only legit
> example I have in mind now is the courier-imap site, which really
> doesn't need that) in general. so, in short: I have never seen a legit
> geocities url that uses a "/?" script. This is not an easy construct for
> beginners (most people don't even know this is feasible), and advanced
> people don't need it.

Ahh, I did miss that..

So the regex looks something like this:

/\.geocities\.com\/\.*\/\?/i

?


Re: geocities rule?

Posted by mouss <us...@free.fr>.
Matt Kettler a écrit :

>mouss wrote:
>  
>
>>They used other geocities sites since a long time. I just score 4 for
>>any "*.geocities.com/*/?" URLs. for now,  I have no FN, no FP.
>>    
>>
>
>Unfortunately, I've had plenty of FPs with the basic *.geocities.com.. 
>
You may have missed the "/?" part. I don't score any geocities link 
unless it has a "/?" pattern. for me, this is very rare (the only legit 
example I have in mind now is the courier-imap site, which really 
doesn't need that) in general. so, in short: I have never seen a legit 
geocities url that uses a "/?" script. This is not an easy construct for 
beginners (most people don't even know this is feasible), and advanced 
people don't need it.

<spam fight>
All those who break the email system are terrorists we must fight.
Fight TrendMicro. Fight rbl-plus.
</spam>

Re: geocities rule?

Posted by Jeff Chan <je...@surbl.org>.
On Tuesday, November 15, 2005, 3:47:22 PM, Loren Wilton wrote:
>> Unfortunately, I've had plenty of FPs with the basic *.geocities.com.. A
> lot of
>> "enthusiast" websites of various sorts are hosted there and my users like
> to
>> forward around links to them.

> I wonder what the effect of listing /\w\.\w\w\.geocities\.com\b/ would be?
> That would only catch the non-US hosts.  Arguably still using way too large
> a hammer, but for some people it might work.

>         Loren

Funny, a few days after you said that I saw www.geocities.com in
spams.

Hello spammers....   F  U  !    ;-)

Jeff C.
-- 
Jeff Chan
mailto:jeffc@surbl.org
http://www.surbl.org/


Re: geocities rule?

Posted by Loren Wilton <lw...@earthlink.net>.
> Unfortunately, I've had plenty of FPs with the basic *.geocities.com.. A
lot of
> "enthusiast" websites of various sorts are hosted there and my users like
to
> forward around links to them.

I wonder what the effect of listing /\w\.\w\w\.geocities\.com\b/ would be?
That would only catch the non-US hosts.  Arguably still using way too large
a hammer, but for some people it might work.

        Loren


Re: geocities rule?

Posted by Matt Kettler <mk...@evi-inc.com>.
mouss wrote:
>
> They used other geocities sites since a long time. I just score 4 for
> any "*.geocities.com/*/?" URLs. for now,  I have no FN, no FP.

Unfortunately, I've had plenty of FPs with the basic *.geocities.com.. A lot of
"enthusiast" websites of various sorts are hosted there and my users like to
forward around links to them.

Of course, each network is different and YMMV. In particular you might have
different results being not located in a country with a geocities portal.

> <ad>
> Don't use TrendMicro products. they list innocent people.
> </ad>

It strikes me as slightly hypocritical to be blaming trend for listing "innocent
people", while at the same advocating strong scores mail containing
*.geocities.com.

Here on my network I could list uris in *.fr with less impact than
*.geocities.com, but I don't think I'd do either.






Re: geocities rule?

Posted by mouss <us...@free.fr>.
Evan Platt a écrit :

>
> I run my own site, so my site, my rules... But looking through my 
> archived mail for uk.geoc1ties.com, I see NO legit mail.
>
> I'm at the point of adding a body_check rule in postfix to /dev/null 
> anything with the above in the body. I see now they've started to use 
> de.geocities...
>
>
They used other geocities sites since a long time. I just score 4 for 
any "*.geocities.com/*/?" URLs. for now,  I have no FN, no FP.

<ad>
Don't use TrendMicro products. they list innocent people.
</ad>

Re: geocities rule?

Posted by Evan Platt <ev...@espphotography.com>.
At 12:43 PM 11/15/2005, you wrote:
>This is a collection of various geocities rules that I've been 
>using. You might
>want to run them by a corpus to see what their FPs are like for you, 
>but this is
>a good starting point. These are based on rules posted by others on the list,
>with a few limited hacks and customizations of my own added.


I run my own site, so my site, my rules... But looking through my 
archived mail for uk.geoc1ties.com, I see NO legit mail.

I'm at the point of adding a body_check rule in postfix to /dev/null 
anything with the above in the body. I see now they've started to use 
de.geocities...


Re: geocities rule?

Posted by Matt Kettler <mk...@evi-inc.com>.
Jeff Chan wrote:
> Does anyone have a geocities rule that catches most of the spams
> and has few FPs?


This is a collection of various geocities rules that I've been using. You might
want to run them by a corpus to see what their FPs are like for you, but this is
a good starting point. These are based on rules posted by others on the list,
with a few limited hacks and customizations of my own added.


uri      L_L_GEOEXP /(?:uk|de)\.geocities\.com\/\w{2,20}\/\?\w{1,20}[=&]\w{2}/
describe  L_L_GEOEXP Possible Geocities exploitation
score     L_L_GEOEXP 1.0

#stacks with geoexp, but is more specific and less FP prone.
uri      L_L_GEOEXP2
/(?:uk|de)\.geocities\.com\/[a-z]{2,20}\d{1,5}\/\?\w{1,20}[=&]\w{2}/
describe  L_L_GEOEXP2 Possible Geocities exploitation
score     L_L_GEOEXP2 1.5


#different pattern, somewhat FP prone due to broader hit range.
uri      L_L_GEOEXP3 /(?:uk|de)\.geocities\.com\/[a-z]{5,7}[0-9]{2,3}\//
describe  L_L_GEOEXP3 Possible Geocities exploitation
score     L_L_GEOEXP3 1.0

uri      UOLCC_UKGEO
/(?:uk|de)\.geocities\.com\/[A-Z]?[a-z]{2,20}_[A-Z]?[a-z]{2,20}(?:_[A-Z]?[a-z]{2,20})?\d{0,4}\/\?[\w=\.]{3}/
describe  UOLCC_UKGEO UK Geocities exploitation
score     UOLCC_UKGEO 2.0

Re: geocities rule?

Posted by mouss <us...@free.fr>.
Simon Byrnand a écrit :

> Hi,
>
> I don't suppose you could post your rules for this for those of us not 
> well versed in perl regex's ? :-) Those spams are driving us crazy and 
> they're about the only ones that are slipping through 3.0.4. I'm also 
> running 3.1.0 on test and it's no better (in fact worse) at detecting 
> these...


It's a silly one:        m{http://\S+\.geocities\.com/\.+/\?}i
- you can replace '+' with max length (\S{,20} instead of \S+ for 
instance). but since I couldn't get a "good" number, I went the infinite way
- I first started with uk.geocities.com, but then spammers used other 
countries.
- I also started using foo_bar pattern 
(http://uk.geocities.com/james_bond/?abcdefg=...), but since spammers 
can use any login they want...
- I use rawbody. you can use uri instead. I just don't care.

- geocities spam I've looked at seemed to be in 2 categories:
1) Most of it uses a URI of the form:
        http://$site/foo_bar/?id=blahblah (generally very long "blahblah").
they also start with a
   * first name
   * some ad text
   * the uri
   * some random text
   * a "polite" goodbye.
variations exist, but thet mostly use a combination of the above.
Most of these use a client with no rdns and use a forged helo.

2) some use the form
        http://$site/foo/blahblah
These are somewhat "smarter". however, I didn't need to catch these 
directly thanks to Bayes and SARE (the first category is so overwhelming 
that it helps Bayes to detect the latter. so in a way: spammers beat 
spammers...


Re: geocities rule?

Posted by Simon Byrnand <si...@igrin.co.nz>.
At 01:18 14/11/2005, mouss wrote:
>Jeff Chan a écrit :
>
>>Does anyone have a geocities rule that catches most of the spams
>>and has few FPs?
>>
>after looking at many of these, I ended up just 
>giving 4 points to any 
>http://*.geocities.com/*/? (written as perl expression of course).
>together with Bayes and other tests, this seems 
>to block all that spam (I didn't need to block all the geocities.com URLs).
>FPs are theoritically possible, though I didn't 
>see legitimate geocities pages with the "/?" pattern.
>
>- A large number of these come from IPs with no rDNS
>- many come from Asian networks.
>
>so it may be good to mix multiple checks together to reduce FPs.

Hi,

I don't suppose you could post your rules for 
this for those of us not well versed in perl 
regex's ? :-) Those spams are driving us crazy 
and they're about the only ones that are slipping 
through 3.0.4. I'm also running 3.1.0 on test and 
it's no better (in fact worse) at detecting these...

Regards,
Simon


Re: geocities rule?

Posted by jdow <jd...@earthlink.net>.
From: "mouss" <us...@free.fr>

> Jeff Chan a écrit :
>
>>Does anyone have a geocities rule that catches most of the spams
>>and has few FPs?
>>
> after looking at many of these, I ended up just giving 4 points to any 
> http://*.geocities.com/*/? (written as perl expression of course).
> together with Bayes and other tests, this seems to block all that spam (I 
> didn't need to block all the geocities.com URLs).
> FPs are theoritically possible, though I didn't see legitimate geocities 
> pages with the "/?" pattern.
>
> - A large number of these come from IPs with no rDNS
> - many come from Asian networks.
>
> so it may be good to mix multiple checks together to reduce FPs.

I got more aggressive in collaboration with Fred. It amounts to any
mention of geocities.co* in a URL toasts the message. I give it a
killer score, too.

{^_^} 


Re: geocities rule?

Posted by mouss <us...@free.fr>.
Jeff Chan a écrit :

>Does anyone have a geocities rule that catches most of the spams
>and has few FPs?
>  
>
after looking at many of these, I ended up just giving 4 points to any 
http://*.geocities.com/*/? (written as perl expression of course).
together with Bayes and other tests, this seems to block all that spam 
(I didn't need to block all the geocities.com URLs).
FPs are theoritically possible, though I didn't see legitimate geocities 
pages with the "/?" pattern.

- A large number of these come from IPs with no rDNS
- many come from Asian networks.

so it may be good to mix multiple checks together to reduce FPs.