You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by "Daryl C. W. O'Shea" <sp...@dostech.ca> on 2005/12/16 02:17:34 UTC

WebRedirect SpamAssassin Plugin for use with 'Geocities Spam'

I was planning on giving Yahoo! more time to correct their "Geocities 
Spam" problem before I released my plugin to deal with it, but I've been 
noticing a decline in the scores these mails are getting.

I also just found out that I have copies of this sort of spam going back 
to at least December 28, 2004 and have been getting them in volume since 
May 2005.  I had thought it only went back to September and not back an 
entire year with increasing volume (10%+ of my spam is now "geocities 
spam") in the last six months.  In my opinion they've had sufficient 
time to act.

Further, while adding some documentation to the plugin, I tested some of 
the spam I used to write the plugin back in September and found that 
some of the "member sites" are still active.

Conveniently, there are only a few versions of the pages linked to, so 
writing rules against them is pretty effective -- which is what this 
plugin is for.

A few words of caution if you do decide to use this plugin:

   - While I believe there are no issues with the code, I'm not too
     familiar with LWP::UserAgent, so it's entirely possible that I
     have missed something.  In the event your machine gets rooted,
     you've been warned.

   - Query the links found in an email inherently has a number of privacy
     and technical issues you should be aware of.  The plugin attempts to
     avoid them by stripping visible query strings and login credentials,
     but I encourage you to read the WARNING section of the plugin's
     perldoc before using it.  Be sure to NEVER use this plugin to query
     links hosted on a server the sender may control.

   - High volume sites would be wise to run this behind a caching HTTP
     proxy such as Squid to reduce the 0.3 to 1 second that it may take
     to query each link.  While the web query is blocking, it takes place
     just after the DNS requests are kicked off, so it gives the DNS
     queries more time to complete which may result in DNSBL hits that
     may have been missed due to timeouts.

   - The scores assigned to the rules are guesses on my part based on
     what they match.  I have no legitimate email to compare hits
     against.  I recommend monitoring the hits for some period of time
     and reassigning scores if necessary or not to your liking.


The plugin is available at:
http://wiki.apache.org/spamassassin/WebRedirectPlugin

Send me an email if you find the plugin useful or spot a flaw that 
should be corrected.


Best Regards,

Daryl C. W. O'Shea