You are viewing a plain text version of this content. The canonical link for it is here.

Posted to users@spamassassin.apache.org by Steve Cloutier <cl...@piesky.com> on 2008/03/08 07:07:16 UTC

Yet another spam blocker?

Hi !

Call me -- whatever :-) I took a look at SpamAssassin a while back, and (at
least at the time), it seemed to scan the mailbox file after the message(s)
were received. The program (again, at the time) was written in Perl.

This whole process seemed somewhat inefficient, and also allowed the spammer
to believe their messages were getting through.

I looked into blocking spam at the SMTP protocol level, and using the
sendmail milter api, wrote a blocker. What started out as a
hacked-together spam blocker (not filter) has become something which seems
to work very well. This one is written in C, and is quite efficient from
both a memory and CPU usage standpoint. It is a multi-threaded daemon, and
currently is running on a number of FreeBSD
machines, from 4.3 to the latest version.

The spam blocking process I used is pretty simple:

1) The forward and reverse name lookups for the connecting SMTP server must
exist and agree.
If this is not true, the message is rejected with a 550-5.7.0 reject
(with explanation). The spammer
knows the message failed.

2) The ip address of the server must not appear in any of the DNSBLs that
I'm using
If the address is on any of the lists, the message is rejected at the
protocol level, as above.

3) There is an option for blocking servers that are coming from a
"cust-199-190-54-2" (or various variants)
type of reverse lookup. This is very effective, but does yield a few
false positives.

4) The server's domain is checked against a couple of URIBLs

5) The sender domain is similarly checked

6) Optionally, SPF records can be checked

7) The subject is optionally checked for certain words and phrases.

8) The message body is checked (as it arrives) with an efficient scanner for
URLs. These are checked
against the URIBLs. Also, optionally, certain words and phrases can be
used to trigger a block.
Currently, I don't do this (the word/phrase blocks) as the other checks
seem to be more than sufficient.

Again, at each step, if anything triggers a block, the appropriate error is
immediately returned to the
connecting SMTP server, and the spammer knows the message did not get
through. In the event of a
false positive, there is no ambiguity as to whether the message arrived or
not. The sender knows
immediately.

All transactions, blocks, reasons, IP addresses, etc. are logged.

The blocker also maintains a number of lists such as local "whitelists",
blacklists, allowed hosts, local hosts,
allowed domains, and overrides for those who do not wish to have their
mailboxes free of spam, etc. Any
address to which a message is sent is automatically put on a whitelist.

Anyway, it seems to work nicely. On one particular site, I've blocked over
17 million messages since the
end of November '07.

The idea was to have something which worked at the protocol level to block
unwanted messages, and
which was very efficient.

The protocol level blocking is nice because the spammers remove you from
their lists, and eventually, the
amount of spam sent actually decreases. I've noticed this since running the
blocker, which has been
since about August, 2007. The amount of (spam) messages sent to us has
gradually decreased.

If anyone wants to test this, you're welcome to do so. Contacat me with
what you're running for
a platform, and I'll see if I can generate an executable for you.
Installation takes about 10 seconds -
add a couple of lines to sendmail.cf and to whatever you use to start
sendmail (you must start the spam
blocking daemon before sendmail starts), and you're done!

The program is highly reliable, and so far, I've never had a crash or hang!
I've only used it on FreeBSD
with sendmail, however I don't see any big issues with Linux or other Unix
variants using sendmail.
Sendmail version 8.14.2 or later is strongly recommended, as there are some
milter API bugs in earlier
versions.

Oh well, for what it's worth!

Regards,

Steve

cloutier@piesky.com

--
View this message in context: http://www.nabble.com/Yet-another-spam-blocker--tp15911630p15911630.html
Sent from the SpamAssassin - Users mailing list archive at Nabble.com.

Re: Yet another spam blocker?

Posted by mouss <mo...@netoyen.net>.

Fred T wrote:
> Hello Steve,
>
> Saturday, March 8, 2008, 11:56:46 PM, you wrote:
>
>   
>> Now, I'm no expert on spam-bots, but it strikes me that the 'bots might want
>> to remove failed addresses
>> from their lists to make them more efficient.  A 550 error returned at the
>> protocol level will immediately
>> notify the 'bot that the addressee is bad.  Whether the 'bot then removes
>> the addressee from the list
>> is a matter of implmentation, but if the reduction in spam directed at the
>> Town that we have seen is any 
>> indication, the 'bots might just function in this manner (or at least some
>> of them).
>>     
>
> This is interesting and I wonder why different sites would see
> different behavior.    We see a bot attempt to deliver a message and
> get rejected and then almost immediately we see the same message from
> another bot get rejected.  So from our perspective we see the bots
> working together to attempt to circumvent ip based blacklists.
> And we block invalid recip's and they keep sending no matter what!
>   

I also see the same zombies retrying many times with a different sender. 
I guess they have some blind retry strategy that consist of retrying 
with a different sender and/or from a different IP. I am not seeing any 
evidence of list washing.

I wanted to see if these were real retries, that is, they occur because 
the transaction is rejected, or if the bots resend whether the 
transaction is rejected or not, so I configured some of the "highly 
targetted" addresses to accept mail. I found that few spam is sent 
multiple times (so that's an automatic retry, even if the message was 
accepted) and other spam is only received once.

Given the size of a spam, it is tempting to accept and discard instead 
of rejecting. unfortunately, this is risky (except for "obviously" 
invalid addresses).

> We've been using SpamAssassin for 4 years and blocking during the
> SMTP session (or during protocol stage as you state it) and we've
> never seen a decrease in spam except for the downtime between new
> versions of the malware that drives them!
>
> I have a MRTG graph of # of spam blocked in transit and it's been
> consistently 52-56k a day for years!!  I always notice a huge
> decrease over the weekend and it picks up big-time during the week.
> From 40k on the weekend to an average peak of 54k weekdays.
>
>
>

Re: Re[2]: Yet another spam blocker?

Posted by Loren Wilton <lw...@earthlink.net>.

> I have a MRTG graph of # of spam blocked in transit and it's been
> consistently 52-56k a day for years!!  I always notice a huge
> decrease over the weekend and it picks up big-time during the week.
> From 40k on the weekend to an average peak of 54k weekdays.

I wonder if this means that the majority of zombies are actually business 
PCs rather than home PCs, and they get turned off over the weekend.

        Loren

Re[2]: Yet another spam blocker?

Posted by Fred T <sp...@freddyt.com>.

Hello Steve,

Saturday, March 8, 2008, 11:56:46 PM, you wrote:

> Now, I'm no expert on spam-bots, but it strikes me that the 'bots might want
> to remove failed addresses
> from their lists to make them more efficient.  A 550 error returned at the
> protocol level will immediately
> notify the 'bot that the addressee is bad.  Whether the 'bot then removes
> the addressee from the list
> is a matter of implmentation, but if the reduction in spam directed at the
> Town that we have seen is any 
> indication, the 'bots might just function in this manner (or at least some
> of them).

This is interesting and I wonder why different sites would see
different behavior.    We see a bot attempt to deliver a message and
get rejected and then almost immediately we see the same message from
another bot get rejected.  So from our perspective we see the bots
working together to attempt to circumvent ip based blacklists.
And we block invalid recip's and they keep sending no matter what!

We've been using SpamAssassin for 4 years and blocking during the
SMTP session (or during protocol stage as you state it) and we've
never seen a decrease in spam except for the downtime between new
versions of the malware that drives them!

I have a MRTG graph of # of spam blocked in transit and it's been
consistently 52-56k a day for years!!  I always notice a huge
decrease over the weekend and it picks up big-time during the week.
From 40k on the weekend to an average peak of 54k weekdays.

-- 
Best regards,
 Fred                            mailto:spamassassin@freddyt.com

Re: Yet another spam blocker?

Posted by Steve Cloutier <cl...@piesky.com>.

Jari Fredriksson wrote:
> 
>> I just wanted to come up with
>> something that blocked spam
>> at the protocol level (so the spammer gets an error!!!),
> 
> That's all great.. but the reality may be that the spammer still get no
> error.
> 
> ...
> 

Hmmmm... Well, yes and no :-)  Now, it is probably correct to say the actual
spammer him/her self does not get the errors, but what is VERY interesting
is the DRASTIC reduction in
the actual amount of spam received (and blocked) over time.

For instance, I went back and looked at some early statistics from when we
first started running our spam
blocker (about 7 months ago).  During the first week, we blocked around
25,000 spam email messages 
directed at one domain (the Town's domain ).  For one recipient in
particular (the Town clerk), the blocker 
stopped 4814 messages in one week.

I went back and ran the blocking statistics for this past week.  Down from
the original 25,000 in a week, 
to under 14,000. The Town Clerk is down from 4184 blocked messages in one
week to 588 (a reduction to 
almost 1/10th the original value).  None of the messages received in the
clerk's email box since 29-Feb 
(which is as far back as I looked) were spam, which is kind of interesting. 
I figured at least one or two 
would get through in that amount of time.

I have consistantly noticed this with my other sites as well.  The amount of
spam messages directed at
the domains drop quite a bit over time.

Now, I'm no expert on spam-bots, but it strikes me that the 'bots might want
to remove failed addresses
from their lists to make them more efficient.  A 550 error returned at the
protocol level will immediately
notify the 'bot that the addressee is bad.  Whether the 'bot then removes
the addressee from the list
is a matter of implmentation, but if the reduction in spam directed at the
Town that we have seen is any 
indication, the 'bots might just function in this manner (or at least some
of them).

Don'tchya just hate spam?  I have a friend who works in MIS at a large
hospital.  They have one of the 
more expensive spam solutions (which supposedly cost them some $50,000 - and
then there's an 
ongoing fee), ***and*** they have a full-time sysadmin on staff to
administrate the machine and the 
associated software.  So, what are they spending on spam *per year*- 
something like $100,000 or 
probably MORE????!!!

Gives us something to do, I guess, but I can think of better uses of time
:-) :-) :-)

Regards,

Steve

-- 
View this message in context: http://www.nabble.com/Yet-another-spam-blocker--tp15911630p15923463.html
Sent from the SpamAssassin - Users mailing list archive at Nabble.com.

Re: Yet another spam blocker?

Posted by Jari Fredriksson <ja...@iki.fi>.

> I just wanted to come up with
> something that blocked spam
> at the protocol level (so the spammer gets an error!!!),

That's all great.. but the reality may be that the spammer still get no error.

Spam is nowadays delivered thru 3rd party innocent bystanders, and the actual spammer hardly is following what happens on those bots.

Re: Yet another spam blocker?

Posted by Henrik K <he...@hege.li>.

On Sat, Mar 08, 2008 at 01:52:57PM -0800, Steve Cloutier wrote:
> 
> Well, don't know about non-standard :-), but this blocker definitely
> interacts with all
> phases of delivery :-)  I'm not necessarily suggesting anyone abandon
> SpamAssassin - don't
> get me wrong about that - I just wanted to come up with something that
> blocked spam
> at the protocol level (so the spammer gets an error!!!), and was highly
> efficient and simple to get 
> working on a system.

So are you saying that you don't use SpamAssassin at all? What are you doing
with the thousands of spams that come from legimate sources like hotmail?
The checks you do are the basic stuff that most people do, but still a few
percent of spam gets through and needs checks like DCC, Razor etc to catch
more complicated ones. And ofcourse some own rules that can't be made as a
simple "phrase check".

The fact is you also need SpamAssassin (or whatever floats your boat,
dspam?) somewhere in the chain. It doesn't require much processing power
after you already initially blocked 90%+ of your traffic, so the "quest for
efficiency" is bogus. And yes you can block at SMTP level with it too.

Re: Yet another spam blocker?

Posted by Steve Cloutier <cl...@piesky.com>.

Matt Kettler-3 wrote:
> 
> 
> ..
> Yes, we know that. Our tool is external too.
> 
> The only big difference I see is your tool appears to be a quasi 
> nonstandard milter for sendmail, that interacts with all phases of
> delivery.
> 
> Personally, I use a combination of milter-greylist for filtering before 
> the DATA phase, and MailScanner/SpamAssassin for post delivery scanning. 
> Works really nicely and does about 95% of what you mention. It won't 
> look up text domain names in RBLs or URIBLs though, only IP based RBLs 
> are currently supported.
> ...
> 

Well, don't know about non-standard :-), but this blocker definitely
interacts with all
phases of delivery :-)  I'm not necessarily suggesting anyone abandon
SpamAssassin - don't
get me wrong about that - I just wanted to come up with something that
blocked spam
at the protocol level (so the spammer gets an error!!!), and was highly
efficient and simple to get 
working on a system.

I'm not sure what we're going to do with our "product" beyond our own
systems, and for use
on some of the big sites my group manages.  I am the MIS director for a
couple of municipalities, along
with some other sites, and spam was becoming a real issue.  We evaluated a
number of available 
products, both commercial and not, including Ironmail, Barracuda networks,
SpamAssassin and some others 
I can't think of right now.

For us, because we have a couple of software engineers on staff, writing a
blocker turned out to be
practical, and in this case, the best alternative.

Currently, a configuration user interface (graphical, web based) is in the
works, to allow individual users
to have whitelists, blacklists, etc. and determine the filtering level for
their own particular needs, if the
default system-wide settings are not to their liking.  You know, feature
creep :-)  "oh, but we could
add this one thing"....

Regards,

Steve

-- 
View this message in context: http://www.nabble.com/Yet-another-spam-blocker--tp15911630p15920983.html
Sent from the SpamAssassin - Users mailing list archive at Nabble.com.

Re: Yet another spam blocker?

Posted by Matt Kettler <mk...@verizon.net>.

Steve Cloutier wrote:
>
> Hi !
>
> I did a fair amount of sendmail tweaking, and it does indeed do quite a bit
> (like checking for the existance of domains, etc.), but *not* the sort of
> filtering I've been able to do with the external code.
>   
Um, Yeah.. We know that. Many of us use SpamAssassin as a SMTP session 
interactive scanner in their MTA (sendmail, postfix, qmail or exim). 
Others run it at the MDA (ie: procmail) or MUA (ie: thunderbird) layers, 
completely independent of their MTA.

The "external code" would clearly appear to apply to SA.
> The filtering/blocking at the protocol level is made somewhat more difficult
> by the order of operations in the SMTP protocol itself (which is a good
> protocol, but it wasn't made for this level of filtering).
>
> For instance, the content does not come across until all of the recipients
> (in a list) have been processed (and approved or rejected).  So, if one
> wants to reject a message on the basis of some bad URL or content in the
> header or body, the message has to be rejected for everyone or accepted for
> everyone.  Sure, I can remove the recipients who don't want the message, but
> the other end doesn't get an error, and it's nice to send back an error at
> the protcol level :-)  Or I could deliver the message to those who don't
> want that level of filtering and still reject it at the protocol level....
> you get the idea  :-)  :-)  :-)  It's is the way it is, so no use kvetching
> about it (the protocol) !!!! :-)
>   
Yes, We know that too. That's why SA isn't tied to the SMTP protocol.. 
it's a generic filter. Data in, data out. No ties to any particular MTA, 
or any particular part of the mail processing chain.You can stuff it in 
anywhere you want.
> Anyway, you can really do a LOT externally - things that can't be done with
> sendmail alone.  Of course, there's also the possiblity of integration with
> other email packages which may have some sort of protocol level interface.
>   
Yes, we know that. Our tool is external too.

The only big difference I see is your tool appears to be a quasi 
nonstandard milter for sendmail, that interacts with all phases of delivery.

Personally, I use a combination of milter-greylist for filtering before 
the DATA phase, and MailScanner/SpamAssassin for post delivery scanning. 
Works really nicely and does about 95% of what you mention. It won't 
look up text domain names in RBLs or URIBLs though, only IP based RBLs 
are currently supported.

Note: milter-greylist, despite its name isn't just a greylist, it's a 
complex ACL based milter that can invoke RBLs, SPF, etc. Any trigger 
mechanism can be used to whitelist, greylist (with per-acl durrations) 
or blacklist email. It can have ACLs both before and after the DATA 
phase, although greylist isn't supported in the post data phase ACLs, as 
it would be really unwise to do so from a bandwidth perspective.

Re: Yet another spam blocker?

Posted by Steve Cloutier <cl...@piesky.com>.

Chris Hoogendyk wrote:
> 
> 
> Henrik K wrote:
>> On Fri, Mar 07, 2008 at 10:07:16PM -0800, Steve Cloutier wrote:
>>   
>>> Hi !
>>>
>>> Call me -- whatever :-)  I took a look at SpamAssassin a while back, and
>>> (at
>>> least at the time), it seemed to scan the mailbox file after the
>>> message(s)
>>> were received.  The program (again, at the time) was written in Perl.
>>>
>>> This whole process seemed somewhat inefficient, and also allowed the
>>> spammer
>>> to believe their messages were getting through.
>>>     
>>
>> SpamAssassin is only a filter. There are many ways to run it at SMTP
>> level.
>>
>> Also there are plenty of software that does the features you listed. And
>> a
>> proper MTA can do most of the features you mentioned even by itself. Not
>> to
>> start a flame war, but it seems it's always the Sendmail people who need
>> to
>> come up with fancy custom milters etc. ;)
>>   
> 
> Well, actually, we use sendmail, and, as I read the original post, I was 
> thinking myself, umm, a lot of these are things you can do with sendmail 
> without any additional code. So, maybe it's just people who see they can 
> do a milter but don't take the time to learn all the depth of what they 
> can already do.
> 
> I'm not the local expert on sendmail, but I did the original install and 
> I do the maintenance. My boss has dug in and done some of the tweaks. 
> Several years ago he attended a usenix seminar by Eric Allman and added 
> quite a lot to what he knew about sendmail. The latest O'Reilly book on 
> sendmail provides lots of depth to plumb.
> 
> 
>>> If anyone wants to test this, you're welcome to do so.  Contacat me with
>>> what you're running for
>>> a platform, and I'll see if I can generate an executable for you. 
>>>     
>>
>> I'm sure everyone is dying to get "some executable" running in their
>> systems.
>> How about sources? :)
>>   
> 
> 

Hi !

I did a fair amount of sendmail tweaking, and it does indeed do quite a bit
(like checking for the existance of domains, etc.), but *not* the sort of
filtering I've been able to do with the external code.

The filtering/blocking at the protocol level is made somewhat more difficult
by the order of operations in the SMTP protocol itself (which is a good
protocol, but it wasn't made for this level of filtering).

For instance, the content does not come across until all of the recipients
(in a list) have been processed (and approved or rejected).  So, if one
wants to reject a message on the basis of some bad URL or content in the
header or body, the message has to be rejected for everyone or accepted for
everyone.  Sure, I can remove the recipients who don't want the message, but
the other end doesn't get an error, and it's nice to send back an error at
the protcol level :-)  Or I could deliver the message to those who don't
want that level of filtering and still reject it at the protocol level....
you get the idea  :-)  :-)  :-)  It's is the way it is, so no use kvetching
about it (the protocol) !!!! :-)

Anyway, you can really do a LOT externally - things that can't be done with
sendmail alone.  Of course, there's also the possiblity of integration with
other email packages which may have some sort of protocol level interface.

Regards,

Steve

Anyway, 

-- 
View this message in context: http://www.nabble.com/Yet-another-spam-blocker--tp15911630p15918095.html
Sent from the SpamAssassin - Users mailing list archive at Nabble.com.

Re: Yet another spam blocker?

Posted by Chris Hoogendyk <ho...@bio.umass.edu>.

Henrik K wrote:
> On Fri, Mar 07, 2008 at 10:07:16PM -0800, Steve Cloutier wrote:
>   
>> Hi !
>>
>> Call me -- whatever :-)  I took a look at SpamAssassin a while back, and (at
>> least at the time), it seemed to scan the mailbox file after the message(s)
>> were received.  The program (again, at the time) was written in Perl.
>>
>> This whole process seemed somewhat inefficient, and also allowed the spammer
>> to believe their messages were getting through.
>>     
>
> SpamAssassin is only a filter. There are many ways to run it at SMTP level.
>
> Also there are plenty of software that does the features you listed. And a
> proper MTA can do most of the features you mentioned even by itself. Not to
> start a flame war, but it seems it's always the Sendmail people who need to
> come up with fancy custom milters etc. ;)
>   

Well, actually, we use sendmail, and, as I read the original post, I was 
thinking myself, umm, a lot of these are things you can do with sendmail 
without any additional code. So, maybe it's just people who see they can 
do a milter but don't take the time to learn all the depth of what they 
can already do.

I'm not the local expert on sendmail, but I did the original install and 
I do the maintenance. My boss has dug in and done some of the tweaks. 
Several years ago he attended a usenix seminar by Eric Allman and added 
quite a lot to what he knew about sendmail. The latest O'Reilly book on 
sendmail provides lots of depth to plumb.

>> If anyone wants to test this, you're welcome to do so.  Contacat me with
>> what you're running for
>> a platform, and I'll see if I can generate an executable for you. 
>>     
>
> I'm sure everyone is dying to get "some executable" running in their systems.
> How about sources? :)
>   

---------------

Chris Hoogendyk

-
   O__  ---- Systems Administrator
  c/ /'_ --- Biology & Geology Departments
 (*) \(*) -- 140 Morrill Science Center
~~~~~~~~~~ - University of Massachusetts, Amherst 

<ho...@bio.umass.edu>

--------------- 

Erdös 4

Re: Yet another spam blocker?

Posted by Henrik K <he...@hege.li>.

On Fri, Mar 07, 2008 at 10:07:16PM -0800, Steve Cloutier wrote:
> 
> Hi !
> 
> Call me -- whatever :-)  I took a look at SpamAssassin a while back, and (at
> least at the time), it seemed to scan the mailbox file after the message(s)
> were received.  The program (again, at the time) was written in Perl.
> 
> This whole process seemed somewhat inefficient, and also allowed the spammer
> to believe their messages were getting through.

SpamAssassin is only a filter. There are many ways to run it at SMTP level.

Also there are plenty of software that does the features you listed. And a
proper MTA can do most of the features you mentioned even by itself. Not to
start a flame war, but it seems it's always the Sendmail people who need to
come up with fancy custom milters etc. ;)

> If anyone wants to test this, you're welcome to do so.  Contacat me with
> what you're running for
> a platform, and I'll see if I can generate an executable for you. 

I'm sure everyone is dying to get "some executable" running in their systems.
How about sources? :)