You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by William Stearns <ws...@pobox.com> on 2005/07/10 07:41:27 UTC

md5sum/sha1sum signatures available, was RE: Gif-Only spams

Good evening, all,

On Thu, 9 Jun 2005, Chris Santerre wrote:

>> From: Sven Riedel [mailto:sr@baghus.net]
>> Sent: Thursday, June 09, 2005 10:19 AM
>>
>> has anyone developed a good strategy against spams
>> that contain a random text and the actual spam in
>> an image within a multipart/alternative mail?
>>
>> Short of entirely blocking mails containing images, that
>> is.
>
> Check out the interesting idea at www.rulesemporium.com/forums/
>
> entitled: Image attachment MD5 footprint RBL
>
> Pretty cool.

 	The forums appear to be down at the moment, so I couldn't read the 
thread involved.
 	I'm guessing the idea is to have a set of md5sums of known spam 
attachments (images and others), so when a new message comes in, the spam 
filter md5sums/sha1sums each mime part and does a dns lookup on

6f2b009a213b916d391407a7f86c0300.attach.uribl.com

 	, which returns a 127.0.0.2 if that's a known spam attachment?

 	razor, pyzor, and dcc do this with custom client apps and 
protocols (just try getting the razor protocol from Vipul or Jordan ;-). 
I kind of like the idea of doing it with dns and simple md5 or sha1 
checksums.  Enough so that I extracted around 21,000 unique attachments 
from the 3.5G of the last 3 years of hand-checked spam.  I hand-checked 
9,791 of those attachments (*) and placed their md5sums and sha1sums up at 
http://www.stearns.org/spamattach/ 
(http://www.stearns.org/spamattach/combined.md5sums and 
http://www.stearns.org/spamattach/combined.sha1sums hold all of the sums)

 	Is someone willing to do the SA plugin to ms5/sha1 sum each 
non-text mime part (or even just the images for efficiency)?  If so, I'd 
be glad to create a zone to test from.
 	For all those that aren't sure it's worth redoing the razor, 
pyzor, and dcc work in a dns-based rbl, I guess I'd answer I'm not sure 
either.  :-)  On the other hand, I've already done a hand-checked set of 
sums, the plugin shouldn't be all that hard, and we can throw it at a 
corpus to see how well it works.  It might just help enough to be worth 
it....
 	Cheers,
 	- Bill

* I had to stop when my eyes glazed over.  :-)

---------------------------------------------------------------------------
         "The sign on the window next to the entrance of OptInRealBig's
offices in Westminster leaves no room for misunderstanding.  Or irony.
NO SOLICITING."
http://www.westword.com/issues/2004-01-29/feature.html/3/index.html
--------------------------------------------------------------------------
William Stearns (wstearns@pobox.com).  Mason, Buildkernel, freedups, p0f,
rsync-backup, ssh-keyinstall, dns-check, more at:   http://www.stearns.org
--------------------------------------------------------------------------

Re: md5sum/sha1sum signatures available, was RE: Gif-Only spams

Posted by Dirk Bonengel <di...@bonengel.de>.
Just to add my 2 Euro-Cent:
Something like this might actually exist (in as far as gif-only spams 
are of interest).
Bert Ungerer, an editor with the German IT magazine 'iX', developed a 
procmail-based AntiSpam-System he called 'NiXSpam'.
One part of it is a list of MD5-hashes of parts of  the body of mails 
that fail diverse whitelisting criteria. A mail that comes in later and 
whose body returns the same hash value is considered spam. The 
procmail-Recipe in question can be found here:
ftp://ftp.ix.de/pub/ix/ix_listings/2004/05/checksums
Now, these hashes have been available by DNS for some time now. I've put 
together a plugin for SpamAssassin to use these hashes (to be found at 
http://wiki.apache.org/spamassassin/iXhash) and set up my own RBL (with 
our own input) to boot. Work quite well here, but then it's (propably) 
not enough spam to come to real conclusions
Anyone willing to experiment is welcome, expecially Sven, the original 
poster (who lives in Munich just like I do- the world's a small 
place...). More info to be found at the wiki.

Dirk

Rob Skedgell schrieb:

>On Sunday 10 Jul 2005 06:41, William Stearns wrote:
>  
>
>>Good evening, all,
>>
>>On Thu, 9 Jun 2005, Chris Santerre wrote:
>>    
>>
>>>>From: Sven Riedel [mailto:sr@baghus.net]
>>>>Sent: Thursday, June 09, 2005 10:19 AM
>>>>
>>>>has anyone developed a good strategy against spams
>>>>that contain a random text and the actual spam in
>>>>an image within a multipart/alternative mail?
>>>>
>>>>Short of entirely blocking mails containing images, that
>>>>is.
>>>>        
>>>>
>>>Check out the interesting idea at www.rulesemporium.com/forums/
>>>
>>>entitled: Image attachment MD5 footprint RBL
>>>
>>>Pretty cool.
>>>      
>>>
>> 	The forums appear to be down at the moment, so I couldn't read the
>>thread involved.
>> 	I'm guessing the idea is to have a set of md5sums of known spam
>>attachments (images and others), so when a new message comes in, the
>>spam filter md5sums/sha1sums each mime part and does a dns lookup on
>>
>>6f2b009a213b916d391407a7f86c0300.attach.uribl.com
>>
>> 	, which returns a 127.0.0.2 if that's a known spam attachment?
>>
>> 	razor, pyzor, and dcc do this with custom client apps and
>>protocols (just try getting the razor protocol from Vipul or Jordan
>>;-). I kind of like the idea of doing it with dns and simple md5 or
>>sha1 checksums.  Enough so that I extracted around 21,000 unique
>>attachments from the 3.5G of the last 3 years of hand-checked spam. 
>>I hand-checked 9,791 of those attachments (*) and placed their
>>md5sums and sha1sums up at http://www.stearns.org/spamattach/
>>(http://www.stearns.org/spamattach/combined.md5sums and
>>http://www.stearns.org/spamattach/combined.sha1sums hold all of the
>>sums)
>>
>> 	Is someone willing to do the SA plugin to ms5/sha1 sum each
>>non-text mime part (or even just the images for efficiency)?  If so,
>>I'd be glad to create a zone to test from.
>> 	For all those that aren't sure it's worth redoing the razor,
>>pyzor, and dcc work in a dns-based rbl, I guess I'd answer I'm not
>>sure either.  :-)  On the other hand, I've already done a
>>hand-checked set of sums, the plugin shouldn't be all that hard, and
>>we can throw it at a corpus to see how well it works.  It might just
>>help enough to be worth it....
>> 	Cheers,
>> 	- Bill
>>
>>* I had to stop when my eyes glazed over.  :-)
>>    
>>
>
>I would certainly be interested in this as I've been replacing all but 
>the first and last lines of base64 from spam attachments in NANAS 
>postings with somthing like this (from diploma mill spam) posted as 
><ne...@nephelococcygia.demon.co.uk>:
>
>*** Attachment "subliminal.GIF" elided:
>*** file(1) : GIF image data, version 89a, 642 x 485
>*** size:     11443 bytes
>*** md5sum:   ac5d8f032c58938a821771ef96eb970d
>*** sha1sum:  ac029c1f85000f028233d4ad60a0e860e973a806
>*** clamscan: OK
>***
>*** Phone number in image: +1-206-984-0021
>***
>
>(Not found amongst the 1299 entries in 
><http://www.stearns.org/spamattach/diploma.md5sums>).
>
>The only problem I can see with this is that once it caught on the 
>spammers would be able to frustrate it quite easily. Obviously, I won't 
>suggest how in a public forum...
>
>  
>


Re: md5sum/sha1sum signatures available, was RE: Gif-Only spams

Posted by Rob Skedgell <ro...@nephelococcygia.demon.co.uk>.
On Sunday 10 Jul 2005 06:41, William Stearns wrote:
> Good evening, all,
>
> On Thu, 9 Jun 2005, Chris Santerre wrote:
> >> From: Sven Riedel [mailto:sr@baghus.net]
> >> Sent: Thursday, June 09, 2005 10:19 AM
> >>
> >> has anyone developed a good strategy against spams
> >> that contain a random text and the actual spam in
> >> an image within a multipart/alternative mail?
> >>
> >> Short of entirely blocking mails containing images, that
> >> is.
> >
> > Check out the interesting idea at www.rulesemporium.com/forums/
> >
> > entitled: Image attachment MD5 footprint RBL
> >
> > Pretty cool.
>
>  	The forums appear to be down at the moment, so I couldn't read the
> thread involved.
>  	I'm guessing the idea is to have a set of md5sums of known spam
> attachments (images and others), so when a new message comes in, the
> spam filter md5sums/sha1sums each mime part and does a dns lookup on
>
> 6f2b009a213b916d391407a7f86c0300.attach.uribl.com
>
>  	, which returns a 127.0.0.2 if that's a known spam attachment?
>
>  	razor, pyzor, and dcc do this with custom client apps and
> protocols (just try getting the razor protocol from Vipul or Jordan
> ;-). I kind of like the idea of doing it with dns and simple md5 or
> sha1 checksums.  Enough so that I extracted around 21,000 unique
> attachments from the 3.5G of the last 3 years of hand-checked spam. 
> I hand-checked 9,791 of those attachments (*) and placed their
> md5sums and sha1sums up at http://www.stearns.org/spamattach/
> (http://www.stearns.org/spamattach/combined.md5sums and
> http://www.stearns.org/spamattach/combined.sha1sums hold all of the
> sums)
>
>  	Is someone willing to do the SA plugin to ms5/sha1 sum each
> non-text mime part (or even just the images for efficiency)?  If so,
> I'd be glad to create a zone to test from.
>  	For all those that aren't sure it's worth redoing the razor,
> pyzor, and dcc work in a dns-based rbl, I guess I'd answer I'm not
> sure either.  :-)  On the other hand, I've already done a
> hand-checked set of sums, the plugin shouldn't be all that hard, and
> we can throw it at a corpus to see how well it works.  It might just
> help enough to be worth it....
>  	Cheers,
>  	- Bill
>
> * I had to stop when my eyes glazed over.  :-)

I would certainly be interested in this as I've been replacing all but 
the first and last lines of base64 from spam attachments in NANAS 
postings with somthing like this (from diploma mill spam) posted as 
<ne...@nephelococcygia.demon.co.uk>:

*** Attachment "subliminal.GIF" elided:
*** file(1) : GIF image data, version 89a, 642 x 485
*** size:     11443 bytes
*** md5sum:   ac5d8f032c58938a821771ef96eb970d
*** sha1sum:  ac029c1f85000f028233d4ad60a0e860e973a806
*** clamscan: OK
***
*** Phone number in image: +1-206-984-0021
***

(Not found amongst the 1299 entries in 
<http://www.stearns.org/spamattach/diploma.md5sums>).

The only problem I can see with this is that once it caught on the 
spammers would be able to frustrate it quite easily. Obviously, I won't 
suggest how in a public forum...

-- 
Rob Skedgell <ro...@nephelococcygia.demon.co.uk>