You are viewing a plain text version of this content. The canonical link for it is here.

Posted to users@spamassassin.apache.org by ha...@t-online.de on 2006/07/31 18:26:55 UTC

Re: Image spams getting thru

>> 
>> > On Mon, Jul 31, 2006 at 01:57:52PM +0530, Ramprasad wrote:
>> >> So if the spammer keeps generating different images for every spam mail
>> >> then DCC RAZOR etc would be useless right ?
>> >
>> >   An image is just content - much like text or HTML.  How useful
>> > DCC/RAZOR/etc. would be depends highly on how they are used and
>> > on how sophisticated the spammer is.  What I suggested is not the
>> > end-it-all solution for spam detection but another tool to add to
>> > the spamassassin toolbox.
>> >
>> >   Also, generating new images potentially is computationally expensive
>> > enough that most spammers wouldn't try it.
>> >
>> >   Over 50% of my false negatives this week would have been properly
>> > identified by IDing the image.  YMMV.
>> >
>> >   Tim
>> >
>> 
>> A few months ago I played around with a plugin that computed MD5 hashes
>> from images contained in a mail and compared that sum to a RBL-like
>> DNS-based database maintained by Will Stearns.
>> Results were somewhat disappointing. If Will still feeds the zone I can
>> post the code somewhere
>> 
>> Another idea was to check the images for correctness. Some spammers seem
>> to use slightly modified copies of a master image. These copies are
>> displayed correctly by the usual MUAs but they do contain errors that show
>> up when using Image::Info or something.
>> 
>> Dirk
>> 

Hi,

this should be possible to detect, but at least gif format can be modified easily without
introducing errors: just play with unused colormap entries.
An algorithm that actually renders the image (eg converts it to pbm) before the md5
would recognize images as the same while plain md5 will consider them different

Wolfgang Hamann

Musicman

Re: Image spams getting thru

Posted by Philip Prindeville <ph...@redfish-solutions.com>.

Logan Shaw wrote:

>[snip]
>And there's also an easy way around it.  Simply add noise to
>the image.  There are a number of techniques, but an obvious
>one to use with GIF is to assign two palette entries to
>two nearly (but not quite) identical colors.  For example,
>put 0xffffff and 0xfffeff in your palette.  Then, for every
>white pixel in the original image, choose at random whether it
>gets represented by a 0xffffff or 0xfffeff pixel.  There will
>be virtually no discernable difference to the eye, but the
>files will completely different, especially since GIF uses
>LZW compression on the pixel data.
>
>There are similar methods for other formats:  with JPEG, you
>can just change the quality settings, causing the JPEG decoder
>itself to add noise to your image.  (And perfectly legit noise,
>too, since the quality parameters vary on legit images.)
>
>And of course you can just add noise to the least significant
>bit in any generic format as well.
>
>   - Logan
>  
>

If I could revisit this issue and be less sinister in doing so, I'm
trying to look at ways to generate a fingerprint from GIF stock
spams that could be used to filter them.

I'll need to reduce a large number of spam (no, I don't need any
extra, so don't bother forwarding them ;-)... and then do a stochastic
analysis of those parameters.

In the meantime, a couple of questions and observations...

First, CPAN seems to come up short on modules to parse and
decompose (and render!) GIF or PNG file formats. Most
disappointing. I finally decided on the now stagnant and
unsupported Image::Info module (sigh), but it doesn't
decompress that data once it deconstructs the GIF data stream
into its component parts.

I tried to use Compress::LZW to decompress the stream, but
that only seems to work on 12 or 16 bit minimum codesize,
whereas GIF images are routinely 4, 6, or 8 bits long.

Does anyone have a handle on what Perl modules to use for
dissecting GIF objects?

Thanks,

-Philip

Re: Image spams getting thru

Posted by jdow <jd...@earthlink.net>.

From: "Logan Shaw" <ls...@emitinc.com>

> On Mon, 31 Jul 2006, jdow wrote:
>> Break the image into pieces. If too many pieces match on MD5 sum then
>> you score it higher than if lots of the image is different. But that
>> can get tedious to say the least.
> 
> And there's also an easy way around it.  Simply add noise to
> the image.  There are a number of techniques, but an obvious
> one to use with GIF is to assign two palette entries to
> two nearly (but not quite) identical colors.  For example,
> put 0xffffff and 0xfffeff in your palette.  Then, for every
> white pixel in the original image, choose at random whether it
> gets represented by a 0xffffff or 0xfffeff pixel.  There will
> be virtually no discernable difference to the eye, but the
> files will completely different, especially since GIF uses
> LZW compression on the pixel data.
> 
> There are similar methods for other formats:  with JPEG, you
> can just change the quality settings, causing the JPEG decoder
> itself to add noise to your image.  (And perfectly legit noise,
> too, since the quality parameters vary on legit images.)
> 
> And of course you can just add noise to the least significant
> bit in any generic format as well.

Yup, steganography with random data. Of course, you could feed them
to the FBI and say you suspect this is steganographic terrorist
planning or something. I betcha they or the CIA can find the source
of the spam if they buy into that idea....

{^_-}

Re: Image spams getting thru

Posted by Ricardo Oliveira <ri...@gmail.com>.

Just another note: Derek's rule is catching almost all of these messages
here, although it seems we have a new wave of different ones. I'll try to
see what's the pattern and send it this way.

Regards,
Ricardo Oliveira

Re: Image spams getting thru

Posted by Dave Augustus <da...@ingraftedsoftware.com>.

I installed Derek's test rule last night and it has caught every one of
the stock promotion emails and nothing else. I set it 1.5 for testing. 

I have received about 5 of these in the last 12 hours on 2 different
accounts out of a total of about 100 emails. 

Also, I did receive some emails with that were both HTML and text WITH
images and they came through perfect without hitting the rule.

I will be keeping a close eye on this one as these have seemed to elude
every other method. If I see more success, I will be increasing the
score.

Thanks Derek!


-- 
Here to serve,
Dave Augustus
Ingrafted Software Inc.
c(817) 371-0585
o(817) 741-1288
PO Box 1040
Newark TX 76071

Re: Image spams getting thru

Posted by "Chr. v. Stuckrad" <st...@mi.fu-berlin.de>.

On Tue, 01 Aug 2006, Theo Van Dinter wrote:

> On Tue, Aug 01, 2006 at 09:24:55AM -0700, John D. Hardin wrote:
> ...
> Well, until greylisting becomes enough of a problem that the spammers change
> their software to queue and retry, thereby eliminating the benefit completely.
Or even simply send spam unconditionally twice or thrice
just to be sure to get through the greylist.

It just needs knowledge how fast you have to give the same
combination of envelope-addresses to the same zombie again.

And THIS would explain why I get lots of spams more than once,
but in 'chunks' of 3 to 6 times the same thing in a few minutes
and then pausing for a long while.

So just by re-arranging the (spam-)address-lists and sending
at least twice the amount of spam, greylisting may be circumvented.

Just an idea, because we currently/suddenly get over 20% more spams
for the last few days.

Stucki

-- 
Christoph von Stuckrad      * * |nickname |<st...@mi.fu-berlin.de>   \
Freie Universitaet Berlin   |/_*|'stucki' |Tel(days):+49 30 838-5 57 78|
Mathematik & Informatik EDV |\ *|if online|Tel(else):+49 30 77 39 66 00|
Arnimallee 6 / 14195 Berlin * * |on IRCnet|Fax(alle):+49 30 838-75 454/

Re: Image spams getting thru

Posted by Derek Harding <de...@innovyx.com>.

On Tue, 2006-08-01 at 17:49 -0400, Theo Van Dinter wrote:
>
> Except now you've also delayed your valid mail by 30 minutes or an hour
> which sucks (and is sometimes completely unacceptable).

True though it would be more accurate to say that you've delayed some of
your valid mail by 30 minutes to an hour. 

How much this sucks and how unacceptable it is is going to vary
enormously.

Having run greylisting for a couple of years now I have to say that for
me, for the most part, it's not even noticeable since the majority of my
email turns up immediately.

Derek

Re: Image spams getting thru

Posted by John Rudd <jr...@ucsc.edu>.

On Aug 2, 2006, at 5:21 AM, Jim Maul wrote:

> John D. Hardin wrote:
>> On Tue, 1 Aug 2006, Theo Van Dinter wrote:
>>> Except now you've also delayed your valid mail by 30 minutes or an
>>> hour which sucks (and is sometimes completely unacceptable).
>> Repeat after me: "Email is a non-guaranteed, Best Attempt delivery
>> mechanism. There may be delays."
>
> Just because thats what it was designed to be, doesnt mean that it is. 
> Email is whatever people use it for.  Its an instant messenger 
> utility, its a file transfer mechanism, or even a replacement for the 
> telephone or snail mail.  Many people have gotten used to the fact 
> that email these days is usually freakin quick and to suddenly have 
> that changed is unacceptable.
>

Yes, but no matter how much lipstick and lace you put on a pig, it's 
still a pig.  It never suddenly becomes a human woman.  And if you take 
it to a restaurant, you can talk about how dressed up it is, but people 
are still going to see a pig slopping at the table.  And they're still 
going to give you funny looks for DATING A PIG.

People who think Email is an IM, a file sharing tool, or a replacement 
for a fast, secure, guaranteed courier service ... are dating pigs.  
Treat them like it.

Re: Image spams getting thru

Posted by Jim Maul <jm...@elih.org>.

John D. Hardin wrote:
> On Tue, 1 Aug 2006, Theo Van Dinter wrote:
> 
>> Except now you've also delayed your valid mail by 30 minutes or an
>> hour which sucks (and is sometimes completely unacceptable).
> 
> Repeat after me: "Email is a non-guaranteed, Best Attempt delivery
> mechanism. There may be delays."
> 

Just because thats what it was designed to be, doesnt mean that it is. 
Email is whatever people use it for.  Its an instant messenger utility, 
its a file transfer mechanism, or even a replacement for the telephone 
or snail mail.  Many people have gotten used to the fact that email 
these days is usually freakin quick and to suddenly have that changed is 
unacceptable.

Imagine if car companies suddenly started making all vehicles with 4 
cylinder engines to help solve the current gasoline crisis.  It *would* 
help the problem and many people would embrace it, but for many others, 
its simply unacceptable.

-Jim

Re: Image spams getting thru

Posted by "John D. Hardin" <jh...@impsec.org>.

On Tue, 1 Aug 2006, Theo Van Dinter wrote:

> Except now you've also delayed your valid mail by 30 minutes or an
> hour which sucks (and is sometimes completely unacceptable).

Repeat after me: "Email is a non-guaranteed, Best Attempt delivery
mechanism. There may be delays."

--
 John Hardin KA7OHZ    ICQ#15735746    http://www.impsec.org/~jhardin/
 jhardin@impsec.org    FALaholic #11174    pgpk -a jhardin@impsec.org
 key: 0xB8732E79 - 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
 [Small arms] are fundamentally dangerous and their removal from the
 equation either by control, neutralisation or removal is essential.
 The first step is to gain information on their numbers and
 whereabouts.         -- the UN, who "doesn't want to confiscate guns"
-----------------------------------------------------------------------

Re: Image spams getting thru

Posted by Theo Van Dinter <fe...@apache.org>.

On Tue, Aug 01, 2006 at 04:33:39PM -0500, Logan Shaw wrote:
> However, don't assume that it kills the benefit of greylisting
> completely:  if you can delay processing that questionable
> message for 30 minutes or an hour, that greatly increases the
> chances it will end up on a realtime blacklist of some type.

Except now you've also delayed your valid mail by 30 minutes or an hour
which sucks (and is sometimes completely unacceptable).

> Then the spammer goes for a second pass through the list to
> try to defeat greylisting.  The servers that had greylisted
> the messages will receive it again but will check the
> distributed database.  The distributed database will have a
> zillion reports of suspicious activity from that IP address.
> That won't absolutely indicate that the message is spam,
> but it might be worth adding a score of 1 or 2 points.

Or it's the first time the service sees <insert legit newsletter sender>
sending out their newsletters. ;)

-- 
Randomly Generated Tagline:
"It's stupid to slap a table ... "          - Prof. Long

Re: Image spams getting thru

Posted by Logan Shaw <ls...@emitinc.com>.

On Tue, 1 Aug 2006, John D. Hardin wrote:
> On Tue, 1 Aug 2006, John Rudd wrote:

>> They don't really even have to "queue".  They just have to retry.

>> It's a lightweight solution to getting around greylisting.

> Crap. That's good.

Yeah, it would be a very simple way of getting around
greylisting.

However, don't assume that it kills the benefit of greylisting
completely:  if you can delay processing that questionable
message for 30 minutes or an hour, that greatly increases the
chances it will end up on a realtime blacklist of some type.
Basically, even though this reduces the effectiveness of
greylisting, greylisting will take away the element of surprise,
which could be valuable.

Now, thinking of realtime blacklists in combination with
greylisting has got me thinking of a strange concept.  Might be
new or might not be, but I'll mention it just in case.  When a
spammer sends out spam, each computer they're using to send it
(whether zombie, open relay, or whatever) will be sending out
zillions of messages.  And greylisting at an individual site
tracks sources of messages, but only tracks based on traffic
at an individual site.

So here's the idea:  what if a greylist server filed a report
in a distributed database every time it saw a message from
an unknown sender (and tempfailed it)?  So, for example,
a spammer's zombie at 1.2.3.4 sends to acme.com.  acme.com
greylists it since it doesn't know 1.2.3.4 and files a report
with the realtime distributed database.  Then foo.com also
receives a message from 1.2.3.4.  It's also an unknown source
for foo.com, so it files a report with the same database.
More and more sites keep getting connections from 1.2.3.4,
and all the ones that don't recognize 1.2.3.4 as having a
history with them all file reports of suspicious activity.

Then the spammer goes for a second pass through the list to
try to defeat greylisting.  The servers that had greylisted
the messages will receive it again but will check the
distributed database.  The distributed database will have a
zillion reports of suspicious activity from that IP address.
That won't absolutely indicate that the message is spam,
but it might be worth adding a score of 1 or 2 points.

Like dcc, this would sometimes penalize legitimate bulk mail
(whenever a new server appears on the internet and starts
sending en masse immediately, it would be penalized).  But if
it's part of a larger strategy, could it be useful?  It seems
like it would do a fairly good job of automatically detecting
bulk senders.  For what it's worth, the distributed database
could also keep track of IP addresses that the individual sites'
greylists *did* recognize, so that something would only be
considered spam if (say) 95% of the sites reporting on that
address didn't recognize it.

   - Logan

Re: Image spams getting thru

Posted by "John D. Hardin" <jh...@impsec.org>.

On Tue, 1 Aug 2006, John Rudd wrote:

> They don't really even have to "queue".  They just have to retry.

...

> It's a lightweight solution to getting around greylisting.

Crap. That's good.

I suppose one way around it might be to hardfail if the far end is
retrying too quickly or too many times during the greylist TMPFAIL 
period.

There doesn't appear to be an option to do that presently.

This would require some careful configuration, though; the danger of
rejecting legitimate mail due to an overly-aggressive retry schedule
appears great.

--
 John Hardin KA7OHZ    ICQ#15735746    http://www.impsec.org/~jhardin/
 jhardin@impsec.org    FALaholic #11174    pgpk -a jhardin@impsec.org
 key: 0xB8732E79 - 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
  There is no doubt in my mind that millions of lives could have been
  saved if the people were not "brainwashed" about gun ownership and
  had been well armed. ... Gun haters always want to forget the Warsaw
  Ghetto uprising, which is a perfect example of how a ragtag,
  half-starved group of Jews took 10 handguns and made asses out of
  the Nazis.
                                     -- Theodore Haas, Dachau Survivor
-----------------------------------------------------------------------

Re: Image spams getting thru

Posted by John Rudd <jr...@ucsc.edu>.

On Aug 1, 2006, at 9:53 AM, Theo Van Dinter wrote:

> On Tue, Aug 01, 2006 at 09:24:55AM -0700, John D. Hardin wrote:
>>> How many spams would really comeback. max 20%
>> There is a much lighter-weight and more global way to achieve that:
>> standard greylisting.
>
> Well, until greylisting becomes enough of a problem that the spammers 
> change
> their software to queue and retry, thereby eliminating the benefit 
> completely.

They don't really even have to "queue".  They just have to retry.

I started doing a tempfail on any host that doesn't have reverse dns 
(tempfail instead of hardfail in case it's a transient DNS issue).  The 
other day I got 2500 attempts from one such host.  I'm willing to bet 
they were doing something like this:

1) run through my list of recipients
    a) if I get to deliver, take that recipient off the list
    b) if I get a permanent failure, take that recipient off of the list
    b) otherwise, keep them on the list but move on to the next recipient

2) when I get to the end of the list, go through the list again with my 
smaller list of recipients that got tempfailed the first time

No queue of messages.  Just retry everyone who tempfailed, over and 
over again until you get past the greylist.  Only, I'm not greylisting, 
so I just got hit over and over again.

It's a lightweight solution to getting around greylisting.  It might 
need some refinement though, but I wont say here what that refinement 
is.

(though, I suppose you could say that's a queue of recipients, but I 
tend to think of "queue" in the email sense as a queue of messages ... 
which I don't expect to ever be a successful spam strategy for zombied 
PC's -- it will use too many resources, and thus be too likely to 
attract the attention of the user/owner)

Luckily, I do this check early enough in the SMTP session that it 
didn't really tie up much of the actual system resources.  (I do it in 
MIMEDefang, in filter_sender, so right after the "mail from:" stage of 
the SMTP session)

Re: Image spams getting thru

Posted by "John D. Hardin" <jh...@impsec.org>.

On Tue, 1 Aug 2006, Theo Van Dinter wrote:

> On Tue, Aug 01, 2006 at 09:24:55AM -0700, John D. Hardin wrote:
> > > How many spams would really comeback. max 20% 
> > There is a much lighter-weight and more global way to achieve that:
> > standard greylisting. 
> 
> Well, until greylisting becomes enough of a problem that the
> spammers change their software to queue and retry, thereby
> eliminating the benefit completely.

Granted.

--
 John Hardin KA7OHZ    ICQ#15735746    http://www.impsec.org/~jhardin/
 jhardin@impsec.org    FALaholic #11174    pgpk -a jhardin@impsec.org
 key: 0xB8732E79 - 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
 It may be possible to start a programme of weapon registration as a
 first step towards the physical collection phase. ... Assurances
 must be provided, and met, that the process of registration will
 not lead to immediate weapons seizures by security forces.
                      -- the UN, who "doesn't want to confiscate guns"
-----------------------------------------------------------------------

Re: Image spams getting thru

Posted by Theo Van Dinter <fe...@apache.org>.

On Tue, Aug 01, 2006 at 09:24:55AM -0700, John D. Hardin wrote:
> > How many spams would really comeback. max 20% 
> There is a much lighter-weight and more global way to achieve that:
> standard greylisting. 

Well, until greylisting becomes enough of a problem that the spammers change
their software to queue and retry, thereby eliminating the benefit completely.

-- 
Randomly Generated Tagline:
"First learn computer science and all the theory. Next develop a
 programming style.  Then forget all that and just hack."
                           - George Carrette

Re: Image spams getting thru

Posted by Logan Shaw <ls...@emitinc.com>.

On Tue, 1 Aug 2006, John D. Hardin wrote:
> On Tue, 1 Aug 2006, Ramprasad wrote:

>>   How about sending "450 Please Try later" to ever mail with an
>> inline image and then somehow verify if it really comes back.

> If some spammer MTAs are going to only try delivery once, why expend
> heavy resources on your end (a full SA scan) to decide whether to
> TMPFAIL the message just to see if they do? Just install
> milter-greylist and lose *all* of the lazy-spammer traffic regardless
> of whether or not it is multi-image-only format.

The two approaches have different costs but also different
benefits.  Content scan before tempfail has the benefit that
it reduces the set of messages for which there is a delay.
Pure greylist has the benefit that it saves the work of doing
content scans.

Basically, doing a content scan before tempfail gives you
some extra benefits but has some extra costs.  Whether it's
an appropriate solution depends on whether those benefits
(reduced chances of a legit message being delayed) are worth
the cost in CPU time and network bandwidth.  And that depends
on your situation.

If you are a small organization with an underutilized server
(say, a modern machine that handles only 5000 messages a day),
you might be willing to use double or triple the CPU time and
double or triple the bandwidth to improve your spam detection
accuracy from (say) 97% to 99%.  If you are a large ISP with
servers that keep up with their load but don't have much
resources to spare, it might be important to you to reduce
the load on your servers.

   - Logan

Re: Image spams getting thru

Posted by Bill Landry <bi...@pointshare.com>.

----- Original Message ----- 
From: "Jim Maul" <jm...@elih.org>

> John D. Hardin wrote:
>> On Tue, 1 Aug 2006, Ramprasad wrote:
>>
>>>   How about sending "450 Please Try later" to ever mail with an
>>> inline image and then somehow verify if it really comes back.
>>> (Obviously not my original idea :-) )
>>
>> The problem there, again, is that you've already used the bandwidth
>> and system resources needed to receive and scan the message. Why
>> explicitly say "please re-send the message later, I'd like to use my
>> bandwidth and CPU resources to process it again"? Would the benefit
>> outweigh the cost?
>>
>> Then add in the infrastructure and long-term resources needed to
>> determine whether you've seen the message before and make a decision
>> based on that data.
>>
>>> How many spams would really comeback. max 20%
>>
>> There is a much lighter-weight and more global way to achieve that:
>> standard greylisting.
>
> Im curious how many organizations that arent ISPs are using some sort of 
> greylisting.  Do your "users" complain when the email they sent to a 
> fellow employee 17 seconds ago didnt arrive yet?  We hear all sorts of 
> shit when things like that happen.  Try explaining greylisting and spam to 
> some ICU nurse who really doesnt care.  All she knows is that we didnt 
> have this "problem" when we paid to outsource our email.  For us, and im 
> sure many others as well, greylisting is just not realistic.

Hmmm, strange, all of our customers are healthcare (hospitals, clinics, 
payers, specialists, etc.), and they love our service.  We have been using 
greylisting for about 1.5 years now, and it has dramatically decreased our 
spam filtering and virus scanning load.

Bill

Re: Image spams getting thru

Posted by Jim Maul <jm...@elih.org>.

Ken A wrote:
> 
> 
> Jim Maul wrote:
>> John D. Hardin wrote:
>>> On Tue, 1 Aug 2006, Ramprasad wrote:
>>>
>>>>   How about sending "450 Please Try later" to ever mail with an
>>>> inline image and then somehow verify if it really comes back.
>>>> (Obviously not my original idea :-) )
>>>
>>> The problem there, again, is that you've already used the bandwidth
>>> and system resources needed to receive and scan the message. Why
>>> explicitly say "please re-send the message later, I'd like to use my
>>> bandwidth and CPU resources to process it again"? Would the benefit
>>> outweigh the cost?
>>>
>>> Then add in the infrastructure and long-term resources needed to
>>> determine whether you've seen the message before and make a decision
>>> based on that data.
>>>
>>>> How many spams would really comeback. max 20% 
>>>
>>> There is a much lighter-weight and more global way to achieve that:
>>> standard greylisting.
>>
>> Im curious how many organizations that arent ISPs are using some sort 
>> of greylisting.  Do your "users" complain when the email they sent to 
>> a fellow employee 17 seconds ago didnt arrive yet?  We hear all sorts 
>> of shit when things like that happen.  Try explaining greylisting and 
>> spam to some ICU nurse who really doesnt care.  All she knows is that 
>> we didnt have this "problem" when we paid to outsource our email.  For 
>> us, and im sure many others as well, greylisting is just not realistic.
> 
> 
> Well, you don't have to use it on internal mail. That's just a 
> configuration issue.
> Ken
> Pacific.Net
> 
>

True, and we would if we chose to use it at all.  My example was a 
little too generic I suppose.  We regularly have employees that use 
email as an instant messenger type of service with insurance companies, 
patients, doctors offices, etc.  For them, and ultimately us, the delay 
is simply not an option.

-Jim

Re: Image spams getting thru

Posted by Ken A <ka...@pacific.net>.


Jim Maul wrote:
> John D. Hardin wrote:
>> On Tue, 1 Aug 2006, Ramprasad wrote:
>>
>>>   How about sending "450 Please Try later" to ever mail with an
>>> inline image and then somehow verify if it really comes back.
>>> (Obviously not my original idea :-) )
>>
>> The problem there, again, is that you've already used the bandwidth
>> and system resources needed to receive and scan the message. Why
>> explicitly say "please re-send the message later, I'd like to use my
>> bandwidth and CPU resources to process it again"? Would the benefit
>> outweigh the cost?
>>
>> Then add in the infrastructure and long-term resources needed to
>> determine whether you've seen the message before and make a decision
>> based on that data.
>>
>>> How many spams would really comeback. max 20% 
>>
>> There is a much lighter-weight and more global way to achieve that:
>> standard greylisting.
> 
> Im curious how many organizations that arent ISPs are using some sort of 
> greylisting.  Do your "users" complain when the email they sent to a 
> fellow employee 17 seconds ago didnt arrive yet?  We hear all sorts of 
> shit when things like that happen.  Try explaining greylisting and spam 
> to some ICU nurse who really doesnt care.  All she knows is that we 
> didnt have this "problem" when we paid to outsource our email.  For us, 
> and im sure many others as well, greylisting is just not realistic.


Well, you don't have to use it on internal mail. That's just a 
configuration issue.
Ken
Pacific.Net


> 
> -Jim
>

Re: Image spams getting thru

Posted by "John D. Hardin" <jh...@impsec.org>.

On Tue, 1 Aug 2006, Jim Maul wrote:

> > There is a much lighter-weight and more global way to achieve that:
> > standard greylisting. 
> 
> Im curious how many organizations that arent ISPs are using some sort of 
> greylisting.  Do your "users" complain when the email they sent to a 
> fellow employee 17 seconds ago didnt arrive yet?

I wouldn't greylist local mail.

And if people complain about *external* mail, I give them my
email-is-only-best-effort speech.

> We hear all sorts of shit when things like that happen.  Try
> explaining greylisting and spam to some ICU nurse who really
> doesnt care.  All she knows is that we didnt have this "problem"
> when we paid to outsource our email.  For us, and im sure many
> others as well, greylisting is just not realistic.

That's where having an intelligent administrator comes in. If you
regularly exchange mail with known sites and can't afford delays in
communicating with them, then tell your systems that you trust them -
put them in the greylist exclusion list, add them to the list of sites
that can send you executable attachments, and so forth. Reserve your
maximum paranoia for the Great Unwashed Internet.

--
 John Hardin KA7OHZ    ICQ#15735746    http://www.impsec.org/~jhardin/
 jhardin@impsec.org    FALaholic #11174    pgpk -a jhardin@impsec.org
 key: 0xB8732E79 - 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
 It may be possible to start a programme of weapon registration as a
 first step towards the physical collection phase. ... Assurances
 must be provided, and met, that the process of registration will
 not lead to immediate weapons seizures by security forces.
                      -- the UN, who "doesn't want to confiscate guns"
-----------------------------------------------------------------------

Re: Image spams getting thru

Posted by Jim Maul <jm...@elih.org>.

John D. Hardin wrote:
> On Tue, 1 Aug 2006, Ramprasad wrote:
> 
>>   How about sending "450 Please Try later" to ever mail with an
>> inline image and then somehow verify if it really comes back.
>> (Obviously not my original idea :-) )
> 
> The problem there, again, is that you've already used the bandwidth
> and system resources needed to receive and scan the message. Why
> explicitly say "please re-send the message later, I'd like to use my
> bandwidth and CPU resources to process it again"? Would the benefit
> outweigh the cost?
> 
> Then add in the infrastructure and long-term resources needed to
> determine whether you've seen the message before and make a decision
> based on that data.
> 
>> How many spams would really comeback. max 20% 
> 
> There is a much lighter-weight and more global way to achieve that:
> standard greylisting. 
> 

Im curious how many organizations that arent ISPs are using some sort of 
greylisting.  Do your "users" complain when the email they sent to a 
fellow employee 17 seconds ago didnt arrive yet?  We hear all sorts of 
shit when things like that happen.  Try explaining greylisting and spam 
to some ICU nurse who really doesnt care.  All she knows is that we 
didnt have this "problem" when we paid to outsource our email.  For us, 
and im sure many others as well, greylisting is just not realistic.

-Jim

Re: Image spams getting thru

Posted by "John D. Hardin" <jh...@impsec.org>.

On Tue, 1 Aug 2006, Ramprasad wrote:

>   How about sending "450 Please Try later" to ever mail with an
> inline image and then somehow verify if it really comes back.
> (Obviously not my original idea :-) )

The problem there, again, is that you've already used the bandwidth
and system resources needed to receive and scan the message. Why
explicitly say "please re-send the message later, I'd like to use my
bandwidth and CPU resources to process it again"? Would the benefit
outweigh the cost?

Then add in the infrastructure and long-term resources needed to
determine whether you've seen the message before and make a decision
based on that data.

> How many spams would really comeback. max 20% 

There is a much lighter-weight and more global way to achieve that:
standard greylisting. 

If some spammer MTAs are going to only try delivery once, why expend
heavy resources on your end (a full SA scan) to decide whether to
TMPFAIL the message just to see if they do? Just install
milter-greylist and lose *all* of the lazy-spammer traffic regardless
of whether or not it is multi-image-only format.

--
 John Hardin KA7OHZ    ICQ#15735746    http://www.impsec.org/~jhardin/
 jhardin@impsec.org    FALaholic #11174    pgpk -a jhardin@impsec.org
 key: 0xB8732E79 - 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
 It may be possible to start a programme of weapon registration as a
 first step towards the physical collection phase. ... Assurances
 must be provided, and met, that the process of registration will
 not lead to immediate weapons seizures by security forces.
                      -- the UN, who "doesn't want to confiscate guns"
-----------------------------------------------------------------------

Re: Image spams getting thru

Posted by Ramprasad <ra...@netcore.co.in>.

  How about sending "450 Please Try later" to ever mail with an inline
image and then somehow verify if it really comes back. (Obviously not my
original idea  :-) )

How many spams would really comeback. max 20% .. those which are routed
thru zombies

Thanks
Ram

Re: Image spams getting thru

Posted by Derek Harding <de...@innovyx.com>.

Rob Mangiafico wrote:
> Anyone else find this to be a good rule to catch these image stock spams 
> without too much collateral damage?
>
>   
After writing this I did some checks on the SA public corpus. The rule 
didn't hit on any of the hard ham. It didn't hit much of the spam either 
since very little of that is image spam.

Regarding SARE it has SARE_GIF_ATTACH which matches on any email that 
has an attached image. My rule only matches on email that has an 
attached image that is referenced in the HTML.

I'm finding it to be very successful and am interested in what others find.

Derek

Re: Image spams getting thru

Posted by Bill Randle <bi...@neocat.org>.

On Tue, 2006-08-01 at 18:02 -0700, jdow wrote:
> From: "Rob Mangiafico" <rm...@lexiconn.com>
> 
> > On Mon, 31 Jul 2006, Derek Harding wrote:
> >> rawbody INLINE_IMAGE    /src\s*=\s*["']cid:/i
> >> describe INLINE_IMAGE   Inline Images
> >> score INLINE_IMAGE 1.5
> >> 
> >> I haven't tested this against the SA corpus so YMMV.
> > 
> > Anyone else find this to be a good rule to catch these image stock spams 
> > without too much collateral damage?
> 
> Unless it is hidden in SARE rules some place I have not tried it. That
> would likely detect ANY embedded image, which would be bad.

One thing I've noticed with most of the image spam is that there's
a TAB character after the "Date:" string of the Date header, e.g.:
	Date:<tab>Wed, 2 Aug 2006 01:42:58 -0900

I haven't seen this in other emails (usually, it's a space character).
It may not be safe to use by itself, but in combination with other
rules may be helpful.

	-Bill

Re: Image spams getting thru

Posted by jdow <jd...@earthlink.net>.

From: "Rob Mangiafico" <rm...@lexiconn.com>

> On Mon, 31 Jul 2006, Derek Harding wrote:
>> rawbody INLINE_IMAGE    /src\s*=\s*["']cid:/i
>> describe INLINE_IMAGE   Inline Images
>> score INLINE_IMAGE 1.5
>> 
>> I haven't tested this against the SA corpus so YMMV.
> 
> Anyone else find this to be a good rule to catch these image stock spams 
> without too much collateral damage?

Unless it is hidden in SARE rules some place I have not tried it. That
would likely detect ANY embedded image, which would be bad.

{^_^}

Re: Image spams getting thru

Posted by Rob Mangiafico <rm...@lexiconn.com>.

On Mon, 31 Jul 2006, Derek Harding wrote:
> rawbody INLINE_IMAGE    /src\s*=\s*["']cid:/i
> describe INLINE_IMAGE   Inline Images
> score INLINE_IMAGE 1.5
> 
> I haven't tested this against the SA corpus so YMMV.

Anyone else find this to be a good rule to catch these image stock spams 
without too much collateral damage?

Rob

Re: Image spams getting thru

Posted by Derek Harding <de...@innovyx.com>.

On Mon, 2006-07-31 at 19:03 -0500, Tim wrote:
>   Thanks for the tip.  That sounds pretty effective, actually.  Care to
> share your rule?

Sure thing:

rawbody INLINE_IMAGE    /src\s*=\s*["']cid:/i
describe INLINE_IMAGE   Inline Images
score INLINE_IMAGE 1.5

I haven't tested this against the SA corpus so YMMV.

Derek

Re: Image spams getting thru

Posted by Tim <ti...@sleepy.wojomedia.com>.

On Mon, Jul 31, 2006 at 04:57:49PM -0700, Derek Harding wrote:
> At my (small) site we receive very few legitimate emails that have
> attached images that are referenced in the HTML of the message. It's
> basically only a few droolers who decided to use an image as their sig.
> Thus testing for /src\s*=\s*["']cid:/i in the rawbody of the message is
> working very nicely against image spams.
>
> False positives on those people with image sigs are prevented by AWL,
> Bayes and not scoring the test too highly.
> 
> Is that positive enough?

  Thanks for the tip.  That sounds pretty effective, actually.  Care to
share your rule?

  You just gave me an idea though.  I think I am going to set up a
maybe-spam folder and put your rules in it.  That'll keep most of
these away from my INBOX and also me from deleting my spam folder
without looking.

  Thanks,

  Tim

Re: Image spams getting thru

Posted by Derek Harding <de...@innovyx.com>.

On Mon, 2006-07-31 at 18:34 -0500, Tim wrote:
> 
>   But I find it amusing that people here are more interested in
> telling
> spammers how they can defeat an algorithm instead of the other
> way around.  99% of the techniques in SpamAssassins hvae an easy
> workaround - does that stop anybody from using them?

At my (small) site we receive very few legitimate emails that have
attached images that are referenced in the HTML of the message. It's
basically only a few droolers who decided to use an image as their sig.
Thus testing for /src\s*=\s*["']cid:/i in the rawbody of the message is
working very nicely against image spams.

False positives on those people with image sigs are prevented by AWL,
Bayes and not scoring the test too highly.

Is that positive enough?

Derek

Re: Image spams getting thru

Posted by Tim <ti...@sleepy.wojomedia.com>.

On Mon, Jul 31, 2006 at 03:45:05PM -0500, Logan Shaw wrote:
> On Mon, 31 Jul 2006, jdow wrote:
> >Break the image into pieces. If too many pieces match on MD5 sum then
> >you score it higher than if lots of the image is different. But that
> >can get tedious to say the least.
> 
> And there's also an easy way around it.  Simply add noise to
> the image.

  Then start using techniques that compare similarities of images.

  There is probably not ever going to be a technique that can
completely defeat spam without drastically changing the way e-mail and
Internet works.  All we can do is to try to stay ahead of the spammers.
If that wasn't the case, there would never be new rules coming out
for SpamAssassin.

  But I find it amusing that people here are more interested in telling
spammers how they can defeat an algorithm instead of the other
way around.  99% of the techniques in SpamAssassins hvae an easy
workaround - does that stop anybody from using them?

  Tim

Re: Image spams getting thru

Posted by Logan Shaw <ls...@emitinc.com>.

On Mon, 31 Jul 2006, jdow wrote:
> Break the image into pieces. If too many pieces match on MD5 sum then
> you score it higher than if lots of the image is different. But that
> can get tedious to say the least.

And there's also an easy way around it.  Simply add noise to
the image.  There are a number of techniques, but an obvious
one to use with GIF is to assign two palette entries to
two nearly (but not quite) identical colors.  For example,
put 0xffffff and 0xfffeff in your palette.  Then, for every
white pixel in the original image, choose at random whether it
gets represented by a 0xffffff or 0xfffeff pixel.  There will
be virtually no discernable difference to the eye, but the
files will completely different, especially since GIF uses
LZW compression on the pixel data.

There are similar methods for other formats:  with JPEG, you
can just change the quality settings, causing the JPEG decoder
itself to add noise to your image.  (And perfectly legit noise,
too, since the quality parameters vary on legit images.)

And of course you can just add noise to the least significant
bit in any generic format as well.

   - Logan

Re: Image spams getting thru

Posted by jdow <jd...@earthlink.net>.

From: <ha...@t-online.de>

>>>
>>> > On Mon, Jul 31, 2006 at 01:57:52PM +0530, Ramprasad wrote:
>>> >> So if the spammer keeps generating different images for every spam mail
>>> >> then DCC RAZOR etc would be useless right ?
>>> >
>>> >   An image is just content - much like text or HTML.  How useful
>>> > DCC/RAZOR/etc. would be depends highly on how they are used and
>>> > on how sophisticated the spammer is.  What I suggested is not the
>>> > end-it-all solution for spam detection but another tool to add to
>>> > the spamassassin toolbox.
>>> >
>>> >   Also, generating new images potentially is computationally expensive
>>> > enough that most spammers wouldn't try it.
>>> >
>>> >   Over 50% of my false negatives this week would have been properly
>>> > identified by IDing the image.  YMMV.
>>> >
>>> >   Tim
>>> >
>>>
>>> A few months ago I played around with a plugin that computed MD5 hashes
>>> from images contained in a mail and compared that sum to a RBL-like
>>> DNS-based database maintained by Will Stearns.
>>> Results were somewhat disappointing. If Will still feeds the zone I can
>>> post the code somewhere
>>>
>>> Another idea was to check the images for correctness. Some spammers seem
>>> to use slightly modified copies of a master image. These copies are
>>> displayed correctly by the usual MUAs but they do contain errors that show
>>> up when using Image::Info or something.
>>>
>>> Dirk
>>>
>
> Hi,
>
> this should be possible to detect, but at least gif format can be modified easily 
> without
> introducing errors: just play with unused colormap entries.
> An algorithm that actually renders the image (eg converts it to pbm) before the md5
> would recognize images as the same while plain md5 will consider them different

Break the image into pieces. If too many pieces match on MD5 sum then
you score it higher than if lots of the image is different. But that
can get tedious to say the least.

{^_^}