You are viewing a plain text version of this content. The canonical link for it is here.

Posted to users@spamassassin.apache.org by Ramprasad <ra...@netcore.co.in> on 2006/07/28 13:14:32 UTC

Image spams getting thru

I am suddenly facing a lot of image spams from a pretty effiecient
spammer now . The Ips he is using are not listed anywhere 

All spams advertising stocks of HLUN.PK Am I alone facing this problem. 
Also I found that the From header  in all mails is a typical repeated
string

Like these 

From: Rory [mailto:cardiac.cardiac@allergist.com]
From: Barbra [mailto:adjudge.adjudge@tokyo.com]
From: Ada [mailto:asbestos.asbestos@umpire.com]
From: Hattie [mailto:coyote.coyote@playful.com]
From: Stacy [mailto:allot.allot@insurer.com]
From: Lynne [mailto:conn.conn@munich.com]
From: Juliet [mailto:descant.descant@technologist.com]
From: Genevieve [mailto:blanche.blanche@rescueteam.com]
From: Aisha [mailto:acrobat.acrobat@australiamail.com]
From: Monique [mailto:affront.affront@clerk.com]
From: Kirsten [mailto:delete.delete@pediatrician.com]
From: Pablo [mailto:believe.believe@gardener.com]
From: Sadie [mailto:crockery.crockery@insurer.com]


Can I write a ruleset to hit these froms 


Thanks
Ram

Re: Image spams getting thru

Posted by Raymond Dijkxhoorn <ra...@prolocation.net>.

Hi!

> All spams advertising stocks of HLUN.PK Am I alone facing this problem.
> Also I found that the From header  in all mails is a typical repeated
> string

No this is seen all over.

Anyone a nice rule?

Bye,
Raymond.

Re: Image spams getting thru

Posted by MennovB <mv...@xs4all.nl>.


jdow wrote:
> 
> One that made it through here had no URLs in the body, a LOT of HTML
> formatting, and hit HTML_IMAGE_RATIO_06, a very low scoring rule.
> The HTML formatting is excessive use of this long string for
> individually formatting small chunks of text which are then covered
> by the enclosed Base64 image:
> <p class=3DMsoNormal><font size=3D2 face=3DArial>
> <span lang=3DEN-US = style=3D'font-size:10.0pt;font-family:Arial'>
> 
> That can probably lead to some tests.
> 
> I also noticed here that HTML_IMAGE_RATIO_06 hit 0.3 percent spam
> and 0.0 percent ham, here. So I bumped its score up a little. I expect
> that to be safe here. YMMV.
> 
> That is the only spam that has broken through in a VERY long time.
> 
Yes, if we're talking about the same spam, the one with that string started
only recently here.
They score between 7 and 15 points due to network-tests, but are since an
hour ago being discarded because luckily they contain several unique
strings..

Regards
Menno van Bennekom
-- 
View this message in context: http://www.nabble.com/Image-spams-getting-thru-tf2014839.html#a5589996
Sent from the SpamAssassin - Users forum at Nabble.com.

Re: Image spams getting thru

Posted by jdow <jd...@earthlink.net>.

From: "MennovB" <mv...@xs4all.nl>
> 
> These image spams have recognizable strings, but normally not in the header.
> Just collect a few of them and compare (e.g. cat|sort the lines, you will
> always find similarities (sometimes only in the Mime-part but even that can
> work nicely and safe enough).
> You could then make a Spamassassin rule for it (check them on your HAM
> first).
> The strings I'm sure enough about are not configured in SA but in Postfix
> with body_checks, if needed first I put them on HOLD to check the result a
> few days in the hold-queue then I put them on DISCARD so it is thrown away
> unnoticed. One of these newer checks 'HOLDED' 170 spams this weekend without
> FP's, not a big absolute number but there's not a lot of spam coming in
> anyway because of ip-blocks, RBL's etc in postfix.
> Only trouble is after some time they change the spam, but then already
> hundreds of spams are stopped.
> And finding a new string/regexp can be an entertaining puzzle. But some spam
> is just used over and over again so some rules still get hit after 2 years,
> very kind of the spammers..
> I check the spam (archived by SA/Amavisd) every morning and if I see more
> spam than normal and a lot of spam of the same size I know there's work to
> do ;-)

One that made it through here had no URLs in the body, a LOT of HTML
formatting, and hit HTML_IMAGE_RATIO_06, a very low scoring rule.
The HTML formatting is excessive use of this long string for
individually formatting small chunks of text which are then covered
by the enclosed Base64 image:
<p class=3DMsoNormal><font size=3D2 face=3DArial>
<span lang=3DEN-US = style=3D'font-size:10.0pt;font-family:Arial'>

That can probably lead to some tests.

I also noticed here that HTML_IMAGE_RATIO_06 hit 0.3 percent spam
and 0.0 percent ham, here. So I bumped its score up a little. I expect
that to be safe here. YMMV.

That is the only spam that has broken through in a VERY long time.

{^_^}

Re: Image spams getting thru

Posted by MennovB <mv...@xs4all.nl>.

These image spams have recognizable strings, but normally not in the header.
Just collect a few of them and compare (e.g. cat|sort the lines, you will
always find similarities (sometimes only in the Mime-part but even that can
work nicely and safe enough).
You could then make a Spamassassin rule for it (check them on your HAM
first).
The strings I'm sure enough about are not configured in SA but in Postfix
with body_checks, if needed first I put them on HOLD to check the result a
few days in the hold-queue then I put them on DISCARD so it is thrown away
unnoticed. One of these newer checks 'HOLDED' 170 spams this weekend without
FP's, not a big absolute number but there's not a lot of spam coming in
anyway because of ip-blocks, RBL's etc in postfix.
Only trouble is after some time they change the spam, but then already
hundreds of spams are stopped.
And finding a new string/regexp can be an entertaining puzzle. But some spam
is just used over and over again so some rules still get hit after 2 years,
very kind of the spammers..
I check the spam (archived by SA/Amavisd) every morning and if I see more
spam than normal and a lot of spam of the same size I know there's work to
do ;-)

Regards
Menno van Bennekom 
-- 
View this message in context: http://www.nabble.com/Image-spams-getting-thru-tf2014839.html#a5577751
Sent from the SpamAssassin - Users forum at Nabble.com.

Re: Image spams getting thru

Posted by jdow <jd...@earthlink.net>.

From: "Benny Pedersen" <me...@junc.org>

> On Fri, July 28, 2006 13:14, Ramprasad wrote:
>> From: Rory [mailto:cardiac.cardiac@allergist.com]
>> From: Barbra [mailto:adjudge.adjudge@tokyo.com]
>>
>> Can I write a ruleset to hit these froms
> 
> yes
> 
> attached a rule for this

I think he meant the "cardiac.cardiac" and "adjudge.adjudge" part
of the From line. Your rule simply prevents more than 1 Subject:
header line and more than 1 From: header line.

{^_^}

Re: Image spams getting thru

Posted by Tim <ti...@sleepy.wojomedia.com>.

Does DCC, RAZOR, PYZOR, or any other signature algorithms work with
the image spams?  It's not apparent from reading the man pages.  It
seems to me that one could compare the signatures of attachments instead
of the whole e-mail and provide additional detection.

Thanks,

Tim

Re: Image spams getting thru

Posted by Logan Shaw <ls...@emitinc.com>.

On Sat, 29 Jul 2006, John D. Hardin wrote:
> On Sat, 29 Jul 2006, Loren Wilton wrote:
>>>>> From: Rory [mailto:cardiac.cardiac@allergist.com]
>>>>> From: Barbra [mailto:adjudge.adjudge@tokyo.com]
>>
>> Something like
>>
>> header FROMFROM    =~ /[A-Z]\w+ \[mailto\: \w+\.\w+\@/
>>
>> There is a way to be more specific, but it costs considerably
>> more.
>
> Namely:
>
>   header   FROM_REPEAT  From =~ /\b(\w{1,20})\.\1\@/
>
> Incorrect results returned quickly are useless.
>
> Adding a test for a single-word unquoted display name would reduce the
> cost as the RE engine wouldn't get to the expensive backreference

Ironically, and somewhat amusingly, the spammer has probably
made the backreference less expensive by marking the boundary
between the repeated strings with a period.  If the (relevant
part of the) addresses the spammer generated had looked
like this:

 	cardiaccardiac
 	adjudgeadjudge

that would have been more annoying to match than what they
actually used:

 	cardiac.cardiac
 	adjudge.adjudge

The reason is that you know exactly where the repeated string
(if it exists) must start, so all that is necessary is for the
regex engine is to collect everything up until the period,
then do a single check to see if there is a match (a check
that will usually fail on the very first character when the
engine is replaying its "tape" of the backreferenced string).
And that's O(N) (where N is the number of characters) in the
worst case, so not bad at all.

Actually, even without the period marking the spot where the
repeat will start it is easy in theory to efficiently match
these strings:  the repeat must start exactly in the middle
(since if a string is repeated, the repeat will be the same
length as the first occurrence).  If you have a string which is
2*N characters long and you want to see if the last N characters
are a repeat of the first N characters, you start by comparing
character 0 with character N, then compare character 1 with
character N+1, etc.  But, whether the regex machine would
ever use that technique is doubtful.  So, even though it's
possible in theory to match it efficiently without the "." as
a marker, the spammer has chosen a format that's relatively
easy to recognize.

   - Logan

Re: Image spams getting thru

Posted by "John D. Hardin" <jh...@impsec.org>.

On Sat, 29 Jul 2006, John D. Hardin wrote:

> > header FROMFROM    =~ /[A-Z]\w+ \[mailto\: \w+\.\w+\@/
> 
> It won't work. [A-Z] without the case-insensitive flag won't match
> the samples provided.

Whoops! Comment retracted!

--
 John Hardin KA7OHZ    ICQ#15735746    http://www.impsec.org/~jhardin/
 jhardin@impsec.org    FALaholic #11174    pgpk -a jhardin@impsec.org
 key: 0xB8732E79 - 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
  The problem is when people look at Yahoo, slashdot, or groklaw and
  jump from obvious and correct observations like "Oh my God, this
  place is teeming with utter morons" to incorrect conclusions like
  "there's nothing of value here".        -- Al Petrofsky, in Y! SCOX
-----------------------------------------------------------------------

Re: Image spams getting thru

Posted by "John D. Hardin" <jh...@impsec.org>.

On Sat, 29 Jul 2006, Loren Wilton wrote:

> >> > From: Rory [mailto:cardiac.cardiac@allergist.com]
> >> > From: Barbra [mailto:adjudge.adjudge@tokyo.com]
> 
> Something like
> 
> header FROMFROM    =~ /[A-Z]\w+ \[mailto\: \w+\.\w+\@/
>
> There is a way to be more specific, but it costs considerably
> more.

Namely:

   header   FROM_REPEAT  From =~ /\b(\w{1,20})\.\1\@/

Incorrect results returned quickly are useless.

Adding a test for a single-word unquoted display name would reduce the
cost as the RE engine wouldn't get to the expensive backreference
unless there was a single-word unquoted display name:

   header   FROM_REPEAT  From =~ /^\w{1,20}\s<(\w{1,20})\.\1\@/

> I'd try this first.

It won't work. [A-Z] without the case-insensitive flag won't match the
samples provided. You should also have a beginning-of-line anchor to
ensure it only hits on single-word display names. And the samples
don't have a space after the colon.

Also (and primarily), the "[mailto:...]" cruft is likely a
Winders-MUA-specific display-only mangle coded by somebody who is only
familiar with HTML and who should have stuck to browser programming.
If that's actually IN the raw From: message header then it makes an
excellent spam sign by itself as it is a URI format, NOT a valid email
mail address format per RFC-2822.

   describe FROM_URI   Browser Hammer syndrome
   header   FROM_URI   From =~ /\[mailto:/i
   score    FROM_URI   5000

(...is my hatred of that too obvious?)

The loose version would be:

   header   FROM_REPEAT   From =~ /^\w{1,20}\s<\w{1,20}\.\w{1,20}\@/

...but don't score it too high (above, say, 0.5) because it would hit
on possibly legitimate senders like:

  From: BillG <bi...@microsoft.com>

  From: ChairMaster <St...@microsoft.com>

(whew. My blood sugar is low this morning, I'm cranky...)

--
 John Hardin KA7OHZ    ICQ#15735746    http://www.impsec.org/~jhardin/
 jhardin@impsec.org    FALaholic #11174    pgpk -a jhardin@impsec.org
 key: 0xB8732E79 - 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
  The problem is when people look at Yahoo, slashdot, or groklaw and
  jump from obvious and correct observations like "Oh my God, this
  place is teeming with utter morons" to incorrect conclusions like
  "there's nothing of value here".        -- Al Petrofsky, in Y! SCOX
-----------------------------------------------------------------------

Re: Image spams getting thru

Posted by Loren Wilton <lw...@earthlink.net>.

>> > From: Rory [mailto:cardiac.cardiac@allergist.com]
>> > From: Barbra [mailto:adjudge.adjudge@tokyo.com]

Something like

header FROMFROM    =~ /[A-Z]\w+ \[mailto\: \w+\.\w+\@/

There is a way to be more specific, but it costs considerably more.  I'd try 
this first.

        Loren

Re: Image spams getting thru

Posted by Ramprasad <ra...@netcore.co.in>.

Oops they were single from headers , but from different mails 

On Fri, 2006-07-28 at 16:50 +0200, Benny Pedersen wrote:
> On Fri, July 28, 2006 13:14, Ramprasad wrote:
> > From: Rory [mailto:cardiac.cardiac@allergist.com]
> > From: Barbra [mailto:adjudge.adjudge@tokyo.com]
> >
> > Can I write a ruleset to hit these froms
> 
> yes
> 
> attached a rule for this
> 
> -- 
> Benny

Re: Image spams getting thru

Posted by Benny Pedersen <me...@junc.org>.

On Fri, July 28, 2006 13:14, Ramprasad wrote:
> From: Rory [mailto:cardiac.cardiac@allergist.com]
> From: Barbra [mailto:adjudge.adjudge@tokyo.com]
>
> Can I write a ruleset to hit these froms

yes

attached a rule for this

-- 
Benny