You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Matthew Newton <mc...@leicester.ac.uk> on 2005/04/22 11:49:38 UTC

Anyone else seem spam like this?

Hi,

Have had several spams over the last few days with the exact paragraph
below. Anyone else seen similar messages? Any rules available?

Can't yet think of how to write rules for this, as it's so non-spam it
obviously is (of that makes any sense). I'll have a think about it.

There is an HTML part, too, that is not included below. It doesn't seem
to say too much more than this paragraph though.

Also, looking at the subject line, looks like the spammers are using the
technique of muddling all the middle letters of words but leaving the
first and last letters as normal. ISTR that some research recently
showed that people could understand words muddled up like this very
easily. Maybe it needs a new SA rule or plugin (don't know how this
would be done; would a plugin be needed?).

Thanks

Matthew


----- Forwarded message from Lolita Mcintosh <iv...@dartmail.net> -----

Subject: Re: Fuond a betetr suotloin
X-Spam-Score: (/) 0.4
X-Spam-Report: This e-mail has been scored by SpamAssassin 3.0.2
	Pts Rule name              Description
	---- ---------------------- ---------------------------------------
	0.0 HTML_TEXT_AFTER_HTML   BODY: HTML contains text after HTML close tag
	0.1 HTML_TEXT_AFTER_BODY   BODY: HTML contains text after BODY close tag
	0.0 HTML_MESSAGE           BODY: HTML included in message
	0.0 BAYES_50               BODY: Bayesian spam probability is 40 to 60%
	[score: 0.5002]
	0.1 HTML_TAG_EXIST_TBODY   BODY: HTML has "tbody" tag
	0.1 HTML_FONT_BIG          BODY: HTML tag for a big font size
	0.1 RCVD_IN_NJABL_DUL      RBL: NJABL: dialup sender did non-local SMTP
	[24.220.150.223 listed in combined.njabl.org]

Hello, 

When parents talk to babies, they often speak slowly and melodically, using a form of speech that experts refer to as "parentese." This is exactly the kind of speech that is best suited to helping babies learn to talk. To engage your baby's attention, it is helpful to be lively and to vary the tone and pitch of your voice. It is also helpful to speak slowly and distinctly, and to repeat words and phrases. However, don't underestimate your baby's grasp of what you are saying. Well before they can respond with words, babies and toddlers can understand a lot of what is said.

Have a good day.

----- End forwarded message -----

-- 
Matthew Newton <mc...@le.ac.uk>

UNIX and e-mail Systems Administrator, Network Support Section,
Computer Centre, University of Leicester,
Leicester LE1 7RH, United Kingdom

Re: Anyone else seem spam like this?

Posted by Robert Menschel <Ro...@Menschel.net>.
Hello Matthew,

Friday, April 22, 2005, 2:49:38 AM, you wrote:

MN> Hi,

MN> Have had several spams over the last few days with the exact
MN> paragraph below. Anyone else seen similar messages? Any rules
MN> available?

As suggested, if that paragraph is being repeated, Bayes is your
friend. Agreed it may not do the whole job, but it's a big help.

MN> Can't yet think of how to write rules for this, as it's so
MN> non-spam it obviously is (of that makes any sense). I'll have a
MN> think about it.

It might be worth writing a rule that looks for a long phrase from the
paragraph, one that won't normally occur in ham, like,
body  RULE_NAME  /helpful to be lively and to vary the tone and/

MN> There is an HTML part, too, that is not included below. It doesn't
MN> seem to say too much more than this paragraph though.

However, there are frequently cues within the HTML tags themselves
that could identify this as spam.  Are you using any of SARE's HTML
rule set files?  See http://www.rulesemporium.com/rules.htm#html

MN> Also, looking at the subject line, looks like the spammers are using the
MN> technique of muddling all the middle letters of words but leaving the
MN> first and last letters as normal. ISTR that some research recently
MN> showed that people could understand words muddled up like this very
MN> easily. Maybe it needs a new SA rule or plugin (don't know how this
MN> would be done; would a plugin be needed?).

Not a new trick -- we noticed it and discussed it a year or more ago.
Tried a few general methods of identifying this, and it didn't work
well.

Based on a year's more experience in spam fighting, I have an idea or
two that I'll play with and see if I can tackle this via some simple
rules. No guarantees. If anyone has additional examples of such
misspleled words, please send them to me offlist.

Bob Menschel




Re: Anyone else seem spam like this?

Posted by Matthew Newton <mc...@leicester.ac.uk>.
On Fri, Apr 22, 2005 at 07:37:35AM -0400, Kevin Peuhkurinen wrote:
> I haven't seen any of these myself, but it sounds like a job for 
> Bayes.   Feed them all into sa-learn and soon enough they should be 
> hitting higher Bayes scores.  

Thanks, I can do that.

However, the bayes scores are at most about 2.5, and these messages have
been coming in with scores around 2 or less. This would only help push
them (maximum) up to just under 5.

I did increase the bayes scores recently, but had to drop them as they
were causing too many false positives (or near FPs). I don't think the
database is corrupt, just that we have a very wide range of different
types of e-mail coming in here.

Thanks

Matthew

-- 
Matthew Newton <mc...@le.ac.uk>

UNIX and e-mail Systems Administrator, Network Support Section,
Computer Centre, University of Leicester,
Leicester LE1 7RH, United Kingdom

Re: Anyone else seem spam like this?

Posted by Kevin Peuhkurinen <ke...@meridiancu.ca>.
I haven't seen any of these myself, but it sounds like a job for 
Bayes.   Feed them all into sa-learn and soon enough they should be 
hitting higher Bayes scores.  

Matthew Newton wrote:

>Hi,
>
>Have had several spams over the last few days with the exact paragraph
>below. Anyone else seen similar messages? Any rules available?
>
>Can't yet think of how to write rules for this, as it's so non-spam it
>obviously is (of that makes any sense). I'll have a think about it.
>
>There is an HTML part, too, that is not included below. It doesn't seem
>to say too much more than this paragraph though.
>
>Also, looking at the subject line, looks like the spammers are using the
>technique of muddling all the middle letters of words but leaving the
>first and last letters as normal. ISTR that some research recently
>showed that people could understand words muddled up like this very
>easily. Maybe it needs a new SA rule or plugin (don't know how this
>would be done; would a plugin be needed?).
>
>Thanks
>
>Matthew
>
>
>  
>

Re: Anyone else seem spam like this?

Posted by JamesDR <ro...@bellsouth.net>.
Matthew Newton wrote:
> Hi,
> 
> Have had several spams over the last few days with the exact paragraph
> below. Anyone else seen similar messages? Any rules available?
> 
> Can't yet think of how to write rules for this, as it's so non-spam it
> obviously is (of that makes any sense). I'll have a think about it.
> 
> There is an HTML part, too, that is not included below. It doesn't seem
> to say too much more than this paragraph though.
> 
> Also, looking at the subject line, looks like the spammers are using the
> technique of muddling all the middle letters of words but leaving the
> first and last letters as normal. ISTR that some research recently
> showed that people could understand words muddled up like this very
> easily. Maybe it needs a new SA rule or plugin (don't know how this
> would be done; would a plugin be needed?).
> 
> Thanks
> 
> Matthew
> 
> 
> ----- Forwarded message from Lolita Mcintosh <iv...@dartmail.net> -----
> 
> Subject: Re: Fuond a betetr suotloin
> X-Spam-Score: (/) 0.4
> X-Spam-Report: This e-mail has been scored by SpamAssassin 3.0.2
> 	Pts Rule name              Description
> 	---- ---------------------- ---------------------------------------
> 	0.0 HTML_TEXT_AFTER_HTML   BODY: HTML contains text after HTML close tag
> 	0.1 HTML_TEXT_AFTER_BODY   BODY: HTML contains text after BODY close tag
> 	0.0 HTML_MESSAGE           BODY: HTML included in message
> 	0.0 BAYES_50               BODY: Bayesian spam probability is 40 to 60%
> 	[score: 0.5002]
> 	0.1 HTML_TAG_EXIST_TBODY   BODY: HTML has "tbody" tag
> 	0.1 HTML_FONT_BIG          BODY: HTML tag for a big font size
> 	0.1 RCVD_IN_NJABL_DUL      RBL: NJABL: dialup sender did non-local SMTP
> 	[24.220.150.223 listed in combined.njabl.org]

<snip>

0.0 BAYES_50 <-- indicates that bayes things this is neither ham nor 
spam. Training here will help.

the other few rules that did hit could be bumped up slightly, again, you 
would need to test this with your hams to make sure to not get any FP. 
Another thing to consider is how often do you or your clients use this 
phrase:
"Fuond a betetr suotloin"
or many of the other phrases in that mail? If it is nearly never, then 
you could make a rule to look for that subject and some key phrases in 
that mail (I'm not any good at regex, but if it is that bad, you could 
learn it :-D ) that could key other rules. A few points here/there do help.

HTH

-- 
Thanks,
JamesDR

Re: Anyone else seem spam like this?

Posted by Chris <cp...@earthlink.net>.
On Friday 22 April 2005 04:49 am, Matthew Newton wrote:
> Hi,
>
> Have had several spams over the last few days with the exact paragraph
> below. Anyone else seen similar messages? Any rules available?
>
It's tagged quite easily here.  Didn't include the entire paragraph, but its 
the same. 

Subject: *****SPAM(21.3)***** Re: Odrer and svae
 X-Spam-Prev-Subject: Re: Odrer and svae
 X-Spam-DCC: CollegeOfNewCaledonia cpollock.localdomain 1189; Body=1 Fuz1=many 
        Fuz2=many
 X-Spam-Flag: YES
 X-Spam-Checker-Version: SpamAssassin 3.0.2 (2004-11-16) on 
        cpollock.localdomain
 X-Spam-Level: *********************
 X-Spam-Status: Yes, score=21.3 required=5.0 tests=BAYES_99,DCC_CHECK,
        DIGEST_MULTIPLE,DNS_FROM_RFC_POST,HELO_DYNAMIC_IPADDR,HTML_FONT_BIG,
        HTML_MESSAGE,HTML_TAG_EXIST_TBODY,HTML_TEXT_AFTER_BODY,
        HTML_TEXT_AFTER_HTML,MSGID_FROM_MTA_ID,PYZOR_CHECK,
        RAZOR2_CF_RANGE_51_100,RAZOR2_CHECK,RCVD_IN_XBL,URIBL_SBL 
        autolearn=disabled version=3.0.2
 X-Spam-Pyzor: Reported 2 times.
 X-Spam-Report: 
        *  4.4 HELO_DYNAMIC_IPADDR Relay HELO'd using suspicious hostname (IP 
addr 1)
        *  1.7 MSGID_FROM_MTA_ID Message-Id for external message added locally
        *  0.0 HTML_TEXT_AFTER_HTML BODY: HTML contains text after HTML close 
tag
        *  0.1 HTML_TEXT_AFTER_BODY BODY: HTML contains text after BODY close 
tag
        *  0.0 HTML_MESSAGE BODY: HTML included in message
        *  0.1 HTML_TAG_EXIST_TBODY BODY: HTML has "tbody" tag
        *  0.1 HTML_FONT_BIG BODY: HTML tag for a big font size
        *  0.1 RAZOR2_CF_RANGE_51_100 BODY: Razor2 gives confidence level 
above 50%
        *      [cf: 100]
        *  1.9 BAYES_99 BODY: Bayesian spam probability is 99 to 100%
        *      [score: 1.0000]
        *  1.5 RAZOR2_CHECK Listed in Razor2 (http://razor.sf.net/)
        *  3.5 PYZOR_CHECK Listed in Pyzor (http://pyzor.sf.net/)
        *  2.2 DCC_CHECK Listed in DCC (http://rhyolite.com/anti-spam/dcc/)
        *  3.1 RCVD_IN_XBL RBL: Received via a relay in Spamhaus XBL
        *      [67.188.170.83 listed in sbl-xbl.spamhaus.org]
        *  1.6 DNS_FROM_RFC_POST RBL: Envelope sender in 
postmaster.rfc-ignorant.org
        *  1.0 URIBL_SBL Contains an URL listed in the SBL blocklist
        *      [URIs: d3w.net]
        *  0.1 DIGEST_MULTIPLE Message hits more than one network digest check
 X-UID: 8572
 X-Length: 7646
 
Hello, 

When parents talk to babies, they often speak slowly and melodically, using a 
form
-- 
Chris
Registered Linux User 283774 http://counter.li.org
21:54:28 up 13 days, 1 min, 1 user, load average: 0.48, 0.65, 0.55
Mandriva Linux 10.1 Official, kernel 2.6.8.1-12mdk
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Beauty?  What's that?
             -- Larry Wall in <19...@wall.org>
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~