You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Magnus Holmgren <ho...@lysator.liu.se> on 2006/04/13 15:58:01 UTC

TEXTAREA style="visibility: hidden"

I see a fair amount of spam using <TEXTAREA style="visibility: hidden"> to 
hide bayes poison. Shouldn't a rule against that, or CSS-hidden text in 
general, be worthwile? I couldn't find any in the default 3.1.1 ruleset, nor 
at SARE.

-- 
Magnus Holmgren

Re: TEXTAREA style="visibility: hidden"

Posted by Theo Van Dinter <fe...@apache.org>.
On Thu, Apr 13, 2006 at 03:58:01PM +0200, Magnus Holmgren wrote:
> I see a fair amount of spam using <TEXTAREA style="visibility: hidden"> to 
> hide bayes poison. Shouldn't a rule against that, or CSS-hidden text in 
> general, be worthwile? I couldn't find any in the default 3.1.1 ruleset, nor 
> at SARE.

Not specific to textarea, just looking for an html tag with that style setting:

  0.878   0.9903   0.3319    0.749   0.00    1.00  TVD_VIS_HIDDEN

Specifically just looking for textarea:

  0.821   0.9903   0.0000    1.000   1.00    1.00  TVD_VIS_HIDDEN

I added the second one to my sandbox.  We'll see how the nightly
mass-checks deal with it. :)

Thanks! :)

-- 
Randomly Generated Tagline:
"Do not meddle in the affairs of wizards,
 for they are subtle and quick to anger."    - Lord of the Rings

Re: How was this missed?

Posted by Magnus Holmgren <ho...@lysator.liu.se>.
Please start a new thread instead of replying to an unrelated message.

Thursday 13 April 2006 18:39 qqqq wrote:
> Any idea how this one got through?
>
> body     BRIAN_PHONE_NUMBERS
> /2.0.6.9.8.4.2.3.2.7|2.0.6.3.3.3.0.0.5.1|2.0.6.9.8.4.0.1.0.6|3.3.8.3.5.7.9|
>2.0.6.3.3.8.6.0.6.1|2.0.6 .2.0.2.2.0.3.3/
> describe BRIAN_PHONE_NUMBERS      Phone number or address pulled from spam
> score    BRIAN_PHONE_NUMBERS      5.5
>

A period (.) matches exactly one arbitrary character (except newline). Try 
putting a question mark (?) after each period.

> ----- Message -----
>
> Good day,
>
>
> A Gen_uine Coll`ege  Deg.ree in 2 weeks Cal_l us now_!-> 2*0*6*984-2327
>
> Within 2 weeks!     No Study Required!   1_0_0_% Veri.fiable!
>
> Right now the following deg.rees are being offered:
>
> B/A,   .B/S/C,    .M/A,    .M/S/C,    .M/B/A,   .P/H/D,
>
>
> C.al_l us now_ for more information,  2*0*6*984-2327
-- 
Magnus Holmgren


Re: How was this missed?

Posted by qqqq <qq...@usermail.com>.
!Sure, the pattern doesn't match.  "." means there has to be some (any)
!character between the numbers.  "984" has no characters between the
!numbers.

DOH!!!

Thanks. your right...

QQQQ

Re: How was this missed?

Posted by Theo Van Dinter <fe...@apache.org>.
On Thu, Apr 13, 2006 at 10:39:29AM -0600, qqqq wrote:
> Any idea how this one got through?
> 
> body     BRIAN_PHONE_NUMBERS
> /2.0.6.9.8.4.2.3.2.7|2.0.6.3.3.3.0.0.5.1|2.0.6.9.8.4.0.1.0.6|3.3.8.3.5.7.9|2.0.6.3.3.8.6.0.6.1|2.0.6
> .2.0.2.2.0.3.3/
> 
> A Gen_uine Coll`ege  Deg.ree in 2 weeks Cal_l us now_!-> 2*0*6*984-2327

Sure, the pattern doesn't match.  "." means there has to be some (any)
character between the numbers.  "984" has no characters between the
numbers.

-- 
Randomly Generated Tagline:
1-900-Tech Support...hold...all operators are busy.

How was this missed?

Posted by qqqq <qq...@usermail.com>.
Guys,

Any idea how this one got through?

body     BRIAN_PHONE_NUMBERS
/2.0.6.9.8.4.2.3.2.7|2.0.6.3.3.3.0.0.5.1|2.0.6.9.8.4.0.1.0.6|3.3.8.3.5.7.9|2.0.6.3.3.8.6.0.6.1|2.0.6
.2.0.2.2.0.3.3/
describe BRIAN_PHONE_NUMBERS      Phone number or address pulled from spam
score    BRIAN_PHONE_NUMBERS      5.5

----- Message -----

Good day,


A Gen_uine Coll`ege  Deg.ree in 2 weeks Cal_l us now_!-> 2*0*6*984-2327

Within 2 weeks!     No Study Required!   1_0_0_% Veri.fiable!

Right now the following deg.rees are being offered:

B/A,   .B/S/C,    .M/A,    .M/S/C,    .M/B/A,   .P/H/D,


C.al_l us now_ for more information,  2*0*6*984-2327



TTYL,
Vilma Milton


Re: TEXTAREA style="visibility: hidden"

Posted by Matthias Keller <li...@matthias-keller.ch>.
Matt Kettler wrote:
> Matthias Keller wrote:
>   
>> Matt Kettler wrote:
>>     
>>> Magnus Holmgren wrote:
>>>  
>>>       
>>>> I see a fair amount of spam using <TEXTAREA style="visibility:
>>>> hidden"> to hide bayes poison. Shouldn't a rule against that, or
>>>> CSS-hidden text in general, be worthwile? I couldn't find any in the
>>>> default 3.1.1 ruleset, nor at SARE.
>>>>     
>>>>         
>>> It certainly seems worth testing.
>>>
>>> Here's a rule I wrote (caution: word-wraps.. this should be 3 lines
>>> long):
>>>
>>> rawbody L_STYLE_HIDDEN /<TEXTAREA
>>> [^>]{0,50}style\s?=\s?"\s?visibility:\s?hidden\s?"[^>]{0,50}>/i
>>> describe L_STYLE_HIDDEN  has text with hidden visibility style
>>> score L_STYLE_HIDDEN 0.1
>>>
>>> I added some allowance for other declarations in the textarea tag, and
>>> the
>>> insertion of whitespace at various spots...
>>>
>>> It may need further tweaking/tuning, but it's a first-stab.
>>>   
>>>       
>> Hi Matt
>>
>> I'm using this rule for quite some time now:
>>
>> rawbody         MKE_HIDDEN1                    
>> /<[^>]*\bstyle=[^>]*(?:visibility:\s*hidden|display:\s*none)/i
>> describe        MKE_HIDDEN1                     Contains CSS-hidden text
>> score           MKE_HIDDEN1                     3.5
>>
>>     
>
> That seems to be a nicer rule. My only concern would be that <[^>]* could be
> rather slow. I'd change the * to a range-limit, to prevent SA from digging
> through the entire body of a message that happens to be text/plain and starts
> off with a < and has no > anywhere in it.
>   
Good idea
Thanks for pointing that out
Maybe a meta rule with IS_HTML or how that's called again might be a 
good idea too

Let me know your mass check results then

Matt

Re: TEXTAREA style="visibility: hidden"

Posted by Matt Kettler <mk...@evi-inc.com>.
Matthias Keller wrote:
> Matt Kettler wrote:
>> Magnus Holmgren wrote:
>>  
>>> I see a fair amount of spam using <TEXTAREA style="visibility:
>>> hidden"> to hide bayes poison. Shouldn't a rule against that, or
>>> CSS-hidden text in general, be worthwile? I couldn't find any in the
>>> default 3.1.1 ruleset, nor at SARE.
>>>     
>>
>> It certainly seems worth testing.
>>
>> Here's a rule I wrote (caution: word-wraps.. this should be 3 lines
>> long):
>>
>> rawbody L_STYLE_HIDDEN /<TEXTAREA
>> [^>]{0,50}style\s?=\s?"\s?visibility:\s?hidden\s?"[^>]{0,50}>/i
>> describe L_STYLE_HIDDEN  has text with hidden visibility style
>> score L_STYLE_HIDDEN 0.1
>>
>> I added some allowance for other declarations in the textarea tag, and
>> the
>> insertion of whitespace at various spots...
>>
>> It may need further tweaking/tuning, but it's a first-stab.
>>   
> Hi Matt
> 
> I'm using this rule for quite some time now:
> 
> rawbody         MKE_HIDDEN1                    
> /<[^>]*\bstyle=[^>]*(?:visibility:\s*hidden|display:\s*none)/i
> describe        MKE_HIDDEN1                     Contains CSS-hidden text
> score           MKE_HIDDEN1                     3.5
> 

That seems to be a nicer rule. My only concern would be that <[^>]* could be
rather slow. I'd change the * to a range-limit, to prevent SA from digging
through the entire body of a message that happens to be text/plain and starts
off with a < and has no > anywhere in it.

Re: TEXTAREA style="visibility: hidden"

Posted by Theo Van Dinter <fe...@apache.org>.
On Thu, Apr 13, 2006 at 09:45:13AM -0700, Kelson wrote:
> Nope.  No legit uses in email that I can think of.

Just because you can't think of a use doesn't mean people don't use them.
I see a lot of:

<div ... style="...; visibility: hidden; ...
<input ... style="display: none" ...
<div ... style="display: none" ...

and a bunch of CSS which includes those two style attributes as well.

Seen in personal mails from places such as Yahoo! and American Express,
and newsletters from such places as the Boston Globe, CNN, the Denver
Post, Der Spiegel, Microsoft, the Washington Post, etc.

-- 
Randomly Generated Tagline:
When you are at Rome live in the Roman style; when you are elsewhere live
 as they live elsewhere.
 		-- St. Ambrose

Re: TEXTAREA style="visibility: hidden"

Posted by Kelson <ke...@speed.net>.
Matthias Keller wrote:
> In my opinion you shouldn't limit it to textareas as I've seen them on 
> DIVs and others too...
> So to me, any visibility:hidden or display:none is suspect as I dont see 
> any legitimate use in emails

Hmm... The main uses I can think of for display:none and 
visibility:hidden are:

(1) Serving the same content to different media (for instance, set a
     page so that the navigation area doesn't appear when you print it)
(2) Replacing content (as in CSS techniques to replace text with
     graphical headlines)
(3) Scripting that will show and hide sections in response to time or
     user interaction.
(4) Creating machine-readable content that the user will not see.
     (keyword stuffing, bayes poison, black-hat SEO, honeypot seeding,
     etc.)

#1 isn't a good fit with email, since the main things you'd want to 
leave out of a print version are more likely to be in the mail client UI 
than part of the message body.  Though it might be useful for providing 
a handheld-friendly view.  Even so, it wouldn't work with inline styles, 
only with an attached or embedded stylesheet.

#2 is pretty much useless in email.  If you want a text alternative, 
you're better off providing a text/plain version of the message.

#3 shouldn't even be a consideration, since HTML-capable email clients 
should have scripting disabled for safety reasons.

#4 is mostly deceptive.  If you need to provide metadata in an HTML doc, 
well, that's what META tags are for.  If you need to provide metadata in 
an email message, you've got headers, you can add an XML attachment, etc.

Nope.  No legit uses in email that I can think of.

-- 
Kelson Vibber
SpeedGate Communications <www.speed.net>

Re: TEXTAREA style="visibility: hidden"

Posted by Matthias Keller <li...@matthias-keller.ch>.
Matt Kettler wrote:
> Magnus Holmgren wrote:
>   
>> I see a fair amount of spam using <TEXTAREA style="visibility: hidden"> to 
>> hide bayes poison. Shouldn't a rule against that, or CSS-hidden text in 
>> general, be worthwile? I couldn't find any in the default 3.1.1 ruleset, nor 
>> at SARE.
>>     
>
> It certainly seems worth testing.
>
> Here's a rule I wrote (caution: word-wraps.. this should be 3 lines long):
>
> rawbody L_STYLE_HIDDEN /<TEXTAREA
> [^>]{0,50}style\s?=\s?"\s?visibility:\s?hidden\s?"[^>]{0,50}>/i
> describe L_STYLE_HIDDEN  has text with hidden visibility style
> score L_STYLE_HIDDEN 0.1
>
> I added some allowance for other declarations in the textarea tag, and the
> insertion of whitespace at various spots...
>
> It may need further tweaking/tuning, but it's a first-stab.
>   
Hi Matt

I'm using this rule for quite some time now:

rawbody         MKE_HIDDEN1                     
/<[^>]*\bstyle=[^>]*(?:visibility:\s*hidden|display:\s*none)/i
describe        MKE_HIDDEN1                     Contains CSS-hidden text
score           MKE_HIDDEN1                     3.5

In my opinion you shouldn't limit it to textareas as I've seen them on 
DIVs and others too...
So to me, any visibility:hidden or display:none is suspect as I dont see 
any legitimate use in emails

In my spams, this rule matches around 4% of all spams, I haven't seen 
any ham matches yet
Feel free to mass check it and/or include it into your coding rules. But 
if you do please inform me that I can remove my local copy then.

Matt

Re: TEXTAREA style="visibility: hidden"

Posted by Matt Kettler <mk...@evi-inc.com>.
Magnus Holmgren wrote:
> I see a fair amount of spam using <TEXTAREA style="visibility: hidden"> to 
> hide bayes poison. Shouldn't a rule against that, or CSS-hidden text in 
> general, be worthwile? I couldn't find any in the default 3.1.1 ruleset, nor 
> at SARE.

It certainly seems worth testing.

Here's a rule I wrote (caution: word-wraps.. this should be 3 lines long):

rawbody L_STYLE_HIDDEN /<TEXTAREA
[^>]{0,50}style\s?=\s?"\s?visibility:\s?hidden\s?"[^>]{0,50}>/i
describe L_STYLE_HIDDEN  has text with hidden visibility style
score L_STYLE_HIDDEN 0.1

I added some allowance for other declarations in the textarea tag, and the
insertion of whitespace at various spots...

It may need further tweaking/tuning, but it's a first-stab.