You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Giampaolo Tomassoni <g....@libero.it> on 2010/04/21 22:16:11 UTC

Interesting use of html comments

Hi everybody,

I recently got this:

http://www.spamcop.net/sc?id=z3936411477z440d5f10ba6037ca7b75c78c52292721z;a
ction=display

in which you may see that a lot of text, probably meant to deceive and/or
poison the Bayes detector, is embedded in html comments.

Now, this particular message scores already a lot in my SA, and I'm not
going to rage on it again. :)

But, anyway, I see SA 3.3.1 comes with a very good HTMLEval plugin. However,
it seems to me that it misses a way to, in example, count the length of the
text commented out with respect to the uncommented one and eventually
trigger a rule if the ratio is above a given threshold.

Do you think such a rule could be somehow useful? Is there anything close to
it?

Thanks,

Giampaolo


RE: Interesting use of html comments

Posted by Giampaolo Tomassoni <Gi...@Tomassoni.biz>.
> Giampaolo Tomassoni wrote:
> > But, anyway, I see SA 3.3.1 comes with a very good HTMLEval plugin.
> However,
> > it seems to me that it misses a way to, in example, count the length
> of the
> > text commented out with respect to the uncommented one and eventually
> > trigger a rule if the ratio is above a given threshold.
> >
> > Do you think such a rule could be somehow useful? Is there anything
> close to
> > it?
> 
> I don't know of anything in the stock rules, but in response to missed
> spam reported by customers:
> 
> rawbody LONG_COMMENT    m|<!--[^>{};]{200,}-->|
> describe LONG_COMMENT   HTML comment with 200+ characters of "content"
> score LONG_COMMENT      1.25
> rawbody DUMB_COMMENT_1  m|<!--\n?\s*\d+\s*\n?-->|
> describe DUMB_COMMENT_1 HTML comment with bare chunk of numbers
> score DUMB_COMMENT_1    1.25
> rawbody DUMB_COMMENT_2  m|<!--\n?\s*(?:-{72}\n){2,}-+\n?\s*-->|
> describe DUMB_COMMENT_2 HTML comment consisting of several lines of
> dashes
> score DUMB_COMMENT_2    1.25
> rawbody BACK2BACK_COMMENT       m|--!><!--[\n\s\w]+--!><!--|
> describe BACK2BACK_COMMENT      HTML structure that looks a lot like
> back-to-back empty comments
> score   BACK2BACK_COMMENT       1.75

I see the relevant message don't trigger these rules, since the xml comments
in it are carefully interspersed in it: strings in each comment are shorter
than 200 characters.


> (Of course, now that I've published them they'll go totally useless...
> <g>)

Sorry for voiding your efforts... :)

Giampaolo

> 
> -kgd


Re: Interesting use of html comments

Posted by Kris Deugau <kd...@vianet.ca>.
Giampaolo Tomassoni wrote:
> But, anyway, I see SA 3.3.1 comes with a very good HTMLEval plugin. However,
> it seems to me that it misses a way to, in example, count the length of the
> text commented out with respect to the uncommented one and eventually
> trigger a rule if the ratio is above a given threshold.
> 
> Do you think such a rule could be somehow useful? Is there anything close to
> it?

I don't know of anything in the stock rules, but in response to missed 
spam reported by customers:

rawbody LONG_COMMENT    m|<!--[^>{};]{200,}-->|
describe LONG_COMMENT   HTML comment with 200+ characters of "content"
score LONG_COMMENT      1.25
rawbody DUMB_COMMENT_1  m|<!--\n?\s*\d+\s*\n?-->|
describe DUMB_COMMENT_1 HTML comment with bare chunk of numbers
score DUMB_COMMENT_1    1.25
rawbody DUMB_COMMENT_2  m|<!--\n?\s*(?:-{72}\n){2,}-+\n?\s*-->|
describe DUMB_COMMENT_2 HTML comment consisting of several lines of dashes
score DUMB_COMMENT_2    1.25
rawbody BACK2BACK_COMMENT       m|--!><!--[\n\s\w]+--!><!--|
describe BACK2BACK_COMMENT      HTML structure that looks a lot like 
back-to-back empty comments
score   BACK2BACK_COMMENT       1.75

(Of course, now that I've published them they'll go totally useless...  <g>)

-kgd