You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Giampaolo Tomassoni <g....@libero.it> on 2010/04/21 22:16:11 UTC
Interesting use of html comments
Hi everybody,
I recently got this:
http://www.spamcop.net/sc?id=z3936411477z440d5f10ba6037ca7b75c78c52292721z;a
ction=display
in which you may see that a lot of text, probably meant to deceive and/or
poison the Bayes detector, is embedded in html comments.
Now, this particular message scores already a lot in my SA, and I'm not
going to rage on it again. :)
But, anyway, I see SA 3.3.1 comes with a very good HTMLEval plugin. However,
it seems to me that it misses a way to, in example, count the length of the
text commented out with respect to the uncommented one and eventually
trigger a rule if the ratio is above a given threshold.
Do you think such a rule could be somehow useful? Is there anything close to
it?
Thanks,
Giampaolo
RE: Interesting use of html comments
Posted by Giampaolo Tomassoni <Gi...@Tomassoni.biz>.
> Giampaolo Tomassoni wrote:
> > But, anyway, I see SA 3.3.1 comes with a very good HTMLEval plugin.
> However,
> > it seems to me that it misses a way to, in example, count the length
> of the
> > text commented out with respect to the uncommented one and eventually
> > trigger a rule if the ratio is above a given threshold.
> >
> > Do you think such a rule could be somehow useful? Is there anything
> close to
> > it?
>
> I don't know of anything in the stock rules, but in response to missed
> spam reported by customers:
>
> rawbody LONG_COMMENT m|<!--[^>{};]{200,}-->|
> describe LONG_COMMENT HTML comment with 200+ characters of "content"
> score LONG_COMMENT 1.25
> rawbody DUMB_COMMENT_1 m|<!--\n?\s*\d+\s*\n?-->|
> describe DUMB_COMMENT_1 HTML comment with bare chunk of numbers
> score DUMB_COMMENT_1 1.25
> rawbody DUMB_COMMENT_2 m|<!--\n?\s*(?:-{72}\n){2,}-+\n?\s*-->|
> describe DUMB_COMMENT_2 HTML comment consisting of several lines of
> dashes
> score DUMB_COMMENT_2 1.25
> rawbody BACK2BACK_COMMENT m|--!><!--[\n\s\w]+--!><!--|
> describe BACK2BACK_COMMENT HTML structure that looks a lot like
> back-to-back empty comments
> score BACK2BACK_COMMENT 1.75
I see the relevant message don't trigger these rules, since the xml comments
in it are carefully interspersed in it: strings in each comment are shorter
than 200 characters.
> (Of course, now that I've published them they'll go totally useless...
> <g>)
Sorry for voiding your efforts... :)
Giampaolo
>
> -kgd
Re: Interesting use of html comments
Posted by Kris Deugau <kd...@vianet.ca>.
Giampaolo Tomassoni wrote:
> But, anyway, I see SA 3.3.1 comes with a very good HTMLEval plugin. However,
> it seems to me that it misses a way to, in example, count the length of the
> text commented out with respect to the uncommented one and eventually
> trigger a rule if the ratio is above a given threshold.
>
> Do you think such a rule could be somehow useful? Is there anything close to
> it?
I don't know of anything in the stock rules, but in response to missed
spam reported by customers:
rawbody LONG_COMMENT m|<!--[^>{};]{200,}-->|
describe LONG_COMMENT HTML comment with 200+ characters of "content"
score LONG_COMMENT 1.25
rawbody DUMB_COMMENT_1 m|<!--\n?\s*\d+\s*\n?-->|
describe DUMB_COMMENT_1 HTML comment with bare chunk of numbers
score DUMB_COMMENT_1 1.25
rawbody DUMB_COMMENT_2 m|<!--\n?\s*(?:-{72}\n){2,}-+\n?\s*-->|
describe DUMB_COMMENT_2 HTML comment consisting of several lines of dashes
score DUMB_COMMENT_2 1.25
rawbody BACK2BACK_COMMENT m|--!><!--[\n\s\w]+--!><!--|
describe BACK2BACK_COMMENT HTML structure that looks a lot like
back-to-back empty comments
score BACK2BACK_COMMENT 1.75
(Of course, now that I've published them they'll go totally useless... <g>)
-kgd