You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spamassassin.apache.org by Jesse Houwing <j....@student.utwente.nl> on 2004/02/25 18:40:51 UTC
html_tag_balance on html and head tags
Hello all,
I've recently started buidling my own custom ruleset to catch some of the
spam that has eluded the spamassassin filter of the university. Recently al
lot of the messages I get have unbalanced html and head tags. Some even
start with </html> and </head>.
I tried to use eval:html_tag_balance('html' '<0') in a rule to test against
this, but somehow this never triggers. The same is true for the head tag.
I've included a test message on the following wiki page:
http://www.exit0.us/index.php/UnBalancedHTMLorHEADtags
I was wondering if it is a problem with Spamassassin, or if I'm just on the
wrong track here.
Jesse
Re: html_tag_balance on html and head tags
Posted by Daniel Quinlan <qu...@pathname.com>.
"Jesse Houwing" <j....@student.utwente.nl> writes:
> I've recently started buidling my own custom ruleset to catch some of the
> spam that has eluded the spamassassin filter of the university. Recently al
> lot of the messages I get have unbalanced html and head tags. Some even
> start with </html> and </head>.
>
> I tried to use eval:html_tag_balance('html' '<0') in a rule to test against
> this, but somehow this never triggers. The same is true for the head tag.
>
> I've included a test message on the following wiki page:
> http://www.exit0.us/index.php/UnBalancedHTMLorHEADtags
>
> I was wondering if it is a problem with Spamassassin, or if I'm just on the
> wrong track here.
(Unfortunately?) balance only checks for tags opened that were never
closed, not ones closed without being opened. It relies on the same
code that tries to determine when it is between two tags (for example,
between <title> and </title>).
The code should not be too complicated to follow. Look at HTML.pm and
maybe EvalTests.pm.
Daniel
--
Daniel Quinlan anti-spam (SpamAssassin), Linux,
http://www.pathname.com/~quinlan/ and open source consulting