You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spamassassin.apache.org by bu...@bugzilla.spamassassin.org on 2004/08/12 20:26:52 UTC

[Bug 3680] New: Empty A HREF tag obfuscations

http://bugzilla.spamassassin.org/show_bug.cgi?id=3680

           Summary: Empty A HREF tag obfuscations
           Product: Spamassassin
           Version: 2.64
          Platform: All
        OS/Version: All
            Status: NEW
          Severity: enhancement
          Priority: P5
         Component: Rules
        AssignedTo: spamassassin-dev@incubator.apache.org
        ReportedBy: hsteoh@debian.org


Hi,

I've discovered another HTML obfuscation technique used in the latest spams.
The following is a snippet of a HTML spam I got:

----BEGIN QUOTE----
<font face="verdana" size="+3">A permanent fix to Pe<a href></a>nis Enla<a 
href></a>rgement</font>
----END QUOTE----

As you can see, the <a href></a> tag is being used to obfuscate key spamwords
in the message. Apparently, the spammers have caught on with the backhair.cf
SA ruleset, which catches the use of embedded non-existent tags to hide spam
words, so now they are using legal tags which have no effect, instead.

As a pre-emptive measure, perhaps SA should strip out all empty tags (i.e.,
anything that looks like <tagname ...></tagname>, which contains nothing in
between), before passing it to the text analysis rules.

And specifically, there should be a rule to penalize the use of such tags.
Here are 2 rules I made to catch suspicious <A HREF> tags:

rawbody   EMPTY_LINK_2  /\<a\b[^\>]*\bhref\s*([^=]|=\s*\"\s*\")/i
describe  EMPTY_LINK_2  HTML anchor has empty or missing HREF value

rawbody   EMPTY_LINK_3  /\<a\b[^\>]*\>\s*\<\/a\>/i
describe  EMPTY_LINK_3  HTML anchor has empty text body


I know we're trying to move away from rawbody tests; but I don't know SA's
HTML-parsing code well enough to write an eval test.



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.