You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Kris Deugau <kd...@vianet.ca> on 2014/02/04 16:55:25 UTC

Just when you think you've found a rock-solid rule...

I just received a false-positive report, that comes down to a hit on
this local rule:

body    OVERSIZE_COMMENT        eval:html_text_match('comment',
'(?s)^(?=.{32000})')
describe OVERSIZE_COMMENT       Excessively long HTML comment

I've been seeing spams up to ~750K where the bulk of the byte count is a
very long list of gibberish wrapped in one or more HTML comments, so
this rule has been invaluable as one of a small handful in a
stripped-down "lean" SA instance in filing "obvious" spam before
spending processing resources scoring it at 30+ points in the full
ruleset.  Or in filing things as spam that wouldn't be passed to the
standard instance in the first place, as "too large".

I have now seen a (nominally) legitimate email trigger this....  and I
can honestly blame Microsoft, because the >32K comments are built around
Microsoft's <!--[if gte mso 9]> hacks that provide different behaviours
for different IE or Outlook HTML rendering engines.

*headdesk*

-kgd