You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Sam Gundry <sa...@its.monash.edu.au> on 2007/09/06 03:37:05 UTC
HTML to text
Hi,
Has there even been work/consideration to convert HTML _as it is
rendered_ into plain-text such that it can then be scanned using
non-html rules? For example, using 'w3m -dump' in linux (although using
this would probably be too slow).
Just curious since we've received some 'job' spam using HTML which
ordinarily we'd match if the rendered HTML text was in plain-text.
Cheers,
Sam
--
Samuel Gundry
Messaging Administrator/Developer
Information Technology Services, Monash University