You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Sam Gundry <sa...@its.monash.edu.au> on 2007/09/06 03:37:05 UTC

HTML to text

Hi,

Has there even been work/consideration to convert HTML _as it is 
rendered_ into plain-text such that it can then be scanned using 
non-html rules? For example, using 'w3m -dump' in linux (although using 
this would probably be too slow).

Just curious since we've received some 'job' spam using HTML which 
ordinarily we'd match if the rendered HTML text was in plain-text.

Cheers,
Sam
-- 
Samuel Gundry
Messaging Administrator/Developer
Information Technology Services, Monash University