You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Christopher Martin <ch...@ebit.com.au> on 2006/10/24 07:08:16 UTC

Regex fot words written over multiple lines

Hi,

Spam assassin has for a long time been picking up e-mails with content 
like the following (I have changed a few letters to prevent Bayesian 
stuff from picking it up), but it's always based on URIs, HTML structure 
and such, rather than on a plain text match on the body.

 V           LOST PRCE           C
 T         TOP QUITY          N
A         FAT DEVERY WORLDWIDE         A
O         MEY BK GUANTEE         L
R         CETELY SECURE         N
A         Visit our sh op: HERE         S

In theory I could use the following to detect it:

/C\n+[a-zA-Z\s]+I\n+[a-zA-Z\s]+A\n+[a-zA-Z\s]+L\n+[a-zA-Z\s]+I\n+[a-zA-Z\s]+S\n/e

Is there a better way? And can use a similar rule for the other word, 
and can I get around the leading space issue? Any way of making it safer 
(less likely to generate false positives)?

Thanks!

Chris M