You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Webmaster <we...@crystal3.com> on 2014/02/27 00:53:24 UTC

help with regex

Hi,

I need a regex to match an alphanumeric string with letters and numbers.

example:  48HQZBF404TY2298D1414BB8050022YQ3872444

The pattern is defined as:

A sequence of alphanumeric characters, letters are upper or lower case, at least 30 chars long, containing at least 10 numbers.

This part is easy enough:  [a-zA-Z0-9]{30,}

But I can't figure out how to match only ifthe string contains at least 10 numbers.

thanks,
JC


Re: help with regex

Posted by Amir Caspi <ce...@3phase.com>.
On Feb 26, 2014, at 5:49 PM, Jeff Mincy <je...@delphioutpost.com> wrote:

> Can't you do something like this using a look ahead regexp?
> 
> (?=[A-Z0-9]{30,})(?:[A-Z]*[0-9]){10,}

According to regexpal.com, that matches the OP's example.  The lookahead works properly in this case, since trying to use (say) 28 numbers fails, while 10 numbers works.  As long as SA can do lookahead, I think this should work.

I'm guessing the OP is trying to make a spam template match, much like the AC_SPAMMY_URI_PATTERNS rules.  Typically, though, I've found that there's no real need to be THAT specific when matching the template; almost no hammy emails will have alphanumeric strings of 30+ characters.  As long as the rest of the URI match is sufficiently unique, there shouldn't be a need to be overly specific about the string itself.  Of course, it all depends on the situation, which would be much easier to assess if the OP posted an example. =)

Cheers.

--- Amir


Re: help with regex

Posted by Webmaster <we...@crystal3.com>.
thanks!... that appears to work just fine ... tested on http://regexpal.com

I will break that down and try to understand how it works.

JC

On 2/26/14 2:49 PM, Jeff Mincy wrote:
>     From: "Kevin A. McGrail" <KM...@PCCC.com>
>     Date: Wed, 26 Feb 2014 19:06:34 -0500
>     
>     On 2/26/2014 6:53 PM, Webmaster wrote:
>     > I need a regex to match an alphanumeric string with letters and numbers.
>     >
>     > example:  48HQZBF404TY2298D1414BB8050022YQ3872444
>     >
>     > The pattern is defined as:
>     >
>     > A sequence of alphanumeric characters, letters are upper or lower
>     > case, at least 30 chars long, containing at least 10 numbers.
>     >
>     > This part is easy enough:  [a-zA-Z0-9]{30,}
>     >
>     > But I can't figure out how to match only ifthe string contains at
>     > least 10 numbers.
>     Hmm, I think you might need a plugin for that one.
>
> Can't you do something like this using a look ahead regexp?
>
> (?=[A-Z0-9]{30,})(?:[A-Z]*[0-9]){10,}
>
> The look ahead gets the 30 chars.   Then the next part gets the 10 or
> more numbers.   You probably don't need unbounded {10,} but you do need
> the {30,} part to be unbounded.
>
> Is the 10 number part really important?
>
> -jeff
>


Re: help with regex

Posted by Jeff Mincy <je...@delphioutpost.com>.
   From: "Kevin A. McGrail" <KM...@PCCC.com>
   Date: Wed, 26 Feb 2014 19:06:34 -0500
   
   On 2/26/2014 6:53 PM, Webmaster wrote:
   > I need a regex to match an alphanumeric string with letters and numbers.
   >
   > example:  48HQZBF404TY2298D1414BB8050022YQ3872444
   >
   > The pattern is defined as:
   >
   > A sequence of alphanumeric characters, letters are upper or lower 
   > case, at least 30 chars long, containing at least 10 numbers.
   >
   > This part is easy enough:  [a-zA-Z0-9]{30,}
   >
   > But I can't figure out how to match only ifthe string contains at 
   > least 10 numbers. 
   Hmm, I think you might need a plugin for that one.

Can't you do something like this using a look ahead regexp?

(?=[A-Z0-9]{30,})(?:[A-Z]*[0-9]){10,}

The look ahead gets the 30 chars.   Then the next part gets the 10 or
more numbers.   You probably don't need unbounded {10,} but you do need
the {30,} part to be unbounded.

Is the 10 number part really important?

-jeff

Re: help with regex

Posted by "Kevin A. McGrail" <KM...@PCCC.com>.
On 2/26/2014 6:53 PM, Webmaster wrote:
> I need a regex to match an alphanumeric string with letters and numbers.
>
> example:  48HQZBF404TY2298D1414BB8050022YQ3872444
>
> The pattern is defined as:
>
> A sequence of alphanumeric characters, letters are upper or lower 
> case, at least 30 chars long, containing at least 10 numbers.
>
> This part is easy enough:  [a-zA-Z0-9]{30,}
>
> But I can't figure out how to match only ifthe string contains at 
> least 10 numbers. 
Hmm, I think you might need a plugin for that one.