You are viewing a plain text version of this content. The canonical link for it is here.
Posted to regexp-user@jakarta.apache.org by Giovanni Azua <ga...@kdlabs.com> on 2004/11/26 19:44:28 UTC

optimizing a "thesaurus" or patterns set ...

Hello all,

While using regex to achieve transformation using 
a few patterns is not a big deal, performance problems
will for sure show up having a transformation that 
includes many patterns e.g. >2000 patterns to match 
against, doing it sequentially (where some patterns 
will easily not match) is a performance killer.

My question is, does anyone know a way to 
automatically prune or merge a set of patterns
into some structure suitable for fast search over
e.g. search tree? Something like given 2000 regular
expressions with associated replacement becomes 
a "transformation tree" which will hopefully have
a very low height.

Thanks in advance,
Best Regards,
Giovanni

Giovanni Azua
Software Engineer
kdlabs AG
www.kdlabs.com
Flurstrasse 32
Zürich CH-8048
Phone: +4114056619
Mobile: +41788899369


---------------------------------------------------------------------
To unsubscribe, e-mail: regexp-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: regexp-user-help@jakarta.apache.org