You are viewing a plain text version of this content. The canonical link for it is here.
Posted to regexp-user@jakarta.apache.org by ey...@post.tau.ac.il on 2011/03/03 10:33:27 UTC

Safe Regexep Engine - Poly time

To Jakarta Regexp Project,

My partner and I are first degree cs students in TAU and in our  
Workshop we needed to fix
Jakarta's some time exponnential run time. We added to the 1.5 release  
a new Class fully
adapted with compiler and debugger, that runs at poly time at the  
expense of not
supporting backreference.

We Improved the Compiler to fully build the state machine when in  
"Safe" mode. After that
the matching function was implemented according to Thompson's NFA  
which assures
polynomial run time. here is a link to Thompson's Documents:
http://swtch.com/~rsc/regexp/regexp1.html

The "Safe" RE is fully working and is passing all of the inlibrary  
tests as in RETest
class. It can not suppurt backreference and always match the shortest  
known string,
however suppurts reluctanct and greedy opertors when have to: as in ^  
and $. behaves
exsactly as original RE including operators, greedy and parenthesis.

Another thing is that we added both in RE and Safe RE support for  
reluctant {m,n}, an
issue needed solving on the original library.

Our Addition to the library is fully working, ducumented and accept on  
backreference
operators even runs faster than the original. A user can choose if to  
use original RE os
the Safe one. So the original library is still intact and working.

We would like to help your project and submit our addition to the  
regexp engine. In order
to do that we need to know if there is any interest in our addition,  
and what are the
formalities in these cases?

Thanks for reading,
Eyal Itkin & Tomer Levy, Tel Aviv University.


---------------------------------------------------------------------
To unsubscribe, e-mail: regexp-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: regexp-user-help@jakarta.apache.org