You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by Erik Hatcher <er...@ehatchersolutions.com> on 2005/12/28 16:14:16 UTC

regex queries

As I promised earlier, in order to leverage 3rd party regex  
implementations I have removed the (Span)RegexQuery code from the  
core and into contrib/regex.  The implementation is vastly different  
in that I've abstracted out the regex implementation and provided one  
for JDK 1.4 java.util.regex and Jakarta Regexp.  Jakarta Regexp has a  
nice feature of being able to provide the static prefix in order to  
short-circuit term enumeration.  In my project, I've created a custom  
implementation that uses Jakarta Regexp for the prefix but  
java.util.regex for the actual matching in order to have the best of  
both worlds.

I may have over-engineered it a bit, though I'm not sure.  I'm in the  
process of documenting beyond just the unit tests, and likely will  
also document how to use regex queries along with term rotation in  
order to really minimize term rotation - though I'm still working on  
this feature in my current project.

Yonik may cringe at the .equals/.hashCode that I let IntelliJ  
generate - sorry.  Let me know if you have suggestions for  
improvement in those.

I doubt this has caused anyone too much problem by moving this out of  
the core, my apologies if so.  By all means let me know if you have  
any issues or suggestions for this.

	Erik


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Re: regex queries

Posted by Erik Hatcher <er...@ehatchersolutions.com>.
On Dec 28, 2005, at 10:14 AM, Erik Hatcher wrote:

> As I promised earlier, in order to leverage 3rd party regex  
> implementations I have removed the (Span)RegexQuery code from the  
> core and into contrib/regex.  The implementation is vastly  
> different in that I've abstracted out the regex implementation and  
> provided one for JDK 1.4 java.util.regex and Jakarta Regexp.   
> Jakarta Regexp has a nice feature of being able to provide the  
> static prefix in order to short-circuit term enumeration.  In my  
> project, I've created a custom implementation that uses Jakarta  
> Regexp for the prefix but java.util.regex for the actual matching  
> in order to have the best of both worlds.
>
> I may have over-engineered it a bit, though I'm not sure.  I'm in  
> the process of documenting beyond just the unit tests, and likely  
> will also document how to use regex queries along with term  
> rotation in order to really minimize term rotation - though I'm  
> still working on this feature in my current project.

Oops... typo.... "to really minimize term _enumeration_"



---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org