You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Martin Braun (JIRA)" <ji...@apache.org> on 2016/08/23 08:42:22 UTC

[jira] [Updated] (LUCENE-7411) Regex Query with Backreferences

     [ https://issues.apache.org/jira/browse/LUCENE-7411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Martin Braun updated LUCENE-7411:
---------------------------------
    Description: 
Hi there,

I am currently working on a Regex Engine that supports Backreferences while not losing determinism. It uses Memory Occurence Automata (MOAs) in the engine which are more powerful than normal DFA/NFAs. The engine does no backtracking and recognizes Regexes that cannot be evaluated deterministically as malformed. It has become more and more mature in the last few weeks and I also implemented a Lucene Query that uses these Patterns in the background. Now my question is: Is there any interest for this work to be merged (or adapted) into Lucene core?

EDIT:

The current state is only a mere proof of concept. The performance can probably be improved by a lot by adapting concepts of the Lucene Regexp Query. As Uwe Schindler correctly stated, the Query currently is quite "dumb" as in it doesn't predict what terms to match next.

https://github.com/s4ke/moar

Usage example for the Lucene Query:

https://github.com/s4ke/moar/blob/master/lucene/src/test/java/com/github/s4ke/moar/lucene/query/test/MoarQueryTest.java#L126

Cheers,

Martin

  was:
Hi there,

I am currently working on a Regex Engine that supports Backreferences while not losing determinism. It uses Memory Occurence Automata (MOAs) in the engine which are more powerful than normal DFA/NFAs. The engine does no backtracking and recognizes Regexes that cannot be evaluated deterministically as malformed. It has become more and more mature in the last few weeks and I also implemented a Lucene Query that uses these Patterns in the background. Now my question is: Is there any interest for this work to be merged (or adapted) into Lucene core?

https://github.com/s4ke/moar

Usage example for the Lucene Query:

https://github.com/s4ke/moar/blob/master/lucene/src/test/java/com/github/s4ke/moar/lucene/query/test/MoarQueryTest.java#L126

Cheers,

Martin


> Regex Query with Backreferences
> -------------------------------
>
>                 Key: LUCENE-7411
>                 URL: https://issues.apache.org/jira/browse/LUCENE-7411
>             Project: Lucene - Core
>          Issue Type: New Feature
>          Components: core/search
>            Reporter: Martin Braun
>            Priority: Minor
>
> Hi there,
> I am currently working on a Regex Engine that supports Backreferences while not losing determinism. It uses Memory Occurence Automata (MOAs) in the engine which are more powerful than normal DFA/NFAs. The engine does no backtracking and recognizes Regexes that cannot be evaluated deterministically as malformed. It has become more and more mature in the last few weeks and I also implemented a Lucene Query that uses these Patterns in the background. Now my question is: Is there any interest for this work to be merged (or adapted) into Lucene core?
> EDIT:
> The current state is only a mere proof of concept. The performance can probably be improved by a lot by adapting concepts of the Lucene Regexp Query. As Uwe Schindler correctly stated, the Query currently is quite "dumb" as in it doesn't predict what terms to match next.
> https://github.com/s4ke/moar
> Usage example for the Lucene Query:
> https://github.com/s4ke/moar/blob/master/lucene/src/test/java/com/github/s4ke/moar/lucene/query/test/MoarQueryTest.java#L126
> Cheers,
> Martin



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org