You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by "Brian Feldman (Jira)" <ji...@apache.org> on 2021/02/01 15:00:00 UTC
[jira] [Commented] (LUCENE-9718) REGEX Pattern Search, character
classes with quantifiers do not work
[ https://issues.apache.org/jira/browse/LUCENE-9718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17276387#comment-17276387 ]
Brian Feldman commented on LUCENE-9718:
---------------------------------------
{code:java}
// code placeholder
/**
* Lucene/Automaton Regex Check
*
* @param regex
* @param checkValue
* @return true if matched
*/
public boolean luceneRegexCheck(String regex, String checkValue) {
//import dk.brics.automaton.RegExp;
//import dk.brics.automaton.RunAutomaton;
//RegExp re = new RegExp(regex);
//RunAutomaton ra = new RunAutomaton(re.toAutomaton());
//return ra.run(regexMatches);
CharacterRunAutomaton automaton = new CharacterRunAutomaton(new RegExp(regex).toAutomaton());
return automaton.run(checkValue);
}
@Test
void REGEXTEST() {
String regex = "[0-9]{2,3}";
String regexMatches = "11";
// Lucene Automaton Regex
assertTrue(luceneRegexCheck(regex, regexMatches), "Lucene Regex Failed to Match");
}
@Test
void REGEXTEST2() {
String regex = "\\d{2,3}";
String regexMatches = "11";
// Lucene Automaton Regex
assertTrue(luceneRegexCheck(regex, regexMatches), "Lucene Regex Failed to Match");
}
{code}
> REGEX Pattern Search, character classes with quantifiers do not work
> --------------------------------------------------------------------
>
> Key: LUCENE-9718
> URL: https://issues.apache.org/jira/browse/LUCENE-9718
> Project: Lucene - Core
> Issue Type: Bug
> Components: core/search
> Affects Versions: 7.7.3, 8.6.3
> Reporter: Brian Feldman
> Priority: Minor
>
> Character classes with a quantifier do not work, no error is given and no results are returned. For example \d\{2} or \d\{2,3} as is commonly written in most languages supporting regular expressions, simply and quietly does not work. A user work around is to write them fully out such as \d\d or [0-9][0-9] or as [0-9]\{2,3} .
>
> This inconsistency or limitation is not documented, wasting the time of users as they have to figure this out themselves. I believe this inconsistency should be clearly documented and an effort to fixing the inconsistency would improve pattern searching.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org