You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@lucene.apache.org by mh...@apache.org on 2020/05/14 10:52:09 UTC

[lucene-solr] branch master updated: Lucene-9336: Changes.txt and migrate.md addition for RegExp enhancements (#1515)

This is an automated email from the ASF dual-hosted git repository.

mharwood pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/lucene-solr.git


The following commit(s) were added to refs/heads/master by this push:
     new 18bd297  Lucene-9336: Changes.txt and migrate.md addition for RegExp enhancements (#1515)
18bd297 is described below

commit 18bd29715a35e9e0a36766eb74a23613dc7e4622
Author: markharwood <ma...@gmail.com>
AuthorDate: Thu May 14 11:51:59 2020 +0100

    Lucene-9336: Changes.txt and migrate.md addition for RegExp enhancements (#1515)
    
    Added notes for new \w \s etc support
---
 lucene/CHANGES.txt | 4 ++++
 lucene/MIGRATE.md  | 4 ++++
 2 files changed, 8 insertions(+)

diff --git a/lucene/CHANGES.txt b/lucene/CHANGES.txt
index 29c7d32..2f87a5d 100644
--- a/lucene/CHANGES.txt
+++ b/lucene/CHANGES.txt
@@ -60,6 +60,10 @@ API Changes
 
 Improvements
 
+* LUCENE-9336: RegExp query now supports \w \W \d \D \s \S expressions.
+  This is a break with previous behaviour where these were (mis)interpreted
+  as literally the characters w W d etc. (Mark Harwood)
+
 * LUCENE-8757: When provided with an ExecutorService to run queries across
   multiple threads, IndexSearcher now groups small segments together, up to
   250k docs per slice. (Atri Sharma via Adrien Grand)
diff --git a/lucene/MIGRATE.md b/lucene/MIGRATE.md
index db188be..0956c8e 100644
--- a/lucene/MIGRATE.md
+++ b/lucene/MIGRATE.md
@@ -1,5 +1,9 @@
 # Apache Lucene Migration Guide
 
+## RegExp certain regular expressions now match differently (LUCENE-9336)
+
+The commonly used regular expressions \w \W \d \D \s and \S now work the same way [Java Pattern](https://docs.oracle.com/javase/tutorial/essential/regex/pre_char_classes.html#CHART) matching works. Previously these expressions were (mis)interpreted as searches for the literal characters w, d, s etc. 
+
 ## NGramFilterFactory "keepShortTerm" option was fixed to "preserveOriginal" (LUCENE-9259)
 
 The factory option name to output the original term was corrected in accordance with its Javadoc.