You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by bu...@apache.org on 2004/03/06 09:36:05 UTC
DO NOT REPLY [Bug 27491] New: -
[PATCH] Allowing '-'/'+' in terms
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
<http://nagoya.apache.org/bugzilla/show_bug.cgi?id=27491>.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND
INSERTED IN THE BUG DATABASE.
http://nagoya.apache.org/bugzilla/show_bug.cgi?id=27491
[PATCH] Allowing '-'/'+' in terms
Summary: [PATCH] Allowing '-'/'+' in terms
Product: Lucene
Version: unspecified
Platform: Other
OS/Version: Other
Status: NEW
Severity: Normal
Priority: Other
Component: QueryParser
AssignedTo: lucene-dev@jakarta.apache.org
ReportedBy: morus.walter@gmx.de
I suggest to change the definition of term character in QueryParser.jj
from
| <#_TERM_CHAR: ( <_TERM_START_CHAR> | <_ESCAPED_CHAR> ) >
to
| <#_TERM_CHAR: ( <_TERM_START_CHAR> | <_ESCAPED_CHAR> | "-" | "+" ) >
As a result query parser will read '-' and '+' within words (such as tft-monitor
or Sysh1-1) as one term, which will be tokenized by the used analyzer
and end up in a term query or phrase query depending if it create one ore
more tokens.
So with StandardAnalyzer a query tft-monitor would get a phrase query "tft
monitor" and Sysh1-1 a term query for "Sysh1-1".
Searching tft-monitor as a phrase "tft monitor" is not exact but the best
aproximation possible once you indexed tft-monitor as tokens tft and monitor.
Currently query parser interpret every '-' or '+' as operators, which means
that 'tft-monitor' gets parsed as tft AND NOT monitor, which probably isn't what
the user wanted.
The effect of '-'/'+' not occuring within a word is not changed, so
tft -monitor will still search for 'tft AND NOT monitor'.
All regression tests pass with the change.
I didn't add a patch-file, because I think it's easy to change queryParser.jj by
hand.
---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-dev-help@jakarta.apache.org