You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Otis Gospodnetic (JIRA)" <ji...@apache.org> on 2016/06/14 20:32:30 UTC
[jira] [Comment Edited] (LUCENE-2605) queryparser parses on
whitespace
[ https://issues.apache.org/jira/browse/LUCENE-2605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15330012#comment-15330012 ]
Otis Gospodnetic edited comment on LUCENE-2605 at 6/14/16 8:31 PM:
-------------------------------------------------------------------
[~steve_rowe] you are about to become everyone's hero and a household name! :)
Is this going to be in the upcoming 6.1?
was (Author: otis):
[~steve_rowe] you are about to become everyone's here and a household name! :)
Is this going to be in the upcoming 6.1?
> queryparser parses on whitespace
> --------------------------------
>
> Key: LUCENE-2605
> URL: https://issues.apache.org/jira/browse/LUCENE-2605
> Project: Lucene - Core
> Issue Type: Bug
> Components: core/queryparser
> Reporter: Robert Muir
> Assignee: Steve Rowe
> Attachments: LUCENE-2605.patch, LUCENE-2605.patch, LUCENE-2605.patch
>
>
> The queryparser parses input on whitespace, and sends each whitespace separated term to its own independent token stream.
> This breaks the following at query-time, because they can't see across whitespace boundaries:
> * n-gram analysis
> * shingles
> * synonyms (especially multi-word for whitespace-separated languages)
> * languages where a 'word' can contain whitespace (e.g. vietnamese)
> Its also rather unexpected, as users think their charfilters/tokenizers/tokenfilters will do the same thing at index and querytime, but
> in many cases they can't. Instead, preferably the queryparser would parse around only real 'operators'.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org