You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Georg Sorst (JIRA)" <ji...@apache.org> on 2017/01/17 22:20:26 UTC
[jira] [Comment Edited] (SOLR-9968) Cannot use special characters
in Suggester Context Query
[ https://issues.apache.org/jira/browse/SOLR-9968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15826929#comment-15826929 ]
Georg Sorst edited comment on SOLR-9968 at 1/17/17 10:20 PM:
-------------------------------------------------------------
Make tokenizer for context filter queries configurable. Applies against Solr 6.3.
was (Author: gs):
Make tokenizer for context filter queries configurable
> Cannot use special characters in Suggester Context Query
> --------------------------------------------------------
>
> Key: SOLR-9968
> URL: https://issues.apache.org/jira/browse/SOLR-9968
> Project: Solr
> Issue Type: Bug
> Security Level: Public(Default Security Level. Issues are Public)
> Components: Suggester
> Affects Versions: 6.0, 6.3
> Reporter: Georg Sorst
> Attachments: SOLR-9968-configurable-tokenizer.patch, test_context_query_with_special_characters.patch
>
>
> h4. Reproduce
> 1. Configure the Suggester to use a {{contextField}}, eg. {{context}}
> 2. Add a document containing special characters in that field, eg. '{{c#x}}'
> 3. Use a context query with the Suggester, eg. {noformat}suggest.cfq=context:c#x{noformat}
> * Escaping the character makes no difference, eg.
> {noformat}suggest.cfq=context:c\#x{noformat}
> h4. What happens
> The suggestions are not properly filtered
> h4. What should happen
> The suggestions should be limited to documents where the field {{context}} is '{{c#x}}'
> ----
> What happens is this:
> 1. {{SolrSuggester.contextFilterQueryAnalyzer}} is hardwired to use {{StandardTokenizer}}
> 2. The context query is parsed like this:
> {code:title=SolrSuggester.parseContextFilterQuery}
> query = new StandardQueryParser(contextFilterQueryAnalyzer).parse(contextFilter, CONTEXTS_FIELD_NAME);
> {code}
> 3. The {{StandardQueryParser}} together with {{StandardTokenizer}} will turn the context query into '{{context:c context:x}}'
> 4. This is used for filtering the suggestions
> 5. Thus, the suggestion where {{context}} is '{{c(x}}' is not returned
> Attached is an extension to {{SuggestComponentContextFilterQueryTest}} to reproduce this behavior.
> So, the question is, how to get the parser and tokenizer to use these special characters verbatim? Two ways I can think of:
> * Make {{contextFilterQueryAnalyzer}} configurable so {{KeywordTokenizer}} can be used
> * Use the analyzer defined for the context field in the schema
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org