You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by cbeer <gi...@git.apache.org> on 2018/01/17 17:33:09 UTC
[GitHub] lucene-solr pull request #308: Add a suggester that operates on tokenized va...
GitHub user cbeer opened a pull request:
https://github.com/apache/lucene-solr/pull/308
Add a suggester that operates on tokenized values from a field
The `TokenizingSuggester` is suspiciously similar to the `AnalyzingInfixSuggester` (and presumably it could be merged into or extend the `AnalyzingInfixSuggester`), but with an additional feature (the `tokenizingAnalyzer`) that allows us to pre-tokenizing suggestions into a manageable size (perhaps single words, shingles of multiple words, or perhaps even NLP-extracted noun phrases) .
Our use case is providing autocomplete suggestions for searching within OCR text of a document (searching within is powered by highlighting), and we're dealing with some page-level OCR that can easily exceed the 32k size limit for the `AnalyzingInfixSuggester`'s exacttext string field.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/cbeer/lucene-solr tokenizing-suggester-upstreamable
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/lucene-solr/pull/308.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #308
----
commit c516bcaabbe6214ba4938859d6775ae7992fed0a
Author: Chris Beer <ca...@...>
Date: 2018-01-16T21:29:51Z
Add a suggester that operates on tokenized values from a field
----
---
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org