You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Ted Sullivan (JIRA)" <ji...@apache.org> on 2015/05/18 22:35:00 UTC

[jira] [Commented] (SOLR-7136) Add an AutoPhrasing TokenFilter

    [ https://issues.apache.org/jira/browse/SOLR-7136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14549170#comment-14549170 ] 

Ted Sullivan commented on SOLR-7136:
------------------------------------

Fixed a bug where overlapping autophrases were emitting an extra token (erroneously) if the match ended before the overlapping token finished. e.g. if the autophrases are "foo bar" and "bar baz" and the test phrase is "foo bar", without the fix, it will emit "foo_bar" and "bar", now it just emits "foo_bar"

> Add an AutoPhrasing TokenFilter
> -------------------------------
>
>                 Key: SOLR-7136
>                 URL: https://issues.apache.org/jira/browse/SOLR-7136
>             Project: Solr
>          Issue Type: New Feature
>            Reporter: Ted Sullivan
>         Attachments: SOLR-7136.patch, SOLR-7136.patch, SOLR-7136.patch
>
>
> Adds an 'autophrasing' token filter which is designed to enable noun phrases that represent a single entity to be tokenized in a singular fashion. Adds support for ManagedResources and Query parser auto-phrasing support given LUCENE-2605.
> The rationale for this Token Filter and its use in solving the long standing multi-term synonym problem in Lucene Solr has been documented online. 
> http://lucidworks.com/blog/automatic-phrase-tokenization-improving-lucene-search-precision-by-more-precise-linguistic-analysis/
> https://lucidworks.com/blog/solution-for-multi-term-synonyms-in-lucenesolr-using-the-auto-phrasing-tokenfilter/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org