You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Tim Allison (JIRA)" <ji...@apache.org> on 2017/10/10 18:21:00 UTC

[jira] [Created] (SOLR-11462) TokenizerChain's normalize() doesn't work

Tim Allison created SOLR-11462:
----------------------------------

             Summary: TokenizerChain's normalize() doesn't work
                 Key: SOLR-11462
                 URL: https://issues.apache.org/jira/browse/SOLR-11462
             Project: Solr
          Issue Type: Bug
      Security Level: Public (Default Security Level. Issues are Public)
            Reporter: Tim Allison
            Priority: Trivial


TokenizerChain's {{normalize()}} is not currently used so this doesn't currently have any negative effects on search.  However, there is a bug, and we should fix it.

If applied to a TokenizerChain with {{filters.length > 1}}, only the last would apply. 
 
{noformat}
 @Override
  protected TokenStream normalize(String fieldName, TokenStream in) {
    TokenStream result = in;
    for (TokenFilterFactory filter : filters) {
      if (filter instanceof MultiTermAwareComponent) {
        filter = (TokenFilterFactory) ((MultiTermAwareComponent) filter).getMultiTermComponent();
        result = filter.create(in);
      }
    }
    return result;
  }
{noformat}

The fix is trivial:
{noformat}
-        result = filter.create(in);
+        result = filter.create(result);
{noformat}

If you'd like to swap out {{TextField#analyzeMultiTerm()}} with, say:

{noformat}
  public static BytesRef analyzeMultiTerm(String field, String part, Analyzer analyzerIn) {
    if (part == null || analyzerIn == null) return null;
    return analyzerIn.normalize(field, part);
  }
{noformat}

I'm happy to submit a PR with unit tests.  Let me know.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org