You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@lucene.apache.org by "Jason Gerlowski (JIRA)" <ji...@apache.org> on 2015/11/01 16:51:27 UTC

[jira] [Commented] (SOLR-7981) term based ValueSourceParsers should support an option to run an analyzer for hte specified field on the input

    [ https://issues.apache.org/jira/browse/SOLR-7981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14984429#comment-14984429 ] 

Jason Gerlowski commented on SOLR-7981:
---------------------------------------

I ran into a bit of a discrepancy between Solr's behavior, and the "legacy behavior" mentioned in the description of this issue.

The description above states that input terms currently aren't being analyzed in anyway.  However, when I sat down to write some tests for a patch I had staged, I noticed that this wasn't true.  When parsing the function query, Solr analyzes the provided input value using the query analyzer.  This can be seen in TermField.getFieldQuery(), which is called by ValueSourceParser.getFieldQuery().

This behavior can be verified by the test found in this patch.

I bring all this up because:

1.) I could use a double check, since this is my first attempt at a contribution.
2.) I'm not sure if it affects the desirability of this change.  To me, it still valuable to be able to pick your analyzer.  But maybe the fact that this is already being analyzed really cuts down on the need for this feature.

> term based ValueSourceParsers should support an option to run an analyzer for hte specified field on the input
> --------------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-7981
>                 URL: https://issues.apache.org/jira/browse/SOLR-7981
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Hoss Man
>              Labels: newdev
>
> The following functions all take exactly 2 arguments: a field name, and a term value...
> * idf
> * termfreq
> * tf
> * totaltermfreq
> ...we should consider adding an optional third argument to indicate if an analyzer for the specified field should be used on the input to find the real "Term" to consider.
> For example, the following might all result in equivilent numeric values for all docs assuming simple plural stemming and lowercasing...
> {noformat}
> termfreq(foo_t,'Bicycles',query) // use the query analyzer for field foo_t on input Bicycles
> termfreq(foo_t,'Bicycles',index) // use the index analyzer for field foo_t on input Bicycles
> termfreq(foo_t,'bicycle',none) // no analyzer used to construct Term
> termfreq(foo_t,'bicycle') // legacy 2 arg syntax, same as 'none'
> {noformat}
> (Special error checking needed if analyzer creates more then one term for the given input string)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org