You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Peng Cheng (JIRA)" <ji...@apache.org> on 2014/02/07 09:31:19 UTC

[jira] [Commented] (LUCENE-5417) Solr function query supports reading multiple values from a field.

    [ https://issues.apache.org/jira/browse/LUCENE-5417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13894303#comment-13894303 ] 

Peng Cheng commented on LUCENE-5417:
------------------------------------

Hi, May I ask for some help on test case implementation? I have finished writing extensions of MultiValueSources that has the described functionality: they can read multiple ByteRef from the FieldCache by using getDocTermOrds(), and then cast them into different data types (String, float etc.) through parsers.  While I got not problem making them running in a new project, I don't know how to test it in solr test framework. (Apparently it uses JUnit but override many default settings) Namely, I got sporadic 'fix your classpath to have tests-framework.jar before lucene-core.jar' exception, and when I don't get that exception, I got 'Insane FieldCache usage(s)', I have googled them but all resolutions points to 3.x version (btw using '@SuppressCodecs("Lucene3x")' only reduces the frequency of the first problem). May I ask an expert to have a look at my test cases?

One more thing: the added ValueSource(s) needs a dimensionality parameter to construct, this is because the current FunctionValues interface does not support reading of variable-length array from a field. To make the solution easier to use, we may need a more powerful FunctionValues that can return dimensionality when ***Val(int doc) method is called.

> Solr function query supports reading multiple values from a field.
> ------------------------------------------------------------------
>
>                 Key: LUCENE-5417
>                 URL: https://issues.apache.org/jira/browse/LUCENE-5417
>             Project: Lucene - Core
>          Issue Type: New Feature
>          Components: core/query/scoring
>    Affects Versions: 4.6
>         Environment: N/A
>            Reporter: Peng Cheng
>            Priority: Minor
>             Fix For: 4.7
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> Solr function query is a powerful tool to customize search criterion and ranking function (http://wiki.apache.org/solr/FunctionQuery). However, it cannot effectively benefit from field values from multi-valued field, namely, the field(...) function can only read one value and discard the others.
> This limitation has been associated with FieldCacheSource, and the fact that FieldCache cannot fetch multiple values from a field, but such constraint has been largely lifted by LUCENE-3354, which allows multiple values to be extracted from one field. Those values in turn can be used as parameters of other functions to yield a single score.
> I personally find this limitation very unhandy when building a learning-to-rank system that uses many cues and string features. Therefore I would like to post this feature request and (hopefully) work on it myself.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org