You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Mathias Lux <ml...@itec.uni-klu.ac.at> on 2013/09/16 13:24:33 UTC
Re-Ranking results based on DocValues with custom function.
Hi!
I'm having quite an index with a lot of text and some binary data in
the documents (numeric vectors of arbitrary size with associated
dissimilarity functions). What I want to do is to search using common
text search and then (optionally) re-rank using some custom function
like
http://localhost:8983/solr/select?q=*:*&sort=myCustomFunction(var1) asc
I've seen that there are hooks in solrconfig.xml, but I did not find
an example or some documentation. I'd be most grateful if anyone could
either point me to one or give me a hint for another way to go :)
Btw. Using just the DocValues for search is handled by a custom
RequestHandler, which works great, but using text as a main search
feature, and my DocValues for re-ranking, I'd rather just add a
function for sorting and use the current, stable and well performing
request handler.
cheers,
Mathias
ps. a demo of the current system is available at:
http://demo-itec.uni-klu.ac.at/liredemo/
--
Dr. Mathias Lux
Assistant Professor, Klagenfurt University, Austria
http://tinyurl.com/mlux-itec
Re: Re-Ranking results based on DocValues with custom function.
Posted by Mathias Lux <ml...@itec.uni-klu.ac.at>.
Got it! Just for you to share ... and maybe for inclusion in the Java
API docs of ValueSource :)
For sorting one needs to implement the method
public double doubleVal(int) of the class ValueSource
then it works like a charm.
cheers,
Mathias
On Tue, Sep 17, 2013 at 6:28 PM, Chris Hostetter
<ho...@fucit.org> wrote:
>
> : It basically allows for searching for text (which is associated to an
> : image) in an index and then getting the distance to a sample image
> : (base64 encoded byte[] array) based on one of five different low level
> : content based features stored as DocValues.
>
> very cool.
>
> : So there one little tiny question I still have ;) When I'm trying to
> : do a "sort" I'm getting
> :
> : "msg": "sort param could not be parsed as a query, and is not a field
> : that exists in the index:
> : lirefunc(cl_hi,FQY5DhMYDg0ODg0PEBEPDg4ODg8QEgsgEBAQEBAgEBAQEBA=)",
> :
> : for the call http://localhost:9000/solr/lire/select?q=*%3A*&sort=lirefunc(cl_hi%2CFQY5DhMYDg0ODg0PEBEPDg4ODg8QEgsgEBAQEBAgEBAQEBA%3D)+asc&fl=id%2Ctitle%2Clirefunc(cl_hi%2CFQY5DhMYDg0ODg0PEBEPDg4ODg8QEgsgEBAQEBAgEBAQEBA%3D)&wt=json&indent=true
>
> Hmmm...
>
> i think the crux of the issue is your string literal. function parsing
> tries to make live easy for you by not requiring string literals to be
> quoted unless they conflict with other function names or field names
> etc.... on top of that the sort parsing code is kind of hueristic based
> (because it has to account for both functions or field names or wildcards,
> followed by other sort clauses, etc...) so in that context the special
> characters like '=' in your base64 string literal might be confusing hte
> hueristics.
>
> can you try to quote the string literal it and see if that works?
>
> For example, when i try using strdist with your base64 string in a sort
> param using the example configs i get the same error...
>
> http://localhost:8983/solr/select?q=*:*&sort=strdist%28name,FQY5DhMYDg0ODg0PEBEPDg4ODg8QEgsgEBAQEBAgEBAQEBA=,jw%29+asc
>
> but if i quote the string literal it works fine...
>
> http://localhost:8983/solr/select?q=*:*&sort=strdist%28name,%27FQY5DhMYDg0ODg0PEBEPDg4ODg8QEgsgEBAQEBAgEBAQEBA=%27,jw%29+asc
>
>
>
> -Hoss
--
Dr. Mathias Lux
Assistant Professor, Klagenfurt University, Austria
http://tinyurl.com/mlux-itec
Re: Re-Ranking results based on DocValues with custom function.
Posted by Chris Hostetter <ho...@fucit.org>.
: It basically allows for searching for text (which is associated to an
: image) in an index and then getting the distance to a sample image
: (base64 encoded byte[] array) based on one of five different low level
: content based features stored as DocValues.
very cool.
: So there one little tiny question I still have ;) When I'm trying to
: do a "sort" I'm getting
:
: "msg": "sort param could not be parsed as a query, and is not a field
: that exists in the index:
: lirefunc(cl_hi,FQY5DhMYDg0ODg0PEBEPDg4ODg8QEgsgEBAQEBAgEBAQEBA=)",
:
: for the call http://localhost:9000/solr/lire/select?q=*%3A*&sort=lirefunc(cl_hi%2CFQY5DhMYDg0ODg0PEBEPDg4ODg8QEgsgEBAQEBAgEBAQEBA%3D)+asc&fl=id%2Ctitle%2Clirefunc(cl_hi%2CFQY5DhMYDg0ODg0PEBEPDg4ODg8QEgsgEBAQEBAgEBAQEBA%3D)&wt=json&indent=true
Hmmm...
i think the crux of the issue is your string literal. function parsing
tries to make live easy for you by not requiring string literals to be
quoted unless they conflict with other function names or field names
etc.... on top of that the sort parsing code is kind of hueristic based
(because it has to account for both functions or field names or wildcards,
followed by other sort clauses, etc...) so in that context the special
characters like '=' in your base64 string literal might be confusing hte
hueristics.
can you try to quote the string literal it and see if that works?
For example, when i try using strdist with your base64 string in a sort
param using the example configs i get the same error...
http://localhost:8983/solr/select?q=*:*&sort=strdist%28name,FQY5DhMYDg0ODg0PEBEPDg4ODg8QEgsgEBAQEBAgEBAQEBA=,jw%29+asc
but if i quote the string literal it works fine...
http://localhost:8983/solr/select?q=*:*&sort=strdist%28name,%27FQY5DhMYDg0ODg0PEBEPDg4ODg8QEgsgEBAQEBAgEBAQEBA=%27,jw%29+asc
-Hoss
Re: Re-Ranking results based on DocValues with custom function.
Posted by Mathias Lux <ml...@itec.uni-klu.ac.at>.
Hi!
Thanks for the directions! I got it up and running with a custom
ValueSourceParser: http://pastebin.com/cz1rJn4A and a custom
ValueSource: http://pastebin.com/j8mhA8e0
It basically allows for searching for text (which is associated to an
image) in an index and then getting the distance to a sample image
(base64 encoded byte[] array) based on one of five different low level
content based features stored as DocValues.
A sample result is here: http://pastebin.com/V7kL3DJh
So there one little tiny question I still have ;) When I'm trying to
do a "sort" I'm getting
"msg": "sort param could not be parsed as a query, and is not a field
that exists in the index:
lirefunc(cl_hi,FQY5DhMYDg0ODg0PEBEPDg4ODg8QEgsgEBAQEBAgEBAQEBA=)",
for the call http://localhost:9000/solr/lire/select?q=*%3A*&sort=lirefunc(cl_hi%2CFQY5DhMYDg0ODg0PEBEPDg4ODg8QEgsgEBAQEBAgEBAQEBA%3D)+asc&fl=id%2Ctitle%2Clirefunc(cl_hi%2CFQY5DhMYDg0ODg0PEBEPDg4ODg8QEgsgEBAQEBAgEBAQEBA%3D)&wt=json&indent=true
cheers,
Mathias
On Tue, Sep 17, 2013 at 1:01 AM, Chris Hostetter
<ho...@fucit.org> wrote:
> : dissimilarity functions). What I want to do is to search using common
> : text search and then (optionally) re-rank using some custom function
> : like
> :
> : http://localhost:8983/solr/select?q=*:*&sort=myCustomFunction(var1) asc
>
> can you describe what you want your custom function to look like? it may
> already be possible using the existing functions provided out of hte box -
> just neeed to combine them to build up the mathc expression...
>
> https://wiki.apache.org/solr/FunctionQuery
>
> ...if you really want to write your own, just implement ValueSourceParser
> and register it in solrconfig.xml...
>
> https://wiki.apache.org/solr/SolrPlugins#ValueSourceParser
>
> : I've seen that there are hooks in solrconfig.xml, but I did not find
> : an example or some documentation. I'd be most grateful if anyone could
> : either point me to one or give me a hint for another way to go :)
>
> when writing a custom plugin like this, the best thing to do is look at
> the existing examples of that plugin. almost all of hte built in
> ValueSourceParsers are really trivial, and can be found in tiny anonymous
> classes right inside the ValueSourceParser.java...
>
> For example, the function ot divide the results of two other fnctions...
>
> addParser("div", new ValueSourceParser() {
> @Override
> public ValueSource parse(FunctionQParser fp) throws SyntaxError {
> ValueSource a = fp.parseValueSource();
> ValueSource b = fp.parseValueSource();
> return new DivFloatFunction(a, b);
> }
> });
>
> ..or, if you were trying to bundle that up in your own plugin jar and
> register it in solrconfig.xml, you might write it something like...
>
> public class DivideValueSourceParser extends ValueSourceParser {
> public DivideValueSourceParser() { }
> public ValueSource parse(FunctionQParser fp) throws SyntaxError {
> ValueSource a = fp.parseValueSource();
> ValueSource b = fp.parseValueSource();
> return new DivFloatFunction(a, b);
> }
> }
>
> and then register it as...
>
> <valueSourceParser name="div" class="com.you.DivideValueSourceParser" />
>
>
> depending on your needs, you may also want to write a custom ValueSource
> implementation (ie: instead of DivFloatFunction above) in which case,
> again, the best examples to look at are all of the existing ValueSource
> functions...
>
> https://lucene.apache.org/core/4_4_0/queries/org/apache/lucene/queries/function/ValueSource.html
>
>
> -Hoss
--
Dr. Mathias Lux
Assistant Professor, Klagenfurt University, Austria
http://tinyurl.com/mlux-itec
Re: Re-Ranking results based on DocValues with custom function.
Posted by Chris Hostetter <ho...@fucit.org>.
: dissimilarity functions). What I want to do is to search using common
: text search and then (optionally) re-rank using some custom function
: like
:
: http://localhost:8983/solr/select?q=*:*&sort=myCustomFunction(var1) asc
can you describe what you want your custom function to look like? it may
already be possible using the existing functions provided out of hte box -
just neeed to combine them to build up the mathc expression...
https://wiki.apache.org/solr/FunctionQuery
...if you really want to write your own, just implement ValueSourceParser
and register it in solrconfig.xml...
https://wiki.apache.org/solr/SolrPlugins#ValueSourceParser
: I've seen that there are hooks in solrconfig.xml, but I did not find
: an example or some documentation. I'd be most grateful if anyone could
: either point me to one or give me a hint for another way to go :)
when writing a custom plugin like this, the best thing to do is look at
the existing examples of that plugin. almost all of hte built in
ValueSourceParsers are really trivial, and can be found in tiny anonymous
classes right inside the ValueSourceParser.java...
For example, the function ot divide the results of two other fnctions...
addParser("div", new ValueSourceParser() {
@Override
public ValueSource parse(FunctionQParser fp) throws SyntaxError {
ValueSource a = fp.parseValueSource();
ValueSource b = fp.parseValueSource();
return new DivFloatFunction(a, b);
}
});
..or, if you were trying to bundle that up in your own plugin jar and
register it in solrconfig.xml, you might write it something like...
public class DivideValueSourceParser extends ValueSourceParser {
public DivideValueSourceParser() { }
public ValueSource parse(FunctionQParser fp) throws SyntaxError {
ValueSource a = fp.parseValueSource();
ValueSource b = fp.parseValueSource();
return new DivFloatFunction(a, b);
}
}
and then register it as...
<valueSourceParser name="div" class="com.you.DivideValueSourceParser" />
depending on your needs, you may also want to write a custom ValueSource
implementation (ie: instead of DivFloatFunction above) in which case,
again, the best examples to look at are all of the existing ValueSource
functions...
https://lucene.apache.org/core/4_4_0/queries/org/apache/lucene/queries/function/ValueSource.html
-Hoss