You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Michael Sokolov <ms...@safaribooksonline.com> on 2013/11/19 02:38:10 UTC

NPE in function query, was: Re: getting matching term count for a query

So for posterity, this what I ended up doing is below.  But I have a 
problem I don't understand; when I use fl=*,hitcount(), I get the 
results I expect, but when I use fl=*,hitcount(),hitcount('fulltext_t'), 
I get an NPE in Solr. This is with Solr 4.2.0.  Is there a known bug?  I 
googled a bit but couldn't find any reference to it.

Caused by: java.lang.RuntimeException: java.lang.NullPointerException
     at 
org.apache.solr.response.BinaryResponseWriter.getParsedResponse(BinaryResponseWriter.java:252)
     at 
org.apache.solr.client.solrj.embedded.EmbeddedSolrServer.getParsedResponse(EmbeddedSolrServer.java:241)
     at 
org.apache.solr.client.solrj.embedded.EmbeddedSolrServer.request(EmbeddedSolrServer.java:213)
     ... 37 more
Caused by: java.lang.NullPointerException
     at 
org.apache.lucene.queries.function.valuesource.MultiFloatFunction.createWeight(MultiFloatFunction.java:95)
     at 
org.apache.solr.response.transform.ValueSourceAugmenter.setContext(ValueSourceAugmenter.java:71)
     at 
org.apache.solr.response.transform.DocTransformers.setContext(DocTransformers.java:70)
     at 
org.apache.solr.response.BinaryResponseWriter$Resolver.writeResultsBody(BinaryResponseWriter.java:139)
     at 
org.apache.solr.response.BinaryResponseWriter$Resolver.writeResults(BinaryResponseWriter.java:173)
     at 
org.apache.solr.response.BinaryResponseWriter$Resolver.resolve(BinaryResponseWriter.java:86)
     at 
org.apache.solr.common.util.JavaBinCodec.writeVal(JavaBinCodec.java:154)
     at 
org.apache.solr.common.util.JavaBinCodec.writeNamedList(JavaBinCodec.java:144)
     at 
org.apache.solr.common.util.JavaBinCodec.writeKnownType(JavaBinCodec.java:234)
     at 
org.apache.solr.common.util.JavaBinCodec.writeVal(JavaBinCodec.java:149)
     at 
org.apache.solr.common.util.JavaBinCodec.marshal(JavaBinCodec.java:92)
     at 
org.apache.solr.response.BinaryResponseWriter.getParsedResponse(BinaryResponseWriter.java:246)
     ... 39 more



/**
  * Defines the Solr function hitcount([field, ...]) which returns the 
total
  * of termfreq(term) for all terms in the query.  The argument specify
  * fields whose terms are to be counted.  If no arguments are passed, terms
  * from every field are counted.
  */
public class HitCount extends ValueSourceParser {

     @Override
     public ValueSource parse(FunctionQParser fp) throws SyntaxError {
         HashSet<String> fields = new HashSet<String>();
         while (fp.hasMoreArguments()) {
             fields.add(fp.parseArg());
         }
         Query q = fp.subQuery(fp.getParams().get("q"), 
"lucene").getQuery();
         HashSet<Term> terms = new HashSet<Term>();
         q.extractTerms(terms);
         ValueSource[] termcounts = new ValueSource[terms.size()];
         int i = 0;
         for (Term t : terms) {
             if (fields.isEmpty() || fields.contains (t.field())) {
                 termcounts[i++] = new TermFreqValueSource(t.field(), 
t.text(), t.field(), t.bytes());
             }
         }
         return new SumFloatFunction(termcounts);
}
}
On 11/18/13 2:19 PM, Michael Sokolov wrote:
> OK -- I did find SOLR-1298 
> <https://issues.apache.org/jira/browse/SOLR-1298>which explains how to 
> request the function as a field value. Still looking for a function 
> that does what I asked for ...
>
> On 11/18/2013 11:55 AM, Michael Sokolov wrote:
>> Some of our customers want to display a "number of matches" score 
>> next to each search result.  I think what they want is to list the 
>> number of matches that will be displayed when the entire document is 
>> highlighted.  But this can be slow to do for every search result 
>> (some documents can be very large), so what we'd like to do is to 
>> count the number of terms that match the query for each document, and 
>> display that.
>>
>> It looks like Solr's function query has some support for this - I see 
>> the termfreq function, for example.  My question is:
>>
>> 1. Is it possible to execute the query as usual, retrieving document 
>> stored field values, and also to run a function, and return the 
>> result as the value of a computed "pseudo-field"?
>>
>> 2. Is there an existing function known to the function query parser 
>> that counts the total number of occurrences of all terms in the query 
>> (for the current hit document)?
>>
>> -Mike
>


Re: NPE in function query, was: Re: getting matching term count for a query

Posted by Michael Sokolov <ms...@safaribooksonline.com>.
OK, nevermind - I was the one adding the null --- working example 
below.  Last question -- does anybody know if it's possible to rewrite 
MultiTermQueries in this context?  I don't see how to get a hold of an 
IndexReader to do that, but if it were possible, it would enable this 
function to handle wildcards, etc.

-Mike


/**
* Defines the Solr function hitcount([field, ...]) which returns the total
* of termfreq(term) for all terms in the query. The arguments specify
* fields whose terms are to be counted. If no arguments are passed, terms
* from every field are counted.
*/
public class HitCount extends ValueSourceParser {
@Override
public ValueSource parse(FunctionQParser fp) throws SyntaxError {
// hitcount() takes no arguments. If we wanted to pass a query
// we could call fp.parseNestedQuery()
HashSet<String> fields = new HashSet<String>();
while (fp.hasMoreArguments()) {
fields.add(fp.parseArg());
}
Query q = fp.subQuery(fp.getParams().get("q"), "lucene").getQuery();
HashSet<Term> terms = new HashSet<Term>();
try {
q.extractTerms(terms);
} catch (UnsupportedOperationException e) {
return new DoubleConstValueSource (1);
}
ArrayList<ValueSource> termcounts = new ArrayList<ValueSource>();
for (Term t : terms) {
if (fields.isEmpty() || fields.contains (t.field())) {
termcounts.add (new TermFreqValueSource(t.field(), t.text(), t.field(), 
t.bytes()));
}
}
return new SumFloatFunction(termcounts.toArray(new 
ValueSource[termcounts.size()]));
}
}



On 11/18/13 8:38 PM, Michael Sokolov wrote:
> So for posterity, this what I ended up doing is below.  But I have a 
> problem I don't understand; when I use fl=*,hitcount(), I get the 
> results I expect, but when I use 
> fl=*,hitcount(),hitcount('fulltext_t'), I get an NPE in Solr.  This is 
> with Solr 4.2.0.  Is there a known bug?  I googled a bit but couldn't 
> find any reference to it.
>
> Caused by: java.lang.RuntimeException: java.lang.NullPointerException
>     at 
> org.apache.solr.response.BinaryResponseWriter.getParsedResponse(BinaryResponseWriter.java:252)
>     at 
> org.apache.solr.client.solrj.embedded.EmbeddedSolrServer.getParsedResponse(EmbeddedSolrServer.java:241)
>     at 
> org.apache.solr.client.solrj.embedded.EmbeddedSolrServer.request(EmbeddedSolrServer.java:213)
>     ... 37 more
> Caused by: java.lang.NullPointerException
>     at 
> org.apache.lucene.queries.function.valuesource.MultiFloatFunction.createWeight(MultiFloatFunction.java:95)
>     at 
> org.apache.solr.response.transform.ValueSourceAugmenter.setContext(ValueSourceAugmenter.java:71)
>     at 
> org.apache.solr.response.transform.DocTransformers.setContext(DocTransformers.java:70)
>     at 
> org.apache.solr.response.BinaryResponseWriter$Resolver.writeResultsBody(BinaryResponseWriter.java:139)
>     at 
> org.apache.solr.response.BinaryResponseWriter$Resolver.writeResults(BinaryResponseWriter.java:173)
>     at 
> org.apache.solr.response.BinaryResponseWriter$Resolver.resolve(BinaryResponseWriter.java:86)
>     at 
> org.apache.solr.common.util.JavaBinCodec.writeVal(JavaBinCodec.java:154)
>     at 
> org.apache.solr.common.util.JavaBinCodec.writeNamedList(JavaBinCodec.java:144)
>     at 
> org.apache.solr.common.util.JavaBinCodec.writeKnownType(JavaBinCodec.java:234)
>     at 
> org.apache.solr.common.util.JavaBinCodec.writeVal(JavaBinCodec.java:149)
>     at 
> org.apache.solr.common.util.JavaBinCodec.marshal(JavaBinCodec.java:92)
>     at 
> org.apache.solr.response.BinaryResponseWriter.getParsedResponse(BinaryResponseWriter.java:246)
>     ... 39 more
>
>
>
> /**
>  * Defines the Solr function hitcount([field, ...]) which returns the 
> total
>  * of termfreq(term) for all terms in the query.  The argument specify
>  * fields whose terms are to be counted.  If no arguments are passed, 
> terms
>  * from every field are counted.
>  */
> public class HitCount extends ValueSourceParser {
>
>     @Override
>     public ValueSource parse(FunctionQParser fp) throws SyntaxError {
>         HashSet<String> fields = new HashSet<String>();
>         while (fp.hasMoreArguments()) {
>             fields.add(fp.parseArg());
>         }
>         Query q = fp.subQuery(fp.getParams().get("q"), 
> "lucene").getQuery();
>         HashSet<Term> terms = new HashSet<Term>();
>         q.extractTerms(terms);
>         ValueSource[] termcounts = new ValueSource[terms.size()];
>         int i = 0;
>         for (Term t : terms) {
>             if (fields.isEmpty() || fields.contains (t.field())) {
>                 termcounts[i++] = new TermFreqValueSource(t.field(), 
> t.text(), t.field(), t.bytes());
>             }
>         }
>         return new SumFloatFunction(termcounts);
> }
> }
> On 11/18/13 2:19 PM, Michael Sokolov wrote:
>> OK -- I did find SOLR-1298 
>> <https://issues.apache.org/jira/browse/SOLR-1298>which explains how 
>> to request the function as a field value.  Still looking for a 
>> function that does what I asked for ...
>>
>> On 11/18/2013 11:55 AM, Michael Sokolov wrote:
>>> Some of our customers want to display a "number of matches" score 
>>> next to each search result.  I think what they want is to list the 
>>> number of matches that will be displayed when the entire document is 
>>> highlighted.  But this can be slow to do for every search result 
>>> (some documents can be very large), so what we'd like to do is to 
>>> count the number of terms that match the query for each document, 
>>> and display that.
>>>
>>> It looks like Solr's function query has some support for this - I 
>>> see the termfreq function, for example.  My question is:
>>>
>>> 1. Is it possible to execute the query as usual, retrieving document 
>>> stored field values, and also to run a function, and return the 
>>> result as the value of a computed "pseudo-field"?
>>>
>>> 2. Is there an existing function known to the function query parser 
>>> that counts the total number of occurrences of all terms in the 
>>> query (for the current hit document)?
>>>
>>> -Mike
>>
>