You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by "Munendra S N (Jira)" <ji...@apache.org> on 2020/03/10 16:20:00 UTC

[jira] [Comment Edited] (SOLR-13199) NPE due to unexpected null return value from QueryBitSetProducer.getBitSet

    [ https://issues.apache.org/jira/browse/SOLR-13199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17056103#comment-17056103 ] 

Munendra S N edited comment on SOLR-13199 at 3/10/20, 4:19 PM:
---------------------------------------------------------------

[~dsmiley]
Thanks for the review

This is in addition what Mikhail has shared
Initially I was thinking to raise/throw an  Exception but then I thought few cases

{code:java}
Likewise if we parse the query and get null, the query is in error.
{code}
One case where query could be null even if parentFilter is specified filter is defined on text field and value is stopword. I have seen cases where query resolves to null in lot of cases but currently could think of this case. Using text field itself for parentFilter is not the right choice but I don't think we can control usage. So, when user has specified perfectly fine filter which resolves to null should we throw an exception?

{code:java}
If parentsFilter.getBitSet returns null, then we should throw an error that the user didn't supply a parentFilter matching parent documents
{code}
parentFilter could be something that matches fewer parent set rather then whole parent set. Suggestion throw an error is good if there is an enforcement that unique parent condition should be part of each document. Suppose, user is also using pagination. Fist page returns properly, there is one such parent product which fits the bill and we throw an exception. Same query throws exception based on limit and start parameter. Not sure, if that would be right choice

I understand both cases are either bit of stretch or corner cases but I'm sharing my reasoning behind going with the above approach
Let me know if these corners cases doesn't make much sense and its okay to fail request then, i will modify the patch accordingly.

Also, I have question if someone uses nestPathField approach(defined in the schema) but doesn't have any children for parents what does childTransformer return? Does it fail the request with valid error or return just the parent products?

I haven't yet tried nestPathField for indexing parent-children. So, just curious. 




was (Author: munendrasn):
[~dsmiley]
Thanks for the review

This is in addition what Mikhail has shared
Initially I was thinking to raise/throw an  Exception but then I thought few cases

{code:java}
Likewise if we parse the query and get null, the query is in error.
{code}
One case where query could be null even if parentFilter is specified filter is defined on text field and value is stopword. I have seen cases where query resolves to null in lot of cases but currently could think of this case. Using text field itself for parentFilter is not the right choice but I don't think we can control usage. So, when user has specified perfectly fine filter which resolves to null should we throw an exception?

{code:java}
If parentsFilter.getBitSet returns null, then we should throw an error that the user didn't supply a parentFilter matching parent documents
{code}
parentFilter could be something that matches fewer parent set rather then whole parent set. Suggestion throw an error is good if there is an enforcement that unique parent condition should be part of each document. Suppose, user is also using pagination. Fist page returns properly, there is one such parent product which fits the bill and we throw an exception. Same query throws exception based on limit and start parameter. Not sure, if that would be right choice

I understand both cases are either bit of stretch or corner cases but I'm sharing my reasoning behind going with the above approach
Let me know if these corners cases doesn't make such sense and its okay to fail request then, i will modify the patch accordingly.

Also, I have question if someone uses nestPathField approach(defined in the schema) but doesn't have any children for parents what does childTransformer return? Does it fail the request with valid error or return just the parent products?

I haven't yet tried nestPathField for indexing parent-children. So, just curious. 



> NPE due to unexpected null return value from QueryBitSetProducer.getBitSet
> --------------------------------------------------------------------------
>
>                 Key: SOLR-13199
>                 URL: https://issues.apache.org/jira/browse/SOLR-13199
>             Project: Solr
>          Issue Type: Bug
>          Components: search
>    Affects Versions: master (9.0)
>         Environment: h1. Steps to reproduce
> * Use a Linux machine.
> *  Build commit {{ea2c8ba}} of Solr as described in the section below.
> * Build the films collection as described below.
> * Start the server using the command {{./bin/solr start -f -p 8983 -s /tmp/home}}
> * Request the URL given in the bug description.
> h1. Compiling the server
> {noformat}
> git clone https://github.com/apache/lucene-solr
> cd lucene-solr
> git checkout ea2c8ba
> ant compile
> cd solr
> ant server
> {noformat}
> h1. Building the collection
> We followed [Exercise 2|http://lucene.apache.org/solr/guide/7_5/solr-tutorial.html#exercise-2] from the [Solr Tutorial|http://lucene.apache.org/solr/guide/7_5/solr-tutorial.html]. The attached file ({{home.zip}}) gives the contents of folder {{/tmp/home}} that you will obtain by following the steps below:
> {noformat}
> mkdir -p /tmp/home
> echo '<?xml version="1.0" encoding="UTF-8" ?><solr></solr>' > /tmp/home/solr.xml
> {noformat}
> In one terminal start a Solr instance in foreground:
> {noformat}
> ./bin/solr start -f -p 8983 -s /tmp/home
> {noformat}
> In another terminal, create a collection of movies, with no shards and no replication, and initialize it:
> {noformat}
> bin/solr create -c films
> curl -X POST -H 'Content-type:application/json' --data-binary '{"add-field": {"name":"name", "type":"text_general", "multiValued":false, "stored":true}}' http://localhost:8983/solr/films/schema
> curl -X POST -H 'Content-type:application/json' --data-binary '{"add-copy-field" : {"source":"*","dest":"_text_"}}' http://localhost:8983/solr/films/schema
> ./bin/post -c films example/films/films.json
> {noformat}
>            Reporter: Johannes Kloos
>            Assignee: Munendra S N
>            Priority: Minor
>              Labels: diffblue, newdev
>         Attachments: SOLR-13199.patch, home.zip
>
>
> Requesting the following URL causes Solr to return an HTTP 500 error response:
> {noformat}
> http://localhost:8983/solr/films/select?fl=[child%20parentFilter=ge]&q=*:*
> {noformat}
> The error response seems to be caused by the following uncaught exception:
> {noformat}
> java.lang.NullPointerException
> at org.apache.solr.response.transform.ChildDocTransformer.transform(ChildDocTransformer.java:92)
> at org.apache.solr.response.DocsStreamer.next(DocsStreamer.java:103)
> at org.apache.solr.response.DocsStreamer.next(DocsStreamer.java:1)
> at org.apache.solr.response.TextResponseWriter.writeDocuments(TextResponseWriter.java:184)
> at org.apache.solr.response.TextResponseWriter.writeVal(TextResponseWriter.java:136)
> at org.apache.solr.common.util.JsonTextWriter.writeNamedListAsMapWithDups(JsonTextWriter.java:386)
> at org.apache.solr.common.util.JsonTextWriter.writeNamedList(JsonTextWriter.java:292)
> at org.apache.solr.response.JSONWriter.writeResponse(JSONWriter.java:73)
> {noformat}
> In ChildDocTransformer.transform, we have the following lines:
> {noformat}
> final BitSet segParentsBitSet = parentsFilter.getBitSet(leafReaderContext);
> final int segPrevRootId = segRootId==0? -1: segParentsBitSet.prevSetBit(segRootId - 1); // can return -1 and that's okay
> {noformat}
> But getBitSet can return null if the set of DocIds is empty:
> {noformat}
> return docIdSet == DocIdSet.EMPTY ? null : ((BitDocIdSet) docIdSet).bits();
> {noformat}
> We found this bug using [Diffblue Microservices Testing|https://www.diffblue.com/labs/?utm_source=solr-br]. Find more information on this [fuzz testing campaign|https://www.diffblue.com/blog/2018/12/19/diffblue-microservice-testing-a-sneak-peek-at-our-early-product-and-results?utm_source=solr-br].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org