You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-dev@lucene.apache.org by Mike Klaas <mi...@gmail.com> on 2006/09/23 03:36:04 UTC

DisMax and null query strings

Hi,

I have an application which throws both query strings and query
filters at dismax.  I discovered that a NPE is thrown if an empty
query string is passed.

1) This should probably be a SolrException(400, "Missing queryString")

2) If the user specifies a filter, the query still makes sense.
Perhaps for queries that have fq but not q, dismax should use a
MatchAllDocsQuery() and a filter.  I suspect this may be a problem,
because although I'm not familiar with it, the potential performance
implications of something that matches all documents makes my stomach
churn.  Perhaps it would be better to detect this case before reaching
solr, and use a StandardRequestHandler with the filters as a lucene
query?

cheers,
-Mike

Re: DisMax and null query strings

Posted by Chris Hostetter <ho...@fucit.org>.
: > mmmm.... saying it makes sense depends on how you look at it ... you could
: > do a MatchAllDocsQuery or you could just use the DocSet generated by the
: > "fq" to get a DocList ... but either way does that really "make sense"
: > given that there is no scoring information? ... how do you fairly factor
: > in things like the bf and bq if you have no "real" scores for the matching
: > documents?
:
: I was just imagining an arbitrary (though consistent) ordered list of docs.

that works inthe simple case -- but as i said, "bq" nad "bf" can be used
to "boost" the scores of documents ... how much they boost depends on the
scores of the documents, without a "query" you don't have any scores.




-Hoss


Re: DisMax and null query strings

Posted by Mike Klaas <mi...@gmail.com>.
On 9/22/06, Chris Hostetter <ho...@fucit.org> wrote:

> : 2) If the user specifies a filter, the query still makes sense.
> : Perhaps for queries that have fq but not q, dismax should use a
> : MatchAllDocsQuery() and a filter.  I suspect this may be a problem,
>
> mmmm.... saying it makes sense depends on how you look at it ... you could
> do a MatchAllDocsQuery or you could just use the DocSet generated by the
> "fq" to get a DocList ... but either way does that really "make sense"
> given that there is no scoring information? ... how do you fairly factor
> in things like the bf and bq if you have no "real" scores for the matching
> documents?

I was just imagining an arbitrary (though consistent) ordered list of docs.

> : churn.  Perhaps it would be better to detect this case before reaching
> : solr, and use a StandardRequestHandler with the filters as a lucene
> : query?
>
> I definitely think it's better to have clients deal with the "no query"
> situation, but most of the time they'd probably just skip the hit to Solr
> alltogether.
>
> If there are cases where we want dismax to support a "match all" or "match
> all with fq"  I'd rather see us add support using a special option that
> triggers the behavior then to make clients conditionally hit Standard with
> the fq => q, becuase that want work for installations that put the "fq" in
> init params so their clients can be agnostic.

This fall-through behaviour is what I ended up implementing on the
client side.  DisMax needn't take care of _all_ my edge cases <g>.

> ...actually, i can see where this would make a lot of sense ...
> supporting filtered "browsing" or filtered "searching" of the full
> index using the same request structure, just leaving off the "q" if you
> are "browsing" ... if bq and bf are used then even sorting by score still
> makes sense.

True, though it isn't hard to throw together a custom request handler
or use the old lucene trick before MatchAllDocs existed (index a
constant field or all docs).

Thanks,
-MIke

Re: DisMax and null query strings

Posted by Chris Hostetter <ho...@fucit.org>.
: I have an application which throws both query strings and query
: filters at dismax.  I discovered that a NPE is thrown if an empty
: query string is passed.
:
: 1) This should probably be a SolrException(400, "Missing queryString")

Your probably right ... i've seen it come up before, but i consider the
lack of a "q" to be a "garbage in" situation so i never really worried too
much about it.

: 2) If the user specifies a filter, the query still makes sense.
: Perhaps for queries that have fq but not q, dismax should use a
: MatchAllDocsQuery() and a filter.  I suspect this may be a problem,

mmmm.... saying it makes sense depends on how you look at it ... you could
do a MatchAllDocsQuery or you could just use the DocSet generated by the
"fq" to get a DocList ... but either way does that really "make sense"
given that there is no scoring information? ... how do you fairly factor
in things like the bf and bq if you have no "real" scores for the matching
documents?

it might if a non-score sort is specified, but to my mind the "q" is the
heart of the request -- it's what builds up the superset of matched
documents, while the "fq" slashes away the cruft you don't want

: because although I'm not familiar with it, the potential performance
: implications of something that matches all documents makes my stomach

FWIW: I rememeber looking at MatchAll a while back ... i don't remember it
being horribly inefficient by itself, but combined in a boolean query with
the bf and bq queries it might be bad (since skipTo wouldn't ever be used)

we could certainly add support for doing a "match all" type thing there is
no "q" using an init param if people felt like it was beneficial .. it
wouldn't even need to depend on there being an "fq", it could just be a
lsightly alternate code path

: churn.  Perhaps it would be better to detect this case before reaching
: solr, and use a StandardRequestHandler with the filters as a lucene
: query?

I definitely think it's better to have clients deal with the "no query"
situation, but most of the time they'd probably just skip the hit to Solr
alltogether.

If there are cases where we want dismax to support a "match all" or "match
all with fq"  I'd rather see us add support using a special option that
triggers the behavior then to make clients conditionally hit Standard with
the fq => q, becuase that want work for installations that put the "fq" in
init params so their clients can be agnostic.


...actually, i can see where this would make a lot of sense ...
supporting filtered "browsing" or filtered "searching" of the full
index using the same request structure, just leaving off the "q" if you
are "browsing" ... if bq and bf are used then even sorting by score still
makes sense.




-Hoss