You are viewing a plain text version of this content. The canonical link for it is here.

Posted to solr-user@lucene.apache.org by Ofer Fort <of...@gmail.com> on 2011/03/02 18:11:18 UTC

Efficient boolean query

Hey all,
I have an index with a lot of documents with the term X and no documents
with the term Y.
If i query for X it take a few seconds and returns the results.
If I query for Y it takes a millisecond and returns an empty set.
If i query for Y AND X it takes a few seconds and returns an empty set.

I'm guessing that it evaluate both X and Y and only then tries to intersect
them?

Am i wrong? is there another way to run this query more efficiently?

thanks for any input

Re: Efficient boolean query

Posted by Ofer Fort <of...@tra.cx>.

Thanks,
I tried it in the past and found out that my hit ratio was pretty low, so it
doesn't help most of my queries

ofer

On Wed, Mar 2, 2011 at 7:16 PM, Geert-Jan Brits <gb...@gmail.com> wrote:

> If you often query X as part of several other queries (e.g: X  | X AND Y |
>  X AND Z)
> you might consider putting X in a filter query (
> http://wiki.apache.org/solr/CommonQueryParameters#fq)
>
> leading to:
> q=*:*&fq=X
> q=Y&fq=X
> q=Z&fq=X
>
> Filter queries are cached seperately which means that after the first query
> involving X, X should be returned quickly.
> So your FIRST query will probably still be in the 'few seconds'- range, but
> all following queries involving X will return much quicker.
>
> hth,
> Geert-Jan
>
> 2011/3/2 Ofer Fort <of...@gmail.com>
>
> > Hey all,
> > I have an index with a lot of documents with the term X and no documents
> > with the term Y.
> > If i query for X it take a few seconds and returns the results.
> > If I query for Y it takes a millisecond and returns an empty set.
> > If i query for Y AND X it takes a few seconds and returns an empty set.
> >
> > I'm guessing that it evaluate both X and Y and only then tries to
> intersect
> > them?
> >
> > Am i wrong? is there another way to run this query more efficiently?
> >
> > thanks for any input
> >
>

Re: Efficient boolean query

Posted by Geert-Jan Brits <gb...@gmail.com>.

If you often query X as part of several other queries (e.g: X  | X AND Y |
 X AND Z)
you might consider putting X in a filter query (
http://wiki.apache.org/solr/CommonQueryParameters#fq)

leading to:
q=*:*&fq=X
q=Y&fq=X
q=Z&fq=X

Filter queries are cached seperately which means that after the first query
involving X, X should be returned quickly.
So your FIRST query will probably still be in the 'few seconds'- range, but
all following queries involving X will return much quicker.

hth,
Geert-Jan

2011/3/2 Ofer Fort <of...@gmail.com>

> Hey all,
> I have an index with a lot of documents with the term X and no documents
> with the term Y.
> If i query for X it take a few seconds and returns the results.
> If I query for Y it takes a millisecond and returns an empty set.
> If i query for Y AND X it takes a few seconds and returns an empty set.
>
> I'm guessing that it evaluate both X and Y and only then tries to intersect
> them?
>
> Am i wrong? is there another way to run this query more efficiently?
>
> thanks for any input
>

Re: Efficient boolean query

Posted by Ofer Fort <of...@tra.cx>.

timestamp is of type:

<field name="timestamp" type="tdate" indexed="true" stored="true"
default="NOW" multiValued="false"/>


On Wed, Mar 2, 2011 at 8:11 PM, Ofer Fort <of...@tra.cx> wrote:

> you are correct that my query is a tange one, probably should have
> mentioned it in the first post.
> this is the debug data:
>
> <?xml version="1.0" encoding="UTF-8"?>
> <response>
>
> <lst name="responseHeader">
>  <int name="status">0</int>
>  <int name="QTime">4173</int>
>  <lst name="params">
>   <str name="debugQuery">on</str>
>   <str name="indent">on</str>
>
>   <str name="start">0</str>
>   <str name="q">timestamp:[2011-02-01T00:00:00Z TO NOW] AND oferiko</str>
>   <str name="version">2.2</str>
>   <str name="rows">10</str>
>  </lst>
> </lst>
> <result name="response" numFound="0" start="0"/>
> <lst name="debug">
>
>  <str name="rawquerystring">timestamp:[2011-02-01T00:00:00Z TO NOW] AND
> oferiko</str>
>  <str name="querystring">timestamp:[2011-02-01T00:00:00Z TO NOW] AND
> oferiko</str>
>  <str name="parsedquery">+timestamp:[1296518400000 TO 1299069584823]
> +contents:oferiko</str>
>  <str name="parsedquery_toString">+timestamp:[1296518400000 TO
> 1299069584823] +contents:oferiko</str>
>  <lst name="explain"/>
>  <str name="QParser">LuceneQParser</str>
>
>  <lst name="timing">
>   <double name="time">4171.0</double>
>   <lst name="prepare">
>     <double name="time">0.0</double>
>     <lst name="org.apache.solr.handler.component.QueryComponent">
>      <double name="time">0.0</double>
>     </lst>
>
>     <lst name="org.apache.solr.handler.component.FacetComponent">
>      <double name="time">0.0</double>
>     </lst>
>     <lst name="org.apache.solr.handler.component.MoreLikeThisComponent">
>      <double name="time">0.0</double>
>     </lst>
>     <lst name="org.apache.solr.handler.component.HighlightComponent">
>      <double name="time">0.0</double>
>
>     </lst>
>     <lst name="org.apache.solr.handler.component.StatsComponent">
>      <double name="time">0.0</double>
>     </lst>
>     <lst name="org.apache.solr.handler.component.DebugComponent">
>      <double name="time">0.0</double>
>     </lst>
>   </lst>
>
>   <lst name="process">
>     <double name="time">4171.0</double>
>     <lst name="org.apache.solr.handler.component.QueryComponent">
>      <double name="time">4171.0</double>
>     </lst>
>     <lst name="org.apache.solr.handler.component.FacetComponent">
>      <double name="time">0.0</double>
>
>     </lst>
>     <lst name="org.apache.solr.handler.component.MoreLikeThisComponent">
>      <double name="time">0.0</double>
>     </lst>
>     <lst name="org.apache.solr.handler.component.HighlightComponent">
>      <double name="time">0.0</double>
>     </lst>
>     <lst name="org.apache.solr.handler.component.StatsComponent">
>
>      <double name="time">0.0</double>
>     </lst>
>     <lst name="org.apache.solr.handler.component.DebugComponent">
>      <double name="time">0.0</double>
>     </lst>
>   </lst>
>  </lst>
> </lst>
>
> </response>
>
>
>
> On Wed, Mar 2, 2011 at 7:48 PM, Yonik Seeley <yo...@lucidimagination.com>wrote:
>
>> On Wed, Mar 2, 2011 at 12:11 PM, Ofer Fort <of...@gmail.com> wrote:
>> > Hey all,
>> > I have an index with a lot of documents with the term X and no documents
>> > with the term Y.
>> > If i query for X it take a few seconds and returns the results.
>> > If I query for Y it takes a millisecond and returns an empty set.
>> > If i query for Y AND X it takes a few seconds and returns an empty set.
>>
>> This depends on the specifics of what X is.   Some query types must
>> generate all hits first internally - an example is a multi-term query
>> (like numeric range query, etc) that matches many terms.
>>
>> Can you show the generated query (i.e. add debugQuery=true to the
>> request)?
>>
>> -Yonik
>> http://lucidimagination.com
>>
>
>

Re: Efficient boolean query

Posted by Ofer Fort <of...@tra.cx>.

That's great, just what I needed, I was debugging and was expecting to
see something like this.
 i'll look through the SVN history to see in which version it was added.
Thanks

On Wednesday, March 2, 2011, Yonik Seeley <yo...@lucidimagination.com> wrote:
> On Wed, Mar 2, 2011 at 2:43 PM, Ofer Fort <of...@tra.cx> wrote:
>> I didn't see this behavior, running solr 1.4.1, was that implemented
>> after this release?
>
> I think so.
> It's implemented now in BooleanWeight.scorer()
>
>       for (Weight w  : weights) {
>         BooleanClause c =  cIter.next();
>         Scorer subScorer = w.scorer(context, ScorerContext.def());
>         if (subScorer == null) {
>           if (c.isRequired()) {
>             return null;
>           }
>
> And TermWeight returns null from scorer() if there are no matches for
> the segment.
>
> -Yonik
> http://lucidimagination.com
>

Re: Efficient boolean query

Posted by Yonik Seeley <yo...@lucidimagination.com>.

On Wed, Mar 2, 2011 at 2:43 PM, Ofer Fort <of...@tra.cx> wrote:
> I didn't see this behavior, running solr 1.4.1, was that implemented
> after this release?

I think so.
It's implemented now in BooleanWeight.scorer()

      for (Weight w  : weights) {
        BooleanClause c =  cIter.next();
        Scorer subScorer = w.scorer(context, ScorerContext.def());
        if (subScorer == null) {
          if (c.isRequired()) {
            return null;
          }

And TermWeight returns null from scorer() if there are no matches for
the segment.

-Yonik
http://lucidimagination.com

Re: Efficient boolean query

Posted by Ofer Fort <of...@tra.cx>.

I didn't see this behavior, running solr 1.4.1, was that implemented
after this release?

On Wednesday, March 2, 2011, Yonik Seeley <yo...@lucidimagination.com> wrote:
> On Wed, Mar 2, 2011 at 1:58 PM, Ofer Fort <of...@tra.cx> wrote:
>> Thanks,
>> But each query tries to see if there is something new since the last result
>> that was found, so rounding things will return the same documents over  and
>> over again, till we reach to the next rounded point.
>>
>> Could i use the document id somehow?  or something else that's bigger than
>> my last search?
>>
>> And even it was a simple term query, on the lucene side of things, why would
>> it try to fetch ALL the terms if one of the required ones resulted in an
>> empty set?
>
> In general, all items are fetched for a big multi-term query because
> it's very difficult to answer the question "what's the first document
> after x that matches any of the terms" without doing so.
>
> More specifically, Lucene does do some short-circuiting for
> non-matches (at least in trunk... not sure about other versions).
> If you reorder your query to
> oferiko AND timestamp:[2011-02-01T00:00:00Z TO NOW]
>
> Then when there is no match on oferiko, BooleanScorer will not ask for
> the scorer for the second clause.
>
> -Yonik
> http://lucidimagination.com
>

Re: Efficient boolean query

Posted by Yonik Seeley <yo...@lucidimagination.com>.

On Wed, Mar 2, 2011 at 1:58 PM, Ofer Fort <of...@tra.cx> wrote:
> Thanks,
> But each query tries to see if there is something new since the last result
> that was found, so rounding things will return the same documents over  and
> over again, till we reach to the next rounded point.
>
> Could i use the document id somehow?  or something else that's bigger than
> my last search?
>
> And even it was a simple term query, on the lucene side of things, why would
> it try to fetch ALL the terms if one of the required ones resulted in an
> empty set?

In general, all items are fetched for a big multi-term query because
it's very difficult to answer the question "what's the first document
after x that matches any of the terms" without doing so.

More specifically, Lucene does do some short-circuiting for
non-matches (at least in trunk... not sure about other versions).
If you reorder your query to
oferiko AND timestamp:[2011-02-01T00:00:00Z TO NOW]

Then when there is no match on oferiko, BooleanScorer will not ask for
the scorer for the second clause.

-Yonik
http://lucidimagination.com

Re: Efficient boolean query

Posted by Ofer Fort <of...@tra.cx>.

I'm guessing what i was describing is a short-circuit evaluation and i see
that lucene doesn't have it:
http://lucene.472066.n3.nabble.com/Short-circuit-in-query-td738551.html

Still would love to hear any suggestions for my type of query

ofer

On Wed, Mar 2, 2011 at 8:58 PM, Ofer Fort <of...@tra.cx> wrote:

> Thanks,
> But each query tries to see if there is something new since the last result
> that was found, so rounding things will return the same documents over  and
> over again, till we reach to the next rounded point.
>
> Could i use the document id somehow?  or something else that's bigger than
> my last search?
>
> And even it was a simple term query, on the lucene side of things, why
> would it try to fetch ALL the terms if one of the required ones resulted in
> an empty set?
>
> thanks for your help, specifically on this matter and in general, to the
> search community :-)
>
>
> On Wed, Mar 2, 2011 at 8:35 PM, Yonik Seeley <yo...@lucidimagination.com>wrote:
>
>> One way to speed things up would be to reduce the resolution on
>> timestamps that you index.
>> Another way would be to decrease the precisionStep on the tdate field
>> type (bigger index, but faster range queries)
>> Yet another way is to use "fq" filters that can be reused many times.
>>
>> One way to increase fq reuse is to round.
>> This rounds up to the nearest hour... assumes 2011-02-01T00:00:00Z is
>> the same across many queries.
>> fq=timestamp:[2011-02-01T00:00:00Z TO NOW/HOUR+1HOUR]
>>
>> Another way is to split the filter into two parts - a large part that
>> doesn't change much + a small part that does.
>> Again this assumes that the first endpoint is reused across many queries.
>> fq=timestamp:[2011-02-01T00:00:00Z TO
>> NOW/HOUR+1HOUR]&fq=timestamp:[NOW/HOUR TO NOW]
>>
>> If the first endpoint is *not* reused across many queries, then you
>> can still use the same strategy as above by adding another small "fq"
>> for the lower endpoint.
>>
>> -Yonik
>> http://lucidimagination.com
>>
>>
>>
>> On Wed, Mar 2, 2011 at 1:11 PM, Ofer Fort <of...@tra.cx> wrote:
>> > you are correct that my query is a tange one, probably should have
>> mentioned
>> > it in the first post.
>> > this is the debug data:
>> >
>> > <?xml version="1.0" encoding="UTF-8"?>
>> > <response>
>> >
>> > <lst name="responseHeader">
>> >  <int name="status">0</int>
>> >  <int name="QTime">4173</int>
>> >  <lst name="params">
>> >  <str name="debugQuery">on</str>
>> >  <str name="indent">on</str>
>> >
>> >  <str name="start">0</str>
>> >  <str name="q">timestamp:[2011-02-01T00:00:00Z TO NOW] AND oferiko</str>
>> >  <str name="version">2.2</str>
>> >  <str name="rows">10</str>
>> >  </lst>
>> > </lst>
>> > <result name="response" numFound="0" start="0"/>
>> > <lst name="debug">
>> >
>> >  <str name="rawquerystring">timestamp:[2011-02-01T00:00:00Z TO NOW] AND
>> > oferiko</str>
>> >  <str name="querystring">timestamp:[2011-02-01T00:00:00Z TO NOW] AND
>> > oferiko</str>
>> >  <str name="parsedquery">+timestamp:[1296518400000 TO 1299069584823]
>> > +contents:oferiko</str>
>> >  <str name="parsedquery_toString">+timestamp:[1296518400000 TO
>> > 1299069584823] +contents:oferiko</str>
>> >  <lst name="explain"/>
>> >  <str name="QParser">LuceneQParser</str>
>> >
>> >  <lst name="timing">
>> >  <double name="time">4171.0</double>
>> >  <lst name="prepare">
>> >    <double name="time">0.0</double>
>> >    <lst name="org.apache.solr.handler.component.QueryComponent">
>> >     <double name="time">0.0</double>
>> >    </lst>
>> >
>> >    <lst name="org.apache.solr.handler.component.FacetComponent">
>> >     <double name="time">0.0</double>
>> >    </lst>
>> >    <lst name="org.apache.solr.handler.component.MoreLikeThisComponent">
>> >     <double name="time">0.0</double>
>> >    </lst>
>> >    <lst name="org.apache.solr.handler.component.HighlightComponent">
>> >     <double name="time">0.0</double>
>> >
>> >    </lst>
>> >    <lst name="org.apache.solr.handler.component.StatsComponent">
>> >     <double name="time">0.0</double>
>> >    </lst>
>> >    <lst name="org.apache.solr.handler.component.DebugComponent">
>> >     <double name="time">0.0</double>
>> >    </lst>
>> >  </lst>
>> >
>> >  <lst name="process">
>> >    <double name="time">4171.0</double>
>> >    <lst name="org.apache.solr.handler.component.QueryComponent">
>> >     <double name="time">4171.0</double>
>> >    </lst>
>> >    <lst name="org.apache.solr.handler.component.FacetComponent">
>> >     <double name="time">0.0</double>
>> >
>> >    </lst>
>> >    <lst name="org.apache.solr.handler.component.MoreLikeThisComponent">
>> >     <double name="time">0.0</double>
>> >    </lst>
>> >    <lst name="org.apache.solr.handler.component.HighlightComponent">
>> >     <double name="time">0.0</double>
>> >    </lst>
>> >    <lst name="org.apache.solr.handler.component.StatsComponent">
>> >
>> >     <double name="time">0.0</double>
>> >    </lst>
>> >    <lst name="org.apache.solr.handler.component.DebugComponent">
>> >     <double name="time">0.0</double>
>> >    </lst>
>> >  </lst>
>> >  </lst>
>> > </lst>
>> >
>> > </response>
>> >
>> >
>> > On Wed, Mar 2, 2011 at 7:48 PM, Yonik Seeley <
>> yonik@lucidimagination.com>wrote:
>> >
>> >> On Wed, Mar 2, 2011 at 12:11 PM, Ofer Fort <of...@gmail.com> wrote:
>> >> > Hey all,
>> >> > I have an index with a lot of documents with the term X and no
>> documents
>> >> > with the term Y.
>> >> > If i query for X it take a few seconds and returns the results.
>> >> > If I query for Y it takes a millisecond and returns an empty set.
>> >> > If i query for Y AND X it takes a few seconds and returns an empty
>> set.
>> >>
>> >> This depends on the specifics of what X is.   Some query types must
>> >> generate all hits first internally - an example is a multi-term query
>> >> (like numeric range query, etc) that matches many terms.
>> >>
>> >> Can you show the generated query (i.e. add debugQuery=true to the
>> request)?
>> >>
>> >> -Yonik
>> >> http://lucidimagination.com
>> >>
>> >
>>
>
>

Re: Efficient boolean query

Posted by Ofer Fort <of...@tra.cx>.

Thanks,
But each query tries to see if there is something new since the last result
that was found, so rounding things will return the same documents over  and
over again, till we reach to the next rounded point.

Could i use the document id somehow?  or something else that's bigger than
my last search?

And even it was a simple term query, on the lucene side of things, why would
it try to fetch ALL the terms if one of the required ones resulted in an
empty set?

thanks for your help, specifically on this matter and in general, to the
search community :-)

On Wed, Mar 2, 2011 at 8:35 PM, Yonik Seeley <yo...@lucidimagination.com>wrote:

> One way to speed things up would be to reduce the resolution on
> timestamps that you index.
> Another way would be to decrease the precisionStep on the tdate field
> type (bigger index, but faster range queries)
> Yet another way is to use "fq" filters that can be reused many times.
>
> One way to increase fq reuse is to round.
> This rounds up to the nearest hour... assumes 2011-02-01T00:00:00Z is
> the same across many queries.
> fq=timestamp:[2011-02-01T00:00:00Z TO NOW/HOUR+1HOUR]
>
> Another way is to split the filter into two parts - a large part that
> doesn't change much + a small part that does.
> Again this assumes that the first endpoint is reused across many queries.
> fq=timestamp:[2011-02-01T00:00:00Z TO
> NOW/HOUR+1HOUR]&fq=timestamp:[NOW/HOUR TO NOW]
>
> If the first endpoint is *not* reused across many queries, then you
> can still use the same strategy as above by adding another small "fq"
> for the lower endpoint.
>
> -Yonik
> http://lucidimagination.com
>
>
>
> On Wed, Mar 2, 2011 at 1:11 PM, Ofer Fort <of...@tra.cx> wrote:
> > you are correct that my query is a tange one, probably should have
> mentioned
> > it in the first post.
> > this is the debug data:
> >
> > <?xml version="1.0" encoding="UTF-8"?>
> > <response>
> >
> > <lst name="responseHeader">
> >  <int name="status">0</int>
> >  <int name="QTime">4173</int>
> >  <lst name="params">
> >  <str name="debugQuery">on</str>
> >  <str name="indent">on</str>
> >
> >  <str name="start">0</str>
> >  <str name="q">timestamp:[2011-02-01T00:00:00Z TO NOW] AND oferiko</str>
> >  <str name="version">2.2</str>
> >  <str name="rows">10</str>
> >  </lst>
> > </lst>
> > <result name="response" numFound="0" start="0"/>
> > <lst name="debug">
> >
> >  <str name="rawquerystring">timestamp:[2011-02-01T00:00:00Z TO NOW] AND
> > oferiko</str>
> >  <str name="querystring">timestamp:[2011-02-01T00:00:00Z TO NOW] AND
> > oferiko</str>
> >  <str name="parsedquery">+timestamp:[1296518400000 TO 1299069584823]
> > +contents:oferiko</str>
> >  <str name="parsedquery_toString">+timestamp:[1296518400000 TO
> > 1299069584823] +contents:oferiko</str>
> >  <lst name="explain"/>
> >  <str name="QParser">LuceneQParser</str>
> >
> >  <lst name="timing">
> >  <double name="time">4171.0</double>
> >  <lst name="prepare">
> >    <double name="time">0.0</double>
> >    <lst name="org.apache.solr.handler.component.QueryComponent">
> >     <double name="time">0.0</double>
> >    </lst>
> >
> >    <lst name="org.apache.solr.handler.component.FacetComponent">
> >     <double name="time">0.0</double>
> >    </lst>
> >    <lst name="org.apache.solr.handler.component.MoreLikeThisComponent">
> >     <double name="time">0.0</double>
> >    </lst>
> >    <lst name="org.apache.solr.handler.component.HighlightComponent">
> >     <double name="time">0.0</double>
> >
> >    </lst>
> >    <lst name="org.apache.solr.handler.component.StatsComponent">
> >     <double name="time">0.0</double>
> >    </lst>
> >    <lst name="org.apache.solr.handler.component.DebugComponent">
> >     <double name="time">0.0</double>
> >    </lst>
> >  </lst>
> >
> >  <lst name="process">
> >    <double name="time">4171.0</double>
> >    <lst name="org.apache.solr.handler.component.QueryComponent">
> >     <double name="time">4171.0</double>
> >    </lst>
> >    <lst name="org.apache.solr.handler.component.FacetComponent">
> >     <double name="time">0.0</double>
> >
> >    </lst>
> >    <lst name="org.apache.solr.handler.component.MoreLikeThisComponent">
> >     <double name="time">0.0</double>
> >    </lst>
> >    <lst name="org.apache.solr.handler.component.HighlightComponent">
> >     <double name="time">0.0</double>
> >    </lst>
> >    <lst name="org.apache.solr.handler.component.StatsComponent">
> >
> >     <double name="time">0.0</double>
> >    </lst>
> >    <lst name="org.apache.solr.handler.component.DebugComponent">
> >     <double name="time">0.0</double>
> >    </lst>
> >  </lst>
> >  </lst>
> > </lst>
> >
> > </response>
> >
> >
> > On Wed, Mar 2, 2011 at 7:48 PM, Yonik Seeley <yonik@lucidimagination.com
> >wrote:
> >
> >> On Wed, Mar 2, 2011 at 12:11 PM, Ofer Fort <of...@gmail.com> wrote:
> >> > Hey all,
> >> > I have an index with a lot of documents with the term X and no
> documents
> >> > with the term Y.
> >> > If i query for X it take a few seconds and returns the results.
> >> > If I query for Y it takes a millisecond and returns an empty set.
> >> > If i query for Y AND X it takes a few seconds and returns an empty
> set.
> >>
> >> This depends on the specifics of what X is.   Some query types must
> >> generate all hits first internally - an example is a multi-term query
> >> (like numeric range query, etc) that matches many terms.
> >>
> >> Can you show the generated query (i.e. add debugQuery=true to the
> request)?
> >>
> >> -Yonik
> >> http://lucidimagination.com
> >>
> >
>

Re: Efficient boolean query

Posted by Ofer Fort <of...@tra.cx>.

you are correct that my query is a tange one, probably should have mentioned
it in the first post.
this is the debug data:

<?xml version="1.0" encoding="UTF-8"?>
<response>

<lst name="responseHeader">
 <int name="status">0</int>
 <int name="QTime">4173</int>
 <lst name="params">
  <str name="debugQuery">on</str>
  <str name="indent">on</str>

  <str name="start">0</str>
  <str name="q">timestamp:[2011-02-01T00:00:00Z TO NOW] AND oferiko</str>
  <str name="version">2.2</str>
  <str name="rows">10</str>
 </lst>
</lst>
<result name="response" numFound="0" start="0"/>
<lst name="debug">

 <str name="rawquerystring">timestamp:[2011-02-01T00:00:00Z TO NOW] AND
oferiko</str>
 <str name="querystring">timestamp:[2011-02-01T00:00:00Z TO NOW] AND
oferiko</str>
 <str name="parsedquery">+timestamp:[1296518400000 TO 1299069584823]
+contents:oferiko</str>
 <str name="parsedquery_toString">+timestamp:[1296518400000 TO
1299069584823] +contents:oferiko</str>
 <lst name="explain"/>
 <str name="QParser">LuceneQParser</str>

 <lst name="timing">
  <double name="time">4171.0</double>
  <lst name="prepare">
    <double name="time">0.0</double>
    <lst name="org.apache.solr.handler.component.QueryComponent">
     <double name="time">0.0</double>
    </lst>

    <lst name="org.apache.solr.handler.component.FacetComponent">
     <double name="time">0.0</double>
    </lst>
    <lst name="org.apache.solr.handler.component.MoreLikeThisComponent">
     <double name="time">0.0</double>
    </lst>
    <lst name="org.apache.solr.handler.component.HighlightComponent">
     <double name="time">0.0</double>

    </lst>
    <lst name="org.apache.solr.handler.component.StatsComponent">
     <double name="time">0.0</double>
    </lst>
    <lst name="org.apache.solr.handler.component.DebugComponent">
     <double name="time">0.0</double>
    </lst>
  </lst>

  <lst name="process">
    <double name="time">4171.0</double>
    <lst name="org.apache.solr.handler.component.QueryComponent">
     <double name="time">4171.0</double>
    </lst>
    <lst name="org.apache.solr.handler.component.FacetComponent">
     <double name="time">0.0</double>

    </lst>
    <lst name="org.apache.solr.handler.component.MoreLikeThisComponent">
     <double name="time">0.0</double>
    </lst>
    <lst name="org.apache.solr.handler.component.HighlightComponent">
     <double name="time">0.0</double>
    </lst>
    <lst name="org.apache.solr.handler.component.StatsComponent">

     <double name="time">0.0</double>
    </lst>
    <lst name="org.apache.solr.handler.component.DebugComponent">
     <double name="time">0.0</double>
    </lst>
  </lst>
 </lst>
</lst>

</response>


On Wed, Mar 2, 2011 at 7:48 PM, Yonik Seeley <yo...@lucidimagination.com>wrote:

> On Wed, Mar 2, 2011 at 12:11 PM, Ofer Fort <of...@gmail.com> wrote:
> > Hey all,
> > I have an index with a lot of documents with the term X and no documents
> > with the term Y.
> > If i query for X it take a few seconds and returns the results.
> > If I query for Y it takes a millisecond and returns an empty set.
> > If i query for Y AND X it takes a few seconds and returns an empty set.
>
> This depends on the specifics of what X is.   Some query types must
> generate all hits first internally - an example is a multi-term query
> (like numeric range query, etc) that matches many terms.
>
> Can you show the generated query (i.e. add debugQuery=true to the request)?
>
> -Yonik
> http://lucidimagination.com
>

Re: Efficient boolean query

Posted by Yonik Seeley <yo...@lucidimagination.com>.

On Wed, Mar 2, 2011 at 12:11 PM, Ofer Fort <of...@gmail.com> wrote:
> Hey all,
> I have an index with a lot of documents with the term X and no documents
> with the term Y.
> If i query for X it take a few seconds and returns the results.
> If I query for Y it takes a millisecond and returns an empty set.
> If i query for Y AND X it takes a few seconds and returns an empty set.

This depends on the specifics of what X is.   Some query types must
generate all hits first internally - an example is a multi-term query
(like numeric range query, etc) that matches many terms.

Can you show the generated query (i.e. add debugQuery=true to the request)?

-Yonik
http://lucidimagination.com