You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by Shai Erera <se...@gmail.com> on 2011/12/16 05:46:03 UTC

NumericRangeQuery.extractTerms and distributed stats

Hi

For the purpose of running a distributed search and fix the local term
statistics to be the global one, we do the following:

(1) Receive a query String and parse into a Query object
(2) Call q.extractTerms()
(3) Fetch stats from each Searcher (forget about caching at the moment)
(4) Transmit the fixed statistics to all searchers so they fix their local
stats with the global ones.

If the query contains a range clause, extractTerms() throws
UnsupportedOperationException, because NumericRangeQuery (and MTQ) do not
override Query.extractTerms. I understand that the numeric range terms are
meaningless, and so I would at least expect it to not do anything, rather
than throw UOE. Throwing an exception IMO is too drastic for this
operation, and it prevents useful functionality.

The question is where to fix it -- maybe we should change Query's impl to
not do anything? Or, we fix NRQ/MTQ to override with a silent no-op impl,
and in trunk make Query.extractTerms abstract?

If someone has an idea on how we can obtain the list of query Terms,
avoiding the exception (but also avoid looping through all the clauses),
please share it.

Shai

Re: NumericRangeQuery.extractTerms and distributed stats

Posted by Shai Erera <se...@gmail.com>.
Argh .. I more carefully read Query.extractTerns javadocs:

"Only works if this query is in its {@link #rewrite rewritten} form."

Sorry for the disturbance :).

Shai

On Fri, Dec 16, 2011 at 6:54 AM, Robert Muir <rc...@gmail.com> wrote:

> On Thu, Dec 15, 2011 at 11:46 PM, Shai Erera <se...@gmail.com> wrote:
> > Hi
> >
> > For the purpose of running a distributed search and fix the local term
> > statistics to be the global one, we do the following:
> >
> > (1) Receive a query String and parse into a Query object
> > (2) Call q.extractTerms()
> > (3) Fetch stats from each Searcher (forget about caching at the moment)
> > (4) Transmit the fixed statistics to all searchers so they fix their
> local
> > stats with the global ones.
> >
> > If the query contains a range clause, extractTerms() throws
> > UnsupportedOperationException, because NumericRangeQuery (and MTQ) do not
> > override Query.extractTerms. I understand that the numeric range terms
> are
> > meaningless, and so I would at least expect it to not do anything, rather
> > than throw UOE. Throwing an exception IMO is too drastic for this
> operation,
> > and it prevents useful functionality.
>
> I think the exception is correct, you need to call extractTerms after
> rewrite()
>
> Mike has a cool test on https://issues.apache.org/jira/browse/LUCENE-3639,
> and if i recall it tests multitermqueries.
>
> currently his searcher overrides IS.rewrite():
>
> @Override rewrite(Query q) {
>   Query rewritten = super.rewrite(q);
>   terms = rewritten.extractTerms();
>   ...
>   return rewritten;
> }
>
> i added some comments on there about how its all still a bit funky and
> am hoping we can still make this easier in trunk.
>
> --
> lucidimagination.com
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>
>

Re: NumericRangeQuery.extractTerms and distributed stats

Posted by Robert Muir <rc...@gmail.com>.
On Thu, Dec 15, 2011 at 11:46 PM, Shai Erera <se...@gmail.com> wrote:
> Hi
>
> For the purpose of running a distributed search and fix the local term
> statistics to be the global one, we do the following:
>
> (1) Receive a query String and parse into a Query object
> (2) Call q.extractTerms()
> (3) Fetch stats from each Searcher (forget about caching at the moment)
> (4) Transmit the fixed statistics to all searchers so they fix their local
> stats with the global ones.
>
> If the query contains a range clause, extractTerms() throws
> UnsupportedOperationException, because NumericRangeQuery (and MTQ) do not
> override Query.extractTerms. I understand that the numeric range terms are
> meaningless, and so I would at least expect it to not do anything, rather
> than throw UOE. Throwing an exception IMO is too drastic for this operation,
> and it prevents useful functionality.

I think the exception is correct, you need to call extractTerms after rewrite()

Mike has a cool test on https://issues.apache.org/jira/browse/LUCENE-3639,
and if i recall it tests multitermqueries.

currently his searcher overrides IS.rewrite():

@Override rewrite(Query q) {
   Query rewritten = super.rewrite(q);
   terms = rewritten.extractTerms();
   ...
   return rewritten;
}

i added some comments on there about how its all still a bit funky and
am hoping we can still make this easier in trunk.

-- 
lucidimagination.com

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org