You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Greg Huber <gr...@gmail.com> on 2015/01/09 17:21:19 UTC

version 4.10.3 AnalyzingInfixSuggester with multiple contexts

Hello,

I am trying to use multiple contexts on the
org.apache.lucene.search.suggest.analyzing.AnalyzingInfixSuggester but
there is a mistake on the CONTEXTS_FIELD_NAME, the
BooleanClause.Occur.SHOULD needs to be BooleanClause.Occur.MUST. ( see
<<<<<<<<<<<<<<<<<< below)

I noticed that its been fixed on the trunk but not on the current release.
Does this mean its not going to be fixed on the 4.x.x release?

version  4.10.3

...
if (contexts != null) {
                BooleanQuery sub = new BooleanQuery();
                query.add(sub, BooleanClause.Occur.MUST);
                for (BytesRef context : contexts) {
                    // NOTE: we "should" wrap this in
                    // ConstantScoreQuery, or maybe send this as a
                    // Filter instead to search, but since all of
                    // these are MUST'd, the change to the score won't
                    // affect the overall ranking. Since we indexed
                    // as DOCS_ONLY, the perf should be the same
                    // either way (no freq int[] blocks to decode):

                    // TODO: if we had a BinaryTermField we could fix
                    // this "must be valid ut8f" limitation:
                    sub.add(new TermQuery(new Term(CONTEXTS_FIELD_NAME,
context
                            .utf8ToString())),
BooleanClause.Occur.SHOULD);  <<<<<<<<<<<<<<<<<<
                }
            }
..

trunk:

..
// do not make a subquery if all context booleans are must not
        if (allMustNot == true) {
          for (Map.Entry<BytesRef, BooleanClause.Occur> entry :
contextInfo.entrySet()) {
            query.add(new TermQuery(new Term(CONTEXTS_FIELD_NAME,
entry.getKey().utf8ToString())), BooleanClause.Occur.MUST_NOT);
          }

        } else {
          BooleanQuery sub = new BooleanQuery();
          query.add(sub, BooleanClause.Occur.MUST); <<<<<<<<<<<<<<<<<<
fixed!

          for (Map.Entry<BytesRef, BooleanClause.Occur> entry :
contextInfo.entrySet()) {
            // NOTE: we "should" wrap this in
            // ConstantScoreQuery, or maybe send this as a
            // Filter instead to search.

            // TODO: if we had a BinaryTermField we could fix
            // this "must be valid ut8f" limitation:
            sub.add(new TermQuery(new Term(CONTEXTS_FIELD_NAME,
entry.getKey().utf8ToString())), entry.getValue());
          }
        }
..

Cheers Greg

Re: version 4.10.3 AnalyzingInfixSuggester with multiple contexts

Posted by Michael McCandless <lu...@mikemccandless.com>.
Well this is by design really.  Ie, the original intent here (4.10.3)
is to return a suggestion if it has any of the specified contexts.

Maybe for 4.10.3 you could subclass AIS and override finishQuery to
rewrite the SHOULD to MUST in your case?

Mike McCandless

http://blog.mikemccandless.com


On Fri, Jan 9, 2015 at 12:02 PM, Greg Huber <gr...@gmail.com> wrote:
> Mike,
>
> Its correct on the trunk, but it looks like a bug in 4.10.3 as the
> BooleanClause.Occur.SHOULD is incorrect, this creates an OR rather than an
> AND.  As it is I get all the results for the multiple contexts, where if I
> change it to MUST it works correctly.
>
> currently:
> sub.add(new TermQuery(new Term(CONTEXTS_FIELD_NAME, context
>                             .utf8ToString())), BooleanClause.Occur.SHOULD);
>
> needs to be:
> sub.add(new TermQuery(new Term(CONTEXTS_FIELD_NAME, context
>                             .utf8ToString())), BooleanClause.Occur.MUST);
>
> Cheers Greg
>
> On 9 January 2015 at 16:52, Michael McCandless <lu...@mikemccandless.com>
> wrote:
>>
>> That change (a new feature, to let you control MUST vs SHOULD for each
>> context) was done with
>> https://issues.apache.org/jira/browse/LUCENE-6050
>>
>> But it's a new feature, not a bug ... and 4.10.x is for bug fixes
>> only, so I don't think we will backport it.
>>
>> Mike McCandless
>>
>> http://blog.mikemccandless.com
>>
>>
>> On Fri, Jan 9, 2015 at 11:21 AM, Greg Huber <gr...@gmail.com> wrote:
>> > Hello,
>> >
>> > I am trying to use multiple contexts on the
>> > org.apache.lucene.search.suggest.analyzing.AnalyzingInfixSuggester but
>> > there is a mistake on the CONTEXTS_FIELD_NAME, the
>> > BooleanClause.Occur.SHOULD needs to be BooleanClause.Occur.MUST. ( see
>> > <<<<<<<<<<<<<<<<<< below)
>> >
>> > I noticed that its been fixed on the trunk but not on the current
>> > release.
>> > Does this mean its not going to be fixed on the 4.x.x release?
>> >
>> > version  4.10.3
>> >
>> > ...
>> > if (contexts != null) {
>> >                 BooleanQuery sub = new BooleanQuery();
>> >                 query.add(sub, BooleanClause.Occur.MUST);
>> >                 for (BytesRef context : contexts) {
>> >                     // NOTE: we "should" wrap this in
>> >                     // ConstantScoreQuery, or maybe send this as a
>> >                     // Filter instead to search, but since all of
>> >                     // these are MUST'd, the change to the score won't
>> >                     // affect the overall ranking. Since we indexed
>> >                     // as DOCS_ONLY, the perf should be the same
>> >                     // either way (no freq int[] blocks to decode):
>> >
>> >                     // TODO: if we had a BinaryTermField we could fix
>> >                     // this "must be valid ut8f" limitation:
>> >                     sub.add(new TermQuery(new Term(CONTEXTS_FIELD_NAME,
>> > context
>> >                             .utf8ToString())),
>> > BooleanClause.Occur.SHOULD);  <<<<<<<<<<<<<<<<<<
>> >                 }
>> >             }
>> > ..
>> >
>> > trunk:
>> >
>> > ..
>> > // do not make a subquery if all context booleans are must not
>> >         if (allMustNot == true) {
>> >           for (Map.Entry<BytesRef, BooleanClause.Occur> entry :
>> > contextInfo.entrySet()) {
>> >             query.add(new TermQuery(new Term(CONTEXTS_FIELD_NAME,
>> > entry.getKey().utf8ToString())), BooleanClause.Occur.MUST_NOT);
>> >           }
>> >
>> >         } else {
>> >           BooleanQuery sub = new BooleanQuery();
>> >           query.add(sub, BooleanClause.Occur.MUST); <<<<<<<<<<<<<<<<<<
>> > fixed!
>> >
>> >           for (Map.Entry<BytesRef, BooleanClause.Occur> entry :
>> > contextInfo.entrySet()) {
>> >             // NOTE: we "should" wrap this in
>> >             // ConstantScoreQuery, or maybe send this as a
>> >             // Filter instead to search.
>> >
>> >             // TODO: if we had a BinaryTermField we could fix
>> >             // this "must be valid ut8f" limitation:
>> >             sub.add(new TermQuery(new Term(CONTEXTS_FIELD_NAME,
>> > entry.getKey().utf8ToString())), entry.getValue());
>> >           }
>> >         }
>> > ..
>> >
>> > Cheers Greg
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: version 4.10.3 AnalyzingInfixSuggester with multiple contexts

Posted by Michael McCandless <lu...@mikemccandless.com>.
That change (a new feature, to let you control MUST vs SHOULD for each
context) was done with
https://issues.apache.org/jira/browse/LUCENE-6050

But it's a new feature, not a bug ... and 4.10.x is for bug fixes
only, so I don't think we will backport it.

Mike McCandless

http://blog.mikemccandless.com


On Fri, Jan 9, 2015 at 11:21 AM, Greg Huber <gr...@gmail.com> wrote:
> Hello,
>
> I am trying to use multiple contexts on the
> org.apache.lucene.search.suggest.analyzing.AnalyzingInfixSuggester but
> there is a mistake on the CONTEXTS_FIELD_NAME, the
> BooleanClause.Occur.SHOULD needs to be BooleanClause.Occur.MUST. ( see
> <<<<<<<<<<<<<<<<<< below)
>
> I noticed that its been fixed on the trunk but not on the current release.
> Does this mean its not going to be fixed on the 4.x.x release?
>
> version  4.10.3
>
> ...
> if (contexts != null) {
>                 BooleanQuery sub = new BooleanQuery();
>                 query.add(sub, BooleanClause.Occur.MUST);
>                 for (BytesRef context : contexts) {
>                     // NOTE: we "should" wrap this in
>                     // ConstantScoreQuery, or maybe send this as a
>                     // Filter instead to search, but since all of
>                     // these are MUST'd, the change to the score won't
>                     // affect the overall ranking. Since we indexed
>                     // as DOCS_ONLY, the perf should be the same
>                     // either way (no freq int[] blocks to decode):
>
>                     // TODO: if we had a BinaryTermField we could fix
>                     // this "must be valid ut8f" limitation:
>                     sub.add(new TermQuery(new Term(CONTEXTS_FIELD_NAME,
> context
>                             .utf8ToString())),
> BooleanClause.Occur.SHOULD);  <<<<<<<<<<<<<<<<<<
>                 }
>             }
> ..
>
> trunk:
>
> ..
> // do not make a subquery if all context booleans are must not
>         if (allMustNot == true) {
>           for (Map.Entry<BytesRef, BooleanClause.Occur> entry :
> contextInfo.entrySet()) {
>             query.add(new TermQuery(new Term(CONTEXTS_FIELD_NAME,
> entry.getKey().utf8ToString())), BooleanClause.Occur.MUST_NOT);
>           }
>
>         } else {
>           BooleanQuery sub = new BooleanQuery();
>           query.add(sub, BooleanClause.Occur.MUST); <<<<<<<<<<<<<<<<<<
> fixed!
>
>           for (Map.Entry<BytesRef, BooleanClause.Occur> entry :
> contextInfo.entrySet()) {
>             // NOTE: we "should" wrap this in
>             // ConstantScoreQuery, or maybe send this as a
>             // Filter instead to search.
>
>             // TODO: if we had a BinaryTermField we could fix
>             // this "must be valid ut8f" limitation:
>             sub.add(new TermQuery(new Term(CONTEXTS_FIELD_NAME,
> entry.getKey().utf8ToString())), entry.getValue());
>           }
>         }
> ..
>
> Cheers Greg

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org