You are viewing a plain text version of this content. The canonical link for it is here.

Posted to java-user@lucene.apache.org by ba...@oracle.com on 2018/09/18 20:16:41 UTC

MultiPhraseQuery

Hi,-

  how does MultiPhraseQuery treat synonyms?

is the following possible?

... (created index with synonyms and indexReader object has the index)

IndexSearcher is = new IndexSearcher(indexReader);

MultiPhraseQuery.Builder builder = new MultiPhraseQuery.Builder();
builder.add(new Term("body", "one"), 0);
builder.add(new Term("body", "two"), 1);
MultiPhraseQuery mpq = builder.build();
TopDocs hits = is.search(mpq, 20);// 20 hits

Best regards


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Re: MultiPhraseQuery

Posted by ba...@oracle.com.

Ok, Mike, that was very helpful.

Now, i think i should use BooleanQuery with PhraseQueries but will 
PhraseQuery be able to handle all synonyms- multi or single term?

What is the best way for this:

i have multiple tokens and i want to be able to do a cheap fuzzy search.

Best regards


On 9/18/18 5:28 PM, Michael McCandless wrote:
> Yes, +1 for a patch to improve the docs!
>
> MultiPhraseQuery only works for single term synonyms, and is usually
> produced by query parsers when the incoming query text had single term
> synonyms matching, I think?  The query parser will use other (span?)
> queries for multi token synonyms.
>
> I think the example in the javadoc should be simplified to not use "app*",
> e.g. maybe just matching "Microsoft Excel|Word"?
>
> Mike McCandless
>
> https://urldefense.proofpoint.com/v2/url?u=http-3A__blog.mikemccandless.com&d=DwIFaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-BKNeyLlULCbaezrgocEvPhQkl4&m=tfPRxWsAx9P1XhVir6rL7WRf3iwK0jtYxnNnnhB9S90&s=yyRp_pK267aMSOlpWodQL-67wMhX3rb88aFr1YJ6lfk&e=
>
>
> On Wed, Sep 19, 2018 at 5:59 AM Erick Erickson <er...@gmail.com>
> wrote:
>
>> bq. i wish the Javadocs has examples like PhraseQuery Javadocs gave.
>>
>> This is where someone coming into the examples for the first time is
>> invaluable, javadoc patches are most welcome! It can be hard to back
>> off enough to remember what the confusing bits are when you wrote the
>> code ;)
>> On Tue, Sep 18, 2018 at 1:56 PM <ba...@oracle.com> wrote:
>>> Any suggestions please?
>>> Two main questions:
>>> - how do synonyms get utilized by MultiPhraseQuery?
>>> - how do we get second token "app" applied to the example on
>>> MultiPhraseQuery javadocs page? (and how do we get Terms[] array from
>>> Terms object?)
>>>
>>> Now three questions :)
>>>
>>> i wish the Javadocs has examples like PhraseQuery Javadocs gave.
>>>
>>> Best
>>>
>>> On 9/18/18 4:45 PM, baris.kazar@oracle.com wrote:
>>>> Trying to implement the example on
>>>>
>> https://urldefense.proofpoint.com/v2/url?u=https-3A__lucene.apache.org_core_6-5F6-5F1_core_org_apache_lucene_search_MultiPhraseQuery.html&d=DwIDaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-BKNeyLlULCbaezrgocEvPhQkl4&m=7WmT3NC9wzVk4FPBupACoALoL4kho6V7-c2o4Kac5QM&s=gM6_4hvpLEZY1_7r-CEInZbUb-ublYDcJOQ8rmeAgVA&e=
>>>> // A generalized version of PhraseQuery, with the possibility of
>>>> adding more than one term at the same position that are treated as a
>>>> disjunction (OR). To use this class to search for the phrase
>>>> "Microsoft app*" first create a Builder and use
>>>>
>>>> // MultiPhraseQuery.Builder.add(Term) on the term "microsoft"
>>>> (assuming lowercase analysis), then find all terms that have "app" as
>>>> prefix using LeafReader.terms(String), seeking to "app" then iterating
>>>> and collecting terms until there is no longer that prefix,
>>>>
>>>> // and finally use MultiPhraseQuery.Builder.add(Term[]) to add them.
>>>> MultiPhraseQuery.Builder.build() returns the fully constructed (and
>>>> immutable) MultiPhraseQuery.
>>>>
>>>>
>>>> IndexSearcher is = new IndexSearcher(indexReader);
>>>>
>>>> MultiPhraseQuery.Builder builder = new MultiPhraseQuery.Builder();
>>>> builder.add(new Term("body", "one"), 0);
>>>>
>>>> Terms terms = LeafReader.terms("body"); // will this be slow? and how
>>>> do we incorporate token/word "app" here?
>>>>
>>>> // i STILL dont see how to get individual Term objects from terms
>>>> object and plus do i need to declare LeafReader object?
>>>>
>>>> Term[] termArr = new Term[k]; // i will get this filled via using
>>>> Terms.iterator
>>>> builder.add(termArr);
>>>> MultiPhraseQuery mpq = builder.build();
>>>> TopDocs hits = is.search(mpq, 20);// 20 hits
>>>>
>>>>
>>>> Best regards
>>>>
>>>>
>>>> On 9/18/18 4:16 PM, baris.kazar@oracle.com wrote:
>>>>> Hi,-
>>>>>
>>>>>   how does MultiPhraseQuery treat synonyms?
>>>>>
>>>>> is the following possible?
>>>>>
>>>>> ... (created index with synonyms and indexReader object has the index)
>>>>>
>>>>> IndexSearcher is = new IndexSearcher(indexReader);
>>>>>
>>>>> MultiPhraseQuery.Builder builder = new MultiPhraseQuery.Builder();
>>>>> builder.add(new Term("body", "one"), 0);
>>>>> builder.add(new Term("body", "two"), 1);
>>>>> MultiPhraseQuery mpq = builder.build();
>>>>> TopDocs hits = is.search(mpq, 20);// 20 hits
>>>>>
>>>>> Best regards
>>>>>
>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>>
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Re: MultiPhraseQuery

Posted by Michael McCandless <lu...@mikemccandless.com>.

Yes, +1 for a patch to improve the docs!

MultiPhraseQuery only works for single term synonyms, and is usually
produced by query parsers when the incoming query text had single term
synonyms matching, I think?  The query parser will use other (span?)
queries for multi token synonyms.

I think the example in the javadoc should be simplified to not use "app*",
e.g. maybe just matching "Microsoft Excel|Word"?

Mike McCandless

http://blog.mikemccandless.com


On Wed, Sep 19, 2018 at 5:59 AM Erick Erickson <er...@gmail.com>
wrote:

> bq. i wish the Javadocs has examples like PhraseQuery Javadocs gave.
>
> This is where someone coming into the examples for the first time is
> invaluable, javadoc patches are most welcome! It can be hard to back
> off enough to remember what the confusing bits are when you wrote the
> code ;)
> On Tue, Sep 18, 2018 at 1:56 PM <ba...@oracle.com> wrote:
> >
> > Any suggestions please?
> > Two main questions:
> > - how do synonyms get utilized by MultiPhraseQuery?
> > - how do we get second token "app" applied to the example on
> > MultiPhraseQuery javadocs page? (and how do we get Terms[] array from
> > Terms object?)
> >
> > Now three questions :)
> >
> > i wish the Javadocs has examples like PhraseQuery Javadocs gave.
> >
> > Best
> >
> > On 9/18/18 4:45 PM, baris.kazar@oracle.com wrote:
> > > Trying to implement the example on
> > >
> https://urldefense.proofpoint.com/v2/url?u=https-3A__lucene.apache.org_core_6-5F6-5F1_core_org_apache_lucene_search_MultiPhraseQuery.html&d=DwIDaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-BKNeyLlULCbaezrgocEvPhQkl4&m=7WmT3NC9wzVk4FPBupACoALoL4kho6V7-c2o4Kac5QM&s=gM6_4hvpLEZY1_7r-CEInZbUb-ublYDcJOQ8rmeAgVA&e=
> > >
> > > // A generalized version of PhraseQuery, with the possibility of
> > > adding more than one term at the same position that are treated as a
> > > disjunction (OR). To use this class to search for the phrase
> > > "Microsoft app*" first create a Builder and use
> > >
> > > // MultiPhraseQuery.Builder.add(Term) on the term "microsoft"
> > > (assuming lowercase analysis), then find all terms that have "app" as
> > > prefix using LeafReader.terms(String), seeking to "app" then iterating
> > > and collecting terms until there is no longer that prefix,
> > >
> > > // and finally use MultiPhraseQuery.Builder.add(Term[]) to add them.
> > > MultiPhraseQuery.Builder.build() returns the fully constructed (and
> > > immutable) MultiPhraseQuery.
> > >
> > >
> > > IndexSearcher is = new IndexSearcher(indexReader);
> > >
> > > MultiPhraseQuery.Builder builder = new MultiPhraseQuery.Builder();
> > > builder.add(new Term("body", "one"), 0);
> > >
> > > Terms terms = LeafReader.terms("body"); // will this be slow? and how
> > > do we incorporate token/word "app" here?
> > >
> > > // i STILL dont see how to get individual Term objects from terms
> > > object and plus do i need to declare LeafReader object?
> > >
> > > Term[] termArr = new Term[k]; // i will get this filled via using
> > > Terms.iterator
> > > builder.add(termArr);
> > > MultiPhraseQuery mpq = builder.build();
> > > TopDocs hits = is.search(mpq, 20);// 20 hits
> > >
> > >
> > > Best regards
> > >
> > >
> > > On 9/18/18 4:16 PM, baris.kazar@oracle.com wrote:
> > >> Hi,-
> > >>
> > >>  how does MultiPhraseQuery treat synonyms?
> > >>
> > >> is the following possible?
> > >>
> > >> ... (created index with synonyms and indexReader object has the index)
> > >>
> > >> IndexSearcher is = new IndexSearcher(indexReader);
> > >>
> > >> MultiPhraseQuery.Builder builder = new MultiPhraseQuery.Builder();
> > >> builder.add(new Term("body", "one"), 0);
> > >> builder.add(new Term("body", "two"), 1);
> > >> MultiPhraseQuery mpq = builder.build();
> > >> TopDocs hits = is.search(mpq, 20);// 20 hits
> > >>
> > >> Best regards
> > >>
> > >>
> > >> ---------------------------------------------------------------------
> > >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > >> For additional commands, e-mail: java-user-help@lucene.apache.org
> > >>
> > >
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > > For additional commands, e-mail: java-user-help@lucene.apache.org
> > >
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-user-help@lucene.apache.org
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Re: MultiPhraseQuery

Posted by ba...@oracle.com.

FuzzyQuery seems also not suitable for me.

PrefixQuery can be one token only, right?

Best


On 9/18/18 5:23 PM, baris.kazar@oracle.com wrote:
> Erick,-
>  i think the reason why MultiPhraseQuery was created was synonyms as 
> far as i understood. am i right?
>
> i want to have a BooleanQuery or MultiPhraseQuery (i cant decide 
> between these two) with an index which considers synonyms already.
> One disadvantage of MultiPhraseQuery is that it needs to match all the 
> terms.
> Then should i go for BooleanQuery with multiple PhraseQueries? but 
> PhraseQuery cannot handle synonyms.
> i know TermQuery is for exact match so i cant use that either in this 
> case.
>
> i have multiple tokens and i want to be able to do a cheap fuzzy search.
> Best regards
>
>
> On 9/18/18 4:58 PM, Erick Erickson wrote:
>> bq. i wish the Javadocs has examples like PhraseQuery Javadocs gave.
>>
>> This is where someone coming into the examples for the first time is
>> invaluable, javadoc patches are most welcome! It can be hard to back
>> off enough to remember what the confusing bits are when you wrote the
>> code ;)
>> On Tue, Sep 18, 2018 at 1:56 PM <ba...@oracle.com> wrote:
>>> Any suggestions please?
>>> Two main questions:
>>> - how do synonyms get utilized by MultiPhraseQuery?
>>> - how do we get second token "app" applied to the example on
>>> MultiPhraseQuery javadocs page? (and how do we get Terms[] array from
>>> Terms object?)
>>>
>>> Now three questions :)
>>>
>>> i wish the Javadocs has examples like PhraseQuery Javadocs gave.
>>>
>>> Best
>>>
>>> On 9/18/18 4:45 PM, baris.kazar@oracle.com wrote:
>>>> Trying to implement the example on
>>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__lucene.apache.org_core_6-5F6-5F1_core_org_apache_lucene_search_MultiPhraseQuery.html&d=DwIDaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-BKNeyLlULCbaezrgocEvPhQkl4&m=7WmT3NC9wzVk4FPBupACoALoL4kho6V7-c2o4Kac5QM&s=gM6_4hvpLEZY1_7r-CEInZbUb-ublYDcJOQ8rmeAgVA&e= 
>>>>
>>>>
>>>> // A generalized version of PhraseQuery, with the possibility of
>>>> adding more than one term at the same position that are treated as a
>>>> disjunction (OR). To use this class to search for the phrase
>>>> "Microsoft app*" first create a Builder and use
>>>>
>>>> // MultiPhraseQuery.Builder.add(Term) on the term "microsoft"
>>>> (assuming lowercase analysis), then find all terms that have "app" as
>>>> prefix using LeafReader.terms(String), seeking to "app" then iterating
>>>> and collecting terms until there is no longer that prefix,
>>>>
>>>> // and finally use MultiPhraseQuery.Builder.add(Term[]) to add them.
>>>> MultiPhraseQuery.Builder.build() returns the fully constructed (and
>>>> immutable) MultiPhraseQuery.
>>>>
>>>>
>>>> IndexSearcher is = new IndexSearcher(indexReader);
>>>>
>>>> MultiPhraseQuery.Builder builder = new MultiPhraseQuery.Builder();
>>>> builder.add(new Term("body", "one"), 0);
>>>>
>>>> Terms terms = LeafReader.terms("body"); // will this be slow? and how
>>>> do we incorporate token/word "app" here?
>>>>
>>>> // i STILL dont see how to get individual Term objects from terms
>>>> object and plus do i need to declare LeafReader object?
>>>>
>>>> Term[] termArr = new Term[k]; // i will get this filled via using
>>>> Terms.iterator
>>>> builder.add(termArr);
>>>> MultiPhraseQuery mpq = builder.build();
>>>> TopDocs hits = is.search(mpq, 20);// 20 hits
>>>>
>>>>
>>>> Best regards
>>>>
>>>>
>>>> On 9/18/18 4:16 PM, baris.kazar@oracle.com wrote:
>>>>> Hi,-
>>>>>
>>>>>   how does MultiPhraseQuery treat synonyms?
>>>>>
>>>>> is the following possible?
>>>>>
>>>>> ... (created index with synonyms and indexReader object has the 
>>>>> index)
>>>>>
>>>>> IndexSearcher is = new IndexSearcher(indexReader);
>>>>>
>>>>> MultiPhraseQuery.Builder builder = new MultiPhraseQuery.Builder();
>>>>> builder.add(new Term("body", "one"), 0);
>>>>> builder.add(new Term("body", "two"), 1);
>>>>> MultiPhraseQuery mpq = builder.build();
>>>>> TopDocs hits = is.search(mpq, 20);// 20 hits
>>>>>
>>>>> Best regards
>>>>>
>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>>
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Re: MultiPhraseQuery

Posted by ba...@oracle.com.

Erick,-
  i think the reason why MultiPhraseQuery was created was synonyms as 
far as i understood. am i right?

i want to have a BooleanQuery or MultiPhraseQuery (i cant decide between 
these two) with an index which considers synonyms already.
One disadvantage of MultiPhraseQuery is that it needs to match all the 
terms.
Then should i go for BooleanQuery with multiple PhraseQueries? but 
PhraseQuery cannot handle synonyms.
i know TermQuery is for exact match so i cant use that either in this case.

i have multiple tokens and i want to be able to do a cheap fuzzy search.
Best regards


On 9/18/18 4:58 PM, Erick Erickson wrote:
> bq. i wish the Javadocs has examples like PhraseQuery Javadocs gave.
>
> This is where someone coming into the examples for the first time is
> invaluable, javadoc patches are most welcome! It can be hard to back
> off enough to remember what the confusing bits are when you wrote the
> code ;)
> On Tue, Sep 18, 2018 at 1:56 PM <ba...@oracle.com> wrote:
>> Any suggestions please?
>> Two main questions:
>> - how do synonyms get utilized by MultiPhraseQuery?
>> - how do we get second token "app" applied to the example on
>> MultiPhraseQuery javadocs page? (and how do we get Terms[] array from
>> Terms object?)
>>
>> Now three questions :)
>>
>> i wish the Javadocs has examples like PhraseQuery Javadocs gave.
>>
>> Best
>>
>> On 9/18/18 4:45 PM, baris.kazar@oracle.com wrote:
>>> Trying to implement the example on
>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__lucene.apache.org_core_6-5F6-5F1_core_org_apache_lucene_search_MultiPhraseQuery.html&d=DwIDaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-BKNeyLlULCbaezrgocEvPhQkl4&m=7WmT3NC9wzVk4FPBupACoALoL4kho6V7-c2o4Kac5QM&s=gM6_4hvpLEZY1_7r-CEInZbUb-ublYDcJOQ8rmeAgVA&e=
>>>
>>> // A generalized version of PhraseQuery, with the possibility of
>>> adding more than one term at the same position that are treated as a
>>> disjunction (OR). To use this class to search for the phrase
>>> "Microsoft app*" first create a Builder and use
>>>
>>> // MultiPhraseQuery.Builder.add(Term) on the term "microsoft"
>>> (assuming lowercase analysis), then find all terms that have "app" as
>>> prefix using LeafReader.terms(String), seeking to "app" then iterating
>>> and collecting terms until there is no longer that prefix,
>>>
>>> // and finally use MultiPhraseQuery.Builder.add(Term[]) to add them.
>>> MultiPhraseQuery.Builder.build() returns the fully constructed (and
>>> immutable) MultiPhraseQuery.
>>>
>>>
>>> IndexSearcher is = new IndexSearcher(indexReader);
>>>
>>> MultiPhraseQuery.Builder builder = new MultiPhraseQuery.Builder();
>>> builder.add(new Term("body", "one"), 0);
>>>
>>> Terms terms = LeafReader.terms("body"); // will this be slow? and how
>>> do we incorporate token/word "app" here?
>>>
>>> // i STILL dont see how to get individual Term objects from terms
>>> object and plus do i need to declare LeafReader object?
>>>
>>> Term[] termArr = new Term[k]; // i will get this filled via using
>>> Terms.iterator
>>> builder.add(termArr);
>>> MultiPhraseQuery mpq = builder.build();
>>> TopDocs hits = is.search(mpq, 20);// 20 hits
>>>
>>>
>>> Best regards
>>>
>>>
>>> On 9/18/18 4:16 PM, baris.kazar@oracle.com wrote:
>>>> Hi,-
>>>>
>>>>   how does MultiPhraseQuery treat synonyms?
>>>>
>>>> is the following possible?
>>>>
>>>> ... (created index with synonyms and indexReader object has the index)
>>>>
>>>> IndexSearcher is = new IndexSearcher(indexReader);
>>>>
>>>> MultiPhraseQuery.Builder builder = new MultiPhraseQuery.Builder();
>>>> builder.add(new Term("body", "one"), 0);
>>>> builder.add(new Term("body", "two"), 1);
>>>> MultiPhraseQuery mpq = builder.build();
>>>> TopDocs hits = is.search(mpq, 20);// 20 hits
>>>>
>>>> Best regards
>>>>
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Re: MultiPhraseQuery

Posted by Erick Erickson <er...@gmail.com>.

bq. i wish the Javadocs has examples like PhraseQuery Javadocs gave.

This is where someone coming into the examples for the first time is
invaluable, javadoc patches are most welcome! It can be hard to back
off enough to remember what the confusing bits are when you wrote the
code ;)
On Tue, Sep 18, 2018 at 1:56 PM <ba...@oracle.com> wrote:
>
> Any suggestions please?
> Two main questions:
> - how do synonyms get utilized by MultiPhraseQuery?
> - how do we get second token "app" applied to the example on
> MultiPhraseQuery javadocs page? (and how do we get Terms[] array from
> Terms object?)
>
> Now three questions :)
>
> i wish the Javadocs has examples like PhraseQuery Javadocs gave.
>
> Best
>
> On 9/18/18 4:45 PM, baris.kazar@oracle.com wrote:
> > Trying to implement the example on
> > https://urldefense.proofpoint.com/v2/url?u=https-3A__lucene.apache.org_core_6-5F6-5F1_core_org_apache_lucene_search_MultiPhraseQuery.html&d=DwIDaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-BKNeyLlULCbaezrgocEvPhQkl4&m=7WmT3NC9wzVk4FPBupACoALoL4kho6V7-c2o4Kac5QM&s=gM6_4hvpLEZY1_7r-CEInZbUb-ublYDcJOQ8rmeAgVA&e=
> >
> > // A generalized version of PhraseQuery, with the possibility of
> > adding more than one term at the same position that are treated as a
> > disjunction (OR). To use this class to search for the phrase
> > "Microsoft app*" first create a Builder and use
> >
> > // MultiPhraseQuery.Builder.add(Term) on the term "microsoft"
> > (assuming lowercase analysis), then find all terms that have "app" as
> > prefix using LeafReader.terms(String), seeking to "app" then iterating
> > and collecting terms until there is no longer that prefix,
> >
> > // and finally use MultiPhraseQuery.Builder.add(Term[]) to add them.
> > MultiPhraseQuery.Builder.build() returns the fully constructed (and
> > immutable) MultiPhraseQuery.
> >
> >
> > IndexSearcher is = new IndexSearcher(indexReader);
> >
> > MultiPhraseQuery.Builder builder = new MultiPhraseQuery.Builder();
> > builder.add(new Term("body", "one"), 0);
> >
> > Terms terms = LeafReader.terms("body"); // will this be slow? and how
> > do we incorporate token/word "app" here?
> >
> > // i STILL dont see how to get individual Term objects from terms
> > object and plus do i need to declare LeafReader object?
> >
> > Term[] termArr = new Term[k]; // i will get this filled via using
> > Terms.iterator
> > builder.add(termArr);
> > MultiPhraseQuery mpq = builder.build();
> > TopDocs hits = is.search(mpq, 20);// 20 hits
> >
> >
> > Best regards
> >
> >
> > On 9/18/18 4:16 PM, baris.kazar@oracle.com wrote:
> >> Hi,-
> >>
> >>  how does MultiPhraseQuery treat synonyms?
> >>
> >> is the following possible?
> >>
> >> ... (created index with synonyms and indexReader object has the index)
> >>
> >> IndexSearcher is = new IndexSearcher(indexReader);
> >>
> >> MultiPhraseQuery.Builder builder = new MultiPhraseQuery.Builder();
> >> builder.add(new Term("body", "one"), 0);
> >> builder.add(new Term("body", "two"), 1);
> >> MultiPhraseQuery mpq = builder.build();
> >> TopDocs hits = is.search(mpq, 20);// 20 hits
> >>
> >> Best regards
> >>
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >> For additional commands, e-mail: java-user-help@lucene.apache.org
> >>
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-user-help@lucene.apache.org
> >
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Re: MultiPhraseQuery

Posted by ba...@oracle.com.

Any suggestions please?
Two main questions:
- how do synonyms get utilized by MultiPhraseQuery?
- how do we get second token "app" applied to the example on 
MultiPhraseQuery javadocs page? (and how do we get Terms[] array from 
Terms object?)

Now three questions :)

i wish the Javadocs has examples like PhraseQuery Javadocs gave.

Best

On 9/18/18 4:45 PM, baris.kazar@oracle.com wrote:
> Trying to implement the example on 
> https://urldefense.proofpoint.com/v2/url?u=https-3A__lucene.apache.org_core_6-5F6-5F1_core_org_apache_lucene_search_MultiPhraseQuery.html&d=DwIDaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-BKNeyLlULCbaezrgocEvPhQkl4&m=7WmT3NC9wzVk4FPBupACoALoL4kho6V7-c2o4Kac5QM&s=gM6_4hvpLEZY1_7r-CEInZbUb-ublYDcJOQ8rmeAgVA&e=
>
> // A generalized version of PhraseQuery, with the possibility of 
> adding more than one term at the same position that are treated as a 
> disjunction (OR). To use this class to search for the phrase 
> "Microsoft app*" first create a Builder and use
>
> // MultiPhraseQuery.Builder.add(Term) on the term "microsoft" 
> (assuming lowercase analysis), then find all terms that have "app" as 
> prefix using LeafReader.terms(String), seeking to "app" then iterating 
> and collecting terms until there is no longer that prefix,
>
> // and finally use MultiPhraseQuery.Builder.add(Term[]) to add them. 
> MultiPhraseQuery.Builder.build() returns the fully constructed (and 
> immutable) MultiPhraseQuery.
>
>
> IndexSearcher is = new IndexSearcher(indexReader);
>
> MultiPhraseQuery.Builder builder = new MultiPhraseQuery.Builder();
> builder.add(new Term("body", "one"), 0);
>
> Terms terms = LeafReader.terms("body"); // will this be slow? and how 
> do we incorporate token/word "app" here?
>
> // i STILL dont see how to get individual Term objects from terms 
> object and plus do i need to declare LeafReader object?
>
> Term[] termArr = new Term[k]; // i will get this filled via using 
> Terms.iterator
> builder.add(termArr);
> MultiPhraseQuery mpq = builder.build();
> TopDocs hits = is.search(mpq, 20);// 20 hits
>
>
> Best regards
>
>
> On 9/18/18 4:16 PM, baris.kazar@oracle.com wrote:
>> Hi,-
>>
>>  how does MultiPhraseQuery treat synonyms?
>>
>> is the following possible?
>>
>> ... (created index with synonyms and indexReader object has the index)
>>
>> IndexSearcher is = new IndexSearcher(indexReader);
>>
>> MultiPhraseQuery.Builder builder = new MultiPhraseQuery.Builder();
>> builder.add(new Term("body", "one"), 0);
>> builder.add(new Term("body", "two"), 1);
>> MultiPhraseQuery mpq = builder.build();
>> TopDocs hits = is.search(mpq, 20);// 20 hits
>>
>> Best regards
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Re: MultiPhraseQuery

Posted by ba...@oracle.com.

Trying to implement the example on 
https://lucene.apache.org/core/6_6_1/core/org/apache/lucene/search/MultiPhraseQuery.html

// A generalized version of PhraseQuery, with the possibility of adding 
more than one term at the same position that are treated as a 
disjunction (OR). To use this class to search for the phrase "Microsoft 
app*" first create a Builder and use

// MultiPhraseQuery.Builder.add(Term) on the term "microsoft" (assuming 
lowercase analysis), then find all terms that have "app" as prefix using 
LeafReader.terms(String), seeking to "app" then iterating and collecting 
terms until there is no longer that prefix,

// and finally use MultiPhraseQuery.Builder.add(Term[]) to add them. 
MultiPhraseQuery.Builder.build() returns the fully constructed (and 
immutable) MultiPhraseQuery.


IndexSearcher is = new IndexSearcher(indexReader);

MultiPhraseQuery.Builder builder = new MultiPhraseQuery.Builder();
builder.add(new Term("body", "one"), 0);

Terms terms = LeafReader.terms("body"); // will this be slow? and how do 
we incorporate token/word "app" here?

// i STILL dont see how to get individual Term objects from terms object 
and plus do i need to declare LeafReader object?

Term[] termArr = new Term[k]; // i will get this filled via using 
Terms.iterator
builder.add(termArr);
MultiPhraseQuery mpq = builder.build();
TopDocs hits = is.search(mpq, 20);// 20 hits


Best regards


On 9/18/18 4:16 PM, baris.kazar@oracle.com wrote:
> Hi,-
>
>  how does MultiPhraseQuery treat synonyms?
>
> is the following possible?
>
> ... (created index with synonyms and indexReader object has the index)
>
> IndexSearcher is = new IndexSearcher(indexReader);
>
> MultiPhraseQuery.Builder builder = new MultiPhraseQuery.Builder();
> builder.add(new Term("body", "one"), 0);
> builder.add(new Term("body", "two"), 1);
> MultiPhraseQuery mpq = builder.build();
> TopDocs hits = is.search(mpq, 20);// 20 hits
>
> Best regards
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org