You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by ba...@oracle.com on 2018/09/18 20:16:41 UTC
MultiPhraseQuery
Hi,-
how does MultiPhraseQuery treat synonyms?
is the following possible?
... (created index with synonyms and indexReader object has the index)
IndexSearcher is = new IndexSearcher(indexReader);
MultiPhraseQuery.Builder builder = new MultiPhraseQuery.Builder();
builder.add(new Term("body", "one"), 0);
builder.add(new Term("body", "two"), 1);
MultiPhraseQuery mpq = builder.build();
TopDocs hits = is.search(mpq, 20);// 20 hits
Best regards
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
Re: MultiPhraseQuery
Posted by ba...@oracle.com.
Ok, Mike, that was very helpful.
Now, i think i should use BooleanQuery with PhraseQueries but will
PhraseQuery be able to handle all synonyms- multi or single term?
What is the best way for this:
i have multiple tokens and i want to be able to do a cheap fuzzy search.
Best regards
On 9/18/18 5:28 PM, Michael McCandless wrote:
> Yes, +1 for a patch to improve the docs!
>
> MultiPhraseQuery only works for single term synonyms, and is usually
> produced by query parsers when the incoming query text had single term
> synonyms matching, I think? The query parser will use other (span?)
> queries for multi token synonyms.
>
> I think the example in the javadoc should be simplified to not use "app*",
> e.g. maybe just matching "Microsoft Excel|Word"?
>
> Mike McCandless
>
> https://urldefense.proofpoint.com/v2/url?u=http-3A__blog.mikemccandless.com&d=DwIFaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-BKNeyLlULCbaezrgocEvPhQkl4&m=tfPRxWsAx9P1XhVir6rL7WRf3iwK0jtYxnNnnhB9S90&s=yyRp_pK267aMSOlpWodQL-67wMhX3rb88aFr1YJ6lfk&e=
>
>
> On Wed, Sep 19, 2018 at 5:59 AM Erick Erickson <er...@gmail.com>
> wrote:
>
>> bq. i wish the Javadocs has examples like PhraseQuery Javadocs gave.
>>
>> This is where someone coming into the examples for the first time is
>> invaluable, javadoc patches are most welcome! It can be hard to back
>> off enough to remember what the confusing bits are when you wrote the
>> code ;)
>> On Tue, Sep 18, 2018 at 1:56 PM <ba...@oracle.com> wrote:
>>> Any suggestions please?
>>> Two main questions:
>>> - how do synonyms get utilized by MultiPhraseQuery?
>>> - how do we get second token "app" applied to the example on
>>> MultiPhraseQuery javadocs page? (and how do we get Terms[] array from
>>> Terms object?)
>>>
>>> Now three questions :)
>>>
>>> i wish the Javadocs has examples like PhraseQuery Javadocs gave.
>>>
>>> Best
>>>
>>> On 9/18/18 4:45 PM, baris.kazar@oracle.com wrote:
>>>> Trying to implement the example on
>>>>
>> https://urldefense.proofpoint.com/v2/url?u=https-3A__lucene.apache.org_core_6-5F6-5F1_core_org_apache_lucene_search_MultiPhraseQuery.html&d=DwIDaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-BKNeyLlULCbaezrgocEvPhQkl4&m=7WmT3NC9wzVk4FPBupACoALoL4kho6V7-c2o4Kac5QM&s=gM6_4hvpLEZY1_7r-CEInZbUb-ublYDcJOQ8rmeAgVA&e=
>>>> // A generalized version of PhraseQuery, with the possibility of
>>>> adding more than one term at the same position that are treated as a
>>>> disjunction (OR). To use this class to search for the phrase
>>>> "Microsoft app*" first create a Builder and use
>>>>
>>>> // MultiPhraseQuery.Builder.add(Term) on the term "microsoft"
>>>> (assuming lowercase analysis), then find all terms that have "app" as
>>>> prefix using LeafReader.terms(String), seeking to "app" then iterating
>>>> and collecting terms until there is no longer that prefix,
>>>>
>>>> // and finally use MultiPhraseQuery.Builder.add(Term[]) to add them.
>>>> MultiPhraseQuery.Builder.build() returns the fully constructed (and
>>>> immutable) MultiPhraseQuery.
>>>>
>>>>
>>>> IndexSearcher is = new IndexSearcher(indexReader);
>>>>
>>>> MultiPhraseQuery.Builder builder = new MultiPhraseQuery.Builder();
>>>> builder.add(new Term("body", "one"), 0);
>>>>
>>>> Terms terms = LeafReader.terms("body"); // will this be slow? and how
>>>> do we incorporate token/word "app" here?
>>>>
>>>> // i STILL dont see how to get individual Term objects from terms
>>>> object and plus do i need to declare LeafReader object?
>>>>
>>>> Term[] termArr = new Term[k]; // i will get this filled via using
>>>> Terms.iterator
>>>> builder.add(termArr);
>>>> MultiPhraseQuery mpq = builder.build();
>>>> TopDocs hits = is.search(mpq, 20);// 20 hits
>>>>
>>>>
>>>> Best regards
>>>>
>>>>
>>>> On 9/18/18 4:16 PM, baris.kazar@oracle.com wrote:
>>>>> Hi,-
>>>>>
>>>>> how does MultiPhraseQuery treat synonyms?
>>>>>
>>>>> is the following possible?
>>>>>
>>>>> ... (created index with synonyms and indexReader object has the index)
>>>>>
>>>>> IndexSearcher is = new IndexSearcher(indexReader);
>>>>>
>>>>> MultiPhraseQuery.Builder builder = new MultiPhraseQuery.Builder();
>>>>> builder.add(new Term("body", "one"), 0);
>>>>> builder.add(new Term("body", "two"), 1);
>>>>> MultiPhraseQuery mpq = builder.build();
>>>>> TopDocs hits = is.search(mpq, 20);// 20 hits
>>>>>
>>>>> Best regards
>>>>>
>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>>
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
Re: MultiPhraseQuery
Posted by Michael McCandless <lu...@mikemccandless.com>.
Yes, +1 for a patch to improve the docs!
MultiPhraseQuery only works for single term synonyms, and is usually
produced by query parsers when the incoming query text had single term
synonyms matching, I think? The query parser will use other (span?)
queries for multi token synonyms.
I think the example in the javadoc should be simplified to not use "app*",
e.g. maybe just matching "Microsoft Excel|Word"?
Mike McCandless
http://blog.mikemccandless.com
On Wed, Sep 19, 2018 at 5:59 AM Erick Erickson <er...@gmail.com>
wrote:
> bq. i wish the Javadocs has examples like PhraseQuery Javadocs gave.
>
> This is where someone coming into the examples for the first time is
> invaluable, javadoc patches are most welcome! It can be hard to back
> off enough to remember what the confusing bits are when you wrote the
> code ;)
> On Tue, Sep 18, 2018 at 1:56 PM <ba...@oracle.com> wrote:
> >
> > Any suggestions please?
> > Two main questions:
> > - how do synonyms get utilized by MultiPhraseQuery?
> > - how do we get second token "app" applied to the example on
> > MultiPhraseQuery javadocs page? (and how do we get Terms[] array from
> > Terms object?)
> >
> > Now three questions :)
> >
> > i wish the Javadocs has examples like PhraseQuery Javadocs gave.
> >
> > Best
> >
> > On 9/18/18 4:45 PM, baris.kazar@oracle.com wrote:
> > > Trying to implement the example on
> > >
> https://urldefense.proofpoint.com/v2/url?u=https-3A__lucene.apache.org_core_6-5F6-5F1_core_org_apache_lucene_search_MultiPhraseQuery.html&d=DwIDaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-BKNeyLlULCbaezrgocEvPhQkl4&m=7WmT3NC9wzVk4FPBupACoALoL4kho6V7-c2o4Kac5QM&s=gM6_4hvpLEZY1_7r-CEInZbUb-ublYDcJOQ8rmeAgVA&e=
> > >
> > > // A generalized version of PhraseQuery, with the possibility of
> > > adding more than one term at the same position that are treated as a
> > > disjunction (OR). To use this class to search for the phrase
> > > "Microsoft app*" first create a Builder and use
> > >
> > > // MultiPhraseQuery.Builder.add(Term) on the term "microsoft"
> > > (assuming lowercase analysis), then find all terms that have "app" as
> > > prefix using LeafReader.terms(String), seeking to "app" then iterating
> > > and collecting terms until there is no longer that prefix,
> > >
> > > // and finally use MultiPhraseQuery.Builder.add(Term[]) to add them.
> > > MultiPhraseQuery.Builder.build() returns the fully constructed (and
> > > immutable) MultiPhraseQuery.
> > >
> > >
> > > IndexSearcher is = new IndexSearcher(indexReader);
> > >
> > > MultiPhraseQuery.Builder builder = new MultiPhraseQuery.Builder();
> > > builder.add(new Term("body", "one"), 0);
> > >
> > > Terms terms = LeafReader.terms("body"); // will this be slow? and how
> > > do we incorporate token/word "app" here?
> > >
> > > // i STILL dont see how to get individual Term objects from terms
> > > object and plus do i need to declare LeafReader object?
> > >
> > > Term[] termArr = new Term[k]; // i will get this filled via using
> > > Terms.iterator
> > > builder.add(termArr);
> > > MultiPhraseQuery mpq = builder.build();
> > > TopDocs hits = is.search(mpq, 20);// 20 hits
> > >
> > >
> > > Best regards
> > >
> > >
> > > On 9/18/18 4:16 PM, baris.kazar@oracle.com wrote:
> > >> Hi,-
> > >>
> > >> how does MultiPhraseQuery treat synonyms?
> > >>
> > >> is the following possible?
> > >>
> > >> ... (created index with synonyms and indexReader object has the index)
> > >>
> > >> IndexSearcher is = new IndexSearcher(indexReader);
> > >>
> > >> MultiPhraseQuery.Builder builder = new MultiPhraseQuery.Builder();
> > >> builder.add(new Term("body", "one"), 0);
> > >> builder.add(new Term("body", "two"), 1);
> > >> MultiPhraseQuery mpq = builder.build();
> > >> TopDocs hits = is.search(mpq, 20);// 20 hits
> > >>
> > >> Best regards
> > >>
> > >>
> > >> ---------------------------------------------------------------------
> > >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > >> For additional commands, e-mail: java-user-help@lucene.apache.org
> > >>
> > >
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > > For additional commands, e-mail: java-user-help@lucene.apache.org
> > >
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-user-help@lucene.apache.org
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>
Re: MultiPhraseQuery
Posted by ba...@oracle.com.
FuzzyQuery seems also not suitable for me.
PrefixQuery can be one token only, right?
Best
On 9/18/18 5:23 PM, baris.kazar@oracle.com wrote:
> Erick,-
> i think the reason why MultiPhraseQuery was created was synonyms as
> far as i understood. am i right?
>
> i want to have a BooleanQuery or MultiPhraseQuery (i cant decide
> between these two) with an index which considers synonyms already.
> One disadvantage of MultiPhraseQuery is that it needs to match all the
> terms.
> Then should i go for BooleanQuery with multiple PhraseQueries? but
> PhraseQuery cannot handle synonyms.
> i know TermQuery is for exact match so i cant use that either in this
> case.
>
> i have multiple tokens and i want to be able to do a cheap fuzzy search.
> Best regards
>
>
> On 9/18/18 4:58 PM, Erick Erickson wrote:
>> bq. i wish the Javadocs has examples like PhraseQuery Javadocs gave.
>>
>> This is where someone coming into the examples for the first time is
>> invaluable, javadoc patches are most welcome! It can be hard to back
>> off enough to remember what the confusing bits are when you wrote the
>> code ;)
>> On Tue, Sep 18, 2018 at 1:56 PM <ba...@oracle.com> wrote:
>>> Any suggestions please?
>>> Two main questions:
>>> - how do synonyms get utilized by MultiPhraseQuery?
>>> - how do we get second token "app" applied to the example on
>>> MultiPhraseQuery javadocs page? (and how do we get Terms[] array from
>>> Terms object?)
>>>
>>> Now three questions :)
>>>
>>> i wish the Javadocs has examples like PhraseQuery Javadocs gave.
>>>
>>> Best
>>>
>>> On 9/18/18 4:45 PM, baris.kazar@oracle.com wrote:
>>>> Trying to implement the example on
>>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__lucene.apache.org_core_6-5F6-5F1_core_org_apache_lucene_search_MultiPhraseQuery.html&d=DwIDaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-BKNeyLlULCbaezrgocEvPhQkl4&m=7WmT3NC9wzVk4FPBupACoALoL4kho6V7-c2o4Kac5QM&s=gM6_4hvpLEZY1_7r-CEInZbUb-ublYDcJOQ8rmeAgVA&e=
>>>>
>>>>
>>>> // A generalized version of PhraseQuery, with the possibility of
>>>> adding more than one term at the same position that are treated as a
>>>> disjunction (OR). To use this class to search for the phrase
>>>> "Microsoft app*" first create a Builder and use
>>>>
>>>> // MultiPhraseQuery.Builder.add(Term) on the term "microsoft"
>>>> (assuming lowercase analysis), then find all terms that have "app" as
>>>> prefix using LeafReader.terms(String), seeking to "app" then iterating
>>>> and collecting terms until there is no longer that prefix,
>>>>
>>>> // and finally use MultiPhraseQuery.Builder.add(Term[]) to add them.
>>>> MultiPhraseQuery.Builder.build() returns the fully constructed (and
>>>> immutable) MultiPhraseQuery.
>>>>
>>>>
>>>> IndexSearcher is = new IndexSearcher(indexReader);
>>>>
>>>> MultiPhraseQuery.Builder builder = new MultiPhraseQuery.Builder();
>>>> builder.add(new Term("body", "one"), 0);
>>>>
>>>> Terms terms = LeafReader.terms("body"); // will this be slow? and how
>>>> do we incorporate token/word "app" here?
>>>>
>>>> // i STILL dont see how to get individual Term objects from terms
>>>> object and plus do i need to declare LeafReader object?
>>>>
>>>> Term[] termArr = new Term[k]; // i will get this filled via using
>>>> Terms.iterator
>>>> builder.add(termArr);
>>>> MultiPhraseQuery mpq = builder.build();
>>>> TopDocs hits = is.search(mpq, 20);// 20 hits
>>>>
>>>>
>>>> Best regards
>>>>
>>>>
>>>> On 9/18/18 4:16 PM, baris.kazar@oracle.com wrote:
>>>>> Hi,-
>>>>>
>>>>> how does MultiPhraseQuery treat synonyms?
>>>>>
>>>>> is the following possible?
>>>>>
>>>>> ... (created index with synonyms and indexReader object has the
>>>>> index)
>>>>>
>>>>> IndexSearcher is = new IndexSearcher(indexReader);
>>>>>
>>>>> MultiPhraseQuery.Builder builder = new MultiPhraseQuery.Builder();
>>>>> builder.add(new Term("body", "one"), 0);
>>>>> builder.add(new Term("body", "two"), 1);
>>>>> MultiPhraseQuery mpq = builder.build();
>>>>> TopDocs hits = is.search(mpq, 20);// 20 hits
>>>>>
>>>>> Best regards
>>>>>
>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>>
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
Re: MultiPhraseQuery
Posted by ba...@oracle.com.
Erick,-
i think the reason why MultiPhraseQuery was created was synonyms as
far as i understood. am i right?
i want to have a BooleanQuery or MultiPhraseQuery (i cant decide between
these two) with an index which considers synonyms already.
One disadvantage of MultiPhraseQuery is that it needs to match all the
terms.
Then should i go for BooleanQuery with multiple PhraseQueries? but
PhraseQuery cannot handle synonyms.
i know TermQuery is for exact match so i cant use that either in this case.
i have multiple tokens and i want to be able to do a cheap fuzzy search.
Best regards
On 9/18/18 4:58 PM, Erick Erickson wrote:
> bq. i wish the Javadocs has examples like PhraseQuery Javadocs gave.
>
> This is where someone coming into the examples for the first time is
> invaluable, javadoc patches are most welcome! It can be hard to back
> off enough to remember what the confusing bits are when you wrote the
> code ;)
> On Tue, Sep 18, 2018 at 1:56 PM <ba...@oracle.com> wrote:
>> Any suggestions please?
>> Two main questions:
>> - how do synonyms get utilized by MultiPhraseQuery?
>> - how do we get second token "app" applied to the example on
>> MultiPhraseQuery javadocs page? (and how do we get Terms[] array from
>> Terms object?)
>>
>> Now three questions :)
>>
>> i wish the Javadocs has examples like PhraseQuery Javadocs gave.
>>
>> Best
>>
>> On 9/18/18 4:45 PM, baris.kazar@oracle.com wrote:
>>> Trying to implement the example on
>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__lucene.apache.org_core_6-5F6-5F1_core_org_apache_lucene_search_MultiPhraseQuery.html&d=DwIDaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-BKNeyLlULCbaezrgocEvPhQkl4&m=7WmT3NC9wzVk4FPBupACoALoL4kho6V7-c2o4Kac5QM&s=gM6_4hvpLEZY1_7r-CEInZbUb-ublYDcJOQ8rmeAgVA&e=
>>>
>>> // A generalized version of PhraseQuery, with the possibility of
>>> adding more than one term at the same position that are treated as a
>>> disjunction (OR). To use this class to search for the phrase
>>> "Microsoft app*" first create a Builder and use
>>>
>>> // MultiPhraseQuery.Builder.add(Term) on the term "microsoft"
>>> (assuming lowercase analysis), then find all terms that have "app" as
>>> prefix using LeafReader.terms(String), seeking to "app" then iterating
>>> and collecting terms until there is no longer that prefix,
>>>
>>> // and finally use MultiPhraseQuery.Builder.add(Term[]) to add them.
>>> MultiPhraseQuery.Builder.build() returns the fully constructed (and
>>> immutable) MultiPhraseQuery.
>>>
>>>
>>> IndexSearcher is = new IndexSearcher(indexReader);
>>>
>>> MultiPhraseQuery.Builder builder = new MultiPhraseQuery.Builder();
>>> builder.add(new Term("body", "one"), 0);
>>>
>>> Terms terms = LeafReader.terms("body"); // will this be slow? and how
>>> do we incorporate token/word "app" here?
>>>
>>> // i STILL dont see how to get individual Term objects from terms
>>> object and plus do i need to declare LeafReader object?
>>>
>>> Term[] termArr = new Term[k]; // i will get this filled via using
>>> Terms.iterator
>>> builder.add(termArr);
>>> MultiPhraseQuery mpq = builder.build();
>>> TopDocs hits = is.search(mpq, 20);// 20 hits
>>>
>>>
>>> Best regards
>>>
>>>
>>> On 9/18/18 4:16 PM, baris.kazar@oracle.com wrote:
>>>> Hi,-
>>>>
>>>> how does MultiPhraseQuery treat synonyms?
>>>>
>>>> is the following possible?
>>>>
>>>> ... (created index with synonyms and indexReader object has the index)
>>>>
>>>> IndexSearcher is = new IndexSearcher(indexReader);
>>>>
>>>> MultiPhraseQuery.Builder builder = new MultiPhraseQuery.Builder();
>>>> builder.add(new Term("body", "one"), 0);
>>>> builder.add(new Term("body", "two"), 1);
>>>> MultiPhraseQuery mpq = builder.build();
>>>> TopDocs hits = is.search(mpq, 20);// 20 hits
>>>>
>>>> Best regards
>>>>
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
Re: MultiPhraseQuery
Posted by Erick Erickson <er...@gmail.com>.
bq. i wish the Javadocs has examples like PhraseQuery Javadocs gave.
This is where someone coming into the examples for the first time is
invaluable, javadoc patches are most welcome! It can be hard to back
off enough to remember what the confusing bits are when you wrote the
code ;)
On Tue, Sep 18, 2018 at 1:56 PM <ba...@oracle.com> wrote:
>
> Any suggestions please?
> Two main questions:
> - how do synonyms get utilized by MultiPhraseQuery?
> - how do we get second token "app" applied to the example on
> MultiPhraseQuery javadocs page? (and how do we get Terms[] array from
> Terms object?)
>
> Now three questions :)
>
> i wish the Javadocs has examples like PhraseQuery Javadocs gave.
>
> Best
>
> On 9/18/18 4:45 PM, baris.kazar@oracle.com wrote:
> > Trying to implement the example on
> > https://urldefense.proofpoint.com/v2/url?u=https-3A__lucene.apache.org_core_6-5F6-5F1_core_org_apache_lucene_search_MultiPhraseQuery.html&d=DwIDaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-BKNeyLlULCbaezrgocEvPhQkl4&m=7WmT3NC9wzVk4FPBupACoALoL4kho6V7-c2o4Kac5QM&s=gM6_4hvpLEZY1_7r-CEInZbUb-ublYDcJOQ8rmeAgVA&e=
> >
> > // A generalized version of PhraseQuery, with the possibility of
> > adding more than one term at the same position that are treated as a
> > disjunction (OR). To use this class to search for the phrase
> > "Microsoft app*" first create a Builder and use
> >
> > // MultiPhraseQuery.Builder.add(Term) on the term "microsoft"
> > (assuming lowercase analysis), then find all terms that have "app" as
> > prefix using LeafReader.terms(String), seeking to "app" then iterating
> > and collecting terms until there is no longer that prefix,
> >
> > // and finally use MultiPhraseQuery.Builder.add(Term[]) to add them.
> > MultiPhraseQuery.Builder.build() returns the fully constructed (and
> > immutable) MultiPhraseQuery.
> >
> >
> > IndexSearcher is = new IndexSearcher(indexReader);
> >
> > MultiPhraseQuery.Builder builder = new MultiPhraseQuery.Builder();
> > builder.add(new Term("body", "one"), 0);
> >
> > Terms terms = LeafReader.terms("body"); // will this be slow? and how
> > do we incorporate token/word "app" here?
> >
> > // i STILL dont see how to get individual Term objects from terms
> > object and plus do i need to declare LeafReader object?
> >
> > Term[] termArr = new Term[k]; // i will get this filled via using
> > Terms.iterator
> > builder.add(termArr);
> > MultiPhraseQuery mpq = builder.build();
> > TopDocs hits = is.search(mpq, 20);// 20 hits
> >
> >
> > Best regards
> >
> >
> > On 9/18/18 4:16 PM, baris.kazar@oracle.com wrote:
> >> Hi,-
> >>
> >> how does MultiPhraseQuery treat synonyms?
> >>
> >> is the following possible?
> >>
> >> ... (created index with synonyms and indexReader object has the index)
> >>
> >> IndexSearcher is = new IndexSearcher(indexReader);
> >>
> >> MultiPhraseQuery.Builder builder = new MultiPhraseQuery.Builder();
> >> builder.add(new Term("body", "one"), 0);
> >> builder.add(new Term("body", "two"), 1);
> >> MultiPhraseQuery mpq = builder.build();
> >> TopDocs hits = is.search(mpq, 20);// 20 hits
> >>
> >> Best regards
> >>
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >> For additional commands, e-mail: java-user-help@lucene.apache.org
> >>
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-user-help@lucene.apache.org
> >
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
Re: MultiPhraseQuery
Posted by ba...@oracle.com.
Any suggestions please?
Two main questions:
- how do synonyms get utilized by MultiPhraseQuery?
- how do we get second token "app" applied to the example on
MultiPhraseQuery javadocs page? (and how do we get Terms[] array from
Terms object?)
Now three questions :)
i wish the Javadocs has examples like PhraseQuery Javadocs gave.
Best
On 9/18/18 4:45 PM, baris.kazar@oracle.com wrote:
> Trying to implement the example on
> https://urldefense.proofpoint.com/v2/url?u=https-3A__lucene.apache.org_core_6-5F6-5F1_core_org_apache_lucene_search_MultiPhraseQuery.html&d=DwIDaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-BKNeyLlULCbaezrgocEvPhQkl4&m=7WmT3NC9wzVk4FPBupACoALoL4kho6V7-c2o4Kac5QM&s=gM6_4hvpLEZY1_7r-CEInZbUb-ublYDcJOQ8rmeAgVA&e=
>
> // A generalized version of PhraseQuery, with the possibility of
> adding more than one term at the same position that are treated as a
> disjunction (OR). To use this class to search for the phrase
> "Microsoft app*" first create a Builder and use
>
> // MultiPhraseQuery.Builder.add(Term) on the term "microsoft"
> (assuming lowercase analysis), then find all terms that have "app" as
> prefix using LeafReader.terms(String), seeking to "app" then iterating
> and collecting terms until there is no longer that prefix,
>
> // and finally use MultiPhraseQuery.Builder.add(Term[]) to add them.
> MultiPhraseQuery.Builder.build() returns the fully constructed (and
> immutable) MultiPhraseQuery.
>
>
> IndexSearcher is = new IndexSearcher(indexReader);
>
> MultiPhraseQuery.Builder builder = new MultiPhraseQuery.Builder();
> builder.add(new Term("body", "one"), 0);
>
> Terms terms = LeafReader.terms("body"); // will this be slow? and how
> do we incorporate token/word "app" here?
>
> // i STILL dont see how to get individual Term objects from terms
> object and plus do i need to declare LeafReader object?
>
> Term[] termArr = new Term[k]; // i will get this filled via using
> Terms.iterator
> builder.add(termArr);
> MultiPhraseQuery mpq = builder.build();
> TopDocs hits = is.search(mpq, 20);// 20 hits
>
>
> Best regards
>
>
> On 9/18/18 4:16 PM, baris.kazar@oracle.com wrote:
>> Hi,-
>>
>> how does MultiPhraseQuery treat synonyms?
>>
>> is the following possible?
>>
>> ... (created index with synonyms and indexReader object has the index)
>>
>> IndexSearcher is = new IndexSearcher(indexReader);
>>
>> MultiPhraseQuery.Builder builder = new MultiPhraseQuery.Builder();
>> builder.add(new Term("body", "one"), 0);
>> builder.add(new Term("body", "two"), 1);
>> MultiPhraseQuery mpq = builder.build();
>> TopDocs hits = is.search(mpq, 20);// 20 hits
>>
>> Best regards
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
Re: MultiPhraseQuery
Posted by ba...@oracle.com.
Trying to implement the example on
https://lucene.apache.org/core/6_6_1/core/org/apache/lucene/search/MultiPhraseQuery.html
// A generalized version of PhraseQuery, with the possibility of adding
more than one term at the same position that are treated as a
disjunction (OR). To use this class to search for the phrase "Microsoft
app*" first create a Builder and use
// MultiPhraseQuery.Builder.add(Term) on the term "microsoft" (assuming
lowercase analysis), then find all terms that have "app" as prefix using
LeafReader.terms(String), seeking to "app" then iterating and collecting
terms until there is no longer that prefix,
// and finally use MultiPhraseQuery.Builder.add(Term[]) to add them.
MultiPhraseQuery.Builder.build() returns the fully constructed (and
immutable) MultiPhraseQuery.
IndexSearcher is = new IndexSearcher(indexReader);
MultiPhraseQuery.Builder builder = new MultiPhraseQuery.Builder();
builder.add(new Term("body", "one"), 0);
Terms terms = LeafReader.terms("body"); // will this be slow? and how do
we incorporate token/word "app" here?
// i STILL dont see how to get individual Term objects from terms object
and plus do i need to declare LeafReader object?
Term[] termArr = new Term[k]; // i will get this filled via using
Terms.iterator
builder.add(termArr);
MultiPhraseQuery mpq = builder.build();
TopDocs hits = is.search(mpq, 20);// 20 hits
Best regards
On 9/18/18 4:16 PM, baris.kazar@oracle.com wrote:
> Hi,-
>
> how does MultiPhraseQuery treat synonyms?
>
> is the following possible?
>
> ... (created index with synonyms and indexReader object has the index)
>
> IndexSearcher is = new IndexSearcher(indexReader);
>
> MultiPhraseQuery.Builder builder = new MultiPhraseQuery.Builder();
> builder.add(new Term("body", "one"), 0);
> builder.add(new Term("body", "two"), 1);
> MultiPhraseQuery mpq = builder.build();
> TopDocs hits = is.search(mpq, 20);// 20 hits
>
> Best regards
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org