You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Roy Lim <ro...@gmail.com> on 2018/08/15 22:27:52 UTC

Multi-word Synonyms - how does sow parameter work?

I'm trying to figure out why the multi-word synonym expansion is not working
correctly.  Specifically, when I test a standard query with Solr Admin it is
still splitting on whitespace.

Here is my setup:
- Solr 7.2.1
- synonym LCD => liquid crystal display
- q=myfield:LCD
- added: sow=false
- myfield looks like:


Solr Admin shows the parsed query looks like:

myfield:liquid myfield:crystal myfield:display

(default operator being OR), which would incorrectly match documents with
any of those words, but not all, which is what I would expect...





--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html

Re: Multi-word Synonyms - how does sow parameter work?

Posted by Roy Lim <ro...@gmail.com>.
Thanks Andrea for the tip.  I wasn't aware of the autoGeneratePhraseQueries
option for text fields, will definitely keep it in mind.

But I question if this is related to the fix on the query parser which
essentially introduces sow parameter and if false (looks like that is the
default in Solr 7), multiwords should be sent as a 'single input' (see
https://issues.apache.org/jira/browse/LUCENE-2605).  That defect doesn't
make mention of autoGeneratePhraseQueries.

I think this is where my confusion lies: as a non-developer unfortunately
I'm not clear what 'multiwords will be sent as a single input' means,
should it mean that it is treated as a phrase query?  Use AND?  So far as
mentioned I only observe that it is just OR clauses, which is no different
than before the fix.

Thanks again!



On Thu, Aug 16, 2018 at 12:39 AM, Andrea Gazzarini <a....@sease.io>
wrote:

> Hi Roy, I think you miss the autoGeneratePhraseQueries=true in the field
> type definition.
> I was on a slightly different use case when I met your same issue (I was
> using synonyms expansion at query time) and honestly I didn't understand
> why this is not the default and implicit behavior. In other words, like
> you, I can't imagine a scenario where I would a multi-terms synonym be
> destructured in multiple OR clauses.
>
> Best,
> Andrea
>
>
> On 16/08/18 02:07, Roy Lim wrote:
>
>> I am not using edismax (eventually I would like to get there) but I'm just
>> testing with standard query right now.  Original posting:
>>
>> I'm trying to figure out why the multi-word synonym expansion is not
>> working correctly (or, at least what I'm misunderstanding).  Specifically,
>> when I test a standard query with Solr Admin it appears to still split on
>> whitespace.
>>
>> Here is my setup:
>> - Solr 7.2.1
>> - synonym example: LCD => liquid crystal display
>> - q=myfield:LCD
>> - added parameter: sow=false
>> - myfield schema looks like (analyzer both applicable to index and query
>> time):
>> ----
>> <fieldType name="myfield" class="solr.TextField"
>> positionIncrementGap="100">
>>    <analyzer>
>>      <tokenizer class="solr.StandardTokenizerFactory" />
>>      <filter class="solr.SynonymGraphFilterFactory" ignoreCase="true"
>> synonyms="synonyms.txt"/>
>>          ...
>> ----
>>
>> When debugging the query, Solr Admin shows the parsed query as:
>> ----
>> myfield:liquid myfield:crystal myfield:display
>> ----
>>
>> (default operator being OR), as you can see it would incorrectly match on
>> any of those words, but not all, which is what I would expect...
>>
>> Should it not do a phrase query search for the exact translated synonym,
>> "liquid crystal display"?
>>
>>
>>
>> On Wed, Aug 15, 2018 at 5:01 PM, Doug Turnbull <
>> dturnbull@opensourceconnections.com> wrote:
>>
>> Also share your fieldType settings for myfield as well from your schema
>>> On Wed, Aug 15, 2018 at 8:00 PM Doug Turnbull <
>>> dturnbull@opensourceconnections.com> wrote:
>>>
>>> Aside from the screenshot issue, one  thing to check: are you searching
>>>> with defType=edismax ?
>>>>
>>>> As in
>>>> q=lcd&qf=myfield&sow=false&defType=edismax
>>>>
>>>> ?
>>>>
>>>> Also sow=false should the the default on Solr 7 and above
>>>>
>>>> Doug
>>>>
>>>> On Wed, Aug 15, 2018 at 6:27 PM Roy Lim <ro...@gmail.com> wrote:
>>>>
>>>> I'm trying to figure out why the multi-word synonym expansion is not
>>>>> working
>>>>> correctly.  Specifically, when I test a standard query with Solr Admin
>>>>>
>>>> it
>>>
>>>> is
>>>>> still splitting on whitespace.
>>>>>
>>>>> Here is my setup:
>>>>> - Solr 7.2.1
>>>>> - synonym LCD => liquid crystal display
>>>>> - q=myfield:LCD
>>>>> - added: sow=false
>>>>> - myfield looks like:
>>>>>
>>>>>
>>>>> Solr Admin shows the parsed query looks like:
>>>>>
>>>>> myfield:liquid myfield:crystal myfield:display
>>>>>
>>>>> (default operator being OR), which would incorrectly match documents
>>>>>
>>>> with
>>>
>>>> any of those words, but not all, which is what I would expect...
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
>>>>>
>>>>> --
>>>> CTO, OpenSource Connections
>>>> Author, Relevant Search
>>>> http://o19s.com/doug
>>>>
>>>> --
>>> CTO, OpenSource Connections
>>> Author, Relevant Search
>>> http://o19s.com/doug
>>>
>>>
>

Re: Multi-word Synonyms - how does sow parameter work?

Posted by Andrea Gazzarini <a....@sease.io>.
Hi Roy, I think you miss the autoGeneratePhraseQueries=true in the field 
type definition.
I was on a slightly different use case when I met your same issue (I was 
using synonyms expansion at query time) and honestly I didn't understand 
why this is not the default and implicit behavior. In other words, like 
you, I can't imagine a scenario where I would a multi-terms synonym be 
destructured in multiple OR clauses.

Best,
Andrea

On 16/08/18 02:07, Roy Lim wrote:
> I am not using edismax (eventually I would like to get there) but I'm just
> testing with standard query right now.  Original posting:
>
> I'm trying to figure out why the multi-word synonym expansion is not
> working correctly (or, at least what I'm misunderstanding).  Specifically,
> when I test a standard query with Solr Admin it appears to still split on
> whitespace.
>
> Here is my setup:
> - Solr 7.2.1
> - synonym example: LCD => liquid crystal display
> - q=myfield:LCD
> - added parameter: sow=false
> - myfield schema looks like (analyzer both applicable to index and query
> time):
> ----
> <fieldType name="myfield" class="solr.TextField" positionIncrementGap="100">
>    <analyzer>
>      <tokenizer class="solr.StandardTokenizerFactory" />
>      <filter class="solr.SynonymGraphFilterFactory" ignoreCase="true"
> synonyms="synonyms.txt"/>
>          ...
> ----
>
> When debugging the query, Solr Admin shows the parsed query as:
> ----
> myfield:liquid myfield:crystal myfield:display
> ----
>
> (default operator being OR), as you can see it would incorrectly match on
> any of those words, but not all, which is what I would expect...
>
> Should it not do a phrase query search for the exact translated synonym,
> "liquid crystal display"?
>
>
>
> On Wed, Aug 15, 2018 at 5:01 PM, Doug Turnbull <
> dturnbull@opensourceconnections.com> wrote:
>
>> Also share your fieldType settings for myfield as well from your schema
>> On Wed, Aug 15, 2018 at 8:00 PM Doug Turnbull <
>> dturnbull@opensourceconnections.com> wrote:
>>
>>> Aside from the screenshot issue, one  thing to check: are you searching
>>> with defType=edismax ?
>>>
>>> As in
>>> q=lcd&qf=myfield&sow=false&defType=edismax
>>>
>>> ?
>>>
>>> Also sow=false should the the default on Solr 7 and above
>>>
>>> Doug
>>>
>>> On Wed, Aug 15, 2018 at 6:27 PM Roy Lim <ro...@gmail.com> wrote:
>>>
>>>> I'm trying to figure out why the multi-word synonym expansion is not
>>>> working
>>>> correctly.  Specifically, when I test a standard query with Solr Admin
>> it
>>>> is
>>>> still splitting on whitespace.
>>>>
>>>> Here is my setup:
>>>> - Solr 7.2.1
>>>> - synonym LCD => liquid crystal display
>>>> - q=myfield:LCD
>>>> - added: sow=false
>>>> - myfield looks like:
>>>>
>>>>
>>>> Solr Admin shows the parsed query looks like:
>>>>
>>>> myfield:liquid myfield:crystal myfield:display
>>>>
>>>> (default operator being OR), which would incorrectly match documents
>> with
>>>> any of those words, but not all, which is what I would expect...
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
>>>>
>>> --
>>> CTO, OpenSource Connections
>>> Author, Relevant Search
>>> http://o19s.com/doug
>>>
>> --
>> CTO, OpenSource Connections
>> Author, Relevant Search
>> http://o19s.com/doug
>>


Re: Multi-word Synonyms - how does sow parameter work?

Posted by Roy Lim <ro...@gmail.com>.
I am not using edismax (eventually I would like to get there) but I'm just
testing with standard query right now.  Original posting:

I'm trying to figure out why the multi-word synonym expansion is not
working correctly (or, at least what I'm misunderstanding).  Specifically,
when I test a standard query with Solr Admin it appears to still split on
whitespace.

Here is my setup:
- Solr 7.2.1
- synonym example: LCD => liquid crystal display
- q=myfield:LCD
- added parameter: sow=false
- myfield schema looks like (analyzer both applicable to index and query
time):
----
<fieldType name="myfield" class="solr.TextField" positionIncrementGap="100">
  <analyzer>
    <tokenizer class="solr.StandardTokenizerFactory" />
    <filter class="solr.SynonymGraphFilterFactory" ignoreCase="true"
synonyms="synonyms.txt"/>
        ...
----

When debugging the query, Solr Admin shows the parsed query as:
----
myfield:liquid myfield:crystal myfield:display
----

(default operator being OR), as you can see it would incorrectly match on
any of those words, but not all, which is what I would expect...

Should it not do a phrase query search for the exact translated synonym,
"liquid crystal display"?



On Wed, Aug 15, 2018 at 5:01 PM, Doug Turnbull <
dturnbull@opensourceconnections.com> wrote:

> Also share your fieldType settings for myfield as well from your schema
> On Wed, Aug 15, 2018 at 8:00 PM Doug Turnbull <
> dturnbull@opensourceconnections.com> wrote:
>
> > Aside from the screenshot issue, one  thing to check: are you searching
> > with defType=edismax ?
> >
> > As in
> > q=lcd&qf=myfield&sow=false&defType=edismax
> >
> > ?
> >
> > Also sow=false should the the default on Solr 7 and above
> >
> > Doug
> >
> > On Wed, Aug 15, 2018 at 6:27 PM Roy Lim <ro...@gmail.com> wrote:
> >
> >> I'm trying to figure out why the multi-word synonym expansion is not
> >> working
> >> correctly.  Specifically, when I test a standard query with Solr Admin
> it
> >> is
> >> still splitting on whitespace.
> >>
> >> Here is my setup:
> >> - Solr 7.2.1
> >> - synonym LCD => liquid crystal display
> >> - q=myfield:LCD
> >> - added: sow=false
> >> - myfield looks like:
> >>
> >>
> >> Solr Admin shows the parsed query looks like:
> >>
> >> myfield:liquid myfield:crystal myfield:display
> >>
> >> (default operator being OR), which would incorrectly match documents
> with
> >> any of those words, but not all, which is what I would expect...
> >>
> >>
> >>
> >>
> >>
> >> --
> >> Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
> >>
> > --
> > CTO, OpenSource Connections
> > Author, Relevant Search
> > http://o19s.com/doug
> >
> --
> CTO, OpenSource Connections
> Author, Relevant Search
> http://o19s.com/doug
>

Re: Multi-word Synonyms - how does sow parameter work?

Posted by Doug Turnbull <dt...@opensourceconnections.com>.
Also share your fieldType settings for myfield as well from your schema
On Wed, Aug 15, 2018 at 8:00 PM Doug Turnbull <
dturnbull@opensourceconnections.com> wrote:

> Aside from the screenshot issue, one  thing to check: are you searching
> with defType=edismax ?
>
> As in
> q=lcd&qf=myfield&sow=false&defType=edismax
>
> ?
>
> Also sow=false should the the default on Solr 7 and above
>
> Doug
>
> On Wed, Aug 15, 2018 at 6:27 PM Roy Lim <ro...@gmail.com> wrote:
>
>> I'm trying to figure out why the multi-word synonym expansion is not
>> working
>> correctly.  Specifically, when I test a standard query with Solr Admin it
>> is
>> still splitting on whitespace.
>>
>> Here is my setup:
>> - Solr 7.2.1
>> - synonym LCD => liquid crystal display
>> - q=myfield:LCD
>> - added: sow=false
>> - myfield looks like:
>>
>>
>> Solr Admin shows the parsed query looks like:
>>
>> myfield:liquid myfield:crystal myfield:display
>>
>> (default operator being OR), which would incorrectly match documents with
>> any of those words, but not all, which is what I would expect...
>>
>>
>>
>>
>>
>> --
>> Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
>>
> --
> CTO, OpenSource Connections
> Author, Relevant Search
> http://o19s.com/doug
>
-- 
CTO, OpenSource Connections
Author, Relevant Search
http://o19s.com/doug

Re: Multi-word Synonyms - how does sow parameter work?

Posted by Doug Turnbull <dt...@opensourceconnections.com>.
Aside from the screenshot issue, one  thing to check: are you searching
with defType=edismax ?

As in
q=lcd&qf=myfield&sow=false&defType=edismax

?

Also sow=false should the the default on Solr 7 and above

Doug

On Wed, Aug 15, 2018 at 6:27 PM Roy Lim <ro...@gmail.com> wrote:

> I'm trying to figure out why the multi-word synonym expansion is not
> working
> correctly.  Specifically, when I test a standard query with Solr Admin it
> is
> still splitting on whitespace.
>
> Here is my setup:
> - Solr 7.2.1
> - synonym LCD => liquid crystal display
> - q=myfield:LCD
> - added: sow=false
> - myfield looks like:
>
>
> Solr Admin shows the parsed query looks like:
>
> myfield:liquid myfield:crystal myfield:display
>
> (default operator being OR), which would incorrectly match documents with
> any of those words, but not all, which is what I would expect...
>
>
>
>
>
> --
> Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
>
-- 
CTO, OpenSource Connections
Author, Relevant Search
http://o19s.com/doug

Re: Multi-word Synonyms - how does sow parameter work?

Posted by Steve Rowe <sa...@gmail.com>.
Yes please.  That way we’ll see the whole thing.

--
Steve
www.lucidworks.com

> On Aug 15, 2018, at 7:20 PM, Roy Lim <ro...@gmail.com> wrote:
> 
> I've subscribed, shall I re-post it then via email?
> 
> 
> 
> --
> Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: Multi-word Synonyms - how does sow parameter work?

Posted by Roy Lim <ro...@gmail.com>.
I've subscribed, shall I re-post it then via email?



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html

Re: Multi-word Synonyms - how does sow parameter work?

Posted by Steve Rowe <sa...@gmail.com>.
Roy,

Not sure of the point of Nabble when it strips content before passing messages on to the mailing list.  I’ve emailed them about this problem in the past but they have done nothing about it.

Updating a post on Nabble will never make it to the mailing list.  If you want us to be able to read your post in full, you should subscribe to the mailing list instead of using Nabble.  Instructions here: http://lucene.apache.org/solr/community.html#solr-user-list-solr-userluceneapacheorg

--
Steve
www.lucidworks.com

> On Aug 15, 2018, at 7:00 PM, Roy Lim <ro...@gmail.com> wrote:
> 
> Thanks, updated original post.  It just removed what I surrounded with the
> raw text markup, I've added it back without markup.  Not sure of the point
> of raw text if it's always removed 
> 
> 
> 
> --
> Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: Multi-word Synonyms - how does sow parameter work?

Posted by Roy Lim <ro...@gmail.com>.
Thanks, updated original post.  It just removed what I surrounded with the
raw text markup, I've added it back without markup.  Not sure of the point
of raw text if it's always removed 



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html

Re: Multi-word Synonyms - how does sow parameter work?

Posted by Erick Erickson <er...@gmail.com>.
The mail server strips pretty much all screenshots and attachments, so
I think some of the data you're trying to provide is missing from the
e-mail.

Best,
Erick

On Wed, Aug 15, 2018 at 3:27 PM, Roy Lim <ro...@gmail.com> wrote:
> I'm trying to figure out why the multi-word synonym expansion is not working
> correctly.  Specifically, when I test a standard query with Solr Admin it is
> still splitting on whitespace.
>
> Here is my setup:
> - Solr 7.2.1
> - synonym LCD => liquid crystal display
> - q=myfield:LCD
> - added: sow=false
> - myfield looks like:
>
>
> Solr Admin shows the parsed query looks like:
>
> myfield:liquid myfield:crystal myfield:display
>
> (default operator being OR), which would incorrectly match documents with
> any of those words, but not all, which is what I would expect...
>
>
>
>
>
> --
> Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html