You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by MaryJo Sminkey <mj...@gmail.com> on 2016/06/07 18:24:19 UTC

Solutions for Multi-word Synonyms

> MaryJo you might want to start a new thread, I think we kinda hijacked this
> one. Also if you are interested in tuning queries check out
> http://splainer.io/ and https://www.quepid.com which are interactive tools
> (both of which my company makes) to tune for search relevancy.
>


Okay I changed the subject. But I don't need a tuning tool, I already know
WHY I'm not getting the results I need, the problem is how to fix it or get
around what the plugin is doing. Which is why I was inquiring if people
have had success with something other than this particularly plugin for
more advanced queries that it messes around with. It seems to do a good job
if you aren't doing anything particularly complicated with your search
logic, but I don't see a good way to solve the issue I'm having, and a
tuning tool isn't really going to help with that. We were pretty happy with
our search relevancy for the most part *other* than the problem with the
multi-term synonyms not working reliably but I definitely can't lose
relevancy that we had just to get those working.

In reviewing your tools previously, the problem as I recall is that they
rely on querying Solr directly, while our searches go through multiple
levels of an application which includes a lot of additional logic in terms
of what the data that gets sent to Solr are, so they just aren't going to
be much use for us. It was easier for me to just write my own tool that
essentially does the same kind of thing, but with my application logic
built in.

Mary Jo

Re: Solutions for Multi-word Synonyms

Posted by Doug Turnbull <dt...@opensourceconnections.com>.
Mary Jo,

Honestly half the time I run into this problem, I end up creating a
QParserPlugin because I need to do something specific. With a QParserPlugin
I can run whatever analysis, slicing and dicing of the query string to
manually construct whatever I need to

http://www.supermind.org/blog/1134/custom-solr-queryparsers-for-fun-and-profit

One thing I often do is repeat the functionality of Elasticsearch's match
query. Elasticsearch's match query does the following:

- Analyze the query string using the field's query-time analyzer
- Create an OR query with the tokens that come out of the analysis

You can look at the field query parser as something of a starting point for
this.

I usually do this in the context of a boost query, not as the main edismax
query.

If I have time, this is something I've been meaning to open source.

Best
-Doug

On Tue, Jun 7, 2016 at 2:51 PM Joe Lawson <jl...@opensourceconnections.com>
wrote:

> I'm sorry I wasn't more specific, I meant we were hijacking the thread with
> the question, "Anyone used a different method of
> handling multi-term synonyms that isn't as global?" as the original thread
> was about getting synonym_edismax running.
>
> On Tue, Jun 7, 2016 at 2:24 PM, MaryJo Sminkey <mj...@gmail.com>
> wrote:
>
> > > MaryJo you might want to start a new thread, I think we kinda hijacked
> > this
> > > one. Also if you are interested in tuning queries check out
> > > http://splainer.io/ and https://www.quepid.com which are interactive
> > tools
> > > (both of which my company makes) to tune for search relevancy.
> > >
> >
> >
> > Okay I changed the subject. But I don't need a tuning tool, I already
> know
> > WHY I'm not getting the results I need, the problem is how to fix it or
> get
> > around what the plugin is doing. Which is why I was inquiring if people
> > have had success with something other than this particularly plugin for
> > more advanced queries that it messes around with. It seems to do a good
> job
> > if you aren't doing anything particularly complicated with your search
> > logic, but I don't see a good way to solve the issue I'm having, and a
> > tuning tool isn't really going to help with that. We were pretty happy
> with
> > our search relevancy for the most part *other* than the problem with the
> > multi-term synonyms not working reliably but I definitely can't lose
> > relevancy that we had just to get those working.
> >
> > In reviewing your tools previously, the problem as I recall is that they
> > rely on querying Solr directly, while our searches go through multiple
> > levels of an application which includes a lot of additional logic in
> terms
> > of what the data that gets sent to Solr are, so they just aren't going to
> > be much use for us. It was easier for me to just write my own tool that
> > essentially does the same kind of thing, but with my application logic
> > built in.
> >
> > Mary Jo
> >
>

Re: Solutions for Multi-word Synonyms

Posted by Joe Lawson <jl...@opensourceconnections.com>.
I rounded up some of the discussion here:
http://opensourceconnections.com/blog/2016/06/23/solr-multi-word-synonym-solutions-2016/

Also my colleage pointed me to another project Querqy,
https://github.com/renekrie/querqy which "is a framework for query
preprocessing in Java-based search engines. It comes with a powerful,
rule-based preprocessor named 'Common Rules Preprocessor', which provides
query-time synonyms, query-dependent boosting and down-ranking, and
query-dependent filters. While the Common Rules
Preprocessor is not specific to any search engine, Querqy provides a plugin
to run it within the Solr search engine."

On Fri, Jun 10, 2016 at 2:25 AM, Bernd Fehling <
bernd.fehling@uni-bielefeld.de> wrote:

> As Doug said,
> you should really try to build your own solution for Multi-word Synonyms
> because every need is different and you can customize it for your special
> use case, like adding a Thesaurus.
>
>
> http://www.ub.uni-bielefeld.de/~befehl/base/solr/InsideBase_eurovocThesaurus.html
>
> Regards
> Bernd
>
> Am 09.06.2016 um 17:06 schrieb Doug Turnbull:
> > Mary Jo,
> >
> > Honestly half the time I run into this problem, I end up creating a
> > QParserPlugin because I need to do something specific. With a
> QParserPlugin
> > I can run whatever analysis, slicing and dicing of the query string to
> > manually construct whatever I need to
> >
> >
> http://www.supermind.org/blog/1134/custom-solr-queryparsers-for-fun-and-profit
> >
> > One thing I often do is repeat the functionality of Elasticsearch's match
> > query. Elasticsearch's match query does the following:
> >
> > - Analyze the query string using the field's query-time analyzer
> > - Create an OR query with the tokens that come out of the analysis
> >
> > You can look at the field query parser as something of a starting point
> for
> > this.
> >
> > I usually do this in the context of a boost query, not as the main
> edismax
> > query.
> >
> > If I have time, this is something I've been meaning to open source.
> >
> > Best
> > -Doug
> >
> > On Tue, Jun 7, 2016 at 2:51 PM Joe Lawson <
> jlawson@opensourceconnections.com>
> > wrote:
> >
> >> I'm sorry I wasn't more specific, I meant we were hijacking the thread
> with
> >> the question, "Anyone used a different method of
> >> handling multi-term synonyms that isn't as global?" as the original
> thread
> >> was about getting synonym_edismax running.
> >>
> >> On Tue, Jun 7, 2016 at 2:24 PM, MaryJo Sminkey <mj...@gmail.com>
> >> wrote:
> >>
> >>>> MaryJo you might want to start a new thread, I think we kinda hijacked
> >>> this
> >>>> one. Also if you are interested in tuning queries check out
> >>>> http://splainer.io/ and https://www.quepid.com which are interactive
> >>> tools
> >>>> (both of which my company makes) to tune for search relevancy.
> >>>>
> >>>
> >>>
> >>> Okay I changed the subject. But I don't need a tuning tool, I already
> >> know
> >>> WHY I'm not getting the results I need, the problem is how to fix it or
> >> get
> >>> around what the plugin is doing. Which is why I was inquiring if people
> >>> have had success with something other than this particularly plugin for
> >>> more advanced queries that it messes around with. It seems to do a good
> >> job
> >>> if you aren't doing anything particularly complicated with your search
> >>> logic, but I don't see a good way to solve the issue I'm having, and a
> >>> tuning tool isn't really going to help with that. We were pretty happy
> >> with
> >>> our search relevancy for the most part *other* than the problem with
> the
> >>> multi-term synonyms not working reliably but I definitely can't lose
> >>> relevancy that we had just to get those working.
> >>>
> >>> In reviewing your tools previously, the problem as I recall is that
> they
> >>> rely on querying Solr directly, while our searches go through multiple
> >>> levels of an application which includes a lot of additional logic in
> >> terms
> >>> of what the data that gets sent to Solr are, so they just aren't going
> to
> >>> be much use for us. It was easier for me to just write my own tool that
> >>> essentially does the same kind of thing, but with my application logic
> >>> built in.
> >>>
> >>> Mary Jo
> >>>
> >>
> >
>
> --
> *************************************************************
> Bernd Fehling                    Bielefeld University Library
> Dipl.-Inform. (FH)                LibTec - Library Technology
> Universitätsstr. 25                  and Knowledge Management
> 33615 Bielefeld
> Tel. +49 521 106-4060       bernd.fehling(at)uni-bielefeld.de
>
> BASE - Bielefeld Academic Search Engine - www.base-search.net
> *************************************************************
>

Re: Solutions for Multi-word Synonyms

Posted by Bernd Fehling <be...@uni-bielefeld.de>.
As Doug said,
you should really try to build your own solution for Multi-word Synonyms
because every need is different and you can customize it for your special
use case, like adding a Thesaurus.

http://www.ub.uni-bielefeld.de/~befehl/base/solr/InsideBase_eurovocThesaurus.html

Regards
Bernd

Am 09.06.2016 um 17:06 schrieb Doug Turnbull:
> Mary Jo,
> 
> Honestly half the time I run into this problem, I end up creating a
> QParserPlugin because I need to do something specific. With a QParserPlugin
> I can run whatever analysis, slicing and dicing of the query string to
> manually construct whatever I need to
> 
> http://www.supermind.org/blog/1134/custom-solr-queryparsers-for-fun-and-profit
> 
> One thing I often do is repeat the functionality of Elasticsearch's match
> query. Elasticsearch's match query does the following:
> 
> - Analyze the query string using the field's query-time analyzer
> - Create an OR query with the tokens that come out of the analysis
> 
> You can look at the field query parser as something of a starting point for
> this.
> 
> I usually do this in the context of a boost query, not as the main edismax
> query.
> 
> If I have time, this is something I've been meaning to open source.
> 
> Best
> -Doug
> 
> On Tue, Jun 7, 2016 at 2:51 PM Joe Lawson <jl...@opensourceconnections.com>
> wrote:
> 
>> I'm sorry I wasn't more specific, I meant we were hijacking the thread with
>> the question, "Anyone used a different method of
>> handling multi-term synonyms that isn't as global?" as the original thread
>> was about getting synonym_edismax running.
>>
>> On Tue, Jun 7, 2016 at 2:24 PM, MaryJo Sminkey <mj...@gmail.com>
>> wrote:
>>
>>>> MaryJo you might want to start a new thread, I think we kinda hijacked
>>> this
>>>> one. Also if you are interested in tuning queries check out
>>>> http://splainer.io/ and https://www.quepid.com which are interactive
>>> tools
>>>> (both of which my company makes) to tune for search relevancy.
>>>>
>>>
>>>
>>> Okay I changed the subject. But I don't need a tuning tool, I already
>> know
>>> WHY I'm not getting the results I need, the problem is how to fix it or
>> get
>>> around what the plugin is doing. Which is why I was inquiring if people
>>> have had success with something other than this particularly plugin for
>>> more advanced queries that it messes around with. It seems to do a good
>> job
>>> if you aren't doing anything particularly complicated with your search
>>> logic, but I don't see a good way to solve the issue I'm having, and a
>>> tuning tool isn't really going to help with that. We were pretty happy
>> with
>>> our search relevancy for the most part *other* than the problem with the
>>> multi-term synonyms not working reliably but I definitely can't lose
>>> relevancy that we had just to get those working.
>>>
>>> In reviewing your tools previously, the problem as I recall is that they
>>> rely on querying Solr directly, while our searches go through multiple
>>> levels of an application which includes a lot of additional logic in
>> terms
>>> of what the data that gets sent to Solr are, so they just aren't going to
>>> be much use for us. It was easier for me to just write my own tool that
>>> essentially does the same kind of thing, but with my application logic
>>> built in.
>>>
>>> Mary Jo
>>>
>>
> 

-- 
*************************************************************
Bernd Fehling                    Bielefeld University Library
Dipl.-Inform. (FH)                LibTec - Library Technology
Universit�tsstr. 25                  and Knowledge Management
33615 Bielefeld
Tel. +49 521 106-4060       bernd.fehling(at)uni-bielefeld.de

BASE - Bielefeld Academic Search Engine - www.base-search.net
*************************************************************

Re: Solutions for Multi-word Synonyms

Posted by Joe Lawson <jl...@opensourceconnections.com>.
I'm sorry I wasn't more specific, I meant we were hijacking the thread with
the question, "Anyone used a different method of
handling multi-term synonyms that isn't as global?" as the original thread
was about getting synonym_edismax running.

On Tue, Jun 7, 2016 at 2:24 PM, MaryJo Sminkey <mj...@gmail.com> wrote:

> > MaryJo you might want to start a new thread, I think we kinda hijacked
> this
> > one. Also if you are interested in tuning queries check out
> > http://splainer.io/ and https://www.quepid.com which are interactive
> tools
> > (both of which my company makes) to tune for search relevancy.
> >
>
>
> Okay I changed the subject. But I don't need a tuning tool, I already know
> WHY I'm not getting the results I need, the problem is how to fix it or get
> around what the plugin is doing. Which is why I was inquiring if people
> have had success with something other than this particularly plugin for
> more advanced queries that it messes around with. It seems to do a good job
> if you aren't doing anything particularly complicated with your search
> logic, but I don't see a good way to solve the issue I'm having, and a
> tuning tool isn't really going to help with that. We were pretty happy with
> our search relevancy for the most part *other* than the problem with the
> multi-term synonyms not working reliably but I definitely can't lose
> relevancy that we had just to get those working.
>
> In reviewing your tools previously, the problem as I recall is that they
> rely on querying Solr directly, while our searches go through multiple
> levels of an application which includes a lot of additional logic in terms
> of what the data that gets sent to Solr are, so they just aren't going to
> be much use for us. It was easier for me to just write my own tool that
> essentially does the same kind of thing, but with my application logic
> built in.
>
> Mary Jo
>