You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Sven Schönfeldt <sc...@subshell.com> on 2014/07/24 10:07:50 UTC

Need a tipp, how to find documents where content is "tel aviv" but user query is "telaviv"?

Hi Solr-Users,

what is the best way to find documents, where the user write a wrong word in query.

For example the user search for „telaviv“. the search result should also include documents where content is „tel aviv“.

any tipp, or keywords how to do that kind of queries?

regards, Sven

Re: Need a tipp, how to find documents where content is "tel aviv" but user query is "telaviv"?

Posted by Jack Krupansky <ja...@basetechnology.com>.
And I should have added that the advantage of the word break approach is 
that it automatically handles both splitting and combining words, all based 
on the index, with no need to mess with creating synonyms.

Also, there is a dictionary-based filter called 
DictionaryCompoundWordTokenFilterFactory which can split combined terms, but 
you do have to put at least explicitly some of the word parts in a 
dictionary file. Again, there are examples in my e-book.

It would be nice to have a dynamic, index-based filter at query time to 
automatically (but optionally) do the expansion/compression.

-- Jack Krupansky

-----Original Message----- 
From: Sven Schönfeldt
Sent: Thursday, July 24, 2014 8:35 AM
To: solr-user@lucene.apache.org
Subject: Re: Need a tipp, how to find documents where content is "tel aviv" 
but user query is "telaviv"?

Thanks!

Thats my core problem, to let solr search a bit like GSA :-)


Greetz

Am 24.07.2014 um 14:27 schrieb Jack Krupansky <ja...@basetechnology.com>:

> Google handles this type of word concatenation quite well... but Solr does 
> not out of the box, at least in terms of automatically. Solr does have a 
> word break spell checker:
>
> https://cwiki.apache.org/confluence/display/solr/Spell+Checking
>
> And described in more detail, with examples in my e-book:
> http://www.lulu.com/us/en/shop/jack-krupansky/solr-4x-deep-dive-early-access-release-7/ebook/product-21203548.html
>
> You could at least use this feature to implement a "did you mean..." UI 
> for your search app - show the user actual results but also a proposed 
> query with the words broken apart.
>
> -- Jack Krupansky
>
> -----Original Message----- From: Sven Schönfeldt
> Sent: Thursday, July 24, 2014 4:07 AM
> To: solr-user@lucene.apache.org
> Subject: Need a tipp, how to find documents where content is "tel aviv" 
> but user query is "telaviv"?
>
> Hi Solr-Users,
>
> what is the best way to find documents, where the user write a wrong word 
> in query.
>
> For example the user search for „telaviv“. the search result should also 
> include documents where content is „tel aviv“.
>
> any tipp, or keywords how to do that kind of queries?
>
> regards, Sven= 


Re: Need a tipp, how to find documents where content is "tel aviv" but user query is "telaviv"?

Posted by Sven Schönfeldt <sc...@subshell.com>.
Thanks!

Thats my core problem, to let solr search a bit like GSA :-)


Greetz

Am 24.07.2014 um 14:27 schrieb Jack Krupansky <ja...@basetechnology.com>:

> Google handles this type of word concatenation quite well... but Solr does not out of the box, at least in terms of automatically. Solr does have a word break spell checker:
> 
> https://cwiki.apache.org/confluence/display/solr/Spell+Checking
> 
> And described in more detail, with examples in my e-book:
> http://www.lulu.com/us/en/shop/jack-krupansky/solr-4x-deep-dive-early-access-release-7/ebook/product-21203548.html
> 
> You could at least use this feature to implement a "did you mean..." UI for your search app - show the user actual results but also a proposed query with the words broken apart.
> 
> -- Jack Krupansky
> 
> -----Original Message----- From: Sven Schönfeldt
> Sent: Thursday, July 24, 2014 4:07 AM
> To: solr-user@lucene.apache.org
> Subject: Need a tipp, how to find documents where content is "tel aviv" but user query is "telaviv"?
> 
> Hi Solr-Users,
> 
> what is the best way to find documents, where the user write a wrong word in query.
> 
> For example the user search for „telaviv“. the search result should also include documents where content is „tel aviv“.
> 
> any tipp, or keywords how to do that kind of queries?
> 
> regards, Sven= 


Re: Need a tipp, how to find documents where content is "tel aviv" but user query is "telaviv"?

Posted by Jack Krupansky <ja...@basetechnology.com>.
Google handles this type of word concatenation quite well... but Solr does 
not out of the box, at least in terms of automatically. Solr does have a 
word break spell checker:

https://cwiki.apache.org/confluence/display/solr/Spell+Checking

And described in more detail, with examples in my e-book:
http://www.lulu.com/us/en/shop/jack-krupansky/solr-4x-deep-dive-early-access-release-7/ebook/product-21203548.html

You could at least use this feature to implement a "did you mean..." UI for 
your search app - show the user actual results but also a proposed query 
with the words broken apart.

-- Jack Krupansky

-----Original Message----- 
From: Sven Schönfeldt
Sent: Thursday, July 24, 2014 4:07 AM
To: solr-user@lucene.apache.org
Subject: Need a tipp, how to find documents where content is "tel aviv" but 
user query is "telaviv"?

Hi Solr-Users,

what is the best way to find documents, where the user write a wrong word in 
query.

For example the user search for „telaviv“. the search result should also 
include documents where content is „tel aviv“.

any tipp, or keywords how to do that kind of queries?

regards, Sven= 


Re: Need a tipp, how to find documents where content is "tel aviv" but user query is "telaviv"?

Posted by Sven Schönfeldt <sc...@subshell.com>.
Thank You Alex!

Am 24.07.2014 um 11:08 schrieb Alexandre Rafalovitch <ar...@gmail.com>:

> You can put the SynonymFilterFactory at query time as well. But it's
> less reliable. Especially if the text is "tel aviv" and the query is
> telaviv, you need to make sure to enable auto phrase search as well.
> 
> Regards,
>   Alex.
> Personal: http://www.outerthoughts.com/ and @arafalov
> Solr resources and newsletter: http://www.solr-start.com/ and @solrstart
> Solr popularizers community: https://www.linkedin.com/groups?gid=6713853
> 
> 
> On Thu, Jul 24, 2014 at 3:31 PM, Sven Schönfeldt
> <sc...@subshell.com> wrote:
>> So i will need SynonymFilterFactory at indexing, or? Any chance to get it work by query time?
>> 
>> 
>> Am 24.07.2014 um 10:24 schrieb Alexandre Rafalovitch <ar...@gmail.com>:
>> 
>>> How often does this happen? Could use synonyms if not too many.
>>> On 24/07/2014 3:08 pm, "Sven Schönfeldt" <sc...@subshell.com> wrote:
>>> 
>>>> Hi Solr-Users,
>>>> 
>>>> what is the best way to find documents, where the user write a wrong word
>>>> in query.
>>>> 
>>>> For example the user search for „telaviv“. the search result should also
>>>> include documents where content is „tel aviv“.
>>>> 
>>>> any tipp, or keywords how to do that kind of queries?
>>>> 
>>>> regards, Sven
>> 


Re: Need a tipp, how to find documents where content is "tel aviv" but user query is "telaviv"?

Posted by Alexandre Rafalovitch <ar...@gmail.com>.
You can put the SynonymFilterFactory at query time as well. But it's
less reliable. Especially if the text is "tel aviv" and the query is
telaviv, you need to make sure to enable auto phrase search as well.

Regards,
   Alex.
Personal: http://www.outerthoughts.com/ and @arafalov
Solr resources and newsletter: http://www.solr-start.com/ and @solrstart
Solr popularizers community: https://www.linkedin.com/groups?gid=6713853


On Thu, Jul 24, 2014 at 3:31 PM, Sven Schönfeldt
<sc...@subshell.com> wrote:
> So i will need SynonymFilterFactory at indexing, or? Any chance to get it work by query time?
>
>
> Am 24.07.2014 um 10:24 schrieb Alexandre Rafalovitch <ar...@gmail.com>:
>
>> How often does this happen? Could use synonyms if not too many.
>> On 24/07/2014 3:08 pm, "Sven Schönfeldt" <sc...@subshell.com> wrote:
>>
>>> Hi Solr-Users,
>>>
>>> what is the best way to find documents, where the user write a wrong word
>>> in query.
>>>
>>> For example the user search for „telaviv“. the search result should also
>>> include documents where content is „tel aviv“.
>>>
>>> any tipp, or keywords how to do that kind of queries?
>>>
>>> regards, Sven
>

Re: Need a tipp, how to find documents where content is "tel aviv" but user query is "telaviv"?

Posted by Sven Schönfeldt <sc...@subshell.com>.
So i will need SynonymFilterFactory at indexing, or? Any chance to get it work by query time?


Am 24.07.2014 um 10:24 schrieb Alexandre Rafalovitch <ar...@gmail.com>:

> How often does this happen? Could use synonyms if not too many.
> On 24/07/2014 3:08 pm, "Sven Schönfeldt" <sc...@subshell.com> wrote:
> 
>> Hi Solr-Users,
>> 
>> what is the best way to find documents, where the user write a wrong word
>> in query.
>> 
>> For example the user search for „telaviv“. the search result should also
>> include documents where content is „tel aviv“.
>> 
>> any tipp, or keywords how to do that kind of queries?
>> 
>> regards, Sven


Re: Need a tipp, how to find documents where content is "tel aviv" but user query is "telaviv"?

Posted by Alexandre Rafalovitch <ar...@gmail.com>.
How often does this happen? Could use synonyms if not too many.
On 24/07/2014 3:08 pm, "Sven Schönfeldt" <sc...@subshell.com> wrote:

> Hi Solr-Users,
>
> what is the best way to find documents, where the user write a wrong word
> in query.
>
> For example the user search for „telaviv“. the search result should also
> include documents where content is „tel aviv“.
>
> any tipp, or keywords how to do that kind of queries?
>
> regards, Sven