You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Mike Mander <wi...@gmx.de> on 2011/10/05 11:00:27 UTC

How do i get results for quering with separated words?

Hello,

i have configured a catchall searchword field. In this i copy the value 
of field name. Name value = "Star Wars".
Now i try to find this document by searchword "starwars". But it's not 
found.
Vice versa same problem. Name value = "SuperRTL", searchword is "super rtl".

Replacing all whitespaces (in index and query) leads to unsatisfiying 
results.

Can someone give me please a small description how i can solve this. 
Maybe there is already a blog on this.

Thanks for help
Mike

Re: How do i get results for quering with separated words?

Posted by Ahmet Arslan <io...@yahoo.com>.

Using ShingleFilterFactory at index time may help.

http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.ShingleFilterFactory


--- On Wed, 10/5/11, Mikhail Khludnev <mk...@griddynamics.com> wrote:

> From: Mikhail Khludnev <mk...@griddynamics.com>
> Subject: Re: How do i get results for quering with separated words?
> To: solr-user@lucene.apache.org
> Date: Wednesday, October 5, 2011, 10:57 PM
> Have you tried to correct spaces by
> spelling dictionary?
> if you build you dictionary from  non tokenized terms,
> you'll have starwars
> -> Star Wars and super rtl->superrtl corrections.
> 
> WDYT?
> 
> On Wed, Oct 5, 2011 at 7:13 PM, elisabeth benoit
> <el...@gmail.com>wrote:
> 
> > I think you could define star wars and starwars as
> synonyms in
> > synonyms.txt...
> >
> > maybe not generic enough?
> >
> > 2011/10/5 Mike Mander <wi...@gmx.de>
> >
> > > Isn't this more a problem of the query string?
> > >
> > > Let's assume i have a game name like "Nintentdo
> 3DS - 'Star Wars - Clone
> > > Wars'".
> > > Can i copy that name to a field cutting the - and
> ', lowercase the result
> > > string
> > > and remove the whitespaces? So that i have
> > "nintendo3dsstarwarsclonewars"*
> > > *.
> > > Is that "findable" with my "starwars" query
> string?
> > >
> > > Thanks for helping me
> > > Mike
> > >
> > >
> > >  index this field without whitespaces ? XD
> > >>
> > >> -----
> > >> ------------------------------**- System
> > ------------------------------**
> > >> ----------
> > >>
> > >> One Server, 12 GB RAM, 2 Solr Instances, 8
> Cores,
> > >> 1 Core with 45 Million Documents other
> Cores<  200.000
> > >>
> > >> - Solr1 for Search-Requests - commit every
> Minute  - 5GB Xmx
> > >> - Solr2 for Update-Request  - delta
> every Minute - 4GB Xmx
> > >> --
> > >> View this message in context: http://lucene.472066.n3.**
> > >>
> nabble.com/How-do-i-get-**results-for-quering-with-**separated-words-**
> > >> tp3395966p3396207.html<
> > http://lucene.472066.n3.nabble.com/How-do-i-get-results-for-quering-with-separated-words-tp3395966p3396207.html
> > >
> > >> Sent from the Solr - User mailing list
> archive at Nabble.com.
> > >>
> > >>
> > >
> >
> 
> 
> 
> -- 
> Sincerely yours
> Mikhail (Mike) Khludnev
> Developer
> Grid Dynamics
> tel. 1-415-738-8644
> Skype: mkhludnev
> <http://www.griddynamics.com>
>  <mk...@griddynamics.com>
> 

Re: How do i get results for quering with separated words?

Posted by Mikhail Khludnev <mk...@griddynamics.com>.
Have you tried to correct spaces by spelling dictionary?
if you build you dictionary from  non tokenized terms, you'll have starwars
-> Star Wars and super rtl->superrtl corrections.

WDYT?

On Wed, Oct 5, 2011 at 7:13 PM, elisabeth benoit
<el...@gmail.com>wrote:

> I think you could define star wars and starwars as synonyms in
> synonyms.txt...
>
> maybe not generic enough?
>
> 2011/10/5 Mike Mander <wi...@gmx.de>
>
> > Isn't this more a problem of the query string?
> >
> > Let's assume i have a game name like "Nintentdo 3DS - 'Star Wars - Clone
> > Wars'".
> > Can i copy that name to a field cutting the - and ', lowercase the result
> > string
> > and remove the whitespaces? So that i have
> "nintendo3dsstarwarsclonewars"*
> > *.
> > Is that "findable" with my "starwars" query string?
> >
> > Thanks for helping me
> > Mike
> >
> >
> >  index this field without whitespaces ? XD
> >>
> >> -----
> >> ------------------------------**- System
> ------------------------------**
> >> ----------
> >>
> >> One Server, 12 GB RAM, 2 Solr Instances, 8 Cores,
> >> 1 Core with 45 Million Documents other Cores<  200.000
> >>
> >> - Solr1 for Search-Requests - commit every Minute  - 5GB Xmx
> >> - Solr2 for Update-Request  - delta every Minute - 4GB Xmx
> >> --
> >> View this message in context: http://lucene.472066.n3.**
> >> nabble.com/How-do-i-get-**results-for-quering-with-**separated-words-**
> >> tp3395966p3396207.html<
> http://lucene.472066.n3.nabble.com/How-do-i-get-results-for-quering-with-separated-words-tp3395966p3396207.html
> >
> >> Sent from the Solr - User mailing list archive at Nabble.com.
> >>
> >>
> >
>



-- 
Sincerely yours
Mikhail (Mike) Khludnev
Developer
Grid Dynamics
tel. 1-415-738-8644
Skype: mkhludnev
<http://www.griddynamics.com>
 <mk...@griddynamics.com>

Re: How do i get results for quering with separated words?

Posted by elisabeth benoit <el...@gmail.com>.
I think you could define star wars and starwars as synonyms in
synonyms.txt...

maybe not generic enough?

2011/10/5 Mike Mander <wi...@gmx.de>

> Isn't this more a problem of the query string?
>
> Let's assume i have a game name like "Nintentdo 3DS - 'Star Wars - Clone
> Wars'".
> Can i copy that name to a field cutting the - and ', lowercase the result
> string
> and remove the whitespaces? So that i have "nintendo3dsstarwarsclonewars"*
> *.
> Is that "findable" with my "starwars" query string?
>
> Thanks for helping me
> Mike
>
>
>  index this field without whitespaces ? XD
>>
>> -----
>> ------------------------------**- System ------------------------------**
>> ----------
>>
>> One Server, 12 GB RAM, 2 Solr Instances, 8 Cores,
>> 1 Core with 45 Million Documents other Cores<  200.000
>>
>> - Solr1 for Search-Requests - commit every Minute  - 5GB Xmx
>> - Solr2 for Update-Request  - delta every Minute - 4GB Xmx
>> --
>> View this message in context: http://lucene.472066.n3.**
>> nabble.com/How-do-i-get-**results-for-quering-with-**separated-words-**
>> tp3395966p3396207.html<http://lucene.472066.n3.nabble.com/How-do-i-get-results-for-quering-with-separated-words-tp3395966p3396207.html>
>> Sent from the Solr - User mailing list archive at Nabble.com.
>>
>>
>

Re: How do i get results for quering with separated words?

Posted by Mike Mander <wi...@gmx.de>.
Isn't this more a problem of the query string?

Let's assume i have a game name like "Nintentdo 3DS - 'Star Wars - Clone 
Wars'".
Can i copy that name to a field cutting the - and ', lowercase the 
result string
and remove the whitespaces? So that i have "nintendo3dsstarwarsclonewars".
Is that "findable" with my "starwars" query string?

Thanks for helping me
Mike

> index this field without whitespaces ? XD
>
> -----
> ------------------------------- System ----------------------------------------
>
> One Server, 12 GB RAM, 2 Solr Instances, 8 Cores,
> 1 Core with 45 Million Documents other Cores<  200.000
>
> - Solr1 for Search-Requests - commit every Minute  - 5GB Xmx
> - Solr2 for Update-Request  - delta every Minute - 4GB Xmx
> --
> View this message in context: http://lucene.472066.n3.nabble.com/How-do-i-get-results-for-quering-with-separated-words-tp3395966p3396207.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>


Re: How do i get results for quering with separated words?

Posted by stockii <st...@googlemail.com>.
index this field without whitespaces ? XD

-----
------------------------------- System ----------------------------------------

One Server, 12 GB RAM, 2 Solr Instances, 8 Cores, 
1 Core with 45 Million Documents other Cores < 200.000

- Solr1 for Search-Requests - commit every Minute  - 5GB Xmx
- Solr2 for Update-Request  - delta every Minute - 4GB Xmx
--
View this message in context: http://lucene.472066.n3.nabble.com/How-do-i-get-results-for-quering-with-separated-words-tp3395966p3396207.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: How do i get results for quering with separated words?

Posted by Mike Mander <wi...@gmx.de>.
Thanks stockii,

but WDFF ist splitting on Numeric or NameChange only.
For Star Wars in index and starwars in query this means that both are 
not equal. Or?

Thanks
Mike
> which type in the schema.xml do you use.
>
> try out WordDelimiterFilterFactory or some other filters from this site:
>
> http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.WordDelimiterFilterFactory
>
> -----
> ------------------------------- System ----------------------------------------
>
> One Server, 12 GB RAM, 2 Solr Instances, 8 Cores,
> 1 Core with 45 Million Documents other Cores<  200.000
>
> - Solr1 for Search-Requests - commit every Minute  - 5GB Xmx
> - Solr2 for Update-Request  - delta every Minute - 4GB Xmx
> --
> View this message in context: http://lucene.472066.n3.nabble.com/How-do-i-get-results-for-quering-with-separated-words-tp3395966p3395982.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>


Re: How do i get results for quering with separated words?

Posted by stockii <st...@googlemail.com>.
which type in the schema.xml do you use.

try out WordDelimiterFilterFactory or some other filters from this site:

http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.WordDelimiterFilterFactory

-----
------------------------------- System ----------------------------------------

One Server, 12 GB RAM, 2 Solr Instances, 8 Cores, 
1 Core with 45 Million Documents other Cores < 200.000

- Solr1 for Search-Requests - commit every Minute  - 5GB Xmx
- Solr2 for Update-Request  - delta every Minute - 4GB Xmx
--
View this message in context: http://lucene.472066.n3.nabble.com/How-do-i-get-results-for-quering-with-separated-words-tp3395966p3395982.html
Sent from the Solr - User mailing list archive at Nabble.com.