You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by ggggGuys <pa...@gmail.com> on 2012/04/23 10:34:03 UTC

Synonyms file in solr

I have some problems with the synonyms file, it seems i can't make it work
the way i'd want. 

Here is an exemple :

I have these words : cat, animal, dog, living thing, baby shark

if i search for animal OR animals, i'd like to have the results for cat,
animal, dog, baby shark as well as their plural cats, dogs, animals and baby
sharks.

if i search for cat, i only want the results with cat or cats. Same for dog.

if i search for living thing, i want the results with living thing, living
things, animal or animals. So no dogs, cats...

So the words are in a hierarchy : living thing(s) -> animal(s) -> [dog(s),
cat(s), baby shark(s)]

I've tried a lot of thing but i can't get the results i want and i really
need your help :-(


--
View this message in context: http://lucene.472066.n3.nabble.com/Synonyms-file-in-solr-tp3931838p3931838.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Synonyms file in solr

Posted by Lee Carroll <le...@googlemail.com>.
Your example are not synonyms so i don't think synonyms.txt by itself
is going to work.
This sounds like tagging using a taxonomy. Values written to the field
storing this taxonomy could be like:

livingthing/animal/cat [doc about cats]
livingthing/animal/dog [doc about dogs]
livingthing/animal [doc about animals in general]
livingthing/animal/cat  livingthing/animal/dog [doc about cats and dogs]

If you need a free text search solution rather than a metadata field
search as above you will need to pre-process your docs looking for
entities in your taxonomy and replace the entity tokens with the above
taxonomic tokens, perhaps placing these into a specialist field for
searching. A solr analysis chain which mimics such pre-processing may
get you some mileage, something like

copyfield content -> taxoKeywords

taxoKeywords field analysis

tokenise
lowercase
minimal stem (sure their is one minimal english stem i think its called
keepwords [cat dog animal livingthing]
synonym replacement [livingthing/animal/cat -> cat,
livingthing/animal/dog -> dog, etc]

I'd go for preprocessing outside of solr but the keepwords / synonms
might work for you

cheers lee c







On 23 April 2012 09:34, ggggGuys <pa...@gmail.com> wrote:
> I have some problems with the synonyms file, it seems i can't make it work
> the way i'd want.
>
> Here is an exemple :
>
> I have these words : cat, animal, dog, living thing, baby shark
>
> if i search for animal OR animals, i'd like to have the results for cat,
> animal, dog, baby shark as well as their plural cats, dogs, animals and baby
> sharks.
>
> if i search for cat, i only want the results with cat or cats. Same for dog.
>
> if i search for living thing, i want the results with living thing, living
> things, animal or animals. So no dogs, cats...
>
> So the words are in a hierarchy : living thing(s) -> animal(s) -> [dog(s),
> cat(s), baby shark(s)]
>
> I've tried a lot of thing but i can't get the results i want and i really
> need your help :-(
>
>
> --
> View this message in context: http://lucene.472066.n3.nabble.com/Synonyms-file-in-solr-tp3931838p3931838.html
> Sent from the Solr - User mailing list archive at Nabble.com.