You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Shawn Heisey <so...@elyograg.org> on 2011/10/05 22:06:11 UTC
Offering search suggestions - a discussion of multi-term phrases
I am trying to figure out how we can begin offering search suggestions
to people, especially when a user types in something that results in few
or zero results. For background, we have an archive of about 60 million
objects, most of which are photographs. There are also a number of text
articles, and most recently, videos. The metadata is kept in a
database, and the database is used as the import source for Solr.
The first thing we're going to try is spellcheck, using the terms
component to generate a wordlist from our catchall field and then doing
what we can in with a program to remove undesirable words. I do not
anticipate running into much trouble with this part.
Another idea we have is search suggestions. One aspect is autocomplete,
the other is similar to the spell-check, but more sophisticated. It
would do things like offer "Nicole Kidman" if the user typed in "Tom
Cruise" and didn't get many search results.
The problem I can see with all of these things is that single terms will
not really be enough, and single terms is all I can get out of the
index. Our distributed index is already quite a bit larger than the
available RAM on the machines that contain it, and it's growing
steadily. Adding analysis complexity or copyFields to the index is not
much of an option, because we have no budget available for new hardware,
but I won't completely rule it out.
Is there any way, even if it's offline analysis of either the index or
the database, to come up with common short phrases specific to our
data? If there is, perhaps I can then give it to Solr and let it make
suggestions with it.
Thanks,
Shawn
Re: Offering search suggestions - a discussion of multi-term phrases
Posted by Otis Gospodnetic <ot...@yahoo.com>.
Shawn,
Have you looked at http://www.sematext.com/products/dym-researcher/index.html as a solution to the ZeroHits problem?
If that doesn't work, then yes, offline word/phase co-occurrence may work.
Otis
----
Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/
>________________________________
>From: Shawn Heisey <so...@elyograg.org>
>To: solr-user@lucene.apache.org
>Sent: Wednesday, October 5, 2011 4:06 PM
>Subject: Offering search suggestions - a discussion of multi-term phrases
>
>I am trying to figure out how we can begin offering search suggestions to people, especially when a user types in something that results in few or zero results. For background, we have an archive of about 60 million objects, most of which are photographs. There are also a number of text articles, and most recently, videos. The metadata is kept in a database, and the database is used as the import source for Solr.
>
>The first thing we're going to try is spellcheck, using the terms component to generate a wordlist from our catchall field and then doing what we can in with a program to remove undesirable words. I do not anticipate running into much trouble with this part.
>
>Another idea we have is search suggestions. One aspect is autocomplete, the other is similar to the spell-check, but more sophisticated. It would do things like offer "Nicole Kidman" if the user typed in "Tom Cruise" and didn't get many search results.
>
>The problem I can see with all of these things is that single terms will not really be enough, and single terms is all I can get out of the index. Our distributed index is already quite a bit larger than the available RAM on the machines that contain it, and it's growing steadily. Adding analysis complexity or copyFields to the index is not much of an option, because we have no budget available for new hardware, but I won't completely rule it out.
>
>Is there any way, even if it's offline analysis of either the index or the database, to come up with common short phrases specific to our data? If there is, perhaps I can then give it to Solr and let it make suggestions with it.
>
>Thanks,
>Shawn
>
>
>
>