You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Shawn Heisey <so...@elyograg.org> on 2011/08/04 17:42:46 UTC

How can I create a good autosuggest list with phrases?

I'm at the point in my Solr deployment where I want to start using it 
for autosuggest, but I've run into a snag.  Because the fields that I 
want to use for autosuggest are tokenized, I can only get single terms 
out of it.  I would like to have it find common phrases that are between 
two and five words long, so that if someone starts typing "ang" their 
autosuggest list will include "Angelina Jolie" as well as possibly "Brad 
Pitt and Angelina Jolie."

My index is already quite large, so I do not want to add shingles.  I 
tried to use the clustering component, but that will only give you 
halfway decent results if you make the "rows=" parameter absolutely huge 
and therefore things run very slowly.  Also, it only works against 
stored fields, so I can only run it against the field where we retrieve 
captions, not the full description.  It's impractical to get results 
based on an entire index, much less all seven shards.

I'm OK with offline analysis to generate a list of suggestions, and I'm 
also OK with doing that analysis against the MySQL data source rather 
than Solr.  I just need some pointers about what software and/or 
techniques I can use to generate a good list, and then some idea of how 
to configure Solr to use that list.  Can anyone help?

Thanks,
Shawn


Re: How can I create a good autosuggest list with phrases?

Posted by Shawn Heisey <so...@elyograg.org>.
On 8/4/2011 10:04 AM, Sethi, Parampreet wrote:
> We handled similar requirement in our product kitchendaily.com by creating a
> list of Search terms which were frequently searched over a period of time
> and then building auto-suggestion index from this data. The constant updates
> of this will allow you to support a well formed auto-suggest feature. This
> is a good and faster solution if you have application logs to start with and
> not very high volume of data.

I do have some separate plans to include data from our query logs, but 
I'd also like to get data from the index itself, more than one term at a 
time.

Thanks,
Shawn


Re: How can I create a good autosuggest list with phrases?

Posted by "Sethi, Parampreet" <pa...@teamaol.com>.
We handled similar requirement in our product kitchendaily.com by creating a
list of Search terms which were frequently searched over a period of time
and then building auto-suggestion index from this data. The constant updates
of this will allow you to support a well formed auto-suggest feature. This
is a good and faster solution if you have application logs to start with and
not very high volume of data.

Or you can search Solr with the user entered data, which returns all the
matching results and boost the data by field which will be used in
AutoSuggest box, use top 5 items in the dynamic div.

Hope it Helps.

-param


On 8/4/11 11:42 AM, "Shawn Heisey" <so...@elyograg.org> wrote:

> I'm at the point in my Solr deployment where I want to start using it
> for autosuggest, but I've run into a snag.  Because the fields that I
> want to use for autosuggest are tokenized, I can only get single terms
> out of it.  I would like to have it find common phrases that are between
> two and five words long, so that if someone starts typing "ang" their
> autosuggest list will include "Angelina Jolie" as well as possibly "Brad
> Pitt and Angelina Jolie."
> 
> My index is already quite large, so I do not want to add shingles.  I
> tried to use the clustering component, but that will only give you
> halfway decent results if you make the "rows=" parameter absolutely huge
> and therefore things run very slowly.  Also, it only works against
> stored fields, so I can only run it against the field where we retrieve
> captions, not the full description.  It's impractical to get results
> based on an entire index, much less all seven shards.
> 
> I'm OK with offline analysis to generate a list of suggestions, and I'm
> also OK with doing that analysis against the MySQL data source rather
> than Solr.  I just need some pointers about what software and/or
> techniques I can use to generate a good list, and then some idea of how
> to configure Solr to use that list.  Can anyone help?
> 
> Thanks,
> Shawn
>