You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by "Charton, Andre" <ac...@ebay-kleinanzeigen.de> on 2011/05/03 09:06:50 UTC

RE: Ebay Kleinanzeigen and Auto Suggest

Hi,

yes we do. 

If you use a limit number of categories (like 100) you can use dynamic fields with the termscomponent and by choosing a category specific prefix, like:

{schema.xml}
...
<dynamicField name="*_suggestion" type="textAS" indexed="true" stored="false" multiValued="true" omitNorms="true"/>
...
{schema.xml}

And within data import handler we script prefix from given category:

{data-config.xml}
		function setCatPrefixFields(row) {
			var catId = row.get('category');
			var title = row.get('freetext');
			var cat_prefix = "c" + catId + "_suggestion";
			return row;
		}
{data-config.xml}

Then you we adapt these in our application layer by a specific request handler, regarding these prefix.

Pro:
	- works fine for limit number of categories

Con:
	- index is getting bigger, we measure increasing by ~40 percent

Regards

André Charton


-----Original Message-----
From: Eric Grobler [mailto:impalaherd@googlemail.com] 
Sent: Wednesday, April 27, 2011 9:56 AM
To: solr-user@lucene.apache.org
Subject: Re: Ebay Kleinanzeigen and Auto Suggest

Hi Otis,

The new Solr 3.1 Suggester also does not support filter queries.

Is anyone using shingles with faceting on large data?

Regards
Ericz

On Tue, Apr 26, 2011 at 10:06 PM, Otis Gospodnetic <
otis_gospodnetic@yahoo.com> wrote:

> Hi Eric,
>
> Before using the terms component, allow me to point out:
>
> * http://sematext.com/products/autocomplete/index.html (used on
> http://search-lucene.com/ for example)
>
> * http://wiki.apache.org/solr/Suggester
>
>
> Otis
> ----
> Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
> Lucene ecosystem search :: http://search-lucene.com/
>
>
>
> ----- Original Message ----
> > From: Eric Grobler <im...@googlemail.com>
> > To: solr-user@lucene.apache.org
> > Sent: Tue, April 26, 2011 1:11:11 PM
> > Subject: Ebay Kleinanzeigen and Auto Suggest
> >
> > Hi
> >
> > Someone told me that ebay is using solr.
> > I was looking at their  Auto Suggest implementation and I guess they are
> > using Shingles and the  TermsComponent.
> >
> > I managed to get a satisfactory implementation but I have  a problem with
> > category specific filtering.
> > Ebay suggestions are sensitive  to categories like Cars and Pets.
> >
> > As far as I understand it is not  possible to using filters with a term
> > query.
> > Unless one uses multiple  fields or special prefixes for the words to
> index I
> > cannot think how to  implement this.
> >
> > Is their perhaps a workaround for this  limitation?
> >
> > Best  Regards
> > EricZ
> >
> > ---------------------------------------
> >
> > I am have  a shingle type like:
> > <fieldType name="shingle_text"  class="solr.TextField"
> > positionIncrementGap="100">
> > <analyzer>
> >    <tokenizer class="solr.StandardTokenizerFactory"/>
> >    <filter  class="solr.ShingleFilterFactory" minShingleSize="2"
> > maxShingleSize="4"  />
> >    <filter class="solr.LowerCaseFilterFactory" />
> >    </analyzer>
> > </fieldType>
> >
> >
> >
> > and a query like
> >
> http://localhost:8983/solr/terms?q=*%3A*&terms.fl=suggest_text&terms.sort=count&terms.prefix=audi
> >i
> >
>

RE: Ebay Kleinanzeigen and Auto Suggest

Posted by Andy <an...@yahoo.com>.
--- On Tue, 5/3/11, Charton, Andre <ac...@ebay-kleinanzeigen.de> wrote:
> 
> yes we do. 
> 
> If you use a limit number of categories (like 100) you can
> use dynamic fields with the termscomponent and by choosing a
> category specific prefix, like:
> 
> {schema.xml}
> ...
> <dynamicField name="*_suggestion" type="textAS"
> indexed="true" stored="false" multiValued="true"
> omitNorms="true"/>
> ...
> {schema.xml}
> 
> And within data import handler we script prefix from given
> category:
> 
> {data-config.xml}
>         function
> setCatPrefixFields(row) {
>            
> var catId = row.get('category');
>            
> var title = row.get('freetext');
>            
> var cat_prefix = "c" + catId + "_suggestion";
>            
> return row;
>         }
> {data-config.xml}
> 
> Then you we adapt these in our application layer by a
> specific request handler, regarding these prefix.
> 
> Pro:
>     - works fine for limit number of
> categories
> 
> Con:
>     - index is getting bigger, we measure
> increasing by ~40 percent


Very interesting.

Why did the index get bigger? You're still indexing the same title, just to different dynamic fields, right? So the total amount of data indexed should still be the same. Adding dynamic fields shouldn't increase the index size. What am I missing?

Andy