You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Jeff Newburn <jn...@zappos.com> on 2009/04/02 00:33:37 UTC

Additive filter queries

I have a design question for all of those who might be willing to provide an
answer.

We are looking for a way to do a type of additive filters.  Our documents
are comprised of a single item of a specified color.  We will use shoes as
an example.  Each document contains a multivalued ³size² field with all
sizes and a multivalued ³width² field for all widths available for a given
color.  Our issue is that the values are not linked to each other.  This
issue can be seen when a user chooses a size (e.g. 7) and we filter the
options down to only size 7.  When the width facet is displayed it will have
all widths available for all documents that match on size 7 even though most
don¹t come in a wide width.  We are looking for strategies to filter facets
based on other facets in separate queries.

-- 
Jeff Newburn
Software Engineer, Zappos.com
jnewburn@zappos.com - 702-943-7562


Re: Additive filter queries

Posted by Matthew Runo <mr...@zappos.com>.
That would work, but the other part of our problem comes in when we  
then try to facet on the resulting set.. If we filter by size 1, for  
example, and then facet Width again - we get facet results that have  
no size 1's, because we have no taught solr what 1_W means, etc etc..

I think field collapsing might solve this for us, maybe..

Thanks for your time!

Matthew Runo
Software Engineer, Zappos.com
mruno@zappos.com - 702-943-7833

On Apr 9, 2009, at 5:23 PM, Chris Hostetter wrote:

> : Right now a document looks like this:
> :
> : <doc>
> : <!-- style level -->
> : <productID>1598548</productID>
> : <styleID>12545</styleID>
> : <brand>Adidas</brand>
> : <size>1, 2, 3, 4, 5, 6, 7</size>
> : <width>AA, A, B, W, W, WWWW</width>
> : <color>Brown</color>
> : </doc>
> :
> : If we went down a level, it could look like..
> : <doc>
> : <!-- stock level -->
> : <productID>1598548</productID>
> : <styleID>12545</styleID>
> : <stockID>654641654684</stockID>
> : <brand>Adidas</brand>
> : <size>1</size>
> : <width>AA</width>
> : <color>Brown</color>
> : </doc>
>
> If you want result at the "product" level then you don't have to  
> have one
> *doc* per legal size+width pair ... you just need one *term* per
> valid size+width pair....
>
>  <size>1, 2, 3, 4, 5, 6, 7</size>
>  <width>AA, A, B, W, W, WWWW</width>
>  <opts>1_W 2W 3_B 3_W 4_AA 4_A 4_B 4_W 4_WW 5_W 5_WWWW 6_WWWW  
> 7_WWWW</opts>
>
> a search for size 4 clogs would look like...
>
>  q=clogs&fq=size:5&facet.field=opts&f.opts.facet.prefix=4_
>
> ...and the facet counts for "opts" would tell me what widths were
> available (and how many).
>
> for completeness you typically want to index the pairs in both  
> directions
> (1_W and W_1 ... typically in seperate fields) so the user can  
> filter by
> either option first ... for something like size+color this makes  
> sense,
> but i'm guessing with shoes no one expects to narrow by "width" untill
> they've narrowed by size first.
>
>
> -Hoss
>


Re: Additive filter queries

Posted by Chris Hostetter <ho...@fucit.org>.
: Right now a document looks like this:
: 
: <doc>
: <!-- style level -->
: <productID>1598548</productID>
: <styleID>12545</styleID>
: <brand>Adidas</brand>
: <size>1, 2, 3, 4, 5, 6, 7</size>
: <width>AA, A, B, W, W, WWWW</width>
: <color>Brown</color>
: </doc>
: 
: If we went down a level, it could look like..
: <doc>
: <!-- stock level -->
: <productID>1598548</productID>
: <styleID>12545</styleID>
: <stockID>654641654684</stockID>
: <brand>Adidas</brand>
: <size>1</size>
: <width>AA</width>
: <color>Brown</color>
: </doc>

If you want result at the "product" level then you don't have to have one 
*doc* per legal size+width pair ... you just need one *term* per 
valid size+width pair....

  <size>1, 2, 3, 4, 5, 6, 7</size>
  <width>AA, A, B, W, W, WWWW</width>
  <opts>1_W 2W 3_B 3_W 4_AA 4_A 4_B 4_W 4_WW 5_W 5_WWWW 6_WWWW 7_WWWW</opts>

a search for size 4 clogs would look like...

  q=clogs&fq=size:5&facet.field=opts&f.opts.facet.prefix=4_

...and the facet counts for "opts" would tell me what widths were 
available (and how many).  

for completeness you typically want to index the pairs in both directions 
(1_W and W_1 ... typically in seperate fields) so the user can filter by 
either option first ... for something like size+color this makes sense, 
but i'm guessing with shoes no one expects to narrow by "width" untill 
they've narrowed by size first.


-Hoss

Re: Additive filter queries

Posted by Matthew Runo <mr...@zappos.com>.
We could do that by going down one level in our inventory, but then we  
have other problems.. for example:

Right now a document looks like this:

<doc>
<!-- style level -->
<productID>1598548</productID>
<styleID>12545</styleID>
<brand>Adidas</brand>
<size>1, 2, 3, 4, 5, 6, 7</size>
<width>AA, A, B, W, W, WWWW</width>
<color>Brown</color>
</doc>

If we went down a level, it could look like..
<doc>
<!-- stock level -->
<productID>1598548</productID>
<styleID>12545</styleID>
<stockID>654641654684</stockID>
<brand>Adidas</brand>
<size>1</size>
<width>AA</width>
<color>Brown</color>
</doc>

The question now is this:
- At the stock level, we don't want a search for "brown shoes" to  
return with alllll the various size/width combos as separate results -  
each productId / styleId combo should be a single result

- At the stock level, if you filter by "Size: 7" and then "Width: B"  
you're assured to only get things that are width B and size 7

- At the style level, we can't tell for sure which size / width combos  
are in stock, since this data is not exposed to solr

This seems like a problem that isn't unique to us. Any store that has  
size/width or anything like that will have the same issue. How might  
it be solved?

Thanks for your time!

Matthew Runo
Software Engineer, Zappos.com
mruno@zappos.com - 702-943-7833

On Apr 3, 2009, at 1:13 AM, Fergus McMenemie wrote:

>> I have a design question for all of those who might be willing to  
>> provide an
>> answer.
>>
>> We are looking for a way to do a type of additive filters.  Our  
>> documents
>> are comprised of a single item of a specified color.  We will use  
>> shoes as
>> an example.  Each document contains a multivalued ³size² field with  
>> all
>> sizes and a multivalued ³width² field for all widths available for  
>> a given
>> color.  Our issue is that the values are not linked to each other.   
>> This
>> issue can be seen when a user chooses a size (e.g. 7) and we filter  
>> the
>> options down to only size 7.  When the width facet is displayed it  
>> will have
>> all widths available for all documents that match on size 7 even  
>> though most
>> don¹t come in a wide width.  We are looking for strategies to  
>> filter facets
>> based on other facets in separate queries.
>>
>> -- 
>> Jeff Newburn
>> Software Engineer, Zappos.com
>> jnewburn@zappos.com - 702-943-7562
>
> Ditto!
>
> As best I understand, you somehow need to arrange for each different
> combination of colour, size and width to be indexed as a separate sol
> document.
>
> -- 
>
> ===============================================================
> Fergus McMenemie               Email:fergus@twig.me.uk
> Techmore Ltd                   Phone:(UK) 07721 376021
>
> Unix/Mac/Intranets             Analyst Programmer
> ===============================================================
>


Re: Additive filter queries

Posted by Fergus McMenemie <fe...@twig.me.uk>.
>I have a design question for all of those who might be willing to provide an
>answer.
>
>We are looking for a way to do a type of additive filters.  Our documents
>are comprised of a single item of a specified color.  We will use shoes as
>an example.  Each document contains a multivalued ³size² field with all
>sizes and a multivalued ³width² field for all widths available for a given
>color.  Our issue is that the values are not linked to each other.  This
>issue can be seen when a user chooses a size (e.g. 7) and we filter the
>options down to only size 7.  When the width facet is displayed it will have
>all widths available for all documents that match on size 7 even though most
>don¹t come in a wide width.  We are looking for strategies to filter facets
>based on other facets in separate queries.
>
>-- 
>Jeff Newburn
>Software Engineer, Zappos.com
>jnewburn@zappos.com - 702-943-7562

Ditto!

As best I understand, you somehow need to arrange for each different 
combination of colour, size and width to be indexed as a separate sol
document.

-- 

===============================================================
Fergus McMenemie               Email:fergus@twig.me.uk
Techmore Ltd                   Phone:(UK) 07721 376021

Unix/Mac/Intranets             Analyst Programmer
===============================================================