You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Mickael Magniez <mi...@gmail.com> on 2010/08/04 16:17:56 UTC

No "group by"? looking for an alternative.

Hello,

I'm dealing with a problem since few days  : I want to index and search
shoes, each shoe can have several size and colors, at different prices.

So, what i want is : when I search for "Converse", i want to retrieve one
"shoe per model", i-e one color and one size, but having colors and sizes in
facets.

My first idea was to copy SQL behaviour with a "SELECT * FROM solr WHERE
text CONTAINS 'converse' GROUP BY model". 
But no group by in Solr :(. I try with FieldCollapsing, but have many bugs
(NullPointerException).

Then I try with multivalued facets  : 
<field name="size" type="string" indexed="true" stored="true"
multiValued="true"/>
<field name="color" type="string" indexed="true" stored="true"
multiValued="true"/>

It's nearly working, but i have a problem : when i filtered on red shoes, in
the size facet, I also have sizes which are not available in red. I don't
find any solutions to filter multivalued facet with value of another
multivalued facet.

So if anyone have an idea for solving this problem...



Mickael.

-- 
View this message in context: http://lucene.472066.n3.nabble.com/No-group-by-looking-for-an-alternative-tp1022738p1022738.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: No "group by"? looking for an alternative.

Posted by Otis Gospodnetic <ot...@yahoo.com>.
Dennis,

Unfortunately it's not ready, at least not SOLR-236.  There is another issue 
that you could look at.... from memory, I think it's SOLR-1682.

Otis
----
Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/



----- Original Message ----
> From: Dennis Gearon <ge...@sbcglobal.net>
> To: solr-user@lucene.apache.org
> Sent: Fri, August 6, 2010 10:44:35 AM
> Subject: Re: No "group by"? looking for an alternative.
> 
> I thought that field collapsing as already 'ready for prime time', just not yet  
>integrated into the core?
> 
> 
> Dennis Gearon
> 
> Signature  Warning
> ----------------
> EARTH has a Right To Life,
>   otherwise we  all die.
> 
> Read 'Hot, Flat, and Crowded'
> Laugh at http://www.yert.com/film.php
> 
> 
> --- On Thu, 8/5/10, Lance Norskog  <go...@gmail.com> wrote:
> 
> >  From: Lance Norskog <go...@gmail.com>
> > Subject: Re:  No "group by"? looking for an alternative.
> > To: solr-user@lucene.apache.org
> >  Date: Thursday, August 5, 2010, 8:17 PM
> > I can see how one document per  model
> > blows up when you have many
> > options. But how many models  of the shoe do they actually
> > make? They
> > can't possibly make  5000, one for every metadat
> > combination.
> > 
> > If you go with  one document per model, you have to do a
> > second search
> > on that  product ID to get all of the models.
> > 
> > Field Collapsing is  exactly for the 'many shoes for one
> > product'
> > problem, but it is  not released, so the second search is
> > what you
> > want.
> > 
> > On Thu, Aug 5, 2010 at 4:54 PM, Jonathan Rochkind <ro...@jhu.edu>
> > wrote:
> >  > Mickael Magniez wrote:
> > >>
> > >> Thanks for your  response.
> > >>
> > >> Unfortunately, I don't think it'll  be enough. In
> > fact, I have many other
> > >> products than  shoes in my index, with many other
> > facets fields.
> >  >>
> > >> I simplified my schema : in reality facets are
> >  dynamic fields.
> > >>
> > >
> > > You could change the  way you do indexing, so every
> > product-color-size combo
> > > is  it's own "document".
> > >
> > > Document1:
> > >    product: running shoe
> > >   size: 12
> > >   color: red
> >  >
> > > Document2:
> > >   product: running shoe
> > >   size: 13
> > >   color: red
> > >
> > > That would let you  do the kind of facetting drill-down
> > you want to do. It
> > >  would of course make other things more complicated.
> > But it's the only  way I
> > > can think of to let you do the kind of facet
> >  drill-down you want, if I
> > > understand what you want correctly, which  I may not.
> > >
> > > Jonathan
> > >
> > >
> >  >
> > >
> > 
> > 
> > 
> > -- 
> > Lance  Norskog
> > goksron@gmail.com
> >
> 

Re: No "group by"? looking for an alternative.

Posted by Dennis Gearon <ge...@sbcglobal.net>.
I thought that field collapsing as already 'ready for prime time', just not yet integrated into the core?


Dennis Gearon

Signature Warning
----------------
EARTH has a Right To Life,
  otherwise we all die.

Read 'Hot, Flat, and Crowded'
Laugh at http://www.yert.com/film.php


--- On Thu, 8/5/10, Lance Norskog <go...@gmail.com> wrote:

> From: Lance Norskog <go...@gmail.com>
> Subject: Re: No "group by"? looking for an alternative.
> To: solr-user@lucene.apache.org
> Date: Thursday, August 5, 2010, 8:17 PM
> I can see how one document per model
> blows up when you have many
> options. But how many models of the shoe do they actually
> make? They
> can't possibly make 5000, one for every metadat
> combination.
> 
> If you go with one document per model, you have to do a
> second search
> on that product ID to get all of the models.
> 
> Field Collapsing is exactly for the 'many shoes for one
> product'
> problem, but it is not released, so the second search is
> what you
> want.
> 
> On Thu, Aug 5, 2010 at 4:54 PM, Jonathan Rochkind <ro...@jhu.edu>
> wrote:
> > Mickael Magniez wrote:
> >>
> >> Thanks for your response.
> >>
> >> Unfortunately, I don't think it'll be enough. In
> fact, I have many other
> >> products than shoes in my index, with many other
> facets fields.
> >>
> >> I simplified my schema : in reality facets are
> dynamic fields.
> >>
> >
> > You could change the way you do indexing, so every
> product-color-size combo
> > is it's own "document".
> >
> > Document1:
> >   product: running shoe
> >   size: 12
> >   color: red
> >
> > Document2:
> >   product: running shoe
> >  size: 13
> >   color: red
> >
> > That would let you do the kind of facetting drill-down
> you want to do. It
> > would of course make other things more complicated.
> But it's the only way I
> > can think of to let you do the kind of facet
> drill-down you want, if I
> > understand what you want correctly, which I may not.
> >
> > Jonathan
> >
> >
> >
> >
> 
> 
> 
> -- 
> Lance Norskog
> goksron@gmail.com
> 

Re: No "group by"? looking for an alternative.

Posted by Lance Norskog <go...@gmail.com>.
I can see how one document per model blows up when you have many
options. But how many models of the shoe do they actually make? They
can't possibly make 5000, one for every metadat combination.

If you go with one document per model, you have to do a second search
on that product ID to get all of the models.

Field Collapsing is exactly for the 'many shoes for one product'
problem, but it is not released, so the second search is what you
want.

On Thu, Aug 5, 2010 at 4:54 PM, Jonathan Rochkind <ro...@jhu.edu> wrote:
> Mickael Magniez wrote:
>>
>> Thanks for your response.
>>
>> Unfortunately, I don't think it'll be enough. In fact, I have many other
>> products than shoes in my index, with many other facets fields.
>>
>> I simplified my schema : in reality facets are dynamic fields.
>>
>
> You could change the way you do indexing, so every product-color-size combo
> is it's own "document".
>
> Document1:
>   product: running shoe
>   size: 12
>   color: red
>
> Document2:
>   product: running shoe
>  size: 13
>   color: red
>
> That would let you do the kind of facetting drill-down you want to do. It
> would of course make other things more complicated. But it's the only way I
> can think of to let you do the kind of facet drill-down you want, if I
> understand what you want correctly, which I may not.
>
> Jonathan
>
>
>
>



-- 
Lance Norskog
goksron@gmail.com

Re: No "group by"? looking for an alternative.

Posted by Jonathan Rochkind <ro...@jhu.edu>.
Mickael Magniez wrote:
> Thanks for your response.
>
> Unfortunately, I don't think it'll be enough. In fact, I have many other
> products than shoes in my index, with many other facets fields.
>
> I simplified my schema : in reality facets are dynamic fields.
>   

You could change the way you do indexing, so every product-color-size 
combo is it's own "document".

Document1:
    product: running shoe
    size: 12
    color: red

Document2:
    product: running shoe
   size: 13
    color: red

That would let you do the kind of facetting drill-down you want to do. 
It would of course make other things more complicated. But it's the only 
way I can think of to let you do the kind of facet drill-down you want, 
if I understand what you want correctly, which I may not.

Jonathan




Re: No "group by"? looking for an alternative.

Posted by Mickael Magniez <mi...@gmail.com>.
Thanks for your response.

Unfortunately, I don't think it'll be enough. In fact, I have many other
products than shoes in my index, with many other facets fields.

I simplified my schema : in reality facets are dynamic fields.
-- 
View this message in context: http://lucene.472066.n3.nabble.com/No-group-by-looking-for-an-alternative-tp1022738p1025256.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: No "group by"? looking for an alternative.

Posted by Lance Norskog <go...@gmail.com>.
Hello-

A way to do this is to create on faceting field that includes both the
size and the color. I assume you have a different shoe product
document for each model. Each model would include the color & size
'red' and '14a' fields, but you would add a field with 'red-14a'.

On Wed, Aug 4, 2010 at 7:17 AM, Mickael Magniez
<mi...@gmail.com> wrote:
>
> Hello,
>
> I'm dealing with a problem since few days  : I want to index and search
> shoes, each shoe can have several size and colors, at different prices.
>
> So, what i want is : when I search for "Converse", i want to retrieve one
> "shoe per model", i-e one color and one size, but having colors and sizes in
> facets.
>
> My first idea was to copy SQL behaviour with a "SELECT * FROM solr WHERE
> text CONTAINS 'converse' GROUP BY model".
> But no group by in Solr :(. I try with FieldCollapsing, but have many bugs
> (NullPointerException).
>
> Then I try with multivalued facets  :
> <field name="size" type="string" indexed="true" stored="true"
> multiValued="true"/>
> <field name="color" type="string" indexed="true" stored="true"
> multiValued="true"/>
>
> It's nearly working, but i have a problem : when i filtered on red shoes, in
> the size facet, I also have sizes which are not available in red. I don't
> find any solutions to filter multivalued facet with value of another
> multivalued facet.
>
> So if anyone have an idea for solving this problem...
>
>
>
> Mickael.
>
> --
> View this message in context: http://lucene.472066.n3.nabble.com/No-group-by-looking-for-an-alternative-tp1022738p1022738.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>



-- 
Lance Norskog
goksron@gmail.com

Re: No "group by"? looking for an alternative.

Posted by Geert-Jan Brits <gb...@gmail.com>.
If I understand correctly:
1. products have different product variants ( in case of shoes a combination
of color and size + some other fields).
2. Each product is shown once in the result set. (so no multiple product
variants of the same product are shown)

This would solve that IMO:

1, create 1 document per product (so not a document per product-variant)
2.create a multivalued field on which to facet containing: all combinations
of: <size>-<color>-<any other field>-<yett another field>
3. make sure to include combinations in which the user is indifferent of a
particular filter. i.e: "don't care about size (dc)" + "red" --> "dc-red"
4. filtering on that combination would give you all the products that
satisfy the product-variant constraints (size, color, etc.) + the extra
product constraints ('converse")
5. on the detail page show all available product-variants not filtered by
the constraints specified. This would likely be something outside of solr (a
simple sql-select on a single product)

hope that helps,
Geert-Jan

2010/8/5 Mickael Magniez <mi...@gmail.com>

>
> I've got only one document per shoes, whatever its size or color.
>
> My first try was to create one document per model/size/color, but when i
> searche for 'converse' for example, the same shoe is retrieved several
> times, and i want to show only one record for each model. But I don't
> succeed in grouping results by shoe model.
>
> If you look at
>
> http://www.amazon.com/s/ref=nb_sb_noss?url=node%3D679255011&field-keywords=Converse+All+Star+Leather+Hi+Chuck+Taylor+&x=0&y=0&ih=1_0_0_0_0_0_0_0_0_0.4136_1&fsc=-1
> amazon for Converse All Star Leather Hi Chuck Taylor  .
> They show the shoe only one time, but if you go on the product details, its
> exists in several colors and sizes. Now if you filter or color, there is
> less sizes available.
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/No-group-by-looking-for-an-alternative-tp1022738p1026618.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>

Re: No "group by"? looking for an alternative.

Posted by Mickael Magniez <mi...@gmail.com>.
I've got only one document per shoes, whatever its size or color.

My first try was to create one document per model/size/color, but when i
searche for 'converse' for example, the same shoe is retrieved several
times, and i want to show only one record for each model. But I don't
succeed in grouping results by shoe model.

If you look at  
http://www.amazon.com/s/ref=nb_sb_noss?url=node%3D679255011&field-keywords=Converse+All+Star+Leather+Hi+Chuck+Taylor+&x=0&y=0&ih=1_0_0_0_0_0_0_0_0_0.4136_1&fsc=-1
amazon for Converse All Star Leather Hi Chuck Taylor  .
They show the shoe only one time, but if you go on the product details, its
exists in several colors and sizes. Now if you filter or color, there is
less sizes available.

-- 
View this message in context: http://lucene.472066.n3.nabble.com/No-group-by-looking-for-an-alternative-tp1022738p1026618.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: No "group by"? looking for an alternative.

Posted by kenf_nc <ke...@realestate.com>.
In the size 'facet' you have values that may not be in red, but in the size
'field' of any individual document  you wont'. If you searched on
q=converse&fq=color:red  the shoes returned would have appropriate sizes in
their field. Having a facet value for size 10 means at least 1 shoe in your
potential result set has that size in red, it doesn't mean the shoe you got
back in position 1 does.
-- 
View this message in context: http://lucene.472066.n3.nabble.com/No-group-by-looking-for-an-alternative-tp1022738p1026581.html
Sent from the Solr - User mailing list archive at Nabble.com.