You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Ramprakash Ramamoorthy <yo...@gmail.com> on 2013/01/25 11:12:32 UTC

Multiple faceting in lucene

Hello,

        After reading the facet help document of Lucene 4, I can see it
helps for a folder structure like faceting. I would like to have a
multi-level/multiple faceting in my application.

        Let me explain, we have a requirement of something like a
contingency table
(wikipedia<http://en.wikipedia.org/wiki/Contingency_table> link).
Say we are having two fields a and b, I need to know the facet counts of b,
for each distinct value in a. Also apart from contingency matrix, this
could expand to more than 2 fields as well.

        Is there a way/work around lucene can help me with such multi level
faceting. Please advice. Thanks in advance.

-- 
With Thanks and Regards,
Ramprakash Ramamoorthy,
India.

Re: Multiple faceting in lucene

Posted by Shai Erera <se...@gmail.com>.
I'm glad to hear it helped you, Ramprakash.

Don't hesitate to post questions to the list if you need further assistance!

Shai


On Fri, Feb 1, 2013 at 9:12 AM, Ramprakash Ramamoorthy <
youngestachiever@gmail.com> wrote:

> On Fri, Jan 25, 2013 at 6:23 PM, Shai Erera <se...@gmail.com> wrote:
>
> > Hi
> >
> > Are the values of 'a' and 'b' known in advance? Is it a limited set of
> > values? Are you always interested in a table which covers all values?
> >
> > If so, one way to do that is to each value of 'a' against all values of
> > 'b'. Of course, pick as pivot the dimension with the least values. Note
> > however that if the number of values of 'a' and 'b' are big, it may be
> > costly to run N searches (even if you optimize the query processing).
> >
> > If the cardinality of 'a' and 'b' is small, and if you index for each
> > document only one value of 'a' and one value of 'b' (maybe some documents
> > only have 'a' or 'b'...), then there is another way.
> > You can run the query 'a AND b', where 'a' and 'b' are the facet
> dimensions
> > and for every matching document count a "fake" facet which is a pair of
> > ordinals, e.g. [1,45], [1,23] ...
> > In the end, you'll have the counts of the pair of ordinals and you can
> use
> > TaxonomyReader to label the ordinals of each pair and return the table.
> > That will work as long as the cardinality of 'a' and 'b' is sane :-)
> >
> > Shai
> >
> > Thank you Shai, I was able to achieve it roundabout. I read this blog<
> http://shaierera.blogspot.com/2012/12/lucene-facets-under-hood.html>of
> yours  and came to know about the setDepth method in FacetRequest. Was
> able to solve my use case with it. Thank you for the blog and the answer!
>
> >
> > On Fri, Jan 25, 2013 at 12:12 PM, Ramprakash Ramamoorthy <
> > youngestachiever@gmail.com> wrote:
> >
> > > Hello,
> > >
> > >         After reading the facet help document of Lucene 4, I can see it
> > > helps for a folder structure like faceting. I would like to have a
> > > multi-level/multiple faceting in my application.
> > >
> > >         Let me explain, we have a requirement of something like a
> > > contingency table
> > > (wikipedia<http://en.wikipedia.org/wiki/Contingency_table> link).
> > > Say we are having two fields a and b, I need to know the facet counts
> of
> > b,
> > > for each distinct value in a. Also apart from contingency matrix, this
> > > could expand to more than 2 fields as well.
> > >
> > >         Is there a way/work around lucene can help me with such multi
> > level
> > > faceting. Please advice. Thanks in advance.
> > >
> > > --
> > > With Thanks and Regards,
> > > Ramprakash Ramamoorthy,
> > > India.
> > >
> >
>
>
>
> --
> With Thanks and Regards,
> Ramprakash Ramamoorthy,
> India,
> +91 9626975420
>

Re: Multiple faceting in lucene

Posted by Ramprakash Ramamoorthy <yo...@gmail.com>.
On Fri, Jan 25, 2013 at 6:23 PM, Shai Erera <se...@gmail.com> wrote:

> Hi
>
> Are the values of 'a' and 'b' known in advance? Is it a limited set of
> values? Are you always interested in a table which covers all values?
>
> If so, one way to do that is to each value of 'a' against all values of
> 'b'. Of course, pick as pivot the dimension with the least values. Note
> however that if the number of values of 'a' and 'b' are big, it may be
> costly to run N searches (even if you optimize the query processing).
>
> If the cardinality of 'a' and 'b' is small, and if you index for each
> document only one value of 'a' and one value of 'b' (maybe some documents
> only have 'a' or 'b'...), then there is another way.
> You can run the query 'a AND b', where 'a' and 'b' are the facet dimensions
> and for every matching document count a "fake" facet which is a pair of
> ordinals, e.g. [1,45], [1,23] ...
> In the end, you'll have the counts of the pair of ordinals and you can use
> TaxonomyReader to label the ordinals of each pair and return the table.
> That will work as long as the cardinality of 'a' and 'b' is sane :-)
>
> Shai
>
> Thank you Shai, I was able to achieve it roundabout. I read this blog<http://shaierera.blogspot.com/2012/12/lucene-facets-under-hood.html>of yours  and came to know about the setDepth method in FacetRequest. Was
able to solve my use case with it. Thank you for the blog and the answer!

>
> On Fri, Jan 25, 2013 at 12:12 PM, Ramprakash Ramamoorthy <
> youngestachiever@gmail.com> wrote:
>
> > Hello,
> >
> >         After reading the facet help document of Lucene 4, I can see it
> > helps for a folder structure like faceting. I would like to have a
> > multi-level/multiple faceting in my application.
> >
> >         Let me explain, we have a requirement of something like a
> > contingency table
> > (wikipedia<http://en.wikipedia.org/wiki/Contingency_table> link).
> > Say we are having two fields a and b, I need to know the facet counts of
> b,
> > for each distinct value in a. Also apart from contingency matrix, this
> > could expand to more than 2 fields as well.
> >
> >         Is there a way/work around lucene can help me with such multi
> level
> > faceting. Please advice. Thanks in advance.
> >
> > --
> > With Thanks and Regards,
> > Ramprakash Ramamoorthy,
> > India.
> >
>



-- 
With Thanks and Regards,
Ramprakash Ramamoorthy,
India,
+91 9626975420

Re: Multiple faceting in lucene

Posted by Shai Erera <se...@gmail.com>.
Hi

Are the values of 'a' and 'b' known in advance? Is it a limited set of
values? Are you always interested in a table which covers all values?

If so, one way to do that is to each value of 'a' against all values of
'b'. Of course, pick as pivot the dimension with the least values. Note
however that if the number of values of 'a' and 'b' are big, it may be
costly to run N searches (even if you optimize the query processing).

If the cardinality of 'a' and 'b' is small, and if you index for each
document only one value of 'a' and one value of 'b' (maybe some documents
only have 'a' or 'b'...), then there is another way.
You can run the query 'a AND b', where 'a' and 'b' are the facet dimensions
and for every matching document count a "fake" facet which is a pair of
ordinals, e.g. [1,45], [1,23] ...
In the end, you'll have the counts of the pair of ordinals and you can use
TaxonomyReader to label the ordinals of each pair and return the table.
That will work as long as the cardinality of 'a' and 'b' is sane :-)

Shai


On Fri, Jan 25, 2013 at 12:12 PM, Ramprakash Ramamoorthy <
youngestachiever@gmail.com> wrote:

> Hello,
>
>         After reading the facet help document of Lucene 4, I can see it
> helps for a folder structure like faceting. I would like to have a
> multi-level/multiple faceting in my application.
>
>         Let me explain, we have a requirement of something like a
> contingency table
> (wikipedia<http://en.wikipedia.org/wiki/Contingency_table> link).
> Say we are having two fields a and b, I need to know the facet counts of b,
> for each distinct value in a. Also apart from contingency matrix, this
> could expand to more than 2 fields as well.
>
>         Is there a way/work around lucene can help me with such multi level
> faceting. Please advice. Thanks in advance.
>
> --
> With Thanks and Regards,
> Ramprakash Ramamoorthy,
> India.
>