You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Nick Snels <ni...@gmail.com> on 2006/07/06 13:39:42 UTC

Simple faceted browsing

Hi,

is anybody currently working on the following item from the tasklist?

* Simple faceted browsing (grouping) support in the standard query handler
    * group by field (provide counts for each distinct value in that field)
    * group by (query1, query2, query3, query4, query5)

I had the intention of doing it myself, but my knowledge of Java proved to
small. I had a look at DisMaxRequestHandler and the StandardRequestHandler
and even the code of Erik Hatcher. But couldn't figure out the flow. My
current solution is to send, in my case, 11 queries
(?indent=on&q=%2B#{query}+%2Bprovince:#{province} each with a different
value for the parameter province. On my development machine it is fast
enough. It would be much easier if I could add a parameter, group by and get
an xml file back with the counts. Which should be a bit faster than sending
11 requests to get the count for each. It would be great to hear from
someone who was implemented this 'simple' grouping in Solr and maybe give me
some pointers. Thanks.

Kind regards,

Nick Snels

Re: Simple faceted browsing

Posted by Erik Hatcher <er...@ehatchersolutions.com>.
Nick,

I wish I could help more directly, but, alas, my time is compressed  
and I can't shepherd facets into Solr's core just yet.  Since you're  
focused on a single field currently, here's how I recommend you  
proceed (unless someone goes the full distance on this and makes it  
more easily available):

   - Set up an environment where you can add some custom code to a  
Solr WAR file and deploy it.   At first just subclass  
StandardRequestHandler as a custom class and add in something simple  
like rsp.add("test", "test") and ensure you're getting this custom  
value back in the returned XML.

   - Hard code in those 11 queries (for now) as TermQuery's into an  
array or something.

   - Then loop over all those TermQuery's and do this:

     SolrIndexSearcher searcher = req.getSearcher();

     for each TermQuery:
       DocSet valueDocSet = searcher.getDocSet(termQuery);
       long count = valueDocSet.intersectionSize(originalQuery);
       rsp.add(termQuery.toString(), count);

This is all off the top of my head with some glancing at the custom  
faceted request handler I created for Collex, so maybe I've  
overlooked something?  But overall its pretty straightforward to get  
counts per field value.  The question is, where do those field values  
come from?   This is why I suggested you hard-code those 11 queries  
for now, and then when that is working you can ramp up and get those  
field values dynamically from the index (which is what my code does,  
but I'm still fiddling to find the best way to cache those values or  
read them from the index dynamically myself).  The above provides the  
counts.  To group actual documents by a field, you could intersect,  
rather than just intersectionSize.

	Erik



On Jul 6, 2006, at 7:39 AM, Nick Snels wrote:
> is anybody currently working on the following item from the tasklist?
>
> * Simple faceted browsing (grouping) support in the standard query  
> handler
>    * group by field (provide counts for each distinct value in that  
> field)
>    * group by (query1, query2, query3, query4, query5)
>
> I had the intention of doing it myself, but my knowledge of Java  
> proved to
> small. I had a look at DisMaxRequestHandler and the  
> StandardRequestHandler
> and even the code of Erik Hatcher. But couldn't figure out the  
> flow. My
> current solution is to send, in my case, 11 queries
> (?indent=on&q=%2B#{query}+%2Bprovince:#{province} each with a  
> different
> value for the parameter province. On my development machine it is fast
> enough. It would be much easier if I could add a parameter, group  
> by and get
> an xml file back with the counts. Which should be a bit faster than  
> sending
> 11 requests to get the count for each. It would be great to hear from
> someone who was implemented this 'simple' grouping in Solr and  
> maybe give me
> some pointers. Thanks.
>
> Kind regards,
>
> Nick Snels