You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Dawn Zoë Raison <da...@digitorial.co.uk> on 2011/03/22 11:43:21 UTC

Grouping...

Hi Folks,

Before I run off and reinvent the wheel here - has anyone done any form 
of result grouping with lucene?

My use case looks something like this:
Newspaper pages are stored as documents in the lucene index.
I need to list the newpapers that match my criteria in date order, so 
that I can then in a subsequent search enumerate the first n pages from 
each. n is derived from the UI - in this case the screen width the user 
has available.
Ideally I'd want to pull all papers for a given date - so a way to pull 
a result set that identifies a set of dates that have pages stored 
against them would be ideal. It seems to me that the only way to do this 
at present would be to define a custom collector and aggregate such a 
result set on the fly?

My reason for wanting to group is so that I can easily compute the 
next/previous start indexes as the user browses through the timeline. If 
I have to include the (variable) page count each time it gets 
convoluted. More so since some pages may be missing from each paper.

Any thoughts appreciated.

-- 

Rgds.
*Dawn Raison*
Technical Director, Digitorial Ltd.



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Grouping...

Posted by Ian Lea <ia...@gmail.com>.
I'm not aware of a particular FAQ on this.

There is something called bobo-browse  - "Faceted search library based
on Lucene - Google ...
Bobo Browse is an information retrieval technology that provides
navigational browsing into a semi-structured dataset. Beyond the
result set from queries ...".

http://sbdevel.wordpress.com/2010/09/24/sorting-faceting-index-lookup/
and https://issues.apache.org/jira/browse/LUCENE-2369 sound
interesting.


--
Ian.


On Fri, Mar 25, 2011 at 10:14 AM, Dawn Zoë Raison <da...@digitorial.co.uk> wrote:
>
> On 23/03/2011 17:55, Grant Ingersoll wrote:
>>
>> Have you looked at Solr and date faceting capabilities?  Also, it has
>> result grouping, but I think you are just describing faceting/filtering.
>
> SOLR is not an option, we are already have the index (>2 million pages some
> with 100,000 terms).
> What I'm looking to do is to create some new ways to view the data.
>
> Is there a good FAQ on faceting/filtering I can peruse.
>
> Ta.
> --
>
> Rgds.
> *Dawn Raison*
> Technical Director, Digitorial Ltd.
>
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Grouping...

Posted by Dawn Zoë Raison <da...@digitorial.co.uk>.
On 23/03/2011 17:55, Grant Ingersoll wrote:
> Have you looked at Solr and date faceting capabilities?  Also, it has result grouping, but I think you are just describing faceting/filtering.

SOLR is not an option, we are already have the index (>2 million pages 
some with 100,000 terms).
What I'm looking to do is to create some new ways to view the data.

Is there a good FAQ on faceting/filtering I can peruse.

Ta.
-- 

Rgds.
*Dawn Raison*
Technical Director, Digitorial Ltd.



Re: Grouping...

Posted by Grant Ingersoll <gs...@apache.org>.
On Mar 22, 2011, at 6:43 AM, Dawn Zoë Raison wrote:

> Hi Folks,
> 
> Before I run off and reinvent the wheel here - has anyone done any form of result grouping with lucene?
> 
> My use case looks something like this:
> Newspaper pages are stored as documents in the lucene index.
> I need to list the newpapers that match my criteria in date order, so that I can then in a subsequent search enumerate the first n pages from each. n is derived from the UI - in this case the screen width the user has available.
> Ideally I'd want to pull all papers for a given date - so a way to pull a result set that identifies a set of dates that have pages stored against them would be ideal. It seems to me that the only way to do this at present would be to define a custom collector and aggregate such a result set on the fly?
> 
> My reason for wanting to group is so that I can easily compute the next/previous start indexes as the user browses through the timeline. If I have to include the (variable) page count each time it gets convoluted. More so since some pages may be missing from each paper.
> 
> Any thoughts appreciated.

Have you looked at Solr and date faceting capabilities?  Also, it has result grouping, but I think you are just describing faceting/filtering.

--------------------------
Grant Ingersoll
http://www.lucidimagination.com/

Search the Lucene ecosystem docs using Solr/Lucene:
http://www.lucidimagination.com/search


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org