You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Dawn Zoë Raison <da...@digitorial.co.uk> on 2011/03/22 11:43:21 UTC
Grouping...
Hi Folks,
Before I run off and reinvent the wheel here - has anyone done any form
of result grouping with lucene?
My use case looks something like this:
Newspaper pages are stored as documents in the lucene index.
I need to list the newpapers that match my criteria in date order, so
that I can then in a subsequent search enumerate the first n pages from
each. n is derived from the UI - in this case the screen width the user
has available.
Ideally I'd want to pull all papers for a given date - so a way to pull
a result set that identifies a set of dates that have pages stored
against them would be ideal. It seems to me that the only way to do this
at present would be to define a custom collector and aggregate such a
result set on the fly?
My reason for wanting to group is so that I can easily compute the
next/previous start indexes as the user browses through the timeline. If
I have to include the (variable) page count each time it gets
convoluted. More so since some pages may be missing from each paper.
Any thoughts appreciated.
--
Rgds.
*Dawn Raison*
Technical Director, Digitorial Ltd.
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
Re: Grouping...
Posted by Ian Lea <ia...@gmail.com>.
I'm not aware of a particular FAQ on this.
There is something called bobo-browse - "Faceted search library based
on Lucene - Google ...
Bobo Browse is an information retrieval technology that provides
navigational browsing into a semi-structured dataset. Beyond the
result set from queries ...".
http://sbdevel.wordpress.com/2010/09/24/sorting-faceting-index-lookup/
and https://issues.apache.org/jira/browse/LUCENE-2369 sound
interesting.
--
Ian.
On Fri, Mar 25, 2011 at 10:14 AM, Dawn Zoë Raison <da...@digitorial.co.uk> wrote:
>
> On 23/03/2011 17:55, Grant Ingersoll wrote:
>>
>> Have you looked at Solr and date faceting capabilities? Also, it has
>> result grouping, but I think you are just describing faceting/filtering.
>
> SOLR is not an option, we are already have the index (>2 million pages some
> with 100,000 terms).
> What I'm looking to do is to create some new ways to view the data.
>
> Is there a good FAQ on faceting/filtering I can peruse.
>
> Ta.
> --
>
> Rgds.
> *Dawn Raison*
> Technical Director, Digitorial Ltd.
>
>
>
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
Re: Grouping...
Posted by Dawn Zoë Raison <da...@digitorial.co.uk>.
On 23/03/2011 17:55, Grant Ingersoll wrote:
> Have you looked at Solr and date faceting capabilities? Also, it has result grouping, but I think you are just describing faceting/filtering.
SOLR is not an option, we are already have the index (>2 million pages
some with 100,000 terms).
What I'm looking to do is to create some new ways to view the data.
Is there a good FAQ on faceting/filtering I can peruse.
Ta.
--
Rgds.
*Dawn Raison*
Technical Director, Digitorial Ltd.
Re: Grouping...
Posted by Grant Ingersoll <gs...@apache.org>.
On Mar 22, 2011, at 6:43 AM, Dawn Zoë Raison wrote:
> Hi Folks,
>
> Before I run off and reinvent the wheel here - has anyone done any form of result grouping with lucene?
>
> My use case looks something like this:
> Newspaper pages are stored as documents in the lucene index.
> I need to list the newpapers that match my criteria in date order, so that I can then in a subsequent search enumerate the first n pages from each. n is derived from the UI - in this case the screen width the user has available.
> Ideally I'd want to pull all papers for a given date - so a way to pull a result set that identifies a set of dates that have pages stored against them would be ideal. It seems to me that the only way to do this at present would be to define a custom collector and aggregate such a result set on the fly?
>
> My reason for wanting to group is so that I can easily compute the next/previous start indexes as the user browses through the timeline. If I have to include the (variable) page count each time it gets convoluted. More so since some pages may be missing from each paper.
>
> Any thoughts appreciated.
Have you looked at Solr and date faceting capabilities? Also, it has result grouping, but I think you are just describing faceting/filtering.
--------------------------
Grant Ingersoll
http://www.lucidimagination.com/
Search the Lucene ecosystem docs using Solr/Lucene:
http://www.lucidimagination.com/search
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org