You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Sawan Sharma <ss...@chambal.com> on 2007/05/17 11:03:02 UTC

Group the search results by a given field

Hi All,

I was wondering - is it possible to search and group the results by a
given field?

For example, I have an index with several million records. Most of
them are different Features of the same ID.

I'd love to be able to do.. groupby=ID or something like that
in the results, and provide the ID as a clickable link to see
all the Features of that ID.

I have used HitCollector class to accomplish this goal. In Collect method I
have used following algo...

Collect()
{
    if Searcher.Doc(doc_id).get(ID) is not exist in HashKey then
                   Add Searcher.Doc(doc_id).get(ID) as new HashKey in hash
table and assign value = 1
    else
                   increment HashKey( Searcher.Doc(doc_id).get(ID)) value
with 1
}

But, it depends on HitCount. As soon as I get more hits it takes more time
and my search performance is degrade.

How it can be done with best performance..?

Any ideas?

Sawan

Re: Group the search results by a given field

Posted by Erick Erickson <er...@gmail.com>.
There has been significant discussion on this topic (way more than
I can remember clearly) on the mail thread, but as I remember it's
been referred to as "facet" or "faceted". I think you would get a lot
of info searching for these terms at...

http://www.gossamer-threads.com/lists/lucene/java-user/

Best
Erick


On 5/17/07, Sawan Sharma <ss...@chambal.com> wrote:
>
> Hi All,
>
> I was wondering - is it possible to search and group the results by a
> given field?
>
> For example, I have an index with several million records. Most of
> them are different Features of the same ID.
>
> I'd love to be able to do.. groupby=ID or something like that
> in the results, and provide the ID as a clickable link to see
> all the Features of that ID.
>
> I have used HitCollector class to accomplish this goal. In Collect method
> I
> have used following algo...
>
> Collect()
> {
>     if Searcher.Doc(doc_id).get(ID) is not exist in HashKey then
>                    Add Searcher.Doc(doc_id).get(ID) as new HashKey in hash
> table and assign value = 1
>     else
>                    increment HashKey( Searcher.Doc(doc_id).get(ID)) value
> with 1
> }
>
> But, it depends on HitCount. As soon as I get more hits it takes more time
> and my search performance is degrade.
>
> How it can be done with best performance..?
>
> Any ideas?
>
> Sawan
>