You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by John john <ze...@yahoo.fr> on 2006/07/24 20:49:34 UTC

index articles with groups

Hello, 
 
 I'm pretty new to lucene so I hope my question is not stupid :)
 
 I'd like to index articles but I want them to be in a group.
 such as:
 
 article1, article2 and article3 are in the group1
 article4 and article5 are in the group2
 
 Then if I search for a word which is present in article1 and article 2, i'd like to retrieve only one result because they are in the same group.
 
 Is it possible?
 
 Thanks
 

 		
---------------------------------
 Découvrez un nouveau moyen de poser toutes vos questions quelque soit le sujet ! Yahoo! Questions/Réponses pour partager vos connaissances, vos opinions et vos expériences. Cliquez ici. 

RE : Re: index articles with groups

Posted by John john <ze...@yahoo.fr>.
Here more details because it seems that I did not give enough information :)
 
 I want to index my messsage board and each topic contains several posts. So my idea was to index each post with 3 fields (ID, title, post_content) 
 
 then I can search in each post and have a link with the title of the topic. However i got a problem with this method.
 Imagine I have a topic named "Linux better that Windows?" which contains 10 posts. I want to retrieve only one post from that topic if someone is searching topics which contains linux. You know what I mean?

karl wettin <ka...@gmail.com> a écrit : On Mon, 2006-07-24 at 20:49 +0200, John john wrote:

>  article1, article2 and article3 are in the group1
>  article4 and article5 are in the group2
>  
>  Then if I search for a word which is present in article1 and article
> 2, i'd like to retrieve only one result because they are in the same
> group.

This sounds very suspicious. My guts tell me you are attacking your
problem -- whatever it might be -- the wrong way. 

But OK, which one of the documents would you want as a result? Any of
them? Create a HitCollector, and if the the field with the group is
already collected, ignore the current document.




---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org



 		
---------------------------------
 Découvrez un nouveau moyen de poser toutes vos questions quelque soit le sujet ! Yahoo! Questions/Réponses pour partager vos connaissances, vos opinions et vos expériences. Cliquez ici. 

Re: index articles with groups

Posted by karl wettin <ka...@gmail.com>.
On Mon, 2006-07-24 at 20:49 +0200, John john wrote:

>  article1, article2 and article3 are in the group1
>  article4 and article5 are in the group2
>  
>  Then if I search for a word which is present in article1 and article
> 2, i'd like to retrieve only one result because they are in the same
> group.

This sounds very suspicious. My guts tell me you are attacking your
problem -- whatever it might be -- the wrong way. 

But OK, which one of the documents would you want as a result? Any of
them? Create a HitCollector, and if the the field with the group is
already collected, ignore the current document.




---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: RE : Re: index articles with groups

Posted by Erick Erickson <er...@gmail.com>.
I think you're back to Karl's suggestion. Implement a HitCollector and
ignore all hits on a group ID after the first one. You even get the most
relevant article in the group that way <G>...

Best
Erick

Re: RE : Re: index articles with groups

Posted by Chris Hostetter <ho...@fucit.org>.
: Unfortunately this is not that easy. Because I must be able to retrieve
: only one article and if i index all the content in one document then all
: the document will be retrieved instead of the single article.

i didn't say you had to *only* index the article contents in "group"
documents ... you can still index "article" based documents, and use a
Filter to decide wether you are interested in getting back "groups" or
"articles"

if you allways want to display article type information, even when
displaying "group" results, you have to decide which article in each group
is the "flagship" article, and store it's data in the gorups stored field.


-Hoss


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


RE : Re: index articles with groups

Posted by John john <ze...@yahoo.fr>.
Unfortunately this is not that easy. Because I must be able to retrieve only one article and if i index all the content in one document then all the document will be retrieved instead of the single article.

Chris Hostetter <ho...@fucit.org> a écrit : 
:  Then if I search for a word which is present in article1 and article 2,
: i'd like to retrieve only one result because they are in the same group.

if you only want one result back per group, then odds are you want one
document per group -- nad index the text from all of the articles in that
group text in that single document.



-Hoss


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org



 		
---------------------------------
 Découvrez un nouveau moyen de poser toutes vos questions quelque soit le sujet ! Yahoo! Questions/Réponses pour partager vos connaissances, vos opinions et vos expériences. Cliquez ici. 

Re: index articles with groups

Posted by Chris Hostetter <ho...@fucit.org>.
:  Then if I search for a word which is present in article1 and article 2,
: i'd like to retrieve only one result because they are in the same group.

if you only want one result back per group, then odds are you want one
document per group -- nad index the text from all of the articles in that
group text in that single document.



-Hoss


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org