You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@jackrabbit.apache.org by alartin <al...@gmail.com> on 2007/04/05 07:50:07 UTC

How to find similar nodes(somewhat like google similar pages)?

Hi all,
Given a node with a text content, I want to find nodes that have similar
text contents. It somewhat is like finding similar pages using google(just
think page is a kind of node, its content is a node property). I think
lucene support this by term vector and wonder whether jackrabbit query can
do it or not. Many thanks.  
-- 
View this message in context: http://www.nabble.com/How-to-find-similar-nodes%28somewhat-like-google-similar-pages%29--tf3529773.html#a9850410
Sent from the Jackrabbit - Users mailing list archive at Nabble.com.


Re: How to find similar nodes(somewhat like google similar pages)?

Posted by Marcel Reutegger <ma...@gmx.net>.
Christoph Kiehl wrote:
> Right now Jackrabbit doesn't support this, probably mainly because the 
> JCR spec doesn't define such a thing.

that's correct.

> I think Marcel recently added search term highlighting which isn't 
> definded by the JCR spec as well. May be one could add a custom xpath 
> function? But I'm not quite sure if this is a funtionality a content 
> repository should be responsible for.

well, it could be very handy to have such a function. at least the index already 
contains all the necessary information that is required for the similarity 
search functionality. If there's more interested in such a feature please file a 
enhancement request in JIRA.

regards
  marcel

Re: How to find similar nodes(somewhat like google similar pages)?

Posted by Christoph Kiehl <ch...@sulu3000.de>.
alartin wrote:
> Hi all,
> Given a node with a text content, I want to find nodes that have similar
> text contents. It somewhat is like finding similar pages using google(just
> think page is a kind of node, its content is a node property). I think
> lucene support this by term vector and wonder whether jackrabbit query can
> do it or not. Many thanks.  

Right now Jackrabbit doesn't support this, probably mainly because the JCR spec 
doesn't define such a thing.
I think Marcel recently added search term highlighting which isn't definded by 
the JCR spec as well. May be one could add a custom xpath function? But I'm not 
quite sure if this is a funtionality a content repository should be responsible for.

Cheers,
Chris