You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Lokeya <lo...@gmail.com> on 2007/04/03 04:14:29 UTC

How to calculate centroid from HITS?

Hi All,

I have queried and have got a HITS object which is a collection of
documents. I want to find out the centroid of these documents. Centroid =
Top Most 35(for eg)common  terms across all the documents in the HITS
object. 

Is there any API in Lucene for this?

Thanks in Advance.
-- 
View this message in context: http://www.nabble.com/How-to-calculate-centroid-from-HITS--tf3509432.html#a9802563
Sent from the Lucene - Java Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: How to calculate centroid from HITS?

Posted by Lokeya <lo...@gmail.com>.
I figured out that there are already java API's written for this and
available. LucQE has that : 

QueryExpansion.expandQuery(java.lang.String queryStr,
org.apache.lucene.search.Hits hits, java.util.Properties prop) which returns
the expanded query (will have centroid)



Grant Ingersoll-6 wrote:
> 
> You could use Term Vectors (TVs) to do this, but I don't know of any  
> existing code for it.  Might be a good contrib module, though.   
> Search this list or see Lucene In Action or I have some TV sample  
> code at http://www.cnlp.org/apachecon2005/
> 
> You might also check the Carrot2 project, which has a number of  
> clustering algorithms and some Lucene support, although I don't know  
> if it does specifically what you want.
> 
> On Apr 2, 2007, at 10:14 PM, Lokeya wrote:
> 
>>
>> Hi All,
>>
>> I have queried and have got a HITS object which is a collection of
>> documents. I want to find out the centroid of these documents.  
>> Centroid =
>> Top Most 35(for eg)common  terms across all the documents in the HITS
>> object.
>>
>> Is there any API in Lucene for this?
>>
>> Thanks in Advance.
>> -- 
>> View this message in context: http://www.nabble.com/How-to- 
>> calculate-centroid-from-HITS--tf3509432.html#a9802563
>> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
> 
> --------------------------
> Grant Ingersoll
> Center for Natural Language Processing
> http://www.cnlp.org
> 
> Read the Lucene Java FAQ at http://wiki.apache.org/jakarta-lucene/ 
> LuceneFAQ
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
> 
> 
> 

-- 
View this message in context: http://www.nabble.com/How-to-calculate-centroid-from-HITS--tf3509432.html#a10091264
Sent from the Lucene - Java Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: How to calculate centroid from HITS?

Posted by Grant Ingersoll <gs...@apache.org>.
You could use Term Vectors (TVs) to do this, but I don't know of any  
existing code for it.  Might be a good contrib module, though.   
Search this list or see Lucene In Action or I have some TV sample  
code at http://www.cnlp.org/apachecon2005/

You might also check the Carrot2 project, which has a number of  
clustering algorithms and some Lucene support, although I don't know  
if it does specifically what you want.

On Apr 2, 2007, at 10:14 PM, Lokeya wrote:

>
> Hi All,
>
> I have queried and have got a HITS object which is a collection of
> documents. I want to find out the centroid of these documents.  
> Centroid =
> Top Most 35(for eg)common  terms across all the documents in the HITS
> object.
>
> Is there any API in Lucene for this?
>
> Thanks in Advance.
> -- 
> View this message in context: http://www.nabble.com/How-to- 
> calculate-centroid-from-HITS--tf3509432.html#a9802563
> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>

--------------------------
Grant Ingersoll
Center for Natural Language Processing
http://www.cnlp.org

Read the Lucene Java FAQ at http://wiki.apache.org/jakarta-lucene/ 
LuceneFAQ



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org