You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mahout.apache.org by shambhusingh <sh...@gmail.com> on 2011/03/17 07:09:19 UTC

clustering using n-grams

I have created the lucene index for database content and have aded the fields
documentId and content using doc.add..now i want to use mahout lucene.vector
to create the sequence file using n-grams algorithm and then I will d the
mahout clustering on top of that...
how do i use n-grams instead of TFIDF for generatining lucene vectors 
please help

or how do I create clusters using n-grams instead of TFIDF with lucene index

--
View this message in context: http://lucene.472066.n3.nabble.com/clustering-using-n-grams-tp2692426p2692426.html
Sent from the Mahout User List mailing list archive at Nabble.com.

Re: clustering using n-grams

Posted by vineet yadav <vi...@gmail.com>.
Hi Sambhu,
Check out Grant Article on Lucid Imagination
http://www.lucidimagination.com/blog/2010/03/16/integrating-apache-mahout-with-apache-lucene-and-solr-part-i-of-3/
Thanks
Vineet Yadav
On Thu, Mar 17, 2011 at 11:39 AM, shambhusingh <sh...@gmail.com> wrote:
> I have created the lucene index for database content and have aded the fields
> documentId and content using doc.add..now i want to use mahout lucene.vector
> to create the sequence file using n-grams algorithm and then I will d the
> mahout clustering on top of that...
> how do i use n-grams instead of TFIDF for generatining lucene vectors
> please help
>
> or how do I create clusters using n-grams instead of TFIDF with lucene index
>
> --
> View this message in context: http://lucene.472066.n3.nabble.com/clustering-using-n-grams-tp2692426p2692426.html
> Sent from the Mahout User List mailing list archive at Nabble.com.
>