You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Neha Gupta <gu...@gmail.com> on 2009/06/19 04:15:12 UTC

n-gram word support

Hey,

I was wondering if there is a way to read the index and generate n-grams of
words for a document in lucene? I am quite new to it and am using pylucene.

Thanks,
Neha

Re: n-gram word support

Posted by Sameer Maggon <ma...@gmail.com>.
Yeah, look at the spellcheck component in Solr. They are doing something
similar.

Sameer.

On Thu, Jun 18, 2009 at 7:15 PM, Neha Gupta <gu...@gmail.com> wrote:

> Hey,
>
> I was wondering if there is a way to read the index and generate n-grams of
> words for a document in lucene? I am quite new to it and am using pylucene.
>
> Thanks,
> Neha
>



-- 
http://www.productification.com
(310) 266-6587 (cell)

Re: n-gram word support

Posted by Grant Ingersoll <gs...@apache.org>.
The contrib/analyzers has several n-gram based tokenization and token  
filter options.

On Jun 18, 2009, at 10:15 PM, Neha Gupta wrote:

> Hey,
>
> I was wondering if there is a way to read the index and generate n- 
> grams of
> words for a document in lucene? I am quite new to it and am using  
> pylucene.
>
> Thanks,
> Neha

--------------------------
Grant Ingersoll
http://www.lucidimagination.com/

Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)  
using Solr/Lucene:
http://www.lucidimagination.com/search


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: n-gram word support

Posted by Otis Gospodnetic <ot...@yahoo.com>.
Here it is:

http://lucene.apache.org/java/2_4_1/api/org/apache/lucene/analysis/shingle/ShingleMatrixFilter.html

Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



----- Original Message ----
> From: Neha Gupta <gu...@gmail.com>
> To: java-user@lucene.apache.org
> Sent: Thursday, June 18, 2009 10:15:12 PM
> Subject: n-gram word support
> 
> Hey,
> 
> I was wondering if there is a way to read the index and generate n-grams of
> words for a document in lucene? I am quite new to it and am using pylucene.
> 
> Thanks,
> Neha


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org