You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Jawahar Lal <jl...@chambal.com> on 2010/12/22 05:31:59 UTC
Re: Can I generate two word phrases from Lucene Index
On 21 December 2010 11:09, Jawahar Lal <jl...@chambal.com> wrote:
> Hi,
>
> I indexed the web pages content. I want to generate following information
> from the indexes.
>
>
> 1. All the keywords in the index, and their no. of frequency.
> I can get this using *indexReader.terms()
> *
> 2. I also want to generate TWO Word Phrase from the index ???
>
>
> For example:-
>
> Document:- This is about how to index contents using lucene.
>
> After Index:-
> 1. *Getting all keyword ==*> this, is,about,how,to, index, contents,
> using, lucene (Suppose no filtering is done for stop words)
> 2. *Getting Two Word Phrase ==>* index contents, using lucene etc...
>
> How can I get the second result data from index ?
>
> Thanks
>
Re: Can I generate two word phrases from Lucene Index
Posted by Ahmet Arslan <io...@yahoo.com>.
> > 2. *Getting Two Word Phrase ==>* index contents,
> using lucene etc...
> >
You can add ShingleFilter to your analyzer chain.
http://lucene.apache.org/java/3_0_1/api/all/org/apache/lucene/analysis/shingle/ShingleFilter.html
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org