You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Jawahar Lal <jl...@chambal.com> on 2010/12/22 05:31:59 UTC

Re: Can I generate two word phrases from Lucene Index

On 21 December 2010 11:09, Jawahar Lal <jl...@chambal.com> wrote:

> Hi,
>
>  I indexed the web pages content.  I want to generate following information
> from the indexes.
>
>
>    1. All the keywords in the index, and their no. of frequency.
>    I can get this using *indexReader.terms()
>    *
>    2. I also want to generate TWO Word Phrase from the index ???
>
>
>  For example:-
>
> Document:- This is about how to index contents using lucene.
>
> After Index:-
> 1. *Getting all keyword ==*> this, is,about,how,to, index, contents,
> using, lucene  (Suppose no filtering is done for stop words)
> 2. *Getting Two Word Phrase ==>* index contents, using lucene etc...
>
> How can I get the second result data from index ?
>
> Thanks
>

Re: Can I generate two word phrases from Lucene Index

Posted by Ahmet Arslan <io...@yahoo.com>.
> > 2. *Getting Two Word Phrase ==>* index contents,
> using lucene etc...
> >

You can add ShingleFilter to your analyzer chain.

http://lucene.apache.org/java/3_0_1/api/all/org/apache/lucene/analysis/shingle/ShingleFilter.html


      

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org