You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Mark Stiner <fa...@yahoo.com> on 2007/03/28 21:13:24 UTC

Using Lucene to apply LSI

                    Hi,

I have a research project where I want to implement LSI technique. The scenario is something as follows.

Search
the news sites for the locally event based news. Cluster the similar
news items together. For example hurricane in New York city.

We want to apply basic LSI as follows

   -Key word extraction
   -Filter using stop list
   -Stemming
   -Option: Synonym detection
   - Frequency Matrix
   - SVD Decomposition

  -Cluster related News items

The input data will be from the web based news sites such as Yahoo, google etc.

How can we use Lucene to achieve this. Please provide me the steps.

Thanks.

Faikeeyes
                    



 
____________________________________________________________________________________
Now that's room service!  Choose from over 150,000 hotels
in 45,000 destinations on Yahoo! Travel to find your fit.
http://farechase.yahoo.com/promo-generic-14795097

Re: Using Lucene to apply LSI

Posted by José Ramón Pérez Agüera <jo...@gmail.com>.
you need to use JAMA combined with Lucene, using the vectors that are
builded by lucene to compute SVD with JAMA

http://math.nist.gov/javanumerics/jama/

Best

jose

On 3/28/07, Mark Stiner <fa...@yahoo.com> wrote:
>
>
>                     Hi,
>
> I have a research project where I want to implement LSI technique. The
> scenario is something as follows.
>
> Search
> the news sites for the locally event based news. Cluster the similar
> news items together. For example hurricane in New York city.
>
> We want to apply basic LSI as follows
>
>    -Key word extraction
>    -Filter using stop list
>    -Stemming
>    -Option: Synonym detection
>    - Frequency Matrix
>    - SVD Decomposition
>
>   -Cluster related News items
>
> The input data will be from the web based news sites such as Yahoo, google
> etc.
>
> How can we use Lucene to achieve this. Please provide me the steps.
>
> Thanks.
>
> Faikeeyes
>
>
>
>
>
>
> ____________________________________________________________________________________
> Now that's room service!  Choose from over 150,000 hotels
> in 45,000 destinations on Yahoo! Travel to find your fit.
> http://farechase.yahoo.com/promo-generic-14795097




-- 
José Ramón Pérez Agüera

Dept. de Ingeniería del Software e Inteligencia Artificial
Despacho 411 tlf. 913947599
Facultad de Informática
Universidad Complutense de Madrid