You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by srinivasa raghavan <rg...@yahoo.com> on 2004/08/23 08:04:34 UTC

Lucene for Indian Languages

Hi all,

 Is Lucene API implemented for Indian contexts? I know
that Lucene stemmers and filters for German and
Russian Languages. I would like to know, whether there
are stemmers and filters available/being developed for
Indian Languages.

Thanks,
Rahavan.




		
_______________________________
Do you Yahoo!?
Express yourself with Y! Messenger! Free. Download now. 
http://messenger.yahoo.com

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Re: Lucene for Indian Languages

Posted by Praveen Peddi <pp...@contextmedia.com>.
Infact CJK analyzer also works well with indian languages. Since CJKAnalyzer
considers the multi byte characters as special case, it works with most
asian multi byte characters. I introduced CJKAnalyzer for japanese text
search and we also tested with hindi and telugu languages. All our search
test cases passed.
Give CJKAnalyzer a try. You will find it a better analyzer than the standard
(for any asian language).

Praveen

----- Original Message ----- 
From: "Satish Kagathare" <sa...@it.iitb.ac.in>
To: "Lucene Users List" <lu...@jakarta.apache.org>
Sent: Monday, August 23, 2004 9:20 AM
Subject: Re: Lucene for Indian Languages


>
> Hi,Srinivasa,
>
> Use StandardAnaylzer for indexing and parsing query for Indian Lang. docs.
> It will work. Right now we r searching on Hindi,Marathi
> but without specific stemmers and filters. We r plannig to develop
> Marathi Morphological Analyzer.
>
> Thanks,
> Satish.
>
> On Sun, 22 Aug 2004, srinivasa raghavan wrote:
>
> > Hi all,
> >
> >  Is Lucene API implemented for Indian contexts? I know
> > that Lucene stemmers and filters for German and
> > Russian Languages. I would like to know, whether there
> > are stemmers and filters available/being developed for
> > Indian Languages.
> >
> > Thanks,
> > Rahavan.
> >
> >
> >
> >
> >
> > _______________________________
> > Do you Yahoo!?
> > Express yourself with Y! Messenger! Free. Download now.
> > http://messenger.yahoo.com
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> > For additional commands, e-mail: lucene-user-help@jakarta.apache.org
> >
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Re: Lucene for Indian Languages

Posted by srinivasa raghavan <rg...@yahoo.com>.
Hi Satish,

 The morphological Analyzers for Hindi, Marathi,
Telugu and Kannada are available. Please visit 

http://ltrc.iiit.net/showfile.php?filename=onlineServices/morph/index.htm
  
 I think you need not develop it from the scratch. I
hope this will solve your problem for marathi to some
extent.

 By the way, Are you planning to develop morph
analyzer for Lucene or for any other purposes. If you
have plans to integrate with Lucene API, can you share
your idea of implementing it?

Thanks 
Raghavan

--- Satish Kagathare <sa...@it.iitb.ac.in> wrote:

> 
> Hi,Srinivasa,
> 
> Use StandardAnaylzer for indexing and parsing query
> for Indian Lang. docs. 
> It will work. Right now we r searching on
> Hindi,Marathi 
> but without specific stemmers and filters. We r
> plannig to develop 
> Marathi Morphological Analyzer.
> 
> Thanks,
> Satish.
> 
> On Sun, 22 Aug 2004, srinivasa raghavan wrote:
> 
> > Hi all,
> > 
> >  Is Lucene API implemented for Indian contexts? I
> know
> > that Lucene stemmers and filters for German and
> > Russian Languages. I would like to know, whether
> there
> > are stemmers and filters available/being developed
> for
> > Indian Languages.
> > 
> > Thanks,
> > Rahavan.
> > 
> > 
> > 
> > 
> > 		
> > _______________________________
> > Do you Yahoo!?
> > Express yourself with Y! Messenger! Free. Download
> now. 
> > http://messenger.yahoo.com
> > 
> >
>
---------------------------------------------------------------------
> > To unsubscribe, e-mail:
> lucene-user-unsubscribe@jakarta.apache.org
> > For additional commands, e-mail:
> lucene-user-help@jakarta.apache.org
> > 
> 
> 
>
---------------------------------------------------------------------
> To unsubscribe, e-mail:
> lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail:
> lucene-user-help@jakarta.apache.org
> 
> 



		
__________________________________
Do you Yahoo!?
New and Improved Yahoo! Mail - Send 10MB messages!
http://promotions.yahoo.com/new_mail 

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Re: Lucene for Indian Languages

Posted by srinivasa raghavan <rg...@yahoo.com>.
Hi Satish,

 Thank you satish for the pointers.

 Actually, I am able to search Indian Language data 
by storing the content in the index in ISCII encoding.
When I search, the search word(s) is also converted
into ISCII encoded word(s) and hit the lucene index
for search. It works pretty fine. But was just
wondering if any of the stemmers and filters are
available.

 How are you searching on Hindi and Marathi? In which
encoding you are storing the data? Can you provide me
some details about the same?

Thanks,
Raghavan.

 

--- Satish Kagathare <sa...@it.iitb.ac.in> wrote:

> 
> Hi,Srinivasa,
> 
> Use StandardAnaylzer for indexing and parsing query
> for Indian Lang. docs. 
> It will work. Right now we r searching on
> Hindi,Marathi 
> but without specific stemmers and filters. We r
> plannig to develop 
> Marathi Morphological Analyzer.
> 
> Thanks,
> Satish.
> 
> On Sun, 22 Aug 2004, srinivasa raghavan wrote:
> 
> > Hi all,
> > 
> >  Is Lucene API implemented for Indian contexts? I
> know
> > that Lucene stemmers and filters for German and
> > Russian Languages. I would like to know, whether
> there
> > are stemmers and filters available/being developed
> for
> > Indian Languages.
> > 
> > Thanks,
> > Rahavan.
> > 
> > 
> > 
> > 
> > 		
> > _______________________________
> > Do you Yahoo!?
> > Express yourself with Y! Messenger! Free. Download
> now. 
> > http://messenger.yahoo.com
> > 
> >
>
---------------------------------------------------------------------
> > To unsubscribe, e-mail:
> lucene-user-unsubscribe@jakarta.apache.org
> > For additional commands, e-mail:
> lucene-user-help@jakarta.apache.org
> > 
> 
> 
>
---------------------------------------------------------------------
> To unsubscribe, e-mail:
> lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail:
> lucene-user-help@jakarta.apache.org
> 
> 



		
_______________________________
Do you Yahoo!?
Win 1 of 4,000 free domain names from Yahoo! Enter now.
http://promotions.yahoo.com/goldrush

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Re: Lucene for Indian Languages

Posted by Satish Kagathare <sa...@it.iitb.ac.in>.
Hi,Srinivasa,

Use StandardAnaylzer for indexing and parsing query for Indian Lang. docs. 
It will work. Right now we r searching on Hindi,Marathi 
but without specific stemmers and filters. We r plannig to develop 
Marathi Morphological Analyzer.

Thanks,
Satish.

On Sun, 22 Aug 2004, srinivasa raghavan wrote:

> Hi all,
> 
>  Is Lucene API implemented for Indian contexts? I know
> that Lucene stemmers and filters for German and
> Russian Languages. I would like to know, whether there
> are stemmers and filters available/being developed for
> Indian Languages.
> 
> Thanks,
> Rahavan.
> 
> 
> 
> 
> 		
> _______________________________
> Do you Yahoo!?
> Express yourself with Y! Messenger! Free. Download now. 
> http://messenger.yahoo.com
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Re: Lucene for Indian Languages

Posted by Otis Gospodnetic <ot...@yahoo.com>.
Srinivasa,

Lucene does not include any Analyzers for any of the Indian languages.

Otis

--- srinivasa raghavan <rg...@yahoo.com> wrote:

> Hi all,
> 
>  Is Lucene API implemented for Indian contexts? I know
> that Lucene stemmers and filters for German and
> Russian Languages. I would like to know, whether there
> are stemmers and filters available/being developed for
> Indian Languages.
> 
> Thanks,
> Rahavan.
> 
> 
> 
> 
> 		
> _______________________________
> Do you Yahoo!?
> Express yourself with Y! Messenger! Free. Download now. 
> http://messenger.yahoo.com
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
> 
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


RE: Lucene for Indian Languages

Posted by Karthik N S <ka...@controlnet.co.in>.
Hi

I do not think so ,but there was One requirement in the Form for the
Devenagari script....

Have look at the forms,u might find something on this....


Karthik

-----Original Message-----
From: srinivasa raghavan [mailto:rg_kandala@yahoo.com]
Sent: Monday, August 23, 2004 11:35 AM
To: lucene-user@jakarta.apache.org
Subject: Lucene for Indian Languages


Hi all,

 Is Lucene API implemented for Indian contexts? I know
that Lucene stemmers and filters for German and
Russian Languages. I would like to know, whether there
are stemmers and filters available/being developed for
Indian Languages.

Thanks,
Rahavan.





_______________________________
Do you Yahoo!?
Express yourself with Y! Messenger! Free. Download now.
http://messenger.yahoo.com

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org