You are viewing a plain text version of this content. The canonical link for it is here.
Posted to general@lucene.apache.org by Farzad Mahdikhani <fa...@yahoo.com> on 2007/08/13 12:30:34 UTC
cross-lingual IR
Dear All,
I would like to implement a cross-lingual IR system with support for Persian and English languages for an academic research task. How can I use Lucene for my task? How shall I proceed? what are the requirements?
Regards,
Farzad
---------------------------------
Pinpoint customers who are looking for what you sell.
Re: cross-lingual IR
Posted by Grant Ingersoll <gs...@apache.org>.
Hi Farzad,
Hmmm, where to begin... This is a tough question and one that
warrants a fair amount of research. I would start by taking a look
at the TREC cross-language tracks and the CLEF conference.
I have used Lucene to index/search both the English and Arabic/French/
Spanish/Dutch/etc. documents. In general, you need some way of
transforming a source language query into a target language query OR
you need some way of automatically translating all your documents to
the same language. How you do this is really the matter of research,
eh? The most basic approach to the query transformation problem is
to use a dictionary to look up the terms from the source and get the
target language equivalents.
As for Lucene, you will need an Analyzer that handles Persian (try
googling "Persian Lucene Analyzer") you may very well have to write
your own. The actual indexing and search tasks are relatively
straightforward as Lucene tasks and there a number of good tutorials
and books on how to do that.
Good luck,
Grant
On Aug 13, 2007, at 6:30 AM, Farzad Mahdikhani wrote:
> Dear All,
>
> I would like to implement a cross-lingual IR system with support
> for Persian and English languages for an academic research task.
> How can I use Lucene for my task? How shall I proceed? what are the
> requirements?
>
> Regards,
> Farzad
>
> ---------------------------------
> Pinpoint customers who are looking for what you sell.
--------------------------
Grant Ingersoll
http://lucene.grantingersoll.com
Lucene Helpful Hints:
http://wiki.apache.org/lucene-java/BasicsOfPerformance
http://wiki.apache.org/lucene-java/LuceneFAQ