You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by abiratsis <ab...@gmail.com> on 2011/03/18 17:09:35 UTC

How to get stopwords and synonyms files for several lanuages

Hello everyone,

I am developing a multilingual index so there is a need for different
languages support. I need some answers to the follwing questions:

1. Which steps should I follow in order to get(download) all the
stopwords-synonyms files for several languages? 

2. Is there any site containing them? 

3. Should I download them somehow or they are already embedded to the
solr.war?

Thanx,
Alex

--
View this message in context: http://lucene.472066.n3.nabble.com/How-to-get-stopwords-and-synonyms-files-for-several-lanuages-tp2698494p2698494.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: How to get stopwords and synonyms files for several lanuages

Posted by abiratsis <ab...@gmail.com>.
OK thanx Markus, is clear enough now

--
View this message in context: http://lucene.472066.n3.nabble.com/How-to-get-stopwords-and-synonyms-files-for-several-lanuages-tp2698494p2698566.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: How to get stopwords and synonyms files for several lanuages

Posted by Markus Jelsma <ma...@openindex.io>.
No, it's not an implementation its more dependant on business. I mean, there 
is no need expand synonyms for terms in a biology field while you're indexing 
physics documents.

On Friday 18 March 2011 17:31:23 abiratsis wrote:
> Basically I have one more question, by saying that "Synonyms largely depend
> on what you're indexing" you mean that I probably need to implement a
> mechanism for handling synonyms right? If yes, you have any suggestions how
> to implement this?
> 
> Thanx,
> Alex
> 
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/How-to-get-stopwords-and-synonyms-files
> -for-several-lanuages-tp2698494p2698593.html Sent from the Solr - User
> mailing list archive at Nabble.com.

-- 
Markus Jelsma - CTO - Openindex
http://www.linkedin.com/in/markus17
050-8536620 / 06-50258350

Re: How to get stopwords and synonyms files for several lanuages

Posted by abiratsis <ab...@gmail.com>.
Basically I have one more question, by saying that "Synonyms largely depend
on what you're indexing" you mean that I probably need to implement a
mechanism for handling synonyms right? If yes, you have any suggestions how
to implement this?

Thanx,
Alex

--
View this message in context: http://lucene.472066.n3.nabble.com/How-to-get-stopwords-and-synonyms-files-for-several-lanuages-tp2698494p2698593.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: How to get stopwords and synonyms files for several lanuages

Posted by abiratsis <ab...@gmail.com>.
OK thanx Markus, is clear enough now

--
View this message in context: http://lucene.472066.n3.nabble.com/How-to-get-stopwords-and-synonyms-files-for-several-lanuages-tp2698494p2698567.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: How to get stopwords and synonyms files for several lanuages

Posted by Markus Jelsma <ma...@openindex.io>.
On Friday 18 March 2011 17:09:35 abiratsis wrote:
> Hello everyone,
> 
> I am developing a multilingual index so there is a need for different
> languages support. I need some answers to the follwing questions:
> 
> 1. Which steps should I follow in order to get(download) all the
> stopwords-synonyms files for several languages?

Synonyms largely depend on what you're indexing. There is no general list of 
synonyms. Also, because if you expand synonyms at index time, the index grows 
to extreme proportions.

> 
> 2. Is there any site containing them?

The wiki has a nice list for many languages. Which stemmer to use, whether 
special lowercasing is needed and stopwords.

http://wiki.apache.org/solr/LanguageAnalysis

> 
> 3. Should I download them somehow or they are already embedded to the
> solr.war?

They're stored in your SOLR_HOME/conf directory.

> 
> Thanx,
> Alex
> 
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/How-to-get-stopwords-and-synonyms-files
> -for-several-lanuages-tp2698494p2698494.html Sent from the Solr - User
> mailing list archive at Nabble.com.

-- 
Markus Jelsma - CTO - Openindex
http://www.linkedin.com/in/markus17
050-8536620 / 06-50258350