You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by vidya <vi...@tcs.com> on 2016/01/12 07:15:30 UTC

solr in action - multiple language content in one field

Hi

I have gone through solr in action 14th chapter which tells - "searching
content in multiple languages" . But i have a doubt that when i put
documents in solr web UI, it recognises every language and gives me the
result when queried for it. What exactly did they depict in that chapter.
can't solr recognise and process all languages at a time?



--
View this message in context: http://lucene.472066.n3.nabble.com/solr-in-action-multiple-language-content-in-one-field-tp4250071.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: solr in action - multiple language content in one field

Posted by Erick Erickson <er...@gmail.com>.
Well, Solr _can_ put all the languages in one field... it's just that
the user experience is sub-optimal.

Stopwords, stemming rules, even tokenization vary between
languages and using, say, the English stopwords for Catalan
is not the best.

And the CJK languages (Chinese, Japanese and Korean)
don't break words up on whitespace, so you don't even get "words".

And what about languages that read right-to-left instead of left-to-right?
This latter is what the default tokenizers assume...

Basically, it's a tradeoff. If there are only a few documents in a
different language, supporting them well may not be worth it. But
native-language speakers will notice, believe me, when you apply
say English rules to French. Or German. Or Arabic.

Best,
Erick

On Mon, Jan 11, 2016 at 10:15 PM, vidya <vi...@tcs.com> wrote:
> Hi
>
> I have gone through solr in action 14th chapter which tells - "searching
> content in multiple languages" . But i have a doubt that when i put
> documents in solr web UI, it recognises every language and gives me the
> result when queried for it. What exactly did they depict in that chapter.
> can't solr recognise and process all languages at a time?
>
>
>
> --
> View this message in context: http://lucene.472066.n3.nabble.com/solr-in-action-multiple-language-content-in-one-field-tp4250071.html
> Sent from the Solr - User mailing list archive at Nabble.com.