You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Blargy <zm...@hotmail.com> on 2010/07/13 18:55:03 UTC
Foreign characters question
I am trying to add the following synonym while indexing/searching
swimsuit, bañadores, bañador
I testing searching for "bañadores" however it didn't return any results.
After further inspection I noticed in the field analysis admin that swimsuit
gets expanded to ba�adores. Not sure if it will show up but the "n" is a
black diamond with a white question mark in it.
So basically, how can I add support for foreign characters? Thanks
--
View this message in context: http://lucene.472066.n3.nabble.com/Foreign-characters-question-tp964078p964078.html
Sent from the Solr - User mailing list archive at Nabble.com.
Re: Foreign characters question
Posted by Robert Muir <rc...@gmail.com>.
On Wed, Jul 14, 2010 at 12:59 PM, Blargy <zm...@hotmail.com> wrote:
>
> Nevermind. Apparently my IDE (Netbeans) was set to "No encoding"... wtf.
> Changed it to UTF-8 and recreated the file and all is good now. Thanks!
>
>
fyi I created an issue with your example here:
https://issues.apache.org/jira/browse/SOLR-2003
In this case, the wrong encoding could have been detected and saved you some
time...
--
Robert Muir
rcmuir@gmail.com
Re: Foreign characters question
Posted by Blargy <zm...@hotmail.com>.
Nevermind. Apparently my IDE (Netbeans) was set to "No encoding"... wtf.
Changed it to UTF-8 and recreated the file and all is good now. Thanks!
--
View this message in context: http://lucene.472066.n3.nabble.com/Foreign-characters-question-tp964078p967058.html
Sent from the Solr - User mailing list archive at Nabble.com.
Re: Foreign characters question
Posted by Blargy <zm...@hotmail.com>.
How can I tell and/or create a UTF-8 synonyms file? Do I have to instruct
solr that this file is UTF-8?
--
View this message in context: http://lucene.472066.n3.nabble.com/Foreign-characters-question-tp964078p967037.html
Sent from the Solr - User mailing list archive at Nabble.com.
Re: Foreign characters question
Posted by Robert Muir <rc...@gmail.com>.
is your synonyms file in UTF-8 encoding?
On Wed, Jul 14, 2010 at 11:11 AM, Blargy <zm...@hotmail.com> wrote:
>
> Thanks for the reply but that didnt help.
>
> Tomcat is accepting foreign characters but for some reason when it reads
> the
> synonyms file and it encounters that character ñ it doesnt appear correctly
> in the Field Analysis admin. It shows up as �. If I query exactly for ñ it
> will work but the synonyms file is srcrewy.
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Foreign-characters-question-tp964078p966740.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
--
Robert Muir
rcmuir@gmail.com
RE: Foreign characters question
Posted by Blargy <zm...@hotmail.com>.
Thanks for the reply but that didnt help.
Tomcat is accepting foreign characters but for some reason when it reads the
synonyms file and it encounters that character ñ it doesnt appear correctly
in the Field Analysis admin. It shows up as �. If I query exactly for ñ it
will work but the synonyms file is srcrewy.
--
View this message in context: http://lucene.472066.n3.nabble.com/Foreign-characters-question-tp964078p966740.html
Sent from the Solr - User mailing list archive at Nabble.com.
RE: Foreign characters question
Posted by Tim Gilbert <TI...@morningstar.com>.
I had the same problem, the correction differs by which application server you are using.
If it's Tomcat, try here: http://wiki.apache.org/solr/SolrTomcat near uri charset.
I use glassfish, and I added this entry to the wiki after getting help from this group: http://wiki.apache.org/solr/SolrGlassfish
I hope this helps.
Tim
-----Original Message-----
From: Blargy [mailto:zmanods@hotmail.com]
Sent: Tuesday, July 13, 2010 12:55 PM
To: solr-user@lucene.apache.org
Subject: Foreign characters question
I am trying to add the following synonym while indexing/searching
swimsuit, bañadores, bañador
I testing searching for "bañadores" however it didn't return any results.
After further inspection I noticed in the field analysis admin that swimsuit
gets expanded to ba�adores. Not sure if it will show up but the "n" is a
black diamond with a white question mark in it.
So basically, how can I add support for foreign characters? Thanks
--
View this message in context: http://lucene.472066.n3.nabble.com/Foreign-characters-question-tp964078p964078.html
Sent from the Solr - User mailing list archive at Nabble.com.