You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by "Villacorta Peral, Eva" <e....@ibermatica.com> on 2011/07/07 09:16:19 UTC

Problem with first letter accented

Hi

 

I'm using Solr 3.3 for searching in different languages, one of them is Spanish. The ASCIIFoldingFilterFactory works fine, but if word begins with a letter accented, like "ágora" or "ínclito", it can't find anything. I have to search word without accent in order to find some result. For instance:

 

-          Title: Imágenes del ágora de la plaza central.

-          Searching text: "imágenes" or "imagenes" returns the same result, the title above

-          Searching text: "ágora" returns no results, while "agora" returns the right result

 

Thx in advance

Eva


RE: Problem with first letter accented

Posted by "Villacorta Peral, Eva" <e....@ibermatica.com>.
I'm sorry if this mail is repeated. But my server mail gave me an error.

Hi!

I've changed the server.xml to add the URI Enconding. I've changed the schema version to 1.4. And I've reindexed my DB. But nothing has changed.

In the analisys.jsp I've searched for "más", in order to find what happens with that word, and it's also recognized as two characters, just like "ágora". But it works for "más".

The order of filter application may be relevant?? I don't read anything about it, but...



-----Mensaje original-----
De: Ahmet Arslan [mailto:iorixxx@yahoo.com] 
Enviado el: viernes, 08 de julio de 2011 9:57
Para: solr-user@lucene.apache.org
Asunto: RE: Problem with first letter accented


Hello,

As I see from analyis.jsp your á letter is not converted to 'a' by ASCII folding filter. It is recognized as two characters 'á' (before it comes to ASCII folding) for some reason.

First of all I would check URI Encoding of my servlet container. It should be utf-8.  See tomcat's config:
http://wiki.apache.org/solr/SolrTomcat#URI_Charset_Config

Not related to this issue but I recommend you to use 1.4 as schema version.  <schema name="ca_objects" version="1.4">


--- On Fri, 7/8/11, Villacorta Peral, Eva <e....@ibermatica.com> wrote:

> From: Villacorta Peral, Eva <e....@ibermatica.com>
> Subject: RE: Problem with first letter accented
> To: solr-user@lucene.apache.org
> Date: Friday, July 8, 2011, 10:26 AM
> I'm using collectiveaccess, and its
> DB structure. Perhaps this is useful...
> 
> My type definition is:
> 
> <schema name="ca_objects" version="1.1">
>     <types>
>         <fieldType
> name="text" class="solr.TextField"
> positionIncrementGap="100">
>            
> <analyzer>
>            
>     <tokenizer
> class="solr.WhitespaceTokenizerFactory"/>
>            
>     <filter
> class="solr.LowerCaseFilterFactory"/>
>            
>     <filter
> class="solr.ASCIIFoldingFilterFactory"/> 
>            
>     <filter
> class="solr.EdgeNGramFilterFactory" minGramSize="2"
> maxGramSize="15" side="front"/>
>            
> </analyzer>
>         </fieldType>
>         <fieldType
> name="string" class="solr.StrField" />
>         <fieldtype
> name="ignored" stored="false" indexed="false"
> class="solr.StrField" /> 
>     </types>
> 
> And the analisys of "ágora" is:
> 
> Index Analyzer
> org.apache.solr.analysis.WhitespaceTokenizerFactory
> {luceneMatchVersion=LUCENE_24}
> position     1 
> term text     ágora 
> startOffset 0 
> endOffset     6 
> 
> org.apache.solr.analysis.LowerCaseFilterFactory
> {luceneMatchVersion=LUCENE_24}
> position     1 
> term text     ã¡gora 
> startOffset 0 
> endOffset     6 
> 
> org.apache.solr.analysis.ASCIIFoldingFilterFactory
> {luceneMatchVersion=LUCENE_24}
> position     1 
> term text     a¡gora 
> startOffset 0 
> endOffset     6 
> 
> org.apache.solr.analysis.EdgeNGramFilterFactory
> {maxGramSize=15, side=front, minGramSize=2,
> luceneMatchVersion=LUCENE_24}
> position     1     2
>     3     4    
>     5 
> term text     a¡     a¡g
>     a¡go     a¡gor
>     a¡gora 
> startOffset 0     0     0
>     0         0
> 
> endOffset     2    
> 3    4     5
>         6 
> 
> Query Analyzer
> org.apache.solr.analysis.WhitespaceTokenizerFactory
> {luceneMatchVersion=LUCENE_24}
> position     1 
> term text     ágora 
> startOffset 0 
> endOffset     6 
> 
> org.apache.solr.analysis.LowerCaseFilterFactory
> {luceneMatchVersion=LUCENE_24}
> position     1 
> term text     ã¡gora 
> startOffset 0 
> endOffset     6 
> 
> org.apache.solr.analysis.ASCIIFoldingFilterFactory
> {luceneMatchVersion=LUCENE_24}
> position     1 
> term text     a¡gora 
> startOffset 0 
> endOffset     6 
> 
> org.apache.solr.analysis.EdgeNGramFilterFactory
> {maxGramSize=15, side=front, minGramSize=2,
> luceneMatchVersion=LUCENE_24}
> position     1     2
>     3     4    
> 5 
> term text     a¡     a¡g
>     a¡go     a¡gor a¡gora 
> startOffset 0     0     0
>     0     0 
> endOffset     2     3
>     4     5    
> 6
> 
> I hope this was enough... Ask me whatever you need. Thx
> 
> 
> > I'm using Solr 3.3 for searching in different
> languages,
> > one of them is Spanish. The ASCIIFoldingFilterFactory
> works
> > fine, but if word begins with a letter accented, like
> > "ágora" or "ínclito", it can't find anything. I have
> to
> > search word without accent in order to find some
> result. For
> > instance:
> > 
> >  
> > 
> > -          Title: Imágenes del
> > ágora de la plaza central.
> > 
> > -          Searching text:
> > "imágenes" or "imagenes" returns the same result, the
> title
> > above
> > 
> > -          Searching text:
> > "ágora" returns no results, while "agora" returns the
> right
> > result
> 
> That's quite strange. Your field type definition would be
> needed. 
> 
> and admin/analysis.jsp show step by step output of
> analysis.
> What happens to words  "ágora" or "ínclito" at index
> time and query time?
> 

RE: Problem with first letter accented

Posted by "Villacorta Peral, Eva" <e....@ibermatica.com>.
> Okey just to make sure, correct connector should be this:
>
>  <Connector port="8080" protocol="HTTP/1.1" 
>               connectionTimeout="20000" 
>               redirectPort="8443"
>               URIEncoding="UTF-8" />
>
>Can you confirm this? Did you restart tomcat?

	This is my connector:

	    <Connector port="83" protocol="HTTP/1.1" 
               connectionTimeout="20000" 
               redirectPort="8443" 
		   URIEncoding="UTF-8" />

	Yes, I'd restart tomcat.

>Also can you paste the output of &debugQuery=on ?

	I don't know where I have to paste it. In the admin/analisys page? I've done it but no result.

	But I've been looking the logs, and I've noticed that a letter accented in the middle of the word works fine because it treats it as a wildcard, but this is wrong in the beginning. Never finds the exact word if this is accented.

This is part of the log. There are two searchings, one for "marín" and the other for "ágora"

INFO: [ca_entities] webapp=/Solr33 path=/select params={start=0&q=mar?&wt=phps&rows=2000} hits=0 status=0 QTime=78 
Jul 8, 2011 12:48:24 PM org.apache.solr.core.SolrCore execute
INFO: [ca_places] webapp=/Solr33 path=/select params={start=0&q=mar?&wt=phps&rows=2000} hits=0 status=0 QTime=0 
Jul 8, 2011 12:48:24 PM org.apache.solr.core.SolrCore execute
INFO: [ca_collections] webapp=/Solr33 path=/select params={start=0&q=mar?&wt=phps&rows=2000} hits=0 status=0 QTime=0 
Jul 8, 2011 12:48:26 PM org.apache.solr.core.SolrCore execute
INFO: [ca_entities] webapp=/Solr33 path=/select params={start=0&q=?go&wt=phps&rows=2000} hits=0 status=0 QTime=0 
Jul 8, 2011 12:48:26 PM org.apache.solr.core.SolrCore execute
INFO: [ca_places] webapp=/Solr33 path=/select params={start=0&q=?go&wt=phps&rows=2000} hits=0 status=0 QTime=0 
Jul 8, 2011 12:48:26 PM org.apache.solr.core.SolrCore execute
INFO: [ca_collections] webapp=/Solr33 path=/select params={start=0&q=?go&wt=phps&rows=2000} hits=0 status=0 QTime=0 
Jul 8, 2011 12:48:27 PM org.apache.solr.core.SolrCore execute
INFO: [ca_entities] webapp=/Solr33 path=/select params={start=0&q=?gor&wt=phps&rows=2000} hits=0 status=0 QTime=0 
Jul 8, 2011 12:48:27 PM org.apache.solr.core.SolrCore execute
INFO: [ca_places] webapp=/Solr33 path=/select params={start=0&q=?gor&wt=phps&rows=2000} hits=0 status=0 QTime=0 
Jul 8, 2011 12:48:27 PM org.apache.solr.core.SolrCore execute
INFO: [ca_collections] webapp=/Solr33 path=/select params={start=0&q=?gor&wt=phps&rows=2000} hits=0 status=0 QTime=0 
Jul 8, 2011 12:48:28 PM org.apache.solr.core.SolrCore execute
INFO: [ca_entities] webapp=/Solr33 path=/select params={start=0&q=?gora&wt=phps&rows=2000} hits=0 status=0 QTime=0 
Jul 8, 2011 12:48:28 PM org.apache.solr.core.SolrCore execute
INFO: [ca_places] webapp=/Solr33 path=/select params={start=0&q=?gora&wt=phps&rows=2000} hits=0 status=0 QTime=0 
Jul 8, 2011 12:48:28 PM org.apache.solr.core.SolrCore execute
INFO: [ca_collections] webapp=/Solr33 path=/select params={start=0&q=?gora&wt=phps&rows=2000} hits=0 status=0 QTime=0 
Jul 8, 2011 12:48:29 PM org.apache.solr.common.SolrException log
SEVERE: org.apache.solr.common.SolrException: org.apache.lucene.queryParser.ParseException: Cannot parse '(?gora)': '*' or '?' not allowed as first character in WildcardQuery


>> In the analisys.jsp I've searched for "más", in order to
>> find what happens with that word, and it's also recognized
>> as two characters, just like "ágora". But it works for
>> "más".

>Sometimes browser caches things, and misled you. Can you delete your >browser caches?

	My browser cache is deleted, too. And my brain, almost :)

RE: Problem with first letter accented

Posted by Ahmet Arslan <io...@yahoo.com>.
> I've changed the server.xml to add the URI Enconding. I've
> changed the schema version to 1.4. And I've reindexed my DB.
> But nothing has changed.


Okey just to make sure, correct connector should be this:

  <Connector port="8080" protocol="HTTP/1.1" 
               connectionTimeout="20000" 
               redirectPort="8443"
               URIEncoding="UTF-8" />

Can you confirm this? Did you restart tomcat?

Also can you paste the output of &debugQuery=on ?


> In the analisys.jsp I've searched for "más", in order to
> find what happens with that word, and it's also recognized
> as two characters, just like "ágora". But it works for
> "más".

Sometimes browser caches things, and misled you. Can you delete your browser caches?

> The order of filter application may be relevant?? I don't
> read anything about it, but...

No order is not the issue here. 

RE: Problem with first letter accented

Posted by "Villacorta Peral, Eva" <e....@ibermatica.com>.
Hi!

I've changed the server.xml to add the URI Enconding. I've changed the schema version to 1.4. And I've reindexed my DB. But nothing has changed.

In the analisys.jsp I've searched for "más", in order to find what happens with that word, and it's also recognized as two characters, just like "ágora". But it works for "más".

The order of filter application may be relevant?? I don't read anything about it, but...

---------------------------------------------------
Hello,

As I see from analyis.jsp your á letter is not converted to 'a' by ASCII folding filter. It is recognized as two characters 'á' (before it comes to ASCII folding) for some reason.

First of all I would check URI Encoding of my servlet container. It should be utf-8.  See tomcat's config:
http://wiki.apache.org/solr/SolrTomcat#URI_Charset_Config

Not related to this issue but I recommend you to use 1.4 as schema version.  <schema name="ca_objects" version="1.4">


--- On Fri, 7/8/11, Villacorta Peral, Eva <e....@ibermatica.com> wrote:

> From: Villacorta Peral, Eva <e....@ibermatica.com>
> Subject: RE: Problem with first letter accented
> To: solr-user@lucene.apache.org
> Date: Friday, July 8, 2011, 10:26 AM
> I'm using collectiveaccess, and its
> DB structure. Perhaps this is useful...
> 
> My type definition is:
> 
> <schema name="ca_objects" version="1.1">
>     <types>
>         <fieldType
> name="text" class="solr.TextField"
> positionIncrementGap="100">
>            
> <analyzer>
>            
>     <tokenizer
> class="solr.WhitespaceTokenizerFactory"/>
>            
>     <filter
> class="solr.LowerCaseFilterFactory"/>
>            
>     <filter
> class="solr.ASCIIFoldingFilterFactory"/> 
>            
>     <filter
> class="solr.EdgeNGramFilterFactory" minGramSize="2"
> maxGramSize="15" side="front"/>
>            
> </analyzer>
>         </fieldType>
>         <fieldType
> name="string" class="solr.StrField" />
>         <fieldtype
> name="ignored" stored="false" indexed="false"
> class="solr.StrField" /> 
>     </types>
> 
> And the analisys of "ágora" is:
> 
> Index Analyzer
> org.apache.solr.analysis.WhitespaceTokenizerFactory
> {luceneMatchVersion=LUCENE_24}
> position     1 
> term text     ágora 
> startOffset 0 
> endOffset     6 
> 
> org.apache.solr.analysis.LowerCaseFilterFactory
> {luceneMatchVersion=LUCENE_24}
> position     1 
> term text     ã¡gora 
> startOffset 0 
> endOffset     6 
> 
> org.apache.solr.analysis.ASCIIFoldingFilterFactory
> {luceneMatchVersion=LUCENE_24}
> position     1 
> term text     a¡gora 
> startOffset 0 
> endOffset     6 
> 
> org.apache.solr.analysis.EdgeNGramFilterFactory
> {maxGramSize=15, side=front, minGramSize=2,
> luceneMatchVersion=LUCENE_24}
> position     1     2
>     3     4    
>     5 
> term text     a¡     a¡g
>     a¡go     a¡gor
>     a¡gora 
> startOffset 0     0     0
>     0         0
> 
> endOffset     2    
> 3    4     5
>         6 
> 
> Query Analyzer
> org.apache.solr.analysis.WhitespaceTokenizerFactory
> {luceneMatchVersion=LUCENE_24}
> position     1 
> term text     ágora 
> startOffset 0 
> endOffset     6 
> 
> org.apache.solr.analysis.LowerCaseFilterFactory
> {luceneMatchVersion=LUCENE_24}
> position     1 
> term text     ã¡gora 
> startOffset 0 
> endOffset     6 
> 
> org.apache.solr.analysis.ASCIIFoldingFilterFactory
> {luceneMatchVersion=LUCENE_24}
> position     1 
> term text     a¡gora 
> startOffset 0 
> endOffset     6 
> 
> org.apache.solr.analysis.EdgeNGramFilterFactory
> {maxGramSize=15, side=front, minGramSize=2,
> luceneMatchVersion=LUCENE_24}
> position     1     2
>     3     4    
> 5 
> term text     a¡     a¡g
>     a¡go     a¡gor a¡gora 
> startOffset 0     0     0
>     0     0 
> endOffset     2     3
>     4     5    
> 6
> 
> I hope this was enough... Ask me whatever you need. Thx
> 
> 
> > I'm using Solr 3.3 for searching in different
> languages,
> > one of them is Spanish. The ASCIIFoldingFilterFactory
> works
> > fine, but if word begins with a letter accented, like
> > "ágora" or "ínclito", it can't find anything. I have
> to
> > search word without accent in order to find some
> result. For
> > instance:
> > 
> >  
> > 
> > -          Title: Imágenes del
> > ágora de la plaza central.
> > 
> > -          Searching text:
> > "imágenes" or "imagenes" returns the same result, the
> title
> > above
> > 
> > -          Searching text:
> > "ágora" returns no results, while "agora" returns the
> right
> > result
> 
> That's quite strange. Your field type definition would be
> needed. 
> 
> and admin/analysis.jsp show step by step output of
> analysis.
> What happens to words  "ágora" or "ínclito" at index
> time and query time?
> 

RE: Problem with first letter accented

Posted by Ahmet Arslan <io...@yahoo.com>.
Hello,

As I see from analyis.jsp your á letter is not converted to 'a' by ASCII folding filter. It is recognized as two characters 'á' (before it comes to ASCII folding) for some reason.

First of all I would check URI Encoding of my servlet container. It should be utf-8.  See tomcat's config:
http://wiki.apache.org/solr/SolrTomcat#URI_Charset_Config

Not related to this issue but I recommend you to use 1.4 as schema version.  <schema name="ca_objects" version="1.4">


--- On Fri, 7/8/11, Villacorta Peral, Eva <e....@ibermatica.com> wrote:

> From: Villacorta Peral, Eva <e....@ibermatica.com>
> Subject: RE: Problem with first letter accented
> To: solr-user@lucene.apache.org
> Date: Friday, July 8, 2011, 10:26 AM
> I'm using collectiveaccess, and its
> DB structure. Perhaps this is useful...
> 
> My type definition is:
> 
> <schema name="ca_objects" version="1.1">
>     <types>
>         <fieldType
> name="text" class="solr.TextField"
> positionIncrementGap="100">
>            
> <analyzer>
>            
>     <tokenizer
> class="solr.WhitespaceTokenizerFactory"/>
>            
>     <filter
> class="solr.LowerCaseFilterFactory"/>
>            
>     <filter
> class="solr.ASCIIFoldingFilterFactory"/> 
>            
>     <filter
> class="solr.EdgeNGramFilterFactory" minGramSize="2"
> maxGramSize="15" side="front"/>
>            
> </analyzer>
>         </fieldType>
>         <fieldType
> name="string" class="solr.StrField" />
>         <fieldtype
> name="ignored" stored="false" indexed="false"
> class="solr.StrField" /> 
>     </types>
> 
> And the analisys of "ágora" is:
> 
> Index Analyzer
> org.apache.solr.analysis.WhitespaceTokenizerFactory
> {luceneMatchVersion=LUCENE_24}
> position     1 
> term text     ágora 
> startOffset 0 
> endOffset     6 
> 
> org.apache.solr.analysis.LowerCaseFilterFactory
> {luceneMatchVersion=LUCENE_24}
> position     1 
> term text     ã¡gora 
> startOffset 0 
> endOffset     6 
> 
> org.apache.solr.analysis.ASCIIFoldingFilterFactory
> {luceneMatchVersion=LUCENE_24}
> position     1 
> term text     a¡gora 
> startOffset 0 
> endOffset     6 
> 
> org.apache.solr.analysis.EdgeNGramFilterFactory
> {maxGramSize=15, side=front, minGramSize=2,
> luceneMatchVersion=LUCENE_24}
> position     1     2
>     3     4    
>     5 
> term text     a¡     a¡g
>     a¡go     a¡gor
>     a¡gora 
> startOffset 0     0     0
>     0         0
> 
> endOffset     2    
> 3    4     5
>         6 
> 
> Query Analyzer
> org.apache.solr.analysis.WhitespaceTokenizerFactory
> {luceneMatchVersion=LUCENE_24}
> position     1 
> term text     ágora 
> startOffset 0 
> endOffset     6 
> 
> org.apache.solr.analysis.LowerCaseFilterFactory
> {luceneMatchVersion=LUCENE_24}
> position     1 
> term text     ã¡gora 
> startOffset 0 
> endOffset     6 
> 
> org.apache.solr.analysis.ASCIIFoldingFilterFactory
> {luceneMatchVersion=LUCENE_24}
> position     1 
> term text     a¡gora 
> startOffset 0 
> endOffset     6 
> 
> org.apache.solr.analysis.EdgeNGramFilterFactory
> {maxGramSize=15, side=front, minGramSize=2,
> luceneMatchVersion=LUCENE_24}
> position     1     2
>     3     4    
> 5 
> term text     a¡     a¡g
>     a¡go     a¡gor a¡gora 
> startOffset 0     0     0
>     0     0 
> endOffset     2     3
>     4     5    
> 6
> 
> I hope this was enough... Ask me whatever you need. Thx
> 
> 
> > I'm using Solr 3.3 for searching in different
> languages,
> > one of them is Spanish. The ASCIIFoldingFilterFactory
> works
> > fine, but if word begins with a letter accented, like
> > "ágora" or "ínclito", it can't find anything. I have
> to
> > search word without accent in order to find some
> result. For
> > instance:
> > 
> >  
> > 
> > -          Title: Imágenes del
> > ágora de la plaza central.
> > 
> > -          Searching text:
> > "imágenes" or "imagenes" returns the same result, the
> title
> > above
> > 
> > -          Searching text:
> > "ágora" returns no results, while "agora" returns the
> right
> > result
> 
> That's quite strange. Your field type definition would be
> needed. 
> 
> and admin/analysis.jsp show step by step output of
> analysis.
> What happens to words  "ágora" or "ínclito" at index
> time and query time?
> 

RE: Problem with first letter accented

Posted by "Villacorta Peral, Eva" <e....@ibermatica.com>.
I'm using collectiveaccess, and its DB structure. Perhaps this is useful...

My type definition is:

<schema name="ca_objects" version="1.1">
	<types>
		<fieldType name="text" class="solr.TextField" positionIncrementGap="100">
			<analyzer>
				<tokenizer class="solr.WhitespaceTokenizerFactory"/>
				<filter class="solr.LowerCaseFilterFactory"/>
				<filter class="solr.ASCIIFoldingFilterFactory"/> 
				<filter class="solr.EdgeNGramFilterFactory" minGramSize="2" maxGramSize="15" side="front"/>
			</analyzer>
		</fieldType>
		<fieldType name="string" class="solr.StrField" />
		<fieldtype name="ignored" stored="false" indexed="false" class="solr.StrField" /> 
	</types>

And the analisys of "ágora" is:

Index Analyzer
org.apache.solr.analysis.WhitespaceTokenizerFactory {luceneMatchVersion=LUCENE_24}
position 	1 
term text 	ágora 
startOffset 0 
endOffset 	6 

org.apache.solr.analysis.LowerCaseFilterFactory {luceneMatchVersion=LUCENE_24}
position 	1 
term text 	ã¡gora 
startOffset 0 
endOffset 	6 

org.apache.solr.analysis.ASCIIFoldingFilterFactory {luceneMatchVersion=LUCENE_24}
position 	1 
term text 	a¡gora 
startOffset 0 
endOffset 	6 

org.apache.solr.analysis.EdgeNGramFilterFactory {maxGramSize=15, side=front, minGramSize=2, luceneMatchVersion=LUCENE_24}
position 	1 	2 	3 	4 		5 
term text 	a¡ 	a¡g 	a¡go 	a¡gor 	a¡gora 
startOffset 0 	0 	0 	0 		0 
endOffset 	2 	3	4 	5 		6 

Query Analyzer
org.apache.solr.analysis.WhitespaceTokenizerFactory {luceneMatchVersion=LUCENE_24}
position 	1 
term text 	ágora 
startOffset 0 
endOffset 	6 

org.apache.solr.analysis.LowerCaseFilterFactory {luceneMatchVersion=LUCENE_24}
position 	1 
term text 	ã¡gora 
startOffset 0 
endOffset 	6 

org.apache.solr.analysis.ASCIIFoldingFilterFactory {luceneMatchVersion=LUCENE_24}
position 	1 
term text 	a¡gora 
startOffset 0 
endOffset 	6 

org.apache.solr.analysis.EdgeNGramFilterFactory {maxGramSize=15, side=front, minGramSize=2, luceneMatchVersion=LUCENE_24}
position 	1 	2 	3 	4 	5 
term text 	a¡ 	a¡g 	a¡go 	a¡gor a¡gora 
startOffset 0 	0 	0 	0 	0 
endOffset 	2 	3 	4 	5 	6

I hope this was enough... Ask me whatever you need. Thx


> I'm using Solr 3.3 for searching in different languages,
> one of them is Spanish. The ASCIIFoldingFilterFactory works
> fine, but if word begins with a letter accented, like
> "ágora" or "ínclito", it can't find anything. I have to
> search word without accent in order to find some result. For
> instance:
> 
>  
> 
> -          Title: Imágenes del
> ágora de la plaza central.
> 
> -          Searching text:
> "imágenes" or "imagenes" returns the same result, the title
> above
> 
> -          Searching text:
> "ágora" returns no results, while "agora" returns the right
> result

That's quite strange. Your field type definition would be needed. 

and admin/analysis.jsp show step by step output of analysis.
What happens to words  "ágora" or "ínclito" at index time and query time?

Re: Problem with first letter accented

Posted by Ahmet Arslan <io...@yahoo.com>.
> I'm using Solr 3.3 for searching in different languages,
> one of them is Spanish. The ASCIIFoldingFilterFactory works
> fine, but if word begins with a letter accented, like
> "ágora" or "ínclito", it can't find anything. I have to
> search word without accent in order to find some result. For
> instance:
> 
>  
> 
> -          Title: Imágenes del
> ágora de la plaza central.
> 
> -          Searching text:
> "imágenes" or "imagenes" returns the same result, the title
> above
> 
> -          Searching text:
> "ágora" returns no results, while "agora" returns the right
> result

That's quite strange. Your field type definition would be needed. 

and admin/analysis.jsp show step by step output of analysis.
What happens to words  "ágora" or "ínclito" at index time and query time?