You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by da...@correo.aeat.es on 2013/09/17 08:16:38 UTC
Problem with SynonymFilter and StopFilterFactory
Hi,
I have encoutered a problem applying StopFilterFactory and
SynonimFilterFactory. The problem is that SynonymFilter removes the gaps
that were previously put by the StopFilterFactory. I'm applying filters in
query time, because users need to change synonym lists frequently.
This is my schema, and an example of the issue:
String: "documentacion para agentes"
org.apache.solr.analysis.WhitespaceTokenizerFactory
{luceneMatchVersion=LUCENE_35}
position 1 2 3
term text documentación para agentes
startOffset 0 14 19
endOffset 13 18 26
org.apache.solr.analysis.LowerCaseFilterFactory
{luceneMatchVersion=LUCENE_35}
position 1 2 3
term text documentación para agentes
startOffset 0 14 19
endOffset 13 18 26
org.apache.solr.analysis.StopFilterFactory {words=stopwords_intranet.txt,
ignoreCase=true, enablePositionIncrements=true,
luceneMatchVersion=LUCENE_35}
position 1 3
term text documentación agentes
startOffset 0 19
endOffset 13 26
org.apache.solr.analysis.SynonymFilterFactory
{synonyms=sinonimos_intranet.txt, expand=true, ignoreCase=true,
luceneMatchVersion=LUCENE_35}
position 1 2
term text documentación agente
archivo agentes
type SYNONYM SYNONYM
SYNONYM SYNONYM
startOffset 0 19
0 19
endOffset 13 26
13 26
As you can see, the position should be 1 and 3, but SynonymFilter removes
the gap and moves token from position 3 to 2
I've got the same problem with Solr 3.5 y 4.0.
I don't know if it's a bug or an error with my configuration. In other
schemas that I have worked with, I had always put the SynonymFilter
previous to StopFilter, but in this I prefered using this order because of
the big number of synonym that the list has (i.e. I don't want to generate
a lot of synonyms for a word that I really wanted to remove).
Thanks,
David Dávila Atienza
AEAT - Departamento de Informática Tributaria