You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Jan Simon Winkelmann <wi...@newsfactory.de> on 2010/02/24 13:00:02 UTC
Strange search behavior
Hi,
I'm having some problems understanding why certain search queries don't return any results.
I have a field of type "text", which is defined like this:
<fieldType name="text" class="solr.TextField" positionIncrementGap="100">
<analyzer type="index">
<charFilter class="solr.HTMLStripCharFilterFactory"/>
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.ISOLatin1AccentFilterFactory"/>
<filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" enablePositionIncrements="true" />
<filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="1" catenateNumbers="1" catenateAll="0"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.SnowballPorterFilterFactory" language="German" />
<filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
</analyzer>
<analyzer type="query">
<charFilter class="solr.HTMLStripCharFilterFactory"/>
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.ISOLatin1AccentFilterFactory"/>
<filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
<filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt"/>
<filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="1" catenateNumbers="1" catenateAll="0"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.SnowballPorterFilterFactory" language="German" />
<filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
</analyzer>
I have a total of about 3.2 Million documents indexed, of which a few hundred are in the format of "Tagesergebnisse der Oddset-Spiele vom 18.02.2010".
My problem is, that if I search for "oddset-spiele", i get no results, but when I search for "oddsetspiele" or "oddset*spiele" i get lots of results. As far as I understand the WordDelimiterFilter converts each phrase into "name:oddset (spiel oddsetspiel)", at least thats what the analyzer says. What I don't get ist hat when I search for "oddset-spiele" I get no results at all.
I would appreciate any help or insight anyone could privide.
Best
Jan
Re: Strange search behavior
Posted by Otis Gospodnetic <ot...@yahoo.com>.
Jan,
If you go to Solr Admin Analysis page and enter your problematic query, what do you see?
Otis
----
Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Hadoop ecosystem search :: http://search-hadoop.com/
----- Original Message ----
> From: Jan Simon Winkelmann <wi...@newsfactory.de>
> To: "solr-user@lucene.apache.org" <so...@lucene.apache.org>
> Sent: Wed, February 24, 2010 7:00:02 AM
> Subject: Strange search behavior
>
> Hi,
>
> I'm having some problems understanding why certain search queries don't return
> any results.
> I have a field of type "text", which is defined like this:
>
>
> positionIncrementGap="100">
>
>
>
>
>
> words="stopwords.txt" enablePositionIncrements="true" />
>
> generateWordParts="1" generateNumberParts="1" catenateWords="1"
> catenateNumbers="1" catenateAll="0"/>
>
>
> language="German" />
>
>
>
>
>
>
>
> synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
>
> words="stopwords.txt"/>
>
> generateWordParts="1" generateNumberParts="1" catenateWords="1"
> catenateNumbers="1" catenateAll="0"/>
>
>
> language="German" />
>
>
>
> I have a total of about 3.2 Million documents indexed, of which a few hundred
> are in the format of "Tagesergebnisse der Oddset-Spiele vom 18.02.2010".
>
> My problem is, that if I search for "oddset-spiele", i get no results, but when
> I search for "oddsetspiele" or "oddset*spiele" i get lots of results. As far as
> I understand the WordDelimiterFilter converts each phrase into "name:oddset
> (spiel oddsetspiel)", at least thats what the analyzer says. What I don't get
> ist hat when I search for "oddset-spiele" I get no results at all.
>
> I would appreciate any help or insight anyone could privide.
>
> Best
> Jan