You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Jean-Sebastien Vachon <js...@videotron.ca> on 2010/11/11 16:39:26 UTC

problem with wildcard

Hi All,

I'm having some trouble with a query using some wildcard and I was wondering if anyone could tell me why these two
similar queries do not return the same number of results. Basically, the query I'm making should return all docs whose title starts
(or contain) the string "lowe'". I suspect some analyzer is causing this behaviour and I'd like to know if there is a way to fix this problem.

1) select?q=*:*&fq=title:(+lowe')&debugQuery=on&rows=0

	<result name="response" numFound="302" start="0"/>
		<lst name="debug">
		<str name="rawquerystring">*:*</str>
		<str name="querystring">*:*</str>
		<str name="parsedquery">MatchAllDocsQuery(*:*)</str>
		<str name="parsedquery_toString">*:*</str>
		<lst name="explain"/>
		<str name="QParser">LuceneQParser</str>
		<arr name="filter_queries">
			<str>title:(  lowe')</str>
		</arr>
		<arr name="parsed_filter_queries">
			<str>title:low</str>
		</arr>

2) select?q=*:*&fq=title:(+lowe'*)&debugQuery=on&rows=0 

		<result name="response" numFound="0" start="0"/>
		<lst name="debug">
			<str name="rawquerystring">*:*</str>
			<str name="querystring">*:*</str>
			<str name="parsedquery">MatchAllDocsQuery(*:*)</str>
			<str name="parsedquery_toString">*:*</str>
			<lst name="explain"/>
			<str name="QParser">LuceneQParser</str>
			<arr name="filter_queries">
				<str>title:(  lowe'*)</str>
			</arr>
			<arr name="parsed_filter_queries">
				<str>title:lowe'*</str>
			</arr>
			...
		</lst>


The <title> field is defined as:

<field name="title" type="text" indexed="true" stored="true" required="false"/>

where the text type is:

<fieldType name="text" class="solr.TextField" positionIncrementGap="100">
      <analyzer type="index">
        <tokenizer class="solr.WhitespaceTokenizerFactory"/>
        <!-- in this example, we will only use synonyms at query time
        <filter class="solr.SynonymFilterFactory" synonyms="index_synonyms.txt" ignoreCase="true" expand="false"/>
        -->
        <!-- Case insensitive stop word removal.
          add enablePositionIncrements=true in both the index and query
          analyzers to leave a 'gap' for more accurate phrase queries.
        -->
        <filter class="solr.StopFilterFactory"
                ignoreCase="true"
                words="stopwords.txt"
                enablePositionIncrements="true"
                />
        <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="1" catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/>
        <filter class="solr.LowerCaseFilterFactory"/>
        <filter class="solr.SnowballPorterFilterFactory" language="English" protected="protwords.txt"/>
      </analyzer>
      <analyzer type="query">
        <tokenizer class="solr.WhitespaceTokenizerFactory"/>
        <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
        <filter class="solr.StopFilterFactory"
                ignoreCase="true"
                words="stopwords.txt"
                enablePositionIncrements="true"
                />
        <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="0" catenateNumbers="0" catenateAll="0" splitOnCaseChange="1"/>
        <filter class="solr.LowerCaseFilterFactory"/>
        <filter class="solr.SnowballPorterFilterFactory" language="English" protected="protwords.txt"/>
      </analyzer>
    </fieldType>





Re: problem with wildcard

Posted by Ahmet Arslan <io...@yahoo.com>.
> select?q=*:*&fq=title:(+lowe')&debugQuery=on&rows=0
> > 
> > "wildcard queries are not analyzed" http://search-lucene.com/m/pnmlH14o6eM1/
> > 
> 
> Yeah I found out about this a couple of minutes after I
> posted my problem. If there is no analyzer then
> why is Solr not finding any documents when a single quote
> precedes the wildcard?


Probably your index analyzer (WordDelimiterFilterFactory) eating that single quote. You can verify this at admin/analysis.jsp page. In other words there is no such term begins with (lowe') in your index. You can try searching just lowe*


      

Re: problem with wildcard

Posted by Jean-Sebastien Vachon <js...@videotron.ca>.
On 2010-11-11, at 3:45 PM, Ahmet Arslan wrote:

>> I'm having some trouble with a query using some wildcard
>> and I was wondering if anyone could tell me why these two
>> similar queries do not return the same number of results.
>> Basically, the query I'm making should return all docs whose
>> title starts
>> (or contain) the string "lowe'". I suspect some analyzer is
>> causing this behaviour and I'd like to know if there is a
>> way to fix this problem.
>> 
>> 1)
>> select?q=*:*&fq=title:(+lowe')&debugQuery=on&rows=0
> 
> "wildcard queries are not analyzed" http://search-lucene.com/m/pnmlH14o6eM1/
> 

Yeah I found out about this a couple of minutes after I posted my problem. If there is no analyzer then
why is Solr not finding any documents when a single quote precedes the wildcard?

Re: problem with wildcard

Posted by Ahmet Arslan <io...@yahoo.com>.
> I'm having some trouble with a query using some wildcard
> and I was wondering if anyone could tell me why these two
> similar queries do not return the same number of results.
> Basically, the query I'm making should return all docs whose
> title starts
> (or contain) the string "lowe'". I suspect some analyzer is
> causing this behaviour and I'd like to know if there is a
> way to fix this problem.
> 
> 1)
> select?q=*:*&fq=title:(+lowe')&debugQuery=on&rows=0

"wildcard queries are not analyzed" http://search-lucene.com/m/pnmlH14o6eM1/