You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by raonalluri <na...@gmail.com> on 2012/08/01 18:23:00 UTC

StandardTokenizerFactory is behaving differently in Solr 3.6?

I have a field type like the following:

<fieldType name="text_general_name" class="solr.TextField"
positionIncrementGap="100">
      <analyzer>
        <tokenizer class="solr.StandardTokenizerFactory"/>
        <filter class="solr.StopFilterFactory" ignoreCase="true"
words="stopwords.txt" enablePositionIncrements="true" />
        <filter class="solr.LowerCaseFilterFactory"/>
      </analyzer>
    </fieldType>


This type is behaving differently in Solr 3.3 and 3.6. In 3.3, the following
doesn't return any records because there is no author called 'Gerri Killis'.
But there is an author called ''Gerri Jonathan'.

/select/?q=author:Gerri\ Killis

In 3.6, the following returns records because there is an author called
'Gerri Jonathan'. So something is wrong in 3.6?. I didn't expect any records
here, because there is no author called 'Gerri Killis'.

/select/?q=author:Gerri\ Killis


Your help is appreciated.

Thanks
Srini



--
View this message in context: http://lucene.472066.n3.nabble.com/StandardTokenizerFactory-is-behaving-differently-in-Solr-3-6-tp3998623.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: StandardTokenizerFactory is behaving differently in Solr 3.6?

Posted by david3s <da...@hotmail.com>.
Hello Jack,

We found that the problem is related to the *lucene* query parser in 3.6

select?q=author:David\ Duke&defType=lucene
Would render the same results as:
select?q=author:(David OR Duke)&defType=lucene

But
select?q=author:David\ Duke&defType=edismax
Would render the same results as:
select?q=author:"David Duke"&defType=lucene

Thanks A lot Jack



--
View this message in context: http://lucene.472066.n3.nabble.com/StandardTokenizerFactory-is-behaving-differently-in-Solr-3-6-tp3998623p3998899.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: StandardTokenizerFactory is behaving differently in Solr 3.6?

Posted by Jack Krupansky <ja...@basetechnology.com>.
This may simply be a matter of changing the default query operator from "OR" 
to "AND". Try adding &q.op=AND to your request.

-- Jack Krupansky

-----Original Message----- 
From: raonalluri
Sent: Wednesday, August 01, 2012 4:26 PM
To: solr-user@lucene.apache.org
Subject: Re: StandardTokenizerFactory is behaving differently in Solr 3.6?

Jack, thanks a lot for your reply. We are using LuceneQParser query parser. 
I
agree, if I phrase the string by adding double quotes, I am good.

But I am checking if there is any fix for this without changing the query.
As we are in production environment, we need to change the quries in
different places.

Can we escape from this issue by change the query parser?

regards
Srini



--
View this message in context: 
http://lucene.472066.n3.nabble.com/StandardTokenizerFactory-is-behaving-differently-in-Solr-3-6-tp3998623p3998677.html
Sent from the Solr - User mailing list archive at Nabble.com. 


Re: StandardTokenizerFactory is behaving differently in Solr 3.6?

Posted by raonalluri <na...@gmail.com>.
Jack, thanks a lot for your reply. We are using LuceneQParser query parser. I
agree, if I phrase the string by adding double quotes, I am good. 

But I am checking if there is any fix for this without changing the query.
As we are in production environment, we need to change the quries in
different places.

Can we escape from this issue by change the query parser?

regards
Srini



--
View this message in context: http://lucene.472066.n3.nabble.com/StandardTokenizerFactory-is-behaving-differently-in-Solr-3-6-tp3998623p3998677.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: StandardTokenizerFactory is behaving differently in Solr 3.6?

Posted by Jack Krupansky <ja...@basetechnology.com>.
Which query parser do you have set in your request handler?

There was a problem with edismax in 3.6 with the WordDelimiterFilter, that 
sounds exactly like your symptom. The workaround is to enclose the term in 
quotes (to make it a phrase), otherwise the terms would be "OR"ed rather 
than "AND"ed.

-- Jack Krupansky

-----Original Message----- 
From: raonalluri
Sent: Wednesday, August 01, 2012 3:25 PM
To: solr-user@lucene.apache.org
Subject: Re: StandardTokenizerFactory is behaving differently in Solr 3.6?

I noticed, escape character which is in the query, is getting ignored in 
solr
3.6.

For the following 3.3 gives results where 'Featuring Chimp' is matched. But
in 3.6, it gives results where Featuring or Chimp or Featuring Chimp is
matched. Any idea what is the difference between my 3.3 and 3.6 environments
for this inconsistent results?

/select/?q=title:Featuring\ Chimp



--
View this message in context: 
http://lucene.472066.n3.nabble.com/StandardTokenizerFactory-is-behaving-differently-in-Solr-3-6-tp3998623p3998665.html
Sent from the Solr - User mailing list archive at Nabble.com. 


Re: StandardTokenizerFactory is behaving differently in Solr 3.6?

Posted by raonalluri <na...@gmail.com>.
I noticed, escape character which is in the query, is getting ignored in solr
3.6.

For the following 3.3 gives results where 'Featuring Chimp' is matched. But
in 3.6, it gives results where Featuring or Chimp or Featuring Chimp is
matched. Any idea what is the difference between my 3.3 and 3.6 environments
for this inconsistent results?

/select/?q=title:Featuring\ Chimp



--
View this message in context: http://lucene.472066.n3.nabble.com/StandardTokenizerFactory-is-behaving-differently-in-Solr-3-6-tp3998623p3998665.html
Sent from the Solr - User mailing list archive at Nabble.com.