You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Steven White <sw...@gmail.com> on 2019/08/07 23:32:02 UTC

Lower case "or" is being treated as operator OR?

Hi everyone,

My schema is setup to index all words (no stop-words such as "or", "and",
etc.) are removed.  My default operator is AND.  But when I search for "one
or two" (without the quotes as this is not a phrase search) I'm getting
hits on documents that have either "one" or "two".  It has the same effect
as if I searched for "one OR two".  Any idea why?

Where should I look to see what's causing this issue?  What part of my
schema or request handler do you need to see?

In case this helps.  Searching for just "or" or "OR" (with or without
quests) gives me the same set of hits and ranking.  The same is also true
for "and" or "AND".

Thanks.

Steven

Re: Lower case "or" is being treated as operator OR?

Posted by Steven White <sw...@gmail.com>.
Hi Chris,

I was able to fix the issue by adding the line "<str
name="lowercaseOperators">false</str> " to my request handler.  Here is how
my request handler looks like

  <requestHandler name="/select_test" class="solr.SearchHandler">
    <lst name="defaults">
      <str name="echoParams">explicit</str>
      <str name="defType">edismax</str>
      <str name="q.alt">*:*</str>
      <str name="rows">100</str>
      <str name="indent">true</str>
      <str name="fl">CC_UNIQUE_FIELD,CC_FILE_PATH,score</str>
      <str name="qf">CC_ALL_FIELDS_DATA</str>
      <str name="wt">xml</str>
      <str name="lowercaseOperators">false</str>
    </lst>
  </requestHandler>

So I am all set.  However, earlier you said  "lowercaseOperators" is set to
"false" by default for 8.1.  Looks like that's not the case.

Thanks.

Steven



On Wed, Aug 7, 2019 at 8:26 PM Chris Hostetter <ho...@fucit.org>
wrote:

>
> : I think by "what query parser" you mean this:
>
> no, that's the fieldType -- what i was refering to is that you are in fact
> using "edismax", but with solr 8.1 lowercaseOperators should default to
> "false", so my initial guess is probably wrong.
>
> : By "request parameter" I think you are asking what I'm sending to Solr?
> if
> : sow I'm sending it the raw text of "or" or "OR".  In case you mean my
> : request-handler, it is this:
>
> i mean all of it -- including any other request params your client may be
> sending to solr that overrides those defaults you just posted.
>
> the best thing to do to make sense of this is add
> "echoParams=all" and "debug=true" to your request, and show us the
> full response, along with some details of what docs in that result you
> don't expect to match, so we can look at:
>
> 1) what params come back in the responseHeader, so we can sanity check
> exactly what query string(s) are getting sent to solr, and that
> nothing is overriding lowercaseOperators, etc...
>
> 2) what comes back in the query debug section, so we can sanity check how
> your query strings are getting parsed
>
> 2) what the "explain" output looks like for those docs you are getting
> that you don't expect, so we can see why they matched.
>
>
> FWIW: you mentioned "My default operator is AND" ... but that's not
> visible in the requestHandler defaults you posted -- so where is it being
> set?  (maybe it's not being set like you think it is?)
>
>
>
> -Hoss
> http://www.lucidworks.com/
>

Re: Lower case "or" is being treated as operator OR?

Posted by Chris Hostetter <ho...@fucit.org>.
: I think by "what query parser" you mean this:

no, that's the fieldType -- what i was refering to is that you are in fact 
using "edismax", but with solr 8.1 lowercaseOperators should default to 
"false", so my initial guess is probably wrong.

: By "request parameter" I think you are asking what I'm sending to Solr?  if
: sow I'm sending it the raw text of "or" or "OR".  In case you mean my
: request-handler, it is this:

i mean all of it -- including any other request params your client may be 
sending to solr that overrides those defaults you just posted.

the best thing to do to make sense of this is add 
"echoParams=all" and "debug=true" to your request, and show us the 
full response, along with some details of what docs in that result you 
don't expect to match, so we can look at:

1) what params come back in the responseHeader, so we can sanity check 
exactly what query string(s) are getting sent to solr, and that 
nothing is overriding lowercaseOperators, etc...

2) what comes back in the query debug section, so we can sanity check how 
your query strings are getting parsed

2) what the "explain" output looks like for those docs you are getting 
that you don't expect, so we can see why they matched.


FWIW: you mentioned "My default operator is AND" ... but that's not 
visible in the requestHandler defaults you posted -- so where is it being 
set?  (maybe it's not being set like you think it is?)



-Hoss
http://www.lucidworks.com/

Re: Lower case "or" is being treated as operator OR?

Posted by Steven White <sw...@gmail.com>.
Hi Chris,

This is on Sorl 8.1.1

I think by "what query parser" you mean this:

  <fieldType name="test_text" class="solr.TextField"
autoGeneratePhraseQueries="true" positionIncrementGap="100">
    <analyzer>
      <tokenizer class="solr.WhitespaceTokenizerFactory"/>
      <filter class="solr.WordDelimiterFilterFactory" catenateNumbers="1"
generateNumberParts="1" stemEnglishPossessive="1" splitOnCaseChange="0"
generateWordParts="1" splitOnNumerics="1" preserveOriginal="1"
catenateAll="1" catenateWords="1"/>
      <filter class="solr.LowerCaseFilterFactory"/>
      <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
    </analyzer>
  </fieldType>

By "request parameter" I think you are asking what I'm sending to Solr?  if
sow I'm sending it the raw text of "or" or "OR".  In case you mean my
request-handler, it is this:

{"requestHandler":{"/select_hcl":{
      "class":"solr.SearchHandler",
      "name":"/select_hcl",
      "defaults":{
        "defType":"edismax",
        "echoParams":"explicit",
        "fl":"CC_UNIQUE_FIELD,CC_FILE_PATH,score",
        "indent":"true",
        "qf":"CC_ALL_FIELDS_DATA",
        "rows":"100",
        "wt":"xml"}}}}

Yes, I'm using edismax.  So if that's what's causing it, how do I tell it
to treat only uppercase OR and AND as operators?

Thanks in advanced.

Steven


On Wed, Aug 7, 2019 at 7:39 PM Chris Hostetter <ho...@fucit.org>
wrote:

>
> what version of solr?
> what query parser are you using?
> what do all of your request params (including defaults) look like?
>
> it's possible you are seeing the effects of edismax's "lowercaseOperators"
> param, which _should_ default to "false" in modern solr, but
> in very old versions it defaulted to "true" (inspite of what the docs at
> the time said)...
>
>
> https://lucene.apache.org/solr/guide/8_1/the-extended-dismax-query-parser.html
> https://issues.apache.org/jira/browse/SOLR-4646
>
>
> : Date: Wed, 7 Aug 2019 19:32:02 -0400
> : From: Steven White <sw...@gmail.com>
> : Reply-To: solr-user@lucene.apache.org
> : To: solr-user@lucene.apache.org
> : Subject: Lower case "or" is being treated as operator OR?
> :
> : Hi everyone,
> :
> : My schema is setup to index all words (no stop-words such as "or", "and",
> : etc.) are removed.  My default operator is AND.  But when I search for
> "one
> : or two" (without the quotes as this is not a phrase search) I'm getting
> : hits on documents that have either "one" or "two".  It has the same
> effect
> : as if I searched for "one OR two".  Any idea why?
> :
> : Where should I look to see what's causing this issue?  What part of my
> : schema or request handler do you need to see?
> :
> : In case this helps.  Searching for just "or" or "OR" (with or without
> : quests) gives me the same set of hits and ranking.  The same is also true
> : for "and" or "AND".
> :
> : Thanks.
> :
> : Steven
> :
>
> -Hoss
> http://www.lucidworks.com/
>

Re: Lower case "or" is being treated as operator OR?

Posted by Chris Hostetter <ho...@fucit.org>.
what version of solr?
what query parser are you using?
what do all of your request params (including defaults) look like?

it's possible you are seeing the effects of edismax's "lowercaseOperators" 
param, which _should_ default to "false" in modern solr, but 
in very old versions it defaulted to "true" (inspite of what the docs at 
the time said)...

https://lucene.apache.org/solr/guide/8_1/the-extended-dismax-query-parser.html
https://issues.apache.org/jira/browse/SOLR-4646


: Date: Wed, 7 Aug 2019 19:32:02 -0400
: From: Steven White <sw...@gmail.com>
: Reply-To: solr-user@lucene.apache.org
: To: solr-user@lucene.apache.org
: Subject: Lower case "or" is being treated as operator OR?
: 
: Hi everyone,
: 
: My schema is setup to index all words (no stop-words such as "or", "and",
: etc.) are removed.  My default operator is AND.  But when I search for "one
: or two" (without the quotes as this is not a phrase search) I'm getting
: hits on documents that have either "one" or "two".  It has the same effect
: as if I searched for "one OR two".  Any idea why?
: 
: Where should I look to see what's causing this issue?  What part of my
: schema or request handler do you need to see?
: 
: In case this helps.  Searching for just "or" or "OR" (with or without
: quests) gives me the same set of hits and ranking.  The same is also true
: for "and" or "AND".
: 
: Thanks.
: 
: Steven
: 

-Hoss
http://www.lucidworks.com/