You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by "KOUTA SETSU (Jira)" <ji...@apache.org> on 2020/03/03 06:19:00 UTC
[jira] [Updated] (SOLR-14300) Some conditional clauses on unindexed field will be ignored by query parser in some specific cases

     [ https://issues.apache.org/jira/browse/SOLR-14300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

KOUTA SETSU updated SOLR-14300:
-------------------------------
    Description: 
In some specific cases, some conditional clauses on unindexed field will be ignored
 * for query like, q=A:1 OR B:1 OR A:2 OR B:2
 if field B is not indexed(but docValues="true"), "B:1" will be lost.
  
 * but if you write query like, q=A:1 OR A:2 OR B:1 OR B:2,
 it will work perfect.

the only difference of two queries is that they are wrote in different orders.
 one is *ABAB*, another is *AABB.*

 

*steps of reproduce*
 you can easily reproduce this problem on a solr collection with _default configset and exampledocs/books.csv data.
 # create a _default collection
{code:java}
bin/solr create -c books -s 2 -rf 2{code}
 # post books.csv.
{code:java}
bin/post -c books example/exampledocs/books.csv{code}
 # run followed query.
 ** query1: [http://localhost:8983/solr/books/select?q=+(name_str:Foundation+OR+cat:book+OR+name_str:Jhereg+OR+cat:cd)&debug=query]
 ** query2: [http://localhost:8983/solr/books/select?q=+(name_str:Foundation+OR+name_str:Jhereg+OR+cat:book+OR+cat:cd)&debug=query]
 ** then you can find the parsedqueries are different.
 *** query1.  ("name_str:Foundation" is lost.)
{code:json}
 "debug":{
     "rawquerystring":"+(name_str:Foundation OR cat:book OR name_str:Jhereg OR cat:cd)",
     "querystring":"+(name_str:Foundation OR cat:book OR name_str:Jhereg OR cat:cd)",
     "parsedquery":"+(cat:book cat:cd (name_str:[[4a 68 65 72 65 67] TO [4a 68 65 72 65 67]]))",
     "parsedquery_toString":"+(cat:book cat:cd name_str:[[4a 68 65 72 65 67] TO [4a 68 65 72 65 67]])",
     "QParser":"LuceneQParser"}}{code}
 *** query2.  ("name_str:Foundation" isn't lost.)
{code:json}
   "debug":{
     "rawquerystring":"+(name_str:Foundation OR name_str:Jhereg OR cat:book OR cat:cd)",
     "querystring":"+(name_str:Foundation OR name_str:Jhereg OR cat:book OR cat:cd)",
     "parsedquery":"+(cat:book cat:cd ((name_str:[[46 6f 75 6e 64 61 74 69 6f 6e] TO [46 6f 75 6e 64 61 74 69 6f 6e]]) (name_str:[[4a 68 65 72 65 67] TO [4a 68 65 72 65 67]])))",
     "parsedquery_toString":"+(cat:book cat:cd (name_str:[[46 6f 75 6e 64 61 74 69 6f 6e] TO [46 6f 75 6e 64 61 74 69 6f 6e]] name_str:[[4a 68 65 72 65 67] TO [4a 68 65 72 65 67]]))",
     "QParser":"LuceneQParser"}{code}

  was:
In some specific cases, some conditional clauses on unindexed field will be ignored
 * for query like, q=A:1 OR B:1 OR A:2 OR B:2
 if field B is not indexed(but docValues="true"), "B:1" will be lost.
  
 * but if you write query like, q=A:1 OR A:2 OR B:1 OR B:2,
 it will work perfect.

the only difference of two queries is that they are wrote in different orders.
 one is *ABAB*, another is *AABB.*

 

*steps of reproduce*
 you can easily reproduce this problem on a solr collection with _default configset and exampledocs/books.csv data.
 # create a _default collection
{code:java}
bin/solr create -c books -s 2 -rf 2{code}

 # post books.csv.
{code:java}
bin/post -c books example/exampledocs/books.csv{code}

 # run followed query.
 ** query1: [http://localhost:8983/solr/books/select?q=+(name_str:Foundation+OR+cat:book+OR+name_str:Jhereg+OR+cat:cd)&debug=query]
 ** query2: [http://localhost:8983/solr/books/select?q=+(name_str:Foundation+OR+name_str:Jhereg+OR+cat:book+OR+cat:cd)&debug=query]
 ** then you can find the parsedqueries are different.
 *** query1.  ("name_str:Foundation" is lost.)
{code:json}
 "debug":{
     "rawquerystring":"+(name_str:Foundation OR cat:book OR name_str:Jhereg OR cat:cd)",
     "querystring":"+(name_str:Foundation OR cat:book OR name_str:Jhereg OR cat:cd)",
     "parsedquery":"+(cat:book cat:cd (name_str:[[4a 68 65 72 65 67] TO [4a 68 65 72 65 67]]))",
     "parsedquery_toString":"+(cat:book cat:cd name_str:[[4a 68 65 72 65 67] TO [4a 68 65 72 65 67]])",
     "QParser":"LuceneQParser"}}{code}

 *** query2.  ("name_str:Foundation" isn't lost.)
{code:json}
   "debug":{
     "rawquerystring":"+(name_str:Foundation OR name_str:Jhereg OR cat:book OR cat:cd)",
     "querystring":"+(name_str:Foundation OR name_str:Jhereg OR cat:book OR cat:cd)",
     "parsedquery":"+(cat:book cat:cd ((name_str:[[46 6f 75 6e 64 61 74 69 6f 6e] TO [46 6f 75 6e 64 61 74 69 6f 6e]]) (name_str:[[4a 68 65 72 65 67] TO [4a 68 65 72 65 67]])))",
     "parsedquery_toString":"+(cat:book cat:cd (name_str:[[46 6f 75 6e 64 61 74 69 6f 6e] TO [46 6f 75 6e 64 61 74 69 6f 6e]] name_str:[[4a 68 65 72 65 67] TO [4a 68 65 72 65 67]]))",
     "QParser":"LuceneQParser"}{code}


> Some conditional clauses on unindexed field will be ignored by query parser in some specific cases
> --------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-14300
>                 URL: https://issues.apache.org/jira/browse/SOLR-14300
>             Project: Solr
>          Issue Type: Bug
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: query parsers
>    Affects Versions: 7.3.1
>         Environment: Solr 7.3.1 
> centos7.5
>            Reporter: KOUTA SETSU
>            Priority: Minor
>
> In some specific cases, some conditional clauses on unindexed field will be ignored
>  * for query like, q=A:1 OR B:1 OR A:2 OR B:2
>  if field B is not indexed(but docValues="true"), "B:1" will be lost.
>   
>  * but if you write query like, q=A:1 OR A:2 OR B:1 OR B:2,
>  it will work perfect.
> the only difference of two queries is that they are wrote in different orders.
>  one is *ABAB*, another is *AABB.*
>  
> *steps of reproduce*
>  you can easily reproduce this problem on a solr collection with _default configset and exampledocs/books.csv data.
>  # create a _default collection
> {code:java}
> bin/solr create -c books -s 2 -rf 2{code}
>  # post books.csv.
> {code:java}
> bin/post -c books example/exampledocs/books.csv{code}
>  # run followed query.
>  ** query1: [http://localhost:8983/solr/books/select?q=+(name_str:Foundation+OR+cat:book+OR+name_str:Jhereg+OR+cat:cd)&debug=query]
>  ** query2: [http://localhost:8983/solr/books/select?q=+(name_str:Foundation+OR+name_str:Jhereg+OR+cat:book+OR+cat:cd)&debug=query]
>  ** then you can find the parsedqueries are different.
>  *** query1.  ("name_str:Foundation" is lost.)
> {code:json}
>  "debug":{
>      "rawquerystring":"+(name_str:Foundation OR cat:book OR name_str:Jhereg OR cat:cd)",
>      "querystring":"+(name_str:Foundation OR cat:book OR name_str:Jhereg OR cat:cd)",
>      "parsedquery":"+(cat:book cat:cd (name_str:[[4a 68 65 72 65 67] TO [4a 68 65 72 65 67]]))",
>      "parsedquery_toString":"+(cat:book cat:cd name_str:[[4a 68 65 72 65 67] TO [4a 68 65 72 65 67]])",
>      "QParser":"LuceneQParser"}}{code}
>  *** query2.  ("name_str:Foundation" isn't lost.)
> {code:json}
>    "debug":{
>      "rawquerystring":"+(name_str:Foundation OR name_str:Jhereg OR cat:book OR cat:cd)",
>      "querystring":"+(name_str:Foundation OR name_str:Jhereg OR cat:book OR cat:cd)",
>      "parsedquery":"+(cat:book cat:cd ((name_str:[[46 6f 75 6e 64 61 74 69 6f 6e] TO [46 6f 75 6e 64 61 74 69 6f 6e]]) (name_str:[[4a 68 65 72 65 67] TO [4a 68 65 72 65 67]])))",
>      "parsedquery_toString":"+(cat:book cat:cd (name_str:[[46 6f 75 6e 64 61 74 69 6f 6e] TO [46 6f 75 6e 64 61 74 69 6f 6e]] name_str:[[4a 68 65 72 65 67] TO [4a 68 65 72 65 67]]))",
>      "QParser":"LuceneQParser"}{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org