You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by "Chris Schneider (Jira)" <ji...@apache.org> on 2019/10/22 17:36:00 UTC

[jira] [Comment Edited] (SOLR-6376) Edismax field alias bug

    [ https://issues.apache.org/jira/browse/SOLR-6376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16957233#comment-16957233 ] 

Chris Schneider edited comment on SOLR-6376 at 10/22/19 5:35 PM:
-----------------------------------------------------------------

Want to add more details, as I ran into this and it was pretty annoying to identify initially as the query looked correct. Not sure what exactly triggers it, but it does seem to require multiple fields.

The below steps can reproduce. I did this on 7.7.2

1. Define a couple fields. These can be whatever. I made them text_general with the names *field_1* and *field_2*

2. Index a couple documents:

 
{noformat}
{   
"field_1":"some field",   
"field_2":"some other field" 
},
{   
"field_1":"2.0",   
"field_2":"~" 
}{noformat}
 

3. Run query with edismax
{code:java}
defType=edismax
q=field_1:query^2.0 OR field_2:query~2 OR alias:query
f.alias.qf=invalid_field
qf=field_1 field_2
stopwords=true{code}
 

Returns:

 
{noformat}
"response":{
   "numFound":1,"
   start":0,"
   docs":[
{
   "field_1":["2.0"],       
   "field_2":["~"],       
   "id":"2c5f0f77-5c19-42b0-ad20-367187ed13ac", 
   "_version_":1648113233049944064   
}
]
{noformat}
 

Debug response:

 
{noformat}
"parsedquery":"+(field_1:query field_1:2.0 DisjunctionMaxQuery((field_1:or | field_2:or)) field_2:query field_2:2 DisjunctionMaxQuery((field_1:or | field_2:or)))", 
 "parsedquery_toString":"+(field_1:query field_1:2.0 (field_1:or | field_2:or) field_2:query field_2:2 (field_1:or | field_2:or))",
  {noformat}
 


was (Author: cschneider86):
Want to add more details, as I ran into this and it was pretty annoying to identify initially as the query looked correct. Not sure what exactly triggers it, but it does seem to require multiple fields.

The below steps can reproduce. I did this on 7.7.2

1. Define a couple fields. These can be whatever. I made them text_general with the names *field_1* and *field_2*

2. Index a couple documents:

 
{noformat}
{   
"field_1":"some field",   
"field_2":"some other field" 
},
{   
"field_1":"2.0",   
"field_2":"~" 
}{noformat}
 

3. Run query with edismax
{code:java}
defType:edismax
q=field_1:query^2.0 OR field_2:query~2 OR alias:query
f.alias.qf=invalid_field
qf=field_1 field_2
stopwords=true{code}
 

Returns:

 
{noformat}
"response":{
   "numFound":1,"
   start":0,"
   docs":[
{
   "field_1":["2.0"],       
   "field_2":["~"],       
   "id":"2c5f0f77-5c19-42b0-ad20-367187ed13ac", 
   "_version_":1648113233049944064   
}
]
{noformat}
 

Debug response:

 
{noformat}
"parsedquery":"+(field_1:query field_1:2.0 DisjunctionMaxQuery((field_1:or | field_2:or)) field_2:query field_2:2 DisjunctionMaxQuery((field_1:or | field_2:or)))", 
 "parsedquery_toString":"+(field_1:query field_1:2.0 (field_1:or | field_2:or) field_2:query field_2:2 (field_1:or | field_2:or))",
  {noformat}
 

> Edismax field alias bug
> -----------------------
>
>                 Key: SOLR-6376
>                 URL: https://issues.apache.org/jira/browse/SOLR-6376
>             Project: Solr
>          Issue Type: Bug
>          Components: query parsers
>    Affects Versions: 4.6.1, 4.7, 4.7.2, 4.8, 4.9, 4.10.1
>            Reporter: Thomas Egense
>            Priority: Minor
>              Labels: difficulty-easy, edismax, impact-low
>         Attachments: SOLR-6376.patch, SOLR-6376.patch
>
>
> If you create a field alias that maps to a nonexistent field, the query will be parsed to utter garbage.
> The bug can reproduced very easily. Add the following line to the /browse request handler in the tutorial example solrconfig.xml
> <str name="f.name_features.qf">name features XXX</str>
> (XXX is a nonexistent field)
> This simple query will actually work correctly: 
> name_features:video
> and it will be parsed to  (features:video | name:video) and return 3 results. It has simply discarded the nonexistent field and the result set is correct.
> However if you change the query to:
> name_features:video AND name_features:video
> you will now get 0 result and the query is parsed to 
> +(((features:video | name:video) (id:AND^10.0 | author:and^2.0 | title:and^10.0 | cat:AND^1.4 | text:and^0.5 | keywords:and^5.0 | manu:and^1.1 | description:and^5.0 | resourcename:and | name:and^1.2 | features:and) (features:video | name:video))~3)
> Notice the AND operator is now used a term! The parsed query can turn out even worse and produce query parts such as:
> title:2~2    
> title:and^2.0^10.0  
> Prefered solution: During start up, shut down Solr if there is a nonexistant field alias. Just as is the case if the cycle-detection detects a cycle:
> Acceptable solution: Ignore the nonexistant field totally.
> Thomas Egense



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org