You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@calcite.apache.org by "Liya Fan (Jira)" <ji...@apache.org> on 2022/03/19 09:22:00 UTC

[jira] [Closed] (CALCITE-4292) Wrong results in ElasticSearch when query contains NOT EQUAL

     [ https://issues.apache.org/jira/browse/CALCITE-4292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Liya Fan closed CALCITE-4292.
-----------------------------

Resolved in release 1.30.0 (2022-03-20)

> Wrong results in ElasticSearch when query contains NOT EQUAL
> ------------------------------------------------------------
>
>                 Key: CALCITE-4292
>                 URL: https://issues.apache.org/jira/browse/CALCITE-4292
>             Project: Calcite
>          Issue Type: Bug
>          Components: elasticsearch-adapter
>            Reporter: Shlok Srivastava
>            Assignee: Bill Neil
>            Priority: Major
>              Labels: ElasticSearch, NotEquals, QueryBuilder, calcite, pull-request-available
>             Fix For: 1.30.0
>
>          Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Currently elasticsearch adapters ignore records which do not have the field specified in NotEqual clause, which is ideally included by elasticsearch.
> *Query:*
> {code:java}
> SELECT* from zips WHERE name <> "NMAX"{code}
> The elasticsearch query formed  for above  is this : 
> {code:java}
> {
>   "query": {
>     "constant_score": {
>       "filter": {
>         "bool": {
>           "must": {
>             "exists": {
>               "field": "Name"
>             }
>           },
>           "must_not": {
>             "term": {
>               "Name": "NMQAX"
>             }
>           }
>         }
>       }
>     }
>   }
> }
> {code}
> *Problem* : The above query ignores document which do not have _Name_ field which is ideally included by elasticsearch but ignored due to must exists condition. 
> *Data:*
> {noformat}
> { "_id" : "01701", "name" : "NMAX", "loc" : [ -71.42548600000001, 42.300665 ], "pop" : 65046, "state" : "MA" }
> { "_id" : "02154", "name" : "NORTH WALTHAM", "loc" : [ -71.236497, 42.382492 ], "pop" : 57871, "state" : "MA" }
> { "_id" : "02401 , "loc" : [ -71.03434799999999, 42.081571 ], "pop" : 59498, "state" : "MA" }
> {noformat}
>  
>  
>  *Expected result:* 
> {noformat}
> { "_id" : "02154", "name" : "NORTH WALTHAM", "loc" : [ -71.236497, 42.382492 ], "pop" : 57871, "state" : "MA" }
> { "_id" : "02401", "loc" : [ -71.03434799999999, 42.081571 ], "pop" : 59498, "state" : "MA" }
> {noformat}
>  
>  *Current Result:*
> {noformat}
>  { "_id" : "02154", "name" : "NORTH WALTHAM", "loc" : [ -71.236497, 42.382492 ], "pop" : 57871, "state" : "MA" }
> {noformat}
> RelNode for same - 
> {code:java}
> relB.not(relB.equals(relb.literal("Name"),relb.literal"NMQAX")){code}
>  
> *Solution* : Remove the exists condition from Not equals Query Expression. Elasticsearch doesn't put this condition therefore keeping queries in sync. [Code|https://github.com/apache/calcite/blob/1050b36cafbb0c487b7a2ade3efd12850609717e/elasticsearch/src/main/java/org/apache/calcite/adapter/elasticsearch/PredicateAnalyzer.java#L782]



--
This message was sent by Atlassian Jira
(v8.20.1#820001)