You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@calcite.apache.org by "Liya Fan (Jira)" <ji...@apache.org> on 2022/03/19 09:22:00 UTC
[jira] [Closed] (CALCITE-4292) Wrong results in ElasticSearch when query contains NOT EQUAL
[ https://issues.apache.org/jira/browse/CALCITE-4292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Liya Fan closed CALCITE-4292.
-----------------------------
Resolved in release 1.30.0 (2022-03-20)
> Wrong results in ElasticSearch when query contains NOT EQUAL
> ------------------------------------------------------------
>
> Key: CALCITE-4292
> URL: https://issues.apache.org/jira/browse/CALCITE-4292
> Project: Calcite
> Issue Type: Bug
> Components: elasticsearch-adapter
> Reporter: Shlok Srivastava
> Assignee: Bill Neil
> Priority: Major
> Labels: ElasticSearch, NotEquals, QueryBuilder, calcite, pull-request-available
> Fix For: 1.30.0
>
> Time Spent: 0.5h
> Remaining Estimate: 0h
>
> Currently elasticsearch adapters ignore records which do not have the field specified in NotEqual clause, which is ideally included by elasticsearch.
> *Query:*
> {code:java}
> SELECT* from zips WHERE name <> "NMAX"{code}
> The elasticsearch query formed for above is this :
> {code:java}
> {
> "query": {
> "constant_score": {
> "filter": {
> "bool": {
> "must": {
> "exists": {
> "field": "Name"
> }
> },
> "must_not": {
> "term": {
> "Name": "NMQAX"
> }
> }
> }
> }
> }
> }
> }
> {code}
> *Problem* : The above query ignores document which do not have _Name_ field which is ideally included by elasticsearch but ignored due to must exists condition.
> *Data:*
> {noformat}
> { "_id" : "01701", "name" : "NMAX", "loc" : [ -71.42548600000001, 42.300665 ], "pop" : 65046, "state" : "MA" }
> { "_id" : "02154", "name" : "NORTH WALTHAM", "loc" : [ -71.236497, 42.382492 ], "pop" : 57871, "state" : "MA" }
> { "_id" : "02401 , "loc" : [ -71.03434799999999, 42.081571 ], "pop" : 59498, "state" : "MA" }
> {noformat}
>
>
> *Expected result:*
> {noformat}
> { "_id" : "02154", "name" : "NORTH WALTHAM", "loc" : [ -71.236497, 42.382492 ], "pop" : 57871, "state" : "MA" }
> { "_id" : "02401", "loc" : [ -71.03434799999999, 42.081571 ], "pop" : 59498, "state" : "MA" }
> {noformat}
>
> *Current Result:*
> {noformat}
> { "_id" : "02154", "name" : "NORTH WALTHAM", "loc" : [ -71.236497, 42.382492 ], "pop" : 57871, "state" : "MA" }
> {noformat}
> RelNode for same -
> {code:java}
> relB.not(relB.equals(relb.literal("Name"),relb.literal"NMQAX")){code}
>
> *Solution* : Remove the exists condition from Not equals Query Expression. Elasticsearch doesn't put this condition therefore keeping queries in sync. [Code|https://github.com/apache/calcite/blob/1050b36cafbb0c487b7a2ade3efd12850609717e/elasticsearch/src/main/java/org/apache/calcite/adapter/elasticsearch/PredicateAnalyzer.java#L782]
--
This message was sent by Atlassian Jira
(v8.20.1#820001)