You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@calcite.apache.org by "Jacky Yin (Jira)" <ji...@apache.org> on 2021/06/10 01:12:00 UTC

[jira] [Commented] (CALCITE-4645) In Elasticsearch adapter, a range predicate should be translated to a range query

    [ https://issues.apache.org/jira/browse/CALCITE-4645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17360479#comment-17360479 ] 

Jacky Yin commented on CALCITE-4645:
------------------------------------

This case is similar with CALCITE-4606 but not same. 

For SEARCH RexCall, there are three subtypes according to the Sarg value included. 
 # Sarg isPoints. (map to In ('a', 'b', 'c'))
 # Sarg isComplementedPoints. (map to Not in ('a', 'b', 'c'))
 # Sarg is real Range set. (map to (f1 > 10 and f1 < 20) or (f1 > 30 and f1 < 40)).

For the 1) and 2) cases, they can be translated into terms query of es.

And for the 3) case, it should be translated into range query of es. 

In CALCITE-4606, I implemented the cases 1) and 2).  This Jira is for tracking the implementation of case 3).

> In Elasticsearch adapter, a range predicate should be translated to a range query
> ---------------------------------------------------------------------------------
>
>                 Key: CALCITE-4645
>                 URL: https://issues.apache.org/jira/browse/CALCITE-4645
>             Project: Calcite
>          Issue Type: Bug
>          Components: elasticsearch-adapter
>    Affects Versions: 1.27.0
>            Reporter: Jacky Yin
>            Priority: Critical
>
> SQL patterns like below will be converted to Search in range which is not supported in the Elasticsearch adapter. 
> {code:java}
> select count(*) from view where (f1 > 10 and f1 < 20) or (f1 > 30 and f1 < 40))
> select * from view where f1 > 10 and f1 < 20
> select * from view where f1 < 10 or f1 > 20 {code}
> Here is one example of the query plan change history for SQL
> {code:java}
>  select count(*) from view where val1 >= 10 and val1 <=20{code}
> *Original:*
> {noformat}
>  LogicalAggregate(group=[{}], EXPR$0=[COUNT()]): rowcount = 1.0, cumulative cost = {136.125 rows, 810.1 cpu, 0.0 io}, id = 14
>   LogicalFilter(condition=[AND(>=($5, 10), <=($5, 20))]): rowcount = 25.0, cumulative cost = {135.0 rows, 810.1 cpu, 0.0 io}, id = 12
>     LogicalProject(cat1=[ITEM($0, 'cat1')], cat2=[ITEM($0, 'cat2')], cat3=[ITEM($0, 'cat3')], cat4=[ITEM($0, 'cat4')], cat5=[ITEM($0, 'cat5')], val1=[ITEM($0, 'val1')], val2=[ITEM($0, 'val2')]): rowcount = 100.0, cumulative cost = {110.0 rows, 710.1 cpu, 0.0 io}, id = 11
>       ElasticsearchTableScan(table=[[elastic, aggs]]): rowcount = 100.0, cumulative cost = {10.0 rows, 10.100000000000001 cpu, 0.0 io}, id = 10{noformat}
>  
> *Simplified*:
> {noformat}
> LogicalAggregate(group=[{}], EXPR$0=[COUNT()]): rowcount = 2.5, cumulative cost = {2.8125 rows, 0.0 cpu, 0.0 io}, id = 32
>   LogicalFilter(subset=[rel#31:RelSubset#2.NONE.[]], condition=[SEARCH($0, Sarg[[10..20]])]): rowcount = 25.0, cumulative cost = {25.0 rows, 100.0 cpu, 0.0 io}, id = 30
>     LogicalProject(subset=[rel#29:RelSubset#1.NONE.[]], val1=[ITEM($0, 'val1')]): rowcount = 100.0, cumulative cost = {100.0 rows, 100.0 cpu, 0.0 io}, id = 28
>       ElasticsearchTableScan(subset=[rel#27:RelSubset#0.ELASTICSEARCH.[]], table=[[elastic, aggs]]): rowcount = 100.0, cumulative cost = {10.0 rows, 10.100000000000001 cpu, 0.0 io}, id = 10
>  {noformat}
>  
> *Optimized by the rules in es adapter:*
> {noformat}
> ElasticsearchToEnumerableConverter: rowcount = 1.0, cumulative cost = {12.7125 rows, 20.200000000000003 cpu, 0.0 io}, id = 84
>   ElasticsearchAggregate(group=[{}], EXPR$0=[COUNT()]): rowcount = 1.0, cumulative cost = {12.6125 rows, 20.1 cpu, 0.0 io}, id = 83
>     ElasticsearchFilter(condition=[SEARCH(ITEM($0, 'val1'), Sarg[[10..20]])]): rowcount = 25.0, cumulative cost = {12.5 rows, 20.1 cpu, 0.0 io}, id = 82
>       ElasticsearchTableScan(table=[[elastic, aggs]]): rowcount = 100.0, cumulative cost = {10.0 rows, 10.100000000000001 cpu, 0.0 io}, id = 10
>  {noformat}
>  
> The LogicalFilter(condition=*[AND(>=($5, 10), <=($5, 20))])* is converted to LogicalFilter(subset=[rel#31:RelSubset#2.NONE.[]], condition=*[SEARCH($0, Sarg[[10..20]])]*) which is not supported in the predicate analyzer of elastic search adapter.
>  
> Below is the exception output
> {noformat}
>  org.apache.calcite.adapter.elasticsearch.PredicateAnalyzer$PredicateAnalyzerException: Unsupported call: [SEARCH(ITEM($0, 'val1'), Sarg[[10..20]])]org.apache.calcite.adapter.elasticsearch.PredicateAnalyzer$PredicateAnalyzerException: Unsupported call: [SEARCH(ITEM($0, 'val1'), Sarg[[10..20]])] at org.apache.calcite.adapter.elasticsearch.PredicateAnalyzer$Visitor.visitCall(PredicateAnalyzer.java:238) at org.apache.calcite.adapter.elasticsearch.PredicateAnalyzer$Visitor.visitCall(PredicateAnalyzer.java:127) at org.apache.calcite.rex.RexCall.accept(RexCall.java:189) at org.apache.calcite.adapter.elasticsearch.PredicateAnalyzer.analyze(PredicateAnalyzer.java:112) at org.apache.calcite.adapter.elasticsearch.ElasticsearchFilter$PredicateAnalyzerTranslator.translateMatch(ElasticsearchFilter.java:102) at org.apache.calcite.adapter.elasticsearch.ElasticsearchFilter.implement(ElasticsearchFilter.java:67) at org.apache.calcite.adapter.elasticsearch.ElasticsearchRel$Implementor.visitChild(ElasticsearchRel.java:129) at org.apache.calcite.adapter.elasticsearch.ElasticsearchAggregate.implement(ElasticsearchAggregate.java:128) at org.apache.calcite.adapter.elasticsearch.ElasticsearchRel$Implementor.visitChild(ElasticsearchRel.java:129) at org.apache.calcite.adapter.elasticsearch.ElasticsearchToEnumerableConverter.implement(ElasticsearchToEnumerableConverter.java:67) at org.apache.calcite.adapter.enumerable.EnumerableRelImplementor.implementRoot(EnumerableRelImplementor.java:111) ... 77 more{noformat}
>  
> Based on the query types in elastic search, search in range should be translated to range query of ES. 
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)