You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@calcite.apache.org by "ZheHu (Jira)" <ji...@apache.org> on 2021/11/19 14:38:00 UTC

[jira] [Commented] (CALCITE-2689) ES Adapter. Grouping on date / number fields fails

    [ https://issues.apache.org/jira/browse/CALCITE-2689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17446516#comment-17446516 ] 

ZheHu commented on CALCITE-2689:
--------------------------------

Hi, [~julianhyde][~sereda], recently, when I tried to fix [CALCITE-4868|https://issues.apache.org/jira/browse/CALCITE-4868?filter=-1], I found one interesting thing about this issue.
Take Integer type as an example, missing value is replaced by Integer.MIN_VALUE.
{code:java}
doc1 = {"int_field1":1, "int_field2": -2147483648}
doc1 = {"int_field1":2}
{code}
When "int_field2" is grouped by in a query, doc1 and doc2 will be in the same group. Situation like this is rare, but may confuse others and make them treat it as a bug.
I did some research on ES scripts about how to aggregate missing value, and unfortunately, no solution was found.

So, maybe we should add some test cases to tell people about this potential problem/limitation, what do you think?

> ES Adapter. Grouping on date / number fields fails
> --------------------------------------------------
>
>                 Key: CALCITE-2689
>                 URL: https://issues.apache.org/jira/browse/CALCITE-2689
>             Project: Calcite
>          Issue Type: Improvement
>          Components: elasticsearch-adapter
>            Reporter: Andrei Sereda
>            Assignee: Julian Hyde
>            Priority: Major
>             Fix For: 1.18.0
>
>
> For [Terms Aggregation|https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-bucket-terms-aggregation.html] missing value has to have same type as group key:
> {code:sql}
> select max(amount), date from orders group by date -- date column is of type date (in ES)
> {code}
> Currently single (text) key is used {{__MISSING__}} which fails when grouping on non-string fields (eg. dates, numbers or booleans).
> When using {{missing}} (value) query converter should consider field type.
> This logic should be reviewed once we migrate to [composite aggregations|https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-bucket-composite-aggregation.html] (available since [6.1|https://www.elastic.co/guide/en/elasticsearch/reference/6.1/release-notes-6.1.0.html] see PR[26800|https://github.com/elastic/elasticsearch/pull/26800])



--
This message was sent by Atlassian Jira
(v8.20.1#820001)