You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@calcite.apache.org by Andrei Sereda <an...@sereda.cc> on 2018/12/27 00:22:45 UTC

"single or null" aggregation function for elastic

Hello,

I’m looking for a way to simulate “single or null” (on distinct values)
aggregation function in elastic.
Example of a query

-- for multiple distinct values return nullselect date,
single_value(value) from table group by date

Some Options

   1. ANY_VALUE. For multiple values returns one of them (I need null).
   2. SINGLE_VALUE. For multiple (or empty) values throws exception (I need
   null).
   3. COLLECT / JSON_ARRAYAGG (see below).
   4. Manually add column COUNT(distinct ...) and post-process the result.

COLLECT / JSON_ARRAYAGG

select date, collect(distinct value) from table group by date

This query might potentially work but the problem with elastic is that it
doesn’t return all values by default (similar issue to scrolling. see size
<https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-bucket-terms-aggregation.html#search-aggregations-bucket-terms-aggregation-size>).
Currently, scrolling is not possible with aggregations.
Therefore I have to impose usage of DISTINCT / LIMIT with COLLECT :

-- enforce DISTINCT and LIMITselect date, collect(distinct value limit
2) from table group by date

Do you think it is reasonable to enforce such restriction to make it work
with elastic ?
Adding COUNT function

One can append count(distinct value) aggregation and check if it is equal
to 1.

-- use combination of ANY_VALUE and COUNTselect date,
any_value(value), count(distinct value) from table group by date

What do you think ?

Regards,
Andrei.

Re: "single or null" aggregation function for elastic

Posted by Andrei Sereda <an...@sereda.cc>.
Thanks for the hint, Julian.
Unfortunately min / max aggregations work only on numeric types
<https://www.elastic.co/guide/en/elasticsearch/reference/current/number.html>
in elastic.

On Thu, Dec 27, 2018 at 1:28 AM Julian Hyde <jh...@gmail.com> wrote:

> case when max(x) = min(x) then max(x) end
>
> Julian
>
> > On Dec 26, 2018, at 16:22, Andrei Sereda <an...@sereda.cc> wrote:
> >
> > Hello,
> >
> > I’m looking for a way to simulate “single or null” (on distinct values)
> > aggregation function in elastic.
> > Example of a query
> >
> > -- for multiple distinct values return nullselect date,
> > single_value(value) from table group by date
> >
> > Some Options
> >
> >   1. ANY_VALUE. For multiple values returns one of them (I need null).
> >   2. SINGLE_VALUE. For multiple (or empty) values throws exception (I
> need
> >   null).
> >   3. COLLECT / JSON_ARRAYAGG (see below).
> >   4. Manually add column COUNT(distinct ...) and post-process the result.
> >
> > COLLECT / JSON_ARRAYAGG
> >
> > select date, collect(distinct value) from table group by date
> >
> > This query might potentially work but the problem with elastic is that it
> > doesn’t return all values by default (similar issue to scrolling. see
> size
> > <
> https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-bucket-terms-aggregation.html#search-aggregations-bucket-terms-aggregation-size
> >).
> > Currently, scrolling is not possible with aggregations.
> > Therefore I have to impose usage of DISTINCT / LIMIT with COLLECT :
> >
> > -- enforce DISTINCT and LIMITselect date, collect(distinct value limit
> > 2) from table group by date
> >
> > Do you think it is reasonable to enforce such restriction to make it work
> > with elastic ?
> > Adding COUNT function
> >
> > One can append count(distinct value) aggregation and check if it is equal
> > to 1.
> >
> > -- use combination of ANY_VALUE and COUNTselect date,
> > any_value(value), count(distinct value) from table group by date
> >
> > What do you think ?
> >
> > Regards,
> > Andrei.
>

Re: "single or null" aggregation function for elastic

Posted by Julian Hyde <jh...@gmail.com>.
case when max(x) = min(x) then max(x) end 

Julian

> On Dec 26, 2018, at 16:22, Andrei Sereda <an...@sereda.cc> wrote:
> 
> Hello,
> 
> I’m looking for a way to simulate “single or null” (on distinct values)
> aggregation function in elastic.
> Example of a query
> 
> -- for multiple distinct values return nullselect date,
> single_value(value) from table group by date
> 
> Some Options
> 
>   1. ANY_VALUE. For multiple values returns one of them (I need null).
>   2. SINGLE_VALUE. For multiple (or empty) values throws exception (I need
>   null).
>   3. COLLECT / JSON_ARRAYAGG (see below).
>   4. Manually add column COUNT(distinct ...) and post-process the result.
> 
> COLLECT / JSON_ARRAYAGG
> 
> select date, collect(distinct value) from table group by date
> 
> This query might potentially work but the problem with elastic is that it
> doesn’t return all values by default (similar issue to scrolling. see size
> <https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-bucket-terms-aggregation.html#search-aggregations-bucket-terms-aggregation-size>).
> Currently, scrolling is not possible with aggregations.
> Therefore I have to impose usage of DISTINCT / LIMIT with COLLECT :
> 
> -- enforce DISTINCT and LIMITselect date, collect(distinct value limit
> 2) from table group by date
> 
> Do you think it is reasonable to enforce such restriction to make it work
> with elastic ?
> Adding COUNT function
> 
> One can append count(distinct value) aggregation and check if it is equal
> to 1.
> 
> -- use combination of ANY_VALUE and COUNTselect date,
> any_value(value), count(distinct value) from table group by date
> 
> What do you think ?
> 
> Regards,
> Andrei.