You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Andres de la Peña (Jira)" <ji...@apache.org> on 2022/11/25 14:14:00 UTC

[jira] [Updated] (CASSANDRA-18073) AVG CQL function of an empty set of values returns zero

     [ https://issues.apache.org/jira/browse/CASSANDRA-18073?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Andres de la Peña updated CASSANDRA-18073:
------------------------------------------
    Description: 
The CQL native aggregate function {{avg}} returns zero when it's applied to an empty set of values:
{code:java}
> CREATE TABLE t (k int PRIMARY KEY, v int);
> SELECT avg(v) FROM t;

 system.avg(v)
---------------
             0
{code}
The new {{collection_avg}} that has just been added by CASSANDRA-18060 is based on the {{avg}} implementation, so both are consistent. Thus, it will also return zero for an empty collection:
{code:java}
> CREATE TABLE t (k int PRIMARY KEY, v frozen<set<int>>);
> INSERT INTO t (k,v) VALUES(1, {});
> SELECT collection_avg(v) FROM t;

 system.collection_avg(v)
--------------------------
                        0
{code}
I think these functions should probably better return {{NaN}} instead of zero.

However, returning zero is not terribly incorrect, and returning {{NaN}} might be problematic for backward compatibility.

Also, to further complicate things, {{BigInteger}} and {{BigDecimal}} don't have a {{NaN}} value, so {{avg}} for {{varint}} and {{decimal}} should have a different behaviour, such as:
 * Keep returning zero
 * Return {{null}}
 * Throw an exception
 

  was:
The CQL native aggregate function {{avg}} returns zero when it's applied to an empty set of values:
{code:java}
> CREATE TABLE t (k int PRIMARY KEY, v int);
> SELECT avg(v) FROM t;

 system.avg(v)
---------------
             0
{code}
The {{collection_avg}} that is about to be added by CASSANDRA-18060 is based on the {{avg}} implementation, so both are consistent. Thus, it will also return zero for an empty collection:
{code:java}
> CREATE TABLE t (k int PRIMARY KEY, v frozen<set<int>>);
> INSERT INTO t (k,v) VALUES(1, {});
> SELECT collection_avg(v) FROM t;

 system.collection_avg(v)
--------------------------
                        0
{code}
I think these functions should probably better return {{NaN}} instead of zero.

However, returning zero is not terribly incorrect, and returning {{NaN}} might be problematic for backward compatibility.

Also, to further complicate things, {{BigInteger}} and {{BigDecimal}} don't have a {{NaN}} value, so {{avg}} for {{varint}} and {{decimal}} should have a different behaviour, such as:
 * Keep returning zero
 * Return {{null}}
 * Throw an exception
 


> AVG CQL function of an empty set of values returns zero
> -------------------------------------------------------
>
>                 Key: CASSANDRA-18073
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-18073
>             Project: Cassandra
>          Issue Type: Bug
>          Components: CQL/Semantics
>            Reporter: Andres de la Peña
>            Priority: Normal
>
> The CQL native aggregate function {{avg}} returns zero when it's applied to an empty set of values:
> {code:java}
> > CREATE TABLE t (k int PRIMARY KEY, v int);
> > SELECT avg(v) FROM t;
>  system.avg(v)
> ---------------
>              0
> {code}
> The new {{collection_avg}} that has just been added by CASSANDRA-18060 is based on the {{avg}} implementation, so both are consistent. Thus, it will also return zero for an empty collection:
> {code:java}
> > CREATE TABLE t (k int PRIMARY KEY, v frozen<set<int>>);
> > INSERT INTO t (k,v) VALUES(1, {});
> > SELECT collection_avg(v) FROM t;
>  system.collection_avg(v)
> --------------------------
>                         0
> {code}
> I think these functions should probably better return {{NaN}} instead of zero.
> However, returning zero is not terribly incorrect, and returning {{NaN}} might be problematic for backward compatibility.
> Also, to further complicate things, {{BigInteger}} and {{BigDecimal}} don't have a {{NaN}} value, so {{avg}} for {{varint}} and {{decimal}} should have a different behaviour, such as:
>  * Keep returning zero
>  * Return {{null}}
>  * Throw an exception
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@cassandra.apache.org
For additional commands, e-mail: commits-help@cassandra.apache.org