You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by "Andrew Lamb (Jira)" <ji...@apache.org> on 2021/04/09 12:59:00 UTC

[jira] [Created] (ARROW-12312) [Rust][DataFusion] COUNT DISTINCT not support for `Float64`

Andrew Lamb created ARROW-12312:
-----------------------------------

             Summary: [Rust][DataFusion] COUNT DISTINCT not support for `Float64`
                 Key: ARROW-12312
                 URL: https://issues.apache.org/jira/browse/ARROW-12312
             Project: Apache Arrow
          Issue Type: Bug
          Components: Rust - DataFusion
            Reporter: Andrew Lamb


If you try to run a `COUNT (DISTINCT ..)` query on a float column you get the following error:

thread 'tokio-runtime-worker' panicked at 'Unexpected DataType for list', datafusion/src/scalar.rs:342:22

Reproducer:
{code}
 echo "foo,1.23" > /tmp/foo.csv
 ./target/debug/datafusion-cli

> CREATE EXTERNAL TABLE t (a varchar, b float) STORED AS CSV LOCATION '/tmp/foo.csv';
0 rows in set. Query took 0 seconds.
> select count(distinct a) from t;
+-------------------+
| COUNT(DISTINCT a) |
+-------------------+
| 1                 |
+-------------------+
1 rows in set. Query took 0 seconds.
> select count(distinct b) from t;
thread 'tokio-runtime-worker' panicked at 'Unexpected DataType for list', datafusion/src/scalar.rs:342:22
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
ArrowError(ExternalError(Canceled))
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)