You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "NGA-TRAN (via GitHub)" <gi...@apache.org> on 2023/11/14 20:19:45 UTC
[I] Count distinct with date_part/date_bin does not work [arrow-datafusion]
NGA-TRAN opened a new issue, #8175:
URL: https://github.com/apache/arrow-datafusion/issues/8175
### Describe the bug
After IOx upgraded DF recently, we hit a bug in count distinct with `date_bin`/`date_part`.
### To Reproduce
After some investigation, here is the reproducer in Datafusion CLI:
```SQL
create table t1(state string, city string, min_temp float, area int, time timestamp) as values
('MA', 'Boston', 70.4, 1, 50),
('MA', 'Bedford', 71.59, 2, 150);
select date_part('year', time) as bla, count(distinct state) as count from t1 group by bla;
-- Optimizer rule 'single_distinct_aggregation_to_group_by' failed caused by Schema error: No field named "date_part(Utf8(""year""),t1.time)". Valid fields are group_alias_0, "COUNT(DISTINCT t1.state)".
-- this query has the same issue
select date_bin(interval '1 year', time) as bla, count(distinct state) as count from t1 group by bla;
```
### Expected behavior
The queries should work
### Additional context
After I backed out https://github.com/apache/arrow-datafusion/commit/15d8c9bf48a56ae9de34d18becab13fd1942dc4a locally, the queries work
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
Re: [I] Count distinct with date_part/date_bin does not work [arrow-datafusion]
Posted by "NGA-TRAN (via GitHub)" <gi...@apache.org>.
NGA-TRAN commented on issue #8175:
URL: https://github.com/apache/arrow-datafusion/issues/8175#issuecomment-1811188501
I am working on 2 PRs:
1. Reverting https://github.com/apache/arrow-datafusion/commit/15d8c9bf48a56ae9de34d18becab13fd1942dc4a
2. Adding above tests
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
Re: [I] Count distinct with date_part/date_bin does not work [arrow-datafusion]
Posted by "NGA-TRAN (via GitHub)" <gi...@apache.org>.
NGA-TRAN commented on issue #8175:
URL: https://github.com/apache/arrow-datafusion/issues/8175#issuecomment-1811185436
CC @alamb @haohuaijin
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
Re: [I] Regression: Count distinct with date_part/date_bin does not work [arrow-datafusion]
Posted by "alamb (via GitHub)" <gi...@apache.org>.
alamb closed issue #8175: Regression: Count distinct with date_part/date_bin does not work
URL: https://github.com/apache/arrow-datafusion/issues/8175
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org