You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pinot.apache.org by "cyrilou242 (via GitHub)" <gi...@apache.org> on 2023/12/04 17:56:11 UTC
[I] DATETIMECONVERT output type is incorrect when grouping by the column on which the function is applied [pinot]
cyrilou242 opened a new issue, #12091:
URL: https://github.com/apache/pinot/issues/12091
## Problem observed
`DATETIMECONVERT(date_col, config...)` does not return the correct type when `date_col` is used in the `GROUP BY`clause.
**The issue may not be specific to the DATETIMECONVERT transform, but I have not tested with other transforms.**
## Example
Using DATETIMECONVERT to transform a string column in an EPOCH format, the expected output type is LONG
```
SELECT
DATETIMECONVERT("date", '1:DAYS:SIMPLE_DATE_FORMAT:yyyyMMdd', '1:MILLISECONDS:EPOCH', '1:DAYS') AS ts,
FROM pageviews
LIMIT 10
```
returns
```
"dataSchema": {
"columnNames": [
"ts"
],
"columnDataTypes": [
"LONG"
]
},
"rows": [
[
1580601600000
],
...
```
The `ts` column datatype is `LONG` ✅
When we use the ts column to group by, the column datatype is still a long
```
SELECT
DATETIMECONVERT("date", '1:DAYS:SIMPLE_DATE_FORMAT:yyyyMMdd', '1:MILLISECONDS:EPOCH', '1:DAYS') AS ts,
count(*) AS met
FROM pageviews
GROUP BY ts
LIMIT 10
```
Returns
```
"dataSchema": {
"columnNames": [
"ts",
"met"
],
"columnDataTypes": [
"LONG",
"LONG"
]
},
"rows": [
[
1581638400000,
1800
],
...
```
The `ts` column datatype is `LONG` ✅
But when grouping by the transformed column directly, the type is different:
```
SELECT
DATETIMECONVERT("date", '1:DAYS:SIMPLE_DATE_FORMAT:yyyyMMdd', '1:MILLISECONDS:EPOCH', '1:DAYS') AS ts,
count(*) AS met
FROM pageviews
GROUP BY "date" -- change is here
LIMIT 10
```
(notice: `GROUP by "date"`)
returns
```
"dataSchema": {
"columnNames": [
"ts",
"met"
],
"columnDataTypes": [
"STRING",
"LONG"
]
},
"rows": [
[
"1581206400000",
1800
],
...
```
--> `ts` column is of type `STRING` ⚠️
**Bug**: It looks like the returned type is the type of of the input time column instead of the type of the output.
The data to reproduce is available here:
https://github.com/startreedata/thirdeye/tree/master/examples/pageviews
If this helps: in most recent versions - or maybe it depends on the type of the original column - the incorrect type column can also be of type `OBJECT`.
I can provide an internal online env to reproduce if this helps.
## Versions:
The issue can be reproduced in `0.12.1`, `1.1.0-ST.19.3` and `1.1.0-ST.19.5`.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org
Re: [I] DATETIMECONVERT output type is incorrect when grouping by the column on which the function is applied [pinot]
Posted by "cyrilou242 (via GitHub)" <gi...@apache.org>.
cyrilou242 commented on issue #12091:
URL: https://github.com/apache/pinot/issues/12091#issuecomment-1839272926
from @Jackie-Jiang :
> the real problem is that our scalar function (not implementing TransformFunction) is not able to return different type based on the argument, and the post-aggregation (e.g. group by date and return the result of a function on date) is using the scalar function instead of TransformFunction. This is a known issue, and we are exploring a fix
closing as duplicate.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org
Re: [I] DATETIMECONVERT output type is incorrect when grouping by the column on which the function is applied [pinot]
Posted by "walterddr (via GitHub)" <gi...@apache.org>.
walterddr commented on issue #12091:
URL: https://github.com/apache/pinot/issues/12091#issuecomment-1839375710
related to https://github.com/apache/pinot/issues/11734
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org
Re: [I] DATETIMECONVERT output type is incorrect when grouping by the column on which the function is applied [pinot]
Posted by "cyrilou242 (via GitHub)" <gi...@apache.org>.
cyrilou242 closed issue #12091: DATETIMECONVERT output type is incorrect when grouping by the column on which the function is applied
URL: https://github.com/apache/pinot/issues/12091
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org