You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/10/14 06:10:23 UTC
[GitHub] [arrow-datafusion] ZuoTiJia opened a new issue, #3832: Paniced at to_timestamp_micros function when the timestamp is too large.
ZuoTiJia opened a new issue, #3832:
URL: https://github.com/apache/arrow-datafusion/issues/3832
**Describe the bug**
Paniced at to_timestamp_micros function.
**To Reproduce**
```shell
DataFusion CLI v13.0.0
❯ SELECT to_timestamp_micros(9065525203050843594);
thread 'main' panicked at 'invalid or out-of-range datetime',
.cargo/registry/src/github.com-1ecc6299db9ec823/chrono-0.4.22/src/naive/datetime/mod.rs:131:18
```
**Additional context**
DataFusion 13.0
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow-datafusion] avantgardnerio commented on issue #3832: Paniced at to_timestamp_micros function when the timestamp is too large.
Posted by GitBox <gi...@apache.org>.
avantgardnerio commented on issue #3832:
URL: https://github.com/apache/arrow-datafusion/issues/3832#issuecomment-1284317925
When chrono:0.4.23 is out we can kill all these:
```
ScalarValue::IntervalDayTime(Some(i)) => add_day_time(prior, *i, sign),
ScalarValue::IntervalYearMonth(Some(i)) => shift_months(prior, *i * sign),
ScalarValue::IntervalMonthDayNano(Some(i)) => add_m_d_nano(prior, *i, sign),
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow-datafusion] avantgardnerio commented on issue #3832: Paniced at to_timestamp_micros function when the timestamp is too large.
Posted by GitBox <gi...@apache.org>.
avantgardnerio commented on issue #3832:
URL: https://github.com/apache/arrow-datafusion/issues/3832#issuecomment-1284318636
Presently we can't remove them yet because there is no Timestamp + Months, only Date + months in 0.4.22.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow-datafusion] Jefffrey commented on issue #3832: Paniced at to_timestamp_micros function when the timestamp is too large.
Posted by GitBox <gi...@apache.org>.
Jefffrey commented on issue #3832:
URL: https://github.com/apache/arrow-datafusion/issues/3832#issuecomment-1344947912
Issue seems to be resolved now, doesn't panic but returns error:
```bash
DataFusion CLI v15.0.0
❯ SELECT to_timestamp_micros(9065525203050843594);
+-----------------------------------------------+
| totimestampmicros(Int64(9065525203050843594)) |
+-----------------------------------------------+
| ERROR CONVERTING DATE |
+-----------------------------------------------+
1 row in set. Query took 0.002 seconds.
❯
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow-datafusion] alamb commented on issue #3832: Paniced at to_timestamp_micros function when the timestamp is too large.
Posted by GitBox <gi...@apache.org>.
alamb commented on issue #3832:
URL: https://github.com/apache/arrow-datafusion/issues/3832#issuecomment-1345235026
Thank you for checking @Jefffrey
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow-datafusion] retikulum commented on issue #3832: Paniced at to_timestamp_micros function when the timestamp is too large.
Posted by GitBox <gi...@apache.org>.
retikulum commented on issue #3832:
URL: https://github.com/apache/arrow-datafusion/issues/3832#issuecomment-1279461383
Hi. I think this problem is about [chronos timestamp_nano function](https://github.com/chronotope/chrono/blob/a62da032f94b8553d8860d777ed0166b5f8396af/src/offset/mod.rs#L397) because it uses i64. Datafusion convert microsecond to nanosecond and passed it to timestamp_nano function. `9065525203050843594` microsecond is `9065525203050843594000` and it doesn't fit into i64 and overflowed. You can also see an example in [playground](https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=6156756059d862af06621e9cf6478bbd). I guess the function call stack is followed:
- [to_timestamp_micros](https://github.com/apache/arrow-datafusion/blob/c59a76754e7e45063cb51c273f2a53aa325676a4/datafusion/physical-expr/src/datetime_expressions.rs#L152)
- [string_to_timestamp_nanos_shim](https://github.com/apache/arrow-datafusion/blob/c59a76754e7e45063cb51c273f2a53aa325676a4/datafusion/physical-expr/src/datetime_expressions.rs#L129)
- [arrow's string_to_timestamp_nanos](https://github.com/apache/arrow-rs/blob/master/arrow/src/compute/kernels/cast_utils.rs#L69)
- [chrono's timestamp_nanos](https://github.com/chronotope/chrono/blob/a62da032f94b8553d8860d777ed0166b5f8396af/src/offset/mod.rs#L397)
It seems interesting case but I am not sure about the solution.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow-datafusion] waitingkuo commented on issue #3832: Paniced at to_timestamp_micros function when the timestamp is too large.
Posted by GitBox <gi...@apache.org>.
waitingkuo commented on issue #3832:
URL: https://github.com/apache/arrow-datafusion/issues/3832#issuecomment-1282976820
> I think it is not releated to this issue but datafusion uses `from_timestamp` function in following line:
>
> https://github.com/apache/arrow-datafusion/blob/f93642fd7782b0d530a5bff0fecec020b34e9bd1/datafusion/physical-expr/src/expressions/datetime.rs#L304
>
> So we also need to fix it. If I am wrong, could you also suggest me tips about how to debug/analyze these bugs?
@retikulum yes, i think we need to fix it as well.
@alamb @avantgardnerio
do you think we should eventually move these datetime arithmetic functions into arrow-rs?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow-datafusion] retikulum commented on issue #3832: Paniced at to_timestamp_micros function when the timestamp is too large.
Posted by GitBox <gi...@apache.org>.
retikulum commented on issue #3832:
URL: https://github.com/apache/arrow-datafusion/issues/3832#issuecomment-1282285012
Hi. I am asking this because I want to understand the codebase better. I build datafusion-cli with debug, run ` SELECT to_timestamp_micros(9065525203050843594);` and get the error. However part of the call stack is shown as follows:
```
19: 0x7ff66b198d9e - chrono::naive::datetime::NaiveDateTime::from_timestamp
at .cargo\registry\src\github.com-1ecc6299db9ec823\chrono-0.4.22\src\naive\datetime\mod.rs:131
20: 0x7ff66b1d14db - arrow_array::temporal_conversions::timestamp_us_to_datetime
at \.cargo\registry\src\github.com-1ecc6299db9ec823\arrow-array-24.0.0\src\temporal_conversions.rs:123
21: 0x7ff66b1d3bb5 - arrow_array::temporal_conversions::as_datetime<arrow_array::types::TimestampMicrosecondType>
at i\.cargo\registry\src\github.com-1ecc6299db9ec823\arrow-array-24.0.0\src\temporal_conversions.rs:181
22: 0x7ff66a9be6ea - arrow_array::array::primitive_array::PrimitiveArray<arrow_array::types::TimestampMicrosecondType>::value_as_datetime<arrow_array::types::TimestampMicrosecondType>
at \.cargo\registry\src\github.com-1ecc6299db9ec823\arrow-array-24.0.0\src\array\primitive_array.rs:496
```
So arrow is using `from_timestamp` function in [temporal_conversions](https://github.com/apache/arrow-rs/blob/02cab5443c0703ce5fa86647b834f184bba172ba/arrow-array/src/temporal_conversions.rs#L123) which means there is a need for a change in this part of the code.
I think it is not releated to this issue but datafusion uses `from_timestamp` function in following line:https://github.com/apache/arrow-datafusion/blob/f93642fd7782b0d530a5bff0fecec020b34e9bd1/datafusion/physical-expr/src/expressions/datetime.rs#L304
So we also need to fix it. If I am wrong, could you also suggest me tips about how to debug/analyze these bugs?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow-datafusion] alamb commented on issue #3832: Paniced at to_timestamp_micros function when the timestamp is too large.
Posted by GitBox <gi...@apache.org>.
alamb commented on issue #3832:
URL: https://github.com/apache/arrow-datafusion/issues/3832#issuecomment-1285521384
> do you think we should eventually move these datetime arithmetic functions into arrow-rs?
Yes, I do think eventually the date/time arithmetic should be moved into arrow-rs kernels
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow-datafusion] waitingkuo commented on issue #3832: Paniced at to_timestamp_micros function when the timestamp is too large.
Posted by GitBox <gi...@apache.org>.
waitingkuo commented on issue #3832:
URL: https://github.com/apache/arrow-datafusion/issues/3832#issuecomment-1282014768
i submitted a pr here apache/arrow-rs#2892
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow-datafusion] waitingkuo commented on issue #3832: Paniced at to_timestamp_micros function when the timestamp is too large.
Posted by GitBox <gi...@apache.org>.
waitingkuo commented on issue #3832:
URL: https://github.com/apache/arrow-datafusion/issues/3832#issuecomment-1281993464
https://github.com/chronotope/chrono/blob/main/src/naive/datetime/mod.rs#L117
`from_timestamp` will be deprecated since 0.4.23.
i think we could change it to `from_timestamp_opt` and notify the user it's out-of-range
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow-datafusion] alamb closed issue #3832: Paniced at to_timestamp_micros function when the timestamp is too large.
Posted by GitBox <gi...@apache.org>.
alamb closed issue #3832: Paniced at to_timestamp_micros function when the timestamp is too large.
URL: https://github.com/apache/arrow-datafusion/issues/3832
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org