You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/10/14 06:10:23 UTC

[GitHub] [arrow-datafusion] ZuoTiJia opened a new issue, #3832: Paniced at to_timestamp_micros function when the timestamp is too large.

ZuoTiJia opened a new issue, #3832:
URL: https://github.com/apache/arrow-datafusion/issues/3832

   **Describe the bug**
   Paniced at to_timestamp_micros function.
   
   **To Reproduce**
   
   ```shell
   DataFusion CLI v13.0.0
   ❯ SELECT to_timestamp_micros(9065525203050843594);
   thread 'main' panicked at 'invalid or out-of-range datetime', 
   .cargo/registry/src/github.com-1ecc6299db9ec823/chrono-0.4.22/src/naive/datetime/mod.rs:131:18
   ```
   
   **Additional context**
   DataFusion 13.0
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] avantgardnerio commented on issue #3832: Paniced at to_timestamp_micros function when the timestamp is too large.

Posted by GitBox <gi...@apache.org>.
avantgardnerio commented on issue #3832:
URL: https://github.com/apache/arrow-datafusion/issues/3832#issuecomment-1284317925

   When chrono:0.4.23 is out we can kill all these:
   
   ```
           ScalarValue::IntervalDayTime(Some(i)) => add_day_time(prior, *i, sign),
           ScalarValue::IntervalYearMonth(Some(i)) => shift_months(prior, *i * sign),
           ScalarValue::IntervalMonthDayNano(Some(i)) => add_m_d_nano(prior, *i, sign),
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] avantgardnerio commented on issue #3832: Paniced at to_timestamp_micros function when the timestamp is too large.

Posted by GitBox <gi...@apache.org>.
avantgardnerio commented on issue #3832:
URL: https://github.com/apache/arrow-datafusion/issues/3832#issuecomment-1284318636

   Presently we can't remove them yet because there is no Timestamp + Months, only Date + months in 0.4.22.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] Jefffrey commented on issue #3832: Paniced at to_timestamp_micros function when the timestamp is too large.

Posted by GitBox <gi...@apache.org>.
Jefffrey commented on issue #3832:
URL: https://github.com/apache/arrow-datafusion/issues/3832#issuecomment-1344947912

   Issue seems to be resolved now, doesn't panic but returns error:
   
   ```bash
   DataFusion CLI v15.0.0
   ❯ SELECT to_timestamp_micros(9065525203050843594);
   +-----------------------------------------------+
   | totimestampmicros(Int64(9065525203050843594)) |
   +-----------------------------------------------+
   | ERROR CONVERTING DATE                         |
   +-----------------------------------------------+
   1 row in set. Query took 0.002 seconds.
   ❯
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] alamb commented on issue #3832: Paniced at to_timestamp_micros function when the timestamp is too large.

Posted by GitBox <gi...@apache.org>.
alamb commented on issue #3832:
URL: https://github.com/apache/arrow-datafusion/issues/3832#issuecomment-1345235026

   Thank you for checking @Jefffrey 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] retikulum commented on issue #3832: Paniced at to_timestamp_micros function when the timestamp is too large.

Posted by GitBox <gi...@apache.org>.
retikulum commented on issue #3832:
URL: https://github.com/apache/arrow-datafusion/issues/3832#issuecomment-1279461383

   Hi. I think this problem is about [chronos timestamp_nano function](https://github.com/chronotope/chrono/blob/a62da032f94b8553d8860d777ed0166b5f8396af/src/offset/mod.rs#L397) because it uses i64. Datafusion convert microsecond to nanosecond and passed it to timestamp_nano function. `9065525203050843594` microsecond is `9065525203050843594000` and it doesn't fit into i64 and overflowed. You can also see an example in [playground](https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=6156756059d862af06621e9cf6478bbd). I guess the function call stack is followed:
   
   - [to_timestamp_micros](https://github.com/apache/arrow-datafusion/blob/c59a76754e7e45063cb51c273f2a53aa325676a4/datafusion/physical-expr/src/datetime_expressions.rs#L152)
   - [string_to_timestamp_nanos_shim](https://github.com/apache/arrow-datafusion/blob/c59a76754e7e45063cb51c273f2a53aa325676a4/datafusion/physical-expr/src/datetime_expressions.rs#L129)
   - [arrow's string_to_timestamp_nanos](https://github.com/apache/arrow-rs/blob/master/arrow/src/compute/kernels/cast_utils.rs#L69)
   - [chrono's timestamp_nanos](https://github.com/chronotope/chrono/blob/a62da032f94b8553d8860d777ed0166b5f8396af/src/offset/mod.rs#L397)
   
   It seems interesting case but I am not sure about the solution.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] waitingkuo commented on issue #3832: Paniced at to_timestamp_micros function when the timestamp is too large.

Posted by GitBox <gi...@apache.org>.
waitingkuo commented on issue #3832:
URL: https://github.com/apache/arrow-datafusion/issues/3832#issuecomment-1282976820

   > I think it is not releated to this issue but datafusion uses `from_timestamp` function in following line:
   > 
   > https://github.com/apache/arrow-datafusion/blob/f93642fd7782b0d530a5bff0fecec020b34e9bd1/datafusion/physical-expr/src/expressions/datetime.rs#L304
   > 
   > So we also need to fix it. If I am wrong, could you also suggest me tips about how to debug/analyze these bugs?
   
   @retikulum yes, i think we need to fix it as well.
   
   @alamb @avantgardnerio 
   do you think we should eventually move these datetime arithmetic functions into arrow-rs?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] retikulum commented on issue #3832: Paniced at to_timestamp_micros function when the timestamp is too large.

Posted by GitBox <gi...@apache.org>.
retikulum commented on issue #3832:
URL: https://github.com/apache/arrow-datafusion/issues/3832#issuecomment-1282285012

   Hi. I am asking this because I want to understand the codebase better. I build datafusion-cli with debug, run ` SELECT to_timestamp_micros(9065525203050843594);` and get the error. However part of the call stack is shown as follows:
   
   ```
     19:     0x7ff66b198d9e - chrono::naive::datetime::NaiveDateTime::from_timestamp
                                  at .cargo\registry\src\github.com-1ecc6299db9ec823\chrono-0.4.22\src\naive\datetime\mod.rs:131
     20:     0x7ff66b1d14db - arrow_array::temporal_conversions::timestamp_us_to_datetime
                                  at \.cargo\registry\src\github.com-1ecc6299db9ec823\arrow-array-24.0.0\src\temporal_conversions.rs:123
     21:     0x7ff66b1d3bb5 - arrow_array::temporal_conversions::as_datetime<arrow_array::types::TimestampMicrosecondType>
                                  at i\.cargo\registry\src\github.com-1ecc6299db9ec823\arrow-array-24.0.0\src\temporal_conversions.rs:181
     22:     0x7ff66a9be6ea - arrow_array::array::primitive_array::PrimitiveArray<arrow_array::types::TimestampMicrosecondType>::value_as_datetime<arrow_array::types::TimestampMicrosecondType>
                                  at \.cargo\registry\src\github.com-1ecc6299db9ec823\arrow-array-24.0.0\src\array\primitive_array.rs:496
   ```
   So arrow is using `from_timestamp` function in [temporal_conversions](https://github.com/apache/arrow-rs/blob/02cab5443c0703ce5fa86647b834f184bba172ba/arrow-array/src/temporal_conversions.rs#L123) which means there is a need for a change in this part of the code. 
   
   I think it is not releated to this issue but datafusion uses `from_timestamp` function in following line:https://github.com/apache/arrow-datafusion/blob/f93642fd7782b0d530a5bff0fecec020b34e9bd1/datafusion/physical-expr/src/expressions/datetime.rs#L304
   
    So we also need to fix it. If I am wrong, could you also suggest me tips about how to debug/analyze these bugs?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] alamb commented on issue #3832: Paniced at to_timestamp_micros function when the timestamp is too large.

Posted by GitBox <gi...@apache.org>.
alamb commented on issue #3832:
URL: https://github.com/apache/arrow-datafusion/issues/3832#issuecomment-1285521384

   > do you think we should eventually move these datetime arithmetic functions into arrow-rs?
   
   Yes, I do think eventually the date/time arithmetic should be moved into arrow-rs kernels


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] waitingkuo commented on issue #3832: Paniced at to_timestamp_micros function when the timestamp is too large.

Posted by GitBox <gi...@apache.org>.
waitingkuo commented on issue #3832:
URL: https://github.com/apache/arrow-datafusion/issues/3832#issuecomment-1282014768

   i submitted a pr here apache/arrow-rs#2892


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] waitingkuo commented on issue #3832: Paniced at to_timestamp_micros function when the timestamp is too large.

Posted by GitBox <gi...@apache.org>.
waitingkuo commented on issue #3832:
URL: https://github.com/apache/arrow-datafusion/issues/3832#issuecomment-1281993464

   https://github.com/chronotope/chrono/blob/main/src/naive/datetime/mod.rs#L117
   
   `from_timestamp` will be deprecated since 0.4.23.
   i think we could change it to `from_timestamp_opt` and notify the user it's out-of-range 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] alamb closed issue #3832: Paniced at to_timestamp_micros function when the timestamp is too large.

Posted by GitBox <gi...@apache.org>.
alamb closed issue #3832: Paniced at to_timestamp_micros function when the timestamp is too large.
URL: https://github.com/apache/arrow-datafusion/issues/3832


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org