You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "2010YOUY01 (via GitHub)" <gi...@apache.org> on 2023/05/19 19:56:05 UTC

[GitHub] [arrow-datafusion] 2010YOUY01 commented on issue #6133: panic in round() math function

2010YOUY01 commented on issue #6133:
URL: https://github.com/apache/arrow-datafusion/issues/6133#issuecomment-1555164645

   Here is a more detailed description of the alternative fix:
   Currently, 2-args round function has a signature `round(num: Float, decimal_places: Int64) -> Float`, the underlying std library implementation of `round()` math functions can only handle 32-bit decimal places. I think changing the `decimal_places` type to `Int32` in the function signature can better solve this panic issue for these reasons:
   1. Better error handling:
   Suppose there is a batch of data for Int64 decimal places, for the error-in-execution approach, one bad value can kill the whole query. The alternative approach can make sure there won't be any execution error, potential overflow will be caught during binding.
   ```sql
   CREATE TABLE round_data (c1 BIGINT);
   INSERT INTO round_data (c1) VALUES (1), (2), (2147483648);
   SELECT round(3.1415, c1) from round_data;
   ```
   2. There are 3 other unimplemented Postgres math functions that have arguments with `Int32` type (for example 2-args `trunc()`), they don't have to worry about the same overflow issue when implementing.
   
   Potential solution:
   1. Change function argument to `Int32` in function definition and implementation 
   2. Since all scalar integer literals(like this 2147483648 in `select round(3.14, 2147483648);` ) are by default bind to `Int64` type inside datafusion, we can cast them when binding functions [here](https://github.com/apache/arrow-datafusion/blob/dd5e1dbbfd20539b40ae65acb8883f7e392cba92/datafusion/sql/src/expr/function.rs#L53), and throw planning error if overflows.
   (If by default parse integer literal into `Int32`, there will be hundreds of test fails to fix, not so doable. Then I think handling this edge case during planning is better than during execution.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org