You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "jonahgao (via GitHub)" <gi...@apache.org> on 2023/10/22 15:40:04 UTC

[PR] feat: support `Decimal256` for the `abs` function [arrow-datafusion]

jonahgao opened a new pull request, #7904:
URL: https://github.com/apache/arrow-datafusion/pull/7904

   ## Which issue does this PR close?
   
   N/A
   
   ## Rationale for this change
   
   <!--
    Why are you proposing this change? If this is already explained clearly in the issue then this section is not needed.
    Explaining clearly why changes are proposed helps reviewers understand your changes and offer better suggestions for fixes.  
   -->
   
   ## What changes are included in this PR?
   - Support `Decimal256` for the `abs` function.
   - The `abs` function of `Decimal128` checks for overflow.
   - Minor fixes in the math sql logical test.
   
   ## Are these changes tested?
   Yes
   
   ## Are there any user-facing changes?
   No


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] feat: support `Decimal256` for the `abs` function [arrow-datafusion]

Posted by "viirya (via GitHub)" <gi...@apache.org>.
viirya commented on code in PR #7904:
URL: https://github.com/apache/arrow-datafusion/pull/7904#discussion_r1368160834


##########
datafusion/physical-expr/src/math_expressions.rs:
##########
@@ -701,6 +701,29 @@ macro_rules! make_try_abs_function {
     }};
 }
 
+// Decimal types should keep the same precision and scale by using `with_data_type()`.
+// https://github.com/apache/arrow-rs/issues/4644
+macro_rules! make_decimal_abs_function {
+    ($ARRAY_TYPE:ident) => {{
+        |args: &[ArrayRef]| {
+            let array = downcast_arg!(&args[0], "abs arg", $ARRAY_TYPE);
+            let res: $ARRAY_TYPE = array
+                .try_unary(|x| {
+                    // Valid Decimal values should never overflow
+                    x.checked_abs().ok_or_else(|| {
+                        ArrowError::InvalidArgumentError(format!(
+                            "Invalid {} value {} when calculating abs",
+                            stringify!($ARRAY_TYPE),
+                            x

Review Comment:
   Looks like this is unnecessary if we use `wrapping_abs`. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] feat: support `Decimal256` for the `abs` function [arrow-datafusion]

Posted by "viirya (via GitHub)" <gi...@apache.org>.
viirya commented on code in PR #7904:
URL: https://github.com/apache/arrow-datafusion/pull/7904#discussion_r1368162811


##########
datafusion/physical-expr/src/math_expressions.rs:
##########
@@ -701,6 +701,29 @@ macro_rules! make_try_abs_function {
     }};
 }
 
+// Decimal types should keep the same precision and scale by using `with_data_type()`.
+// https://github.com/apache/arrow-rs/issues/4644
+macro_rules! make_decimal_abs_function {
+    ($ARRAY_TYPE:ident) => {{
+        |args: &[ArrayRef]| {
+            let array = downcast_arg!(&args[0], "abs arg", $ARRAY_TYPE);
+            let res: $ARRAY_TYPE = array
+                .try_unary(|x| {
+                    // Valid Decimal values should never overflow
+                    x.checked_abs().ok_or_else(|| {

Review Comment:
   That's said, if you assume that input decimal array could be "invalid", we will have this kind of check in every operator/expressions. šŸ˜„ 



##########
datafusion/physical-expr/src/math_expressions.rs:
##########
@@ -701,6 +701,29 @@ macro_rules! make_try_abs_function {
     }};
 }
 
+// Decimal types should keep the same precision and scale by using `with_data_type()`.
+// https://github.com/apache/arrow-rs/issues/4644
+macro_rules! make_decimal_abs_function {
+    ($ARRAY_TYPE:ident) => {{
+        |args: &[ArrayRef]| {
+            let array = downcast_arg!(&args[0], "abs arg", $ARRAY_TYPE);
+            let res: $ARRAY_TYPE = array
+                .try_unary(|x| {
+                    // Valid Decimal values should never overflow
+                    x.checked_abs().ok_or_else(|| {

Review Comment:
   That's said, if you assume that input decimal array could be "invalid", we will have this kind of check in every operator/expressions. šŸ˜„ 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] feat: support `Decimal256` for the `abs` function [arrow-datafusion]

Posted by "viirya (via GitHub)" <gi...@apache.org>.
viirya commented on code in PR #7904:
URL: https://github.com/apache/arrow-datafusion/pull/7904#discussion_r1368159697


##########
datafusion/physical-expr/src/math_expressions.rs:
##########
@@ -701,6 +701,29 @@ macro_rules! make_try_abs_function {
     }};
 }
 
+// Decimal types should keep the same precision and scale by using `with_data_type()`.
+// https://github.com/apache/arrow-rs/issues/4644
+macro_rules! make_decimal_abs_function {
+    ($ARRAY_TYPE:ident) => {{
+        |args: &[ArrayRef]| {
+            let array = downcast_arg!(&args[0], "abs arg", $ARRAY_TYPE);
+            let res: $ARRAY_TYPE = array
+                .try_unary(|x| {
+                    // Valid Decimal values should never overflow
+                    x.checked_abs().ok_or_else(|| {

Review Comment:
   If so, this `checked_abs` looks unnecessary. Maybe just `wrapping_abs`?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] feat: support `Decimal256` for the `abs` function [arrow-datafusion]

Posted by "viirya (via GitHub)" <gi...@apache.org>.
viirya commented on code in PR #7904:
URL: https://github.com/apache/arrow-datafusion/pull/7904#discussion_r1367940364


##########
datafusion/physical-expr/src/math_expressions.rs:
##########
@@ -701,6 +701,29 @@ macro_rules! make_try_abs_function {
     }};
 }
 
+// Decimal types should keep the same precision and scale by using `with_data_type()`.
+// https://github.com/apache/arrow-rs/issues/4644
+macro_rules! make_decimal_abs_function {
+    ($ARRAY_TYPE:ident) => {{
+        |args: &[ArrayRef]| {
+            let array = downcast_arg!(&args[0], "abs arg", $ARRAY_TYPE);
+            let res: $ARRAY_TYPE = array
+                .try_unary(|x| {
+                    // Valid Decimal values should never overflow
+                    x.checked_abs().ok_or_else(|| {

Review Comment:
   Could you add a test for overflow case?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] feat: support `Decimal256` for the `abs` function [arrow-datafusion]

Posted by "alamb (via GitHub)" <gi...@apache.org>.
alamb merged PR #7904:
URL: https://github.com/apache/arrow-datafusion/pull/7904


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] feat: support `Decimal256` for the `abs` function [arrow-datafusion]

Posted by "jonahgao (via GitHub)" <gi...@apache.org>.
jonahgao commented on code in PR #7904:
URL: https://github.com/apache/arrow-datafusion/pull/7904#discussion_r1368200017


##########
datafusion/physical-expr/src/math_expressions.rs:
##########
@@ -701,6 +701,29 @@ macro_rules! make_try_abs_function {
     }};
 }
 
+// Decimal types should keep the same precision and scale by using `with_data_type()`.
+// https://github.com/apache/arrow-rs/issues/4644
+macro_rules! make_decimal_abs_function {
+    ($ARRAY_TYPE:ident) => {{
+        |args: &[ArrayRef]| {
+            let array = downcast_arg!(&args[0], "abs arg", $ARRAY_TYPE);
+            let res: $ARRAY_TYPE = array
+                .try_unary(|x| {
+                    // Valid Decimal values should never overflow
+                    x.checked_abs().ok_or_else(|| {

Review Comment:
   @viirya 
   Not all operators/expressions require this. Overflow in `abs` here is a bug, while others like `sum` may overflow even if inputs are valid.
   
   I have considered using `wrapping_abs`, but Iā€™m worried that if invalid values are introduced elsewhere, they might be hidden and not easily detected. 
   
   It seems to be a trade-off between performance and safety, but for me, both are ok.
   
   I will follow the suggestion and change it to `wrapping_abs`.



##########
datafusion/physical-expr/src/math_expressions.rs:
##########
@@ -701,6 +701,29 @@ macro_rules! make_try_abs_function {
     }};
 }
 
+// Decimal types should keep the same precision and scale by using `with_data_type()`.
+// https://github.com/apache/arrow-rs/issues/4644
+macro_rules! make_decimal_abs_function {
+    ($ARRAY_TYPE:ident) => {{
+        |args: &[ArrayRef]| {
+            let array = downcast_arg!(&args[0], "abs arg", $ARRAY_TYPE);
+            let res: $ARRAY_TYPE = array
+                .try_unary(|x| {
+                    // Valid Decimal values should never overflow
+                    x.checked_abs().ok_or_else(|| {

Review Comment:
   @viirya 
   Not all operators/expressions require this. Overflow in `abs` here is a bug, while others like `sum` may overflow even if inputs are valid.
   
   I have considered using `wrapping_abs`, but Iā€™m worried that if invalid values are introduced elsewhere, they might be hidden and not easily detected. 
   
   It seems to be a trade-off between performance and safety, but for me, both are ok.
   
   I will follow the suggestion and change it to `wrapping_abs`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] feat: support `Decimal256` for the `abs` function [arrow-datafusion]

Posted by "jonahgao (via GitHub)" <gi...@apache.org>.
jonahgao commented on code in PR #7904:
URL: https://github.com/apache/arrow-datafusion/pull/7904#discussion_r1368048094


##########
datafusion/physical-expr/src/math_expressions.rs:
##########
@@ -701,6 +701,29 @@ macro_rules! make_try_abs_function {
     }};
 }
 
+// Decimal types should keep the same precision and scale by using `with_data_type()`.
+// https://github.com/apache/arrow-rs/issues/4644
+macro_rules! make_decimal_abs_function {
+    ($ARRAY_TYPE:ident) => {{
+        |args: &[ArrayRef]| {
+            let array = downcast_arg!(&args[0], "abs arg", $ARRAY_TYPE);
+            let res: $ARRAY_TYPE = array
+                .try_unary(|x| {
+                    // Valid Decimal values should never overflow
+                    x.checked_abs().ok_or_else(|| {
+                        ArrowError::InvalidArgumentError(format!(
+                            "Invalid {} value {} when calculating abs",
+                            stringify!($ARRAY_TYPE),
+                            x

Review Comment:
   @viirya I would like to emphasize that the input is illegal for the same reason mentioned above.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] feat: support `Decimal256` for the `abs` function [arrow-datafusion]

Posted by "viirya (via GitHub)" <gi...@apache.org>.
viirya commented on code in PR #7904:
URL: https://github.com/apache/arrow-datafusion/pull/7904#discussion_r1367940470


##########
datafusion/physical-expr/src/math_expressions.rs:
##########
@@ -701,6 +701,29 @@ macro_rules! make_try_abs_function {
     }};
 }
 
+// Decimal types should keep the same precision and scale by using `with_data_type()`.
+// https://github.com/apache/arrow-rs/issues/4644

Review Comment:
   This comment looks useless.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] feat: support `Decimal256` for the `abs` function [arrow-datafusion]

Posted by "jonahgao (via GitHub)" <gi...@apache.org>.
jonahgao commented on code in PR #7904:
URL: https://github.com/apache/arrow-datafusion/pull/7904#discussion_r1368046836


##########
datafusion/physical-expr/src/math_expressions.rs:
##########
@@ -701,6 +701,29 @@ macro_rules! make_try_abs_function {
     }};
 }
 
+// Decimal types should keep the same precision and scale by using `with_data_type()`.
+// https://github.com/apache/arrow-rs/issues/4644

Review Comment:
   Thank you @viirya. Removed since the linked issue has already been closed.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] feat: support `Decimal256` for the `abs` function [arrow-datafusion]

Posted by "jonahgao (via GitHub)" <gi...@apache.org>.
jonahgao commented on code in PR #7904:
URL: https://github.com/apache/arrow-datafusion/pull/7904#discussion_r1368046407


##########
datafusion/physical-expr/src/math_expressions.rs:
##########
@@ -701,6 +701,29 @@ macro_rules! make_try_abs_function {
     }};
 }
 
+// Decimal types should keep the same precision and scale by using `with_data_type()`.
+// https://github.com/apache/arrow-rs/issues/4644
+macro_rules! make_decimal_abs_function {
+    ($ARRAY_TYPE:ident) => {{
+        |args: &[ArrayRef]| {
+            let array = downcast_arg!(&args[0], "abs arg", $ARRAY_TYPE);
+            let res: $ARRAY_TYPE = array
+                .try_unary(|x| {
+                    // Valid Decimal values should never overflow
+                    x.checked_abs().ok_or_else(|| {

Review Comment:
   @viirya The `abs` function only panics on `MIN`, which is not a valid decimal number unless there is a bug somewhere.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] feat: support `Decimal256` for the `abs` function [arrow-datafusion]

Posted by "viirya (via GitHub)" <gi...@apache.org>.
viirya commented on code in PR #7904:
URL: https://github.com/apache/arrow-datafusion/pull/7904#discussion_r1367940607


##########
datafusion/physical-expr/src/math_expressions.rs:
##########
@@ -701,6 +701,29 @@ macro_rules! make_try_abs_function {
     }};
 }
 
+// Decimal types should keep the same precision and scale by using `with_data_type()`.
+// https://github.com/apache/arrow-rs/issues/4644
+macro_rules! make_decimal_abs_function {
+    ($ARRAY_TYPE:ident) => {{
+        |args: &[ArrayRef]| {
+            let array = downcast_arg!(&args[0], "abs arg", $ARRAY_TYPE);
+            let res: $ARRAY_TYPE = array
+                .try_unary(|x| {
+                    // Valid Decimal values should never overflow
+                    x.checked_abs().ok_or_else(|| {
+                        ArrowError::InvalidArgumentError(format!(
+                            "Invalid {} value {} when calculating abs",
+                            stringify!($ARRAY_TYPE),
+                            x

Review Comment:
   ```suggestion
                               "{} value {} overflows when calculating abs",
                               stringify!($ARRAY_TYPE),
                               x
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] feat: support `Decimal256` for the `abs` function [arrow-datafusion]

Posted by "viirya (via GitHub)" <gi...@apache.org>.
viirya commented on code in PR #7904:
URL: https://github.com/apache/arrow-datafusion/pull/7904#discussion_r1368214057


##########
datafusion/physical-expr/src/math_expressions.rs:
##########
@@ -701,6 +701,29 @@ macro_rules! make_try_abs_function {
     }};
 }
 
+// Decimal types should keep the same precision and scale by using `with_data_type()`.
+// https://github.com/apache/arrow-rs/issues/4644
+macro_rules! make_decimal_abs_function {
+    ($ARRAY_TYPE:ident) => {{
+        |args: &[ArrayRef]| {
+            let array = downcast_arg!(&args[0], "abs arg", $ARRAY_TYPE);
+            let res: $ARRAY_TYPE = array
+                .try_unary(|x| {
+                    // Valid Decimal values should never overflow
+                    x.checked_abs().ok_or_else(|| {

Review Comment:
   It is not a bug...If you feed invalid data, the operators/expressions won't perform correctly.
   
   I mean, if you worry about `abs`'s input could be invalid decimal values, you will worry other operator/expression takes invalid input too. It doesn't make sense.
   
   If you have a decimal input with invalid values (for example, it is out of the representation range of the precision/scale), some operators/expressions will perform wrongly. But it is not a reason to add such check to all operators/expressions to verify if input is valid or not. It means that you will have to validate inputs of each operator/expression.
   
   I think `wrapping_abs` is enough here.
   
   
   
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] feat: support `Decimal256` for the `abs` function [arrow-datafusion]

Posted by "jonahgao (via GitHub)" <gi...@apache.org>.
jonahgao commented on code in PR #7904:
URL: https://github.com/apache/arrow-datafusion/pull/7904#discussion_r1368221509


##########
datafusion/physical-expr/src/math_expressions.rs:
##########
@@ -701,6 +701,29 @@ macro_rules! make_try_abs_function {
     }};
 }
 
+// Decimal types should keep the same precision and scale by using `with_data_type()`.
+// https://github.com/apache/arrow-rs/issues/4644
+macro_rules! make_decimal_abs_function {
+    ($ARRAY_TYPE:ident) => {{
+        |args: &[ArrayRef]| {
+            let array = downcast_arg!(&args[0], "abs arg", $ARRAY_TYPE);
+            let res: $ARRAY_TYPE = array
+                .try_unary(|x| {
+                    // Valid Decimal values should never overflow
+                    x.checked_abs().ok_or_else(|| {

Review Comment:
    Thank you @viirya, it makes sense to me.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] feat: support `Decimal256` for the `abs` function [arrow-datafusion]

Posted by "jonahgao (via GitHub)" <gi...@apache.org>.
jonahgao commented on PR #7904:
URL: https://github.com/apache/arrow-datafusion/pull/7904#issuecomment-1774317629

   > Perhaps its better to use `plan_err` or `exec_err!` instead of `ArrowError`
   
   Thank you @comphead, but the `try_unary` requires the returned error type to be `ArrowError`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org