You are viewing a plain text version of this content. The canonical link for it is here.

Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2021/05/08 08:17:46 UTC

[GitHub] [arrow] edponce opened a new pull request #10274: Arrow 12685 compute add unary absolute value kernel

edponce opened a new pull request #10274:
URL: https://github.com/apache/arrow/pull/10274


   This PR adds the arithmetic absolute value kernels to the compute layer.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] cyb70289 commented on a change in pull request #10274: ARROW-12685: [C++][compute] add unary absolute value kernel

Posted by GitBox <gi...@apache.org>.

cyb70289 commented on a change in pull request #10274:
URL: https://github.com/apache/arrow/pull/10274#discussion_r630690033



##########
File path: cpp/src/arrow/compute/kernels/scalar_arithmetic_test.cc
##########
@@ -1057,5 +1058,109 @@ TYPED_TEST(TestUnaryArithmeticFloating, Negate) {
   }
 }
 
+TYPED_TEST(TestUnaryArithmeticSigned, AbsoluteValue) {
+  using CType = typename TestFixture::CType;
+
+  auto min = std::numeric_limits<CType>::min();
+  auto max = std::numeric_limits<CType>::max();
+
+  for (auto check_overflow : {false, true}) {
+    this->SetOverflowCheck(check_overflow);
+    // Empty array
+    this->AssertUnaryOp(AbsoluteValue, "[]", "[]");
+    // Scalar/arrays with nulls
+    this->AssertUnaryOp(AbsoluteValue, "[null]", "[null]");
+    this->AssertUnaryOp(AbsoluteValue, "[1, null, -10]", "[1, null, 10]");
+    this->AssertUnaryOp(AbsoluteValue, this->MakeNullScalar(), this->MakeNullScalar());
+    // Scalar/arrays with zeros
+    this->AssertUnaryOp(AbsoluteValue, "[0, -0]", "[0, 0]");
+    this->AssertUnaryOp(AbsoluteValue, -0, 0);
+    this->AssertUnaryOp(AbsoluteValue, 0, 0);
+    // Ordinary scalar/arrays (positive inputs)
+    this->AssertUnaryOp(AbsoluteValue, "[1, 10, 127]", "[1, 10, 127]");
+    this->AssertUnaryOp(AbsoluteValue, 1, 1);
+    this->AssertUnaryOp(AbsoluteValue, this->MakeScalar(1), this->MakeScalar(1));
+    // Ordinary scalar/arrays (negative inputs)
+    this->AssertUnaryOp(AbsoluteValue, "[-1, -10, -127]", "[1, 10, 127]");
+    this->AssertUnaryOp(AbsoluteValue, -1, 1);
+    this->AssertUnaryOp(AbsoluteValue, MakeArray(-1), "[1]");
+    // Min/max
+    this->AssertUnaryOp(AbsoluteValue, max, max);
+    if (check_overflow) {
+      this->AssertUnaryOpRaises(AbsoluteValue, MakeArray(min), "overflow");
+    } else {
+      this->AssertUnaryOp(AbsoluteValue, min, min);
+    }
+  }
+
+  // Overflow should not be checked on underlying value slots when output would be null
+  this->SetOverflowCheck(true);
+  auto arg = ArrayFromJSON(this->type_singleton(), MakeArray(-1, max, min));
+  arg = TweakValidityBit(arg, 1, false);
+  arg = TweakValidityBit(arg, 2, false);
+  this->AssertUnaryOp(AbsoluteValue, arg, "[1, null, null]");
+}
+
+TYPED_TEST(TestUnaryArithmeticUnsigned, AbsoluteValue) {
+  using CType = typename TestFixture::CType;
+
+  auto min = std::numeric_limits<CType>::min();
+  auto max = std::numeric_limits<CType>::max();
+
+  for (auto check_overflow : {false, true}) {
+    this->SetOverflowCheck(check_overflow);
+    // Empty arrays
+    this->AssertUnaryOp(AbsoluteValue, "[]", "[]");
+    // Array with nulls
+    this->AssertUnaryOp(AbsoluteValue, "[null]", "[null]");
+    this->AssertUnaryOp(AbsoluteValue, this->MakeNullScalar(), this->MakeNullScalar());
+    // Ordinary arrays
+    this->AssertUnaryOp(AbsoluteValue, "[0, 1, 10, 127]", "[0, 1, 10, 127]");
+    // Min/max
+    this->AssertUnaryOp(AbsoluteValue, min, min);
+    this->AssertUnaryOp(AbsoluteValue, max, max);
+  }
+}
+
+TYPED_TEST(TestUnaryArithmeticFloating, AbsoluteValue) {
+  using CType = typename TestFixture::CType;
+
+  auto min = std::numeric_limits<CType>::lowest();
+  auto max = std::numeric_limits<CType>::max();
+
+  for (auto check_overflow : {false, true}) {
+    this->SetOverflowCheck(check_overflow);
+    // Empty array
+    this->AssertUnaryOp(AbsoluteValue, "[]", "[]");
+    // Scalar/arrays with nulls
+    this->AssertUnaryOp(AbsoluteValue, "[null]", "[null]");
+    this->AssertUnaryOp(AbsoluteValue, "[1.3, null, -10.80]", "[1.3, null, 10.80]");
+    this->AssertUnaryOp(AbsoluteValue, this->MakeNullScalar(), this->MakeNullScalar());
+    // Scalars/arrays with zeros
+    this->AssertUnaryOp(AbsoluteValue, "[0.0, -0.0]", "[0.0, 0.0]");
+    this->AssertUnaryOp(AbsoluteValue, -0.0F, 0.0F);

Review comment:
       As `+0.0 == -0.0`, looks this test doesn't cover the purpose.
   `std::signbit` does returns 1 for -0.0, and 0 or 0.0.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] edponce commented on pull request #10274: ARROW-12685: [C++][compute] add unary absolute value kernel

Posted by GitBox <gi...@apache.org>.

edponce commented on pull request #10274:
URL: https://github.com/apache/arrow/pull/10274#issuecomment-838763550


   I named the compute function as `AbsoluteValue` and kernels as "absolute value" but this feels like too long a name. Convention across other libraries is "abs" but Arrow's compute kernels tend to follow the longer format and "absolute" seems ambiguous. @bkietz please advise


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] edponce commented on a change in pull request #10274: ARROW-12685: [C++][Compute] Add unary absolute value kernel

Posted by GitBox <gi...@apache.org>.

edponce commented on a change in pull request #10274:
URL: https://github.com/apache/arrow/pull/10274#discussion_r631529736



##########
File path: cpp/src/arrow/compute/kernels/scalar_arithmetic.cc
##########
@@ -79,22 +79,20 @@ struct AbsoluteValue {
 
   template <typename T, typename Arg>
   static constexpr enable_if_signed_integer<T> Call(KernelContext*, Arg arg, Status* st) {
-    return (arg < static_cast<T>(0)) ? arrow::internal::SafeSignedNegate(arg) : arg;
+    return std::abs(arg);
   }
 };
 
 struct AbsoluteValueChecked {
   template <typename T, typename Arg>
   static enable_if_signed_integer<T> Call(KernelContext*, Arg arg, Status* st) {
     static_assert(std::is_same<T, Arg>::value, "");
-    if (arg < static_cast<T>(0)) {
-      T result = 0;
-      if (ARROW_PREDICT_FALSE(NegateWithOverflow(arg, &result))) {
-        *st = Status::Invalid("overflow");
-      }
-      return result;
+    T result = std::abs(arg);
+    if (result < 0) {
+      *st = Status::Invalid("overflow");
+      return 0;

Review comment:
       Ok, I had noticed that other kernels returned 0 on error (e.g. divide, power) and thought it might be a convention. But I agree that the invalid value should be returned.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] github-actions[bot] commented on pull request #10274: ARROW-12685: [C++][compute] add unary absolute value kernel

Posted by GitBox <gi...@apache.org>.

github-actions[bot] commented on pull request #10274:
URL: https://github.com/apache/arrow/pull/10274#issuecomment-838700289


   https://issues.apache.org/jira/browse/ARROW-12685


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] pitrou commented on a change in pull request #10274: ARROW-12685: [C++][Compute] Add unary absolute value kernel

Posted by GitBox <gi...@apache.org>.

pitrou commented on a change in pull request #10274:
URL: https://github.com/apache/arrow/pull/10274#discussion_r630827515



##########
File path: cpp/src/arrow/compute/api_scalar.h
##########
@@ -160,6 +160,18 @@ struct ARROW_EXPORT ProjectOptions : public FunctionOptions {
 
 /// @}
 
+/// \brief Get the absolute value of a value. Array values can be of arbitrary
+/// length. If argument is null the result will be null.
+///
+/// \param[in] arg the value transformed
+/// \param[in] options arithmetic options (overflow handling), optional
+/// \param[in] ctx the function execution context, optional
+/// \return the elementwise absolute value

Review comment:
       values?

##########
File path: cpp/src/arrow/compute/api_scalar.cc
##########
@@ -47,6 +47,7 @@ namespace compute {
     return CallFunction(func_name, {arg}, ctx);                                        \
   }
 
+SCALAR_ARITHMETIC_UNARY(AbsoluteValue, "absolute_value", "absolute_value_checked")

Review comment:
       "abs" and "abs_checked" perhaps?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] cyb70289 commented on a change in pull request #10274: ARROW-12685: [C++][Compute] Add unary absolute value kernel

Posted by GitBox <gi...@apache.org>.

cyb70289 commented on a change in pull request #10274:
URL: https://github.com/apache/arrow/pull/10274#discussion_r632209010



##########
File path: docs/source/cpp/compute.rst
##########
@@ -260,6 +260,10 @@ an ``Invalid`` :class:`Status` when overflow is detected.
 +--------------------------+------------+--------------------+---------------------+
 | Function name            | Arity      | Input types        | Output type         |
 +==========================+============+====================+=====================+
+| abs                      | Unary      | Numeric            | Numeric             |

Review comment:
       We also have to add compute functions manually to python api doc
   https://github.com/apache/arrow/blob/master/docs/source/python/api/compute.rst

##########
File path: cpp/src/arrow/util/int_util_internal.h
##########
@@ -63,9 +63,9 @@ OPS_WITH_OVERFLOW(DivideWithOverflow, div)
 #undef OP_WITH_OVERFLOW
 #undef OPS_WITH_OVERFLOW
 
-// Define functions NegateWithOverflow with the signature `bool(T u, T* out)`
-// where T is a signed integer type.  On overflow, these functions return true.
-// Otherwise, false is returned and `out` is updated with the result of the
+// Define functions NegateWithOverflow with the signature
+// `bool(T u, T* out)` where T is a signed integer type.  On overflow, these functions
+// return true. Otherwise, false is returned and `out` is updated with the result of the

Review comment:
       Nit: is this intended or spurious change?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] pitrou commented on a change in pull request #10274: ARROW-12685: [C++][Compute] Add unary absolute value kernel

Posted by GitBox <gi...@apache.org>.

pitrou commented on a change in pull request #10274:
URL: https://github.com/apache/arrow/pull/10274#discussion_r631428116



##########
File path: cpp/src/arrow/compute/api_scalar.h
##########
@@ -160,6 +160,18 @@ struct ARROW_EXPORT ProjectOptions : public FunctionOptions {
 
 /// @}
 
+/// \brief Get the absolute value of a value. Array values can be of arbitrary
+/// length. If argument is null the result will be null.
+///
+/// \param[in] arg the value transformed
+/// \param[in] options arithmetic options (overflow handling), optional
+/// \param[in] ctx the function execution context, optional
+/// \return the elementwise absolute value

Review comment:
       Hmm, you're right, the other docstrings are written this way.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] edponce commented on a change in pull request #10274: ARROW-12685: [C++][compute] add unary absolute value kernel

Posted by GitBox <gi...@apache.org>.

edponce commented on a change in pull request #10274:
URL: https://github.com/apache/arrow/pull/10274#discussion_r630698902



##########
File path: cpp/src/arrow/compute/kernels/scalar_arithmetic.cc
##########
@@ -66,6 +66,50 @@ constexpr Unsigned to_unsigned(T signed_) {
   return static_cast<Unsigned>(signed_);
 }
 
+struct AbsoluteValue {
+  template <typename T, typename Arg>
+  static constexpr enable_if_floating_point<T> Call(KernelContext*, Arg arg, Status*) {
+    return (arg < static_cast<T>(0)) ? -arg : arg;

Review comment:
       I agree, thanks for the suggestion.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] edponce edited a comment on pull request #10274: ARROW-12685: [C++][Compute] Add unary absolute value kernel

Posted by GitBox <gi...@apache.org>.

edponce edited a comment on pull request #10274:
URL: https://github.com/apache/arrow/pull/10274#issuecomment-840258476


   @cyb70289  I noticed that *clang* removes the check of `if (result < 0)` because it seems that during the analysis it has a rule that the result of `std::abs()` is never negative, but this is not true if the input is `std::numeric_limits<T>::min()`. In other words, it does not considers the overflow case, although the value returned by `std::abs()` is negative. See https://godbolt.org/z/7bsM4oTMG to observe this effect. I also tested the godbolt example using unknown values at compile-time and observe the same behavior. This behavior is observed across compilers. My conclusion is to follow the form of `abs_2()`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] edponce commented on pull request #10274: ARROW-12685: [C++][Compute] Add unary absolute value kernel

Posted by GitBox <gi...@apache.org>.

edponce commented on pull request #10274:
URL: https://github.com/apache/arrow/pull/10274#issuecomment-840258476


   @cyb70289  I noticed that *clang* removes the check of `if (result < 0)` because it seems that during the analysis it has a rule that the result of `abs` is never negative, but this is not true if the input is `std::numeric_limits<T>::min()`. In other words, it does not considers the overflow case. See https://godbolt.org/z/7bsM4oTMG to observe this effect. I also tested the godbolt example using unknown values at compile-time and observe the same behavior. This behavior is observed across compilers. My conclusion is to follow the form of `abs_2()`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] cyb70289 commented on a change in pull request #10274: ARROW-12685: [C++][compute] add unary absolute value kernel

Posted by GitBox <gi...@apache.org>.

cyb70289 commented on a change in pull request #10274:
URL: https://github.com/apache/arrow/pull/10274#discussion_r630683291



##########
File path: cpp/src/arrow/compute/kernels/scalar_arithmetic.cc
##########
@@ -66,6 +66,50 @@ constexpr Unsigned to_unsigned(T signed_) {
   return static_cast<Unsigned>(signed_);
 }
 
+struct AbsoluteValue {
+  template <typename T, typename Arg>
+  static constexpr enable_if_floating_point<T> Call(KernelContext*, Arg arg, Status*) {
+    return (arg < static_cast<T>(0)) ? -arg : arg;

Review comment:
       Any reason don't use `std::fabs` directly?
   std::fabs provides more optimized code: https://godbolt.org/z/P84WbxvGP
   std::fabs(-0.0) = 0.0, but `(x < 0) ? -x : x` outputs -0.0 if x = -0.0.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] edponce edited a comment on pull request #10274: ARROW-12685: [C++][Compute] Add unary absolute value kernel

Posted by GitBox <gi...@apache.org>.

edponce edited a comment on pull request #10274:
URL: https://github.com/apache/arrow/pull/10274#issuecomment-840258476


   @cyb70289  I noticed that *clang* removes the check of `if (result < 0)` because it seems that during the analysis it has a rule that the result of `std::abs()` is never negative, but this is not true if the input is `std::numeric_limits<T>::min()`. In other words, it does not considers the overflow case, although the value returned by `std::abs()` is negative. See https://godbolt.org/z/7bsM4oTMG to observe this effect. I also tested the godbolt example using unknown values at compile-time and observe the same behavior. This behavior is observed across compilers. My conclusion is to follow the form of `abs_2()`, that is, `if (arg == std::numeric_limits<T>::min())`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] edponce edited a comment on pull request #10274: ARROW-12685: [C++][Compute] Add unary absolute value kernel

Posted by GitBox <gi...@apache.org>.

edponce edited a comment on pull request #10274:
URL: https://github.com/apache/arrow/pull/10274#issuecomment-840507006


   For such a simple kernel we found:
   1. We need to handle overflow case manually because the [standard](https://en.cppreference.com/w/cpp/numeric/math/abs) states it is undefined behavior
   2. Compilers can prune checks that test for negative value the result of an absolute value operation. The purpose of such check is to detect overflow. For example,
   ```
   result = std::abs(arg);
   if (result < 0) {
     ERROR(...);
   }
   return result;
   ```
   becomes `result = std::abs(arg)`. The solution is to first check if input is a value that will overflow
   ```
   if (arg == std::numeric_limits<T>::min()) {
      ERROR(...);
      return arg;
   }
   result = std::abs(arg);
   ```
   3. Tests performing checks for special cases such as `-0.0 == 0.0` should also check the sign bit, `std::signbit(val)`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] cyb70289 commented on pull request #10274: ARROW-12685: [C++][Compute] Add unary absolute value kernel

Posted by GitBox <gi...@apache.org>.

cyb70289 commented on pull request #10274:
URL: https://github.com/apache/arrow/pull/10274#issuecomment-840965703


   Also, would you rebase to fix some CI failures? https://github.com/apache/arrow/pull/10310


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] edponce commented on a change in pull request #10274: ARROW-12685: [C++][Compute] Add unary absolute value kernel

Posted by GitBox <gi...@apache.org>.

edponce commented on a change in pull request #10274:
URL: https://github.com/apache/arrow/pull/10274#discussion_r631425707



##########
File path: cpp/src/arrow/compute/api_scalar.h
##########
@@ -160,6 +160,18 @@ struct ARROW_EXPORT ProjectOptions : public FunctionOptions {
 
 /// @}
 
+/// \brief Get the absolute value of a value. Array values can be of arbitrary
+/// length. If argument is null the result will be null.
+///
+/// \param[in] arg the value transformed
+/// \param[in] options arithmetic options (overflow handling), optional
+/// \param[in] ctx the function execution context, optional
+/// \return the elementwise absolute value

Review comment:
       Not sure what you refer to as "values"? The documentation is written for a single input "value", except for the comment that refers to array inputs.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] edponce commented on a change in pull request #10274: ARROW-12685: [C++][Compute] Add unary absolute value kernel

Posted by GitBox <gi...@apache.org>.

edponce commented on a change in pull request #10274:
URL: https://github.com/apache/arrow/pull/10274#discussion_r631452874



##########
File path: cpp/src/arrow/compute/kernels/scalar_arithmetic_test.cc
##########
@@ -1057,5 +1058,109 @@ TYPED_TEST(TestUnaryArithmeticFloating, Negate) {
   }
 }
 
+TYPED_TEST(TestUnaryArithmeticSigned, AbsoluteValue) {
+  using CType = typename TestFixture::CType;
+
+  auto min = std::numeric_limits<CType>::min();
+  auto max = std::numeric_limits<CType>::max();
+
+  for (auto check_overflow : {false, true}) {
+    this->SetOverflowCheck(check_overflow);
+    // Empty array
+    this->AssertUnaryOp(AbsoluteValue, "[]", "[]");
+    // Scalar/arrays with nulls
+    this->AssertUnaryOp(AbsoluteValue, "[null]", "[null]");
+    this->AssertUnaryOp(AbsoluteValue, "[1, null, -10]", "[1, null, 10]");
+    this->AssertUnaryOp(AbsoluteValue, this->MakeNullScalar(), this->MakeNullScalar());
+    // Scalar/arrays with zeros
+    this->AssertUnaryOp(AbsoluteValue, "[0, -0]", "[0, 0]");
+    this->AssertUnaryOp(AbsoluteValue, -0, 0);
+    this->AssertUnaryOp(AbsoluteValue, 0, 0);
+    // Ordinary scalar/arrays (positive inputs)
+    this->AssertUnaryOp(AbsoluteValue, "[1, 10, 127]", "[1, 10, 127]");
+    this->AssertUnaryOp(AbsoluteValue, 1, 1);
+    this->AssertUnaryOp(AbsoluteValue, this->MakeScalar(1), this->MakeScalar(1));
+    // Ordinary scalar/arrays (negative inputs)
+    this->AssertUnaryOp(AbsoluteValue, "[-1, -10, -127]", "[1, 10, 127]");
+    this->AssertUnaryOp(AbsoluteValue, -1, 1);
+    this->AssertUnaryOp(AbsoluteValue, MakeArray(-1), "[1]");
+    // Min/max
+    this->AssertUnaryOp(AbsoluteValue, max, max);
+    if (check_overflow) {
+      this->AssertUnaryOpRaises(AbsoluteValue, MakeArray(min), "overflow");
+    } else {
+      this->AssertUnaryOp(AbsoluteValue, min, min);
+    }
+  }
+
+  // Overflow should not be checked on underlying value slots when output would be null
+  this->SetOverflowCheck(true);
+  auto arg = ArrayFromJSON(this->type_singleton(), MakeArray(-1, max, min));
+  arg = TweakValidityBit(arg, 1, false);
+  arg = TweakValidityBit(arg, 2, false);
+  this->AssertUnaryOp(AbsoluteValue, arg, "[1, null, null]");
+}
+
+TYPED_TEST(TestUnaryArithmeticUnsigned, AbsoluteValue) {
+  using CType = typename TestFixture::CType;
+
+  auto min = std::numeric_limits<CType>::min();
+  auto max = std::numeric_limits<CType>::max();
+
+  for (auto check_overflow : {false, true}) {
+    this->SetOverflowCheck(check_overflow);
+    // Empty arrays
+    this->AssertUnaryOp(AbsoluteValue, "[]", "[]");
+    // Array with nulls
+    this->AssertUnaryOp(AbsoluteValue, "[null]", "[null]");
+    this->AssertUnaryOp(AbsoluteValue, this->MakeNullScalar(), this->MakeNullScalar());
+    // Ordinary arrays
+    this->AssertUnaryOp(AbsoluteValue, "[0, 1, 10, 127]", "[0, 1, 10, 127]");
+    // Min/max
+    this->AssertUnaryOp(AbsoluteValue, min, min);
+    this->AssertUnaryOp(AbsoluteValue, max, max);
+  }
+}
+
+TYPED_TEST(TestUnaryArithmeticFloating, AbsoluteValue) {
+  using CType = typename TestFixture::CType;
+
+  auto min = std::numeric_limits<CType>::lowest();
+  auto max = std::numeric_limits<CType>::max();
+
+  for (auto check_overflow : {false, true}) {
+    this->SetOverflowCheck(check_overflow);
+    // Empty array
+    this->AssertUnaryOp(AbsoluteValue, "[]", "[]");
+    // Scalar/arrays with nulls
+    this->AssertUnaryOp(AbsoluteValue, "[null]", "[null]");
+    this->AssertUnaryOp(AbsoluteValue, "[1.3, null, -10.80]", "[1.3, null, 10.80]");
+    this->AssertUnaryOp(AbsoluteValue, this->MakeNullScalar(), this->MakeNullScalar());
+    // Scalars/arrays with zeros
+    this->AssertUnaryOp(AbsoluteValue, "[0.0, -0.0]", "[0.0, 0.0]");
+    this->AssertUnaryOp(AbsoluteValue, -0.0F, 0.0F);

Review comment:
       Created [ARROW-12768](https://issues.apache.org/jira/browse/ARROW-12768) to add the sign bit test cases.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] cyb70289 edited a comment on pull request #10274: ARROW-12685: [C++][Compute] Add unary absolute value kernel

Posted by GitBox <gi...@apache.org>.

cyb70289 edited a comment on pull request #10274:
URL: https://github.com/apache/arrow/pull/10274#issuecomment-840269487


   Thanks @edponce for deep investigation.
   
   From std::abs() [document](https://en.cppreference.com/w/cpp/numeric/math/abs), it is *UB* if input is `numeic_limit::min()`. So compiler can do whatever optimization that doesn't violate the standard.
   
   `Computes the absolute value of an integer number. The behavior is undefined if the result cannot be represented by the return type. `
   
   For checked abs, you can check `arg == min()` before calling `std::abs()`.
   For non-checked abs, thought compiler generated code is safe (no UB), I think it's better to go back to `return v < 0 ? (~(unsigned)v + 1) : v;` approach. Though the assembly code is exactly the same, compiler may decide to change behaviour in the future.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] edponce edited a comment on pull request #10274: ARROW-12685: [C++][Compute] Add unary absolute value kernel

Posted by GitBox <gi...@apache.org>.

edponce edited a comment on pull request #10274:
URL: https://github.com/apache/arrow/pull/10274#issuecomment-838763550


   I named the compute function as `AbsoluteValue` and kernels as "absolute_value" but this feels like too long a name. Convention across other libraries is "abs" but Arrow's compute kernels tend to follow the longer format and "absolute" seems ambiguous. @bkietz please advise


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] cyb70289 commented on a change in pull request #10274: ARROW-12685: [C++][compute] add unary absolute value kernel

Posted by GitBox <gi...@apache.org>.

cyb70289 commented on a change in pull request #10274:
URL: https://github.com/apache/arrow/pull/10274#discussion_r630686322



##########
File path: cpp/src/arrow/compute/kernels/scalar_arithmetic.cc
##########
@@ -66,6 +66,50 @@ constexpr Unsigned to_unsigned(T signed_) {
   return static_cast<Unsigned>(signed_);
 }
 
+struct AbsoluteValue {
+  template <typename T, typename Arg>
+  static constexpr enable_if_floating_point<T> Call(KernelContext*, Arg arg, Status*) {
+    return (arg < static_cast<T>(0)) ? -arg : arg;
+  }
+
+  template <typename T, typename Arg>
+  static constexpr enable_if_unsigned_integer<T> Call(KernelContext*, Arg arg, Status*) {
+    return arg;
+  }
+
+  template <typename T, typename Arg>
+  static constexpr enable_if_signed_integer<T> Call(KernelContext*, Arg arg, Status* st) {
+    return (arg < static_cast<T>(0)) ? arrow::internal::SafeSignedNegate(arg) : arg;

Review comment:
       This produces exactly same code as `std::abs`: https://godbolt.org/z/exvd8nxWc
   I suggest use `std::abs` directly and combine this function with floating point.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] edponce commented on a change in pull request #10274: ARROW-12685: [C++][Compute] Add unary absolute value kernel

Posted by GitBox <gi...@apache.org>.

edponce commented on a change in pull request #10274:
URL: https://github.com/apache/arrow/pull/10274#discussion_r630707124



##########
File path: cpp/src/arrow/compute/kernels/scalar_arithmetic_test.cc
##########
@@ -1057,5 +1058,109 @@ TYPED_TEST(TestUnaryArithmeticFloating, Negate) {
   }
 }
 
+TYPED_TEST(TestUnaryArithmeticSigned, AbsoluteValue) {
+  using CType = typename TestFixture::CType;
+
+  auto min = std::numeric_limits<CType>::min();
+  auto max = std::numeric_limits<CType>::max();
+
+  for (auto check_overflow : {false, true}) {
+    this->SetOverflowCheck(check_overflow);
+    // Empty array
+    this->AssertUnaryOp(AbsoluteValue, "[]", "[]");
+    // Scalar/arrays with nulls
+    this->AssertUnaryOp(AbsoluteValue, "[null]", "[null]");
+    this->AssertUnaryOp(AbsoluteValue, "[1, null, -10]", "[1, null, 10]");
+    this->AssertUnaryOp(AbsoluteValue, this->MakeNullScalar(), this->MakeNullScalar());
+    // Scalar/arrays with zeros
+    this->AssertUnaryOp(AbsoluteValue, "[0, -0]", "[0, 0]");
+    this->AssertUnaryOp(AbsoluteValue, -0, 0);
+    this->AssertUnaryOp(AbsoluteValue, 0, 0);
+    // Ordinary scalar/arrays (positive inputs)
+    this->AssertUnaryOp(AbsoluteValue, "[1, 10, 127]", "[1, 10, 127]");
+    this->AssertUnaryOp(AbsoluteValue, 1, 1);
+    this->AssertUnaryOp(AbsoluteValue, this->MakeScalar(1), this->MakeScalar(1));
+    // Ordinary scalar/arrays (negative inputs)
+    this->AssertUnaryOp(AbsoluteValue, "[-1, -10, -127]", "[1, 10, 127]");
+    this->AssertUnaryOp(AbsoluteValue, -1, 1);
+    this->AssertUnaryOp(AbsoluteValue, MakeArray(-1), "[1]");
+    // Min/max
+    this->AssertUnaryOp(AbsoluteValue, max, max);
+    if (check_overflow) {
+      this->AssertUnaryOpRaises(AbsoluteValue, MakeArray(min), "overflow");
+    } else {
+      this->AssertUnaryOp(AbsoluteValue, min, min);
+    }
+  }
+
+  // Overflow should not be checked on underlying value slots when output would be null
+  this->SetOverflowCheck(true);
+  auto arg = ArrayFromJSON(this->type_singleton(), MakeArray(-1, max, min));
+  arg = TweakValidityBit(arg, 1, false);
+  arg = TweakValidityBit(arg, 2, false);
+  this->AssertUnaryOp(AbsoluteValue, arg, "[1, null, null]");
+}
+
+TYPED_TEST(TestUnaryArithmeticUnsigned, AbsoluteValue) {
+  using CType = typename TestFixture::CType;
+
+  auto min = std::numeric_limits<CType>::min();
+  auto max = std::numeric_limits<CType>::max();
+
+  for (auto check_overflow : {false, true}) {
+    this->SetOverflowCheck(check_overflow);
+    // Empty arrays
+    this->AssertUnaryOp(AbsoluteValue, "[]", "[]");
+    // Array with nulls
+    this->AssertUnaryOp(AbsoluteValue, "[null]", "[null]");
+    this->AssertUnaryOp(AbsoluteValue, this->MakeNullScalar(), this->MakeNullScalar());
+    // Ordinary arrays
+    this->AssertUnaryOp(AbsoluteValue, "[0, 1, 10, 127]", "[0, 1, 10, 127]");
+    // Min/max
+    this->AssertUnaryOp(AbsoluteValue, min, min);
+    this->AssertUnaryOp(AbsoluteValue, max, max);
+  }
+}
+
+TYPED_TEST(TestUnaryArithmeticFloating, AbsoluteValue) {
+  using CType = typename TestFixture::CType;
+
+  auto min = std::numeric_limits<CType>::lowest();
+  auto max = std::numeric_limits<CType>::max();
+
+  for (auto check_overflow : {false, true}) {
+    this->SetOverflowCheck(check_overflow);
+    // Empty array
+    this->AssertUnaryOp(AbsoluteValue, "[]", "[]");
+    // Scalar/arrays with nulls
+    this->AssertUnaryOp(AbsoluteValue, "[null]", "[null]");
+    this->AssertUnaryOp(AbsoluteValue, "[1.3, null, -10.80]", "[1.3, null, 10.80]");
+    this->AssertUnaryOp(AbsoluteValue, this->MakeNullScalar(), this->MakeNullScalar());
+    // Scalars/arrays with zeros
+    this->AssertUnaryOp(AbsoluteValue, "[0.0, -0.0]", "[0.0, 0.0]");
+    this->AssertUnaryOp(AbsoluteValue, -0.0F, 0.0F);

Review comment:
       I will update the tests to check for the sign bit instead. I think the sign bit test for zero applies to all the basic arithmetic kernels (add, sub, mult, div, neg) because all of these operations can result in 0 and -0.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] cyb70289 commented on a change in pull request #10274: ARROW-12685: [C++][Compute] Add unary absolute value kernel

Posted by GitBox <gi...@apache.org>.

cyb70289 commented on a change in pull request #10274:
URL: https://github.com/apache/arrow/pull/10274#discussion_r631509260



##########
File path: cpp/src/arrow/compute/kernels/scalar_arithmetic.cc
##########
@@ -79,22 +79,20 @@ struct AbsoluteValue {
 
   template <typename T, typename Arg>
   static constexpr enable_if_signed_integer<T> Call(KernelContext*, Arg arg, Status* st) {
-    return (arg < static_cast<T>(0)) ? arrow::internal::SafeSignedNegate(arg) : arg;
+    return std::abs(arg);
   }
 };
 
 struct AbsoluteValueChecked {
   template <typename T, typename Arg>
   static enable_if_signed_integer<T> Call(KernelContext*, Arg arg, Status* st) {
     static_assert(std::is_same<T, Arg>::value, "");
-    if (arg < static_cast<T>(0)) {
-      T result = 0;
-      if (ARROW_PREDICT_FALSE(NegateWithOverflow(arg, &result))) {
-        *st = Status::Invalid("overflow");
-      }
-      return result;
+    T result = std::abs(arg);
+    if (result < 0) {
+      *st = Status::Invalid("overflow");
+      return 0;

Review comment:
       Returning 0 looks strange. What about just returning the negative result?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] cyb70289 commented on pull request #10274: ARROW-12685: [C++][Compute] Add unary absolute value kernel

Posted by GitBox <gi...@apache.org>.

cyb70289 commented on pull request #10274:
URL: https://github.com/apache/arrow/pull/10274#issuecomment-841924720


   @pitrou, do you have other comments to this PR?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] cyb70289 commented on a change in pull request #10274: ARROW-12685: [C++][compute] add unary absolute value kernel

Posted by GitBox <gi...@apache.org>.

cyb70289 commented on a change in pull request #10274:
URL: https://github.com/apache/arrow/pull/10274#discussion_r630683291



##########
File path: cpp/src/arrow/compute/kernels/scalar_arithmetic.cc
##########
@@ -66,6 +66,50 @@ constexpr Unsigned to_unsigned(T signed_) {
   return static_cast<Unsigned>(signed_);
 }
 
+struct AbsoluteValue {
+  template <typename T, typename Arg>
+  static constexpr enable_if_floating_point<T> Call(KernelContext*, Arg arg, Status*) {
+    return (arg < static_cast<T>(0)) ? -arg : arg;

Review comment:
       Any reason don't use `std::fabs` directly?
   std::fabs provides more optimized code: https://godbolt.org/z/nd38arvTe
   std::fabs(-0.0) = 0.0, but `(x < 0) ? -x : x` outputs -0.0 if x = -0.0.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] cyb70289 commented on pull request #10274: ARROW-12685: [C++][Compute] Add unary absolute value kernel

Posted by GitBox <gi...@apache.org>.

cyb70289 commented on pull request #10274:
URL: https://github.com/apache/arrow/pull/10274#issuecomment-841924720


   @pitrou, do you have other comments to this PR?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] edponce commented on a change in pull request #10274: ARROW-12685: [C++][Compute] Add unary absolute value kernel

Posted by GitBox <gi...@apache.org>.

edponce commented on a change in pull request #10274:
URL: https://github.com/apache/arrow/pull/10274#discussion_r632247362



##########
File path: cpp/src/arrow/util/int_util_internal.h
##########
@@ -63,9 +63,9 @@ OPS_WITH_OVERFLOW(DivideWithOverflow, div)
 #undef OP_WITH_OVERFLOW
 #undef OPS_WITH_OVERFLOW
 
-// Define functions NegateWithOverflow with the signature `bool(T u, T* out)`
-// where T is a signed integer type.  On overflow, these functions return true.
-// Otherwise, false is returned and `out` is updated with the result of the
+// Define functions NegateWithOverflow with the signature
+// `bool(T u, T* out)` where T is a signed integer type.  On overflow, these functions
+// return true. Otherwise, false is returned and `out` is updated with the result of the

Review comment:
       It was a spurious change because I had initially added an `AbsoluteValueWithOverflow` but `NegateWithOverflow` sufficed.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] edponce edited a comment on pull request #10274: ARROW-12685: [C++][Compute] Add unary absolute value kernel

Posted by GitBox <gi...@apache.org>.

edponce edited a comment on pull request #10274:
URL: https://github.com/apache/arrow/pull/10274#issuecomment-840258476


   @cyb70289  I noticed that *clang* removes the check of `if (result < 0)` because it seems that during the analysis it has a rule that the result of `std::abs()` is never negative, but this is not true if the input is `std::numeric_limits<T>::min()`. In other words, it does not considers the overflow case, although the value returned by `std::abs()` is negative and correct. See https://godbolt.org/z/7bsM4oTMG to observe this effect. I also tested the godbolt example using unknown values at compile-time and observe the same behavior. This behavior is observed across compilers. My conclusion is to follow the form of `abs_2()`, that is, `if (arg == std::numeric_limits<T>::min())`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] cyb70289 commented on pull request #10274: ARROW-12685: [C++][Compute] Add unary absolute value kernel

Posted by GitBox <gi...@apache.org>.

cyb70289 commented on pull request #10274:
URL: https://github.com/apache/arrow/pull/10274#issuecomment-840269487


   Thanks @edponce for deep investigation.
   
   From std::abs() [document](https://en.cppreference.com/w/cpp/numeric/math/abs), it is *UB* if input is `numeic_limit::min()`. So compiler can do whatever optimization that doesn't violate the standard.
   
   `Computes the absolute value of an integer number. The behavior is undefined if the result cannot be represented by the return type. `
   
   For checked abs, you can check `arg == min()` before calling `std::abs()`.
   For non-checked abs, thought compiler generated code is safe, I think it's better to go back to `return v < 0 ? (~(unsigned)v + 1) : v;` approach.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] edponce commented on pull request #10274: ARROW-12685: [C++][Compute] Add unary absolute value kernel

Posted by GitBox <gi...@apache.org>.

edponce commented on pull request #10274:
URL: https://github.com/apache/arrow/pull/10274#issuecomment-840507006


   For such a simple kernel we found:
   1. We need to handle overflow case manually because the standard states it is undefined behavior
   2. Compilers can prune checks that test for negative value the result of an absolute value operation. The purpose of such check is to detect overflow. For example,
   ```
   result = std::abs(arg);
   if (result < 0) {
     ERROR(...);
   }
   return result;
   ```
   becomes `result = std::abs(arg)`. The solution is to first check if input is a value that will overflow
   ```
   if (arg == std::numeric_limits<T>::min()) {
      ERROR(...);
      return arg;
   }
   result = std::abs(arg);
   ```
   3. Tests performing checks for special cases such as `-0.0 == 0.0` should also check the sign bit, `std::signbit(val)`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] cyb70289 closed pull request #10274: ARROW-12685: [C++][Compute] Add unary absolute value kernel

Posted by GitBox <gi...@apache.org>.

cyb70289 closed pull request #10274:
URL: https://github.com/apache/arrow/pull/10274


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] github-actions[bot] commented on pull request #10274: Arrow 12685 compute add unary absolute value kernel

Posted by GitBox <gi...@apache.org>.

github-actions[bot] commented on pull request #10274:
URL: https://github.com/apache/arrow/pull/10274#issuecomment-835201506


   <!--
     Licensed to the Apache Software Foundation (ASF) under one
     or more contributor license agreements.  See the NOTICE file
     distributed with this work for additional information
     regarding copyright ownership.  The ASF licenses this file
     to you under the Apache License, Version 2.0 (the
     "License"); you may not use this file except in compliance
     with the License.  You may obtain a copy of the License at
   
       http://www.apache.org/licenses/LICENSE-2.0
   
     Unless required by applicable law or agreed to in writing,
     software distributed under the License is distributed on an
     "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
     KIND, either express or implied.  See the License for the
     specific language governing permissions and limitations
     under the License.
   -->
   
   Thanks for opening a pull request!
   
   If this is not a [minor PR](https://github.com/apache/arrow/blob/master/CONTRIBUTING.md#Minor-Fixes). Could you open an issue for this pull request on JIRA? https://issues.apache.org/jira/browse/ARROW
   
   Opening JIRAs ahead of time contributes to the [Openness](http://theapacheway.com/open/#:~:text=Openness%20allows%20new%20users%20the,must%20happen%20in%20the%20open.) of the Apache Arrow project.
   
   Then could you also rename pull request title in the following format?
   
       ARROW-${JIRA_ID}: [${COMPONENT}] ${SUMMARY}
   
   or
   
       MINOR: [${COMPONENT}] ${SUMMARY}
   
   See also:
   
     * [Other pull requests](https://github.com/apache/arrow/pulls/)
     * [Contribution Guidelines - How to contribute patches](https://arrow.apache.org/docs/developers/contributing.html#how-to-contribute-patches)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org