You are viewing a plain text version of this content. The canonical link for it is here.

Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2021/05/25 06:29:50 UTC

[GitHub] [arrow] edponce opened a new pull request #10395: ARROW-12861: [C++][Compute] Add sign function and tests

edponce opened a new pull request #10395:
URL: https://github.com/apache/arrow/pull/10395


   This PR adds the sign function to the compute layer as a unary scalar function registered as "sign".
   The implementations results depend on the input type:
   * Floating-points: sign(x) = [-1,1]
   * Signed integrals: sign(x) = [-1,0,1]
   * Unsigned integrals: sign(x) = [0,1]


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] edponce commented on pull request #10395: ARROW-12861: [C++][Compute] Add sign function kernels

Posted by GitBox <gi...@apache.org>.

edponce commented on pull request #10395:
URL: https://github.com/apache/arrow/pull/10395#issuecomment-860021183


   The [`Sign` kernels](https://github.com/edponce/arrow/blob/ARROW-12861-Compute-Add-sign-function-kernels/cpp/src/arrow/compute/kernels/scalar_arithmetic.cc#L402-L442) were created with a [fixed output type (Int8)](https://github.com/edponce/arrow/blob/ARROW-12861-Compute-Add-sign-function-kernels/cpp/src/arrow/compute/kernels/scalar_arithmetic.cc#L784-L790) using a new [kernel generator dispatcher](https://github.com/edponce/arrow/blob/ARROW-12861-Compute-Add-sign-function-kernels/cpp/src/arrow/compute/kernels/scalar_arithmetic.cc#L476-L505) but the [tests for `DispatchBest`](https://github.com/edponce/arrow/blob/ARROW-12861-Compute-Add-sign-function-kernels/cpp/src/arrow/compute/kernels/scalar_arithmetic_test.cc#L985-L998) fail. The output value from `DispatchBest` is the same as its input and the input/output values for `DispatchExact` are always `int8`. Could someone help with this issue?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] edponce edited a comment on pull request #10395: ARROW-12861: [C++][Compute] Add sign function and tests

Posted by GitBox <gi...@apache.org>.

edponce edited a comment on pull request #10395:
URL: https://github.com/apache/arrow/pull/10395#issuecomment-849529970


   The test for `double` type failed in AMD64-MinGW32 with a RapidJSON error message "Number too big to be stored in double". [I am not the first to encounter this error from RapidJSON](https://github.com/Tencent/rapidjson/issues/849). I disabled the tests that use `std::numeric_limits<double>::max()` to check if this is the triggering case.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] bkietz commented on a change in pull request #10395: ARROW-12861: [C++][Compute] Add sign function kernels

Posted by GitBox <gi...@apache.org>.

bkietz commented on a change in pull request #10395:
URL: https://github.com/apache/arrow/pull/10395#discussion_r652073154



##########
File path: docs/source/cpp/compute.rst
##########
@@ -255,41 +255,52 @@ Input(s) will be cast to the :ref:`common numeric type <common-numeric-type>`
 (and dictionary decoded, if applicable) before the operation is applied.
 
 The default variant of these functions does not detect overflow (the result
-then typically wraps around).  Each function is also available in an
+then typically wraps around).  Most functions are also available in an
 overflow-checking variant, suffixed ``_checked``, which returns
 an ``Invalid`` :class:`Status` when overflow is detected.
 
-+--------------------------+------------+--------------------+---------------------+
-| Function name            | Arity      | Input types        | Output type         |
-+==========================+============+====================+=====================+
-| abs                      | Unary      | Numeric            | Numeric             |
-+--------------------------+------------+--------------------+---------------------+
-| abs_checked              | Unary      | Numeric            | Numeric             |
-+--------------------------+------------+--------------------+---------------------+
-| add                      | Binary     | Numeric            | Numeric             |
-+--------------------------+------------+--------------------+---------------------+
-| add_checked              | Binary     | Numeric            | Numeric             |
-+--------------------------+------------+--------------------+---------------------+
-| divide                   | Binary     | Numeric            | Numeric             |
-+--------------------------+------------+--------------------+---------------------+
-| divide_checked           | Binary     | Numeric            | Numeric             |
-+--------------------------+------------+--------------------+---------------------+
-| multiply                 | Binary     | Numeric            | Numeric             |
-+--------------------------+------------+--------------------+---------------------+
-| multiply_checked         | Binary     | Numeric            | Numeric             |
-+--------------------------+------------+--------------------+---------------------+
-| negate                   | Unary      | Numeric            | Numeric             |
-+--------------------------+------------+--------------------+---------------------+
-| negate_checked           | Unary      | Signed Numeric     | Signed Numeric      |
-+--------------------------+------------+--------------------+---------------------+
-| power                    | Binary     | Numeric            | Numeric             |
-+--------------------------+------------+--------------------+---------------------+
-| power_checked            | Binary     | Numeric            | Numeric             |
-+--------------------------+------------+--------------------+---------------------+
-| subtract                 | Binary     | Numeric            | Numeric             |
-+--------------------------+------------+--------------------+---------------------+
-| subtract_checked         | Binary     | Numeric            | Numeric             |
-+--------------------------+------------+--------------------+---------------------+
++--------------------------+------------+--------------------+---------------------+-------+
+| Function name            | Arity      | Input types        | Output type         | Notes |
++==========================+============+====================+=====================+=======+
+| abs                      | Unary      | Numeric            | Numeric             |       |
++--------------------------+------------+--------------------+---------------------+-------+
+| abs_checked              | Unary      | Numeric            | Numeric             |       |
++--------------------------+------------+--------------------+---------------------+-------+
+| add                      | Binary     | Numeric            | Numeric             |       |
++--------------------------+------------+--------------------+---------------------+-------+
+| add_checked              | Binary     | Numeric            | Numeric             |       |
++--------------------------+------------+--------------------+---------------------+-------+
+| divide                   | Binary     | Numeric            | Numeric             |       |
++--------------------------+------------+--------------------+---------------------+-------+
+| divide_checked           | Binary     | Numeric            | Numeric             |       |
++--------------------------+------------+--------------------+---------------------+-------+
+| multiply                 | Binary     | Numeric            | Numeric             |       |
++--------------------------+------------+--------------------+---------------------+-------+
+| multiply_checked         | Binary     | Numeric            | Numeric             |       |
++--------------------------+------------+--------------------+---------------------+-------+
+| negate                   | Unary      | Numeric            | Numeric             |       |
++--------------------------+------------+--------------------+---------------------+-------+
+| negate_checked           | Unary      | Signed Numeric     | Signed Numeric      |       |
++--------------------------+------------+--------------------+---------------------+-------+
+| power                    | Binary     | Numeric            | Numeric             |       |
++--------------------------+------------+--------------------+---------------------+-------+
+| power_checked            | Binary     | Numeric            | Numeric             |       |
++--------------------------+------------+--------------------+---------------------+-------+
+| sign                     | Unary      | Numeric            | Int8                | \(1)  |
++--------------------------+------------+--------------------+---------------------+-------+
+| sign_with_signed_zero    | Unary      | Numeric            | Int8                | \(2)  |
++--------------------------+------------+--------------------+---------------------+-------+
+| subtract                 | Binary     | Numeric            | Numeric             |       |
++--------------------------+------------+--------------------+---------------------+-------+
+| subtract_checked         | Binary     | Numeric            | Numeric             |       |
++--------------------------+------------+--------------------+---------------------+-------+
+
+* \(1) Output is an integral representing the signedness of the input.
+  For nonzero inputs, output is any of (-1,1) and 0 for zero input.
+
+* \(2) Similar to ``sign`` but zero input is considered as a signed number.

Review comment:
       ```suggestion
   * \(2) Similar to ``sign`` but zero is considered signed for floating point inputs.
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] bkietz commented on pull request #10395: ARROW-12861: [C++][Compute] Add sign function and tests

Posted by GitBox <gi...@apache.org>.

bkietz commented on pull request #10395:
URL: https://github.com/apache/arrow/pull/10395#issuecomment-849537461


   I'd say that even if this could be called an arithmetic function it's not ideal to require ArithmeticOptions to carry fields which only apply to sign. Please write SignOptions instead


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] edponce commented on a change in pull request #10395: ARROW-12861: [C++][Compute] Add sign function kernels

Posted by GitBox <gi...@apache.org>.

edponce commented on a change in pull request #10395:
URL: https://github.com/apache/arrow/pull/10395#discussion_r656530353



##########
File path: cpp/src/arrow/compute/kernels/scalar_arithmetic.cc
##########
@@ -454,6 +456,48 @@ struct PowerChecked {
   }
 };
 
+struct Sign {
+  template <typename T, typename Arg,
+            enable_if_t<std::is_floating_point<Arg>::value, bool> = true>
+  static constexpr T Call(KernelContext*, Arg arg, Status*) {
+    return (arg == 0) ? 0 : (std::signbit(arg) ? -1 : 1);
+  }
+
+  template <typename T, typename Arg,
+            enable_if_t<std::is_unsigned<Arg>::value, bool> = true>
+  static constexpr T Call(KernelContext*, Arg arg, Status*) {
+    return (arg == 0) ? 0 : 1;
+  }
+
+  template <typename T, typename Arg,
+            enable_if_t<std::is_integral<Arg>::value && std::is_signed<Arg>::value,
+                        bool> = true>
+  static constexpr T Call(KernelContext*, Arg arg, Status*) {
+    return (arg > 0) ? 1 : ((arg == 0) ? 0 : -1);
+  }
+};
+
+struct SignWithSignedZero {
+  template <typename T, typename Arg,
+            enable_if_t<std::is_floating_point<Arg>::value, bool> = true>
+  static constexpr T Call(KernelContext*, Arg arg, Status*) {
+    return std::signbit(arg) ? -1 : 1;
+  }
+
+  template <typename T, typename Arg,
+            enable_if_t<std::is_unsigned<Arg>::value, bool> = true>
+  static constexpr T Call(KernelContext*, Arg arg, Status*) {
+    return (arg == 0) ? 0 : 1;
+  }
+
+  template <typename T, typename Arg,
+            enable_if_t<std::is_integral<Arg>::value && std::is_signed<Arg>::value,
+                        bool> = true>
+  static constexpr T Call(KernelContext*, Arg arg, Status*) {
+    return (arg > 0) ? 1 : ((arg == 0) ? 0 : -1);

Review comment:
       If zero is treated as a signed value, FP output will always be one of (-1,1).
   If zero is not signed, FP output can be (-1,0,1).




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] pitrou commented on pull request #10395: ARROW-12861: [C++][Compute] Add sign function and tests

Posted by GitBox <gi...@apache.org>.

pitrou commented on pull request #10395:
URL: https://github.com/apache/arrow/pull/10395#issuecomment-849554473


   @nealrichardson What do you think? Is this useful?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] pitrou commented on a change in pull request #10395: ARROW-12861: [C++][Compute] Add sign function kernels

Posted by GitBox <gi...@apache.org>.

pitrou commented on a change in pull request #10395:
URL: https://github.com/apache/arrow/pull/10395#discussion_r656048916



##########
File path: cpp/src/arrow/compute/kernels/scalar_arithmetic_test.cc
##########
@@ -149,7 +173,23 @@ template <typename T>
 class TestUnaryArithmeticUnsigned : public TestUnaryArithmeticIntegral<T> {};
 
 template <typename T>
-class TestUnaryArithmeticFloating : public TestUnaryArithmetic<T> {};
+class TestUnaryArithmeticFloating : public TestUnaryArithmetic<T, ArithmeticOptions> {};
+
+template <typename T>
+class TestUnaryArithmetic<T, SignOptions>
+    : public TestBaseUnaryArithmetic<T, SignOptions> {};
+
+template <typename T>
+class TestUnaryArithmeticIntegral_Sign : public TestUnaryArithmetic<T, SignOptions> {};

Review comment:
       This could be called e.g. `TestSignIntegral`, and the classes below `TestSignUnsigned`, etc.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] pitrou commented on a change in pull request #10395: ARROW-12861: [C++][Compute] Add sign function kernels

Posted by GitBox <gi...@apache.org>.

pitrou commented on a change in pull request #10395:
URL: https://github.com/apache/arrow/pull/10395#discussion_r646772641



##########
File path: cpp/src/arrow/compute/kernels/scalar_arithmetic_test.cc
##########
@@ -1161,5 +1180,93 @@ TYPED_TEST(TestUnaryArithmeticFloating, AbsoluteValue) {
   }
 }
 
+TYPED_TEST(TestUnaryArithmeticSigned, Sign) {
+  using CType = typename TestFixture::CType;
+  auto min = std::numeric_limits<CType>::min();
+  auto max = std::numeric_limits<CType>::max();
+
+  for (auto allow_signed_zero : {false, true}) {
+    this->SetSignedZero(allow_signed_zero);
+    // Empty array
+    this->AssertUnaryOp(Sign, "[]", ArrayFromJSON(int8(), "[]"));
+    // Scalar/arrays with nulls
+    this->AssertUnaryOp(Sign, "[null]", ArrayFromJSON(int8(), "[null]"));
+    this->AssertUnaryOp(Sign, "[1, null, -10]", ArrayFromJSON(int8(), "[1, null, -1]"));
+    // Scalar/arrays with zeros
+    this->AssertUnaryOp(Sign, "[0]", ArrayFromJSON(int8(), "[0]"));
+    // Ordinary scalar/arrays (positive inputs)
+    this->AssertUnaryOp(Sign, "[1, 10, 127]", ArrayFromJSON(int8(), "[1, 1, 1]"));
+    // Ordinary scalar/arrays (negative inputs)
+    this->AssertUnaryOp(Sign, "[-1, -10, -127]", ArrayFromJSON(int8(), "[-1, -1, -1]"));
+    // Min/max
+    this->AssertUnaryOp(Sign, ArrayFromJSON(this->type_singleton(), MakeArray(min, max)),
+                        ArrayFromJSON(int8(), "[-1, 1]"));
+
+    auto arg = ArrayFromJSON(this->type_singleton(), MakeArray(-1, min, max, 1));
+    arg = TweakValidityBit(arg, 1, false);
+    arg = TweakValidityBit(arg, 2, false);
+    this->AssertUnaryOp(Sign, arg, ArrayFromJSON(int8(), "[-1, null, null, 1]"));
+  }
+}
+
+TYPED_TEST(TestUnaryArithmeticUnsigned, Sign) {
+  using CType = typename TestFixture::CType;
+  auto min = std::numeric_limits<CType>::min();
+  auto max = std::numeric_limits<CType>::max();
+
+  for (auto allow_signed_zero : {false, true}) {
+    this->SetSignedZero(allow_signed_zero);
+    // Empty arrays
+    this->AssertUnaryOp(Sign, "[]", ArrayFromJSON(int8(), "[]"));
+    // Array with nulls
+    this->AssertUnaryOp(Sign, "[null]", ArrayFromJSON(int8(), "[null]"));
+    // Ordinary arrays
+    this->AssertUnaryOp(Sign, "[0, 1, 10, 127]", ArrayFromJSON(int8(), "[0, 1, 1, 1]"));
+    // Min/max
+    this->AssertUnaryOp(Sign, ArrayFromJSON(this->type_singleton(), MakeArray(min, max)),
+                        ArrayFromJSON(int8(), "[0, 1]"));
+
+    auto arg = ArrayFromJSON(this->type_singleton(), MakeArray(0, min, max, 1));
+    arg = TweakValidityBit(arg, 1, false);
+    arg = TweakValidityBit(arg, 2, false);
+    this->AssertUnaryOp(Sign, arg, ArrayFromJSON(int8(), "[0, null, null, 1]"));
+  }
+}
+
+TYPED_TEST(TestUnaryArithmeticFloating, Sign) {
+  using CType = typename TestFixture::CType;
+  auto min = std::numeric_limits<CType>::lowest();
+  auto max = std::numeric_limits<CType>::max();
+
+  // Empty array
+  this->AssertUnaryOp(Sign, "[]", ArrayFromJSON(int8(), "[]"));
+  // Scalar/arrays with nulls
+  this->AssertUnaryOp(Sign, "[null]", ArrayFromJSON(int8(), "[null]"));
+  this->AssertUnaryOp(Sign, "[1.3, null, -10.80]",
+                      ArrayFromJSON(int8(), "[1, null, -1]"));
+  // Scalars/arrays with zeros
+  this->AssertUnaryOp(Sign, "[0.0, -0.0]", ArrayFromJSON(int8(), "[0, 0]"));
+  this->SetSignedZero(true);
+  this->AssertUnaryOp(Sign, "[0.0, -0.0]", ArrayFromJSON(int8(), "[1, -1]"));
+  this->SetSignedZero(false);
+  // Ordinary scalars/arrays (positive inputs)
+  this->AssertUnaryOp(Sign, "[1.3, 10.80, 12748.001]",
+                      ArrayFromJSON(int8(), "[1, 1, 1]"));
+  // Ordinary scalars/arrays (negative inputs)
+  this->AssertUnaryOp(Sign, "[-1.3, -10.80, -12748.001]",
+                      ArrayFromJSON(int8(), "[-1, -1, -1]"));
+  // Arrays with infinites
+  this->AssertUnaryOp(Sign, "[Inf, -Inf]", ArrayFromJSON(int8(), "[1, -1]"));
+  // Arrays with NaNs
+  this->AssertUnaryOp(Sign, "[NaN]", ArrayFromJSON(int8(), "[1]"));
+  // Min/max
+  // this->AssertUnaryOp(Sign, ArrayFromJSON(this->type_singleton(), MakeArray(min, max)),
+  //                     ArrayFromJSON(int8(), "[-1, 1]"));
+
+  // auto arg = ArrayFromJSON(this->type_singleton(), MakeArray(-1, min, max, 1));
+  // arg = TweakValidityBit(arg, 1, false);
+  // arg = TweakValidityBit(arg, 2, false);
+  // this->AssertUnaryOp(Sign, arg, ArrayFromJSON(int8(), "[-1, null, null, 1]"));

Review comment:
       Please remove commented out code.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] edponce commented on a change in pull request #10395: ARROW-12861: [C++][Compute] Add sign function kernels

Posted by GitBox <gi...@apache.org>.

edponce commented on a change in pull request #10395:
URL: https://github.com/apache/arrow/pull/10395#discussion_r652118835



##########
File path: docs/source/cpp/compute.rst
##########
@@ -255,41 +255,52 @@ Input(s) will be cast to the :ref:`common numeric type <common-numeric-type>`
 (and dictionary decoded, if applicable) before the operation is applied.
 
 The default variant of these functions does not detect overflow (the result
-then typically wraps around).  Each function is also available in an
+then typically wraps around).  Most functions are also available in an
 overflow-checking variant, suffixed ``_checked``, which returns
 an ``Invalid`` :class:`Status` when overflow is detected.
 
-+--------------------------+------------+--------------------+---------------------+
-| Function name            | Arity      | Input types        | Output type         |
-+==========================+============+====================+=====================+
-| abs                      | Unary      | Numeric            | Numeric             |
-+--------------------------+------------+--------------------+---------------------+
-| abs_checked              | Unary      | Numeric            | Numeric             |
-+--------------------------+------------+--------------------+---------------------+
-| add                      | Binary     | Numeric            | Numeric             |
-+--------------------------+------------+--------------------+---------------------+
-| add_checked              | Binary     | Numeric            | Numeric             |
-+--------------------------+------------+--------------------+---------------------+
-| divide                   | Binary     | Numeric            | Numeric             |
-+--------------------------+------------+--------------------+---------------------+
-| divide_checked           | Binary     | Numeric            | Numeric             |
-+--------------------------+------------+--------------------+---------------------+
-| multiply                 | Binary     | Numeric            | Numeric             |
-+--------------------------+------------+--------------------+---------------------+
-| multiply_checked         | Binary     | Numeric            | Numeric             |
-+--------------------------+------------+--------------------+---------------------+
-| negate                   | Unary      | Numeric            | Numeric             |
-+--------------------------+------------+--------------------+---------------------+
-| negate_checked           | Unary      | Signed Numeric     | Signed Numeric      |
-+--------------------------+------------+--------------------+---------------------+
-| power                    | Binary     | Numeric            | Numeric             |
-+--------------------------+------------+--------------------+---------------------+
-| power_checked            | Binary     | Numeric            | Numeric             |
-+--------------------------+------------+--------------------+---------------------+
-| subtract                 | Binary     | Numeric            | Numeric             |
-+--------------------------+------------+--------------------+---------------------+
-| subtract_checked         | Binary     | Numeric            | Numeric             |
-+--------------------------+------------+--------------------+---------------------+
++--------------------------+------------+--------------------+---------------------+-------+
+| Function name            | Arity      | Input types        | Output type         | Notes |
++==========================+============+====================+=====================+=======+
+| abs                      | Unary      | Numeric            | Numeric             |       |
++--------------------------+------------+--------------------+---------------------+-------+
+| abs_checked              | Unary      | Numeric            | Numeric             |       |
++--------------------------+------------+--------------------+---------------------+-------+
+| add                      | Binary     | Numeric            | Numeric             |       |
++--------------------------+------------+--------------------+---------------------+-------+
+| add_checked              | Binary     | Numeric            | Numeric             |       |
++--------------------------+------------+--------------------+---------------------+-------+
+| divide                   | Binary     | Numeric            | Numeric             |       |
++--------------------------+------------+--------------------+---------------------+-------+
+| divide_checked           | Binary     | Numeric            | Numeric             |       |
++--------------------------+------------+--------------------+---------------------+-------+
+| multiply                 | Binary     | Numeric            | Numeric             |       |
++--------------------------+------------+--------------------+---------------------+-------+
+| multiply_checked         | Binary     | Numeric            | Numeric             |       |
++--------------------------+------------+--------------------+---------------------+-------+
+| negate                   | Unary      | Numeric            | Numeric             |       |
++--------------------------+------------+--------------------+---------------------+-------+
+| negate_checked           | Unary      | Signed Numeric     | Signed Numeric      |       |
++--------------------------+------------+--------------------+---------------------+-------+
+| power                    | Binary     | Numeric            | Numeric             |       |
++--------------------------+------------+--------------------+---------------------+-------+
+| power_checked            | Binary     | Numeric            | Numeric             |       |
++--------------------------+------------+--------------------+---------------------+-------+
+| sign                     | Unary      | Numeric            | Int8                | \(1)  |
++--------------------------+------------+--------------------+---------------------+-------+
+| sign_with_signed_zero    | Unary      | Numeric            | Int8                | \(2)  |
++--------------------------+------------+--------------------+---------------------+-------+
+| subtract                 | Binary     | Numeric            | Numeric             |       |
++--------------------------+------------+--------------------+---------------------+-------+
+| subtract_checked         | Binary     | Numeric            | Numeric             |       |
++--------------------------+------------+--------------------+---------------------+-------+
+
+* \(1) Output is an integral representing the signedness of the input.
+  For nonzero inputs, output is any of (-1,1) and 0 for zero input.
+
+* \(2) Similar to ``sign`` but zero input is considered as a signed number.

Review comment:
       Hmm...I agree with this change. The `sign_with_signed_zero` variant only applies to floating-point numbers resulting in either (-1,1). For integral values, the result will be either of (-1,0,1).




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] ianmcook commented on pull request #10395: ARROW-12861: [C++][Compute] Add sign function kernels

Posted by GitBox <gi...@apache.org>.

ianmcook commented on pull request #10395:
URL: https://github.com/apache/arrow/pull/10395#issuecomment-852154037


   It's a thing in NumPy too https://numpy.org/doc/stable/reference/generated/numpy.sign.html


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] pitrou commented on pull request #10395: ARROW-12861: [C++][Compute] Add sign function kernels

Posted by GitBox <gi...@apache.org>.

pitrou commented on pull request #10395:
URL: https://github.com/apache/arrow/pull/10395#issuecomment-856093165


   Ok, all three versions of the `sign` function (Python, R, SQL) return (0,-1,+1), so this function should probably do the same.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] edponce commented on pull request #10395: ARROW-12861: [C++][Compute] Add sign function and tests

Posted by GitBox <gi...@apache.org>.

edponce commented on pull request #10395:
URL: https://github.com/apache/arrow/pull/10395#issuecomment-849552023


   While working on the rounding functions, I make use of the signedness of the input for specific rounding modes (e.g., round-half-away-from-zero). After making `sign` utility functions, I found that many database systems provide such a function (although I do not have a particular use case). The option for treating zero as a signed value was suggested in [issue discussion](https://issues.apache.org/jira/browse/ARROW-12861?jql=order%20by%20lastViewed%20DESC).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] pitrou commented on a change in pull request #10395: ARROW-12861: [C++][Compute] Add sign function kernels

Posted by GitBox <gi...@apache.org>.

pitrou commented on a change in pull request #10395:
URL: https://github.com/apache/arrow/pull/10395#discussion_r656044314



##########
File path: cpp/src/arrow/compute/api_scalar.cc
##########
@@ -50,6 +50,11 @@ namespace compute {
 SCALAR_ARITHMETIC_UNARY(AbsoluteValue, "abs", "abs_checked")
 SCALAR_ARITHMETIC_UNARY(Negate, "negate", "negate_checked")
 
+Result<Datum> Sign(const Datum& arg, SignOptions options, ExecContext* ctx) {
+  auto func_name = options.with_signed_zero ? "sign_with_signed_zero" : "sign";

Review comment:
       I'm still not sure why we need both variants. Is this a user request? @ianmcook ?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] pitrou commented on a change in pull request #10395: ARROW-12861: [C++][Compute] Add sign function kernels

Posted by GitBox <gi...@apache.org>.

pitrou commented on a change in pull request #10395:
URL: https://github.com/apache/arrow/pull/10395#discussion_r656470969



##########
File path: cpp/src/arrow/compute/api_scalar.cc
##########
@@ -50,6 +50,11 @@ namespace compute {
 SCALAR_ARITHMETIC_UNARY(AbsoluteValue, "abs", "abs_checked")
 SCALAR_ARITHMETIC_UNARY(Negate, "negate", "negate_checked")
 
+Result<Datum> Sign(const Datum& arg, SignOptions options, ExecContext* ctx) {
+  auto func_name = options.with_signed_zero ? "sign_with_signed_zero" : "sign";

Review comment:
       Please, can we answer the original question? Why are *two* variants needed?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] pitrou commented on a change in pull request #10395: ARROW-12861: [C++][Compute] Add sign function kernels

Posted by GitBox <gi...@apache.org>.

pitrou commented on a change in pull request #10395:
URL: https://github.com/apache/arrow/pull/10395#discussion_r646772300



##########
File path: cpp/src/arrow/compute/kernels/scalar_arithmetic_test.cc
##########
@@ -1161,5 +1180,93 @@ TYPED_TEST(TestUnaryArithmeticFloating, AbsoluteValue) {
   }
 }
 
+TYPED_TEST(TestUnaryArithmeticSigned, Sign) {
+  using CType = typename TestFixture::CType;
+  auto min = std::numeric_limits<CType>::min();
+  auto max = std::numeric_limits<CType>::max();
+
+  for (auto allow_signed_zero : {false, true}) {
+    this->SetSignedZero(allow_signed_zero);
+    // Empty array
+    this->AssertUnaryOp(Sign, "[]", ArrayFromJSON(int8(), "[]"));
+    // Scalar/arrays with nulls
+    this->AssertUnaryOp(Sign, "[null]", ArrayFromJSON(int8(), "[null]"));
+    this->AssertUnaryOp(Sign, "[1, null, -10]", ArrayFromJSON(int8(), "[1, null, -1]"));
+    // Scalar/arrays with zeros
+    this->AssertUnaryOp(Sign, "[0]", ArrayFromJSON(int8(), "[0]"));
+    // Ordinary scalar/arrays (positive inputs)
+    this->AssertUnaryOp(Sign, "[1, 10, 127]", ArrayFromJSON(int8(), "[1, 1, 1]"));
+    // Ordinary scalar/arrays (negative inputs)
+    this->AssertUnaryOp(Sign, "[-1, -10, -127]", ArrayFromJSON(int8(), "[-1, -1, -1]"));
+    // Min/max
+    this->AssertUnaryOp(Sign, ArrayFromJSON(this->type_singleton(), MakeArray(min, max)),
+                        ArrayFromJSON(int8(), "[-1, 1]"));
+
+    auto arg = ArrayFromJSON(this->type_singleton(), MakeArray(-1, min, max, 1));
+    arg = TweakValidityBit(arg, 1, false);
+    arg = TweakValidityBit(arg, 2, false);

Review comment:
       These tests which tweak the validity bits by hand are only meaningful if the kernel may return an error, which is not the case here.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] pitrou commented on a change in pull request #10395: ARROW-12861: [C++][Compute] Add sign function kernels

Posted by GitBox <gi...@apache.org>.

pitrou commented on a change in pull request #10395:
URL: https://github.com/apache/arrow/pull/10395#discussion_r646771585



##########
File path: docs/source/cpp/compute.rst
##########
@@ -286,6 +286,8 @@ an ``Invalid`` :class:`Status` when overflow is detected.
 +--------------------------+------------+--------------------+---------------------+
 | power_checked            | Binary     | Numeric            | Numeric             |
 +--------------------------+------------+--------------------+---------------------+
+| sign                     | Unary      | Numeric            | Int8                |

Review comment:
       Should add a note explaining how the output is computed. You can look at other sections for examples.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] bkietz closed pull request #10395: ARROW-12861: [C++][Compute] Add sign function kernels

Posted by GitBox <gi...@apache.org>.

bkietz closed pull request #10395:
URL: https://github.com/apache/arrow/pull/10395


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] edponce commented on pull request #10395: ARROW-12861: [C++][Compute] Add sign function and tests

Posted by GitBox <gi...@apache.org>.

edponce commented on pull request #10395:
URL: https://github.com/apache/arrow/pull/10395#issuecomment-849481217


   The `sign` function is an arithmetic function, therefore, I added its option `allow_signed_zero` to the [`ArithmeticOptions`](https://github.com/apache/arrow/pull/10395/files#diff-6bc7ecec6a4f7bcefc2511cde3bd809340ad0d94bb8f7cc5f4994063c798f2faR41) class.
   Is this valid? given that `sign` does not overflows so it will not make use of the `check_overflow` and currently, other compute functions do not use the `allow_signed_zero` option.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] pitrou commented on pull request #10395: ARROW-12861: [C++][Compute] Add sign function kernels

Posted by GitBox <gi...@apache.org>.

pitrou commented on pull request #10395:
URL: https://github.com/apache/arrow/pull/10395#issuecomment-861297199


   @bkietz may have the answer to your question, @edponce .


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] edponce commented on a change in pull request #10395: ARROW-12861: [C++][Compute] Add sign function kernels

Posted by GitBox <gi...@apache.org>.

edponce commented on a change in pull request #10395:
URL: https://github.com/apache/arrow/pull/10395#discussion_r650364204



##########
File path: docs/source/cpp/compute.rst
##########
@@ -286,6 +286,8 @@ an ``Invalid`` :class:`Status` when overflow is detected.
 +--------------------------+------------+--------------------+---------------------+
 | power_checked            | Binary     | Numeric            | Numeric             |
 +--------------------------+------------+--------------------+---------------------+
+| sign                     | Unary      | Numeric            | Int8                |

Review comment:
       I added a "Notes" column to the table, and included notes describing the output values and special case of 0.

##########
File path: cpp/src/arrow/compute/kernels/scalar_arithmetic_test.cc
##########
@@ -1161,5 +1180,93 @@ TYPED_TEST(TestUnaryArithmeticFloating, AbsoluteValue) {
   }
 }
 
+TYPED_TEST(TestUnaryArithmeticSigned, Sign) {
+  using CType = typename TestFixture::CType;
+  auto min = std::numeric_limits<CType>::min();
+  auto max = std::numeric_limits<CType>::max();
+
+  for (auto allow_signed_zero : {false, true}) {
+    this->SetSignedZero(allow_signed_zero);
+    // Empty array
+    this->AssertUnaryOp(Sign, "[]", ArrayFromJSON(int8(), "[]"));
+    // Scalar/arrays with nulls
+    this->AssertUnaryOp(Sign, "[null]", ArrayFromJSON(int8(), "[null]"));
+    this->AssertUnaryOp(Sign, "[1, null, -10]", ArrayFromJSON(int8(), "[1, null, -1]"));
+    // Scalar/arrays with zeros
+    this->AssertUnaryOp(Sign, "[0]", ArrayFromJSON(int8(), "[0]"));
+    // Ordinary scalar/arrays (positive inputs)
+    this->AssertUnaryOp(Sign, "[1, 10, 127]", ArrayFromJSON(int8(), "[1, 1, 1]"));
+    // Ordinary scalar/arrays (negative inputs)
+    this->AssertUnaryOp(Sign, "[-1, -10, -127]", ArrayFromJSON(int8(), "[-1, -1, -1]"));
+    // Min/max
+    this->AssertUnaryOp(Sign, ArrayFromJSON(this->type_singleton(), MakeArray(min, max)),
+                        ArrayFromJSON(int8(), "[-1, 1]"));
+
+    auto arg = ArrayFromJSON(this->type_singleton(), MakeArray(-1, min, max, 1));
+    arg = TweakValidityBit(arg, 1, false);
+    arg = TweakValidityBit(arg, 2, false);

Review comment:
       Thanks for the explanation. These tests were removed.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] pitrou commented on pull request #10395: ARROW-12861: [C++][Compute] Add sign function and tests

Posted by GitBox <gi...@apache.org>.

pitrou commented on pull request #10395:
URL: https://github.com/apache/arrow/pull/10395#issuecomment-849542668


   Can you explain in which context this function is useful? Furthermore, can you explain what the purpose is of having two different variants?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] edponce commented on a change in pull request #10395: ARROW-12861: [C++][Compute] Add sign function kernels

Posted by GitBox <gi...@apache.org>.

edponce commented on a change in pull request #10395:
URL: https://github.com/apache/arrow/pull/10395#discussion_r656452847



##########
File path: cpp/src/arrow/compute/kernels/scalar_arithmetic_test.cc
##########
@@ -149,7 +173,23 @@ template <typename T>
 class TestUnaryArithmeticUnsigned : public TestUnaryArithmeticIntegral<T> {};
 
 template <typename T>
-class TestUnaryArithmeticFloating : public TestUnaryArithmetic<T> {};
+class TestUnaryArithmeticFloating : public TestUnaryArithmetic<T, ArithmeticOptions> {};
+
+template <typename T>
+class TestUnaryArithmetic<T, SignOptions>
+    : public TestBaseUnaryArithmetic<T, SignOptions> {};
+
+template <typename T>
+class TestUnaryArithmeticIntegral_Sign : public TestUnaryArithmetic<T, SignOptions> {};

Review comment:
       Oops! Thanks for catching this.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] edponce commented on a change in pull request #10395: ARROW-12861: [C++][Compute] Add sign function kernels

Posted by GitBox <gi...@apache.org>.

edponce commented on a change in pull request #10395:
URL: https://github.com/apache/arrow/pull/10395#discussion_r656468614



##########
File path: cpp/src/arrow/compute/api_scalar.cc
##########
@@ -50,6 +50,11 @@ namespace compute {
 SCALAR_ARITHMETIC_UNARY(AbsoluteValue, "abs", "abs_checked")
 SCALAR_ARITHMETIC_UNARY(Negate, "negate", "negate_checked")
 
+Result<Datum> Sign(const Datum& arg, SignOptions options, ExecContext* ctx) {
+  auto func_name = options.with_signed_zero ? "sign_with_signed_zero" : "sign";

Review comment:
       Also, I consider that `copysign` should not be exposed as a compute function but rather used internally, and instead we should expose `sign` and `sign_with_signed_zero` (implementations may use `copysign(x, 1)`. The rationale for this is that `copysign(x, 1)` is not common in any SQL, dataframes library but `sign` is.
   
   A use case of `sign` is to build a custom ranking based on the signedness of values.
   For example, consider ranking values relative to a constant value (e.g., 3):
   ```
   SELECT CASE 
       WHEN @a < 3 THEN 0
       WHEN @a = 3 THEN 1
       WHEN @a > 3 THEN 2
   END
   ```
   is not supported because expressions are not allowed in `CASE-WHEN` constructs, so a workaround is
   ```
   SELECT CASE SIGN(@a - 3)
       WHEN -1 THEN 0
       WHEN 0 THEN 1
       WHEN 1 THEN 2
   END
   ``` 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] pitrou commented on a change in pull request #10395: ARROW-12861: [C++][Compute] Add sign function kernels

Posted by GitBox <gi...@apache.org>.

pitrou commented on a change in pull request #10395:
URL: https://github.com/apache/arrow/pull/10395#discussion_r656048180



##########
File path: cpp/src/arrow/compute/kernels/scalar_arithmetic_test.cc
##########
@@ -149,7 +173,23 @@ template <typename T>
 class TestUnaryArithmeticUnsigned : public TestUnaryArithmeticIntegral<T> {};
 
 template <typename T>
-class TestUnaryArithmeticFloating : public TestUnaryArithmetic<T> {};
+class TestUnaryArithmeticFloating : public TestUnaryArithmetic<T, ArithmeticOptions> {};
+
+template <typename T>
+class TestUnaryArithmetic<T, SignOptions>
+    : public TestBaseUnaryArithmetic<T, SignOptions> {};
+
+template <typename T>
+class TestUnaryArithmeticIntegral_Sign : public TestUnaryArithmetic<T, SignOptions> {};

Review comment:
       Can you use `CamelCase` without exceptions?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] pitrou commented on a change in pull request #10395: ARROW-12861: [C++][Compute] Add sign function kernels

Posted by GitBox <gi...@apache.org>.

pitrou commented on a change in pull request #10395:
URL: https://github.com/apache/arrow/pull/10395#discussion_r656046253



##########
File path: cpp/src/arrow/compute/kernels/scalar_arithmetic.cc
##########
@@ -454,6 +456,48 @@ struct PowerChecked {
   }
 };
 
+struct Sign {
+  template <typename T, typename Arg,
+            enable_if_t<std::is_floating_point<Arg>::value, bool> = true>
+  static constexpr T Call(KernelContext*, Arg arg, Status*) {
+    return (arg == 0) ? 0 : (std::signbit(arg) ? -1 : 1);
+  }
+
+  template <typename T, typename Arg,
+            enable_if_t<std::is_unsigned<Arg>::value, bool> = true>
+  static constexpr T Call(KernelContext*, Arg arg, Status*) {
+    return (arg == 0) ? 0 : 1;
+  }
+
+  template <typename T, typename Arg,
+            enable_if_t<std::is_integral<Arg>::value && std::is_signed<Arg>::value,
+                        bool> = true>
+  static constexpr T Call(KernelContext*, Arg arg, Status*) {
+    return (arg > 0) ? 1 : ((arg == 0) ? 0 : -1);
+  }
+};
+
+struct SignWithSignedZero {
+  template <typename T, typename Arg,
+            enable_if_t<std::is_floating_point<Arg>::value, bool> = true>
+  static constexpr T Call(KernelContext*, Arg arg, Status*) {
+    return std::signbit(arg) ? -1 : 1;
+  }
+
+  template <typename T, typename Arg,
+            enable_if_t<std::is_unsigned<Arg>::value, bool> = true>
+  static constexpr T Call(KernelContext*, Arg arg, Status*) {
+    return (arg == 0) ? 0 : 1;
+  }
+
+  template <typename T, typename Arg,
+            enable_if_t<std::is_integral<Arg>::value && std::is_signed<Arg>::value,
+                        bool> = true>
+  static constexpr T Call(KernelContext*, Arg arg, Status*) {
+    return (arg > 0) ? 1 : ((arg == 0) ? 0 : -1);

Review comment:
       This isn't consistent with the floating-point definition above, where `0` gives `1` in the output.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] github-actions[bot] commented on pull request #10395: ARROW-12861: [C++][Compute] Add sign function and tests

Posted by GitBox <gi...@apache.org>.

github-actions[bot] commented on pull request #10395:
URL: https://github.com/apache/arrow/pull/10395#issuecomment-847588512


   https://issues.apache.org/jira/browse/ARROW-12861


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] edponce edited a comment on pull request #10395: ARROW-12861: [C++][Compute] Add sign function and tests

Posted by GitBox <gi...@apache.org>.

edponce edited a comment on pull request #10395:
URL: https://github.com/apache/arrow/pull/10395#issuecomment-849529970


   The test for `double` type failed in AMD64-MinGW32 with a RapidJSON error message "Number too big to be stored in double". [I am not the first to encounter this error from RapidJSON](https://github.com/Tencent/rapidjson/issues/849). I disabled the tests that use `std::numeric_limits<double>::max()` to check if this is the triggering case.
   
   Indeed this was the case. Not sure of a current solution for this.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] edponce commented on a change in pull request #10395: ARROW-12861: [C++][Compute] Add sign function kernels

Posted by GitBox <gi...@apache.org>.

edponce commented on a change in pull request #10395:
URL: https://github.com/apache/arrow/pull/10395#discussion_r656451863



##########
File path: cpp/src/arrow/compute/api_scalar.cc
##########
@@ -50,6 +50,11 @@ namespace compute {
 SCALAR_ARITHMETIC_UNARY(AbsoluteValue, "abs", "abs_checked")
 SCALAR_ARITHMETIC_UNARY(Negate, "negate", "negate_checked")
 
+Result<Datum> Sign(const Datum& arg, SignOptions options, ExecContext* ctx) {
+  auto func_name = options.with_signed_zero ? "sign_with_signed_zero" : "sign";

Review comment:
       @pitrou The sign kernel is to be used by the rounding function. I agree with the fact that `sign_with_signed_zero` is equivalent to `copysign`. Actually, `copysign` is the variant needed for rounding function.
   
   Nevertheless, `sign` is a common function exposed in frontend tools, so I consider it worthwhile to provide.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] edponce commented on a change in pull request #10395: ARROW-12861: [C++][Compute] Add sign function kernels

Posted by GitBox <gi...@apache.org>.

edponce commented on a change in pull request #10395:
URL: https://github.com/apache/arrow/pull/10395#discussion_r656525276



##########
File path: cpp/src/arrow/compute/api_scalar.cc
##########
@@ -50,6 +50,11 @@ namespace compute {
 SCALAR_ARITHMETIC_UNARY(AbsoluteValue, "abs", "abs_checked")
 SCALAR_ARITHMETIC_UNARY(Negate, "negate", "negate_checked")
 
+Result<Datum> Sign(const Datum& arg, SignOptions options, ExecContext* ctx) {
+  auto func_name = options.with_signed_zero ? "sign_with_signed_zero" : "sign";

Review comment:
       After careful thought, there is not a general need for having two variants. The only case where I see that `sign_with_signed_zero` can be relevant is if data stored is used for calculus-related operations (e.g., integral limits).
   I will remove the `sign_with_signed_zero` variant.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] pitrou commented on a change in pull request #10395: ARROW-12861: [C++][Compute] Add sign function kernels

Posted by GitBox <gi...@apache.org>.

pitrou commented on a change in pull request #10395:
URL: https://github.com/apache/arrow/pull/10395#discussion_r656045319



##########
File path: cpp/src/arrow/compute/api_scalar.cc
##########
@@ -50,6 +50,11 @@ namespace compute {
 SCALAR_ARITHMETIC_UNARY(AbsoluteValue, "abs", "abs_checked")
 SCALAR_ARITHMETIC_UNARY(Negate, "negate", "negate_checked")
 
+Result<Datum> Sign(const Datum& arg, SignOptions options, ExecContext* ctx) {
+  auto func_name = options.with_signed_zero ? "sign_with_signed_zero" : "sign";

Review comment:
       Note that we could expose a `copysign` instead of `sign_with_signed_zero`, which would be more generic.
   (`sign_with_signed_zero(x)` would simply be `copysign(x, 1)`, AFAICT)




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] bkietz commented on a change in pull request #10395: ARROW-12861: [C++][Compute] Add sign function kernels

Posted by GitBox <gi...@apache.org>.

bkietz commented on a change in pull request #10395:
URL: https://github.com/apache/arrow/pull/10395#discussion_r652077054



##########
File path: cpp/src/arrow/compute/kernels/scalar_arithmetic_test.cc
##########
@@ -928,14 +974,32 @@ TEST(TestUnaryArithmetic, DispatchBest) {
     }
   }
 
+  // Signed arithmetic
   for (std::string name : {"negate_checked"}) {
     for (const auto& ty : {int8(), int16(), int32(), int64(), float32(), float64()}) {
       CheckDispatchBest(name, {ty}, {ty});
       CheckDispatchBest(name, {dictionary(int8(), ty)}, {ty});
     }
   }
 
-  for (std::string name : {"negate", "negate_checked", "abs", "abs_checked"}) {
+  // Functions with fixed output type
+  struct FuncAndOutType {
+    std::string name;
+    std::shared_ptr<DataType> out_type;
+  };
+
+  FuncAndOutType funcs[] = {{"sign", int8()}};
+  for (const auto& func : funcs) {
+    for (const auto& ty : {int8(), int16(), int32(), int64(), uint8(), uint16(), uint32(),
+                           uint64(), float32(), float64()}) {
+      CheckDispatchBest(func.name, {ty}, {func.out_type});
+      CheckDispatchBest(func.name, {dictionary(int8(), ty)}, {func.out_type});
+    }
+  }

Review comment:
       This seems unnecessarily general, especially given we have only "sign". Additionally, note that CheckDispatchBest does not assert the output type: it asserts the type to which arguments must be implicitly cast:
   ```suggestion
     // Sign always outputs to int8
     for (std::string name : {"sign", "sign_with_negative_zero"}) {
       for (const auto& ty : {int8(), int16(), int32(), int64(), uint8(), uint16(), uint32(),
                              uint64(), float32(), float64()}) {
         CheckDispatchBest(name, {ty}, {ty});
         CheckDispatchBest(name, {dictionary(int8(), ty)}, {ty});
       }
     }
   ```
   In the example above, `CheckDispatchBest(name, {ty}, {int8()});` would imply that we must first cast `ty -> int8()` inputs before invoking the kernel, which is incorrect. Compare this with `CheckDispatchBest(name, {dictionary(int8(), ty)}, {ty});` which asserts that we can't directly extract the sign bit from dictionary encoded data; encoded arrays must first be decoded for the kernel to operate on them.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] edponce commented on a change in pull request #10395: ARROW-12861: [C++][Compute] Add sign function kernels

Posted by GitBox <gi...@apache.org>.

edponce commented on a change in pull request #10395:
URL: https://github.com/apache/arrow/pull/10395#discussion_r652077943



##########
File path: docs/source/cpp/compute.rst
##########
@@ -255,41 +255,52 @@ Input(s) will be cast to the :ref:`common numeric type <common-numeric-type>`
 (and dictionary decoded, if applicable) before the operation is applied.
 
 The default variant of these functions does not detect overflow (the result
-then typically wraps around).  Each function is also available in an
+then typically wraps around).  Most functions are also available in an
 overflow-checking variant, suffixed ``_checked``, which returns
 an ``Invalid`` :class:`Status` when overflow is detected.
 
-+--------------------------+------------+--------------------+---------------------+
-| Function name            | Arity      | Input types        | Output type         |
-+==========================+============+====================+=====================+
-| abs                      | Unary      | Numeric            | Numeric             |
-+--------------------------+------------+--------------------+---------------------+
-| abs_checked              | Unary      | Numeric            | Numeric             |
-+--------------------------+------------+--------------------+---------------------+
-| add                      | Binary     | Numeric            | Numeric             |
-+--------------------------+------------+--------------------+---------------------+
-| add_checked              | Binary     | Numeric            | Numeric             |
-+--------------------------+------------+--------------------+---------------------+
-| divide                   | Binary     | Numeric            | Numeric             |
-+--------------------------+------------+--------------------+---------------------+
-| divide_checked           | Binary     | Numeric            | Numeric             |
-+--------------------------+------------+--------------------+---------------------+
-| multiply                 | Binary     | Numeric            | Numeric             |
-+--------------------------+------------+--------------------+---------------------+
-| multiply_checked         | Binary     | Numeric            | Numeric             |
-+--------------------------+------------+--------------------+---------------------+
-| negate                   | Unary      | Numeric            | Numeric             |
-+--------------------------+------------+--------------------+---------------------+
-| negate_checked           | Unary      | Signed Numeric     | Signed Numeric      |
-+--------------------------+------------+--------------------+---------------------+
-| power                    | Binary     | Numeric            | Numeric             |
-+--------------------------+------------+--------------------+---------------------+
-| power_checked            | Binary     | Numeric            | Numeric             |
-+--------------------------+------------+--------------------+---------------------+
-| subtract                 | Binary     | Numeric            | Numeric             |
-+--------------------------+------------+--------------------+---------------------+
-| subtract_checked         | Binary     | Numeric            | Numeric             |
-+--------------------------+------------+--------------------+---------------------+
++--------------------------+------------+--------------------+---------------------+-------+
+| Function name            | Arity      | Input types        | Output type         | Notes |
++==========================+============+====================+=====================+=======+
+| abs                      | Unary      | Numeric            | Numeric             |       |
++--------------------------+------------+--------------------+---------------------+-------+
+| abs_checked              | Unary      | Numeric            | Numeric             |       |
++--------------------------+------------+--------------------+---------------------+-------+
+| add                      | Binary     | Numeric            | Numeric             |       |
++--------------------------+------------+--------------------+---------------------+-------+
+| add_checked              | Binary     | Numeric            | Numeric             |       |
++--------------------------+------------+--------------------+---------------------+-------+
+| divide                   | Binary     | Numeric            | Numeric             |       |
++--------------------------+------------+--------------------+---------------------+-------+
+| divide_checked           | Binary     | Numeric            | Numeric             |       |
++--------------------------+------------+--------------------+---------------------+-------+
+| multiply                 | Binary     | Numeric            | Numeric             |       |
++--------------------------+------------+--------------------+---------------------+-------+
+| multiply_checked         | Binary     | Numeric            | Numeric             |       |
++--------------------------+------------+--------------------+---------------------+-------+
+| negate                   | Unary      | Numeric            | Numeric             |       |
++--------------------------+------------+--------------------+---------------------+-------+
+| negate_checked           | Unary      | Signed Numeric     | Signed Numeric      |       |
++--------------------------+------------+--------------------+---------------------+-------+
+| power                    | Binary     | Numeric            | Numeric             |       |
++--------------------------+------------+--------------------+---------------------+-------+
+| power_checked            | Binary     | Numeric            | Numeric             |       |
++--------------------------+------------+--------------------+---------------------+-------+
+| sign                     | Unary      | Numeric            | Int8                | \(1)  |
++--------------------------+------------+--------------------+---------------------+-------+
+| sign_with_signed_zero    | Unary      | Numeric            | Int8                | \(2)  |
++--------------------------+------------+--------------------+---------------------+-------+
+| subtract                 | Binary     | Numeric            | Numeric             |       |
++--------------------------+------------+--------------------+---------------------+-------+
+| subtract_checked         | Binary     | Numeric            | Numeric             |       |
++--------------------------+------------+--------------------+---------------------+-------+
+
+* \(1) Output is an integral representing the signedness of the input.
+  For nonzero inputs, output is any of (-1,1) and 0 for zero input.
+
+* \(2) Similar to ``sign`` but zero input is considered as a signed number.

Review comment:
       The `sign_with_signed_zero` variant treats 0 as a positive value for integral types, but either -0 or +0 for floating-point types.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] edponce edited a comment on pull request #10395: ARROW-12861: [C++][Compute] Add sign function and tests

Posted by GitBox <gi...@apache.org>.

edponce edited a comment on pull request #10395:
URL: https://github.com/apache/arrow/pull/10395#issuecomment-849481217


   The `sign` function is an arithmetic function, therefore, I added its option `allow_signed_zero` to the [`ArithmeticOptions`](https://github.com/apache/arrow/pull/10395/files#diff-6bc7ecec6a4f7bcefc2511cde3bd809340ad0d94bb8f7cc5f4994063c798f2faR41) class.
   Is this valid? given that `sign` does not overflows so it will not make use of `check_overflow` and currently, other compute functions do not use the `allow_signed_zero` option.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] nealrichardson commented on pull request #10395: ARROW-12861: [C++][Compute] Add sign function kernels

Posted by GitBox <gi...@apache.org>.

nealrichardson commented on pull request #10395:
URL: https://github.com/apache/arrow/pull/10395#issuecomment-850504023


   I wouldn't say it's the _most_ useful kernel to be adding, if I were prioritizing kernels to write, but it is something that people do. In R for example there is a `sign()` function: https://stat.ethz.ch/R-manual/R-devel/library/base/html/sign.html, and it's a thing in SQL too. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] edponce commented on pull request #10395: ARROW-12861: [C++][Compute] Add sign function and tests

Posted by GitBox <gi...@apache.org>.

edponce commented on pull request #10395:
URL: https://github.com/apache/arrow/pull/10395#issuecomment-849529970


   The test for `double` type failed in AMD64-WinGW32 with a RapidJSON error message "Number too big to be stored in double". [I am not the first to encounter this error from RapidJSON](https://github.com/Tencent/rapidjson/issues/849). I disabled the tests that use `std::numeric_limits<double>::max()` to check if this is the triggering case.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org