You are viewing a plain text version of this content. The canonical link for it is here.

Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2021/05/17 19:12:36 UTC

[GitHub] [arrow] edponce opened a new pull request #10349: ARROW-12744: [C++][Compute] Add rounding kernel

edponce opened a new pull request #10349:
URL: https://github.com/apache/arrow/pull/10349


   This PR adds a rounding compute function registered as "round" and "round_checked". The rounding function supports integral and floating-point types and returns a value of the same type.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] edponce commented on pull request #10349: ARROW-12744: [C++][Compute] Add rounding kernel

Posted by GitBox <gi...@apache.org>.

edponce commented on pull request #10349:
URL: https://github.com/apache/arrow/pull/10349#issuecomment-884465731


   There are 2 round functions (`Round` and `MRound`) and [both use different `Options` but make use of the same `enum RoundMode`](https://github.com/apache/arrow/pull/10349/files#diff-6bc7ecec6a4f7bcefc2511cde3bd809340ad0d94bb8f7cc5f4994063c798f2faR78-R96), therefore I defined [`enum RoundMode` in global space of `api_scalar.h`](https://github.com/apache/arrow/pull/10349/files#diff-6bc7ecec6a4f7bcefc2511cde3bd809340ad0d94bb8f7cc5f4994063c798f2faR58). Based on the recent `FunctionOptions` changes, I added [`EnumTraits<RoundMode>` to `api_scalar.cc`](https://github.com/apache/arrow/pull/10349/files#diff-926d239191ef87c5aa7bbf1ead3e7ad0a7e2d3d099648efda42812e649c2e030R105) along with necessary type and registration code.
   
   For tests, I wanted to use the `values()` method of `EnumTraits` to be able to iterate through the enum values, but I am not sure on how to invoke the `EnumTraits<RoundMode>` since it is not exposed in a header file. My solution was to create an [array (`kRoundModes`) with the enum values in the global space of tests](https://github.com/apache/arrow/pull/10349/files#diff-670d19cdae75caec7c47902285bf28449b6f188eddfb3a366ada22f3d7fe98b5R1403-R1409).
   
   Also, I could not find a way to create the [generator dispatchers without explicitly using the `enum RoundMode` values as template parameters](https://github.com/apache/arrow/pull/10349/files#diff-3eafd7246f6a8c699f10d46e3276852fe44b6853b5517ef10396e561730c09f4R1385-R1410) (I do not think we can do this in C++11 because the value depends on the `ty` loop variable).
   
   @lidavidm @bkietz Any comments or suggestions would be gladly appreciated.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] edponce commented on a change in pull request #10349: ARROW-12744: [C++][Compute] Add rounding kernel

Posted by GitBox <gi...@apache.org>.

edponce commented on a change in pull request #10349:
URL: https://github.com/apache/arrow/pull/10349#discussion_r692356691



##########
File path: cpp/src/arrow/compute/kernels/scalar_arithmetic_test.cc
##########
@@ -150,18 +149,32 @@ class TestUnaryArithmetic : public TestBase {
     AssertArraysApproxEqual(*expected, *actual, /*verbose=*/true, equal_options_);
   }
 
-  void SetOverflowCheck(bool value = true) { options_.check_overflow = value; }
-
   void SetNansEqual(bool value = true) {
     this->equal_options_ = equal_options_.nans_equal(value);
   }
 
-  ArithmeticOptions options_ = ArithmeticOptions();
+  Options options_ = Options();
   EqualOptions equal_options_ = EqualOptions::Defaults();
 };
 
+template <typename T, typename Options>

Review comment:
       It is the primary template for which explicit specializations are provided for `ArithmeticOptions`, `RoundOptions`, and `MRoundOptions`. I simplified these subclasses in latest commit.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] lidavidm commented on a change in pull request #10349: ARROW-12744: [C++][Compute] Add rounding kernel

Posted by GitBox <gi...@apache.org>.

lidavidm commented on a change in pull request #10349:
URL: https://github.com/apache/arrow/pull/10349#discussion_r692131309



##########
File path: cpp/src/arrow/compute/api_scalar.h
##########
@@ -49,10 +49,44 @@ class ARROW_EXPORT ElementWiseAggregateOptions : public FunctionOptions {
   explicit ElementWiseAggregateOptions(bool skip_nulls = true);
   constexpr static char const kTypeName[] = "ElementWiseAggregateOptions";
   static ElementWiseAggregateOptions Defaults() { return ElementWiseAggregateOptions{}; }
-
   bool skip_nulls;
 };
 
+enum class RoundMode {

Review comment:
       I see there's a table in compute.rst so maybe a comment mentioning that would be good - at least I know I only grep header files when I'm looking for something.

##########
File path: cpp/src/arrow/compute/kernels/scalar_arithmetic_test.cc
##########
@@ -150,18 +149,32 @@ class TestUnaryArithmetic : public TestBase {
     AssertArraysApproxEqual(*expected, *actual, /*verbose=*/true, equal_options_);
   }
 
-  void SetOverflowCheck(bool value = true) { options_.check_overflow = value; }
-
   void SetNansEqual(bool value = true) {
     this->equal_options_ = equal_options_.nans_equal(value);
   }
 
-  ArithmeticOptions options_ = ArithmeticOptions();
+  Options options_ = Options();
   EqualOptions equal_options_ = EqualOptions::Defaults();
 };
 
+template <typename T, typename Options>

Review comment:
       Is there any value to this subclass?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] pitrou commented on a change in pull request #10349: ARROW-12744: [C++][Compute] Add rounding kernel

Posted by GitBox <gi...@apache.org>.

pitrou commented on a change in pull request #10349:
URL: https://github.com/apache/arrow/pull/10349#discussion_r703602112



##########
File path: cpp/src/arrow/compute/kernels/scalar_arithmetic.cc
##########
@@ -852,24 +854,232 @@ struct LogbChecked {
   }
 };
 
+struct RoundUtil {
+  template <typename T>
+  static constexpr enable_if_t<std::is_floating_point<T>::value, bool> IsApproxEqual(
+      const T a, const T b, const T abs_tol = kDefaultAbsoluteTolerance) {
+    return std::fabs(a - b) <= abs_tol;
+  }
+
+  template <typename T>
+  static constexpr enable_if_t<std::is_floating_point<T>::value, bool> IsApproxInt(
+      const T val, const T abs_tol = kDefaultAbsoluteTolerance) {
+    // |frac| ~ 0.0?
+    return IsApproxEqual(val, std::round(val), abs_tol);
+  }
+
+  template <typename T>
+  static constexpr enable_if_t<std::is_floating_point<T>::value, bool> IsApproxHalfInt(
+      const T val, const T abs_tol = kDefaultAbsoluteTolerance) {
+    // |frac| ~ 0.5?
+    return IsApproxEqual(val - std::floor(val), T(0.5), abs_tol);
+  }
+
+  template <typename T>
+  static enable_if_floating_point<T> Pow10(const int64_t power) {
+    const T lut[]{1e0F,  1e1F,  1e2F,  1e3F,  1e4F,  1e5F,  1e6F,  1e7F,

Review comment:
       Could probably be `constexpr`?

##########
File path: cpp/src/arrow/compute/kernels/scalar_arithmetic_test.cc
##########
@@ -1352,6 +1412,212 @@ TYPED_TEST(TestUnaryArithmeticFloating, AbsoluteValue) {
   }
 }
 
+const std::initializer_list<RoundMode> kRoundModes{

Review comment:
       Nit, but can avoid pecularities of `initializer_list` and use `std::vector`...

##########
File path: cpp/src/arrow/compute/api_scalar.h
##########
@@ -49,10 +49,66 @@ class ARROW_EXPORT ElementWiseAggregateOptions : public FunctionOptions {
   explicit ElementWiseAggregateOptions(bool skip_nulls = true);
   constexpr static char const kTypeName[] = "ElementWiseAggregateOptions";
   static ElementWiseAggregateOptions Defaults() { return ElementWiseAggregateOptions{}; }
-
   bool skip_nulls;
 };
 
+/// Rounding and tie-breaking modes for round compute functions.
+/// Additional details and examples are provided in compute.rst.
+enum class RoundMode : int8_t {
+  /// Round to nearest integer less than or equal in magnitude (aka "floor")
+  DOWN,
+  /// Round to nearest integer greater than or equal in magnitude (aka "ceil")
+  UP,
+  /// Get the integral part without fractional digits (aka "trunc")
+  TOWARDS_ZERO,
+  /// Round negative values with DOWN rule and positive values with UP rule
+  TOWARDS_INFINITY,
+  /// Round ties with DOWN rule
+  HALF_DOWN,
+  /// Round ties with UP rule
+  HALF_UP,
+  /// Round ties with TOWARDS_ZERO rule
+  HALF_TOWARDS_ZERO,
+  /// Round ties with TOWARDS_INFINITY rule
+  HALF_TOWARDS_INFINITY,
+  /// Round ties to nearest even integer
+  HALF_TO_EVEN,
+  /// Round ties to nearest odd integer
+  HALF_TO_ODD,
+};
+
+static constexpr double kDefaultAbsoluteTolerance = 1E-5;

Review comment:
       I'd make this a member of `RoundOptions` and `MRoundOptions` respectively.

##########
File path: cpp/src/arrow/compute/api_scalar.h
##########
@@ -583,10 +639,32 @@ Result<Datum> MinElementWise(
 ///
 /// \param[in] arg the value to extract sign from
 /// \param[in] ctx the function execution context, optional
-/// \return the elementwise sign function
+/// \return the element-wise sign function
 ARROW_EXPORT
 Result<Datum> Sign(const Datum& arg, ExecContext* ctx = NULLPTR);
 
+/// \brief Round a value to the given number of digits. Array values can be of
+/// arbitrary length. If argument is null the result will be null.

Review comment:
       Hmm... what does "of arbitrary length" imply here? I'm not sure I understand.

##########
File path: cpp/src/arrow/compute/kernels/scalar_arithmetic.cc
##########
@@ -852,24 +854,232 @@ struct LogbChecked {
   }
 };
 
+struct RoundUtil {
+  template <typename T>
+  static constexpr enable_if_t<std::is_floating_point<T>::value, bool> IsApproxEqual(
+      const T a, const T b, const T abs_tol = kDefaultAbsoluteTolerance) {
+    return std::fabs(a - b) <= abs_tol;
+  }
+
+  template <typename T>
+  static constexpr enable_if_t<std::is_floating_point<T>::value, bool> IsApproxInt(
+      const T val, const T abs_tol = kDefaultAbsoluteTolerance) {
+    // |frac| ~ 0.0?
+    return IsApproxEqual(val, std::round(val), abs_tol);
+  }
+
+  template <typename T>
+  static constexpr enable_if_t<std::is_floating_point<T>::value, bool> IsApproxHalfInt(
+      const T val, const T abs_tol = kDefaultAbsoluteTolerance) {
+    // |frac| ~ 0.5?
+    return IsApproxEqual(val - std::floor(val), T(0.5), abs_tol);
+  }
+
+  template <typename T>
+  static enable_if_floating_point<T> Pow10(const int64_t power) {
+    const T lut[]{1e0F,  1e1F,  1e2F,  1e3F,  1e4F,  1e5F,  1e6F,  1e7F,
+                  1e8F,  1e9F,  1e10F, 1e11F, 1e12F, 1e13F, 1e14F, 1e15F,
+                  1e16F, 1e17F, 1e18F, 1e19F, 1e20F, 1e21F, 1e22F};
+    // Return NaN if index is out-of-range.
+    auto lut_size = (int64_t)(sizeof(lut) / sizeof(*lut));
+    return (power >= 0 && power < lut_size) ? lut[power] : std::nanf("");
+  }
+};
+
+// Specializations of rounding implementations for round kernels
+template <typename T, RoundMode>
+struct RoundImpl {};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::DOWN> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::floor(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::UP> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::ceil(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::TOWARDS_ZERO> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::signbit(val) ? std::ceil(val) : std::floor(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::TOWARDS_INFINITY> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::signbit(val) ? std::floor(val) : std::ceil(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_DOWN> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return RoundImpl<T, RoundMode::DOWN>::Round(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_UP> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return RoundImpl<T, RoundMode::UP>::Round(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TOWARDS_ZERO> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return RoundImpl<T, RoundMode::TOWARDS_ZERO>::Round(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TOWARDS_INFINITY> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return RoundImpl<T, RoundMode::TOWARDS_INFINITY>::Round(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TO_EVEN> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::round(val * T(0.5)) * 2;
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TO_ODD> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::floor(val * T(0.5)) + std::ceil(val * T(0.5));
+  }
+};
+
+template <RoundMode RndMode>
+struct Round {
+  using RoundState = OptionsWrapper<RoundOptions>;
+
+  template <typename T, typename Arg>
+  static enable_if_floating_point<Arg, T> Call(KernelContext* ctx, Arg arg, Status* st) {
+    static_assert(std::is_same<T, Arg>::value, "");
+    static_assert(RoundMode::HALF_DOWN > RoundMode::DOWN &&
+                      RoundMode::HALF_DOWN > RoundMode::UP &&
+                      RoundMode::HALF_DOWN > RoundMode::TOWARDS_ZERO &&
+                      RoundMode::HALF_DOWN > RoundMode::TOWARDS_INFINITY &&
+                      RoundMode::HALF_DOWN < RoundMode::HALF_UP &&
+                      RoundMode::HALF_DOWN < RoundMode::HALF_TOWARDS_ZERO &&
+                      RoundMode::HALF_DOWN < RoundMode::HALF_TOWARDS_INFINITY &&
+                      RoundMode::HALF_DOWN < RoundMode::HALF_TO_EVEN &&
+                      RoundMode::HALF_DOWN < RoundMode::HALF_TO_ODD,
+                  "Round modes prefixed with HALF need to be defined last in enum and "
+                  "the first HALF entry has to be HALF_DOWN.");
+
+    auto options = RoundState::Get(ctx);
+    auto pow10 = RoundUtil::Pow10<T>(std::llabs(options.ndigits));
+    if (std::isnan(pow10)) {
+      *st = Status::Invalid("out-of-range value for rounding digits");
+      return arg;

Review comment:
       It would be much better to make the kernel stateful and precompute this in the init function, IMHO.

##########
File path: cpp/src/arrow/compute/kernels/scalar_arithmetic.cc
##########
@@ -1667,6 +1930,22 @@ const FunctionDoc trunc_doc{
     ("Calculate the nearest integer not greater in magnitude than to the "
      "argument element-wise."),
     {"x"}};
+
+const FunctionDoc round_doc{
+    "Round the arguments to a given precision element-wise",
+    ("Options are used to control the number of digits and rounding mode.\n"
+     "Default behavior is to round to the nearest integer and use half-to-even "
+     "rule to break ties."),
+    {"x"},
+    "RoundOptions"};
+
+const FunctionDoc mround_doc{
+    "Round the arguments to a given multiple element-wise",

Review comment:
       "Round to a given multiple"?

##########
File path: cpp/src/arrow/compute/kernels/scalar_arithmetic.cc
##########
@@ -852,24 +854,232 @@ struct LogbChecked {
   }
 };
 
+struct RoundUtil {
+  template <typename T>
+  static constexpr enable_if_t<std::is_floating_point<T>::value, bool> IsApproxEqual(
+      const T a, const T b, const T abs_tol = kDefaultAbsoluteTolerance) {
+    return std::fabs(a - b) <= abs_tol;
+  }
+
+  template <typename T>
+  static constexpr enable_if_t<std::is_floating_point<T>::value, bool> IsApproxInt(
+      const T val, const T abs_tol = kDefaultAbsoluteTolerance) {
+    // |frac| ~ 0.0?
+    return IsApproxEqual(val, std::round(val), abs_tol);
+  }
+
+  template <typename T>
+  static constexpr enable_if_t<std::is_floating_point<T>::value, bool> IsApproxHalfInt(
+      const T val, const T abs_tol = kDefaultAbsoluteTolerance) {
+    // |frac| ~ 0.5?
+    return IsApproxEqual(val - std::floor(val), T(0.5), abs_tol);
+  }
+
+  template <typename T>
+  static enable_if_floating_point<T> Pow10(const int64_t power) {
+    const T lut[]{1e0F,  1e1F,  1e2F,  1e3F,  1e4F,  1e5F,  1e6F,  1e7F,
+                  1e8F,  1e9F,  1e10F, 1e11F, 1e12F, 1e13F, 1e14F, 1e15F,
+                  1e16F, 1e17F, 1e18F, 1e19F, 1e20F, 1e21F, 1e22F};
+    // Return NaN if index is out-of-range.
+    auto lut_size = (int64_t)(sizeof(lut) / sizeof(*lut));
+    return (power >= 0 && power < lut_size) ? lut[power] : std::nanf("");

Review comment:
       Why don't you also store inverse powers of 10? Is it because of rounding issues?

##########
File path: cpp/src/arrow/compute/kernels/scalar_arithmetic.cc
##########
@@ -852,24 +854,232 @@ struct LogbChecked {
   }
 };
 
+struct RoundUtil {
+  template <typename T>
+  static constexpr enable_if_t<std::is_floating_point<T>::value, bool> IsApproxEqual(
+      const T a, const T b, const T abs_tol = kDefaultAbsoluteTolerance) {
+    return std::fabs(a - b) <= abs_tol;
+  }
+
+  template <typename T>
+  static constexpr enable_if_t<std::is_floating_point<T>::value, bool> IsApproxInt(
+      const T val, const T abs_tol = kDefaultAbsoluteTolerance) {
+    // |frac| ~ 0.0?
+    return IsApproxEqual(val, std::round(val), abs_tol);
+  }
+
+  template <typename T>
+  static constexpr enable_if_t<std::is_floating_point<T>::value, bool> IsApproxHalfInt(
+      const T val, const T abs_tol = kDefaultAbsoluteTolerance) {
+    // |frac| ~ 0.5?
+    return IsApproxEqual(val - std::floor(val), T(0.5), abs_tol);
+  }
+
+  template <typename T>
+  static enable_if_floating_point<T> Pow10(const int64_t power) {
+    const T lut[]{1e0F,  1e1F,  1e2F,  1e3F,  1e4F,  1e5F,  1e6F,  1e7F,
+                  1e8F,  1e9F,  1e10F, 1e11F, 1e12F, 1e13F, 1e14F, 1e15F,
+                  1e16F, 1e17F, 1e18F, 1e19F, 1e20F, 1e21F, 1e22F};
+    // Return NaN if index is out-of-range.
+    auto lut_size = (int64_t)(sizeof(lut) / sizeof(*lut));
+    return (power >= 0 && power < lut_size) ? lut[power] : std::nanf("");
+  }
+};
+
+// Specializations of rounding implementations for round kernels
+template <typename T, RoundMode>
+struct RoundImpl {};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::DOWN> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::floor(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::UP> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::ceil(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::TOWARDS_ZERO> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::signbit(val) ? std::ceil(val) : std::floor(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::TOWARDS_INFINITY> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::signbit(val) ? std::floor(val) : std::ceil(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_DOWN> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return RoundImpl<T, RoundMode::DOWN>::Round(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_UP> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return RoundImpl<T, RoundMode::UP>::Round(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TOWARDS_ZERO> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return RoundImpl<T, RoundMode::TOWARDS_ZERO>::Round(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TOWARDS_INFINITY> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return RoundImpl<T, RoundMode::TOWARDS_INFINITY>::Round(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TO_EVEN> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::round(val * T(0.5)) * 2;
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TO_ODD> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::floor(val * T(0.5)) + std::ceil(val * T(0.5));
+  }
+};
+
+template <RoundMode RndMode>
+struct Round {
+  using RoundState = OptionsWrapper<RoundOptions>;
+
+  template <typename T, typename Arg>
+  static enable_if_floating_point<Arg, T> Call(KernelContext* ctx, Arg arg, Status* st) {
+    static_assert(std::is_same<T, Arg>::value, "");
+    static_assert(RoundMode::HALF_DOWN > RoundMode::DOWN &&
+                      RoundMode::HALF_DOWN > RoundMode::UP &&
+                      RoundMode::HALF_DOWN > RoundMode::TOWARDS_ZERO &&
+                      RoundMode::HALF_DOWN > RoundMode::TOWARDS_INFINITY &&
+                      RoundMode::HALF_DOWN < RoundMode::HALF_UP &&
+                      RoundMode::HALF_DOWN < RoundMode::HALF_TOWARDS_ZERO &&
+                      RoundMode::HALF_DOWN < RoundMode::HALF_TOWARDS_INFINITY &&
+                      RoundMode::HALF_DOWN < RoundMode::HALF_TO_EVEN &&
+                      RoundMode::HALF_DOWN < RoundMode::HALF_TO_ODD,
+                  "Round modes prefixed with HALF need to be defined last in enum and "
+                  "the first HALF entry has to be HALF_DOWN.");
+
+    auto options = RoundState::Get(ctx);
+    auto pow10 = RoundUtil::Pow10<T>(std::llabs(options.ndigits));
+    if (std::isnan(pow10)) {
+      *st = Status::Invalid("out-of-range value for rounding digits");
+      return arg;
+    } else if (!std::isfinite(arg)) {
+      return arg;
+    }
+
+    T scaled_arg = (options.ndigits >= 0) ? (arg * pow10) : (arg / pow10);
+    // Use std::round if scaled value is an integer or not 0.5 when a tie-breaking mode
+    // was set.
+    T result;
+    if (RoundUtil::IsApproxInt(scaled_arg, T(options.abs_tol)) ||
+        (options.round_mode >= RoundMode::HALF_DOWN &&
+         !RoundUtil::IsApproxHalfInt(scaled_arg, T(options.abs_tol)))) {
+      result = std::round(scaled_arg);
+    } else {
+      result = RoundImpl<T, RndMode>::Round(scaled_arg);
+    }
+    result = (options.ndigits >= 0) ? (result / pow10) : (result * pow10);
+    if (!std::isfinite(result)) {
+      *st = Status::Invalid("overflow occurred during rounding");
+      return arg;
+    }
+    // If rounding didn't change value, return original value
+    return RoundUtil::IsApproxEqual(arg, result, T(options.abs_tol)) ? arg : result;
+  }
+};
+
+template <RoundMode RndMode>
+struct MRound {
+  using MRoundState = OptionsWrapper<MRoundOptions>;
+
+  template <typename T, typename Arg>
+  static enable_if_floating_point<Arg, T> Call(KernelContext* ctx, Arg arg, Status* st) {
+    static_assert(std::is_same<T, Arg>::value, "");
+    static_assert(RoundMode::HALF_DOWN > RoundMode::DOWN &&
+                      RoundMode::HALF_DOWN > RoundMode::UP &&
+                      RoundMode::HALF_DOWN > RoundMode::TOWARDS_ZERO &&
+                      RoundMode::HALF_DOWN > RoundMode::TOWARDS_INFINITY &&
+                      RoundMode::HALF_DOWN < RoundMode::HALF_UP &&
+                      RoundMode::HALF_DOWN < RoundMode::HALF_TOWARDS_ZERO &&
+                      RoundMode::HALF_DOWN < RoundMode::HALF_TOWARDS_INFINITY &&
+                      RoundMode::HALF_DOWN < RoundMode::HALF_TO_EVEN &&
+                      RoundMode::HALF_DOWN < RoundMode::HALF_TO_ODD,
+                  "Round modes prefixed with HALF need to be defined last in enum and "
+                  "the first HALF entry has to be HALF_DOWN.");
+    auto options = MRoundState::Get(ctx);
+    auto mult = std::fabs(T(options.multiple));
+    if (mult == 0) {

Review comment:
       Can't we just mandate that `mult` is finite and strictly positive? It doesn't seem useful to support degenerate cases such as `mult == 0` or `mult == +inf`, etc.

##########
File path: cpp/src/arrow/compute/api_scalar.h
##########
@@ -49,10 +49,66 @@ class ARROW_EXPORT ElementWiseAggregateOptions : public FunctionOptions {
   explicit ElementWiseAggregateOptions(bool skip_nulls = true);
   constexpr static char const kTypeName[] = "ElementWiseAggregateOptions";
   static ElementWiseAggregateOptions Defaults() { return ElementWiseAggregateOptions{}; }
-
   bool skip_nulls;
 };
 
+/// Rounding and tie-breaking modes for round compute functions.
+/// Additional details and examples are provided in compute.rst.
+enum class RoundMode : int8_t {
+  /// Round to nearest integer less than or equal in magnitude (aka "floor")
+  DOWN,
+  /// Round to nearest integer greater than or equal in magnitude (aka "ceil")
+  UP,
+  /// Get the integral part without fractional digits (aka "trunc")
+  TOWARDS_ZERO,
+  /// Round negative values with DOWN rule and positive values with UP rule
+  TOWARDS_INFINITY,
+  /// Round ties with DOWN rule
+  HALF_DOWN,
+  /// Round ties with UP rule
+  HALF_UP,
+  /// Round ties with TOWARDS_ZERO rule
+  HALF_TOWARDS_ZERO,
+  /// Round ties with TOWARDS_INFINITY rule
+  HALF_TOWARDS_INFINITY,
+  /// Round ties to nearest even integer
+  HALF_TO_EVEN,
+  /// Round ties to nearest odd integer
+  HALF_TO_ODD,
+};
+
+static constexpr double kDefaultAbsoluteTolerance = 1E-5;
+
+class ARROW_EXPORT RoundOptions : public FunctionOptions {
+ public:
+  explicit RoundOptions(int64_t ndigits = 0,
+                        RoundMode round_mode = RoundMode::HALF_TO_EVEN,
+                        double abs_tol = kDefaultAbsoluteTolerance);
+  constexpr static char const kTypeName[] = "RoundOptions";
+  static RoundOptions Defaults() { return RoundOptions(); }
+  /// Rounding precision (number of digits to round to).
+  int64_t ndigits;
+  /// Rounding and tie-breaking mode
+  RoundMode round_mode;
+  /// Absolute tolerance for approximating values as integers and mid-point decimals
+  double abs_tol;

Review comment:
       Is it really desirable to implement this? If I have "1.500000000001", will it really be considered equal to 1.5?

##########
File path: cpp/src/arrow/compute/kernels/scalar_arithmetic_test.cc
##########
@@ -422,6 +472,16 @@ TYPED_TEST_SUITE(TestUnaryArithmeticSigned, SignedIntegerTypes);
 TYPED_TEST_SUITE(TestUnaryArithmeticUnsigned, UnsignedIntegerTypes);
 TYPED_TEST_SUITE(TestUnaryArithmeticFloating, FloatingTypes);
 
+TYPED_TEST_SUITE(TestUnaryRoundIntegral, IntegralTypes);
+TYPED_TEST_SUITE(TestUnaryRoundSigned, SignedIntegerTypes);
+TYPED_TEST_SUITE(TestUnaryRoundUnsigned, UnsignedIntegerTypes);
+TYPED_TEST_SUITE(TestUnaryRoundFloating, FloatingTypes);
+
+TYPED_TEST_SUITE(TestUnaryMRoundIntegral, IntegralTypes);
+TYPED_TEST_SUITE(TestUnaryMRoundSigned, SignedIntegerTypes);
+TYPED_TEST_SUITE(TestUnaryMRoundUnsigned, UnsignedIntegerTypes);
+TYPED_TEST_SUITE(TestUnaryMRoundFloating, FloatingTypes);

Review comment:
       Can you move these declarations near the corresponding `TYPED_TEST` definitions? Otherwise, things will be difficult to follow.

##########
File path: cpp/src/arrow/compute/kernels/scalar_arithmetic.cc
##########
@@ -20,14 +20,16 @@
 #include <limits>
 #include <utility>
 
-#include "arrow/compute/kernels/codegen_internal.h"
+#include "arrow/compare.h"
+#include "arrow/compute/api_scalar.h"
 #include "arrow/compute/kernels/common.h"
 #include "arrow/compute/kernels/util_internal.h"
 #include "arrow/type.h"
 #include "arrow/type_traits.h"
 #include "arrow/util/decimal.h"
 #include "arrow/util/int_util_internal.h"
 #include "arrow/util/macros.h"
+#include "arrow/util/map.h"

Review comment:
       This one doesn't seem used?

##########
File path: cpp/src/arrow/util/map.h
##########
@@ -17,13 +17,21 @@
 
 #pragma once
 
+#include <type_traits>
 #include <utility>
 
 #include "arrow/result.h"
 
 namespace arrow {
 namespace internal {
 
+/// Functor to make enums hashable.
+template <typename T, typename R = typename std::underlying_type<T>::type,
+          typename = std::enable_if<std::is_enum<T>::value>>
+struct EnumHash {
+  R operator()(T val) const { return static_cast<R>(val); }
+};

Review comment:
       Is it actually used in this PR?

##########
File path: cpp/src/arrow/compute/kernels/scalar_arithmetic.cc
##########
@@ -1667,6 +1930,22 @@ const FunctionDoc trunc_doc{
     ("Calculate the nearest integer not greater in magnitude than to the "
      "argument element-wise."),
     {"x"}};
+
+const FunctionDoc round_doc{
+    "Round the arguments to a given precision element-wise",

Review comment:
       I don't think "element-wise" is informational (what would a non-element-wise round function?).

##########
File path: cpp/src/arrow/compute/kernels/scalar_arithmetic.cc
##########
@@ -852,24 +854,232 @@ struct LogbChecked {
   }
 };
 
+struct RoundUtil {
+  template <typename T>
+  static constexpr enable_if_t<std::is_floating_point<T>::value, bool> IsApproxEqual(
+      const T a, const T b, const T abs_tol = kDefaultAbsoluteTolerance) {
+    return std::fabs(a - b) <= abs_tol;
+  }
+
+  template <typename T>
+  static constexpr enable_if_t<std::is_floating_point<T>::value, bool> IsApproxInt(
+      const T val, const T abs_tol = kDefaultAbsoluteTolerance) {
+    // |frac| ~ 0.0?
+    return IsApproxEqual(val, std::round(val), abs_tol);
+  }
+
+  template <typename T>
+  static constexpr enable_if_t<std::is_floating_point<T>::value, bool> IsApproxHalfInt(
+      const T val, const T abs_tol = kDefaultAbsoluteTolerance) {
+    // |frac| ~ 0.5?
+    return IsApproxEqual(val - std::floor(val), T(0.5), abs_tol);
+  }
+
+  template <typename T>
+  static enable_if_floating_point<T> Pow10(const int64_t power) {
+    const T lut[]{1e0F,  1e1F,  1e2F,  1e3F,  1e4F,  1e5F,  1e6F,  1e7F,
+                  1e8F,  1e9F,  1e10F, 1e11F, 1e12F, 1e13F, 1e14F, 1e15F,
+                  1e16F, 1e17F, 1e18F, 1e19F, 1e20F, 1e21F, 1e22F};
+    // Return NaN if index is out-of-range.
+    auto lut_size = (int64_t)(sizeof(lut) / sizeof(*lut));
+    return (power >= 0 && power < lut_size) ? lut[power] : std::nanf("");
+  }
+};
+
+// Specializations of rounding implementations for round kernels
+template <typename T, RoundMode>
+struct RoundImpl {};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::DOWN> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::floor(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::UP> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::ceil(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::TOWARDS_ZERO> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::signbit(val) ? std::ceil(val) : std::floor(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::TOWARDS_INFINITY> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::signbit(val) ? std::floor(val) : std::ceil(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_DOWN> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return RoundImpl<T, RoundMode::DOWN>::Round(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_UP> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return RoundImpl<T, RoundMode::UP>::Round(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TOWARDS_ZERO> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return RoundImpl<T, RoundMode::TOWARDS_ZERO>::Round(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TOWARDS_INFINITY> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return RoundImpl<T, RoundMode::TOWARDS_INFINITY>::Round(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TO_EVEN> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::round(val * T(0.5)) * 2;
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TO_ODD> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::floor(val * T(0.5)) + std::ceil(val * T(0.5));
+  }
+};
+
+template <RoundMode RndMode>
+struct Round {
+  using RoundState = OptionsWrapper<RoundOptions>;
+
+  template <typename T, typename Arg>
+  static enable_if_floating_point<Arg, T> Call(KernelContext* ctx, Arg arg, Status* st) {
+    static_assert(std::is_same<T, Arg>::value, "");
+    static_assert(RoundMode::HALF_DOWN > RoundMode::DOWN &&
+                      RoundMode::HALF_DOWN > RoundMode::UP &&
+                      RoundMode::HALF_DOWN > RoundMode::TOWARDS_ZERO &&
+                      RoundMode::HALF_DOWN > RoundMode::TOWARDS_INFINITY &&
+                      RoundMode::HALF_DOWN < RoundMode::HALF_UP &&
+                      RoundMode::HALF_DOWN < RoundMode::HALF_TOWARDS_ZERO &&
+                      RoundMode::HALF_DOWN < RoundMode::HALF_TOWARDS_INFINITY &&
+                      RoundMode::HALF_DOWN < RoundMode::HALF_TO_EVEN &&
+                      RoundMode::HALF_DOWN < RoundMode::HALF_TO_ODD,
+                  "Round modes prefixed with HALF need to be defined last in enum and "
+                  "the first HALF entry has to be HALF_DOWN.");

Review comment:
       You can probably put this at the top level instead of repeating it in both kernels.

##########
File path: python/pyarrow/tests/test_compute.py
##########
@@ -1316,6 +1318,67 @@ def test_arithmetic_multiply():
     assert result.equals(expected)
 
 
+@pytest.mark.parametrize("ty", ["round", "mround"])
+def test_round_to_integer(ty):
+    if ty == "round":
+        round = pc.round
+        RoundOptions = partial(pc.RoundOptions, ndigits=0)
+    elif ty == "mround":
+        round = pc.mround
+        RoundOptions = partial(pc.MRoundOptions, multiple=1)
+
+    values = [3.2, 3.5, 3.7, 4.5, -3.2, -3.5, -3.7, None]
+    rmode_and_expected_map = {
+        "down": [3, 3, 3, 4, -4, -4, -4, None],
+        "up": [4, 4, 4, 5, -3, -3, -3, None],
+        "towards_zero": [3, 3, 3, 4, -3, -3, -3, None],
+        "towards_infinity": [4, 4, 4, 5, -4, -4, -4, None],
+        "half_down": [3, 3, 4, 4, -3, -4, -4, None],
+        "half_up": [3, 4, 4, 5, -3, -3, -4, None],
+        "half_towards_zero": [3, 3, 4, 4, -3, -3, -4, None],
+        "half_towards_infinity": [3, 4, 4, 5, -3, -4, -4, None],
+        "half_to_even": [3, 4, 4, 4, -3, -4, -4, None],
+        "half_to_odd": [3, 3, 4, 5, -3, -3, -4, None],
+    }
+    for round_mode, expected in rmode_and_expected_map.items():
+        options = RoundOptions(round_mode=round_mode)
+        result = round(values, options=options)
+        assert np.all(np.isclose(result, pa.array(expected), equal_nan=True))

Review comment:
       Given the results are all integers, we should check for exact equality here, not approximate.

##########
File path: cpp/src/arrow/compute/kernels/scalar_arithmetic_test.cc
##########
@@ -1352,6 +1412,212 @@ TYPED_TEST(TestUnaryArithmeticFloating, AbsoluteValue) {
   }
 }
 
+const std::initializer_list<RoundMode> kRoundModes{
+    RoundMode::DOWN,
+    RoundMode::UP,
+    RoundMode::TOWARDS_ZERO,
+    RoundMode::TOWARDS_INFINITY,
+    RoundMode::HALF_DOWN,
+    RoundMode::HALF_UP,
+    RoundMode::HALF_TOWARDS_ZERO,
+    RoundMode::HALF_TOWARDS_INFINITY,
+    RoundMode::HALF_TO_EVEN,
+    RoundMode::HALF_TO_ODD,
+};
+
+TYPED_TEST(TestUnaryRoundSigned, Round) {
+  // Test different rounding modes for integer rounding
+  std::string values("[0, 1, -13, -50, 115]");
+  this->SetRoundNdigits(0);
+  for (const auto& round_mode : kRoundModes) {
+    this->SetRoundMode(round_mode);
+    this->AssertUnaryOp(Round, values, ArrayFromJSON(float64(), values));
+  }
+
+  // Test different round N-digits for nearest rounding mode
+  std::unordered_map<int64_t, std::string> ndigits_and_expected_map({

Review comment:
       This can trivially be a `std::vector<std::pair>`...

##########
File path: docs/source/cpp/compute.rst
##########
@@ -457,18 +458,100 @@ Bit-wise functions
 Rounding functions
 ~~~~~~~~~~~~~~~~~~
 
-Rounding functions convert a numeric input into an approximate value with a
-simpler representation based on the rounding strategy.
-
-+------------------+--------+----------------+-----------------+-------+
-| Function name    | Arity  | Input types    | Output type     | Notes |
-+==================+========+================+=================+=======+
-| floor            | Unary  | Numeric        | Float32/Float64 |       |
-+------------------+--------+----------------+-----------------+-------+
-| ceil             | Unary  | Numeric        | Float32/Float64 |       |
-+------------------+--------+----------------+-----------------+-------+
-| trunc            | Unary  | Numeric        | Float32/Float64 |       |
-+------------------+--------+----------------+-----------------+-------+
+Rounding functions displace numeric inputs to an approximate value with a simpler
+representation based on the rounding criterion.
+
++---------------+------------+-------------+------------------+-------------------------+--------+
+| Function name | Arity      | Input types | Output type      | Options class           | Notes  |
++===============+============+=============+==================+=========================+========+
+| ceil          | Unary      | Numeric     | Float32/Float64  |                         |        |
++---------------+------------+-------------+------------------+-------------------------+--------+
+| floor         | Unary      | Numeric     | Float32/Float64  |                         |        |
++---------------+------------+-------------+------------------+-------------------------+--------+
+| mround        | Unary      | Numeric     | Float32/Float64  | :struct:`MRoundOptions` | (1)(2) |
++---------------+------------+-------------+------------------+-------------------------+--------+
+| round         | Unary      | Numeric     | Float32/Float64  | :struct:`RoundOptions`  | (1)(3) |
++---------------+------------+-------------+------------------+-------------------------+--------+
+| trunc         | Unary      | Numeric     | Float32/Float64  |                         |        |
++---------------+------------+-------------+------------------+-------------------------+--------+
+
+* \(1) Output value is a 64-bit floating-point for integral inputs and the
+  retains the same type for floating-point inputs.  By default rounding
+  functions displace a value to the nearest integer using HALF_TO_EVEN
+  to resolve ties.  Options are available to control the rounding criterion.
+  Both ``round`` and ``mround`` have the ``round_mode`` option to set the
+  rounding mode.
+* \(2) Round to a multiple where the ``multiple`` option of :struct:`MRoundOptions`
+  specifies the rounding scale.  Only the absolute value of the rounding
+  multiple is used, that is, its sign is ignored.  For example, 100 corresponds
+  to rounding to the nearest multiple of 100 (zeroing the ones and tens digits).
+  Default value of ``multiple`` is 1 which rounds to the nearest integer.
+  Also, an absolute tolerance ``abs_tol`` can be set to control equality
+  comparisons of floating-point values.
+* \(3) Round to a number of digits where the ``ndigits`` option of
+  :struct:`RoundOptions` specifies the rounding precision in terms of number of
+  digits.  A negative value corresponds to digits in the non-fractional part.
+  For example, -2 corresponds to rounding to the nearest multiple of 100
+  (zeroing the ones and tens digits).  Default value of ``ndigits`` is 0 which
+  rounds to the nearest integer.  Also, an absolute tolerance ``abs_tol`` can be
+  set to control equality comparisons of floating-point values.
+
+The following tables present the rounding operations performed for various
+rounding options and modes.
+
++--------------------+---------------+-------------------------------+
+| Round ``multiple`` | Round ``ndigits`` | Operation performed       |

Review comment:
       Looks like the first table line has an alignment issue above?

##########
File path: python/pyarrow/_compute.pyx
##########
@@ -699,6 +699,78 @@ class ElementWiseAggregateOptions(_ElementWiseAggregateOptions):
         self._set_options(skip_nulls)
 
 
+cdef class _RoundOptions(FunctionOptions):
+    def _set_options(self, int64_t ndigits, round_mode, double abs_tol):
+        cdef:
+            CRoundMode c_round_mode = CRoundMode_HALF_TO_EVEN
+        if round_mode == 'down':
+            c_round_mode = CRoundMode_DOWN
+        elif round_mode == 'up':
+            c_round_mode = CRoundMode_UP
+        elif round_mode == 'towards_zero':
+            c_round_mode = CRoundMode_TOWARDS_ZERO
+        elif round_mode == 'towards_infinity':
+            c_round_mode = CRoundMode_TOWARDS_INFINITY
+        elif round_mode == 'half_down':
+            c_round_mode = CRoundMode_HALF_DOWN
+        elif round_mode == 'half_up':
+            c_round_mode = CRoundMode_HALF_UP
+        elif round_mode == 'half_towards_zero':
+            c_round_mode = CRoundMode_HALF_TOWARDS_ZERO
+        elif round_mode == 'half_towards_infinity':
+            c_round_mode = CRoundMode_HALF_TOWARDS_INFINITY
+        elif round_mode == 'half_to_even':
+            c_round_mode = CRoundMode_HALF_TO_EVEN
+        elif round_mode == 'half_to_odd':
+            c_round_mode = CRoundMode_HALF_TO_ODD
+        else:
+            raise ValueError(
+                '"{}" is not a valid round mode'
+                .format(round_mode))

Review comment:
       Can probably factor all this out in a `cdef` function:
   ```cython
   
   cdef CRoundMode unwrap_round_mode(round_mode) except *:
       if round_mode == 'down':
           return CRoundMode_DOWN
       elif round_mode == 'up':
           return CRoundMode_UP
       elif round_mode == 'towards_zero':
           return CRoundMode_TOWARDS_ZERO
       elif round_mode == 'towards_infinity':
           return CRoundMode_TOWARDS_INFINITY
       elif round_mode == 'half_down':
           return CRoundMode_HALF_DOWN
       elif round_mode == 'half_up':
           return CRoundMode_HALF_UP
       elif round_mode == 'half_towards_zero':
           return CRoundMode_HALF_TOWARDS_ZERO
       elif round_mode == 'half_towards_infinity':
           return CRoundMode_HALF_TOWARDS_INFINITY
       elif round_mode == 'half_to_even':
           return CRoundMode_HALF_TO_EVEN
       elif round_mode == 'half_to_odd':
           return CRoundMode_HALF_TO_ODD
       else:
           raise ValueError(
               '"{}" is not a valid round mode'
               .format(round_mode))
   ```

##########
File path: cpp/src/arrow/compute/kernels/scalar_arithmetic_test.cc
##########
@@ -66,18 +65,20 @@ class TestUnaryArithmetic : public TestBase {
     return *arrow::MakeScalar(type_singleton(), value);
   }
 
+  void SetUp() override {}
+
   // (CScalar, CScalar)
   void AssertUnaryOp(UnaryFunction func, CType argument, CType expected) {
     auto arg = MakeScalar(argument);
     auto exp = MakeScalar(expected);
-    ASSERT_OK_AND_ASSIGN(auto actual, func(arg, options_, nullptr));
+    ASSERT_OK_AND_ASSIGN(auto actual, func(arg, options_, NULLPTR));

Review comment:
       Those changes shouldn't be necessary, `NULLPTR` is only required in header files.

##########
File path: cpp/src/arrow/compute/api_scalar.h
##########
@@ -49,10 +49,66 @@ class ARROW_EXPORT ElementWiseAggregateOptions : public FunctionOptions {
   explicit ElementWiseAggregateOptions(bool skip_nulls = true);
   constexpr static char const kTypeName[] = "ElementWiseAggregateOptions";
   static ElementWiseAggregateOptions Defaults() { return ElementWiseAggregateOptions{}; }
-
   bool skip_nulls;
 };
 
+/// Rounding and tie-breaking modes for round compute functions.
+/// Additional details and examples are provided in compute.rst.
+enum class RoundMode : int8_t {
+  /// Round to nearest integer less than or equal in magnitude (aka "floor")
+  DOWN,
+  /// Round to nearest integer greater than or equal in magnitude (aka "ceil")
+  UP,
+  /// Get the integral part without fractional digits (aka "trunc")
+  TOWARDS_ZERO,
+  /// Round negative values with DOWN rule and positive values with UP rule
+  TOWARDS_INFINITY,
+  /// Round ties with DOWN rule
+  HALF_DOWN,
+  /// Round ties with UP rule
+  HALF_UP,
+  /// Round ties with TOWARDS_ZERO rule
+  HALF_TOWARDS_ZERO,
+  /// Round ties with TOWARDS_INFINITY rule
+  HALF_TOWARDS_INFINITY,
+  /// Round ties to nearest even integer
+  HALF_TO_EVEN,
+  /// Round ties to nearest odd integer
+  HALF_TO_ODD,
+};
+
+static constexpr double kDefaultAbsoluteTolerance = 1E-5;
+
+class ARROW_EXPORT RoundOptions : public FunctionOptions {
+ public:
+  explicit RoundOptions(int64_t ndigits = 0,
+                        RoundMode round_mode = RoundMode::HALF_TO_EVEN,
+                        double abs_tol = kDefaultAbsoluteTolerance);
+  constexpr static char const kTypeName[] = "RoundOptions";
+  static RoundOptions Defaults() { return RoundOptions(); }
+  /// Rounding precision (number of digits to round to).
+  int64_t ndigits;
+  /// Rounding and tie-breaking mode
+  RoundMode round_mode;
+  /// Absolute tolerance for approximating values as integers and mid-point decimals
+  double abs_tol;

Review comment:
       And in case an optional tolerance is actually useful, I would really make it 0 by default.

##########
File path: cpp/src/arrow/compute/kernels/scalar_arithmetic.cc
##########
@@ -1667,6 +1930,22 @@ const FunctionDoc trunc_doc{
     ("Calculate the nearest integer not greater in magnitude than to the "
      "argument element-wise."),
     {"x"}};
+
+const FunctionDoc round_doc{
+    "Round the arguments to a given precision element-wise",

Review comment:
       So just "Round to a given precision" perhaps?

##########
File path: python/pyarrow/tests/test_compute.py
##########
@@ -1316,6 +1318,67 @@ def test_arithmetic_multiply():
     assert result.equals(expected)
 
 
+@pytest.mark.parametrize("ty", ["round", "mround"])
+def test_round_to_integer(ty):
+    if ty == "round":
+        round = pc.round
+        RoundOptions = partial(pc.RoundOptions, ndigits=0)
+    elif ty == "mround":
+        round = pc.mround
+        RoundOptions = partial(pc.MRoundOptions, multiple=1)
+
+    values = [3.2, 3.5, 3.7, 4.5, -3.2, -3.5, -3.7, None]
+    rmode_and_expected_map = {
+        "down": [3, 3, 3, 4, -4, -4, -4, None],
+        "up": [4, 4, 4, 5, -3, -3, -3, None],
+        "towards_zero": [3, 3, 3, 4, -3, -3, -3, None],
+        "towards_infinity": [4, 4, 4, 5, -4, -4, -4, None],
+        "half_down": [3, 3, 4, 4, -3, -4, -4, None],
+        "half_up": [3, 4, 4, 5, -3, -3, -4, None],
+        "half_towards_zero": [3, 3, 4, 4, -3, -3, -4, None],
+        "half_towards_infinity": [3, 4, 4, 5, -3, -4, -4, None],
+        "half_to_even": [3, 4, 4, 4, -3, -4, -4, None],
+        "half_to_odd": [3, 3, 4, 5, -3, -3, -4, None],
+    }
+    for round_mode, expected in rmode_and_expected_map.items():
+        options = RoundOptions(round_mode=round_mode)
+        result = round(values, options=options)
+        assert np.all(np.isclose(result, pa.array(expected), equal_nan=True))

Review comment:
       Also, the Numpy [testing assertions](https://numpy.org/doc/stable/reference/routines.testing.html) should give a more informative output on failure ;-)




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] lidavidm commented on a change in pull request #10349: ARROW-12744: [C++][Compute] Add rounding kernel

Posted by GitBox <gi...@apache.org>.

lidavidm commented on a change in pull request #10349:
URL: https://github.com/apache/arrow/pull/10349#discussion_r659706688



##########
File path: cpp/src/arrow/compute/kernels/scalar_arithmetic.cc
##########
@@ -454,6 +456,159 @@ struct PowerChecked {
   }
 };
 
+using RoundState = internal::OptionsWrapper<RoundOptions>;
+
+struct Round {
+  template <typename T, enable_if_t<std::is_floating_point<T>::value, bool> = true>
+  static constexpr bool ApproxEqual(T x, T y, int ulp = 8) {
+    // https://en.cppreference.com/w/cpp/types/numeric_limits/epsilon
+    // The machine epsilon has to be scaled to the magnitude of the values used
+    // and multiplied by the desired precision in ULPs (units in the last place)
+    return (std::fabs(x - y) <=
+            (std::numeric_limits<T>::epsilon() * std::fabs(x + y) * ulp))
+           // unless the result is subnormal
+           || (std::fabs(x - y) < std::numeric_limits<T>::min());
+  }
+
+  template <typename T, enable_if_t<std::is_floating_point<T>::value, bool> = true>
+  static constexpr bool IsHalf(T val) {
+    // |frac| == 0.5?
+    return ApproxEqual(std::fabs(std::fmod(val, T(1))), T(0.5));
+  }
+
+  template <typename T>
+  static constexpr enable_if_floating_point<T> RoundWithMultiple(T val, T mult) {
+    return (val / mult) * mult;
+  }
+
+  template <typename T>
+  static constexpr enable_if_floating_point<T> Floor(T val) {
+    return std::floor(val);
+  }
+
+  template <typename T>
+  static constexpr enable_if_floating_point<T> Ceiling(T val) {
+    return std::ceil(val);
+  }
+
+  template <typename T>
+  static constexpr enable_if_floating_point<T> Truncate(T val) {
+    return std::trunc(val);
+  }
+
+  template <typename T>
+  static constexpr enable_if_floating_point<T> TowardsInfinity(T val) {
+    return std::signbit(val) ? std::floor(val) : std::ceil(val);
+  }
+
+  template <typename T>
+  static constexpr enable_if_floating_point<T> HalfDown(T val) {
+    return std::ceil(val - T(0.5));
+  }
+
+  template <typename T>
+  static constexpr enable_if_floating_point<T> HalfUp(T val) {
+    return std::floor(val + T(0.5));
+  }
+
+  template <typename T>
+  static enable_if_floating_point<T> HalfToEven(T val) {
+    if (IsHalf(val)) {
+      auto floor = std::floor(val);
+      // Odd + 1, Even + 0
+      return floor + (std::fmod(std::fabs(floor), T(2)) >= T(1));
+    }
+    return std::round(val);
+  }
+
+  template <typename T>
+  static enable_if_floating_point<T> HalfToOdd(T val) {
+    if (IsHalf(val)) {
+      auto floor = std::floor(val);
+      // Odd + 0, Even + 1
+      return floor + (std::fmod(std::fabs(floor), T(2)) < T(1));
+    }
+    return std::round(val);
+  }
+
+  template <typename T>
+  static constexpr enable_if_floating_point<T> Nearest(T val) {
+    return std::round(val);
+  }
+
+  template <typename T>
+  static constexpr enable_if_floating_point<T> HalfTowardsZero(T val) {
+    return std::copysign(std::ceil(std::fabs(val) - T(0.5)), val);
+  }
+
+  template <typename T, typename Arg,
+            enable_if_t<std::is_integral<Arg>::value, bool> = true>
+  static constexpr enable_if_floating_point<T> Call(KernelContext* ctx, Arg arg,
+                                                    Status* st) {
+    return Call<T, T>(ctx, T(arg), st);
+  }
+
+  template <typename T, typename Arg,
+            enable_if_t<std::is_floating_point<Arg>::value, bool> = true>
+  static enable_if_floating_point<T> Call(KernelContext* ctx, Arg arg, Status* st) {
+    const RoundOptions& options = RoundState::Get(ctx);

Review comment:
       Ah, glad you got it figured out!




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] lidavidm commented on a change in pull request #10349: ARROW-12744: [C++][Compute] Add rounding kernel

Posted by GitBox <gi...@apache.org>.

lidavidm commented on a change in pull request #10349:
URL: https://github.com/apache/arrow/pull/10349#discussion_r674397590



##########
File path: cpp/src/arrow/compute/api_scalar.h
##########
@@ -49,10 +49,52 @@ class ARROW_EXPORT ElementWiseAggregateOptions : public FunctionOptions {
   explicit ElementWiseAggregateOptions(bool skip_nulls = true);
   constexpr static char const kTypeName[] = "ElementWiseAggregateOptions";
   static ElementWiseAggregateOptions Defaults() { return ElementWiseAggregateOptions{}; }
-
   bool skip_nulls;
 };
 
+/// Rounding modes for Round and MRound functions. General modes are prefixed
+/// with TOWARDS and tie-breaker modes are prefixed with HALF. Common aliases
+/// are available for several modes.

Review comment:
       So long as they are consistent it should be ok. It sounds like the general definition would be best.
   
   I think the duplication is ok. If there's no difference we might actually prefer to have Floor/Ceil/Trunc be aliases or call into the more general kernel.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] edponce commented on a change in pull request #10349: ARROW-12744: [C++][Compute] Add rounding kernel

Posted by GitBox <gi...@apache.org>.

edponce commented on a change in pull request #10349:
URL: https://github.com/apache/arrow/pull/10349#discussion_r703916892



##########
File path: cpp/src/arrow/util/map.h
##########
@@ -17,13 +17,21 @@
 
 #pragma once
 
+#include <type_traits>
 #include <utility>
 
 #include "arrow/result.h"
 
 namespace arrow {
 namespace internal {
 
+/// Functor to make enums hashable.
+template <typename T, typename R = typename std::underlying_type<T>::type,
+          typename = std::enable_if<std::is_enum<T>::value>>
+struct EnumHash {
+  R operator()(T val) const { return static_cast<R>(val); }
+};

Review comment:
       It was used for `std::unordered_map<RoundMode, ...>` but this map was removed and converted into a switch statement inside of a function. Also, it is used in the `scalar_arithmetic_test.cc` but it will not be used as maps will be converted to `std::vector<std::pair<...>>`.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] edponce commented on a change in pull request #10349: ARROW-12744: [C++][Compute] Add rounding kernel

Posted by GitBox <gi...@apache.org>.

edponce commented on a change in pull request #10349:
URL: https://github.com/apache/arrow/pull/10349#discussion_r703465795



##########
File path: cpp/src/arrow/compute/kernels/scalar_arithmetic.cc
##########
@@ -1444,7 +1463,7 @@ std::shared_ptr<ScalarFunction> MakeUnaryRoundFunction(std::string name,
   auto func = std::make_shared<ArithmeticFloatingPointFunction>(name, Arity::Unary(), doc,
                                                                 &kDefaultOptions);
   for (const auto& ty : FloatingPointTypes()) {
-    std::unordered_map<RoundMode, ArrayKernelExec, EnumHash<RoundMode>>
+    std::unordered_map<RoundMode, ArrayKernelExec, ::arrow::internal::EnumHash<RoundMode>>

Review comment:
       I changed it to a switch-case inside a function.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] edponce commented on a change in pull request #10349: ARROW-12744: [C++][Compute] Add rounding kernel

Posted by GitBox <gi...@apache.org>.

edponce commented on a change in pull request #10349:
URL: https://github.com/apache/arrow/pull/10349#discussion_r692358623



##########
File path: cpp/src/arrow/compute/api_scalar.h
##########
@@ -49,10 +49,44 @@ class ARROW_EXPORT ElementWiseAggregateOptions : public FunctionOptions {
   explicit ElementWiseAggregateOptions(bool skip_nulls = true);
   constexpr static char const kTypeName[] = "ElementWiseAggregateOptions";
   static ElementWiseAggregateOptions Defaults() { return ElementWiseAggregateOptions{}; }
-
   bool skip_nulls;
 };
 
+enum class RoundMode {

Review comment:
       I added comments to individual enum members and a reference to `compute.rst`.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] edponce commented on a change in pull request #10349: ARROW-12744: [C++][Compute] Add rounding kernel

Posted by GitBox <gi...@apache.org>.

edponce commented on a change in pull request #10349:
URL: https://github.com/apache/arrow/pull/10349#discussion_r703722756



##########
File path: cpp/src/arrow/compute/kernels/scalar_arithmetic.cc
##########
@@ -852,24 +854,232 @@ struct LogbChecked {
   }
 };
 
+struct RoundUtil {
+  template <typename T>
+  static constexpr enable_if_t<std::is_floating_point<T>::value, bool> IsApproxEqual(
+      const T a, const T b, const T abs_tol = kDefaultAbsoluteTolerance) {
+    return std::fabs(a - b) <= abs_tol;
+  }
+
+  template <typename T>
+  static constexpr enable_if_t<std::is_floating_point<T>::value, bool> IsApproxInt(
+      const T val, const T abs_tol = kDefaultAbsoluteTolerance) {
+    // |frac| ~ 0.0?
+    return IsApproxEqual(val, std::round(val), abs_tol);
+  }
+
+  template <typename T>
+  static constexpr enable_if_t<std::is_floating_point<T>::value, bool> IsApproxHalfInt(
+      const T val, const T abs_tol = kDefaultAbsoluteTolerance) {
+    // |frac| ~ 0.5?
+    return IsApproxEqual(val - std::floor(val), T(0.5), abs_tol);
+  }
+
+  template <typename T>
+  static enable_if_floating_point<T> Pow10(const int64_t power) {
+    const T lut[]{1e0F,  1e1F,  1e2F,  1e3F,  1e4F,  1e5F,  1e6F,  1e7F,
+                  1e8F,  1e9F,  1e10F, 1e11F, 1e12F, 1e13F, 1e14F, 1e15F,
+                  1e16F, 1e17F, 1e18F, 1e19F, 1e20F, 1e21F, 1e22F};
+    // Return NaN if index is out-of-range.
+    auto lut_size = (int64_t)(sizeof(lut) / sizeof(*lut));
+    return (power >= 0 && power < lut_size) ? lut[power] : std::nanf("");

Review comment:
       Initially, that is what I did as to avoid division operations and instead use only multiplies (`round(val * 10^3) * 10^-3`). But found out that this was the culprit of the floating-point precision errors (well more than should be). After looking at several rounding implementations (e.g., Numpy and CPython) noticed that they do divide. So I applied the division and the results were much less noisy (I wish I could explain this but seems that errors sort of cancelled out better). I extracted CPython's round function and converted the divisions into reciprocal multiplies and observed the same noisy behavior output. For this reason, the round functions apply a multiply and divide with the greater than 1 power of 10 factors.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] edponce commented on a change in pull request #10349: ARROW-12744: [C++][Compute] Add rounding kernel

Posted by GitBox <gi...@apache.org>.

edponce commented on a change in pull request #10349:
URL: https://github.com/apache/arrow/pull/10349#discussion_r705496648



##########
File path: cpp/src/arrow/compute/kernels/scalar_arithmetic.cc
##########
@@ -852,24 +853,243 @@ struct LogbChecked {
   }
 };
 
+struct RoundUtil {
+  template <typename T>
+  static constexpr enable_if_t<std::is_floating_point<T>::value, bool> IsInteger(
+      const T val) {
+    // |frac| ~ 0.0?
+    return std::floor(val) == val;
+  }
+
+  template <typename T>
+  static constexpr enable_if_t<std::is_floating_point<T>::value, bool> IsHalfInteger(
+      const T val) {
+    // |frac| ~ 0.5?
+    return (val - std::floor(val)) == T(0.5);
+  }
+
+  // Calculate powers of ten with arbitrary integer exponent
+  template <typename T = double>
+  static enable_if_floating_point<T> Pow10(int64_t power) {
+    static constexpr T lut[] = {1e0F, 1e1F, 1e2F,  1e3F,  1e4F,  1e5F,  1e6F,  1e7F,
+                                1e8F, 1e9F, 1e10F, 1e11F, 1e12F, 1e13F, 1e14F, 1e15F};
+    int64_t lut_size = (sizeof(lut) / sizeof(*lut));
+    int64_t abs_power = std::abs(power);
+    auto pow10 = lut[std::min(abs_power, lut_size - 1)];
+    while (abs_power-- >= lut_size) {
+      pow10 *= 1e1F;
+    }
+    return (power >= 0) ? pow10 : (1 / pow10);
+  }
+};
+
+// Specializations of rounding implementations for round kernels
+template <typename, RoundMode>
+struct RoundImpl;
+
+template <typename T>
+struct RoundImpl<T, RoundMode::DOWN> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::floor(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::UP> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::ceil(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::TOWARDS_ZERO> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::trunc(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::TOWARDS_INFINITY> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::signbit(val) ? std::floor(val) : std::ceil(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_DOWN> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return RoundImpl<T, RoundMode::DOWN>::Round(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_UP> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return RoundImpl<T, RoundMode::UP>::Round(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TOWARDS_ZERO> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return RoundImpl<T, RoundMode::TOWARDS_ZERO>::Round(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TOWARDS_INFINITY> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return RoundImpl<T, RoundMode::TOWARDS_INFINITY>::Round(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TO_EVEN> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::round(val * T(0.5)) * 2;
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TO_ODD> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::floor(val * T(0.5)) + std::ceil(val * T(0.5));
+  }
+};
+
+// Specializations of kernel state for round kernels
+template <typename>
+struct RoundOptionsWrapper;
+
+template <>
+struct RoundOptionsWrapper<RoundOptions> : public OptionsWrapper<RoundOptions> {
+  using OptionsType = RoundOptions;
+  double pow10;
+
+  explicit RoundOptionsWrapper(OptionsType options) : OptionsWrapper(std::move(options)) {
+    // Only positive of powers of 10 are used because combining multiply and
+    // division operations produced more stable rounding than using multiply-only.
+    // Refer to NumPy's round implementation:
+    // https://github.com/numpy/numpy/blob/7b2f20b406d27364c812f7a81a9c901afbd3600c/numpy/core/src/multiarray/calculation.c#L589
+    pow10 = RoundUtil::Pow10(std::abs(options.ndigits));
+  }
+
+  static Result<std::unique_ptr<KernelState>> Init(KernelContext* ctx,
+                                                   const KernelInitArgs& args) {
+    if (auto options = static_cast<const OptionsType*>(args.options)) {
+      return arrow::internal::make_unique<RoundOptionsWrapper<OptionsType>>(*options);
+    }
+
+    return Status::Invalid(
+        "Attempted to initialize KernelState from null FunctionOptions");
+  }
+};
+
+template <>
+struct RoundOptionsWrapper<RoundToMultipleOptions>
+    : public OptionsWrapper<RoundToMultipleOptions> {
+  using OptionsType = RoundToMultipleOptions;
+
+  static Result<std::unique_ptr<KernelState>> Init(KernelContext* ctx,
+                                                   const KernelInitArgs& args) {
+    ARROW_ASSIGN_OR_RAISE(auto state, OptionsWrapper<OptionsType>::Init(ctx, args));
+    auto options = Get(*state);
+    if (options.multiple <= 0) {
+      return Status::Invalid("Rounding multiple has to be a non-zero positive value");
+    }
+    return std::move(state);
+  }
+};
+
+template <RoundMode RndMode>
+struct Round {
+  using State = RoundOptionsWrapper<RoundOptions>;
+
+  template <typename T, typename Arg>
+  static enable_if_floating_point<Arg, T> Call(KernelContext* ctx, Arg arg, Status* st) {
+    static_assert(std::is_same<T, Arg>::value, "");
+    // Do not process Inf or NaN because they will trigger the overflow error at end of
+    // function.
+    if (!std::isfinite(arg)) {
+      return arg;
+    }
+    auto state = static_cast<State*>(ctx->state());
+    auto options = state->options;
+    auto pow10 = T(state->pow10);
+    auto round_val = (options.ndigits >= 0) ? (arg * pow10) : (arg / pow10);
+    // Use std::round() if in tie-breaking mode and scaled value is not 0.5.
+    if ((options.round_mode >= RoundMode::HALF_DOWN) &&

Review comment:
       Good catch! Template types with underlying integers can be used in "runtime" conditionals. I had done this somewhere else but escaped me here.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] edponce commented on a change in pull request #10349: ARROW-12744: [C++][Compute] Add rounding kernel

Posted by GitBox <gi...@apache.org>.

edponce commented on a change in pull request #10349:
URL: https://github.com/apache/arrow/pull/10349#discussion_r707507850



##########
File path: cpp/src/arrow/compute/kernels/scalar_arithmetic_test.cc
##########
@@ -1352,6 +1403,213 @@ TYPED_TEST(TestUnaryArithmeticFloating, AbsoluteValue) {
   }
 }
 
+TYPED_TEST_SUITE(TestUnaryRoundIntegral, IntegralTypes);
+TYPED_TEST_SUITE(TestUnaryRoundSigned, SignedIntegerTypes);
+TYPED_TEST_SUITE(TestUnaryRoundUnsigned, UnsignedIntegerTypes);
+TYPED_TEST_SUITE(TestUnaryRoundFloating, FloatingTypes);
+
+const std::vector<RoundMode> kRoundModes{
+    RoundMode::DOWN,
+    RoundMode::UP,
+    RoundMode::TOWARDS_ZERO,
+    RoundMode::TOWARDS_INFINITY,
+    RoundMode::HALF_DOWN,
+    RoundMode::HALF_UP,
+    RoundMode::HALF_TOWARDS_ZERO,
+    RoundMode::HALF_TOWARDS_INFINITY,
+    RoundMode::HALF_TO_EVEN,
+    RoundMode::HALF_TO_ODD,
+};
+
+TYPED_TEST(TestUnaryRoundSigned, Round) {
+  // Test different rounding modes for integer rounding
+  std::string values("[0, 1, -13, -50, 115]");
+  this->SetRoundNdigits(0);
+  for (const auto& round_mode : kRoundModes) {
+    this->SetRoundMode(round_mode);
+    this->AssertUnaryOp(Round, values, ArrayFromJSON(float64(), values));
+  }
+
+  // Test different round N-digits for nearest rounding mode
+  std::vector<std::pair<int64_t, std::string>> ndigits_and_expected{{
+      {-2, "[0, 0, -0, -100, 100]"},
+      {-1, "[0, 0, -10, -50, 120]"},
+      {0, values},
+      {1, values},
+      {2, values},
+  }};
+  this->SetRoundMode(RoundMode::HALF_TOWARDS_INFINITY);
+  for (const auto& pair : ndigits_and_expected) {
+    this->SetRoundNdigits(pair.first);
+    this->AssertUnaryOp(Round, values, ArrayFromJSON(float64(), pair.second));
+  }
+}
+
+TYPED_TEST(TestUnaryRoundUnsigned, Round) {
+  // Test different rounding modes for integer rounding
+  std::string values("[0, 1, 13, 50, 115]");
+  this->SetRoundNdigits(0);
+  for (const auto& round_mode : kRoundModes) {
+    this->SetRoundMode(round_mode);
+    this->AssertUnaryOp(Round, values, ArrayFromJSON(float64(), values));
+  }
+
+  // Test different round N-digits for nearest rounding mode
+  std::vector<std::pair<int64_t, std::string>> ndigits_and_expected{{
+      {-2, "[0, 0, 0, 100, 100]"},
+      {-1, "[0, 0, 10, 50, 120]"},
+      {0, values},
+      {1, values},
+      {2, values},
+  }};
+  this->SetRoundMode(RoundMode::HALF_TOWARDS_INFINITY);
+  for (const auto& pair : ndigits_and_expected) {
+    this->SetRoundNdigits(pair.first);
+    this->AssertUnaryOp(Round, values, ArrayFromJSON(float64(), pair.second));
+  }
+}
+
+TYPED_TEST(TestUnaryRoundFloating, Round) {
+  this->SetNansEqual(true);
+
+  // Test different rounding modes for integer rounding

Review comment:
       Yes, the first set of test cases are for tests where options are configured for rounding to an integer.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] edponce commented on a change in pull request #10349: ARROW-12744: [C++][Compute] Add rounding kernel

Posted by GitBox <gi...@apache.org>.

edponce commented on a change in pull request #10349:
URL: https://github.com/apache/arrow/pull/10349#discussion_r704874792



##########
File path: cpp/src/arrow/compute/kernels/scalar_arithmetic.cc
##########
@@ -852,24 +854,232 @@ struct LogbChecked {
   }
 };
 
+struct RoundUtil {
+  template <typename T>
+  static constexpr enable_if_t<std::is_floating_point<T>::value, bool> IsApproxEqual(
+      const T a, const T b, const T abs_tol = kDefaultAbsoluteTolerance) {
+    return std::fabs(a - b) <= abs_tol;
+  }
+
+  template <typename T>
+  static constexpr enable_if_t<std::is_floating_point<T>::value, bool> IsApproxInt(
+      const T val, const T abs_tol = kDefaultAbsoluteTolerance) {
+    // |frac| ~ 0.0?
+    return IsApproxEqual(val, std::round(val), abs_tol);
+  }
+
+  template <typename T>
+  static constexpr enable_if_t<std::is_floating_point<T>::value, bool> IsApproxHalfInt(
+      const T val, const T abs_tol = kDefaultAbsoluteTolerance) {
+    // |frac| ~ 0.5?
+    return IsApproxEqual(val - std::floor(val), T(0.5), abs_tol);
+  }
+
+  template <typename T>
+  static enable_if_floating_point<T> Pow10(const int64_t power) {
+    const T lut[]{1e0F,  1e1F,  1e2F,  1e3F,  1e4F,  1e5F,  1e6F,  1e7F,
+                  1e8F,  1e9F,  1e10F, 1e11F, 1e12F, 1e13F, 1e14F, 1e15F,
+                  1e16F, 1e17F, 1e18F, 1e19F, 1e20F, 1e21F, 1e22F};
+    // Return NaN if index is out-of-range.
+    auto lut_size = (int64_t)(sizeof(lut) / sizeof(*lut));
+    return (power >= 0 && power < lut_size) ? lut[power] : std::nanf("");
+  }
+};
+
+// Specializations of rounding implementations for round kernels
+template <typename T, RoundMode>
+struct RoundImpl {};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::DOWN> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::floor(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::UP> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::ceil(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::TOWARDS_ZERO> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::signbit(val) ? std::ceil(val) : std::floor(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::TOWARDS_INFINITY> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::signbit(val) ? std::floor(val) : std::ceil(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_DOWN> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return RoundImpl<T, RoundMode::DOWN>::Round(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_UP> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return RoundImpl<T, RoundMode::UP>::Round(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TOWARDS_ZERO> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return RoundImpl<T, RoundMode::TOWARDS_ZERO>::Round(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TOWARDS_INFINITY> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return RoundImpl<T, RoundMode::TOWARDS_INFINITY>::Round(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TO_EVEN> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::round(val * T(0.5)) * 2;
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TO_ODD> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::floor(val * T(0.5)) + std::ceil(val * T(0.5));
+  }
+};
+
+template <RoundMode RndMode>
+struct Round {
+  using RoundState = OptionsWrapper<RoundOptions>;
+
+  template <typename T, typename Arg>
+  static enable_if_floating_point<Arg, T> Call(KernelContext* ctx, Arg arg, Status* st) {
+    static_assert(std::is_same<T, Arg>::value, "");
+    static_assert(RoundMode::HALF_DOWN > RoundMode::DOWN &&
+                      RoundMode::HALF_DOWN > RoundMode::UP &&
+                      RoundMode::HALF_DOWN > RoundMode::TOWARDS_ZERO &&
+                      RoundMode::HALF_DOWN > RoundMode::TOWARDS_INFINITY &&
+                      RoundMode::HALF_DOWN < RoundMode::HALF_UP &&
+                      RoundMode::HALF_DOWN < RoundMode::HALF_TOWARDS_ZERO &&
+                      RoundMode::HALF_DOWN < RoundMode::HALF_TOWARDS_INFINITY &&
+                      RoundMode::HALF_DOWN < RoundMode::HALF_TO_EVEN &&
+                      RoundMode::HALF_DOWN < RoundMode::HALF_TO_ODD,
+                  "Round modes prefixed with HALF need to be defined last in enum and "
+                  "the first HALF entry has to be HALF_DOWN.");
+
+    auto options = RoundState::Get(ctx);
+    auto pow10 = RoundUtil::Pow10<T>(std::llabs(options.ndigits));
+    if (std::isnan(pow10)) {
+      *st = Status::Invalid("out-of-range value for rounding digits");
+      return arg;

Review comment:
       I was able to solve this issue by creating and specializing `RoundOptionsWrapper` (which is a `KernelState`) to:
   * validate the `multiple` option of `round_to_multiple` in `Init()`
   * extend kernel state with `pow10` data member for `round`




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] edponce commented on a change in pull request #10349: ARROW-12744: [C++][Compute] Add rounding kernel

Posted by GitBox <gi...@apache.org>.

edponce commented on a change in pull request #10349:
URL: https://github.com/apache/arrow/pull/10349#discussion_r659965886



##########
File path: docs/source/cpp/compute.rst
##########
@@ -312,6 +313,79 @@ precision of `divide` is at least the sum of precisions of both operands with
 enough scale kept. Error is returned if the result precision is beyond the
 decimal value range.
 
+Rounding functions
+~~~~~~~~~~~~~~~~~~
+
+These functions displace numeric input(s) to approximate and shorter numeric
+representation(s).  Integral input(s) produce floating-point output(s) of same value.
+If any of the input element(s) is null, the corresponding output element is null.
+
++---------------+------------+-------------+-------------+----------------------------------+
+| Function name | Arity      | Input types | Output type | Notes | Options class            |

Review comment:
       Well, I was trying to mimick the [string transforms](https://arrow.apache.org/docs/cpp/compute.html#string-transforms) table, but noticed that there was a typo, so now *Notes* and *Options class* are different columns.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] edponce commented on a change in pull request #10349: ARROW-12744: [C++][Compute] Add rounding kernel

Posted by GitBox <gi...@apache.org>.

edponce commented on a change in pull request #10349:
URL: https://github.com/apache/arrow/pull/10349#discussion_r674818236



##########
File path: cpp/src/arrow/compute/kernels/scalar_arithmetic.cc
##########
@@ -817,24 +818,158 @@ struct Log1pChecked {
   }
 };
 
+struct RoundUtil {
+  template <typename T, enable_if_t<std::is_floating_point<T>::value, bool> = true>
+  static constexpr bool ApproxEqual(const T x, const T y) {
+    return (x == y) || (std::fabs(x - y) <= std::numeric_limits<T>::epsilon());
+  }
+
+  template <typename T, enable_if_t<std::is_floating_point<T>::value, bool> = true>
+  static constexpr bool IsHalf(T val) {
+    // |frac| == 0.5?
+    return ApproxEqual(std::fmod(std::fabs(val), T(1)), T(0.5));
+  }
+
+  template <typename T>
+  static enable_if_floating_point<T> Pow10(const int power) {
+    static constexpr auto pow10 = std::array<T, 39>{
+        1e-19, 1e-18, 1e-17, 1e-16, 1e-15, 1e-14, 1e-13, 1e-12, 1e-11, 1e-10,
+        1e-9,  1e-8,  1e-7,  1e-6,  1e-5,  1e-4,  1e-3,  1e-2,  1e-1,  1e0,
+        1e1,   1e2,   1e3,   1e4,   1e5,   1e6,   1e7,   1e8,   1e9,   1e10,
+        1e11,  1e12,  1e13,  1e14,  1e15,  1e16,  1e17,  1e18,  1e19};
+    return pow10.at(power + 19);
+  }
+};
+
+// Specializations of rounding implementations for kernels
+template <typename T, RoundMode RndMode>
+struct RoundImpl {
+  static constexpr enable_if_floating_point<T> Round(T) { return T(0); }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::TOWARDS_NEG_INFINITY> {
+  static constexpr enable_if_floating_point<T> Round(T val) { return std::floor(val); }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::TOWARDS_POS_INFINITY> {
+  static constexpr enable_if_floating_point<T> Round(T val) { return std::ceil(val); }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::TOWARDS_ZERO> {
+  static constexpr enable_if_floating_point<T> Round(T val) { return std::trunc(val); }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::TOWARDS_INFINITY> {
+  static constexpr enable_if_floating_point<T> Round(T val) {
+    return std::signbit(val) ? std::floor(val) : std::ceil(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TOWARDS_NEG_INFINITY> {
+  static constexpr enable_if_floating_point<T> Round(T val) {
+    return std::ceil(val - T(0.5));
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TOWARDS_POS_INFINITY> {
+  static constexpr enable_if_floating_point<T> Round(T val) {
+    return std::floor(val + T(0.5));
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TOWARDS_ZERO> {
+  static constexpr enable_if_floating_point<T> Round(T val) {
+    return std::copysign(std::ceil(std::fabs(val) - T(0.5)), val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TOWARDS_INFINITY> {
+  static enable_if_floating_point<T> Round(T val) {
+    return std::copysign(std::floor(std::fabs(val) + T(0.5)), val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TO_EVEN> {
+  static enable_if_floating_point<T> Round(T val) {
+    if (!RoundUtil::IsHalf(val)) {
+      return std::round(val);
+    }
+    auto floor = std::floor(val);
+    // Odd + 1, Even + 0
+    return floor + T(std::fmod(std::fabs(floor), T(2)) >= T(1));
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TO_ODD> {
+  static enable_if_floating_point<T> Round(T val) {
+    if (!RoundUtil::IsHalf(val)) {
+      return std::round(val);
+    }
+    auto floor = std::floor(val);
+    // Odd + 0, Even + 1
+    return floor + T(std::fmod(std::fabs(floor), T(2)) < T(1));
+  }
+};
+
+template <RoundMode RndMode>
+struct MRound {
+  template <typename T, typename Arg>
+  static enable_if_floating_point<Arg, T> Call(KernelContext* ctx, Arg arg, Status*) {
+    static_assert(std::is_same<T, Arg>::value, "");
+    if (std::isnan(arg)) {
+      return arg;
+    }
+    auto options = OptionsWrapper<MRoundOptions>::Get(ctx);
+    const auto mult = std::fabs(T(options.multiple));
+    return (mult == T(0)) ? T(0) : (RoundImpl<T, RndMode>::Round(arg / mult) * mult);
+  }
+};
+
+template <RoundMode RndMode>
+struct Round {
+  template <typename T, typename Arg>
+  static enable_if_floating_point<Arg, T> Call(KernelContext* ctx, Arg arg, Status*) {
+    static_assert(std::is_same<T, Arg>::value, "");
+    auto options = OptionsWrapper<RoundOptions>::Get(ctx);
+    const auto mult = RoundUtil::Pow10<T>(-options.ndigits);

Review comment:
       `ndigits` and `multiple` can be any value (except NaN, Inf). If they are too large or too small they can produce `NaN` or `Inf` results. The "too large/small" will depend on the floating point type for output.
   
   If there would be any validation, I recommend to do it when setting the `RoundOptions` value but given that these can be set directly to the data member (not through `set` methods), this is not feasible.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] lidavidm commented on a change in pull request #10349: ARROW-12744: [C++][Compute] Add rounding kernel

Posted by GitBox <gi...@apache.org>.

lidavidm commented on a change in pull request #10349:
URL: https://github.com/apache/arrow/pull/10349#discussion_r693986079



##########
File path: python/pyarrow/tests/test_compute.py
##########
@@ -1299,6 +1301,98 @@ def test_arithmetic_multiply():
     assert result.equals(expected)
 
 
+@pytest.mark.parametrize("ty", ["round", "mround"])
+def test_round_to_integer(ty):
+    if ty == "round":
+        func = pc.round
+        RoundOptions = pc.RoundOptions
+    elif ty == "mround":
+        func = pc.mround
+        RoundOptions = pc.MRoundOptions
+
+    approx_equals = partial(np.isclose, equal_nan=True)
+    round_modes = [
+        "down", "up", "towards_zero", "towards_infinity", "half_down",
+        "half_up", "half_towards_zero", "half_towards_infinity",
+        "half_to_even", "half_to_odd",
+    ]
+    arr = pa.array([3.2, 3.5, 3.7, 4.5, -3.2, -3.5, -3.7, None])
+    for round_mode in round_modes:
+        options = RoundOptions(round_mode=round_mode)
+        result = func(arr, options=options).to_numpy(zero_copy_only=False)
+        if round_mode == "down":
+            expected_ = [3, 3, 3, 4, -4, -4, -4, None]
+        elif round_mode == "up":
+            expected_ = [4, 4, 4, 5, -3, -3, -3, None]
+        elif round_mode == "towards_zero":
+            expected_ = [3, 3, 3, 4, -3, -3, -3, None]
+        elif round_mode == "towards_infinity":
+            expected_ = [4, 4, 4, 5, -4, -4, -4, None]
+        elif round_mode == "half_down":
+            expected_ = [3, 3, 4, 4, -3, -4, -4, None]
+        elif round_mode == "half_up":
+            expected_ = [3, 4, 4, 5, -3, -3, -4, None]
+        elif round_mode == "half_towards_zero":
+            expected_ = [3, 3, 4, 4, -3, -3, -4, None]
+        elif round_mode == "half_towards_infinity":
+            expected_ = [3, 4, 4, 5, -3, -4, -4, None]
+        elif round_mode == "half_to_even":
+            expected_ = [3, 4, 4, 4, -3, -4, -4, None]
+        elif round_mode == "half_to_odd":
+            expected_ = [3, 3, 4, 5, -3, -3, -4, None]
+
+        expected = pa.array(expected_).to_numpy(zero_copy_only=False)
+        assert all(map(approx_equals, result, expected))
+
+
+def test_round():
+    approx_equals = partial(np.isclose, equal_nan=True)
+    round_ndigits = [-2, -1, 0, 1, 2]
+    arr = pa.array([320, 3.5, 3.075, 4.5, -3.212, -35.1234, -3.045, None])
+    for ndigits in round_ndigits:
+        options = pc.RoundOptions(
+            ndigits=ndigits, round_mode="half_towards_infinity")
+        result = pc.round(arr, options=options).to_numpy(zero_copy_only=False)
+        if ndigits == -2:
+            expected_ = [300, 0, 0, 0, 0, 0, 0, None]
+        elif ndigits == -1:
+            expected_ = [320, 0, 0, 0, 0, -40, 0, None]
+        elif ndigits == 0:
+            expected_ = [320, 4, 3, 5, -3, -35, -3, None]
+        elif ndigits == 1:
+            expected_ = [320, 3.5, 3.1, 4.5, -3.2, -35.1, -3, None]
+        elif ndigits == 2:
+            expected_ = [320, 3.5, 3.08, 4.5, -3.21, -35.12, -3.05, None]
+
+        expected = pa.array(expected_).to_numpy(zero_copy_only=False)
+        assert all(map(approx_equals, result, expected))
+
+
+def test_mround():
+    approx_equals = partial(np.isclose, equal_nan=True)
+    multiples = [-2, -0.05, 0.1, 0, 10, 100]
+    arr = pa.array([320, 3.5, 3.075, 4.5, -3.212, -35.1234, -3.045, None])
+    for multiple in multiples:
+        options = pc.MRoundOptions(
+            multiple=multiple, round_mode="half_towards_infinity")
+        result = pc.mround(arr, options=options).to_numpy(zero_copy_only=False)
+        if multiple == -2:
+            expected_ = [320, 4, 4, 4, -4, -36, -4, None]
+        elif multiple == -0.05:
+            expected_ = [320, 3.5, 3.1, 4.5, -3.2, -35.1, -3.05, None]
+        elif multiple == 0.1:
+            expected_ = [320, 3.5, 3.1, 4.5, -3.2, -35.1, -3, None]
+        elif multiple == 0:
+            expected_ = [0, 0, 0, 0, 0, 0, 0, None]
+        elif multiple == 10:
+            expected_ = [320, 0, 0, 0, 0, -40, 0, None]
+        elif multiple == 100:
+            expected_ = [300, 0, 0, 0, 0, 0, 0, None]
+
+        expected = pa.array(expected_).to_numpy(zero_copy_only=False)

Review comment:
       Is there a reason why we have to unset zero_copy_only? False is the default value anyways.

##########
File path: python/pyarrow/tests/test_compute.py
##########
@@ -1299,6 +1301,98 @@ def test_arithmetic_multiply():
     assert result.equals(expected)
 
 
+@pytest.mark.parametrize("ty", ["round", "mround"])
+def test_round_to_integer(ty):
+    if ty == "round":
+        func = pc.round
+        RoundOptions = pc.RoundOptions
+    elif ty == "mround":
+        func = pc.mround
+        RoundOptions = pc.MRoundOptions
+
+    approx_equals = partial(np.isclose, equal_nan=True)
+    round_modes = [
+        "down", "up", "towards_zero", "towards_infinity", "half_down",
+        "half_up", "half_towards_zero", "half_towards_infinity",
+        "half_to_even", "half_to_odd",
+    ]
+    arr = pa.array([3.2, 3.5, 3.7, 4.5, -3.2, -3.5, -3.7, None])
+    for round_mode in round_modes:
+        options = RoundOptions(round_mode=round_mode)
+        result = func(arr, options=options).to_numpy(zero_copy_only=False)
+        if round_mode == "down":
+            expected_ = [3, 3, 3, 4, -4, -4, -4, None]
+        elif round_mode == "up":
+            expected_ = [4, 4, 4, 5, -3, -3, -3, None]
+        elif round_mode == "towards_zero":
+            expected_ = [3, 3, 3, 4, -3, -3, -3, None]
+        elif round_mode == "towards_infinity":
+            expected_ = [4, 4, 4, 5, -4, -4, -4, None]
+        elif round_mode == "half_down":
+            expected_ = [3, 3, 4, 4, -3, -4, -4, None]
+        elif round_mode == "half_up":
+            expected_ = [3, 4, 4, 5, -3, -3, -4, None]
+        elif round_mode == "half_towards_zero":
+            expected_ = [3, 3, 4, 4, -3, -3, -4, None]
+        elif round_mode == "half_towards_infinity":
+            expected_ = [3, 4, 4, 5, -3, -4, -4, None]
+        elif round_mode == "half_to_even":
+            expected_ = [3, 4, 4, 4, -3, -4, -4, None]
+        elif round_mode == "half_to_odd":
+            expected_ = [3, 3, 4, 5, -3, -3, -4, None]
+
+        expected = pa.array(expected_).to_numpy(zero_copy_only=False)
+        assert all(map(approx_equals, result, expected))
+
+
+def test_round():
+    approx_equals = partial(np.isclose, equal_nan=True)
+    round_ndigits = [-2, -1, 0, 1, 2]
+    arr = pa.array([320, 3.5, 3.075, 4.5, -3.212, -35.1234, -3.045, None])
+    for ndigits in round_ndigits:
+        options = pc.RoundOptions(
+            ndigits=ndigits, round_mode="half_towards_infinity")
+        result = pc.round(arr, options=options).to_numpy(zero_copy_only=False)
+        if ndigits == -2:
+            expected_ = [300, 0, 0, 0, 0, 0, 0, None]
+        elif ndigits == -1:
+            expected_ = [320, 0, 0, 0, 0, -40, 0, None]
+        elif ndigits == 0:
+            expected_ = [320, 4, 3, 5, -3, -35, -3, None]
+        elif ndigits == 1:
+            expected_ = [320, 3.5, 3.1, 4.5, -3.2, -35.1, -3, None]
+        elif ndigits == 2:
+            expected_ = [320, 3.5, 3.08, 4.5, -3.21, -35.12, -3.05, None]
+
+        expected = pa.array(expected_).to_numpy(zero_copy_only=False)
+        assert all(map(approx_equals, result, expected))
+
+
+def test_mround():
+    approx_equals = partial(np.isclose, equal_nan=True)
+    multiples = [-2, -0.05, 0.1, 0, 10, 100]
+    arr = pa.array([320, 3.5, 3.075, 4.5, -3.212, -35.1234, -3.045, None])
+    for multiple in multiples:
+        options = pc.MRoundOptions(
+            multiple=multiple, round_mode="half_towards_infinity")
+        result = pc.mround(arr, options=options).to_numpy(zero_copy_only=False)
+        if multiple == -2:
+            expected_ = [320, 4, 4, 4, -4, -36, -4, None]
+        elif multiple == -0.05:
+            expected_ = [320, 3.5, 3.1, 4.5, -3.2, -35.1, -3.05, None]
+        elif multiple == 0.1:
+            expected_ = [320, 3.5, 3.1, 4.5, -3.2, -35.1, -3, None]
+        elif multiple == 0:
+            expected_ = [0, 0, 0, 0, 0, 0, 0, None]
+        elif multiple == 10:
+            expected_ = [320, 0, 0, 0, 0, -40, 0, None]
+        elif multiple == 100:
+            expected_ = [300, 0, 0, 0, 0, 0, 0, None]
+
+        expected = pa.array(expected_).to_numpy(zero_copy_only=False)

Review comment:
       And this could just be `np.array(...)` instead of bouncing through PyArrow (though as noted below, you don't even need to do that).

##########
File path: python/pyarrow/tests/test_compute.py
##########
@@ -1299,6 +1301,98 @@ def test_arithmetic_multiply():
     assert result.equals(expected)
 
 
+@pytest.mark.parametrize("ty", ["round", "mround"])
+def test_round_to_integer(ty):
+    if ty == "round":
+        func = pc.round
+        RoundOptions = pc.RoundOptions
+    elif ty == "mround":
+        func = pc.mround
+        RoundOptions = pc.MRoundOptions
+
+    approx_equals = partial(np.isclose, equal_nan=True)
+    round_modes = [
+        "down", "up", "towards_zero", "towards_infinity", "half_down",
+        "half_up", "half_towards_zero", "half_towards_infinity",
+        "half_to_even", "half_to_odd",
+    ]
+    arr = pa.array([3.2, 3.5, 3.7, 4.5, -3.2, -3.5, -3.7, None])
+    for round_mode in round_modes:
+        options = RoundOptions(round_mode=round_mode)
+        result = func(arr, options=options).to_numpy(zero_copy_only=False)
+        if round_mode == "down":
+            expected_ = [3, 3, 3, 4, -4, -4, -4, None]
+        elif round_mode == "up":
+            expected_ = [4, 4, 4, 5, -3, -3, -3, None]
+        elif round_mode == "towards_zero":
+            expected_ = [3, 3, 3, 4, -3, -3, -3, None]
+        elif round_mode == "towards_infinity":
+            expected_ = [4, 4, 4, 5, -4, -4, -4, None]
+        elif round_mode == "half_down":
+            expected_ = [3, 3, 4, 4, -3, -4, -4, None]
+        elif round_mode == "half_up":
+            expected_ = [3, 4, 4, 5, -3, -3, -4, None]
+        elif round_mode == "half_towards_zero":
+            expected_ = [3, 3, 4, 4, -3, -3, -4, None]
+        elif round_mode == "half_towards_infinity":
+            expected_ = [3, 4, 4, 5, -3, -4, -4, None]
+        elif round_mode == "half_to_even":
+            expected_ = [3, 4, 4, 4, -3, -4, -4, None]
+        elif round_mode == "half_to_odd":
+            expected_ = [3, 3, 4, 5, -3, -3, -4, None]
+
+        expected = pa.array(expected_).to_numpy(zero_copy_only=False)
+        assert all(map(approx_equals, result, expected))
+
+
+def test_round():
+    approx_equals = partial(np.isclose, equal_nan=True)
+    round_ndigits = [-2, -1, 0, 1, 2]
+    arr = pa.array([320, 3.5, 3.075, 4.5, -3.212, -35.1234, -3.045, None])
+    for ndigits in round_ndigits:
+        options = pc.RoundOptions(
+            ndigits=ndigits, round_mode="half_towards_infinity")
+        result = pc.round(arr, options=options).to_numpy(zero_copy_only=False)
+        if ndigits == -2:
+            expected_ = [300, 0, 0, 0, 0, 0, 0, None]
+        elif ndigits == -1:
+            expected_ = [320, 0, 0, 0, 0, -40, 0, None]
+        elif ndigits == 0:
+            expected_ = [320, 4, 3, 5, -3, -35, -3, None]
+        elif ndigits == 1:
+            expected_ = [320, 3.5, 3.1, 4.5, -3.2, -35.1, -3, None]
+        elif ndigits == 2:
+            expected_ = [320, 3.5, 3.08, 4.5, -3.21, -35.12, -3.05, None]
+
+        expected = pa.array(expected_).to_numpy(zero_copy_only=False)
+        assert all(map(approx_equals, result, expected))
+
+
+def test_mround():
+    approx_equals = partial(np.isclose, equal_nan=True)
+    multiples = [-2, -0.05, 0.1, 0, 10, 100]
+    arr = pa.array([320, 3.5, 3.075, 4.5, -3.212, -35.1234, -3.045, None])
+    for multiple in multiples:
+        options = pc.MRoundOptions(
+            multiple=multiple, round_mode="half_towards_infinity")
+        result = pc.mround(arr, options=options).to_numpy(zero_copy_only=False)
+        if multiple == -2:
+            expected_ = [320, 4, 4, 4, -4, -36, -4, None]
+        elif multiple == -0.05:
+            expected_ = [320, 3.5, 3.1, 4.5, -3.2, -35.1, -3.05, None]
+        elif multiple == 0.1:
+            expected_ = [320, 3.5, 3.1, 4.5, -3.2, -35.1, -3, None]
+        elif multiple == 0:
+            expected_ = [0, 0, 0, 0, 0, 0, 0, None]
+        elif multiple == 10:
+            expected_ = [320, 0, 0, 0, 0, -40, 0, None]
+        elif multiple == 100:
+            expected_ = [300, 0, 0, 0, 0, 0, 0, None]
+
+        expected = pa.array(expected_).to_numpy(zero_copy_only=False)
+        assert all(map(approx_equals, result, expected))

Review comment:
       nit: it would be more natural to use np.all(np.isclose(arr1, arr2, equal_nan=True)) instead of using map/partial/all.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] edponce commented on a change in pull request #10349: ARROW-12744: [C++][Compute] Add rounding kernel

Posted by GitBox <gi...@apache.org>.

edponce commented on a change in pull request #10349:
URL: https://github.com/apache/arrow/pull/10349#discussion_r703909808



##########
File path: cpp/src/arrow/python/python_test.cc
##########
@@ -194,7 +194,7 @@ TEST(PyBuffer, InvalidInputObject) {
 // ("unresolved external symbol arrow_ARRAY_API referenced").
 #ifndef _WIN32
 TEST(PyBuffer, NumpyArray) {
-  const npy_intp dims[1] = {10};
+  npy_intp dims[1] = {10};

Review comment:
       I do not recall if it was in CI or a compiler on my side, but build was triggering error w.r.t. to `const npy_intp *` in  `PyArray_SimpleNew`. I had not seen this issue before and the `const` attribute is present in the latest NumPy API but it was not there in previous versions. So most likely I compiled with an older Numpy and got these messages. I reverted this change, but note that this TEST uses `const npy_intp` and the following below does not.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] lidavidm commented on a change in pull request #10349: ARROW-12744: [C++][Compute] Add rounding kernel

Posted by GitBox <gi...@apache.org>.

lidavidm commented on a change in pull request #10349:
URL: https://github.com/apache/arrow/pull/10349#discussion_r674765714



##########
File path: cpp/src/arrow/compute/kernels/scalar_arithmetic.cc
##########
@@ -817,24 +818,158 @@ struct Log1pChecked {
   }
 };
 
+struct RoundUtil {
+  template <typename T, enable_if_t<std::is_floating_point<T>::value, bool> = true>
+  static constexpr bool ApproxEqual(const T x, const T y) {
+    return (x == y) || (std::fabs(x - y) <= std::numeric_limits<T>::epsilon());
+  }
+
+  template <typename T, enable_if_t<std::is_floating_point<T>::value, bool> = true>
+  static constexpr bool IsHalf(T val) {
+    // |frac| == 0.5?
+    return ApproxEqual(std::fmod(std::fabs(val), T(1)), T(0.5));
+  }
+
+  template <typename T>
+  static enable_if_floating_point<T> Pow10(const int power) {
+    static constexpr auto pow10 = std::array<T, 39>{
+        1e-19, 1e-18, 1e-17, 1e-16, 1e-15, 1e-14, 1e-13, 1e-12, 1e-11, 1e-10,
+        1e-9,  1e-8,  1e-7,  1e-6,  1e-5,  1e-4,  1e-3,  1e-2,  1e-1,  1e0,
+        1e1,   1e2,   1e3,   1e4,   1e5,   1e6,   1e7,   1e8,   1e9,   1e10,
+        1e11,  1e12,  1e13,  1e14,  1e15,  1e16,  1e17,  1e18,  1e19};
+    return pow10.at(power + 19);
+  }
+};
+
+// Specializations of rounding implementations for kernels
+template <typename T, RoundMode RndMode>
+struct RoundImpl {
+  static constexpr enable_if_floating_point<T> Round(T) { return T(0); }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::TOWARDS_NEG_INFINITY> {
+  static constexpr enable_if_floating_point<T> Round(T val) { return std::floor(val); }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::TOWARDS_POS_INFINITY> {
+  static constexpr enable_if_floating_point<T> Round(T val) { return std::ceil(val); }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::TOWARDS_ZERO> {
+  static constexpr enable_if_floating_point<T> Round(T val) { return std::trunc(val); }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::TOWARDS_INFINITY> {
+  static constexpr enable_if_floating_point<T> Round(T val) {
+    return std::signbit(val) ? std::floor(val) : std::ceil(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TOWARDS_NEG_INFINITY> {
+  static constexpr enable_if_floating_point<T> Round(T val) {
+    return std::ceil(val - T(0.5));
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TOWARDS_POS_INFINITY> {
+  static constexpr enable_if_floating_point<T> Round(T val) {
+    return std::floor(val + T(0.5));
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TOWARDS_ZERO> {
+  static constexpr enable_if_floating_point<T> Round(T val) {
+    return std::copysign(std::ceil(std::fabs(val) - T(0.5)), val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TOWARDS_INFINITY> {
+  static enable_if_floating_point<T> Round(T val) {
+    return std::copysign(std::floor(std::fabs(val) + T(0.5)), val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TO_EVEN> {
+  static enable_if_floating_point<T> Round(T val) {
+    if (!RoundUtil::IsHalf(val)) {
+      return std::round(val);
+    }
+    auto floor = std::floor(val);
+    // Odd + 1, Even + 0
+    return floor + T(std::fmod(std::fabs(floor), T(2)) >= T(1));
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TO_ODD> {
+  static enable_if_floating_point<T> Round(T val) {
+    if (!RoundUtil::IsHalf(val)) {
+      return std::round(val);
+    }
+    auto floor = std::floor(val);
+    // Odd + 0, Even + 1
+    return floor + T(std::fmod(std::fabs(floor), T(2)) < T(1));
+  }
+};
+
+template <RoundMode RndMode>
+struct MRound {
+  template <typename T, typename Arg>
+  static enable_if_floating_point<Arg, T> Call(KernelContext* ctx, Arg arg, Status*) {
+    static_assert(std::is_same<T, Arg>::value, "");
+    if (std::isnan(arg)) {
+      return arg;
+    }
+    auto options = OptionsWrapper<MRoundOptions>::Get(ctx);
+    const auto mult = std::fabs(T(options.multiple));
+    return (mult == T(0)) ? T(0) : (RoundImpl<T, RndMode>::Round(arg / mult) * mult);
+  }
+};
+
+template <RoundMode RndMode>
+struct Round {
+  template <typename T, typename Arg>
+  static enable_if_floating_point<Arg, T> Call(KernelContext* ctx, Arg arg, Status*) {
+    static_assert(std::is_same<T, Arg>::value, "");
+    auto options = OptionsWrapper<RoundOptions>::Get(ctx);
+    const auto mult = RoundUtil::Pow10<T>(-options.ndigits);

Review comment:
       We're not validating ndigits anywhere.

##########
File path: cpp/src/arrow/compute/kernels/scalar_arithmetic.cc
##########
@@ -817,24 +818,158 @@ struct Log1pChecked {
   }
 };
 
+struct RoundUtil {
+  template <typename T, enable_if_t<std::is_floating_point<T>::value, bool> = true>
+  static constexpr bool ApproxEqual(const T x, const T y) {

Review comment:
       nit: I think you can write this as
   
   ```cpp
   template <typename T>
   static constexpr enable_if_t<std::is_floating_point<T>::value, bool> ApproxEqual(...)
   ```
   
   if I'm not mistaken.

##########
File path: cpp/src/arrow/compute/api_scalar.h
##########
@@ -49,10 +49,44 @@ class ARROW_EXPORT ElementWiseAggregateOptions : public FunctionOptions {
   explicit ElementWiseAggregateOptions(bool skip_nulls = true);
   constexpr static char const kTypeName[] = "ElementWiseAggregateOptions";
   static ElementWiseAggregateOptions Defaults() { return ElementWiseAggregateOptions{}; }
-
   bool skip_nulls;
 };
 
+enum class RoundMode {

Review comment:
       I think it's fine to keep the docstring, and it might also be nice to document the individual enum members (e.g. note that TOWARDS_NEG_INFINITY is also known as 'floor' or 'downward'). Also the comment about the indices is probably redundant - that's the default behavior for C++ and presumably we aren't adding more modes.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] lidavidm commented on a change in pull request #10349: ARROW-12744: [C++][Compute] Add rounding kernel

Posted by GitBox <gi...@apache.org>.

lidavidm commented on a change in pull request #10349:
URL: https://github.com/apache/arrow/pull/10349#discussion_r695036917



##########
File path: cpp/src/arrow/compute/api_scalar.h
##########
@@ -52,6 +53,13 @@ class ARROW_EXPORT ElementWiseAggregateOptions : public FunctionOptions {
   bool skip_nulls;
 };
 
+/// Functor to calculate hash of an enum.

Review comment:
       nit: maybe this should go in arrow/util, e.g. arrow/util/map.h or (maybe possibly) arrow/util/hashing.h?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] edponce commented on a change in pull request #10349: ARROW-12744: [C++][Compute] Add rounding kernel

Posted by GitBox <gi...@apache.org>.

edponce commented on a change in pull request #10349:
URL: https://github.com/apache/arrow/pull/10349#discussion_r703917203



##########
File path: cpp/src/arrow/compute/kernels/scalar_arithmetic.cc
##########
@@ -20,14 +20,16 @@
 #include <limits>
 #include <utility>
 
-#include "arrow/compute/kernels/codegen_internal.h"
+#include "arrow/compare.h"
+#include "arrow/compute/api_scalar.h"
 #include "arrow/compute/kernels/common.h"
 #include "arrow/compute/kernels/util_internal.h"
 #include "arrow/type.h"
 #include "arrow/type_traits.h"
 #include "arrow/util/decimal.h"
 #include "arrow/util/int_util_internal.h"
 #include "arrow/util/macros.h"
+#include "arrow/util/map.h"

Review comment:
       Not anymore.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] lidavidm commented on a change in pull request #10349: ARROW-12744: [C++][Compute] Add rounding kernel

Posted by GitBox <gi...@apache.org>.

lidavidm commented on a change in pull request #10349:
URL: https://github.com/apache/arrow/pull/10349#discussion_r658733299



##########
File path: cpp/src/arrow/compute/kernels/scalar_arithmetic.cc
##########
@@ -454,6 +456,159 @@ struct PowerChecked {
   }
 };
 
+using RoundState = internal::OptionsWrapper<RoundOptions>;
+
+struct Round {
+  template <typename T, enable_if_t<std::is_floating_point<T>::value, bool> = true>
+  static constexpr bool ApproxEqual(T x, T y, int ulp = 8) {
+    // https://en.cppreference.com/w/cpp/types/numeric_limits/epsilon
+    // The machine epsilon has to be scaled to the magnitude of the values used
+    // and multiplied by the desired precision in ULPs (units in the last place)
+    return (std::fabs(x - y) <=
+            (std::numeric_limits<T>::epsilon() * std::fabs(x + y) * ulp))
+           // unless the result is subnormal
+           || (std::fabs(x - y) < std::numeric_limits<T>::min());
+  }
+
+  template <typename T, enable_if_t<std::is_floating_point<T>::value, bool> = true>
+  static constexpr bool IsHalf(T val) {
+    // |frac| == 0.5?
+    return ApproxEqual(std::fabs(std::fmod(val, T(1))), T(0.5));
+  }
+
+  template <typename T>
+  static constexpr enable_if_floating_point<T> RoundWithMultiple(T val, T mult) {
+    return (val / mult) * mult;
+  }
+
+  template <typename T>
+  static constexpr enable_if_floating_point<T> Floor(T val) {
+    return std::floor(val);
+  }
+
+  template <typename T>
+  static constexpr enable_if_floating_point<T> Ceiling(T val) {
+    return std::ceil(val);
+  }
+
+  template <typename T>
+  static constexpr enable_if_floating_point<T> Truncate(T val) {
+    return std::trunc(val);
+  }
+
+  template <typename T>
+  static constexpr enable_if_floating_point<T> TowardsInfinity(T val) {
+    return std::signbit(val) ? std::floor(val) : std::ceil(val);
+  }
+
+  template <typename T>
+  static constexpr enable_if_floating_point<T> HalfDown(T val) {
+    return std::ceil(val - T(0.5));
+  }
+
+  template <typename T>
+  static constexpr enable_if_floating_point<T> HalfUp(T val) {
+    return std::floor(val + T(0.5));
+  }
+
+  template <typename T>
+  static enable_if_floating_point<T> HalfToEven(T val) {
+    if (IsHalf(val)) {
+      auto floor = std::floor(val);
+      // Odd + 1, Even + 0
+      return floor + (std::fmod(std::fabs(floor), T(2)) >= T(1));
+    }
+    return std::round(val);
+  }
+
+  template <typename T>
+  static enable_if_floating_point<T> HalfToOdd(T val) {
+    if (IsHalf(val)) {
+      auto floor = std::floor(val);
+      // Odd + 0, Even + 1
+      return floor + (std::fmod(std::fabs(floor), T(2)) < T(1));
+    }
+    return std::round(val);
+  }
+
+  template <typename T>
+  static constexpr enable_if_floating_point<T> Nearest(T val) {
+    return std::round(val);
+  }
+
+  template <typename T>
+  static constexpr enable_if_floating_point<T> HalfTowardsZero(T val) {
+    return std::copysign(std::ceil(std::fabs(val) - T(0.5)), val);
+  }
+
+  template <typename T, typename Arg,
+            enable_if_t<std::is_integral<Arg>::value, bool> = true>
+  static constexpr enable_if_floating_point<T> Call(KernelContext* ctx, Arg arg,
+                                                    Status* st) {
+    return Call<T, T>(ctx, T(arg), st);
+  }
+
+  template <typename T, typename Arg,
+            enable_if_t<std::is_floating_point<Arg>::value, bool> = true>
+  static enable_if_floating_point<T> Call(KernelContext* ctx, Arg arg, Status* st) {
+    const RoundOptions& options = RoundState::Get(ctx);

Review comment:
       The default options instance needs to be declared static, e.g. see JoinOptions: https://github.com/apache/arrow/blob/c4a20e98a3294b32e51c879e927878e9fb6e799b/cpp/src/arrow/compute/kernels/scalar_string.cc#L3566
   
   Currently, the default options are allocated on the stack in RegisterScalarArithmetic and will go away when the function returns. Then when you call the function, it references the already-deallocated options and crashes.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] edponce commented on a change in pull request #10349: ARROW-12744: [C++][Compute] Add rounding kernel

Posted by GitBox <gi...@apache.org>.

edponce commented on a change in pull request #10349:
URL: https://github.com/apache/arrow/pull/10349#discussion_r660073364



##########
File path: cpp/src/arrow/compute/kernels/scalar_arithmetic.cc
##########
@@ -454,6 +456,166 @@ struct PowerChecked {
   }
 };
 
+struct RoundUtils {
+  template <typename T, enable_if_t<std::is_floating_point<T>::value, bool> = true>
+  static bool ApproxEqual(const T x, const T y, const int ulp = 7) {
+    // https://en.cppreference.com/w/cpp/types/numeric_limits/epsilon
+    // The machine epsilon has to be scaled to the magnitude of the values used
+    // and multiplied by the desired precision in ULPs (units in the last place)
+    const auto eps_ulp = std::numeric_limits<T>::epsilon() * ulp;
+    const auto xy_diff = std::fabs(x - y);
+    const auto xy_sum = std::fabs(x + y);
+    return (xy_diff <= (xy_sum * eps_ulp))
+           // unless the result is subnormal
+           || (xy_diff < std::numeric_limits<T>::min());
+  }
+
+  template <typename T, enable_if_t<std::is_floating_point<T>::value, bool> = true>
+  static bool IsHalf(T val) {
+    // |frac| == 0.5?
+    return ApproxEqual(std::fabs(std::fmod(val, T(1))), T(0.5));
+  }
+
+  template <typename T>
+  static constexpr enable_if_floating_point<T> Floor(T val) {
+    return std::floor(val);
+  }
+
+  template <typename T>
+  static constexpr enable_if_floating_point<T> Ceiling(T val) {
+    return std::ceil(val);
+  }
+
+  template <typename T>
+  static constexpr enable_if_floating_point<T> Truncate(T val) {
+    return std::trunc(val);
+  }
+
+  template <typename T>
+  static constexpr enable_if_floating_point<T> TowardsInfinity(T val) {
+    return std::signbit(val) ? std::floor(val) : std::ceil(val);
+  }
+
+  template <typename T>
+  static constexpr enable_if_floating_point<T> HalfDown(T val) {
+    return std::ceil(val - T(0.5));
+  }
+
+  template <typename T>
+  static constexpr enable_if_floating_point<T> HalfUp(T val) {
+    return std::floor(val + T(0.5));
+  }
+
+  template <typename T>
+  static enable_if_floating_point<T> HalfToEven(T val) {
+    if (IsHalf(val)) {
+      auto floor = std::floor(val);
+      // Odd + 1, Even + 0
+      return floor + (std::fmod(std::fabs(floor), T(2)) >= T(1));
+    }
+    return std::round(val);
+  }
+
+  template <typename T>
+  static enable_if_floating_point<T> HalfToOdd(T val) {
+    if (IsHalf(val)) {
+      auto floor = std::floor(val);
+      // Odd + 0, Even + 1
+      return floor + (std::fmod(std::fabs(floor), T(2)) < T(1));
+    }
+    return std::round(val);
+  }
+
+  template <typename T>
+  static constexpr enable_if_floating_point<T> Nearest(T val) {
+    return std::round(val);
+  }
+
+  template <typename T>
+  static constexpr enable_if_floating_point<T> HalfTowardsZero(T val) {
+    return std::copysign(std::ceil(std::fabs(val) - T(0.5)), val);
+  }
+
+  template <typename T>
+  static enable_if_floating_point<T> Round(T val, T mult, RoundMode round_mode,
+                                           Status* st) {
+    val /= mult;
+
+    T result;
+    switch (round_mode) {

Review comment:
       Actually this was something that I thought about but did not knew how/when to resolve function options that conditionally control the kernel dispatched. With this knowledge I make the following observations regarding conditionally controlled function and kernel dispatching to prevent such checks from entering the hot-loop of execution:
   1. If multiple function variants are available then these are explicitly controlled by their name when invoking `CallFunction`. Nevertheless, in the public API (eg. [scalar](https://github.com/apache/arrow/blob/master/cpp/src/arrow/compute/api_scalar.h)) the [function options can resolve the variant's name to call](https://github.com/apache/arrow/blob/master/cpp/src/arrow/compute/api_scalar.cc#L44-L48).
   2. If multiple kernel variants are available (and not resolved by input type), then function options can be resolved from `KernelContext` when selecting kernel generators (`ArrayKernelExec`). This may require the kernels to have a template parameter of the function option of interest.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] lidavidm commented on a change in pull request #10349: ARROW-12744: [C++][Compute] Add rounding kernel

Posted by GitBox <gi...@apache.org>.

lidavidm commented on a change in pull request #10349:
URL: https://github.com/apache/arrow/pull/10349#discussion_r674819207



##########
File path: cpp/src/arrow/compute/kernels/scalar_arithmetic.cc
##########
@@ -817,24 +818,158 @@ struct Log1pChecked {
   }
 };
 
+struct RoundUtil {
+  template <typename T, enable_if_t<std::is_floating_point<T>::value, bool> = true>
+  static constexpr bool ApproxEqual(const T x, const T y) {
+    return (x == y) || (std::fabs(x - y) <= std::numeric_limits<T>::epsilon());
+  }
+
+  template <typename T, enable_if_t<std::is_floating_point<T>::value, bool> = true>
+  static constexpr bool IsHalf(T val) {
+    // |frac| == 0.5?
+    return ApproxEqual(std::fmod(std::fabs(val), T(1)), T(0.5));
+  }
+
+  template <typename T>
+  static enable_if_floating_point<T> Pow10(const int power) {
+    static constexpr auto pow10 = std::array<T, 39>{
+        1e-19, 1e-18, 1e-17, 1e-16, 1e-15, 1e-14, 1e-13, 1e-12, 1e-11, 1e-10,
+        1e-9,  1e-8,  1e-7,  1e-6,  1e-5,  1e-4,  1e-3,  1e-2,  1e-1,  1e0,
+        1e1,   1e2,   1e3,   1e4,   1e5,   1e6,   1e7,   1e8,   1e9,   1e10,
+        1e11,  1e12,  1e13,  1e14,  1e15,  1e16,  1e17,  1e18,  1e19};
+    return pow10.at(power + 19);
+  }
+};
+
+// Specializations of rounding implementations for kernels
+template <typename T, RoundMode RndMode>
+struct RoundImpl {
+  static constexpr enable_if_floating_point<T> Round(T) { return T(0); }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::TOWARDS_NEG_INFINITY> {
+  static constexpr enable_if_floating_point<T> Round(T val) { return std::floor(val); }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::TOWARDS_POS_INFINITY> {
+  static constexpr enable_if_floating_point<T> Round(T val) { return std::ceil(val); }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::TOWARDS_ZERO> {
+  static constexpr enable_if_floating_point<T> Round(T val) { return std::trunc(val); }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::TOWARDS_INFINITY> {
+  static constexpr enable_if_floating_point<T> Round(T val) {
+    return std::signbit(val) ? std::floor(val) : std::ceil(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TOWARDS_NEG_INFINITY> {
+  static constexpr enable_if_floating_point<T> Round(T val) {
+    return std::ceil(val - T(0.5));
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TOWARDS_POS_INFINITY> {
+  static constexpr enable_if_floating_point<T> Round(T val) {
+    return std::floor(val + T(0.5));
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TOWARDS_ZERO> {
+  static constexpr enable_if_floating_point<T> Round(T val) {
+    return std::copysign(std::ceil(std::fabs(val) - T(0.5)), val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TOWARDS_INFINITY> {
+  static enable_if_floating_point<T> Round(T val) {
+    return std::copysign(std::floor(std::fabs(val) + T(0.5)), val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TO_EVEN> {
+  static enable_if_floating_point<T> Round(T val) {
+    if (!RoundUtil::IsHalf(val)) {
+      return std::round(val);
+    }
+    auto floor = std::floor(val);
+    // Odd + 1, Even + 0
+    return floor + T(std::fmod(std::fabs(floor), T(2)) >= T(1));
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TO_ODD> {
+  static enable_if_floating_point<T> Round(T val) {
+    if (!RoundUtil::IsHalf(val)) {
+      return std::round(val);
+    }
+    auto floor = std::floor(val);
+    // Odd + 0, Even + 1
+    return floor + T(std::fmod(std::fabs(floor), T(2)) < T(1));
+  }
+};
+
+template <RoundMode RndMode>
+struct MRound {
+  template <typename T, typename Arg>
+  static enable_if_floating_point<Arg, T> Call(KernelContext* ctx, Arg arg, Status*) {
+    static_assert(std::is_same<T, Arg>::value, "");
+    if (std::isnan(arg)) {
+      return arg;
+    }
+    auto options = OptionsWrapper<MRoundOptions>::Get(ctx);
+    const auto mult = std::fabs(T(options.multiple));
+    return (mult == T(0)) ? T(0) : (RoundImpl<T, RndMode>::Round(arg / mult) * mult);
+  }
+};
+
+template <RoundMode RndMode>
+struct Round {
+  template <typename T, typename Arg>
+  static enable_if_floating_point<Arg, T> Call(KernelContext* ctx, Arg arg, Status*) {
+    static_assert(std::is_same<T, Arg>::value, "");
+    auto options = OptionsWrapper<RoundOptions>::Get(ctx);
+    const auto mult = RoundUtil::Pow10<T>(-options.ndigits);

Review comment:
       ndigits gets used as an index in Pow10 so it needs validation - right now it'll just crash.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] edponce removed a comment on pull request #10349: ARROW-12744: [C++][Compute] Add rounding kernel

Posted by GitBox <gi...@apache.org>.

edponce removed a comment on pull request #10349:
URL: https://github.com/apache/arrow/pull/10349#issuecomment-869762925


   This PR is missing an implementation of `pow(10, x)` using a LUT (precomputed), instead of `std::pow`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] edponce commented on pull request #10349: ARROW-12744: [C++][Compute] Add rounding kernel

Posted by GitBox <gi...@apache.org>.

edponce commented on pull request #10349:
URL: https://github.com/apache/arrow/pull/10349#issuecomment-868420891


   @bkietz @jorisvandenbossche Need feedback on this PR. Specifically, the rounding options provided and kernel implementations.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] edponce commented on a change in pull request #10349: ARROW-12744: [C++][Compute] Add rounding kernel

Posted by GitBox <gi...@apache.org>.

edponce commented on a change in pull request #10349:
URL: https://github.com/apache/arrow/pull/10349#discussion_r659133597



##########
File path: cpp/src/arrow/compute/kernels/scalar_arithmetic.cc
##########
@@ -454,6 +456,159 @@ struct PowerChecked {
   }
 };
 
+using RoundState = internal::OptionsWrapper<RoundOptions>;
+
+struct Round {
+  template <typename T, enable_if_t<std::is_floating_point<T>::value, bool> = true>
+  static constexpr bool ApproxEqual(T x, T y, int ulp = 8) {
+    // https://en.cppreference.com/w/cpp/types/numeric_limits/epsilon
+    // The machine epsilon has to be scaled to the magnitude of the values used
+    // and multiplied by the desired precision in ULPs (units in the last place)
+    return (std::fabs(x - y) <=
+            (std::numeric_limits<T>::epsilon() * std::fabs(x + y) * ulp))
+           // unless the result is subnormal
+           || (std::fabs(x - y) < std::numeric_limits<T>::min());
+  }
+
+  template <typename T, enable_if_t<std::is_floating_point<T>::value, bool> = true>
+  static constexpr bool IsHalf(T val) {
+    // |frac| == 0.5?
+    return ApproxEqual(std::fabs(std::fmod(val, T(1))), T(0.5));
+  }
+
+  template <typename T>
+  static constexpr enable_if_floating_point<T> RoundWithMultiple(T val, T mult) {
+    return (val / mult) * mult;
+  }
+
+  template <typename T>
+  static constexpr enable_if_floating_point<T> Floor(T val) {
+    return std::floor(val);
+  }
+
+  template <typename T>
+  static constexpr enable_if_floating_point<T> Ceiling(T val) {
+    return std::ceil(val);
+  }
+
+  template <typename T>
+  static constexpr enable_if_floating_point<T> Truncate(T val) {
+    return std::trunc(val);
+  }
+
+  template <typename T>
+  static constexpr enable_if_floating_point<T> TowardsInfinity(T val) {
+    return std::signbit(val) ? std::floor(val) : std::ceil(val);
+  }
+
+  template <typename T>
+  static constexpr enable_if_floating_point<T> HalfDown(T val) {
+    return std::ceil(val - T(0.5));
+  }
+
+  template <typename T>
+  static constexpr enable_if_floating_point<T> HalfUp(T val) {
+    return std::floor(val + T(0.5));
+  }
+
+  template <typename T>
+  static enable_if_floating_point<T> HalfToEven(T val) {
+    if (IsHalf(val)) {
+      auto floor = std::floor(val);
+      // Odd + 1, Even + 0
+      return floor + (std::fmod(std::fabs(floor), T(2)) >= T(1));
+    }
+    return std::round(val);
+  }
+
+  template <typename T>
+  static enable_if_floating_point<T> HalfToOdd(T val) {
+    if (IsHalf(val)) {
+      auto floor = std::floor(val);
+      // Odd + 0, Even + 1
+      return floor + (std::fmod(std::fabs(floor), T(2)) < T(1));
+    }
+    return std::round(val);
+  }
+
+  template <typename T>
+  static constexpr enable_if_floating_point<T> Nearest(T val) {
+    return std::round(val);
+  }
+
+  template <typename T>
+  static constexpr enable_if_floating_point<T> HalfTowardsZero(T val) {
+    return std::copysign(std::ceil(std::fabs(val) - T(0.5)), val);
+  }
+
+  template <typename T, typename Arg,
+            enable_if_t<std::is_integral<Arg>::value, bool> = true>
+  static constexpr enable_if_floating_point<T> Call(KernelContext* ctx, Arg arg,
+                                                    Status* st) {
+    return Call<T, T>(ctx, T(arg), st);
+  }
+
+  template <typename T, typename Arg,
+            enable_if_t<std::is_floating_point<Arg>::value, bool> = true>
+  static enable_if_floating_point<T> Call(KernelContext* ctx, Arg arg, Status* st) {
+    const RoundOptions& options = RoundState::Get(ctx);

Review comment:
       I was able to found the issues triggering a segfault when `RoundState::Get(ctx)` in kernels. The first issue was resolved by @lidavidm fix, `kRoundOptions` needs to be defined in global space. The second issue occurred because no `KernelInit` parameter (in this case (`RoundOptions::Init`) was passed to `AddKernel` and the [default value is null](https://github.com/edponce/arrow/blob/master/cpp/src/arrow/compute/function.h#L267-L268) which [prevented options from being set in the `KernelState`](https://github.com/edponce/arrow/blob/master/cpp/src/arrow/compute/function.cc#L181-L184).




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] lidavidm commented on a change in pull request #10349: ARROW-12744: [C++][Compute] Add rounding kernel

Posted by GitBox <gi...@apache.org>.

lidavidm commented on a change in pull request #10349:
URL: https://github.com/apache/arrow/pull/10349#discussion_r674398043



##########
File path: cpp/src/arrow/compute/kernels/scalar_arithmetic.cc
##########
@@ -1606,6 +1774,20 @@ const FunctionDoc trunc_doc{
     ("Calculate the nearest integer not greater in magnitude than to the "
      "argument element-wise."),
     {"x"}};
+
+const FunctionDoc round_doc{
+    "Round the arguments element-wise",
+    ("Options are used to control the rounding mode and number of digits.\n"
+     "Default behavior is to round to nearest integer."),
+    {"x"},
+    "RoundOptions"};
+
+const FunctionDoc mround_doc{

Review comment:
       I was talking about the name used with CallFunction, sorry.
   
   I am ok with round/mround, you are right in that there's not too much consistency. Just a thought since it took a moment for mround to click, but of course I'm not so familiar with Excel et al.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] rok commented on pull request #10349: ARROW-12744: [C++][Compute] Add rounding kernel

Posted by GitBox <gi...@apache.org>.

rok commented on pull request #10349:
URL: https://github.com/apache/arrow/pull/10349#issuecomment-915993687


   > @rok It is a possibility, if we have the semantics pinned down w.r.t. how to shift specific timestamps (forward, backward, delta).
   
   Indeed that would probably need some work.
   
   Does this currently support arbitrary rounding to multiple on timestamps? If yes it might be good to limit it to timezoneless and UTC timestamps to avoid ambiguous and nonexistent timestamp issues.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] lidavidm commented on a change in pull request #10349: ARROW-12744: [C++][Compute] Add rounding kernel

Posted by GitBox <gi...@apache.org>.

lidavidm commented on a change in pull request #10349:
URL: https://github.com/apache/arrow/pull/10349#discussion_r674333665



##########
File path: cpp/src/arrow/compute/kernels/scalar_arithmetic.cc
##########
@@ -1241,6 +1373,45 @@ std::shared_ptr<ScalarFunction> MakeUnaryArithmeticFunctionNotNull(
   return func;
 }
 
+// Like MakeUnaryArithmeticFunction, but for unary rounding functions that control
+// kernel dispatch based on RoundMode, only on non-null output.
+template <template <RoundMode RndMode> class Op, typename Opts>
+std::shared_ptr<ScalarFunction> MakeUnaryRoundFunction(
+    std::string name, const FunctionDoc* doc,
+    const FunctionOptions* default_options = NULLPTR, KernelInit init = NULLPTR) {
+  auto func = std::make_shared<ArithmeticFloatingPointFunction>(name, Arity::Unary(), doc,
+                                                                default_options);
+  for (const auto& ty : FloatingPointTypes()) {
+    // Order of ArrayKernelExec w.r.t. RoundMode needs to follow the order of
+    // values in RoundMode definition (see api_scalar.h) because they are used as
+    // indexing values.
+    std::vector<ArrayKernelExec> execs = {
+        GenerateArithmeticFloatingPoint<ScalarUnary, Op<RoundMode::TOWARDS_NEG_INFINITY>>(
+            ty),
+        GenerateArithmeticFloatingPoint<ScalarUnary, Op<RoundMode::TOWARDS_POS_INFINITY>>(
+            ty),
+        GenerateArithmeticFloatingPoint<ScalarUnary, Op<RoundMode::TOWARDS_ZERO>>(ty),
+        GenerateArithmeticFloatingPoint<ScalarUnary, Op<RoundMode::TOWARDS_INFINITY>>(ty),
+        GenerateArithmeticFloatingPoint<ScalarUnary, Op<RoundMode::HALF_NEG_INFINITY>>(
+            ty),
+        GenerateArithmeticFloatingPoint<ScalarUnary, Op<RoundMode::HALF_POS_INFINITY>>(
+            ty),
+        GenerateArithmeticFloatingPoint<ScalarUnary, Op<RoundMode::HALF_TOWARDS_ZERO>>(
+            ty),
+        GenerateArithmeticFloatingPoint<ScalarUnary,
+                                        Op<RoundMode::HALF_TOWARDS_INFINITY>>(ty),
+        GenerateArithmeticFloatingPoint<ScalarUnary, Op<RoundMode::HALF_TO_EVEN>>(ty),
+        GenerateArithmeticFloatingPoint<ScalarUnary, Op<RoundMode::HALF_TO_ODD>>(ty),
+    };

Review comment:
       I wonder, given the branch predictor, if a switch-case might perform similarly to generating separate kernels for every type and rounding mode, if you're concerned about having to place the rounding mode in the template parameters.

##########
File path: cpp/src/arrow/compute/kernels/scalar_arithmetic.cc
##########
@@ -1606,6 +1774,20 @@ const FunctionDoc trunc_doc{
     ("Calculate the nearest integer not greater in magnitude than to the "
      "argument element-wise."),
     {"x"}};
+
+const FunctionDoc round_doc{
+    "Round the arguments element-wise",

Review comment:
       Maybe "Round arguments to a [decimal place/multiple] element-wise"?

##########
File path: cpp/src/arrow/compute/kernels/scalar_arithmetic_test.cc
##########
@@ -1333,6 +1400,305 @@ TYPED_TEST(TestUnaryArithmeticFloating, AbsoluteValue) {
   }
 }
 
+constexpr std::initializer_list<RoundMode> kRoundModes = {

Review comment:
       Having a list like this is fine. We probably don't want to expose EnumTraits<Foo> (unless you really want to make another internal header/translation unit to hold the definitions).

##########
File path: cpp/src/arrow/compute/kernels/scalar_arithmetic.cc
##########
@@ -1606,6 +1774,20 @@ const FunctionDoc trunc_doc{
     ("Calculate the nearest integer not greater in magnitude than to the "
      "argument element-wise."),
     {"x"}};
+
+const FunctionDoc round_doc{
+    "Round the arguments element-wise",
+    ("Options are used to control the rounding mode and number of digits.\n"
+     "Default behavior is to round to nearest integer."),
+    {"x"},
+    "RoundOptions"};
+
+const FunctionDoc mround_doc{

Review comment:
       nit: perhaps RoundToMultiple (and possibly RoundToPlace) might be clearer than round/mround (unless mround is a common name? I see Excel uses it at least)

##########
File path: cpp/src/arrow/compute/api_scalar.h
##########
@@ -49,10 +49,52 @@ class ARROW_EXPORT ElementWiseAggregateOptions : public FunctionOptions {
   explicit ElementWiseAggregateOptions(bool skip_nulls = true);
   constexpr static char const kTypeName[] = "ElementWiseAggregateOptions";
   static ElementWiseAggregateOptions Defaults() { return ElementWiseAggregateOptions{}; }
-
   bool skip_nulls;
 };
 
+/// Rounding modes for Round and MRound functions. General modes are prefixed
+/// with TOWARDS and tie-breaker modes are prefixed with HALF. Common aliases
+/// are available for several modes.

Review comment:
       Are the aliases worth keeping? Maybe it's better to just document the various names for them but keep only a single name for each possible case.

##########
File path: docs/source/python/api/compute.rst
##########
@@ -127,6 +129,18 @@ variants which detect domain errors where appropriate.
    tan
    tan_checked
 
+Rounding

Review comment:
       Did you mean to add rounding functions twice in this file?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] edponce commented on a change in pull request #10349: ARROW-12744: [C++][Compute] Add rounding kernel

Posted by GitBox <gi...@apache.org>.

edponce commented on a change in pull request #10349:
URL: https://github.com/apache/arrow/pull/10349#discussion_r703722756



##########
File path: cpp/src/arrow/compute/kernels/scalar_arithmetic.cc
##########
@@ -852,24 +854,232 @@ struct LogbChecked {
   }
 };
 
+struct RoundUtil {
+  template <typename T>
+  static constexpr enable_if_t<std::is_floating_point<T>::value, bool> IsApproxEqual(
+      const T a, const T b, const T abs_tol = kDefaultAbsoluteTolerance) {
+    return std::fabs(a - b) <= abs_tol;
+  }
+
+  template <typename T>
+  static constexpr enable_if_t<std::is_floating_point<T>::value, bool> IsApproxInt(
+      const T val, const T abs_tol = kDefaultAbsoluteTolerance) {
+    // |frac| ~ 0.0?
+    return IsApproxEqual(val, std::round(val), abs_tol);
+  }
+
+  template <typename T>
+  static constexpr enable_if_t<std::is_floating_point<T>::value, bool> IsApproxHalfInt(
+      const T val, const T abs_tol = kDefaultAbsoluteTolerance) {
+    // |frac| ~ 0.5?
+    return IsApproxEqual(val - std::floor(val), T(0.5), abs_tol);
+  }
+
+  template <typename T>
+  static enable_if_floating_point<T> Pow10(const int64_t power) {
+    const T lut[]{1e0F,  1e1F,  1e2F,  1e3F,  1e4F,  1e5F,  1e6F,  1e7F,
+                  1e8F,  1e9F,  1e10F, 1e11F, 1e12F, 1e13F, 1e14F, 1e15F,
+                  1e16F, 1e17F, 1e18F, 1e19F, 1e20F, 1e21F, 1e22F};
+    // Return NaN if index is out-of-range.
+    auto lut_size = (int64_t)(sizeof(lut) / sizeof(*lut));
+    return (power >= 0 && power < lut_size) ? lut[power] : std::nanf("");

Review comment:
       Initially, that is what I did as to avoid division operations and instead use only multiplies (`round(val * 10^3) * 10^-3`). But found out that this was the culprit of the floating-point precision errors (well more than should be). After looking at several rounding implementations (e.g., [Numpy](https://github.com/numpy/numpy/blob/7b2f20b406d27364c812f7a81a9c901afbd3600c/numpy/core/src/multiarray/calculation.c#L589) and [CPython](https://hg.python.org/cpython/file/1f1498fe50e5/Objects/floatobject.c#l571)) noticed that they do divide. So I applied the division and the results were much less noisy (I wish I could explain this but seems that errors sort of cancelled out better). I extracted CPython's round function and converted the divisions into reciprocal multiplies and observed the same noisy behavior output. For this reason, the round functions apply a multiply and divide with the greater than 1 power of 10 factors.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] edponce commented on a change in pull request #10349: ARROW-12744: [C++][Compute] Add rounding kernel

Posted by GitBox <gi...@apache.org>.

edponce commented on a change in pull request #10349:
URL: https://github.com/apache/arrow/pull/10349#discussion_r705571901



##########
File path: cpp/src/arrow/compute/kernels/scalar_arithmetic.cc
##########
@@ -852,24 +853,243 @@ struct LogbChecked {
   }
 };
 
+struct RoundUtil {
+  template <typename T>
+  static constexpr enable_if_t<std::is_floating_point<T>::value, bool> IsInteger(
+      const T val) {
+    // |frac| ~ 0.0?
+    return std::floor(val) == val;
+  }
+
+  template <typename T>
+  static constexpr enable_if_t<std::is_floating_point<T>::value, bool> IsHalfInteger(
+      const T val) {
+    // |frac| ~ 0.5?
+    return (val - std::floor(val)) == T(0.5);
+  }
+
+  // Calculate powers of ten with arbitrary integer exponent
+  template <typename T = double>
+  static enable_if_floating_point<T> Pow10(int64_t power) {
+    static constexpr T lut[] = {1e0F, 1e1F, 1e2F,  1e3F,  1e4F,  1e5F,  1e6F,  1e7F,
+                                1e8F, 1e9F, 1e10F, 1e11F, 1e12F, 1e13F, 1e14F, 1e15F};
+    int64_t lut_size = (sizeof(lut) / sizeof(*lut));
+    int64_t abs_power = std::abs(power);
+    auto pow10 = lut[std::min(abs_power, lut_size - 1)];
+    while (abs_power-- >= lut_size) {
+      pow10 *= 1e1F;
+    }
+    return (power >= 0) ? pow10 : (1 / pow10);
+  }
+};
+
+// Specializations of rounding implementations for round kernels
+template <typename, RoundMode>
+struct RoundImpl;
+
+template <typename T>
+struct RoundImpl<T, RoundMode::DOWN> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::floor(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::UP> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::ceil(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::TOWARDS_ZERO> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::trunc(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::TOWARDS_INFINITY> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::signbit(val) ? std::floor(val) : std::ceil(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_DOWN> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return RoundImpl<T, RoundMode::DOWN>::Round(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_UP> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return RoundImpl<T, RoundMode::UP>::Round(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TOWARDS_ZERO> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return RoundImpl<T, RoundMode::TOWARDS_ZERO>::Round(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TOWARDS_INFINITY> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return RoundImpl<T, RoundMode::TOWARDS_INFINITY>::Round(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TO_EVEN> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::round(val * T(0.5)) * 2;
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TO_ODD> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::floor(val * T(0.5)) + std::ceil(val * T(0.5));
+  }
+};
+
+// Specializations of kernel state for round kernels
+template <typename>
+struct RoundOptionsWrapper;
+
+template <>
+struct RoundOptionsWrapper<RoundOptions> : public OptionsWrapper<RoundOptions> {
+  using OptionsType = RoundOptions;
+  double pow10;
+
+  explicit RoundOptionsWrapper(OptionsType options) : OptionsWrapper(std::move(options)) {
+    // Only positive of powers of 10 are used because combining multiply and
+    // division operations produced more stable rounding than using multiply-only.
+    // Refer to NumPy's round implementation:
+    // https://github.com/numpy/numpy/blob/7b2f20b406d27364c812f7a81a9c901afbd3600c/numpy/core/src/multiarray/calculation.c#L589
+    pow10 = RoundUtil::Pow10(std::abs(options.ndigits));
+  }
+
+  static Result<std::unique_ptr<KernelState>> Init(KernelContext* ctx,
+                                                   const KernelInitArgs& args) {
+    if (auto options = static_cast<const OptionsType*>(args.options)) {
+      return arrow::internal::make_unique<RoundOptionsWrapper<OptionsType>>(*options);
+    }
+
+    return Status::Invalid(
+        "Attempted to initialize KernelState from null FunctionOptions");
+  }

Review comment:
       I templetized `Init` method of `KernelState` derived classes so that derived constructors can be invoked without the need to duplicate the `Init` definition. This makes `KernelState` be fully functional to support options validation and extending kernel state via constructor which can most likely benefit other scalar kernels as well.
   
   For example by extending, `OptionsWrapper` as follows:
   ```c++
   template <typename OptionsType>
   struct OptionsWrapper : public KernelState {
     template <typename KernelStateType = OptionsWrapper>
     static Result<std::unique_ptr<KernelState>> Init(KernelContext* ctx, const KernelInitArgs& args) {
       if (auto options = static_cast<const OptionsType*>(args.options)) {
           return ::arrow::internal::make_unique<KernelStateType>(*options);
       }
       ...
     }
     ...
   };
   ```
   now we can extend custom states as follows:
   ```c++
   struct RoundOptionsWrapper<RoundOptions> : public OptionsWrapper<RoundOptions> {
     using OptionsType = RoundOptions;
     using State = RoundOptionsWrapper<OptionsType>;
     double pow10;
   
     explicit RoundOptionsWrapper(OptionsType options) : OptionsWrapper(std::move(options)) {
       pow10 = RoundUtil::Pow10(std::abs(options.ndigits));
     }
   
     static Result<std::unique_ptr<KernelState>> Init(KernelContext* ctx, const KernelInitArgs& args) {
       return OptionsWrapper<OptionsType>::Init<State>(ctx, args);
     }
   };
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] edponce commented on a change in pull request #10349: ARROW-12744: [C++][Compute] Add rounding kernel

Posted by GitBox <gi...@apache.org>.

edponce commented on a change in pull request #10349:
URL: https://github.com/apache/arrow/pull/10349#discussion_r660073364



##########
File path: cpp/src/arrow/compute/kernels/scalar_arithmetic.cc
##########
@@ -454,6 +456,166 @@ struct PowerChecked {
   }
 };
 
+struct RoundUtils {
+  template <typename T, enable_if_t<std::is_floating_point<T>::value, bool> = true>
+  static bool ApproxEqual(const T x, const T y, const int ulp = 7) {
+    // https://en.cppreference.com/w/cpp/types/numeric_limits/epsilon
+    // The machine epsilon has to be scaled to the magnitude of the values used
+    // and multiplied by the desired precision in ULPs (units in the last place)
+    const auto eps_ulp = std::numeric_limits<T>::epsilon() * ulp;
+    const auto xy_diff = std::fabs(x - y);
+    const auto xy_sum = std::fabs(x + y);
+    return (xy_diff <= (xy_sum * eps_ulp))
+           // unless the result is subnormal
+           || (xy_diff < std::numeric_limits<T>::min());
+  }
+
+  template <typename T, enable_if_t<std::is_floating_point<T>::value, bool> = true>
+  static bool IsHalf(T val) {
+    // |frac| == 0.5?
+    return ApproxEqual(std::fabs(std::fmod(val, T(1))), T(0.5));
+  }
+
+  template <typename T>
+  static constexpr enable_if_floating_point<T> Floor(T val) {
+    return std::floor(val);
+  }
+
+  template <typename T>
+  static constexpr enable_if_floating_point<T> Ceiling(T val) {
+    return std::ceil(val);
+  }
+
+  template <typename T>
+  static constexpr enable_if_floating_point<T> Truncate(T val) {
+    return std::trunc(val);
+  }
+
+  template <typename T>
+  static constexpr enable_if_floating_point<T> TowardsInfinity(T val) {
+    return std::signbit(val) ? std::floor(val) : std::ceil(val);
+  }
+
+  template <typename T>
+  static constexpr enable_if_floating_point<T> HalfDown(T val) {
+    return std::ceil(val - T(0.5));
+  }
+
+  template <typename T>
+  static constexpr enable_if_floating_point<T> HalfUp(T val) {
+    return std::floor(val + T(0.5));
+  }
+
+  template <typename T>
+  static enable_if_floating_point<T> HalfToEven(T val) {
+    if (IsHalf(val)) {
+      auto floor = std::floor(val);
+      // Odd + 1, Even + 0
+      return floor + (std::fmod(std::fabs(floor), T(2)) >= T(1));
+    }
+    return std::round(val);
+  }
+
+  template <typename T>
+  static enable_if_floating_point<T> HalfToOdd(T val) {
+    if (IsHalf(val)) {
+      auto floor = std::floor(val);
+      // Odd + 0, Even + 1
+      return floor + (std::fmod(std::fabs(floor), T(2)) < T(1));
+    }
+    return std::round(val);
+  }
+
+  template <typename T>
+  static constexpr enable_if_floating_point<T> Nearest(T val) {
+    return std::round(val);
+  }
+
+  template <typename T>
+  static constexpr enable_if_floating_point<T> HalfTowardsZero(T val) {
+    return std::copysign(std::ceil(std::fabs(val) - T(0.5)), val);
+  }
+
+  template <typename T>
+  static enable_if_floating_point<T> Round(T val, T mult, RoundMode round_mode,
+                                           Status* st) {
+    val /= mult;
+
+    T result;
+    switch (round_mode) {

Review comment:
       Actually this was something that I thought about but did not knew how/when to resolve function options that conditionally control the kernel dispatched. With this knowledge I make the following observations regarding conditionally controlled function and kernel dispatching to prevent such checks from entering the hot-loop of execution:
   1. If multiple function variants are available then these are explicitly controlled by their name when invoking `CallFunction`. Nevertheless, in the public API (eg. [scalar](https://github.com/apache/arrow/blob/master/cpp/src/arrow/compute/api_scalar.h)) the function options can resolve the variant's name to call.
   2. If multiple kernel variants are available (and not resolved by input type), then function options can be resolved from `KernelContext` when selecting kernel generators (`ArrayKernelExec`). This may require the kernels to have a template parameter of the function option of interest.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] jorisvandenbossche commented on a change in pull request #10349: ARROW-12744: [C++][Compute] Add rounding kernel

Posted by GitBox <gi...@apache.org>.

jorisvandenbossche commented on a change in pull request #10349:
URL: https://github.com/apache/arrow/pull/10349#discussion_r634139326



##########
File path: cpp/src/arrow/compute/kernels/scalar_arithmetic.cc
##########
@@ -355,6 +355,32 @@ struct PowerChecked {
   }
 };
 
+struct Round {
+  template <typename T, typename Arg>
+  static constexpr enable_if_floating_point<T> Call(KernelContext*, Arg arg, Status*) {
+    return std::round(arg);
+  }
+
+  template <typename T, typename Arg>
+  static constexpr enable_if_integer<T> Call(KernelContext*, Arg arg, Status*) {
+    return arg;
+  }
+};
+
+struct RoundChecked {
+  template <typename T, typename Arg>
+  static constexpr enable_if_integer<T> Call(KernelContext*, Arg arg, Status*) {
+    static_assert(std::is_same<T, Arg>::value, "");
+    return arg;
+  }
+
+  template <typename T, typename Arg>
+  static constexpr enable_if_floating_point<T> Call(KernelContext*, Arg arg, Status*) {
+    static_assert(std::is_same<T, Arg>::value, "");

Review comment:
       But should a `round` kernel ever change the type? IMO when rounding floats, the result can still be floats, regardless of the number of decimals to round




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] edponce commented on pull request #10349: ARROW-12744: [C++][Compute] Add rounding kernel

Posted by GitBox <gi...@apache.org>.

edponce commented on pull request #10349:
URL: https://github.com/apache/arrow/pull/10349#issuecomment-869762925


   This PR is missing an implementation of `pow(10, x)` using a LUT (precomputed), instead of `std::pow`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] edponce commented on a change in pull request #10349: ARROW-12744: [C++][Compute] Add rounding kernel

Posted by GitBox <gi...@apache.org>.

edponce commented on a change in pull request #10349:
URL: https://github.com/apache/arrow/pull/10349#discussion_r703727560



##########
File path: cpp/src/arrow/compute/api_scalar.h
##########
@@ -49,10 +49,66 @@ class ARROW_EXPORT ElementWiseAggregateOptions : public FunctionOptions {
   explicit ElementWiseAggregateOptions(bool skip_nulls = true);
   constexpr static char const kTypeName[] = "ElementWiseAggregateOptions";
   static ElementWiseAggregateOptions Defaults() { return ElementWiseAggregateOptions{}; }
-
   bool skip_nulls;
 };
 
+/// Rounding and tie-breaking modes for round compute functions.
+/// Additional details and examples are provided in compute.rst.
+enum class RoundMode : int8_t {
+  /// Round to nearest integer less than or equal in magnitude (aka "floor")
+  DOWN,
+  /// Round to nearest integer greater than or equal in magnitude (aka "ceil")
+  UP,
+  /// Get the integral part without fractional digits (aka "trunc")
+  TOWARDS_ZERO,
+  /// Round negative values with DOWN rule and positive values with UP rule
+  TOWARDS_INFINITY,
+  /// Round ties with DOWN rule
+  HALF_DOWN,
+  /// Round ties with UP rule
+  HALF_UP,
+  /// Round ties with TOWARDS_ZERO rule
+  HALF_TOWARDS_ZERO,
+  /// Round ties with TOWARDS_INFINITY rule
+  HALF_TOWARDS_INFINITY,
+  /// Round ties to nearest even integer
+  HALF_TO_EVEN,
+  /// Round ties to nearest odd integer
+  HALF_TO_ODD,
+};
+
+static constexpr double kDefaultAbsoluteTolerance = 1E-5;
+
+class ARROW_EXPORT RoundOptions : public FunctionOptions {
+ public:
+  explicit RoundOptions(int64_t ndigits = 0,
+                        RoundMode round_mode = RoundMode::HALF_TO_EVEN,
+                        double abs_tol = kDefaultAbsoluteTolerance);
+  constexpr static char const kTypeName[] = "RoundOptions";
+  static RoundOptions Defaults() { return RoundOptions(); }
+  /// Rounding precision (number of digits to round to).
+  int64_t ndigits;
+  /// Rounding and tie-breaking mode
+  RoundMode round_mode;
+  /// Absolute tolerance for approximating values as integers and mid-point decimals
+  double abs_tol;

Review comment:
       Another option is to have it internally for comparing floating-point numbers but not expose it to the client code as an option.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] edponce removed a comment on pull request #10349: ARROW-12744: [C++][Compute] Add rounding kernel

Posted by GitBox <gi...@apache.org>.

edponce removed a comment on pull request #10349:
URL: https://github.com/apache/arrow/pull/10349#issuecomment-842580763


   There is an unresolved discussion on rounding modes and probably providing options to specify, refer to https://issues.apache.org/jira/browse/ARROW-12744


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] bkietz commented on a change in pull request #10349: ARROW-12744: [C++][Compute] Add rounding kernel

Posted by GitBox <gi...@apache.org>.

bkietz commented on a change in pull request #10349:
URL: https://github.com/apache/arrow/pull/10349#discussion_r659917735



##########
File path: docs/source/cpp/compute.rst
##########
@@ -312,6 +313,79 @@ precision of `divide` is at least the sum of precisions of both operands with
 enough scale kept. Error is returned if the result precision is beyond the
 decimal value range.
 
+Rounding functions
+~~~~~~~~~~~~~~~~~~
+
+These functions displace numeric input(s) to approximate and shorter numeric
+representation(s).  Integral input(s) produce floating-point output(s) of same value.
+If any of the input element(s) is null, the corresponding output element is null.
+
++---------------+------------+-------------+-------------+----------------------------------+
+| Function name | Arity      | Input types | Output type | Notes | Options class            |
++===============+============+=============+=============+==================================+
+| mround        | Unary      | Numeric     | Float32/64  | (1)(2) | :struct:`MRoundOptions` |
++---------------+------------+-------------+-------------+----------------------------------+
+| round         | Unary      | Numeric     | Float32/64  | (1)(3) | :struct:`RoundOptions`  |
++---------------+------------+-------------+-------------+----------------------------------+
+
+* \(1) Output value is a 64-bit floating-point for integral inputs and the
+  retains the same type for floating-point inputs.  By default rounding functions
+  displace a value to the nearest integer with a round to even for breaking ties.
+  Options are available to control the rounding behavior.
+* \(2) The ``multiple`` option specifies the rounding
+  scale and precision.  Only the magnitude of the ``rounding multiple`` is used,
+  its sign is ignored.
+* \(3) The ``ndigits`` option specifies the rounding precision in
+  terms of number of digits.  A negative value corresponds to digits in the
+  non-decimal part.
+
++-------------------------+---------------------------------+
+| Round mode              | Description/Examples            |
++=========================+=================================+
+| DOWNWARD                | Equivalent to ``floor(x)``      |
+| TOWARDS_NEG_INFINITY    | 3.7 = 3, -3.2 = -4              |

Review comment:
       Nit: using `=` like this is confusing
   ```suggestion
   | TOWARDS_NEG_INFINITY    | 3.7 -> 3, -3.2 -> -4            |
   ```

##########
File path: cpp/src/arrow/compute/kernels/scalar_arithmetic.cc
##########
@@ -454,6 +456,166 @@ struct PowerChecked {
   }
 };
 
+struct RoundUtils {
+  template <typename T, enable_if_t<std::is_floating_point<T>::value, bool> = true>
+  static bool ApproxEqual(const T x, const T y, const int ulp = 7) {
+    // https://en.cppreference.com/w/cpp/types/numeric_limits/epsilon
+    // The machine epsilon has to be scaled to the magnitude of the values used
+    // and multiplied by the desired precision in ULPs (units in the last place)
+    const auto eps_ulp = std::numeric_limits<T>::epsilon() * ulp;
+    const auto xy_diff = std::fabs(x - y);
+    const auto xy_sum = std::fabs(x + y);
+    return (xy_diff <= (xy_sum * eps_ulp))
+           // unless the result is subnormal
+           || (xy_diff < std::numeric_limits<T>::min());
+  }
+
+  template <typename T, enable_if_t<std::is_floating_point<T>::value, bool> = true>
+  static bool IsHalf(T val) {
+    // |frac| == 0.5?
+    return ApproxEqual(std::fabs(std::fmod(val, T(1))), T(0.5));
+  }
+
+  template <typename T>
+  static constexpr enable_if_floating_point<T> Floor(T val) {
+    return std::floor(val);
+  }
+
+  template <typename T>
+  static constexpr enable_if_floating_point<T> Ceiling(T val) {
+    return std::ceil(val);
+  }
+
+  template <typename T>
+  static constexpr enable_if_floating_point<T> Truncate(T val) {
+    return std::trunc(val);
+  }
+
+  template <typename T>
+  static constexpr enable_if_floating_point<T> TowardsInfinity(T val) {
+    return std::signbit(val) ? std::floor(val) : std::ceil(val);
+  }
+
+  template <typename T>
+  static constexpr enable_if_floating_point<T> HalfDown(T val) {
+    return std::ceil(val - T(0.5));
+  }
+
+  template <typename T>
+  static constexpr enable_if_floating_point<T> HalfUp(T val) {
+    return std::floor(val + T(0.5));
+  }
+
+  template <typename T>
+  static enable_if_floating_point<T> HalfToEven(T val) {
+    if (IsHalf(val)) {
+      auto floor = std::floor(val);
+      // Odd + 1, Even + 0
+      return floor + (std::fmod(std::fabs(floor), T(2)) >= T(1));
+    }
+    return std::round(val);
+  }
+
+  template <typename T>
+  static enable_if_floating_point<T> HalfToOdd(T val) {
+    if (IsHalf(val)) {
+      auto floor = std::floor(val);
+      // Odd + 0, Even + 1
+      return floor + (std::fmod(std::fabs(floor), T(2)) < T(1));
+    }
+    return std::round(val);
+  }
+
+  template <typename T>
+  static constexpr enable_if_floating_point<T> Nearest(T val) {
+    return std::round(val);
+  }
+
+  template <typename T>
+  static constexpr enable_if_floating_point<T> HalfTowardsZero(T val) {
+    return std::copysign(std::ceil(std::fabs(val) - T(0.5)), val);
+  }
+
+  template <typename T>
+  static enable_if_floating_point<T> Round(T val, T mult, RoundMode round_mode,
+                                           Status* st) {
+    val /= mult;
+
+    T result;
+    switch (round_mode) {

Review comment:
       Let's ensure this switch is outside the hot loop. In this case, that would probably entail: make `round_mode` a template parameter of `RoundUtils::Round`, `struct Round`, and `struct MRound`. Then you can produce a vector of `ArrayKernelExec` which can be selected from outside the loop, something like:
   
   ```c++
     auto func =
         std::make_shared<ArithmeticUnaryFloatOnlyFunction>(name, Arity::Unary(), doc, default_options);
     for (const auto& ty : {float32(), float64()}) {
       std::vector<ArrayKernelExec> execs = {
         Round<RoundMode::DOWNWARD>,
         Round<RoundMode::UPWARD>,
         //...
       };
       auto exec = [execs](KernelContext* ctx, const ExecBatch& batch, Datum* out) {
         RoundMode round_mode = OptionsWrapper<RoundOptions>::Get(ctx).round_mode;
         return execs[round_mode](ctx, batch, out);
       };
       DCHECK_OK(func->AddKernel({ty}, out_ty, exec, init));
     }
     return func;
   ```

##########
File path: docs/source/cpp/compute.rst
##########
@@ -312,6 +313,79 @@ precision of `divide` is at least the sum of precisions of both operands with
 enough scale kept. Error is returned if the result precision is beyond the
 decimal value range.
 
+Rounding functions
+~~~~~~~~~~~~~~~~~~
+
+These functions displace numeric input(s) to approximate and shorter numeric
+representation(s).  Integral input(s) produce floating-point output(s) of same value.
+If any of the input element(s) is null, the corresponding output element is null.
+
++---------------+------------+-------------+-------------+----------------------------------+
+| Function name | Arity      | Input types | Output type | Notes | Options class            |
++===============+============+=============+=============+==================================+
+| mround        | Unary      | Numeric     | Float32/64  | (1)(2) | :struct:`MRoundOptions` |
++---------------+------------+-------------+-------------+----------------------------------+
+| round         | Unary      | Numeric     | Float32/64  | (1)(3) | :struct:`RoundOptions`  |
++---------------+------------+-------------+-------------+----------------------------------+
+
+* \(1) Output value is a 64-bit floating-point for integral inputs and the
+  retains the same type for floating-point inputs.  By default rounding functions
+  displace a value to the nearest integer with a round to even for breaking ties.
+  Options are available to control the rounding behavior.
+* \(2) The ``multiple`` option specifies the rounding
+  scale and precision.  Only the magnitude of the ``rounding multiple`` is used,

Review comment:
       ```suggestion
     scale and precision.  Only the absolute value of the ``rounding multiple`` is used,
   ```

##########
File path: cpp/src/arrow/compute/kernels/scalar_arithmetic.cc
##########
@@ -485,6 +647,36 @@ ArrayKernelExec ArithmeticExecFromOp(detail::GetTypeId get_id) {
   }
 }
 
+template <template <typename... Args> class KernelGenerator, typename Op>
+ArrayKernelExec GenerateArithmeticWithFloatOutType(detail::GetTypeId get_id) {

Review comment:
       Instead, wouldn't it make more sense to just insert an implicit cast to floating point?

##########
File path: docs/source/cpp/compute.rst
##########
@@ -312,6 +313,79 @@ precision of `divide` is at least the sum of precisions of both operands with
 enough scale kept. Error is returned if the result precision is beyond the
 decimal value range.
 
+Rounding functions
+~~~~~~~~~~~~~~~~~~
+
+These functions displace numeric input(s) to approximate and shorter numeric
+representation(s).  Integral input(s) produce floating-point output(s) of same value.
+If any of the input element(s) is null, the corresponding output element is null.
+
++---------------+------------+-------------+-------------+----------------------------------+
+| Function name | Arity      | Input types | Output type | Notes | Options class            |

Review comment:
       ```suggestion
   | Function name | Arity      | Input types | Output type | Notes  | Options class           |
   ```

##########
File path: cpp/src/arrow/compute/api_scalar.h
##########
@@ -42,6 +42,55 @@ struct ArithmeticOptions : public FunctionOptions {
   bool check_overflow;
 };
 
+enum class RoundMode {
+  // floor (towards negative infinity)
+  DOWNWARD,
+  TOWARDS_NEG_INFINITY,

Review comment:
       If these are equivalent, shouldn't they have the same value?
   ```suggestion
     DOWNWARD,
     TOWARDS_NEG_INFINITY = DOWNWARD,
   ```

##########
File path: docs/source/cpp/compute.rst
##########
@@ -286,7 +287,7 @@ an ``Invalid`` :class:`Status` when overflow is detected.
 +--------------------------+------------+--------------------+---------------------+
 | power_checked            | Binary     | Numeric            | Numeric             |
 +--------------------------+------------+--------------------+---------------------+
-| subtract                 | Binary     | Numeric            | Numeric (1)         |
+| subtract                 | Binary     | Numeric            | Numeric             |

Review comment:
       Was this intentional?

##########
File path: docs/source/cpp/compute.rst
##########
@@ -312,6 +313,79 @@ precision of `divide` is at least the sum of precisions of both operands with
 enough scale kept. Error is returned if the result precision is beyond the
 decimal value range.
 
+Rounding functions
+~~~~~~~~~~~~~~~~~~
+
+These functions displace numeric input(s) to approximate and shorter numeric
+representation(s).  Integral input(s) produce floating-point output(s) of same value.
+If any of the input element(s) is null, the corresponding output element is null.
+
++---------------+------------+-------------+-------------+----------------------------------+
+| Function name | Arity      | Input types | Output type | Notes | Options class            |
++===============+============+=============+=============+==================================+
+| mround        | Unary      | Numeric     | Float32/64  | (1)(2) | :struct:`MRoundOptions` |
++---------------+------------+-------------+-------------+----------------------------------+
+| round         | Unary      | Numeric     | Float32/64  | (1)(3) | :struct:`RoundOptions`  |
++---------------+------------+-------------+-------------+----------------------------------+
+
+* \(1) Output value is a 64-bit floating-point for integral inputs and the
+  retains the same type for floating-point inputs.  By default rounding functions
+  displace a value to the nearest integer with a round to even for breaking ties.
+  Options are available to control the rounding behavior.
+* \(2) The ``multiple`` option specifies the rounding
+  scale and precision.  Only the magnitude of the ``rounding multiple`` is used,
+  its sign is ignored.
+* \(3) The ``ndigits`` option specifies the rounding precision in
+  terms of number of digits.  A negative value corresponds to digits in the
+  non-decimal part.
+
++-------------------------+---------------------------------+
+| Round mode              | Description/Examples            |
++=========================+=================================+
+| DOWNWARD                | Equivalent to ``floor(x)``      |
+| TOWARDS_NEG_INFINITY    | 3.7 = 3, -3.2 = -4              |
++-------------------------+---------------------------------+
+| UPWARD                  | Equivalent to ``ceil(x)``       |
+| TOWARDS_POS_INFINITY    | 3.2 = 4, -3.7 = -3              |
++-------------------------+---------------------------------+
+| TOWARDS_ZERO            | Equivalent to ``trunc(x)``      |
+| AWAY_FROM_INFINITY      | 3.7 = 3, -3.7 = -3              |
++-------------------------+---------------------------------+
+| TOWARDS_INFINITY        | 3.2 = 4, -3.2 = -4              |
+| AWAY_FROM_ZERO          |                                 |
++-------------------------+---------------------------------+
+| HALF_UP                 | 3.5 = 4, 4.5 = 5, -3.5 = -3     |
+| HALF_POS_INFINITY       |                                 |
++-------------------------+---------------------------------+
+| HALF_DOWN               | 3.5 = 3, 4.5 = 4, -3.5 = -4     |
+| HALF_NEG_INFINITY       |                                 |
++-------------------------+---------------------------------+
+| HALF_TO_EVEN            | 3.5 = 4, 4.5 = 4, -3.5 = -4     |
++-------------------------+---------------------------------+
+| HALF_TO_ODD             | 3.5 = 3, 4.5 = 5, -3.5 = -3     |
++-------------------------+---------------------------------+
+| HALF_TOWARDS_ZERO       | 3.5 = 3, 4.5 = 4, -3.5 = -3     |
+| HALF_AWAY_FROM_INFINITY |                                 |
++-------------------------+---------------------------------+
+| HALF_TOWARDS_INFINITY   | Round nearest integer           |
+| HALF_AWAY_FROM_ZERO     | 3.5 = 4, 4.5 = 5, -3.5 = -4     |
+| NEAREST                 |                                 |
++-------------------------+---------------------------------+
+
++----------------+---------------+----------------------------+
+| Round multiple | Round ndigits | Description                |
++================================+============================+
+| 1.0            | 0             | Round to integer           |

Review comment:
       Could you add some examples here too? Additionally, could you break this into two tables (one for `mround` and one for `round`)? From this table it looks like one might specify both `multiple` and `ndigits` to a single function call

##########
File path: cpp/src/arrow/compute/kernels/scalar_arithmetic.cc
##########
@@ -20,6 +20,8 @@
 #include <limits>
 #include <utility>
 
+#include "arrow/compute/api_scalar.h"
+// #include "arrow/compute/kernels/codegen_internal.h"

Review comment:
       ```suggestion
   ```

##########
File path: docs/source/cpp/compute.rst
##########
@@ -312,6 +313,79 @@ precision of `divide` is at least the sum of precisions of both operands with
 enough scale kept. Error is returned if the result precision is beyond the
 decimal value range.
 
+Rounding functions
+~~~~~~~~~~~~~~~~~~
+
+These functions displace numeric input(s) to approximate and shorter numeric
+representation(s).  Integral input(s) produce floating-point output(s) of same value.
+If any of the input element(s) is null, the corresponding output element is null.
+
++---------------+------------+-------------+-------------+----------------------------------+
+| Function name | Arity      | Input types | Output type | Notes | Options class            |
++===============+============+=============+=============+==================================+
+| mround        | Unary      | Numeric     | Float32/64  | (1)(2) | :struct:`MRoundOptions` |
++---------------+------------+-------------+-------------+----------------------------------+
+| round         | Unary      | Numeric     | Float32/64  | (1)(3) | :struct:`RoundOptions`  |
++---------------+------------+-------------+-------------+----------------------------------+
+
+* \(1) Output value is a 64-bit floating-point for integral inputs and the
+  retains the same type for floating-point inputs.  By default rounding functions
+  displace a value to the nearest integer with a round to even for breaking ties.
+  Options are available to control the rounding behavior.
+* \(2) The ``multiple`` option specifies the rounding
+  scale and precision.  Only the magnitude of the ``rounding multiple`` is used,
+  its sign is ignored.
+* \(3) The ``ndigits`` option specifies the rounding precision in
+  terms of number of digits.  A negative value corresponds to digits in the
+  non-decimal part.
+
++-------------------------+---------------------------------+
+| Round mode              | Description/Examples            |
++=========================+=================================+
+| DOWNWARD                | Equivalent to ``floor(x)``      |
+| TOWARDS_NEG_INFINITY    | 3.7 = 3, -3.2 = -4              |
++-------------------------+---------------------------------+
+| UPWARD                  | Equivalent to ``ceil(x)``       |
+| TOWARDS_POS_INFINITY    | 3.2 = 4, -3.7 = -3              |
++-------------------------+---------------------------------+
+| TOWARDS_ZERO            | Equivalent to ``trunc(x)``      |
+| AWAY_FROM_INFINITY      | 3.7 = 3, -3.7 = -3              |
++-------------------------+---------------------------------+
+| TOWARDS_INFINITY        | 3.2 = 4, -3.2 = -4              |
+| AWAY_FROM_ZERO          |                                 |
++-------------------------+---------------------------------+
+| HALF_UP                 | 3.5 = 4, 4.5 = 5, -3.5 = -3     |
+| HALF_POS_INFINITY       |                                 |
++-------------------------+---------------------------------+
+| HALF_DOWN               | 3.5 = 3, 4.5 = 4, -3.5 = -4     |
+| HALF_NEG_INFINITY       |                                 |
++-------------------------+---------------------------------+
+| HALF_TO_EVEN            | 3.5 = 4, 4.5 = 4, -3.5 = -4     |
++-------------------------+---------------------------------+
+| HALF_TO_ODD             | 3.5 = 3, 4.5 = 5, -3.5 = -3     |
++-------------------------+---------------------------------+
+| HALF_TOWARDS_ZERO       | 3.5 = 3, 4.5 = 4, -3.5 = -3     |
+| HALF_AWAY_FROM_INFINITY |                                 |
++-------------------------+---------------------------------+
+| HALF_TOWARDS_INFINITY   | Round nearest integer           |
+| HALF_AWAY_FROM_ZERO     | 3.5 = 4, 4.5 = 5, -3.5 = -4     |
+| NEAREST                 |                                 |
++-------------------------+---------------------------------+
+
++----------------+---------------+----------------------------+
+| Round multiple | Round ndigits | Description                |
++================================+============================+
+| 1.0            | 0             | Round to integer           |
++----------------+--------------------------------------------+
+| 0.001          | 3             | Round to 3 decimal places  |
++----------------+--------------------------------------------+
+| 10             | -2            | Round to multiple of 10    |

Review comment:
       ```suggestion
   | 10             | -2            | Round to multiple of 100   |
   ```

##########
File path: docs/source/cpp/compute.rst
##########
@@ -312,6 +313,79 @@ precision of `divide` is at least the sum of precisions of both operands with
 enough scale kept. Error is returned if the result precision is beyond the
 decimal value range.
 
+Rounding functions
+~~~~~~~~~~~~~~~~~~
+
+These functions displace numeric input(s) to approximate and shorter numeric
+representation(s).  Integral input(s) produce floating-point output(s) of same value.
+If any of the input element(s) is null, the corresponding output element is null.
+
++---------------+------------+-------------+-------------+----------------------------------+
+| Function name | Arity      | Input types | Output type | Notes | Options class            |
++===============+============+=============+=============+==================================+
+| mround        | Unary      | Numeric     | Float32/64  | (1)(2) | :struct:`MRoundOptions` |
++---------------+------------+-------------+-------------+----------------------------------+
+| round         | Unary      | Numeric     | Float32/64  | (1)(3) | :struct:`RoundOptions`  |
++---------------+------------+-------------+-------------+----------------------------------+
+
+* \(1) Output value is a 64-bit floating-point for integral inputs and the
+  retains the same type for floating-point inputs.  By default rounding functions
+  displace a value to the nearest integer with a round to even for breaking ties.
+  Options are available to control the rounding behavior.
+* \(2) The ``multiple`` option specifies the rounding
+  scale and precision.  Only the magnitude of the ``rounding multiple`` is used,
+  its sign is ignored.
+* \(3) The ``ndigits`` option specifies the rounding precision in
+  terms of number of digits.  A negative value corresponds to digits in the
+  non-decimal part.

Review comment:
       ```suggestion
     non-fractional part. For example -2 corresponds to rounding to the nearest multiple of 100
     (zeroing the ones and tens digits).
   ```

##########
File path: cpp/src/arrow/compute/kernels/scalar_arithmetic.cc
##########
@@ -454,6 +456,166 @@ struct PowerChecked {
   }
 };
 
+struct RoundUtils {
+  template <typename T, enable_if_t<std::is_floating_point<T>::value, bool> = true>
+  static bool ApproxEqual(const T x, const T y, const int ulp = 7) {
+    // https://en.cppreference.com/w/cpp/types/numeric_limits/epsilon
+    // The machine epsilon has to be scaled to the magnitude of the values used
+    // and multiplied by the desired precision in ULPs (units in the last place)
+    const auto eps_ulp = std::numeric_limits<T>::epsilon() * ulp;
+    const auto xy_diff = std::fabs(x - y);
+    const auto xy_sum = std::fabs(x + y);
+    return (xy_diff <= (xy_sum * eps_ulp))
+           // unless the result is subnormal
+           || (xy_diff < std::numeric_limits<T>::min());
+  }
+
+  template <typename T, enable_if_t<std::is_floating_point<T>::value, bool> = true>
+  static bool IsHalf(T val) {
+    // |frac| == 0.5?
+    return ApproxEqual(std::fabs(std::fmod(val, T(1))), T(0.5));
+  }
+
+  template <typename T>
+  static constexpr enable_if_floating_point<T> Floor(T val) {
+    return std::floor(val);
+  }
+
+  template <typename T>
+  static constexpr enable_if_floating_point<T> Ceiling(T val) {
+    return std::ceil(val);
+  }
+
+  template <typename T>
+  static constexpr enable_if_floating_point<T> Truncate(T val) {
+    return std::trunc(val);
+  }
+
+  template <typename T>
+  static constexpr enable_if_floating_point<T> TowardsInfinity(T val) {
+    return std::signbit(val) ? std::floor(val) : std::ceil(val);
+  }
+
+  template <typename T>
+  static constexpr enable_if_floating_point<T> HalfDown(T val) {
+    return std::ceil(val - T(0.5));
+  }
+
+  template <typename T>
+  static constexpr enable_if_floating_point<T> HalfUp(T val) {
+    return std::floor(val + T(0.5));
+  }
+
+  template <typename T>
+  static enable_if_floating_point<T> HalfToEven(T val) {
+    if (IsHalf(val)) {
+      auto floor = std::floor(val);
+      // Odd + 1, Even + 0
+      return floor + (std::fmod(std::fabs(floor), T(2)) >= T(1));
+    }
+    return std::round(val);
+  }
+
+  template <typename T>
+  static enable_if_floating_point<T> HalfToOdd(T val) {
+    if (IsHalf(val)) {
+      auto floor = std::floor(val);
+      // Odd + 0, Even + 1
+      return floor + (std::fmod(std::fabs(floor), T(2)) < T(1));
+    }
+    return std::round(val);
+  }
+
+  template <typename T>
+  static constexpr enable_if_floating_point<T> Nearest(T val) {
+    return std::round(val);
+  }
+
+  template <typename T>
+  static constexpr enable_if_floating_point<T> HalfTowardsZero(T val) {
+    return std::copysign(std::ceil(std::fabs(val) - T(0.5)), val);
+  }
+
+  template <typename T>
+  static enable_if_floating_point<T> Round(T val, T mult, RoundMode round_mode,
+                                           Status* st) {
+    val /= mult;
+
+    T result;
+    switch (round_mode) {

Review comment:
       Related: https://issues.apache.org/jira/browse/ARROW-13122
   
   Kernel variants *should not* occur; instead we should have distinct kernels for each RoundMode. We can't currently because we don't have access to function options at dispatch time. If we did, instead of building a vector of execs we could simply construct a kernel for each and let dispatch select the appropriate kernel for the given RoundMode




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] edponce commented on a change in pull request #10349: ARROW-12744: [C++][Compute] Add rounding kernel

Posted by GitBox <gi...@apache.org>.

edponce commented on a change in pull request #10349:
URL: https://github.com/apache/arrow/pull/10349#discussion_r704005402



##########
File path: cpp/src/arrow/compute/kernels/scalar_arithmetic.cc
##########
@@ -852,24 +854,232 @@ struct LogbChecked {
   }
 };
 
+struct RoundUtil {
+  template <typename T>
+  static constexpr enable_if_t<std::is_floating_point<T>::value, bool> IsApproxEqual(
+      const T a, const T b, const T abs_tol = kDefaultAbsoluteTolerance) {
+    return std::fabs(a - b) <= abs_tol;
+  }
+
+  template <typename T>
+  static constexpr enable_if_t<std::is_floating_point<T>::value, bool> IsApproxInt(
+      const T val, const T abs_tol = kDefaultAbsoluteTolerance) {
+    // |frac| ~ 0.0?
+    return IsApproxEqual(val, std::round(val), abs_tol);
+  }
+
+  template <typename T>
+  static constexpr enable_if_t<std::is_floating_point<T>::value, bool> IsApproxHalfInt(
+      const T val, const T abs_tol = kDefaultAbsoluteTolerance) {
+    // |frac| ~ 0.5?
+    return IsApproxEqual(val - std::floor(val), T(0.5), abs_tol);
+  }
+
+  template <typename T>
+  static enable_if_floating_point<T> Pow10(const int64_t power) {
+    const T lut[]{1e0F,  1e1F,  1e2F,  1e3F,  1e4F,  1e5F,  1e6F,  1e7F,
+                  1e8F,  1e9F,  1e10F, 1e11F, 1e12F, 1e13F, 1e14F, 1e15F,
+                  1e16F, 1e17F, 1e18F, 1e19F, 1e20F, 1e21F, 1e22F};
+    // Return NaN if index is out-of-range.
+    auto lut_size = (int64_t)(sizeof(lut) / sizeof(*lut));
+    return (power >= 0 && power < lut_size) ? lut[power] : std::nanf("");
+  }
+};
+
+// Specializations of rounding implementations for round kernels
+template <typename T, RoundMode>
+struct RoundImpl {};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::DOWN> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::floor(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::UP> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::ceil(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::TOWARDS_ZERO> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::signbit(val) ? std::ceil(val) : std::floor(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::TOWARDS_INFINITY> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::signbit(val) ? std::floor(val) : std::ceil(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_DOWN> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return RoundImpl<T, RoundMode::DOWN>::Round(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_UP> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return RoundImpl<T, RoundMode::UP>::Round(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TOWARDS_ZERO> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return RoundImpl<T, RoundMode::TOWARDS_ZERO>::Round(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TOWARDS_INFINITY> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return RoundImpl<T, RoundMode::TOWARDS_INFINITY>::Round(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TO_EVEN> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::round(val * T(0.5)) * 2;
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TO_ODD> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::floor(val * T(0.5)) + std::ceil(val * T(0.5));
+  }
+};
+
+template <RoundMode RndMode>
+struct Round {
+  using RoundState = OptionsWrapper<RoundOptions>;
+
+  template <typename T, typename Arg>
+  static enable_if_floating_point<Arg, T> Call(KernelContext* ctx, Arg arg, Status* st) {
+    static_assert(std::is_same<T, Arg>::value, "");
+    static_assert(RoundMode::HALF_DOWN > RoundMode::DOWN &&
+                      RoundMode::HALF_DOWN > RoundMode::UP &&
+                      RoundMode::HALF_DOWN > RoundMode::TOWARDS_ZERO &&
+                      RoundMode::HALF_DOWN > RoundMode::TOWARDS_INFINITY &&
+                      RoundMode::HALF_DOWN < RoundMode::HALF_UP &&
+                      RoundMode::HALF_DOWN < RoundMode::HALF_TOWARDS_ZERO &&
+                      RoundMode::HALF_DOWN < RoundMode::HALF_TOWARDS_INFINITY &&
+                      RoundMode::HALF_DOWN < RoundMode::HALF_TO_EVEN &&
+                      RoundMode::HALF_DOWN < RoundMode::HALF_TO_ODD,
+                  "Round modes prefixed with HALF need to be defined last in enum and "
+                  "the first HALF entry has to be HALF_DOWN.");
+
+    auto options = RoundState::Get(ctx);
+    auto pow10 = RoundUtil::Pow10<T>(std::llabs(options.ndigits));
+    if (std::isnan(pow10)) {
+      *st = Status::Invalid("out-of-range value for rounding digits");
+      return arg;

Review comment:
       I agree, and I tried to make use of [`KernelStateFromFunctionOptions`](https://github.com/apache/arrow/blob/master/cpp/src/arrow/compute/kernels/codegen_internal.h#L96) but failed to make it work. The issues I encountered were:
   * I do not know how to invoke `Init` from this wrapper (while allowing the existing `Init` to be invoked first). Actually, I calculated `pow10` in the constructor.
   * An invalid `status` cannot be handled in constructor, it has to be `Init`.
   * `Call` is a static function and member variable `state` is not.
   
   Here is more or less what I tried:
   ```c++
   struct RoundState {
     RoundOptions options_;
     Status status_ = Status::OK();
     double pow10_;
   
     explicit RoundState(KernelContext* ctx, RoundOptions options) : options_(std::move(options)) {
       pow10_ = RoundUtil::Pow10<double>(std::llabs(options_.ndigits));
       if (std::isnan(pow10_)) {
         // NOTE: There is no way to signal this invalid status outside of Call (e.g., a PreExec method like in StringTransforms).
         status_ = Status::Invalid("out-of-range value for rounding digits");
       }
     }
   };
   
   template <RoundMode RndMode>
   struct Round {
     using State = KernelStateFromFunctionOptions<RoundState, RoundOptions>;
     const State& state_;
   
   template <typename T, typename Arg>
   static enable_if_floating_point<Arg, T> Call(KernelContext* ctx, Arg arg, Status* st) {
       static_assert(std::is_same<T, Arg>::value, "");
       // NOTE: There is no way to signal this invalid status outside of Call (e.g., a PreExec method).
       if (state_.status_ != Status::OK()) {
           *st = state_.status_;
           return arg;
        } else if (!std::isfinite(arg)) {
           return arg;
        }
   
       auto pow10 = T(state_.pow10_);
       ...
     }
   };
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] edponce commented on a change in pull request #10349: ARROW-12744: [C++][Compute] Add rounding kernel

Posted by GitBox <gi...@apache.org>.

edponce commented on a change in pull request #10349:
URL: https://github.com/apache/arrow/pull/10349#discussion_r703591017



##########
File path: cpp/src/arrow/compute/api_scalar.h
##########
@@ -53,16 +52,10 @@ class ARROW_EXPORT ElementWiseAggregateOptions : public FunctionOptions {
   bool skip_nulls;
 };
 
-/// Functor to calculate hash of an enum.
-template <typename T, typename R = typename std::underlying_type<T>::type,
-          typename = std::enable_if<std::is_enum<T>::value>>
-struct EnumHash {
-  R operator()(T val) const { return static_cast<R>(val); }
-};
-
 /// Rounding and tie-breaking modes for round compute functions.
 /// Additional details and examples are provided in compute.rst.
 enum class RoundMode : int8_t {
+  // Note: The HALF values need to be last and the first HALF entry is HALF_DOWN.

Review comment:
       Actually, I think the asserts for enum order would fit best in the Round/MRoundOptions constructors.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] bkietz commented on a change in pull request #10349: ARROW-12744: [C++][Compute] Add rounding kernel

Posted by GitBox <gi...@apache.org>.

bkietz commented on a change in pull request #10349:
URL: https://github.com/apache/arrow/pull/10349#discussion_r660102749



##########
File path: cpp/src/arrow/compute/kernels/scalar_arithmetic.cc
##########
@@ -454,6 +456,166 @@ struct PowerChecked {
   }
 };
 
+struct RoundUtils {
+  template <typename T, enable_if_t<std::is_floating_point<T>::value, bool> = true>
+  static bool ApproxEqual(const T x, const T y, const int ulp = 7) {
+    // https://en.cppreference.com/w/cpp/types/numeric_limits/epsilon
+    // The machine epsilon has to be scaled to the magnitude of the values used
+    // and multiplied by the desired precision in ULPs (units in the last place)
+    const auto eps_ulp = std::numeric_limits<T>::epsilon() * ulp;
+    const auto xy_diff = std::fabs(x - y);
+    const auto xy_sum = std::fabs(x + y);
+    return (xy_diff <= (xy_sum * eps_ulp))
+           // unless the result is subnormal
+           || (xy_diff < std::numeric_limits<T>::min());
+  }
+
+  template <typename T, enable_if_t<std::is_floating_point<T>::value, bool> = true>
+  static bool IsHalf(T val) {
+    // |frac| == 0.5?
+    return ApproxEqual(std::fabs(std::fmod(val, T(1))), T(0.5));
+  }
+
+  template <typename T>
+  static constexpr enable_if_floating_point<T> Floor(T val) {
+    return std::floor(val);
+  }
+
+  template <typename T>
+  static constexpr enable_if_floating_point<T> Ceiling(T val) {
+    return std::ceil(val);
+  }
+
+  template <typename T>
+  static constexpr enable_if_floating_point<T> Truncate(T val) {
+    return std::trunc(val);
+  }
+
+  template <typename T>
+  static constexpr enable_if_floating_point<T> TowardsInfinity(T val) {
+    return std::signbit(val) ? std::floor(val) : std::ceil(val);
+  }
+
+  template <typename T>
+  static constexpr enable_if_floating_point<T> HalfDown(T val) {
+    return std::ceil(val - T(0.5));
+  }
+
+  template <typename T>
+  static constexpr enable_if_floating_point<T> HalfUp(T val) {
+    return std::floor(val + T(0.5));
+  }
+
+  template <typename T>
+  static enable_if_floating_point<T> HalfToEven(T val) {
+    if (IsHalf(val)) {
+      auto floor = std::floor(val);
+      // Odd + 1, Even + 0
+      return floor + (std::fmod(std::fabs(floor), T(2)) >= T(1));
+    }
+    return std::round(val);
+  }
+
+  template <typename T>
+  static enable_if_floating_point<T> HalfToOdd(T val) {
+    if (IsHalf(val)) {
+      auto floor = std::floor(val);
+      // Odd + 0, Even + 1
+      return floor + (std::fmod(std::fabs(floor), T(2)) < T(1));
+    }
+    return std::round(val);
+  }
+
+  template <typename T>
+  static constexpr enable_if_floating_point<T> Nearest(T val) {
+    return std::round(val);
+  }
+
+  template <typename T>
+  static constexpr enable_if_floating_point<T> HalfTowardsZero(T val) {
+    return std::copysign(std::ceil(std::fabs(val) - T(0.5)), val);
+  }
+
+  template <typename T>
+  static enable_if_floating_point<T> Round(T val, T mult, RoundMode round_mode,
+                                           Status* st) {
+    val /= mult;
+
+    T result;
+    switch (round_mode) {

Review comment:
       Related: https://issues.apache.org/jira/browse/ARROW-13122
   
   Kernel variants *should not* occur; instead we should have distinct kernels for each RoundMode. We can't currently because we don't have access to function options at dispatch time. If we did, instead of building a vector of execs we could simply construct a kernel for each and let dispatch select the appropriate kernel for the given RoundMode




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] edponce commented on a change in pull request #10349: ARROW-12744: [C++][Compute] Add rounding kernel

Posted by GitBox <gi...@apache.org>.

edponce commented on a change in pull request #10349:
URL: https://github.com/apache/arrow/pull/10349#discussion_r674278374



##########
File path: cpp/src/arrow/compute/kernels/scalar_arithmetic.cc
##########
@@ -485,6 +647,36 @@ ArrayKernelExec ArithmeticExecFromOp(detail::GetTypeId get_id) {
   }
 }
 
+template <template <typename... Args> class KernelGenerator, typename Op>
+ArrayKernelExec GenerateArithmeticWithFloatOutType(detail::GetTypeId get_id) {

Review comment:
       Used [`ArithmeticFloatingPointFunction`](https://github.com/apache/arrow/blob/master/cpp/src/arrow/compute/kernels/scalar_arithmetic.cc#L1151) instead which casts integers to floating-point.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] edponce commented on a change in pull request #10349: ARROW-12744: [C++][Compute] Add rounding kernel

Posted by GitBox <gi...@apache.org>.

edponce commented on a change in pull request #10349:
URL: https://github.com/apache/arrow/pull/10349#discussion_r674818621



##########
File path: cpp/src/arrow/compute/kernels/scalar_arithmetic.cc
##########
@@ -817,24 +818,158 @@ struct Log1pChecked {
   }
 };
 
+struct RoundUtil {
+  template <typename T, enable_if_t<std::is_floating_point<T>::value, bool> = true>
+  static constexpr bool ApproxEqual(const T x, const T y) {

Review comment:
       Yes, you are right. Thanks!




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] edponce commented on a change in pull request #10349: ARROW-12744: [C++][Compute] Add rounding kernel

Posted by GitBox <gi...@apache.org>.

edponce commented on a change in pull request #10349:
URL: https://github.com/apache/arrow/pull/10349#discussion_r674365913



##########
File path: cpp/src/arrow/compute/kernels/scalar_arithmetic.cc
##########
@@ -1606,6 +1774,20 @@ const FunctionDoc trunc_doc{
     ("Calculate the nearest integer not greater in magnitude than to the "
      "argument element-wise."),
     {"x"}};
+
+const FunctionDoc round_doc{
+    "Round the arguments element-wise",
+    ("Options are used to control the rounding mode and number of digits.\n"
+     "Default behavior is to round to nearest integer."),
+    {"x"},
+    "RoundOptions"};
+
+const FunctionDoc mround_doc{

Review comment:
       I am still a bit confused on how to name functions as I have seen reviews on both camps:
   * use the short name commonly used by other tools
   * use a long form that is explicit and unambiguous
   
   Do you refer solely to the C++ class/variables or also to the name used with `CallFunction`?
   `mround` does comes from Excel/Office-like tools.
   
   I am ok with being explicit in the C++ code using `RoundToMultiple` and `RoundToDigits`. Do you also suggest having the name for `CallFunction` be `round_to_multiple` and `round_to_digits` and options be `RoundToMultipleOptions` and `RoundToDigitsOptions`? The long form may be too verbose.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] edponce commented on a change in pull request #10349: ARROW-12744: [C++][Compute] Add rounding kernel

Posted by GitBox <gi...@apache.org>.

edponce commented on a change in pull request #10349:
URL: https://github.com/apache/arrow/pull/10349#discussion_r660073364



##########
File path: cpp/src/arrow/compute/kernels/scalar_arithmetic.cc
##########
@@ -454,6 +456,166 @@ struct PowerChecked {
   }
 };
 
+struct RoundUtils {
+  template <typename T, enable_if_t<std::is_floating_point<T>::value, bool> = true>
+  static bool ApproxEqual(const T x, const T y, const int ulp = 7) {
+    // https://en.cppreference.com/w/cpp/types/numeric_limits/epsilon
+    // The machine epsilon has to be scaled to the magnitude of the values used
+    // and multiplied by the desired precision in ULPs (units in the last place)
+    const auto eps_ulp = std::numeric_limits<T>::epsilon() * ulp;
+    const auto xy_diff = std::fabs(x - y);
+    const auto xy_sum = std::fabs(x + y);
+    return (xy_diff <= (xy_sum * eps_ulp))
+           // unless the result is subnormal
+           || (xy_diff < std::numeric_limits<T>::min());
+  }
+
+  template <typename T, enable_if_t<std::is_floating_point<T>::value, bool> = true>
+  static bool IsHalf(T val) {
+    // |frac| == 0.5?
+    return ApproxEqual(std::fabs(std::fmod(val, T(1))), T(0.5));
+  }
+
+  template <typename T>
+  static constexpr enable_if_floating_point<T> Floor(T val) {
+    return std::floor(val);
+  }
+
+  template <typename T>
+  static constexpr enable_if_floating_point<T> Ceiling(T val) {
+    return std::ceil(val);
+  }
+
+  template <typename T>
+  static constexpr enable_if_floating_point<T> Truncate(T val) {
+    return std::trunc(val);
+  }
+
+  template <typename T>
+  static constexpr enable_if_floating_point<T> TowardsInfinity(T val) {
+    return std::signbit(val) ? std::floor(val) : std::ceil(val);
+  }
+
+  template <typename T>
+  static constexpr enable_if_floating_point<T> HalfDown(T val) {
+    return std::ceil(val - T(0.5));
+  }
+
+  template <typename T>
+  static constexpr enable_if_floating_point<T> HalfUp(T val) {
+    return std::floor(val + T(0.5));
+  }
+
+  template <typename T>
+  static enable_if_floating_point<T> HalfToEven(T val) {
+    if (IsHalf(val)) {
+      auto floor = std::floor(val);
+      // Odd + 1, Even + 0
+      return floor + (std::fmod(std::fabs(floor), T(2)) >= T(1));
+    }
+    return std::round(val);
+  }
+
+  template <typename T>
+  static enable_if_floating_point<T> HalfToOdd(T val) {
+    if (IsHalf(val)) {
+      auto floor = std::floor(val);
+      // Odd + 0, Even + 1
+      return floor + (std::fmod(std::fabs(floor), T(2)) < T(1));
+    }
+    return std::round(val);
+  }
+
+  template <typename T>
+  static constexpr enable_if_floating_point<T> Nearest(T val) {
+    return std::round(val);
+  }
+
+  template <typename T>
+  static constexpr enable_if_floating_point<T> HalfTowardsZero(T val) {
+    return std::copysign(std::ceil(std::fabs(val) - T(0.5)), val);
+  }
+
+  template <typename T>
+  static enable_if_floating_point<T> Round(T val, T mult, RoundMode round_mode,
+                                           Status* st) {
+    val /= mult;
+
+    T result;
+    switch (round_mode) {

Review comment:
       Actually this was something that I thought about but did not knew how/when to resolve function options that conditionally control the kernel dispatched. With this knowledge I make the following observations regarding conditionally controlled function and kernel dispatching to prevent such checks from entering the hot-loop of execution:
   1. If multiple function variants are available then these are explicitly controlled by their name when invoking `CallFunction`. Nevertheless, in the public API (eg. [scalar](https://github.com/apache/arrow/blob/master/cpp/src/arrow/compute/api_scalar.h)) the function options can resolve the variant's name to call.
   2. If multiple kernel variants are available (and not resolved by input type), then function options can be resolved from `KernelContext` when creating kernel generators (`ArrayKernelExec`). This may require the kernels to have a template parameter of the function option of interest.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] edponce commented on a change in pull request #10349: ARROW-12744: [C++][Compute] Add rounding kernel

Posted by GitBox <gi...@apache.org>.

edponce commented on a change in pull request #10349:
URL: https://github.com/apache/arrow/pull/10349#discussion_r703695272



##########
File path: cpp/src/arrow/compute/kernels/scalar_arithmetic.cc
##########
@@ -852,24 +854,232 @@ struct LogbChecked {
   }
 };
 
+struct RoundUtil {
+  template <typename T>
+  static constexpr enable_if_t<std::is_floating_point<T>::value, bool> IsApproxEqual(
+      const T a, const T b, const T abs_tol = kDefaultAbsoluteTolerance) {
+    return std::fabs(a - b) <= abs_tol;
+  }
+
+  template <typename T>
+  static constexpr enable_if_t<std::is_floating_point<T>::value, bool> IsApproxInt(
+      const T val, const T abs_tol = kDefaultAbsoluteTolerance) {
+    // |frac| ~ 0.0?
+    return IsApproxEqual(val, std::round(val), abs_tol);
+  }
+
+  template <typename T>
+  static constexpr enable_if_t<std::is_floating_point<T>::value, bool> IsApproxHalfInt(
+      const T val, const T abs_tol = kDefaultAbsoluteTolerance) {
+    // |frac| ~ 0.5?
+    return IsApproxEqual(val - std::floor(val), T(0.5), abs_tol);
+  }
+
+  template <typename T>
+  static enable_if_floating_point<T> Pow10(const int64_t power) {
+    const T lut[]{1e0F,  1e1F,  1e2F,  1e3F,  1e4F,  1e5F,  1e6F,  1e7F,
+                  1e8F,  1e9F,  1e10F, 1e11F, 1e12F, 1e13F, 1e14F, 1e15F,
+                  1e16F, 1e17F, 1e18F, 1e19F, 1e20F, 1e21F, 1e22F};
+    // Return NaN if index is out-of-range.
+    auto lut_size = (int64_t)(sizeof(lut) / sizeof(*lut));
+    return (power >= 0 && power < lut_size) ? lut[power] : std::nanf("");
+  }
+};
+
+// Specializations of rounding implementations for round kernels
+template <typename T, RoundMode>
+struct RoundImpl {};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::DOWN> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::floor(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::UP> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::ceil(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::TOWARDS_ZERO> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::signbit(val) ? std::ceil(val) : std::floor(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::TOWARDS_INFINITY> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::signbit(val) ? std::floor(val) : std::ceil(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_DOWN> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return RoundImpl<T, RoundMode::DOWN>::Round(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_UP> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return RoundImpl<T, RoundMode::UP>::Round(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TOWARDS_ZERO> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return RoundImpl<T, RoundMode::TOWARDS_ZERO>::Round(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TOWARDS_INFINITY> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return RoundImpl<T, RoundMode::TOWARDS_INFINITY>::Round(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TO_EVEN> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::round(val * T(0.5)) * 2;
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TO_ODD> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::floor(val * T(0.5)) + std::ceil(val * T(0.5));
+  }
+};
+
+template <RoundMode RndMode>
+struct Round {
+  using RoundState = OptionsWrapper<RoundOptions>;
+
+  template <typename T, typename Arg>
+  static enable_if_floating_point<Arg, T> Call(KernelContext* ctx, Arg arg, Status* st) {
+    static_assert(std::is_same<T, Arg>::value, "");
+    static_assert(RoundMode::HALF_DOWN > RoundMode::DOWN &&
+                      RoundMode::HALF_DOWN > RoundMode::UP &&
+                      RoundMode::HALF_DOWN > RoundMode::TOWARDS_ZERO &&
+                      RoundMode::HALF_DOWN > RoundMode::TOWARDS_INFINITY &&
+                      RoundMode::HALF_DOWN < RoundMode::HALF_UP &&
+                      RoundMode::HALF_DOWN < RoundMode::HALF_TOWARDS_ZERO &&
+                      RoundMode::HALF_DOWN < RoundMode::HALF_TOWARDS_INFINITY &&
+                      RoundMode::HALF_DOWN < RoundMode::HALF_TO_EVEN &&
+                      RoundMode::HALF_DOWN < RoundMode::HALF_TO_ODD,
+                  "Round modes prefixed with HALF need to be defined last in enum and "
+                  "the first HALF entry has to be HALF_DOWN.");

Review comment:
       I moved it to the Round/MRoundOptions constructor as the enum pertains more to the options.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] github-actions[bot] commented on pull request #10349: ARROW-12744: [C++][Compute] Add rounding kernel

Posted by GitBox <gi...@apache.org>.

github-actions[bot] commented on pull request #10349:
URL: https://github.com/apache/arrow/pull/10349#issuecomment-842568791


   https://issues.apache.org/jira/browse/ARROW-12744


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] edponce commented on a change in pull request #10349: ARROW-12744: [C++][Compute] Add rounding kernel

Posted by GitBox <gi...@apache.org>.

edponce commented on a change in pull request #10349:
URL: https://github.com/apache/arrow/pull/10349#discussion_r703909808



##########
File path: cpp/src/arrow/python/python_test.cc
##########
@@ -194,7 +194,7 @@ TEST(PyBuffer, InvalidInputObject) {
 // ("unresolved external symbol arrow_ARRAY_API referenced").
 #ifndef _WIN32
 TEST(PyBuffer, NumpyArray) {
-  const npy_intp dims[1] = {10};
+  npy_intp dims[1] = {10};

Review comment:
       I do not recall if it was in CI or a compiler on my side, but build was triggering error w.r.t. to `const npy_intp *` in  `PyArray_SimpleNew`. I had not seen this issue before and the `const` attribute is present in the latest NumPy API but it was not there at some point. I revert this change, but note that this TEST uses `const npy_intp` and the following below does not.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] edponce commented on a change in pull request #10349: ARROW-12744: [C++][Compute] Add rounding kernel

Posted by GitBox <gi...@apache.org>.

edponce commented on a change in pull request #10349:
URL: https://github.com/apache/arrow/pull/10349#discussion_r674365913



##########
File path: cpp/src/arrow/compute/kernels/scalar_arithmetic.cc
##########
@@ -1606,6 +1774,20 @@ const FunctionDoc trunc_doc{
     ("Calculate the nearest integer not greater in magnitude than to the "
      "argument element-wise."),
     {"x"}};
+
+const FunctionDoc round_doc{
+    "Round the arguments element-wise",
+    ("Options are used to control the rounding mode and number of digits.\n"
+     "Default behavior is to round to nearest integer."),
+    {"x"},
+    "RoundOptions"};
+
+const FunctionDoc mround_doc{

Review comment:
       I am still a bit confused on how to name functions as I have seen reviews on both camps:
   * use the short name commonly used by other tools
   * use a long form that is explicit and unambiguous
   
   Do you refer solely to the C++ class/variables or also to the name used with `CallFunction`?
   `mround` does comes from Excel/Office-like tools.
   
   I am ok with being explicit in the C++ code using `RoundToMultiple` and `RoundToPlace`. Do you also suggest having the name for `CallFunction` be `round_to_multiple` and `round_to_place`?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] lidavidm commented on a change in pull request #10349: ARROW-12744: [C++][Compute] Add rounding kernel

Posted by GitBox <gi...@apache.org>.

lidavidm commented on a change in pull request #10349:
URL: https://github.com/apache/arrow/pull/10349#discussion_r692372557



##########
File path: cpp/src/arrow/compute/kernels/scalar_arithmetic_test.cc
##########
@@ -150,18 +149,32 @@ class TestUnaryArithmetic : public TestBase {
     AssertArraysApproxEqual(*expected, *actual, /*verbose=*/true, equal_options_);
   }
 
-  void SetOverflowCheck(bool value = true) { options_.check_overflow = value; }
-
   void SetNansEqual(bool value = true) {
     this->equal_options_ = equal_options_.nans_equal(value);
   }
 
-  ArithmeticOptions options_ = ArithmeticOptions();
+  Options options_ = Options();
   EqualOptions equal_options_ = EqualOptions::Defaults();
 };
 
+template <typename T, typename Options>

Review comment:
       Right, I just don't see why you couldn't have them all inherit from the Base variant.

##########
File path: cpp/src/arrow/compute/api_scalar.h
##########
@@ -49,10 +49,44 @@ class ARROW_EXPORT ElementWiseAggregateOptions : public FunctionOptions {
   explicit ElementWiseAggregateOptions(bool skip_nulls = true);
   constexpr static char const kTypeName[] = "ElementWiseAggregateOptions";
   static ElementWiseAggregateOptions Defaults() { return ElementWiseAggregateOptions{}; }
-
   bool skip_nulls;
 };
 
+enum class RoundMode {

Review comment:
       Awesome, thanks.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] edponce commented on a change in pull request #10349: ARROW-12744: [C++][Compute] Add rounding kernel

Posted by GitBox <gi...@apache.org>.

edponce commented on a change in pull request #10349:
URL: https://github.com/apache/arrow/pull/10349#discussion_r705487276



##########
File path: cpp/src/arrow/compute/kernels/scalar_arithmetic.cc
##########
@@ -852,24 +853,243 @@ struct LogbChecked {
   }
 };
 
+struct RoundUtil {
+  template <typename T>
+  static constexpr enable_if_t<std::is_floating_point<T>::value, bool> IsInteger(
+      const T val) {
+    // |frac| ~ 0.0?
+    return std::floor(val) == val;
+  }
+
+  template <typename T>
+  static constexpr enable_if_t<std::is_floating_point<T>::value, bool> IsHalfInteger(
+      const T val) {
+    // |frac| ~ 0.5?
+    return (val - std::floor(val)) == T(0.5);
+  }
+
+  // Calculate powers of ten with arbitrary integer exponent
+  template <typename T = double>
+  static enable_if_floating_point<T> Pow10(int64_t power) {
+    static constexpr T lut[] = {1e0F, 1e1F, 1e2F,  1e3F,  1e4F,  1e5F,  1e6F,  1e7F,
+                                1e8F, 1e9F, 1e10F, 1e11F, 1e12F, 1e13F, 1e14F, 1e15F};
+    int64_t lut_size = (sizeof(lut) / sizeof(*lut));
+    int64_t abs_power = std::abs(power);
+    auto pow10 = lut[std::min(abs_power, lut_size - 1)];
+    while (abs_power-- >= lut_size) {
+      pow10 *= 1e1F;
+    }
+    return (power >= 0) ? pow10 : (1 / pow10);
+  }
+};
+
+// Specializations of rounding implementations for round kernels
+template <typename, RoundMode>
+struct RoundImpl;
+
+template <typename T>
+struct RoundImpl<T, RoundMode::DOWN> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::floor(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::UP> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::ceil(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::TOWARDS_ZERO> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::trunc(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::TOWARDS_INFINITY> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::signbit(val) ? std::floor(val) : std::ceil(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_DOWN> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return RoundImpl<T, RoundMode::DOWN>::Round(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_UP> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return RoundImpl<T, RoundMode::UP>::Round(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TOWARDS_ZERO> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return RoundImpl<T, RoundMode::TOWARDS_ZERO>::Round(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TOWARDS_INFINITY> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return RoundImpl<T, RoundMode::TOWARDS_INFINITY>::Round(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TO_EVEN> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::round(val * T(0.5)) * 2;
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TO_ODD> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::floor(val * T(0.5)) + std::ceil(val * T(0.5));
+  }
+};
+
+// Specializations of kernel state for round kernels
+template <typename>
+struct RoundOptionsWrapper;
+
+template <>
+struct RoundOptionsWrapper<RoundOptions> : public OptionsWrapper<RoundOptions> {
+  using OptionsType = RoundOptions;
+  double pow10;
+
+  explicit RoundOptionsWrapper(OptionsType options) : OptionsWrapper(std::move(options)) {
+    // Only positive of powers of 10 are used because combining multiply and
+    // division operations produced more stable rounding than using multiply-only.
+    // Refer to NumPy's round implementation:
+    // https://github.com/numpy/numpy/blob/7b2f20b406d27364c812f7a81a9c901afbd3600c/numpy/core/src/multiarray/calculation.c#L589
+    pow10 = RoundUtil::Pow10(std::abs(options.ndigits));
+  }
+
+  static Result<std::unique_ptr<KernelState>> Init(KernelContext* ctx,
+                                                   const KernelInitArgs& args) {
+    if (auto options = static_cast<const OptionsType*>(args.options)) {
+      return arrow::internal::make_unique<RoundOptionsWrapper<OptionsType>>(*options);
+    }
+
+    return Status::Invalid(
+        "Attempted to initialize KernelState from null FunctionOptions");
+  }

Review comment:
       No, its is `RoundOptionsWrapper` so that its constructor is invoked via `make_unique`, which initializes `pow10` data member. If we use `OptionsWrapper` then `pow10` is not available. I tried invoking `OptionsWrapper::Init` but it returns a `std::unique_ptr` which would require "casting" to `RoundOptionsWrapper` first and then to `KernelState` to match return type. The `unique_ptr` casting caused too many issues so I reverted to mimic the `OptionsWrapper::Init` method.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] edponce commented on a change in pull request #10349: ARROW-12744: [C++][Compute] Add rounding kernel

Posted by GitBox <gi...@apache.org>.

edponce commented on a change in pull request #10349:
URL: https://github.com/apache/arrow/pull/10349#discussion_r692386394



##########
File path: cpp/src/arrow/compute/kernels/scalar_arithmetic_test.cc
##########
@@ -150,18 +149,32 @@ class TestUnaryArithmetic : public TestBase {
     AssertArraysApproxEqual(*expected, *actual, /*verbose=*/true, equal_options_);
   }
 
-  void SetOverflowCheck(bool value = true) { options_.check_overflow = value; }
-
   void SetNansEqual(bool value = true) {
     this->equal_options_ = equal_options_.nans_equal(value);
   }
 
-  ArithmeticOptions options_ = ArithmeticOptions();
+  Options options_ = Options();
   EqualOptions equal_options_ = EqualOptions::Defaults();
 };
 
+template <typename T, typename Options>

Review comment:
       Ok, I reworked the template specialization as subclasses of the Base variant. Thanks for this review!




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] edponce commented on pull request #10349: ARROW-12744: [C++][Compute] Add rounding kernel

Posted by GitBox <gi...@apache.org>.

edponce commented on pull request #10349:
URL: https://github.com/apache/arrow/pull/10349#issuecomment-842580763


   There is an unresolved discussion on rounding modes and probably providing options to specify, refer to https://issues.apache.org/jira/browse/ARROW-12744


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] edponce commented on a change in pull request #10349: ARROW-12744: [C++][Compute] Add rounding kernel

Posted by GitBox <gi...@apache.org>.

edponce commented on a change in pull request #10349:
URL: https://github.com/apache/arrow/pull/10349#discussion_r659957016



##########
File path: docs/source/cpp/compute.rst
##########
@@ -312,6 +313,79 @@ precision of `divide` is at least the sum of precisions of both operands with
 enough scale kept. Error is returned if the result precision is beyond the
 decimal value range.
 
+Rounding functions
+~~~~~~~~~~~~~~~~~~
+
+These functions displace numeric input(s) to approximate and shorter numeric
+representation(s).  Integral input(s) produce floating-point output(s) of same value.
+If any of the input element(s) is null, the corresponding output element is null.
+
++---------------+------------+-------------+-------------+----------------------------------+
+| Function name | Arity      | Input types | Output type | Notes | Options class            |
++===============+============+=============+=============+==================================+
+| mround        | Unary      | Numeric     | Float32/64  | (1)(2) | :struct:`MRoundOptions` |
++---------------+------------+-------------+-------------+----------------------------------+
+| round         | Unary      | Numeric     | Float32/64  | (1)(3) | :struct:`RoundOptions`  |
++---------------+------------+-------------+-------------+----------------------------------+
+
+* \(1) Output value is a 64-bit floating-point for integral inputs and the
+  retains the same type for floating-point inputs.  By default rounding functions
+  displace a value to the nearest integer with a round to even for breaking ties.
+  Options are available to control the rounding behavior.
+* \(2) The ``multiple`` option specifies the rounding
+  scale and precision.  Only the magnitude of the ``rounding multiple`` is used,
+  its sign is ignored.
+* \(3) The ``ndigits`` option specifies the rounding precision in
+  terms of number of digits.  A negative value corresponds to digits in the
+  non-decimal part.
+
++-------------------------+---------------------------------+
+| Round mode              | Description/Examples            |
++=========================+=================================+
+| DOWNWARD                | Equivalent to ``floor(x)``      |
+| TOWARDS_NEG_INFINITY    | 3.7 = 3, -3.2 = -4              |

Review comment:
       Agree, good observation.

##########
File path: docs/source/cpp/compute.rst
##########
@@ -286,7 +287,7 @@ an ``Invalid`` :class:`Status` when overflow is detected.
 +--------------------------+------------+--------------------+---------------------+
 | power_checked            | Binary     | Numeric            | Numeric             |
 +--------------------------+------------+--------------------+---------------------+
-| subtract                 | Binary     | Numeric            | Numeric (1)         |
+| subtract                 | Binary     | Numeric            | Numeric             |

Review comment:
       Not intentional at all. This was sloppy on my part.

##########
File path: docs/source/cpp/compute.rst
##########
@@ -312,6 +313,79 @@ precision of `divide` is at least the sum of precisions of both operands with
 enough scale kept. Error is returned if the result precision is beyond the
 decimal value range.
 
+Rounding functions
+~~~~~~~~~~~~~~~~~~
+
+These functions displace numeric input(s) to approximate and shorter numeric
+representation(s).  Integral input(s) produce floating-point output(s) of same value.
+If any of the input element(s) is null, the corresponding output element is null.
+
++---------------+------------+-------------+-------------+----------------------------------+
+| Function name | Arity      | Input types | Output type | Notes | Options class            |

Review comment:
       Well, I was trying to mimick the [string transforms](https://arrow.apache.org/docs/cpp/compute.html#string-transforms) table, but noticed that there was a typo, so now *Notes* and *Options class* are different columns.

##########
File path: cpp/src/arrow/compute/kernels/scalar_arithmetic.cc
##########
@@ -454,6 +456,166 @@ struct PowerChecked {
   }
 };
 
+struct RoundUtils {
+  template <typename T, enable_if_t<std::is_floating_point<T>::value, bool> = true>
+  static bool ApproxEqual(const T x, const T y, const int ulp = 7) {
+    // https://en.cppreference.com/w/cpp/types/numeric_limits/epsilon
+    // The machine epsilon has to be scaled to the magnitude of the values used
+    // and multiplied by the desired precision in ULPs (units in the last place)
+    const auto eps_ulp = std::numeric_limits<T>::epsilon() * ulp;
+    const auto xy_diff = std::fabs(x - y);
+    const auto xy_sum = std::fabs(x + y);
+    return (xy_diff <= (xy_sum * eps_ulp))
+           // unless the result is subnormal
+           || (xy_diff < std::numeric_limits<T>::min());
+  }
+
+  template <typename T, enable_if_t<std::is_floating_point<T>::value, bool> = true>
+  static bool IsHalf(T val) {
+    // |frac| == 0.5?
+    return ApproxEqual(std::fabs(std::fmod(val, T(1))), T(0.5));
+  }
+
+  template <typename T>
+  static constexpr enable_if_floating_point<T> Floor(T val) {
+    return std::floor(val);
+  }
+
+  template <typename T>
+  static constexpr enable_if_floating_point<T> Ceiling(T val) {
+    return std::ceil(val);
+  }
+
+  template <typename T>
+  static constexpr enable_if_floating_point<T> Truncate(T val) {
+    return std::trunc(val);
+  }
+
+  template <typename T>
+  static constexpr enable_if_floating_point<T> TowardsInfinity(T val) {
+    return std::signbit(val) ? std::floor(val) : std::ceil(val);
+  }
+
+  template <typename T>
+  static constexpr enable_if_floating_point<T> HalfDown(T val) {
+    return std::ceil(val - T(0.5));
+  }
+
+  template <typename T>
+  static constexpr enable_if_floating_point<T> HalfUp(T val) {
+    return std::floor(val + T(0.5));
+  }
+
+  template <typename T>
+  static enable_if_floating_point<T> HalfToEven(T val) {
+    if (IsHalf(val)) {
+      auto floor = std::floor(val);
+      // Odd + 1, Even + 0
+      return floor + (std::fmod(std::fabs(floor), T(2)) >= T(1));
+    }
+    return std::round(val);
+  }
+
+  template <typename T>
+  static enable_if_floating_point<T> HalfToOdd(T val) {
+    if (IsHalf(val)) {
+      auto floor = std::floor(val);
+      // Odd + 0, Even + 1
+      return floor + (std::fmod(std::fabs(floor), T(2)) < T(1));
+    }
+    return std::round(val);
+  }
+
+  template <typename T>
+  static constexpr enable_if_floating_point<T> Nearest(T val) {
+    return std::round(val);
+  }
+
+  template <typename T>
+  static constexpr enable_if_floating_point<T> HalfTowardsZero(T val) {
+    return std::copysign(std::ceil(std::fabs(val) - T(0.5)), val);
+  }
+
+  template <typename T>
+  static enable_if_floating_point<T> Round(T val, T mult, RoundMode round_mode,
+                                           Status* st) {
+    val /= mult;
+
+    T result;
+    switch (round_mode) {

Review comment:
       Actually this was something that I thought about but did not knew how/when to resolve function options that conditionally control the kernel dispatched. With this knowledge I make the following observations regarding conditionally controlled function and kernel dispatching to prevent such checks from entering the hot-loop of execution:
   1. If multiple function variants are available then these are explicitly controlled by their name when invoking `CallFunction`. Nevertheless, in the public API (eg. [scalar](https://github.com/apache/arrow/blob/master/cpp/src/arrow/compute/api_scalar.h)) the function options can resolve the variant's name to call.
   2. If multiple kernel variants are available (and not resolved by input type), then function options can be resolved from `KernelContext` when creating kernel generators (`ArrayKernelExec`). This may require the kernels to have a template parameter of the function option of interest.

##########
File path: cpp/src/arrow/compute/kernels/scalar_arithmetic.cc
##########
@@ -454,6 +456,166 @@ struct PowerChecked {
   }
 };
 
+struct RoundUtils {
+  template <typename T, enable_if_t<std::is_floating_point<T>::value, bool> = true>
+  static bool ApproxEqual(const T x, const T y, const int ulp = 7) {
+    // https://en.cppreference.com/w/cpp/types/numeric_limits/epsilon
+    // The machine epsilon has to be scaled to the magnitude of the values used
+    // and multiplied by the desired precision in ULPs (units in the last place)
+    const auto eps_ulp = std::numeric_limits<T>::epsilon() * ulp;
+    const auto xy_diff = std::fabs(x - y);
+    const auto xy_sum = std::fabs(x + y);
+    return (xy_diff <= (xy_sum * eps_ulp))
+           // unless the result is subnormal
+           || (xy_diff < std::numeric_limits<T>::min());
+  }
+
+  template <typename T, enable_if_t<std::is_floating_point<T>::value, bool> = true>
+  static bool IsHalf(T val) {
+    // |frac| == 0.5?
+    return ApproxEqual(std::fabs(std::fmod(val, T(1))), T(0.5));
+  }
+
+  template <typename T>
+  static constexpr enable_if_floating_point<T> Floor(T val) {
+    return std::floor(val);
+  }
+
+  template <typename T>
+  static constexpr enable_if_floating_point<T> Ceiling(T val) {
+    return std::ceil(val);
+  }
+
+  template <typename T>
+  static constexpr enable_if_floating_point<T> Truncate(T val) {
+    return std::trunc(val);
+  }
+
+  template <typename T>
+  static constexpr enable_if_floating_point<T> TowardsInfinity(T val) {
+    return std::signbit(val) ? std::floor(val) : std::ceil(val);
+  }
+
+  template <typename T>
+  static constexpr enable_if_floating_point<T> HalfDown(T val) {
+    return std::ceil(val - T(0.5));
+  }
+
+  template <typename T>
+  static constexpr enable_if_floating_point<T> HalfUp(T val) {
+    return std::floor(val + T(0.5));
+  }
+
+  template <typename T>
+  static enable_if_floating_point<T> HalfToEven(T val) {
+    if (IsHalf(val)) {
+      auto floor = std::floor(val);
+      // Odd + 1, Even + 0
+      return floor + (std::fmod(std::fabs(floor), T(2)) >= T(1));
+    }
+    return std::round(val);
+  }
+
+  template <typename T>
+  static enable_if_floating_point<T> HalfToOdd(T val) {
+    if (IsHalf(val)) {
+      auto floor = std::floor(val);
+      // Odd + 0, Even + 1
+      return floor + (std::fmod(std::fabs(floor), T(2)) < T(1));
+    }
+    return std::round(val);
+  }
+
+  template <typename T>
+  static constexpr enable_if_floating_point<T> Nearest(T val) {
+    return std::round(val);
+  }
+
+  template <typename T>
+  static constexpr enable_if_floating_point<T> HalfTowardsZero(T val) {
+    return std::copysign(std::ceil(std::fabs(val) - T(0.5)), val);
+  }
+
+  template <typename T>
+  static enable_if_floating_point<T> Round(T val, T mult, RoundMode round_mode,
+                                           Status* st) {
+    val /= mult;
+
+    T result;
+    switch (round_mode) {

Review comment:
       Actually this was something that I thought about but did not knew how/when to resolve function options that conditionally control the kernel dispatched. With this knowledge I make the following observations regarding conditionally controlled function and kernel dispatching to prevent such checks from entering the hot-loop of execution:
   1. If multiple function variants are available then these are explicitly controlled by their name when invoking `CallFunction`. Nevertheless, in the public API (eg. [scalar](https://github.com/apache/arrow/blob/master/cpp/src/arrow/compute/api_scalar.h)) the function options can resolve the variant's name to call.
   2. If multiple kernel variants are available (and not resolved by input type), then function options can be resolved from `KernelContext` when selecting kernel generators (`ArrayKernelExec`). This may require the kernels to have a template parameter of the function option of interest.

##########
File path: cpp/src/arrow/compute/kernels/scalar_arithmetic.cc
##########
@@ -454,6 +456,166 @@ struct PowerChecked {
   }
 };
 
+struct RoundUtils {
+  template <typename T, enable_if_t<std::is_floating_point<T>::value, bool> = true>
+  static bool ApproxEqual(const T x, const T y, const int ulp = 7) {
+    // https://en.cppreference.com/w/cpp/types/numeric_limits/epsilon
+    // The machine epsilon has to be scaled to the magnitude of the values used
+    // and multiplied by the desired precision in ULPs (units in the last place)
+    const auto eps_ulp = std::numeric_limits<T>::epsilon() * ulp;
+    const auto xy_diff = std::fabs(x - y);
+    const auto xy_sum = std::fabs(x + y);
+    return (xy_diff <= (xy_sum * eps_ulp))
+           // unless the result is subnormal
+           || (xy_diff < std::numeric_limits<T>::min());
+  }
+
+  template <typename T, enable_if_t<std::is_floating_point<T>::value, bool> = true>
+  static bool IsHalf(T val) {
+    // |frac| == 0.5?
+    return ApproxEqual(std::fabs(std::fmod(val, T(1))), T(0.5));
+  }
+
+  template <typename T>
+  static constexpr enable_if_floating_point<T> Floor(T val) {
+    return std::floor(val);
+  }
+
+  template <typename T>
+  static constexpr enable_if_floating_point<T> Ceiling(T val) {
+    return std::ceil(val);
+  }
+
+  template <typename T>
+  static constexpr enable_if_floating_point<T> Truncate(T val) {
+    return std::trunc(val);
+  }
+
+  template <typename T>
+  static constexpr enable_if_floating_point<T> TowardsInfinity(T val) {
+    return std::signbit(val) ? std::floor(val) : std::ceil(val);
+  }
+
+  template <typename T>
+  static constexpr enable_if_floating_point<T> HalfDown(T val) {
+    return std::ceil(val - T(0.5));
+  }
+
+  template <typename T>
+  static constexpr enable_if_floating_point<T> HalfUp(T val) {
+    return std::floor(val + T(0.5));
+  }
+
+  template <typename T>
+  static enable_if_floating_point<T> HalfToEven(T val) {
+    if (IsHalf(val)) {
+      auto floor = std::floor(val);
+      // Odd + 1, Even + 0
+      return floor + (std::fmod(std::fabs(floor), T(2)) >= T(1));
+    }
+    return std::round(val);
+  }
+
+  template <typename T>
+  static enable_if_floating_point<T> HalfToOdd(T val) {
+    if (IsHalf(val)) {
+      auto floor = std::floor(val);
+      // Odd + 0, Even + 1
+      return floor + (std::fmod(std::fabs(floor), T(2)) < T(1));
+    }
+    return std::round(val);
+  }
+
+  template <typename T>
+  static constexpr enable_if_floating_point<T> Nearest(T val) {
+    return std::round(val);
+  }
+
+  template <typename T>
+  static constexpr enable_if_floating_point<T> HalfTowardsZero(T val) {
+    return std::copysign(std::ceil(std::fabs(val) - T(0.5)), val);
+  }
+
+  template <typename T>
+  static enable_if_floating_point<T> Round(T val, T mult, RoundMode round_mode,
+                                           Status* st) {
+    val /= mult;
+
+    T result;
+    switch (round_mode) {

Review comment:
       Actually this was something that I thought about but did not knew how/when to resolve function options that conditionally control the kernel dispatched. With this knowledge I make the following observations regarding conditionally controlled function and kernel dispatching to prevent such checks from entering the hot-loop of execution:
   1. If multiple function variants are available then these are explicitly controlled by their name when invoking `CallFunction`. Nevertheless, in the public API (eg. [scalar](https://github.com/apache/arrow/blob/master/cpp/src/arrow/compute/api_scalar.h)) the [function options can resolve the variant's name to call](https://github.com/apache/arrow/blob/master/cpp/src/arrow/compute/api_scalar.cc#L44-L48).
   2. If multiple kernel variants are available (and not resolved by input type), then function options can be resolved from `KernelContext` when selecting kernel generators (`ArrayKernelExec`). This may require the kernels to have a template parameter of the function option of interest.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] edponce commented on a change in pull request #10349: ARROW-12744: [C++][Compute] Add rounding kernel

Posted by GitBox <gi...@apache.org>.

edponce commented on a change in pull request #10349:
URL: https://github.com/apache/arrow/pull/10349#discussion_r674353363



##########
File path: docs/source/python/api/compute.rst
##########
@@ -127,6 +129,18 @@ variants which detect domain errors where appropriate.
    tan
    tan_checked
 
+Rounding

Review comment:
       Not at all.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] edponce commented on a change in pull request #10349: ARROW-12744: [C++][Compute] Add rounding kernel

Posted by GitBox <gi...@apache.org>.

edponce commented on a change in pull request #10349:
URL: https://github.com/apache/arrow/pull/10349#discussion_r634865428



##########
File path: cpp/src/arrow/compute/kernels/scalar_arithmetic.cc
##########
@@ -355,6 +355,32 @@ struct PowerChecked {
   }
 };
 
+struct Round {
+  template <typename T, typename Arg>
+  static constexpr enable_if_floating_point<T> Call(KernelContext*, Arg arg, Status*) {
+    return std::round(arg);
+  }
+
+  template <typename T, typename Arg>
+  static constexpr enable_if_integer<T> Call(KernelContext*, Arg arg, Status*) {
+    return arg;
+  }
+};
+
+struct RoundChecked {
+  template <typename T, typename Arg>
+  static constexpr enable_if_integer<T> Call(KernelContext*, Arg arg, Status*) {
+    static_assert(std::is_same<T, Arg>::value, "");
+    return arg;
+  }
+
+  template <typename T, typename Arg>
+  static constexpr enable_if_floating_point<T> Call(KernelContext*, Arg arg, Status*) {
+    static_assert(std::is_same<T, Arg>::value, "");

Review comment:
       I agree, but I was not sure if there was interest in mimicking C++ [lround](https://en.cppreference.com/w/cpp/numeric/math/round) variants. For now, it seems reasonable to only support floats and return result with same type.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] edponce commented on a change in pull request #10349: ARROW-12744: [C++][Compute] Add rounding kernel

Posted by GitBox <gi...@apache.org>.

edponce commented on a change in pull request #10349:
URL: https://github.com/apache/arrow/pull/10349#discussion_r659957016



##########
File path: docs/source/cpp/compute.rst
##########
@@ -312,6 +313,79 @@ precision of `divide` is at least the sum of precisions of both operands with
 enough scale kept. Error is returned if the result precision is beyond the
 decimal value range.
 
+Rounding functions
+~~~~~~~~~~~~~~~~~~
+
+These functions displace numeric input(s) to approximate and shorter numeric
+representation(s).  Integral input(s) produce floating-point output(s) of same value.
+If any of the input element(s) is null, the corresponding output element is null.
+
++---------------+------------+-------------+-------------+----------------------------------+
+| Function name | Arity      | Input types | Output type | Notes | Options class            |
++===============+============+=============+=============+==================================+
+| mround        | Unary      | Numeric     | Float32/64  | (1)(2) | :struct:`MRoundOptions` |
++---------------+------------+-------------+-------------+----------------------------------+
+| round         | Unary      | Numeric     | Float32/64  | (1)(3) | :struct:`RoundOptions`  |
++---------------+------------+-------------+-------------+----------------------------------+
+
+* \(1) Output value is a 64-bit floating-point for integral inputs and the
+  retains the same type for floating-point inputs.  By default rounding functions
+  displace a value to the nearest integer with a round to even for breaking ties.
+  Options are available to control the rounding behavior.
+* \(2) The ``multiple`` option specifies the rounding
+  scale and precision.  Only the magnitude of the ``rounding multiple`` is used,
+  its sign is ignored.
+* \(3) The ``ndigits`` option specifies the rounding precision in
+  terms of number of digits.  A negative value corresponds to digits in the
+  non-decimal part.
+
++-------------------------+---------------------------------+
+| Round mode              | Description/Examples            |
++=========================+=================================+
+| DOWNWARD                | Equivalent to ``floor(x)``      |
+| TOWARDS_NEG_INFINITY    | 3.7 = 3, -3.2 = -4              |

Review comment:
       Agree, good observation.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] lidavidm commented on a change in pull request #10349: ARROW-12744: [C++][Compute] Add rounding kernel

Posted by GitBox <gi...@apache.org>.

lidavidm commented on a change in pull request #10349:
URL: https://github.com/apache/arrow/pull/10349#discussion_r692127936



##########
File path: cpp/src/arrow/compute/kernels/scalar_arithmetic.cc
##########
@@ -817,24 +818,158 @@ struct Log1pChecked {
   }
 };
 
+struct RoundUtil {
+  template <typename T, enable_if_t<std::is_floating_point<T>::value, bool> = true>
+  static constexpr bool ApproxEqual(const T x, const T y) {
+    return (x == y) || (std::fabs(x - y) <= std::numeric_limits<T>::epsilon());
+  }
+
+  template <typename T, enable_if_t<std::is_floating_point<T>::value, bool> = true>
+  static constexpr bool IsHalf(T val) {
+    // |frac| == 0.5?
+    return ApproxEqual(std::fmod(std::fabs(val), T(1)), T(0.5));
+  }
+
+  template <typename T>
+  static enable_if_floating_point<T> Pow10(const int power) {
+    static constexpr auto pow10 = std::array<T, 39>{
+        1e-19, 1e-18, 1e-17, 1e-16, 1e-15, 1e-14, 1e-13, 1e-12, 1e-11, 1e-10,
+        1e-9,  1e-8,  1e-7,  1e-6,  1e-5,  1e-4,  1e-3,  1e-2,  1e-1,  1e0,
+        1e1,   1e2,   1e3,   1e4,   1e5,   1e6,   1e7,   1e8,   1e9,   1e10,
+        1e11,  1e12,  1e13,  1e14,  1e15,  1e16,  1e17,  1e18,  1e19};
+    return pow10.at(power + 19);
+  }
+};
+
+// Specializations of rounding implementations for kernels
+template <typename T, RoundMode RndMode>
+struct RoundImpl {
+  static constexpr enable_if_floating_point<T> Round(T) { return T(0); }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::TOWARDS_NEG_INFINITY> {
+  static constexpr enable_if_floating_point<T> Round(T val) { return std::floor(val); }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::TOWARDS_POS_INFINITY> {
+  static constexpr enable_if_floating_point<T> Round(T val) { return std::ceil(val); }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::TOWARDS_ZERO> {
+  static constexpr enable_if_floating_point<T> Round(T val) { return std::trunc(val); }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::TOWARDS_INFINITY> {
+  static constexpr enable_if_floating_point<T> Round(T val) {
+    return std::signbit(val) ? std::floor(val) : std::ceil(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TOWARDS_NEG_INFINITY> {
+  static constexpr enable_if_floating_point<T> Round(T val) {
+    return std::ceil(val - T(0.5));
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TOWARDS_POS_INFINITY> {
+  static constexpr enable_if_floating_point<T> Round(T val) {
+    return std::floor(val + T(0.5));
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TOWARDS_ZERO> {
+  static constexpr enable_if_floating_point<T> Round(T val) {
+    return std::copysign(std::ceil(std::fabs(val) - T(0.5)), val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TOWARDS_INFINITY> {
+  static enable_if_floating_point<T> Round(T val) {
+    return std::copysign(std::floor(std::fabs(val) + T(0.5)), val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TO_EVEN> {
+  static enable_if_floating_point<T> Round(T val) {
+    if (!RoundUtil::IsHalf(val)) {
+      return std::round(val);
+    }
+    auto floor = std::floor(val);
+    // Odd + 1, Even + 0
+    return floor + T(std::fmod(std::fabs(floor), T(2)) >= T(1));
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TO_ODD> {
+  static enable_if_floating_point<T> Round(T val) {
+    if (!RoundUtil::IsHalf(val)) {
+      return std::round(val);
+    }
+    auto floor = std::floor(val);
+    // Odd + 0, Even + 1
+    return floor + T(std::fmod(std::fabs(floor), T(2)) < T(1));
+  }
+};
+
+template <RoundMode RndMode>
+struct MRound {
+  template <typename T, typename Arg>
+  static enable_if_floating_point<Arg, T> Call(KernelContext* ctx, Arg arg, Status*) {
+    static_assert(std::is_same<T, Arg>::value, "");
+    if (std::isnan(arg)) {
+      return arg;
+    }
+    auto options = OptionsWrapper<MRoundOptions>::Get(ctx);
+    const auto mult = std::fabs(T(options.multiple));
+    return (mult == T(0)) ? T(0) : (RoundImpl<T, RndMode>::Round(arg / mult) * mult);
+  }
+};
+
+template <RoundMode RndMode>
+struct Round {
+  template <typename T, typename Arg>
+  static enable_if_floating_point<Arg, T> Call(KernelContext* ctx, Arg arg, Status*) {
+    static_assert(std::is_same<T, Arg>::value, "");
+    auto options = OptionsWrapper<RoundOptions>::Get(ctx);
+    const auto mult = RoundUtil::Pow10<T>(-options.ndigits);

Review comment:
       We don't use exceptions; I think nan is fine. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] lidavidm commented on a change in pull request #10349: ARROW-12744: [C++][Compute] Add rounding kernel

Posted by GitBox <gi...@apache.org>.

lidavidm commented on a change in pull request #10349:
URL: https://github.com/apache/arrow/pull/10349#discussion_r674997744



##########
File path: cpp/src/arrow/compute/kernels/scalar_arithmetic.cc
##########
@@ -817,24 +818,158 @@ struct Log1pChecked {
   }
 };
 
+struct RoundUtil {
+  template <typename T, enable_if_t<std::is_floating_point<T>::value, bool> = true>
+  static constexpr bool ApproxEqual(const T x, const T y) {
+    return (x == y) || (std::fabs(x - y) <= std::numeric_limits<T>::epsilon());
+  }
+
+  template <typename T, enable_if_t<std::is_floating_point<T>::value, bool> = true>
+  static constexpr bool IsHalf(T val) {
+    // |frac| == 0.5?
+    return ApproxEqual(std::fmod(std::fabs(val), T(1)), T(0.5));
+  }
+
+  template <typename T>
+  static enable_if_floating_point<T> Pow10(const int power) {
+    static constexpr auto pow10 = std::array<T, 39>{
+        1e-19, 1e-18, 1e-17, 1e-16, 1e-15, 1e-14, 1e-13, 1e-12, 1e-11, 1e-10,
+        1e-9,  1e-8,  1e-7,  1e-6,  1e-5,  1e-4,  1e-3,  1e-2,  1e-1,  1e0,
+        1e1,   1e2,   1e3,   1e4,   1e5,   1e6,   1e7,   1e8,   1e9,   1e10,
+        1e11,  1e12,  1e13,  1e14,  1e15,  1e16,  1e17,  1e18,  1e19};
+    return pow10.at(power + 19);
+  }
+};
+
+// Specializations of rounding implementations for kernels
+template <typename T, RoundMode RndMode>
+struct RoundImpl {
+  static constexpr enable_if_floating_point<T> Round(T) { return T(0); }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::TOWARDS_NEG_INFINITY> {
+  static constexpr enable_if_floating_point<T> Round(T val) { return std::floor(val); }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::TOWARDS_POS_INFINITY> {
+  static constexpr enable_if_floating_point<T> Round(T val) { return std::ceil(val); }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::TOWARDS_ZERO> {
+  static constexpr enable_if_floating_point<T> Round(T val) { return std::trunc(val); }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::TOWARDS_INFINITY> {
+  static constexpr enable_if_floating_point<T> Round(T val) {
+    return std::signbit(val) ? std::floor(val) : std::ceil(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TOWARDS_NEG_INFINITY> {
+  static constexpr enable_if_floating_point<T> Round(T val) {
+    return std::ceil(val - T(0.5));
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TOWARDS_POS_INFINITY> {
+  static constexpr enable_if_floating_point<T> Round(T val) {
+    return std::floor(val + T(0.5));
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TOWARDS_ZERO> {
+  static constexpr enable_if_floating_point<T> Round(T val) {
+    return std::copysign(std::ceil(std::fabs(val) - T(0.5)), val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TOWARDS_INFINITY> {
+  static enable_if_floating_point<T> Round(T val) {
+    return std::copysign(std::floor(std::fabs(val) + T(0.5)), val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TO_EVEN> {
+  static enable_if_floating_point<T> Round(T val) {
+    if (!RoundUtil::IsHalf(val)) {
+      return std::round(val);
+    }
+    auto floor = std::floor(val);
+    // Odd + 1, Even + 0
+    return floor + T(std::fmod(std::fabs(floor), T(2)) >= T(1));
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TO_ODD> {
+  static enable_if_floating_point<T> Round(T val) {
+    if (!RoundUtil::IsHalf(val)) {
+      return std::round(val);
+    }
+    auto floor = std::floor(val);
+    // Odd + 0, Even + 1
+    return floor + T(std::fmod(std::fabs(floor), T(2)) < T(1));
+  }
+};
+
+template <RoundMode RndMode>
+struct MRound {
+  template <typename T, typename Arg>
+  static enable_if_floating_point<Arg, T> Call(KernelContext* ctx, Arg arg, Status*) {
+    static_assert(std::is_same<T, Arg>::value, "");
+    if (std::isnan(arg)) {
+      return arg;
+    }
+    auto options = OptionsWrapper<MRoundOptions>::Get(ctx);
+    const auto mult = std::fabs(T(options.multiple));
+    return (mult == T(0)) ? T(0) : (RoundImpl<T, RndMode>::Round(arg / mult) * mult);
+  }
+};
+
+template <RoundMode RndMode>
+struct Round {
+  template <typename T, typename Arg>
+  static enable_if_floating_point<Arg, T> Call(KernelContext* ctx, Arg arg, Status*) {
+    static_assert(std::is_same<T, Arg>::value, "");
+    auto options = OptionsWrapper<RoundOptions>::Get(ctx);
+    const auto mult = RoundUtil::Pow10<T>(-options.ndigits);

Review comment:
       I would say either a checked and unchecked variant, or else raise an error (since really, it's about the option being invalid). I don't think it's too useful to round to 0 or not round.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] edponce commented on a change in pull request #10349: ARROW-12744: [C++][Compute] Add rounding kernel

Posted by GitBox <gi...@apache.org>.

edponce commented on a change in pull request #10349:
URL: https://github.com/apache/arrow/pull/10349#discussion_r659133597



##########
File path: cpp/src/arrow/compute/kernels/scalar_arithmetic.cc
##########
@@ -454,6 +456,159 @@ struct PowerChecked {
   }
 };
 
+using RoundState = internal::OptionsWrapper<RoundOptions>;
+
+struct Round {
+  template <typename T, enable_if_t<std::is_floating_point<T>::value, bool> = true>
+  static constexpr bool ApproxEqual(T x, T y, int ulp = 8) {
+    // https://en.cppreference.com/w/cpp/types/numeric_limits/epsilon
+    // The machine epsilon has to be scaled to the magnitude of the values used
+    // and multiplied by the desired precision in ULPs (units in the last place)
+    return (std::fabs(x - y) <=
+            (std::numeric_limits<T>::epsilon() * std::fabs(x + y) * ulp))
+           // unless the result is subnormal
+           || (std::fabs(x - y) < std::numeric_limits<T>::min());
+  }
+
+  template <typename T, enable_if_t<std::is_floating_point<T>::value, bool> = true>
+  static constexpr bool IsHalf(T val) {
+    // |frac| == 0.5?
+    return ApproxEqual(std::fabs(std::fmod(val, T(1))), T(0.5));
+  }
+
+  template <typename T>
+  static constexpr enable_if_floating_point<T> RoundWithMultiple(T val, T mult) {
+    return (val / mult) * mult;
+  }
+
+  template <typename T>
+  static constexpr enable_if_floating_point<T> Floor(T val) {
+    return std::floor(val);
+  }
+
+  template <typename T>
+  static constexpr enable_if_floating_point<T> Ceiling(T val) {
+    return std::ceil(val);
+  }
+
+  template <typename T>
+  static constexpr enable_if_floating_point<T> Truncate(T val) {
+    return std::trunc(val);
+  }
+
+  template <typename T>
+  static constexpr enable_if_floating_point<T> TowardsInfinity(T val) {
+    return std::signbit(val) ? std::floor(val) : std::ceil(val);
+  }
+
+  template <typename T>
+  static constexpr enable_if_floating_point<T> HalfDown(T val) {
+    return std::ceil(val - T(0.5));
+  }
+
+  template <typename T>
+  static constexpr enable_if_floating_point<T> HalfUp(T val) {
+    return std::floor(val + T(0.5));
+  }
+
+  template <typename T>
+  static enable_if_floating_point<T> HalfToEven(T val) {
+    if (IsHalf(val)) {
+      auto floor = std::floor(val);
+      // Odd + 1, Even + 0
+      return floor + (std::fmod(std::fabs(floor), T(2)) >= T(1));
+    }
+    return std::round(val);
+  }
+
+  template <typename T>
+  static enable_if_floating_point<T> HalfToOdd(T val) {
+    if (IsHalf(val)) {
+      auto floor = std::floor(val);
+      // Odd + 0, Even + 1
+      return floor + (std::fmod(std::fabs(floor), T(2)) < T(1));
+    }
+    return std::round(val);
+  }
+
+  template <typename T>
+  static constexpr enable_if_floating_point<T> Nearest(T val) {
+    return std::round(val);
+  }
+
+  template <typename T>
+  static constexpr enable_if_floating_point<T> HalfTowardsZero(T val) {
+    return std::copysign(std::ceil(std::fabs(val) - T(0.5)), val);
+  }
+
+  template <typename T, typename Arg,
+            enable_if_t<std::is_integral<Arg>::value, bool> = true>
+  static constexpr enable_if_floating_point<T> Call(KernelContext* ctx, Arg arg,
+                                                    Status* st) {
+    return Call<T, T>(ctx, T(arg), st);
+  }
+
+  template <typename T, typename Arg,
+            enable_if_t<std::is_floating_point<Arg>::value, bool> = true>
+  static enable_if_floating_point<T> Call(KernelContext* ctx, Arg arg, Status* st) {
+    const RoundOptions& options = RoundState::Get(ctx);

Review comment:
       I was able to found the issues triggering a segfault when `RoundState::Get(ctx)` in kernels. The first issue was resolved by @lidavidm fix, `kRoundOptions` needs to be defined in global space. The second issue occurred because no `KernelInit` parameter (in this case `RoundOptions::Init`) was passed to `AddKernel` and the [default value is null](https://github.com/edponce/arrow/blob/master/cpp/src/arrow/compute/function.h#L267-L268) which [prevented options from being set in the `KernelState`](https://github.com/edponce/arrow/blob/master/cpp/src/arrow/compute/function.cc#L181-L184).




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] edponce commented on a change in pull request #10349: ARROW-12744: [C++][Compute] Add rounding kernel

Posted by GitBox <gi...@apache.org>.

edponce commented on a change in pull request #10349:
URL: https://github.com/apache/arrow/pull/10349#discussion_r703657703



##########
File path: cpp/src/arrow/compute/kernels/scalar_arithmetic.cc
##########
@@ -852,24 +854,232 @@ struct LogbChecked {
   }
 };
 
+struct RoundUtil {
+  template <typename T>
+  static constexpr enable_if_t<std::is_floating_point<T>::value, bool> IsApproxEqual(
+      const T a, const T b, const T abs_tol = kDefaultAbsoluteTolerance) {
+    return std::fabs(a - b) <= abs_tol;
+  }
+
+  template <typename T>
+  static constexpr enable_if_t<std::is_floating_point<T>::value, bool> IsApproxInt(
+      const T val, const T abs_tol = kDefaultAbsoluteTolerance) {
+    // |frac| ~ 0.0?
+    return IsApproxEqual(val, std::round(val), abs_tol);
+  }
+
+  template <typename T>
+  static constexpr enable_if_t<std::is_floating_point<T>::value, bool> IsApproxHalfInt(
+      const T val, const T abs_tol = kDefaultAbsoluteTolerance) {
+    // |frac| ~ 0.5?
+    return IsApproxEqual(val - std::floor(val), T(0.5), abs_tol);
+  }
+
+  template <typename T>
+  static enable_if_floating_point<T> Pow10(const int64_t power) {
+    const T lut[]{1e0F,  1e1F,  1e2F,  1e3F,  1e4F,  1e5F,  1e6F,  1e7F,

Review comment:
       The method no, but the array yes, which should be by default. I initially had it as a static variable so that it doesn't get created in every function call, but my reasoning is that since it is literal and constant it should be readily available in the stack frame.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] edponce commented on a change in pull request #10349: ARROW-12744: [C++][Compute] Add rounding kernel

Posted by GitBox <gi...@apache.org>.

edponce commented on a change in pull request #10349:
URL: https://github.com/apache/arrow/pull/10349#discussion_r698884590



##########
File path: cpp/src/arrow/util/map.h
##########
@@ -17,13 +17,21 @@
 
 #pragma once
 
+#include <type_traits>
 #include <utility>
 
 #include "arrow/result.h"
 
 namespace arrow {
 namespace internal {
 
+/// Functor to calculate hash of an enum.

Review comment:
       Change description to "Functor to make enums hashable."




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] edponce commented on a change in pull request #10349: ARROW-12744: [C++][Compute] Add rounding kernel

Posted by GitBox <gi...@apache.org>.

edponce commented on a change in pull request #10349:
URL: https://github.com/apache/arrow/pull/10349#discussion_r674368136



##########
File path: cpp/src/arrow/compute/api_scalar.h
##########
@@ -49,10 +49,52 @@ class ARROW_EXPORT ElementWiseAggregateOptions : public FunctionOptions {
   explicit ElementWiseAggregateOptions(bool skip_nulls = true);
   constexpr static char const kTypeName[] = "ElementWiseAggregateOptions";
   static ElementWiseAggregateOptions Defaults() { return ElementWiseAggregateOptions{}; }
-
   bool skip_nulls;
 };
 
+/// Rounding modes for Round and MRound functions. General modes are prefixed
+/// with TOWARDS and tie-breaker modes are prefixed with HALF. Common aliases
+/// are available for several modes.

Review comment:
       IMHO, aliases are not worth keeping, but I was not sure what API to use:
   * aliases are based on [C++ rounding modes](https://en.cppreference.com/w/cpp/numeric/fenv/FE_round) (which is a subset)
   * "true" rounding modes are based on [general definition](https://en.wikipedia.org/wiki/Rounding#Rounding_to_integer)




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] edponce commented on a change in pull request #10349: ARROW-12744: [C++][Compute] Add rounding kernel

Posted by GitBox <gi...@apache.org>.

edponce commented on a change in pull request #10349:
URL: https://github.com/apache/arrow/pull/10349#discussion_r705489459



##########
File path: cpp/src/arrow/compute/kernels/scalar_arithmetic.cc
##########
@@ -852,24 +853,243 @@ struct LogbChecked {
   }
 };
 
+struct RoundUtil {
+  template <typename T>
+  static constexpr enable_if_t<std::is_floating_point<T>::value, bool> IsInteger(
+      const T val) {
+    // |frac| ~ 0.0?
+    return std::floor(val) == val;
+  }
+
+  template <typename T>
+  static constexpr enable_if_t<std::is_floating_point<T>::value, bool> IsHalfInteger(
+      const T val) {
+    // |frac| ~ 0.5?
+    return (val - std::floor(val)) == T(0.5);
+  }
+
+  // Calculate powers of ten with arbitrary integer exponent
+  template <typename T = double>
+  static enable_if_floating_point<T> Pow10(int64_t power) {
+    static constexpr T lut[] = {1e0F, 1e1F, 1e2F,  1e3F,  1e4F,  1e5F,  1e6F,  1e7F,
+                                1e8F, 1e9F, 1e10F, 1e11F, 1e12F, 1e13F, 1e14F, 1e15F};
+    int64_t lut_size = (sizeof(lut) / sizeof(*lut));
+    int64_t abs_power = std::abs(power);
+    auto pow10 = lut[std::min(abs_power, lut_size - 1)];
+    while (abs_power-- >= lut_size) {
+      pow10 *= 1e1F;
+    }
+    return (power >= 0) ? pow10 : (1 / pow10);
+  }
+};
+
+// Specializations of rounding implementations for round kernels
+template <typename, RoundMode>
+struct RoundImpl;
+
+template <typename T>
+struct RoundImpl<T, RoundMode::DOWN> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::floor(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::UP> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::ceil(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::TOWARDS_ZERO> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::trunc(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::TOWARDS_INFINITY> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::signbit(val) ? std::floor(val) : std::ceil(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_DOWN> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return RoundImpl<T, RoundMode::DOWN>::Round(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_UP> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return RoundImpl<T, RoundMode::UP>::Round(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TOWARDS_ZERO> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return RoundImpl<T, RoundMode::TOWARDS_ZERO>::Round(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TOWARDS_INFINITY> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return RoundImpl<T, RoundMode::TOWARDS_INFINITY>::Round(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TO_EVEN> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::round(val * T(0.5)) * 2;
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TO_ODD> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::floor(val * T(0.5)) + std::ceil(val * T(0.5));
+  }
+};
+
+// Specializations of kernel state for round kernels
+template <typename>
+struct RoundOptionsWrapper;
+
+template <>
+struct RoundOptionsWrapper<RoundOptions> : public OptionsWrapper<RoundOptions> {
+  using OptionsType = RoundOptions;
+  double pow10;
+
+  explicit RoundOptionsWrapper(OptionsType options) : OptionsWrapper(std::move(options)) {
+    // Only positive of powers of 10 are used because combining multiply and
+    // division operations produced more stable rounding than using multiply-only.
+    // Refer to NumPy's round implementation:
+    // https://github.com/numpy/numpy/blob/7b2f20b406d27364c812f7a81a9c901afbd3600c/numpy/core/src/multiarray/calculation.c#L589
+    pow10 = RoundUtil::Pow10(std::abs(options.ndigits));
+  }
+
+  static Result<std::unique_ptr<KernelState>> Init(KernelContext* ctx,
+                                                   const KernelInitArgs& args) {
+    if (auto options = static_cast<const OptionsType*>(args.options)) {
+      return arrow::internal::make_unique<RoundOptionsWrapper<OptionsType>>(*options);
+    }
+
+    return Status::Invalid(
+        "Attempted to initialize KernelState from null FunctionOptions");
+  }

Review comment:
       The way I used these `OptionsWrapper` classes:
   * considers `Init` for options validation because `Init` can return a `Status::Invalid`
   * constructor for initializing non-options state, which can then be accessed in kernels' `Call` via `ctx->state()`




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] edponce commented on a change in pull request #10349: ARROW-12744: [C++][Compute] Add rounding kernel

Posted by GitBox <gi...@apache.org>.

edponce commented on a change in pull request #10349:
URL: https://github.com/apache/arrow/pull/10349#discussion_r703619282



##########
File path: cpp/src/arrow/compute/kernels/scalar_arithmetic.cc
##########
@@ -852,24 +854,235 @@ struct LogbChecked {
   }
 };
 
+struct RoundUtil {
+  template <typename T>
+  static constexpr enable_if_t<std::is_floating_point<T>::value, bool> IsApproxEqual(
+      const T a, const T b, const T abs_tol = kDefaultAbsoluteTolerance) {
+    return std::fabs(a - b) <= abs_tol;
+  }

Review comment:
       Since these methods are not exposed, I will keep the ones in `RoundUtils`.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] edponce commented on a change in pull request #10349: ARROW-12744: [C++][Compute] Add rounding kernel

Posted by GitBox <gi...@apache.org>.

edponce commented on a change in pull request #10349:
URL: https://github.com/apache/arrow/pull/10349#discussion_r705526272



##########
File path: cpp/src/arrow/compute/kernels/scalar_arithmetic.cc
##########
@@ -1276,6 +1496,65 @@ std::shared_ptr<ScalarFunction> MakeUnaryArithmeticFunctionNotNull(
   return func;
 }
 
+// Generate a kernel given an arithmetic rounding functor
+template <template <RoundMode> class Op>
+ArrayKernelExec GenerateArithmeticRound(RoundMode rmode, detail::GetTypeId ty) {
+  switch (rmode) {
+    case RoundMode::DOWN:
+      return GenerateArithmeticFloatingPoint<ScalarUnaryNotNull, Op<RoundMode::DOWN>>(ty);
+    case RoundMode::UP:
+      return GenerateArithmeticFloatingPoint<ScalarUnaryNotNull, Op<RoundMode::UP>>(ty);
+    case RoundMode::TOWARDS_ZERO:
+      return GenerateArithmeticFloatingPoint<ScalarUnaryNotNull,
+                                             Op<RoundMode::TOWARDS_ZERO>>(ty);
+    case RoundMode::TOWARDS_INFINITY:
+      return GenerateArithmeticFloatingPoint<ScalarUnaryNotNull,
+                                             Op<RoundMode::TOWARDS_INFINITY>>(ty);
+    case RoundMode::HALF_DOWN:
+      return GenerateArithmeticFloatingPoint<ScalarUnaryNotNull,
+                                             Op<RoundMode::HALF_DOWN>>(ty);
+    case RoundMode::HALF_UP:
+      return GenerateArithmeticFloatingPoint<ScalarUnaryNotNull, Op<RoundMode::HALF_UP>>(
+          ty);
+    case RoundMode::HALF_TOWARDS_ZERO:
+      return GenerateArithmeticFloatingPoint<ScalarUnaryNotNull,
+                                             Op<RoundMode::HALF_TOWARDS_ZERO>>(ty);
+    case RoundMode::HALF_TOWARDS_INFINITY:
+      return GenerateArithmeticFloatingPoint<ScalarUnaryNotNull,
+                                             Op<RoundMode::HALF_TOWARDS_INFINITY>>(ty);
+    case RoundMode::HALF_TO_EVEN:
+      return GenerateArithmeticFloatingPoint<ScalarUnaryNotNull,
+                                             Op<RoundMode::HALF_TO_EVEN>>(ty);
+    case RoundMode::HALF_TO_ODD:
+      return GenerateArithmeticFloatingPoint<ScalarUnaryNotNull,
+                                             Op<RoundMode::HALF_TO_ODD>>(ty);
+    default:
+      DCHECK(false);
+      return ExecFail;
+  }
+}
+
+// Like MakeUnaryArithmeticFunction, but for unary rounding functions that control
+// kernel dispatch based on RoundMode, only on non-null output.
+template <template <RoundMode> class Op, typename OptionsType>
+std::shared_ptr<ScalarFunction> MakeUnaryRoundFunction(std::string name,
+                                                       const FunctionDoc* doc) {
+  using State = RoundOptionsWrapper<OptionsType>;
+
+  static const OptionsType kDefaultOptions = OptionsType::Defaults();
+  auto func = std::make_shared<ArithmeticFloatingPointFunction>(name, Arity::Unary(), doc,
+                                                                &kDefaultOptions);
+  for (const auto& ty : FloatingPointTypes()) {
+    auto exec = [&](KernelContext* ctx, const ExecBatch& batch, Datum* out) {
+      auto options = State::Get(ctx);
+      auto exec_ = GenerateArithmeticRound<Op>(options.round_mode, ty);

Review comment:
       Actually, it is doing the same thing as the other kernels. Other kernels pass `exec` (an `ArrayKernelExec`) to the `AddKernel` method without invoking it. `exec` is invoked during kernel dispatching because it requires `KernelContext, ExecBatch, and Datum` parameters.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] edponce commented on a change in pull request #10349: ARROW-12744: [C++][Compute] Add rounding kernel

Posted by GitBox <gi...@apache.org>.

edponce commented on a change in pull request #10349:
URL: https://github.com/apache/arrow/pull/10349#discussion_r704874792



##########
File path: cpp/src/arrow/compute/kernels/scalar_arithmetic.cc
##########
@@ -852,24 +854,232 @@ struct LogbChecked {
   }
 };
 
+struct RoundUtil {
+  template <typename T>
+  static constexpr enable_if_t<std::is_floating_point<T>::value, bool> IsApproxEqual(
+      const T a, const T b, const T abs_tol = kDefaultAbsoluteTolerance) {
+    return std::fabs(a - b) <= abs_tol;
+  }
+
+  template <typename T>
+  static constexpr enable_if_t<std::is_floating_point<T>::value, bool> IsApproxInt(
+      const T val, const T abs_tol = kDefaultAbsoluteTolerance) {
+    // |frac| ~ 0.0?
+    return IsApproxEqual(val, std::round(val), abs_tol);
+  }
+
+  template <typename T>
+  static constexpr enable_if_t<std::is_floating_point<T>::value, bool> IsApproxHalfInt(
+      const T val, const T abs_tol = kDefaultAbsoluteTolerance) {
+    // |frac| ~ 0.5?
+    return IsApproxEqual(val - std::floor(val), T(0.5), abs_tol);
+  }
+
+  template <typename T>
+  static enable_if_floating_point<T> Pow10(const int64_t power) {
+    const T lut[]{1e0F,  1e1F,  1e2F,  1e3F,  1e4F,  1e5F,  1e6F,  1e7F,
+                  1e8F,  1e9F,  1e10F, 1e11F, 1e12F, 1e13F, 1e14F, 1e15F,
+                  1e16F, 1e17F, 1e18F, 1e19F, 1e20F, 1e21F, 1e22F};
+    // Return NaN if index is out-of-range.
+    auto lut_size = (int64_t)(sizeof(lut) / sizeof(*lut));
+    return (power >= 0 && power < lut_size) ? lut[power] : std::nanf("");
+  }
+};
+
+// Specializations of rounding implementations for round kernels
+template <typename T, RoundMode>
+struct RoundImpl {};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::DOWN> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::floor(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::UP> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::ceil(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::TOWARDS_ZERO> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::signbit(val) ? std::ceil(val) : std::floor(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::TOWARDS_INFINITY> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::signbit(val) ? std::floor(val) : std::ceil(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_DOWN> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return RoundImpl<T, RoundMode::DOWN>::Round(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_UP> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return RoundImpl<T, RoundMode::UP>::Round(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TOWARDS_ZERO> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return RoundImpl<T, RoundMode::TOWARDS_ZERO>::Round(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TOWARDS_INFINITY> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return RoundImpl<T, RoundMode::TOWARDS_INFINITY>::Round(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TO_EVEN> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::round(val * T(0.5)) * 2;
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TO_ODD> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::floor(val * T(0.5)) + std::ceil(val * T(0.5));
+  }
+};
+
+template <RoundMode RndMode>
+struct Round {
+  using RoundState = OptionsWrapper<RoundOptions>;
+
+  template <typename T, typename Arg>
+  static enable_if_floating_point<Arg, T> Call(KernelContext* ctx, Arg arg, Status* st) {
+    static_assert(std::is_same<T, Arg>::value, "");
+    static_assert(RoundMode::HALF_DOWN > RoundMode::DOWN &&
+                      RoundMode::HALF_DOWN > RoundMode::UP &&
+                      RoundMode::HALF_DOWN > RoundMode::TOWARDS_ZERO &&
+                      RoundMode::HALF_DOWN > RoundMode::TOWARDS_INFINITY &&
+                      RoundMode::HALF_DOWN < RoundMode::HALF_UP &&
+                      RoundMode::HALF_DOWN < RoundMode::HALF_TOWARDS_ZERO &&
+                      RoundMode::HALF_DOWN < RoundMode::HALF_TOWARDS_INFINITY &&
+                      RoundMode::HALF_DOWN < RoundMode::HALF_TO_EVEN &&
+                      RoundMode::HALF_DOWN < RoundMode::HALF_TO_ODD,
+                  "Round modes prefixed with HALF need to be defined last in enum and "
+                  "the first HALF entry has to be HALF_DOWN.");
+
+    auto options = RoundState::Get(ctx);
+    auto pow10 = RoundUtil::Pow10<T>(std::llabs(options.ndigits));
+    if (std::isnan(pow10)) {
+      *st = Status::Invalid("out-of-range value for rounding digits");
+      return arg;

Review comment:
       I was able to solve this issue by creating and specializing `RoundOptionsWrapper` (which is a `KernelState`) to perform:
   * validation for `round_to_multiple` in `Init()`
   * extends state with `pow10` data member for `round`




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] edponce commented on a change in pull request #10349: ARROW-12744: [C++][Compute] Add rounding kernel

Posted by GitBox <gi...@apache.org>.

edponce commented on a change in pull request #10349:
URL: https://github.com/apache/arrow/pull/10349#discussion_r674371608



##########
File path: cpp/src/arrow/compute/api_scalar.h
##########
@@ -49,10 +49,52 @@ class ARROW_EXPORT ElementWiseAggregateOptions : public FunctionOptions {
   explicit ElementWiseAggregateOptions(bool skip_nulls = true);
   constexpr static char const kTypeName[] = "ElementWiseAggregateOptions";
   static ElementWiseAggregateOptions Defaults() { return ElementWiseAggregateOptions{}; }
-
   bool skip_nulls;
 };
 
+/// Rounding modes for Round and MRound functions. General modes are prefixed
+/// with TOWARDS and tie-breaker modes are prefixed with HALF. Common aliases
+/// are available for several modes.

Review comment:
       Related to aliases, there are [rounding modes that are equivalent to other compute functions](https://github.com/apache/arrow/pull/10349/files#diff-3eafd7246f6a8c699f10d46e3276852fe44b6853b5517ef10396e561730c09f4R874-R887):
   * `TOWARDS_NEG_INFINITY` -> `Floor`
   * `TOWARDS_POS_INFINITY` -> `Ceil`
   * `TOWARDS_ZERO` -> `Trunc`
   
   Do you recommend for `Round/MRound` to invoke `Floor/Ceil/Trunc` or invoke `std::floor/ceil/trunc`? Currently, there is a bit of code duplication in the function bodies as both use the `std` variants but this to bypass unneeded overhead.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] lidavidm commented on a change in pull request #10349: ARROW-12744: [C++][Compute] Add rounding kernel

Posted by GitBox <gi...@apache.org>.

lidavidm commented on a change in pull request #10349:
URL: https://github.com/apache/arrow/pull/10349#discussion_r703661023



##########
File path: cpp/src/arrow/python/python_test.cc
##########
@@ -194,7 +194,7 @@ TEST(PyBuffer, InvalidInputObject) {
 // ("unresolved external symbol arrow_ARRAY_API referenced").
 #ifndef _WIN32
 TEST(PyBuffer, NumpyArray) {
-  const npy_intp dims[1] = {10};
+  npy_intp dims[1] = {10};

Review comment:
       What are these changes for?

##########
File path: cpp/src/arrow/compute/api_scalar.h
##########
@@ -49,10 +49,66 @@ class ARROW_EXPORT ElementWiseAggregateOptions : public FunctionOptions {
   explicit ElementWiseAggregateOptions(bool skip_nulls = true);
   constexpr static char const kTypeName[] = "ElementWiseAggregateOptions";
   static ElementWiseAggregateOptions Defaults() { return ElementWiseAggregateOptions{}; }
-
   bool skip_nulls;
 };
 
+/// Rounding and tie-breaking modes for round compute functions.
+/// Additional details and examples are provided in compute.rst.
+enum class RoundMode : int8_t {
+  /// Round to nearest integer less than or equal in magnitude (aka "floor")
+  DOWN,
+  /// Round to nearest integer greater than or equal in magnitude (aka "ceil")
+  UP,
+  /// Get the integral part without fractional digits (aka "trunc")
+  TOWARDS_ZERO,
+  /// Round negative values with DOWN rule and positive values with UP rule
+  TOWARDS_INFINITY,
+  /// Round ties with DOWN rule
+  HALF_DOWN,
+  /// Round ties with UP rule
+  HALF_UP,
+  /// Round ties with TOWARDS_ZERO rule
+  HALF_TOWARDS_ZERO,
+  /// Round ties with TOWARDS_INFINITY rule
+  HALF_TOWARDS_INFINITY,
+  /// Round ties to nearest even integer
+  HALF_TO_EVEN,
+  /// Round ties to nearest odd integer
+  HALF_TO_ODD,
+};
+
+static constexpr double kDefaultAbsoluteTolerance = 1E-5;
+
+class ARROW_EXPORT RoundOptions : public FunctionOptions {
+ public:
+  explicit RoundOptions(int64_t ndigits = 0,
+                        RoundMode round_mode = RoundMode::HALF_TO_EVEN,
+                        double abs_tol = kDefaultAbsoluteTolerance);
+  constexpr static char const kTypeName[] = "RoundOptions";
+  static RoundOptions Defaults() { return RoundOptions(); }
+  /// Rounding precision (number of digits to round to).
+  int64_t ndigits;
+  /// Rounding and tie-breaking mode
+  RoundMode round_mode;
+  /// Absolute tolerance for approximating values as integers and mid-point decimals
+  double abs_tol;

Review comment:
       If NumPy doesn't support it, is it super useful to implement at all?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] edponce commented on a change in pull request #10349: ARROW-12744: [C++][Compute] Add rounding kernel

Posted by GitBox <gi...@apache.org>.

edponce commented on a change in pull request #10349:
URL: https://github.com/apache/arrow/pull/10349#discussion_r703934380



##########
File path: cpp/src/arrow/compute/api_scalar.h
##########
@@ -583,10 +639,32 @@ Result<Datum> MinElementWise(
 ///
 /// \param[in] arg the value to extract sign from
 /// \param[in] ctx the function execution context, optional
-/// \return the elementwise sign function
+/// \return the element-wise sign function
 ARROW_EXPORT
 Result<Datum> Sign(const Datum& arg, ExecContext* ctx = NULLPTR);
 
+/// \brief Round a value to the given number of digits. Array values can be of
+/// arbitrary length. If argument is null the result will be null.
+///
+/// \param[in] arg the value rounded
+/// \param[in] options rounding options (rounding mode and number of digits), optional
+/// \param[in] ctx the function execution context, optional
+/// \return the element-wise rounded value
+ARROW_EXPORT
+Result<Datum> Round(const Datum& arg, RoundOptions options = RoundOptions::Defaults(),
+                    ExecContext* ctx = NULLPTR);
+
+/// \brief Round a value to the given multiple. Array values can be of arbitrary
+/// length. If argument is null the result will be null.
+///
+/// \param[in] arg the value to round
+/// \param[in] options rounding options (rounding mode and multiple), optional
+/// \param[in] ctx the function execution context, optional
+/// \return the element-wise rounded value
+ARROW_EXPORT
+Result<Datum> MRound(const Datum& arg, MRoundOptions options = MRoundOptions::Defaults(),

Review comment:
       The [`mround`](https://support.microsoft.com/en-us/office/mround-function-c299c3b0-15a5-426d-aa4b-d2d5b3baf427) function comes from Microsoft/Excel. Do you refer to change the name only to this C++ public API or everywhere? I would prefer to leave the options as `MRoundOptions`, or should I rename `MRoundOptions` to `RoundToMultipleOptions`?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] edponce commented on a change in pull request #10349: ARROW-12744: [C++][Compute] Add rounding kernel

Posted by GitBox <gi...@apache.org>.

edponce commented on a change in pull request #10349:
URL: https://github.com/apache/arrow/pull/10349#discussion_r703611914



##########
File path: cpp/src/arrow/compute/api_scalar.h
##########
@@ -49,10 +49,66 @@ class ARROW_EXPORT ElementWiseAggregateOptions : public FunctionOptions {
   explicit ElementWiseAggregateOptions(bool skip_nulls = true);
   constexpr static char const kTypeName[] = "ElementWiseAggregateOptions";
   static ElementWiseAggregateOptions Defaults() { return ElementWiseAggregateOptions{}; }
-
   bool skip_nulls;
 };
 
+/// Rounding and tie-breaking modes for round compute functions.
+/// Additional details and examples are provided in compute.rst.
+enum class RoundMode : int8_t {
+  /// Round to nearest integer less than or equal in magnitude (aka "floor")
+  DOWN,
+  /// Round to nearest integer greater than or equal in magnitude (aka "ceil")
+  UP,
+  /// Get the integral part without fractional digits (aka "trunc")
+  TOWARDS_ZERO,
+  /// Round negative values with DOWN rule and positive values with UP rule
+  TOWARDS_INFINITY,
+  /// Round ties with DOWN rule
+  HALF_DOWN,
+  /// Round ties with UP rule
+  HALF_UP,
+  /// Round ties with TOWARDS_ZERO rule
+  HALF_TOWARDS_ZERO,
+  /// Round ties with TOWARDS_INFINITY rule
+  HALF_TOWARDS_INFINITY,
+  /// Round ties to nearest even integer
+  HALF_TO_EVEN,
+  /// Round ties to nearest odd integer
+  HALF_TO_ODD,
+};
+
+static constexpr double kDefaultAbsoluteTolerance = 1E-5;
+

Review comment:
       I am including `compare.h` header file to access the `kDefaultAbsoluteTolerance` from there. This way if at another point in time, `compare.h` is included, there is no name clashing.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] edponce commented on a change in pull request #10349: ARROW-12744: [C++][Compute] Add rounding kernel

Posted by GitBox <gi...@apache.org>.

edponce commented on a change in pull request #10349:
URL: https://github.com/apache/arrow/pull/10349#discussion_r659082281



##########
File path: cpp/src/arrow/compute/kernels/scalar_arithmetic.cc
##########
@@ -454,6 +456,159 @@ struct PowerChecked {
   }
 };
 
+using RoundState = internal::OptionsWrapper<RoundOptions>;
+
+struct Round {
+  template <typename T, enable_if_t<std::is_floating_point<T>::value, bool> = true>
+  static constexpr bool ApproxEqual(T x, T y, int ulp = 8) {
+    // https://en.cppreference.com/w/cpp/types/numeric_limits/epsilon
+    // The machine epsilon has to be scaled to the magnitude of the values used
+    // and multiplied by the desired precision in ULPs (units in the last place)
+    return (std::fabs(x - y) <=
+            (std::numeric_limits<T>::epsilon() * std::fabs(x + y) * ulp))
+           // unless the result is subnormal
+           || (std::fabs(x - y) < std::numeric_limits<T>::min());
+  }
+
+  template <typename T, enable_if_t<std::is_floating_point<T>::value, bool> = true>
+  static constexpr bool IsHalf(T val) {
+    // |frac| == 0.5?
+    return ApproxEqual(std::fabs(std::fmod(val, T(1))), T(0.5));
+  }
+
+  template <typename T>
+  static constexpr enable_if_floating_point<T> RoundWithMultiple(T val, T mult) {
+    return (val / mult) * mult;
+  }
+
+  template <typename T>
+  static constexpr enable_if_floating_point<T> Floor(T val) {
+    return std::floor(val);
+  }
+
+  template <typename T>
+  static constexpr enable_if_floating_point<T> Ceiling(T val) {
+    return std::ceil(val);
+  }
+
+  template <typename T>
+  static constexpr enable_if_floating_point<T> Truncate(T val) {
+    return std::trunc(val);
+  }
+
+  template <typename T>
+  static constexpr enable_if_floating_point<T> TowardsInfinity(T val) {
+    return std::signbit(val) ? std::floor(val) : std::ceil(val);
+  }
+
+  template <typename T>
+  static constexpr enable_if_floating_point<T> HalfDown(T val) {
+    return std::ceil(val - T(0.5));
+  }
+
+  template <typename T>
+  static constexpr enable_if_floating_point<T> HalfUp(T val) {
+    return std::floor(val + T(0.5));
+  }
+
+  template <typename T>
+  static enable_if_floating_point<T> HalfToEven(T val) {
+    if (IsHalf(val)) {
+      auto floor = std::floor(val);
+      // Odd + 1, Even + 0
+      return floor + (std::fmod(std::fabs(floor), T(2)) >= T(1));
+    }
+    return std::round(val);
+  }
+
+  template <typename T>
+  static enable_if_floating_point<T> HalfToOdd(T val) {
+    if (IsHalf(val)) {
+      auto floor = std::floor(val);
+      // Odd + 0, Even + 1
+      return floor + (std::fmod(std::fabs(floor), T(2)) < T(1));
+    }
+    return std::round(val);
+  }
+
+  template <typename T>
+  static constexpr enable_if_floating_point<T> Nearest(T val) {
+    return std::round(val);
+  }
+
+  template <typename T>
+  static constexpr enable_if_floating_point<T> HalfTowardsZero(T val) {
+    return std::copysign(std::ceil(std::fabs(val) - T(0.5)), val);
+  }
+
+  template <typename T, typename Arg,
+            enable_if_t<std::is_integral<Arg>::value, bool> = true>
+  static constexpr enable_if_floating_point<T> Call(KernelContext* ctx, Arg arg,
+                                                    Status* st) {
+    return Call<T, T>(ctx, T(arg), st);
+  }
+
+  template <typename T, typename Arg,
+            enable_if_t<std::is_floating_point<Arg>::value, bool> = true>
+  static enable_if_floating_point<T> Call(KernelContext* ctx, Arg arg, Status* st) {
+    const RoundOptions& options = RoundState::Get(ctx);

Review comment:
       I tried placing the `kRoundOptions` outside of function and inside with `static` and still get segfault. I was missing `ARROW_EXPORT` in `RoundOptions` but this did not fix the issue. Still looking...




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] edponce commented on a change in pull request #10349: ARROW-12744: [C++][Compute] Add rounding kernel

Posted by GitBox <gi...@apache.org>.

edponce commented on a change in pull request #10349:
URL: https://github.com/apache/arrow/pull/10349#discussion_r658690788



##########
File path: cpp/src/arrow/compute/kernels/scalar_arithmetic.cc
##########
@@ -454,6 +456,159 @@ struct PowerChecked {
   }
 };
 
+using RoundState = internal::OptionsWrapper<RoundOptions>;
+
+struct Round {
+  template <typename T, enable_if_t<std::is_floating_point<T>::value, bool> = true>
+  static constexpr bool ApproxEqual(T x, T y, int ulp = 8) {
+    // https://en.cppreference.com/w/cpp/types/numeric_limits/epsilon
+    // The machine epsilon has to be scaled to the magnitude of the values used
+    // and multiplied by the desired precision in ULPs (units in the last place)
+    return (std::fabs(x - y) <=
+            (std::numeric_limits<T>::epsilon() * std::fabs(x + y) * ulp))
+           // unless the result is subnormal
+           || (std::fabs(x - y) < std::numeric_limits<T>::min());
+  }
+
+  template <typename T, enable_if_t<std::is_floating_point<T>::value, bool> = true>
+  static constexpr bool IsHalf(T val) {
+    // |frac| == 0.5?
+    return ApproxEqual(std::fabs(std::fmod(val, T(1))), T(0.5));
+  }
+
+  template <typename T>
+  static constexpr enable_if_floating_point<T> RoundWithMultiple(T val, T mult) {
+    return (val / mult) * mult;
+  }
+
+  template <typename T>
+  static constexpr enable_if_floating_point<T> Floor(T val) {
+    return std::floor(val);
+  }
+
+  template <typename T>
+  static constexpr enable_if_floating_point<T> Ceiling(T val) {
+    return std::ceil(val);
+  }
+
+  template <typename T>
+  static constexpr enable_if_floating_point<T> Truncate(T val) {
+    return std::trunc(val);
+  }
+
+  template <typename T>
+  static constexpr enable_if_floating_point<T> TowardsInfinity(T val) {
+    return std::signbit(val) ? std::floor(val) : std::ceil(val);
+  }
+
+  template <typename T>
+  static constexpr enable_if_floating_point<T> HalfDown(T val) {
+    return std::ceil(val - T(0.5));
+  }
+
+  template <typename T>
+  static constexpr enable_if_floating_point<T> HalfUp(T val) {
+    return std::floor(val + T(0.5));
+  }
+
+  template <typename T>
+  static enable_if_floating_point<T> HalfToEven(T val) {
+    if (IsHalf(val)) {
+      auto floor = std::floor(val);
+      // Odd + 1, Even + 0
+      return floor + (std::fmod(std::fabs(floor), T(2)) >= T(1));
+    }
+    return std::round(val);
+  }
+
+  template <typename T>
+  static enable_if_floating_point<T> HalfToOdd(T val) {
+    if (IsHalf(val)) {
+      auto floor = std::floor(val);
+      // Odd + 0, Even + 1
+      return floor + (std::fmod(std::fabs(floor), T(2)) < T(1));
+    }
+    return std::round(val);
+  }
+
+  template <typename T>
+  static constexpr enable_if_floating_point<T> Nearest(T val) {
+    return std::round(val);
+  }
+
+  template <typename T>
+  static constexpr enable_if_floating_point<T> HalfTowardsZero(T val) {
+    return std::copysign(std::ceil(std::fabs(val) - T(0.5)), val);
+  }
+
+  template <typename T, typename Arg,
+            enable_if_t<std::is_integral<Arg>::value, bool> = true>
+  static constexpr enable_if_floating_point<T> Call(KernelContext* ctx, Arg arg,
+                                                    Status* st) {
+    return Call<T, T>(ctx, T(arg), st);
+  }
+
+  template <typename T, typename Arg,
+            enable_if_t<std::is_floating_point<Arg>::value, bool> = true>
+  static enable_if_floating_point<T> Call(KernelContext* ctx, Arg arg, Status* st) {
+    const RoundOptions& options = RoundState::Get(ctx);

Review comment:
       @bkietz @lidavidm How do you pass FunctionOptions to scalar arithmetic kernels? This currently segfaults when tests are ran. I am passing [`RoundOptions::Defaults()`](https://github.com/edponce/arrow/blob/ARROW-12744-Add-rounding-kernel/cpp/src/arrow/compute/kernels/scalar_arithmetic.cc#L1117-L1119) to [`ArithmeticFunction constructor`](https://github.com/edponce/arrow/blob/ARROW-12744-Add-rounding-kernel/cpp/src/arrow/compute/kernels/scalar_arithmetic.cc#L911-L915), but it seems options are not set properly in `KernelContext` and `RoundState::Get(ctx)` does not works properly.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] edponce commented on a change in pull request #10349: ARROW-12744: [C++][Compute] Add rounding kernel

Posted by GitBox <gi...@apache.org>.

edponce commented on a change in pull request #10349:
URL: https://github.com/apache/arrow/pull/10349#discussion_r692358623



##########
File path: cpp/src/arrow/compute/api_scalar.h
##########
@@ -49,10 +49,44 @@ class ARROW_EXPORT ElementWiseAggregateOptions : public FunctionOptions {
   explicit ElementWiseAggregateOptions(bool skip_nulls = true);
   constexpr static char const kTypeName[] = "ElementWiseAggregateOptions";
   static ElementWiseAggregateOptions Defaults() { return ElementWiseAggregateOptions{}; }
-
   bool skip_nulls;
 };
 
+enum class RoundMode {

Review comment:
       I added comments to individual enums and reference to `compute.rst`.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] edponce commented on a change in pull request #10349: ARROW-12744: [C++][Compute] Add rounding kernel

Posted by GitBox <gi...@apache.org>.

edponce commented on a change in pull request #10349:
URL: https://github.com/apache/arrow/pull/10349#discussion_r703275754



##########
File path: cpp/src/arrow/compute/kernels/scalar_arithmetic.cc
##########
@@ -856,10 +860,23 @@ struct LogbChecked {
 
 struct RoundUtil {
   template <typename T>
-  static enable_if_t<std::is_floating_point<T>::value, bool> ApproxHalfInt(
-      const T val, const T atol = 1e-5) {
-    // |frac| ~ 0.5?
-    return std::fabs(std::fmod(std::fabs(val), T(1)) - T(0.5)) <= std::fabs(atol);
+  static constexpr enable_if_t<std::is_floating_point<T>::value, bool> IsApproxEqual(
+      const T a, const T b, const T atol = kDefaultAbsoluteTolerance) {
+    return std::fabs(a - b) <= std::fabs(atol);

Review comment:
       Just playing it safe. Although most implementations do not make use of an absolute tolerance and simply compare directly against 0.5, but I found special cases where it fails. This parameter can be controlled via Options but it depends on the magnitude of the values being rounded.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] edponce commented on a change in pull request #10349: ARROW-12744: [C++][Compute] Add rounding kernel

Posted by GitBox <gi...@apache.org>.

edponce commented on a change in pull request #10349:
URL: https://github.com/apache/arrow/pull/10349#discussion_r695049732



##########
File path: cpp/src/arrow/compute/api_scalar.h
##########
@@ -52,6 +53,13 @@ class ARROW_EXPORT ElementWiseAggregateOptions : public FunctionOptions {
   bool skip_nulls;
 };
 
+/// Functor to calculate hash of an enum.

Review comment:
       Will place in `arrow/util/map.h`




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] edponce commented on a change in pull request #10349: ARROW-12744: [C++][Compute] Add rounding kernel

Posted by GitBox <gi...@apache.org>.

edponce commented on a change in pull request #10349:
URL: https://github.com/apache/arrow/pull/10349#discussion_r705527509



##########
File path: cpp/src/arrow/compute/kernels/scalar_arithmetic_test.cc
##########
@@ -43,8 +42,20 @@
 namespace arrow {
 namespace compute {
 
-template <typename T>
-class TestUnaryArithmetic : public TestBase {
+// InputType - OutputType pairs

Review comment:
       WTF was I thinking with this comment! Ughh.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] edponce commented on pull request #10349: ARROW-12744: [C++][Compute] Add rounding kernel

Posted by GitBox <gi...@apache.org>.

edponce commented on pull request #10349:
URL: https://github.com/apache/arrow/pull/10349#issuecomment-914370213


   @pitrou If you get a chance, please help review this PR.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] edponce commented on pull request #10349: ARROW-12744: [C++][Compute] Add rounding kernel

Posted by GitBox <gi...@apache.org>.

edponce commented on pull request #10349:
URL: https://github.com/apache/arrow/pull/10349#issuecomment-917048763


   Are there any additional comments/reviews? cc @pitrou @bkietz @jorisvandenbossche 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] edponce commented on a change in pull request #10349: ARROW-12744: [C++][Compute] Add rounding kernel

Posted by GitBox <gi...@apache.org>.

edponce commented on a change in pull request #10349:
URL: https://github.com/apache/arrow/pull/10349#discussion_r705528007



##########
File path: cpp/src/arrow/compute/kernels/scalar_arithmetic_test.cc
##########
@@ -172,6 +190,64 @@ class TestUnaryArithmeticUnsigned : public TestUnaryArithmeticIntegral<T> {};
 template <typename T>
 class TestUnaryArithmeticFloating : public TestUnaryArithmetic<T> {};
 
+TYPED_TEST_SUITE(TestUnaryArithmeticIntegral, IntegralTypes);
+TYPED_TEST_SUITE(TestUnaryArithmeticSigned, SignedIntegerTypes);
+TYPED_TEST_SUITE(TestUnaryArithmeticUnsigned, UnsignedIntegerTypes);
+TYPED_TEST_SUITE(TestUnaryArithmeticFloating, FloatingTypes);
+
+template <typename T>
+class TestUnaryRound : public TestBaseUnaryArithmetic<T, RoundOptions> {
+ protected:
+  using Base = TestBaseUnaryArithmetic<T, RoundOptions>;
+  using Base::options_;
+  void SetRoundMode(RoundMode value) { options_.round_mode = value; }
+  void SetRoundNdigits(int64_t value) { options_.ndigits = value; }
+};
+
+template <typename T>
+class TestUnaryRoundIntegral : public TestUnaryRound<T> {};
+
+template <typename T>
+class TestUnaryRoundSigned : public TestUnaryRoundIntegral<T> {};
+
+template <typename T>
+class TestUnaryRoundUnsigned : public TestUnaryRoundIntegral<T> {};
+
+template <typename T>
+class TestUnaryRoundFloating : public TestUnaryRound<T> {};
+
+TYPED_TEST_SUITE(TestUnaryRoundIntegral, IntegralTypes);
+TYPED_TEST_SUITE(TestUnaryRoundSigned, SignedIntegerTypes);
+TYPED_TEST_SUITE(TestUnaryRoundUnsigned, UnsignedIntegerTypes);
+TYPED_TEST_SUITE(TestUnaryRoundFloating, FloatingTypes);

Review comment:
       Oops, I misunderstood and moved to the wrong place.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] edponce commented on a change in pull request #10349: ARROW-12744: [C++][Compute] Add rounding kernel

Posted by GitBox <gi...@apache.org>.

edponce commented on a change in pull request #10349:
URL: https://github.com/apache/arrow/pull/10349#discussion_r704000408



##########
File path: cpp/src/arrow/compute/api_scalar.h
##########
@@ -583,10 +639,32 @@ Result<Datum> MinElementWise(
 ///
 /// \param[in] arg the value to extract sign from
 /// \param[in] ctx the function execution context, optional
-/// \return the elementwise sign function
+/// \return the element-wise sign function
 ARROW_EXPORT
 Result<Datum> Sign(const Datum& arg, ExecContext* ctx = NULLPTR);
 
+/// \brief Round a value to the given number of digits. Array values can be of
+/// arbitrary length. If argument is null the result will be null.

Review comment:
       This phrase is used several times in the function descriptions, I simply followed others as a "template". It refers to the input array, in that it is not required to have a specified length.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] lidavidm commented on a change in pull request #10349: ARROW-12744: [C++][Compute] Add rounding kernel

Posted by GitBox <gi...@apache.org>.

lidavidm commented on a change in pull request #10349:
URL: https://github.com/apache/arrow/pull/10349#discussion_r674356914



##########
File path: cpp/src/arrow/compute/kernels/scalar_arithmetic.cc
##########
@@ -1241,6 +1373,45 @@ std::shared_ptr<ScalarFunction> MakeUnaryArithmeticFunctionNotNull(
   return func;
 }
 
+// Like MakeUnaryArithmeticFunction, but for unary rounding functions that control
+// kernel dispatch based on RoundMode, only on non-null output.
+template <template <RoundMode RndMode> class Op, typename Opts>
+std::shared_ptr<ScalarFunction> MakeUnaryRoundFunction(
+    std::string name, const FunctionDoc* doc,
+    const FunctionOptions* default_options = NULLPTR, KernelInit init = NULLPTR) {
+  auto func = std::make_shared<ArithmeticFloatingPointFunction>(name, Arity::Unary(), doc,
+                                                                default_options);
+  for (const auto& ty : FloatingPointTypes()) {
+    // Order of ArrayKernelExec w.r.t. RoundMode needs to follow the order of
+    // values in RoundMode definition (see api_scalar.h) because they are used as
+    // indexing values.
+    std::vector<ArrayKernelExec> execs = {
+        GenerateArithmeticFloatingPoint<ScalarUnary, Op<RoundMode::TOWARDS_NEG_INFINITY>>(
+            ty),
+        GenerateArithmeticFloatingPoint<ScalarUnary, Op<RoundMode::TOWARDS_POS_INFINITY>>(
+            ty),
+        GenerateArithmeticFloatingPoint<ScalarUnary, Op<RoundMode::TOWARDS_ZERO>>(ty),
+        GenerateArithmeticFloatingPoint<ScalarUnary, Op<RoundMode::TOWARDS_INFINITY>>(ty),
+        GenerateArithmeticFloatingPoint<ScalarUnary, Op<RoundMode::HALF_NEG_INFINITY>>(
+            ty),
+        GenerateArithmeticFloatingPoint<ScalarUnary, Op<RoundMode::HALF_POS_INFINITY>>(
+            ty),
+        GenerateArithmeticFloatingPoint<ScalarUnary, Op<RoundMode::HALF_TOWARDS_ZERO>>(
+            ty),
+        GenerateArithmeticFloatingPoint<ScalarUnary,
+                                        Op<RoundMode::HALF_TOWARDS_INFINITY>>(ty),
+        GenerateArithmeticFloatingPoint<ScalarUnary, Op<RoundMode::HALF_TO_EVEN>>(ty),
+        GenerateArithmeticFloatingPoint<ScalarUnary, Op<RoundMode::HALF_TO_ODD>>(ty),
+    };

Review comment:
       Ah whoops, thanks for the context.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] edponce commented on a change in pull request #10349: ARROW-12744: [C++][Compute] Add rounding kernel

Posted by GitBox <gi...@apache.org>.

edponce commented on a change in pull request #10349:
URL: https://github.com/apache/arrow/pull/10349#discussion_r703643077



##########
File path: cpp/src/arrow/compute/api_scalar.h
##########
@@ -49,10 +49,66 @@ class ARROW_EXPORT ElementWiseAggregateOptions : public FunctionOptions {
   explicit ElementWiseAggregateOptions(bool skip_nulls = true);
   constexpr static char const kTypeName[] = "ElementWiseAggregateOptions";
   static ElementWiseAggregateOptions Defaults() { return ElementWiseAggregateOptions{}; }
-
   bool skip_nulls;
 };
 
+/// Rounding and tie-breaking modes for round compute functions.
+/// Additional details and examples are provided in compute.rst.
+enum class RoundMode : int8_t {
+  /// Round to nearest integer less than or equal in magnitude (aka "floor")
+  DOWN,
+  /// Round to nearest integer greater than or equal in magnitude (aka "ceil")
+  UP,
+  /// Get the integral part without fractional digits (aka "trunc")
+  TOWARDS_ZERO,
+  /// Round negative values with DOWN rule and positive values with UP rule
+  TOWARDS_INFINITY,
+  /// Round ties with DOWN rule
+  HALF_DOWN,
+  /// Round ties with UP rule
+  HALF_UP,
+  /// Round ties with TOWARDS_ZERO rule
+  HALF_TOWARDS_ZERO,
+  /// Round ties with TOWARDS_INFINITY rule
+  HALF_TOWARDS_INFINITY,
+  /// Round ties to nearest even integer
+  HALF_TO_EVEN,
+  /// Round ties to nearest odd integer
+  HALF_TO_ODD,
+};
+
+static constexpr double kDefaultAbsoluteTolerance = 1E-5;
+
+class ARROW_EXPORT RoundOptions : public FunctionOptions {
+ public:
+  explicit RoundOptions(int64_t ndigits = 0,
+                        RoundMode round_mode = RoundMode::HALF_TO_EVEN,
+                        double abs_tol = kDefaultAbsoluteTolerance);
+  constexpr static char const kTypeName[] = "RoundOptions";
+  static RoundOptions Defaults() { return RoundOptions(); }
+  /// Rounding precision (number of digits to round to).
+  int64_t ndigits;
+  /// Rounding and tie-breaking mode
+  RoundMode round_mode;
+  /// Absolute tolerance for approximating values as integers and mid-point decimals
+  double abs_tol;

Review comment:
       With the current tolerance, 1.500000000001 will be equal to 1.5. *The tolerance is to be able to control equality approximations since floating point numbers are not equally spaced.* Minor floating point errors are introduced when the value is scaled during the rounding process. So if you have very small numbers (e.g. 0.00000034, 0.0000003401) you can set tolerance accordingly so that they are not equal, but if you have very large numbers (e.g., 10000.000001, 10000.000000) you may want to consider them equal. In reality, this parameter should be use in the general case, but I did found that Numpy's rounding also has its corner cases, so this is a best effort approach. Setting it to 0 will allow some of those corner cases, so what about a smaller tolerance, 1e-8? If not, that is fine, 0 it is to mimic the behavior of other libraries as well.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] edponce commented on a change in pull request #10349: ARROW-12744: [C++][Compute] Add rounding kernel

Posted by GitBox <gi...@apache.org>.

edponce commented on a change in pull request #10349:
URL: https://github.com/apache/arrow/pull/10349#discussion_r705850030



##########
File path: cpp/src/arrow/compute/kernels/scalar_arithmetic.cc
##########
@@ -852,24 +853,243 @@ struct LogbChecked {
   }
 };
 
+struct RoundUtil {
+  template <typename T>
+  static constexpr enable_if_t<std::is_floating_point<T>::value, bool> IsInteger(
+      const T val) {
+    // |frac| ~ 0.0?
+    return std::floor(val) == val;
+  }
+
+  template <typename T>
+  static constexpr enable_if_t<std::is_floating_point<T>::value, bool> IsHalfInteger(
+      const T val) {
+    // |frac| ~ 0.5?
+    return (val - std::floor(val)) == T(0.5);
+  }
+
+  // Calculate powers of ten with arbitrary integer exponent
+  template <typename T = double>
+  static enable_if_floating_point<T> Pow10(int64_t power) {
+    static constexpr T lut[] = {1e0F, 1e1F, 1e2F,  1e3F,  1e4F,  1e5F,  1e6F,  1e7F,
+                                1e8F, 1e9F, 1e10F, 1e11F, 1e12F, 1e13F, 1e14F, 1e15F};
+    int64_t lut_size = (sizeof(lut) / sizeof(*lut));
+    int64_t abs_power = std::abs(power);
+    auto pow10 = lut[std::min(abs_power, lut_size - 1)];
+    while (abs_power-- >= lut_size) {
+      pow10 *= 1e1F;
+    }
+    return (power >= 0) ? pow10 : (1 / pow10);
+  }
+};
+
+// Specializations of rounding implementations for round kernels
+template <typename, RoundMode>
+struct RoundImpl;
+
+template <typename T>
+struct RoundImpl<T, RoundMode::DOWN> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::floor(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::UP> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::ceil(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::TOWARDS_ZERO> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::trunc(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::TOWARDS_INFINITY> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::signbit(val) ? std::floor(val) : std::ceil(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_DOWN> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return RoundImpl<T, RoundMode::DOWN>::Round(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_UP> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return RoundImpl<T, RoundMode::UP>::Round(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TOWARDS_ZERO> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return RoundImpl<T, RoundMode::TOWARDS_ZERO>::Round(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TOWARDS_INFINITY> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return RoundImpl<T, RoundMode::TOWARDS_INFINITY>::Round(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TO_EVEN> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::round(val * T(0.5)) * 2;
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TO_ODD> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::floor(val * T(0.5)) + std::ceil(val * T(0.5));
+  }
+};
+
+// Specializations of kernel state for round kernels
+template <typename>
+struct RoundOptionsWrapper;
+
+template <>
+struct RoundOptionsWrapper<RoundOptions> : public OptionsWrapper<RoundOptions> {
+  using OptionsType = RoundOptions;
+  double pow10;
+
+  explicit RoundOptionsWrapper(OptionsType options) : OptionsWrapper(std::move(options)) {
+    // Only positive of powers of 10 are used because combining multiply and
+    // division operations produced more stable rounding than using multiply-only.
+    // Refer to NumPy's round implementation:
+    // https://github.com/numpy/numpy/blob/7b2f20b406d27364c812f7a81a9c901afbd3600c/numpy/core/src/multiarray/calculation.c#L589
+    pow10 = RoundUtil::Pow10(std::abs(options.ndigits));
+  }
+
+  static Result<std::unique_ptr<KernelState>> Init(KernelContext* ctx,
+                                                   const KernelInitArgs& args) {
+    if (auto options = static_cast<const OptionsType*>(args.options)) {
+      return arrow::internal::make_unique<RoundOptionsWrapper<OptionsType>>(*options);
+    }
+
+    return Status::Invalid(
+        "Attempted to initialize KernelState from null FunctionOptions");
+  }

Review comment:
       Ok, I tried the template variant of `OptionsWrapper` and although it made the code cleaner, it failed to compiled for some systems, so reverted to duplicating the `OptionsWrapper::Init` definition. I think there needs to be a refactoring of KernelState and related parts to support validating kernel options and extending kernel state in a simpler manner. There are different patterns being used in the code to fulfill these. But this is a separate issue from this PR.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] edponce commented on a change in pull request #10349: ARROW-12744: [C++][Compute] Add rounding kernel

Posted by GitBox <gi...@apache.org>.

edponce commented on a change in pull request #10349:
URL: https://github.com/apache/arrow/pull/10349#discussion_r696160814



##########
File path: python/pyarrow/tests/test_compute.py
##########
@@ -1299,6 +1301,98 @@ def test_arithmetic_multiply():
     assert result.equals(expected)
 
 
+@pytest.mark.parametrize("ty", ["round", "mround"])
+def test_round_to_integer(ty):
+    if ty == "round":
+        func = pc.round
+        RoundOptions = pc.RoundOptions
+    elif ty == "mround":
+        func = pc.mround
+        RoundOptions = pc.MRoundOptions
+
+    approx_equals = partial(np.isclose, equal_nan=True)
+    round_modes = [
+        "down", "up", "towards_zero", "towards_infinity", "half_down",
+        "half_up", "half_towards_zero", "half_towards_infinity",
+        "half_to_even", "half_to_odd",
+    ]
+    arr = pa.array([3.2, 3.5, 3.7, 4.5, -3.2, -3.5, -3.7, None])
+    for round_mode in round_modes:
+        options = RoundOptions(round_mode=round_mode)
+        result = func(arr, options=options).to_numpy(zero_copy_only=False)
+        if round_mode == "down":
+            expected_ = [3, 3, 3, 4, -4, -4, -4, None]
+        elif round_mode == "up":
+            expected_ = [4, 4, 4, 5, -3, -3, -3, None]
+        elif round_mode == "towards_zero":
+            expected_ = [3, 3, 3, 4, -3, -3, -3, None]
+        elif round_mode == "towards_infinity":
+            expected_ = [4, 4, 4, 5, -4, -4, -4, None]
+        elif round_mode == "half_down":
+            expected_ = [3, 3, 4, 4, -3, -4, -4, None]
+        elif round_mode == "half_up":
+            expected_ = [3, 4, 4, 5, -3, -3, -4, None]
+        elif round_mode == "half_towards_zero":
+            expected_ = [3, 3, 4, 4, -3, -3, -4, None]
+        elif round_mode == "half_towards_infinity":
+            expected_ = [3, 4, 4, 5, -3, -4, -4, None]
+        elif round_mode == "half_to_even":
+            expected_ = [3, 4, 4, 4, -3, -4, -4, None]
+        elif round_mode == "half_to_odd":
+            expected_ = [3, 3, 4, 5, -3, -3, -4, None]
+
+        expected = pa.array(expected_).to_numpy(zero_copy_only=False)
+        assert all(map(approx_equals, result, expected))
+
+
+def test_round():
+    approx_equals = partial(np.isclose, equal_nan=True)
+    round_ndigits = [-2, -1, 0, 1, 2]
+    arr = pa.array([320, 3.5, 3.075, 4.5, -3.212, -35.1234, -3.045, None])
+    for ndigits in round_ndigits:
+        options = pc.RoundOptions(
+            ndigits=ndigits, round_mode="half_towards_infinity")
+        result = pc.round(arr, options=options).to_numpy(zero_copy_only=False)
+        if ndigits == -2:
+            expected_ = [300, 0, 0, 0, 0, 0, 0, None]
+        elif ndigits == -1:
+            expected_ = [320, 0, 0, 0, 0, -40, 0, None]
+        elif ndigits == 0:
+            expected_ = [320, 4, 3, 5, -3, -35, -3, None]
+        elif ndigits == 1:
+            expected_ = [320, 3.5, 3.1, 4.5, -3.2, -35.1, -3, None]
+        elif ndigits == 2:
+            expected_ = [320, 3.5, 3.08, 4.5, -3.21, -35.12, -3.05, None]
+
+        expected = pa.array(expected_).to_numpy(zero_copy_only=False)
+        assert all(map(approx_equals, result, expected))
+
+
+def test_mround():
+    approx_equals = partial(np.isclose, equal_nan=True)
+    multiples = [-2, -0.05, 0.1, 0, 10, 100]
+    arr = pa.array([320, 3.5, 3.075, 4.5, -3.212, -35.1234, -3.045, None])
+    for multiple in multiples:
+        options = pc.MRoundOptions(
+            multiple=multiple, round_mode="half_towards_infinity")
+        result = pc.mround(arr, options=options).to_numpy(zero_copy_only=False)
+        if multiple == -2:
+            expected_ = [320, 4, 4, 4, -4, -36, -4, None]
+        elif multiple == -0.05:
+            expected_ = [320, 3.5, 3.1, 4.5, -3.2, -35.1, -3.05, None]
+        elif multiple == 0.1:
+            expected_ = [320, 3.5, 3.1, 4.5, -3.2, -35.1, -3, None]
+        elif multiple == 0:
+            expected_ = [0, 0, 0, 0, 0, 0, 0, None]
+        elif multiple == 10:
+            expected_ = [320, 0, 0, 0, 0, -40, 0, None]
+        elif multiple == 100:
+            expected_ = [300, 0, 0, 0, 0, 0, 0, None]
+
+        expected = pa.array(expected_).to_numpy(zero_copy_only=False)

Review comment:
       `to_numpy` has as default `zero_copy_only=True` but the implicit array conversion `__array__` uses `zero_copy_only=False`. I followed your suggestion and used implicit casts Thanks for this suggestion!




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] edponce commented on a change in pull request #10349: ARROW-12744: [C++][Compute] Add rounding kernel

Posted by GitBox <gi...@apache.org>.

edponce commented on a change in pull request #10349:
URL: https://github.com/apache/arrow/pull/10349#discussion_r703584951



##########
File path: cpp/src/arrow/compute/kernels/scalar_arithmetic.cc
##########
@@ -896,7 +915,8 @@ struct RoundImpl<T, RoundMode::UP> {
 template <typename T>
 struct RoundImpl<T, RoundMode::TOWARDS_ZERO> {
   static constexpr enable_if_floating_point<T> Round(const T val) {
-    return std::trunc(val);
+    // return std::trunc(val);

Review comment:
       This was a place holder while I thought more about how to solve the problem.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] pitrou commented on a change in pull request #10349: ARROW-12744: [C++][Compute] Add rounding kernel

Posted by GitBox <gi...@apache.org>.

pitrou commented on a change in pull request #10349:
URL: https://github.com/apache/arrow/pull/10349#discussion_r703656776



##########
File path: cpp/src/arrow/compute/api_scalar.h
##########
@@ -49,10 +49,66 @@ class ARROW_EXPORT ElementWiseAggregateOptions : public FunctionOptions {
   explicit ElementWiseAggregateOptions(bool skip_nulls = true);
   constexpr static char const kTypeName[] = "ElementWiseAggregateOptions";
   static ElementWiseAggregateOptions Defaults() { return ElementWiseAggregateOptions{}; }
-
   bool skip_nulls;
 };
 
+/// Rounding and tie-breaking modes for round compute functions.
+/// Additional details and examples are provided in compute.rst.
+enum class RoundMode : int8_t {
+  /// Round to nearest integer less than or equal in magnitude (aka "floor")
+  DOWN,
+  /// Round to nearest integer greater than or equal in magnitude (aka "ceil")
+  UP,
+  /// Get the integral part without fractional digits (aka "trunc")
+  TOWARDS_ZERO,
+  /// Round negative values with DOWN rule and positive values with UP rule
+  TOWARDS_INFINITY,
+  /// Round ties with DOWN rule
+  HALF_DOWN,
+  /// Round ties with UP rule
+  HALF_UP,
+  /// Round ties with TOWARDS_ZERO rule
+  HALF_TOWARDS_ZERO,
+  /// Round ties with TOWARDS_INFINITY rule
+  HALF_TOWARDS_INFINITY,
+  /// Round ties to nearest even integer
+  HALF_TO_EVEN,
+  /// Round ties to nearest odd integer
+  HALF_TO_ODD,
+};
+
+static constexpr double kDefaultAbsoluteTolerance = 1E-5;
+
+class ARROW_EXPORT RoundOptions : public FunctionOptions {
+ public:
+  explicit RoundOptions(int64_t ndigits = 0,
+                        RoundMode round_mode = RoundMode::HALF_TO_EVEN,
+                        double abs_tol = kDefaultAbsoluteTolerance);
+  constexpr static char const kTypeName[] = "RoundOptions";
+  static RoundOptions Defaults() { return RoundOptions(); }
+  /// Rounding precision (number of digits to round to).
+  int64_t ndigits;
+  /// Rounding and tie-breaking mode
+  RoundMode round_mode;
+  /// Absolute tolerance for approximating values as integers and mid-point decimals
+  double abs_tol;

Review comment:
       I think 0 as a default value is best (especially as `ndigits` is 0 by default as well), it will match the common expectation.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] lidavidm commented on a change in pull request #10349: ARROW-12744: [C++][Compute] Add rounding kernel

Posted by GitBox <gi...@apache.org>.

lidavidm commented on a change in pull request #10349:
URL: https://github.com/apache/arrow/pull/10349#discussion_r658733636



##########
File path: cpp/src/arrow/compute/kernels/scalar_arithmetic.cc
##########
@@ -454,6 +456,159 @@ struct PowerChecked {
   }
 };
 
+using RoundState = internal::OptionsWrapper<RoundOptions>;
+
+struct Round {
+  template <typename T, enable_if_t<std::is_floating_point<T>::value, bool> = true>
+  static constexpr bool ApproxEqual(T x, T y, int ulp = 8) {
+    // https://en.cppreference.com/w/cpp/types/numeric_limits/epsilon
+    // The machine epsilon has to be scaled to the magnitude of the values used
+    // and multiplied by the desired precision in ULPs (units in the last place)
+    return (std::fabs(x - y) <=
+            (std::numeric_limits<T>::epsilon() * std::fabs(x + y) * ulp))
+           // unless the result is subnormal
+           || (std::fabs(x - y) < std::numeric_limits<T>::min());
+  }
+
+  template <typename T, enable_if_t<std::is_floating_point<T>::value, bool> = true>
+  static constexpr bool IsHalf(T val) {
+    // |frac| == 0.5?
+    return ApproxEqual(std::fabs(std::fmod(val, T(1))), T(0.5));
+  }
+
+  template <typename T>
+  static constexpr enable_if_floating_point<T> RoundWithMultiple(T val, T mult) {
+    return (val / mult) * mult;
+  }
+
+  template <typename T>
+  static constexpr enable_if_floating_point<T> Floor(T val) {
+    return std::floor(val);
+  }
+
+  template <typename T>
+  static constexpr enable_if_floating_point<T> Ceiling(T val) {
+    return std::ceil(val);
+  }
+
+  template <typename T>
+  static constexpr enable_if_floating_point<T> Truncate(T val) {
+    return std::trunc(val);
+  }
+
+  template <typename T>
+  static constexpr enable_if_floating_point<T> TowardsInfinity(T val) {
+    return std::signbit(val) ? std::floor(val) : std::ceil(val);
+  }
+
+  template <typename T>
+  static constexpr enable_if_floating_point<T> HalfDown(T val) {
+    return std::ceil(val - T(0.5));
+  }
+
+  template <typename T>
+  static constexpr enable_if_floating_point<T> HalfUp(T val) {
+    return std::floor(val + T(0.5));
+  }
+
+  template <typename T>
+  static enable_if_floating_point<T> HalfToEven(T val) {
+    if (IsHalf(val)) {
+      auto floor = std::floor(val);
+      // Odd + 1, Even + 0
+      return floor + (std::fmod(std::fabs(floor), T(2)) >= T(1));
+    }
+    return std::round(val);
+  }
+
+  template <typename T>
+  static enable_if_floating_point<T> HalfToOdd(T val) {
+    if (IsHalf(val)) {
+      auto floor = std::floor(val);
+      // Odd + 0, Even + 1
+      return floor + (std::fmod(std::fabs(floor), T(2)) < T(1));
+    }
+    return std::round(val);
+  }
+
+  template <typename T>
+  static constexpr enable_if_floating_point<T> Nearest(T val) {
+    return std::round(val);
+  }
+
+  template <typename T>
+  static constexpr enable_if_floating_point<T> HalfTowardsZero(T val) {
+    return std::copysign(std::ceil(std::fabs(val) - T(0.5)), val);
+  }
+
+  template <typename T, typename Arg,
+            enable_if_t<std::is_integral<Arg>::value, bool> = true>
+  static constexpr enable_if_floating_point<T> Call(KernelContext* ctx, Arg arg,
+                                                    Status* st) {
+    return Call<T, T>(ctx, T(arg), st);
+  }
+
+  template <typename T, typename Arg,
+            enable_if_t<std::is_floating_point<Arg>::value, bool> = true>
+  static enable_if_floating_point<T> Call(KernelContext* ctx, Arg arg, Status* st) {
+    const RoundOptions& options = RoundState::Get(ctx);

Review comment:
       You could even do `static auto kRoundOptions = RoundOptions::Defaults()` inside the function.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] pitrou commented on a change in pull request #10349: ARROW-12744: [C++][Compute] Add rounding kernel

Posted by GitBox <gi...@apache.org>.

pitrou commented on a change in pull request #10349:
URL: https://github.com/apache/arrow/pull/10349#discussion_r704417208



##########
File path: cpp/src/arrow/compute/api_scalar.h
##########
@@ -583,10 +639,32 @@ Result<Datum> MinElementWise(
 ///
 /// \param[in] arg the value to extract sign from
 /// \param[in] ctx the function execution context, optional
-/// \return the elementwise sign function
+/// \return the element-wise sign function
 ARROW_EXPORT
 Result<Datum> Sign(const Datum& arg, ExecContext* ctx = NULLPTR);
 
+/// \brief Round a value to the given number of digits. Array values can be of
+/// arbitrary length. If argument is null the result will be null.
+///
+/// \param[in] arg the value rounded
+/// \param[in] options rounding options (rounding mode and number of digits), optional
+/// \param[in] ctx the function execution context, optional
+/// \return the element-wise rounded value
+ARROW_EXPORT
+Result<Datum> Round(const Datum& arg, RoundOptions options = RoundOptions::Defaults(),
+                    ExecContext* ctx = NULLPTR);
+
+/// \brief Round a value to the given multiple. Array values can be of arbitrary
+/// length. If argument is null the result will be null.
+///
+/// \param[in] arg the value to round
+/// \param[in] options rounding options (rounding mode and multiple), optional
+/// \param[in] ctx the function execution context, optional
+/// \return the element-wise rounded value
+ARROW_EXPORT
+Result<Datum> MRound(const Datum& arg, MRoundOptions options = MRoundOptions::Defaults(),

Review comment:
       The names should be consistent as far as possible, so the function should be registered as "round_to_multiple" and the option class names `RoundToMultipleOptions` ;-)




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] edponce commented on a change in pull request #10349: ARROW-12744: [C++][Compute] Add rounding kernel

Posted by GitBox <gi...@apache.org>.

edponce commented on a change in pull request #10349:
URL: https://github.com/apache/arrow/pull/10349#discussion_r659960836



##########
File path: docs/source/cpp/compute.rst
##########
@@ -286,7 +287,7 @@ an ``Invalid`` :class:`Status` when overflow is detected.
 +--------------------------+------------+--------------------+---------------------+
 | power_checked            | Binary     | Numeric            | Numeric             |
 +--------------------------+------------+--------------------+---------------------+
-| subtract                 | Binary     | Numeric            | Numeric (1)         |
+| subtract                 | Binary     | Numeric            | Numeric             |

Review comment:
       Not intentional at all. This was sloppy on my part.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] bkietz commented on a change in pull request #10349: ARROW-12744: [C++][Compute] Add rounding kernel

Posted by GitBox <gi...@apache.org>.

bkietz commented on a change in pull request #10349:
URL: https://github.com/apache/arrow/pull/10349#discussion_r633838367



##########
File path: cpp/src/arrow/compute/kernels/scalar_arithmetic_test.cc
##########
@@ -933,22 +933,23 @@ TEST(TestBinaryArithmetic, AddWithImplicitCastsUint64EdgeCase) {
 }
 
 TEST(TestUnaryArithmetic, DispatchBest) {
-  for (std::string name : {"negate"}) {
+  for (std::string name : {"negate", "round", "round_checked"}) {
     for (const auto& ty : {int8(), int16(), int32(), int64(), uint8(), uint16(), uint32(),

Review comment:
       It seems more than a little odd to support rounding for integer types. I suppose the noop kernels aren't taking up much space, but I'd really expect that rounding would only support floating point or decimal inputs (and would either raise an error or require an implicit cast for integral inputs)

##########
File path: cpp/src/arrow/compute/api_scalar.h
##########
@@ -238,6 +238,17 @@ Result<Datum> Power(const Datum& left, const Datum& right,
                     ArithmeticOptions options = ArithmeticOptions(),
                     ExecContext* ctx = NULLPTR);
 
+/// \brief Round a value. Array values can be of arbitrary length. If argument
+/// is null the result will be null.
+///
+/// \param[in] arg the value rounded
+/// \param[in] options arithmetic options (rounding precision, round-to-nearest method,
+/// overflow handling), optional \param[in] ctx the function execution context, optional
+/// \return the elementwise rounded value
+ARROW_EXPORT
+Result<Datum> Round(const Datum& arg, ArithmeticOptions options = ArithmeticOptions(),

Review comment:
       Round will require separate options including the number of digits which should remain in the rounded output

##########
File path: cpp/src/arrow/compute/api_scalar.h
##########
@@ -238,6 +238,17 @@ Result<Datum> Power(const Datum& left, const Datum& right,
                     ArithmeticOptions options = ArithmeticOptions(),
                     ExecContext* ctx = NULLPTR);
 
+/// \brief Round a value. Array values can be of arbitrary length. If argument
+/// is null the result will be null.
+///
+/// \param[in] arg the value rounded
+/// \param[in] options arithmetic options (rounding precision, round-to-nearest method,
+/// overflow handling), optional \param[in] ctx the function execution context, optional
+/// \return the elementwise rounded value
+ARROW_EXPORT
+Result<Datum> Round(const Datum& arg, ArithmeticOptions options = ArithmeticOptions(),

Review comment:
       Since overflow doesn't pertain to rounding, I think it doesn't make sense to provide ArithmeticOptions (which only contains `bool check_overflow` here)

##########
File path: cpp/src/arrow/compute/kernels/scalar_arithmetic.cc
##########
@@ -355,6 +355,32 @@ struct PowerChecked {
   }
 };
 
+struct Round {
+  template <typename T, typename Arg>
+  static constexpr enable_if_floating_point<T> Call(KernelContext*, Arg arg, Status*) {
+    return std::round(arg);
+  }
+
+  template <typename T, typename Arg>
+  static constexpr enable_if_integer<T> Call(KernelContext*, Arg arg, Status*) {
+    return arg;
+  }
+};
+
+struct RoundChecked {
+  template <typename T, typename Arg>
+  static constexpr enable_if_integer<T> Call(KernelContext*, Arg arg, Status*) {
+    static_assert(std::is_same<T, Arg>::value, "");
+    return arg;
+  }
+
+  template <typename T, typename Arg>
+  static constexpr enable_if_floating_point<T> Call(KernelContext*, Arg arg, Status*) {
+    static_assert(std::is_same<T, Arg>::value, "");

Review comment:
       It's not worthwhile to have a _checked version of this function if there is no difference in runtime behavior between the two functions.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] bkietz commented on a change in pull request #10349: ARROW-12744: [C++][Compute] Add rounding kernel

Posted by GitBox <gi...@apache.org>.

bkietz commented on a change in pull request #10349:
URL: https://github.com/apache/arrow/pull/10349#discussion_r659917735



##########
File path: docs/source/cpp/compute.rst
##########
@@ -312,6 +313,79 @@ precision of `divide` is at least the sum of precisions of both operands with
 enough scale kept. Error is returned if the result precision is beyond the
 decimal value range.
 
+Rounding functions
+~~~~~~~~~~~~~~~~~~
+
+These functions displace numeric input(s) to approximate and shorter numeric
+representation(s).  Integral input(s) produce floating-point output(s) of same value.
+If any of the input element(s) is null, the corresponding output element is null.
+
++---------------+------------+-------------+-------------+----------------------------------+
+| Function name | Arity      | Input types | Output type | Notes | Options class            |
++===============+============+=============+=============+==================================+
+| mround        | Unary      | Numeric     | Float32/64  | (1)(2) | :struct:`MRoundOptions` |
++---------------+------------+-------------+-------------+----------------------------------+
+| round         | Unary      | Numeric     | Float32/64  | (1)(3) | :struct:`RoundOptions`  |
++---------------+------------+-------------+-------------+----------------------------------+
+
+* \(1) Output value is a 64-bit floating-point for integral inputs and the
+  retains the same type for floating-point inputs.  By default rounding functions
+  displace a value to the nearest integer with a round to even for breaking ties.
+  Options are available to control the rounding behavior.
+* \(2) The ``multiple`` option specifies the rounding
+  scale and precision.  Only the magnitude of the ``rounding multiple`` is used,
+  its sign is ignored.
+* \(3) The ``ndigits`` option specifies the rounding precision in
+  terms of number of digits.  A negative value corresponds to digits in the
+  non-decimal part.
+
++-------------------------+---------------------------------+
+| Round mode              | Description/Examples            |
++=========================+=================================+
+| DOWNWARD                | Equivalent to ``floor(x)``      |
+| TOWARDS_NEG_INFINITY    | 3.7 = 3, -3.2 = -4              |

Review comment:
       Nit: using `=` like this is confusing
   ```suggestion
   | TOWARDS_NEG_INFINITY    | 3.7 -> 3, -3.2 -> -4            |
   ```

##########
File path: cpp/src/arrow/compute/kernels/scalar_arithmetic.cc
##########
@@ -454,6 +456,166 @@ struct PowerChecked {
   }
 };
 
+struct RoundUtils {
+  template <typename T, enable_if_t<std::is_floating_point<T>::value, bool> = true>
+  static bool ApproxEqual(const T x, const T y, const int ulp = 7) {
+    // https://en.cppreference.com/w/cpp/types/numeric_limits/epsilon
+    // The machine epsilon has to be scaled to the magnitude of the values used
+    // and multiplied by the desired precision in ULPs (units in the last place)
+    const auto eps_ulp = std::numeric_limits<T>::epsilon() * ulp;
+    const auto xy_diff = std::fabs(x - y);
+    const auto xy_sum = std::fabs(x + y);
+    return (xy_diff <= (xy_sum * eps_ulp))
+           // unless the result is subnormal
+           || (xy_diff < std::numeric_limits<T>::min());
+  }
+
+  template <typename T, enable_if_t<std::is_floating_point<T>::value, bool> = true>
+  static bool IsHalf(T val) {
+    // |frac| == 0.5?
+    return ApproxEqual(std::fabs(std::fmod(val, T(1))), T(0.5));
+  }
+
+  template <typename T>
+  static constexpr enable_if_floating_point<T> Floor(T val) {
+    return std::floor(val);
+  }
+
+  template <typename T>
+  static constexpr enable_if_floating_point<T> Ceiling(T val) {
+    return std::ceil(val);
+  }
+
+  template <typename T>
+  static constexpr enable_if_floating_point<T> Truncate(T val) {
+    return std::trunc(val);
+  }
+
+  template <typename T>
+  static constexpr enable_if_floating_point<T> TowardsInfinity(T val) {
+    return std::signbit(val) ? std::floor(val) : std::ceil(val);
+  }
+
+  template <typename T>
+  static constexpr enable_if_floating_point<T> HalfDown(T val) {
+    return std::ceil(val - T(0.5));
+  }
+
+  template <typename T>
+  static constexpr enable_if_floating_point<T> HalfUp(T val) {
+    return std::floor(val + T(0.5));
+  }
+
+  template <typename T>
+  static enable_if_floating_point<T> HalfToEven(T val) {
+    if (IsHalf(val)) {
+      auto floor = std::floor(val);
+      // Odd + 1, Even + 0
+      return floor + (std::fmod(std::fabs(floor), T(2)) >= T(1));
+    }
+    return std::round(val);
+  }
+
+  template <typename T>
+  static enable_if_floating_point<T> HalfToOdd(T val) {
+    if (IsHalf(val)) {
+      auto floor = std::floor(val);
+      // Odd + 0, Even + 1
+      return floor + (std::fmod(std::fabs(floor), T(2)) < T(1));
+    }
+    return std::round(val);
+  }
+
+  template <typename T>
+  static constexpr enable_if_floating_point<T> Nearest(T val) {
+    return std::round(val);
+  }
+
+  template <typename T>
+  static constexpr enable_if_floating_point<T> HalfTowardsZero(T val) {
+    return std::copysign(std::ceil(std::fabs(val) - T(0.5)), val);
+  }
+
+  template <typename T>
+  static enable_if_floating_point<T> Round(T val, T mult, RoundMode round_mode,
+                                           Status* st) {
+    val /= mult;
+
+    T result;
+    switch (round_mode) {

Review comment:
       Let's ensure this switch is outside the hot loop. In this case, that would probably entail: make `round_mode` a template parameter of `RoundUtils::Round`, `struct Round`, and `struct MRound`. Then you can produce a vector of `ArrayKernelExec` which can be selected from outside the loop, something like:
   
   ```c++
     auto func =
         std::make_shared<ArithmeticUnaryFloatOnlyFunction>(name, Arity::Unary(), doc, default_options);
     for (const auto& ty : {float32(), float64()}) {
       std::vector<ArrayKernelExec> execs = {
         Round<RoundMode::DOWNWARD>,
         Round<RoundMode::UPWARD>,
         //...
       };
       auto exec = [execs](KernelContext* ctx, const ExecBatch& batch, Datum* out) {
         RoundMode round_mode = OptionsWrapper<RoundOptions>::Get(ctx).round_mode;
         return execs[round_mode](ctx, batch, out);
       };
       DCHECK_OK(func->AddKernel({ty}, out_ty, exec, init));
     }
     return func;
   ```

##########
File path: docs/source/cpp/compute.rst
##########
@@ -312,6 +313,79 @@ precision of `divide` is at least the sum of precisions of both operands with
 enough scale kept. Error is returned if the result precision is beyond the
 decimal value range.
 
+Rounding functions
+~~~~~~~~~~~~~~~~~~
+
+These functions displace numeric input(s) to approximate and shorter numeric
+representation(s).  Integral input(s) produce floating-point output(s) of same value.
+If any of the input element(s) is null, the corresponding output element is null.
+
++---------------+------------+-------------+-------------+----------------------------------+
+| Function name | Arity      | Input types | Output type | Notes | Options class            |
++===============+============+=============+=============+==================================+
+| mround        | Unary      | Numeric     | Float32/64  | (1)(2) | :struct:`MRoundOptions` |
++---------------+------------+-------------+-------------+----------------------------------+
+| round         | Unary      | Numeric     | Float32/64  | (1)(3) | :struct:`RoundOptions`  |
++---------------+------------+-------------+-------------+----------------------------------+
+
+* \(1) Output value is a 64-bit floating-point for integral inputs and the
+  retains the same type for floating-point inputs.  By default rounding functions
+  displace a value to the nearest integer with a round to even for breaking ties.
+  Options are available to control the rounding behavior.
+* \(2) The ``multiple`` option specifies the rounding
+  scale and precision.  Only the magnitude of the ``rounding multiple`` is used,

Review comment:
       ```suggestion
     scale and precision.  Only the absolute value of the ``rounding multiple`` is used,
   ```

##########
File path: cpp/src/arrow/compute/kernels/scalar_arithmetic.cc
##########
@@ -485,6 +647,36 @@ ArrayKernelExec ArithmeticExecFromOp(detail::GetTypeId get_id) {
   }
 }
 
+template <template <typename... Args> class KernelGenerator, typename Op>
+ArrayKernelExec GenerateArithmeticWithFloatOutType(detail::GetTypeId get_id) {

Review comment:
       Instead, wouldn't it make more sense to just insert an implicit cast to floating point?

##########
File path: docs/source/cpp/compute.rst
##########
@@ -312,6 +313,79 @@ precision of `divide` is at least the sum of precisions of both operands with
 enough scale kept. Error is returned if the result precision is beyond the
 decimal value range.
 
+Rounding functions
+~~~~~~~~~~~~~~~~~~
+
+These functions displace numeric input(s) to approximate and shorter numeric
+representation(s).  Integral input(s) produce floating-point output(s) of same value.
+If any of the input element(s) is null, the corresponding output element is null.
+
++---------------+------------+-------------+-------------+----------------------------------+
+| Function name | Arity      | Input types | Output type | Notes | Options class            |

Review comment:
       ```suggestion
   | Function name | Arity      | Input types | Output type | Notes  | Options class           |
   ```

##########
File path: cpp/src/arrow/compute/api_scalar.h
##########
@@ -42,6 +42,55 @@ struct ArithmeticOptions : public FunctionOptions {
   bool check_overflow;
 };
 
+enum class RoundMode {
+  // floor (towards negative infinity)
+  DOWNWARD,
+  TOWARDS_NEG_INFINITY,

Review comment:
       If these are equivalent, shouldn't they have the same value?
   ```suggestion
     DOWNWARD,
     TOWARDS_NEG_INFINITY = DOWNWARD,
   ```

##########
File path: docs/source/cpp/compute.rst
##########
@@ -286,7 +287,7 @@ an ``Invalid`` :class:`Status` when overflow is detected.
 +--------------------------+------------+--------------------+---------------------+
 | power_checked            | Binary     | Numeric            | Numeric             |
 +--------------------------+------------+--------------------+---------------------+
-| subtract                 | Binary     | Numeric            | Numeric (1)         |
+| subtract                 | Binary     | Numeric            | Numeric             |

Review comment:
       Was this intentional?

##########
File path: docs/source/cpp/compute.rst
##########
@@ -312,6 +313,79 @@ precision of `divide` is at least the sum of precisions of both operands with
 enough scale kept. Error is returned if the result precision is beyond the
 decimal value range.
 
+Rounding functions
+~~~~~~~~~~~~~~~~~~
+
+These functions displace numeric input(s) to approximate and shorter numeric
+representation(s).  Integral input(s) produce floating-point output(s) of same value.
+If any of the input element(s) is null, the corresponding output element is null.
+
++---------------+------------+-------------+-------------+----------------------------------+
+| Function name | Arity      | Input types | Output type | Notes | Options class            |
++===============+============+=============+=============+==================================+
+| mround        | Unary      | Numeric     | Float32/64  | (1)(2) | :struct:`MRoundOptions` |
++---------------+------------+-------------+-------------+----------------------------------+
+| round         | Unary      | Numeric     | Float32/64  | (1)(3) | :struct:`RoundOptions`  |
++---------------+------------+-------------+-------------+----------------------------------+
+
+* \(1) Output value is a 64-bit floating-point for integral inputs and the
+  retains the same type for floating-point inputs.  By default rounding functions
+  displace a value to the nearest integer with a round to even for breaking ties.
+  Options are available to control the rounding behavior.
+* \(2) The ``multiple`` option specifies the rounding
+  scale and precision.  Only the magnitude of the ``rounding multiple`` is used,
+  its sign is ignored.
+* \(3) The ``ndigits`` option specifies the rounding precision in
+  terms of number of digits.  A negative value corresponds to digits in the
+  non-decimal part.
+
++-------------------------+---------------------------------+
+| Round mode              | Description/Examples            |
++=========================+=================================+
+| DOWNWARD                | Equivalent to ``floor(x)``      |
+| TOWARDS_NEG_INFINITY    | 3.7 = 3, -3.2 = -4              |
++-------------------------+---------------------------------+
+| UPWARD                  | Equivalent to ``ceil(x)``       |
+| TOWARDS_POS_INFINITY    | 3.2 = 4, -3.7 = -3              |
++-------------------------+---------------------------------+
+| TOWARDS_ZERO            | Equivalent to ``trunc(x)``      |
+| AWAY_FROM_INFINITY      | 3.7 = 3, -3.7 = -3              |
++-------------------------+---------------------------------+
+| TOWARDS_INFINITY        | 3.2 = 4, -3.2 = -4              |
+| AWAY_FROM_ZERO          |                                 |
++-------------------------+---------------------------------+
+| HALF_UP                 | 3.5 = 4, 4.5 = 5, -3.5 = -3     |
+| HALF_POS_INFINITY       |                                 |
++-------------------------+---------------------------------+
+| HALF_DOWN               | 3.5 = 3, 4.5 = 4, -3.5 = -4     |
+| HALF_NEG_INFINITY       |                                 |
++-------------------------+---------------------------------+
+| HALF_TO_EVEN            | 3.5 = 4, 4.5 = 4, -3.5 = -4     |
++-------------------------+---------------------------------+
+| HALF_TO_ODD             | 3.5 = 3, 4.5 = 5, -3.5 = -3     |
++-------------------------+---------------------------------+
+| HALF_TOWARDS_ZERO       | 3.5 = 3, 4.5 = 4, -3.5 = -3     |
+| HALF_AWAY_FROM_INFINITY |                                 |
++-------------------------+---------------------------------+
+| HALF_TOWARDS_INFINITY   | Round nearest integer           |
+| HALF_AWAY_FROM_ZERO     | 3.5 = 4, 4.5 = 5, -3.5 = -4     |
+| NEAREST                 |                                 |
++-------------------------+---------------------------------+
+
++----------------+---------------+----------------------------+
+| Round multiple | Round ndigits | Description                |
++================================+============================+
+| 1.0            | 0             | Round to integer           |

Review comment:
       Could you add some examples here too? Additionally, could you break this into two tables (one for `mround` and one for `round`)? From this table it looks like one might specify both `multiple` and `ndigits` to a single function call

##########
File path: cpp/src/arrow/compute/kernels/scalar_arithmetic.cc
##########
@@ -20,6 +20,8 @@
 #include <limits>
 #include <utility>
 
+#include "arrow/compute/api_scalar.h"
+// #include "arrow/compute/kernels/codegen_internal.h"

Review comment:
       ```suggestion
   ```

##########
File path: docs/source/cpp/compute.rst
##########
@@ -312,6 +313,79 @@ precision of `divide` is at least the sum of precisions of both operands with
 enough scale kept. Error is returned if the result precision is beyond the
 decimal value range.
 
+Rounding functions
+~~~~~~~~~~~~~~~~~~
+
+These functions displace numeric input(s) to approximate and shorter numeric
+representation(s).  Integral input(s) produce floating-point output(s) of same value.
+If any of the input element(s) is null, the corresponding output element is null.
+
++---------------+------------+-------------+-------------+----------------------------------+
+| Function name | Arity      | Input types | Output type | Notes | Options class            |
++===============+============+=============+=============+==================================+
+| mround        | Unary      | Numeric     | Float32/64  | (1)(2) | :struct:`MRoundOptions` |
++---------------+------------+-------------+-------------+----------------------------------+
+| round         | Unary      | Numeric     | Float32/64  | (1)(3) | :struct:`RoundOptions`  |
++---------------+------------+-------------+-------------+----------------------------------+
+
+* \(1) Output value is a 64-bit floating-point for integral inputs and the
+  retains the same type for floating-point inputs.  By default rounding functions
+  displace a value to the nearest integer with a round to even for breaking ties.
+  Options are available to control the rounding behavior.
+* \(2) The ``multiple`` option specifies the rounding
+  scale and precision.  Only the magnitude of the ``rounding multiple`` is used,
+  its sign is ignored.
+* \(3) The ``ndigits`` option specifies the rounding precision in
+  terms of number of digits.  A negative value corresponds to digits in the
+  non-decimal part.
+
++-------------------------+---------------------------------+
+| Round mode              | Description/Examples            |
++=========================+=================================+
+| DOWNWARD                | Equivalent to ``floor(x)``      |
+| TOWARDS_NEG_INFINITY    | 3.7 = 3, -3.2 = -4              |
++-------------------------+---------------------------------+
+| UPWARD                  | Equivalent to ``ceil(x)``       |
+| TOWARDS_POS_INFINITY    | 3.2 = 4, -3.7 = -3              |
++-------------------------+---------------------------------+
+| TOWARDS_ZERO            | Equivalent to ``trunc(x)``      |
+| AWAY_FROM_INFINITY      | 3.7 = 3, -3.7 = -3              |
++-------------------------+---------------------------------+
+| TOWARDS_INFINITY        | 3.2 = 4, -3.2 = -4              |
+| AWAY_FROM_ZERO          |                                 |
++-------------------------+---------------------------------+
+| HALF_UP                 | 3.5 = 4, 4.5 = 5, -3.5 = -3     |
+| HALF_POS_INFINITY       |                                 |
++-------------------------+---------------------------------+
+| HALF_DOWN               | 3.5 = 3, 4.5 = 4, -3.5 = -4     |
+| HALF_NEG_INFINITY       |                                 |
++-------------------------+---------------------------------+
+| HALF_TO_EVEN            | 3.5 = 4, 4.5 = 4, -3.5 = -4     |
++-------------------------+---------------------------------+
+| HALF_TO_ODD             | 3.5 = 3, 4.5 = 5, -3.5 = -3     |
++-------------------------+---------------------------------+
+| HALF_TOWARDS_ZERO       | 3.5 = 3, 4.5 = 4, -3.5 = -3     |
+| HALF_AWAY_FROM_INFINITY |                                 |
++-------------------------+---------------------------------+
+| HALF_TOWARDS_INFINITY   | Round nearest integer           |
+| HALF_AWAY_FROM_ZERO     | 3.5 = 4, 4.5 = 5, -3.5 = -4     |
+| NEAREST                 |                                 |
++-------------------------+---------------------------------+
+
++----------------+---------------+----------------------------+
+| Round multiple | Round ndigits | Description                |
++================================+============================+
+| 1.0            | 0             | Round to integer           |
++----------------+--------------------------------------------+
+| 0.001          | 3             | Round to 3 decimal places  |
++----------------+--------------------------------------------+
+| 10             | -2            | Round to multiple of 10    |

Review comment:
       ```suggestion
   | 10             | -2            | Round to multiple of 100   |
   ```

##########
File path: docs/source/cpp/compute.rst
##########
@@ -312,6 +313,79 @@ precision of `divide` is at least the sum of precisions of both operands with
 enough scale kept. Error is returned if the result precision is beyond the
 decimal value range.
 
+Rounding functions
+~~~~~~~~~~~~~~~~~~
+
+These functions displace numeric input(s) to approximate and shorter numeric
+representation(s).  Integral input(s) produce floating-point output(s) of same value.
+If any of the input element(s) is null, the corresponding output element is null.
+
++---------------+------------+-------------+-------------+----------------------------------+
+| Function name | Arity      | Input types | Output type | Notes | Options class            |
++===============+============+=============+=============+==================================+
+| mround        | Unary      | Numeric     | Float32/64  | (1)(2) | :struct:`MRoundOptions` |
++---------------+------------+-------------+-------------+----------------------------------+
+| round         | Unary      | Numeric     | Float32/64  | (1)(3) | :struct:`RoundOptions`  |
++---------------+------------+-------------+-------------+----------------------------------+
+
+* \(1) Output value is a 64-bit floating-point for integral inputs and the
+  retains the same type for floating-point inputs.  By default rounding functions
+  displace a value to the nearest integer with a round to even for breaking ties.
+  Options are available to control the rounding behavior.
+* \(2) The ``multiple`` option specifies the rounding
+  scale and precision.  Only the magnitude of the ``rounding multiple`` is used,
+  its sign is ignored.
+* \(3) The ``ndigits`` option specifies the rounding precision in
+  terms of number of digits.  A negative value corresponds to digits in the
+  non-decimal part.

Review comment:
       ```suggestion
     non-fractional part. For example -2 corresponds to rounding to the nearest multiple of 100
     (zeroing the ones and tens digits).
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] edponce commented on a change in pull request #10349: ARROW-12744: [C++][Compute] Add rounding kernel

Posted by GitBox <gi...@apache.org>.

edponce commented on a change in pull request #10349:
URL: https://github.com/apache/arrow/pull/10349#discussion_r674356447



##########
File path: cpp/src/arrow/compute/kernels/scalar_arithmetic.cc
##########
@@ -1241,6 +1373,45 @@ std::shared_ptr<ScalarFunction> MakeUnaryArithmeticFunctionNotNull(
   return func;
 }
 
+// Like MakeUnaryArithmeticFunction, but for unary rounding functions that control
+// kernel dispatch based on RoundMode, only on non-null output.
+template <template <RoundMode RndMode> class Op, typename Opts>
+std::shared_ptr<ScalarFunction> MakeUnaryRoundFunction(
+    std::string name, const FunctionDoc* doc,
+    const FunctionOptions* default_options = NULLPTR, KernelInit init = NULLPTR) {
+  auto func = std::make_shared<ArithmeticFloatingPointFunction>(name, Arity::Unary(), doc,
+                                                                default_options);
+  for (const auto& ty : FloatingPointTypes()) {
+    // Order of ArrayKernelExec w.r.t. RoundMode needs to follow the order of
+    // values in RoundMode definition (see api_scalar.h) because they are used as
+    // indexing values.
+    std::vector<ArrayKernelExec> execs = {
+        GenerateArithmeticFloatingPoint<ScalarUnary, Op<RoundMode::TOWARDS_NEG_INFINITY>>(
+            ty),
+        GenerateArithmeticFloatingPoint<ScalarUnary, Op<RoundMode::TOWARDS_POS_INFINITY>>(
+            ty),
+        GenerateArithmeticFloatingPoint<ScalarUnary, Op<RoundMode::TOWARDS_ZERO>>(ty),
+        GenerateArithmeticFloatingPoint<ScalarUnary, Op<RoundMode::TOWARDS_INFINITY>>(ty),
+        GenerateArithmeticFloatingPoint<ScalarUnary, Op<RoundMode::HALF_NEG_INFINITY>>(
+            ty),
+        GenerateArithmeticFloatingPoint<ScalarUnary, Op<RoundMode::HALF_POS_INFINITY>>(
+            ty),
+        GenerateArithmeticFloatingPoint<ScalarUnary, Op<RoundMode::HALF_TOWARDS_ZERO>>(
+            ty),
+        GenerateArithmeticFloatingPoint<ScalarUnary,
+                                        Op<RoundMode::HALF_TOWARDS_INFINITY>>(ty),
+        GenerateArithmeticFloatingPoint<ScalarUnary, Op<RoundMode::HALF_TO_EVEN>>(ty),
+        GenerateArithmeticFloatingPoint<ScalarUnary, Op<RoundMode::HALF_TO_ODD>>(ty),
+    };

Review comment:
       A switch-case was the initial approach taken but after code reviews, the conclusion was to resolve the kernel during dispatching rather than performing the check internally for each value/invocation. I am not overtly concerned about the explicit rounding modes in template parameters, but the `vector` order and `RoundMode`s need to match.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] lidavidm commented on a change in pull request #10349: ARROW-12744: [C++][Compute] Add rounding kernel

Posted by GitBox <gi...@apache.org>.

lidavidm commented on a change in pull request #10349:
URL: https://github.com/apache/arrow/pull/10349#discussion_r696590014



##########
File path: cpp/src/arrow/compute/api_scalar.h
##########
@@ -53,16 +52,10 @@ class ARROW_EXPORT ElementWiseAggregateOptions : public FunctionOptions {
   bool skip_nulls;
 };
 
-/// Functor to calculate hash of an enum.
-template <typename T, typename R = typename std::underlying_type<T>::type,
-          typename = std::enable_if<std::is_enum<T>::value>>
-struct EnumHash {
-  R operator()(T val) const { return static_cast<R>(val); }
-};
-
 /// Rounding and tie-breaking modes for round compute functions.
 /// Additional details and examples are provided in compute.rst.
 enum class RoundMode : int8_t {
+  // Note: The HALF values need to be last and the first HALF entry is HALF_DOWN.

Review comment:
       Why?

##########
File path: cpp/src/arrow/compute/api_scalar.h
##########
@@ -85,26 +78,32 @@ enum class RoundMode : int8_t {
   HALF_TO_ODD,
 };
 
+static constexpr double kDefaultAbsoluteTolerance = 1E-5;
+
 class ARROW_EXPORT RoundOptions : public FunctionOptions {
  public:
   explicit RoundOptions(int64_t ndigits = 0,
-                        RoundMode round_mode = RoundMode::HALF_TO_EVEN);
+                        RoundMode round_mode = RoundMode::HALF_TO_EVEN,
+                        double atol = kDefaultAbsoluteTolerance);
   constexpr static char const kTypeName[] = "RoundOptions";
   static RoundOptions Defaults() { return RoundOptions(); }
   /// Rounding precision (number of digits to round to).
   int64_t ndigits;
   RoundMode round_mode;
+  double atol;

Review comment:
       Also it needs Python bindings.

##########
File path: cpp/src/arrow/compute/api_scalar.h
##########
@@ -85,26 +78,32 @@ enum class RoundMode : int8_t {
   HALF_TO_ODD,
 };
 
+static constexpr double kDefaultAbsoluteTolerance = 1E-5;
+
 class ARROW_EXPORT RoundOptions : public FunctionOptions {
  public:
   explicit RoundOptions(int64_t ndigits = 0,
-                        RoundMode round_mode = RoundMode::HALF_TO_EVEN);
+                        RoundMode round_mode = RoundMode::HALF_TO_EVEN,
+                        double atol = kDefaultAbsoluteTolerance);
   constexpr static char const kTypeName[] = "RoundOptions";
   static RoundOptions Defaults() { return RoundOptions(); }
   /// Rounding precision (number of digits to round to).
   int64_t ndigits;
   RoundMode round_mode;
+  double atol;

Review comment:
       It would be good to document what this field does.

##########
File path: cpp/src/arrow/compute/kernels/scalar_arithmetic.cc
##########
@@ -856,10 +860,23 @@ struct LogbChecked {
 
 struct RoundUtil {
   template <typename T>
-  static enable_if_t<std::is_floating_point<T>::value, bool> ApproxHalfInt(
-      const T val, const T atol = 1e-5) {
-    // |frac| ~ 0.5?
-    return std::fabs(std::fmod(std::fabs(val), T(1)) - T(0.5)) <= std::fabs(atol);
+  static constexpr enable_if_t<std::is_floating_point<T>::value, bool> IsApproxEqual(
+      const T a, const T b, const T atol = kDefaultAbsoluteTolerance) {
+    return std::fabs(a - b) <= std::fabs(atol);

Review comment:
       I would assume atol is always nonnegative.

##########
File path: cpp/src/arrow/compute/kernels/scalar_arithmetic.cc
##########
@@ -1444,7 +1463,7 @@ std::shared_ptr<ScalarFunction> MakeUnaryRoundFunction(std::string name,
   auto func = std::make_shared<ArithmeticFloatingPointFunction>(name, Arity::Unary(), doc,
                                                                 &kDefaultOptions);
   for (const auto& ty : FloatingPointTypes()) {
-    std::unordered_map<RoundMode, ArrayKernelExec, EnumHash<RoundMode>>
+    std::unordered_map<RoundMode, ArrayKernelExec, ::arrow::internal::EnumHash<RoundMode>>

Review comment:
       Is a map really necessary here vs a switch-case?

##########
File path: cpp/src/arrow/compute/kernels/scalar_arithmetic.cc
##########
@@ -1444,7 +1463,7 @@ std::shared_ptr<ScalarFunction> MakeUnaryRoundFunction(std::string name,
   auto func = std::make_shared<ArithmeticFloatingPointFunction>(name, Arity::Unary(), doc,
                                                                 &kDefaultOptions);
   for (const auto& ty : FloatingPointTypes()) {
-    std::unordered_map<RoundMode, ArrayKernelExec, EnumHash<RoundMode>>
+    std::unordered_map<RoundMode, ArrayKernelExec, ::arrow::internal::EnumHash<RoundMode>>

Review comment:
       I suppose it's not a big deal since this is only run once.

##########
File path: cpp/src/arrow/compute/api_scalar.h
##########
@@ -53,16 +52,10 @@ class ARROW_EXPORT ElementWiseAggregateOptions : public FunctionOptions {
   bool skip_nulls;
 };
 
-/// Functor to calculate hash of an enum.
-template <typename T, typename R = typename std::underlying_type<T>::type,
-          typename = std::enable_if<std::is_enum<T>::value>>
-struct EnumHash {
-  R operator()(T val) const { return static_cast<R>(val); }
-};
-
 /// Rounding and tie-breaking modes for round compute functions.
 /// Additional details and examples are provided in compute.rst.
 enum class RoundMode : int8_t {
+  // Note: The HALF values need to be last and the first HALF entry is HALF_DOWN.

Review comment:
       Ok, I see why below - but it is a little confusing to put this into a public API header. Some static asserts in the cc file might serve you better?

##########
File path: cpp/src/arrow/compute/api_scalar.h
##########
@@ -85,26 +78,32 @@ enum class RoundMode : int8_t {
   HALF_TO_ODD,
 };
 
+static constexpr double kDefaultAbsoluteTolerance = 1E-5;
+
 class ARROW_EXPORT RoundOptions : public FunctionOptions {
  public:
   explicit RoundOptions(int64_t ndigits = 0,
-                        RoundMode round_mode = RoundMode::HALF_TO_EVEN);
+                        RoundMode round_mode = RoundMode::HALF_TO_EVEN,
+                        double atol = kDefaultAbsoluteTolerance);

Review comment:
       'atol' makes me think of std::atol even though it seems it's 'absolute tolerance' - maybe expand the name?

##########
File path: cpp/src/arrow/compute/kernels/scalar_arithmetic.cc
##########
@@ -896,7 +915,8 @@ struct RoundImpl<T, RoundMode::UP> {
 template <typename T>
 struct RoundImpl<T, RoundMode::TOWARDS_ZERO> {
   static constexpr enable_if_floating_point<T> Round(const T val) {
-    return std::trunc(val);
+    // return std::trunc(val);

Review comment:
       nit: commented code?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] edponce commented on pull request #10349: ARROW-12744: [C++][Compute] Add rounding kernel

Posted by GitBox <gi...@apache.org>.

edponce commented on pull request #10349:
URL: https://github.com/apache/arrow/pull/10349#issuecomment-915894544


   Ready for review cc @pitrou 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] edponce commented on a change in pull request #10349: ARROW-12744: [C++][Compute] Add rounding kernel

Posted by GitBox <gi...@apache.org>.

edponce commented on a change in pull request #10349:
URL: https://github.com/apache/arrow/pull/10349#discussion_r692358623



##########
File path: cpp/src/arrow/compute/api_scalar.h
##########
@@ -49,10 +49,44 @@ class ARROW_EXPORT ElementWiseAggregateOptions : public FunctionOptions {
   explicit ElementWiseAggregateOptions(bool skip_nulls = true);
   constexpr static char const kTypeName[] = "ElementWiseAggregateOptions";
   static ElementWiseAggregateOptions Defaults() { return ElementWiseAggregateOptions{}; }
-
   bool skip_nulls;
 };
 
+enum class RoundMode {

Review comment:
       I added comments to individual enum members and reference to `compute.rst`.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] edponce commented on pull request #10349: ARROW-12744: [C++][Compute] Add rounding kernel

Posted by GitBox <gi...@apache.org>.

edponce commented on pull request #10349:
URL: https://github.com/apache/arrow/pull/10349#issuecomment-914374628


   It is basically complete and undrafted. There are a few minor comments I made w.r.t. to doubts that I have.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] pitrou commented on pull request #10349: ARROW-12744: [C++][Compute] Add rounding kernel

Posted by GitBox <gi...@apache.org>.

pitrou commented on pull request #10349:
URL: https://github.com/apache/arrow/pull/10349#issuecomment-914371893


   Should you undraft it or is it still WIP?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] edponce commented on a change in pull request #10349: ARROW-12744: [C++][Compute] Add rounding kernel

Posted by GitBox <gi...@apache.org>.

edponce commented on a change in pull request #10349:
URL: https://github.com/apache/arrow/pull/10349#discussion_r704536187



##########
File path: cpp/src/arrow/compute/api_scalar.cc
##########
@@ -175,6 +219,42 @@ ElementWiseAggregateOptions::ElementWiseAggregateOptions(bool skip_nulls)
       skip_nulls(skip_nulls) {}
 constexpr char ElementWiseAggregateOptions::kTypeName[];
 
+RoundOptions::RoundOptions(int64_t ndigits, RoundMode round_mode)
+    : FunctionOptions(internal::kRoundOptionsType),
+      ndigits(ndigits),
+      round_mode(round_mode) {
+  static_assert(RoundMode::HALF_DOWN > RoundMode::DOWN &&
+                    RoundMode::HALF_DOWN > RoundMode::UP &&
+                    RoundMode::HALF_DOWN > RoundMode::TOWARDS_ZERO &&
+                    RoundMode::HALF_DOWN > RoundMode::TOWARDS_INFINITY &&
+                    RoundMode::HALF_DOWN < RoundMode::HALF_UP &&
+                    RoundMode::HALF_DOWN < RoundMode::HALF_TOWARDS_ZERO &&
+                    RoundMode::HALF_DOWN < RoundMode::HALF_TOWARDS_INFINITY &&
+                    RoundMode::HALF_DOWN < RoundMode::HALF_TO_EVEN &&
+                    RoundMode::HALF_DOWN < RoundMode::HALF_TO_ODD,
+                "Invalid order of round modes. Modes prefixed with HALF need to be "
+                "enumerated last with HALF_DOWN being the first among them.");
+}
+constexpr char RoundOptions::kTypeName[];
+
+MRoundOptions::MRoundOptions(double multiple, RoundMode round_mode)
+    : FunctionOptions(internal::kMRoundOptionsType),
+      multiple(multiple),
+      round_mode(round_mode) {
+  static_assert(RoundMode::HALF_DOWN > RoundMode::DOWN &&

Review comment:
       Not really, will leave the assert only in `Round`.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] edponce commented on a change in pull request #10349: ARROW-12744: [C++][Compute] Add rounding kernel

Posted by GitBox <gi...@apache.org>.

edponce commented on a change in pull request #10349:
URL: https://github.com/apache/arrow/pull/10349#discussion_r703465473



##########
File path: cpp/src/arrow/compute/api_scalar.h
##########
@@ -85,26 +78,32 @@ enum class RoundMode : int8_t {
   HALF_TO_ODD,
 };
 
+static constexpr double kDefaultAbsoluteTolerance = 1E-5;
+
 class ARROW_EXPORT RoundOptions : public FunctionOptions {
  public:
   explicit RoundOptions(int64_t ndigits = 0,
-                        RoundMode round_mode = RoundMode::HALF_TO_EVEN);
+                        RoundMode round_mode = RoundMode::HALF_TO_EVEN,
+                        double atol = kDefaultAbsoluteTolerance);
   constexpr static char const kTypeName[] = "RoundOptions";
   static RoundOptions Defaults() { return RoundOptions(); }
   /// Rounding precision (number of digits to round to).
   int64_t ndigits;
   RoundMode round_mode;
+  double atol;

Review comment:
       Thanks for noticing!




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] lidavidm commented on pull request #10349: ARROW-12744: [C++][Compute] Add rounding kernel

Posted by GitBox <gi...@apache.org>.

lidavidm commented on pull request #10349:
URL: https://github.com/apache/arrow/pull/10349#issuecomment-916323381


   @edponce looks like you need to rebase again here as well.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] edponce commented on a change in pull request #10349: ARROW-12744: [C++][Compute] Add rounding kernel

Posted by GitBox <gi...@apache.org>.

edponce commented on a change in pull request #10349:
URL: https://github.com/apache/arrow/pull/10349#discussion_r703994513



##########
File path: cpp/src/arrow/compute/kernels/scalar_arithmetic.cc
##########
@@ -852,24 +854,232 @@ struct LogbChecked {
   }
 };
 
+struct RoundUtil {
+  template <typename T>
+  static constexpr enable_if_t<std::is_floating_point<T>::value, bool> IsApproxEqual(
+      const T a, const T b, const T abs_tol = kDefaultAbsoluteTolerance) {
+    return std::fabs(a - b) <= abs_tol;
+  }
+
+  template <typename T>
+  static constexpr enable_if_t<std::is_floating_point<T>::value, bool> IsApproxInt(
+      const T val, const T abs_tol = kDefaultAbsoluteTolerance) {
+    // |frac| ~ 0.0?
+    return IsApproxEqual(val, std::round(val), abs_tol);
+  }
+
+  template <typename T>
+  static constexpr enable_if_t<std::is_floating_point<T>::value, bool> IsApproxHalfInt(
+      const T val, const T abs_tol = kDefaultAbsoluteTolerance) {
+    // |frac| ~ 0.5?
+    return IsApproxEqual(val - std::floor(val), T(0.5), abs_tol);
+  }
+
+  template <typename T>
+  static enable_if_floating_point<T> Pow10(const int64_t power) {
+    const T lut[]{1e0F,  1e1F,  1e2F,  1e3F,  1e4F,  1e5F,  1e6F,  1e7F,
+                  1e8F,  1e9F,  1e10F, 1e11F, 1e12F, 1e13F, 1e14F, 1e15F,
+                  1e16F, 1e17F, 1e18F, 1e19F, 1e20F, 1e21F, 1e22F};
+    // Return NaN if index is out-of-range.
+    auto lut_size = (int64_t)(sizeof(lut) / sizeof(*lut));
+    return (power >= 0 && power < lut_size) ? lut[power] : std::nanf("");
+  }
+};
+
+// Specializations of rounding implementations for round kernels
+template <typename T, RoundMode>
+struct RoundImpl {};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::DOWN> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::floor(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::UP> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::ceil(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::TOWARDS_ZERO> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::signbit(val) ? std::ceil(val) : std::floor(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::TOWARDS_INFINITY> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::signbit(val) ? std::floor(val) : std::ceil(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_DOWN> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return RoundImpl<T, RoundMode::DOWN>::Round(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_UP> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return RoundImpl<T, RoundMode::UP>::Round(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TOWARDS_ZERO> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return RoundImpl<T, RoundMode::TOWARDS_ZERO>::Round(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TOWARDS_INFINITY> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return RoundImpl<T, RoundMode::TOWARDS_INFINITY>::Round(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TO_EVEN> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::round(val * T(0.5)) * 2;
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TO_ODD> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::floor(val * T(0.5)) + std::ceil(val * T(0.5));
+  }
+};
+
+template <RoundMode RndMode>
+struct Round {
+  using RoundState = OptionsWrapper<RoundOptions>;
+
+  template <typename T, typename Arg>
+  static enable_if_floating_point<Arg, T> Call(KernelContext* ctx, Arg arg, Status* st) {
+    static_assert(std::is_same<T, Arg>::value, "");
+    static_assert(RoundMode::HALF_DOWN > RoundMode::DOWN &&
+                      RoundMode::HALF_DOWN > RoundMode::UP &&
+                      RoundMode::HALF_DOWN > RoundMode::TOWARDS_ZERO &&
+                      RoundMode::HALF_DOWN > RoundMode::TOWARDS_INFINITY &&
+                      RoundMode::HALF_DOWN < RoundMode::HALF_UP &&
+                      RoundMode::HALF_DOWN < RoundMode::HALF_TOWARDS_ZERO &&
+                      RoundMode::HALF_DOWN < RoundMode::HALF_TOWARDS_INFINITY &&
+                      RoundMode::HALF_DOWN < RoundMode::HALF_TO_EVEN &&
+                      RoundMode::HALF_DOWN < RoundMode::HALF_TO_ODD,
+                  "Round modes prefixed with HALF need to be defined last in enum and "
+                  "the first HALF entry has to be HALF_DOWN.");
+
+    auto options = RoundState::Get(ctx);
+    auto pow10 = RoundUtil::Pow10<T>(std::llabs(options.ndigits));
+    if (std::isnan(pow10)) {
+      *st = Status::Invalid("out-of-range value for rounding digits");
+      return arg;
+    } else if (!std::isfinite(arg)) {
+      return arg;
+    }
+
+    T scaled_arg = (options.ndigits >= 0) ? (arg * pow10) : (arg / pow10);
+    // Use std::round if scaled value is an integer or not 0.5 when a tie-breaking mode
+    // was set.
+    T result;
+    if (RoundUtil::IsApproxInt(scaled_arg, T(options.abs_tol)) ||
+        (options.round_mode >= RoundMode::HALF_DOWN &&
+         !RoundUtil::IsApproxHalfInt(scaled_arg, T(options.abs_tol)))) {
+      result = std::round(scaled_arg);
+    } else {
+      result = RoundImpl<T, RndMode>::Round(scaled_arg);
+    }
+    result = (options.ndigits >= 0) ? (result / pow10) : (result * pow10);
+    if (!std::isfinite(result)) {
+      *st = Status::Invalid("overflow occurred during rounding");
+      return arg;
+    }
+    // If rounding didn't change value, return original value
+    return RoundUtil::IsApproxEqual(arg, result, T(options.abs_tol)) ? arg : result;
+  }
+};
+
+template <RoundMode RndMode>
+struct MRound {
+  using MRoundState = OptionsWrapper<MRoundOptions>;
+
+  template <typename T, typename Arg>
+  static enable_if_floating_point<Arg, T> Call(KernelContext* ctx, Arg arg, Status* st) {
+    static_assert(std::is_same<T, Arg>::value, "");
+    static_assert(RoundMode::HALF_DOWN > RoundMode::DOWN &&
+                      RoundMode::HALF_DOWN > RoundMode::UP &&
+                      RoundMode::HALF_DOWN > RoundMode::TOWARDS_ZERO &&
+                      RoundMode::HALF_DOWN > RoundMode::TOWARDS_INFINITY &&
+                      RoundMode::HALF_DOWN < RoundMode::HALF_UP &&
+                      RoundMode::HALF_DOWN < RoundMode::HALF_TOWARDS_ZERO &&
+                      RoundMode::HALF_DOWN < RoundMode::HALF_TOWARDS_INFINITY &&
+                      RoundMode::HALF_DOWN < RoundMode::HALF_TO_EVEN &&
+                      RoundMode::HALF_DOWN < RoundMode::HALF_TO_ODD,
+                  "Round modes prefixed with HALF need to be defined last in enum and "
+                  "the first HALF entry has to be HALF_DOWN.");
+    auto options = MRoundState::Get(ctx);
+    auto mult = std::fabs(T(options.multiple));
+    if (mult == 0) {

Review comment:
       Well, the sign of `mult` is irrelevant. We can mandate that `mult` is non-zero and finite, and remove the check, but I think we should leave as-is for now. Ideally, we would be able to have multiple CallXXX variants for a kernel where they would be selected (at the batch level) based on options. In this case, if `mult` is zero, then it would invoke
   ```c++
   static ... CallZero(...) {
       return std::isfinite(arg) ? 0 : std::nanf("");
   }
   ```
   and all other cases would invoke the current `Call()`. This is not the only compute function that has special cases, actually most of them have, and having this capability will increase performance and also make the code a bit more amenable for SIMD. I am working on a refactoring to support these ideas.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] edponce commented on pull request #10349: ARROW-12744: [C++][Compute] Add rounding kernel

Posted by GitBox <gi...@apache.org>.

edponce commented on pull request #10349:
URL: https://github.com/apache/arrow/pull/10349#issuecomment-916228453


   @rok This PR only supports rounding for basic arithmetic data types (unsigned/signed int and floating-point).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] pitrou commented on a change in pull request #10349: ARROW-12744: [C++][Compute] Add rounding kernel

Posted by GitBox <gi...@apache.org>.

pitrou commented on a change in pull request #10349:
URL: https://github.com/apache/arrow/pull/10349#discussion_r704418301



##########
File path: cpp/src/arrow/compute/kernels/scalar_arithmetic.cc
##########
@@ -852,24 +854,232 @@ struct LogbChecked {
   }
 };
 
+struct RoundUtil {
+  template <typename T>
+  static constexpr enable_if_t<std::is_floating_point<T>::value, bool> IsApproxEqual(
+      const T a, const T b, const T abs_tol = kDefaultAbsoluteTolerance) {
+    return std::fabs(a - b) <= abs_tol;
+  }
+
+  template <typename T>
+  static constexpr enable_if_t<std::is_floating_point<T>::value, bool> IsApproxInt(
+      const T val, const T abs_tol = kDefaultAbsoluteTolerance) {
+    // |frac| ~ 0.0?
+    return IsApproxEqual(val, std::round(val), abs_tol);
+  }
+
+  template <typename T>
+  static constexpr enable_if_t<std::is_floating_point<T>::value, bool> IsApproxHalfInt(
+      const T val, const T abs_tol = kDefaultAbsoluteTolerance) {
+    // |frac| ~ 0.5?
+    return IsApproxEqual(val - std::floor(val), T(0.5), abs_tol);
+  }
+
+  template <typename T>
+  static enable_if_floating_point<T> Pow10(const int64_t power) {
+    const T lut[]{1e0F,  1e1F,  1e2F,  1e3F,  1e4F,  1e5F,  1e6F,  1e7F,
+                  1e8F,  1e9F,  1e10F, 1e11F, 1e12F, 1e13F, 1e14F, 1e15F,
+                  1e16F, 1e17F, 1e18F, 1e19F, 1e20F, 1e21F, 1e22F};
+    // Return NaN if index is out-of-range.
+    auto lut_size = (int64_t)(sizeof(lut) / sizeof(*lut));
+    return (power >= 0 && power < lut_size) ? lut[power] : std::nanf("");

Review comment:
       Yes, adding a comment sounds good.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] lidavidm commented on a change in pull request #10349: ARROW-12744: [C++][Compute] Add rounding kernel

Posted by GitBox <gi...@apache.org>.

lidavidm commented on a change in pull request #10349:
URL: https://github.com/apache/arrow/pull/10349#discussion_r705299721



##########
File path: cpp/src/arrow/compute/kernels/scalar_arithmetic.cc
##########
@@ -852,24 +853,243 @@ struct LogbChecked {
   }
 };
 
+struct RoundUtil {
+  template <typename T>
+  static constexpr enable_if_t<std::is_floating_point<T>::value, bool> IsInteger(
+      const T val) {
+    // |frac| ~ 0.0?
+    return std::floor(val) == val;
+  }
+
+  template <typename T>
+  static constexpr enable_if_t<std::is_floating_point<T>::value, bool> IsHalfInteger(
+      const T val) {
+    // |frac| ~ 0.5?
+    return (val - std::floor(val)) == T(0.5);
+  }
+
+  // Calculate powers of ten with arbitrary integer exponent
+  template <typename T = double>
+  static enable_if_floating_point<T> Pow10(int64_t power) {
+    static constexpr T lut[] = {1e0F, 1e1F, 1e2F,  1e3F,  1e4F,  1e5F,  1e6F,  1e7F,
+                                1e8F, 1e9F, 1e10F, 1e11F, 1e12F, 1e13F, 1e14F, 1e15F};
+    int64_t lut_size = (sizeof(lut) / sizeof(*lut));
+    int64_t abs_power = std::abs(power);
+    auto pow10 = lut[std::min(abs_power, lut_size - 1)];
+    while (abs_power-- >= lut_size) {
+      pow10 *= 1e1F;
+    }
+    return (power >= 0) ? pow10 : (1 / pow10);
+  }
+};
+
+// Specializations of rounding implementations for round kernels
+template <typename, RoundMode>
+struct RoundImpl;
+
+template <typename T>
+struct RoundImpl<T, RoundMode::DOWN> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::floor(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::UP> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::ceil(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::TOWARDS_ZERO> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::trunc(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::TOWARDS_INFINITY> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::signbit(val) ? std::floor(val) : std::ceil(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_DOWN> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return RoundImpl<T, RoundMode::DOWN>::Round(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_UP> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return RoundImpl<T, RoundMode::UP>::Round(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TOWARDS_ZERO> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return RoundImpl<T, RoundMode::TOWARDS_ZERO>::Round(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TOWARDS_INFINITY> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return RoundImpl<T, RoundMode::TOWARDS_INFINITY>::Round(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TO_EVEN> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::round(val * T(0.5)) * 2;
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TO_ODD> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::floor(val * T(0.5)) + std::ceil(val * T(0.5));
+  }
+};
+
+// Specializations of kernel state for round kernels
+template <typename>
+struct RoundOptionsWrapper;
+
+template <>
+struct RoundOptionsWrapper<RoundOptions> : public OptionsWrapper<RoundOptions> {
+  using OptionsType = RoundOptions;
+  double pow10;
+
+  explicit RoundOptionsWrapper(OptionsType options) : OptionsWrapper(std::move(options)) {
+    // Only positive of powers of 10 are used because combining multiply and
+    // division operations produced more stable rounding than using multiply-only.
+    // Refer to NumPy's round implementation:
+    // https://github.com/numpy/numpy/blob/7b2f20b406d27364c812f7a81a9c901afbd3600c/numpy/core/src/multiarray/calculation.c#L589
+    pow10 = RoundUtil::Pow10(std::abs(options.ndigits));
+  }
+
+  static Result<std::unique_ptr<KernelState>> Init(KernelContext* ctx,
+                                                   const KernelInitArgs& args) {
+    if (auto options = static_cast<const OptionsType*>(args.options)) {
+      return arrow::internal::make_unique<RoundOptionsWrapper<OptionsType>>(*options);
+    }
+
+    return Status::Invalid(
+        "Attempted to initialize KernelState from null FunctionOptions");
+  }

Review comment:
       Should this just return OptionsWrapper<OptionsType>::Init like below?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] edponce commented on a change in pull request #10349: ARROW-12744: [C++][Compute] Add rounding kernel

Posted by GitBox <gi...@apache.org>.

edponce commented on a change in pull request #10349:
URL: https://github.com/apache/arrow/pull/10349#discussion_r633845512



##########
File path: cpp/src/arrow/compute/kernels/scalar_arithmetic.cc
##########
@@ -355,6 +355,32 @@ struct PowerChecked {
   }
 };
 
+struct Round {
+  template <typename T, typename Arg>
+  static constexpr enable_if_floating_point<T> Call(KernelContext*, Arg arg, Status*) {
+    return std::round(arg);
+  }
+
+  template <typename T, typename Arg>
+  static constexpr enable_if_integer<T> Call(KernelContext*, Arg arg, Status*) {
+    return arg;
+  }
+};
+
+struct RoundChecked {
+  template <typename T, typename Arg>
+  static constexpr enable_if_integer<T> Call(KernelContext*, Arg arg, Status*) {
+    static_assert(std::is_same<T, Arg>::value, "");
+    return arg;
+  }
+
+  template <typename T, typename Arg>
+  static constexpr enable_if_floating_point<T> Call(KernelContext*, Arg arg, Status*) {
+    static_assert(std::is_same<T, Arg>::value, "");

Review comment:
       I knew this comment was coming with PR as is. This is currently a placeholder for the cases where if the rounding options suggest to round to nearest integer and output will be of integral type, then overflow/underflow can occur.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] edponce commented on pull request #10349: ARROW-12744: [C++][Compute] Add rounding kernel

Posted by GitBox <gi...@apache.org>.

edponce commented on pull request #10349:
URL: https://github.com/apache/arrow/pull/10349#issuecomment-915971328


   @rok It is a possibility, if we have the semantics pinned down w.r.t. how to shift specific timestamps (forward, backward, delta). This would be a new set of compute functions: "round_time" and "round_time_to_multiple", where the latter could be a quaternary/varargs function to support multiples for hour, min, sec, ms/ns.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] edponce commented on a change in pull request #10349: ARROW-12744: [C++][Compute] Add rounding kernel

Posted by GitBox <gi...@apache.org>.

edponce commented on a change in pull request #10349:
URL: https://github.com/apache/arrow/pull/10349#discussion_r705805393



##########
File path: cpp/src/arrow/compute/kernels/scalar_arithmetic.cc
##########
@@ -852,24 +853,243 @@ struct LogbChecked {
   }
 };
 
+struct RoundUtil {
+  template <typename T>
+  static constexpr enable_if_t<std::is_floating_point<T>::value, bool> IsInteger(
+      const T val) {
+    // |frac| ~ 0.0?
+    return std::floor(val) == val;
+  }
+
+  template <typename T>
+  static constexpr enable_if_t<std::is_floating_point<T>::value, bool> IsHalfInteger(
+      const T val) {
+    // |frac| ~ 0.5?
+    return (val - std::floor(val)) == T(0.5);
+  }
+
+  // Calculate powers of ten with arbitrary integer exponent
+  template <typename T = double>
+  static enable_if_floating_point<T> Pow10(int64_t power) {
+    static constexpr T lut[] = {1e0F, 1e1F, 1e2F,  1e3F,  1e4F,  1e5F,  1e6F,  1e7F,
+                                1e8F, 1e9F, 1e10F, 1e11F, 1e12F, 1e13F, 1e14F, 1e15F};
+    int64_t lut_size = (sizeof(lut) / sizeof(*lut));
+    int64_t abs_power = std::abs(power);
+    auto pow10 = lut[std::min(abs_power, lut_size - 1)];
+    while (abs_power-- >= lut_size) {
+      pow10 *= 1e1F;
+    }
+    return (power >= 0) ? pow10 : (1 / pow10);
+  }
+};
+
+// Specializations of rounding implementations for round kernels
+template <typename, RoundMode>
+struct RoundImpl;
+
+template <typename T>
+struct RoundImpl<T, RoundMode::DOWN> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::floor(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::UP> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::ceil(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::TOWARDS_ZERO> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::trunc(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::TOWARDS_INFINITY> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::signbit(val) ? std::floor(val) : std::ceil(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_DOWN> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return RoundImpl<T, RoundMode::DOWN>::Round(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_UP> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return RoundImpl<T, RoundMode::UP>::Round(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TOWARDS_ZERO> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return RoundImpl<T, RoundMode::TOWARDS_ZERO>::Round(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TOWARDS_INFINITY> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return RoundImpl<T, RoundMode::TOWARDS_INFINITY>::Round(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TO_EVEN> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::round(val * T(0.5)) * 2;
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TO_ODD> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::floor(val * T(0.5)) + std::ceil(val * T(0.5));
+  }
+};
+
+// Specializations of kernel state for round kernels
+template <typename>
+struct RoundOptionsWrapper;
+
+template <>
+struct RoundOptionsWrapper<RoundOptions> : public OptionsWrapper<RoundOptions> {
+  using OptionsType = RoundOptions;
+  double pow10;
+
+  explicit RoundOptionsWrapper(OptionsType options) : OptionsWrapper(std::move(options)) {
+    // Only positive of powers of 10 are used because combining multiply and
+    // division operations produced more stable rounding than using multiply-only.
+    // Refer to NumPy's round implementation:
+    // https://github.com/numpy/numpy/blob/7b2f20b406d27364c812f7a81a9c901afbd3600c/numpy/core/src/multiarray/calculation.c#L589
+    pow10 = RoundUtil::Pow10(std::abs(options.ndigits));
+  }
+
+  static Result<std::unique_ptr<KernelState>> Init(KernelContext* ctx,
+                                                   const KernelInitArgs& args) {
+    if (auto options = static_cast<const OptionsType*>(args.options)) {
+      return arrow::internal::make_unique<RoundOptionsWrapper<OptionsType>>(*options);
+    }
+
+    return Status::Invalid(
+        "Attempted to initialize KernelState from null FunctionOptions");
+  }
+};
+
+template <>
+struct RoundOptionsWrapper<RoundToMultipleOptions>
+    : public OptionsWrapper<RoundToMultipleOptions> {
+  using OptionsType = RoundToMultipleOptions;
+
+  static Result<std::unique_ptr<KernelState>> Init(KernelContext* ctx,
+                                                   const KernelInitArgs& args) {
+    ARROW_ASSIGN_OR_RAISE(auto state, OptionsWrapper<OptionsType>::Init(ctx, args));
+    auto options = Get(*state);
+    if (options.multiple <= 0) {
+      return Status::Invalid("Rounding multiple has to be a non-zero positive value");
+    }
+    return std::move(state);
+  }
+};
+
+template <RoundMode RndMode>
+struct Round {
+  using State = RoundOptionsWrapper<RoundOptions>;
+
+  template <typename T, typename Arg>
+  static enable_if_floating_point<Arg, T> Call(KernelContext* ctx, Arg arg, Status* st) {
+    static_assert(std::is_same<T, Arg>::value, "");
+    // Do not process Inf or NaN because they will trigger the overflow error at end of
+    // function.
+    if (!std::isfinite(arg)) {
+      return arg;
+    }
+    auto state = static_cast<State*>(ctx->state());
+    auto options = state->options;
+    auto pow10 = T(state->pow10);
+    auto round_val = (options.ndigits >= 0) ? (arg * pow10) : (arg / pow10);
+    // Use std::round() if in tie-breaking mode and scaled value is not 0.5.
+    if ((options.round_mode >= RoundMode::HALF_DOWN) &&
+        !RoundUtil::IsHalfInteger(round_val)) {
+      round_val = std::round(round_val);
+    } else if (!RoundUtil::IsInteger(round_val)) {

Review comment:
       The integer logic is now fixed, so if the scaled value `round_val` is an integer then no rounding needs to take place. I also inlined the logic of `IsInteger` and `IsHalfInteger` so that the internal `floor` operation is done once.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] edponce commented on a change in pull request #10349: ARROW-12744: [C++][Compute] Add rounding kernel

Posted by GitBox <gi...@apache.org>.

edponce commented on a change in pull request #10349:
URL: https://github.com/apache/arrow/pull/10349#discussion_r703917625



##########
File path: cpp/src/arrow/compute/api_scalar.h
##########
@@ -49,10 +49,66 @@ class ARROW_EXPORT ElementWiseAggregateOptions : public FunctionOptions {
   explicit ElementWiseAggregateOptions(bool skip_nulls = true);
   constexpr static char const kTypeName[] = "ElementWiseAggregateOptions";
   static ElementWiseAggregateOptions Defaults() { return ElementWiseAggregateOptions{}; }
-
   bool skip_nulls;
 };
 
+/// Rounding and tie-breaking modes for round compute functions.
+/// Additional details and examples are provided in compute.rst.
+enum class RoundMode : int8_t {
+  /// Round to nearest integer less than or equal in magnitude (aka "floor")
+  DOWN,
+  /// Round to nearest integer greater than or equal in magnitude (aka "ceil")
+  UP,
+  /// Get the integral part without fractional digits (aka "trunc")
+  TOWARDS_ZERO,
+  /// Round negative values with DOWN rule and positive values with UP rule
+  TOWARDS_INFINITY,
+  /// Round ties with DOWN rule
+  HALF_DOWN,
+  /// Round ties with UP rule
+  HALF_UP,
+  /// Round ties with TOWARDS_ZERO rule
+  HALF_TOWARDS_ZERO,
+  /// Round ties with TOWARDS_INFINITY rule
+  HALF_TOWARDS_INFINITY,
+  /// Round ties to nearest even integer
+  HALF_TO_EVEN,
+  /// Round ties to nearest odd integer
+  HALF_TO_ODD,
+};
+
+static constexpr double kDefaultAbsoluteTolerance = 1E-5;
+
+class ARROW_EXPORT RoundOptions : public FunctionOptions {
+ public:
+  explicit RoundOptions(int64_t ndigits = 0,
+                        RoundMode round_mode = RoundMode::HALF_TO_EVEN,
+                        double abs_tol = kDefaultAbsoluteTolerance);
+  constexpr static char const kTypeName[] = "RoundOptions";
+  static RoundOptions Defaults() { return RoundOptions(); }
+  /// Rounding precision (number of digits to round to).
+  int64_t ndigits;
+  /// Rounding and tie-breaking mode
+  RoundMode round_mode;
+  /// Absolute tolerance for approximating values as integers and mid-point decimals
+  double abs_tol;

Review comment:
       Ok, I am removing the absolute tolerance parameter.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] edponce commented on a change in pull request #10349: ARROW-12744: [C++][Compute] Add rounding kernel

Posted by GitBox <gi...@apache.org>.

edponce commented on a change in pull request #10349:
URL: https://github.com/apache/arrow/pull/10349#discussion_r703722756



##########
File path: cpp/src/arrow/compute/kernels/scalar_arithmetic.cc
##########
@@ -852,24 +854,232 @@ struct LogbChecked {
   }
 };
 
+struct RoundUtil {
+  template <typename T>
+  static constexpr enable_if_t<std::is_floating_point<T>::value, bool> IsApproxEqual(
+      const T a, const T b, const T abs_tol = kDefaultAbsoluteTolerance) {
+    return std::fabs(a - b) <= abs_tol;
+  }
+
+  template <typename T>
+  static constexpr enable_if_t<std::is_floating_point<T>::value, bool> IsApproxInt(
+      const T val, const T abs_tol = kDefaultAbsoluteTolerance) {
+    // |frac| ~ 0.0?
+    return IsApproxEqual(val, std::round(val), abs_tol);
+  }
+
+  template <typename T>
+  static constexpr enable_if_t<std::is_floating_point<T>::value, bool> IsApproxHalfInt(
+      const T val, const T abs_tol = kDefaultAbsoluteTolerance) {
+    // |frac| ~ 0.5?
+    return IsApproxEqual(val - std::floor(val), T(0.5), abs_tol);
+  }
+
+  template <typename T>
+  static enable_if_floating_point<T> Pow10(const int64_t power) {
+    const T lut[]{1e0F,  1e1F,  1e2F,  1e3F,  1e4F,  1e5F,  1e6F,  1e7F,
+                  1e8F,  1e9F,  1e10F, 1e11F, 1e12F, 1e13F, 1e14F, 1e15F,
+                  1e16F, 1e17F, 1e18F, 1e19F, 1e20F, 1e21F, 1e22F};
+    // Return NaN if index is out-of-range.
+    auto lut_size = (int64_t)(sizeof(lut) / sizeof(*lut));
+    return (power >= 0 && power < lut_size) ? lut[power] : std::nanf("");

Review comment:
       Initially, that is what I did as to avoid division operations and instead use only multiplies (`round(val * 10^3) * 10^-3`). But found out that this was the culprit of the floating-point precision errors (well more than should be). After looking at several rounding implementations (e.g., [Numpy](https://github.com/numpy/numpy/blob/7b2f20b406d27364c812f7a81a9c901afbd3600c/numpy/core/src/multiarray/calculation.c#L589) and [CPython](https://hg.python.org/cpython/file/1f1498fe50e5/Objects/floatobject.c#l924)) noticed that they do divide. So I applied the division and the results were much less noisy (I wish I could explain this but seems that errors sort of cancelled out better). I extracted CPython's round function and converted the divisions into reciprocal multiplies and observed the same noisy behavior output. For this reason, the round functions apply a multiply and divide with the greater than 1 power of 10 factors.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] rok commented on pull request #10349: ARROW-12744: [C++][Compute] Add rounding kernel

Posted by GitBox <gi...@apache.org>.

rok commented on pull request #10349:
URL: https://github.com/apache/arrow/pull/10349#issuecomment-915951669


   Looks great!
   Would we consider [pandas-like time unit rounding](https://pandas.pydata.org/docs/reference/api/pandas.Timestamp.round.html) once we have this in?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] edponce commented on a change in pull request #10349: ARROW-12744: [C++][Compute] Add rounding kernel

Posted by GitBox <gi...@apache.org>.

edponce commented on a change in pull request #10349:
URL: https://github.com/apache/arrow/pull/10349#discussion_r703464394



##########
File path: cpp/src/arrow/compute/api_scalar.h
##########
@@ -53,16 +52,10 @@ class ARROW_EXPORT ElementWiseAggregateOptions : public FunctionOptions {
   bool skip_nulls;
 };
 
-/// Functor to calculate hash of an enum.
-template <typename T, typename R = typename std::underlying_type<T>::type,
-          typename = std::enable_if<std::is_enum<T>::value>>
-struct EnumHash {
-  R operator()(T val) const { return static_cast<R>(val); }
-};
-
 /// Rounding and tie-breaking modes for round compute functions.
 /// Additional details and examples are provided in compute.rst.
 enum class RoundMode : int8_t {
+  // Note: The HALF values need to be last and the first HALF entry is HALF_DOWN.

Review comment:
       By having the HALF_XXX enums "grouped" to one end of the enum, it allows for checking if the rounding mode is tie-breaking by simply comparing against one instead of all the cases.
   ```
   if (rmode >= RoundMode::HALF_DOWN) { ... }
   // vs
   if (rmode == RoundMode::HALF_DOWN || rmode == RoundMode::HALF_UP || ... 4 more times) { ... }
   ```
   I added asserts in the Round/MRound kernel implementations, but not sure where else I can place them? Round/MRound constructor?

##########
File path: cpp/src/arrow/compute/api_scalar.h
##########
@@ -53,16 +52,10 @@ class ARROW_EXPORT ElementWiseAggregateOptions : public FunctionOptions {
   bool skip_nulls;
 };
 
-/// Functor to calculate hash of an enum.
-template <typename T, typename R = typename std::underlying_type<T>::type,
-          typename = std::enable_if<std::is_enum<T>::value>>
-struct EnumHash {
-  R operator()(T val) const { return static_cast<R>(val); }
-};
-
 /// Rounding and tie-breaking modes for round compute functions.
 /// Additional details and examples are provided in compute.rst.
 enum class RoundMode : int8_t {
+  // Note: The HALF values need to be last and the first HALF entry is HALF_DOWN.

Review comment:
       By having the HALF_XXX enums "grouped" to one end of the enum, it allows for checking if the rounding mode is tie-breaking by simply comparing against one instead of all the cases.
   ```c++
   if (rmode >= RoundMode::HALF_DOWN) { ... }
   // vs
   if (rmode == RoundMode::HALF_DOWN || rmode == RoundMode::HALF_UP || ... 4 more times) { ... }
   ```
   I added asserts in the Round/MRound kernel implementations, but not sure where else I can place them? Round/MRound constructor?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] edponce commented on a change in pull request #10349: ARROW-12744: [C++][Compute] Add rounding kernel

Posted by GitBox <gi...@apache.org>.

edponce commented on a change in pull request #10349:
URL: https://github.com/apache/arrow/pull/10349#discussion_r703473200



##########
File path: cpp/src/arrow/compute/api_scalar.h
##########
@@ -49,10 +49,66 @@ class ARROW_EXPORT ElementWiseAggregateOptions : public FunctionOptions {
   explicit ElementWiseAggregateOptions(bool skip_nulls = true);
   constexpr static char const kTypeName[] = "ElementWiseAggregateOptions";
   static ElementWiseAggregateOptions Defaults() { return ElementWiseAggregateOptions{}; }
-
   bool skip_nulls;
 };
 
+/// Rounding and tie-breaking modes for round compute functions.
+/// Additional details and examples are provided in compute.rst.
+enum class RoundMode : int8_t {
+  /// Round to nearest integer less than or equal in magnitude (aka "floor")

Review comment:
       I would had preferred for this enum to be scoped in the Round/MRound Options, but that would require duplicating it. Any suggestions?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] pitrou commented on a change in pull request #10349: ARROW-12744: [C++][Compute] Add rounding kernel

Posted by GitBox <gi...@apache.org>.

pitrou commented on a change in pull request #10349:
URL: https://github.com/apache/arrow/pull/10349#discussion_r704417660



##########
File path: cpp/src/arrow/compute/api_scalar.h
##########
@@ -49,10 +49,66 @@ class ARROW_EXPORT ElementWiseAggregateOptions : public FunctionOptions {
   explicit ElementWiseAggregateOptions(bool skip_nulls = true);
   constexpr static char const kTypeName[] = "ElementWiseAggregateOptions";
   static ElementWiseAggregateOptions Defaults() { return ElementWiseAggregateOptions{}; }
-
   bool skip_nulls;
 };
 
+/// Rounding and tie-breaking modes for round compute functions.
+/// Additional details and examples are provided in compute.rst.
+enum class RoundMode : int8_t {
+  /// Round to nearest integer less than or equal in magnitude (aka "floor")
+  DOWN,
+  /// Round to nearest integer greater than or equal in magnitude (aka "ceil")
+  UP,
+  /// Get the integral part without fractional digits (aka "trunc")
+  TOWARDS_ZERO,
+  /// Round negative values with DOWN rule and positive values with UP rule
+  TOWARDS_INFINITY,
+  /// Round ties with DOWN rule
+  HALF_DOWN,
+  /// Round ties with UP rule
+  HALF_UP,
+  /// Round ties with TOWARDS_ZERO rule
+  HALF_TOWARDS_ZERO,
+  /// Round ties with TOWARDS_INFINITY rule
+  HALF_TOWARDS_INFINITY,
+  /// Round ties to nearest even integer
+  HALF_TO_EVEN,
+  /// Round ties to nearest odd integer
+  HALF_TO_ODD,
+};
+
+static constexpr double kDefaultAbsoluteTolerance = 1E-5;
+
+class ARROW_EXPORT RoundOptions : public FunctionOptions {
+ public:
+  explicit RoundOptions(int64_t ndigits = 0,
+                        RoundMode round_mode = RoundMode::HALF_TO_EVEN,
+                        double abs_tol = kDefaultAbsoluteTolerance);
+  constexpr static char const kTypeName[] = "RoundOptions";
+  static RoundOptions Defaults() { return RoundOptions(); }
+  /// Rounding precision (number of digits to round to).
+  int64_t ndigits;
+  /// Rounding and tie-breaking mode
+  RoundMode round_mode;
+  /// Absolute tolerance for approximating values as integers and mid-point decimals
+  double abs_tol;

Review comment:
       Sound good. If possible ask for it, we can put it back later.

##########
File path: cpp/src/arrow/compute/api_scalar.h
##########
@@ -49,10 +49,66 @@ class ARROW_EXPORT ElementWiseAggregateOptions : public FunctionOptions {
   explicit ElementWiseAggregateOptions(bool skip_nulls = true);
   constexpr static char const kTypeName[] = "ElementWiseAggregateOptions";
   static ElementWiseAggregateOptions Defaults() { return ElementWiseAggregateOptions{}; }
-
   bool skip_nulls;
 };
 
+/// Rounding and tie-breaking modes for round compute functions.
+/// Additional details and examples are provided in compute.rst.
+enum class RoundMode : int8_t {
+  /// Round to nearest integer less than or equal in magnitude (aka "floor")
+  DOWN,
+  /// Round to nearest integer greater than or equal in magnitude (aka "ceil")
+  UP,
+  /// Get the integral part without fractional digits (aka "trunc")
+  TOWARDS_ZERO,
+  /// Round negative values with DOWN rule and positive values with UP rule
+  TOWARDS_INFINITY,
+  /// Round ties with DOWN rule
+  HALF_DOWN,
+  /// Round ties with UP rule
+  HALF_UP,
+  /// Round ties with TOWARDS_ZERO rule
+  HALF_TOWARDS_ZERO,
+  /// Round ties with TOWARDS_INFINITY rule
+  HALF_TOWARDS_INFINITY,
+  /// Round ties to nearest even integer
+  HALF_TO_EVEN,
+  /// Round ties to nearest odd integer
+  HALF_TO_ODD,
+};
+
+static constexpr double kDefaultAbsoluteTolerance = 1E-5;
+
+class ARROW_EXPORT RoundOptions : public FunctionOptions {
+ public:
+  explicit RoundOptions(int64_t ndigits = 0,
+                        RoundMode round_mode = RoundMode::HALF_TO_EVEN,
+                        double abs_tol = kDefaultAbsoluteTolerance);
+  constexpr static char const kTypeName[] = "RoundOptions";
+  static RoundOptions Defaults() { return RoundOptions(); }
+  /// Rounding precision (number of digits to round to).
+  int64_t ndigits;
+  /// Rounding and tie-breaking mode
+  RoundMode round_mode;
+  /// Absolute tolerance for approximating values as integers and mid-point decimals
+  double abs_tol;

Review comment:
       Sound good. If people ask for it, we can put it back later.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] edponce commented on a change in pull request #10349: ARROW-12744: [C++][Compute] Add rounding kernel

Posted by GitBox <gi...@apache.org>.

edponce commented on a change in pull request #10349:
URL: https://github.com/apache/arrow/pull/10349#discussion_r703472702



##########
File path: cpp/src/arrow/compute/api_scalar.h
##########
@@ -49,10 +49,66 @@ class ARROW_EXPORT ElementWiseAggregateOptions : public FunctionOptions {
   explicit ElementWiseAggregateOptions(bool skip_nulls = true);
   constexpr static char const kTypeName[] = "ElementWiseAggregateOptions";
   static ElementWiseAggregateOptions Defaults() { return ElementWiseAggregateOptions{}; }
-
   bool skip_nulls;
 };
 
+/// Rounding and tie-breaking modes for round compute functions.
+/// Additional details and examples are provided in compute.rst.
+enum class RoundMode : int8_t {
+  /// Round to nearest integer less than or equal in magnitude (aka "floor")
+  DOWN,
+  /// Round to nearest integer greater than or equal in magnitude (aka "ceil")
+  UP,
+  /// Get the integral part without fractional digits (aka "trunc")
+  TOWARDS_ZERO,
+  /// Round negative values with DOWN rule and positive values with UP rule
+  TOWARDS_INFINITY,
+  /// Round ties with DOWN rule
+  HALF_DOWN,
+  /// Round ties with UP rule
+  HALF_UP,
+  /// Round ties with TOWARDS_ZERO rule
+  HALF_TOWARDS_ZERO,
+  /// Round ties with TOWARDS_INFINITY rule
+  HALF_TOWARDS_INFINITY,
+  /// Round ties to nearest even integer
+  HALF_TO_EVEN,
+  /// Round ties to nearest odd integer
+  HALF_TO_ODD,
+};
+
+static constexpr double kDefaultAbsoluteTolerance = 1E-5;
+

Review comment:
       Should this be part of the Round/MRound Options? There is an existing atol in [compare.h](https://github.com/apache/arrow/blob/master/cpp/src/arrow/compare.h#L36).




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] pitrou commented on a change in pull request #10349: ARROW-12744: [C++][Compute] Add rounding kernel

Posted by GitBox <gi...@apache.org>.

pitrou commented on a change in pull request #10349:
URL: https://github.com/apache/arrow/pull/10349#discussion_r704420538



##########
File path: cpp/src/arrow/compute/kernels/scalar_arithmetic.cc
##########
@@ -852,24 +854,232 @@ struct LogbChecked {
   }
 };
 
+struct RoundUtil {
+  template <typename T>
+  static constexpr enable_if_t<std::is_floating_point<T>::value, bool> IsApproxEqual(
+      const T a, const T b, const T abs_tol = kDefaultAbsoluteTolerance) {
+    return std::fabs(a - b) <= abs_tol;
+  }
+
+  template <typename T>
+  static constexpr enable_if_t<std::is_floating_point<T>::value, bool> IsApproxInt(
+      const T val, const T abs_tol = kDefaultAbsoluteTolerance) {
+    // |frac| ~ 0.0?
+    return IsApproxEqual(val, std::round(val), abs_tol);
+  }
+
+  template <typename T>
+  static constexpr enable_if_t<std::is_floating_point<T>::value, bool> IsApproxHalfInt(
+      const T val, const T abs_tol = kDefaultAbsoluteTolerance) {
+    // |frac| ~ 0.5?
+    return IsApproxEqual(val - std::floor(val), T(0.5), abs_tol);
+  }
+
+  template <typename T>
+  static enable_if_floating_point<T> Pow10(const int64_t power) {
+    const T lut[]{1e0F,  1e1F,  1e2F,  1e3F,  1e4F,  1e5F,  1e6F,  1e7F,
+                  1e8F,  1e9F,  1e10F, 1e11F, 1e12F, 1e13F, 1e14F, 1e15F,
+                  1e16F, 1e17F, 1e18F, 1e19F, 1e20F, 1e21F, 1e22F};
+    // Return NaN if index is out-of-range.
+    auto lut_size = (int64_t)(sizeof(lut) / sizeof(*lut));
+    return (power >= 0 && power < lut_size) ? lut[power] : std::nanf("");
+  }
+};
+
+// Specializations of rounding implementations for round kernels
+template <typename T, RoundMode>
+struct RoundImpl {};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::DOWN> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::floor(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::UP> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::ceil(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::TOWARDS_ZERO> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::signbit(val) ? std::ceil(val) : std::floor(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::TOWARDS_INFINITY> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::signbit(val) ? std::floor(val) : std::ceil(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_DOWN> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return RoundImpl<T, RoundMode::DOWN>::Round(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_UP> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return RoundImpl<T, RoundMode::UP>::Round(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TOWARDS_ZERO> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return RoundImpl<T, RoundMode::TOWARDS_ZERO>::Round(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TOWARDS_INFINITY> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return RoundImpl<T, RoundMode::TOWARDS_INFINITY>::Round(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TO_EVEN> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::round(val * T(0.5)) * 2;
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TO_ODD> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::floor(val * T(0.5)) + std::ceil(val * T(0.5));
+  }
+};
+
+template <RoundMode RndMode>
+struct Round {
+  using RoundState = OptionsWrapper<RoundOptions>;
+
+  template <typename T, typename Arg>
+  static enable_if_floating_point<Arg, T> Call(KernelContext* ctx, Arg arg, Status* st) {
+    static_assert(std::is_same<T, Arg>::value, "");
+    static_assert(RoundMode::HALF_DOWN > RoundMode::DOWN &&
+                      RoundMode::HALF_DOWN > RoundMode::UP &&
+                      RoundMode::HALF_DOWN > RoundMode::TOWARDS_ZERO &&
+                      RoundMode::HALF_DOWN > RoundMode::TOWARDS_INFINITY &&
+                      RoundMode::HALF_DOWN < RoundMode::HALF_UP &&
+                      RoundMode::HALF_DOWN < RoundMode::HALF_TOWARDS_ZERO &&
+                      RoundMode::HALF_DOWN < RoundMode::HALF_TOWARDS_INFINITY &&
+                      RoundMode::HALF_DOWN < RoundMode::HALF_TO_EVEN &&
+                      RoundMode::HALF_DOWN < RoundMode::HALF_TO_ODD,
+                  "Round modes prefixed with HALF need to be defined last in enum and "
+                  "the first HALF entry has to be HALF_DOWN.");
+
+    auto options = RoundState::Get(ctx);
+    auto pow10 = RoundUtil::Pow10<T>(std::llabs(options.ndigits));
+    if (std::isnan(pow10)) {
+      *st = Status::Invalid("out-of-range value for rounding digits");
+      return arg;
+    } else if (!std::isfinite(arg)) {
+      return arg;
+    }
+
+    T scaled_arg = (options.ndigits >= 0) ? (arg * pow10) : (arg / pow10);
+    // Use std::round if scaled value is an integer or not 0.5 when a tie-breaking mode
+    // was set.
+    T result;
+    if (RoundUtil::IsApproxInt(scaled_arg, T(options.abs_tol)) ||
+        (options.round_mode >= RoundMode::HALF_DOWN &&
+         !RoundUtil::IsApproxHalfInt(scaled_arg, T(options.abs_tol)))) {
+      result = std::round(scaled_arg);
+    } else {
+      result = RoundImpl<T, RndMode>::Round(scaled_arg);
+    }
+    result = (options.ndigits >= 0) ? (result / pow10) : (result * pow10);
+    if (!std::isfinite(result)) {
+      *st = Status::Invalid("overflow occurred during rounding");
+      return arg;
+    }
+    // If rounding didn't change value, return original value
+    return RoundUtil::IsApproxEqual(arg, result, T(options.abs_tol)) ? arg : result;
+  }
+};
+
+template <RoundMode RndMode>
+struct MRound {
+  using MRoundState = OptionsWrapper<MRoundOptions>;
+
+  template <typename T, typename Arg>
+  static enable_if_floating_point<Arg, T> Call(KernelContext* ctx, Arg arg, Status* st) {
+    static_assert(std::is_same<T, Arg>::value, "");
+    static_assert(RoundMode::HALF_DOWN > RoundMode::DOWN &&
+                      RoundMode::HALF_DOWN > RoundMode::UP &&
+                      RoundMode::HALF_DOWN > RoundMode::TOWARDS_ZERO &&
+                      RoundMode::HALF_DOWN > RoundMode::TOWARDS_INFINITY &&
+                      RoundMode::HALF_DOWN < RoundMode::HALF_UP &&
+                      RoundMode::HALF_DOWN < RoundMode::HALF_TOWARDS_ZERO &&
+                      RoundMode::HALF_DOWN < RoundMode::HALF_TOWARDS_INFINITY &&
+                      RoundMode::HALF_DOWN < RoundMode::HALF_TO_EVEN &&
+                      RoundMode::HALF_DOWN < RoundMode::HALF_TO_ODD,
+                  "Round modes prefixed with HALF need to be defined last in enum and "
+                  "the first HALF entry has to be HALF_DOWN.");
+    auto options = MRoundState::Get(ctx);
+    auto mult = std::fabs(T(options.multiple));
+    if (mult == 0) {

Review comment:
       My inclination would be to check for `mult > 0` in the kernel init function and return an error otherwise. I don't think there's any use case for passing zero or a negative number, is there?
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] edponce commented on pull request #10349: ARROW-12744: [C++][Compute] Add rounding kernel

Posted by GitBox <gi...@apache.org>.

edponce commented on pull request #10349:
URL: https://github.com/apache/arrow/pull/10349#issuecomment-916916498


   This PR is ready for a (hopefully) final review. cc @lidavidm @pitrou 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] edponce commented on a change in pull request #10349: ARROW-12744: [C++][Compute] Add rounding kernel

Posted by GitBox <gi...@apache.org>.

edponce commented on a change in pull request #10349:
URL: https://github.com/apache/arrow/pull/10349#discussion_r703994513



##########
File path: cpp/src/arrow/compute/kernels/scalar_arithmetic.cc
##########
@@ -852,24 +854,232 @@ struct LogbChecked {
   }
 };
 
+struct RoundUtil {
+  template <typename T>
+  static constexpr enable_if_t<std::is_floating_point<T>::value, bool> IsApproxEqual(
+      const T a, const T b, const T abs_tol = kDefaultAbsoluteTolerance) {
+    return std::fabs(a - b) <= abs_tol;
+  }
+
+  template <typename T>
+  static constexpr enable_if_t<std::is_floating_point<T>::value, bool> IsApproxInt(
+      const T val, const T abs_tol = kDefaultAbsoluteTolerance) {
+    // |frac| ~ 0.0?
+    return IsApproxEqual(val, std::round(val), abs_tol);
+  }
+
+  template <typename T>
+  static constexpr enable_if_t<std::is_floating_point<T>::value, bool> IsApproxHalfInt(
+      const T val, const T abs_tol = kDefaultAbsoluteTolerance) {
+    // |frac| ~ 0.5?
+    return IsApproxEqual(val - std::floor(val), T(0.5), abs_tol);
+  }
+
+  template <typename T>
+  static enable_if_floating_point<T> Pow10(const int64_t power) {
+    const T lut[]{1e0F,  1e1F,  1e2F,  1e3F,  1e4F,  1e5F,  1e6F,  1e7F,
+                  1e8F,  1e9F,  1e10F, 1e11F, 1e12F, 1e13F, 1e14F, 1e15F,
+                  1e16F, 1e17F, 1e18F, 1e19F, 1e20F, 1e21F, 1e22F};
+    // Return NaN if index is out-of-range.
+    auto lut_size = (int64_t)(sizeof(lut) / sizeof(*lut));
+    return (power >= 0 && power < lut_size) ? lut[power] : std::nanf("");
+  }
+};
+
+// Specializations of rounding implementations for round kernels
+template <typename T, RoundMode>
+struct RoundImpl {};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::DOWN> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::floor(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::UP> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::ceil(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::TOWARDS_ZERO> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::signbit(val) ? std::ceil(val) : std::floor(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::TOWARDS_INFINITY> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::signbit(val) ? std::floor(val) : std::ceil(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_DOWN> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return RoundImpl<T, RoundMode::DOWN>::Round(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_UP> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return RoundImpl<T, RoundMode::UP>::Round(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TOWARDS_ZERO> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return RoundImpl<T, RoundMode::TOWARDS_ZERO>::Round(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TOWARDS_INFINITY> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return RoundImpl<T, RoundMode::TOWARDS_INFINITY>::Round(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TO_EVEN> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::round(val * T(0.5)) * 2;
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TO_ODD> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::floor(val * T(0.5)) + std::ceil(val * T(0.5));
+  }
+};
+
+template <RoundMode RndMode>
+struct Round {
+  using RoundState = OptionsWrapper<RoundOptions>;
+
+  template <typename T, typename Arg>
+  static enable_if_floating_point<Arg, T> Call(KernelContext* ctx, Arg arg, Status* st) {
+    static_assert(std::is_same<T, Arg>::value, "");
+    static_assert(RoundMode::HALF_DOWN > RoundMode::DOWN &&
+                      RoundMode::HALF_DOWN > RoundMode::UP &&
+                      RoundMode::HALF_DOWN > RoundMode::TOWARDS_ZERO &&
+                      RoundMode::HALF_DOWN > RoundMode::TOWARDS_INFINITY &&
+                      RoundMode::HALF_DOWN < RoundMode::HALF_UP &&
+                      RoundMode::HALF_DOWN < RoundMode::HALF_TOWARDS_ZERO &&
+                      RoundMode::HALF_DOWN < RoundMode::HALF_TOWARDS_INFINITY &&
+                      RoundMode::HALF_DOWN < RoundMode::HALF_TO_EVEN &&
+                      RoundMode::HALF_DOWN < RoundMode::HALF_TO_ODD,
+                  "Round modes prefixed with HALF need to be defined last in enum and "
+                  "the first HALF entry has to be HALF_DOWN.");
+
+    auto options = RoundState::Get(ctx);
+    auto pow10 = RoundUtil::Pow10<T>(std::llabs(options.ndigits));
+    if (std::isnan(pow10)) {
+      *st = Status::Invalid("out-of-range value for rounding digits");
+      return arg;
+    } else if (!std::isfinite(arg)) {
+      return arg;
+    }
+
+    T scaled_arg = (options.ndigits >= 0) ? (arg * pow10) : (arg / pow10);
+    // Use std::round if scaled value is an integer or not 0.5 when a tie-breaking mode
+    // was set.
+    T result;
+    if (RoundUtil::IsApproxInt(scaled_arg, T(options.abs_tol)) ||
+        (options.round_mode >= RoundMode::HALF_DOWN &&
+         !RoundUtil::IsApproxHalfInt(scaled_arg, T(options.abs_tol)))) {
+      result = std::round(scaled_arg);
+    } else {
+      result = RoundImpl<T, RndMode>::Round(scaled_arg);
+    }
+    result = (options.ndigits >= 0) ? (result / pow10) : (result * pow10);
+    if (!std::isfinite(result)) {
+      *st = Status::Invalid("overflow occurred during rounding");
+      return arg;
+    }
+    // If rounding didn't change value, return original value
+    return RoundUtil::IsApproxEqual(arg, result, T(options.abs_tol)) ? arg : result;
+  }
+};
+
+template <RoundMode RndMode>
+struct MRound {
+  using MRoundState = OptionsWrapper<MRoundOptions>;
+
+  template <typename T, typename Arg>
+  static enable_if_floating_point<Arg, T> Call(KernelContext* ctx, Arg arg, Status* st) {
+    static_assert(std::is_same<T, Arg>::value, "");
+    static_assert(RoundMode::HALF_DOWN > RoundMode::DOWN &&
+                      RoundMode::HALF_DOWN > RoundMode::UP &&
+                      RoundMode::HALF_DOWN > RoundMode::TOWARDS_ZERO &&
+                      RoundMode::HALF_DOWN > RoundMode::TOWARDS_INFINITY &&
+                      RoundMode::HALF_DOWN < RoundMode::HALF_UP &&
+                      RoundMode::HALF_DOWN < RoundMode::HALF_TOWARDS_ZERO &&
+                      RoundMode::HALF_DOWN < RoundMode::HALF_TOWARDS_INFINITY &&
+                      RoundMode::HALF_DOWN < RoundMode::HALF_TO_EVEN &&
+                      RoundMode::HALF_DOWN < RoundMode::HALF_TO_ODD,
+                  "Round modes prefixed with HALF need to be defined last in enum and "
+                  "the first HALF entry has to be HALF_DOWN.");
+    auto options = MRoundState::Get(ctx);
+    auto mult = std::fabs(T(options.multiple));
+    if (mult == 0) {

Review comment:
       Well, the only value not allowed is `mult = 0`. We can mandate that `mult` is non-zero and remove the check, but I think we should leave as-is for now. Ideally, we would be able to have multiple CallXXX variants for a kernel where they would be selected based on options. In this case, if `mult` is zero, then it would invoke
   ```c++
   static ... CallZero(...) {
         return std::isfinite(arg) ? 0 : std::nanf("");
   }
   ```
   and all other cases would invoke the current `Call()`. This is not the only compute function that has special cases, actually most of them have, and having this capability will increase performance and also make the code a bit more amenable for SIMD. I am working on a refactoring to support these ideas.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] jorisvandenbossche commented on a change in pull request #10349: ARROW-12744: [C++][Compute] Add rounding kernel

Posted by GitBox <gi...@apache.org>.

jorisvandenbossche commented on a change in pull request #10349:
URL: https://github.com/apache/arrow/pull/10349#discussion_r634972664



##########
File path: cpp/src/arrow/compute/kernels/scalar_arithmetic.cc
##########
@@ -355,6 +355,32 @@ struct PowerChecked {
   }
 };
 
+struct Round {
+  template <typename T, typename Arg>
+  static constexpr enable_if_floating_point<T> Call(KernelContext*, Arg arg, Status*) {
+    return std::round(arg);
+  }
+
+  template <typename T, typename Arg>
+  static constexpr enable_if_integer<T> Call(KernelContext*, Arg arg, Status*) {
+    return arg;
+  }
+};
+
+struct RoundChecked {
+  template <typename T, typename Arg>
+  static constexpr enable_if_integer<T> Call(KernelContext*, Arg arg, Status*) {
+    static_assert(std::is_same<T, Arg>::value, "");
+    return arg;
+  }
+
+  template <typename T, typename Arg>
+  static constexpr enable_if_floating_point<T> Call(KernelContext*, Arg arg, Status*) {
+    static_assert(std::is_same<T, Arg>::value, "");

Review comment:
       Ah, that might also be interesting, but I would indeed start with the type preserving round for now (there are also already enough complexities regarding round mode, whether we want ceil/floor/trunc etc)




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] jorisvandenbossche commented on a change in pull request #10349: ARROW-12744: [C++][Compute] Add rounding kernel

Posted by GitBox <gi...@apache.org>.

jorisvandenbossche commented on a change in pull request #10349:
URL: https://github.com/apache/arrow/pull/10349#discussion_r634139326



##########
File path: cpp/src/arrow/compute/kernels/scalar_arithmetic.cc
##########
@@ -355,6 +355,32 @@ struct PowerChecked {
   }
 };
 
+struct Round {
+  template <typename T, typename Arg>
+  static constexpr enable_if_floating_point<T> Call(KernelContext*, Arg arg, Status*) {
+    return std::round(arg);
+  }
+
+  template <typename T, typename Arg>
+  static constexpr enable_if_integer<T> Call(KernelContext*, Arg arg, Status*) {
+    return arg;
+  }
+};
+
+struct RoundChecked {
+  template <typename T, typename Arg>
+  static constexpr enable_if_integer<T> Call(KernelContext*, Arg arg, Status*) {
+    static_assert(std::is_same<T, Arg>::value, "");
+    return arg;
+  }
+
+  template <typename T, typename Arg>
+  static constexpr enable_if_floating_point<T> Call(KernelContext*, Arg arg, Status*) {
+    static_assert(std::is_same<T, Arg>::value, "");

Review comment:
       But should a `round` kernel ever change the type? IMO when rounding floats, the result can still be floats, regardless of the number of decimals to round (in which case overflow will never be a problem?)




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] lidavidm commented on a change in pull request #10349: ARROW-12744: [C++][Compute] Add rounding kernel

Posted by GitBox <gi...@apache.org>.

lidavidm commented on a change in pull request #10349:
URL: https://github.com/apache/arrow/pull/10349#discussion_r704368213



##########
File path: cpp/src/arrow/compute/api_scalar.cc
##########
@@ -175,6 +219,42 @@ ElementWiseAggregateOptions::ElementWiseAggregateOptions(bool skip_nulls)
       skip_nulls(skip_nulls) {}
 constexpr char ElementWiseAggregateOptions::kTypeName[];
 
+RoundOptions::RoundOptions(int64_t ndigits, RoundMode round_mode)
+    : FunctionOptions(internal::kRoundOptionsType),
+      ndigits(ndigits),
+      round_mode(round_mode) {
+  static_assert(RoundMode::HALF_DOWN > RoundMode::DOWN &&
+                    RoundMode::HALF_DOWN > RoundMode::UP &&
+                    RoundMode::HALF_DOWN > RoundMode::TOWARDS_ZERO &&
+                    RoundMode::HALF_DOWN > RoundMode::TOWARDS_INFINITY &&
+                    RoundMode::HALF_DOWN < RoundMode::HALF_UP &&
+                    RoundMode::HALF_DOWN < RoundMode::HALF_TOWARDS_ZERO &&
+                    RoundMode::HALF_DOWN < RoundMode::HALF_TOWARDS_INFINITY &&
+                    RoundMode::HALF_DOWN < RoundMode::HALF_TO_EVEN &&
+                    RoundMode::HALF_DOWN < RoundMode::HALF_TO_ODD,
+                "Invalid order of round modes. Modes prefixed with HALF need to be "
+                "enumerated last with HALF_DOWN being the first among them.");
+}
+constexpr char RoundOptions::kTypeName[];
+
+MRoundOptions::MRoundOptions(double multiple, RoundMode round_mode)
+    : FunctionOptions(internal::kMRoundOptionsType),
+      multiple(multiple),
+      round_mode(round_mode) {
+  static_assert(RoundMode::HALF_DOWN > RoundMode::DOWN &&

Review comment:
       nit: do we really need to have this here twice?

##########
File path: cpp/src/arrow/compute/kernels/scalar_arithmetic.cc
##########
@@ -852,24 +854,232 @@ struct LogbChecked {
   }
 };
 
+struct RoundUtil {
+  template <typename T>
+  static constexpr enable_if_t<std::is_floating_point<T>::value, bool> IsApproxEqual(
+      const T a, const T b, const T abs_tol = kDefaultAbsoluteTolerance) {
+    return std::fabs(a - b) <= abs_tol;
+  }
+
+  template <typename T>
+  static constexpr enable_if_t<std::is_floating_point<T>::value, bool> IsApproxInt(
+      const T val, const T abs_tol = kDefaultAbsoluteTolerance) {
+    // |frac| ~ 0.0?
+    return IsApproxEqual(val, std::round(val), abs_tol);
+  }
+
+  template <typename T>
+  static constexpr enable_if_t<std::is_floating_point<T>::value, bool> IsApproxHalfInt(
+      const T val, const T abs_tol = kDefaultAbsoluteTolerance) {
+    // |frac| ~ 0.5?
+    return IsApproxEqual(val - std::floor(val), T(0.5), abs_tol);
+  }
+
+  template <typename T>
+  static enable_if_floating_point<T> Pow10(const int64_t power) {
+    const T lut[]{1e0F,  1e1F,  1e2F,  1e3F,  1e4F,  1e5F,  1e6F,  1e7F,
+                  1e8F,  1e9F,  1e10F, 1e11F, 1e12F, 1e13F, 1e14F, 1e15F,
+                  1e16F, 1e17F, 1e18F, 1e19F, 1e20F, 1e21F, 1e22F};
+    // Return NaN if index is out-of-range.
+    auto lut_size = (int64_t)(sizeof(lut) / sizeof(*lut));
+    return (power >= 0 && power < lut_size) ? lut[power] : std::nanf("");
+  }
+};
+
+// Specializations of rounding implementations for round kernels
+template <typename T, RoundMode>
+struct RoundImpl {};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::DOWN> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::floor(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::UP> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::ceil(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::TOWARDS_ZERO> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::signbit(val) ? std::ceil(val) : std::floor(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::TOWARDS_INFINITY> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::signbit(val) ? std::floor(val) : std::ceil(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_DOWN> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return RoundImpl<T, RoundMode::DOWN>::Round(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_UP> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return RoundImpl<T, RoundMode::UP>::Round(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TOWARDS_ZERO> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return RoundImpl<T, RoundMode::TOWARDS_ZERO>::Round(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TOWARDS_INFINITY> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return RoundImpl<T, RoundMode::TOWARDS_INFINITY>::Round(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TO_EVEN> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::round(val * T(0.5)) * 2;
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TO_ODD> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::floor(val * T(0.5)) + std::ceil(val * T(0.5));
+  }
+};
+
+template <RoundMode RndMode>
+struct Round {
+  using RoundState = OptionsWrapper<RoundOptions>;
+
+  template <typename T, typename Arg>
+  static enable_if_floating_point<Arg, T> Call(KernelContext* ctx, Arg arg, Status* st) {
+    static_assert(std::is_same<T, Arg>::value, "");
+    static_assert(RoundMode::HALF_DOWN > RoundMode::DOWN &&
+                      RoundMode::HALF_DOWN > RoundMode::UP &&
+                      RoundMode::HALF_DOWN > RoundMode::TOWARDS_ZERO &&
+                      RoundMode::HALF_DOWN > RoundMode::TOWARDS_INFINITY &&
+                      RoundMode::HALF_DOWN < RoundMode::HALF_UP &&
+                      RoundMode::HALF_DOWN < RoundMode::HALF_TOWARDS_ZERO &&
+                      RoundMode::HALF_DOWN < RoundMode::HALF_TOWARDS_INFINITY &&
+                      RoundMode::HALF_DOWN < RoundMode::HALF_TO_EVEN &&
+                      RoundMode::HALF_DOWN < RoundMode::HALF_TO_ODD,
+                  "Round modes prefixed with HALF need to be defined last in enum and "
+                  "the first HALF entry has to be HALF_DOWN.");
+
+    auto options = RoundState::Get(ctx);
+    auto pow10 = RoundUtil::Pow10<T>(std::llabs(options.ndigits));
+    if (std::isnan(pow10)) {
+      *st = Status::Invalid("out-of-range value for rounding digits");
+      return arg;

Review comment:
       It seems you'll have to use ScalarUnaryNotNullStateful and do some more refactoring on top of that (i.e. you'd be implementing a static Exec which validates the options and instantiates the Op, then calls out to ScalarUnaryNotNullStateful).

##########
File path: cpp/src/arrow/compute/kernels/scalar_arithmetic.cc
##########
@@ -852,24 +854,232 @@ struct LogbChecked {
   }
 };
 
+struct RoundUtil {
+  template <typename T>
+  static constexpr enable_if_t<std::is_floating_point<T>::value, bool> IsApproxEqual(
+      const T a, const T b, const T abs_tol = kDefaultAbsoluteTolerance) {
+    return std::fabs(a - b) <= abs_tol;
+  }
+
+  template <typename T>
+  static constexpr enable_if_t<std::is_floating_point<T>::value, bool> IsApproxInt(
+      const T val, const T abs_tol = kDefaultAbsoluteTolerance) {
+    // |frac| ~ 0.0?
+    return IsApproxEqual(val, std::round(val), abs_tol);
+  }
+
+  template <typename T>
+  static constexpr enable_if_t<std::is_floating_point<T>::value, bool> IsApproxHalfInt(
+      const T val, const T abs_tol = kDefaultAbsoluteTolerance) {
+    // |frac| ~ 0.5?
+    return IsApproxEqual(val - std::floor(val), T(0.5), abs_tol);
+  }
+
+  template <typename T>
+  static enable_if_floating_point<T> Pow10(const int64_t power) {
+    const T lut[]{1e0F,  1e1F,  1e2F,  1e3F,  1e4F,  1e5F,  1e6F,  1e7F,
+                  1e8F,  1e9F,  1e10F, 1e11F, 1e12F, 1e13F, 1e14F, 1e15F,
+                  1e16F, 1e17F, 1e18F, 1e19F, 1e20F, 1e21F, 1e22F};
+    // Return NaN if index is out-of-range.
+    auto lut_size = (int64_t)(sizeof(lut) / sizeof(*lut));
+    return (power >= 0 && power < lut_size) ? lut[power] : std::nanf("");

Review comment:
       In that case, can we reference NumPy's implementation and note that division is more stable for us than a precomputed table?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] edponce commented on a change in pull request #10349: ARROW-12744: [C++][Compute] Add rounding kernel

Posted by GitBox <gi...@apache.org>.

edponce commented on a change in pull request #10349:
URL: https://github.com/apache/arrow/pull/10349#discussion_r705731494



##########
File path: cpp/src/arrow/compute/kernels/scalar_arithmetic.cc
##########
@@ -852,24 +853,243 @@ struct LogbChecked {
   }
 };
 
+struct RoundUtil {
+  template <typename T>
+  static constexpr enable_if_t<std::is_floating_point<T>::value, bool> IsInteger(
+      const T val) {
+    // |frac| ~ 0.0?
+    return std::floor(val) == val;
+  }
+
+  template <typename T>
+  static constexpr enable_if_t<std::is_floating_point<T>::value, bool> IsHalfInteger(
+      const T val) {
+    // |frac| ~ 0.5?
+    return (val - std::floor(val)) == T(0.5);
+  }
+
+  // Calculate powers of ten with arbitrary integer exponent
+  template <typename T = double>
+  static enable_if_floating_point<T> Pow10(int64_t power) {
+    static constexpr T lut[] = {1e0F, 1e1F, 1e2F,  1e3F,  1e4F,  1e5F,  1e6F,  1e7F,
+                                1e8F, 1e9F, 1e10F, 1e11F, 1e12F, 1e13F, 1e14F, 1e15F};
+    int64_t lut_size = (sizeof(lut) / sizeof(*lut));
+    int64_t abs_power = std::abs(power);
+    auto pow10 = lut[std::min(abs_power, lut_size - 1)];
+    while (abs_power-- >= lut_size) {
+      pow10 *= 1e1F;
+    }
+    return (power >= 0) ? pow10 : (1 / pow10);
+  }
+};
+
+// Specializations of rounding implementations for round kernels
+template <typename, RoundMode>
+struct RoundImpl;
+
+template <typename T>
+struct RoundImpl<T, RoundMode::DOWN> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::floor(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::UP> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::ceil(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::TOWARDS_ZERO> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::trunc(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::TOWARDS_INFINITY> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::signbit(val) ? std::floor(val) : std::ceil(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_DOWN> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return RoundImpl<T, RoundMode::DOWN>::Round(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_UP> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return RoundImpl<T, RoundMode::UP>::Round(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TOWARDS_ZERO> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return RoundImpl<T, RoundMode::TOWARDS_ZERO>::Round(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TOWARDS_INFINITY> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return RoundImpl<T, RoundMode::TOWARDS_INFINITY>::Round(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TO_EVEN> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::round(val * T(0.5)) * 2;
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TO_ODD> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::floor(val * T(0.5)) + std::ceil(val * T(0.5));
+  }
+};
+
+// Specializations of kernel state for round kernels
+template <typename>
+struct RoundOptionsWrapper;
+
+template <>
+struct RoundOptionsWrapper<RoundOptions> : public OptionsWrapper<RoundOptions> {
+  using OptionsType = RoundOptions;
+  double pow10;
+
+  explicit RoundOptionsWrapper(OptionsType options) : OptionsWrapper(std::move(options)) {
+    // Only positive of powers of 10 are used because combining multiply and
+    // division operations produced more stable rounding than using multiply-only.
+    // Refer to NumPy's round implementation:
+    // https://github.com/numpy/numpy/blob/7b2f20b406d27364c812f7a81a9c901afbd3600c/numpy/core/src/multiarray/calculation.c#L589
+    pow10 = RoundUtil::Pow10(std::abs(options.ndigits));
+  }
+
+  static Result<std::unique_ptr<KernelState>> Init(KernelContext* ctx,
+                                                   const KernelInitArgs& args) {
+    if (auto options = static_cast<const OptionsType*>(args.options)) {
+      return arrow::internal::make_unique<RoundOptionsWrapper<OptionsType>>(*options);
+    }
+
+    return Status::Invalid(
+        "Attempted to initialize KernelState from null FunctionOptions");
+  }
+};
+
+template <>
+struct RoundOptionsWrapper<RoundToMultipleOptions>
+    : public OptionsWrapper<RoundToMultipleOptions> {
+  using OptionsType = RoundToMultipleOptions;
+
+  static Result<std::unique_ptr<KernelState>> Init(KernelContext* ctx,
+                                                   const KernelInitArgs& args) {
+    ARROW_ASSIGN_OR_RAISE(auto state, OptionsWrapper<OptionsType>::Init(ctx, args));
+    auto options = Get(*state);
+    if (options.multiple <= 0) {
+      return Status::Invalid("Rounding multiple has to be a non-zero positive value");
+    }
+    return std::move(state);
+  }
+};
+
+template <RoundMode RndMode>
+struct Round {
+  using State = RoundOptionsWrapper<RoundOptions>;
+
+  template <typename T, typename Arg>
+  static enable_if_floating_point<Arg, T> Call(KernelContext* ctx, Arg arg, Status* st) {
+    static_assert(std::is_same<T, Arg>::value, "");
+    // Do not process Inf or NaN because they will trigger the overflow error at end of
+    // function.
+    if (!std::isfinite(arg)) {
+      return arg;
+    }
+    auto state = static_cast<State*>(ctx->state());
+    auto options = state->options;
+    auto pow10 = T(state->pow10);
+    auto round_val = (options.ndigits >= 0) ? (arg * pow10) : (arg / pow10);
+    // Use std::round() if in tie-breaking mode and scaled value is not 0.5.
+    if ((options.round_mode >= RoundMode::HALF_DOWN) &&
+        !RoundUtil::IsHalfInteger(round_val)) {
+      round_val = std::round(round_val);

Review comment:
       My mistake, move the check incorrectly during a revision. Thanks for being meticulous in your review!




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] edponce commented on a change in pull request #10349: ARROW-12744: [C++][Compute] Add rounding kernel

Posted by GitBox <gi...@apache.org>.

edponce commented on a change in pull request #10349:
URL: https://github.com/apache/arrow/pull/10349#discussion_r698884590



##########
File path: cpp/src/arrow/util/map.h
##########
@@ -17,13 +17,21 @@
 
 #pragma once
 
+#include <type_traits>
 #include <utility>
 
 #include "arrow/result.h"
 
 namespace arrow {
 namespace internal {
 
+/// Functor to calculate hash of an enum.

Review comment:
       Change description to "Functor to make enums hashable."




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] lidavidm commented on pull request #10349: ARROW-12744: [C++][Compute] Add rounding kernel

Posted by GitBox <gi...@apache.org>.

lidavidm commented on pull request #10349:
URL: https://github.com/apache/arrow/pull/10349#issuecomment-884495434


   There seem to be some Windows-specific test failures :/


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] edponce commented on a change in pull request #10349: ARROW-12744: [C++][Compute] Add rounding kernel

Posted by GitBox <gi...@apache.org>.

edponce commented on a change in pull request #10349:
URL: https://github.com/apache/arrow/pull/10349#discussion_r703994513



##########
File path: cpp/src/arrow/compute/kernels/scalar_arithmetic.cc
##########
@@ -852,24 +854,232 @@ struct LogbChecked {
   }
 };
 
+struct RoundUtil {
+  template <typename T>
+  static constexpr enable_if_t<std::is_floating_point<T>::value, bool> IsApproxEqual(
+      const T a, const T b, const T abs_tol = kDefaultAbsoluteTolerance) {
+    return std::fabs(a - b) <= abs_tol;
+  }
+
+  template <typename T>
+  static constexpr enable_if_t<std::is_floating_point<T>::value, bool> IsApproxInt(
+      const T val, const T abs_tol = kDefaultAbsoluteTolerance) {
+    // |frac| ~ 0.0?
+    return IsApproxEqual(val, std::round(val), abs_tol);
+  }
+
+  template <typename T>
+  static constexpr enable_if_t<std::is_floating_point<T>::value, bool> IsApproxHalfInt(
+      const T val, const T abs_tol = kDefaultAbsoluteTolerance) {
+    // |frac| ~ 0.5?
+    return IsApproxEqual(val - std::floor(val), T(0.5), abs_tol);
+  }
+
+  template <typename T>
+  static enable_if_floating_point<T> Pow10(const int64_t power) {
+    const T lut[]{1e0F,  1e1F,  1e2F,  1e3F,  1e4F,  1e5F,  1e6F,  1e7F,
+                  1e8F,  1e9F,  1e10F, 1e11F, 1e12F, 1e13F, 1e14F, 1e15F,
+                  1e16F, 1e17F, 1e18F, 1e19F, 1e20F, 1e21F, 1e22F};
+    // Return NaN if index is out-of-range.
+    auto lut_size = (int64_t)(sizeof(lut) / sizeof(*lut));
+    return (power >= 0 && power < lut_size) ? lut[power] : std::nanf("");
+  }
+};
+
+// Specializations of rounding implementations for round kernels
+template <typename T, RoundMode>
+struct RoundImpl {};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::DOWN> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::floor(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::UP> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::ceil(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::TOWARDS_ZERO> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::signbit(val) ? std::ceil(val) : std::floor(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::TOWARDS_INFINITY> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::signbit(val) ? std::floor(val) : std::ceil(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_DOWN> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return RoundImpl<T, RoundMode::DOWN>::Round(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_UP> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return RoundImpl<T, RoundMode::UP>::Round(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TOWARDS_ZERO> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return RoundImpl<T, RoundMode::TOWARDS_ZERO>::Round(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TOWARDS_INFINITY> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return RoundImpl<T, RoundMode::TOWARDS_INFINITY>::Round(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TO_EVEN> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::round(val * T(0.5)) * 2;
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TO_ODD> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::floor(val * T(0.5)) + std::ceil(val * T(0.5));
+  }
+};
+
+template <RoundMode RndMode>
+struct Round {
+  using RoundState = OptionsWrapper<RoundOptions>;
+
+  template <typename T, typename Arg>
+  static enable_if_floating_point<Arg, T> Call(KernelContext* ctx, Arg arg, Status* st) {
+    static_assert(std::is_same<T, Arg>::value, "");
+    static_assert(RoundMode::HALF_DOWN > RoundMode::DOWN &&
+                      RoundMode::HALF_DOWN > RoundMode::UP &&
+                      RoundMode::HALF_DOWN > RoundMode::TOWARDS_ZERO &&
+                      RoundMode::HALF_DOWN > RoundMode::TOWARDS_INFINITY &&
+                      RoundMode::HALF_DOWN < RoundMode::HALF_UP &&
+                      RoundMode::HALF_DOWN < RoundMode::HALF_TOWARDS_ZERO &&
+                      RoundMode::HALF_DOWN < RoundMode::HALF_TOWARDS_INFINITY &&
+                      RoundMode::HALF_DOWN < RoundMode::HALF_TO_EVEN &&
+                      RoundMode::HALF_DOWN < RoundMode::HALF_TO_ODD,
+                  "Round modes prefixed with HALF need to be defined last in enum and "
+                  "the first HALF entry has to be HALF_DOWN.");
+
+    auto options = RoundState::Get(ctx);
+    auto pow10 = RoundUtil::Pow10<T>(std::llabs(options.ndigits));
+    if (std::isnan(pow10)) {
+      *st = Status::Invalid("out-of-range value for rounding digits");
+      return arg;
+    } else if (!std::isfinite(arg)) {
+      return arg;
+    }
+
+    T scaled_arg = (options.ndigits >= 0) ? (arg * pow10) : (arg / pow10);
+    // Use std::round if scaled value is an integer or not 0.5 when a tie-breaking mode
+    // was set.
+    T result;
+    if (RoundUtil::IsApproxInt(scaled_arg, T(options.abs_tol)) ||
+        (options.round_mode >= RoundMode::HALF_DOWN &&
+         !RoundUtil::IsApproxHalfInt(scaled_arg, T(options.abs_tol)))) {
+      result = std::round(scaled_arg);
+    } else {
+      result = RoundImpl<T, RndMode>::Round(scaled_arg);
+    }
+    result = (options.ndigits >= 0) ? (result / pow10) : (result * pow10);
+    if (!std::isfinite(result)) {
+      *st = Status::Invalid("overflow occurred during rounding");
+      return arg;
+    }
+    // If rounding didn't change value, return original value
+    return RoundUtil::IsApproxEqual(arg, result, T(options.abs_tol)) ? arg : result;
+  }
+};
+
+template <RoundMode RndMode>
+struct MRound {
+  using MRoundState = OptionsWrapper<MRoundOptions>;
+
+  template <typename T, typename Arg>
+  static enable_if_floating_point<Arg, T> Call(KernelContext* ctx, Arg arg, Status* st) {
+    static_assert(std::is_same<T, Arg>::value, "");
+    static_assert(RoundMode::HALF_DOWN > RoundMode::DOWN &&
+                      RoundMode::HALF_DOWN > RoundMode::UP &&
+                      RoundMode::HALF_DOWN > RoundMode::TOWARDS_ZERO &&
+                      RoundMode::HALF_DOWN > RoundMode::TOWARDS_INFINITY &&
+                      RoundMode::HALF_DOWN < RoundMode::HALF_UP &&
+                      RoundMode::HALF_DOWN < RoundMode::HALF_TOWARDS_ZERO &&
+                      RoundMode::HALF_DOWN < RoundMode::HALF_TOWARDS_INFINITY &&
+                      RoundMode::HALF_DOWN < RoundMode::HALF_TO_EVEN &&
+                      RoundMode::HALF_DOWN < RoundMode::HALF_TO_ODD,
+                  "Round modes prefixed with HALF need to be defined last in enum and "
+                  "the first HALF entry has to be HALF_DOWN.");
+    auto options = MRoundState::Get(ctx);
+    auto mult = std::fabs(T(options.multiple));
+    if (mult == 0) {

Review comment:
       Well, the sign of `mult` is irrelevant. We can mandate that `mult` is non-zero and finite, and remove the check, but I think we should leave as-is for now. Ideally, we would be able to have multiple CallXXX variants for a kernel where they would be selected based on options. In this case, if `mult` is zero, then it would invoke
   ```c++
   static ... CallZero(...) {
       return std::isfinite(arg) ? 0 : std::nanf("");
   }
   ```
   and all other cases would invoke the current `Call()`. This is not the only compute function that has special cases, actually most of them have, and having this capability will increase performance and also make the code a bit more amenable for SIMD. I am working on a refactoring to support these ideas.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] edponce commented on a change in pull request #10349: ARROW-12744: [C++][Compute] Add rounding kernel

Posted by GitBox <gi...@apache.org>.

edponce commented on a change in pull request #10349:
URL: https://github.com/apache/arrow/pull/10349#discussion_r703474375



##########
File path: cpp/src/arrow/compute/kernels/scalar_arithmetic.cc
##########
@@ -852,24 +854,235 @@ struct LogbChecked {
   }
 };
 
+struct RoundUtil {
+  template <typename T>
+  static constexpr enable_if_t<std::is_floating_point<T>::value, bool> IsApproxEqual(
+      const T a, const T b, const T abs_tol = kDefaultAbsoluteTolerance) {
+    return std::fabs(a - b) <= abs_tol;
+  }

Review comment:
       Also, [compare.cc](https://github.com/apache/arrow/blob/master/cpp/src/arrow/compare.cc#L78-L86) contains an approximate equal for floating point values.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] edponce commented on a change in pull request #10349: ARROW-12744: [C++][Compute] Add rounding kernel

Posted by GitBox <gi...@apache.org>.

edponce commented on a change in pull request #10349:
URL: https://github.com/apache/arrow/pull/10349#discussion_r705514453



##########
File path: cpp/src/arrow/compute/kernels/scalar_arithmetic.cc
##########
@@ -1276,6 +1496,65 @@ std::shared_ptr<ScalarFunction> MakeUnaryArithmeticFunctionNotNull(
   return func;
 }
 
+// Generate a kernel given an arithmetic rounding functor
+template <template <RoundMode> class Op>
+ArrayKernelExec GenerateArithmeticRound(RoundMode rmode, detail::GetTypeId ty) {
+  switch (rmode) {
+    case RoundMode::DOWN:
+      return GenerateArithmeticFloatingPoint<ScalarUnaryNotNull, Op<RoundMode::DOWN>>(ty);
+    case RoundMode::UP:
+      return GenerateArithmeticFloatingPoint<ScalarUnaryNotNull, Op<RoundMode::UP>>(ty);
+    case RoundMode::TOWARDS_ZERO:
+      return GenerateArithmeticFloatingPoint<ScalarUnaryNotNull,
+                                             Op<RoundMode::TOWARDS_ZERO>>(ty);
+    case RoundMode::TOWARDS_INFINITY:
+      return GenerateArithmeticFloatingPoint<ScalarUnaryNotNull,
+                                             Op<RoundMode::TOWARDS_INFINITY>>(ty);
+    case RoundMode::HALF_DOWN:
+      return GenerateArithmeticFloatingPoint<ScalarUnaryNotNull,
+                                             Op<RoundMode::HALF_DOWN>>(ty);
+    case RoundMode::HALF_UP:
+      return GenerateArithmeticFloatingPoint<ScalarUnaryNotNull, Op<RoundMode::HALF_UP>>(
+          ty);
+    case RoundMode::HALF_TOWARDS_ZERO:
+      return GenerateArithmeticFloatingPoint<ScalarUnaryNotNull,
+                                             Op<RoundMode::HALF_TOWARDS_ZERO>>(ty);
+    case RoundMode::HALF_TOWARDS_INFINITY:
+      return GenerateArithmeticFloatingPoint<ScalarUnaryNotNull,
+                                             Op<RoundMode::HALF_TOWARDS_INFINITY>>(ty);
+    case RoundMode::HALF_TO_EVEN:
+      return GenerateArithmeticFloatingPoint<ScalarUnaryNotNull,
+                                             Op<RoundMode::HALF_TO_EVEN>>(ty);
+    case RoundMode::HALF_TO_ODD:
+      return GenerateArithmeticFloatingPoint<ScalarUnaryNotNull,
+                                             Op<RoundMode::HALF_TO_ODD>>(ty);
+    default:
+      DCHECK(false);
+      return ExecFail;
+  }
+}
+
+// Like MakeUnaryArithmeticFunction, but for unary rounding functions that control
+// kernel dispatch based on RoundMode, only on non-null output.
+template <template <RoundMode> class Op, typename OptionsType>
+std::shared_ptr<ScalarFunction> MakeUnaryRoundFunction(std::string name,
+                                                       const FunctionDoc* doc) {
+  using State = RoundOptionsWrapper<OptionsType>;
+
+  static const OptionsType kDefaultOptions = OptionsType::Defaults();
+  auto func = std::make_shared<ArithmeticFloatingPointFunction>(name, Arity::Unary(), doc,
+                                                                &kDefaultOptions);
+  for (const auto& ty : FloatingPointTypes()) {
+    auto exec = [&](KernelContext* ctx, const ExecBatch& batch, Datum* out) {
+      auto options = State::Get(ctx);
+      auto exec_ = GenerateArithmeticRound<Op>(options.round_mode, ty);

Review comment:
       I agree. The purpose of doing this here is to generate dispatchers once prior to kernel invocation. Previously, I tried 2 solutions:
   * Have a vector of precompute `GenerateArithmeticFloatingPoint` and use the `options.round_mode` to index this vector. But this required vector to be ordered identically to the round modes in `enum RoundMode`.
   * Similar to above but using an `unordered_map` indexed by `options.round_mode`. This requires adding hash support for `RoundMode` data type. This approach does not imposes a full constraint on the ordering of `RoundMode`.
   
   Which one do you think is best?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] edponce commented on a change in pull request #10349: ARROW-12744: [C++][Compute] Add rounding kernel

Posted by GitBox <gi...@apache.org>.

edponce commented on a change in pull request #10349:
URL: https://github.com/apache/arrow/pull/10349#discussion_r705487276



##########
File path: cpp/src/arrow/compute/kernels/scalar_arithmetic.cc
##########
@@ -852,24 +853,243 @@ struct LogbChecked {
   }
 };
 
+struct RoundUtil {
+  template <typename T>
+  static constexpr enable_if_t<std::is_floating_point<T>::value, bool> IsInteger(
+      const T val) {
+    // |frac| ~ 0.0?
+    return std::floor(val) == val;
+  }
+
+  template <typename T>
+  static constexpr enable_if_t<std::is_floating_point<T>::value, bool> IsHalfInteger(
+      const T val) {
+    // |frac| ~ 0.5?
+    return (val - std::floor(val)) == T(0.5);
+  }
+
+  // Calculate powers of ten with arbitrary integer exponent
+  template <typename T = double>
+  static enable_if_floating_point<T> Pow10(int64_t power) {
+    static constexpr T lut[] = {1e0F, 1e1F, 1e2F,  1e3F,  1e4F,  1e5F,  1e6F,  1e7F,
+                                1e8F, 1e9F, 1e10F, 1e11F, 1e12F, 1e13F, 1e14F, 1e15F};
+    int64_t lut_size = (sizeof(lut) / sizeof(*lut));
+    int64_t abs_power = std::abs(power);
+    auto pow10 = lut[std::min(abs_power, lut_size - 1)];
+    while (abs_power-- >= lut_size) {
+      pow10 *= 1e1F;
+    }
+    return (power >= 0) ? pow10 : (1 / pow10);
+  }
+};
+
+// Specializations of rounding implementations for round kernels
+template <typename, RoundMode>
+struct RoundImpl;
+
+template <typename T>
+struct RoundImpl<T, RoundMode::DOWN> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::floor(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::UP> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::ceil(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::TOWARDS_ZERO> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::trunc(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::TOWARDS_INFINITY> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::signbit(val) ? std::floor(val) : std::ceil(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_DOWN> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return RoundImpl<T, RoundMode::DOWN>::Round(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_UP> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return RoundImpl<T, RoundMode::UP>::Round(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TOWARDS_ZERO> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return RoundImpl<T, RoundMode::TOWARDS_ZERO>::Round(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TOWARDS_INFINITY> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return RoundImpl<T, RoundMode::TOWARDS_INFINITY>::Round(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TO_EVEN> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::round(val * T(0.5)) * 2;
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TO_ODD> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::floor(val * T(0.5)) + std::ceil(val * T(0.5));
+  }
+};
+
+// Specializations of kernel state for round kernels
+template <typename>
+struct RoundOptionsWrapper;
+
+template <>
+struct RoundOptionsWrapper<RoundOptions> : public OptionsWrapper<RoundOptions> {
+  using OptionsType = RoundOptions;
+  double pow10;
+
+  explicit RoundOptionsWrapper(OptionsType options) : OptionsWrapper(std::move(options)) {
+    // Only positive of powers of 10 are used because combining multiply and
+    // division operations produced more stable rounding than using multiply-only.
+    // Refer to NumPy's round implementation:
+    // https://github.com/numpy/numpy/blob/7b2f20b406d27364c812f7a81a9c901afbd3600c/numpy/core/src/multiarray/calculation.c#L589
+    pow10 = RoundUtil::Pow10(std::abs(options.ndigits));
+  }
+
+  static Result<std::unique_ptr<KernelState>> Init(KernelContext* ctx,
+                                                   const KernelInitArgs& args) {
+    if (auto options = static_cast<const OptionsType*>(args.options)) {
+      return arrow::internal::make_unique<RoundOptionsWrapper<OptionsType>>(*options);
+    }
+
+    return Status::Invalid(
+        "Attempted to initialize KernelState from null FunctionOptions");
+  }

Review comment:
       No, it is `RoundOptionsWrapper` so that its constructor is invoked via `make_unique`, which initializes `pow10` data member. If we use `OptionsWrapper` then `pow10` is not available. I tried invoking `OptionsWrapper::Init` but it returns a `std::unique_ptr` which would require "casting" to `RoundOptionsWrapper` first and then to `KernelState` to match return type. The `unique_ptr` casting caused too many issues so I reverted to mimic the `OptionsWrapper::Init` method.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] pitrou commented on a change in pull request #10349: ARROW-12744: [C++][Compute] Add rounding kernel

Posted by GitBox <gi...@apache.org>.

pitrou commented on a change in pull request #10349:
URL: https://github.com/apache/arrow/pull/10349#discussion_r705326962



##########
File path: cpp/src/arrow/compute/api_scalar.h
##########
@@ -49,10 +49,58 @@ class ARROW_EXPORT ElementWiseAggregateOptions : public FunctionOptions {
   explicit ElementWiseAggregateOptions(bool skip_nulls = true);
   constexpr static char const kTypeName[] = "ElementWiseAggregateOptions";
   static ElementWiseAggregateOptions Defaults() { return ElementWiseAggregateOptions{}; }
-
   bool skip_nulls;
 };
 
+/// Rounding and tie-breaking modes for round compute functions.
+/// Additional details and examples are provided in compute.rst.
+enum class RoundMode : int8_t {
+  /// Round to nearest integer less than or equal in magnitude (aka "floor")
+  DOWN,
+  /// Round to nearest integer greater than or equal in magnitude (aka "ceil")
+  UP,
+  /// Get the integral part without fractional digits (aka "trunc")
+  TOWARDS_ZERO,
+  /// Round negative values with DOWN rule and positive values with UP rule
+  TOWARDS_INFINITY,
+  /// Round ties with DOWN rule
+  HALF_DOWN,
+  /// Round ties with UP rule
+  HALF_UP,
+  /// Round ties with TOWARDS_ZERO rule
+  HALF_TOWARDS_ZERO,
+  /// Round ties with TOWARDS_INFINITY rule
+  HALF_TOWARDS_INFINITY,
+  /// Round ties to nearest even integer
+  HALF_TO_EVEN,
+  /// Round ties to nearest odd integer
+  HALF_TO_ODD,
+};
+
+class ARROW_EXPORT RoundOptions : public FunctionOptions {
+ public:
+  explicit RoundOptions(int64_t ndigits = 0,
+                        RoundMode round_mode = RoundMode::HALF_TO_EVEN);
+  constexpr static char const kTypeName[] = "RoundOptions";
+  static RoundOptions Defaults() { return RoundOptions(); }
+  /// Rounding precision (number of digits to round to).
+  int64_t ndigits;
+  /// Rounding and tie-breaking mode
+  RoundMode round_mode;
+};
+
+class ARROW_EXPORT RoundToMultipleOptions : public FunctionOptions {
+ public:
+  explicit RoundToMultipleOptions(double multiple = 1.0,
+                                  RoundMode round_mode = RoundMode::HALF_TO_EVEN);
+  constexpr static char const kTypeName[] = "RoundToMultipleOptions";
+  static RoundToMultipleOptions Defaults() { return RoundToMultipleOptions(); }
+  /// Rounding scale (multiple to round to, only the absolute value is used).

Review comment:
       Can remove the comment about the absolute value being used, since negative values are disallowed.

##########
File path: cpp/src/arrow/compute/kernels/scalar_arithmetic.cc
##########
@@ -852,24 +853,243 @@ struct LogbChecked {
   }
 };
 
+struct RoundUtil {
+  template <typename T>
+  static constexpr enable_if_t<std::is_floating_point<T>::value, bool> IsInteger(
+      const T val) {
+    // |frac| ~ 0.0?
+    return std::floor(val) == val;
+  }
+
+  template <typename T>
+  static constexpr enable_if_t<std::is_floating_point<T>::value, bool> IsHalfInteger(
+      const T val) {
+    // |frac| ~ 0.5?
+    return (val - std::floor(val)) == T(0.5);
+  }
+
+  // Calculate powers of ten with arbitrary integer exponent
+  template <typename T = double>
+  static enable_if_floating_point<T> Pow10(int64_t power) {
+    static constexpr T lut[] = {1e0F, 1e1F, 1e2F,  1e3F,  1e4F,  1e5F,  1e6F,  1e7F,
+                                1e8F, 1e9F, 1e10F, 1e11F, 1e12F, 1e13F, 1e14F, 1e15F};
+    int64_t lut_size = (sizeof(lut) / sizeof(*lut));
+    int64_t abs_power = std::abs(power);
+    auto pow10 = lut[std::min(abs_power, lut_size - 1)];
+    while (abs_power-- >= lut_size) {
+      pow10 *= 1e1F;
+    }
+    return (power >= 0) ? pow10 : (1 / pow10);
+  }
+};
+
+// Specializations of rounding implementations for round kernels
+template <typename, RoundMode>
+struct RoundImpl;
+
+template <typename T>
+struct RoundImpl<T, RoundMode::DOWN> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::floor(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::UP> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::ceil(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::TOWARDS_ZERO> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::trunc(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::TOWARDS_INFINITY> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::signbit(val) ? std::floor(val) : std::ceil(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_DOWN> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return RoundImpl<T, RoundMode::DOWN>::Round(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_UP> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return RoundImpl<T, RoundMode::UP>::Round(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TOWARDS_ZERO> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return RoundImpl<T, RoundMode::TOWARDS_ZERO>::Round(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TOWARDS_INFINITY> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return RoundImpl<T, RoundMode::TOWARDS_INFINITY>::Round(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TO_EVEN> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::round(val * T(0.5)) * 2;
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TO_ODD> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::floor(val * T(0.5)) + std::ceil(val * T(0.5));
+  }
+};
+
+// Specializations of kernel state for round kernels
+template <typename>
+struct RoundOptionsWrapper;
+
+template <>
+struct RoundOptionsWrapper<RoundOptions> : public OptionsWrapper<RoundOptions> {
+  using OptionsType = RoundOptions;
+  double pow10;
+
+  explicit RoundOptionsWrapper(OptionsType options) : OptionsWrapper(std::move(options)) {
+    // Only positive of powers of 10 are used because combining multiply and
+    // division operations produced more stable rounding than using multiply-only.
+    // Refer to NumPy's round implementation:
+    // https://github.com/numpy/numpy/blob/7b2f20b406d27364c812f7a81a9c901afbd3600c/numpy/core/src/multiarray/calculation.c#L589
+    pow10 = RoundUtil::Pow10(std::abs(options.ndigits));
+  }
+
+  static Result<std::unique_ptr<KernelState>> Init(KernelContext* ctx,
+                                                   const KernelInitArgs& args) {
+    if (auto options = static_cast<const OptionsType*>(args.options)) {
+      return arrow::internal::make_unique<RoundOptionsWrapper<OptionsType>>(*options);
+    }
+
+    return Status::Invalid(
+        "Attempted to initialize KernelState from null FunctionOptions");
+  }
+};
+
+template <>
+struct RoundOptionsWrapper<RoundToMultipleOptions>
+    : public OptionsWrapper<RoundToMultipleOptions> {
+  using OptionsType = RoundToMultipleOptions;
+
+  static Result<std::unique_ptr<KernelState>> Init(KernelContext* ctx,
+                                                   const KernelInitArgs& args) {
+    ARROW_ASSIGN_OR_RAISE(auto state, OptionsWrapper<OptionsType>::Init(ctx, args));
+    auto options = Get(*state);
+    if (options.multiple <= 0) {
+      return Status::Invalid("Rounding multiple has to be a non-zero positive value");
+    }
+    return std::move(state);
+  }
+};
+
+template <RoundMode RndMode>
+struct Round {
+  using State = RoundOptionsWrapper<RoundOptions>;
+
+  template <typename T, typename Arg>
+  static enable_if_floating_point<Arg, T> Call(KernelContext* ctx, Arg arg, Status* st) {
+    static_assert(std::is_same<T, Arg>::value, "");
+    // Do not process Inf or NaN because they will trigger the overflow error at end of
+    // function.
+    if (!std::isfinite(arg)) {
+      return arg;
+    }
+    auto state = static_cast<State*>(ctx->state());
+    auto options = state->options;
+    auto pow10 = T(state->pow10);
+    auto round_val = (options.ndigits >= 0) ? (arg * pow10) : (arg / pow10);
+    // Use std::round() if in tie-breaking mode and scaled value is not 0.5.
+    if ((options.round_mode >= RoundMode::HALF_DOWN) &&
+        !RoundUtil::IsHalfInteger(round_val)) {
+      round_val = std::round(round_val);
+    } else if (!RoundUtil::IsInteger(round_val)) {

Review comment:
       I wonder: is this condition still useful? If `std::floor(round_val) == round_val`, then presumably the same also holds for `std::ceil` and `std::trunc`, so `RoundImpl::Round` will be a no-op.

##########
File path: cpp/src/arrow/compute/kernels/scalar_arithmetic.cc
##########
@@ -852,24 +853,243 @@ struct LogbChecked {
   }
 };
 
+struct RoundUtil {
+  template <typename T>
+  static constexpr enable_if_t<std::is_floating_point<T>::value, bool> IsInteger(
+      const T val) {
+    // |frac| ~ 0.0?
+    return std::floor(val) == val;
+  }
+
+  template <typename T>
+  static constexpr enable_if_t<std::is_floating_point<T>::value, bool> IsHalfInteger(
+      const T val) {
+    // |frac| ~ 0.5?
+    return (val - std::floor(val)) == T(0.5);
+  }
+
+  // Calculate powers of ten with arbitrary integer exponent
+  template <typename T = double>
+  static enable_if_floating_point<T> Pow10(int64_t power) {
+    static constexpr T lut[] = {1e0F, 1e1F, 1e2F,  1e3F,  1e4F,  1e5F,  1e6F,  1e7F,
+                                1e8F, 1e9F, 1e10F, 1e11F, 1e12F, 1e13F, 1e14F, 1e15F};
+    int64_t lut_size = (sizeof(lut) / sizeof(*lut));
+    int64_t abs_power = std::abs(power);
+    auto pow10 = lut[std::min(abs_power, lut_size - 1)];
+    while (abs_power-- >= lut_size) {
+      pow10 *= 1e1F;
+    }
+    return (power >= 0) ? pow10 : (1 / pow10);
+  }
+};
+
+// Specializations of rounding implementations for round kernels
+template <typename, RoundMode>
+struct RoundImpl;
+
+template <typename T>
+struct RoundImpl<T, RoundMode::DOWN> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::floor(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::UP> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::ceil(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::TOWARDS_ZERO> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::trunc(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::TOWARDS_INFINITY> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::signbit(val) ? std::floor(val) : std::ceil(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_DOWN> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return RoundImpl<T, RoundMode::DOWN>::Round(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_UP> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return RoundImpl<T, RoundMode::UP>::Round(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TOWARDS_ZERO> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return RoundImpl<T, RoundMode::TOWARDS_ZERO>::Round(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TOWARDS_INFINITY> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return RoundImpl<T, RoundMode::TOWARDS_INFINITY>::Round(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TO_EVEN> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::round(val * T(0.5)) * 2;
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TO_ODD> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::floor(val * T(0.5)) + std::ceil(val * T(0.5));
+  }
+};
+
+// Specializations of kernel state for round kernels
+template <typename>
+struct RoundOptionsWrapper;
+
+template <>
+struct RoundOptionsWrapper<RoundOptions> : public OptionsWrapper<RoundOptions> {
+  using OptionsType = RoundOptions;
+  double pow10;
+
+  explicit RoundOptionsWrapper(OptionsType options) : OptionsWrapper(std::move(options)) {
+    // Only positive of powers of 10 are used because combining multiply and
+    // division operations produced more stable rounding than using multiply-only.
+    // Refer to NumPy's round implementation:
+    // https://github.com/numpy/numpy/blob/7b2f20b406d27364c812f7a81a9c901afbd3600c/numpy/core/src/multiarray/calculation.c#L589
+    pow10 = RoundUtil::Pow10(std::abs(options.ndigits));
+  }
+
+  static Result<std::unique_ptr<KernelState>> Init(KernelContext* ctx,
+                                                   const KernelInitArgs& args) {
+    if (auto options = static_cast<const OptionsType*>(args.options)) {
+      return arrow::internal::make_unique<RoundOptionsWrapper<OptionsType>>(*options);
+    }
+
+    return Status::Invalid(
+        "Attempted to initialize KernelState from null FunctionOptions");
+  }
+};
+
+template <>
+struct RoundOptionsWrapper<RoundToMultipleOptions>
+    : public OptionsWrapper<RoundToMultipleOptions> {
+  using OptionsType = RoundToMultipleOptions;
+
+  static Result<std::unique_ptr<KernelState>> Init(KernelContext* ctx,
+                                                   const KernelInitArgs& args) {
+    ARROW_ASSIGN_OR_RAISE(auto state, OptionsWrapper<OptionsType>::Init(ctx, args));
+    auto options = Get(*state);
+    if (options.multiple <= 0) {
+      return Status::Invalid("Rounding multiple has to be a non-zero positive value");
+    }
+    return std::move(state);
+  }
+};
+
+template <RoundMode RndMode>
+struct Round {
+  using State = RoundOptionsWrapper<RoundOptions>;
+
+  template <typename T, typename Arg>
+  static enable_if_floating_point<Arg, T> Call(KernelContext* ctx, Arg arg, Status* st) {
+    static_assert(std::is_same<T, Arg>::value, "");
+    // Do not process Inf or NaN because they will trigger the overflow error at end of
+    // function.
+    if (!std::isfinite(arg)) {
+      return arg;
+    }
+    auto state = static_cast<State*>(ctx->state());
+    auto options = state->options;
+    auto pow10 = T(state->pow10);
+    auto round_val = (options.ndigits >= 0) ? (arg * pow10) : (arg / pow10);
+    // Use std::round() if in tie-breaking mode and scaled value is not 0.5.
+    if ((options.round_mode >= RoundMode::HALF_DOWN) &&

Review comment:
       Hmm... why don't you use `RndMode` here instead of branching dynamically on `options.round_mode`?

##########
File path: cpp/src/arrow/compute/kernels/scalar_arithmetic.cc
##########
@@ -1276,6 +1496,65 @@ std::shared_ptr<ScalarFunction> MakeUnaryArithmeticFunctionNotNull(
   return func;
 }
 
+// Generate a kernel given an arithmetic rounding functor
+template <template <RoundMode> class Op>
+ArrayKernelExec GenerateArithmeticRound(RoundMode rmode, detail::GetTypeId ty) {
+  switch (rmode) {
+    case RoundMode::DOWN:
+      return GenerateArithmeticFloatingPoint<ScalarUnaryNotNull, Op<RoundMode::DOWN>>(ty);
+    case RoundMode::UP:
+      return GenerateArithmeticFloatingPoint<ScalarUnaryNotNull, Op<RoundMode::UP>>(ty);
+    case RoundMode::TOWARDS_ZERO:
+      return GenerateArithmeticFloatingPoint<ScalarUnaryNotNull,
+                                             Op<RoundMode::TOWARDS_ZERO>>(ty);
+    case RoundMode::TOWARDS_INFINITY:
+      return GenerateArithmeticFloatingPoint<ScalarUnaryNotNull,
+                                             Op<RoundMode::TOWARDS_INFINITY>>(ty);
+    case RoundMode::HALF_DOWN:
+      return GenerateArithmeticFloatingPoint<ScalarUnaryNotNull,
+                                             Op<RoundMode::HALF_DOWN>>(ty);
+    case RoundMode::HALF_UP:
+      return GenerateArithmeticFloatingPoint<ScalarUnaryNotNull, Op<RoundMode::HALF_UP>>(
+          ty);
+    case RoundMode::HALF_TOWARDS_ZERO:
+      return GenerateArithmeticFloatingPoint<ScalarUnaryNotNull,
+                                             Op<RoundMode::HALF_TOWARDS_ZERO>>(ty);
+    case RoundMode::HALF_TOWARDS_INFINITY:
+      return GenerateArithmeticFloatingPoint<ScalarUnaryNotNull,
+                                             Op<RoundMode::HALF_TOWARDS_INFINITY>>(ty);
+    case RoundMode::HALF_TO_EVEN:
+      return GenerateArithmeticFloatingPoint<ScalarUnaryNotNull,
+                                             Op<RoundMode::HALF_TO_EVEN>>(ty);
+    case RoundMode::HALF_TO_ODD:
+      return GenerateArithmeticFloatingPoint<ScalarUnaryNotNull,
+                                             Op<RoundMode::HALF_TO_ODD>>(ty);
+    default:
+      DCHECK(false);
+      return ExecFail;
+  }
+}
+
+// Like MakeUnaryArithmeticFunction, but for unary rounding functions that control
+// kernel dispatch based on RoundMode, only on non-null output.
+template <template <RoundMode> class Op, typename OptionsType>
+std::shared_ptr<ScalarFunction> MakeUnaryRoundFunction(std::string name,
+                                                       const FunctionDoc* doc) {
+  using State = RoundOptionsWrapper<OptionsType>;
+
+  static const OptionsType kDefaultOptions = OptionsType::Defaults();
+  auto func = std::make_shared<ArithmeticFloatingPointFunction>(name, Arity::Unary(), doc,
+                                                                &kDefaultOptions);
+  for (const auto& ty : FloatingPointTypes()) {
+    auto exec = [&](KernelContext* ctx, const ExecBatch& batch, Datum* out) {
+      auto options = State::Get(ctx);
+      auto exec_ = GenerateArithmeticRound<Op>(options.round_mode, ty);

Review comment:
       Just for the record, it feels a bit weird to call `GenerateArithmeticRound` at kernel execution time. That said, a quick benchmarking in Python shows there doesn't seem to be any large overhead:
   ```python
   >>> import pyarrow as pa, pyarrow.compute as pc
   >>> floor = pc.get_function("floor")
   >>> round = pc.get_function("round")
   >>> arr = pa.array([None], type=pa.float64())
   >>> %timeit floor.call([arr])
   2.57 µs ± 10.2 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
   >>> %timeit floor.call([arr])
   2.58 µs ± 11.3 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
   >>> %timeit round.call([arr])
   2.65 µs ± 10.6 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
   >>> %timeit round.call([arr])
   2.53 µs ± 12.5 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
   ```

##########
File path: cpp/src/arrow/compute/kernels/scalar_arithmetic.cc
##########
@@ -852,24 +853,243 @@ struct LogbChecked {
   }
 };
 
+struct RoundUtil {
+  template <typename T>
+  static constexpr enable_if_t<std::is_floating_point<T>::value, bool> IsInteger(
+      const T val) {
+    // |frac| ~ 0.0?
+    return std::floor(val) == val;
+  }
+
+  template <typename T>
+  static constexpr enable_if_t<std::is_floating_point<T>::value, bool> IsHalfInteger(
+      const T val) {
+    // |frac| ~ 0.5?
+    return (val - std::floor(val)) == T(0.5);
+  }
+
+  // Calculate powers of ten with arbitrary integer exponent
+  template <typename T = double>
+  static enable_if_floating_point<T> Pow10(int64_t power) {
+    static constexpr T lut[] = {1e0F, 1e1F, 1e2F,  1e3F,  1e4F,  1e5F,  1e6F,  1e7F,
+                                1e8F, 1e9F, 1e10F, 1e11F, 1e12F, 1e13F, 1e14F, 1e15F};
+    int64_t lut_size = (sizeof(lut) / sizeof(*lut));
+    int64_t abs_power = std::abs(power);
+    auto pow10 = lut[std::min(abs_power, lut_size - 1)];
+    while (abs_power-- >= lut_size) {
+      pow10 *= 1e1F;
+    }
+    return (power >= 0) ? pow10 : (1 / pow10);
+  }
+};
+
+// Specializations of rounding implementations for round kernels
+template <typename, RoundMode>
+struct RoundImpl;
+
+template <typename T>
+struct RoundImpl<T, RoundMode::DOWN> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::floor(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::UP> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::ceil(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::TOWARDS_ZERO> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::trunc(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::TOWARDS_INFINITY> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::signbit(val) ? std::floor(val) : std::ceil(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_DOWN> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return RoundImpl<T, RoundMode::DOWN>::Round(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_UP> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return RoundImpl<T, RoundMode::UP>::Round(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TOWARDS_ZERO> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return RoundImpl<T, RoundMode::TOWARDS_ZERO>::Round(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TOWARDS_INFINITY> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return RoundImpl<T, RoundMode::TOWARDS_INFINITY>::Round(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TO_EVEN> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::round(val * T(0.5)) * 2;
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TO_ODD> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::floor(val * T(0.5)) + std::ceil(val * T(0.5));
+  }
+};
+
+// Specializations of kernel state for round kernels
+template <typename>
+struct RoundOptionsWrapper;
+
+template <>
+struct RoundOptionsWrapper<RoundOptions> : public OptionsWrapper<RoundOptions> {
+  using OptionsType = RoundOptions;
+  double pow10;
+
+  explicit RoundOptionsWrapper(OptionsType options) : OptionsWrapper(std::move(options)) {
+    // Only positive of powers of 10 are used because combining multiply and
+    // division operations produced more stable rounding than using multiply-only.
+    // Refer to NumPy's round implementation:
+    // https://github.com/numpy/numpy/blob/7b2f20b406d27364c812f7a81a9c901afbd3600c/numpy/core/src/multiarray/calculation.c#L589
+    pow10 = RoundUtil::Pow10(std::abs(options.ndigits));
+  }
+
+  static Result<std::unique_ptr<KernelState>> Init(KernelContext* ctx,
+                                                   const KernelInitArgs& args) {
+    if (auto options = static_cast<const OptionsType*>(args.options)) {
+      return arrow::internal::make_unique<RoundOptionsWrapper<OptionsType>>(*options);
+    }
+
+    return Status::Invalid(
+        "Attempted to initialize KernelState from null FunctionOptions");
+  }
+};
+
+template <>
+struct RoundOptionsWrapper<RoundToMultipleOptions>
+    : public OptionsWrapper<RoundToMultipleOptions> {
+  using OptionsType = RoundToMultipleOptions;
+
+  static Result<std::unique_ptr<KernelState>> Init(KernelContext* ctx,
+                                                   const KernelInitArgs& args) {
+    ARROW_ASSIGN_OR_RAISE(auto state, OptionsWrapper<OptionsType>::Init(ctx, args));
+    auto options = Get(*state);
+    if (options.multiple <= 0) {
+      return Status::Invalid("Rounding multiple has to be a non-zero positive value");
+    }
+    return std::move(state);
+  }
+};
+
+template <RoundMode RndMode>
+struct Round {
+  using State = RoundOptionsWrapper<RoundOptions>;
+
+  template <typename T, typename Arg>
+  static enable_if_floating_point<Arg, T> Call(KernelContext* ctx, Arg arg, Status* st) {
+    static_assert(std::is_same<T, Arg>::value, "");
+    // Do not process Inf or NaN because they will trigger the overflow error at end of
+    // function.
+    if (!std::isfinite(arg)) {
+      return arg;
+    }
+    auto state = static_cast<State*>(ctx->state());
+    auto options = state->options;
+    auto pow10 = T(state->pow10);
+    auto round_val = (options.ndigits >= 0) ? (arg * pow10) : (arg / pow10);
+    // Use std::round() if in tie-breaking mode and scaled value is not 0.5.
+    if ((options.round_mode >= RoundMode::HALF_DOWN) &&
+        !RoundUtil::IsHalfInteger(round_val)) {
+      round_val = std::round(round_val);

Review comment:
       So this is called even if `IsInteger(round_val)` would be true?

##########
File path: cpp/src/arrow/compute/kernels/scalar_arithmetic_test.cc
##########
@@ -43,8 +42,20 @@
 namespace arrow {
 namespace compute {
 
-template <typename T>
-class TestUnaryArithmetic : public TestBase {
+// InputType - OutputType pairs

Review comment:
       I don't see any pairs here?

##########
File path: cpp/src/arrow/compute/kernels/scalar_arithmetic_test.cc
##########
@@ -172,6 +190,64 @@ class TestUnaryArithmeticUnsigned : public TestUnaryArithmeticIntegral<T> {};
 template <typename T>
 class TestUnaryArithmeticFloating : public TestUnaryArithmetic<T> {};
 
+TYPED_TEST_SUITE(TestUnaryArithmeticIntegral, IntegralTypes);
+TYPED_TEST_SUITE(TestUnaryArithmeticSigned, SignedIntegerTypes);
+TYPED_TEST_SUITE(TestUnaryArithmeticUnsigned, UnsignedIntegerTypes);
+TYPED_TEST_SUITE(TestUnaryArithmeticFloating, FloatingTypes);
+
+template <typename T>
+class TestUnaryRound : public TestBaseUnaryArithmetic<T, RoundOptions> {
+ protected:
+  using Base = TestBaseUnaryArithmetic<T, RoundOptions>;
+  using Base::options_;
+  void SetRoundMode(RoundMode value) { options_.round_mode = value; }
+  void SetRoundNdigits(int64_t value) { options_.ndigits = value; }
+};
+
+template <typename T>
+class TestUnaryRoundIntegral : public TestUnaryRound<T> {};
+
+template <typename T>
+class TestUnaryRoundSigned : public TestUnaryRoundIntegral<T> {};
+
+template <typename T>
+class TestUnaryRoundUnsigned : public TestUnaryRoundIntegral<T> {};
+
+template <typename T>
+class TestUnaryRoundFloating : public TestUnaryRound<T> {};
+
+TYPED_TEST_SUITE(TestUnaryRoundIntegral, IntegralTypes);
+TYPED_TEST_SUITE(TestUnaryRoundSigned, SignedIntegerTypes);
+TYPED_TEST_SUITE(TestUnaryRoundUnsigned, UnsignedIntegerTypes);
+TYPED_TEST_SUITE(TestUnaryRoundFloating, FloatingTypes);

Review comment:
       As I said, can you move the `TYPED_TEST_SUITE` declarations near the `TYPED_TEST` definitions?

##########
File path: cpp/src/arrow/compute/kernels/scalar_arithmetic.cc
##########
@@ -852,24 +853,243 @@ struct LogbChecked {
   }
 };
 
+struct RoundUtil {
+  template <typename T>
+  static constexpr enable_if_t<std::is_floating_point<T>::value, bool> IsInteger(
+      const T val) {
+    // |frac| ~ 0.0?
+    return std::floor(val) == val;
+  }
+
+  template <typename T>
+  static constexpr enable_if_t<std::is_floating_point<T>::value, bool> IsHalfInteger(
+      const T val) {
+    // |frac| ~ 0.5?
+    return (val - std::floor(val)) == T(0.5);
+  }
+
+  // Calculate powers of ten with arbitrary integer exponent
+  template <typename T = double>
+  static enable_if_floating_point<T> Pow10(int64_t power) {
+    static constexpr T lut[] = {1e0F, 1e1F, 1e2F,  1e3F,  1e4F,  1e5F,  1e6F,  1e7F,
+                                1e8F, 1e9F, 1e10F, 1e11F, 1e12F, 1e13F, 1e14F, 1e15F};
+    int64_t lut_size = (sizeof(lut) / sizeof(*lut));
+    int64_t abs_power = std::abs(power);
+    auto pow10 = lut[std::min(abs_power, lut_size - 1)];
+    while (abs_power-- >= lut_size) {
+      pow10 *= 1e1F;
+    }
+    return (power >= 0) ? pow10 : (1 / pow10);
+  }
+};
+
+// Specializations of rounding implementations for round kernels
+template <typename, RoundMode>
+struct RoundImpl;
+
+template <typename T>
+struct RoundImpl<T, RoundMode::DOWN> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::floor(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::UP> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::ceil(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::TOWARDS_ZERO> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::trunc(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::TOWARDS_INFINITY> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::signbit(val) ? std::floor(val) : std::ceil(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_DOWN> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return RoundImpl<T, RoundMode::DOWN>::Round(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_UP> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return RoundImpl<T, RoundMode::UP>::Round(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TOWARDS_ZERO> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return RoundImpl<T, RoundMode::TOWARDS_ZERO>::Round(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TOWARDS_INFINITY> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return RoundImpl<T, RoundMode::TOWARDS_INFINITY>::Round(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TO_EVEN> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::round(val * T(0.5)) * 2;
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TO_ODD> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::floor(val * T(0.5)) + std::ceil(val * T(0.5));
+  }
+};
+
+// Specializations of kernel state for round kernels
+template <typename>
+struct RoundOptionsWrapper;
+
+template <>
+struct RoundOptionsWrapper<RoundOptions> : public OptionsWrapper<RoundOptions> {
+  using OptionsType = RoundOptions;
+  double pow10;
+
+  explicit RoundOptionsWrapper(OptionsType options) : OptionsWrapper(std::move(options)) {
+    // Only positive of powers of 10 are used because combining multiply and
+    // division operations produced more stable rounding than using multiply-only.
+    // Refer to NumPy's round implementation:
+    // https://github.com/numpy/numpy/blob/7b2f20b406d27364c812f7a81a9c901afbd3600c/numpy/core/src/multiarray/calculation.c#L589
+    pow10 = RoundUtil::Pow10(std::abs(options.ndigits));
+  }
+
+  static Result<std::unique_ptr<KernelState>> Init(KernelContext* ctx,
+                                                   const KernelInitArgs& args) {
+    if (auto options = static_cast<const OptionsType*>(args.options)) {
+      return arrow::internal::make_unique<RoundOptionsWrapper<OptionsType>>(*options);
+    }
+
+    return Status::Invalid(
+        "Attempted to initialize KernelState from null FunctionOptions");
+  }
+};
+
+template <>
+struct RoundOptionsWrapper<RoundToMultipleOptions>
+    : public OptionsWrapper<RoundToMultipleOptions> {
+  using OptionsType = RoundToMultipleOptions;
+
+  static Result<std::unique_ptr<KernelState>> Init(KernelContext* ctx,
+                                                   const KernelInitArgs& args) {
+    ARROW_ASSIGN_OR_RAISE(auto state, OptionsWrapper<OptionsType>::Init(ctx, args));
+    auto options = Get(*state);
+    if (options.multiple <= 0) {
+      return Status::Invalid("Rounding multiple has to be a non-zero positive value");
+    }
+    return std::move(state);
+  }
+};
+
+template <RoundMode RndMode>
+struct Round {
+  using State = RoundOptionsWrapper<RoundOptions>;
+
+  template <typename T, typename Arg>
+  static enable_if_floating_point<Arg, T> Call(KernelContext* ctx, Arg arg, Status* st) {
+    static_assert(std::is_same<T, Arg>::value, "");
+    // Do not process Inf or NaN because they will trigger the overflow error at end of
+    // function.
+    if (!std::isfinite(arg)) {
+      return arg;
+    }
+    auto state = static_cast<State*>(ctx->state());
+    auto options = state->options;
+    auto pow10 = T(state->pow10);
+    auto round_val = (options.ndigits >= 0) ? (arg * pow10) : (arg / pow10);
+    // Use std::round() if in tie-breaking mode and scaled value is not 0.5.
+    if ((options.round_mode >= RoundMode::HALF_DOWN) &&
+        !RoundUtil::IsHalfInteger(round_val)) {
+      round_val = std::round(round_val);
+    } else if (!RoundUtil::IsInteger(round_val)) {
+      round_val = RoundImpl<T, RndMode>::Round(round_val);
+    }
+    // Equality check is ommitted so that the common case of 10^0 (integer rounding) uses
+    // multiply-only
+    round_val = (options.ndigits > 0) ? (round_val / pow10) : (round_val * pow10);
+    if (!std::isfinite(round_val)) {
+      *st = Status::Invalid("overflow occurred during rounding");
+      return arg;
+    }
+    return round_val;
+  }
+};
+
+template <RoundMode RndMode>
+struct RoundToMultiple {
+  using State = RoundOptionsWrapper<RoundToMultipleOptions>;
+
+  template <typename T, typename Arg>
+  static enable_if_floating_point<Arg, T> Call(KernelContext* ctx, Arg arg, Status* st) {
+    static_assert(std::is_same<T, Arg>::value, "");
+    // Do not process Inf or NaN because they will trigger the overflow error at end of
+    // function.
+    if (!std::isfinite(arg)) {
+      return arg;
+    }
+
+    auto options = State::Get(ctx);
+    auto round_val = arg / T(options.multiple);
+    // Use std::round() if in tie-breaking mode and scaled value is not 0.5.
+    if ((options.round_mode >= RoundMode::HALF_DOWN) &&

Review comment:
       Same questions as above.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] pitrou commented on a change in pull request #10349: ARROW-12744: [C++][Compute] Add rounding kernel

Posted by GitBox <gi...@apache.org>.

pitrou commented on a change in pull request #10349:
URL: https://github.com/apache/arrow/pull/10349#discussion_r703587363



##########
File path: cpp/src/arrow/compute/api_scalar.h
##########
@@ -49,10 +49,66 @@ class ARROW_EXPORT ElementWiseAggregateOptions : public FunctionOptions {
   explicit ElementWiseAggregateOptions(bool skip_nulls = true);
   constexpr static char const kTypeName[] = "ElementWiseAggregateOptions";
   static ElementWiseAggregateOptions Defaults() { return ElementWiseAggregateOptions{}; }
-
   bool skip_nulls;
 };
 
+/// Rounding and tie-breaking modes for round compute functions.
+/// Additional details and examples are provided in compute.rst.
+enum class RoundMode : int8_t {
+  /// Round to nearest integer less than or equal in magnitude (aka "floor")

Review comment:
       Well, let's keep it like this.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] edponce commented on a change in pull request #10349: ARROW-12744: [C++][Compute] Add rounding kernel

Posted by GitBox <gi...@apache.org>.

edponce commented on a change in pull request #10349:
URL: https://github.com/apache/arrow/pull/10349#discussion_r705492169



##########
File path: cpp/src/arrow/compute/api_scalar.h
##########
@@ -49,10 +49,58 @@ class ARROW_EXPORT ElementWiseAggregateOptions : public FunctionOptions {
   explicit ElementWiseAggregateOptions(bool skip_nulls = true);
   constexpr static char const kTypeName[] = "ElementWiseAggregateOptions";
   static ElementWiseAggregateOptions Defaults() { return ElementWiseAggregateOptions{}; }
-
   bool skip_nulls;
 };
 
+/// Rounding and tie-breaking modes for round compute functions.
+/// Additional details and examples are provided in compute.rst.
+enum class RoundMode : int8_t {
+  /// Round to nearest integer less than or equal in magnitude (aka "floor")
+  DOWN,
+  /// Round to nearest integer greater than or equal in magnitude (aka "ceil")
+  UP,
+  /// Get the integral part without fractional digits (aka "trunc")
+  TOWARDS_ZERO,
+  /// Round negative values with DOWN rule and positive values with UP rule
+  TOWARDS_INFINITY,
+  /// Round ties with DOWN rule
+  HALF_DOWN,
+  /// Round ties with UP rule
+  HALF_UP,
+  /// Round ties with TOWARDS_ZERO rule
+  HALF_TOWARDS_ZERO,
+  /// Round ties with TOWARDS_INFINITY rule
+  HALF_TOWARDS_INFINITY,
+  /// Round ties to nearest even integer
+  HALF_TO_EVEN,
+  /// Round ties to nearest odd integer
+  HALF_TO_ODD,
+};
+
+class ARROW_EXPORT RoundOptions : public FunctionOptions {
+ public:
+  explicit RoundOptions(int64_t ndigits = 0,
+                        RoundMode round_mode = RoundMode::HALF_TO_EVEN);
+  constexpr static char const kTypeName[] = "RoundOptions";
+  static RoundOptions Defaults() { return RoundOptions(); }
+  /// Rounding precision (number of digits to round to).
+  int64_t ndigits;
+  /// Rounding and tie-breaking mode
+  RoundMode round_mode;
+};
+
+class ARROW_EXPORT RoundToMultipleOptions : public FunctionOptions {
+ public:
+  explicit RoundToMultipleOptions(double multiple = 1.0,
+                                  RoundMode round_mode = RoundMode::HALF_TO_EVEN);
+  constexpr static char const kTypeName[] = "RoundToMultipleOptions";
+  static RoundToMultipleOptions Defaults() { return RoundToMultipleOptions(); }
+  /// Rounding scale (multiple to round to, only the absolute value is used).

Review comment:
       Thanks for catching this! I also removed similar comment from compute docs.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] edponce commented on pull request #10349: ARROW-12744: [C++][Compute] Add rounding kernel

Posted by GitBox <gi...@apache.org>.

edponce commented on pull request #10349:
URL: https://github.com/apache/arrow/pull/10349#issuecomment-869762925


   This PR is missing an implementation of `pow(10, x)` using a LUT (precomputed), instead of `std::pow`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] edponce commented on a change in pull request #10349: ARROW-12744: [C++][Compute] Add rounding kernel

Posted by GitBox <gi...@apache.org>.

edponce commented on a change in pull request #10349:
URL: https://github.com/apache/arrow/pull/10349#discussion_r674995912



##########
File path: cpp/src/arrow/compute/kernels/scalar_arithmetic.cc
##########
@@ -817,24 +818,158 @@ struct Log1pChecked {
   }
 };
 
+struct RoundUtil {
+  template <typename T, enable_if_t<std::is_floating_point<T>::value, bool> = true>
+  static constexpr bool ApproxEqual(const T x, const T y) {
+    return (x == y) || (std::fabs(x - y) <= std::numeric_limits<T>::epsilon());
+  }
+
+  template <typename T, enable_if_t<std::is_floating_point<T>::value, bool> = true>
+  static constexpr bool IsHalf(T val) {
+    // |frac| == 0.5?
+    return ApproxEqual(std::fmod(std::fabs(val), T(1)), T(0.5));
+  }
+
+  template <typename T>
+  static enable_if_floating_point<T> Pow10(const int power) {
+    static constexpr auto pow10 = std::array<T, 39>{
+        1e-19, 1e-18, 1e-17, 1e-16, 1e-15, 1e-14, 1e-13, 1e-12, 1e-11, 1e-10,
+        1e-9,  1e-8,  1e-7,  1e-6,  1e-5,  1e-4,  1e-3,  1e-2,  1e-1,  1e0,
+        1e1,   1e2,   1e3,   1e4,   1e5,   1e6,   1e7,   1e8,   1e9,   1e10,
+        1e11,  1e12,  1e13,  1e14,  1e15,  1e16,  1e17,  1e18,  1e19};
+    return pow10.at(power + 19);
+  }
+};
+
+// Specializations of rounding implementations for kernels
+template <typename T, RoundMode RndMode>
+struct RoundImpl {
+  static constexpr enable_if_floating_point<T> Round(T) { return T(0); }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::TOWARDS_NEG_INFINITY> {
+  static constexpr enable_if_floating_point<T> Round(T val) { return std::floor(val); }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::TOWARDS_POS_INFINITY> {
+  static constexpr enable_if_floating_point<T> Round(T val) { return std::ceil(val); }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::TOWARDS_ZERO> {
+  static constexpr enable_if_floating_point<T> Round(T val) { return std::trunc(val); }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::TOWARDS_INFINITY> {
+  static constexpr enable_if_floating_point<T> Round(T val) {
+    return std::signbit(val) ? std::floor(val) : std::ceil(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TOWARDS_NEG_INFINITY> {
+  static constexpr enable_if_floating_point<T> Round(T val) {
+    return std::ceil(val - T(0.5));
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TOWARDS_POS_INFINITY> {
+  static constexpr enable_if_floating_point<T> Round(T val) {
+    return std::floor(val + T(0.5));
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TOWARDS_ZERO> {
+  static constexpr enable_if_floating_point<T> Round(T val) {
+    return std::copysign(std::ceil(std::fabs(val) - T(0.5)), val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TOWARDS_INFINITY> {
+  static enable_if_floating_point<T> Round(T val) {
+    return std::copysign(std::floor(std::fabs(val) + T(0.5)), val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TO_EVEN> {
+  static enable_if_floating_point<T> Round(T val) {
+    if (!RoundUtil::IsHalf(val)) {
+      return std::round(val);
+    }
+    auto floor = std::floor(val);
+    // Odd + 1, Even + 0
+    return floor + T(std::fmod(std::fabs(floor), T(2)) >= T(1));
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TO_ODD> {
+  static enable_if_floating_point<T> Round(T val) {
+    if (!RoundUtil::IsHalf(val)) {
+      return std::round(val);
+    }
+    auto floor = std::floor(val);
+    // Odd + 0, Even + 1
+    return floor + T(std::fmod(std::fabs(floor), T(2)) < T(1));
+  }
+};
+
+template <RoundMode RndMode>
+struct MRound {
+  template <typename T, typename Arg>
+  static enable_if_floating_point<Arg, T> Call(KernelContext* ctx, Arg arg, Status*) {
+    static_assert(std::is_same<T, Arg>::value, "");
+    if (std::isnan(arg)) {
+      return arg;
+    }
+    auto options = OptionsWrapper<MRoundOptions>::Get(ctx);
+    const auto mult = std::fabs(T(options.multiple));
+    return (mult == T(0)) ? T(0) : (RoundImpl<T, RndMode>::Round(arg / mult) * mult);
+  }
+};
+
+template <RoundMode RndMode>
+struct Round {
+  template <typename T, typename Arg>
+  static enable_if_floating_point<Arg, T> Call(KernelContext* ctx, Arg arg, Status*) {
+    static_assert(std::is_same<T, Arg>::value, "");
+    auto options = OptionsWrapper<RoundOptions>::Get(ctx);
+    const auto mult = RoundUtil::Pow10<T>(-options.ndigits);

Review comment:
       Would you recommend resolving invalid values or explicitly triggering an error?
   For cases where `ndigits` is out-of-range we can provide a corresponding meaningful result.
   
   For example,
   * If `ndigits` is "too large" and positive, which means to round to an extremely small decimal precision = `10^-(large number)` (most likely less than epsilon), then the rounded value is the input with no change.
   * If `ndigits` is "too large" and negative, which means to round to an extremely large integral value = `10^(large number)` (most likely more than the range of floating-point values), then the rounded value is 0.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] pitrou commented on a change in pull request #10349: ARROW-12744: [C++][Compute] Add rounding kernel

Posted by GitBox <gi...@apache.org>.

pitrou commented on a change in pull request #10349:
URL: https://github.com/apache/arrow/pull/10349#discussion_r707400565



##########
File path: cpp/src/arrow/compute/kernels/scalar_arithmetic.cc
##########
@@ -852,24 +853,238 @@ struct LogbChecked {
   }
 };
 
+struct RoundUtil {
+  // Calculate powers of ten with arbitrary integer exponent
+  template <typename T = double>
+  static enable_if_floating_point<T> Pow10(int64_t power) {
+    static constexpr T lut[] = {1e0F, 1e1F, 1e2F,  1e3F,  1e4F,  1e5F,  1e6F,  1e7F,
+                                1e8F, 1e9F, 1e10F, 1e11F, 1e12F, 1e13F, 1e14F, 1e15F};
+    int64_t lut_size = (sizeof(lut) / sizeof(*lut));
+    int64_t abs_power = std::abs(power);
+    auto pow10 = lut[std::min(abs_power, lut_size - 1)];
+    while (abs_power-- >= lut_size) {
+      pow10 *= 1e1F;
+    }
+    return (power >= 0) ? pow10 : (1 / pow10);
+  }
+};
+
+// Specializations of rounding implementations for round kernels
+template <typename, RoundMode>
+struct RoundImpl;
+
+template <typename T>
+struct RoundImpl<T, RoundMode::DOWN> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::floor(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::UP> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::ceil(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::TOWARDS_ZERO> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::trunc(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::TOWARDS_INFINITY> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::signbit(val) ? std::floor(val) : std::ceil(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_DOWN> {

Review comment:
       Can you add a comment that the implementations below are only invoked if the input has a fractional part equal to 0.5? Otherwise reading this is a bit confusing.

##########
File path: cpp/src/arrow/compute/kernels/scalar_arithmetic_test.cc
##########
@@ -1352,6 +1403,213 @@ TYPED_TEST(TestUnaryArithmeticFloating, AbsoluteValue) {
   }
 }
 
+TYPED_TEST_SUITE(TestUnaryRoundIntegral, IntegralTypes);
+TYPED_TEST_SUITE(TestUnaryRoundSigned, SignedIntegerTypes);
+TYPED_TEST_SUITE(TestUnaryRoundUnsigned, UnsignedIntegerTypes);
+TYPED_TEST_SUITE(TestUnaryRoundFloating, FloatingTypes);
+
+const std::vector<RoundMode> kRoundModes{
+    RoundMode::DOWN,
+    RoundMode::UP,
+    RoundMode::TOWARDS_ZERO,
+    RoundMode::TOWARDS_INFINITY,
+    RoundMode::HALF_DOWN,
+    RoundMode::HALF_UP,
+    RoundMode::HALF_TOWARDS_ZERO,
+    RoundMode::HALF_TOWARDS_INFINITY,
+    RoundMode::HALF_TO_EVEN,
+    RoundMode::HALF_TO_ODD,
+};
+
+TYPED_TEST(TestUnaryRoundSigned, Round) {
+  // Test different rounding modes for integer rounding
+  std::string values("[0, 1, -13, -50, 115]");
+  this->SetRoundNdigits(0);
+  for (const auto& round_mode : kRoundModes) {
+    this->SetRoundMode(round_mode);
+    this->AssertUnaryOp(Round, values, ArrayFromJSON(float64(), values));
+  }
+
+  // Test different round N-digits for nearest rounding mode
+  std::vector<std::pair<int64_t, std::string>> ndigits_and_expected{{
+      {-2, "[0, 0, -0, -100, 100]"},
+      {-1, "[0, 0, -10, -50, 120]"},
+      {0, values},
+      {1, values},
+      {2, values},
+  }};
+  this->SetRoundMode(RoundMode::HALF_TOWARDS_INFINITY);
+  for (const auto& pair : ndigits_and_expected) {
+    this->SetRoundNdigits(pair.first);
+    this->AssertUnaryOp(Round, values, ArrayFromJSON(float64(), pair.second));
+  }
+}
+
+TYPED_TEST(TestUnaryRoundUnsigned, Round) {
+  // Test different rounding modes for integer rounding
+  std::string values("[0, 1, 13, 50, 115]");
+  this->SetRoundNdigits(0);
+  for (const auto& round_mode : kRoundModes) {
+    this->SetRoundMode(round_mode);
+    this->AssertUnaryOp(Round, values, ArrayFromJSON(float64(), values));
+  }
+
+  // Test different round N-digits for nearest rounding mode
+  std::vector<std::pair<int64_t, std::string>> ndigits_and_expected{{
+      {-2, "[0, 0, 0, 100, 100]"},
+      {-1, "[0, 0, 10, 50, 120]"},
+      {0, values},
+      {1, values},
+      {2, values},
+  }};
+  this->SetRoundMode(RoundMode::HALF_TOWARDS_INFINITY);
+  for (const auto& pair : ndigits_and_expected) {
+    this->SetRoundNdigits(pair.first);
+    this->AssertUnaryOp(Round, values, ArrayFromJSON(float64(), pair.second));
+  }
+}
+
+TYPED_TEST(TestUnaryRoundFloating, Round) {
+  this->SetNansEqual(true);
+
+  // Test different rounding modes for integer rounding

Review comment:
       "integer"?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] pitrou closed pull request #10349: ARROW-12744: [C++][Compute] Add rounding kernel

Posted by GitBox <gi...@apache.org>.

pitrou closed pull request #10349:
URL: https://github.com/apache/arrow/pull/10349


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] edponce commented on a change in pull request #10349: ARROW-12744: [C++][Compute] Add rounding kernel

Posted by GitBox <gi...@apache.org>.

edponce commented on a change in pull request #10349:
URL: https://github.com/apache/arrow/pull/10349#discussion_r692356691



##########
File path: cpp/src/arrow/compute/kernels/scalar_arithmetic_test.cc
##########
@@ -150,18 +149,32 @@ class TestUnaryArithmetic : public TestBase {
     AssertArraysApproxEqual(*expected, *actual, /*verbose=*/true, equal_options_);
   }
 
-  void SetOverflowCheck(bool value = true) { options_.check_overflow = value; }
-
   void SetNansEqual(bool value = true) {
     this->equal_options_ = equal_options_.nans_equal(value);
   }
 
-  ArithmeticOptions options_ = ArithmeticOptions();
+  Options options_ = Options();
   EqualOptions equal_options_ = EqualOptions::Defaults();
 };
 
+template <typename T, typename Options>

Review comment:
       It is the primary template for which explicit specializations are provided for `ArithmeticOptions`, `RoundOptions`, and `MRoundOptions`. I simplified these subclasses i n latest commit.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] edponce commented on a change in pull request #10349: ARROW-12744: [C++][Compute] Add rounding kernel

Posted by GitBox <gi...@apache.org>.

edponce commented on a change in pull request #10349:
URL: https://github.com/apache/arrow/pull/10349#discussion_r692356691



##########
File path: cpp/src/arrow/compute/kernels/scalar_arithmetic_test.cc
##########
@@ -150,18 +149,32 @@ class TestUnaryArithmetic : public TestBase {
     AssertArraysApproxEqual(*expected, *actual, /*verbose=*/true, equal_options_);
   }
 
-  void SetOverflowCheck(bool value = true) { options_.check_overflow = value; }
-
   void SetNansEqual(bool value = true) {
     this->equal_options_ = equal_options_.nans_equal(value);
   }
 
-  ArithmeticOptions options_ = ArithmeticOptions();
+  Options options_ = Options();
   EqualOptions equal_options_ = EqualOptions::Defaults();
 };
 
+template <typename T, typename Options>

Review comment:
       It is the primary template for which explicit specializations are provided for `ArithmeticOptions`, `RoundOptions`, and `MRoundOptions`. I emptied the primary template in latest commit.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] edponce commented on a change in pull request #10349: ARROW-12744: [C++][Compute] Add rounding kernel

Posted by GitBox <gi...@apache.org>.

edponce commented on a change in pull request #10349:
URL: https://github.com/apache/arrow/pull/10349#discussion_r704651704



##########
File path: cpp/src/arrow/compute/kernels/scalar_arithmetic.cc
##########
@@ -852,24 +854,232 @@ struct LogbChecked {
   }
 };
 
+struct RoundUtil {
+  template <typename T>
+  static constexpr enable_if_t<std::is_floating_point<T>::value, bool> IsApproxEqual(
+      const T a, const T b, const T abs_tol = kDefaultAbsoluteTolerance) {
+    return std::fabs(a - b) <= abs_tol;
+  }
+
+  template <typename T>
+  static constexpr enable_if_t<std::is_floating_point<T>::value, bool> IsApproxInt(
+      const T val, const T abs_tol = kDefaultAbsoluteTolerance) {
+    // |frac| ~ 0.0?
+    return IsApproxEqual(val, std::round(val), abs_tol);
+  }
+
+  template <typename T>
+  static constexpr enable_if_t<std::is_floating_point<T>::value, bool> IsApproxHalfInt(
+      const T val, const T abs_tol = kDefaultAbsoluteTolerance) {
+    // |frac| ~ 0.5?
+    return IsApproxEqual(val - std::floor(val), T(0.5), abs_tol);
+  }
+
+  template <typename T>
+  static enable_if_floating_point<T> Pow10(const int64_t power) {
+    const T lut[]{1e0F,  1e1F,  1e2F,  1e3F,  1e4F,  1e5F,  1e6F,  1e7F,
+                  1e8F,  1e9F,  1e10F, 1e11F, 1e12F, 1e13F, 1e14F, 1e15F,
+                  1e16F, 1e17F, 1e18F, 1e19F, 1e20F, 1e21F, 1e22F};
+    // Return NaN if index is out-of-range.
+    auto lut_size = (int64_t)(sizeof(lut) / sizeof(*lut));
+    return (power >= 0 && power < lut_size) ? lut[power] : std::nanf("");
+  }
+};
+
+// Specializations of rounding implementations for round kernels
+template <typename T, RoundMode>
+struct RoundImpl {};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::DOWN> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::floor(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::UP> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::ceil(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::TOWARDS_ZERO> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::signbit(val) ? std::ceil(val) : std::floor(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::TOWARDS_INFINITY> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::signbit(val) ? std::floor(val) : std::ceil(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_DOWN> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return RoundImpl<T, RoundMode::DOWN>::Round(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_UP> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return RoundImpl<T, RoundMode::UP>::Round(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TOWARDS_ZERO> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return RoundImpl<T, RoundMode::TOWARDS_ZERO>::Round(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TOWARDS_INFINITY> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return RoundImpl<T, RoundMode::TOWARDS_INFINITY>::Round(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TO_EVEN> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::round(val * T(0.5)) * 2;
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TO_ODD> {
+  static constexpr enable_if_floating_point<T> Round(const T val) {
+    return std::floor(val * T(0.5)) + std::ceil(val * T(0.5));
+  }
+};
+
+template <RoundMode RndMode>
+struct Round {
+  using RoundState = OptionsWrapper<RoundOptions>;
+
+  template <typename T, typename Arg>
+  static enable_if_floating_point<Arg, T> Call(KernelContext* ctx, Arg arg, Status* st) {
+    static_assert(std::is_same<T, Arg>::value, "");
+    static_assert(RoundMode::HALF_DOWN > RoundMode::DOWN &&
+                      RoundMode::HALF_DOWN > RoundMode::UP &&
+                      RoundMode::HALF_DOWN > RoundMode::TOWARDS_ZERO &&
+                      RoundMode::HALF_DOWN > RoundMode::TOWARDS_INFINITY &&
+                      RoundMode::HALF_DOWN < RoundMode::HALF_UP &&
+                      RoundMode::HALF_DOWN < RoundMode::HALF_TOWARDS_ZERO &&
+                      RoundMode::HALF_DOWN < RoundMode::HALF_TOWARDS_INFINITY &&
+                      RoundMode::HALF_DOWN < RoundMode::HALF_TO_EVEN &&
+                      RoundMode::HALF_DOWN < RoundMode::HALF_TO_ODD,
+                  "Round modes prefixed with HALF need to be defined last in enum and "
+                  "the first HALF entry has to be HALF_DOWN.");
+
+    auto options = RoundState::Get(ctx);
+    auto pow10 = RoundUtil::Pow10<T>(std::llabs(options.ndigits));
+    if (std::isnan(pow10)) {
+      *st = Status::Invalid("out-of-range value for rounding digits");
+      return arg;
+    } else if (!std::isfinite(arg)) {
+      return arg;
+    }
+
+    T scaled_arg = (options.ndigits >= 0) ? (arg * pow10) : (arg / pow10);
+    // Use std::round if scaled value is an integer or not 0.5 when a tie-breaking mode
+    // was set.
+    T result;
+    if (RoundUtil::IsApproxInt(scaled_arg, T(options.abs_tol)) ||
+        (options.round_mode >= RoundMode::HALF_DOWN &&
+         !RoundUtil::IsApproxHalfInt(scaled_arg, T(options.abs_tol)))) {
+      result = std::round(scaled_arg);
+    } else {
+      result = RoundImpl<T, RndMode>::Round(scaled_arg);
+    }
+    result = (options.ndigits >= 0) ? (result / pow10) : (result * pow10);
+    if (!std::isfinite(result)) {
+      *st = Status::Invalid("overflow occurred during rounding");
+      return arg;
+    }
+    // If rounding didn't change value, return original value
+    return RoundUtil::IsApproxEqual(arg, result, T(options.abs_tol)) ? arg : result;
+  }
+};
+
+template <RoundMode RndMode>
+struct MRound {
+  using MRoundState = OptionsWrapper<MRoundOptions>;
+
+  template <typename T, typename Arg>
+  static enable_if_floating_point<Arg, T> Call(KernelContext* ctx, Arg arg, Status* st) {
+    static_assert(std::is_same<T, Arg>::value, "");
+    static_assert(RoundMode::HALF_DOWN > RoundMode::DOWN &&
+                      RoundMode::HALF_DOWN > RoundMode::UP &&
+                      RoundMode::HALF_DOWN > RoundMode::TOWARDS_ZERO &&
+                      RoundMode::HALF_DOWN > RoundMode::TOWARDS_INFINITY &&
+                      RoundMode::HALF_DOWN < RoundMode::HALF_UP &&
+                      RoundMode::HALF_DOWN < RoundMode::HALF_TOWARDS_ZERO &&
+                      RoundMode::HALF_DOWN < RoundMode::HALF_TOWARDS_INFINITY &&
+                      RoundMode::HALF_DOWN < RoundMode::HALF_TO_EVEN &&
+                      RoundMode::HALF_DOWN < RoundMode::HALF_TO_ODD,
+                  "Round modes prefixed with HALF need to be defined last in enum and "
+                  "the first HALF entry has to be HALF_DOWN.");
+    auto options = MRoundState::Get(ctx);
+    auto mult = std::fabs(T(options.multiple));
+    if (mult == 0) {

Review comment:
       No, there is not. I will add the check for `mult > 0`.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] edponce commented on a change in pull request #10349: ARROW-12744: [C++][Compute] Add rounding kernel

Posted by GitBox <gi...@apache.org>.

edponce commented on a change in pull request #10349:
URL: https://github.com/apache/arrow/pull/10349#discussion_r692108407



##########
File path: cpp/src/arrow/compute/kernels/scalar_arithmetic.cc
##########
@@ -817,24 +818,158 @@ struct Log1pChecked {
   }
 };
 
+struct RoundUtil {
+  template <typename T, enable_if_t<std::is_floating_point<T>::value, bool> = true>
+  static constexpr bool ApproxEqual(const T x, const T y) {
+    return (x == y) || (std::fabs(x - y) <= std::numeric_limits<T>::epsilon());
+  }
+
+  template <typename T, enable_if_t<std::is_floating_point<T>::value, bool> = true>
+  static constexpr bool IsHalf(T val) {
+    // |frac| == 0.5?
+    return ApproxEqual(std::fmod(std::fabs(val), T(1)), T(0.5));
+  }
+
+  template <typename T>
+  static enable_if_floating_point<T> Pow10(const int power) {
+    static constexpr auto pow10 = std::array<T, 39>{
+        1e-19, 1e-18, 1e-17, 1e-16, 1e-15, 1e-14, 1e-13, 1e-12, 1e-11, 1e-10,
+        1e-9,  1e-8,  1e-7,  1e-6,  1e-5,  1e-4,  1e-3,  1e-2,  1e-1,  1e0,
+        1e1,   1e2,   1e3,   1e4,   1e5,   1e6,   1e7,   1e8,   1e9,   1e10,
+        1e11,  1e12,  1e13,  1e14,  1e15,  1e16,  1e17,  1e18,  1e19};
+    return pow10.at(power + 19);
+  }
+};
+
+// Specializations of rounding implementations for kernels
+template <typename T, RoundMode RndMode>
+struct RoundImpl {
+  static constexpr enable_if_floating_point<T> Round(T) { return T(0); }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::TOWARDS_NEG_INFINITY> {
+  static constexpr enable_if_floating_point<T> Round(T val) { return std::floor(val); }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::TOWARDS_POS_INFINITY> {
+  static constexpr enable_if_floating_point<T> Round(T val) { return std::ceil(val); }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::TOWARDS_ZERO> {
+  static constexpr enable_if_floating_point<T> Round(T val) { return std::trunc(val); }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::TOWARDS_INFINITY> {
+  static constexpr enable_if_floating_point<T> Round(T val) {
+    return std::signbit(val) ? std::floor(val) : std::ceil(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TOWARDS_NEG_INFINITY> {
+  static constexpr enable_if_floating_point<T> Round(T val) {
+    return std::ceil(val - T(0.5));
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TOWARDS_POS_INFINITY> {
+  static constexpr enable_if_floating_point<T> Round(T val) {
+    return std::floor(val + T(0.5));
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TOWARDS_ZERO> {
+  static constexpr enable_if_floating_point<T> Round(T val) {
+    return std::copysign(std::ceil(std::fabs(val) - T(0.5)), val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TOWARDS_INFINITY> {
+  static enable_if_floating_point<T> Round(T val) {
+    return std::copysign(std::floor(std::fabs(val) + T(0.5)), val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TO_EVEN> {
+  static enable_if_floating_point<T> Round(T val) {
+    if (!RoundUtil::IsHalf(val)) {
+      return std::round(val);
+    }
+    auto floor = std::floor(val);
+    // Odd + 1, Even + 0
+    return floor + T(std::fmod(std::fabs(floor), T(2)) >= T(1));
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TO_ODD> {
+  static enable_if_floating_point<T> Round(T val) {
+    if (!RoundUtil::IsHalf(val)) {
+      return std::round(val);
+    }
+    auto floor = std::floor(val);
+    // Odd + 0, Even + 1
+    return floor + T(std::fmod(std::fabs(floor), T(2)) < T(1));
+  }
+};
+
+template <RoundMode RndMode>
+struct MRound {
+  template <typename T, typename Arg>
+  static enable_if_floating_point<Arg, T> Call(KernelContext* ctx, Arg arg, Status*) {
+    static_assert(std::is_same<T, Arg>::value, "");
+    if (std::isnan(arg)) {
+      return arg;
+    }
+    auto options = OptionsWrapper<MRoundOptions>::Get(ctx);
+    const auto mult = std::fabs(T(options.multiple));
+    return (mult == T(0)) ? T(0) : (RoundImpl<T, RndMode>::Round(arg / mult) * mult);
+  }
+};
+
+template <RoundMode RndMode>
+struct Round {
+  template <typename T, typename Arg>
+  static enable_if_floating_point<Arg, T> Call(KernelContext* ctx, Arg arg, Status*) {
+    static_assert(std::is_same<T, Arg>::value, "");
+    auto options = OptionsWrapper<RoundOptions>::Get(ctx);
+    const auto mult = RoundUtil::Pow10<T>(-options.ndigits);

Review comment:
       Given that the `ndigits` value can be detected to be invalid in `RoundUtil::Pow10(...)` helper function which is invoked by `Round::Call(...)`, I decided to propagate the error by returning a NaN value from `Pow10` and setting `Status::Invalid` if value is NaN. AFAIK, exceptions are not catched if not an "out-of-range" exception could be triggered via `std::array.at(...)`.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] edponce commented on a change in pull request #10349: ARROW-12744: [C++][Compute] Add rounding kernel

Posted by GitBox <gi...@apache.org>.

edponce commented on a change in pull request #10349:
URL: https://github.com/apache/arrow/pull/10349#discussion_r692108407



##########
File path: cpp/src/arrow/compute/kernels/scalar_arithmetic.cc
##########
@@ -817,24 +818,158 @@ struct Log1pChecked {
   }
 };
 
+struct RoundUtil {
+  template <typename T, enable_if_t<std::is_floating_point<T>::value, bool> = true>
+  static constexpr bool ApproxEqual(const T x, const T y) {
+    return (x == y) || (std::fabs(x - y) <= std::numeric_limits<T>::epsilon());
+  }
+
+  template <typename T, enable_if_t<std::is_floating_point<T>::value, bool> = true>
+  static constexpr bool IsHalf(T val) {
+    // |frac| == 0.5?
+    return ApproxEqual(std::fmod(std::fabs(val), T(1)), T(0.5));
+  }
+
+  template <typename T>
+  static enable_if_floating_point<T> Pow10(const int power) {
+    static constexpr auto pow10 = std::array<T, 39>{
+        1e-19, 1e-18, 1e-17, 1e-16, 1e-15, 1e-14, 1e-13, 1e-12, 1e-11, 1e-10,
+        1e-9,  1e-8,  1e-7,  1e-6,  1e-5,  1e-4,  1e-3,  1e-2,  1e-1,  1e0,
+        1e1,   1e2,   1e3,   1e4,   1e5,   1e6,   1e7,   1e8,   1e9,   1e10,
+        1e11,  1e12,  1e13,  1e14,  1e15,  1e16,  1e17,  1e18,  1e19};
+    return pow10.at(power + 19);
+  }
+};
+
+// Specializations of rounding implementations for kernels
+template <typename T, RoundMode RndMode>
+struct RoundImpl {
+  static constexpr enable_if_floating_point<T> Round(T) { return T(0); }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::TOWARDS_NEG_INFINITY> {
+  static constexpr enable_if_floating_point<T> Round(T val) { return std::floor(val); }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::TOWARDS_POS_INFINITY> {
+  static constexpr enable_if_floating_point<T> Round(T val) { return std::ceil(val); }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::TOWARDS_ZERO> {
+  static constexpr enable_if_floating_point<T> Round(T val) { return std::trunc(val); }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::TOWARDS_INFINITY> {
+  static constexpr enable_if_floating_point<T> Round(T val) {
+    return std::signbit(val) ? std::floor(val) : std::ceil(val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TOWARDS_NEG_INFINITY> {
+  static constexpr enable_if_floating_point<T> Round(T val) {
+    return std::ceil(val - T(0.5));
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TOWARDS_POS_INFINITY> {
+  static constexpr enable_if_floating_point<T> Round(T val) {
+    return std::floor(val + T(0.5));
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TOWARDS_ZERO> {
+  static constexpr enable_if_floating_point<T> Round(T val) {
+    return std::copysign(std::ceil(std::fabs(val) - T(0.5)), val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TOWARDS_INFINITY> {
+  static enable_if_floating_point<T> Round(T val) {
+    return std::copysign(std::floor(std::fabs(val) + T(0.5)), val);
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TO_EVEN> {
+  static enable_if_floating_point<T> Round(T val) {
+    if (!RoundUtil::IsHalf(val)) {
+      return std::round(val);
+    }
+    auto floor = std::floor(val);
+    // Odd + 1, Even + 0
+    return floor + T(std::fmod(std::fabs(floor), T(2)) >= T(1));
+  }
+};
+
+template <typename T>
+struct RoundImpl<T, RoundMode::HALF_TO_ODD> {
+  static enable_if_floating_point<T> Round(T val) {
+    if (!RoundUtil::IsHalf(val)) {
+      return std::round(val);
+    }
+    auto floor = std::floor(val);
+    // Odd + 0, Even + 1
+    return floor + T(std::fmod(std::fabs(floor), T(2)) < T(1));
+  }
+};
+
+template <RoundMode RndMode>
+struct MRound {
+  template <typename T, typename Arg>
+  static enable_if_floating_point<Arg, T> Call(KernelContext* ctx, Arg arg, Status*) {
+    static_assert(std::is_same<T, Arg>::value, "");
+    if (std::isnan(arg)) {
+      return arg;
+    }
+    auto options = OptionsWrapper<MRoundOptions>::Get(ctx);
+    const auto mult = std::fabs(T(options.multiple));
+    return (mult == T(0)) ? T(0) : (RoundImpl<T, RndMode>::Round(arg / mult) * mult);
+  }
+};
+
+template <RoundMode RndMode>
+struct Round {
+  template <typename T, typename Arg>
+  static enable_if_floating_point<Arg, T> Call(KernelContext* ctx, Arg arg, Status*) {
+    static_assert(std::is_same<T, Arg>::value, "");
+    auto options = OptionsWrapper<RoundOptions>::Get(ctx);
+    const auto mult = RoundUtil::Pow10<T>(-options.ndigits);

Review comment:
       Given that the `ndigits` value can be detected to be invalid in `Pow10(...)` helper function which is invoked by `Round::Call(...)`, I decided to propagate the error by returning a NaN value from `Pow10` and setting `Status::Invalid` if value is NaN. AFAIK, exceptions are not catched if not an "out-of-range" exception could be triggered via `std::array.at(...)`.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org