You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by "mroeschke (via GitHub)" <gi...@apache.org> on 2023/04/21 17:47:17 UTC

[GitHub] [arrow] mroeschke opened a new issue, #35273: BUG: pyarrow.compute.round fails for large integer within int64

mroeschke opened a new issue, #35273:
URL: https://github.com/apache/arrow/issues/35273

   ### Describe the bug, including details regarding any error messages, version, and platform.
   
   ```
   In [11]: import pyarrow as pa
   
   In [12]: pa.compute.round(pa.array([1372636858620000589]))
   ArrowInvalid: Integer value 1372636858620000589 not in range: -9007199254740992 to 9007199254740992
   
   In [13]: pa.__version__
   Out[13]: '11.0.0'
   ```
   
   Not sure why the rounding range is narrower  (2^53) that the int64 range
   
   ### Component(s)
   
   Python


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] jorisvandenbossche commented on issue #35273: BUG: pyarrow.compute.round fails for large integer within int64

Posted by "jorisvandenbossche (via GitHub)" <gi...@apache.org>.
jorisvandenbossche commented on issue #35273:
URL: https://github.com/apache/arrow/issues/35273#issuecomment-1518257796

   > So on the short term, I think pandas can just avoid calling the kernel on integers, as it should be a no-op anyway. And it might be more convenient that pyarrow adds a dummy kernel for the integer types as well.
   
   Actually, rounding to a `ndigits` >= 0 is a no-op, but of course rounding to a negative `ndigits` is also useful for integers (and not a no-op):
   
   ```
   In [23]: pa.compute.round([2023], ndigits=-1)
   Out[23]: 
   <pyarrow.lib.DoubleArray object at 0x7ff8aa3b8b80>
   [
     2020
   ]
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] westonpace commented on issue #35273: [C++] Support "round" kernel for integer inputs

Posted by "westonpace (via GitHub)" <gi...@apache.org>.
westonpace commented on issue #35273:
URL: https://github.com/apache/arrow/issues/35273#issuecomment-1523833616

   I've added `good-second-issue` as a label.  Adding a new kernel is not trivial but it is rather self-contained.  Someone could also use the existing round kernels as inspiration.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] pitrou closed issue #35273: [C++] Support "round" kernel for integer inputs

Posted by "pitrou (via GitHub)" <gi...@apache.org>.
pitrou closed issue #35273: [C++] Support "round" kernel for integer inputs
URL: https://github.com/apache/arrow/issues/35273


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] westonpace commented on issue #35273: [C++] Support "round" kernel for integer inputs

Posted by "westonpace (via GitHub)" <gi...@apache.org>.
westonpace commented on issue #35273:
URL: https://github.com/apache/arrow/issues/35273#issuecomment-1613231728

   > If we don't, then there will be inconsistency between them and round. Which way do you think would be better?
   
   I don't think I would be opposed to dummy implementations but I don't think we need them.  Planners are typically smart enough to recognize that these are no-ops and remove them.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] js8544 commented on issue #35273: [C++] Support "round" kernel for integer inputs

Posted by "js8544 (via GitHub)" <gi...@apache.org>.
js8544 commented on issue #35273:
URL: https://github.com/apache/arrow/issues/35273#issuecomment-1605914674

   @jorisvandenbossche @westonpace While it would be meaningful to implement `round`  and `round_to_multiple` kernels  for integers, how about `floor`, `ceil` and `trunc`? These functions currently also cast ints to floats. If we do add kernels for ints, they will essentially be no-op. If we don't, then there will be inconsistency between them and `round`. Which way do you think would be better?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] westonpace commented on issue #35273: [C++] Support "round" kernel for integer inputs

Posted by "westonpace (via GitHub)" <gi...@apache.org>.
westonpace commented on issue #35273:
URL: https://github.com/apache/arrow/issues/35273#issuecomment-1523835735

   https://github.com/apache/arrow/pull/13933 would likely be useful for anyone new that is looking to author a kernel.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] jorisvandenbossche commented on issue #35273: BUG: pyarrow.compute.round fails for large integer within int64

Posted by "jorisvandenbossche (via GitHub)" <gi...@apache.org>.
jorisvandenbossche commented on issue #35273:
URL: https://github.com/apache/arrow/issues/35273#issuecomment-1518256692

   The issue here is that the "round" kernel is only implemented for floats / decimals:
   
   ```
   In [20]: pa.compute.get_function("round").kernels
   Out[20]: 
   [ScalarKernel<(Type::FLOAT) -> float>,
    ScalarKernel<(Type::DOUBLE) -> double>,
    ScalarKernel<(Type::DECIMAL128) -> computed>,
    ScalarKernel<(Type::DECIMAL256) -> computed>,
    ScalarKernel<(Type::NA) -> null>]
   ```
   
   And so for numeric types we do have some automatic type casting to find a kernel, which in this case doesn't really do what would be expected (the failure you see is from trying to cast int to float, and the int is too large to to be faithfully represented as a float).
   
   So on the short term, I think pandas can just avoid calling the kernel on integers, as it should be a no-op anyway. And it might be more convenient that pyarrow adds a dummy kernel for the integer types as well.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] js8544 commented on issue #35273: [C++] Support "round" kernel for integer inputs

Posted by "js8544 (via GitHub)" <gi...@apache.org>.
js8544 commented on issue #35273:
URL: https://github.com/apache/arrow/issues/35273#issuecomment-1605913824

   take


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org