You are viewing a plain text version of this content. The canonical link for it is here.

Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2021/01/27 17:09:57 UTC

[GitHub] [arrow] bkietz commented on a change in pull request #9294: ARROW-8919: [C++][Compute][Dataset] Add Function::DispatchBest to accomodate implicit casts

bkietz commented on a change in pull request #9294:
URL: https://github.com/apache/arrow/pull/9294#discussion_r565483217



##########
File path: docs/source/cpp/compute.rst
##########
@@ -744,3 +749,34 @@ Structural transforms
 * \(2) For each value in the list child array, the index at which it is found
   in the list array is appended to the output.  Nulls in the parent list array
   are discarded.
+
+.. _common-numeric-type:
+
+Common numeric type
+~~~~~~~~~~~~~~~~~~~
+
+The common numeric type of a set of input numeric types is the smallest numeric
+type which can accommodate any value of any input. If any input is a floating
+point type the common numeric type is the widest floating point type among the
+inputs. Otherwise the common numeric type is integral, is signed if any input
+is signed, and its width is the maximum width of any input. For example:
+
++-------------------+----------------------+-------------------------------------------+
+| Input types       | Common numeric type  | Notes                                     |
++===================+======================+===========================================+
+| int32, int32      | int32                |                                           |
++-------------------+----------------------+-------------------------------------------+
+| int16, int32      | int32                | Max width is 32, promote LHS to int32     |
++-------------------+----------------------+-------------------------------------------+
+| uint16, uint32    | uint32               | All inputs unsigned, maintain unsigned    |
++-------------------+----------------------+-------------------------------------------+
+| uint16, int32     | int32                | One input signed, override unsigned       |
++-------------------+----------------------+-------------------------------------------+
+| int16, uint32     | int32                |                                           |

Review comment:
       I chose to preserve bit width to keep the implicitly cast intermediates a bit smaller (and therefore faster). The downside is direct comparison may raise: `pc.equal(pa.array([0], type='int32'), pa.array([2**31], type='uint32'))` would overflow when casting the RHS to `int32`.
   
   We can certainly use the rule you suggest instead, but note that integer overflow is still a danger: we have nowhere wider to go when comparing `int64, uint64`, and arithmetic can of course overflow as well.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org