You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2020/07/13 10:07:34 UTC

[GitHub] [arrow] jorisvandenbossche commented on a change in pull request #7695: ARROW-8989: [C++][Doc] Document available compute functions

jorisvandenbossche commented on a change in pull request #7695:
URL: https://github.com/apache/arrow/pull/7695#discussion_r453541169



##########
File path: docs/source/cpp/compute.rst
##########
@@ -0,0 +1,419 @@
+.. Licensed to the Apache Software Foundation (ASF) under one
+.. or more contributor license agreements.  See the NOTICE file
+.. distributed with this work for additional information
+.. regarding copyright ownership.  The ASF licenses this file
+.. to you under the Apache License, Version 2.0 (the
+.. "License"); you may not use this file except in compliance
+.. with the License.  You may obtain a copy of the License at
+
+..   http://www.apache.org/licenses/LICENSE-2.0
+
+.. Unless required by applicable law or agreed to in writing,
+.. software distributed under the License is distributed on an
+.. "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+.. KIND, either express or implied.  See the License for the
+.. specific language governing permissions and limitations
+.. under the License.
+
+.. default-domain:: cpp
+.. highlight:: cpp
+.. cpp:namespace:: arrow::compute
+
+=================
+Compute Functions
+=================
+
+.. TODO: describe API and how to invoke compute functions
+
+Available functions
+===================
+
+Aggregations
+------------
+
++--------------------------+------------+--------------------+-----------------------+--------------------------------------------+
+| Function name            | Arity      | Input types        | Output type           | Options class                              |
++==========================+============+====================+=======================+============================================+
+| count                    | Unary      | Any                | Scalar Int64          | :struct:`CountOptions`                     |
++--------------------------+------------+--------------------+-----------------------+--------------------------------------------+
+| mean                     | Unary      | Numeric            | Scalar Float64        |                                            |
++--------------------------+------------+--------------------+-----------------------+--------------------------------------------+
+| minmax                   | Unary      | Numeric            | Scalar Struct  (1)    | :struct:`MinMaxOptions`                    |
++--------------------------+------------+--------------------+-----------------------+--------------------------------------------+
+| sum                      | Unary      | Numeric            | Scalar Numeric (2)    |                                            |
++--------------------------+------------+--------------------+-----------------------+--------------------------------------------+
+
+Notes:
+
+* \(1) Output is a ``{"min": input type, "max": input type}`` Struct
+
+* \(2) Output is Int64, UInt64 or Float64, depending on the input type
+
+
+Element-wise ("scalar") functions
+---------------------------------
+
+Arithmetic functions
+~~~~~~~~~~~~~~~~~~~~
+
+Those functions expect two inputs of the same type and apply a given binary
+operation to each pair of elements gathered from the inputs.  Each function
+is also available in an overflow-checking variant, suffixed ``_checked``.
+
+If any of the input elements in a pair is null, the corresponding output
+element is null.
+
++--------------------------+------------+--------------------+---------------------+
+| Function name            | Arity      | Input types        | Output type         |
++==========================+============+====================+=====================+
+| add                      | Binary     | Numeric            | Numeric             |
++--------------------------+------------+--------------------+---------------------+
+| add_checked              | Binary     | Numeric            | Numeric             |
++--------------------------+------------+--------------------+---------------------+
+| multiply                 | Binary     | Numeric            | Numeric             |
++--------------------------+------------+--------------------+---------------------+
+| multiply_checked         | Binary     | Numeric            | Numeric             |
++--------------------------+------------+--------------------+---------------------+
+| subtract                 | Binary     | Numeric            | Numeric             |
++--------------------------+------------+--------------------+---------------------+
+| subtract_checked         | Binary     | Numeric            | Numeric             |
++--------------------------+------------+--------------------+---------------------+
+
+Comparisons
+~~~~~~~~~~~
+
+Those functions expect two inputs of the same type and apply a given
+comparison operator.  If any of the input elements in a pair is null,
+the corresponding output element is null.
+
++--------------------------+------------+---------------------------------+---------------------+
+| Function names           | Arity      | Input types                     | Output type         |
++==========================+============+=================================+=====================+
+| equal, not_equal         | Binary     | Numeric                         | Boolean             |
++--------------------------+------------+---------------------------------+---------------------+
+| equal, not_equal         | Binary     | Binary- and String-like         | Boolean             |
++--------------------------+------------+---------------------------------+---------------------+
+| equal, not_equal         | Binary     | Temporal                        | Boolean             |
++--------------------------+------------+---------------------------------+---------------------+
+| greater, greater_equal,  | Binary     | Numeric                         | Boolean             |
+| less, less_equal         |            |                                 |                     |
++--------------------------+------------+---------------------------------+---------------------+
+| greater, greater_equal,  | Binary     | Binary- and String-like         | Boolean             |
+| less, less_equal         |            |                                 |                     |
++--------------------------+------------+---------------------------------+---------------------+
+| greater, greater_equal,  | Binary     | Temporal                        | Boolean             |
+| less, less_equal         |            |                                 |                     |
++--------------------------+------------+---------------------------------+---------------------+
+
+Logical functions
+~~~~~~~~~~~~~~~~~~
+
+The normal behaviour for these functions is to emit a null if any of the
+inputs is null.
+
+Some of them are also available in a "`Kleene logic`_" variant (suffixed

Review comment:
       Yes, numpy doesn't do this. But, numpy doesn't know the concept of "nulls", so it's actually not even a question for numpy (and pandas only started to do this in the new experimental opt-in nullable dtypes, but for its default numpy-baed dtypes, ti follows numpy)
   
   Kleene logic is also what Julia does by default when working with their "Missing" union type (https://docs.julialang.org/en/v1/manual/missing/index.html#Logical-operators-1, they use the term "three-valued logic"). 
   I am personally not aware of a system that has "null" support, and doesn't use Kleene logic.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org