You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/05/19 12:59:14 UTC

[GitHub] [arrow] jorisvandenbossche commented on a diff in pull request #13155: ARROW-16469: [Python] Table.filter accepts a boolean expression in addition to boolean array

jorisvandenbossche commented on code in PR #13155:
URL: https://github.com/apache/arrow/pull/13155#discussion_r877024037


##########
python/pyarrow/table.pxi:
##########
@@ -2882,24 +2882,27 @@ cdef class Table(_PandasConvertible):
 
         return pyarrow_wrap_table(result)
 
-    def filter(self, mask, object null_selection_behavior="drop"):
+    def filter(self, mask_or_expr, object null_selection_behavior="drop"):
         """
         Select rows from the table.
 
-        See :func:`pyarrow.compute.filter` for full usage.
+        The Table can be filtered based on a mask, which will be passed to
+        :func:`pyarrow.compute.filter` to perform the filtering, or it can
+        be filtered through a boolean :class:`.Expression`
 
         Parameters
         ----------
-        mask : Array or array-like
-            The boolean mask to filter the table with.
+        mask_or_expr : Array or array-like or .Expression
+            The boolean mask or the :class:`.Expression` to filter the table with.
         null_selection_behavior
-            How nulls in the mask should be handled.
+            How nulls in the mask should be handled, does nothing if
+            an :class:`.Expression` is used.

Review Comment:
   > I think that if you care about special handling nulls, you probably want to build an expression that evaluates as you wish for nulls
   
   I don't think is possible to get the "emit null" behaviour by changing the expression (for dropping/keeping, you can explicitly fill the null with False/True, but for preserving the row as null, that's only possible through this option). I suppose that is a good reason this is an option of the filter kernel and not eg comparison kernels. 
   
   Anyway, this is not that important given that the "drop" behaviour is the default for both (and is the typical behaviour you want, I think), but this might be something to open a JIRA for to add `FilterOptions` to the `FilterNodeOptions` (cc @westonpace would that make sense?)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org