You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@arrow.apache.org by jo...@apache.org on 2021/02/05 16:25:52 UTC

[arrow] branch master updated: ARROW-11412: [Python] Improve Expression docs

This is an automated email from the ASF dual-hosted git repository.

jorisvandenbossche pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/arrow.git


The following commit(s) were added to refs/heads/master by this push:
     new 6a1687b  ARROW-11412: [Python] Improve Expression docs
6a1687b is described below

commit 6a1687b59999a674d4d11b6e91c3fe5746f1b71c
Author: Roman Karlstetter <11...@users.noreply.github.com>
AuthorDate: Fri Feb 5 17:25:01 2021 +0100

    ARROW-11412: [Python] Improve Expression docs
    
    Slightly improve documentation on how to create and combine expressions
    for filtering datasets. Including one example.
    
    Closes #9351 from romankarlstetter/master
    
    Authored-by: Roman Karlstetter <11...@users.noreply.github.com>
    Signed-off-by: Joris Van den Bossche <jo...@gmail.com>
---
 docs/source/python/dataset.rst |  4 +++-
 python/pyarrow/_dataset.pyx    | 32 +++++++++++++++++++++++++++++++-
 2 files changed, 34 insertions(+), 2 deletions(-)

diff --git a/docs/source/python/dataset.rst b/docs/source/python/dataset.rst
index e77c611..29503af 100644
--- a/docs/source/python/dataset.rst
+++ b/docs/source/python/dataset.rst
@@ -182,7 +182,7 @@ The easiest way to construct those :class:`Expression` objects is by using the
 referenced using the :func:`field` function (which creates a
 :class:`FieldExpression`). Operator overloads are provided to compose filters
 including the comparisons (equal, larger/less than, etc), set membership
-testing, and boolean combinations (and, or, not):
+testing, and boolean combinations (``&``, ``|``, ``~``):
 
 .. ipython:: python
 
@@ -190,6 +190,8 @@ testing, and boolean combinations (and, or, not):
     ds.field('a').isin([1, 2, 3])
     (ds.field('a') > ds.field('b')) & (ds.field('b') > 1)
 
+Note that :class:`Expression` objects can **not** be combined by python logical
+operators ``and``, ``or`` and ``not``.
 
 Reading partitioned data
 ------------------------
diff --git a/python/pyarrow/_dataset.pyx b/python/pyarrow/_dataset.pyx
index 151ae81..087d421 100644
--- a/python/pyarrow/_dataset.pyx
+++ b/python/pyarrow/_dataset.pyx
@@ -84,7 +84,37 @@ cdef CFileSource _make_file_source(object file, FileSystem filesystem=None):
 
 
 cdef class Expression(_Weakrefable):
-
+    """
+    A logical expression to be evaluated against some input.
+
+    To create an expression:
+
+    - Use the factory function ``pyarrow.dataset.scalar()`` to create a
+      scalar (not necessary when combined, see example below).
+    - Use the factory function ``pyarrow.dataset.field()`` to reference
+      a field (column in table).
+    - Compare fields and scalars with ``<``, ``<=``, ``==``, ``>=``, ``>``.
+    - Combine expressions using python operators ``&`` (logical and),
+      ``|`` (logical or) and ``~`` (logical not).
+      Note: python keywords ``and``, ``or`` and ``not`` cannot be used
+      to combine expressions.
+    - Check whether the expression is contained in a list of values with
+      the ``pyarrow.dataset.Expression.isin()`` member function.
+
+    Examples:
+    --------
+    >>> import pyarrow.dataset as ds
+    >>> (ds.field("a") < ds.scalar(3)) | (ds.field("b") > 7)
+    <pyarrow.dataset.Expression ((a < 3:int64) or (b > 7:int64))>
+    >>> ds.field('a') != 3
+    <pyarrow.dataset.Expression (a != 3)>
+    >>> ds.field('a').isin([1, 2, 3])
+    <pyarrow.dataset.Expression (a is in [
+      1,
+      2,
+      3
+    ])>
+    """
     cdef:
         CExpression expr