You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "jorisvandenbossche (via GitHub)" <gi...@apache.org> on 2023/03/15 11:51:53 UTC

[GitHub] [arrow] jorisvandenbossche commented on a diff in pull request #34570: GH-34568: [C++][Python] Make RunEndEncodeOptions availabe to Python Arrow

jorisvandenbossche commented on code in PR #34570:
URL: https://github.com/apache/arrow/pull/34570#discussion_r1136795600


##########
python/pyarrow/_compute.pyx:
##########
@@ -1335,6 +1335,27 @@ class DictionaryEncodeOptions(_DictionaryEncodeOptions):
         self._set_options(null_encoding)
 
 
+cdef class _RunEndEncodeOptions(FunctionOptions):
+    def _set_options(self, run_end_type):
+        self.wrapped.reset(new CRunEndEncodeOptions(pyarrow_unwrap_data_type(run_end_type)))

Review Comment:
   ```suggestion
           self.wrapped.reset(new CRunEndEncodeOptions(pyarrow_unwrap_data_type(ensure_type(run_end_type))))
   ```
   
   To ensure the type can be passed as a string alias as well, and to raise a proper error when something that is not a type is passed (otherwise this would return a null pointer, potentially crashing the encoding?)



##########
python/pyarrow/tests/test_compute.py:
##########
@@ -3108,3 +3109,11 @@ def test_list_slice_bad_parameters():
         pc.list_slice(arr, 0, 1, step=0)
     with pytest.raises(pa.ArrowInvalid, match=msg + "-1"):
         pc.list_slice(arr, 0, 1, step=-1)
+
+
+def test_run_end_encode():
+    arr = pa.array([1, 1, 1, 2, 2, 1, 1, 1, 1, 1, 3, 3, 3, 3, 3, 3, 3, 3, 3])
+    encoded = pc.run_end_encode(arr, options=pc.RunEndEncodeOptions(pa.int64()))

Review Comment:
   So this actually fails at the moment, because we can't yet put a REE C++ array in a pyarrow.Array object (`pyarrow_wrap_array` fails here)
   
   On the short term, we can maybe put it just in the base class Array (I don't know if there is already an issue about adding proper python bindings?)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org