You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by "vyasr (via GitHub)" <gi...@apache.org> on 2024/03/25 18:40:51 UTC

[I] replace_with_mask does not properly handle chunked target array [arrow]

vyasr opened a new issue, #40780:
URL: https://github.com/apache/arrow/issues/40780

   ### Describe the bug, including details regarding any error messages, version, and platform.
   
   `pyarrow.compute.replace_with_mask` requires that the inputs not be chunked arrays. The default calling pattern works as expected:
   ```
   >>> pa.compute.replace_with_mask(pa.array([False]), pa.array([True]), pa.array([True]))
   <pyarrow.lib.BooleanArray object at 0x7f30ccbc8940>
   [
     true
   ]
   ```
   while specifying the replacement values as a chunked array raises a comprehensible exception:
   ```
   >>> pa.compute.replace_with_mask(pa.array([False]), pa.array([True]), pa.chunked_array([[True]]))
   Traceback (most recent call last):
     File "/home/coder/.conda/envs/rapids/lib/python3.10/code.py", line 90, in runcode
       exec(code, self.locals)
     File "<console>", line 1, in <module>
     File "/home/coder/.conda/envs/rapids/lib/python3.10/site-packages/pyarrow/compute.py", line 246, in wrapper
       return func.call(args, None, memory_pool)
     File "pyarrow/_compute.pyx", line 385, in pyarrow._compute.Function.call
     File "pyarrow/error.pxi", line 154, in pyarrow.lib.pyarrow_internal_check_status
     File "pyarrow/error.pxi", line 91, in pyarrow.lib.check_status
   pyarrow.lib.ArrowInvalid: Replacements must be array or scalar, not ChunkedArray([
     [
       true
     ]
   ])
   ```
   However, specifying the values as a chunked array instead silently produces an invalid output:
   ```
   >>> pa.compute.replace_with_mask(pa.chunked_array([[False]]), pa.array([True]), pa.array([True]))
   <pyarrow.lib.ChunkedArray object at 0x7f30ccbcc5e0>
   [
   <Invalid array: Buffer #1 too small in array of type bool and length 1: expected at least 1 byte(s), got 0>
   ]
   ```
   
   The latter case should also be handled by the same error-checking logic used to validate the replacements (unless chunked arrays can in fact be supported, which would be nice but isn't too important).
   
   ### Component(s)
   
   Python


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org