You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Kevin Crouse (Jira)" <ji...@apache.org> on 2022/04/04 16:39:00 UTC

[jira] [Comment Edited] (ARROW-15928) [C++][Python] replace_with_mask seg faults when passed ChunkedArray

    [ https://issues.apache.org/jira/browse/ARROW-15928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17516938#comment-17516938 ] 

Kevin Crouse edited comment on ARROW-15928 at 4/4/22 4:38 PM:
--------------------------------------------------------------

[~lidavidm] , as I noted in my comment above I believe this is not a ChunkedArray problem and I have the same issue with regular arrow arrays. I also provided an example that consistently seg faults on our machine, but if it's a memory overflow you may not have the same result.


was (Author: JIRAUSER286896):
[~lidavidm] , as I noted in my comment above I believe this is not a ChunkedArray problem and I have the same issue with regular arrays. I also provided an example that seg faults on our machine, but if it's a memory overflow you may not have the same result.

> [C++][Python] replace_with_mask seg faults when passed ChunkedArray
> -------------------------------------------------------------------
>
>                 Key: ARROW-15928
>                 URL: https://issues.apache.org/jira/browse/ARROW-15928
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: C++, Python
>    Affects Versions: 5.0.0, 6.0.0, 6.0.1, 7.0.0
>            Reporter: Luke Manley
>            Priority: Critical
>              Labels: good-second-issue, kernel
>             Fix For: 8.0.0
>
>
> It looks like most compute functions that take an array-like can accept a ChunkedArray as well. One exception to that appears to be replace_with_mask which seems to seg fault when the array-like is a chunked array. Here is an example:
>  
>  
> {code:java}
> import pyarrow as pa 
> import pyarrow.compute as pc 
> ca = pa.chunked_array([[1, 2], [3, 4]]) 
> mask = [False, True, False, True] 
> # works (when we first combine chunks into a single array) 
> a_new = pc.replace_with_mask(ca.combine_chunks(), mask, 0) 
> # seg fault (if we try to pass the chunked array) 
> ca_new = pc.replace_with_mask(ca, mask, 0){code}
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)