You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Wes McKinney (Jira)" <ji...@apache.org> on 2020/10/07 16:42:00 UTC

[jira] [Updated] (ARROW-10172) [Python] pyarrow.concat_arrays segfaults if a resulting StringArray's capacity overflows

     [ https://issues.apache.org/jira/browse/ARROW-10172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Wes McKinney updated ARROW-10172:
---------------------------------
    Summary: [Python] pyarrow.concat_arrays segfaults if a resulting StringArray's capacity overflows  (was: [Python] cancat_arrays requires upcast for large array)

> [Python] pyarrow.concat_arrays segfaults if a resulting StringArray's capacity overflows
> ----------------------------------------------------------------------------------------
>
>                 Key: ARROW-10172
>                 URL: https://issues.apache.org/jira/browse/ARROW-10172
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: Python
>    Affects Versions: 1.0.1
>            Reporter: Artem KOZHEVNIKOV
>            Priority: Major
>
> I'm sorry if this was already reported, but there's an overflow issue in concatenation of large arrays
> {code:python}
> In [1]: import pyarrow as pa
> In [2]: str_array = pa.array(['a' * 128] * 10**8)
> In [3]: large_array = pa.concat_arrays([str_array] * 50)
> Segmentation fault (core dumped)
> {code}
> I suppose that  this should be handled by upcast to large_string.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)