You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Joris Van den Bossche (Jira)" <ji...@apache.org> on 2020/12/16 09:23:00 UTC

[jira] [Comment Edited] (ARROW-10910) [Python] Segmentation Fault when None given to read_table with legacy dataset

    [ https://issues.apache.org/jira/browse/ARROW-10910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17250197#comment-17250197 ] 

Joris Van den Bossche edited comment on ARROW-10910 at 12/16/20, 9:22 AM:
--------------------------------------------------------------------------

You can also get the same segfault with the {{ParquetFile}} API, so that is something we should certainly still fix, so reopening this issue:

{code}
In [1]: import pyarrow.parquet as pq

In [2]: pq.ParquetFile(None)
Segmentation fault (core dumped)
{code}


was (Author: jorisvandenbossche):
You can also get the same segfault with the {{ParquetFile}} API, so that is something we should certainly still fix:

{code}
In [1]: import pyarrow.parquet as pq

In [2]: pq.ParquetFile(None)
Segmentation fault (core dumped)
{code}

> [Python] Segmentation Fault when None given to read_table with legacy dataset
> -----------------------------------------------------------------------------
>
>                 Key: ARROW-10910
>                 URL: https://issues.apache.org/jira/browse/ARROW-10910
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: Python
>    Affects Versions: 0.17.0
>         Environment: python: 3.8.3.final.0
> python-bits: 64
> OS: Linux
> OS-release: 5.4.0-56-generic
> machine: x86_64
> processor: x86_64
> byteorder: little
> LC_ALL: None
> LANG: en_US.UTF-8
> LOCALE: en_US.UTF-8
> pyarrow: 0.17.0
>            Reporter: Charles Burkland
>            Priority: Major
>              Labels: Bug:Generic, Python3, Segmenation_Fault, pyarrow
>             Fix For: 3.0.0
>
>
> h3. Code Sample (copy-pasteable)
> {code:python}
> import pyarrow.parquet as pq
> pq.read_table(None)
> {code}
> h3. Description
> The above snippet will produce a Segmentation Fault, which is highly undesirable. The reason I discovered this, was I had a function that was supposed to return a file path, but on my first iteration I forgot to return. Thus, when I ran my module with
> {code:python}
> pq.read_table(generate_fp()){code}
> it produced a Segmentation Fault.
> h3. Expected OutputĀ 
> Ideally this will raise an *ValueError*, indicating to the user that *None* is an invalid source/file path. In my opinion, this is much more desirable than a violent segfault.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)