You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by "Wes McKinney (JIRA)" <ji...@apache.org> on 2018/06/27 20:35:00 UTC

[jira] [Assigned] (ARROW-2683) [Python] Resource Warning (Unclosed File) when using pyarrow.parquet.read_table()

     [ https://issues.apache.org/jira/browse/ARROW-2683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Wes McKinney reassigned ARROW-2683:
-----------------------------------

    Assignee: Krisztian Szucs

> [Python] Resource Warning (Unclosed File) when using pyarrow.parquet.read_table()
> ---------------------------------------------------------------------------------
>
>                 Key: ARROW-2683
>                 URL: https://issues.apache.org/jira/browse/ARROW-2683
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: Python
>    Affects Versions: 0.9.0
>            Reporter: Aldrin
>            Assignee: Krisztian Szucs
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 0.10.0
>
>         Attachments: parquetread_test.py, simple.parquet
>
>          Time Spent: 20m
>  Remaining Estimate: 0h
>
> pyarrow version from python repl:
> {noformat}
> >>> import pyarrow
> >>> pyarrow.__version__
> '0.9.0.post1'{noformat}
> python interpreter information:
> {noformat}
> Python 3.6.5 (default, Mar 30 2018, 06:42:10)
> [GCC 4.2.1 Compatible Apple LLVM 9.0.0 (clang-900.0.39.2)] on darwin{noformat}
> arbitrary, potentially relevant system information:
> {noformat}
> OS                       : macOS High Sierra (10.13.4)
> homebrew package         : python: stable 3.6.5 (bottled), devel 3.7.0b4, HEAD
> pip version              : pip 10.0.1
> pipenv version           : pipenv, version 2018.05.18
> pyarrow version (via pip): pyarrow         0.9.0.post1
> cython version (via pip) : Cython          0.28.2{noformat}
>  
> Issue Description:
> I see a ResourceWarning, which doesn't seem to be an error, but seems important enough (a.k.a. annoying enough) that I thought it would be worth asking about. [~xhochy] was nice enough to respond in  #general in the arrow slack.
> The main problem is as follows:
> # with this code in a python unittest:
> {noformat}
> def test_arrow_from_parquet(self):
> table = parquet.read_table(<path as str>){noformat}
> I see this warning:
> {noformat}
> ResourceWarning: unclosed file <_io.BufferedReader name=<path_to_file>{noformat}
> # I tried adding the following, per Uwe's request:
> {noformat}
> warnings.simplefilter("error"){noformat}
> # I then see this information:
> {noformat}
> test_arrow_from_parquet (tests.datalayer_test.TestFileReader) ... Exception ignored in: <_io.FileIO name=<path_to_file> mode='rb' closefd=True>
> ResourceWarning: unclosed file <_io.BufferedReader name=<path_to_file>>{noformat}
> # Uwe's thoughts:
> {noformat}
> That could be a valid error. We don’t seem to close the file we open in `ParquetFile.__init__`{noformat}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)