You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Joris Van den Bossche (Jira)" <ji...@apache.org> on 2020/12/21 16:11:00 UTC
[jira] [Created] (ARROW-11000) [Python] Enable random access
reading for Python file objects (if supported)
Joris Van den Bossche created ARROW-11000:
---------------------------------------------
Summary: [Python] Enable random access reading for Python file objects (if supported)
Key: ARROW-11000
URL: https://issues.apache.org/jira/browse/ARROW-11000
Project: Apache Arrow
Issue Type: Improvement
Components: Python
Reporter: Joris Van den Bossche
{{arrow::py::PyReadableFile::ReadAt}} is being commented as thread-safe (it puts a lock on the underlying python file) and should thus allow random access in parallel code (for example, reading a subset (eg column) of a parquet file).
However, based on experimentation, it seems this doesn't work (eg with s3fs filesystem to read a specific parquet column
--
This message was sent by Atlassian Jira
(v8.3.4#803005)