You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Weston Pace (Jira)" <ji...@apache.org> on 2021/07/22 17:32:00 UTC

[jira] [Created] (ARROW-13436) [Python][Doc] Clarify what should be expected if read_table is passed an empty list of columns

Weston Pace created ARROW-13436:
-----------------------------------

             Summary: [Python][Doc] Clarify what should be expected if read_table is passed an empty list of columns
                 Key: ARROW-13436
                 URL: https://issues.apache.org/jira/browse/ARROW-13436
             Project: Apache Arrow
          Issue Type: Bug
          Components: Python
            Reporter: Weston Pace


The documentation for pyarrow.parquet.read_table states:

 
 * *columns* (_list_) – If not None, only these columns will be read from the file. A column name may be a prefix of a nested field, e.g. ‘a’ will select ‘a.b’, ‘a.c’, and ‘a.d.e’.

 

It is not clear what should be the expected result if columns is an empty list.  In pyarrow 3.0 this read in all columns (as long as use_legacy_dataset=False).  In pyarrow 4.0 this doesn't read in any columns.  I think this behavior (not reading in any columns) is the correct behavior (since None can be used for all columns) but we should clarify that in the docs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)