You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by "Wes McKinney (JIRA)" <ji...@apache.org> on 2019/02/08 05:06:00 UTC
[jira] [Updated] (ARROW-4143) [Python] Skip rows while reading
parquet file
[ https://issues.apache.org/jira/browse/ARROW-4143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Wes McKinney updated ARROW-4143:
--------------------------------
Summary: [Python] Skip rows while reading parquet file (was: Skip rows while reading parquet file)
> [Python] Skip rows while reading parquet file
> ---------------------------------------------
>
> Key: ARROW-4143
> URL: https://issues.apache.org/jira/browse/ARROW-4143
> Project: Apache Arrow
> Issue Type: Improvement
> Components: Developer Tools
> Reporter: Sanchit
> Priority: Minor
> Labels: newbie
>
> Is there any functionality in pyarrow that allows reading the file partially. Means if I wish to read only the first 10 rows from the parquet file.
> I got this situation while doing this:
> `df = pd.read_parquet(path= 'filepath', nrows = 10)` #Gave me error
> I wanted to read just the 10 rows into pandas dataframe using the read_parquet, (read_parquet uses pyarrow as one of the engines to read parquet file). As the parquet file is considerably huge in size, if one wants to read only a few n rows is there any functionality we can add in the engine to do so?
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)