You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@arrow.apache.org by "Adam Lippai (Jira)" <ji...@apache.org> on 2019/10/02 22:40:00 UTC
[jira] [Created] (ARROW-6774) Reading parquet file is slow
Adam Lippai created ARROW-6774:
----------------------------------
Summary: Reading parquet file is slow
Key: ARROW-6774
URL: https://issues.apache.org/jira/browse/ARROW-6774
Project: Apache Arrow
Issue Type: Improvement
Components: Rust
Affects Versions: 0.15.0
Reporter: Adam Lippai
Using the example at [https://github.com/apache/arrow/tree/master/rust/parquet] is slow.
The snippet
{code:none}
let reader = SerializedFileReader::new(file).unwrap();
let mut iter = reader.get_row_iter(None).unwrap();
let start = Instant::now();
while let Some(record) = iter.next() {}
let duration = start.elapsed();
println!("{:?}", duration);
{code}
Runs for 17sec for a ~160MB parquet file.
If there is a more effective way to load a parquet file, it would be nice to add it to the readme.
P.S.: My goal is to construct an ndarray from it, I'd be happy for any tips.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)