You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@parquet.apache.org by "Steven Paster (JIRA)" <ji...@apache.org> on 2018/11/29 09:27:00 UTC
[jira] [Created] (PARQUET-1465) CLONE - Add a way to append encoded
blocks in ParquetFileWriter
Steven Paster created PARQUET-1465:
--------------------------------------
Summary: CLONE - Add a way to append encoded blocks in ParquetFileWriter
Key: PARQUET-1465
URL: https://issues.apache.org/jira/browse/PARQUET-1465
Project: Parquet
Issue Type: New Feature
Components: parquet-mr
Affects Versions: 1.8.0
Reporter: Steven Paster
Assignee: Ryan Blue
Fix For: 1.9.0, 1.8.2
Concatenating two files together currently requires reading the source files and rewriting the content from scratch. This ends up taking a lot of memory, even if the data is already encoded correctly and blocks just need to be appended and have their metadata updated. Merging two files should be fast and not take much memory.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)