You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@parquet.apache.org by "Steven Paster (JIRA)" <ji...@apache.org> on 2018/11/29 09:27:00 UTC

[jira] [Created] (PARQUET-1465) CLONE - Add a way to append encoded blocks in ParquetFileWriter

Steven Paster created PARQUET-1465:
--------------------------------------

             Summary: CLONE - Add a way to append encoded blocks in ParquetFileWriter
                 Key: PARQUET-1465
                 URL: https://issues.apache.org/jira/browse/PARQUET-1465
             Project: Parquet
          Issue Type: New Feature
          Components: parquet-mr
    Affects Versions: 1.8.0
            Reporter: Steven Paster
            Assignee: Ryan Blue
             Fix For: 1.9.0, 1.8.2


Concatenating two files together currently requires reading the source files and rewriting the content from scratch. This ends up taking a lot of memory, even if the data is already encoded correctly and blocks just need to be appended and have their metadata updated. Merging two files should be fast and not take much memory.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)