You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@parquet.apache.org by "Tham (JIRA)" <ji...@apache.org> on 2019/03/12 10:55:00 UTC

[jira] [Commented] (PARQUET-1022) [C++] Append mode in parquet-cpp

    [ https://issues.apache.org/jira/browse/PARQUET-1022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16790435#comment-16790435 ] 

Tham commented on PARQUET-1022:
-------------------------------

Our application would like to have this feature. Our use case is when the application crashes (it's true we cannot develop a perfect system), so we cannot close the file that is opening at that time. We want to open, close multiple times, then we just need to open when we want to write data, then close the file. With that way, we can reduce the risks to lose data. Any suggestion?

> [C++] Append mode in parquet-cpp
> --------------------------------
>
>                 Key: PARQUET-1022
>                 URL: https://issues.apache.org/jira/browse/PARQUET-1022
>             Project: Parquet
>          Issue Type: New Feature
>          Components: parquet-cpp
>    Affects Versions: cpp-1.1.0
>            Reporter: yugu
>            Assignee: Wes McKinney
>            Priority: Major
>
> As said, currently trying to work out a append feature for parquet files in c++.
> (been searching through repo etc, can't find example tho..)
> Current solution is to (assume no schema changes that is):
> Read in metadata
> Change metadata based on appended rows+ original rows
> Append a new row group (or multiple row group writer)
> Write the new rows.
> ---
> The problem is that, is approached this way, the original last row group may not be complete filled. Was wondering if there is a fix or I'm using the api wrong...
> Thanks ! : D



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)