You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@parquet.apache.org by "Uwe L. Korn (JIRA)" <ji...@apache.org> on 2017/06/22 11:32:00 UTC

[jira] [Commented] (PARQUET-1037) Allow final RowGroup to be unfilled

    [ https://issues.apache.org/jira/browse/PARQUET-1037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16059204#comment-16059204 ] 

Uwe L. Korn commented on PARQUET-1037:
--------------------------------------

[~tobyshaw] Can you give us more insight in your actual case here? My gut feeling would be to close this as "should be handled in the code of the caller" but there might a situation where this may work I'm not sure from the title what the situation actually is.

One thing to keep in mind is that we currently encode the data in column when we go to the next column. Keeping the column unfilled is not possible in this way of doing as for the encoding process, we would need to know the actual data. Still to propose a solution, I need to understand the context of the problem.

> Allow final RowGroup to be unfilled
> -----------------------------------
>
>                 Key: PARQUET-1037
>                 URL: https://issues.apache.org/jira/browse/PARQUET-1037
>             Project: Parquet
>          Issue Type: Improvement
>          Components: parquet-cpp
>            Reporter: Toby Shaw
>
> When a RowGroup is added with AppendRowGroup, it must be filled with exactly the number of rows specified, or an exception will be thrown.
> It should be possible go back and modify the RowGroupSize metadata after a column has completed prematurely. This would be useful in scenarios where the total number of rows to be written is not known in advance.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)