You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@parquet.apache.org by "gszadovszky (via GitHub)" <gi...@apache.org> on 2023/08/16 07:38:36 UTC

[GitHub] [parquet-mr] gszadovszky commented on a diff in pull request #507: PARQUET-1364: Invalid row indexes for pages starting with nulls

gszadovszky commented on code in PR #507:
URL: https://github.com/apache/parquet-mr/pull/507#discussion_r1295498921


##########
parquet-column/src/main/java/org/apache/parquet/column/impl/ColumnWriterBase.java:
##########
@@ -84,6 +84,10 @@ private void definitionLevel(int definitionLevel) {
 
   private void repetitionLevel(int repetitionLevel) {
     repetitionLevelColumn.writeInteger(repetitionLevel);
+    assert pageRowCount == 0 ? repetitionLevel == 0 : true : "Every page shall start on record boundaries";

Review Comment:
   @zhaochengzhch, I think the message describes it. We require to end/start pages at record boundaries so the repetition level shall be `0` when the page row count is `0` (which means we are starting a page). If the repetition level is not `0` at this point it breaks the mentioned requirement which is needed for column indexes.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@parquet.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org