You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "alamb (via GitHub)" <gi...@apache.org> on 2023/06/01 12:08:12 UTC

[GitHub] [arrow-rs] alamb commented on a diff in pull request #4327: Don't split record across pages (#3680)

alamb commented on code in PR #4327:
URL: https://github.com/apache/arrow-rs/pull/4327#discussion_r1213053147


##########
parquet/src/column/writer/mod.rs:
##########
@@ -308,6 +308,17 @@ impl<'a, E: ColumnValueEncoder> GenericColumnWriter<'a, E> {
         max: Option<&E::T>,
         distinct_count: Option<u64>,
     ) -> Result<usize> {
+        // Check if number of definition levels is the same as number of repetition levels.
+        if let (Some(def), Some(rep)) = (def_levels, rep_levels) {

Review Comment:
   Is there any way to test this error condition (and the one below it)? 



##########
parquet/tests/arrow_writer_layout.rs:
##########
@@ -502,3 +503,45 @@ fn test_string() {
         },
     });
 }
+
+#[test]
+fn test_list() {

Review Comment:
   I ran this test without the changes in this PR and verified it failed:
   
   ```
   
   test test_list ... FAILED
   
   failures:
   
   ---- test_list stdout ----
   thread 'test_list' panicked at 'assertion failed: `(left == right)`
     left: `11`,
    right: `10`: index page count mismatch', parquet/tests/arrow_writer_layout.rs:95:13
   stack backtrace:
      0: rust_begin_unwind
                at /rustc/84c898d65adf2f39a5a98507f1fe0ce10a2b8dbc/library/std/src/panicking.rs:579:5
      1: core::panicking::panic_fmt
                at /rustc/84c898d65adf2f39a5a98507f1fe0ce10a2b8dbc/library/core/src/panicking.rs:64:14
      2: core::panicking::assert_failed_inner
                at /rustc/84c898d65adf2f39a5a98507f1fe0ce10a2b8dbc/library/core/src/panicking.rs:251:23
      3: core::panicking::assert_failed
                at /rustc/84c898d65adf2f39a5a98507f1fe0ce10a2b8dbc/library/core/src/panicking.rs:211:5
      4: arrow_writer_layout::assert_layout
                at ./tests/arrow_writer_layout.rs:95:13
      5: arrow_writer_layout::do_test
                at ./tests/arrow_writer_layout.rs:77:5
      6: arrow_writer_layout::test_list
                at ./tests/arrow_writer_layout.rs:527:5
      7: arrow_writer_layout::test_list::{{closure}}
                at ./tests/arrow_writer_layout.rs:508:16
      8: core::ops::function::FnOnce::call_once
                at /rustc/84c898d65adf2f39a5a98507f1fe0ce10a2b8dbc/library/core/src/ops/function.rs:250:5
      9: core::ops::function::FnOnce::call_once
                at /rustc/84c898d65adf2f39a5a98507f1fe0ce10a2b8dbc/library/core/src/ops/function.rs:250:5
   note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
   ```
   
   Which seems to imply that the data would be written with one more page.
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org