You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2021/01/21 00:24:57 UTC

[GitHub] [arrow] rtyler commented on a change in pull request #9240: ARROW-10766: [Rust] [Parquet] Compute nested list definitions

rtyler commented on a change in pull request #9240:
URL: https://github.com/apache/arrow/pull/9240#discussion_r561361029



##########
File path: rust/parquet/src/arrow/array_reader.rs
##########
@@ -912,11 +912,36 @@ impl<OffsetSize: OffsetSizeTrait> ArrayReader for ListArrayReader<OffsetSize> {
             ));
         }
 
-        // Need to remove from the values array the nulls that represent null lists rather than null items
-        // null lists have def_level = 0
+        // List definitions can be encoded as 4 values:
+        // - n + 0: the list slot is null
+        // - n + 1: the list slot is not null, but is empty (i.e. [])
+        // - n + 2: the list slot is not null, but its child is empty (i.e. [ null ])
+        // - n + 3: the list slot is not null, and its child is not empty
+        // Where n is the max definition level of the list's parent.
+        // If a Parquet schema's only leaf is the list, then n = 0.
+
+        // TODO: ARROW-10391 - add a test case with a non-nullable child, check if max is 3
+        let list_field_type = match self.get_data_type() {
+            ArrowType::List(field)
+            | ArrowType::FixedSizeList(field, _)
+            | ArrowType::LargeList(field) => field,
+            _ => {
+                // Panic: this is safe as we only write lists from list datatypes
+                unreachable!()

Review comment:
       For my own curiosity, is there a specific reason to use `unreachable!` rather than `panic!` in cases like this? I understand the outcome will be the same.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org