You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2022/11/16 09:51:32 UTC

[GitHub] [iceberg] szehon-ho commented on a diff in pull request #4627: Parquet: Fixes get null values for the nested field partition column

szehon-ho commented on code in PR #4627:
URL: https://github.com/apache/iceberg/pull/4627#discussion_r1023069744


##########
parquet/src/main/java/org/apache/iceberg/data/parquet/BaseParquetReaders.java:
##########
@@ -149,11 +153,14 @@ public ParquetValueReader<?> struct(Types.StructType expected, GroupType struct,
       List<ParquetValueReader<?>> reorderedFields = Lists.newArrayListWithExpectedSize(
           expectedFields.size());
       List<Type> types = Lists.newArrayListWithExpectedSize(expectedFields.size());
+      // Inferring MaxDefinitionLevel from parent field
+      int inferredMaxDefinitionLevel = type.getMaxDefinitionLevel(currentPath());
       for (Types.NestedField field : expectedFields) {
         int id = field.fieldId();
         if (idToConstant.containsKey(id)) {
           // containsKey is used because the constant may be null
-          reorderedFields.add(ParquetValueReaders.constant(idToConstant.get(id)));
+          int fieldMaxDefinitionLevel = maxDefinitionLevelsById.getOrDefault(id, inferredMaxDefinitionLevel);

Review Comment:
   Im kind of new to this code, so curious, why do we take from parent?  Is it because the field is indeed null here and we will thus just take parent's definition level?
   
   Should we call it parentMaxDefinitionLevel?



##########
parquet/src/main/java/org/apache/iceberg/parquet/ParquetValueReaders.java:
##########
@@ -116,9 +120,45 @@ public void setPageSource(PageReadStore pageStore, long rowPosition) {
 
   static class ConstantReader<C> implements ParquetValueReader<C> {
     private final C constantValue;
+    private final TripleIterator<?> column;
+    private final List<TripleIterator<?>> children;
 
     ConstantReader(C constantValue) {
       this.constantValue = constantValue;
+      this.column = NullReader.NULL_COLUMN;
+      this.children = NullReader.COLUMNS;
+    }
+
+    ConstantReader(C constantValue, int definitionLevel) {
+      this.constantValue = constantValue;
+      this.column = new TripleIterator<Object>() {
+        @Override
+        public int currentDefinitionLevel() {
+          return definitionLevel;
+        }
+
+        @Override
+        public int currentRepetitionLevel() {
+          return 0;
+        }
+
+        @Override
+        public <N> N nextNull() {
+          return null;
+        }
+
+        @Override
+        public boolean hasNext() {
+          return false;
+        }
+
+        @Override
+        public Object next() {
+          return null;
+        }
+      };
+
+      this.children = ImmutableList.of(column);

Review Comment:
   Trying to understand how this works, but this previously is empty list, but we need it here?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org