You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@drill.apache.org by GitBox <gi...@apache.org> on 2021/05/29 12:00:59 UTC

[GitHub] [drill] vvysotskyi commented on a change in pull request #2238: DRILL-7934: Fix NullPointerException error when reading parquet files

vvysotskyi commented on a change in pull request #2238:
URL: https://github.com/apache/drill/pull/2238#discussion_r641929654



##########
File path: exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/ParquetGroupScanStatistics.java
##########
@@ -115,7 +118,7 @@ public void collect(Collection<T> metadataList) {
           previousCount.setValue(Statistic.NO_COLUMN_STATS);
         }
         ColumnMetadata columnMetadata = SchemaPathUtils.getColumnMetadata(schemaPath, metadata.getSchema());
-        TypeProtos.MajorType majorType = columnMetadata != null ? columnMetadata.majorType() : null;
+        TypeProtos.MajorType majorType = columnMetadata != null ? columnMetadata.majorType() : NULL;

Review comment:
       Specifying the `NULL` type here may cause issues later when this type is used for obtaining values comparator... 
   But do we actually support partitioning list columns, i.e. if `majorType` is null, maybe we should set `partitionColumn` to false instead of calling the `checkForPartitionColumn()` method?

##########
File path: exec/java-exec/src/test/java/org/apache/drill/exec/store/parquet/ParquetRecordReaderTest.java
##########
@@ -745,4 +747,41 @@ public void testLimitMultipleRowGroupsBeyondRowCount() throws Exception {
     assertTrue(String.format("Number of records in output is wrong: expected=%d, actual=%s", 300, recordsInOutput), 300 == recordsInOutput);
   }
 
+  @Test
+  public void testTypeNull() throws Exception {

Review comment:
       Please note that this class has `@Ignore` annotation, so its tests wouldn't be running. Please add the test to another class.

##########
File path: exec/java-exec/src/test/java/org/apache/drill/exec/store/parquet/ParquetRecordReaderTest.java
##########
@@ -745,4 +747,41 @@ public void testLimitMultipleRowGroupsBeyondRowCount() throws Exception {
     assertTrue(String.format("Number of records in output is wrong: expected=%d, actual=%s", 300, recordsInOutput), 300 == recordsInOutput);
   }
 
+  @Test
+  public void testTypeNull() throws Exception {
+    /* the `features` schema is:
+    optional group features {
+      required int32 type (INTEGER(8,true));
+      optional int32 size;
+      optional group indices (LIST) {
+        repeated group list {
+          required int32 element;
+        }
+      }
+      optional group values (LIST) {
+        repeated group list {
+          required double element;
+        }
+      }
+    }
+    base on metastore/metastore-api/src/main/java/org/apache/drill/metastore/util/SchemaPathUtils.java
+    list schema is skipped, so that in ParquetGroupScanStatistics drill can not get ColumnMetadata by schemaPath
+    */
+    List<QueryDataBatch> results = testSqlWithResults("SELECT * FROM cp.`parquet/test_type_null.parquet`");

Review comment:
       Please use `testBuilder()` for running and verifying query results.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org