You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2022/01/17 21:55:56 UTC

[GitHub] [iceberg] rdblue commented on a change in pull request #3902: Parquet: unit test for parquet table properties

rdblue commented on a change in pull request #3902:
URL: https://github.com/apache/iceberg/pull/3902#discussion_r786305268



##########
File path: flink/v1.14/flink/src/test/java/org/apache/iceberg/flink/data/TestFlinkParquetWriter.java
##########
@@ -86,4 +117,141 @@ protected void writeAndValidate(Schema schema) throws IOException {
         RandomGenericData.generateFallbackRecords(schema, NUM_RECORDS, 21124, NUM_RECORDS / 20)),
         schema);
   }
+
+  @Test
+  public void testParquetProperties() throws Exception {
+    final MessageType schemaSimple =
+        MessageTypeParser.parseMessageType(
+            "message m {" +
+                "  optional int32 id = 1;" +
+                "  optional binary data (STRING) = 2;" +
+                "}");
+
+    final ColumnDescriptor colADesc = schemaSimple.getColumns().get(0);
+    final ColumnDescriptor colBDesc = schemaSimple.getColumns().get(1);
+
+    List<ColumnDescriptor> columnDescriptors = Arrays.asList(colADesc, colBDesc);
+
+    int expectedRowCount = 100000;
+    List<RowData> rows = Lists.newArrayListWithCapacity(expectedRowCount);
+    for (int i = 0; i < expectedRowCount; i++) {
+      rows.add(SimpleDataUtil.createRowData(1, UUID.randomUUID().toString().substring(0, 10)));
+    }
+
+    String location = temp.getRoot().getAbsolutePath();
+
+    ImmutableMap<String, String> properties = ImmutableMap.of(
+        TableProperties.PARQUET_ROW_GROUP_SIZE_BYTES, String.valueOf(expectedRowCount * 10),
+        TableProperties.PARQUET_PAGE_SIZE_BYTES, String.valueOf(expectedRowCount),
+        TableProperties.PARQUET_DICT_SIZE_BYTES, String.valueOf(expectedRowCount),
+        TableProperties.PARQUET_COMPRESSION, "uncompressed");
+
+    Table table = SimpleDataUtil.createTable(location, properties, false);
+
+    writeAndCommit(table, ImmutableList.of(), false, rows);
+    table.refresh();
+
+    CloseableIterator<DataFile> iterator =
+        FindFiles.in(table).collect().iterator();
+
+    Assert.assertTrue(iterator.hasNext());
+
+    DataFile dataFile = iterator.next();
+    Path path = new Path((String) dataFile.path());

Review comment:
       This should use the Table's IO instead of a Hadoop FileSystem.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org