You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2020/04/01 17:32:41 UTC

[GitHub] [incubator-iceberg] sudssf commented on a change in pull request #882: fix: Failed to get status issue because of s3 eventual consistency

sudssf commented on a change in pull request #882: fix: Failed to get status issue because of s3 eventual consistency
URL: https://github.com/apache/incubator-iceberg/pull/882#discussion_r401789045
 
 

 ##########
 File path: spark/src/test/java/org/apache/iceberg/spark/source/TestDataSourceOptions.java
 ##########
 @@ -201,6 +201,36 @@ public void testSplitOptionsOverridesTableProperties() throws IOException {
     Assert.assertEquals("Spark partitions should match", 2, resultDf.javaRDD().getNumPartitions());
   }
 
+  @Test
+  public void testSplitOptionsOverridesTablePropertiesWithWriterLength() throws IOException {
+    String tableLocation = temp.newFolder("iceberg-table").toString();
+
+    HadoopTables tables = new HadoopTables(CONF);
+    PartitionSpec spec = PartitionSpec.unpartitioned();
+    Map<String, String> options = Maps.newHashMap();
+    options.put(TableProperties.SPLIT_SIZE, String.valueOf(128L * 1024 * 1024)); // 128Mb
+    tables.create(SCHEMA, spec, options, tableLocation);
+
+    List<SimpleRecord> expectedRecords = Lists.newArrayList(
+        new SimpleRecord(1, "a"),
+        new SimpleRecord(2, "b")
+    );
+    Dataset<Row> originalDf = spark.createDataFrame(expectedRecords, SimpleRecord.class);
+    originalDf.select("id", "data").write()
+        .format("iceberg")
+        .mode("append")
+        .option("use-writer-length-as-file-size", true)
+        .save(tableLocation);
+
+    Dataset<Row> resultDf = spark.read()
+        .format("iceberg")
+        .option("split-size", String.valueOf(611 + 103)) // 611 bytes is the size of SimpleRecord(1,"a")
 
 Review comment:
   @rdblue additional 103 in writer length, can this be bug in writer factory which returns buffer size after flush?
   ( no rush of merging this PR , I am trying to make sure changes are ok)

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org