You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2020/04/22 16:44:48 UTC

[GitHub] [incubator-iceberg] aokolnychyi opened a new pull request #951: Inherit snapshot id and sequence number in entries metadata table

aokolnychyi opened a new pull request #951:
URL: https://github.com/apache/incubator-iceberg/pull/951


   This PR migrates the entries metadata table to `DataTask` so that we can inherit snapshot ids and sequence numbers.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [incubator-iceberg] aokolnychyi commented on a change in pull request #951: Inherit snapshot id and sequence number in entries metadata table

Posted by GitBox <gi...@apache.org>.

aokolnychyi commented on a change in pull request #951:
URL: https://github.com/apache/incubator-iceberg/pull/951#discussion_r413146297



##########
File path: core/src/main/java/org/apache/iceberg/ManifestEntriesTable.java
##########
@@ -100,11 +102,39 @@ protected long targetSplitSize(TableOperations ops) {
     protected CloseableIterable<FileScanTask> planFiles(
         TableOperations ops, Snapshot snapshot, Expression rowFilter, boolean caseSensitive, boolean colStats) {
       CloseableIterable<ManifestFile> manifests = CloseableIterable.withNoopClose(snapshot.manifests());
+      Schema fileSchema = new Schema(schema().findType("data_file").asStructType().fields());

Review comment:
       I need this for the projection while reading manifest entries.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [incubator-iceberg] rdblue commented on issue #951: Inherit snapshot id and sequence number in entries metadata table

Posted by GitBox <gi...@apache.org>.

rdblue commented on issue #951:
URL: https://github.com/apache/incubator-iceberg/pull/951#issuecomment-617918702


   +1 when tests pass


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [incubator-iceberg] aokolnychyi commented on a change in pull request #951: Inherit snapshot id and sequence number in entries metadata table

Posted by GitBox <gi...@apache.org>.

aokolnychyi commented on a change in pull request #951:
URL: https://github.com/apache/incubator-iceberg/pull/951#discussion_r413144301



##########
File path: spark/src/test/java/org/apache/iceberg/spark/source/TestDataSourceOptions.java
##########
@@ -288,68 +289,48 @@ public void testIncrementalScanOptions() throws IOException {
   @Test
   public void testMetadataSplitSizeOptionOverrideTableProperties() throws IOException {
     String tableLocation = temp.newFolder("iceberg-table").toString();
-    int splitSize = 2 * 1024;
 
     HadoopTables tables = new HadoopTables(CONF);
     PartitionSpec spec = PartitionSpec.unpartitioned();
     Map<String, String> options = Maps.newHashMap();
-    options.put(TableProperties.SPLIT_SIZE, String.valueOf(128L * 1024 * 1024)); // 128Mb
-    options.put(TableProperties.METADATA_SPLIT_SIZE, String.valueOf(32L * 1024 * 1024)); // 32MB
-    tables.create(SCHEMA, spec, options, tableLocation);
+    Table table = tables.create(SCHEMA, spec, options, tableLocation);
 
     List<SimpleRecord> expectedRecords = Lists.newArrayList(
         new SimpleRecord(1, "a"),
         new SimpleRecord(2, "b")
     );
     Dataset<Row> originalDf = spark.createDataFrame(expectedRecords, SimpleRecord.class);
+    // produce 1st manifest
     originalDf.select("id", "data").write()
         .format("iceberg")
         .mode("append")
         .save(tableLocation);
-

Review comment:
       Github renders it weirdly but I squashed two tests into 1 and made modifications on top.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org