You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2021/04/09 03:18:08 UTC

[GitHub] [iceberg] wg1026688210 opened a new issue #2445: What is the meaning of `delete_rows_count` and `delete_data_count_file ` at manifest

wg1026688210 opened a new issue #2445:
URL: https://github.com/apache/iceberg/issues/2445


   I am  confused about `delete_rows_count` and `delete_data_count_file` . It seem not associated with `table format v2`  when I write a unit to test my guess . 
   ```@Test
     public void test() throws IOException {
       PartitionSpec spec = PartitionSpec.builderFor(SCHEMA)
               .identity("c1")
               .truncate("c2", 2)
               .build();
       Table table = TABLES.create(SCHEMA, spec, ImmutableMap.of(), tableLocation);
       upgradeToFormatV2(table);
       // Commit the txn to delete few rows.
       Schema deleteRowSchema = table.schema().select("c1", "c2", "c3");
       Record dataDelete = GenericRecord.create(deleteRowSchema);
       List<Record> deletions = Lists.newArrayList(
               dataDelete.copy("c1", 1, "c2", "AAAAAAAAAA", "c3", "CCCC")
       );
       DeleteFile eqDeletes1 = FileHelpers.writeDeleteFile(table, newOutputFile(),
               TestHelpers.Row.of(1, "AA"), deletions.subList(0, 1), deleteRowSchema);
       table.newRowDelta()
               .addDeletes(eqDeletes1)
               .commit();
       final List<ManifestFile> manifestFiles = Lists.newArrayList(table.currentSnapshot().deleteManifests());
       Assert.assertEquals("delete manifest should be 1",manifestFiles.size(),1);
   
       final ManifestFile deleteManifests = manifestFiles.get(0);
       final int deleteFilesCount = deleteManifests.deletedFilesCount();
       Assert.assertEquals("deleteFilesCount should be 1",deleteFilesCount,1);
       final long aLong = deleteManifests.deletedRowsCount();
       Assert.assertEquals("deletedRowsCount should be 1",deleteFilesCount,1);
     }```
   I found it work when  merge amd removesnapshot , and it is added together when ManifestEntry status id deleted at ```ManifestWriter``` . 
   
   
   
   
   What 's the real semantics  of `delete_rows_count` and `delete_data_count_file` at design . And whether the name is confusing with `equality delete` and `position delete` 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org