You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2022/06/22 22:46:54 UTC

[GitHub] [iceberg] flyrain commented on a diff in pull request #4945: Add table spec changes for statistics information in table snapshot

flyrain commented on code in PR #4945:
URL: https://github.com/apache/iceberg/pull/4945#discussion_r904352829


##########
format/spec.md:
##########
@@ -631,6 +632,29 @@ When expiring snapshots, retention policies in table and snapshot references are
     2. The snapshot is not one of the first `min-snapshots-to-keep` in the branch (including the branch's referenced snapshot)
 5. Expire any snapshot not in the set of snapshots to retain.
 
+#### Statistics file
+
+Statistics files are valid [Puffin files](../puffin-spec). Statistics are informational. A reader can choose to
+ignore statistics information. Statistics support is not required to read the table correctly.
+
+Statistics files' metadata within `statistics` table snapshot field is a struct with the following fields:
+
+| v1         | v2         | Field name                      | Type                              | Description                                                                                                                               |
+|------------|------------|---------------------------------|-----------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------|
+| _required_ | _required_ | **`statistics-path`**           | `string`                          | Path of the statistics file. See [Puffin file format](../puffin-spec).                                                                    |
+| _required_ | _required_ | **`file-size-in-bytes`**        | `long`                            | Size of the statistics file.                                                                                                              |
+| _required_ | _required_ | **`file-footer-size-in-bytes`** | `long`                            | Total size of the statistics file's footer (not the footer payload size). See [Puffin file format](../puffin-spec) for footer definition. |
+| _required_ | _required_ | **`source-snapshot-id`**        | `long`                            | Table sequence number at which the stats were calculated                                                                                  |

Review Comment:
   Kind of confused by the field name and description. Is it a snapshot id or sequence number? I assume it is a snapshot id.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org