You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2020/07/21 17:26:49 UTC

[GitHub] [iceberg] sudssf commented on a change in pull request #1221: Spark: Fix estimateStatistics when called without filters

sudssf commented on a change in pull request #1221:
URL: https://github.com/apache/iceberg/pull/1221#discussion_r458266469



##########
File path: site/docs/configuration.md
##########
@@ -109,14 +110,14 @@ spark.read
     .table("catalog.db.table")
 ```
 
-| Spark option    | Default               | Description                                                                               |
-| --------------- | --------------------- | ----------------------------------------------------------------------------------------- |
-| snapshot-id     | (latest)              | Snapshot ID of the table snapshot to read                                                 |
-| as-of-timestamp | (latest)              | A timestamp in milliseconds; the snapshot used will be the snapshot current at this time. |
-| split-size      | As per table property | Overrides this table's read.split.target-size and read.split.metadata-target-size         |
-| lookback        | As per table property | Overrides this table's read.split.planning-lookback                                       |
-| file-open-cost  | As per table property | Overrides this table's read.split.open-file-cost                                          |
-
+| Spark option               | Default               | Description                                                                               |
+| -------------------------- | --------------------- | ----------------------------------------------------------------------------------------- |
+| snapshot-id                | (latest)              | Snapshot ID of the table snapshot to read                                                 |
+| as-of-timestamp            | (latest)              | A timestamp in milliseconds; the snapshot used will be the snapshot current at this time. |
+| split-size                 | As per table property | Overrides this table's read.split.target-size and read.split.metadata-target-size         |
+| lookback                   | As per table property | Overrides this table's read.split.planning-lookback                                       |
+| file-open-cost             | As per table property | Overrides this table's read.split.open-file-cost                                          |
+| use-approximate-statistics | As per table property | Overrides this table's read.spark.read.spark.use-approximate-statistics                   |

Review comment:
       what if manifest size is still large after filters are applied?  user may want to disable scanning entire manifest if not trying to broadcast table




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org