You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2021/01/28 22:37:31 UTC

[GitHub] [iceberg] jackye1995 opened a new pull request #2178: Doc: fix miscellaneous comments and add more details in release note for 0.11

jackye1995 opened a new pull request #2178:
URL: https://github.com/apache/iceberg/pull/2178


   @rdblue as we discussed in slack, add more details for bug fixes in 0.11. Also fix some typos and add more details in Flink as suggested by #2168 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] rdblue commented on a change in pull request #2178: Doc: fix miscellaneous comments and add more details in release note for 0.11

Posted by GitBox <gi...@apache.org>.

rdblue commented on a change in pull request #2178:
URL: https://github.com/apache/iceberg/pull/2178#discussion_r566459626



##########
File path: site/docs/aws.md
##########
@@ -51,6 +51,8 @@ spark-sql --packages $DEPENDENCIES \
 
 As you can see, In the shell command, we use `--packages` to specify the additional AWS bundle and HTTP client dependencies with their version as `2.15.40`.
 
+For integration with other engines such as Flink, please read their engine documentation pages that explain loading a custom catalog. 

Review comment:
       I think it should be "explains how to load".




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] rdblue commented on a change in pull request #2178: Doc: fix miscellaneous comments and add more details in release note for 0.11

Posted by GitBox <gi...@apache.org>.

rdblue commented on a change in pull request #2178:
URL: https://github.com/apache/iceberg/pull/2178#discussion_r566461898



##########
File path: site/docs/releases.md
##########
@@ -74,14 +74,21 @@ High-level features:
 
 Important bug fixes:
 
-* [\#2091](https://github.com/apache/iceberg/pull/2091) fixes Parquet vectorized reads when column types are promoted
-* [\#1991](https://github.com/apache/iceberg/pull/1991) fixes Avro schema conversions to preserve field docs
-* [\#1981](https://github.com/apache/iceberg/pull/1981) fixes bug that date and timestamp transforms were producing incorrect values for negative dates and times
-* [\#1798](https://github.com/apache/iceberg/pull/1798) fixes read failure when encountering duplicate entries of data files
-* [\#1785](https://github.com/apache/iceberg/pull/1785) fixes invalidation of metadata tables in CachingCatalog
+* [\#2091](https://github.com/apache/iceberg/pull/2091) fixes `ClassCastException` for type promotion `int` to `long` or `float` to `double` during Parquet vectorized read. Now Arrow vector is created by looking at Parquet file schema instead of Iceberg schema for `int` and `float` fields.
+* [\#2031](https://github.com/apache/iceberg/pull/2031) fixes bug in Flink that custom catalog property causes catalog initialization failure. Now Flink catalog can support arbitrary custom catalog properties.
+* [\#2011](https://github.com/apache/iceberg/pull/2011) fixes equality comparison for `BaseSnapshot`. For engines such as Beam that serialize snapshots, now snapshots can be compared through Java equality operator.
+* [\#1998](https://github.com/apache/iceberg/pull/1998) fixes bug in `HiveTableOperation` that `unlock` is not called if new metadata cannot be deleted. Now it is guaranteed that `unlock` is always called for Hive catalog users.
+* [\#1991](https://github.com/apache/iceberg/pull/1991) fixes Avro schema conversions to preserve field docs. Now field level documentation is also preserved when converting from Avro schemas to Iceberg schemas.
+* [\#1981](https://github.com/apache/iceberg/pull/1981) fixes bug that date and timestamp transforms were producing incorrect values for dates and times before 1970. Before the fix, negative values were incorrectly transformed by date and timestamp transforms to 1 larger than the correct value. For example, `day(1969-12-31 10:00:00)` produced 0 instead of -1. The fix is backwards compatible, which means predicate projection can still work with the incorrectly transformed partitions written using older versions.
+* [\#1979](https://github.com/apache/iceberg/pull/1979) fixes table listing failure in Hadoop catalog when user does not have permission to some tables. Now the tables with no permission are ignored in listing.
+* [\#1798](https://github.com/apache/iceberg/pull/1798) fixes read failure when encountering duplicate entries of data files. Spark and Flink readers can now ignore duplicated entries in data files.

Review comment:
       I don't think it is true that duplicate data files are ignored. I thought it just fixed the encryption map problem. Can you double-check this? Also, I think that the duplication would need to be in a single task. We don't guarantee deduplication in split planning.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] jackye1995 commented on a change in pull request #2178: Doc: fix miscellaneous comments and add more details in release note for 0.11

Posted by GitBox <gi...@apache.org>.

jackye1995 commented on a change in pull request #2178:
URL: https://github.com/apache/iceberg/pull/2178#discussion_r566469259



##########
File path: site/docs/releases.md
##########
@@ -74,14 +74,21 @@ High-level features:
 
 Important bug fixes:
 
-* [\#2091](https://github.com/apache/iceberg/pull/2091) fixes Parquet vectorized reads when column types are promoted
-* [\#1991](https://github.com/apache/iceberg/pull/1991) fixes Avro schema conversions to preserve field docs
-* [\#1981](https://github.com/apache/iceberg/pull/1981) fixes bug that date and timestamp transforms were producing incorrect values for negative dates and times
-* [\#1798](https://github.com/apache/iceberg/pull/1798) fixes read failure when encountering duplicate entries of data files
-* [\#1785](https://github.com/apache/iceberg/pull/1785) fixes invalidation of metadata tables in CachingCatalog
+* [\#2091](https://github.com/apache/iceberg/pull/2091) fixes `ClassCastException` for type promotion `int` to `long` or `float` to `double` during Parquet vectorized read. Now Arrow vector is created by looking at Parquet file schema instead of Iceberg schema for `int` and `float` fields.
+* [\#2031](https://github.com/apache/iceberg/pull/2031) fixes bug in Flink that custom catalog property causes catalog initialization failure. Now Flink catalog can support arbitrary custom catalog properties.
+* [\#2011](https://github.com/apache/iceberg/pull/2011) fixes equality comparison for `BaseSnapshot`. For engines such as Beam that serialize snapshots, now snapshots can be compared through Java equality operator.
+* [\#1998](https://github.com/apache/iceberg/pull/1998) fixes bug in `HiveTableOperation` that `unlock` is not called if new metadata cannot be deleted. Now it is guaranteed that `unlock` is always called for Hive catalog users.
+* [\#1991](https://github.com/apache/iceberg/pull/1991) fixes Avro schema conversions to preserve field docs. Now field level documentation is also preserved when converting from Avro schemas to Iceberg schemas.
+* [\#1981](https://github.com/apache/iceberg/pull/1981) fixes bug that date and timestamp transforms were producing incorrect values for dates and times before 1970. Before the fix, negative values were incorrectly transformed by date and timestamp transforms to 1 larger than the correct value. For example, `day(1969-12-31 10:00:00)` produced 0 instead of -1. The fix is backwards compatible, which means predicate projection can still work with the incorrectly transformed partitions written using older versions.
+* [\#1979](https://github.com/apache/iceberg/pull/1979) fixes table listing failure in Hadoop catalog when user does not have permission to some tables. Now the tables with no permission are ignored in listing.
+* [\#1798](https://github.com/apache/iceberg/pull/1798) fixes read failure when encountering duplicate entries of data files. Spark and Flink readers can now ignore duplicated entries in data files.

Review comment:
       If duplicate exists, the old code path uses immutable map builder which would throw exception. That is why it is changed to `hashMap.putIfAbsent()` instead and duplicated location is ignored. Yes it is in a single task, let me add that.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] rdblue commented on a change in pull request #2178: Doc: fix miscellaneous comments and add more details in release note for 0.11

Posted by GitBox <gi...@apache.org>.

rdblue commented on a change in pull request #2178:
URL: https://github.com/apache/iceberg/pull/2178#discussion_r566461343



##########
File path: site/docs/releases.md
##########
@@ -74,14 +74,21 @@ High-level features:
 
 Important bug fixes:
 
-* [\#2091](https://github.com/apache/iceberg/pull/2091) fixes Parquet vectorized reads when column types are promoted
-* [\#1991](https://github.com/apache/iceberg/pull/1991) fixes Avro schema conversions to preserve field docs
-* [\#1981](https://github.com/apache/iceberg/pull/1981) fixes bug that date and timestamp transforms were producing incorrect values for negative dates and times
-* [\#1798](https://github.com/apache/iceberg/pull/1798) fixes read failure when encountering duplicate entries of data files
-* [\#1785](https://github.com/apache/iceberg/pull/1785) fixes invalidation of metadata tables in CachingCatalog
+* [\#2091](https://github.com/apache/iceberg/pull/2091) fixes `ClassCastException` for type promotion `int` to `long` or `float` to `double` during Parquet vectorized read. Now Arrow vector is created by looking at Parquet file schema instead of Iceberg schema for `int` and `float` fields.
+* [\#2031](https://github.com/apache/iceberg/pull/2031) fixes bug in Flink that custom catalog property causes catalog initialization failure. Now Flink catalog can support arbitrary custom catalog properties.
+* [\#2011](https://github.com/apache/iceberg/pull/2011) fixes equality comparison for `BaseSnapshot`. For engines such as Beam that serialize snapshots, now snapshots can be compared through Java equality operator.
+* [\#1998](https://github.com/apache/iceberg/pull/1998) fixes bug in `HiveTableOperation` that `unlock` is not called if new metadata cannot be deleted. Now it is guaranteed that `unlock` is always called for Hive catalog users.
+* [\#1991](https://github.com/apache/iceberg/pull/1991) fixes Avro schema conversions to preserve field docs. Now field level documentation is also preserved when converting from Avro schemas to Iceberg schemas.

Review comment:
       This may be a notable feature update, but is not a serious bug that we need to draw attention to.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] rdblue merged pull request #2178: Doc: fix miscellaneous comments and add more details in release note for 0.11

Posted by GitBox <gi...@apache.org>.

rdblue merged pull request #2178:
URL: https://github.com/apache/iceberg/pull/2178


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] rdblue commented on a change in pull request #2178: Doc: fix miscellaneous comments and add more details in release note for 0.11

Posted by GitBox <gi...@apache.org>.

rdblue commented on a change in pull request #2178:
URL: https://github.com/apache/iceberg/pull/2178#discussion_r566461898



##########
File path: site/docs/releases.md
##########
@@ -74,14 +74,21 @@ High-level features:
 
 Important bug fixes:
 
-* [\#2091](https://github.com/apache/iceberg/pull/2091) fixes Parquet vectorized reads when column types are promoted
-* [\#1991](https://github.com/apache/iceberg/pull/1991) fixes Avro schema conversions to preserve field docs
-* [\#1981](https://github.com/apache/iceberg/pull/1981) fixes bug that date and timestamp transforms were producing incorrect values for negative dates and times
-* [\#1798](https://github.com/apache/iceberg/pull/1798) fixes read failure when encountering duplicate entries of data files
-* [\#1785](https://github.com/apache/iceberg/pull/1785) fixes invalidation of metadata tables in CachingCatalog
+* [\#2091](https://github.com/apache/iceberg/pull/2091) fixes `ClassCastException` for type promotion `int` to `long` or `float` to `double` during Parquet vectorized read. Now Arrow vector is created by looking at Parquet file schema instead of Iceberg schema for `int` and `float` fields.
+* [\#2031](https://github.com/apache/iceberg/pull/2031) fixes bug in Flink that custom catalog property causes catalog initialization failure. Now Flink catalog can support arbitrary custom catalog properties.
+* [\#2011](https://github.com/apache/iceberg/pull/2011) fixes equality comparison for `BaseSnapshot`. For engines such as Beam that serialize snapshots, now snapshots can be compared through Java equality operator.
+* [\#1998](https://github.com/apache/iceberg/pull/1998) fixes bug in `HiveTableOperation` that `unlock` is not called if new metadata cannot be deleted. Now it is guaranteed that `unlock` is always called for Hive catalog users.
+* [\#1991](https://github.com/apache/iceberg/pull/1991) fixes Avro schema conversions to preserve field docs. Now field level documentation is also preserved when converting from Avro schemas to Iceberg schemas.
+* [\#1981](https://github.com/apache/iceberg/pull/1981) fixes bug that date and timestamp transforms were producing incorrect values for dates and times before 1970. Before the fix, negative values were incorrectly transformed by date and timestamp transforms to 1 larger than the correct value. For example, `day(1969-12-31 10:00:00)` produced 0 instead of -1. The fix is backwards compatible, which means predicate projection can still work with the incorrectly transformed partitions written using older versions.
+* [\#1979](https://github.com/apache/iceberg/pull/1979) fixes table listing failure in Hadoop catalog when user does not have permission to some tables. Now the tables with no permission are ignored in listing.
+* [\#1798](https://github.com/apache/iceberg/pull/1798) fixes read failure when encountering duplicate entries of data files. Spark and Flink readers can now ignore duplicated entries in data files.

Review comment:
       I don't think it is true that duplicate data files are ignored. I thought it just fixed the encryption map problem. Can you double-check this?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] rdblue commented on a change in pull request #2178: Doc: fix miscellaneous comments and add more details in release note for 0.11

Posted by GitBox <gi...@apache.org>.

rdblue commented on a change in pull request #2178:
URL: https://github.com/apache/iceberg/pull/2178#discussion_r566462563



##########
File path: site/docs/releases.md
##########
@@ -74,14 +74,21 @@ High-level features:
 
 Important bug fixes:
 
-* [\#2091](https://github.com/apache/iceberg/pull/2091) fixes Parquet vectorized reads when column types are promoted
-* [\#1991](https://github.com/apache/iceberg/pull/1991) fixes Avro schema conversions to preserve field docs
-* [\#1981](https://github.com/apache/iceberg/pull/1981) fixes bug that date and timestamp transforms were producing incorrect values for negative dates and times
-* [\#1798](https://github.com/apache/iceberg/pull/1798) fixes read failure when encountering duplicate entries of data files
-* [\#1785](https://github.com/apache/iceberg/pull/1785) fixes invalidation of metadata tables in CachingCatalog
+* [\#2091](https://github.com/apache/iceberg/pull/2091) fixes `ClassCastException` for type promotion `int` to `long` or `float` to `double` during Parquet vectorized read. Now Arrow vector is created by looking at Parquet file schema instead of Iceberg schema for `int` and `float` fields.
+* [\#2031](https://github.com/apache/iceberg/pull/2031) fixes bug in Flink that custom catalog property causes catalog initialization failure. Now Flink catalog can support arbitrary custom catalog properties.
+* [\#2011](https://github.com/apache/iceberg/pull/2011) fixes equality comparison for `BaseSnapshot`. For engines such as Beam that serialize snapshots, now snapshots can be compared through Java equality operator.
+* [\#1998](https://github.com/apache/iceberg/pull/1998) fixes bug in `HiveTableOperation` that `unlock` is not called if new metadata cannot be deleted. Now it is guaranteed that `unlock` is always called for Hive catalog users.
+* [\#1991](https://github.com/apache/iceberg/pull/1991) fixes Avro schema conversions to preserve field docs. Now field level documentation is also preserved when converting from Avro schemas to Iceberg schemas.
+* [\#1981](https://github.com/apache/iceberg/pull/1981) fixes bug that date and timestamp transforms were producing incorrect values for dates and times before 1970. Before the fix, negative values were incorrectly transformed by date and timestamp transforms to 1 larger than the correct value. For example, `day(1969-12-31 10:00:00)` produced 0 instead of -1. The fix is backwards compatible, which means predicate projection can still work with the incorrectly transformed partitions written using older versions.
+* [\#1979](https://github.com/apache/iceberg/pull/1979) fixes table listing failure in Hadoop catalog when user does not have permission to some tables. Now the tables with no permission are ignored in listing.
+* [\#1798](https://github.com/apache/iceberg/pull/1798) fixes read failure when encountering duplicate entries of data files. Spark and Flink readers can now ignore duplicated entries in data files.
+* [\#1785](https://github.com/apache/iceberg/pull/1785) fixes invalidation of metadata tables in `CachingCatalog`. When a table is dropped, all the metadata tables associated with it are also invalidated in the cache.
+* [\#1960](https://github.com/apache/iceberg/pull/1960) fixes bug that ORC writer does not read metrics config and always use the default. Now customized metrics config is respected.

Review comment:
       I think this is a new feature, not really a bug. Notable though, so maybe move this to the next section.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] rdblue commented on a change in pull request #2178: Doc: fix miscellaneous comments and add more details in release note for 0.11

Posted by GitBox <gi...@apache.org>.

rdblue commented on a change in pull request #2178:
URL: https://github.com/apache/iceberg/pull/2178#discussion_r566460940



##########
File path: site/docs/releases.md
##########
@@ -74,14 +74,21 @@ High-level features:
 
 Important bug fixes:
 
-* [\#2091](https://github.com/apache/iceberg/pull/2091) fixes Parquet vectorized reads when column types are promoted
-* [\#1991](https://github.com/apache/iceberg/pull/1991) fixes Avro schema conversions to preserve field docs
-* [\#1981](https://github.com/apache/iceberg/pull/1981) fixes bug that date and timestamp transforms were producing incorrect values for negative dates and times
-* [\#1798](https://github.com/apache/iceberg/pull/1798) fixes read failure when encountering duplicate entries of data files
-* [\#1785](https://github.com/apache/iceberg/pull/1785) fixes invalidation of metadata tables in CachingCatalog
+* [\#2091](https://github.com/apache/iceberg/pull/2091) fixes `ClassCastException` for type promotion `int` to `long` or `float` to `double` during Parquet vectorized read. Now Arrow vector is created by looking at Parquet file schema instead of Iceberg schema for `int` and `float` fields.
+* [\#2031](https://github.com/apache/iceberg/pull/2031) fixes bug in Flink that custom catalog property causes catalog initialization failure. Now Flink catalog can support arbitrary custom catalog properties.
+* [\#2011](https://github.com/apache/iceberg/pull/2011) fixes equality comparison for `BaseSnapshot`. For engines such as Beam that serialize snapshots, now snapshots can be compared through Java equality operator.

Review comment:
       I probably wouldn't include this. It was needed for beam, but it wasn't needed for other engines before now so it is not really a bug.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] rdblue commented on a change in pull request #2178: Doc: fix miscellaneous comments and add more details in release note for 0.11

Posted by GitBox <gi...@apache.org>.

rdblue commented on a change in pull request #2178:
URL: https://github.com/apache/iceberg/pull/2178#discussion_r566459946



##########
File path: site/docs/flink.md
##########
@@ -144,6 +144,18 @@ CREATE CATALOG my_catalog WITH (
 );
 ```
 
+### Create through YAML config
+
+Catalog can also be registered in `sql-client-defaults.yaml` before starting the SQL client. Here is an example:

Review comment:
       Typo: should be "Catalogs can be ..."




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] rdblue commented on a change in pull request #2178: Doc: fix miscellaneous comments and add more details in release note for 0.11

Posted by GitBox <gi...@apache.org>.

rdblue commented on a change in pull request #2178:
URL: https://github.com/apache/iceberg/pull/2178#discussion_r566463006



##########
File path: site/docs/releases.md
##########
@@ -74,14 +74,21 @@ High-level features:
 
 Important bug fixes:
 
-* [\#2091](https://github.com/apache/iceberg/pull/2091) fixes Parquet vectorized reads when column types are promoted
-* [\#1991](https://github.com/apache/iceberg/pull/1991) fixes Avro schema conversions to preserve field docs
-* [\#1981](https://github.com/apache/iceberg/pull/1981) fixes bug that date and timestamp transforms were producing incorrect values for negative dates and times
-* [\#1798](https://github.com/apache/iceberg/pull/1798) fixes read failure when encountering duplicate entries of data files
-* [\#1785](https://github.com/apache/iceberg/pull/1785) fixes invalidation of metadata tables in CachingCatalog
+* [\#2091](https://github.com/apache/iceberg/pull/2091) fixes `ClassCastException` for type promotion `int` to `long` or `float` to `double` during Parquet vectorized read. Now Arrow vector is created by looking at Parquet file schema instead of Iceberg schema for `int` and `float` fields.
+* [\#2031](https://github.com/apache/iceberg/pull/2031) fixes bug in Flink that custom catalog property causes catalog initialization failure. Now Flink catalog can support arbitrary custom catalog properties.
+* [\#2011](https://github.com/apache/iceberg/pull/2011) fixes equality comparison for `BaseSnapshot`. For engines such as Beam that serialize snapshots, now snapshots can be compared through Java equality operator.
+* [\#1998](https://github.com/apache/iceberg/pull/1998) fixes bug in `HiveTableOperation` that `unlock` is not called if new metadata cannot be deleted. Now it is guaranteed that `unlock` is always called for Hive catalog users.
+* [\#1991](https://github.com/apache/iceberg/pull/1991) fixes Avro schema conversions to preserve field docs. Now field level documentation is also preserved when converting from Avro schemas to Iceberg schemas.
+* [\#1981](https://github.com/apache/iceberg/pull/1981) fixes bug that date and timestamp transforms were producing incorrect values for dates and times before 1970. Before the fix, negative values were incorrectly transformed by date and timestamp transforms to 1 larger than the correct value. For example, `day(1969-12-31 10:00:00)` produced 0 instead of -1. The fix is backwards compatible, which means predicate projection can still work with the incorrectly transformed partitions written using older versions.
+* [\#1979](https://github.com/apache/iceberg/pull/1979) fixes table listing failure in Hadoop catalog when user does not have permission to some tables. Now the tables with no permission are ignored in listing.
+* [\#1798](https://github.com/apache/iceberg/pull/1798) fixes read failure when encountering duplicate entries of data files. Spark and Flink readers can now ignore duplicated entries in data files.
+* [\#1785](https://github.com/apache/iceberg/pull/1785) fixes invalidation of metadata tables in `CachingCatalog`. When a table is dropped, all the metadata tables associated with it are also invalidated in the cache.
+* [\#1960](https://github.com/apache/iceberg/pull/1960) fixes bug that ORC writer does not read metrics config and always use the default. Now customized metrics config is respected.
+* [\#1936](https://github.com/apache/iceberg/pull/1936) fixes parallelism setting in Flink. Before, the default Flink parallelism was used which cause performance issue or resource waste. Now the parallelism is set to the number of Iceberg read splits. 
 
 Other notable changes:
 
+* PrestoSQL is renamed to [Trino](https://trino.io/)

Review comment:
       This isn't an Iceberg change, so I would remove it. It is also not the most important change so I would not put it first.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] rdblue commented on a change in pull request #2178: Doc: fix miscellaneous comments and add more details in release note for 0.11

Posted by GitBox <gi...@apache.org>.

rdblue commented on a change in pull request #2178:
URL: https://github.com/apache/iceberg/pull/2178#discussion_r566463144



##########
File path: site/docs/releases.md
##########
@@ -74,14 +74,21 @@ High-level features:
 
 Important bug fixes:
 
-* [\#2091](https://github.com/apache/iceberg/pull/2091) fixes Parquet vectorized reads when column types are promoted
-* [\#1991](https://github.com/apache/iceberg/pull/1991) fixes Avro schema conversions to preserve field docs
-* [\#1981](https://github.com/apache/iceberg/pull/1981) fixes bug that date and timestamp transforms were producing incorrect values for negative dates and times
-* [\#1798](https://github.com/apache/iceberg/pull/1798) fixes read failure when encountering duplicate entries of data files
-* [\#1785](https://github.com/apache/iceberg/pull/1785) fixes invalidation of metadata tables in CachingCatalog
+* [\#2091](https://github.com/apache/iceberg/pull/2091) fixes `ClassCastException` for type promotion `int` to `long` or `float` to `double` during Parquet vectorized read. Now Arrow vector is created by looking at Parquet file schema instead of Iceberg schema for `int` and `float` fields.
+* [\#2031](https://github.com/apache/iceberg/pull/2031) fixes bug in Flink that custom catalog property causes catalog initialization failure. Now Flink catalog can support arbitrary custom catalog properties.
+* [\#2011](https://github.com/apache/iceberg/pull/2011) fixes equality comparison for `BaseSnapshot`. For engines such as Beam that serialize snapshots, now snapshots can be compared through Java equality operator.
+* [\#1998](https://github.com/apache/iceberg/pull/1998) fixes bug in `HiveTableOperation` that `unlock` is not called if new metadata cannot be deleted. Now it is guaranteed that `unlock` is always called for Hive catalog users.
+* [\#1991](https://github.com/apache/iceberg/pull/1991) fixes Avro schema conversions to preserve field docs. Now field level documentation is also preserved when converting from Avro schemas to Iceberg schemas.
+* [\#1981](https://github.com/apache/iceberg/pull/1981) fixes bug that date and timestamp transforms were producing incorrect values for dates and times before 1970. Before the fix, negative values were incorrectly transformed by date and timestamp transforms to 1 larger than the correct value. For example, `day(1969-12-31 10:00:00)` produced 0 instead of -1. The fix is backwards compatible, which means predicate projection can still work with the incorrectly transformed partitions written using older versions.

Review comment:
       This is a major fix. Could you list it first?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] rdblue commented on a change in pull request #2178: Doc: fix miscellaneous comments and add more details in release note for 0.11

Posted by GitBox <gi...@apache.org>.

rdblue commented on a change in pull request #2178:
URL: https://github.com/apache/iceberg/pull/2178#discussion_r566461898



##########
File path: site/docs/releases.md
##########
@@ -74,14 +74,21 @@ High-level features:
 
 Important bug fixes:
 
-* [\#2091](https://github.com/apache/iceberg/pull/2091) fixes Parquet vectorized reads when column types are promoted
-* [\#1991](https://github.com/apache/iceberg/pull/1991) fixes Avro schema conversions to preserve field docs
-* [\#1981](https://github.com/apache/iceberg/pull/1981) fixes bug that date and timestamp transforms were producing incorrect values for negative dates and times
-* [\#1798](https://github.com/apache/iceberg/pull/1798) fixes read failure when encountering duplicate entries of data files
-* [\#1785](https://github.com/apache/iceberg/pull/1785) fixes invalidation of metadata tables in CachingCatalog
+* [\#2091](https://github.com/apache/iceberg/pull/2091) fixes `ClassCastException` for type promotion `int` to `long` or `float` to `double` during Parquet vectorized read. Now Arrow vector is created by looking at Parquet file schema instead of Iceberg schema for `int` and `float` fields.
+* [\#2031](https://github.com/apache/iceberg/pull/2031) fixes bug in Flink that custom catalog property causes catalog initialization failure. Now Flink catalog can support arbitrary custom catalog properties.
+* [\#2011](https://github.com/apache/iceberg/pull/2011) fixes equality comparison for `BaseSnapshot`. For engines such as Beam that serialize snapshots, now snapshots can be compared through Java equality operator.
+* [\#1998](https://github.com/apache/iceberg/pull/1998) fixes bug in `HiveTableOperation` that `unlock` is not called if new metadata cannot be deleted. Now it is guaranteed that `unlock` is always called for Hive catalog users.
+* [\#1991](https://github.com/apache/iceberg/pull/1991) fixes Avro schema conversions to preserve field docs. Now field level documentation is also preserved when converting from Avro schemas to Iceberg schemas.
+* [\#1981](https://github.com/apache/iceberg/pull/1981) fixes bug that date and timestamp transforms were producing incorrect values for dates and times before 1970. Before the fix, negative values were incorrectly transformed by date and timestamp transforms to 1 larger than the correct value. For example, `day(1969-12-31 10:00:00)` produced 0 instead of -1. The fix is backwards compatible, which means predicate projection can still work with the incorrectly transformed partitions written using older versions.
+* [\#1979](https://github.com/apache/iceberg/pull/1979) fixes table listing failure in Hadoop catalog when user does not have permission to some tables. Now the tables with no permission are ignored in listing.
+* [\#1798](https://github.com/apache/iceberg/pull/1798) fixes read failure when encountering duplicate entries of data files. Spark and Flink readers can now ignore duplicated entries in data files.

Review comment:
       I don't think it is true that duplicate data files are ignored. I thought it just fixed the encryption map problem.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] jackye1995 commented on pull request #2178: Doc: fix miscellaneous comments and add more details in release note for 0.11

Posted by GitBox <gi...@apache.org>.

jackye1995 commented on pull request #2178:
URL: https://github.com/apache/iceberg/pull/2178#issuecomment-769464933


   @rdblue All comments should now be addressed, please let me know any further comments, thank you!


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] rdblue commented on a change in pull request #2178: Doc: fix miscellaneous comments and add more details in release note for 0.11

Posted by GitBox <gi...@apache.org>.

rdblue commented on a change in pull request #2178:
URL: https://github.com/apache/iceberg/pull/2178#discussion_r566462745



##########
File path: site/docs/releases.md
##########
@@ -74,14 +74,21 @@ High-level features:
 
 Important bug fixes:
 
-* [\#2091](https://github.com/apache/iceberg/pull/2091) fixes Parquet vectorized reads when column types are promoted
-* [\#1991](https://github.com/apache/iceberg/pull/1991) fixes Avro schema conversions to preserve field docs
-* [\#1981](https://github.com/apache/iceberg/pull/1981) fixes bug that date and timestamp transforms were producing incorrect values for negative dates and times
-* [\#1798](https://github.com/apache/iceberg/pull/1798) fixes read failure when encountering duplicate entries of data files
-* [\#1785](https://github.com/apache/iceberg/pull/1785) fixes invalidation of metadata tables in CachingCatalog
+* [\#2091](https://github.com/apache/iceberg/pull/2091) fixes `ClassCastException` for type promotion `int` to `long` or `float` to `double` during Parquet vectorized read. Now Arrow vector is created by looking at Parquet file schema instead of Iceberg schema for `int` and `float` fields.
+* [\#2031](https://github.com/apache/iceberg/pull/2031) fixes bug in Flink that custom catalog property causes catalog initialization failure. Now Flink catalog can support arbitrary custom catalog properties.
+* [\#2011](https://github.com/apache/iceberg/pull/2011) fixes equality comparison for `BaseSnapshot`. For engines such as Beam that serialize snapshots, now snapshots can be compared through Java equality operator.
+* [\#1998](https://github.com/apache/iceberg/pull/1998) fixes bug in `HiveTableOperation` that `unlock` is not called if new metadata cannot be deleted. Now it is guaranteed that `unlock` is always called for Hive catalog users.
+* [\#1991](https://github.com/apache/iceberg/pull/1991) fixes Avro schema conversions to preserve field docs. Now field level documentation is also preserved when converting from Avro schemas to Iceberg schemas.
+* [\#1981](https://github.com/apache/iceberg/pull/1981) fixes bug that date and timestamp transforms were producing incorrect values for dates and times before 1970. Before the fix, negative values were incorrectly transformed by date and timestamp transforms to 1 larger than the correct value. For example, `day(1969-12-31 10:00:00)` produced 0 instead of -1. The fix is backwards compatible, which means predicate projection can still work with the incorrectly transformed partitions written using older versions.
+* [\#1979](https://github.com/apache/iceberg/pull/1979) fixes table listing failure in Hadoop catalog when user does not have permission to some tables. Now the tables with no permission are ignored in listing.
+* [\#1798](https://github.com/apache/iceberg/pull/1798) fixes read failure when encountering duplicate entries of data files. Spark and Flink readers can now ignore duplicated entries in data files.
+* [\#1785](https://github.com/apache/iceberg/pull/1785) fixes invalidation of metadata tables in `CachingCatalog`. When a table is dropped, all the metadata tables associated with it are also invalidated in the cache.
+* [\#1960](https://github.com/apache/iceberg/pull/1960) fixes bug that ORC writer does not read metrics config and always use the default. Now customized metrics config is respected.
+* [\#1936](https://github.com/apache/iceberg/pull/1936) fixes parallelism setting in Flink. Before, the default Flink parallelism was used which cause performance issue or resource waste. Now the parallelism is set to the number of Iceberg read splits. 

Review comment:
       This is also a new feature, not a bug. I'm not sure whether I consider it notable or not.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] rdblue commented on pull request #2178: Doc: fix miscellaneous comments and add more details in release note for 0.11

Posted by GitBox <gi...@apache.org>.

rdblue commented on pull request #2178:
URL: https://github.com/apache/iceberg/pull/2178#issuecomment-769507504


   Thanks, @jackye1995! I'll deploy this now.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] rdblue commented on a change in pull request #2178: Doc: fix miscellaneous comments and add more details in release note for 0.11

Posted by GitBox <gi...@apache.org>.

rdblue commented on a change in pull request #2178:
URL: https://github.com/apache/iceberg/pull/2178#discussion_r566460640



##########
File path: site/docs/releases.md
##########
@@ -74,14 +74,21 @@ High-level features:
 
 Important bug fixes:
 
-* [\#2091](https://github.com/apache/iceberg/pull/2091) fixes Parquet vectorized reads when column types are promoted
-* [\#1991](https://github.com/apache/iceberg/pull/1991) fixes Avro schema conversions to preserve field docs
-* [\#1981](https://github.com/apache/iceberg/pull/1981) fixes bug that date and timestamp transforms were producing incorrect values for negative dates and times
-* [\#1798](https://github.com/apache/iceberg/pull/1798) fixes read failure when encountering duplicate entries of data files
-* [\#1785](https://github.com/apache/iceberg/pull/1785) fixes invalidation of metadata tables in CachingCatalog
+* [\#2091](https://github.com/apache/iceberg/pull/2091) fixes `ClassCastException` for type promotion `int` to `long` or `float` to `double` during Parquet vectorized read. Now Arrow vector is created by looking at Parquet file schema instead of Iceberg schema for `int` and `float` fields.
+* [\#2031](https://github.com/apache/iceberg/pull/2031) fixes bug in Flink that custom catalog property causes catalog initialization failure. Now Flink catalog can support arbitrary custom catalog properties.

Review comment:
       I don't think this is a bug because custom catalogs were not supported in 0.10.0. I wouldn't mention it here. These should be serious or correctness errors.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org