You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@iceberg.apache.org by Ryan Blue <bl...@apache.org> on 2021/06/15 22:56:39 UTC
[DISCUSS] June board report

Hi everyone,

Time for another board report. I’m a bit late this month so we may need to
report next month as well. Let me know if there are any updates or comments!

Ryan
Description:

Apache Iceberg is a table format for huge analytic datasets that is designed
for high performance and ease of use.
Issues:

There are no issues requiring board attention.

Apologies that this report is late.
The community will report next month if needed.
Membership Data:

Apache Iceberg was founded 2020-05-19 (a year ago)
There are currently 17 committers and 9 PMC members in this project.
The Committer-to-PMC ratio is roughly 9:5.

Community changes, past quarter:

   - No new PMC members. Last addition was Anton Okolnychyi on 2020-05-19.
   - Ted Gooch was added as committer on 2021-05-11
   - Russell Spitzer was added as committer on 2021-04-02
   - Ryan Murray was added as committer on 2021-03-26
   - Yan Yan was added as committer on 2021-03-23

Project Activity:

0.11.1 was released on 2021-04-03.

The community is currently working on the 0.12.0 release, which will
update support for Spark 3.1 to fix the Iceberg SQL extensions.

Several features were finished:

   - Spark UPDATE support was committed
   - Row identifier fields were added to schemas to support Flink UPSERT
   - An action to import existing data files was added
   - Hive integration has been updated to allow using multiple catalogs

In addition, there are several on-going projects:

   - The community is working on updates for Spark 3.1
   - Spark data file compaction strategies and a new implementation have
   been
   discussed and should be available in 0.12.0
   - A design for encryption support has been proposed that will support
   Parquet and ORC encryption, as well as encryption for the metadata tree.
   - There have been design discussions for adding secondary indexes that
   can be
   updated asynchronously to keep commit latency low.
   - There have been design discussions for adding default field values
   - Support for Spark 3.0 structured streaming with the DSv2 API is under
   review
   - A DynamoDB catalog has been submitted as a PR

Community Health:

The community is healthy and showed an increase in contributors in the past
quarter. New contributors are working on significant projects, like Spark
streaming support and default values.

The community also added 4 new committers in the past quarter!
-- 
Ryan Blue