You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hudi.apache.org by Udit Mehrotra <ud...@apache.org> on 2021/08/04 00:12:45 UTC
[DISCUSS] Hudi 0.9.0 Release
Hi Community,
As we draw close to doing Hudi 0.9.0 release, I am happy to share a summary
of the key features/improvements that would be going in the release and the
current blockers for everyone's visibility.
*Highlights*
- [HUDI-1729] Asynchronous Hive sync and commits cleaning for Flink
writer
- [HUDI-1738] Detect and emit deleted records for Flink MOR table
streaming read
- [HUDI-1867] Support streaming reads for Flink COW table
- [HUDI-1908] Global index for flink writer
- [HUDI-1788] Support Insert Overwrite with Flink Writer
- [HUDI-2209] Bulk insert for flink writer
- [HUDI-1591] Support querying using non-globbed paths for Hudi Spark
DataSource queries
- [HUDI-1591] Partition pruning support for read optimized queries via
Hudi Spark DataSource
- [HUDI-1415] Register Hudi Table as a Spark DataSource Table with
metastore. Queries via Spark SQL will be routed through Hudi DataSource
(instead of InputFormat), thus making it more performant due to Spark's
native/optimized readers
- [HUDI-1591] Partition pruning support for snapshot queries via Hudi
Spark DataSource
- [HUDI-1658] DML and DDL support via Spark SQL
- [HUDI-1790] Add SqlSource for DeltaStreamer to support backfill use
cases:
- [HUDI-251] Add JDBC Source support for DeltaStreamer
- [HUDI-1910] Support Kafka based checkpointing for HoodieDeltaStreamer
- [HUDI-1371] Support metadata based listing for Spark DataSource and
Spark SQL
- [HUDI-2013] [HUDI-1717] [HUDI-2089] [HUDI-2016] Improvements to
Metadata based listing
- HUDI-89] Introduce a HoodieConfig/ConfigProperty framework to bring
all configs under one roof
- [HUDI-2124] Grafana dashboard for Hudi
- [HUDI-1104] [HUDI-1105] [HUDI-2009] Improvements to Bulk Insert via
row writing
- [HUDI-1483] Async clustering for Delta Streamer
- [HUDI-2235] Add virtual key support to Hudi
- [HUDI-1848] Add support for Hive Metastore in Hive-sync-tool
- In addition, there have been significant improvements and bug fixes to
improve the overall stability of Flink Hudi integration
*Current Blockers*
- [HUDI-2208] Support Bulk Insert For Spark Sql (Owner: pengzhiwei)
- [HUDI-1256] Follow on improvements to HFile tables for metadata based
listing (Owner: None)
- [HUDI-2063] Add Doc For Spark Sql (DML and DDL) integration With Hudi
(Owner: pengzhiwei)
- [HUDI-1842] Spark Sql Support For The Exists Hoodie Table (Owner:
pengzhiwei)
- [HUDI-1138] Re-implement marker files via timeline server (Owner:
Ethan Guo)
- [HUDI-1985] Website redesign implementation (Owner: Vinoth
Govindarajan)
- [HUDI-2232] MERGE INTO fails with table having nested struct (Owner:
pengzhiwei)
- [HUDI-1468] incremental read support with clustering (Owner: Liwei)
- [HUDI-2250] Bulk insert support for tables w/ primary key (Owner: None)
- [HUDI-2222] [SQL] Test catalog integration (Owner: Sagar Sumit)
- [HUDI-2221] [SQL] Functionality testing with Spark 2 (Owner: Sagar
Sumit)
- [HUDI-1887] Setting default value to false for enabling schema post
processor (Owner: Sivabalan)
- [HUDI-1850] Fixing read of a empty table but with failed write (Owner:
Sivabalan)
- [HUDI-2151] Enable defaults for out of box performance (Owner: Udit
Mehrotra)
- [HUDI-2119] Ensure the rolled-back instance was previously synced to
the Metadata Table when syncing a Rollback Instant (Owner: Prashant Wason)
- [HUDI-1458] Support custom clustering strategies and preserve commit
time to support incremental read (Owner: Satish Kotha)
- [HUDI-1763] Fixing honoring of Ordering val in
DefaultHoodieRecordPayload.preCombine (Owner: Sivabalan)
- [HUDI-1129] Improving schema evolution support in hudi (Owner:
Sivabalan)
- [HUDI-2120] [DOC] Update docs about schema in flink sql configuration
(Owner: Xianghu Wang)
- [HUDI-2182] Support Compaction Command For Spark Sql (Owner:
pengzhiwei)
Please respond to the thread if you think that I have missed capturing any
of the highlights or blockers for Hudi 0.9.0 release. For the owners of
these release blockers, can you please provide a specific timeline you are
willing to commit to for finishing these so we can cut an RC ?
Thanks,
Udit
Re: [DISCUSS] Hudi 0.9.0 Release
Posted by Sivabalan <n....@gmail.com>.
Status update: all release blockers are landed. We are good to go ahead
with RC work.
On Fri, Aug 13, 2021 at 5:46 PM Udit Mehrotra <ud...@apache.org> wrote:
> Hi Community,
>
> Here is a quick update on 0.9.0 release status. Over the last 10 days we
> made significant progress on the release blockers previously mentioned in
> the thread, thanks to all the owners. Here are the remaining blockers the
> we are currently tracking:
>
> - [HUDI-2305] Add MARKERS.type and fix marker-based rollback
> - [HUDI-2268] Add upgrade and downgrade to and from 0.9.0
> release-blockers
> - [HUDI-2307] When using delete_partition with ds should not rely on the
> primary key
> - [HUDI-2151] Flipping defaults
> - [HUDI-1897] Deltastreamer source for AWS S3
> - [HUDI-2120] [DOC] Update docs about schema in flink sql configuration
> - [HUDI-2119] Ensure the rolled-back instance was previously synced to
> the Metadata Table when syncing a Rollback Instant.
>
> We plan to resolve these soon and cut a RC by *tomorrow (August 14th, 2021)
> end of day PST*. If you have any other blockers that you would like to
> surface for Hudi 0.9.0, feel free to reach out.
>
> Thanks,
> Udit
>
> On Fri, Aug 6, 2021 at 1:53 AM sagar sumit <sa...@gmail.com> wrote:
>
> > Hi Udit, Vinoth
> >
> > End of next week sounds good. Apart from the issues listed, there is one
> > more that we can take in this release:
> > [HUDI-1897] DeltaStreamer Source for AWS S3
> >
> > It's under review and should be closed by early next week.
> >
> > Regards,
> > Sagar
> >
> > On 2021/08/06 00:55:19, Raymond Xu <xu...@gmail.com> wrote:
> > > +1 End of next week
> > >
> > > On Thu, Aug 5, 2021 at 3:06 PM Sivabalan <n....@gmail.com> wrote:
> > >
> > > > Yeah, end of next week sounds good.
> > > >
> > > > Here are the status updates wrt patches I am involved.
> > > >
> > > > Plan to get these in by early next week.
> > > > - [HUDI-2208] Support Bulk Insert For Spark Sql (Owner:
> pengzhiwei)
> > > > - [HUDI-2250] Bulk insert support for tables w/ primary key
> (Owner:
> > > > Sivabalan)
> > > > - [HUDI-1842] Spark Sql Support For The Exists Hoodie Table
> (Owner:
> > > > pengzhiwei)
> > > > - [HUDI-1138] Re-implement marker files via timeline server
> (Owner:
> > > > Ethan Guo)
> > > > - [HUDI-1129] Improving schema evolution support in hudi (Owner:
> > > > Sivabalan)
> > > >
> > > > Mid next week:
> > > > - [HUDI-2063] Add Doc For Spark Sql (DML and DDL) integration With
> > Hudi
> > > > (Owner: pengzhiwei)
> > > >
> > > > Waiting for reviews. Will try to get it in by early next week. If
> we
> > > > couldn't get this in, probably will skip this release.
> > > > - [HUDI-1763] Fixing honoring of Ordering val in
> > > > DefaultHoodieRecordPayload.preCombine (Owner: Sivabalan)
> > > >
> > > > Removed from release blockers:
> > > > - [HUDI-1887] Setting default value to false for enabling schema
> > post
> > > > processor (Owner: Sivabalan)
> > > > - [HUDI-1850] Fixing read of a empty table but with failed write
> > (Owner:
> > > > Sivabalan)
> > > >
> > > >
> > > > On Thu, Aug 5, 2021 at 11:17 AM Vinoth Chandar <vi...@apache.org>
> > wrote:
> > > >
> > > > > Any other thoughts? Love to lock this date down sooner than later.
> > > > >
> > > > > Thanks
> > > > > Vinoth
> > > > >
> > > > > On Tue, Aug 3, 2021 at 11:35 PM Udit Mehrotra <ud...@apache.org>
> > wrote:
> > > > >
> > > > > > Agreed Vinoth. End of next week seems reasonable as a hard
> > deadline for
> > > > > > cutting the RC.
> > > > > >
> > > > > > If anyone thinks otherwise or needs more time, feel free to chime
> > in.
> > > > > >
> > > > > > On Tue, Aug 3, 2021 at 8:10 PM Vinoth Chandar <vinoth@apache.org
> >
> > > > wrote:
> > > > > >
> > > > > > > Thanks Udit! I propose we set end of next week as a hard
> > deadline for
> > > > > > > cutting the RC. Any thoughts?
> > > > > > >
> > > > > > > A good amount of progress is being made on these blockers, I
> > think.
> > > > > > >
> > > > > > >
> > > > > > > On Tue, Aug 3, 2021 at 5:13 PM Udit Mehrotra <
> uditme@apache.org>
> > > > > wrote:
> > > > > > >
> > > > > > > > Hi Community,
> > > > > > > >
> > > > > > > > As we draw close to doing Hudi 0.9.0 release, I am happy to
> > share a
> > > > > > > summary
> > > > > > > > of the key features/improvements that would be going in the
> > release
> > > > > and
> > > > > > > the
> > > > > > > > current blockers for everyone's visibility.
> > > > > > > >
> > > > > > > > *Highlights*
> > > > > > > >
> > > > > > > > - [HUDI-1729] Asynchronous Hive sync and commits cleaning
> > for
> > > > > Flink
> > > > > > > > writer
> > > > > > > > - [HUDI-1738] Detect and emit deleted records for Flink
> MOR
> > > > table
> > > > > > > > streaming read
> > > > > > > > - [HUDI-1867] Support streaming reads for Flink COW table
> > > > > > > > - [HUDI-1908] Global index for flink writer
> > > > > > > > - [HUDI-1788] Support Insert Overwrite with Flink Writer
> > > > > > > > - [HUDI-2209] Bulk insert for flink writer
> > > > > > > > - [HUDI-1591] Support querying using non-globbed paths for
> > Hudi
> > > > > > Spark
> > > > > > > > DataSource queries
> > > > > > > > - [HUDI-1591] Partition pruning support for read optimized
> > > > queries
> > > > > > via
> > > > > > > > Hudi Spark DataSource
> > > > > > > > - [HUDI-1415] Register Hudi Table as a Spark DataSource
> > Table
> > > > with
> > > > > > > > metastore. Queries via Spark SQL will be routed through
> Hudi
> > > > > > > DataSource
> > > > > > > > (instead of InputFormat), thus making it more performant
> > due to
> > > > > > > Spark's
> > > > > > > > native/optimized readers
> > > > > > > > - [HUDI-1591] Partition pruning support for snapshot
> > queries via
> > > > > > Hudi
> > > > > > > > Spark DataSource
> > > > > > > > - [HUDI-1658] DML and DDL support via Spark SQL
> > > > > > > > - [HUDI-1790] Add SqlSource for DeltaStreamer to support
> > > > backfill
> > > > > > use
> > > > > > > > cases:
> > > > > > > > - [HUDI-251] Add JDBC Source support for DeltaStreamer
> > > > > > > > - [HUDI-1910] Support Kafka based checkpointing for
> > > > > > > HoodieDeltaStreamer
> > > > > > > > - [HUDI-1371] Support metadata based listing for Spark
> > > > DataSource
> > > > > > and
> > > > > > > > Spark SQL
> > > > > > > > - [HUDI-2013] [HUDI-1717] [HUDI-2089] [HUDI-2016]
> > Improvements
> > > > to
> > > > > > > > Metadata based listing
> > > > > > > > - HUDI-89] Introduce a HoodieConfig/ConfigProperty
> > framework to
> > > > > > bring
> > > > > > > > all configs under one roof
> > > > > > > > - [HUDI-2124] Grafana dashboard for Hudi
> > > > > > > > - [HUDI-1104] [HUDI-1105] [HUDI-2009] Improvements to Bulk
> > > > Insert
> > > > > > via
> > > > > > > > row writing
> > > > > > > > - [HUDI-1483] Async clustering for Delta Streamer
> > > > > > > > - [HUDI-2235] Add virtual key support to Hudi
> > > > > > > > - [HUDI-1848] Add support for Hive Metastore in
> > Hive-sync-tool
> > > > > > > > - In addition, there have been significant improvements
> and
> > bug
> > > > > > fixes
> > > > > > > to
> > > > > > > > improve the overall stability of Flink Hudi integration
> > > > > > > >
> > > > > > > > *Current Blockers*
> > > > > > > >
> > > > > > > > - [HUDI-2208] Support Bulk Insert For Spark Sql (Owner:
> > > > > pengzhiwei)
> > > > > > > > - [HUDI-1256] Follow on improvements to HFile tables for
> > > > metadata
> > > > > > > based
> > > > > > > > listing (Owner: None)
> > > > > > > > - [HUDI-2063] Add Doc For Spark Sql (DML and DDL)
> > integration
> > > > With
> > > > > > > Hudi
> > > > > > > > (Owner: pengzhiwei)
> > > > > > > > - [HUDI-1842] Spark Sql Support For The Exists Hoodie
> Table
> > > > > (Owner:
> > > > > > > > pengzhiwei)
> > > > > > > > - [HUDI-1138] Re-implement marker files via timeline
> server
> > > > > (Owner:
> > > > > > > > Ethan Guo)
> > > > > > > > - [HUDI-1985] Website redesign implementation (Owner:
> Vinoth
> > > > > > > > Govindarajan)
> > > > > > > > - [HUDI-2232] MERGE INTO fails with table having nested
> > struct
> > > > > > (Owner:
> > > > > > > > pengzhiwei)
> > > > > > > > - [HUDI-1468] incremental read support with clustering
> > (Owner:
> > > > > > Liwei)
> > > > > > > > - [HUDI-2250] Bulk insert support for tables w/ primary
> key
> > > > > (Owner:
> > > > > > > > None)
> > > > > > > > - [HUDI-2222] [SQL] Test catalog integration (Owner: Sagar
> > > > Sumit)
> > > > > > > > - [HUDI-2221] [SQL] Functionality testing with Spark 2
> > (Owner:
> > > > > Sagar
> > > > > > > > Sumit)
> > > > > > > > - [HUDI-1887] Setting default value to false for enabling
> > schema
> > > > > > post
> > > > > > > > processor (Owner: Sivabalan)
> > > > > > > > - [HUDI-1850] Fixing read of a empty table but with failed
> > write
> > > > > > > (Owner:
> > > > > > > > Sivabalan)
> > > > > > > > - [HUDI-2151] Enable defaults for out of box performance
> > (Owner:
> > > > > > Udit
> > > > > > > > Mehrotra)
> > > > > > > > - [HUDI-2119] Ensure the rolled-back instance was
> previously
> > > > > synced
> > > > > > to
> > > > > > > > the Metadata Table when syncing a Rollback Instant (Owner:
> > > > > Prashant
> > > > > > > > Wason)
> > > > > > > > - [HUDI-1458] Support custom clustering strategies and
> > preserve
> > > > > > commit
> > > > > > > > time to support incremental read (Owner: Satish Kotha)
> > > > > > > > - [HUDI-1763] Fixing honoring of Ordering val in
> > > > > > > > DefaultHoodieRecordPayload.preCombine (Owner: Sivabalan)
> > > > > > > > - [HUDI-1129] Improving schema evolution support in hudi
> > (Owner:
> > > > > > > > Sivabalan)
> > > > > > > > - [HUDI-2120] [DOC] Update docs about schema in flink sql
> > > > > > > configuration
> > > > > > > > (Owner: Xianghu Wang)
> > > > > > > > - [HUDI-2182] Support Compaction Command For Spark Sql
> > (Owner:
> > > > > > > > pengzhiwei)
> > > > > > > >
> > > > > > > > Please respond to the thread if you think that I have missed
> > > > > capturing
> > > > > > > any
> > > > > > > > of the highlights or blockers for Hudi 0.9.0 release. For the
> > > > owners
> > > > > of
> > > > > > > > these release blockers, can you please provide a specific
> > timeline
> > > > > you
> > > > > > > are
> > > > > > > > willing to commit to for finishing these so we can cut an RC
> ?
> > > > > > > >
> > > > > > > > Thanks,
> > > > > > > > Udit
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > > >
> > > > --
> > > > Regards,
> > > > -Sivabalan
> > > >
> > >
> >
>
--
Regards,
-Sivabalan
Re: [DISCUSS] Hudi 0.9.0 Release
Posted by Udit Mehrotra <ud...@apache.org>.
Hi Community,
Here is a quick update on 0.9.0 release status. Over the last 10 days we
made significant progress on the release blockers previously mentioned in
the thread, thanks to all the owners. Here are the remaining blockers the
we are currently tracking:
- [HUDI-2305] Add MARKERS.type and fix marker-based rollback
- [HUDI-2268] Add upgrade and downgrade to and from 0.9.0
release-blockers
- [HUDI-2307] When using delete_partition with ds should not rely on the
primary key
- [HUDI-2151] Flipping defaults
- [HUDI-1897] Deltastreamer source for AWS S3
- [HUDI-2120] [DOC] Update docs about schema in flink sql configuration
- [HUDI-2119] Ensure the rolled-back instance was previously synced to
the Metadata Table when syncing a Rollback Instant.
We plan to resolve these soon and cut a RC by *tomorrow (August 14th, 2021)
end of day PST*. If you have any other blockers that you would like to
surface for Hudi 0.9.0, feel free to reach out.
Thanks,
Udit
On Fri, Aug 6, 2021 at 1:53 AM sagar sumit <sa...@gmail.com> wrote:
> Hi Udit, Vinoth
>
> End of next week sounds good. Apart from the issues listed, there is one
> more that we can take in this release:
> [HUDI-1897] DeltaStreamer Source for AWS S3
>
> It's under review and should be closed by early next week.
>
> Regards,
> Sagar
>
> On 2021/08/06 00:55:19, Raymond Xu <xu...@gmail.com> wrote:
> > +1 End of next week
> >
> > On Thu, Aug 5, 2021 at 3:06 PM Sivabalan <n....@gmail.com> wrote:
> >
> > > Yeah, end of next week sounds good.
> > >
> > > Here are the status updates wrt patches I am involved.
> > >
> > > Plan to get these in by early next week.
> > > - [HUDI-2208] Support Bulk Insert For Spark Sql (Owner: pengzhiwei)
> > > - [HUDI-2250] Bulk insert support for tables w/ primary key (Owner:
> > > Sivabalan)
> > > - [HUDI-1842] Spark Sql Support For The Exists Hoodie Table (Owner:
> > > pengzhiwei)
> > > - [HUDI-1138] Re-implement marker files via timeline server (Owner:
> > > Ethan Guo)
> > > - [HUDI-1129] Improving schema evolution support in hudi (Owner:
> > > Sivabalan)
> > >
> > > Mid next week:
> > > - [HUDI-2063] Add Doc For Spark Sql (DML and DDL) integration With
> Hudi
> > > (Owner: pengzhiwei)
> > >
> > > Waiting for reviews. Will try to get it in by early next week. If we
> > > couldn't get this in, probably will skip this release.
> > > - [HUDI-1763] Fixing honoring of Ordering val in
> > > DefaultHoodieRecordPayload.preCombine (Owner: Sivabalan)
> > >
> > > Removed from release blockers:
> > > - [HUDI-1887] Setting default value to false for enabling schema
> post
> > > processor (Owner: Sivabalan)
> > > - [HUDI-1850] Fixing read of a empty table but with failed write
> (Owner:
> > > Sivabalan)
> > >
> > >
> > > On Thu, Aug 5, 2021 at 11:17 AM Vinoth Chandar <vi...@apache.org>
> wrote:
> > >
> > > > Any other thoughts? Love to lock this date down sooner than later.
> > > >
> > > > Thanks
> > > > Vinoth
> > > >
> > > > On Tue, Aug 3, 2021 at 11:35 PM Udit Mehrotra <ud...@apache.org>
> wrote:
> > > >
> > > > > Agreed Vinoth. End of next week seems reasonable as a hard
> deadline for
> > > > > cutting the RC.
> > > > >
> > > > > If anyone thinks otherwise or needs more time, feel free to chime
> in.
> > > > >
> > > > > On Tue, Aug 3, 2021 at 8:10 PM Vinoth Chandar <vi...@apache.org>
> > > wrote:
> > > > >
> > > > > > Thanks Udit! I propose we set end of next week as a hard
> deadline for
> > > > > > cutting the RC. Any thoughts?
> > > > > >
> > > > > > A good amount of progress is being made on these blockers, I
> think.
> > > > > >
> > > > > >
> > > > > > On Tue, Aug 3, 2021 at 5:13 PM Udit Mehrotra <ud...@apache.org>
> > > > wrote:
> > > > > >
> > > > > > > Hi Community,
> > > > > > >
> > > > > > > As we draw close to doing Hudi 0.9.0 release, I am happy to
> share a
> > > > > > summary
> > > > > > > of the key features/improvements that would be going in the
> release
> > > > and
> > > > > > the
> > > > > > > current blockers for everyone's visibility.
> > > > > > >
> > > > > > > *Highlights*
> > > > > > >
> > > > > > > - [HUDI-1729] Asynchronous Hive sync and commits cleaning
> for
> > > > Flink
> > > > > > > writer
> > > > > > > - [HUDI-1738] Detect and emit deleted records for Flink MOR
> > > table
> > > > > > > streaming read
> > > > > > > - [HUDI-1867] Support streaming reads for Flink COW table
> > > > > > > - [HUDI-1908] Global index for flink writer
> > > > > > > - [HUDI-1788] Support Insert Overwrite with Flink Writer
> > > > > > > - [HUDI-2209] Bulk insert for flink writer
> > > > > > > - [HUDI-1591] Support querying using non-globbed paths for
> Hudi
> > > > > Spark
> > > > > > > DataSource queries
> > > > > > > - [HUDI-1591] Partition pruning support for read optimized
> > > queries
> > > > > via
> > > > > > > Hudi Spark DataSource
> > > > > > > - [HUDI-1415] Register Hudi Table as a Spark DataSource
> Table
> > > with
> > > > > > > metastore. Queries via Spark SQL will be routed through Hudi
> > > > > > DataSource
> > > > > > > (instead of InputFormat), thus making it more performant
> due to
> > > > > > Spark's
> > > > > > > native/optimized readers
> > > > > > > - [HUDI-1591] Partition pruning support for snapshot
> queries via
> > > > > Hudi
> > > > > > > Spark DataSource
> > > > > > > - [HUDI-1658] DML and DDL support via Spark SQL
> > > > > > > - [HUDI-1790] Add SqlSource for DeltaStreamer to support
> > > backfill
> > > > > use
> > > > > > > cases:
> > > > > > > - [HUDI-251] Add JDBC Source support for DeltaStreamer
> > > > > > > - [HUDI-1910] Support Kafka based checkpointing for
> > > > > > HoodieDeltaStreamer
> > > > > > > - [HUDI-1371] Support metadata based listing for Spark
> > > DataSource
> > > > > and
> > > > > > > Spark SQL
> > > > > > > - [HUDI-2013] [HUDI-1717] [HUDI-2089] [HUDI-2016]
> Improvements
> > > to
> > > > > > > Metadata based listing
> > > > > > > - HUDI-89] Introduce a HoodieConfig/ConfigProperty
> framework to
> > > > > bring
> > > > > > > all configs under one roof
> > > > > > > - [HUDI-2124] Grafana dashboard for Hudi
> > > > > > > - [HUDI-1104] [HUDI-1105] [HUDI-2009] Improvements to Bulk
> > > Insert
> > > > > via
> > > > > > > row writing
> > > > > > > - [HUDI-1483] Async clustering for Delta Streamer
> > > > > > > - [HUDI-2235] Add virtual key support to Hudi
> > > > > > > - [HUDI-1848] Add support for Hive Metastore in
> Hive-sync-tool
> > > > > > > - In addition, there have been significant improvements and
> bug
> > > > > fixes
> > > > > > to
> > > > > > > improve the overall stability of Flink Hudi integration
> > > > > > >
> > > > > > > *Current Blockers*
> > > > > > >
> > > > > > > - [HUDI-2208] Support Bulk Insert For Spark Sql (Owner:
> > > > pengzhiwei)
> > > > > > > - [HUDI-1256] Follow on improvements to HFile tables for
> > > metadata
> > > > > > based
> > > > > > > listing (Owner: None)
> > > > > > > - [HUDI-2063] Add Doc For Spark Sql (DML and DDL)
> integration
> > > With
> > > > > > Hudi
> > > > > > > (Owner: pengzhiwei)
> > > > > > > - [HUDI-1842] Spark Sql Support For The Exists Hoodie Table
> > > > (Owner:
> > > > > > > pengzhiwei)
> > > > > > > - [HUDI-1138] Re-implement marker files via timeline server
> > > > (Owner:
> > > > > > > Ethan Guo)
> > > > > > > - [HUDI-1985] Website redesign implementation (Owner: Vinoth
> > > > > > > Govindarajan)
> > > > > > > - [HUDI-2232] MERGE INTO fails with table having nested
> struct
> > > > > (Owner:
> > > > > > > pengzhiwei)
> > > > > > > - [HUDI-1468] incremental read support with clustering
> (Owner:
> > > > > Liwei)
> > > > > > > - [HUDI-2250] Bulk insert support for tables w/ primary key
> > > > (Owner:
> > > > > > > None)
> > > > > > > - [HUDI-2222] [SQL] Test catalog integration (Owner: Sagar
> > > Sumit)
> > > > > > > - [HUDI-2221] [SQL] Functionality testing with Spark 2
> (Owner:
> > > > Sagar
> > > > > > > Sumit)
> > > > > > > - [HUDI-1887] Setting default value to false for enabling
> schema
> > > > > post
> > > > > > > processor (Owner: Sivabalan)
> > > > > > > - [HUDI-1850] Fixing read of a empty table but with failed
> write
> > > > > > (Owner:
> > > > > > > Sivabalan)
> > > > > > > - [HUDI-2151] Enable defaults for out of box performance
> (Owner:
> > > > > Udit
> > > > > > > Mehrotra)
> > > > > > > - [HUDI-2119] Ensure the rolled-back instance was previously
> > > > synced
> > > > > to
> > > > > > > the Metadata Table when syncing a Rollback Instant (Owner:
> > > > Prashant
> > > > > > > Wason)
> > > > > > > - [HUDI-1458] Support custom clustering strategies and
> preserve
> > > > > commit
> > > > > > > time to support incremental read (Owner: Satish Kotha)
> > > > > > > - [HUDI-1763] Fixing honoring of Ordering val in
> > > > > > > DefaultHoodieRecordPayload.preCombine (Owner: Sivabalan)
> > > > > > > - [HUDI-1129] Improving schema evolution support in hudi
> (Owner:
> > > > > > > Sivabalan)
> > > > > > > - [HUDI-2120] [DOC] Update docs about schema in flink sql
> > > > > > configuration
> > > > > > > (Owner: Xianghu Wang)
> > > > > > > - [HUDI-2182] Support Compaction Command For Spark Sql
> (Owner:
> > > > > > > pengzhiwei)
> > > > > > >
> > > > > > > Please respond to the thread if you think that I have missed
> > > > capturing
> > > > > > any
> > > > > > > of the highlights or blockers for Hudi 0.9.0 release. For the
> > > owners
> > > > of
> > > > > > > these release blockers, can you please provide a specific
> timeline
> > > > you
> > > > > > are
> > > > > > > willing to commit to for finishing these so we can cut an RC ?
> > > > > > >
> > > > > > > Thanks,
> > > > > > > Udit
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> > >
> > > --
> > > Regards,
> > > -Sivabalan
> > >
> >
>
Re: [DISCUSS] Hudi 0.9.0 Release
Posted by sagar sumit <sa...@gmail.com>.
Hi Udit, Vinoth
End of next week sounds good. Apart from the issues listed, there is one more that we can take in this release:
[HUDI-1897] DeltaStreamer Source for AWS S3
It's under review and should be closed by early next week.
Regards,
Sagar
On 2021/08/06 00:55:19, Raymond Xu <xu...@gmail.com> wrote:
> +1 End of next week
>
> On Thu, Aug 5, 2021 at 3:06 PM Sivabalan <n....@gmail.com> wrote:
>
> > Yeah, end of next week sounds good.
> >
> > Here are the status updates wrt patches I am involved.
> >
> > Plan to get these in by early next week.
> > - [HUDI-2208] Support Bulk Insert For Spark Sql (Owner: pengzhiwei)
> > - [HUDI-2250] Bulk insert support for tables w/ primary key (Owner:
> > Sivabalan)
> > - [HUDI-1842] Spark Sql Support For The Exists Hoodie Table (Owner:
> > pengzhiwei)
> > - [HUDI-1138] Re-implement marker files via timeline server (Owner:
> > Ethan Guo)
> > - [HUDI-1129] Improving schema evolution support in hudi (Owner:
> > Sivabalan)
> >
> > Mid next week:
> > - [HUDI-2063] Add Doc For Spark Sql (DML and DDL) integration With Hudi
> > (Owner: pengzhiwei)
> >
> > Waiting for reviews. Will try to get it in by early next week. If we
> > couldn't get this in, probably will skip this release.
> > - [HUDI-1763] Fixing honoring of Ordering val in
> > DefaultHoodieRecordPayload.preCombine (Owner: Sivabalan)
> >
> > Removed from release blockers:
> > - [HUDI-1887] Setting default value to false for enabling schema post
> > processor (Owner: Sivabalan)
> > - [HUDI-1850] Fixing read of a empty table but with failed write (Owner:
> > Sivabalan)
> >
> >
> > On Thu, Aug 5, 2021 at 11:17 AM Vinoth Chandar <vi...@apache.org> wrote:
> >
> > > Any other thoughts? Love to lock this date down sooner than later.
> > >
> > > Thanks
> > > Vinoth
> > >
> > > On Tue, Aug 3, 2021 at 11:35 PM Udit Mehrotra <ud...@apache.org> wrote:
> > >
> > > > Agreed Vinoth. End of next week seems reasonable as a hard deadline for
> > > > cutting the RC.
> > > >
> > > > If anyone thinks otherwise or needs more time, feel free to chime in.
> > > >
> > > > On Tue, Aug 3, 2021 at 8:10 PM Vinoth Chandar <vi...@apache.org>
> > wrote:
> > > >
> > > > > Thanks Udit! I propose we set end of next week as a hard deadline for
> > > > > cutting the RC. Any thoughts?
> > > > >
> > > > > A good amount of progress is being made on these blockers, I think.
> > > > >
> > > > >
> > > > > On Tue, Aug 3, 2021 at 5:13 PM Udit Mehrotra <ud...@apache.org>
> > > wrote:
> > > > >
> > > > > > Hi Community,
> > > > > >
> > > > > > As we draw close to doing Hudi 0.9.0 release, I am happy to share a
> > > > > summary
> > > > > > of the key features/improvements that would be going in the release
> > > and
> > > > > the
> > > > > > current blockers for everyone's visibility.
> > > > > >
> > > > > > *Highlights*
> > > > > >
> > > > > > - [HUDI-1729] Asynchronous Hive sync and commits cleaning for
> > > Flink
> > > > > > writer
> > > > > > - [HUDI-1738] Detect and emit deleted records for Flink MOR
> > table
> > > > > > streaming read
> > > > > > - [HUDI-1867] Support streaming reads for Flink COW table
> > > > > > - [HUDI-1908] Global index for flink writer
> > > > > > - [HUDI-1788] Support Insert Overwrite with Flink Writer
> > > > > > - [HUDI-2209] Bulk insert for flink writer
> > > > > > - [HUDI-1591] Support querying using non-globbed paths for Hudi
> > > > Spark
> > > > > > DataSource queries
> > > > > > - [HUDI-1591] Partition pruning support for read optimized
> > queries
> > > > via
> > > > > > Hudi Spark DataSource
> > > > > > - [HUDI-1415] Register Hudi Table as a Spark DataSource Table
> > with
> > > > > > metastore. Queries via Spark SQL will be routed through Hudi
> > > > > DataSource
> > > > > > (instead of InputFormat), thus making it more performant due to
> > > > > Spark's
> > > > > > native/optimized readers
> > > > > > - [HUDI-1591] Partition pruning support for snapshot queries via
> > > > Hudi
> > > > > > Spark DataSource
> > > > > > - [HUDI-1658] DML and DDL support via Spark SQL
> > > > > > - [HUDI-1790] Add SqlSource for DeltaStreamer to support
> > backfill
> > > > use
> > > > > > cases:
> > > > > > - [HUDI-251] Add JDBC Source support for DeltaStreamer
> > > > > > - [HUDI-1910] Support Kafka based checkpointing for
> > > > > HoodieDeltaStreamer
> > > > > > - [HUDI-1371] Support metadata based listing for Spark
> > DataSource
> > > > and
> > > > > > Spark SQL
> > > > > > - [HUDI-2013] [HUDI-1717] [HUDI-2089] [HUDI-2016] Improvements
> > to
> > > > > > Metadata based listing
> > > > > > - HUDI-89] Introduce a HoodieConfig/ConfigProperty framework to
> > > > bring
> > > > > > all configs under one roof
> > > > > > - [HUDI-2124] Grafana dashboard for Hudi
> > > > > > - [HUDI-1104] [HUDI-1105] [HUDI-2009] Improvements to Bulk
> > Insert
> > > > via
> > > > > > row writing
> > > > > > - [HUDI-1483] Async clustering for Delta Streamer
> > > > > > - [HUDI-2235] Add virtual key support to Hudi
> > > > > > - [HUDI-1848] Add support for Hive Metastore in Hive-sync-tool
> > > > > > - In addition, there have been significant improvements and bug
> > > > fixes
> > > > > to
> > > > > > improve the overall stability of Flink Hudi integration
> > > > > >
> > > > > > *Current Blockers*
> > > > > >
> > > > > > - [HUDI-2208] Support Bulk Insert For Spark Sql (Owner:
> > > pengzhiwei)
> > > > > > - [HUDI-1256] Follow on improvements to HFile tables for
> > metadata
> > > > > based
> > > > > > listing (Owner: None)
> > > > > > - [HUDI-2063] Add Doc For Spark Sql (DML and DDL) integration
> > With
> > > > > Hudi
> > > > > > (Owner: pengzhiwei)
> > > > > > - [HUDI-1842] Spark Sql Support For The Exists Hoodie Table
> > > (Owner:
> > > > > > pengzhiwei)
> > > > > > - [HUDI-1138] Re-implement marker files via timeline server
> > > (Owner:
> > > > > > Ethan Guo)
> > > > > > - [HUDI-1985] Website redesign implementation (Owner: Vinoth
> > > > > > Govindarajan)
> > > > > > - [HUDI-2232] MERGE INTO fails with table having nested struct
> > > > (Owner:
> > > > > > pengzhiwei)
> > > > > > - [HUDI-1468] incremental read support with clustering (Owner:
> > > > Liwei)
> > > > > > - [HUDI-2250] Bulk insert support for tables w/ primary key
> > > (Owner:
> > > > > > None)
> > > > > > - [HUDI-2222] [SQL] Test catalog integration (Owner: Sagar
> > Sumit)
> > > > > > - [HUDI-2221] [SQL] Functionality testing with Spark 2 (Owner:
> > > Sagar
> > > > > > Sumit)
> > > > > > - [HUDI-1887] Setting default value to false for enabling schema
> > > > post
> > > > > > processor (Owner: Sivabalan)
> > > > > > - [HUDI-1850] Fixing read of a empty table but with failed write
> > > > > (Owner:
> > > > > > Sivabalan)
> > > > > > - [HUDI-2151] Enable defaults for out of box performance (Owner:
> > > > Udit
> > > > > > Mehrotra)
> > > > > > - [HUDI-2119] Ensure the rolled-back instance was previously
> > > synced
> > > > to
> > > > > > the Metadata Table when syncing a Rollback Instant (Owner:
> > > Prashant
> > > > > > Wason)
> > > > > > - [HUDI-1458] Support custom clustering strategies and preserve
> > > > commit
> > > > > > time to support incremental read (Owner: Satish Kotha)
> > > > > > - [HUDI-1763] Fixing honoring of Ordering val in
> > > > > > DefaultHoodieRecordPayload.preCombine (Owner: Sivabalan)
> > > > > > - [HUDI-1129] Improving schema evolution support in hudi (Owner:
> > > > > > Sivabalan)
> > > > > > - [HUDI-2120] [DOC] Update docs about schema in flink sql
> > > > > configuration
> > > > > > (Owner: Xianghu Wang)
> > > > > > - [HUDI-2182] Support Compaction Command For Spark Sql (Owner:
> > > > > > pengzhiwei)
> > > > > >
> > > > > > Please respond to the thread if you think that I have missed
> > > capturing
> > > > > any
> > > > > > of the highlights or blockers for Hudi 0.9.0 release. For the
> > owners
> > > of
> > > > > > these release blockers, can you please provide a specific timeline
> > > you
> > > > > are
> > > > > > willing to commit to for finishing these so we can cut an RC ?
> > > > > >
> > > > > > Thanks,
> > > > > > Udit
> > > > > >
> > > > >
> > > >
> > >
> >
> >
> > --
> > Regards,
> > -Sivabalan
> >
>
Re: [DISCUSS] Hudi 0.9.0 Release
Posted by Raymond Xu <xu...@gmail.com>.
+1 End of next week
On Thu, Aug 5, 2021 at 3:06 PM Sivabalan <n....@gmail.com> wrote:
> Yeah, end of next week sounds good.
>
> Here are the status updates wrt patches I am involved.
>
> Plan to get these in by early next week.
> - [HUDI-2208] Support Bulk Insert For Spark Sql (Owner: pengzhiwei)
> - [HUDI-2250] Bulk insert support for tables w/ primary key (Owner:
> Sivabalan)
> - [HUDI-1842] Spark Sql Support For The Exists Hoodie Table (Owner:
> pengzhiwei)
> - [HUDI-1138] Re-implement marker files via timeline server (Owner:
> Ethan Guo)
> - [HUDI-1129] Improving schema evolution support in hudi (Owner:
> Sivabalan)
>
> Mid next week:
> - [HUDI-2063] Add Doc For Spark Sql (DML and DDL) integration With Hudi
> (Owner: pengzhiwei)
>
> Waiting for reviews. Will try to get it in by early next week. If we
> couldn't get this in, probably will skip this release.
> - [HUDI-1763] Fixing honoring of Ordering val in
> DefaultHoodieRecordPayload.preCombine (Owner: Sivabalan)
>
> Removed from release blockers:
> - [HUDI-1887] Setting default value to false for enabling schema post
> processor (Owner: Sivabalan)
> - [HUDI-1850] Fixing read of a empty table but with failed write (Owner:
> Sivabalan)
>
>
> On Thu, Aug 5, 2021 at 11:17 AM Vinoth Chandar <vi...@apache.org> wrote:
>
> > Any other thoughts? Love to lock this date down sooner than later.
> >
> > Thanks
> > Vinoth
> >
> > On Tue, Aug 3, 2021 at 11:35 PM Udit Mehrotra <ud...@apache.org> wrote:
> >
> > > Agreed Vinoth. End of next week seems reasonable as a hard deadline for
> > > cutting the RC.
> > >
> > > If anyone thinks otherwise or needs more time, feel free to chime in.
> > >
> > > On Tue, Aug 3, 2021 at 8:10 PM Vinoth Chandar <vi...@apache.org>
> wrote:
> > >
> > > > Thanks Udit! I propose we set end of next week as a hard deadline for
> > > > cutting the RC. Any thoughts?
> > > >
> > > > A good amount of progress is being made on these blockers, I think.
> > > >
> > > >
> > > > On Tue, Aug 3, 2021 at 5:13 PM Udit Mehrotra <ud...@apache.org>
> > wrote:
> > > >
> > > > > Hi Community,
> > > > >
> > > > > As we draw close to doing Hudi 0.9.0 release, I am happy to share a
> > > > summary
> > > > > of the key features/improvements that would be going in the release
> > and
> > > > the
> > > > > current blockers for everyone's visibility.
> > > > >
> > > > > *Highlights*
> > > > >
> > > > > - [HUDI-1729] Asynchronous Hive sync and commits cleaning for
> > Flink
> > > > > writer
> > > > > - [HUDI-1738] Detect and emit deleted records for Flink MOR
> table
> > > > > streaming read
> > > > > - [HUDI-1867] Support streaming reads for Flink COW table
> > > > > - [HUDI-1908] Global index for flink writer
> > > > > - [HUDI-1788] Support Insert Overwrite with Flink Writer
> > > > > - [HUDI-2209] Bulk insert for flink writer
> > > > > - [HUDI-1591] Support querying using non-globbed paths for Hudi
> > > Spark
> > > > > DataSource queries
> > > > > - [HUDI-1591] Partition pruning support for read optimized
> queries
> > > via
> > > > > Hudi Spark DataSource
> > > > > - [HUDI-1415] Register Hudi Table as a Spark DataSource Table
> with
> > > > > metastore. Queries via Spark SQL will be routed through Hudi
> > > > DataSource
> > > > > (instead of InputFormat), thus making it more performant due to
> > > > Spark's
> > > > > native/optimized readers
> > > > > - [HUDI-1591] Partition pruning support for snapshot queries via
> > > Hudi
> > > > > Spark DataSource
> > > > > - [HUDI-1658] DML and DDL support via Spark SQL
> > > > > - [HUDI-1790] Add SqlSource for DeltaStreamer to support
> backfill
> > > use
> > > > > cases:
> > > > > - [HUDI-251] Add JDBC Source support for DeltaStreamer
> > > > > - [HUDI-1910] Support Kafka based checkpointing for
> > > > HoodieDeltaStreamer
> > > > > - [HUDI-1371] Support metadata based listing for Spark
> DataSource
> > > and
> > > > > Spark SQL
> > > > > - [HUDI-2013] [HUDI-1717] [HUDI-2089] [HUDI-2016] Improvements
> to
> > > > > Metadata based listing
> > > > > - HUDI-89] Introduce a HoodieConfig/ConfigProperty framework to
> > > bring
> > > > > all configs under one roof
> > > > > - [HUDI-2124] Grafana dashboard for Hudi
> > > > > - [HUDI-1104] [HUDI-1105] [HUDI-2009] Improvements to Bulk
> Insert
> > > via
> > > > > row writing
> > > > > - [HUDI-1483] Async clustering for Delta Streamer
> > > > > - [HUDI-2235] Add virtual key support to Hudi
> > > > > - [HUDI-1848] Add support for Hive Metastore in Hive-sync-tool
> > > > > - In addition, there have been significant improvements and bug
> > > fixes
> > > > to
> > > > > improve the overall stability of Flink Hudi integration
> > > > >
> > > > > *Current Blockers*
> > > > >
> > > > > - [HUDI-2208] Support Bulk Insert For Spark Sql (Owner:
> > pengzhiwei)
> > > > > - [HUDI-1256] Follow on improvements to HFile tables for
> metadata
> > > > based
> > > > > listing (Owner: None)
> > > > > - [HUDI-2063] Add Doc For Spark Sql (DML and DDL) integration
> With
> > > > Hudi
> > > > > (Owner: pengzhiwei)
> > > > > - [HUDI-1842] Spark Sql Support For The Exists Hoodie Table
> > (Owner:
> > > > > pengzhiwei)
> > > > > - [HUDI-1138] Re-implement marker files via timeline server
> > (Owner:
> > > > > Ethan Guo)
> > > > > - [HUDI-1985] Website redesign implementation (Owner: Vinoth
> > > > > Govindarajan)
> > > > > - [HUDI-2232] MERGE INTO fails with table having nested struct
> > > (Owner:
> > > > > pengzhiwei)
> > > > > - [HUDI-1468] incremental read support with clustering (Owner:
> > > Liwei)
> > > > > - [HUDI-2250] Bulk insert support for tables w/ primary key
> > (Owner:
> > > > > None)
> > > > > - [HUDI-2222] [SQL] Test catalog integration (Owner: Sagar
> Sumit)
> > > > > - [HUDI-2221] [SQL] Functionality testing with Spark 2 (Owner:
> > Sagar
> > > > > Sumit)
> > > > > - [HUDI-1887] Setting default value to false for enabling schema
> > > post
> > > > > processor (Owner: Sivabalan)
> > > > > - [HUDI-1850] Fixing read of a empty table but with failed write
> > > > (Owner:
> > > > > Sivabalan)
> > > > > - [HUDI-2151] Enable defaults for out of box performance (Owner:
> > > Udit
> > > > > Mehrotra)
> > > > > - [HUDI-2119] Ensure the rolled-back instance was previously
> > synced
> > > to
> > > > > the Metadata Table when syncing a Rollback Instant (Owner:
> > Prashant
> > > > > Wason)
> > > > > - [HUDI-1458] Support custom clustering strategies and preserve
> > > commit
> > > > > time to support incremental read (Owner: Satish Kotha)
> > > > > - [HUDI-1763] Fixing honoring of Ordering val in
> > > > > DefaultHoodieRecordPayload.preCombine (Owner: Sivabalan)
> > > > > - [HUDI-1129] Improving schema evolution support in hudi (Owner:
> > > > > Sivabalan)
> > > > > - [HUDI-2120] [DOC] Update docs about schema in flink sql
> > > > configuration
> > > > > (Owner: Xianghu Wang)
> > > > > - [HUDI-2182] Support Compaction Command For Spark Sql (Owner:
> > > > > pengzhiwei)
> > > > >
> > > > > Please respond to the thread if you think that I have missed
> > capturing
> > > > any
> > > > > of the highlights or blockers for Hudi 0.9.0 release. For the
> owners
> > of
> > > > > these release blockers, can you please provide a specific timeline
> > you
> > > > are
> > > > > willing to commit to for finishing these so we can cut an RC ?
> > > > >
> > > > > Thanks,
> > > > > Udit
> > > > >
> > > >
> > >
> >
>
>
> --
> Regards,
> -Sivabalan
>
Re: [DISCUSS] Hudi 0.9.0 Release
Posted by Raymond Xu <xu...@gmail.com>.
+1 End of next week
On Thu, Aug 5, 2021 at 3:06 PM Sivabalan <n....@gmail.com> wrote:
> Yeah, end of next week sounds good.
>
> Here are the status updates wrt patches I am involved.
>
> Plan to get these in by early next week.
> - [HUDI-2208] Support Bulk Insert For Spark Sql (Owner: pengzhiwei)
> - [HUDI-2250] Bulk insert support for tables w/ primary key (Owner:
> Sivabalan)
> - [HUDI-1842] Spark Sql Support For The Exists Hoodie Table (Owner:
> pengzhiwei)
> - [HUDI-1138] Re-implement marker files via timeline server (Owner:
> Ethan Guo)
> - [HUDI-1129] Improving schema evolution support in hudi (Owner:
> Sivabalan)
>
> Mid next week:
> - [HUDI-2063] Add Doc For Spark Sql (DML and DDL) integration With Hudi
> (Owner: pengzhiwei)
>
> Waiting for reviews. Will try to get it in by early next week. If we
> couldn't get this in, probably will skip this release.
> - [HUDI-1763] Fixing honoring of Ordering val in
> DefaultHoodieRecordPayload.preCombine (Owner: Sivabalan)
>
> Removed from release blockers:
> - [HUDI-1887] Setting default value to false for enabling schema post
> processor (Owner: Sivabalan)
> - [HUDI-1850] Fixing read of a empty table but with failed write (Owner:
> Sivabalan)
>
>
> On Thu, Aug 5, 2021 at 11:17 AM Vinoth Chandar <vi...@apache.org> wrote:
>
> > Any other thoughts? Love to lock this date down sooner than later.
> >
> > Thanks
> > Vinoth
> >
> > On Tue, Aug 3, 2021 at 11:35 PM Udit Mehrotra <ud...@apache.org> wrote:
> >
> > > Agreed Vinoth. End of next week seems reasonable as a hard deadline for
> > > cutting the RC.
> > >
> > > If anyone thinks otherwise or needs more time, feel free to chime in.
> > >
> > > On Tue, Aug 3, 2021 at 8:10 PM Vinoth Chandar <vi...@apache.org>
> wrote:
> > >
> > > > Thanks Udit! I propose we set end of next week as a hard deadline for
> > > > cutting the RC. Any thoughts?
> > > >
> > > > A good amount of progress is being made on these blockers, I think.
> > > >
> > > >
> > > > On Tue, Aug 3, 2021 at 5:13 PM Udit Mehrotra <ud...@apache.org>
> > wrote:
> > > >
> > > > > Hi Community,
> > > > >
> > > > > As we draw close to doing Hudi 0.9.0 release, I am happy to share a
> > > > summary
> > > > > of the key features/improvements that would be going in the release
> > and
> > > > the
> > > > > current blockers for everyone's visibility.
> > > > >
> > > > > *Highlights*
> > > > >
> > > > > - [HUDI-1729] Asynchronous Hive sync and commits cleaning for
> > Flink
> > > > > writer
> > > > > - [HUDI-1738] Detect and emit deleted records for Flink MOR
> table
> > > > > streaming read
> > > > > - [HUDI-1867] Support streaming reads for Flink COW table
> > > > > - [HUDI-1908] Global index for flink writer
> > > > > - [HUDI-1788] Support Insert Overwrite with Flink Writer
> > > > > - [HUDI-2209] Bulk insert for flink writer
> > > > > - [HUDI-1591] Support querying using non-globbed paths for Hudi
> > > Spark
> > > > > DataSource queries
> > > > > - [HUDI-1591] Partition pruning support for read optimized
> queries
> > > via
> > > > > Hudi Spark DataSource
> > > > > - [HUDI-1415] Register Hudi Table as a Spark DataSource Table
> with
> > > > > metastore. Queries via Spark SQL will be routed through Hudi
> > > > DataSource
> > > > > (instead of InputFormat), thus making it more performant due to
> > > > Spark's
> > > > > native/optimized readers
> > > > > - [HUDI-1591] Partition pruning support for snapshot queries via
> > > Hudi
> > > > > Spark DataSource
> > > > > - [HUDI-1658] DML and DDL support via Spark SQL
> > > > > - [HUDI-1790] Add SqlSource for DeltaStreamer to support
> backfill
> > > use
> > > > > cases:
> > > > > - [HUDI-251] Add JDBC Source support for DeltaStreamer
> > > > > - [HUDI-1910] Support Kafka based checkpointing for
> > > > HoodieDeltaStreamer
> > > > > - [HUDI-1371] Support metadata based listing for Spark
> DataSource
> > > and
> > > > > Spark SQL
> > > > > - [HUDI-2013] [HUDI-1717] [HUDI-2089] [HUDI-2016] Improvements
> to
> > > > > Metadata based listing
> > > > > - HUDI-89] Introduce a HoodieConfig/ConfigProperty framework to
> > > bring
> > > > > all configs under one roof
> > > > > - [HUDI-2124] Grafana dashboard for Hudi
> > > > > - [HUDI-1104] [HUDI-1105] [HUDI-2009] Improvements to Bulk
> Insert
> > > via
> > > > > row writing
> > > > > - [HUDI-1483] Async clustering for Delta Streamer
> > > > > - [HUDI-2235] Add virtual key support to Hudi
> > > > > - [HUDI-1848] Add support for Hive Metastore in Hive-sync-tool
> > > > > - In addition, there have been significant improvements and bug
> > > fixes
> > > > to
> > > > > improve the overall stability of Flink Hudi integration
> > > > >
> > > > > *Current Blockers*
> > > > >
> > > > > - [HUDI-2208] Support Bulk Insert For Spark Sql (Owner:
> > pengzhiwei)
> > > > > - [HUDI-1256] Follow on improvements to HFile tables for
> metadata
> > > > based
> > > > > listing (Owner: None)
> > > > > - [HUDI-2063] Add Doc For Spark Sql (DML and DDL) integration
> With
> > > > Hudi
> > > > > (Owner: pengzhiwei)
> > > > > - [HUDI-1842] Spark Sql Support For The Exists Hoodie Table
> > (Owner:
> > > > > pengzhiwei)
> > > > > - [HUDI-1138] Re-implement marker files via timeline server
> > (Owner:
> > > > > Ethan Guo)
> > > > > - [HUDI-1985] Website redesign implementation (Owner: Vinoth
> > > > > Govindarajan)
> > > > > - [HUDI-2232] MERGE INTO fails with table having nested struct
> > > (Owner:
> > > > > pengzhiwei)
> > > > > - [HUDI-1468] incremental read support with clustering (Owner:
> > > Liwei)
> > > > > - [HUDI-2250] Bulk insert support for tables w/ primary key
> > (Owner:
> > > > > None)
> > > > > - [HUDI-2222] [SQL] Test catalog integration (Owner: Sagar
> Sumit)
> > > > > - [HUDI-2221] [SQL] Functionality testing with Spark 2 (Owner:
> > Sagar
> > > > > Sumit)
> > > > > - [HUDI-1887] Setting default value to false for enabling schema
> > > post
> > > > > processor (Owner: Sivabalan)
> > > > > - [HUDI-1850] Fixing read of a empty table but with failed write
> > > > (Owner:
> > > > > Sivabalan)
> > > > > - [HUDI-2151] Enable defaults for out of box performance (Owner:
> > > Udit
> > > > > Mehrotra)
> > > > > - [HUDI-2119] Ensure the rolled-back instance was previously
> > synced
> > > to
> > > > > the Metadata Table when syncing a Rollback Instant (Owner:
> > Prashant
> > > > > Wason)
> > > > > - [HUDI-1458] Support custom clustering strategies and preserve
> > > commit
> > > > > time to support incremental read (Owner: Satish Kotha)
> > > > > - [HUDI-1763] Fixing honoring of Ordering val in
> > > > > DefaultHoodieRecordPayload.preCombine (Owner: Sivabalan)
> > > > > - [HUDI-1129] Improving schema evolution support in hudi (Owner:
> > > > > Sivabalan)
> > > > > - [HUDI-2120] [DOC] Update docs about schema in flink sql
> > > > configuration
> > > > > (Owner: Xianghu Wang)
> > > > > - [HUDI-2182] Support Compaction Command For Spark Sql (Owner:
> > > > > pengzhiwei)
> > > > >
> > > > > Please respond to the thread if you think that I have missed
> > capturing
> > > > any
> > > > > of the highlights or blockers for Hudi 0.9.0 release. For the
> owners
> > of
> > > > > these release blockers, can you please provide a specific timeline
> > you
> > > > are
> > > > > willing to commit to for finishing these so we can cut an RC ?
> > > > >
> > > > > Thanks,
> > > > > Udit
> > > > >
> > > >
> > >
> >
>
>
> --
> Regards,
> -Sivabalan
>
Re: [DISCUSS] Hudi 0.9.0 Release
Posted by Sivabalan <n....@gmail.com>.
Yeah, end of next week sounds good.
Here are the status updates wrt patches I am involved.
Plan to get these in by early next week.
- [HUDI-2208] Support Bulk Insert For Spark Sql (Owner: pengzhiwei)
- [HUDI-2250] Bulk insert support for tables w/ primary key (Owner:
Sivabalan)
- [HUDI-1842] Spark Sql Support For The Exists Hoodie Table (Owner:
pengzhiwei)
- [HUDI-1138] Re-implement marker files via timeline server (Owner:
Ethan Guo)
- [HUDI-1129] Improving schema evolution support in hudi (Owner:
Sivabalan)
Mid next week:
- [HUDI-2063] Add Doc For Spark Sql (DML and DDL) integration With Hudi
(Owner: pengzhiwei)
Waiting for reviews. Will try to get it in by early next week. If we
couldn't get this in, probably will skip this release.
- [HUDI-1763] Fixing honoring of Ordering val in
DefaultHoodieRecordPayload.preCombine (Owner: Sivabalan)
Removed from release blockers:
- [HUDI-1887] Setting default value to false for enabling schema post
processor (Owner: Sivabalan)
- [HUDI-1850] Fixing read of a empty table but with failed write (Owner:
Sivabalan)
On Thu, Aug 5, 2021 at 11:17 AM Vinoth Chandar <vi...@apache.org> wrote:
> Any other thoughts? Love to lock this date down sooner than later.
>
> Thanks
> Vinoth
>
> On Tue, Aug 3, 2021 at 11:35 PM Udit Mehrotra <ud...@apache.org> wrote:
>
> > Agreed Vinoth. End of next week seems reasonable as a hard deadline for
> > cutting the RC.
> >
> > If anyone thinks otherwise or needs more time, feel free to chime in.
> >
> > On Tue, Aug 3, 2021 at 8:10 PM Vinoth Chandar <vi...@apache.org> wrote:
> >
> > > Thanks Udit! I propose we set end of next week as a hard deadline for
> > > cutting the RC. Any thoughts?
> > >
> > > A good amount of progress is being made on these blockers, I think.
> > >
> > >
> > > On Tue, Aug 3, 2021 at 5:13 PM Udit Mehrotra <ud...@apache.org>
> wrote:
> > >
> > > > Hi Community,
> > > >
> > > > As we draw close to doing Hudi 0.9.0 release, I am happy to share a
> > > summary
> > > > of the key features/improvements that would be going in the release
> and
> > > the
> > > > current blockers for everyone's visibility.
> > > >
> > > > *Highlights*
> > > >
> > > > - [HUDI-1729] Asynchronous Hive sync and commits cleaning for
> Flink
> > > > writer
> > > > - [HUDI-1738] Detect and emit deleted records for Flink MOR table
> > > > streaming read
> > > > - [HUDI-1867] Support streaming reads for Flink COW table
> > > > - [HUDI-1908] Global index for flink writer
> > > > - [HUDI-1788] Support Insert Overwrite with Flink Writer
> > > > - [HUDI-2209] Bulk insert for flink writer
> > > > - [HUDI-1591] Support querying using non-globbed paths for Hudi
> > Spark
> > > > DataSource queries
> > > > - [HUDI-1591] Partition pruning support for read optimized queries
> > via
> > > > Hudi Spark DataSource
> > > > - [HUDI-1415] Register Hudi Table as a Spark DataSource Table with
> > > > metastore. Queries via Spark SQL will be routed through Hudi
> > > DataSource
> > > > (instead of InputFormat), thus making it more performant due to
> > > Spark's
> > > > native/optimized readers
> > > > - [HUDI-1591] Partition pruning support for snapshot queries via
> > Hudi
> > > > Spark DataSource
> > > > - [HUDI-1658] DML and DDL support via Spark SQL
> > > > - [HUDI-1790] Add SqlSource for DeltaStreamer to support backfill
> > use
> > > > cases:
> > > > - [HUDI-251] Add JDBC Source support for DeltaStreamer
> > > > - [HUDI-1910] Support Kafka based checkpointing for
> > > HoodieDeltaStreamer
> > > > - [HUDI-1371] Support metadata based listing for Spark DataSource
> > and
> > > > Spark SQL
> > > > - [HUDI-2013] [HUDI-1717] [HUDI-2089] [HUDI-2016] Improvements to
> > > > Metadata based listing
> > > > - HUDI-89] Introduce a HoodieConfig/ConfigProperty framework to
> > bring
> > > > all configs under one roof
> > > > - [HUDI-2124] Grafana dashboard for Hudi
> > > > - [HUDI-1104] [HUDI-1105] [HUDI-2009] Improvements to Bulk Insert
> > via
> > > > row writing
> > > > - [HUDI-1483] Async clustering for Delta Streamer
> > > > - [HUDI-2235] Add virtual key support to Hudi
> > > > - [HUDI-1848] Add support for Hive Metastore in Hive-sync-tool
> > > > - In addition, there have been significant improvements and bug
> > fixes
> > > to
> > > > improve the overall stability of Flink Hudi integration
> > > >
> > > > *Current Blockers*
> > > >
> > > > - [HUDI-2208] Support Bulk Insert For Spark Sql (Owner:
> pengzhiwei)
> > > > - [HUDI-1256] Follow on improvements to HFile tables for metadata
> > > based
> > > > listing (Owner: None)
> > > > - [HUDI-2063] Add Doc For Spark Sql (DML and DDL) integration With
> > > Hudi
> > > > (Owner: pengzhiwei)
> > > > - [HUDI-1842] Spark Sql Support For The Exists Hoodie Table
> (Owner:
> > > > pengzhiwei)
> > > > - [HUDI-1138] Re-implement marker files via timeline server
> (Owner:
> > > > Ethan Guo)
> > > > - [HUDI-1985] Website redesign implementation (Owner: Vinoth
> > > > Govindarajan)
> > > > - [HUDI-2232] MERGE INTO fails with table having nested struct
> > (Owner:
> > > > pengzhiwei)
> > > > - [HUDI-1468] incremental read support with clustering (Owner:
> > Liwei)
> > > > - [HUDI-2250] Bulk insert support for tables w/ primary key
> (Owner:
> > > > None)
> > > > - [HUDI-2222] [SQL] Test catalog integration (Owner: Sagar Sumit)
> > > > - [HUDI-2221] [SQL] Functionality testing with Spark 2 (Owner:
> Sagar
> > > > Sumit)
> > > > - [HUDI-1887] Setting default value to false for enabling schema
> > post
> > > > processor (Owner: Sivabalan)
> > > > - [HUDI-1850] Fixing read of a empty table but with failed write
> > > (Owner:
> > > > Sivabalan)
> > > > - [HUDI-2151] Enable defaults for out of box performance (Owner:
> > Udit
> > > > Mehrotra)
> > > > - [HUDI-2119] Ensure the rolled-back instance was previously
> synced
> > to
> > > > the Metadata Table when syncing a Rollback Instant (Owner:
> Prashant
> > > > Wason)
> > > > - [HUDI-1458] Support custom clustering strategies and preserve
> > commit
> > > > time to support incremental read (Owner: Satish Kotha)
> > > > - [HUDI-1763] Fixing honoring of Ordering val in
> > > > DefaultHoodieRecordPayload.preCombine (Owner: Sivabalan)
> > > > - [HUDI-1129] Improving schema evolution support in hudi (Owner:
> > > > Sivabalan)
> > > > - [HUDI-2120] [DOC] Update docs about schema in flink sql
> > > configuration
> > > > (Owner: Xianghu Wang)
> > > > - [HUDI-2182] Support Compaction Command For Spark Sql (Owner:
> > > > pengzhiwei)
> > > >
> > > > Please respond to the thread if you think that I have missed
> capturing
> > > any
> > > > of the highlights or blockers for Hudi 0.9.0 release. For the owners
> of
> > > > these release blockers, can you please provide a specific timeline
> you
> > > are
> > > > willing to commit to for finishing these so we can cut an RC ?
> > > >
> > > > Thanks,
> > > > Udit
> > > >
> > >
> >
>
--
Regards,
-Sivabalan
Re: [DISCUSS] Hudi 0.9.0 Release
Posted by Sivabalan <n....@gmail.com>.
Yeah, end of next week sounds good.
Here are the status updates wrt patches I am involved.
Plan to get these in by early next week.
- [HUDI-2208] Support Bulk Insert For Spark Sql (Owner: pengzhiwei)
- [HUDI-2250] Bulk insert support for tables w/ primary key (Owner:
Sivabalan)
- [HUDI-1842] Spark Sql Support For The Exists Hoodie Table (Owner:
pengzhiwei)
- [HUDI-1138] Re-implement marker files via timeline server (Owner:
Ethan Guo)
- [HUDI-1129] Improving schema evolution support in hudi (Owner:
Sivabalan)
Mid next week:
- [HUDI-2063] Add Doc For Spark Sql (DML and DDL) integration With Hudi
(Owner: pengzhiwei)
Waiting for reviews. Will try to get it in by early next week. If we
couldn't get this in, probably will skip this release.
- [HUDI-1763] Fixing honoring of Ordering val in
DefaultHoodieRecordPayload.preCombine (Owner: Sivabalan)
Removed from release blockers:
- [HUDI-1887] Setting default value to false for enabling schema post
processor (Owner: Sivabalan)
- [HUDI-1850] Fixing read of a empty table but with failed write (Owner:
Sivabalan)
On Thu, Aug 5, 2021 at 11:17 AM Vinoth Chandar <vi...@apache.org> wrote:
> Any other thoughts? Love to lock this date down sooner than later.
>
> Thanks
> Vinoth
>
> On Tue, Aug 3, 2021 at 11:35 PM Udit Mehrotra <ud...@apache.org> wrote:
>
> > Agreed Vinoth. End of next week seems reasonable as a hard deadline for
> > cutting the RC.
> >
> > If anyone thinks otherwise or needs more time, feel free to chime in.
> >
> > On Tue, Aug 3, 2021 at 8:10 PM Vinoth Chandar <vi...@apache.org> wrote:
> >
> > > Thanks Udit! I propose we set end of next week as a hard deadline for
> > > cutting the RC. Any thoughts?
> > >
> > > A good amount of progress is being made on these blockers, I think.
> > >
> > >
> > > On Tue, Aug 3, 2021 at 5:13 PM Udit Mehrotra <ud...@apache.org>
> wrote:
> > >
> > > > Hi Community,
> > > >
> > > > As we draw close to doing Hudi 0.9.0 release, I am happy to share a
> > > summary
> > > > of the key features/improvements that would be going in the release
> and
> > > the
> > > > current blockers for everyone's visibility.
> > > >
> > > > *Highlights*
> > > >
> > > > - [HUDI-1729] Asynchronous Hive sync and commits cleaning for
> Flink
> > > > writer
> > > > - [HUDI-1738] Detect and emit deleted records for Flink MOR table
> > > > streaming read
> > > > - [HUDI-1867] Support streaming reads for Flink COW table
> > > > - [HUDI-1908] Global index for flink writer
> > > > - [HUDI-1788] Support Insert Overwrite with Flink Writer
> > > > - [HUDI-2209] Bulk insert for flink writer
> > > > - [HUDI-1591] Support querying using non-globbed paths for Hudi
> > Spark
> > > > DataSource queries
> > > > - [HUDI-1591] Partition pruning support for read optimized queries
> > via
> > > > Hudi Spark DataSource
> > > > - [HUDI-1415] Register Hudi Table as a Spark DataSource Table with
> > > > metastore. Queries via Spark SQL will be routed through Hudi
> > > DataSource
> > > > (instead of InputFormat), thus making it more performant due to
> > > Spark's
> > > > native/optimized readers
> > > > - [HUDI-1591] Partition pruning support for snapshot queries via
> > Hudi
> > > > Spark DataSource
> > > > - [HUDI-1658] DML and DDL support via Spark SQL
> > > > - [HUDI-1790] Add SqlSource for DeltaStreamer to support backfill
> > use
> > > > cases:
> > > > - [HUDI-251] Add JDBC Source support for DeltaStreamer
> > > > - [HUDI-1910] Support Kafka based checkpointing for
> > > HoodieDeltaStreamer
> > > > - [HUDI-1371] Support metadata based listing for Spark DataSource
> > and
> > > > Spark SQL
> > > > - [HUDI-2013] [HUDI-1717] [HUDI-2089] [HUDI-2016] Improvements to
> > > > Metadata based listing
> > > > - HUDI-89] Introduce a HoodieConfig/ConfigProperty framework to
> > bring
> > > > all configs under one roof
> > > > - [HUDI-2124] Grafana dashboard for Hudi
> > > > - [HUDI-1104] [HUDI-1105] [HUDI-2009] Improvements to Bulk Insert
> > via
> > > > row writing
> > > > - [HUDI-1483] Async clustering for Delta Streamer
> > > > - [HUDI-2235] Add virtual key support to Hudi
> > > > - [HUDI-1848] Add support for Hive Metastore in Hive-sync-tool
> > > > - In addition, there have been significant improvements and bug
> > fixes
> > > to
> > > > improve the overall stability of Flink Hudi integration
> > > >
> > > > *Current Blockers*
> > > >
> > > > - [HUDI-2208] Support Bulk Insert For Spark Sql (Owner:
> pengzhiwei)
> > > > - [HUDI-1256] Follow on improvements to HFile tables for metadata
> > > based
> > > > listing (Owner: None)
> > > > - [HUDI-2063] Add Doc For Spark Sql (DML and DDL) integration With
> > > Hudi
> > > > (Owner: pengzhiwei)
> > > > - [HUDI-1842] Spark Sql Support For The Exists Hoodie Table
> (Owner:
> > > > pengzhiwei)
> > > > - [HUDI-1138] Re-implement marker files via timeline server
> (Owner:
> > > > Ethan Guo)
> > > > - [HUDI-1985] Website redesign implementation (Owner: Vinoth
> > > > Govindarajan)
> > > > - [HUDI-2232] MERGE INTO fails with table having nested struct
> > (Owner:
> > > > pengzhiwei)
> > > > - [HUDI-1468] incremental read support with clustering (Owner:
> > Liwei)
> > > > - [HUDI-2250] Bulk insert support for tables w/ primary key
> (Owner:
> > > > None)
> > > > - [HUDI-2222] [SQL] Test catalog integration (Owner: Sagar Sumit)
> > > > - [HUDI-2221] [SQL] Functionality testing with Spark 2 (Owner:
> Sagar
> > > > Sumit)
> > > > - [HUDI-1887] Setting default value to false for enabling schema
> > post
> > > > processor (Owner: Sivabalan)
> > > > - [HUDI-1850] Fixing read of a empty table but with failed write
> > > (Owner:
> > > > Sivabalan)
> > > > - [HUDI-2151] Enable defaults for out of box performance (Owner:
> > Udit
> > > > Mehrotra)
> > > > - [HUDI-2119] Ensure the rolled-back instance was previously
> synced
> > to
> > > > the Metadata Table when syncing a Rollback Instant (Owner:
> Prashant
> > > > Wason)
> > > > - [HUDI-1458] Support custom clustering strategies and preserve
> > commit
> > > > time to support incremental read (Owner: Satish Kotha)
> > > > - [HUDI-1763] Fixing honoring of Ordering val in
> > > > DefaultHoodieRecordPayload.preCombine (Owner: Sivabalan)
> > > > - [HUDI-1129] Improving schema evolution support in hudi (Owner:
> > > > Sivabalan)
> > > > - [HUDI-2120] [DOC] Update docs about schema in flink sql
> > > configuration
> > > > (Owner: Xianghu Wang)
> > > > - [HUDI-2182] Support Compaction Command For Spark Sql (Owner:
> > > > pengzhiwei)
> > > >
> > > > Please respond to the thread if you think that I have missed
> capturing
> > > any
> > > > of the highlights or blockers for Hudi 0.9.0 release. For the owners
> of
> > > > these release blockers, can you please provide a specific timeline
> you
> > > are
> > > > willing to commit to for finishing these so we can cut an RC ?
> > > >
> > > > Thanks,
> > > > Udit
> > > >
> > >
> >
>
--
Regards,
-Sivabalan
Re: [DISCUSS] Hudi 0.9.0 Release
Posted by Vinoth Chandar <vi...@apache.org>.
Any other thoughts? Love to lock this date down sooner than later.
Thanks
Vinoth
On Tue, Aug 3, 2021 at 11:35 PM Udit Mehrotra <ud...@apache.org> wrote:
> Agreed Vinoth. End of next week seems reasonable as a hard deadline for
> cutting the RC.
>
> If anyone thinks otherwise or needs more time, feel free to chime in.
>
> On Tue, Aug 3, 2021 at 8:10 PM Vinoth Chandar <vi...@apache.org> wrote:
>
> > Thanks Udit! I propose we set end of next week as a hard deadline for
> > cutting the RC. Any thoughts?
> >
> > A good amount of progress is being made on these blockers, I think.
> >
> >
> > On Tue, Aug 3, 2021 at 5:13 PM Udit Mehrotra <ud...@apache.org> wrote:
> >
> > > Hi Community,
> > >
> > > As we draw close to doing Hudi 0.9.0 release, I am happy to share a
> > summary
> > > of the key features/improvements that would be going in the release and
> > the
> > > current blockers for everyone's visibility.
> > >
> > > *Highlights*
> > >
> > > - [HUDI-1729] Asynchronous Hive sync and commits cleaning for Flink
> > > writer
> > > - [HUDI-1738] Detect and emit deleted records for Flink MOR table
> > > streaming read
> > > - [HUDI-1867] Support streaming reads for Flink COW table
> > > - [HUDI-1908] Global index for flink writer
> > > - [HUDI-1788] Support Insert Overwrite with Flink Writer
> > > - [HUDI-2209] Bulk insert for flink writer
> > > - [HUDI-1591] Support querying using non-globbed paths for Hudi
> Spark
> > > DataSource queries
> > > - [HUDI-1591] Partition pruning support for read optimized queries
> via
> > > Hudi Spark DataSource
> > > - [HUDI-1415] Register Hudi Table as a Spark DataSource Table with
> > > metastore. Queries via Spark SQL will be routed through Hudi
> > DataSource
> > > (instead of InputFormat), thus making it more performant due to
> > Spark's
> > > native/optimized readers
> > > - [HUDI-1591] Partition pruning support for snapshot queries via
> Hudi
> > > Spark DataSource
> > > - [HUDI-1658] DML and DDL support via Spark SQL
> > > - [HUDI-1790] Add SqlSource for DeltaStreamer to support backfill
> use
> > > cases:
> > > - [HUDI-251] Add JDBC Source support for DeltaStreamer
> > > - [HUDI-1910] Support Kafka based checkpointing for
> > HoodieDeltaStreamer
> > > - [HUDI-1371] Support metadata based listing for Spark DataSource
> and
> > > Spark SQL
> > > - [HUDI-2013] [HUDI-1717] [HUDI-2089] [HUDI-2016] Improvements to
> > > Metadata based listing
> > > - HUDI-89] Introduce a HoodieConfig/ConfigProperty framework to
> bring
> > > all configs under one roof
> > > - [HUDI-2124] Grafana dashboard for Hudi
> > > - [HUDI-1104] [HUDI-1105] [HUDI-2009] Improvements to Bulk Insert
> via
> > > row writing
> > > - [HUDI-1483] Async clustering for Delta Streamer
> > > - [HUDI-2235] Add virtual key support to Hudi
> > > - [HUDI-1848] Add support for Hive Metastore in Hive-sync-tool
> > > - In addition, there have been significant improvements and bug
> fixes
> > to
> > > improve the overall stability of Flink Hudi integration
> > >
> > > *Current Blockers*
> > >
> > > - [HUDI-2208] Support Bulk Insert For Spark Sql (Owner: pengzhiwei)
> > > - [HUDI-1256] Follow on improvements to HFile tables for metadata
> > based
> > > listing (Owner: None)
> > > - [HUDI-2063] Add Doc For Spark Sql (DML and DDL) integration With
> > Hudi
> > > (Owner: pengzhiwei)
> > > - [HUDI-1842] Spark Sql Support For The Exists Hoodie Table (Owner:
> > > pengzhiwei)
> > > - [HUDI-1138] Re-implement marker files via timeline server (Owner:
> > > Ethan Guo)
> > > - [HUDI-1985] Website redesign implementation (Owner: Vinoth
> > > Govindarajan)
> > > - [HUDI-2232] MERGE INTO fails with table having nested struct
> (Owner:
> > > pengzhiwei)
> > > - [HUDI-1468] incremental read support with clustering (Owner:
> Liwei)
> > > - [HUDI-2250] Bulk insert support for tables w/ primary key (Owner:
> > > None)
> > > - [HUDI-2222] [SQL] Test catalog integration (Owner: Sagar Sumit)
> > > - [HUDI-2221] [SQL] Functionality testing with Spark 2 (Owner: Sagar
> > > Sumit)
> > > - [HUDI-1887] Setting default value to false for enabling schema
> post
> > > processor (Owner: Sivabalan)
> > > - [HUDI-1850] Fixing read of a empty table but with failed write
> > (Owner:
> > > Sivabalan)
> > > - [HUDI-2151] Enable defaults for out of box performance (Owner:
> Udit
> > > Mehrotra)
> > > - [HUDI-2119] Ensure the rolled-back instance was previously synced
> to
> > > the Metadata Table when syncing a Rollback Instant (Owner: Prashant
> > > Wason)
> > > - [HUDI-1458] Support custom clustering strategies and preserve
> commit
> > > time to support incremental read (Owner: Satish Kotha)
> > > - [HUDI-1763] Fixing honoring of Ordering val in
> > > DefaultHoodieRecordPayload.preCombine (Owner: Sivabalan)
> > > - [HUDI-1129] Improving schema evolution support in hudi (Owner:
> > > Sivabalan)
> > > - [HUDI-2120] [DOC] Update docs about schema in flink sql
> > configuration
> > > (Owner: Xianghu Wang)
> > > - [HUDI-2182] Support Compaction Command For Spark Sql (Owner:
> > > pengzhiwei)
> > >
> > > Please respond to the thread if you think that I have missed capturing
> > any
> > > of the highlights or blockers for Hudi 0.9.0 release. For the owners of
> > > these release blockers, can you please provide a specific timeline you
> > are
> > > willing to commit to for finishing these so we can cut an RC ?
> > >
> > > Thanks,
> > > Udit
> > >
> >
>
Re: [DISCUSS] Hudi 0.9.0 Release
Posted by Vinoth Chandar <vi...@apache.org>.
Any other thoughts? Love to lock this date down sooner than later.
Thanks
Vinoth
On Tue, Aug 3, 2021 at 11:35 PM Udit Mehrotra <ud...@apache.org> wrote:
> Agreed Vinoth. End of next week seems reasonable as a hard deadline for
> cutting the RC.
>
> If anyone thinks otherwise or needs more time, feel free to chime in.
>
> On Tue, Aug 3, 2021 at 8:10 PM Vinoth Chandar <vi...@apache.org> wrote:
>
> > Thanks Udit! I propose we set end of next week as a hard deadline for
> > cutting the RC. Any thoughts?
> >
> > A good amount of progress is being made on these blockers, I think.
> >
> >
> > On Tue, Aug 3, 2021 at 5:13 PM Udit Mehrotra <ud...@apache.org> wrote:
> >
> > > Hi Community,
> > >
> > > As we draw close to doing Hudi 0.9.0 release, I am happy to share a
> > summary
> > > of the key features/improvements that would be going in the release and
> > the
> > > current blockers for everyone's visibility.
> > >
> > > *Highlights*
> > >
> > > - [HUDI-1729] Asynchronous Hive sync and commits cleaning for Flink
> > > writer
> > > - [HUDI-1738] Detect and emit deleted records for Flink MOR table
> > > streaming read
> > > - [HUDI-1867] Support streaming reads for Flink COW table
> > > - [HUDI-1908] Global index for flink writer
> > > - [HUDI-1788] Support Insert Overwrite with Flink Writer
> > > - [HUDI-2209] Bulk insert for flink writer
> > > - [HUDI-1591] Support querying using non-globbed paths for Hudi
> Spark
> > > DataSource queries
> > > - [HUDI-1591] Partition pruning support for read optimized queries
> via
> > > Hudi Spark DataSource
> > > - [HUDI-1415] Register Hudi Table as a Spark DataSource Table with
> > > metastore. Queries via Spark SQL will be routed through Hudi
> > DataSource
> > > (instead of InputFormat), thus making it more performant due to
> > Spark's
> > > native/optimized readers
> > > - [HUDI-1591] Partition pruning support for snapshot queries via
> Hudi
> > > Spark DataSource
> > > - [HUDI-1658] DML and DDL support via Spark SQL
> > > - [HUDI-1790] Add SqlSource for DeltaStreamer to support backfill
> use
> > > cases:
> > > - [HUDI-251] Add JDBC Source support for DeltaStreamer
> > > - [HUDI-1910] Support Kafka based checkpointing for
> > HoodieDeltaStreamer
> > > - [HUDI-1371] Support metadata based listing for Spark DataSource
> and
> > > Spark SQL
> > > - [HUDI-2013] [HUDI-1717] [HUDI-2089] [HUDI-2016] Improvements to
> > > Metadata based listing
> > > - HUDI-89] Introduce a HoodieConfig/ConfigProperty framework to
> bring
> > > all configs under one roof
> > > - [HUDI-2124] Grafana dashboard for Hudi
> > > - [HUDI-1104] [HUDI-1105] [HUDI-2009] Improvements to Bulk Insert
> via
> > > row writing
> > > - [HUDI-1483] Async clustering for Delta Streamer
> > > - [HUDI-2235] Add virtual key support to Hudi
> > > - [HUDI-1848] Add support for Hive Metastore in Hive-sync-tool
> > > - In addition, there have been significant improvements and bug
> fixes
> > to
> > > improve the overall stability of Flink Hudi integration
> > >
> > > *Current Blockers*
> > >
> > > - [HUDI-2208] Support Bulk Insert For Spark Sql (Owner: pengzhiwei)
> > > - [HUDI-1256] Follow on improvements to HFile tables for metadata
> > based
> > > listing (Owner: None)
> > > - [HUDI-2063] Add Doc For Spark Sql (DML and DDL) integration With
> > Hudi
> > > (Owner: pengzhiwei)
> > > - [HUDI-1842] Spark Sql Support For The Exists Hoodie Table (Owner:
> > > pengzhiwei)
> > > - [HUDI-1138] Re-implement marker files via timeline server (Owner:
> > > Ethan Guo)
> > > - [HUDI-1985] Website redesign implementation (Owner: Vinoth
> > > Govindarajan)
> > > - [HUDI-2232] MERGE INTO fails with table having nested struct
> (Owner:
> > > pengzhiwei)
> > > - [HUDI-1468] incremental read support with clustering (Owner:
> Liwei)
> > > - [HUDI-2250] Bulk insert support for tables w/ primary key (Owner:
> > > None)
> > > - [HUDI-2222] [SQL] Test catalog integration (Owner: Sagar Sumit)
> > > - [HUDI-2221] [SQL] Functionality testing with Spark 2 (Owner: Sagar
> > > Sumit)
> > > - [HUDI-1887] Setting default value to false for enabling schema
> post
> > > processor (Owner: Sivabalan)
> > > - [HUDI-1850] Fixing read of a empty table but with failed write
> > (Owner:
> > > Sivabalan)
> > > - [HUDI-2151] Enable defaults for out of box performance (Owner:
> Udit
> > > Mehrotra)
> > > - [HUDI-2119] Ensure the rolled-back instance was previously synced
> to
> > > the Metadata Table when syncing a Rollback Instant (Owner: Prashant
> > > Wason)
> > > - [HUDI-1458] Support custom clustering strategies and preserve
> commit
> > > time to support incremental read (Owner: Satish Kotha)
> > > - [HUDI-1763] Fixing honoring of Ordering val in
> > > DefaultHoodieRecordPayload.preCombine (Owner: Sivabalan)
> > > - [HUDI-1129] Improving schema evolution support in hudi (Owner:
> > > Sivabalan)
> > > - [HUDI-2120] [DOC] Update docs about schema in flink sql
> > configuration
> > > (Owner: Xianghu Wang)
> > > - [HUDI-2182] Support Compaction Command For Spark Sql (Owner:
> > > pengzhiwei)
> > >
> > > Please respond to the thread if you think that I have missed capturing
> > any
> > > of the highlights or blockers for Hudi 0.9.0 release. For the owners of
> > > these release blockers, can you please provide a specific timeline you
> > are
> > > willing to commit to for finishing these so we can cut an RC ?
> > >
> > > Thanks,
> > > Udit
> > >
> >
>
Re: [DISCUSS] Hudi 0.9.0 Release
Posted by Danny Chan <da...@apache.org>.
HUDI-2170 needs to be involved, it solves the problem that in COW write and
MOR reader code path, the preCombine field is ignored when merging.
HUDI-1771: we would try to get the rough version so that we can get more
feedback from the user, this is also a strong request for Chinese users.
Best,
Danny Chan
Udit Mehrotra <ud...@apache.org> 于2021年8月4日周三 下午2:35写道:
> Agreed Vinoth. End of next week seems reasonable as a hard deadline for
> cutting the RC.
>
> If anyone thinks otherwise or needs more time, feel free to chime in.
>
> On Tue, Aug 3, 2021 at 8:10 PM Vinoth Chandar <vi...@apache.org> wrote:
>
>> Thanks Udit! I propose we set end of next week as a hard deadline for
>> cutting the RC. Any thoughts?
>>
>> A good amount of progress is being made on these blockers, I think.
>>
>>
>> On Tue, Aug 3, 2021 at 5:13 PM Udit Mehrotra <ud...@apache.org> wrote:
>>
>> > Hi Community,
>> >
>> > As we draw close to doing Hudi 0.9.0 release, I am happy to share a
>> summary
>> > of the key features/improvements that would be going in the release and
>> the
>> > current blockers for everyone's visibility.
>> >
>> > *Highlights*
>> >
>> > - [HUDI-1729] Asynchronous Hive sync and commits cleaning for Flink
>> > writer
>> > - [HUDI-1738] Detect and emit deleted records for Flink MOR table
>> > streaming read
>> > - [HUDI-1867] Support streaming reads for Flink COW table
>> > - [HUDI-1908] Global index for flink writer
>> > - [HUDI-1788] Support Insert Overwrite with Flink Writer
>> > - [HUDI-2209] Bulk insert for flink writer
>> > - [HUDI-1591] Support querying using non-globbed paths for Hudi Spark
>> > DataSource queries
>> > - [HUDI-1591] Partition pruning support for read optimized queries
>> via
>> > Hudi Spark DataSource
>> > - [HUDI-1415] Register Hudi Table as a Spark DataSource Table with
>> > metastore. Queries via Spark SQL will be routed through Hudi
>> DataSource
>> > (instead of InputFormat), thus making it more performant due to
>> Spark's
>> > native/optimized readers
>> > - [HUDI-1591] Partition pruning support for snapshot queries via Hudi
>> > Spark DataSource
>> > - [HUDI-1658] DML and DDL support via Spark SQL
>> > - [HUDI-1790] Add SqlSource for DeltaStreamer to support backfill use
>> > cases:
>> > - [HUDI-251] Add JDBC Source support for DeltaStreamer
>> > - [HUDI-1910] Support Kafka based checkpointing for
>> HoodieDeltaStreamer
>> > - [HUDI-1371] Support metadata based listing for Spark DataSource and
>> > Spark SQL
>> > - [HUDI-2013] [HUDI-1717] [HUDI-2089] [HUDI-2016] Improvements to
>> > Metadata based listing
>> > - HUDI-89] Introduce a HoodieConfig/ConfigProperty framework to bring
>> > all configs under one roof
>> > - [HUDI-2124] Grafana dashboard for Hudi
>> > - [HUDI-1104] [HUDI-1105] [HUDI-2009] Improvements to Bulk Insert via
>> > row writing
>> > - [HUDI-1483] Async clustering for Delta Streamer
>> > - [HUDI-2235] Add virtual key support to Hudi
>> > - [HUDI-1848] Add support for Hive Metastore in Hive-sync-tool
>> > - In addition, there have been significant improvements and bug
>> fixes to
>> > improve the overall stability of Flink Hudi integration
>> >
>> > *Current Blockers*
>> >
>> > - [HUDI-2208] Support Bulk Insert For Spark Sql (Owner: pengzhiwei)
>> > - [HUDI-1256] Follow on improvements to HFile tables for metadata
>> based
>> > listing (Owner: None)
>> > - [HUDI-2063] Add Doc For Spark Sql (DML and DDL) integration With
>> Hudi
>> > (Owner: pengzhiwei)
>> > - [HUDI-1842] Spark Sql Support For The Exists Hoodie Table (Owner:
>> > pengzhiwei)
>> > - [HUDI-1138] Re-implement marker files via timeline server (Owner:
>> > Ethan Guo)
>> > - [HUDI-1985] Website redesign implementation (Owner: Vinoth
>> > Govindarajan)
>> > - [HUDI-2232] MERGE INTO fails with table having nested struct
>> (Owner:
>> > pengzhiwei)
>> > - [HUDI-1468] incremental read support with clustering (Owner: Liwei)
>> > - [HUDI-2250] Bulk insert support for tables w/ primary key (Owner:
>> > None)
>> > - [HUDI-2222] [SQL] Test catalog integration (Owner: Sagar Sumit)
>> > - [HUDI-2221] [SQL] Functionality testing with Spark 2 (Owner: Sagar
>> > Sumit)
>> > - [HUDI-1887] Setting default value to false for enabling schema post
>> > processor (Owner: Sivabalan)
>> > - [HUDI-1850] Fixing read of a empty table but with failed write
>> (Owner:
>> > Sivabalan)
>> > - [HUDI-2151] Enable defaults for out of box performance (Owner: Udit
>> > Mehrotra)
>> > - [HUDI-2119] Ensure the rolled-back instance was previously synced
>> to
>> > the Metadata Table when syncing a Rollback Instant (Owner: Prashant
>> > Wason)
>> > - [HUDI-1458] Support custom clustering strategies and preserve
>> commit
>> > time to support incremental read (Owner: Satish Kotha)
>> > - [HUDI-1763] Fixing honoring of Ordering val in
>> > DefaultHoodieRecordPayload.preCombine (Owner: Sivabalan)
>> > - [HUDI-1129] Improving schema evolution support in hudi (Owner:
>> > Sivabalan)
>> > - [HUDI-2120] [DOC] Update docs about schema in flink sql
>> configuration
>> > (Owner: Xianghu Wang)
>> > - [HUDI-2182] Support Compaction Command For Spark Sql (Owner:
>> > pengzhiwei)
>> >
>> > Please respond to the thread if you think that I have missed capturing
>> any
>> > of the highlights or blockers for Hudi 0.9.0 release. For the owners of
>> > these release blockers, can you please provide a specific timeline you
>> are
>> > willing to commit to for finishing these so we can cut an RC ?
>> >
>> > Thanks,
>> > Udit
>> >
>>
>
Re: [DISCUSS] Hudi 0.9.0 Release
Posted by Udit Mehrotra <ud...@apache.org>.
Agreed Vinoth. End of next week seems reasonable as a hard deadline for
cutting the RC.
If anyone thinks otherwise or needs more time, feel free to chime in.
On Tue, Aug 3, 2021 at 8:10 PM Vinoth Chandar <vi...@apache.org> wrote:
> Thanks Udit! I propose we set end of next week as a hard deadline for
> cutting the RC. Any thoughts?
>
> A good amount of progress is being made on these blockers, I think.
>
>
> On Tue, Aug 3, 2021 at 5:13 PM Udit Mehrotra <ud...@apache.org> wrote:
>
> > Hi Community,
> >
> > As we draw close to doing Hudi 0.9.0 release, I am happy to share a
> summary
> > of the key features/improvements that would be going in the release and
> the
> > current blockers for everyone's visibility.
> >
> > *Highlights*
> >
> > - [HUDI-1729] Asynchronous Hive sync and commits cleaning for Flink
> > writer
> > - [HUDI-1738] Detect and emit deleted records for Flink MOR table
> > streaming read
> > - [HUDI-1867] Support streaming reads for Flink COW table
> > - [HUDI-1908] Global index for flink writer
> > - [HUDI-1788] Support Insert Overwrite with Flink Writer
> > - [HUDI-2209] Bulk insert for flink writer
> > - [HUDI-1591] Support querying using non-globbed paths for Hudi Spark
> > DataSource queries
> > - [HUDI-1591] Partition pruning support for read optimized queries via
> > Hudi Spark DataSource
> > - [HUDI-1415] Register Hudi Table as a Spark DataSource Table with
> > metastore. Queries via Spark SQL will be routed through Hudi
> DataSource
> > (instead of InputFormat), thus making it more performant due to
> Spark's
> > native/optimized readers
> > - [HUDI-1591] Partition pruning support for snapshot queries via Hudi
> > Spark DataSource
> > - [HUDI-1658] DML and DDL support via Spark SQL
> > - [HUDI-1790] Add SqlSource for DeltaStreamer to support backfill use
> > cases:
> > - [HUDI-251] Add JDBC Source support for DeltaStreamer
> > - [HUDI-1910] Support Kafka based checkpointing for
> HoodieDeltaStreamer
> > - [HUDI-1371] Support metadata based listing for Spark DataSource and
> > Spark SQL
> > - [HUDI-2013] [HUDI-1717] [HUDI-2089] [HUDI-2016] Improvements to
> > Metadata based listing
> > - HUDI-89] Introduce a HoodieConfig/ConfigProperty framework to bring
> > all configs under one roof
> > - [HUDI-2124] Grafana dashboard for Hudi
> > - [HUDI-1104] [HUDI-1105] [HUDI-2009] Improvements to Bulk Insert via
> > row writing
> > - [HUDI-1483] Async clustering for Delta Streamer
> > - [HUDI-2235] Add virtual key support to Hudi
> > - [HUDI-1848] Add support for Hive Metastore in Hive-sync-tool
> > - In addition, there have been significant improvements and bug fixes
> to
> > improve the overall stability of Flink Hudi integration
> >
> > *Current Blockers*
> >
> > - [HUDI-2208] Support Bulk Insert For Spark Sql (Owner: pengzhiwei)
> > - [HUDI-1256] Follow on improvements to HFile tables for metadata
> based
> > listing (Owner: None)
> > - [HUDI-2063] Add Doc For Spark Sql (DML and DDL) integration With
> Hudi
> > (Owner: pengzhiwei)
> > - [HUDI-1842] Spark Sql Support For The Exists Hoodie Table (Owner:
> > pengzhiwei)
> > - [HUDI-1138] Re-implement marker files via timeline server (Owner:
> > Ethan Guo)
> > - [HUDI-1985] Website redesign implementation (Owner: Vinoth
> > Govindarajan)
> > - [HUDI-2232] MERGE INTO fails with table having nested struct (Owner:
> > pengzhiwei)
> > - [HUDI-1468] incremental read support with clustering (Owner: Liwei)
> > - [HUDI-2250] Bulk insert support for tables w/ primary key (Owner:
> > None)
> > - [HUDI-2222] [SQL] Test catalog integration (Owner: Sagar Sumit)
> > - [HUDI-2221] [SQL] Functionality testing with Spark 2 (Owner: Sagar
> > Sumit)
> > - [HUDI-1887] Setting default value to false for enabling schema post
> > processor (Owner: Sivabalan)
> > - [HUDI-1850] Fixing read of a empty table but with failed write
> (Owner:
> > Sivabalan)
> > - [HUDI-2151] Enable defaults for out of box performance (Owner: Udit
> > Mehrotra)
> > - [HUDI-2119] Ensure the rolled-back instance was previously synced to
> > the Metadata Table when syncing a Rollback Instant (Owner: Prashant
> > Wason)
> > - [HUDI-1458] Support custom clustering strategies and preserve commit
> > time to support incremental read (Owner: Satish Kotha)
> > - [HUDI-1763] Fixing honoring of Ordering val in
> > DefaultHoodieRecordPayload.preCombine (Owner: Sivabalan)
> > - [HUDI-1129] Improving schema evolution support in hudi (Owner:
> > Sivabalan)
> > - [HUDI-2120] [DOC] Update docs about schema in flink sql
> configuration
> > (Owner: Xianghu Wang)
> > - [HUDI-2182] Support Compaction Command For Spark Sql (Owner:
> > pengzhiwei)
> >
> > Please respond to the thread if you think that I have missed capturing
> any
> > of the highlights or blockers for Hudi 0.9.0 release. For the owners of
> > these release blockers, can you please provide a specific timeline you
> are
> > willing to commit to for finishing these so we can cut an RC ?
> >
> > Thanks,
> > Udit
> >
>
Re: [DISCUSS] Hudi 0.9.0 Release
Posted by Udit Mehrotra <ud...@apache.org>.
Agreed Vinoth. End of next week seems reasonable as a hard deadline for
cutting the RC.
If anyone thinks otherwise or needs more time, feel free to chime in.
On Tue, Aug 3, 2021 at 8:10 PM Vinoth Chandar <vi...@apache.org> wrote:
> Thanks Udit! I propose we set end of next week as a hard deadline for
> cutting the RC. Any thoughts?
>
> A good amount of progress is being made on these blockers, I think.
>
>
> On Tue, Aug 3, 2021 at 5:13 PM Udit Mehrotra <ud...@apache.org> wrote:
>
> > Hi Community,
> >
> > As we draw close to doing Hudi 0.9.0 release, I am happy to share a
> summary
> > of the key features/improvements that would be going in the release and
> the
> > current blockers for everyone's visibility.
> >
> > *Highlights*
> >
> > - [HUDI-1729] Asynchronous Hive sync and commits cleaning for Flink
> > writer
> > - [HUDI-1738] Detect and emit deleted records for Flink MOR table
> > streaming read
> > - [HUDI-1867] Support streaming reads for Flink COW table
> > - [HUDI-1908] Global index for flink writer
> > - [HUDI-1788] Support Insert Overwrite with Flink Writer
> > - [HUDI-2209] Bulk insert for flink writer
> > - [HUDI-1591] Support querying using non-globbed paths for Hudi Spark
> > DataSource queries
> > - [HUDI-1591] Partition pruning support for read optimized queries via
> > Hudi Spark DataSource
> > - [HUDI-1415] Register Hudi Table as a Spark DataSource Table with
> > metastore. Queries via Spark SQL will be routed through Hudi
> DataSource
> > (instead of InputFormat), thus making it more performant due to
> Spark's
> > native/optimized readers
> > - [HUDI-1591] Partition pruning support for snapshot queries via Hudi
> > Spark DataSource
> > - [HUDI-1658] DML and DDL support via Spark SQL
> > - [HUDI-1790] Add SqlSource for DeltaStreamer to support backfill use
> > cases:
> > - [HUDI-251] Add JDBC Source support for DeltaStreamer
> > - [HUDI-1910] Support Kafka based checkpointing for
> HoodieDeltaStreamer
> > - [HUDI-1371] Support metadata based listing for Spark DataSource and
> > Spark SQL
> > - [HUDI-2013] [HUDI-1717] [HUDI-2089] [HUDI-2016] Improvements to
> > Metadata based listing
> > - HUDI-89] Introduce a HoodieConfig/ConfigProperty framework to bring
> > all configs under one roof
> > - [HUDI-2124] Grafana dashboard for Hudi
> > - [HUDI-1104] [HUDI-1105] [HUDI-2009] Improvements to Bulk Insert via
> > row writing
> > - [HUDI-1483] Async clustering for Delta Streamer
> > - [HUDI-2235] Add virtual key support to Hudi
> > - [HUDI-1848] Add support for Hive Metastore in Hive-sync-tool
> > - In addition, there have been significant improvements and bug fixes
> to
> > improve the overall stability of Flink Hudi integration
> >
> > *Current Blockers*
> >
> > - [HUDI-2208] Support Bulk Insert For Spark Sql (Owner: pengzhiwei)
> > - [HUDI-1256] Follow on improvements to HFile tables for metadata
> based
> > listing (Owner: None)
> > - [HUDI-2063] Add Doc For Spark Sql (DML and DDL) integration With
> Hudi
> > (Owner: pengzhiwei)
> > - [HUDI-1842] Spark Sql Support For The Exists Hoodie Table (Owner:
> > pengzhiwei)
> > - [HUDI-1138] Re-implement marker files via timeline server (Owner:
> > Ethan Guo)
> > - [HUDI-1985] Website redesign implementation (Owner: Vinoth
> > Govindarajan)
> > - [HUDI-2232] MERGE INTO fails with table having nested struct (Owner:
> > pengzhiwei)
> > - [HUDI-1468] incremental read support with clustering (Owner: Liwei)
> > - [HUDI-2250] Bulk insert support for tables w/ primary key (Owner:
> > None)
> > - [HUDI-2222] [SQL] Test catalog integration (Owner: Sagar Sumit)
> > - [HUDI-2221] [SQL] Functionality testing with Spark 2 (Owner: Sagar
> > Sumit)
> > - [HUDI-1887] Setting default value to false for enabling schema post
> > processor (Owner: Sivabalan)
> > - [HUDI-1850] Fixing read of a empty table but with failed write
> (Owner:
> > Sivabalan)
> > - [HUDI-2151] Enable defaults for out of box performance (Owner: Udit
> > Mehrotra)
> > - [HUDI-2119] Ensure the rolled-back instance was previously synced to
> > the Metadata Table when syncing a Rollback Instant (Owner: Prashant
> > Wason)
> > - [HUDI-1458] Support custom clustering strategies and preserve commit
> > time to support incremental read (Owner: Satish Kotha)
> > - [HUDI-1763] Fixing honoring of Ordering val in
> > DefaultHoodieRecordPayload.preCombine (Owner: Sivabalan)
> > - [HUDI-1129] Improving schema evolution support in hudi (Owner:
> > Sivabalan)
> > - [HUDI-2120] [DOC] Update docs about schema in flink sql
> configuration
> > (Owner: Xianghu Wang)
> > - [HUDI-2182] Support Compaction Command For Spark Sql (Owner:
> > pengzhiwei)
> >
> > Please respond to the thread if you think that I have missed capturing
> any
> > of the highlights or blockers for Hudi 0.9.0 release. For the owners of
> > these release blockers, can you please provide a specific timeline you
> are
> > willing to commit to for finishing these so we can cut an RC ?
> >
> > Thanks,
> > Udit
> >
>
Re: [DISCUSS] Hudi 0.9.0 Release
Posted by Vinoth Chandar <vi...@apache.org>.
Thanks Udit! I propose we set end of next week as a hard deadline for
cutting the RC. Any thoughts?
A good amount of progress is being made on these blockers, I think.
On Tue, Aug 3, 2021 at 5:13 PM Udit Mehrotra <ud...@apache.org> wrote:
> Hi Community,
>
> As we draw close to doing Hudi 0.9.0 release, I am happy to share a summary
> of the key features/improvements that would be going in the release and the
> current blockers for everyone's visibility.
>
> *Highlights*
>
> - [HUDI-1729] Asynchronous Hive sync and commits cleaning for Flink
> writer
> - [HUDI-1738] Detect and emit deleted records for Flink MOR table
> streaming read
> - [HUDI-1867] Support streaming reads for Flink COW table
> - [HUDI-1908] Global index for flink writer
> - [HUDI-1788] Support Insert Overwrite with Flink Writer
> - [HUDI-2209] Bulk insert for flink writer
> - [HUDI-1591] Support querying using non-globbed paths for Hudi Spark
> DataSource queries
> - [HUDI-1591] Partition pruning support for read optimized queries via
> Hudi Spark DataSource
> - [HUDI-1415] Register Hudi Table as a Spark DataSource Table with
> metastore. Queries via Spark SQL will be routed through Hudi DataSource
> (instead of InputFormat), thus making it more performant due to Spark's
> native/optimized readers
> - [HUDI-1591] Partition pruning support for snapshot queries via Hudi
> Spark DataSource
> - [HUDI-1658] DML and DDL support via Spark SQL
> - [HUDI-1790] Add SqlSource for DeltaStreamer to support backfill use
> cases:
> - [HUDI-251] Add JDBC Source support for DeltaStreamer
> - [HUDI-1910] Support Kafka based checkpointing for HoodieDeltaStreamer
> - [HUDI-1371] Support metadata based listing for Spark DataSource and
> Spark SQL
> - [HUDI-2013] [HUDI-1717] [HUDI-2089] [HUDI-2016] Improvements to
> Metadata based listing
> - HUDI-89] Introduce a HoodieConfig/ConfigProperty framework to bring
> all configs under one roof
> - [HUDI-2124] Grafana dashboard for Hudi
> - [HUDI-1104] [HUDI-1105] [HUDI-2009] Improvements to Bulk Insert via
> row writing
> - [HUDI-1483] Async clustering for Delta Streamer
> - [HUDI-2235] Add virtual key support to Hudi
> - [HUDI-1848] Add support for Hive Metastore in Hive-sync-tool
> - In addition, there have been significant improvements and bug fixes to
> improve the overall stability of Flink Hudi integration
>
> *Current Blockers*
>
> - [HUDI-2208] Support Bulk Insert For Spark Sql (Owner: pengzhiwei)
> - [HUDI-1256] Follow on improvements to HFile tables for metadata based
> listing (Owner: None)
> - [HUDI-2063] Add Doc For Spark Sql (DML and DDL) integration With Hudi
> (Owner: pengzhiwei)
> - [HUDI-1842] Spark Sql Support For The Exists Hoodie Table (Owner:
> pengzhiwei)
> - [HUDI-1138] Re-implement marker files via timeline server (Owner:
> Ethan Guo)
> - [HUDI-1985] Website redesign implementation (Owner: Vinoth
> Govindarajan)
> - [HUDI-2232] MERGE INTO fails with table having nested struct (Owner:
> pengzhiwei)
> - [HUDI-1468] incremental read support with clustering (Owner: Liwei)
> - [HUDI-2250] Bulk insert support for tables w/ primary key (Owner:
> None)
> - [HUDI-2222] [SQL] Test catalog integration (Owner: Sagar Sumit)
> - [HUDI-2221] [SQL] Functionality testing with Spark 2 (Owner: Sagar
> Sumit)
> - [HUDI-1887] Setting default value to false for enabling schema post
> processor (Owner: Sivabalan)
> - [HUDI-1850] Fixing read of a empty table but with failed write (Owner:
> Sivabalan)
> - [HUDI-2151] Enable defaults for out of box performance (Owner: Udit
> Mehrotra)
> - [HUDI-2119] Ensure the rolled-back instance was previously synced to
> the Metadata Table when syncing a Rollback Instant (Owner: Prashant
> Wason)
> - [HUDI-1458] Support custom clustering strategies and preserve commit
> time to support incremental read (Owner: Satish Kotha)
> - [HUDI-1763] Fixing honoring of Ordering val in
> DefaultHoodieRecordPayload.preCombine (Owner: Sivabalan)
> - [HUDI-1129] Improving schema evolution support in hudi (Owner:
> Sivabalan)
> - [HUDI-2120] [DOC] Update docs about schema in flink sql configuration
> (Owner: Xianghu Wang)
> - [HUDI-2182] Support Compaction Command For Spark Sql (Owner:
> pengzhiwei)
>
> Please respond to the thread if you think that I have missed capturing any
> of the highlights or blockers for Hudi 0.9.0 release. For the owners of
> these release blockers, can you please provide a specific timeline you are
> willing to commit to for finishing these so we can cut an RC ?
>
> Thanks,
> Udit
>
Re: [DISCUSS] Hudi 0.9.0 Release
Posted by Vinoth Chandar <vi...@apache.org>.
Thanks Udit! I propose we set end of next week as a hard deadline for
cutting the RC. Any thoughts?
A good amount of progress is being made on these blockers, I think.
On Tue, Aug 3, 2021 at 5:13 PM Udit Mehrotra <ud...@apache.org> wrote:
> Hi Community,
>
> As we draw close to doing Hudi 0.9.0 release, I am happy to share a summary
> of the key features/improvements that would be going in the release and the
> current blockers for everyone's visibility.
>
> *Highlights*
>
> - [HUDI-1729] Asynchronous Hive sync and commits cleaning for Flink
> writer
> - [HUDI-1738] Detect and emit deleted records for Flink MOR table
> streaming read
> - [HUDI-1867] Support streaming reads for Flink COW table
> - [HUDI-1908] Global index for flink writer
> - [HUDI-1788] Support Insert Overwrite with Flink Writer
> - [HUDI-2209] Bulk insert for flink writer
> - [HUDI-1591] Support querying using non-globbed paths for Hudi Spark
> DataSource queries
> - [HUDI-1591] Partition pruning support for read optimized queries via
> Hudi Spark DataSource
> - [HUDI-1415] Register Hudi Table as a Spark DataSource Table with
> metastore. Queries via Spark SQL will be routed through Hudi DataSource
> (instead of InputFormat), thus making it more performant due to Spark's
> native/optimized readers
> - [HUDI-1591] Partition pruning support for snapshot queries via Hudi
> Spark DataSource
> - [HUDI-1658] DML and DDL support via Spark SQL
> - [HUDI-1790] Add SqlSource for DeltaStreamer to support backfill use
> cases:
> - [HUDI-251] Add JDBC Source support for DeltaStreamer
> - [HUDI-1910] Support Kafka based checkpointing for HoodieDeltaStreamer
> - [HUDI-1371] Support metadata based listing for Spark DataSource and
> Spark SQL
> - [HUDI-2013] [HUDI-1717] [HUDI-2089] [HUDI-2016] Improvements to
> Metadata based listing
> - HUDI-89] Introduce a HoodieConfig/ConfigProperty framework to bring
> all configs under one roof
> - [HUDI-2124] Grafana dashboard for Hudi
> - [HUDI-1104] [HUDI-1105] [HUDI-2009] Improvements to Bulk Insert via
> row writing
> - [HUDI-1483] Async clustering for Delta Streamer
> - [HUDI-2235] Add virtual key support to Hudi
> - [HUDI-1848] Add support for Hive Metastore in Hive-sync-tool
> - In addition, there have been significant improvements and bug fixes to
> improve the overall stability of Flink Hudi integration
>
> *Current Blockers*
>
> - [HUDI-2208] Support Bulk Insert For Spark Sql (Owner: pengzhiwei)
> - [HUDI-1256] Follow on improvements to HFile tables for metadata based
> listing (Owner: None)
> - [HUDI-2063] Add Doc For Spark Sql (DML and DDL) integration With Hudi
> (Owner: pengzhiwei)
> - [HUDI-1842] Spark Sql Support For The Exists Hoodie Table (Owner:
> pengzhiwei)
> - [HUDI-1138] Re-implement marker files via timeline server (Owner:
> Ethan Guo)
> - [HUDI-1985] Website redesign implementation (Owner: Vinoth
> Govindarajan)
> - [HUDI-2232] MERGE INTO fails with table having nested struct (Owner:
> pengzhiwei)
> - [HUDI-1468] incremental read support with clustering (Owner: Liwei)
> - [HUDI-2250] Bulk insert support for tables w/ primary key (Owner:
> None)
> - [HUDI-2222] [SQL] Test catalog integration (Owner: Sagar Sumit)
> - [HUDI-2221] [SQL] Functionality testing with Spark 2 (Owner: Sagar
> Sumit)
> - [HUDI-1887] Setting default value to false for enabling schema post
> processor (Owner: Sivabalan)
> - [HUDI-1850] Fixing read of a empty table but with failed write (Owner:
> Sivabalan)
> - [HUDI-2151] Enable defaults for out of box performance (Owner: Udit
> Mehrotra)
> - [HUDI-2119] Ensure the rolled-back instance was previously synced to
> the Metadata Table when syncing a Rollback Instant (Owner: Prashant
> Wason)
> - [HUDI-1458] Support custom clustering strategies and preserve commit
> time to support incremental read (Owner: Satish Kotha)
> - [HUDI-1763] Fixing honoring of Ordering val in
> DefaultHoodieRecordPayload.preCombine (Owner: Sivabalan)
> - [HUDI-1129] Improving schema evolution support in hudi (Owner:
> Sivabalan)
> - [HUDI-2120] [DOC] Update docs about schema in flink sql configuration
> (Owner: Xianghu Wang)
> - [HUDI-2182] Support Compaction Command For Spark Sql (Owner:
> pengzhiwei)
>
> Please respond to the thread if you think that I have missed capturing any
> of the highlights or blockers for Hudi 0.9.0 release. For the owners of
> these release blockers, can you please provide a specific timeline you are
> willing to commit to for finishing these so we can cut an RC ?
>
> Thanks,
> Udit
>