You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@flink.apache.org by Yufan Sheng <sy...@gmail.com> on 2023/04/13 14:22:36 UTC

[DISCUSS] Release of Pulsar Connector v4.0.0

Hello all,

I would like to discuss the release of Pulsar Connectors v4.0.0.

A lot of new features and bug fixes have been added to
flink-connector-pulsar 4.0.0 release. It also introduces some break
changes in this release.

Improvements

[FLINK-28083] PulsarSource works with object-reusing DeserializationSchema.
This commit will reuse the Message instance created by the Pulsar
client. No need to create duplicate objects which improves the source
performance.

[FLINK-29709] Bump the Pulsar to latest 2.10.2.
Bundle the latest Pulsar client which is more stable on transaction.

[FLINK-30413] Drop subscription support, remove unordered consumption.
We won’t support the Shared and Key_Shared subscription in this
release. It’s buggy and low performance. All the subscriptions created
by connector will be Exclusive by default.

[FLINK-28870] Improve the Pulsar source performance when meeting small
data rates.
The old connector will hang for 10 seconds when consuming the messages
at a small rate. We have removed this limitation and made it faster to
switch to another topic partition.

[FLINK-28351] Add dynamic sink topic support for Pulsar connector.
Writing to a non-existed topic is supported in Pulsar. We added this
support since this release.

[FLINK-30654][Connector/Pulsar] Force consumption from StartCursor
every time the application starts.
StartCursor is used only when the subscription does not exist in
Pulsar. We now support using the position from the StartCursor every
time the application starts.

New Features

[FLINK-26027] Expose Pulsar producer metrics and add FLIP-33 sink metrics.
You can monitor all the Pulsar client metrics now in this release. We
also support the FLIP-33 sink metrics by default which you can have a
clear view on how many messages have been written.

[FLINK-25686] Support schema evolution for Pulsar source.
Schema evolution has been supported in the Pulsar sink. We add a new
option for enabling the schema evolution support in source.

[FLINK-28082] Add end-to-end encryption support for Pulsar connector.
End-to-end encryption can encrypt the messages from sink to source in
the connector with an extra signature check. No one can see the real
content except the connector.

[FLINK-30689] Support sending message bytes with extra schema check.
Connector is sending the messages in bytes by default. We add this
feature for supporting validating the message bytes with the latest
schema on Pulsar.

[FLINK-30622] Support consuming messages with schema auto-detection from Pulsar.
Some topics may contain multiple types of messages. Use this feature
to convert these messages into a more general interface.

BUG Fixes

[FLINK-30552] Drop next message id calculation, use resetCursor api to
exclude the given message.
The old connector has a crucial bug on consuming messages which are
sent to Pulsar in batch. We have fixed the bug and supported all the
message types since this release.

Breaking Changes

Some classes and interfaces which are annotated with @PublicEvolving
have been changed in this release. And this may affect the end-users.

Pulsar Sink

PulsarMessage should be created by using the PulsarMessage.builder()
method. The PulsarMessageBuilder can’t be created directly like
before.

The route method in the TopicRouter interface should return a
TopicPartition instead of a topic name.

All the static methods in PulsarSerializationSchema have been removed.
You should set them (Schema, SerializationSchema) by directly using
PulsarSinkBuilder.setSerializationSchema().

Pulsar Source

The subscription type setting has been removed from the connector. All
the connector’s subscriptions will be created in Exclusive type. The
checkpoint for Shared and Key_Shared subscriptions couldn’t be used
since this release.

The setSubscriptionType method has been removed from the PulsarSourceBuilder.

The fromMessageTime method has been removed from the StartCursor.
Pulsar doesn’t support seeking from message time.

The RangeGenerator interface removed the deprecated open method and
the keyShareMode method. Because we don’t support the Key_Shared
subscription now. The RangeGenerator is only used in Exclusive
subscription to filter the desired keys.

TopicPartition only exposes two constructors with the @PublicEvolving
annotation now.

All the static methods in PulsarDeserializationSchema have been
removed. You should set them (Schema, SerializationSchema) by directly
using PulsarSourceBuilder.setDeserializationSchema().

The first argument of the open method in PulsarDeserializationSchema
has been changed from InitializationContext to a new
PulsarInitializationContext interface.

Weijie Guo will be the release manager.

Thanks,
Yufan

Re: [DISCUSS] Release of Pulsar Connector v4.0.0

Posted by Yufan Sheng <sy...@gmail.com>.
Aha, sorry. I'll follow the release in that thread.

On Thu, Apr 13, 2023 at 10:25 PM Martijn Visser
<ma...@apache.org> wrote:
>
> Hi Yufan,
>
> I think there's a slight miscommunication somewhere, because there's
> already a Pulsar 4.0.0 release candidate out. See
> https://lists.apache.org/thread/d6nd7xx11fd67z5h56sv0tlo5pvlm94o
>
> Do let me know if that's a good one, or if this should be cancelled.
>
> Best regards,
>
> Martijn
>
> On Thu, Apr 13, 2023 at 4:23 PM Yufan Sheng <sy...@gmail.com> wrote:
>
> > Hello all,
> >
> > I would like to discuss the release of Pulsar Connectors v4.0.0.
> >
> > A lot of new features and bug fixes have been added to
> > flink-connector-pulsar 4.0.0 release. It also introduces some break
> > changes in this release.
> >
> > Improvements
> >
> > [FLINK-28083] PulsarSource works with object-reusing DeserializationSchema.
> > This commit will reuse the Message instance created by the Pulsar
> > client. No need to create duplicate objects which improves the source
> > performance.
> >
> > [FLINK-29709] Bump the Pulsar to latest 2.10.2.
> > Bundle the latest Pulsar client which is more stable on transaction.
> >
> > [FLINK-30413] Drop subscription support, remove unordered consumption.
> > We won’t support the Shared and Key_Shared subscription in this
> > release. It’s buggy and low performance. All the subscriptions created
> > by connector will be Exclusive by default.
> >
> > [FLINK-28870] Improve the Pulsar source performance when meeting small
> > data rates.
> > The old connector will hang for 10 seconds when consuming the messages
> > at a small rate. We have removed this limitation and made it faster to
> > switch to another topic partition.
> >
> > [FLINK-28351] Add dynamic sink topic support for Pulsar connector.
> > Writing to a non-existed topic is supported in Pulsar. We added this
> > support since this release.
> >
> > [FLINK-30654][Connector/Pulsar] Force consumption from StartCursor
> > every time the application starts.
> > StartCursor is used only when the subscription does not exist in
> > Pulsar. We now support using the position from the StartCursor every
> > time the application starts.
> >
> > New Features
> >
> > [FLINK-26027] Expose Pulsar producer metrics and add FLIP-33 sink metrics.
> > You can monitor all the Pulsar client metrics now in this release. We
> > also support the FLIP-33 sink metrics by default which you can have a
> > clear view on how many messages have been written.
> >
> > [FLINK-25686] Support schema evolution for Pulsar source.
> > Schema evolution has been supported in the Pulsar sink. We add a new
> > option for enabling the schema evolution support in source.
> >
> > [FLINK-28082] Add end-to-end encryption support for Pulsar connector.
> > End-to-end encryption can encrypt the messages from sink to source in
> > the connector with an extra signature check. No one can see the real
> > content except the connector.
> >
> > [FLINK-30689] Support sending message bytes with extra schema check.
> > Connector is sending the messages in bytes by default. We add this
> > feature for supporting validating the message bytes with the latest
> > schema on Pulsar.
> >
> > [FLINK-30622] Support consuming messages with schema auto-detection from
> > Pulsar.
> > Some topics may contain multiple types of messages. Use this feature
> > to convert these messages into a more general interface.
> >
> > BUG Fixes
> >
> > [FLINK-30552] Drop next message id calculation, use resetCursor api to
> > exclude the given message.
> > The old connector has a crucial bug on consuming messages which are
> > sent to Pulsar in batch. We have fixed the bug and supported all the
> > message types since this release.
> >
> > Breaking Changes
> >
> > Some classes and interfaces which are annotated with @PublicEvolving
> > have been changed in this release. And this may affect the end-users.
> >
> > Pulsar Sink
> >
> > PulsarMessage should be created by using the PulsarMessage.builder()
> > method. The PulsarMessageBuilder can’t be created directly like
> > before.
> >
> > The route method in the TopicRouter interface should return a
> > TopicPartition instead of a topic name.
> >
> > All the static methods in PulsarSerializationSchema have been removed.
> > You should set them (Schema, SerializationSchema) by directly using
> > PulsarSinkBuilder.setSerializationSchema().
> >
> > Pulsar Source
> >
> > The subscription type setting has been removed from the connector. All
> > the connector’s subscriptions will be created in Exclusive type. The
> > checkpoint for Shared and Key_Shared subscriptions couldn’t be used
> > since this release.
> >
> > The setSubscriptionType method has been removed from the
> > PulsarSourceBuilder.
> >
> > The fromMessageTime method has been removed from the StartCursor.
> > Pulsar doesn’t support seeking from message time.
> >
> > The RangeGenerator interface removed the deprecated open method and
> > the keyShareMode method. Because we don’t support the Key_Shared
> > subscription now. The RangeGenerator is only used in Exclusive
> > subscription to filter the desired keys.
> >
> > TopicPartition only exposes two constructors with the @PublicEvolving
> > annotation now.
> >
> > All the static methods in PulsarDeserializationSchema have been
> > removed. You should set them (Schema, SerializationSchema) by directly
> > using PulsarSourceBuilder.setDeserializationSchema().
> >
> > The first argument of the open method in PulsarDeserializationSchema
> > has been changed from InitializationContext to a new
> > PulsarInitializationContext interface.
> >
> > Weijie Guo will be the release manager.
> >
> > Thanks,
> > Yufan
> >

Re: [DISCUSS] Release of Pulsar Connector v4.0.0

Posted by Martijn Visser <ma...@apache.org>.
Hi Yufan,

I think there's a slight miscommunication somewhere, because there's
already a Pulsar 4.0.0 release candidate out. See
https://lists.apache.org/thread/d6nd7xx11fd67z5h56sv0tlo5pvlm94o

Do let me know if that's a good one, or if this should be cancelled.

Best regards,

Martijn

On Thu, Apr 13, 2023 at 4:23 PM Yufan Sheng <sy...@gmail.com> wrote:

> Hello all,
>
> I would like to discuss the release of Pulsar Connectors v4.0.0.
>
> A lot of new features and bug fixes have been added to
> flink-connector-pulsar 4.0.0 release. It also introduces some break
> changes in this release.
>
> Improvements
>
> [FLINK-28083] PulsarSource works with object-reusing DeserializationSchema.
> This commit will reuse the Message instance created by the Pulsar
> client. No need to create duplicate objects which improves the source
> performance.
>
> [FLINK-29709] Bump the Pulsar to latest 2.10.2.
> Bundle the latest Pulsar client which is more stable on transaction.
>
> [FLINK-30413] Drop subscription support, remove unordered consumption.
> We won’t support the Shared and Key_Shared subscription in this
> release. It’s buggy and low performance. All the subscriptions created
> by connector will be Exclusive by default.
>
> [FLINK-28870] Improve the Pulsar source performance when meeting small
> data rates.
> The old connector will hang for 10 seconds when consuming the messages
> at a small rate. We have removed this limitation and made it faster to
> switch to another topic partition.
>
> [FLINK-28351] Add dynamic sink topic support for Pulsar connector.
> Writing to a non-existed topic is supported in Pulsar. We added this
> support since this release.
>
> [FLINK-30654][Connector/Pulsar] Force consumption from StartCursor
> every time the application starts.
> StartCursor is used only when the subscription does not exist in
> Pulsar. We now support using the position from the StartCursor every
> time the application starts.
>
> New Features
>
> [FLINK-26027] Expose Pulsar producer metrics and add FLIP-33 sink metrics.
> You can monitor all the Pulsar client metrics now in this release. We
> also support the FLIP-33 sink metrics by default which you can have a
> clear view on how many messages have been written.
>
> [FLINK-25686] Support schema evolution for Pulsar source.
> Schema evolution has been supported in the Pulsar sink. We add a new
> option for enabling the schema evolution support in source.
>
> [FLINK-28082] Add end-to-end encryption support for Pulsar connector.
> End-to-end encryption can encrypt the messages from sink to source in
> the connector with an extra signature check. No one can see the real
> content except the connector.
>
> [FLINK-30689] Support sending message bytes with extra schema check.
> Connector is sending the messages in bytes by default. We add this
> feature for supporting validating the message bytes with the latest
> schema on Pulsar.
>
> [FLINK-30622] Support consuming messages with schema auto-detection from
> Pulsar.
> Some topics may contain multiple types of messages. Use this feature
> to convert these messages into a more general interface.
>
> BUG Fixes
>
> [FLINK-30552] Drop next message id calculation, use resetCursor api to
> exclude the given message.
> The old connector has a crucial bug on consuming messages which are
> sent to Pulsar in batch. We have fixed the bug and supported all the
> message types since this release.
>
> Breaking Changes
>
> Some classes and interfaces which are annotated with @PublicEvolving
> have been changed in this release. And this may affect the end-users.
>
> Pulsar Sink
>
> PulsarMessage should be created by using the PulsarMessage.builder()
> method. The PulsarMessageBuilder can’t be created directly like
> before.
>
> The route method in the TopicRouter interface should return a
> TopicPartition instead of a topic name.
>
> All the static methods in PulsarSerializationSchema have been removed.
> You should set them (Schema, SerializationSchema) by directly using
> PulsarSinkBuilder.setSerializationSchema().
>
> Pulsar Source
>
> The subscription type setting has been removed from the connector. All
> the connector’s subscriptions will be created in Exclusive type. The
> checkpoint for Shared and Key_Shared subscriptions couldn’t be used
> since this release.
>
> The setSubscriptionType method has been removed from the
> PulsarSourceBuilder.
>
> The fromMessageTime method has been removed from the StartCursor.
> Pulsar doesn’t support seeking from message time.
>
> The RangeGenerator interface removed the deprecated open method and
> the keyShareMode method. Because we don’t support the Key_Shared
> subscription now. The RangeGenerator is only used in Exclusive
> subscription to filter the desired keys.
>
> TopicPartition only exposes two constructors with the @PublicEvolving
> annotation now.
>
> All the static methods in PulsarDeserializationSchema have been
> removed. You should set them (Schema, SerializationSchema) by directly
> using PulsarSourceBuilder.setDeserializationSchema().
>
> The first argument of the open method in PulsarDeserializationSchema
> has been changed from InitializationContext to a new
> PulsarInitializationContext interface.
>
> Weijie Guo will be the release manager.
>
> Thanks,
> Yufan
>