You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@flink.apache.org by GitBox <gi...@apache.org> on 2022/01/24 19:04:50 UTC

[GitHub] [flink] fapaul edited a comment on pull request #18412: [FLINK-25696][datastream] Introduce MetadataPublisher interface to SinkWriter

fapaul edited a comment on pull request #18412:
URL: https://github.com/apache/flink/pull/18412#issuecomment-1020262180

> I don't think we need to put it inside the mailbox, it would be very performance intensive, it's a per record operation. A callback consumer, which I think has asynchronous processing reasonable.

From a Kafka connector side, the subscriber is not updated on every record, and just when the KafkaProducer is flushed it is only updated for a bulk of records (either during a checkpoint or if the internal buffer size is reached).
I would definitely prefer to handle the consumer in the mailbox and not by the Kafka thread. The Kafka threads might have surprising effects on the overall pipeline stability i.e. the shutdown is blocked because the producer cannot be stopped because it is executing the metadata consumer.

Regarding adding a method to the InitContext I think that is okay. Do you think there will be ever multiple Subscribers? Maybe it is safer to already add a list instead of an optional.

I am still a bit surprised that the `TableStoreSink` reads all metadata offsets nevertheless if they are committed in Kafka or not.

--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org