You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by Salva Alcántara <sa...@gmail.com> on 2022/07/08 03:44:11 UTC

Configure a kafka source dynamically (???)

When using the kafka connector, you need to set the topics in advance (by
giving a list of them or a regex pattern for the topic names). Imagine a
situation where the topics are not known in advance, of course you could
use an all-pass regex pattern to match all the topics in the broker but
what I want to know is whether it's possible to read from new topics on
demand.

E.g., initially the source starts without any topics to read from so
nothing is read until it gets a control msg (which could be pushed to a
control topic, for example) specifying the set of topics to subscribe to. I
guess this could be somehow implemented using custom subscribers once this
issue is merged/closed:

https://issues.apache.org/jira/browse/FLINK-24660

but would it be possible to achieve this objective without having to
periodically pull the broker, e.g., in a more reactive (push) way? I guess
if the kafka source (or any other source for what it's worth) were to have
a control signal like that then it would be more of an operator than a
source, really...

Salva

PS: Does anyone know the current state of FLINK-24660? The changes seem to
have been ready to merge for a while.

Re: Configure a kafka source dynamically (???)

Posted by Salva Alcántara <sa...@gmail.com>.
Regarding the kafka subscribers, I ended up copying the original Kafka
Source in my project to make the setter public. Thanks for pointing out the
reflection option (in reality, both are ugly).

As for the main point, there is no real motivation other than curiosity. I
agree that one can always reduce the discovery period, but that does not
look ideal. I understand from your words that this is indeed a by-design
limitation of the current source framework (which is good to know).

Salva

On Fri, Jul 8, 2022, 09:40 Mason Chen <ma...@gmail.com> wrote:

> Hi Salva,
>
> I was the contributor on the ticket and have updated the PR. Sorry for the
> delay! Meanwhile, you can use reflection to set the KafkaSubscriber if you
> need to have an immediate solution.
>
> With respect to your control message idea, what is the motivation to use a
> push based mechanism vs poll based mechanism? If it is about latency in
> getting the changes, you could also reduce the discovery interval to be a
> low number. From the source framework, there isn't a notion of an
> entrypoint that could be exposed to external systems for control message
> pushing.
>
> Best,
> Mason
>
> On Thu, Jul 7, 2022 at 8:45 PM Salva Alcántara <sa...@gmail.com>
> wrote:
>
>> When using the kafka connector, you need to set the topics in advance (by
>> giving a list of them or a regex pattern for the topic names). Imagine a
>> situation where the topics are not known in advance, of course you could
>> use an all-pass regex pattern to match all the topics in the broker but
>> what I want to know is whether it's possible to read from new topics on
>> demand.
>>
>> E.g., initially the source starts without any topics to read from so
>> nothing is read until it gets a control msg (which could be pushed to a
>> control topic, for example) specifying the set of topics to subscribe to. I
>> guess this could be somehow implemented using custom subscribers once this
>> issue is merged/closed:
>>
>> https://issues.apache.org/jira/browse/FLINK-24660
>>
>> but would it be possible to achieve this objective without having to
>> periodically pull the broker, e.g., in a more reactive (push) way? I guess
>> if the kafka source (or any other source for what it's worth) were to have
>> a control signal like that then it would be more of an operator than a
>> source, really...
>>
>> Salva
>>
>> PS: Does anyone know the current state of FLINK-24660? The changes seem
>> to have been ready to merge for a while.
>>
>

Re: Configure a kafka source dynamically (???)

Posted by Mason Chen <ma...@gmail.com>.
Hi Salva,

I was the contributor on the ticket and have updated the PR. Sorry for the
delay! Meanwhile, you can use reflection to set the KafkaSubscriber if you
need to have an immediate solution.

With respect to your control message idea, what is the motivation to use a
push based mechanism vs poll based mechanism? If it is about latency in
getting the changes, you could also reduce the discovery interval to be a
low number. From the source framework, there isn't a notion of an
entrypoint that could be exposed to external systems for control message
pushing.

Best,
Mason

On Thu, Jul 7, 2022 at 8:45 PM Salva Alcántara <sa...@gmail.com>
wrote:

> When using the kafka connector, you need to set the topics in advance (by
> giving a list of them or a regex pattern for the topic names). Imagine a
> situation where the topics are not known in advance, of course you could
> use an all-pass regex pattern to match all the topics in the broker but
> what I want to know is whether it's possible to read from new topics on
> demand.
>
> E.g., initially the source starts without any topics to read from so
> nothing is read until it gets a control msg (which could be pushed to a
> control topic, for example) specifying the set of topics to subscribe to. I
> guess this could be somehow implemented using custom subscribers once this
> issue is merged/closed:
>
> https://issues.apache.org/jira/browse/FLINK-24660
>
> but would it be possible to achieve this objective without having to
> periodically pull the broker, e.g., in a more reactive (push) way? I guess
> if the kafka source (or any other source for what it's worth) were to have
> a control signal like that then it would be more of an operator than a
> source, really...
>
> Salva
>
> PS: Does anyone know the current state of FLINK-24660? The changes seem to
> have been ready to merge for a while.
>