You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@beam.apache.org by Raghu Angadi <ra...@google.com.INVALID> on 2016/10/04 19:26:19 UTC

Re: Should UnboundedSource provide a split identifier ?

On Tue, Sep 13, 2016 at 1:49 AM, Amit Sela <am...@gmail.com> wrote:

> If I understand correctly this will break
> https://github.com/apache/incubator-beam/blob/master/
> sdks/java/io/kafka/src/main/java/org/apache/beam/sdk/io/
> kafka/KafkaIO.java#L857
> in
> KafkaIO.
>
> So it's a KafkaIO limitation (for now ?) ?
>

KafkaIO can remove this check, it is not a hard requirement. Instead it can
just read the partitions listed in checkpoint. Let me know if that is what
you would like here.

The main disadvantage of removing these checks would be that we will not
fail hard when then the number of Kafka partitions changes across an update
(we could log a warning, but most users would not notice it). It may not be
so bad, as KafkaIO does not notice the change in number of partitions at
run-time any way (may be it should, and warn or error).