You are viewing a plain text version of this content. The canonical link for it is here.

Posted to jira@kafka.apache.org by GitBox <gi...@apache.org> on 2020/05/02 01:22:19 UTC

[GitHub] [kafka] mjsax edited a comment on pull request #8504: KAFKA-9298: reuse mapped stream error in joins

mjsax edited a comment on pull request #8504:
URL: https://github.com/apache/kafka/pull/8504#issuecomment-618073161

> Ideally, the fix should be to generate a repartition topic name each time to avoid such issues. But IMHO that ship has already sailed because by introducing a new name generation will cause compatibility issues for existing topologies.

Why that? Because such a topology would hit the bug, it could never be deployed, and thus nobody can actually run such a topology? In fact, shouldn't we "burn" an index even if a name is provided (IIRC, we do this for some cases)?

I agree thought, that merging repartition topics (as proposed in (1)) should be done if possible (it's a historic artifact that we did not merge them in the past and IMHO we should not make the same mistake again?).

For (2), it's a tricky question because the different names are used for different stores and changelog topics (ie, main purpose?) -- it seems to be a "nasty side effect" if we would end up with two repartition topics for this case? Of course, given the new `repartition()` operator, a user can work around it by using it after `map()` and before calling `join()`. Just brainstorming here what the impact could be and what tradeoff we want to pick.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org