You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by Ismael Juma <is...@juma.me.uk> on 2023/03/18 15:15:38 UTC

[DISCUSS] KIP-897: Publish a single kafka (aka core) Maven artifact in Apache Kafka 4.0

Hi all,

I would like to start a discussion regarding the removal of the scala
suffix from the kafka (aka core) module. Please take a look at the proposal
and provide feedback:

https://cwiki.apache.org/confluence/display/KAFKA/KIP-897%3A+Publish+a+single+kafka+%28aka+core%29+Maven+artifact+in+Apache+Kafka+4.0

Ismael

Re: [DISCUSS] KIP-897: Publish a single kafka (aka core) Maven artifact in Apache Kafka 4.0

Posted by Matthew Benedict de Detrich <ma...@aiven.io.INVALID>.
> If you have a Java project, you do not set a Scala version. Why would you?

There are 2 cases here. If you have a Java project that already has
existing Scala dependencies (either directly or indirectly) in which case
you have a Java project that defines a Scala version and the suffix we are
talking about points to that Scala version (the reason why the suffix
exists in the first place is to prevent mixing of binary incompatible Scala
versions). The other case is a pure Java library that happens to want to
include kafka core directly, in which case we are talking about the
difference of

org.apache.kafka:kafka:<VERSION>

vs

org.apache.kafka:kafka_2.13:<VERSION>

Is this change really worth it just to save going from kafka_2.13 to kafka?
As you said in the KIP we are just going to support a single Scala version
so the user doesn't even need to care/know about Scala 2.13. If we are
talking about the source release/docker/containers then the artifact name
is completely irrelevant either way.

> The streams module does not depend on the core module either, see:

> https://mvnrepository.com/artifact/org.apache.kafka/kafka-streams/3.4.0

So it appears that the require query from the old https://mvnrepository.com
<https://mvnrepository.com/artifact/org.apache.kafka/kafka-streams/3.4.0> site
is bringing in all dependencies from all scopes (such as test) rather than
just direct compile scope however this is still going to cause resolution
issues for a lot of upstream projects (which use kafka core for testing as
an example).

If on the other hand you say that Scala is just an implementation detail of
kafka core and there isn't anything of value in the ecosystem having a
compile time dependency on kafka core, then which hypothetical users are we
talking about that are currently having such big issues with the Scala
suffixes in the artifact? If most of the users using kafka core are doing
so to run kafka as an application then they are going to be using the
source distribution (or some other method, i.e. docker/testcontainers etc
etc) in which case the artifact names are completely irrelevant here.

Either there are a significant portion of users that (for some reason) have
a dependency on kafka core in which case the problems I prescribed before
apply or we have a case where almost nothing depends on kafka core which
means that the KIP's value is questionable while also creating a highly
unorthodox situation of an artifact depending on Scala but not having a
Scala suffix (which is an expectation for any artifact using Scala,
implementation detail or not).

On Sat, Mar 18, 2023 at 11:10 PM Ismael Juma <is...@juma.me.uk> wrote:

> Hi Matthew,
>
> Comments below.
>
> On Sat, Mar 18, 2023 at 2:27 PM Matthew Benedict de Detrich
> <ma...@aiven.io.invalid> wrote:
>
> > >1. The kafka module does not expose a public api (unlike kafka-clients
> or
> > streams-scala) and the usage of Scala (including the version) is an
> > implementation detail.
> >
> > This is irrelevant, the core problem is bringing in scala-library.jar and
> > potentially mixing in different binary incompatible versions.
> >
>
> I disagree.
>
>
> > > 2. The vast majority of kafka users do not use Scala and the Scala
> suffix
> > is a problem for them. For example, they may depend on kafka_2.12 and
> then
> > that stops working when support for Scala 2.12 goes away.
> >
> > I fail to see how this is a "problem". All of the common JVM build tools
> > already handle this transparently. You pick a Scala version, define it in
> > the build tool and then everything is handled for you. If you pick the
> > wrong Scala version then it will fail to resolve at which point you have
> > feedback thay you picked the wrong version.
> >
>
> If you have a Java project, you do not set a Scala version. Why would you?
>
> This suggestion would mean that if a user brings in Scala version X (from
> > some other dependency) that differs from Kafkas version they will get a
> > very hard to diagnose runtime error.
> >
> > > That's incorrect. kafka-clients is a pure Java library, does not
> include
> > the Scala suffix and does not depend on the kafka (aka core) jar.
> >
> > I am talking about clients of kafka such as streams which depend on core.
> > In fact if you look at the direct dependencies of kafka core (see
> > https://mvnrepository.com/artifact/org.apache.kafka/kafka/usages) you
> will
> > see there are a lot of libraries/runtimes which depends on kafka core.
>
>
> The streams module does not depend on the core module either, see:
>
> https://mvnrepository.com/artifact/org.apache.kafka/kafka-streams/3.4.0
>
> Ismael
>


-- 

Matthew de Detrich

*Aiven Deutschland GmbH*

Immanuelkirchstraße 26, 10405 Berlin

Amtsgericht Charlottenburg, HRB 209739 B

Geschäftsführer: Oskari Saarenmaa & Hannu Valtonen

*m:* +491603708037

*w:* aiven.io *e:* matthew.dedetrich@aiven.io

Re: [DISCUSS] KIP-897: Publish a single kafka (aka core) Maven artifact in Apache Kafka 4.0

Posted by Ismael Juma <is...@juma.me.uk>.
Hi Matthew,

Comments below.

On Sat, Mar 18, 2023 at 2:27 PM Matthew Benedict de Detrich
<ma...@aiven.io.invalid> wrote:

> >1. The kafka module does not expose a public api (unlike kafka-clients or
> streams-scala) and the usage of Scala (including the version) is an
> implementation detail.
>
> This is irrelevant, the core problem is bringing in scala-library.jar and
> potentially mixing in different binary incompatible versions.
>

I disagree.


> > 2. The vast majority of kafka users do not use Scala and the Scala suffix
> is a problem for them. For example, they may depend on kafka_2.12 and then
> that stops working when support for Scala 2.12 goes away.
>
> I fail to see how this is a "problem". All of the common JVM build tools
> already handle this transparently. You pick a Scala version, define it in
> the build tool and then everything is handled for you. If you pick the
> wrong Scala version then it will fail to resolve at which point you have
> feedback thay you picked the wrong version.
>

If you have a Java project, you do not set a Scala version. Why would you?

This suggestion would mean that if a user brings in Scala version X (from
> some other dependency) that differs from Kafkas version they will get a
> very hard to diagnose runtime error.
>
> > That's incorrect. kafka-clients is a pure Java library, does not include
> the Scala suffix and does not depend on the kafka (aka core) jar.
>
> I am talking about clients of kafka such as streams which depend on core.
> In fact if you look at the direct dependencies of kafka core (see
> https://mvnrepository.com/artifact/org.apache.kafka/kafka/usages) you will
> see there are a lot of libraries/runtimes which depends on kafka core.


The streams module does not depend on the core module either, see:

https://mvnrepository.com/artifact/org.apache.kafka/kafka-streams/3.4.0

Ismael

Re: [DISCUSS] KIP-897: Publish a single kafka (aka core) Maven artifact in Apache Kafka 4.0

Posted by Matthew Benedict de Detrich <ma...@aiven.io.INVALID>.
Hi Ismael

>1. The kafka module does not expose a public api (unlike kafka-clients or
streams-scala) and the usage of Scala (including the version) is an
implementation detail.

This is irrelevant, the core problem is bringing in scala-library.jar and
potentially mixing in different binary incompatible versions.


> 2. The vast majority of kafka users do not use Scala and the Scala suffix
is a problem for them. For example, they may depend on kafka_2.12 and then
that stops working when support for Scala 2.12 goes away.

I fail to see how this is a "problem". All of the common JVM build tools
already handle this transparently. You pick a Scala version, define it in
the build tool and then everything is handled for you. If you pick the
wrong Scala version then it will fail to resolve at which point you have
feedback thay you picked the wrong version.

This suggestion would mean that if a user brings in Scala version X (from
some other dependency) that differs from Kafkas version they will get a
very hard to diagnose runtime error.

> That's incorrect. kafka-clients is a pure Java library, does not include
the Scala suffix and does not depend on the kafka (aka core) jar.

I am talking about clients of kafka such as streams which depend on core.
In fact if you look at the direct dependencies of kafka core (see
https://mvnrepository.com/artifact/org.apache.kafka/kafka/usages) you will
see there are a lot of libraries/runtimes which depends on kafka core.


Such a change would either directly break those dependency chains directly
or create  transitive issues (which is even worse because as mentioned
before it could result in runtime errors).

On Sat, 18 Mar 2023, 19:30 Ismael Juma, <is...@juma.me.uk> wrote:

> Hi Matthew,
>
> I understand all that, but the important points are:
>
> 1. The kafka module does not expose a public api (unlike kafka-clients or
> streams-scala) and the usage of Scala (including the version) is an
> implementation detail.
> 2. The vast majority of kafka users do not use Scala and the Scala suffix
> is a problem for them. For example, they may depend on kafka_2.12 and then
> that stops working when support for Scala 2.12 goes away.
>
> One additional comment below:
>
> This problem also extends transitively, i.e. think of
> > libraries which wrap kafka clients which would transitively include
> > kafka-core.
>
>
> That's incorrect. kafka-clients is a pure Java library, does not include
> the Scala suffix and does not depend on the kafka (aka core) jar.
>
> Ismael
>
> On Sat, Mar 18, 2023 at 9:31 AM Matthew Benedict de Detrich
> <ma...@aiven.io.invalid> wrote:
>
> > While I understand the motivation, such a change would break currently
> > existing JVM build tools that work with Scala in subtle ways. More
> > concretely it is expected that if a library is shipping the scala runtime
> > scala-library.jar (i.e. it is a Scala library) it should use the suffix
> > (i.e. _2.12 or _2.13) because currently existing Maven/Ivy resolution
> tools
> > rely on that suffix existing when doing dependency resolution.
> >
> > A simple example where this would cause a problem is if someone has a
> Scala
> > project and is using Scala 2.12 and they include a future hypothetical
> > Kafka core that only supports Scala 2.13. In the case where Kafka core
> > preserves the 2.13 suffix, the build tool would complain that it cannot
> > find a Kafka core release with 2.12 and tell the user immediately at
> > resolution that Kafka has not been released with Scala 2.12 support
> (which
> > is expected and the correct thing to do). On the other hand if there is
> no
> > suffix, depending on the build tool in question you would essentially
> > have to **pretend** that it's a pure Java library by ignoring the suffix
> > (which Kafka it is not). This means that the Kafka 2.12 artifact would
> get
> > incorrectly resolved in a project that is using Scala 2.13 causing a
> > runtime error. This problem also extends transitively, i.e. think of
> > libraries which wrap kafka clients which would transitively include
> > kafka-core.
> >
> > Such a change is also incredibly misleading, I can't think of a single
> > project using Scala that does this. There is nothing wrong with
> supporting
> > only a single version of Scala and then removing Scala support later on,
> > however the suffix in the artifact name is an expectation. I am also
> having
> > trouble understanding what this is achieving when weighed against the
> > ramifications stated earlier, if we are talking about people including
> > specific Kafka libraries (such as kafka-clients) in their currently
> > existing projects this is not going to provide any real benefit since
> > current common build tools (maven/gradle/sbt) handle the suffix
> > automatically for you during resolution. Similarly if we are talking
> about
> > a distribution of Kafka that needs to be run, it's just a list of jars
> that
> > need to be a classloader.
> >
> >
> > On Sat, Mar 18, 2023 at 4:16 PM Ismael Juma <is...@juma.me.uk> wrote:
> >
> > > Hi all,
> > >
> > > I would like to start a discussion regarding the removal of the scala
> > > suffix from the kafka (aka core) module. Please take a look at the
> > proposal
> > > and provide feedback:
> > >
> > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-897%3A+Publish+a+single+kafka+%28aka+core%29+Maven+artifact+in+Apache+Kafka+4.0
> > >
> > > Ismael
> > >
> >
> >
> > --
> >
> > Matthew de Detrich
> >
> > *Aiven Deutschland GmbH*
> >
> > Immanuelkirchstraße 26, 10405 Berlin
> >
> > Amtsgericht Charlottenburg, HRB 209739 B
> >
> > Geschäftsführer: Oskari Saarenmaa & Hannu Valtonen
> >
> > *m:* +491603708037
> >
> > *w:* aiven.io *e:* matthew.dedetrich@aiven.io
> >
>

Re: [DISCUSS] KIP-897: Publish a single kafka (aka core) Maven artifact in Apache Kafka 4.0

Posted by Ismael Juma <is...@juma.me.uk>.
Hi Matthew,

I understand all that, but the important points are:

1. The kafka module does not expose a public api (unlike kafka-clients or
streams-scala) and the usage of Scala (including the version) is an
implementation detail.
2. The vast majority of kafka users do not use Scala and the Scala suffix
is a problem for them. For example, they may depend on kafka_2.12 and then
that stops working when support for Scala 2.12 goes away.

One additional comment below:

This problem also extends transitively, i.e. think of
> libraries which wrap kafka clients which would transitively include
> kafka-core.


That's incorrect. kafka-clients is a pure Java library, does not include
the Scala suffix and does not depend on the kafka (aka core) jar.

Ismael

On Sat, Mar 18, 2023 at 9:31 AM Matthew Benedict de Detrich
<ma...@aiven.io.invalid> wrote:

> While I understand the motivation, such a change would break currently
> existing JVM build tools that work with Scala in subtle ways. More
> concretely it is expected that if a library is shipping the scala runtime
> scala-library.jar (i.e. it is a Scala library) it should use the suffix
> (i.e. _2.12 or _2.13) because currently existing Maven/Ivy resolution tools
> rely on that suffix existing when doing dependency resolution.
>
> A simple example where this would cause a problem is if someone has a Scala
> project and is using Scala 2.12 and they include a future hypothetical
> Kafka core that only supports Scala 2.13. In the case where Kafka core
> preserves the 2.13 suffix, the build tool would complain that it cannot
> find a Kafka core release with 2.12 and tell the user immediately at
> resolution that Kafka has not been released with Scala 2.12 support (which
> is expected and the correct thing to do). On the other hand if there is no
> suffix, depending on the build tool in question you would essentially
> have to **pretend** that it's a pure Java library by ignoring the suffix
> (which Kafka it is not). This means that the Kafka 2.12 artifact would get
> incorrectly resolved in a project that is using Scala 2.13 causing a
> runtime error. This problem also extends transitively, i.e. think of
> libraries which wrap kafka clients which would transitively include
> kafka-core.
>
> Such a change is also incredibly misleading, I can't think of a single
> project using Scala that does this. There is nothing wrong with supporting
> only a single version of Scala and then removing Scala support later on,
> however the suffix in the artifact name is an expectation. I am also having
> trouble understanding what this is achieving when weighed against the
> ramifications stated earlier, if we are talking about people including
> specific Kafka libraries (such as kafka-clients) in their currently
> existing projects this is not going to provide any real benefit since
> current common build tools (maven/gradle/sbt) handle the suffix
> automatically for you during resolution. Similarly if we are talking about
> a distribution of Kafka that needs to be run, it's just a list of jars that
> need to be a classloader.
>
>
> On Sat, Mar 18, 2023 at 4:16 PM Ismael Juma <is...@juma.me.uk> wrote:
>
> > Hi all,
> >
> > I would like to start a discussion regarding the removal of the scala
> > suffix from the kafka (aka core) module. Please take a look at the
> proposal
> > and provide feedback:
> >
> >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-897%3A+Publish+a+single+kafka+%28aka+core%29+Maven+artifact+in+Apache+Kafka+4.0
> >
> > Ismael
> >
>
>
> --
>
> Matthew de Detrich
>
> *Aiven Deutschland GmbH*
>
> Immanuelkirchstraße 26, 10405 Berlin
>
> Amtsgericht Charlottenburg, HRB 209739 B
>
> Geschäftsführer: Oskari Saarenmaa & Hannu Valtonen
>
> *m:* +491603708037
>
> *w:* aiven.io *e:* matthew.dedetrich@aiven.io
>

Re: [DISCUSS] KIP-897: Publish a single kafka (aka core) Maven artifact in Apache Kafka 4.0

Posted by Matthew Benedict de Detrich <ma...@aiven.io.INVALID>.
While I understand the motivation, such a change would break currently
existing JVM build tools that work with Scala in subtle ways. More
concretely it is expected that if a library is shipping the scala runtime
scala-library.jar (i.e. it is a Scala library) it should use the suffix
(i.e. _2.12 or _2.13) because currently existing Maven/Ivy resolution tools
rely on that suffix existing when doing dependency resolution.

A simple example where this would cause a problem is if someone has a Scala
project and is using Scala 2.12 and they include a future hypothetical
Kafka core that only supports Scala 2.13. In the case where Kafka core
preserves the 2.13 suffix, the build tool would complain that it cannot
find a Kafka core release with 2.12 and tell the user immediately at
resolution that Kafka has not been released with Scala 2.12 support (which
is expected and the correct thing to do). On the other hand if there is no
suffix, depending on the build tool in question you would essentially
have to **pretend** that it's a pure Java library by ignoring the suffix
(which Kafka it is not). This means that the Kafka 2.12 artifact would get
incorrectly resolved in a project that is using Scala 2.13 causing a
runtime error. This problem also extends transitively, i.e. think of
libraries which wrap kafka clients which would transitively include
kafka-core.

Such a change is also incredibly misleading, I can't think of a single
project using Scala that does this. There is nothing wrong with supporting
only a single version of Scala and then removing Scala support later on,
however the suffix in the artifact name is an expectation. I am also having
trouble understanding what this is achieving when weighed against the
ramifications stated earlier, if we are talking about people including
specific Kafka libraries (such as kafka-clients) in their currently
existing projects this is not going to provide any real benefit since
current common build tools (maven/gradle/sbt) handle the suffix
automatically for you during resolution. Similarly if we are talking about
a distribution of Kafka that needs to be run, it's just a list of jars that
need to be a classloader.


On Sat, Mar 18, 2023 at 4:16 PM Ismael Juma <is...@juma.me.uk> wrote:

> Hi all,
>
> I would like to start a discussion regarding the removal of the scala
> suffix from the kafka (aka core) module. Please take a look at the proposal
> and provide feedback:
>
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-897%3A+Publish+a+single+kafka+%28aka+core%29+Maven+artifact+in+Apache+Kafka+4.0
>
> Ismael
>


-- 

Matthew de Detrich

*Aiven Deutschland GmbH*

Immanuelkirchstraße 26, 10405 Berlin

Amtsgericht Charlottenburg, HRB 209739 B

Geschäftsführer: Oskari Saarenmaa & Hannu Valtonen

*m:* +491603708037

*w:* aiven.io *e:* matthew.dedetrich@aiven.io