You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pulsar.apache.org by Alexander Preuss <al...@streamnative.io.INVALID> on 2022/08/05 12:34:24 UTC

[DISCUSS] PIP 197: Add Schema hash and equals to public API

Hello Pulsar community,

I would like to start a discussion about PIP 197: Add Schema hash and
equals to public API.
You can find the proposal at https://github.com/apache/pulsar/issues/16959
as well as pasted below.

Looking forward to hearing your thoughts,
Alex

## Motivation

Currently, the `Schema` interface in the public client-api does not provide
access to a sensible hash function. The fallback to Java’s object-equality
makes it unfit for use in most hash-based collections. For example, it
prevents usage as a key in a cache.
Further, the lack of a reliable equals function means that there is no way
to identify if two schemas are the same thing.

## Goal

The goal of this proposal is to provide a sensible `hashCode` and `equals`
implementation for Schema as part of the public API.

Currently, pulsar-common contains `SchemaHash`, a wrapper class that exists
to solve the aforementioned problems. However, `SchemaHash` is not part of
the public API, so users should not depend on it.


## API Changes

There is no further change required as moving `SchemaHash` from
pulsar-common to the public API. The only further change could be to
re-think the class name, as the wrapper offers more than just a schemas
hash.

## Implementation

Move `SchemaHash` from pulsar-common
`org.apache.pulsar.common.protocol.schema` package into pulsar-client-api
`org.apache.pulsar.common.schema` package.


## Reject Alternatives

Providing default methods for `equals` and `hashCode` directly on the
`Schema` interface is not possible because Java prohibits overriding the
base Object methods.
Another option would be to provide the `hashCode` and `equals`
functionality through similarly-named default methods that could be used by
any `Schema` implementation. The drawback of this idea is that it requires
developers to override the equals and hashCode to use these provided
methods, as well as possibly polluting the interface.

Re: [DISCUSS] PIP 197: Add Schema hash and equals to public API

Posted by Alexander Preuss <al...@streamnative.io.INVALID>.
Hi Enrico,

Thank you for chiming in.
I understand your concerns regarding the state of a Schema instance.
However, what would be the reason for keeping SchemaHash as part of the
private API? Of course, a user could re-create its functionality in their
own code but then why should we not offer this for convenience in the first
place?

Best,
Alex

On Thu, Aug 11, 2022 at 9:49 AM Enrico Olivelli <eo...@gmail.com> wrote:

> A Schema instance is stateful (and contians mutable state) and in the end
> it contains a reference to the PulsarClient instance that is using it.
> Look at how AutoConsume schema works for instance.
>
> I don't think it is possible to support such API in the general case.
>
> If you want to use it as a key in a map then I suggest you to create a
> wrapper in your code.
>
>
> Enrico
>
> Il Ven 5 Ago 2022, 14:34 Alexander Preuss
> <al...@streamnative.io.invalid> ha scritto:
>
> > Hello Pulsar community,
> >
> > I would like to start a discussion about PIP 197: Add Schema hash and
> > equals to public API.
> > You can find the proposal at
> https://github.com/apache/pulsar/issues/16959
> > as well as pasted below.
> >
> > Looking forward to hearing your thoughts,
> > Alex
> >
> > ## Motivation
> >
> > Currently, the `Schema` interface in the public client-api does not
> provide
> > access to a sensible hash function. The fallback to Java’s
> object-equality
> > makes it unfit for use in most hash-based collections. For example, it
> > prevents usage as a key in a cache.
> > Further, the lack of a reliable equals function means that there is no
> way
> > to identify if two schemas are the same thing.
> >
> > ## Goal
> >
> > The goal of this proposal is to provide a sensible `hashCode` and
> `equals`
> > implementation for Schema as part of the public API.
> >
> > Currently, pulsar-common contains `SchemaHash`, a wrapper class that
> exists
> > to solve the aforementioned problems. However, `SchemaHash` is not part
> of
> > the public API, so users should not depend on it.
> >
> >
> > ## API Changes
> >
> > There is no further change required as moving `SchemaHash` from
> > pulsar-common to the public API. The only further change could be to
> > re-think the class name, as the wrapper offers more than just a schemas
> > hash.
> >
> > ## Implementation
> >
> > Move `SchemaHash` from pulsar-common
> > `org.apache.pulsar.common.protocol.schema` package into pulsar-client-api
> > `org.apache.pulsar.common.schema` package.
> >
> >
> > ## Reject Alternatives
> >
> > Providing default methods for `equals` and `hashCode` directly on the
> > `Schema` interface is not possible because Java prohibits overriding the
> > base Object methods.
> > Another option would be to provide the `hashCode` and `equals`
> > functionality through similarly-named default methods that could be used
> by
> > any `Schema` implementation. The drawback of this idea is that it
> requires
> > developers to override the equals and hashCode to use these provided
> > methods, as well as possibly polluting the interface.
> >
>

Re: [DISCUSS] PIP 197: Add Schema hash and equals to public API

Posted by Enrico Olivelli <eo...@gmail.com>.
A Schema instance is stateful (and contians mutable state) and in the end
it contains a reference to the PulsarClient instance that is using it.
Look at how AutoConsume schema works for instance.

I don't think it is possible to support such API in the general case.

If you want to use it as a key in a map then I suggest you to create a
wrapper in your code.


Enrico

Il Ven 5 Ago 2022, 14:34 Alexander Preuss
<al...@streamnative.io.invalid> ha scritto:

> Hello Pulsar community,
>
> I would like to start a discussion about PIP 197: Add Schema hash and
> equals to public API.
> You can find the proposal at https://github.com/apache/pulsar/issues/16959
> as well as pasted below.
>
> Looking forward to hearing your thoughts,
> Alex
>
> ## Motivation
>
> Currently, the `Schema` interface in the public client-api does not provide
> access to a sensible hash function. The fallback to Java’s object-equality
> makes it unfit for use in most hash-based collections. For example, it
> prevents usage as a key in a cache.
> Further, the lack of a reliable equals function means that there is no way
> to identify if two schemas are the same thing.
>
> ## Goal
>
> The goal of this proposal is to provide a sensible `hashCode` and `equals`
> implementation for Schema as part of the public API.
>
> Currently, pulsar-common contains `SchemaHash`, a wrapper class that exists
> to solve the aforementioned problems. However, `SchemaHash` is not part of
> the public API, so users should not depend on it.
>
>
> ## API Changes
>
> There is no further change required as moving `SchemaHash` from
> pulsar-common to the public API. The only further change could be to
> re-think the class name, as the wrapper offers more than just a schemas
> hash.
>
> ## Implementation
>
> Move `SchemaHash` from pulsar-common
> `org.apache.pulsar.common.protocol.schema` package into pulsar-client-api
> `org.apache.pulsar.common.schema` package.
>
>
> ## Reject Alternatives
>
> Providing default methods for `equals` and `hashCode` directly on the
> `Schema` interface is not possible because Java prohibits overriding the
> base Object methods.
> Another option would be to provide the `hashCode` and `equals`
> functionality through similarly-named default methods that could be used by
> any `Schema` implementation. The drawback of this idea is that it requires
> developers to override the equals and hashCode to use these provided
> methods, as well as possibly polluting the interface.
>