You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pulsar.apache.org by Yunze Xu <xy...@apache.org> on 2024/03/20 10:03:48 UTC

[DISCUSS] Clear definition for the APIs we should not remove or change simply

Hi community,

I'd like to bring back the discussion again since a similar discussion
happened long ago [1].

Currently, there is no clear definition for the maintain strategy of
public APIs. To make it more clear, I mean "public interface methods"
and "public class methods" when I mentioned "public APIs".It's very
ambiguous if it's acceptable to modify the method signature. Here are
some examples:
- Public APIs in pulsar-broker or pulsar-broker-common could be used
in the protocol handlers
- Public APIs in pulsar-client could be used due to the lack of
control in pulsar-client-api like MessageId and PulsarClient

However, we should not encourage users to leverage such APIs, which
are mostly used internally. If authors use these APIs, they should be
responsible to modify the code if a new release changed such APIs. See
concrete examples here [2].

Therefore, I suggest adding the API signature guarantee somewhere (as
a PIP or on the website). Here are my proposals.

For any existing public method, no matter if it's from an interface or a class,
- If it's in one module of pulsar-client-api, pulsar-common,
pulsar-broker-common, should not be modified until the next LTS
release (e.g. 4.0.0). Before that, it can only be marked as
deprecated.
- Otherwise, it can be modified in the next feature release (e.g. 3.3.0)
- Specifically, for patch releases (e.g. 3.2.1), it needs public
discussion in the mail list to be cherry-picked

The strategy above is more relaxed than my previous proposal [1]. It
tells ecosystem developers:
- What are the public APIs
- Use public APIs whenever possible
- It's okay to use internal APIs if existing public APIs cannot
satisfy their demands
- Propose new public APIs if they think their demands are universals
and then they won't risk modifying the code for new releases

[1] https://lists.apache.org/thread/87n22qskcnpb0x44bfjy1chgdxzh8tf9
[2] https://github.com/apache/pulsar/pull/22182#discussion_r1531754956

Thanks,
Yunze

Re: [DISCUSS] Clear definition for the APIs we should not remove or change simply

Posted by Yunze Xu <xy...@apache.org>.
> I think that this is too broad. It would essentially prevent most refactorings and improvements to Pulsar.

It makes sense. I will take some time to learn the use of japicmp.

Thanks,
Yunze

On Wed, Mar 20, 2024 at 8:59 PM Lari Hotari <lh...@apache.org> wrote:
>
> On Wed, 20 Mar 2024 at 13:50, Yunze Xu <xy...@apache.org> wrote:
> > Currently, there is no clear definition for the maintain strategy of
> > public APIs. To make it more clear, I mean "public interface methods"
> > and "public class methods" when I mentioned "public APIs".It's very
> > ambiguous if it's acceptable to modify the method signature. Here are
> > some examples:
> > - Public APIs in pulsar-broker or pulsar-broker-common could be used
> > in the protocol handlers
> > - Public APIs in pulsar-client could be used due to the lack of
> > control in pulsar-client-api like MessageId and PulsarClient
>
> Yes, there must be a clear definition of what are considered as stable APIs.
>
> There's already a solution in use for this purpose. For example, in
> ManagedCursor[1] there are annotations:
>
> @InterfaceAudience.LimitedPrivate
> @InterfaceStability.Stable
> public interface ManagedCursor {
>
> InterfaceAudience and InterfaceStability annotations are defined in
> BookKeeper [2].
>
> We should use tooling in CI to enforce binary compatibility of the
> stable APIs that shouldn't break.
>
> There's japicmp [3] that can be used to do this. japicmp comes with a
> maven plugin [4] that needs to be configured with the rules to be
> used. Without enforcement, the annotation solution isn't very effective.
>
> Flink's maven build uses japicmp for enforcing the rules [5]. They
> have a solution for updating the baseline when the API changes [6].
> That could be a good reference if we decide to adopt a similar
> solution for Pulsar.
>
> > For any existing public method, no matter if it's from an interface or a class,
> > - If it's in one module of pulsar-client-api, pulsar-common,
> > pulsar-broker-common, should not be modified until the next LTS
> > release (e.g. 4.0.0). Before that, it can only be marked as
> > deprecated.
>
> I think that this is too broad. It would essentially prevent most
> refactorings and improvements to Pulsar. We could choose to do so, but
> it has consequences.
> Each class should have a reason why it's part of the stable API. It
> shouldn't be too hard to list what is used by custom plugins and why.
> Custom plugin authors should provide this information and this could
> be taken into account when the stable API annotations are added and
> reviewed.
>
> > - Specifically, for patch releases (e.g. 3.2.1), it needs public
> > discussion in the mail list to be cherry-picked
>
> When the list of stable interfaces is minimal, there's a chance that
> those are kept stable and there wouldn't be a frequent reason to break
> APIs.
> When there's a clear reason to break an API, yes, the mailing list
> should be informed. When using a tool like japicmp, the change would
> need to be explicitly approved as part of the baseline to get the CI
> to pass. I just hope that we keep the stable API minimal and
> reasonable so that we don't end up in this situation very often.
>
> -Lari
>
> 1 - https://github.com/apache/pulsar/blob/e81a20d667aef7c0f888e88dbcf972196012ebea/managed-ledger/src/main/java/org/apache/bookkeeper/mledger/ManagedCursor.java#L45-L47
> 2 - https://github.com/apache/bookkeeper/tree/master/bookkeeper-common/src/main/java/org/apache/bookkeeper/common/annotation
> 3 - https://siom79.github.io/japicmp/
> 4 - https://siom79.github.io/japicmp/MavenPlugin.html
> 5 - https://github.com/apache/flink/blob/cb1d68e68a56b43720f631d4cc3636c97519bb65/pom.xml#L2329-L2400
> 6 - https://github.com/apache/flink/blob/master/tools/releasing/update_japicmp_configuration.sh
>
> On Wed, 20 Mar 2024 at 13:50, Yunze Xu <xy...@apache.org> wrote:
> >
> > Hi community,
> >
> > I'd like to bring back the discussion again since a similar discussion
> > happened long ago [1].
> >
> > Currently, there is no clear definition for the maintain strategy of
> > public APIs. To make it more clear, I mean "public interface methods"
> > and "public class methods" when I mentioned "public APIs".It's very
> > ambiguous if it's acceptable to modify the method signature. Here are
> > some examples:
> > - Public APIs in pulsar-broker or pulsar-broker-common could be used
> > in the protocol handlers
> > - Public APIs in pulsar-client could be used due to the lack of
> > control in pulsar-client-api like MessageId and PulsarClient
> >
> > However, we should not encourage users to leverage such APIs, which
> > are mostly used internally. If authors use these APIs, they should be
> > responsible to modify the code if a new release changed such APIs. See
> > concrete examples here [2].
> >
> > Therefore, I suggest adding the API signature guarantee somewhere (as
> > a PIP or on the website). Here are my proposals.
> >
> > For any existing public method, no matter if it's from an interface or a class,
> > - If it's in one module of pulsar-client-api, pulsar-common,
> > pulsar-broker-common, should not be modified until the next LTS
> > release (e.g. 4.0.0). Before that, it can only be marked as
> > deprecated.
> > - Otherwise, it can be modified in the next feature release (e.g. 3.3.0)
> > - Specifically, for patch releases (e.g. 3.2.1), it needs public
> > discussion in the mail list to be cherry-picked
> >
> > The strategy above is more relaxed than my previous proposal [1]. It
> > tells ecosystem developers:
> > - What are the public APIs
> > - Use public APIs whenever possible
> > - It's okay to use internal APIs if existing public APIs cannot
> > satisfy their demands
> > - Propose new public APIs if they think their demands are universals
> > and then they won't risk modifying the code for new releases
> >
> > [1] https://lists.apache.org/thread/87n22qskcnpb0x44bfjy1chgdxzh8tf9
> > [2] https://github.com/apache/pulsar/pull/22182#discussion_r1531754956
> >
> > Thanks,
> > Yunze

Re: [DISCUSS] Clear definition for the APIs we should not remove or change simply

Posted by Lari Hotari <lh...@apache.org>.
On Wed, 20 Mar 2024 at 13:50, Yunze Xu <xy...@apache.org> wrote:
> Currently, there is no clear definition for the maintain strategy of
> public APIs. To make it more clear, I mean "public interface methods"
> and "public class methods" when I mentioned "public APIs".It's very
> ambiguous if it's acceptable to modify the method signature. Here are
> some examples:
> - Public APIs in pulsar-broker or pulsar-broker-common could be used
> in the protocol handlers
> - Public APIs in pulsar-client could be used due to the lack of
> control in pulsar-client-api like MessageId and PulsarClient

Yes, there must be a clear definition of what are considered as stable APIs.

There's already a solution in use for this purpose. For example, in
ManagedCursor[1] there are annotations:

@InterfaceAudience.LimitedPrivate
@InterfaceStability.Stable
public interface ManagedCursor {

InterfaceAudience and InterfaceStability annotations are defined in
BookKeeper [2].

We should use tooling in CI to enforce binary compatibility of the
stable APIs that shouldn't break.

There's japicmp [3] that can be used to do this. japicmp comes with a
maven plugin [4] that needs to be configured with the rules to be
used. Without enforcement, the annotation solution isn't very effective.

Flink's maven build uses japicmp for enforcing the rules [5]. They
have a solution for updating the baseline when the API changes [6].
That could be a good reference if we decide to adopt a similar
solution for Pulsar.

> For any existing public method, no matter if it's from an interface or a class,
> - If it's in one module of pulsar-client-api, pulsar-common,
> pulsar-broker-common, should not be modified until the next LTS
> release (e.g. 4.0.0). Before that, it can only be marked as
> deprecated.

I think that this is too broad. It would essentially prevent most
refactorings and improvements to Pulsar. We could choose to do so, but
it has consequences.
Each class should have a reason why it's part of the stable API. It
shouldn't be too hard to list what is used by custom plugins and why.
Custom plugin authors should provide this information and this could
be taken into account when the stable API annotations are added and
reviewed.

> - Specifically, for patch releases (e.g. 3.2.1), it needs public
> discussion in the mail list to be cherry-picked

When the list of stable interfaces is minimal, there's a chance that
those are kept stable and there wouldn't be a frequent reason to break
APIs.
When there's a clear reason to break an API, yes, the mailing list
should be informed. When using a tool like japicmp, the change would
need to be explicitly approved as part of the baseline to get the CI
to pass. I just hope that we keep the stable API minimal and
reasonable so that we don't end up in this situation very often.

-Lari

1 - https://github.com/apache/pulsar/blob/e81a20d667aef7c0f888e88dbcf972196012ebea/managed-ledger/src/main/java/org/apache/bookkeeper/mledger/ManagedCursor.java#L45-L47
2 - https://github.com/apache/bookkeeper/tree/master/bookkeeper-common/src/main/java/org/apache/bookkeeper/common/annotation
3 - https://siom79.github.io/japicmp/
4 - https://siom79.github.io/japicmp/MavenPlugin.html
5 - https://github.com/apache/flink/blob/cb1d68e68a56b43720f631d4cc3636c97519bb65/pom.xml#L2329-L2400
6 - https://github.com/apache/flink/blob/master/tools/releasing/update_japicmp_configuration.sh

On Wed, 20 Mar 2024 at 13:50, Yunze Xu <xy...@apache.org> wrote:
>
> Hi community,
>
> I'd like to bring back the discussion again since a similar discussion
> happened long ago [1].
>
> Currently, there is no clear definition for the maintain strategy of
> public APIs. To make it more clear, I mean "public interface methods"
> and "public class methods" when I mentioned "public APIs".It's very
> ambiguous if it's acceptable to modify the method signature. Here are
> some examples:
> - Public APIs in pulsar-broker or pulsar-broker-common could be used
> in the protocol handlers
> - Public APIs in pulsar-client could be used due to the lack of
> control in pulsar-client-api like MessageId and PulsarClient
>
> However, we should not encourage users to leverage such APIs, which
> are mostly used internally. If authors use these APIs, they should be
> responsible to modify the code if a new release changed such APIs. See
> concrete examples here [2].
>
> Therefore, I suggest adding the API signature guarantee somewhere (as
> a PIP or on the website). Here are my proposals.
>
> For any existing public method, no matter if it's from an interface or a class,
> - If it's in one module of pulsar-client-api, pulsar-common,
> pulsar-broker-common, should not be modified until the next LTS
> release (e.g. 4.0.0). Before that, it can only be marked as
> deprecated.
> - Otherwise, it can be modified in the next feature release (e.g. 3.3.0)
> - Specifically, for patch releases (e.g. 3.2.1), it needs public
> discussion in the mail list to be cherry-picked
>
> The strategy above is more relaxed than my previous proposal [1]. It
> tells ecosystem developers:
> - What are the public APIs
> - Use public APIs whenever possible
> - It's okay to use internal APIs if existing public APIs cannot
> satisfy their demands
> - Propose new public APIs if they think their demands are universals
> and then they won't risk modifying the code for new releases
>
> [1] https://lists.apache.org/thread/87n22qskcnpb0x44bfjy1chgdxzh8tf9
> [2] https://github.com/apache/pulsar/pull/22182#discussion_r1531754956
>
> Thanks,
> Yunze