You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pulsar.apache.org by Matteo Merli <mm...@apache.org> on 2017/08/25 16:19:20 UTC

[DISCUSS] Log4J-2.x and defining acceptable breaking changes

Starting a discussion on the mailing list to hear feedback from a wider
audience.

Few of us have have been discussing over the opportunity of some breaking
changes in a PR at https://github.com/apache/incubator-pulsar/pull/680

TL;DR version: upgrading to log4j-2.x brings new feature (speed,
no-garbage, reloads conf file at runtime and can add scripting to have
custom filtering rules) but uses a different configuration file format. If
a user has some custom log4j-1.x configuration file, he would have to
convert them to log4j-2.x format.

My personal view is that this change is ok for 1.20.0 release:
 * It doesn't break the core Pulsar APIs (client interfaces, data and
admin, or broker auth plugins)
 * We should make the log4j upgrade notice clear in the release notes.
While it may be true that not everyone reads the release notes text, I
would expect everyone running Pulsar in production or similar to pay
attention to that. Eventually, the worst case scenario, for someone with
custom logger configuration is to either fallback to default configuration
or to be missing logs, but there will be no risk for data integrity.
 * We have already did breaking changes (eg: renaming com.yahoo.pulsar to
org.apache.pulsar) without changing the major version.

The bigger question here would be defining what is acceptable to include
without breaking compatibility. And that leads to define what would be the
conditions to switch the version to 2.0.

In my opinion there are 2 reasons that would justify to change the major
version:
 1. Significant amount of feature or improvement included in the release
 2. Breaking user facing API compatibility

For #1, I think that would require a longer release cycle. So far, we've
being working on a ~1month release cycle in which new features are
incrementally introduced. This model is a bit different from long-train
releases carrying big changes that would justify a major version bump.

For #2, as of general rule, we shouldn't break the API, and I believe we
haven't done that so far (and I am talking even internally at Yahoo before
open-sourcing Pulsar: the earliest client-lib should still work against the
latest broker code, and vice-versa, the new client, should be able to use
the older brokers, minus new features).

That was the primary reason why we haven't bumped to 2.x release versions.

Having said that, we could compile a list of all the things that might be
good to change in the APIs if we were to do a 2.0 release.
I, for one, would vote to fix the MessageId in the C++ client lib to hide
implementation and allow for further non-breaking changes.

Please, everyone share you thoughts on this,

Matteo
-- 
Matteo Merli
<mm...@apache.org>

Re: [DISCUSS] Log4J-2.x and defining acceptable breaking changes

Posted by Masakazu Kitajo <ma...@apache.org>.
In my opinion, we should bump major version if we change configuration
formats and the migration can't be automated by conversion script or
something like that because I don't think API is the only compatibility we
should care about. Configuration file format is a kind of interface between
users and softwares.

So, I have no objection for the reason #1 and #2. I just think breaking
configuration compatibility should be #3.

 * We should make the log4j upgrade notice clear in the release notes.
> While it may be true that not everyone reads the release notes text, I
> would expect everyone running Pulsar in production or similar to pay
> attention to that. Eventually, the worst case scenario, for someone with
> custom logger configuration is to either fallback to default configuration
> or to be missing logs, but there will be no risk for data integrity.


This sounds to me that it's ok because it causes only logging problem. I
doubt that it matches what users expect in general. I would expect minor
version change that it works as the same as previous minor versions without
anything as long as I don't use new / improved feature.

Probably, there's not only one right conclusion for this discussion. It
depends on what we prioritize. We can allow making minor incompatibilities
for minor version changes to introduce great features speedy, and some
users would want it. However, I think strictly keeping compatibility is
more user friendly.

Thanks,
Masakazu



On Sat, Aug 26, 2017 at 4:47 AM, Dave Fisher <da...@comcast.net> wrote:

> Hi -
>
> It is good to have a discussion around breaking changes and versioning.
> This is the type of policy decisions that are important. For example Apache
> POI is about to go from version 3.17 to 4.0 and is breaking the API and
> changing some defaults. Apache Tika uses POI and there is an overlap in PMC
> membership. This means the two PMCs are coordinating.
>
> Are there known consumers of Pulsar that would be impacted? If so then
> this is an opportunity to bring them into the discussion. Who knows they
> could become new contributors.
>
> I had one idea about this particular case. Is a simple configuration
> conversion script feasible? Is this something that the Log4J project has?
>
> Regards,
> Dave
>
> > On Aug 25, 2017, at 9:19 AM, Matteo Merli <mm...@apache.org> wrote:
> >
> > Starting a discussion on the mailing list to hear feedback from a wider
> > audience.
> >
> > Few of us have have been discussing over the opportunity of some breaking
> > changes in a PR at https://github.com/apache/incubator-pulsar/pull/680
> >
> > TL;DR version: upgrading to log4j-2.x brings new feature (speed,
> > no-garbage, reloads conf file at runtime and can add scripting to have
> > custom filtering rules) but uses a different configuration file format.
> If
> > a user has some custom log4j-1.x configuration file, he would have to
> > convert them to log4j-2.x format.
> >
> > My personal view is that this change is ok for 1.20.0 release:
> > * It doesn't break the core Pulsar APIs (client interfaces, data and
> > admin, or broker auth plugins)
> > * We should make the log4j upgrade notice clear in the release notes.
> > While it may be true that not everyone reads the release notes text, I
> > would expect everyone running Pulsar in production or similar to pay
> > attention to that. Eventually, the worst case scenario, for someone with
> > custom logger configuration is to either fallback to default
> configuration
> > or to be missing logs, but there will be no risk for data integrity.
> > * We have already did breaking changes (eg: renaming com.yahoo.pulsar to
> > org.apache.pulsar) without changing the major version.
> >
> > The bigger question here would be defining what is acceptable to include
> > without breaking compatibility. And that leads to define what would be
> the
> > conditions to switch the version to 2.0.
> >
> > In my opinion there are 2 reasons that would justify to change the major
> > version:
> > 1. Significant amount of feature or improvement included in the release
> > 2. Breaking user facing API compatibility
> >
> > For #1, I think that would require a longer release cycle. So far, we've
> > being working on a ~1month release cycle in which new features are
> > incrementally introduced. This model is a bit different from long-train
> > releases carrying big changes that would justify a major version bump.
> >
> > For #2, as of general rule, we shouldn't break the API, and I believe we
> > haven't done that so far (and I am talking even internally at Yahoo
> before
> > open-sourcing Pulsar: the earliest client-lib should still work against
> the
> > latest broker code, and vice-versa, the new client, should be able to use
> > the older brokers, minus new features).
> >
> > That was the primary reason why we haven't bumped to 2.x release
> versions.
> >
> > Having said that, we could compile a list of all the things that might be
> > good to change in the APIs if we were to do a 2.0 release.
> > I, for one, would vote to fix the MessageId in the C++ client lib to hide
> > implementation and allow for further non-breaking changes.
> >
> > Please, everyone share you thoughts on this,
> >
> > Matteo
> > --
> > Matteo Merli
> > <mm...@apache.org>
>
>

Re: [DISCUSS] Log4J-2.x and defining acceptable breaking changes

Posted by Dave Fisher <da...@comcast.net>.
Hi -

It is good to have a discussion around breaking changes and versioning. This is the type of policy decisions that are important. For example Apache POI is about to go from version 3.17 to 4.0 and is breaking the API and changing some defaults. Apache Tika uses POI and there is an overlap in PMC membership. This means the two PMCs are coordinating.

Are there known consumers of Pulsar that would be impacted? If so then this is an opportunity to bring them into the discussion. Who knows they could become new contributors.

I had one idea about this particular case. Is a simple configuration conversion script feasible? Is this something that the Log4J project has?

Regards,
Dave

> On Aug 25, 2017, at 9:19 AM, Matteo Merli <mm...@apache.org> wrote:
> 
> Starting a discussion on the mailing list to hear feedback from a wider
> audience.
> 
> Few of us have have been discussing over the opportunity of some breaking
> changes in a PR at https://github.com/apache/incubator-pulsar/pull/680
> 
> TL;DR version: upgrading to log4j-2.x brings new feature (speed,
> no-garbage, reloads conf file at runtime and can add scripting to have
> custom filtering rules) but uses a different configuration file format. If
> a user has some custom log4j-1.x configuration file, he would have to
> convert them to log4j-2.x format.
> 
> My personal view is that this change is ok for 1.20.0 release:
> * It doesn't break the core Pulsar APIs (client interfaces, data and
> admin, or broker auth plugins)
> * We should make the log4j upgrade notice clear in the release notes.
> While it may be true that not everyone reads the release notes text, I
> would expect everyone running Pulsar in production or similar to pay
> attention to that. Eventually, the worst case scenario, for someone with
> custom logger configuration is to either fallback to default configuration
> or to be missing logs, but there will be no risk for data integrity.
> * We have already did breaking changes (eg: renaming com.yahoo.pulsar to
> org.apache.pulsar) without changing the major version.
> 
> The bigger question here would be defining what is acceptable to include
> without breaking compatibility. And that leads to define what would be the
> conditions to switch the version to 2.0.
> 
> In my opinion there are 2 reasons that would justify to change the major
> version:
> 1. Significant amount of feature or improvement included in the release
> 2. Breaking user facing API compatibility
> 
> For #1, I think that would require a longer release cycle. So far, we've
> being working on a ~1month release cycle in which new features are
> incrementally introduced. This model is a bit different from long-train
> releases carrying big changes that would justify a major version bump.
> 
> For #2, as of general rule, we shouldn't break the API, and I believe we
> haven't done that so far (and I am talking even internally at Yahoo before
> open-sourcing Pulsar: the earliest client-lib should still work against the
> latest broker code, and vice-versa, the new client, should be able to use
> the older brokers, minus new features).
> 
> That was the primary reason why we haven't bumped to 2.x release versions.
> 
> Having said that, we could compile a list of all the things that might be
> good to change in the APIs if we were to do a 2.0 release.
> I, for one, would vote to fix the MessageId in the C++ client lib to hide
> implementation and allow for further non-breaking changes.
> 
> Please, everyone share you thoughts on this,
> 
> Matteo
> --
> Matteo Merli
> <mm...@apache.org>