You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@bookkeeper.apache.org by Ivan Kelly <iv...@apache.org> on 2020/11/10 10:09:15 UTC

Contributing Splunk changes back to OSS

Hi folks,

It's been about a year since Streamlio joined Splunk and since then
we've had a bit of forking with our BK branch.
It has gotten to a stage where it's starting to be a problem for us,
so we'd like to start to get things back in sync.

There are a couple of big chunks of work to come back.
We've added a data integrity checker that replaces a lot of the
functionality of autorecovery and allows us to run without a journal.
We refactored the bookie to allow dependency injection.
We've rewritten the entry logger to use direct I/O (allowing 2GBps
writes per bookie).

One other thing we've done is to change the build system to use gradle.
The major driver for this was that maven is just slow, even before you
start running tests.
"mvn clean package -DskipTests" takes 4m30 on my laptop. "./gradlew
clean jar" takes 40s.
Subsequent builds on gradle are much much faster, as it does
incremental building.
Incremental building exists in maven, but it doesn't work.
Gradle also handle multimodule projects better. If I make a change in
bookkeeper-common,
"./gradlew :bookkeeper-server:test" will pick up the change without
having to explicitly
"mvn install" the bookkeeper-common. In my opinion it's just a much
nicer build system
to work with. Even the poms it generates are better as they avoid
dependency pollution.

What are peoples opinions on moving BookKeeper to gradle (assuming
I/splunk do the legwork)?
If people are open to it, I'll submit a BP.

Another thing that BK (and the whole ecosystem) is missing is
structured logging.
We also plan to add structured logging to BK in soon. This is a major
motivator for converging the branches,
as it touches a lot of places.

Anyhow, any feedback appreciated.

-Ivan

Re: Contributing Splunk changes back to OSS

Posted by Ivan Kelly <iv...@apache.org>.
> > What are peoples opinions on moving BookKeeper to gradle (assuming
> > I/splunk do the legwork)?
> > If people are open to it, I'll submit a BP.
> >
>
> +1. My only question is how do you do an Apache release. I'd like to see BP
> covering that question.

Yes, this will need a BP to cover all the CI + release things.
This is a good thing though. It'll give us a chance to clear house.

-Ivan

Re: Contributing Splunk changes back to OSS

Posted by Sijie Guo <gu...@gmail.com>.
On Tue, Nov 10, 2020 at 3:09 AM Ivan Kelly <iv...@apache.org> wrote:

> Hi folks,
>
> It's been about a year since Streamlio joined Splunk and since then
> we've had a bit of forking with our BK branch.
> It has gotten to a stage where it's starting to be a problem for us,
> so we'd like to start to get things back in sync.
>
> There are a couple of big chunks of work to come back.
> We've added a data integrity checker that replaces a lot of the
> functionality of autorecovery and allows us to run without a journal.
> We refactored the bookie to allow dependency injection.
> We've rewritten the entry logger to use direct I/O (allowing 2GBps
> writes per bookie).
>

+1 Looking forward to the changes.


>
> One other thing we've done is to change the build system to use gradle.
> The major driver for this was that maven is just slow, even before you
> start running tests.
> "mvn clean package -DskipTests" takes 4m30 on my laptop. "./gradlew
> clean jar" takes 40s.
> Subsequent builds on gradle are much much faster, as it does
> incremental building.
> Incremental building exists in maven, but it doesn't work.
> Gradle also handle multimodule projects better. If I make a change in
> bookkeeper-common,
> "./gradlew :bookkeeper-server:test" will pick up the change without
> having to explicitly
> "mvn install" the bookkeeper-common. In my opinion it's just a much
> nicer build system
> to work with. Even the poms it generates are better as they avoid
> dependency pollution.
>
> What are peoples opinions on moving BookKeeper to gradle (assuming
> I/splunk do the legwork)?
> If people are open to it, I'll submit a BP.
>

+1. My only question is how do you do an Apache release. I'd like to see BP
covering that question.


>
> Another thing that BK (and the whole ecosystem) is missing is
> structured logging.
> We also plan to add structured logging to BK in soon. This is a major
> motivator for converging the branches,
> as it touches a lot of places.
>
> Anyhow, any feedback appreciated.
>
> -Ivan
>

Re: Contributing Splunk changes back to OSS

Posted by Enrico Olivelli <eo...@gmail.com>.
Also Pulsar is in the middle of a big repository refactor, Sijie is moving
Adapters to another repository.
Let's do it in BK first, in the meantime the Pulsar repository will be
stable and probably it is better to do it after 2.7.0 release.
I guess it will take time

Enrico

Il giorno mer 11 nov 2020 alle ore 10:16 Ivan Kelly <iv...@apache.org> ha
scritto:

> > Based on feedback on this, I'd also like to later start a similar
> > Gradle proposal for Pulsar builds too.
> Yes, but let's wait and see how it goes here first.
>
> I'll start putting together a BP today.
>
> -Ivan
>

Re: Contributing Splunk changes back to OSS

Posted by Ivan Kelly <iv...@apache.org>.
> Based on feedback on this, I'd also like to later start a similar
> Gradle proposal for Pulsar builds too.
Yes, but let's wait and see how it goes here first.

I'll start putting together a BP today.

-Ivan

Re: Contributing Splunk changes back to OSS

Posted by Matteo Merli <mm...@apache.org>.
+1 from me on Gradle.

The advantages in having very fast (and correct) incremental builds
are very nice.

I wouldn't expect major problems in setting up for Apache releases
since several other ASF projects are already using it.

Based on feedback on this, I'd also like to later start a similar
Gradle proposal for Pulsar builds too.

Matteo

On Tue, Nov 10, 2020 at 3:08 AM Enrico Olivelli <eo...@gmail.com> wrote:
>
> Il giorno mar 10 nov 2020 alle ore 11:09 Ivan Kelly <iv...@apache.org> ha
> scritto:
>
> > Hi folks,
> >
> > It's been about a year since Streamlio joined Splunk and since then
> > we've had a bit of forking with our BK branch.
> > It has gotten to a stage where it's starting to be a problem for us,
> > so we'd like to start to get things back in sync.
> >
> > There are a couple of big chunks of work to come back.
> > We've added a data integrity checker that replaces a lot of the
> > functionality of autorecovery and allows us to run without a journal.
> > We refactored the bookie to allow dependency injection.
> > We've rewritten the entry logger to use direct I/O (allowing 2GBps
> > writes per bookie).
> >
>
> Cool, eager to see those changes in ASF BK
>
>
> >
> > One other thing we've done is to change the build system to use gradle.
> > The major driver for this was that maven is just slow, even before you
> > start running tests.
> > "mvn clean package -DskipTests" takes 4m30 on my laptop. "./gradlew
> > clean jar" takes 40s.
> > Subsequent builds on gradle are much much faster, as it does
> > incremental building.
> > Incremental building exists in maven, but it doesn't work.
> > Gradle also handle multimodule projects better. If I make a change in
> > bookkeeper-common,
> > "./gradlew :bookkeeper-server:test" will pick up the change without
> > having to explicitly
> > "mvn install" the bookkeeper-common. In my opinion it's just a much
> > nicer build system
> > to work with. Even the poms it generates are better as they avoid
> > dependency pollution.
> >
>
> I am not a big fan of Gradle, but I don't want to start a battle. There are
> pros and cons.
> To me it is a matter of taste, both of the two worlds are pretty widespread.
> Personally I have experienced the move of some projects from Maven to
> Gradle with a little bit of pain,
> but as said I am not against a change.
>
> Usually changing the build system is problematic for:
> - existing contributors/committers
> - forked repositories
>
> If you have time and resources to drive the change and to help the
> community to understand how to work with Gradle I am happy to accept it.
> I will be also a good change to reduce some tech debt, when you rewrite the
> build system/configuration you can decide to chop useless stuff that you
> aren't dropping because it is better to not fix things that aren't broken
> So +1, a BP is a good starting point please
>
>
> > What are peoples opinions on moving BookKeeper to gradle (assuming
> > I/splunk do the legwork)?
> > If people are open to it, I'll submit a BP.
> >
> > Another thing that BK (and the whole ecosystem) is missing is
> > structured logging.
> > We also plan to add structured logging to BK in soon. This is a major
> > motivator for converging the branches,
> > as it touches a lot of places.
> >
>
> +1
>
>
> > Anyhow, any feedback appreciated.
> >
>
> I am happy that the community can start to work together again as a whole
>
> Enrico
>
>
> >
> > -Ivan
> >

Re: Contributing Splunk changes back to OSS

Posted by Enrico Olivelli <eo...@gmail.com>.
Il giorno mar 10 nov 2020 alle ore 11:09 Ivan Kelly <iv...@apache.org> ha
scritto:

> Hi folks,
>
> It's been about a year since Streamlio joined Splunk and since then
> we've had a bit of forking with our BK branch.
> It has gotten to a stage where it's starting to be a problem for us,
> so we'd like to start to get things back in sync.
>
> There are a couple of big chunks of work to come back.
> We've added a data integrity checker that replaces a lot of the
> functionality of autorecovery and allows us to run without a journal.
> We refactored the bookie to allow dependency injection.
> We've rewritten the entry logger to use direct I/O (allowing 2GBps
> writes per bookie).
>

Cool, eager to see those changes in ASF BK


>
> One other thing we've done is to change the build system to use gradle.
> The major driver for this was that maven is just slow, even before you
> start running tests.
> "mvn clean package -DskipTests" takes 4m30 on my laptop. "./gradlew
> clean jar" takes 40s.
> Subsequent builds on gradle are much much faster, as it does
> incremental building.
> Incremental building exists in maven, but it doesn't work.
> Gradle also handle multimodule projects better. If I make a change in
> bookkeeper-common,
> "./gradlew :bookkeeper-server:test" will pick up the change without
> having to explicitly
> "mvn install" the bookkeeper-common. In my opinion it's just a much
> nicer build system
> to work with. Even the poms it generates are better as they avoid
> dependency pollution.
>

I am not a big fan of Gradle, but I don't want to start a battle. There are
pros and cons.
To me it is a matter of taste, both of the two worlds are pretty widespread.
Personally I have experienced the move of some projects from Maven to
Gradle with a little bit of pain,
but as said I am not against a change.

Usually changing the build system is problematic for:
- existing contributors/committers
- forked repositories

If you have time and resources to drive the change and to help the
community to understand how to work with Gradle I am happy to accept it.
I will be also a good change to reduce some tech debt, when you rewrite the
build system/configuration you can decide to chop useless stuff that you
aren't dropping because it is better to not fix things that aren't broken
So +1, a BP is a good starting point please


> What are peoples opinions on moving BookKeeper to gradle (assuming
> I/splunk do the legwork)?
> If people are open to it, I'll submit a BP.
>
> Another thing that BK (and the whole ecosystem) is missing is
> structured logging.
> We also plan to add structured logging to BK in soon. This is a major
> motivator for converging the branches,
> as it touches a lot of places.
>

+1


> Anyhow, any feedback appreciated.
>

I am happy that the community can start to work together again as a whole

Enrico


>
> -Ivan
>

Re: Contributing Splunk changes back to OSS

Posted by Enrico Olivelli <eo...@gmail.com>.
Ivan,
I see that we are making good process on The StreamStrorage service,
porting Splunk fixes and with the help of Andrey.

Do you have already a plan regarding:
- Journal by-pass
- Moving to Gradle
- AckQuorum < WriteQuorum issues

Regards
Enrico

Il giorno gio 12 nov 2020 alle ore 01:50 Jia Zhai <zh...@gmail.com> ha
scritto:

> Thanks for this information.Ivan.
> looking forward to the changes.
> A BP and docs for the Gradle are very needed. +1
>
> Best Regards.
>
>
> Jia Zhai
>
> Beijing, China
>
> Mobile: +86 15810491983
>
>
>
>
> On Thu, Nov 12, 2020 at 4:26 AM Anup Ghatage <gh...@gmail.com> wrote:
>
> > You can find the proposal doc here:
> >
> >
> https://docs.google.com/document/d/1skocaIYC-9ZUsG7w2VAiGDbUG39ybYivrfvw9o5r2VA/edit?usp=sharing
> > In general, it's about audit logging, similar to ZOOKEEPER-1260 which
> > brought in Audit Logging to ZK.
> >
> > Regards,
> > Anup
> >
> >
> > On Wed, Nov 11, 2020 at 12:18 PM Ivan Kelly <iv...@apache.org> wrote:
> >
> > > > There might be slight overlap with BP-40 which I have in the works
> for
> > > > Audit logging for BKShell and Bkctl.
> > >
> > > Where is BP-40? I don't see it on the dev list.
> > >
> > > -Ivan
> > >
> >
> >
> > --
> > Anup Ghatage
> > www.ghatage.com
> >
>

Re: Contributing Splunk changes back to OSS

Posted by Jia Zhai <zh...@gmail.com>.
Thanks for this information.Ivan.
looking forward to the changes.
A BP and docs for the Gradle are very needed. +1

Best Regards.


Jia Zhai

Beijing, China

Mobile: +86 15810491983




On Thu, Nov 12, 2020 at 4:26 AM Anup Ghatage <gh...@gmail.com> wrote:

> You can find the proposal doc here:
>
> https://docs.google.com/document/d/1skocaIYC-9ZUsG7w2VAiGDbUG39ybYivrfvw9o5r2VA/edit?usp=sharing
> In general, it's about audit logging, similar to ZOOKEEPER-1260 which
> brought in Audit Logging to ZK.
>
> Regards,
> Anup
>
>
> On Wed, Nov 11, 2020 at 12:18 PM Ivan Kelly <iv...@apache.org> wrote:
>
> > > There might be slight overlap with BP-40 which I have in the works for
> > > Audit logging for BKShell and Bkctl.
> >
> > Where is BP-40? I don't see it on the dev list.
> >
> > -Ivan
> >
>
>
> --
> Anup Ghatage
> www.ghatage.com
>

Re: Contributing Splunk changes back to OSS

Posted by Anup Ghatage <gh...@gmail.com>.
You can find the proposal doc here:
https://docs.google.com/document/d/1skocaIYC-9ZUsG7w2VAiGDbUG39ybYivrfvw9o5r2VA/edit?usp=sharing
In general, it's about audit logging, similar to ZOOKEEPER-1260 which
brought in Audit Logging to ZK.

Regards,
Anup


On Wed, Nov 11, 2020 at 12:18 PM Ivan Kelly <iv...@apache.org> wrote:

> > There might be slight overlap with BP-40 which I have in the works for
> > Audit logging for BKShell and Bkctl.
>
> Where is BP-40? I don't see it on the dev list.
>
> -Ivan
>


-- 
Anup Ghatage
www.ghatage.com

Re: Contributing Splunk changes back to OSS

Posted by Ivan Kelly <iv...@apache.org>.
> There might be slight overlap with BP-40 which I have in the works for
> Audit logging for BKShell and Bkctl.

Where is BP-40? I don't see it on the dev list.

-Ivan

Re: Contributing Splunk changes back to OSS

Posted by Anup Ghatage <gh...@gmail.com>.
On Tue, Nov 10, 2020 at 2:09 AM Ivan Kelly <iv...@apache.org> wrote:

> Hi folks,
>
> It's been about a year since Streamlio joined Splunk and since then
> we've had a bit of forking with our BK branch.
> It has gotten to a stage where it's starting to be a problem for us,
> so we'd like to start to get things back in sync.
>
>
+1 Very cool! Looking forward to them!


> There are a couple of big chunks of work to come back.
> We've added a data integrity checker that replaces a lot of the
> functionality of autorecovery and allows us to run without a journal.
> We refactored the bookie to allow dependency injection.
> We've rewritten the entry logger to use direct I/O (allowing 2GBps
> writes per bookie).
>
One other thing we've done is to change the build system to use gradle.
> The major driver for this was that maven is just slow, even before you
> start running tests.
> "mvn clean package -DskipTests" takes 4m30 on my laptop. "./gradlew
> clean jar" takes 40s.
> Subsequent builds on gradle are much much faster, as it does
> incremental building.
> Incremental building exists in maven, but it doesn't work.
> Gradle also handle multimodule projects better. If I make a change in
> bookkeeper-common,
> "./gradlew :bookkeeper-server:test" will pick up the change without
> having to explicitly
> "mvn install" the bookkeeper-common. In my opinion it's just a much
> nicer build system
> to work with. Even the poms it generates are better as they avoid
> dependency pollution.
>
> What are peoples opinions on moving BookKeeper to gradle (assuming
> I/splunk do the legwork)?
> If people are open to it, I'll submit a BP.
>
> Another thing that BK (and the whole ecosystem) is missing is
> structured logging.
> We also plan to add structured logging to BK in soon. This is a major
> motivator for converging the branches,
> as it touches a lot of places.
>

There might be slight overlap with BP-40 which I have in the works for
Audit logging for BKShell and Bkctl.
We also have some more work in this area that we're looking to contribute
back soon as well.
Would like to have a discussion around this when we come to it.
Nonetheless, I'm very interested in this.


> Anyhow, any feedback appreciated.
>
> -Ivan
>


-- 
Anup Ghatage
www.ghatage.com