You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@cassandra.apache.org by François Deliège <fr...@instagram.com> on 2017/03/16 17:32:23 UTC

Code quality, principles and rules

Hi Dev,

What principles do we have? How do we implement them?

Our team has been evaluating 3.0.x and 3.x for a large production deployment.  We have noticed broken tests and have been working on several patches.  However, large parts of the code base are wildly untested, which makes new contributions more delicate.

All of this ultimately reduces our confidence in the new releases and slows down our adoption of the 3.0 / 3.x and future 4.0 releases.

So, I'd like to have a constructive discussion around 2 questions:

1. What principles should the community have in place about code quality and ensuring its long term productivity?
2. What are good implementationg (as in rules) of these principles?

To get this started, here is an initial proposal:

Principles:

1. Tests always pass.  This is the starting point. If we don't care about test failures, then we should stop writing tests. A recurring failing test carries no signal and is better deleted.
2. The code is tested.

Assuming we can align on these principles, here is a proposal for their implementation.

Rules:

1. Each new release passes all tests (no flakinesss).
2. If a patch has a failing test (test touching the same code path), the code or test should be fixed prior to being accepted.
3. Bugs fixes should have one test that fails prior to the fix and passes after fix.
4. New code should have at least 90% test coverage.

Looking forward to reading your feedback,

@fdeliege

Re: Code quality, principles and rules

Posted by Jeff Jirsa <jj...@apache.org>.


On 2017-03-16 14:51 (-0700), Qingcun Zhou <zh...@gmail.com> wrote: 
>
> When we talk about code coverage for new code, should we encourage people
> to contribute unit test cases for existing code?
> 

Unit tests for existing untested code seems like something we'd welcome and encourage.

Re: Code quality, principles and rules

Posted by Qingcun Zhou <zh...@gmail.com>.

Performance consideration is a valid concern.

When I say "difficult to write unit test cases", I mean sometimes you need
to make method/variable package private, or create a package private
constructor so that you can inject some internal states, etc. This is more
like "annoying" if it's not "difficult". But I agree that good coding
practice leads to easier unit testing.

When we talk about code coverage for new code, should we encourage people
to contribute unit test cases for existing code?

On Thu, Mar 16, 2017 at 2:27 PM, DuyHai Doan <do...@gmail.com> wrote:

> "Otherwise it'll be difficult to write unit test cases."
>
> Having to pull in dependency injection framework to make unit testing
> easier is generally a sign of code design issue.
>
> With constructor injection & other techniques, there is more than enough to
> unit test your code without having to pull external libs
>
> On Thu, Mar 16, 2017 at 10:18 PM, Jason Brown <ja...@gmail.com>
> wrote:
>
> > >> do we have plan to integrate with a dependency injection framework?
> >
> > No, we (the maintainers) have been pretty much against more frameworks
> due
> > to performance reasons, overhead, and dependency management problems.
> >
> > On Thu, Mar 16, 2017 at 2:04 PM, Qingcun Zhou <zh...@gmail.com>
> > wrote:
> >
> > > Since we're here, do we have plan to integrate with a dependency
> > injection
> > > framework like Dagger2? Otherwise it'll be difficult to write unit test
> > > cases.
> > >
> > > On Thu, Mar 16, 2017 at 1:16 PM, Edward Capriolo <
> edlinuxguru@gmail.com>
> > > wrote:
> > >
> > > > On Thu, Mar 16, 2017 at 3:10 PM, Jeff Jirsa <jj...@apache.org>
> wrote:
> > > >
> > > > >
> > > > >
> > > > > On 2017-03-16 10:32 (-0700), François Deliège <
> > francois@instagram.com>
> > > > > wrote:
> > > > > >
> > > > > > To get this started, here is an initial proposal:
> > > > > >
> > > > > > Principles:
> > > > > >
> > > > > > 1. Tests always pass.  This is the starting point. If we don't
> care
> > > > > about test failures, then we should stop writing tests. A recurring
> > > > failing
> > > > > test carries no signal and is better deleted.
> > > > > > 2. The code is tested.
> > > > > >
> > > > > > Assuming we can align on these principles, here is a proposal for
> > > their
> > > > > implementation.
> > > > > >
> > > > > > Rules:
> > > > > >
> > > > > > 1. Each new release passes all tests (no flakinesss).
> > > > > > 2. If a patch has a failing test (test touching the same code
> > path),
> > > > the
> > > > > code or test should be fixed prior to being accepted.
> > > > > > 3. Bugs fixes should have one test that fails prior to the fix
> and
> > > > > passes after fix.
> > > > > > 4. New code should have at least 90% test coverage.
> > > > > >
> > > > > First I was
> > > > > I agree with all of these and hope they become codified and
> > followed. I
> > > > > don't know anyone who believes we should be committing code that
> > breaks
> > > > > tests - but we should be more strict with requiring green test
> runs,
> > > and
> > > > > perhaps more strict with reverting patches that break tests (or
> cause
> > > > them
> > > > > to be flakey).
> > > > >
> > > > > Ed also noted on the user list [0] that certain sections of the
> code
> > > > > itself are difficult to test because of singletons - I agree with
> the
> > > > > suggestion that it's time to revisit CASSANDRA-7837 and
> > CASSANDRA-10283
> > > > >
> > > > > Finally, we should also recall Jason's previous notes [1] that the
> > > actual
> > > > > test infrastructure available is limited - the system provided by
> > > > Datastax
> > > > > is not generally open to everyone (and not guaranteed to be
> > permanent),
> > > > and
> > > > > the infrastructure currently available to the ASF is somewhat
> limited
> > > > (much
> > > > > slower, at the very least). If we require tests passing (and I
> agree
> > > that
> > > > > we should), we need to define how we're going to be testing (or how
> > > we're
> > > > > going to be sharing test results), because the ASF hardware isn't
> > going
> > > > to
> > > > > be able to do dozens of dev branch dtest runs per day in its
> current
> > > > form.
> > > > >
> > > > > 0: https://lists.apache.org/thread.html/
> > f6f3fc6d0ad1bd54a6185ce7bd7a2f
> > > > > 6f09759a02352ffc05df92eef6@%3Cuser.cassandra.apache.org%3E
> > > > > 1: https://lists.apache.org/thread.html/
> > 5fb8f0446ab97644100e4ef987f36e
> > > > > 07f44e8dd6d38f5dc81ecb3cdd@%3Cdev.cassandra.apache.org%3E
> > > > >
> > > > >
> > > > >
> > > > Ed also noted on the user list [0] that certain sections of the code
> > > itself
> > > > are difficult to test because of singletons - I agree with the
> > suggestion
> > > > that it's time to revisit CASSANDRA-7837 and CASSANDRA-10283
> > > >
> > > > Thanks for the shout out!
> > > >
> > > > I was just looking at a patch about compaction. The patch was to
> > > calculate
> > > > free space correctly in case X. Compaction is not something that
> > requires
> > > > multiple nodes to test. The logic on the surface seems simple: find
> > > tables
> > > > of similar size and select them and merge them. The reality is it
> turns
> > > out
> > > > now to be that way. The coverage itself both branch and line may be
> > very
> > > > high, but what the code does not do is directly account for a wide
> > > variety
> > > > of scenarios. Without direct tests you end up with a mental
> > approximation
> > > > of what it does and that varies from person to person and accounts
> for
> > > the
> > > > cases that fit in your mind. For example, you personally are only
> > running
> > > > LevelDB inspired compaction.
> > > >
> > > > Being that this this is not a multi-node problem you should be able
> to
> > re
> > > > factor this heavily. Pulling everything out to a static method where
> > all
> > > > the parameters are arguments, or inject a lot of mocks in the current
> > > code,
> > > > and develop some scenario based coverage.
> > > >
> > > > That is how i typically "rescue" code I take over. I look at the
> > > nightmare
> > > > and say, "damn i am really afraid to touch this". I construct 8
> > scenarios
> > > > that test green. Then I force some testing into it through careful re
> > > > factoring. Now, I probably know -something- about it. Now, you are
> > fairly
> > > > free to do a wide ranging refactor, because you at least counted for
> 8
> > > > scenarios and you put unit test traps so that some rules are
> enforced.
> > > (Or
> > > > the person changing the code has to actively REMOVE your tests
> > asserting
> > > it
> > > > was not or no longer is valid). Later on you (or someone else)
> > __STILL__
> > > > might screw the entire thing up, but at least you can now build
> > forward.
> > > >
> > > > Anyway that patch on compaction was great and I am sure it improved
> > > things.
> > > > That being said it did not add any tests :). So it can easily be
> undone
> > > by
> > > > the next person who does not understand the specific issue trying to
> be
> > > > addressed. Inline comments almost scream to me 'we need a test' not
> > > > everyone believes that.
> > > >
> > >
> > >
> > >
> > > --
> > > Thank you & Best Regards,
> > > --Simon (Qingcun) Zhou
> > >
> >
>



-- 
Thank you & Best Regards,
--Simon (Qingcun) Zhou

Re: Code quality, principles and rules

Posted by DuyHai Doan <do...@gmail.com>.

"Otherwise it'll be difficult to write unit test cases."

Having to pull in dependency injection framework to make unit testing
easier is generally a sign of code design issue.

With constructor injection & other techniques, there is more than enough to
unit test your code without having to pull external libs

On Thu, Mar 16, 2017 at 10:18 PM, Jason Brown <ja...@gmail.com> wrote:

> >> do we have plan to integrate with a dependency injection framework?
>
> No, we (the maintainers) have been pretty much against more frameworks due
> to performance reasons, overhead, and dependency management problems.
>
> On Thu, Mar 16, 2017 at 2:04 PM, Qingcun Zhou <zh...@gmail.com>
> wrote:
>
> > Since we're here, do we have plan to integrate with a dependency
> injection
> > framework like Dagger2? Otherwise it'll be difficult to write unit test
> > cases.
> >
> > On Thu, Mar 16, 2017 at 1:16 PM, Edward Capriolo <ed...@gmail.com>
> > wrote:
> >
> > > On Thu, Mar 16, 2017 at 3:10 PM, Jeff Jirsa <jj...@apache.org> wrote:
> > >
> > > >
> > > >
> > > > On 2017-03-16 10:32 (-0700), François Deliège <
> francois@instagram.com>
> > > > wrote:
> > > > >
> > > > > To get this started, here is an initial proposal:
> > > > >
> > > > > Principles:
> > > > >
> > > > > 1. Tests always pass.  This is the starting point. If we don't care
> > > > about test failures, then we should stop writing tests. A recurring
> > > failing
> > > > test carries no signal and is better deleted.
> > > > > 2. The code is tested.
> > > > >
> > > > > Assuming we can align on these principles, here is a proposal for
> > their
> > > > implementation.
> > > > >
> > > > > Rules:
> > > > >
> > > > > 1. Each new release passes all tests (no flakinesss).
> > > > > 2. If a patch has a failing test (test touching the same code
> path),
> > > the
> > > > code or test should be fixed prior to being accepted.
> > > > > 3. Bugs fixes should have one test that fails prior to the fix and
> > > > passes after fix.
> > > > > 4. New code should have at least 90% test coverage.
> > > > >
> > > > First I was
> > > > I agree with all of these and hope they become codified and
> followed. I
> > > > don't know anyone who believes we should be committing code that
> breaks
> > > > tests - but we should be more strict with requiring green test runs,
> > and
> > > > perhaps more strict with reverting patches that break tests (or cause
> > > them
> > > > to be flakey).
> > > >
> > > > Ed also noted on the user list [0] that certain sections of the code
> > > > itself are difficult to test because of singletons - I agree with the
> > > > suggestion that it's time to revisit CASSANDRA-7837 and
> CASSANDRA-10283
> > > >
> > > > Finally, we should also recall Jason's previous notes [1] that the
> > actual
> > > > test infrastructure available is limited - the system provided by
> > > Datastax
> > > > is not generally open to everyone (and not guaranteed to be
> permanent),
> > > and
> > > > the infrastructure currently available to the ASF is somewhat limited
> > > (much
> > > > slower, at the very least). If we require tests passing (and I agree
> > that
> > > > we should), we need to define how we're going to be testing (or how
> > we're
> > > > going to be sharing test results), because the ASF hardware isn't
> going
> > > to
> > > > be able to do dozens of dev branch dtest runs per day in its current
> > > form.
> > > >
> > > > 0: https://lists.apache.org/thread.html/
> f6f3fc6d0ad1bd54a6185ce7bd7a2f
> > > > 6f09759a02352ffc05df92eef6@%3Cuser.cassandra.apache.org%3E
> > > > 1: https://lists.apache.org/thread.html/
> 5fb8f0446ab97644100e4ef987f36e
> > > > 07f44e8dd6d38f5dc81ecb3cdd@%3Cdev.cassandra.apache.org%3E
> > > >
> > > >
> > > >
> > > Ed also noted on the user list [0] that certain sections of the code
> > itself
> > > are difficult to test because of singletons - I agree with the
> suggestion
> > > that it's time to revisit CASSANDRA-7837 and CASSANDRA-10283
> > >
> > > Thanks for the shout out!
> > >
> > > I was just looking at a patch about compaction. The patch was to
> > calculate
> > > free space correctly in case X. Compaction is not something that
> requires
> > > multiple nodes to test. The logic on the surface seems simple: find
> > tables
> > > of similar size and select them and merge them. The reality is it turns
> > out
> > > now to be that way. The coverage itself both branch and line may be
> very
> > > high, but what the code does not do is directly account for a wide
> > variety
> > > of scenarios. Without direct tests you end up with a mental
> approximation
> > > of what it does and that varies from person to person and accounts for
> > the
> > > cases that fit in your mind. For example, you personally are only
> running
> > > LevelDB inspired compaction.
> > >
> > > Being that this this is not a multi-node problem you should be able to
> re
> > > factor this heavily. Pulling everything out to a static method where
> all
> > > the parameters are arguments, or inject a lot of mocks in the current
> > code,
> > > and develop some scenario based coverage.
> > >
> > > That is how i typically "rescue" code I take over. I look at the
> > nightmare
> > > and say, "damn i am really afraid to touch this". I construct 8
> scenarios
> > > that test green. Then I force some testing into it through careful re
> > > factoring. Now, I probably know -something- about it. Now, you are
> fairly
> > > free to do a wide ranging refactor, because you at least counted for 8
> > > scenarios and you put unit test traps so that some rules are enforced.
> > (Or
> > > the person changing the code has to actively REMOVE your tests
> asserting
> > it
> > > was not or no longer is valid). Later on you (or someone else)
> __STILL__
> > > might screw the entire thing up, but at least you can now build
> forward.
> > >
> > > Anyway that patch on compaction was great and I am sure it improved
> > things.
> > > That being said it did not add any tests :). So it can easily be undone
> > by
> > > the next person who does not understand the specific issue trying to be
> > > addressed. Inline comments almost scream to me 'we need a test' not
> > > everyone believes that.
> > >
> >
> >
> >
> > --
> > Thank you & Best Regards,
> > --Simon (Qingcun) Zhou
> >
>

Re: Code quality, principles and rules

Posted by Michael Shuler <mi...@pbandjelly.org>.

On 03/22/2017 12:41 PM, Fran�ois Deli�ge wrote:
> A first actionable step is to increase the visibility of the test
> coverage.  Ideally this would be integrated in the Jenkins run on
> Apache.  Michael Shuler, is this something you can take a look at?
> Let me know if we can help.

We've been running the jacoco coverage ant target on cassci for a very
long time, but cassci will be going away in the future. I will not have
time to add coverage to ASF Jenkins in the foreseeable future, so help
would certainly be appreciated in all things testing for the Apache
Cassandra project.

Someone interested in including coverage runs could add to the Job DSL
groovy under the jenkins-dsl/ dir - this is where all the jobs are
configured on the ASF Jenkins instance. Dikang /msg'ed me on IRC earlier
about coverage, too.

Here's the cassandra-builds repo that drives everything on the project's
ASF Jenkins jobs:
https://git1-us-west.apache.org/repos/asf?p=cassandra-builds.git

Here's where all the current project jobs are on ASF Jenkins:
https://builds.apache.org/view/A-D/view/Cassandra/

Here's info about how the Job DSL works - all job configs are managed
with DSL, with the exception of a couple scratch utility jobs:
https://github.com/jenkinsci/job-dsl-plugin/wiki

Here's info about ASF Jenkins:
https://cwiki.apache.org/confluence/display/INFRA/Jenkins

Basically, my task over the last few months has been to migrate Apache
Cassandra testing jobs to run on the Apache Foundation's Jenkins and
sunset CassCI. 95% of that task is done.

-- 
Warm regards,
Michael

Re: Code quality, principles and rules

Posted by François Deliège <fr...@instagram.com>.

Thanks everybody for chiming in.  I have not heard any concerns about the rules, so I’d like to move forward with some concrete steps in that direction.

A first actionable step is to increase the visibility of the test coverage.  Ideally this would be integrated in the Jenkins run on Apache.  Michael Shuler, is this something you can take a look at?  Let me know if we can help.

A second step, imho, is to get agreement among committers that no patch that decrease test coverage will be accepted by any committer.   Could a PMC throw this question as a vote?

Finally, I used my mad data analytics skills to look at the ratio of non committers contributors during the last 6 months.    Turns out that 60% of the authors are not committers.   So as mentioned in the testing thread Jason started, when we think about testing, I think it makes sense to consider opening it up to non committers.   Today the testing time ranges from 2 hours for unitests to 1 day for integration tests.  I’d like to explore if we can throw some hardware at the problem.  I’d appreciate a pointer to who I should talk to.

Re: Code quality, principles and rules

Posted by Edward Capriolo <ed...@gmail.com>.

On Saturday, March 18, 2017, Qingcun Zhou <zh...@gmail.com> wrote:

> I wanted to contribute some unit test cases. However the unit test approach
> in Cassandra seems weird to me after looking into some examples. Not sure
> if anyone else has the same feeling.
>
> Usually, at least for all Java projects I have seen, people use mock
> (mockito, powermock) for dependencies. And then in a particular test case
> you verify the behavior using junit.assert* or mockito.verify. However we
> don't use mockito in Cassandra. Is there any reason for this? Without
> these, how easy do people think about adding unit test cases?
>
>
> Besides that, we have lots of singletons and there are already a handful of
> tickets to eliminate them. Maybe I missed something but I'm not seeing much
> progress. Is anyone actively working on this?
>
> Maybe a related problem. Some unit test cases have method annotated with
> @BeforeClass to do initialization work. However, it not only initializes
> direct dependencies, but also indirect ones, including loading
> cassandra.yaml and initializing indirect dependencies. This seems to me
> more like functional/integration test but not unit test style.
>
>
> On Fri, Mar 17, 2017 at 2:56 PM, Jeremy Hanna <jeremy.hanna1234@gmail.com
> <javascript:;>>
> wrote:
>
> > https://issues.apache.org/jira/browse/CASSANDRA-7837 may be some
> > interesting context regarding what's been worked on to get rid of
> > singletons and static initialization.
> >
> > > On Mar 17, 2017, at 4:47 PM, Jonathan Haddad <jon@jonhaddad.com
> <javascript:;>> wrote:
> > >
> > > I'd like to think that if someone refactors existing code, making it
> more
> > > testable (with tests, of course) it should be acceptable on it's own
> > > merit.  In fact, in my opinion it sometimes makes more sense to do
> these
> > > types of refactorings for the sole purpose of improving stability and
> > > testability as opposed to mixing them with features.
> > >
> > > You referenced the issue I fixed in one of the early emails.  The fix
> > > itself was a couple lines of code.  Refactoring the codebase to make it
> > > testable would have been a huge effort.  I wish I had time to do it.  I
> > > created CASSANDRA-13007 as a follow up with the intent of working on
> > > compaction from a purely architectural standpoint.  I think this type
> of
> > > thing should be done throughout the codebase.
> > >
> > > Removing the singletons is a good first step, my vote is we just rip
> off
> > > the bandaid, do it, and move forward.
> > >
> > > On Fri, Mar 17, 2017 at 2:20 PM Edward Capriolo <edlinuxguru@gmail.com
> <javascript:;>>
> > > wrote:
> > >
> > >>> On Fri, Mar 17, 2017 at 2:31 PM, Jason Brown <jasedbrown@gmail.com
> <javascript:;>>
> > wrote:
> > >>>
> > >>> To François's point about code coverage for new code, I think this
> > makes
> > >> a
> > >>> lot of sense wrt large features (like the current work on
> > >> 8457/12229/9754).
> > >>> It's much simpler to (mentally, at least) isolate those changed
> > sections
> > >>> and it'll show up better in a code coverage report. With small
> patches,
> > >>> that might be harder to achieve - however, as the patch should come
> > with
> > >>> *some* tests (unless it's a truly trivial patch), it might just work
> > >> itself
> > >>> out.
> > >>>
> > >>> On Fri, Mar 17, 2017 at 11:19 AM, Jason Brown <jasedbrown@gmail.com
> <javascript:;>>
> > >>> wrote:
> > >>>
> > >>>> As someone who spent a lot of time looking at the singletons topic
> in
> > >> the
> > >>>> past, Blake brings a great perspective here. Figuring out and
> > >>> communicating
> > >>>> how best to test with the system we have (and of course
> incrementally
> > >>>> making that system easier to work with/test) seems like an
> achievable
> > >>> goal.
> > >>>>
> > >>>> On Fri, Mar 17, 2017 at 10:17 AM, Edward Capriolo <
> > >> edlinuxguru@gmail.com <javascript:;>
> > >>>>
> > >>>> wrote:
> > >>>>
> > >>>>> On Fri, Mar 17, 2017 at 12:33 PM, Blake Eggleston <
> > >> beggleston@apple.com <javascript:;>
> > >>>>
> > >>>>> wrote:
> > >>>>>
> > >>>>>> I think we’re getting a little ahead of ourselves talking about DI
> > >>>>>> frameworks. Before that even becomes something worth talking
> about,
> > >>> we’d
> > >>>>>> need to have made serious progress on un-spaghettifying Cassandra
> in
> > >>> the
> > >>>>>> first place. It’s an extremely tall order. Adding a DI framework
> > >> right
> > >>>>> now
> > >>>>>> would be like throwing gasoline on a raging tire fire.
> > >>>>>>
> > >>>>>> Removing singletons seems to come up every 6-12 months, and
> usually
> > >>>>>> abandoned once people figure out how difficult they are to remove
> > >>>>> properly.
> > >>>>>> I do think removing them *should* be a long term goal, but we
> really
> > >>>>> need
> > >>>>>> something more immediately actionable. Otherwise, nothing’s going
> to
> > >>>>>> happen, and we’ll be having this discussion again in a year or so
> > >> when
> > >>>>>> everyone’s angry that Cassandra 5.0 still isn’t ready for
> > >> production,
> > >>> a
> > >>>>>> year after it’s release.
> > >>>>>>
> > >>>>>> That said, the reason singletons regularly get brought up is
> because
> > >>>>> doing
> > >>>>>> extensive testing of anything in Cassandra is pretty much
> > >> impossible,
> > >>>>> since
> > >>>>>> the code is basically this big web of interconnected global state.
> > >>>>> Testing
> > >>>>>> anything in isolation can’t be done, which, for a distributed
> > >>> database,
> > >>>>> is
> > >>>>>> crazy. It’s a chronic problem that handicaps our ability to
> release
> > >> a
> > >>>>>> stable database.
> > >>>>>>
> > >>>>>> At this point, I think a more pragmatic approach would be to draft
> > >> and
> > >>>>>> enforce some coding standards that can be applied in day to day
> > >>>>> development
> > >>>>>> that drive incremental improvement of the testing and testability
> of
> > >>> the
> > >>>>>> project. What should be tested, how it should be tested. How to
> > >> write
> > >>>>> new
> > >>>>>> code that talks to the rest of Cassandra and is testable. How to
> fix
> > >>>>> bugs
> > >>>>>> in old code in a way that’s testable. We should also have some
> > >>>>> guidelines
> > >>>>>> around refactoring the wildly untested sections, how to get
> started,
> > >>>>> what
> > >>>>>> to do, what not to do, etc.
> > >>>>>>
> > >>>>>> Thoughts?
> > >>>>>
> > >>>>>
> > >>>>> To make the conversation practical. There is one class I personally
> > >>> really
> > >>>>> want to refactor so it can be tested:
> > >>>>>
> > >>>>> https://github.com/apache/cassandra/blob/trunk/src/java/org/
> > >>>>> apache/cassandra/net/OutboundTcpConnection.java
> > >>>>>
> > >>>>> There is little coverage here. Questions like:
> > >>>>> what errors cause the connection to restart?
> > >>>>> when are undropable messages are dropped?
> > >>>>> what happens when the queue fills up?
> > >>>>> Infamous throw new AssertionError(ex); (which probably bubble up to
> > >>>>> nowhere)
> > >>>>> what does the COALESCED strategy do in case XYZ.
> > >>>>> A nifty label (wow a label you just never see those much!)
> > >>>>> outer:
> > >>>>> while (!isStopped)
> > >>>>>
> > >>>>> Comments to jira's that probably are not explicitly tested:
> > >>>>> // If we haven't retried this message yet, put it back on the queue
> > to
> > >>>>> retry after re-connecting.
> > >>>>> // See CASSANDRA-5393 and CASSANDRA-12192.
> > >>>>>
> > >>>>> If I were to undertake this cleanup, would there actually be
> support?
> > >> IE
> > >>>>> if
> > >>>>> this going to turn into an "it aint broken. don't fix it thing" or
> a
> > >> "we
> > >>>>> don't want to change stuff just to add tests" . Like will someone
> > >> pledge
> > >>>>> to
> > >>>>> agree its kinda wonky and merge the effort in < 1 years time?
> > >>>>>
> > >>>>
> > >>>>
> > >>>
> > >>
> > >> So ...:) If open a ticket to refactor OutboundTcpConnection.java to do
> > >> specific unit testing and possibly even pull things out to the point
> > that I
> > >> can actually open a socket and to an end to end test will you/anyone
> > >> support that? (it sounds like your saying I must/should make a large
> > >> feature to add a test)
> > >>
> >
>
>
>
> --
> Thank you & Best Regards,
> --Simon (Qingcun) Zhou
>

I am not a huge fan of mock testing personally, but I see its benefit.

Strangly, cassandra does not use mockito yet there are classes inside /test
named mockxyz.

Would a test by another name cover as sweet?


-- 
Sorry this was sent from mobile. Will do less grammar and spell check than
usual.

Re: Code quality, principles and rules

Posted by Edward Capriolo <ed...@gmail.com>.

On Sat, Mar 18, 2017 at 9:21 PM, Qingcun Zhou <zh...@gmail.com> wrote:

> I wanted to contribute some unit test cases. However the unit test approach
> in Cassandra seems weird to me after looking into some examples. Not sure
> if anyone else has the same feeling.
>
> Usually, at least for all Java projects I have seen, people use mock
> (mockito, powermock) for dependencies. And then in a particular test case
> you verify the behavior using junit.assert* or mockito.verify. However we
> don't use mockito in Cassandra. Is there any reason for this? Without
> these, how easy do people think about adding unit test cases?
>
>
> Besides that, we have lots of singletons and there are already a handful of
> tickets to eliminate them. Maybe I missed something but I'm not seeing much
> progress. Is anyone actively working on this?
>
> Maybe a related problem. Some unit test cases have method annotated with
> @BeforeClass to do initialization work. However, it not only initializes
> direct dependencies, but also indirect ones, including loading
> cassandra.yaml and initializing indirect dependencies. This seems to me
> more like functional/integration test but not unit test style.
>
>
> On Fri, Mar 17, 2017 at 2:56 PM, Jeremy Hanna <je...@gmail.com>
> wrote:
>
> > https://issues.apache.org/jira/browse/CASSANDRA-7837 may be some
> > interesting context regarding what's been worked on to get rid of
> > singletons and static initialization.
> >
> > > On Mar 17, 2017, at 4:47 PM, Jonathan Haddad <jo...@jonhaddad.com>
> wrote:
> > >
> > > I'd like to think that if someone refactors existing code, making it
> more
> > > testable (with tests, of course) it should be acceptable on it's own
> > > merit.  In fact, in my opinion it sometimes makes more sense to do
> these
> > > types of refactorings for the sole purpose of improving stability and
> > > testability as opposed to mixing them with features.
> > >
> > > You referenced the issue I fixed in one of the early emails.  The fix
> > > itself was a couple lines of code.  Refactoring the codebase to make it
> > > testable would have been a huge effort.  I wish I had time to do it.  I
> > > created CASSANDRA-13007 as a follow up with the intent of working on
> > > compaction from a purely architectural standpoint.  I think this type
> of
> > > thing should be done throughout the codebase.
> > >
> > > Removing the singletons is a good first step, my vote is we just rip
> off
> > > the bandaid, do it, and move forward.
> > >
> > > On Fri, Mar 17, 2017 at 2:20 PM Edward Capriolo <edlinuxguru@gmail.com
> >
> > > wrote:
> > >
> > >>> On Fri, Mar 17, 2017 at 2:31 PM, Jason Brown <ja...@gmail.com>
> > wrote:
> > >>>
> > >>> To François's point about code coverage for new code, I think this
> > makes
> > >> a
> > >>> lot of sense wrt large features (like the current work on
> > >> 8457/12229/9754).
> > >>> It's much simpler to (mentally, at least) isolate those changed
> > sections
> > >>> and it'll show up better in a code coverage report. With small
> patches,
> > >>> that might be harder to achieve - however, as the patch should come
> > with
> > >>> *some* tests (unless it's a truly trivial patch), it might just work
> > >> itself
> > >>> out.
> > >>>
> > >>> On Fri, Mar 17, 2017 at 11:19 AM, Jason Brown <ja...@gmail.com>
> > >>> wrote:
> > >>>
> > >>>> As someone who spent a lot of time looking at the singletons topic
> in
> > >> the
> > >>>> past, Blake brings a great perspective here. Figuring out and
> > >>> communicating
> > >>>> how best to test with the system we have (and of course
> incrementally
> > >>>> making that system easier to work with/test) seems like an
> achievable
> > >>> goal.
> > >>>>
> > >>>> On Fri, Mar 17, 2017 at 10:17 AM, Edward Capriolo <
> > >> edlinuxguru@gmail.com
> > >>>>
> > >>>> wrote:
> > >>>>
> > >>>>> On Fri, Mar 17, 2017 at 12:33 PM, Blake Eggleston <
> > >> beggleston@apple.com
> > >>>>
> > >>>>> wrote:
> > >>>>>
> > >>>>>> I think we’re getting a little ahead of ourselves talking about DI
> > >>>>>> frameworks. Before that even becomes something worth talking
> about,
> > >>> we’d
> > >>>>>> need to have made serious progress on un-spaghettifying Cassandra
> in
> > >>> the
> > >>>>>> first place. It’s an extremely tall order. Adding a DI framework
> > >> right
> > >>>>> now
> > >>>>>> would be like throwing gasoline on a raging tire fire.
> > >>>>>>
> > >>>>>> Removing singletons seems to come up every 6-12 months, and
> usually
> > >>>>>> abandoned once people figure out how difficult they are to remove
> > >>>>> properly.
> > >>>>>> I do think removing them *should* be a long term goal, but we
> really
> > >>>>> need
> > >>>>>> something more immediately actionable. Otherwise, nothing’s going
> to
> > >>>>>> happen, and we’ll be having this discussion again in a year or so
> > >> when
> > >>>>>> everyone’s angry that Cassandra 5.0 still isn’t ready for
> > >> production,
> > >>> a
> > >>>>>> year after it’s release.
> > >>>>>>
> > >>>>>> That said, the reason singletons regularly get brought up is
> because
> > >>>>> doing
> > >>>>>> extensive testing of anything in Cassandra is pretty much
> > >> impossible,
> > >>>>> since
> > >>>>>> the code is basically this big web of interconnected global state.
> > >>>>> Testing
> > >>>>>> anything in isolation can’t be done, which, for a distributed
> > >>> database,
> > >>>>> is
> > >>>>>> crazy. It’s a chronic problem that handicaps our ability to
> release
> > >> a
> > >>>>>> stable database.
> > >>>>>>
> > >>>>>> At this point, I think a more pragmatic approach would be to draft
> > >> and
> > >>>>>> enforce some coding standards that can be applied in day to day
> > >>>>> development
> > >>>>>> that drive incremental improvement of the testing and testability
> of
> > >>> the
> > >>>>>> project. What should be tested, how it should be tested. How to
> > >> write
> > >>>>> new
> > >>>>>> code that talks to the rest of Cassandra and is testable. How to
> fix
> > >>>>> bugs
> > >>>>>> in old code in a way that’s testable. We should also have some
> > >>>>> guidelines
> > >>>>>> around refactoring the wildly untested sections, how to get
> started,
> > >>>>> what
> > >>>>>> to do, what not to do, etc.
> > >>>>>>
> > >>>>>> Thoughts?
> > >>>>>
> > >>>>>
> > >>>>> To make the conversation practical. There is one class I personally
> > >>> really
> > >>>>> want to refactor so it can be tested:
> > >>>>>
> > >>>>> https://github.com/apache/cassandra/blob/trunk/src/java/org/
> > >>>>> apache/cassandra/net/OutboundTcpConnection.java
> > >>>>>
> > >>>>> There is little coverage here. Questions like:
> > >>>>> what errors cause the connection to restart?
> > >>>>> when are undropable messages are dropped?
> > >>>>> what happens when the queue fills up?
> > >>>>> Infamous throw new AssertionError(ex); (which probably bubble up to
> > >>>>> nowhere)
> > >>>>> what does the COALESCED strategy do in case XYZ.
> > >>>>> A nifty label (wow a label you just never see those much!)
> > >>>>> outer:
> > >>>>> while (!isStopped)
> > >>>>>
> > >>>>> Comments to jira's that probably are not explicitly tested:
> > >>>>> // If we haven't retried this message yet, put it back on the queue
> > to
> > >>>>> retry after re-connecting.
> > >>>>> // See CASSANDRA-5393 and CASSANDRA-12192.
> > >>>>>
> > >>>>> If I were to undertake this cleanup, would there actually be
> support?
> > >> IE
> > >>>>> if
> > >>>>> this going to turn into an "it aint broken. don't fix it thing" or
> a
> > >> "we
> > >>>>> don't want to change stuff just to add tests" . Like will someone
> > >> pledge
> > >>>>> to
> > >>>>> agree its kinda wonky and merge the effort in < 1 years time?
> > >>>>>
> > >>>>
> > >>>>
> > >>>
> > >>
> > >> So ...:) If open a ticket to refactor OutboundTcpConnection.java to do
> > >> specific unit testing and possibly even pull things out to the point
> > that I
> > >> can actually open a socket and to an end to end test will you/anyone
> > >> support that? (it sounds like your saying I must/should make a large
> > >> feature to add a test)
> > >>
> >
>
>
>
> --
> Thank you & Best Regards,
> --Simon (Qingcun) Zhou
>

Here is an example of a DYI mockito:

https://issues.apache.org/jira/secure/attachment/12817379/12016-trunk.patch

It reminds me of a saying I heard once:

When you were a child you had a birthday party and a clown scared you. Now,
you are all grown up and your have a kid who is having a party. You do not
want to scare your kid, so you don't get a clown. Since you do not have a
clown,  you run around telling jokes, blowing up balloons, sitting on
whoopee cushions, blowing Kazoo's.

A balloon pops and a kid crys, then one point it occurs to you, "Holy crap!
I am the f in clown!"

Re: Code quality, principles and rules

Posted by Qingcun Zhou <zh...@gmail.com>.

I wanted to contribute some unit test cases. However the unit test approach
in Cassandra seems weird to me after looking into some examples. Not sure
if anyone else has the same feeling.

Usually, at least for all Java projects I have seen, people use mock
(mockito, powermock) for dependencies. And then in a particular test case
you verify the behavior using junit.assert* or mockito.verify. However we
don't use mockito in Cassandra. Is there any reason for this? Without
these, how easy do people think about adding unit test cases?


Besides that, we have lots of singletons and there are already a handful of
tickets to eliminate them. Maybe I missed something but I'm not seeing much
progress. Is anyone actively working on this?

Maybe a related problem. Some unit test cases have method annotated with
@BeforeClass to do initialization work. However, it not only initializes
direct dependencies, but also indirect ones, including loading
cassandra.yaml and initializing indirect dependencies. This seems to me
more like functional/integration test but not unit test style.


On Fri, Mar 17, 2017 at 2:56 PM, Jeremy Hanna <je...@gmail.com>
wrote:

> https://issues.apache.org/jira/browse/CASSANDRA-7837 may be some
> interesting context regarding what's been worked on to get rid of
> singletons and static initialization.
>
> > On Mar 17, 2017, at 4:47 PM, Jonathan Haddad <jo...@jonhaddad.com> wrote:
> >
> > I'd like to think that if someone refactors existing code, making it more
> > testable (with tests, of course) it should be acceptable on it's own
> > merit.  In fact, in my opinion it sometimes makes more sense to do these
> > types of refactorings for the sole purpose of improving stability and
> > testability as opposed to mixing them with features.
> >
> > You referenced the issue I fixed in one of the early emails.  The fix
> > itself was a couple lines of code.  Refactoring the codebase to make it
> > testable would have been a huge effort.  I wish I had time to do it.  I
> > created CASSANDRA-13007 as a follow up with the intent of working on
> > compaction from a purely architectural standpoint.  I think this type of
> > thing should be done throughout the codebase.
> >
> > Removing the singletons is a good first step, my vote is we just rip off
> > the bandaid, do it, and move forward.
> >
> > On Fri, Mar 17, 2017 at 2:20 PM Edward Capriolo <ed...@gmail.com>
> > wrote:
> >
> >>> On Fri, Mar 17, 2017 at 2:31 PM, Jason Brown <ja...@gmail.com>
> wrote:
> >>>
> >>> To François's point about code coverage for new code, I think this
> makes
> >> a
> >>> lot of sense wrt large features (like the current work on
> >> 8457/12229/9754).
> >>> It's much simpler to (mentally, at least) isolate those changed
> sections
> >>> and it'll show up better in a code coverage report. With small patches,
> >>> that might be harder to achieve - however, as the patch should come
> with
> >>> *some* tests (unless it's a truly trivial patch), it might just work
> >> itself
> >>> out.
> >>>
> >>> On Fri, Mar 17, 2017 at 11:19 AM, Jason Brown <ja...@gmail.com>
> >>> wrote:
> >>>
> >>>> As someone who spent a lot of time looking at the singletons topic in
> >> the
> >>>> past, Blake brings a great perspective here. Figuring out and
> >>> communicating
> >>>> how best to test with the system we have (and of course incrementally
> >>>> making that system easier to work with/test) seems like an achievable
> >>> goal.
> >>>>
> >>>> On Fri, Mar 17, 2017 at 10:17 AM, Edward Capriolo <
> >> edlinuxguru@gmail.com
> >>>>
> >>>> wrote:
> >>>>
> >>>>> On Fri, Mar 17, 2017 at 12:33 PM, Blake Eggleston <
> >> beggleston@apple.com
> >>>>
> >>>>> wrote:
> >>>>>
> >>>>>> I think we’re getting a little ahead of ourselves talking about DI
> >>>>>> frameworks. Before that even becomes something worth talking about,
> >>> we’d
> >>>>>> need to have made serious progress on un-spaghettifying Cassandra in
> >>> the
> >>>>>> first place. It’s an extremely tall order. Adding a DI framework
> >> right
> >>>>> now
> >>>>>> would be like throwing gasoline on a raging tire fire.
> >>>>>>
> >>>>>> Removing singletons seems to come up every 6-12 months, and usually
> >>>>>> abandoned once people figure out how difficult they are to remove
> >>>>> properly.
> >>>>>> I do think removing them *should* be a long term goal, but we really
> >>>>> need
> >>>>>> something more immediately actionable. Otherwise, nothing’s going to
> >>>>>> happen, and we’ll be having this discussion again in a year or so
> >> when
> >>>>>> everyone’s angry that Cassandra 5.0 still isn’t ready for
> >> production,
> >>> a
> >>>>>> year after it’s release.
> >>>>>>
> >>>>>> That said, the reason singletons regularly get brought up is because
> >>>>> doing
> >>>>>> extensive testing of anything in Cassandra is pretty much
> >> impossible,
> >>>>> since
> >>>>>> the code is basically this big web of interconnected global state.
> >>>>> Testing
> >>>>>> anything in isolation can’t be done, which, for a distributed
> >>> database,
> >>>>> is
> >>>>>> crazy. It’s a chronic problem that handicaps our ability to release
> >> a
> >>>>>> stable database.
> >>>>>>
> >>>>>> At this point, I think a more pragmatic approach would be to draft
> >> and
> >>>>>> enforce some coding standards that can be applied in day to day
> >>>>> development
> >>>>>> that drive incremental improvement of the testing and testability of
> >>> the
> >>>>>> project. What should be tested, how it should be tested. How to
> >> write
> >>>>> new
> >>>>>> code that talks to the rest of Cassandra and is testable. How to fix
> >>>>> bugs
> >>>>>> in old code in a way that’s testable. We should also have some
> >>>>> guidelines
> >>>>>> around refactoring the wildly untested sections, how to get started,
> >>>>> what
> >>>>>> to do, what not to do, etc.
> >>>>>>
> >>>>>> Thoughts?
> >>>>>
> >>>>>
> >>>>> To make the conversation practical. There is one class I personally
> >>> really
> >>>>> want to refactor so it can be tested:
> >>>>>
> >>>>> https://github.com/apache/cassandra/blob/trunk/src/java/org/
> >>>>> apache/cassandra/net/OutboundTcpConnection.java
> >>>>>
> >>>>> There is little coverage here. Questions like:
> >>>>> what errors cause the connection to restart?
> >>>>> when are undropable messages are dropped?
> >>>>> what happens when the queue fills up?
> >>>>> Infamous throw new AssertionError(ex); (which probably bubble up to
> >>>>> nowhere)
> >>>>> what does the COALESCED strategy do in case XYZ.
> >>>>> A nifty label (wow a label you just never see those much!)
> >>>>> outer:
> >>>>> while (!isStopped)
> >>>>>
> >>>>> Comments to jira's that probably are not explicitly tested:
> >>>>> // If we haven't retried this message yet, put it back on the queue
> to
> >>>>> retry after re-connecting.
> >>>>> // See CASSANDRA-5393 and CASSANDRA-12192.
> >>>>>
> >>>>> If I were to undertake this cleanup, would there actually be support?
> >> IE
> >>>>> if
> >>>>> this going to turn into an "it aint broken. don't fix it thing" or a
> >> "we
> >>>>> don't want to change stuff just to add tests" . Like will someone
> >> pledge
> >>>>> to
> >>>>> agree its kinda wonky and merge the effort in < 1 years time?
> >>>>>
> >>>>
> >>>>
> >>>
> >>
> >> So ...:) If open a ticket to refactor OutboundTcpConnection.java to do
> >> specific unit testing and possibly even pull things out to the point
> that I
> >> can actually open a socket and to an end to end test will you/anyone
> >> support that? (it sounds like your saying I must/should make a large
> >> feature to add a test)
> >>
>



-- 
Thank you & Best Regards,
--Simon (Qingcun) Zhou

Re: Code quality, principles and rules

Posted by Jeremy Hanna <je...@gmail.com>.

https://issues.apache.org/jira/browse/CASSANDRA-7837 may be some interesting context regarding what's been worked on to get rid of singletons and static initialization.

> On Mar 17, 2017, at 4:47 PM, Jonathan Haddad <jo...@jonhaddad.com> wrote:
> 
> I'd like to think that if someone refactors existing code, making it more
> testable (with tests, of course) it should be acceptable on it's own
> merit.  In fact, in my opinion it sometimes makes more sense to do these
> types of refactorings for the sole purpose of improving stability and
> testability as opposed to mixing them with features.
> 
> You referenced the issue I fixed in one of the early emails.  The fix
> itself was a couple lines of code.  Refactoring the codebase to make it
> testable would have been a huge effort.  I wish I had time to do it.  I
> created CASSANDRA-13007 as a follow up with the intent of working on
> compaction from a purely architectural standpoint.  I think this type of
> thing should be done throughout the codebase.
> 
> Removing the singletons is a good first step, my vote is we just rip off
> the bandaid, do it, and move forward.
> 
> On Fri, Mar 17, 2017 at 2:20 PM Edward Capriolo <ed...@gmail.com>
> wrote:
> 
>>> On Fri, Mar 17, 2017 at 2:31 PM, Jason Brown <ja...@gmail.com> wrote:
>>> 
>>> To François's point about code coverage for new code, I think this makes
>> a
>>> lot of sense wrt large features (like the current work on
>> 8457/12229/9754).
>>> It's much simpler to (mentally, at least) isolate those changed sections
>>> and it'll show up better in a code coverage report. With small patches,
>>> that might be harder to achieve - however, as the patch should come with
>>> *some* tests (unless it's a truly trivial patch), it might just work
>> itself
>>> out.
>>> 
>>> On Fri, Mar 17, 2017 at 11:19 AM, Jason Brown <ja...@gmail.com>
>>> wrote:
>>> 
>>>> As someone who spent a lot of time looking at the singletons topic in
>> the
>>>> past, Blake brings a great perspective here. Figuring out and
>>> communicating
>>>> how best to test with the system we have (and of course incrementally
>>>> making that system easier to work with/test) seems like an achievable
>>> goal.
>>>> 
>>>> On Fri, Mar 17, 2017 at 10:17 AM, Edward Capriolo <
>> edlinuxguru@gmail.com
>>>> 
>>>> wrote:
>>>> 
>>>>> On Fri, Mar 17, 2017 at 12:33 PM, Blake Eggleston <
>> beggleston@apple.com
>>>> 
>>>>> wrote:
>>>>> 
>>>>>> I think we’re getting a little ahead of ourselves talking about DI
>>>>>> frameworks. Before that even becomes something worth talking about,
>>> we’d
>>>>>> need to have made serious progress on un-spaghettifying Cassandra in
>>> the
>>>>>> first place. It’s an extremely tall order. Adding a DI framework
>> right
>>>>> now
>>>>>> would be like throwing gasoline on a raging tire fire.
>>>>>> 
>>>>>> Removing singletons seems to come up every 6-12 months, and usually
>>>>>> abandoned once people figure out how difficult they are to remove
>>>>> properly.
>>>>>> I do think removing them *should* be a long term goal, but we really
>>>>> need
>>>>>> something more immediately actionable. Otherwise, nothing’s going to
>>>>>> happen, and we’ll be having this discussion again in a year or so
>> when
>>>>>> everyone’s angry that Cassandra 5.0 still isn’t ready for
>> production,
>>> a
>>>>>> year after it’s release.
>>>>>> 
>>>>>> That said, the reason singletons regularly get brought up is because
>>>>> doing
>>>>>> extensive testing of anything in Cassandra is pretty much
>> impossible,
>>>>> since
>>>>>> the code is basically this big web of interconnected global state.
>>>>> Testing
>>>>>> anything in isolation can’t be done, which, for a distributed
>>> database,
>>>>> is
>>>>>> crazy. It’s a chronic problem that handicaps our ability to release
>> a
>>>>>> stable database.
>>>>>> 
>>>>>> At this point, I think a more pragmatic approach would be to draft
>> and
>>>>>> enforce some coding standards that can be applied in day to day
>>>>> development
>>>>>> that drive incremental improvement of the testing and testability of
>>> the
>>>>>> project. What should be tested, how it should be tested. How to
>> write
>>>>> new
>>>>>> code that talks to the rest of Cassandra and is testable. How to fix
>>>>> bugs
>>>>>> in old code in a way that’s testable. We should also have some
>>>>> guidelines
>>>>>> around refactoring the wildly untested sections, how to get started,
>>>>> what
>>>>>> to do, what not to do, etc.
>>>>>> 
>>>>>> Thoughts?
>>>>> 
>>>>> 
>>>>> To make the conversation practical. There is one class I personally
>>> really
>>>>> want to refactor so it can be tested:
>>>>> 
>>>>> https://github.com/apache/cassandra/blob/trunk/src/java/org/
>>>>> apache/cassandra/net/OutboundTcpConnection.java
>>>>> 
>>>>> There is little coverage here. Questions like:
>>>>> what errors cause the connection to restart?
>>>>> when are undropable messages are dropped?
>>>>> what happens when the queue fills up?
>>>>> Infamous throw new AssertionError(ex); (which probably bubble up to
>>>>> nowhere)
>>>>> what does the COALESCED strategy do in case XYZ.
>>>>> A nifty label (wow a label you just never see those much!)
>>>>> outer:
>>>>> while (!isStopped)
>>>>> 
>>>>> Comments to jira's that probably are not explicitly tested:
>>>>> // If we haven't retried this message yet, put it back on the queue to
>>>>> retry after re-connecting.
>>>>> // See CASSANDRA-5393 and CASSANDRA-12192.
>>>>> 
>>>>> If I were to undertake this cleanup, would there actually be support?
>> IE
>>>>> if
>>>>> this going to turn into an "it aint broken. don't fix it thing" or a
>> "we
>>>>> don't want to change stuff just to add tests" . Like will someone
>> pledge
>>>>> to
>>>>> agree its kinda wonky and merge the effort in < 1 years time?
>>>>> 
>>>> 
>>>> 
>>> 
>> 
>> So ...:) If open a ticket to refactor OutboundTcpConnection.java to do
>> specific unit testing and possibly even pull things out to the point that I
>> can actually open a socket and to an end to end test will you/anyone
>> support that? (it sounds like your saying I must/should make a large
>> feature to add a test)
>>

Re: Code quality, principles and rules

Posted by Jonathan Haddad <jo...@jonhaddad.com>.

I'd like to think that if someone refactors existing code, making it more
testable (with tests, of course) it should be acceptable on it's own
merit.  In fact, in my opinion it sometimes makes more sense to do these
types of refactorings for the sole purpose of improving stability and
testability as opposed to mixing them with features.

You referenced the issue I fixed in one of the early emails.  The fix
itself was a couple lines of code.  Refactoring the codebase to make it
testable would have been a huge effort.  I wish I had time to do it.  I
created CASSANDRA-13007 as a follow up with the intent of working on
compaction from a purely architectural standpoint.  I think this type of
thing should be done throughout the codebase.

Removing the singletons is a good first step, my vote is we just rip off
the bandaid, do it, and move forward.

On Fri, Mar 17, 2017 at 2:20 PM Edward Capriolo <ed...@gmail.com>
wrote:

> On Fri, Mar 17, 2017 at 2:31 PM, Jason Brown <ja...@gmail.com> wrote:
>
> > To François's point about code coverage for new code, I think this makes
> a
> > lot of sense wrt large features (like the current work on
> 8457/12229/9754).
> > It's much simpler to (mentally, at least) isolate those changed sections
> > and it'll show up better in a code coverage report. With small patches,
> > that might be harder to achieve - however, as the patch should come with
> > *some* tests (unless it's a truly trivial patch), it might just work
> itself
> > out.
> >
> > On Fri, Mar 17, 2017 at 11:19 AM, Jason Brown <ja...@gmail.com>
> > wrote:
> >
> > > As someone who spent a lot of time looking at the singletons topic in
> the
> > > past, Blake brings a great perspective here. Figuring out and
> > communicating
> > > how best to test with the system we have (and of course incrementally
> > > making that system easier to work with/test) seems like an achievable
> > goal.
> > >
> > > On Fri, Mar 17, 2017 at 10:17 AM, Edward Capriolo <
> edlinuxguru@gmail.com
> > >
> > > wrote:
> > >
> > >> On Fri, Mar 17, 2017 at 12:33 PM, Blake Eggleston <
> beggleston@apple.com
> > >
> > >> wrote:
> > >>
> > >> > I think we’re getting a little ahead of ourselves talking about DI
> > >> > frameworks. Before that even becomes something worth talking about,
> > we’d
> > >> > need to have made serious progress on un-spaghettifying Cassandra in
> > the
> > >> > first place. It’s an extremely tall order. Adding a DI framework
> right
> > >> now
> > >> > would be like throwing gasoline on a raging tire fire.
> > >> >
> > >> > Removing singletons seems to come up every 6-12 months, and usually
> > >> > abandoned once people figure out how difficult they are to remove
> > >> properly.
> > >> > I do think removing them *should* be a long term goal, but we really
> > >> need
> > >> > something more immediately actionable. Otherwise, nothing’s going to
> > >> > happen, and we’ll be having this discussion again in a year or so
> when
> > >> > everyone’s angry that Cassandra 5.0 still isn’t ready for
> production,
> > a
> > >> > year after it’s release.
> > >> >
> > >> > That said, the reason singletons regularly get brought up is because
> > >> doing
> > >> > extensive testing of anything in Cassandra is pretty much
> impossible,
> > >> since
> > >> > the code is basically this big web of interconnected global state.
> > >> Testing
> > >> > anything in isolation can’t be done, which, for a distributed
> > database,
> > >> is
> > >> > crazy. It’s a chronic problem that handicaps our ability to release
> a
> > >> > stable database.
> > >> >
> > >> > At this point, I think a more pragmatic approach would be to draft
> and
> > >> > enforce some coding standards that can be applied in day to day
> > >> development
> > >> > that drive incremental improvement of the testing and testability of
> > the
> > >> > project. What should be tested, how it should be tested. How to
> write
> > >> new
> > >> > code that talks to the rest of Cassandra and is testable. How to fix
> > >> bugs
> > >> > in old code in a way that’s testable. We should also have some
> > >> guidelines
> > >> > around refactoring the wildly untested sections, how to get started,
> > >> what
> > >> > to do, what not to do, etc.
> > >> >
> > >> > Thoughts?
> > >>
> > >>
> > >> To make the conversation practical. There is one class I personally
> > really
> > >> want to refactor so it can be tested:
> > >>
> > >> https://github.com/apache/cassandra/blob/trunk/src/java/org/
> > >> apache/cassandra/net/OutboundTcpConnection.java
> > >>
> > >> There is little coverage here. Questions like:
> > >> what errors cause the connection to restart?
> > >> when are undropable messages are dropped?
> > >> what happens when the queue fills up?
> > >> Infamous throw new AssertionError(ex); (which probably bubble up to
> > >> nowhere)
> > >> what does the COALESCED strategy do in case XYZ.
> > >> A nifty label (wow a label you just never see those much!)
> > >> outer:
> > >> while (!isStopped)
> > >>
> > >> Comments to jira's that probably are not explicitly tested:
> > >> // If we haven't retried this message yet, put it back on the queue to
> > >> retry after re-connecting.
> > >> // See CASSANDRA-5393 and CASSANDRA-12192.
> > >>
> > >> If I were to undertake this cleanup, would there actually be support?
> IE
> > >> if
> > >> this going to turn into an "it aint broken. don't fix it thing" or a
> "we
> > >> don't want to change stuff just to add tests" . Like will someone
> pledge
> > >> to
> > >> agree its kinda wonky and merge the effort in < 1 years time?
> > >>
> > >
> > >
> >
>
> So ...:) If open a ticket to refactor OutboundTcpConnection.java to do
> specific unit testing and possibly even pull things out to the point that I
> can actually open a socket and to an end to end test will you/anyone
> support that? (it sounds like your saying I must/should make a large
> feature to add a test)
>

Re: Code quality, principles and rules

Posted by Edward Capriolo <ed...@gmail.com>.

On Fri, Mar 17, 2017 at 2:31 PM, Jason Brown <ja...@gmail.com> wrote:

> To François's point about code coverage for new code, I think this makes a
> lot of sense wrt large features (like the current work on 8457/12229/9754).
> It's much simpler to (mentally, at least) isolate those changed sections
> and it'll show up better in a code coverage report. With small patches,
> that might be harder to achieve - however, as the patch should come with
> *some* tests (unless it's a truly trivial patch), it might just work itself
> out.
>
> On Fri, Mar 17, 2017 at 11:19 AM, Jason Brown <ja...@gmail.com>
> wrote:
>
> > As someone who spent a lot of time looking at the singletons topic in the
> > past, Blake brings a great perspective here. Figuring out and
> communicating
> > how best to test with the system we have (and of course incrementally
> > making that system easier to work with/test) seems like an achievable
> goal.
> >
> > On Fri, Mar 17, 2017 at 10:17 AM, Edward Capriolo <edlinuxguru@gmail.com
> >
> > wrote:
> >
> >> On Fri, Mar 17, 2017 at 12:33 PM, Blake Eggleston <beggleston@apple.com
> >
> >> wrote:
> >>
> >> > I think we’re getting a little ahead of ourselves talking about DI
> >> > frameworks. Before that even becomes something worth talking about,
> we’d
> >> > need to have made serious progress on un-spaghettifying Cassandra in
> the
> >> > first place. It’s an extremely tall order. Adding a DI framework right
> >> now
> >> > would be like throwing gasoline on a raging tire fire.
> >> >
> >> > Removing singletons seems to come up every 6-12 months, and usually
> >> > abandoned once people figure out how difficult they are to remove
> >> properly.
> >> > I do think removing them *should* be a long term goal, but we really
> >> need
> >> > something more immediately actionable. Otherwise, nothing’s going to
> >> > happen, and we’ll be having this discussion again in a year or so when
> >> > everyone’s angry that Cassandra 5.0 still isn’t ready for production,
> a
> >> > year after it’s release.
> >> >
> >> > That said, the reason singletons regularly get brought up is because
> >> doing
> >> > extensive testing of anything in Cassandra is pretty much impossible,
> >> since
> >> > the code is basically this big web of interconnected global state.
> >> Testing
> >> > anything in isolation can’t be done, which, for a distributed
> database,
> >> is
> >> > crazy. It’s a chronic problem that handicaps our ability to release a
> >> > stable database.
> >> >
> >> > At this point, I think a more pragmatic approach would be to draft and
> >> > enforce some coding standards that can be applied in day to day
> >> development
> >> > that drive incremental improvement of the testing and testability of
> the
> >> > project. What should be tested, how it should be tested. How to write
> >> new
> >> > code that talks to the rest of Cassandra and is testable. How to fix
> >> bugs
> >> > in old code in a way that’s testable. We should also have some
> >> guidelines
> >> > around refactoring the wildly untested sections, how to get started,
> >> what
> >> > to do, what not to do, etc.
> >> >
> >> > Thoughts?
> >>
> >>
> >> To make the conversation practical. There is one class I personally
> really
> >> want to refactor so it can be tested:
> >>
> >> https://github.com/apache/cassandra/blob/trunk/src/java/org/
> >> apache/cassandra/net/OutboundTcpConnection.java
> >>
> >> There is little coverage here. Questions like:
> >> what errors cause the connection to restart?
> >> when are undropable messages are dropped?
> >> what happens when the queue fills up?
> >> Infamous throw new AssertionError(ex); (which probably bubble up to
> >> nowhere)
> >> what does the COALESCED strategy do in case XYZ.
> >> A nifty label (wow a label you just never see those much!)
> >> outer:
> >> while (!isStopped)
> >>
> >> Comments to jira's that probably are not explicitly tested:
> >> // If we haven't retried this message yet, put it back on the queue to
> >> retry after re-connecting.
> >> // See CASSANDRA-5393 and CASSANDRA-12192.
> >>
> >> If I were to undertake this cleanup, would there actually be support? IE
> >> if
> >> this going to turn into an "it aint broken. don't fix it thing" or a "we
> >> don't want to change stuff just to add tests" . Like will someone pledge
> >> to
> >> agree its kinda wonky and merge the effort in < 1 years time?
> >>
> >
> >
>

So ...:) If open a ticket to refactor OutboundTcpConnection.java to do
specific unit testing and possibly even pull things out to the point that I
can actually open a socket and to an end to end test will you/anyone
support that? (it sounds like your saying I must/should make a large
feature to add a test)

Re: Code quality, principles and rules

Posted by Jason Brown <ja...@gmail.com>.

To François's point about code coverage for new code, I think this makes a
lot of sense wrt large features (like the current work on 8457/12229/9754).
It's much simpler to (mentally, at least) isolate those changed sections
and it'll show up better in a code coverage report. With small patches,
that might be harder to achieve - however, as the patch should come with
*some* tests (unless it's a truly trivial patch), it might just work itself
out.

On Fri, Mar 17, 2017 at 11:19 AM, Jason Brown <ja...@gmail.com> wrote:

> As someone who spent a lot of time looking at the singletons topic in the
> past, Blake brings a great perspective here. Figuring out and communicating
> how best to test with the system we have (and of course incrementally
> making that system easier to work with/test) seems like an achievable goal.
>
> On Fri, Mar 17, 2017 at 10:17 AM, Edward Capriolo <ed...@gmail.com>
> wrote:
>
>> On Fri, Mar 17, 2017 at 12:33 PM, Blake Eggleston <be...@apple.com>
>> wrote:
>>
>> > I think we’re getting a little ahead of ourselves talking about DI
>> > frameworks. Before that even becomes something worth talking about, we’d
>> > need to have made serious progress on un-spaghettifying Cassandra in the
>> > first place. It’s an extremely tall order. Adding a DI framework right
>> now
>> > would be like throwing gasoline on a raging tire fire.
>> >
>> > Removing singletons seems to come up every 6-12 months, and usually
>> > abandoned once people figure out how difficult they are to remove
>> properly.
>> > I do think removing them *should* be a long term goal, but we really
>> need
>> > something more immediately actionable. Otherwise, nothing’s going to
>> > happen, and we’ll be having this discussion again in a year or so when
>> > everyone’s angry that Cassandra 5.0 still isn’t ready for production, a
>> > year after it’s release.
>> >
>> > That said, the reason singletons regularly get brought up is because
>> doing
>> > extensive testing of anything in Cassandra is pretty much impossible,
>> since
>> > the code is basically this big web of interconnected global state.
>> Testing
>> > anything in isolation can’t be done, which, for a distributed database,
>> is
>> > crazy. It’s a chronic problem that handicaps our ability to release a
>> > stable database.
>> >
>> > At this point, I think a more pragmatic approach would be to draft and
>> > enforce some coding standards that can be applied in day to day
>> development
>> > that drive incremental improvement of the testing and testability of the
>> > project. What should be tested, how it should be tested. How to write
>> new
>> > code that talks to the rest of Cassandra and is testable. How to fix
>> bugs
>> > in old code in a way that’s testable. We should also have some
>> guidelines
>> > around refactoring the wildly untested sections, how to get started,
>> what
>> > to do, what not to do, etc.
>> >
>> > Thoughts?
>>
>>
>> To make the conversation practical. There is one class I personally really
>> want to refactor so it can be tested:
>>
>> https://github.com/apache/cassandra/blob/trunk/src/java/org/
>> apache/cassandra/net/OutboundTcpConnection.java
>>
>> There is little coverage here. Questions like:
>> what errors cause the connection to restart?
>> when are undropable messages are dropped?
>> what happens when the queue fills up?
>> Infamous throw new AssertionError(ex); (which probably bubble up to
>> nowhere)
>> what does the COALESCED strategy do in case XYZ.
>> A nifty label (wow a label you just never see those much!)
>> outer:
>> while (!isStopped)
>>
>> Comments to jira's that probably are not explicitly tested:
>> // If we haven't retried this message yet, put it back on the queue to
>> retry after re-connecting.
>> // See CASSANDRA-5393 and CASSANDRA-12192.
>>
>> If I were to undertake this cleanup, would there actually be support? IE
>> if
>> this going to turn into an "it aint broken. don't fix it thing" or a "we
>> don't want to change stuff just to add tests" . Like will someone pledge
>> to
>> agree its kinda wonky and merge the effort in < 1 years time?
>>
>
>

Re: Code quality, principles and rules

Posted by benjamin roth <br...@gmail.com>.

I think you can refactor any project with little risk and increase test
coverage.
What is needed:
Rules. Discipline. Perseverance. Small iterations. Small iterations. Small
iterations.

   - Refactor in the smallest possible unit
   - Split large classes into smaller ones. Remove god classes by pulling
   out single methods or aspects. Maybe factor out method by method.
   - Maintain compatibility. Build facades, adapters, proxy objects for
   compatibility during refactoring process. Do not break interfaces if not
   really necessary or risky.
   - Push states into corners. E.g. when refactoring a single method, pass
   global state as parameter. So this single method becomes testable.

If you iterate like this maybe 1000 times, you will most likely break much
fewer things than doing a big bang refactor. You make code testable in
small steps.

Global state is the biggest disease, history of programming has ever seen.
Singletons are also not supergreat to test and static methods should be
avoided at all costs if they contain state.
Tested idempotent static methods should not be a problem.

From my experience, you don't need a bloated DI framework to make a class
testable that depends somehow on static methods or singletons.
You just have to push the bad guys into a corner where they don't harm and
can be killed without risk in the very end.
E.g. instead of calling SomeClass.instance.doWhatEver() spread here and
there it can be encapsulated in a single method like
TestableClass.doWhatever() {SomeClass.instance.doWhatEver()}
Or the whole singleton is retrieved through TestableClass.getSomeClass().
So you can either mock the hell out of it or you inject a non-singleton
instance of that class at test-runtime.


2017-03-17 19:19 GMT+01:00 Jason Brown <ja...@gmail.com>:

> As someone who spent a lot of time looking at the singletons topic in the
> past, Blake brings a great perspective here. Figuring out and communicating
> how best to test with the system we have (and of course incrementally
> making that system easier to work with/test) seems like an achievable goal.
>
> On Fri, Mar 17, 2017 at 10:17 AM, Edward Capriolo <ed...@gmail.com>
> wrote:
>
> > On Fri, Mar 17, 2017 at 12:33 PM, Blake Eggleston <be...@apple.com>
> > wrote:
> >
> > > I think we’re getting a little ahead of ourselves talking about DI
> > > frameworks. Before that even becomes something worth talking about,
> we’d
> > > need to have made serious progress on un-spaghettifying Cassandra in
> the
> > > first place. It’s an extremely tall order. Adding a DI framework right
> > now
> > > would be like throwing gasoline on a raging tire fire.
> > >
> > > Removing singletons seems to come up every 6-12 months, and usually
> > > abandoned once people figure out how difficult they are to remove
> > properly.
> > > I do think removing them *should* be a long term goal, but we really
> need
> > > something more immediately actionable. Otherwise, nothing’s going to
> > > happen, and we’ll be having this discussion again in a year or so when
> > > everyone’s angry that Cassandra 5.0 still isn’t ready for production, a
> > > year after it’s release.
> > >
> > > That said, the reason singletons regularly get brought up is because
> > doing
> > > extensive testing of anything in Cassandra is pretty much impossible,
> > since
> > > the code is basically this big web of interconnected global state.
> > Testing
> > > anything in isolation can’t be done, which, for a distributed database,
> > is
> > > crazy. It’s a chronic problem that handicaps our ability to release a
> > > stable database.
> > >
> > > At this point, I think a more pragmatic approach would be to draft and
> > > enforce some coding standards that can be applied in day to day
> > development
> > > that drive incremental improvement of the testing and testability of
> the
> > > project. What should be tested, how it should be tested. How to write
> new
> > > code that talks to the rest of Cassandra and is testable. How to fix
> bugs
> > > in old code in a way that’s testable. We should also have some
> guidelines
> > > around refactoring the wildly untested sections, how to get started,
> what
> > > to do, what not to do, etc.
> > >
> > > Thoughts?
> >
> >
> > To make the conversation practical. There is one class I personally
> really
> > want to refactor so it can be tested:
> >
> > https://github.com/apache/cassandra/blob/trunk/src/java/
> > org/apache/cassandra/net/OutboundTcpConnection.java
> >
> > There is little coverage here. Questions like:
> > what errors cause the connection to restart?
> > when are undropable messages are dropped?
> > what happens when the queue fills up?
> > Infamous throw new AssertionError(ex); (which probably bubble up to
> > nowhere)
> > what does the COALESCED strategy do in case XYZ.
> > A nifty label (wow a label you just never see those much!)
> > outer:
> > while (!isStopped)
> >
> > Comments to jira's that probably are not explicitly tested:
> > // If we haven't retried this message yet, put it back on the queue to
> > retry after re-connecting.
> > // See CASSANDRA-5393 and CASSANDRA-12192.
> >
> > If I were to undertake this cleanup, would there actually be support? IE
> if
> > this going to turn into an "it aint broken. don't fix it thing" or a "we
> > don't want to change stuff just to add tests" . Like will someone pledge
> to
> > agree its kinda wonky and merge the effort in < 1 years time?
> >
>

Re: Code quality, principles and rules

Posted by Jason Brown <ja...@gmail.com>.

As someone who spent a lot of time looking at the singletons topic in the
past, Blake brings a great perspective here. Figuring out and communicating
how best to test with the system we have (and of course incrementally
making that system easier to work with/test) seems like an achievable goal.

On Fri, Mar 17, 2017 at 10:17 AM, Edward Capriolo <ed...@gmail.com>
wrote:

> On Fri, Mar 17, 2017 at 12:33 PM, Blake Eggleston <be...@apple.com>
> wrote:
>
> > I think we’re getting a little ahead of ourselves talking about DI
> > frameworks. Before that even becomes something worth talking about, we’d
> > need to have made serious progress on un-spaghettifying Cassandra in the
> > first place. It’s an extremely tall order. Adding a DI framework right
> now
> > would be like throwing gasoline on a raging tire fire.
> >
> > Removing singletons seems to come up every 6-12 months, and usually
> > abandoned once people figure out how difficult they are to remove
> properly.
> > I do think removing them *should* be a long term goal, but we really need
> > something more immediately actionable. Otherwise, nothing’s going to
> > happen, and we’ll be having this discussion again in a year or so when
> > everyone’s angry that Cassandra 5.0 still isn’t ready for production, a
> > year after it’s release.
> >
> > That said, the reason singletons regularly get brought up is because
> doing
> > extensive testing of anything in Cassandra is pretty much impossible,
> since
> > the code is basically this big web of interconnected global state.
> Testing
> > anything in isolation can’t be done, which, for a distributed database,
> is
> > crazy. It’s a chronic problem that handicaps our ability to release a
> > stable database.
> >
> > At this point, I think a more pragmatic approach would be to draft and
> > enforce some coding standards that can be applied in day to day
> development
> > that drive incremental improvement of the testing and testability of the
> > project. What should be tested, how it should be tested. How to write new
> > code that talks to the rest of Cassandra and is testable. How to fix bugs
> > in old code in a way that’s testable. We should also have some guidelines
> > around refactoring the wildly untested sections, how to get started, what
> > to do, what not to do, etc.
> >
> > Thoughts?
>
>
> To make the conversation practical. There is one class I personally really
> want to refactor so it can be tested:
>
> https://github.com/apache/cassandra/blob/trunk/src/java/
> org/apache/cassandra/net/OutboundTcpConnection.java
>
> There is little coverage here. Questions like:
> what errors cause the connection to restart?
> when are undropable messages are dropped?
> what happens when the queue fills up?
> Infamous throw new AssertionError(ex); (which probably bubble up to
> nowhere)
> what does the COALESCED strategy do in case XYZ.
> A nifty label (wow a label you just never see those much!)
> outer:
> while (!isStopped)
>
> Comments to jira's that probably are not explicitly tested:
> // If we haven't retried this message yet, put it back on the queue to
> retry after re-connecting.
> // See CASSANDRA-5393 and CASSANDRA-12192.
>
> If I were to undertake this cleanup, would there actually be support? IE if
> this going to turn into an "it aint broken. don't fix it thing" or a "we
> don't want to change stuff just to add tests" . Like will someone pledge to
> agree its kinda wonky and merge the effort in < 1 years time?
>

Re: Code quality, principles and rules

Posted by Edward Capriolo <ed...@gmail.com>.

On Fri, Mar 17, 2017 at 12:33 PM, Blake Eggleston <be...@apple.com>
wrote:

> I think we’re getting a little ahead of ourselves talking about DI
> frameworks. Before that even becomes something worth talking about, we’d
> need to have made serious progress on un-spaghettifying Cassandra in the
> first place. It’s an extremely tall order. Adding a DI framework right now
> would be like throwing gasoline on a raging tire fire.
>
> Removing singletons seems to come up every 6-12 months, and usually
> abandoned once people figure out how difficult they are to remove properly.
> I do think removing them *should* be a long term goal, but we really need
> something more immediately actionable. Otherwise, nothing’s going to
> happen, and we’ll be having this discussion again in a year or so when
> everyone’s angry that Cassandra 5.0 still isn’t ready for production, a
> year after it’s release.
>
> That said, the reason singletons regularly get brought up is because doing
> extensive testing of anything in Cassandra is pretty much impossible, since
> the code is basically this big web of interconnected global state. Testing
> anything in isolation can’t be done, which, for a distributed database, is
> crazy. It’s a chronic problem that handicaps our ability to release a
> stable database.
>
> At this point, I think a more pragmatic approach would be to draft and
> enforce some coding standards that can be applied in day to day development
> that drive incremental improvement of the testing and testability of the
> project. What should be tested, how it should be tested. How to write new
> code that talks to the rest of Cassandra and is testable. How to fix bugs
> in old code in a way that’s testable. We should also have some guidelines
> around refactoring the wildly untested sections, how to get started, what
> to do, what not to do, etc.
>
> Thoughts?

To make the conversation practical. There is one class I personally really
want to refactor so it can be tested:

https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/net/OutboundTcpConnection.java

There is little coverage here. Questions like:
what errors cause the connection to restart?
when are undropable messages are dropped?
what happens when the queue fills up?
Infamous throw new AssertionError(ex); (which probably bubble up to nowhere)
what does the COALESCED strategy do in case XYZ.
A nifty label (wow a label you just never see those much!)
outer:
while (!isStopped)

Comments to jira's that probably are not explicitly tested:
// If we haven't retried this message yet, put it back on the queue to
retry after re-connecting.
// See CASSANDRA-5393 and CASSANDRA-12192.

If I were to undertake this cleanup, would there actually be support? IE if
this going to turn into an "it aint broken. don't fix it thing" or a "we
don't want to change stuff just to add tests" . Like will someone pledge to
agree its kinda wonky and merge the effort in < 1 years time?

Re: Code quality, principles and rules

Posted by Blake Eggleston <be...@apple.com>.

I think we’re getting a little ahead of ourselves talking about DI frameworks. Before that even becomes something worth talking about, we’d need to have made serious progress on un-spaghettifying Cassandra in the first place. It’s an extremely tall order. Adding a DI framework right now would be like throwing gasoline on a raging tire fire.

Removing singletons seems to come up every 6-12 months, and usually abandoned once people figure out how difficult they are to remove properly. I do think removing them *should* be a long term goal, but we really need something more immediately actionable. Otherwise, nothing’s going to happen, and we’ll be having this discussion again in a year or so when everyone’s angry that Cassandra 5.0 still isn’t ready for production, a year after it’s release.

That said, the reason singletons regularly get brought up is because doing extensive testing of anything in Cassandra is pretty much impossible, since the code is basically this big web of interconnected global state. Testing anything in isolation can’t be done, which, for a distributed database, is crazy. It’s a chronic problem that handicaps our ability to release a stable database.

At this point, I think a more pragmatic approach would be to draft and enforce some coding standards that can be applied in day to day development that drive incremental improvement of the testing and testability of the project. What should be tested, how it should be tested. How to write new code that talks to the rest of Cassandra and is testable. How to fix bugs in old code in a way that’s testable. We should also have some guidelines around refactoring the wildly untested sections, how to get started, what to do, what not to do, etc.

Thoughts?

Re: Code quality, principles and rules

Posted by Edward Capriolo <ed...@gmail.com>.

On Fri, Mar 17, 2017 at 9:46 AM, Edward Capriolo <ed...@gmail.com>
wrote:

>
>
> On Fri, Mar 17, 2017 at 6:41 AM, Ryan Svihla <rs...@foundev.pro> wrote:
>
>> Different DI frameworks have different initialization costs, even inside
>> of
>> spring even depending on how you wire up dependencies (did it use autowire
>> with reflection, parse a giant XML of explicit dependencies, etc).
>>
>> To back this assertion up for awhile in that community benching different
>> DI frameworks perf was a thing and you can find benchmarks galore with a
>> quick Google.
>>
>> The practical cost is also dependent on the lifecycles used (transient
>> versus Singleton style for example) and features used (Interceptors
>> depending on implementation can get expensive).
>>
>> So I think there should be some quantification of cost before a framework
>> is considered, something like dagger2 which uses codegen I wager is only a
>> cost at compile time (have not benched it, but looking at it's feature
>> set,
>> that's my guess) , Spring I know from experience even with the most
>> optimal
>> settings is slower on initialization time than doing by DI "by hand" at
>> minimum, and that can sometimes be substantial.
>>
>>
>> On Mar 17, 2017 12:29 AM, "Edward Capriolo" <ed...@gmail.com>
>> wrote:
>>
>> On Thu, Mar 16, 2017 at 5:18 PM, Jason Brown <ja...@gmail.com>
>> wrote:
>>
>> > >> do we have plan to integrate with a dependency injection framework?
>> >
>> > No, we (the maintainers) have been pretty much against more frameworks
>> due
>> > to performance reasons, overhead, and dependency management problems.
>> >
>> > On Thu, Mar 16, 2017 at 2:04 PM, Qingcun Zhou <zh...@gmail.com>
>> > wrote:
>> >
>> > > Since we're here, do we have plan to integrate with a dependency
>> > injection
>> > > framework like Dagger2? Otherwise it'll be difficult to write unit
>> test
>> > > cases.
>> > >
>> > > On Thu, Mar 16, 2017 at 1:16 PM, Edward Capriolo <
>> edlinuxguru@gmail.com>
>> > > wrote:
>> > >
>> > > > On Thu, Mar 16, 2017 at 3:10 PM, Jeff Jirsa <jj...@apache.org>
>> wrote:
>> > > >
>> > > > >
>> > > > >
>> > > > > On 2017-03-16 10:32 (-0700), François Deliège <
>> > francois@instagram.com>
>> > > > > wrote:
>> > > > > >
>> > > > > > To get this started, here is an initial proposal:
>> > > > > >
>> > > > > > Principles:
>> > > > > >
>> > > > > > 1. Tests always pass.  This is the starting point. If we don't
>> care
>> > > > > about test failures, then we should stop writing tests. A
>> recurring
>> > > > failing
>> > > > > test carries no signal and is better deleted.
>> > > > > > 2. The code is tested.
>> > > > > >
>> > > > > > Assuming we can align on these principles, here is a proposal
>> for
>> > > their
>> > > > > implementation.
>> > > > > >
>> > > > > > Rules:
>> > > > > >
>> > > > > > 1. Each new release passes all tests (no flakinesss).
>> > > > > > 2. If a patch has a failing test (test touching the same code
>> > path),
>> > > > the
>> > > > > code or test should be fixed prior to being accepted.
>> > > > > > 3. Bugs fixes should have one test that fails prior to the fix
>> and
>> > > > > passes after fix.
>> > > > > > 4. New code should have at least 90% test coverage.
>> > > > > >
>> > > > > First I was
>> > > > > I agree with all of these and hope they become codified and
>> > followed. I
>> > > > > don't know anyone who believes we should be committing code that
>> > breaks
>> > > > > tests - but we should be more strict with requiring green test
>> runs,
>> > > and
>> > > > > perhaps more strict with reverting patches that break tests (or
>> cause
>> > > > them
>> > > > > to be flakey).
>> > > > >
>> > > > > Ed also noted on the user list [0] that certain sections of the
>> code
>> > > > > itself are difficult to test because of singletons - I agree with
>> the
>> > > > > suggestion that it's time to revisit CASSANDRA-7837 and
>> > CASSANDRA-10283
>> > > > >
>> > > > > Finally, we should also recall Jason's previous notes [1] that the
>> > > actual
>> > > > > test infrastructure available is limited - the system provided by
>> > > > Datastax
>> > > > > is not generally open to everyone (and not guaranteed to be
>> > permanent),
>> > > > and
>> > > > > the infrastructure currently available to the ASF is somewhat
>> limited
>> > > > (much
>> > > > > slower, at the very least). If we require tests passing (and I
>> agree
>> > > that
>> > > > > we should), we need to define how we're going to be testing (or
>> how
>> > > we're
>> > > > > going to be sharing test results), because the ASF hardware isn't
>> > going
>> > > > to
>> > > > > be able to do dozens of dev branch dtest runs per day in its
>> current
>> > > > form.
>> > > > >
>> > > > > 0: https://lists.apache.org/thread.html/
>> > f6f3fc6d0ad1bd54a6185ce7bd7a2f
>> > > > > 6f09759a02352ffc05df92eef6@%3Cuser.cassandra.apache.org%3E
>> > > > > 1: https://lists.apache.org/thread.html/
>> > 5fb8f0446ab97644100e4ef987f36e
>> > > > > 07f44e8dd6d38f5dc81ecb3cdd@%3Cdev.cassandra.apache.org%3E
>> > > > >
>> > > > >
>> > > > >
>> > > > Ed also noted on the user list [0] that certain sections of the code
>> > > itself
>> > > > are difficult to test because of singletons - I agree with the
>> > suggestion
>> > > > that it's time to revisit CASSANDRA-7837 and CASSANDRA-10283
>> > > >
>> > > > Thanks for the shout out!
>> > > >
>> > > > I was just looking at a patch about compaction. The patch was to
>> > > calculate
>> > > > free space correctly in case X. Compaction is not something that
>> > requires
>> > > > multiple nodes to test. The logic on the surface seems simple: find
>> > > tables
>> > > > of similar size and select them and merge them. The reality is it
>> turns
>> > > out
>> > > > now to be that way. The coverage itself both branch and line may be
>> > very
>> > > > high, but what the code does not do is directly account for a wide
>> > > variety
>> > > > of scenarios. Without direct tests you end up with a mental
>> > approximation
>> > > > of what it does and that varies from person to person and accounts
>> for
>> > > the
>> > > > cases that fit in your mind. For example, you personally are only
>> > running
>> > > > LevelDB inspired compaction.
>> > > >
>> > > > Being that this this is not a multi-node problem you should be able
>> to
>> > re
>> > > > factor this heavily. Pulling everything out to a static method where
>> > all
>> > > > the parameters are arguments, or inject a lot of mocks in the
>> current
>> > > code,
>> > > > and develop some scenario based coverage.
>> > > >
>> > > > That is how i typically "rescue" code I take over. I look at the
>> > > nightmare
>> > > > and say, "damn i am really afraid to touch this". I construct 8
>> > scenarios
>> > > > that test green. Then I force some testing into it through careful
>> re
>> > > > factoring. Now, I probably know -something- about it. Now, you are
>> > fairly
>> > > > free to do a wide ranging refactor, because you at least counted
>> for 8
>> > > > scenarios and you put unit test traps so that some rules are
>> enforced.
>> > > (Or
>> > > > the person changing the code has to actively REMOVE your tests
>> > asserting
>> > > it
>> > > > was not or no longer is valid). Later on you (or someone else)
>> > __STILL__
>> > > > might screw the entire thing up, but at least you can now build
>> > forward.
>> > > >
>> > > > Anyway that patch on compaction was great and I am sure it improved
>> > > things.
>> > > > That being said it did not add any tests :). So it can easily be
>> undone
>> > > by
>> > > > the next person who does not understand the specific issue trying to
>> be
>> > > > addressed. Inline comments almost scream to me 'we need a test' not
>> > > > everyone believes that.
>> > > >
>> > >
>> > >
>> > >
>> > > --
>> > > Thank you & Best Regards,
>> > > --Simon (Qingcun) Zhou
>> > >
>> >
>>
>> I don't believe dependency injection frameworks cause "overhead". For
>> example, if you are using spring
>>
>> @Bean(initMethod=init, destroyMethod=destroy)
>> public Server getMeAServer( Component component) {}
>>
>> @Bean(initMethod=init, destroyMethod=destroy)
>> public Component getMeAComponent(){}
>>
>> What spring actually does is something like thisthis:
>>
>> startup code
>> Component c = new Component();
>> c.init()
>> Server s = new Server(c);
>> s.init()
>> The default spring prototype is "singleton" aka create one of these per
>> bean context.
>>
>> Spring also does the reverse on shutdown.
>>
>> The static singleton initialization thing creates a host of problems, with
>> things not closing cleanly.
>>
>> https://issues.apache.org/jira/browse/CASSANDRA-12844
>> https://issues.apache.org/jira/browse/CASSANDRA-12728
>>
>> One of them I even fixed:
>> https://github.com/stef1927/cassandra/commits/11984-2.2
>>
>> So beyond it not creating any performance problems it probably would help
>> in practice by forcing code into a pattern which is not singleton
>> initiation driven by classloader which is something very opaque and hard
>> to
>> troubleshoot. Try putting a break point for example on something loaded by
>> a static singleton and try to reason about the order of initialization.
>>
>
> Different DI frameworks have different initialization costs, even inside of
> spring even depending on how you wire up dependencies (did it use autowire
> with reflection, parse a giant XML of explicit dependencies, etc).
>
> Spring I know from experience even with the most optimal
> settings is slower on initialization time than doing by DI "by hand" at
> minimum, and that can sometimes be substantial.
>
> You are considering the cost of reflection (nano seconds) vs the cost of
> starting a database that has to read files from disk. (seconds to minutes)
>
> "parse a giant XML of explicit dependencies"
>
> Modern (spring) projects rarely do what you are describing. The big xml
> files are gone: https://projects.spring.io/spring-boot/. If you were not
> going to use spring (cause you are afraid of big xml files) there is
> https://github.com/google/guice.
>
> It is basically a straw man argument to say "we don't like DI because of
> big XML files" while papering over the fact that drain really does not shot
> down cassandra properly and it has been project for a long while.
>
>
To put this into prospective I worked with a group that ran a larger number
of C* servers. They had tried and failed to develop a clean shutdown
procedure that would not leave corrupt commit logs. The scripts were like
this:

nodetool disable-gossip
sleep 5
nodetool disable-thrift
sleep 5
nodetool drain
sleep 30
kill

We tried all permutations of shutdown techniques and could not build a
sequence that could cleanly shutdown and restart cassandra 100% of the
time. So there is room for improvement here. DI would help IMHO.

Re: Code quality, principles and rules

Posted by Edward Capriolo <ed...@gmail.com>.

On Fri, Mar 17, 2017 at 6:41 AM, Ryan Svihla <rs...@foundev.pro> wrote:

> Different DI frameworks have different initialization costs, even inside of
> spring even depending on how you wire up dependencies (did it use autowire
> with reflection, parse a giant XML of explicit dependencies, etc).
>
> To back this assertion up for awhile in that community benching different
> DI frameworks perf was a thing and you can find benchmarks galore with a
> quick Google.
>
> The practical cost is also dependent on the lifecycles used (transient
> versus Singleton style for example) and features used (Interceptors
> depending on implementation can get expensive).
>
> So I think there should be some quantification of cost before a framework
> is considered, something like dagger2 which uses codegen I wager is only a
> cost at compile time (have not benched it, but looking at it's feature set,
> that's my guess) , Spring I know from experience even with the most optimal
> settings is slower on initialization time than doing by DI "by hand" at
> minimum, and that can sometimes be substantial.
>
>
> On Mar 17, 2017 12:29 AM, "Edward Capriolo" <ed...@gmail.com> wrote:
>
> On Thu, Mar 16, 2017 at 5:18 PM, Jason Brown <ja...@gmail.com> wrote:
>
> > >> do we have plan to integrate with a dependency injection framework?
> >
> > No, we (the maintainers) have been pretty much against more frameworks
> due
> > to performance reasons, overhead, and dependency management problems.
> >
> > On Thu, Mar 16, 2017 at 2:04 PM, Qingcun Zhou <zh...@gmail.com>
> > wrote:
> >
> > > Since we're here, do we have plan to integrate with a dependency
> > injection
> > > framework like Dagger2? Otherwise it'll be difficult to write unit test
> > > cases.
> > >
> > > On Thu, Mar 16, 2017 at 1:16 PM, Edward Capriolo <
> edlinuxguru@gmail.com>
> > > wrote:
> > >
> > > > On Thu, Mar 16, 2017 at 3:10 PM, Jeff Jirsa <jj...@apache.org>
> wrote:
> > > >
> > > > >
> > > > >
> > > > > On 2017-03-16 10:32 (-0700), François Deliège <
> > francois@instagram.com>
> > > > > wrote:
> > > > > >
> > > > > > To get this started, here is an initial proposal:
> > > > > >
> > > > > > Principles:
> > > > > >
> > > > > > 1. Tests always pass.  This is the starting point. If we don't
> care
> > > > > about test failures, then we should stop writing tests. A recurring
> > > > failing
> > > > > test carries no signal and is better deleted.
> > > > > > 2. The code is tested.
> > > > > >
> > > > > > Assuming we can align on these principles, here is a proposal for
> > > their
> > > > > implementation.
> > > > > >
> > > > > > Rules:
> > > > > >
> > > > > > 1. Each new release passes all tests (no flakinesss).
> > > > > > 2. If a patch has a failing test (test touching the same code
> > path),
> > > > the
> > > > > code or test should be fixed prior to being accepted.
> > > > > > 3. Bugs fixes should have one test that fails prior to the fix
> and
> > > > > passes after fix.
> > > > > > 4. New code should have at least 90% test coverage.
> > > > > >
> > > > > First I was
> > > > > I agree with all of these and hope they become codified and
> > followed. I
> > > > > don't know anyone who believes we should be committing code that
> > breaks
> > > > > tests - but we should be more strict with requiring green test
> runs,
> > > and
> > > > > perhaps more strict with reverting patches that break tests (or
> cause
> > > > them
> > > > > to be flakey).
> > > > >
> > > > > Ed also noted on the user list [0] that certain sections of the
> code
> > > > > itself are difficult to test because of singletons - I agree with
> the
> > > > > suggestion that it's time to revisit CASSANDRA-7837 and
> > CASSANDRA-10283
> > > > >
> > > > > Finally, we should also recall Jason's previous notes [1] that the
> > > actual
> > > > > test infrastructure available is limited - the system provided by
> > > > Datastax
> > > > > is not generally open to everyone (and not guaranteed to be
> > permanent),
> > > > and
> > > > > the infrastructure currently available to the ASF is somewhat
> limited
> > > > (much
> > > > > slower, at the very least). If we require tests passing (and I
> agree
> > > that
> > > > > we should), we need to define how we're going to be testing (or how
> > > we're
> > > > > going to be sharing test results), because the ASF hardware isn't
> > going
> > > > to
> > > > > be able to do dozens of dev branch dtest runs per day in its
> current
> > > > form.
> > > > >
> > > > > 0: https://lists.apache.org/thread.html/
> > f6f3fc6d0ad1bd54a6185ce7bd7a2f
> > > > > 6f09759a02352ffc05df92eef6@%3Cuser.cassandra.apache.org%3E
> > > > > 1: https://lists.apache.org/thread.html/
> > 5fb8f0446ab97644100e4ef987f36e
> > > > > 07f44e8dd6d38f5dc81ecb3cdd@%3Cdev.cassandra.apache.org%3E
> > > > >
> > > > >
> > > > >
> > > > Ed also noted on the user list [0] that certain sections of the code
> > > itself
> > > > are difficult to test because of singletons - I agree with the
> > suggestion
> > > > that it's time to revisit CASSANDRA-7837 and CASSANDRA-10283
> > > >
> > > > Thanks for the shout out!
> > > >
> > > > I was just looking at a patch about compaction. The patch was to
> > > calculate
> > > > free space correctly in case X. Compaction is not something that
> > requires
> > > > multiple nodes to test. The logic on the surface seems simple: find
> > > tables
> > > > of similar size and select them and merge them. The reality is it
> turns
> > > out
> > > > now to be that way. The coverage itself both branch and line may be
> > very
> > > > high, but what the code does not do is directly account for a wide
> > > variety
> > > > of scenarios. Without direct tests you end up with a mental
> > approximation
> > > > of what it does and that varies from person to person and accounts
> for
> > > the
> > > > cases that fit in your mind. For example, you personally are only
> > running
> > > > LevelDB inspired compaction.
> > > >
> > > > Being that this this is not a multi-node problem you should be able
> to
> > re
> > > > factor this heavily. Pulling everything out to a static method where
> > all
> > > > the parameters are arguments, or inject a lot of mocks in the current
> > > code,
> > > > and develop some scenario based coverage.
> > > >
> > > > That is how i typically "rescue" code I take over. I look at the
> > > nightmare
> > > > and say, "damn i am really afraid to touch this". I construct 8
> > scenarios
> > > > that test green. Then I force some testing into it through careful re
> > > > factoring. Now, I probably know -something- about it. Now, you are
> > fairly
> > > > free to do a wide ranging refactor, because you at least counted for
> 8
> > > > scenarios and you put unit test traps so that some rules are
> enforced.
> > > (Or
> > > > the person changing the code has to actively REMOVE your tests
> > asserting
> > > it
> > > > was not or no longer is valid). Later on you (or someone else)
> > __STILL__
> > > > might screw the entire thing up, but at least you can now build
> > forward.
> > > >
> > > > Anyway that patch on compaction was great and I am sure it improved
> > > things.
> > > > That being said it did not add any tests :). So it can easily be
> undone
> > > by
> > > > the next person who does not understand the specific issue trying to
> be
> > > > addressed. Inline comments almost scream to me 'we need a test' not
> > > > everyone believes that.
> > > >
> > >
> > >
> > >
> > > --
> > > Thank you & Best Regards,
> > > --Simon (Qingcun) Zhou
> > >
> >
>
> I don't believe dependency injection frameworks cause "overhead". For
> example, if you are using spring
>
> @Bean(initMethod=init, destroyMethod=destroy)
> public Server getMeAServer( Component component) {}
>
> @Bean(initMethod=init, destroyMethod=destroy)
> public Component getMeAComponent(){}
>
> What spring actually does is something like thisthis:
>
> startup code
> Component c = new Component();
> c.init()
> Server s = new Server(c);
> s.init()
> The default spring prototype is "singleton" aka create one of these per
> bean context.
>
> Spring also does the reverse on shutdown.
>
> The static singleton initialization thing creates a host of problems, with
> things not closing cleanly.
>
> https://issues.apache.org/jira/browse/CASSANDRA-12844
> https://issues.apache.org/jira/browse/CASSANDRA-12728
>
> One of them I even fixed:
> https://github.com/stef1927/cassandra/commits/11984-2.2
>
> So beyond it not creating any performance problems it probably would help
> in practice by forcing code into a pattern which is not singleton
> initiation driven by classloader which is something very opaque and hard to
> troubleshoot. Try putting a break point for example on something loaded by
> a static singleton and try to reason about the order of initialization.
>

Different DI frameworks have different initialization costs, even inside of
spring even depending on how you wire up dependencies (did it use autowire
with reflection, parse a giant XML of explicit dependencies, etc).

Spring I know from experience even with the most optimal
settings is slower on initialization time than doing by DI "by hand" at
minimum, and that can sometimes be substantial.

You are considering the cost of reflection (nano seconds) vs the cost of
starting a database that has to read files from disk. (seconds to minutes)

"parse a giant XML of explicit dependencies"

Modern (spring) projects rarely do what you are describing. The big xml
files are gone: https://projects.spring.io/spring-boot/. If you were not
going to use spring (cause you are afraid of big xml files) there is
https://github.com/google/guice.

It is basically a straw man argument to say "we don't like DI because of
big XML files" while papering over the fact that drain really does not shot
down cassandra properly and it has been project for a long while.

Re: Code quality, principles and rules

Posted by Ryan Svihla <rs...@foundev.pro>.

Different DI frameworks have different initialization costs, even inside of
spring even depending on how you wire up dependencies (did it use autowire
with reflection, parse a giant XML of explicit dependencies, etc).

To back this assertion up for awhile in that community benching different
DI frameworks perf was a thing and you can find benchmarks galore with a
quick Google.

The practical cost is also dependent on the lifecycles used (transient
versus Singleton style for example) and features used (Interceptors
depending on implementation can get expensive).

So I think there should be some quantification of cost before a framework
is considered, something like dagger2 which uses codegen I wager is only a
cost at compile time (have not benched it, but looking at it's feature set,
that's my guess) , Spring I know from experience even with the most optimal
settings is slower on initialization time than doing by DI "by hand" at
minimum, and that can sometimes be substantial.


On Mar 17, 2017 12:29 AM, "Edward Capriolo" <ed...@gmail.com> wrote:

On Thu, Mar 16, 2017 at 5:18 PM, Jason Brown <ja...@gmail.com> wrote:

> >> do we have plan to integrate with a dependency injection framework?
>
> No, we (the maintainers) have been pretty much against more frameworks due
> to performance reasons, overhead, and dependency management problems.
>
> On Thu, Mar 16, 2017 at 2:04 PM, Qingcun Zhou <zh...@gmail.com>
> wrote:
>
> > Since we're here, do we have plan to integrate with a dependency
> injection
> > framework like Dagger2? Otherwise it'll be difficult to write unit test
> > cases.
> >
> > On Thu, Mar 16, 2017 at 1:16 PM, Edward Capriolo <ed...@gmail.com>
> > wrote:
> >
> > > On Thu, Mar 16, 2017 at 3:10 PM, Jeff Jirsa <jj...@apache.org> wrote:
> > >
> > > >
> > > >
> > > > On 2017-03-16 10:32 (-0700), François Deliège <
> francois@instagram.com>
> > > > wrote:
> > > > >
> > > > > To get this started, here is an initial proposal:
> > > > >
> > > > > Principles:
> > > > >
> > > > > 1. Tests always pass.  This is the starting point. If we don't
care
> > > > about test failures, then we should stop writing tests. A recurring
> > > failing
> > > > test carries no signal and is better deleted.
> > > > > 2. The code is tested.
> > > > >
> > > > > Assuming we can align on these principles, here is a proposal for
> > their
> > > > implementation.
> > > > >
> > > > > Rules:
> > > > >
> > > > > 1. Each new release passes all tests (no flakinesss).
> > > > > 2. If a patch has a failing test (test touching the same code
> path),
> > > the
> > > > code or test should be fixed prior to being accepted.
> > > > > 3. Bugs fixes should have one test that fails prior to the fix and
> > > > passes after fix.
> > > > > 4. New code should have at least 90% test coverage.
> > > > >
> > > > First I was
> > > > I agree with all of these and hope they become codified and
> followed. I
> > > > don't know anyone who believes we should be committing code that
> breaks
> > > > tests - but we should be more strict with requiring green test runs,
> > and
> > > > perhaps more strict with reverting patches that break tests (or
cause
> > > them
> > > > to be flakey).
> > > >
> > > > Ed also noted on the user list [0] that certain sections of the code
> > > > itself are difficult to test because of singletons - I agree with
the
> > > > suggestion that it's time to revisit CASSANDRA-7837 and
> CASSANDRA-10283
> > > >
> > > > Finally, we should also recall Jason's previous notes [1] that the
> > actual
> > > > test infrastructure available is limited - the system provided by
> > > Datastax
> > > > is not generally open to everyone (and not guaranteed to be
> permanent),
> > > and
> > > > the infrastructure currently available to the ASF is somewhat
limited
> > > (much
> > > > slower, at the very least). If we require tests passing (and I agree
> > that
> > > > we should), we need to define how we're going to be testing (or how
> > we're
> > > > going to be sharing test results), because the ASF hardware isn't
> going
> > > to
> > > > be able to do dozens of dev branch dtest runs per day in its current
> > > form.
> > > >
> > > > 0: https://lists.apache.org/thread.html/
> f6f3fc6d0ad1bd54a6185ce7bd7a2f
> > > > 6f09759a02352ffc05df92eef6@%3Cuser.cassandra.apache.org%3E
> > > > 1: https://lists.apache.org/thread.html/
> 5fb8f0446ab97644100e4ef987f36e
> > > > 07f44e8dd6d38f5dc81ecb3cdd@%3Cdev.cassandra.apache.org%3E
> > > >
> > > >
> > > >
> > > Ed also noted on the user list [0] that certain sections of the code
> > itself
> > > are difficult to test because of singletons - I agree with the
> suggestion
> > > that it's time to revisit CASSANDRA-7837 and CASSANDRA-10283
> > >
> > > Thanks for the shout out!
> > >
> > > I was just looking at a patch about compaction. The patch was to
> > calculate
> > > free space correctly in case X. Compaction is not something that
> requires
> > > multiple nodes to test. The logic on the surface seems simple: find
> > tables
> > > of similar size and select them and merge them. The reality is it
turns
> > out
> > > now to be that way. The coverage itself both branch and line may be
> very
> > > high, but what the code does not do is directly account for a wide
> > variety
> > > of scenarios. Without direct tests you end up with a mental
> approximation
> > > of what it does and that varies from person to person and accounts for
> > the
> > > cases that fit in your mind. For example, you personally are only
> running
> > > LevelDB inspired compaction.
> > >
> > > Being that this this is not a multi-node problem you should be able to
> re
> > > factor this heavily. Pulling everything out to a static method where
> all
> > > the parameters are arguments, or inject a lot of mocks in the current
> > code,
> > > and develop some scenario based coverage.
> > >
> > > That is how i typically "rescue" code I take over. I look at the
> > nightmare
> > > and say, "damn i am really afraid to touch this". I construct 8
> scenarios
> > > that test green. Then I force some testing into it through careful re
> > > factoring. Now, I probably know -something- about it. Now, you are
> fairly
> > > free to do a wide ranging refactor, because you at least counted for 8
> > > scenarios and you put unit test traps so that some rules are enforced.
> > (Or
> > > the person changing the code has to actively REMOVE your tests
> asserting
> > it
> > > was not or no longer is valid). Later on you (or someone else)
> __STILL__
> > > might screw the entire thing up, but at least you can now build
> forward.
> > >
> > > Anyway that patch on compaction was great and I am sure it improved
> > things.
> > > That being said it did not add any tests :). So it can easily be
undone
> > by
> > > the next person who does not understand the specific issue trying to
be
> > > addressed. Inline comments almost scream to me 'we need a test' not
> > > everyone believes that.
> > >
> >
> >
> >
> > --
> > Thank you & Best Regards,
> > --Simon (Qingcun) Zhou
> >
>

I don't believe dependency injection frameworks cause "overhead". For
example, if you are using spring

@Bean(initMethod=init, destroyMethod=destroy)
public Server getMeAServer( Component component) {}

@Bean(initMethod=init, destroyMethod=destroy)
public Component getMeAComponent(){}

What spring actually does is something like thisthis:

startup code
Component c = new Component();
c.init()
Server s = new Server(c);
s.init()
The default spring prototype is "singleton" aka create one of these per
bean context.

Spring also does the reverse on shutdown.

The static singleton initialization thing creates a host of problems, with
things not closing cleanly.

https://issues.apache.org/jira/browse/CASSANDRA-12844
https://issues.apache.org/jira/browse/CASSANDRA-12728

One of them I even fixed:
https://github.com/stef1927/cassandra/commits/11984-2.2

So beyond it not creating any performance problems it probably would help
in practice by forcing code into a pattern which is not singleton
initiation driven by classloader which is something very opaque and hard to
troubleshoot. Try putting a break point for example on something loaded by
a static singleton and try to reason about the order of initialization.

Re: Code quality, principles and rules

Posted by Edward Capriolo <ed...@gmail.com>.

On Thu, Mar 16, 2017 at 5:18 PM, Jason Brown <ja...@gmail.com> wrote:

> >> do we have plan to integrate with a dependency injection framework?
>
> No, we (the maintainers) have been pretty much against more frameworks due
> to performance reasons, overhead, and dependency management problems.
>
> On Thu, Mar 16, 2017 at 2:04 PM, Qingcun Zhou <zh...@gmail.com>
> wrote:
>
> > Since we're here, do we have plan to integrate with a dependency
> injection
> > framework like Dagger2? Otherwise it'll be difficult to write unit test
> > cases.
> >
> > On Thu, Mar 16, 2017 at 1:16 PM, Edward Capriolo <ed...@gmail.com>
> > wrote:
> >
> > > On Thu, Mar 16, 2017 at 3:10 PM, Jeff Jirsa <jj...@apache.org> wrote:
> > >
> > > >
> > > >
> > > > On 2017-03-16 10:32 (-0700), François Deliège <
> francois@instagram.com>
> > > > wrote:
> > > > >
> > > > > To get this started, here is an initial proposal:
> > > > >
> > > > > Principles:
> > > > >
> > > > > 1. Tests always pass.  This is the starting point. If we don't care
> > > > about test failures, then we should stop writing tests. A recurring
> > > failing
> > > > test carries no signal and is better deleted.
> > > > > 2. The code is tested.
> > > > >
> > > > > Assuming we can align on these principles, here is a proposal for
> > their
> > > > implementation.
> > > > >
> > > > > Rules:
> > > > >
> > > > > 1. Each new release passes all tests (no flakinesss).
> > > > > 2. If a patch has a failing test (test touching the same code
> path),
> > > the
> > > > code or test should be fixed prior to being accepted.
> > > > > 3. Bugs fixes should have one test that fails prior to the fix and
> > > > passes after fix.
> > > > > 4. New code should have at least 90% test coverage.
> > > > >
> > > > First I was
> > > > I agree with all of these and hope they become codified and
> followed. I
> > > > don't know anyone who believes we should be committing code that
> breaks
> > > > tests - but we should be more strict with requiring green test runs,
> > and
> > > > perhaps more strict with reverting patches that break tests (or cause
> > > them
> > > > to be flakey).
> > > >
> > > > Ed also noted on the user list [0] that certain sections of the code
> > > > itself are difficult to test because of singletons - I agree with the
> > > > suggestion that it's time to revisit CASSANDRA-7837 and
> CASSANDRA-10283
> > > >
> > > > Finally, we should also recall Jason's previous notes [1] that the
> > actual
> > > > test infrastructure available is limited - the system provided by
> > > Datastax
> > > > is not generally open to everyone (and not guaranteed to be
> permanent),
> > > and
> > > > the infrastructure currently available to the ASF is somewhat limited
> > > (much
> > > > slower, at the very least). If we require tests passing (and I agree
> > that
> > > > we should), we need to define how we're going to be testing (or how
> > we're
> > > > going to be sharing test results), because the ASF hardware isn't
> going
> > > to
> > > > be able to do dozens of dev branch dtest runs per day in its current
> > > form.
> > > >
> > > > 0: https://lists.apache.org/thread.html/
> f6f3fc6d0ad1bd54a6185ce7bd7a2f
> > > > 6f09759a02352ffc05df92eef6@%3Cuser.cassandra.apache.org%3E
> > > > 1: https://lists.apache.org/thread.html/
> 5fb8f0446ab97644100e4ef987f36e
> > > > 07f44e8dd6d38f5dc81ecb3cdd@%3Cdev.cassandra.apache.org%3E
> > > >
> > > >
> > > >
> > > Ed also noted on the user list [0] that certain sections of the code
> > itself
> > > are difficult to test because of singletons - I agree with the
> suggestion
> > > that it's time to revisit CASSANDRA-7837 and CASSANDRA-10283
> > >
> > > Thanks for the shout out!
> > >
> > > I was just looking at a patch about compaction. The patch was to
> > calculate
> > > free space correctly in case X. Compaction is not something that
> requires
> > > multiple nodes to test. The logic on the surface seems simple: find
> > tables
> > > of similar size and select them and merge them. The reality is it turns
> > out
> > > now to be that way. The coverage itself both branch and line may be
> very
> > > high, but what the code does not do is directly account for a wide
> > variety
> > > of scenarios. Without direct tests you end up with a mental
> approximation
> > > of what it does and that varies from person to person and accounts for
> > the
> > > cases that fit in your mind. For example, you personally are only
> running
> > > LevelDB inspired compaction.
> > >
> > > Being that this this is not a multi-node problem you should be able to
> re
> > > factor this heavily. Pulling everything out to a static method where
> all
> > > the parameters are arguments, or inject a lot of mocks in the current
> > code,
> > > and develop some scenario based coverage.
> > >
> > > That is how i typically "rescue" code I take over. I look at the
> > nightmare
> > > and say, "damn i am really afraid to touch this". I construct 8
> scenarios
> > > that test green. Then I force some testing into it through careful re
> > > factoring. Now, I probably know -something- about it. Now, you are
> fairly
> > > free to do a wide ranging refactor, because you at least counted for 8
> > > scenarios and you put unit test traps so that some rules are enforced.
> > (Or
> > > the person changing the code has to actively REMOVE your tests
> asserting
> > it
> > > was not or no longer is valid). Later on you (or someone else)
> __STILL__
> > > might screw the entire thing up, but at least you can now build
> forward.
> > >
> > > Anyway that patch on compaction was great and I am sure it improved
> > things.
> > > That being said it did not add any tests :). So it can easily be undone
> > by
> > > the next person who does not understand the specific issue trying to be
> > > addressed. Inline comments almost scream to me 'we need a test' not
> > > everyone believes that.
> > >
> >
> >
> >
> > --
> > Thank you & Best Regards,
> > --Simon (Qingcun) Zhou
> >
>

I don't believe dependency injection frameworks cause "overhead". For
example, if you are using spring

@Bean(initMethod=init, destroyMethod=destroy)
public Server getMeAServer( Component component) {}

@Bean(initMethod=init, destroyMethod=destroy)
public Component getMeAComponent(){}

What spring actually does is something like thisthis:

startup code
Component c = new Component();
c.init()
Server s = new Server(c);
s.init()
The default spring prototype is "singleton" aka create one of these per
bean context.

Spring also does the reverse on shutdown.

The static singleton initialization thing creates a host of problems, with
things not closing cleanly.

https://issues.apache.org/jira/browse/CASSANDRA-12844
https://issues.apache.org/jira/browse/CASSANDRA-12728

One of them I even fixed:
https://github.com/stef1927/cassandra/commits/11984-2.2

So beyond it not creating any performance problems it probably would help
in practice by forcing code into a pattern which is not singleton
initiation driven by classloader which is something very opaque and hard to
troubleshoot. Try putting a break point for example on something loaded by
a static singleton and try to reason about the order of initialization.

Re: Code quality, principles and rules

Posted by Jason Brown <ja...@gmail.com>.

>> do we have plan to integrate with a dependency injection framework?

No, we (the maintainers) have been pretty much against more frameworks due
to performance reasons, overhead, and dependency management problems.

On Thu, Mar 16, 2017 at 2:04 PM, Qingcun Zhou <zh...@gmail.com> wrote:

> Since we're here, do we have plan to integrate with a dependency injection
> framework like Dagger2? Otherwise it'll be difficult to write unit test
> cases.
>
> On Thu, Mar 16, 2017 at 1:16 PM, Edward Capriolo <ed...@gmail.com>
> wrote:
>
> > On Thu, Mar 16, 2017 at 3:10 PM, Jeff Jirsa <jj...@apache.org> wrote:
> >
> > >
> > >
> > > On 2017-03-16 10:32 (-0700), François Deliège <fr...@instagram.com>
> > > wrote:
> > > >
> > > > To get this started, here is an initial proposal:
> > > >
> > > > Principles:
> > > >
> > > > 1. Tests always pass.  This is the starting point. If we don't care
> > > about test failures, then we should stop writing tests. A recurring
> > failing
> > > test carries no signal and is better deleted.
> > > > 2. The code is tested.
> > > >
> > > > Assuming we can align on these principles, here is a proposal for
> their
> > > implementation.
> > > >
> > > > Rules:
> > > >
> > > > 1. Each new release passes all tests (no flakinesss).
> > > > 2. If a patch has a failing test (test touching the same code path),
> > the
> > > code or test should be fixed prior to being accepted.
> > > > 3. Bugs fixes should have one test that fails prior to the fix and
> > > passes after fix.
> > > > 4. New code should have at least 90% test coverage.
> > > >
> > > First I was
> > > I agree with all of these and hope they become codified and followed. I
> > > don't know anyone who believes we should be committing code that breaks
> > > tests - but we should be more strict with requiring green test runs,
> and
> > > perhaps more strict with reverting patches that break tests (or cause
> > them
> > > to be flakey).
> > >
> > > Ed also noted on the user list [0] that certain sections of the code
> > > itself are difficult to test because of singletons - I agree with the
> > > suggestion that it's time to revisit CASSANDRA-7837 and CASSANDRA-10283
> > >
> > > Finally, we should also recall Jason's previous notes [1] that the
> actual
> > > test infrastructure available is limited - the system provided by
> > Datastax
> > > is not generally open to everyone (and not guaranteed to be permanent),
> > and
> > > the infrastructure currently available to the ASF is somewhat limited
> > (much
> > > slower, at the very least). If we require tests passing (and I agree
> that
> > > we should), we need to define how we're going to be testing (or how
> we're
> > > going to be sharing test results), because the ASF hardware isn't going
> > to
> > > be able to do dozens of dev branch dtest runs per day in its current
> > form.
> > >
> > > 0: https://lists.apache.org/thread.html/f6f3fc6d0ad1bd54a6185ce7bd7a2f
> > > 6f09759a02352ffc05df92eef6@%3Cuser.cassandra.apache.org%3E
> > > 1: https://lists.apache.org/thread.html/5fb8f0446ab97644100e4ef987f36e
> > > 07f44e8dd6d38f5dc81ecb3cdd@%3Cdev.cassandra.apache.org%3E
> > >
> > >
> > >
> > Ed also noted on the user list [0] that certain sections of the code
> itself
> > are difficult to test because of singletons - I agree with the suggestion
> > that it's time to revisit CASSANDRA-7837 and CASSANDRA-10283
> >
> > Thanks for the shout out!
> >
> > I was just looking at a patch about compaction. The patch was to
> calculate
> > free space correctly in case X. Compaction is not something that requires
> > multiple nodes to test. The logic on the surface seems simple: find
> tables
> > of similar size and select them and merge them. The reality is it turns
> out
> > now to be that way. The coverage itself both branch and line may be very
> > high, but what the code does not do is directly account for a wide
> variety
> > of scenarios. Without direct tests you end up with a mental approximation
> > of what it does and that varies from person to person and accounts for
> the
> > cases that fit in your mind. For example, you personally are only running
> > LevelDB inspired compaction.
> >
> > Being that this this is not a multi-node problem you should be able to re
> > factor this heavily. Pulling everything out to a static method where all
> > the parameters are arguments, or inject a lot of mocks in the current
> code,
> > and develop some scenario based coverage.
> >
> > That is how i typically "rescue" code I take over. I look at the
> nightmare
> > and say, "damn i am really afraid to touch this". I construct 8 scenarios
> > that test green. Then I force some testing into it through careful re
> > factoring. Now, I probably know -something- about it. Now, you are fairly
> > free to do a wide ranging refactor, because you at least counted for 8
> > scenarios and you put unit test traps so that some rules are enforced.
> (Or
> > the person changing the code has to actively REMOVE your tests asserting
> it
> > was not or no longer is valid). Later on you (or someone else)  __STILL__
> > might screw the entire thing up, but at least you can now build forward.
> >
> > Anyway that patch on compaction was great and I am sure it improved
> things.
> > That being said it did not add any tests :). So it can easily be undone
> by
> > the next person who does not understand the specific issue trying to be
> > addressed. Inline comments almost scream to me 'we need a test' not
> > everyone believes that.
> >
>
>
>
> --
> Thank you & Best Regards,
> --Simon (Qingcun) Zhou
>

Re: Code quality, principles and rules

Posted by Qingcun Zhou <zh...@gmail.com>.

Since we're here, do we have plan to integrate with a dependency injection
framework like Dagger2? Otherwise it'll be difficult to write unit test
cases.

On Thu, Mar 16, 2017 at 1:16 PM, Edward Capriolo <ed...@gmail.com>
wrote:

> On Thu, Mar 16, 2017 at 3:10 PM, Jeff Jirsa <jj...@apache.org> wrote:
>
> >
> >
> > On 2017-03-16 10:32 (-0700), François Deliège <fr...@instagram.com>
> > wrote:
> > >
> > > To get this started, here is an initial proposal:
> > >
> > > Principles:
> > >
> > > 1. Tests always pass.  This is the starting point. If we don't care
> > about test failures, then we should stop writing tests. A recurring
> failing
> > test carries no signal and is better deleted.
> > > 2. The code is tested.
> > >
> > > Assuming we can align on these principles, here is a proposal for their
> > implementation.
> > >
> > > Rules:
> > >
> > > 1. Each new release passes all tests (no flakinesss).
> > > 2. If a patch has a failing test (test touching the same code path),
> the
> > code or test should be fixed prior to being accepted.
> > > 3. Bugs fixes should have one test that fails prior to the fix and
> > passes after fix.
> > > 4. New code should have at least 90% test coverage.
> > >
> > First I was
> > I agree with all of these and hope they become codified and followed. I
> > don't know anyone who believes we should be committing code that breaks
> > tests - but we should be more strict with requiring green test runs, and
> > perhaps more strict with reverting patches that break tests (or cause
> them
> > to be flakey).
> >
> > Ed also noted on the user list [0] that certain sections of the code
> > itself are difficult to test because of singletons - I agree with the
> > suggestion that it's time to revisit CASSANDRA-7837 and CASSANDRA-10283
> >
> > Finally, we should also recall Jason's previous notes [1] that the actual
> > test infrastructure available is limited - the system provided by
> Datastax
> > is not generally open to everyone (and not guaranteed to be permanent),
> and
> > the infrastructure currently available to the ASF is somewhat limited
> (much
> > slower, at the very least). If we require tests passing (and I agree that
> > we should), we need to define how we're going to be testing (or how we're
> > going to be sharing test results), because the ASF hardware isn't going
> to
> > be able to do dozens of dev branch dtest runs per day in its current
> form.
> >
> > 0: https://lists.apache.org/thread.html/f6f3fc6d0ad1bd54a6185ce7bd7a2f
> > 6f09759a02352ffc05df92eef6@%3Cuser.cassandra.apache.org%3E
> > 1: https://lists.apache.org/thread.html/5fb8f0446ab97644100e4ef987f36e
> > 07f44e8dd6d38f5dc81ecb3cdd@%3Cdev.cassandra.apache.org%3E
> >
> >
> >
> Ed also noted on the user list [0] that certain sections of the code itself
> are difficult to test because of singletons - I agree with the suggestion
> that it's time to revisit CASSANDRA-7837 and CASSANDRA-10283
>
> Thanks for the shout out!
>
> I was just looking at a patch about compaction. The patch was to calculate
> free space correctly in case X. Compaction is not something that requires
> multiple nodes to test. The logic on the surface seems simple: find tables
> of similar size and select them and merge them. The reality is it turns out
> now to be that way. The coverage itself both branch and line may be very
> high, but what the code does not do is directly account for a wide variety
> of scenarios. Without direct tests you end up with a mental approximation
> of what it does and that varies from person to person and accounts for the
> cases that fit in your mind. For example, you personally are only running
> LevelDB inspired compaction.
>
> Being that this this is not a multi-node problem you should be able to re
> factor this heavily. Pulling everything out to a static method where all
> the parameters are arguments, or inject a lot of mocks in the current code,
> and develop some scenario based coverage.
>
> That is how i typically "rescue" code I take over. I look at the nightmare
> and say, "damn i am really afraid to touch this". I construct 8 scenarios
> that test green. Then I force some testing into it through careful re
> factoring. Now, I probably know -something- about it. Now, you are fairly
> free to do a wide ranging refactor, because you at least counted for 8
> scenarios and you put unit test traps so that some rules are enforced. (Or
> the person changing the code has to actively REMOVE your tests asserting it
> was not or no longer is valid). Later on you (or someone else)  __STILL__
> might screw the entire thing up, but at least you can now build forward.
>
> Anyway that patch on compaction was great and I am sure it improved things.
> That being said it did not add any tests :). So it can easily be undone by
> the next person who does not understand the specific issue trying to be
> addressed. Inline comments almost scream to me 'we need a test' not
> everyone believes that.
>



-- 
Thank you & Best Regards,
--Simon (Qingcun) Zhou

Re: Code quality, principles and rules

Posted by Edward Capriolo <ed...@gmail.com>.

On Thu, Mar 16, 2017 at 3:10 PM, Jeff Jirsa <jj...@apache.org> wrote:

>
>
> On 2017-03-16 10:32 (-0700), François Deliège <fr...@instagram.com>
> wrote:
> >
> > To get this started, here is an initial proposal:
> >
> > Principles:
> >
> > 1. Tests always pass.  This is the starting point. If we don't care
> about test failures, then we should stop writing tests. A recurring failing
> test carries no signal and is better deleted.
> > 2. The code is tested.
> >
> > Assuming we can align on these principles, here is a proposal for their
> implementation.
> >
> > Rules:
> >
> > 1. Each new release passes all tests (no flakinesss).
> > 2. If a patch has a failing test (test touching the same code path), the
> code or test should be fixed prior to being accepted.
> > 3. Bugs fixes should have one test that fails prior to the fix and
> passes after fix.
> > 4. New code should have at least 90% test coverage.
> >
> First I was
> I agree with all of these and hope they become codified and followed. I
> don't know anyone who believes we should be committing code that breaks
> tests - but we should be more strict with requiring green test runs, and
> perhaps more strict with reverting patches that break tests (or cause them
> to be flakey).
>
> Ed also noted on the user list [0] that certain sections of the code
> itself are difficult to test because of singletons - I agree with the
> suggestion that it's time to revisit CASSANDRA-7837 and CASSANDRA-10283
>
> Finally, we should also recall Jason's previous notes [1] that the actual
> test infrastructure available is limited - the system provided by Datastax
> is not generally open to everyone (and not guaranteed to be permanent), and
> the infrastructure currently available to the ASF is somewhat limited (much
> slower, at the very least). If we require tests passing (and I agree that
> we should), we need to define how we're going to be testing (or how we're
> going to be sharing test results), because the ASF hardware isn't going to
> be able to do dozens of dev branch dtest runs per day in its current form.
>
> 0: https://lists.apache.org/thread.html/f6f3fc6d0ad1bd54a6185ce7bd7a2f
> 6f09759a02352ffc05df92eef6@%3Cuser.cassandra.apache.org%3E
> 1: https://lists.apache.org/thread.html/5fb8f0446ab97644100e4ef987f36e
> 07f44e8dd6d38f5dc81ecb3cdd@%3Cdev.cassandra.apache.org%3E
>
>
>
Ed also noted on the user list [0] that certain sections of the code itself
are difficult to test because of singletons - I agree with the suggestion
that it's time to revisit CASSANDRA-7837 and CASSANDRA-10283

Thanks for the shout out!

I was just looking at a patch about compaction. The patch was to calculate
free space correctly in case X. Compaction is not something that requires
multiple nodes to test. The logic on the surface seems simple: find tables
of similar size and select them and merge them. The reality is it turns out
now to be that way. The coverage itself both branch and line may be very
high, but what the code does not do is directly account for a wide variety
of scenarios. Without direct tests you end up with a mental approximation
of what it does and that varies from person to person and accounts for the
cases that fit in your mind. For example, you personally are only running
LevelDB inspired compaction.

Being that this this is not a multi-node problem you should be able to re
factor this heavily. Pulling everything out to a static method where all
the parameters are arguments, or inject a lot of mocks in the current code,
and develop some scenario based coverage.

That is how i typically "rescue" code I take over. I look at the nightmare
and say, "damn i am really afraid to touch this". I construct 8 scenarios
that test green. Then I force some testing into it through careful re
factoring. Now, I probably know -something- about it. Now, you are fairly
free to do a wide ranging refactor, because you at least counted for 8
scenarios and you put unit test traps so that some rules are enforced. (Or
the person changing the code has to actively REMOVE your tests asserting it
was not or no longer is valid). Later on you (or someone else)  __STILL__
might screw the entire thing up, but at least you can now build forward.

Anyway that patch on compaction was great and I am sure it improved things.
That being said it did not add any tests :). So it can easily be undone by
the next person who does not understand the specific issue trying to be
addressed. Inline comments almost scream to me 'we need a test' not
everyone believes that.

Re: Code quality, principles and rules

Posted by Jeff Jirsa <jj...@apache.org>.

On 2017-03-16 10:32 (-0700), François Deliège <fr...@instagram.com> wrote: 
> 
> To get this started, here is an initial proposal:
> 
> Principles:
> 
> 1. Tests always pass.  This is the starting point. If we don't care about test failures, then we should stop writing tests. A recurring failing test carries no signal and is better deleted.
> 2. The code is tested.
> 
> Assuming we can align on these principles, here is a proposal for their implementation.
> 
> Rules:
> 
> 1. Each new release passes all tests (no flakinesss).
> 2. If a patch has a failing test (test touching the same code path), the code or test should be fixed prior to being accepted.
> 3. Bugs fixes should have one test that fails prior to the fix and passes after fix.
> 4. New code should have at least 90% test coverage.
> 

I agree with all of these and hope they become codified and followed. I don't know anyone who believes we should be committing code that breaks tests - but we should be more strict with requiring green test runs, and perhaps more strict with reverting patches that break tests (or cause them to be flakey). 

Ed also noted on the user list [0] that certain sections of the code itself are difficult to test because of singletons - I agree with the suggestion that it's time to revisit CASSANDRA-7837 and CASSANDRA-10283

Finally, we should also recall Jason's previous notes [1] that the actual test infrastructure available is limited - the system provided by Datastax is not generally open to everyone (and not guaranteed to be permanent), and the infrastructure currently available to the ASF is somewhat limited (much slower, at the very least). If we require tests passing (and I agree that we should), we need to define how we're going to be testing (or how we're going to be sharing test results), because the ASF hardware isn't going to be able to do dozens of dev branch dtest runs per day in its current form.  

0: https://lists.apache.org/thread.html/f6f3fc6d0ad1bd54a6185ce7bd7a2f6f09759a02352ffc05df92eef6@%3Cuser.cassandra.apache.org%3E
1: https://lists.apache.org/thread.html/5fb8f0446ab97644100e4ef987f36e07f44e8dd6d38f5dc81ecb3cdd@%3Cdev.cassandra.apache.org%3E