You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@cassandra.apache.org by Alex Petrov <al...@coffeenco.de> on 2022/04/01 15:02:13 UTC

Re: [DISCUSS] Should we deprecate / freeze python dtests

(bunching up references here)
> They use a separate implementation of instance initialization and thus they test the test server rather than the real node.

> What is the real gap between the in-JVM tests server instance and a server as run by python DTests? 

> I also have this concern.

> I think we can get rid of this by extending CassandraDaemon, just need to add a few hooks to mock out gossip/internode/client
(/end bunching up references)

Wanted to mention that implementing "regular" instance initialisation should not be complicated: I know it because I've tried it once. I never got to submit a patch, but mostly because it was not fully fleshed out in a state that would meet our quality standards. I'll see if I can finish it up quickly. 

That said, vast majority of dtests will not require instance initialisation. I've checked dtest suite back when we were working on in-jvm dtest API, and only a few tests (such as bootstrap, some streaming tests, etc) would've benefitted from this feature.

> So, if you do Cluster.build(num).withConfig(c -> c.with(Features.values())).start(), this will run the full Cassandra daemon, so the main difference will be with: startup (Instance vs CassandraDaemon), and JMX (direct method call rather than JMX).

Right, I think main difference is more or less resolved in `BootstrapTest` by stubbing out Gossip, but I can see some people being opposed to doing it this way, hence we should just mimic what CassandraDaemon#activate does and go through the "normal" startup sequence that starts bootstrap/streaming/etc.

> Given each started instance uses a dedicated class loader there is some amount of trash left and when there are a couple of multi-node test cases in a single test class, it sometimes happens that the test fail with out of memory in metaspace error.

We have also had several discussions about this, and consensus was to just use a jvm per method in test suite. There still will be tests where we will have to be more considerate, but these tests would overlap with ones where dtest suite would also hit the boundaries of the machine, so this is fine. 

> I support deprecating python dtests, as long as in-jvm dtests have feature parity with python dtests,

We can start by deprecating those dtests where there is feature parity. I'm sure we all agree we don't need full feature parity between frameworks _before_ we start porting _any_ tests.


I have to mention that migrating to in-jvm dtests will also be hugely beneficial because we'll be able to use Harry more easily, and many tests that currently use stress and have no validation will be "simply" migrated to use Harry and have validation more or less by default. Besides, we'll be able to avoid using any `for-loop` data generation and generate data using Harry instead.


On Wed, Mar 30, 2022, at 6:22 PM, David Capwell wrote:
> 
>> Outside of this area is there some other difference in the coverage of the tests. Is serialization fully covered?
>> I would like to be sure that we will not miss anything by using in-jvm dtests instead of python dtests.
> 
> So, if you do Cluster.build(num).withConfig(c -> c.with(Features.values())).start(), this will run the full Cassandra daemon, so the main difference will be with: startup (Instance vs CassandraDaemon), and JMX (direct method call rather than JMX).
> 
> The Features lets you enable different subsystems, so adding every feature will get you in-sync mostly.
> 
> Now, if you don’t define any Features, then the following are mocked/disabled out: Gossip, Internode, Client (disabled)
> 
> Gossip: gossiper isn’t running and the gossip state is defined by the Cluster, the state does not change for the life of the test (there are utility methods to mutate gossip state).  Since most tests do not care about gossip state changes, this is fine for majority of tests, when not just enable gossip with Feature.GOSSIP
> NETWORK: messages do not go over network by default, but will be serialized; this logic does not match networking though the serializer is the same.  If you are actually testing internode messaging, enable it with Feature.NETWORK
> NATIVE_PROTOCOL: by default CQL/thrift protocol are disabled, to enable do Feature.NATIVE_PROTOCOL (by default it acts as if -Dcassandra.start_native_transport=false)
> 
> 
> 
>> On Mar 30, 2022, at 1:51 AM, Benjamin Lerer <b....@gmail.com> wrote:
>> 
>>> 
>>> 
>>> I think we can get rid of this by extending CassandraDaemon, just need to add a few hooks to mock out gossip/internode/client (for cases where the mocks are desired), and when mocks are not desired just run the real logic.
>>> 
>>> Too many times I have had to make the 2 more in-line, and this is hard to maintain… we should fix this and feel this is 100% fixable
>> 
>> Thanks for the explanation David. Outside of this area is there some other difference in the coverage of the tests. Is serialization fully covered?
>> I would like to be sure that we will not miss anything by using in-jvm dtests instead of python dtests.
>> 
>> 
>> Le mer. 30 mars 2022 à 02:15, benedict@apache.org <be...@apache.org> a écrit :
>>> > a well-defined path to reduce/eliminate code duplication and basic documentation for newcomers to get up to speed with writing in-jvm dtests and extending the framework____
>>> __ __
>>> Are python tests much better here? If not, I do not see why these should be blockers for their deprecation.____
>>> __ __
>>> Perfect feature parity also seems unnecessary - unless a missing feature is an active impediment. But as far as I know every missing feature is actively under development and can be expected very soon. ____
>>> __ __
>>> Let’s get this decision over and done with.____
>>> __ __
>>> __ __
>>> *From: *Paulo Motta <pa...@gmail.com>
>>> *Date: *Wednesday, 30 March 2022 at 00:46
>>> *To: *Cassandra DEV <de...@cassandra.apache.org>
>>> *Subject: *Re: [DISCUSS] Should we deprecate / freeze python dtests____
>>> I support deprecating python dtests, as long as in-jvm dtests have feature parity with python dtests, a well-defined path to reduce/eliminate code duplication and basic documentation for newcomers to get up to speed with writing in-jvm dtests and extending the framework.____
>>> __ __
>>> Em ter., 29 de mar. de 2022 às 20:09, benedict@apache.org <be...@apache.org> escreveu:____
>>>> It often does not work. I can attest to many wasted weeks, on some environments never getting them to work.____
>>>>  ____
>>>> They happen to work right now for me, though.____
>>>>  ____
>>>> I think the learning curve thing is a bit of a distraction, personally. I have always found python dtests hard to work with, both developing against and running, so their learning curve for me is going on 10 years. Some folk may be more comfortable with python dtests due to their familiarity with python, ccm or other tooling, but that is a different matter.____
>>>>  ____
>>>> Looking at git, most contributors to python dtests are contributors to in-jvm dtests, and the latter have received 20x as many net code contributions over the past year. ____
>>>>  ____
>>>> I think it’s quite justified to just say in-jvm dtests are simply better to work with, and already better and more widely used despite their youth, whatever their remaining teething problems.____
>>>>  ____
>>>> I vote we immediately discontinue python dtest development, and discontinue running python dtests pre-commit, retaining them for releases only. This will provide the necessary impetus to polish off any last remaining gaps, without reducing coverage.____
>>>>  ____
>>>> *From: *Brandon Williams <dr...@gmail.com>
>>>> *Date: *Tuesday, 29 March 2022 at 23:42
>>>> *To: *dev <de...@cassandra.apache.org>
>>>> *Subject: *Re: [DISCUSS] Should we deprecate / freeze python dtests____
>>>> > In fact there is a high learning curve to setup cassandra-dtest environment
>>>> 
>>>> I think this is fairly well documented:
>>>> https://github.com/apache/cassandra-dtest/blob/trunk/README.md
>>>> 
>>>> On Tue, Mar 29, 2022 at 5:27 PM Paulo Motta <pa...@gmail.com> wrote:
>>>> >
>>>> > > I am curious about this comment.  When I first joined I learned jvm-dtest within an hour and started walking Repair code in a debugger (and this was way before the improvements that let us do things like nodetool)… python dtest took weeks to get working correctly (still having issues with the MBean library we use… so have to comment out error handling to get some tests to pass)….
>>>> >
>>>> > Thanks for sharing your perspective. In fact there is a high learning curve to setup cassandra-dtest environment, but once it's working it's pretty straightforward to test any existing or new functionality.
>>>> >
>>>> > I think with in-jvm dtests you don't have the hassle of setting up a different environment and this is a great motivator to standardize on this solution. The main difficulty I had was testing features not supported by the framework, which require you to extend the framework. I don't recall having to extend ccm/cassandra-dtest many times when working on new features.
>>>> >
>>>> > Perhaps this has improved recently and we no longer need to worry about extending the framework or duplicating code when testing new functionality.
>>>> >
>>>> > Em ter., 29 de mar. de 2022 às 15:12, Ekaterina Dimitrova <e....@gmail.com> escreveu:
>>>> >>
>>>> >> One thing that we can add to docs is for people how to update the in-jvm framework and test their patches before asking for in-jvm api release. The assumption is those won’t be many updates needed I think, but it is good to be documented.
>>>> >>
>>>> >> On Tue, 29 Mar 2022 at 13:51, David Capwell <dc...@apple.com> wrote:
>>>> >>>
>>>> >>> They use a separate implementation of instance initialization and thus they test the test server rather than the real node.
>>>> >>>
>>>> >>>
>>>> >>> I think we can get rid of this by extending CassandraDaemon, just need to add a few hooks to mock out gossip/internode/client (for cases where the mocks are desired), and when mocks are not desired just run the real logic.
>>>> >>>
>>>> >>> Too many times I have had to make the 2 more in-line, and this is hard to maintain… we should fix this and feel this is 100% fixable
>>>> >>>
>>>> >>> we shouldn't neglect that there is a significant learning curve associated with it for new contributors which IMO is much lower for pyhton dtests
>>>> >>>
>>>> >>>
>>>> >>> I am curious about this comment.  When I first joined I learned jvm-dtest within an hour and started walking Repair code in a debugger (and this was way before the improvements that let us do things like nodetool)… python dtest took weeks to get working correctly (still having issues with the MBean library we use… so have to comment out error handling to get some tests to pass)….
>>>> >>>
>>>> >>> Maybe we could have some example docs showing how to do the same in both tools?  Honestly Cluster.build(3).withConfig(c -> c.with(Feature.values())).start() matches 95% of python dtest tests (the withConfig logic is a bit cryptic), so don’t think the docs would be too much work
>>>> >>>
>>>> >>>
>>>> >>> On Mar 29, 2022, at 5:48 AM, Josh McKenzie <jm...@apache.org> wrote:
>>>> >>>
>>>> >>> we should at least write extensive documentation on how to use/modify in-jvm dtest framework before deprecating python dtests.
>>>> >>>
>>>> >>> We should have this for all our testing frameworks period, in-jvm dtest, python dtest, and ccm. They're woefully under-documented IMO.
>>>> >>>
>>>> >>> On Tue, Mar 29, 2022, at 6:11 AM, Paulo Motta wrote:
>>>> >>>
>>>> >>> To elaborate a bit on the steep learning curve point, when mentoring new contributors on a couple of occasions I told them to "just write a python dtest" because we had no idea on how to test that functionality on in-jvm tests while the python dtest was fairly straightforward to implement (I can't recall exactly what feature was it but I can dig if necessary).
>>>> >>>
>>>> >>> While we might be already familiar with the in-jvm dtest framework due to our exposure to it, we shouldn't neglect that there is a significant learning curve associated with it for new contributors which IMO is much lower for pyhton dtests. So we should at least write extensive documentation on how to use/modify in-jvm dtest framework before deprecating python dtests.
>>>> >>>
>>>> >>> Em ter., 29 de mar. de 2022 às 06:58, Paulo Motta <pa...@gmail.com> escreveu:
>>>> >>>
>>>> >>> > They use a separate implementation of instance initialization and thus they test the test server rather than the real node.
>>>> >>>
>>>> >>> I also have this concern. When adding a new service on CASSANDRA-16789 we had to explicitly modify the in-jvm dtest server to match the behavior from the actual server [1] (this is just a minor example but I remember having to do something similar on other tickets).
>>>> >>>
>>>> >>> Besides having a steep learning curve since users need to be familiar with the in-jvm dtest framework in order to add new functionality not supported by it, this is potentially unsafe, since the implementations can diverge without being caught by tests.
>>>> >>>
>>>> >>> Is there any way we could avoid duplicating functionality on the test server and use the same initialization code on in-jvm dtests?
>>>> >>>
>>>> >>> [1] - https://github.com/apache/cassandra/commit/ad249424814836bd00f47931258ad58bfefb24fd#diff-321b52220c5bd0aaadf275a845143eb208c889c2696ba0d48a5fc880551131d8R735
>>>> >>>
>>>> >>> Em ter., 29 de mar. de 2022 às 04:22, Benjamin Lerer <bl...@apache.org> escreveu:
>>>> >>>
>>>> >>> They use a separate implementation of instance initialization and thus they test the test server rather than the real node.
>>>> >>>
>>>> >>>
>>>> >>> This is actually my main concern. What is the real gap between the in-JVM tests server instance and a server as run by python DTests?
>>>> >>>
>>>> >>> Le mar. 29 mars 2022 à 00:08, benedict@apache.org <be...@apache.org> a écrit :
>>>> >>>
>>>> >>> > Other than that, it can be problematic to test upgrades when the starting version must run with a different Java version than the end release
>>>> >>>
>>>> >>>
>>>> >>>
>>>> >>> python upgrade tests seem to be particularly limited (from a quick skim, primarily testing major upgrade points that are now long in the past), so I’m not sure how much of a penalty this is today in practice - but it might well become a problem.
>>>> >>>
>>>> >>>
>>>> >>>
>>>> >>> There’s several questions to answer, namely how many versions we want to:
>>>> >>>
>>>> >>>
>>>> >>>
>>>> >>> - test upgrades across
>>>> >>>
>>>> >>> - maintain backwards compatibility of the in-jvm dtest api across
>>>> >>>
>>>> >>> - support a given JVM for
>>>> >>>
>>>> >>>
>>>> >>>
>>>> >>> However, if we need to, we can probably use RMI to transparently support multiple JVMs for tests that require it. Since we already use serialization to cross the ClassLoader boundary it might not even be very difficult.
>>>> >>>
>>>> >>>
>>>> >>>
>>>> >>>
>>>> >>>
>>>> >>> From: Jacek Lewandowski <le...@gmail.com>
>>>> >>> Date: Monday, 28 March 2022 at 22:30
>>>> >>> To: dev@cassandra.apache.org <de...@cassandra.apache.org>
>>>> >>> Subject: Re: [DISCUSS] Should we deprecate / freeze python dtests
>>>> >>>
>>>> >>> Although I like in-jvm DTests for many scenarios, I can see that they do not test the production code as it is. They use a separate implementation of instance initialization and thus they test the test server rather than the real node. Other than that, it can be problematic to test upgrades when the starting version must run with a different Java version than the end release. One more thing I've been observing sometimes is high consumption of metaspace, which does not seem to be cleaned after individual test cases. Given each started instance uses a dedicated class loader there is some amount of trash left and when there are a couple of multi-node test cases in a single test class, it sometimes happens that the test fail with out of memory in metaspace error.
>>>> >>>
>>>> >>>
>>>> >>>
>>>> >>> Thanks,
>>>> >>>
>>>> >>> Jacek
>>>> >>>
>>>> >>>
>>>> >>>
>>>> >>> On Mon, Mar 28, 2022 at 10:06 PM David Capwell <dc...@apple.com> wrote:
>>>> >>>
>>>> >>> I am back and the work for trunk to support vnode is at the last stage of review; I had not planned to backport the changes to other branches (aka, older branches would only support single token), so if someone would like to pick up this work it is rather LHF after 17332 goes in (see trunk patch GH PR: trunk).
>>>> >>>
>>>> >>>
>>>> >>>
>>>> >>> I am in favor of deprecating python dtests, and agree we should figure out what the gaps are (once vnode support is merged) so we can either shrink them or special case to unfreeze (such as startup changes being allowed).
>>>> >>>
>>>> >>>
>>>> >>> On Mar 14, 2022, at 6:13 AM, Josh McKenzie <jm...@apache.org> wrote:
>>>> >>>
>>>> >>>
>>>> >>>
>>>> >>> vnode support for in-jvm dtests is in flight and fairly straightforward:
>>>> >>>
>>>> >>>
>>>> >>>
>>>> >>> https://issues.apache.org/jira/browse/CASSANDRA-17332
>>>> >>>
>>>> >>>
>>>> >>>
>>>> >>> David's OOO right now but I suspect we can get this in in April some time.
>>>> >>>
>>>> >>>
>>>> >>>
>>>> >>> On Mon, Mar 14, 2022, at 8:36 AM, benedict@apache.org wrote:
>>>> >>>
>>>> >>> This is the limitation I mentioned. I think this is solely a question of supplying an initial config that uses vnodes, i.e. that specifies multiple tokens for each node. It is not really a limitation – I believe a dtest could be written today using vnodes, by overriding the config’s tokens. It does look like the token handling has been refactored since the initial implementation to make this a little uglier than should be necessary.
>>>> >>>
>>>> >>>
>>>> >>>
>>>> >>> We should make this trivial, anyway, and perhaps offer a way to run all of the dtests with vnodes (and suitably annotating those that cannot be run with vnodes). This should be quite easy.
>>>> >>>
>>>> >>>
>>>> >>>
>>>> >>>
>>>> >>>
>>>> >>> From: Andrés de la Peña <ad...@apache.org>
>>>> >>> Date: Monday, 14 March 2022 at 12:28
>>>> >>> To: dev@cassandra.apache.org <de...@cassandra.apache.org>
>>>> >>> Subject: Re: [DISCUSS] Should we deprecate / freeze python dtests
>>>> >>>
>>>> >>> Last time I checked there wasn't support for vnodes on in-jvm dtests, which seems an important limitation.
>>>> >>>
>>>> >>>
>>>> >>>
>>>> >>> On Mon, 14 Mar 2022 at 12:24, benedict@apache.org <be...@apache.org> wrote:
>>>> >>>
>>>> >>> I am strongly in favour of deprecating python dtests in all cases where they are currently superseded by in-jvm dtests. They are environmentally more challenging to work with, causing many problems on local and remote machines. They are harder to debug, slower, flakier, and mostly less sophisticated.
>>>> >>>
>>>> >>>
>>>> >>>
>>>> >>> > all focus on getting the in-jvm framework robust enough to cover edge-cases
>>>> >>>
>>>> >>>
>>>> >>>
>>>> >>> Would be great to collect gaps. I think it’s just vnodes, which is by no means a fundamental limitation? There may also be some stuff to do startup/shutdown and environmental scripts, that may be a niche we retain something like python dtests for.
>>>> >>>
>>>> >>>
>>>> >>>
>>>> >>> > people aren’t familiar
>>>> >>>
>>>> >>>
>>>> >>>
>>>> >>> I would be interested to hear from these folk to understand their concerns or problems using in-jvm dtests, if there is a cohort holding off for this reason
>>>> >>>
>>>> >>>
>>>> >>>
>>>> >>> > This is going to require documentation work from some of the original authors
>>>> >>>
>>>> >>>
>>>> >>>
>>>> >>> I think a collection of template-like tests we can point people to would be a cheap initial effort. Cutting and pasting an existing test with the required functionality, then editing to suit, should get most people off to a quick start who aren’t familiar.
>>>> >>>
>>>> >>>
>>>> >>>
>>>> >>> > Labor and process around revving new releases of the in-jvm dtest API
>>>> >>>
>>>> >>>
>>>> >>>
>>>> >>> I think we need to revisit how we do this, as it is currently broken. We should consider either using ASF snapshots until we cut new releases of C* itself, or else using git subprojects. This will also become a problem for Accord’s integration over time, and perhaps other subprojects in future, so it is worth better solving this.
>>>> >>>
>>>> >>>
>>>> >>>
>>>> >>> I think this has been made worse than necessary by moving too many implementation details to the shared API project – some should be retained within the C* tree, with the API primarily serving as the shared API itself to ensure cross-version compatibility. However, this is far from a complete explanation of (or solution to) the problem.
>>>> >>>
>>>> >>>
>>>> >>>
>>>> >>>
>>>> >>>
>>>> >>>
>>>> >>>
>>>> >>> From: Josh McKenzie <jm...@apache.org>
>>>> >>> Date: Monday, 14 March 2022 at 12:11
>>>> >>> To: dev@cassandra.apache.org <de...@cassandra.apache.org>
>>>> >>> Subject: [DISCUSS] Should we deprecate / freeze python dtests
>>>> >>>
>>>> >>> I've been wrestling with the python dtests recently and that led to some discussions with other contributors about whether we as a project should be writing new tests in the python dtest framework or the in-jvm framework. This discussion has come up tangentially on some other topics, including the lack of documentation / expertise on the in-jvm framework dis-incentivizing some folks from authoring new tests there vs. the difficulty debugging and maintaining timer-based, sleep-based non-deterministic python dtests, etc.
>>>> >>>
>>>> >>>
>>>> >>>
>>>> >>> I don't know of a place where we've formally discussed this and made a project-wide call on where we expect new distributed tests to be written; if I've missed an email about this someone please link on the thread here (and stop reading! ;))
>>>> >>>
>>>> >>>
>>>> >>>
>>>> >>> At this time we don't specify a preference for where you write new multi-node distributed tests on our "development/testing" portion of the site and documentation: https://cassandra.apache.org/_/development/testing.html
>>>> >>>
>>>> >>>
>>>> >>>
>>>> >>> The primary tradeoffs as I understand them for moving from python-based multi-node testing to jdk-based are:
>>>> >>>
>>>> >>> Pros:
>>>> >>>
>>>> >>> Better debugging functionality (breakpoints, IDE integration, etc)
>>>> >>> Integration with simulator
>>>> >>> More deterministic runtime (anecdotally; python dtests _should_ be deterministic but in practice they prove to be very prone to environmental disruption)
>>>> >>> Test time visibility to internals of cassandra
>>>> >>>
>>>> >>> Cons:
>>>> >>>
>>>> >>> The framework is not as mature as the python dtest framework (some functionality missing)
>>>> >>> Labor and process around revving new releases of the in-jvm dtest API
>>>> >>> People aren't familiar with it yet and there's a learning curve
>>>> >>>
>>>> >>>
>>>> >>>
>>>> >>> So my bid here: I personally think we as a project should freeze writing new tests in the python dtest framework and all focus on getting the in-jvm framework robust enough to cover edge-cases that might still be causing new tests to be written in the python framework. This is going to require documentation work from some of the original authors of the in-jvm framework as well as folks currently familiar with it and effort from those of us not yet intimately familiar with the API to get to know it, however I believe the long-term benefits to the project will be well worth it.
>>>> >>>
>>>> >>>
>>>> >>>
>>>> >>> We could institute a pre-commit check that warns on a commit increasing our raw count of python dtests to help provide process-based visibility to this change in direction for the project's testing.
>>>> >>>
>>>> >>>
>>>> >>>
>>>> >>> So: what do we think?
>>>> >>>
>>>> >>>____