You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@cassandra.apache.org by Josh McKenzie <jm...@apache.org> on 2022/04/05 18:40:28 UTC

Cassandra project biweekly status update 2022-04-05

A day late; thankfully nothing too earth shattering happened yesterday that we missed by the newsletter coming out today. :)

[New contributor Getting Started]
Welcome to Cassandra! We have a couple good places for new contributors to get started - failing tests and also starter tickets we label "lhf" (low hanging fruit). Either category is a great place to get started learning both the codebase, our project specific processes, and the general open-source and Apache Way.

Here are a couple queries to reference the two categories:

- Unassigned failing tests (69 currently unassigned): https://issues.apache.org/jira/secure/RapidBoard.jspa?rapidView=496&quickFilter=2252

- Unassigned starter tickets (25 unassigned): https://issues.apache.org/jira/secure/RapidBoard.jspa?rapidView=484&quickFilter=2162&quickFilter=2160. The 4.0.x swim lane is for our latest stable GA release and the 4.x swim lane covers the upcoming 4.1 release we'll be freezing in May.


[Dev list conversations]
https://lists.apache.org/list?dev@cassandra.apache.org:lte=2w:

We're looking for volunteers to taking on the Build Lead role: https://lists.apache.org/thread/fwl1dl8j5n42gphc7tmgtzovymfo1rbc

As Mick pointed out on another email thread, we haven't yet achieved our goal of green ASF infra CI on trunk. The build lead role has been instrumental in us mapping our current CI situation and helping build and maintain momentum on cleaning up CI, however it seems like the load of changes going into trunk has accelerated as well with the increased effort on cleaning up CI.

See the build lead confluence wiki for details on the role: https://cwiki.apache.org/confluence/display/CASSANDRA/Build+Lead. You don't need to be a committer or PMC member to help out with this!

It looks like we agree on a "fork 4.1 and freeze the branch may 1" (https://lists.apache.org/thread/28slxtw4vn4zxwqndmy8bpb86q3oo8jm). As a reminder, we have agreed to block releases on green CI; today on trunk that means 22 test failures we'll need to clean up before releasing (https://butler.cassandra.apache.org/#/)

Some conversation is going on about the oldest version of python 3.x to support with our cqlsh and other scripts: https://lists.apache.org/thread/omoo3cvjcrwh5wvqb7ndjqzzhyp17klx. With different platforms supporting different versions of the interpreter, perspectives on this are quite welcome.

Erick reached out this morning with a set of ideas around a guide on how to ask good questions: https://lists.apache.org/thread/fnlzos2v78xmgxhz37xsskpdc30dl95l. The gdoc draft of this guide can be found here: https://docs.google.com/document/d/1-ZYpl9tif9OAMdAxFLxA1mTPNp0zkNW4kzXvfvRBHUc/edit?usp=sharing


[CI Trends]
Butler dashboard: https://butler.cassandra.apache.org/#/
We have about 3 weeks worth of historical data on our front line dashboard. Here's our trends for CI failures:

trunk:  7  -> 22
4.0:    9  -> 7
3.11:   21 -> 33
3.0:    20 -> 17
Sum:    57 -> 79

While we can expect to see that number of failures drop when we hit our freeze for 4.1 in early May, it's clear we're not yet at a point of "always releasable trunk". Mick brought this up a bit in the freeze email chain. I think it'd be valuable for us to discuss this further once we freeze and are burning down test failures in pursuit of 4.1 GA.


[Release progress]
https://issues.apache.org/jira/secure/RapidBoard.jspa?rapidView=484&quickFilter=2175

4.0.4:
6 issues closed out in the past 2 weeks. One big highlight here is CASSANDRA-17467 with a fix for a 3.11 -> 4.0 regression to timestamp formatting and parsing. Another big one is CASSANDRA-17466, where failed repair SyncTasks through a chain of castings and exception handling can end up causing FSReadErrors and node down situations. Both are some of the larger impact bugfixes we've seen since 4.0.0 released.

4.1.0:
We've closed out 18 issues in 4.1.0 for this past two weeks. CASSANDRA-15399 added the ability to track repair state via virtual tables. Nodetool verify was guarded behind a newly required -f / --force option in CASSANDRA-17017 which we've talked about off and on over the years. CASSANDRA-17150 takes care of a new regression introduced in CASSANDRA-16927 (all isolated to the unreleased 4.1 / trunk branch) where Streaming Sessions that took longer than 3 minutes (i.e. didn't have messages passed on control channels for 3 minutes) would time out and terminate - it's great to see things like this caught before release! We have too many things to list them all here; see the link above for details.

4 weeks to go until freeze; let's see if we can keep our trunk CI failures to <= the current snapshot of 22 failures so the lag between branch and release, at least from a CI perspective, is contained.

Thanks for the hard work everyone!

~Josh