You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Jonathan Ellis <jb...@gmail.com> on 2015/06/10 16:29:47 UTC

Cassandra 2.2, 3.0, and beyond

*As you know, we've split our post-2.1 release into two pieces, with 2.2 to
be released in July (rc1 out Monday
<http://cassandra.apache.org/download/>) and 3.0 in September.2.2 will
include Windows support, commitlog compression
<https://issues.apache.org/jira/browse/CASSANDRA-6809>, JSON support
<https://issues.apache.org/jira/browse/CASSANDRA-7970>, role-based
authorization
<http://www.datastax.com/dev/blog/role-based-access-control-in-cassandra>,
bootstrap-aware leveled compaction
<https://issues.apache.org/jira/browse/CASSANDRA-7460>, and user-defined
functions
<http://christopher-batey.blogspot.com/2015/05/cassandra-aggregates-min-max-avg-group.html>.
3.0 will include a major storage engine rewrite
<https://issues.apache.org/jira/browse/CASSANDRA-8099> and materialized
views <https://issues.apache.org/jira/browse/CASSANDRA-6477>.We're
splitting things up this way because we don't want to block the features
that are already complete while waiting for 8099 (the new storage engine).
Releasing them now as 2.2 reduces the risk for users (2.2 has a lot in
common with 2.1) and allows us to stabilize that independently of the
upheaval from 8099.After 3.0, we'll take this even further: we will release
3.x versions monthly.  Even releases will include both bugfixes and new
features; odd releases will be bugfix-only.  You may have heard this
referred to as "tick-tock" releases, after Intel's policy of changing
process and architecture independently
<http://www.intel.com/content/www/us/en/silicon-innovations/intel-tick-tock-model-general.html>.The
primary goal is to improve release quality.  Our current major "dot zero"
releases require another five or six months to make them stable enough for
production.  This is directly related to how we pile features in for 9 to
12 months and release all at once.  The interactions between the new
features are complex and not always obvious.  2.1 was no exception, despite
DataStax hiring a full time test engineering team specifically for Apache
Cassandra.We need to try something different.  Tick-tock releases will
dramatically reduce the number of features in each version, which will
necessarily improve our ability to quickly track down any regressions.  And
"pausing" every other month to focus on bug fixes will help ensure that we
don't accumulate issues faster than we can fix them.Tick-tock will also
prevent situations like the one we are in now with 8099 delaying everything
else.  Users will get to test new features almost immediately.To get there,
we are investing significant effort in making trunk "always releasable,"
with the goal that each release, or at least each odd-numbered bugfix
release, should be usable in production.  We’ve extended our continuous
integration server to make it easy for contributors to run tests against
feature branches
<http://www.datastax.com/dev/blog/cassandra-testing-improvements-for-developer-convenience-and-confidence>
before merging to trunk and we’re working on more test infrastructure
<https://docs.google.com/document/d/1Seku0vPwChbnH3uYYxon0UO-b6LDtSqluZiH--sWWi0>
and procedures
<https://docs.google.com/document/d/1ptr47UQ56N80jqL_O6AlE67b0STyn_cVp2k5DTv-OMc>
to improve release quality.  You can see how this is coming along in our
May retrospective
<https://docs.google.com/document/d/1GtuYRocdr9luNdwmm8wE84uC5Wr6TvewFbQtqoAFVeU/edit>.We
are also extending our backwards compatibility policy to cover all 3.x
releases: you will be able to upgrade seamlessly from 3.1 to 3.7, for
instance, including cross-version repair.  We will not introduce any extra
upgrade requirements or remove deprecated features until 4.0, no sooner
than a year after 3.0.Under normal conditions, we will not release 3.x.y
stability releases for x > 0.  That is, we will have a traditional 3.0.y
stability series, but the odd-numbered bugfix-only releases will fill that
role for the tick-tock series -- recognizing that occasionally we will need
to be flexible enough to release an emergency fix in the case of a critical
bug or security vulnerability.We do recognize that it will take some time
for tick-tock releases to deliver production-level stability, which is why
we will continue to deliver 2.2.y and 3.0.y bugfix releases.  (But if we do
demonstrate that tick-tock can deliver the stability we want, there will be
no need for a 4.0.y bugfix series, only 4.x tick-tock.) After 2.2.0 is
released, 2.0 will reach end-of-life as planned.  After 3.0.0 is released,
2.1 will also reach end of life.  This is earlier than expected, but 2.2
will be very close to as stable as 2.1 and users will be well served by
upgrading.  We will maintain the 2.2 stability series until 4.0 is
released, and 3.0 for six months after that.Thanks for reading this far,
and I look forward to hearing how 2.2rc1 works for you!*
-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder, http://www.datastax.com
@spyced

Re: Cassandra 2.2, 3.0, and beyond

Posted by Tyler Hobbs <ty...@datastax.com>.
On Wed, Jun 10, 2015 at 1:43 PM, <SE...@homedepot.com> wrote:

> With 3.0, what happens to existing Thrift-based tables (with dynamic
> column names, etc.)?


Just like in Cassandra 2.x, they will show up as COMPACT STORAGE tables in
a format that CQL can work with.  We're making a few adjustments to how the
schema is presented in CQL, mostly to better deal with a mixture of defined
and undefined column names (mixed static and dynamic).  That mostly
involves treating defined columns as "static".

However, the storage format for COMPACT STORAGE tables will not be
(significantly) different from normal tables any more.  You can read a few
details about the new storage format here:
https://github.com/pcmanus/cassandra/blob/8099_engine_refactor/guide_8099.md#storage-format-on-disk-and-on-wire


-- 
Tyler Hobbs
DataStax <http://datastax.com/>

RE: Cassandra 2.2, 3.0, and beyond

Posted by SE...@homedepot.com.
With 3.0, what happens to existing Thrift-based tables (with dynamic column names, etc.)?

Sean Durity

From: Jonathan Ellis [mailto:jbellis@gmail.com]
Sent: Wednesday, June 10, 2015 10:30 AM
To: user
Subject: Cassandra 2.2, 3.0, and beyond


As you know, we've split our post-2.1 release into two pieces, with 2.2 to be released in July (rc1 out Monday<http://cassandra.apache.org/download/>) and 3.0 in September.


2.2 will include Windows support, commitlog compression<https://issues.apache.org/jira/browse/CASSANDRA-6809>, JSON support<https://issues.apache.org/jira/browse/CASSANDRA-7970>, role-based authorization<http://www.datastax.com/dev/blog/role-based-access-control-in-cassandra>, bootstrap-aware leveled compaction<https://issues.apache.org/jira/browse/CASSANDRA-7460>, and user-defined functions<http://christopher-batey.blogspot.com/2015/05/cassandra-aggregates-min-max-avg-group.html>.


3.0 will include a major storage engine rewrite<https://issues.apache.org/jira/browse/CASSANDRA-8099> and materialized views<https://issues.apache.org/jira/browse/CASSANDRA-6477>.


We're splitting things up this way because we don't want to block the features that are already complete while waiting for 8099 (the new storage engine).  Releasing them now as 2.2 reduces the risk for users (2.2 has a lot in common with 2.1) and allows us to stabilize that independently of the upheaval from 8099.


After 3.0, we'll take this even further: we will release 3.x versions monthly.  Even releases will include both bugfixes and new features; odd releases will be bugfix-only.  You may have heard this referred to as "tick-tock" releases, after Intel's policy of changing process and architecture independently<http://www.intel.com/content/www/us/en/silicon-innovations/intel-tick-tock-model-general.html>.


The primary goal is to improve release quality.  Our current major "dot zero" releases require another five or six months to make them stable enough for production.  This is directly related to how we pile features in for 9 to 12 months and release all at once.  The interactions between the new features are complex and not always obvious.  2.1 was no exception, despite DataStax hiring a full time test engineering team specifically for Apache Cassandra.


We need to try something different.  Tick-tock releases will dramatically reduce the number of features in each version, which will necessarily improve our ability to quickly track down any regressions.  And "pausing" every other month to focus on bug fixes will help ensure that we don't accumulate issues faster than we can fix them.


Tick-tock will also prevent situations like the one we are in now with 8099 delaying everything else.  Users will get to test new features almost immediately.


To get there, we are investing significant effort in making trunk "always releasable," with the goal that each release, or at least each odd-numbered bugfix release, should be usable in production.  We’ve extended our continuous integration server to make it easy for contributors to run tests against feature branches<http://www.datastax.com/dev/blog/cassandra-testing-improvements-for-developer-convenience-and-confidence> before merging to trunk and we’re working on more test infrastructure<https://docs.google.com/document/d/1Seku0vPwChbnH3uYYxon0UO-b6LDtSqluZiH--sWWi0> and procedures<https://docs.google.com/document/d/1ptr47UQ56N80jqL_O6AlE67b0STyn_cVp2k5DTv-OMc> to improve release quality.  You can see how this is coming along in our May retrospective<https://docs.google.com/document/d/1GtuYRocdr9luNdwmm8wE84uC5Wr6TvewFbQtqoAFVeU/edit>.


We are also extending our backwards compatibility policy to cover all 3.x releases: you will be able to upgrade seamlessly from 3.1 to 3.7, for instance, including cross-version repair.  We will not introduce any extra upgrade requirements or remove deprecated features until 4.0, no sooner than a year after 3.0.


Under normal conditions, we will not release 3.x.y stability releases for x > 0.  That is, we will have a traditional 3.0.y stability series, but the odd-numbered bugfix-only releases will fill that role for the tick-tock series -- recognizing that occasionally we will need to be flexible enough to release an emergency fix in the case of a critical bug or security vulnerability.


We do recognize that it will take some time for tick-tock releases to deliver production-level stability, which is why we will continue to deliver 2.2.y and 3.0.y bugfix releases.  (But if we do demonstrate that tick-tock can deliver the stability we want, there will be no need for a 4.0.y bugfix series, only 4.x tick-tock.)

After 2.2.0 is released, 2.0 will reach end-of-life as planned.  After 3.0.0 is released, 2.1 will also reach end of life.  This is earlier than expected, but 2.2 will be very close to as stable as 2.1 and users will be well served by upgrading.  We will maintain the 2.2 stability series until 4.0 is released, and 3.0 for six months after that.



Thanks for reading this far, and I look forward to hearing how 2.2rc1 works for you!

--
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder, http://www.datastax.com
@spyced


________________________________

The information in this Internet Email is confidential and may be legally privileged. It is intended solely for the addressee. Access to this Email by anyone else is unauthorized. If you are not the intended recipient, any disclosure, copying, distribution or any action taken or omitted to be taken in reliance on it, is prohibited and may be unlawful. When addressed to our clients any opinions or advice contained in this Email are subject to the terms and conditions expressed in any applicable governing The Home Depot terms of business or client engagement letter. The Home Depot disclaims all responsibility and liability for the accuracy and content of this attachment and for any damages or losses arising from any inaccuracies, errors, viruses, e.g., worms, trojan horses, etc., or other items of a destructive nature, which may be contained in this attachment and shall not be liable for direct, indirect, consequential or special damages in connection with this e-mail message or its attachment.