You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@cassandra.apache.org by John Sanda <jo...@gmail.com> on 2020/09/10 03:10:02 UTC

[DISCUSS] Next steps for Kubernetes operator SIG

Hey everyone,

A while back I started https://github.com/jsanda/cassandra-operator in an
effort to move things forward. One of my primary goals was to get some
people contributing. That did not happen, which is understandable. I am
going to throw out some questions and would love to get feedback,
particularly from people who have been participating in the SIG and/or are
involved with relevant projects.

* Should we continue down the path of trying to build a common operator
project? If yes, how should we proceed?

* Should we broaden the focus to using and running Cassandra in Kubernetes
in general? CEP 2 Kubernetes Operator
<https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-2+Kubernetes+Operator>
says
that its motivation is to make it easy to run Cassandra on Kubernetes.
Having an operator is definitely a big part of that, but it is not the only
part. There are important areas like application development, data
migration, multi-region / multi-cloud clusters, and tooling integration to
name a few. I think that the community benefits from collaboration
regardless of whether or not there is a common operator. That is not to
suggest that a common operator would not be good. I do not necessarily see
it as a zero sum game where it has to be a common operator or nothing.

Thanks

- John

Re: [DISCUSS] Next steps for Kubernetes operator SIG

Posted by Patrick McFadin <pm...@gmail.com>.

No problem Franck! I will postpone this week's meeting to next week and we
can continue the discussion on the ML.

Patrick

On Thu, Sep 24, 2020 at 9:23 AM <fr...@orange.com> wrote:

> I can share Orange’s view of the situation, sorry it is a long story!
>
> We started CassKop at the end of 2018 after betting on K8S which was not
> so simple as far as C* was concerned.
> Lack of support for local storage, IPs that change all the time, different
> network plugins to try to implement a non standard K8s way of having nodes
> see each other from different dcs…
> We hesitated with Mesos but could not have both and K8S was already
> tracting so much you could not not choose it.
>
> Anyway, we looked around and did not see anyone with such requirements so
> we said: why not try it ourselves but on github so that we may give it back
> to the community.
> We have used C* for quite a few years with great success on production
> with massive load and perfect availability. We love C* @ Orange :) Thanks!
>
> So we started writing support for mono-dc cluster (CassKop) and added the
> multi dc support with MultiCassKop which is another operator included in
> the CassKop repo.
> For more details we tried to document our designs as much as possible
> here:
> https://orange-opensource.github.io/casskop/docs/1_concepts/3_design_principes#multi-site-management
>
> In the middle of last year we had some talks with Datastax about working
> together around their new management sidecar. Their position on open source
> was not clear at that time so we said please come back when you have
> decided to go open source with it.
> Which they did in the beginning of this year. But at that time I guess
> work had started on cass-operator so we kept our separate ways.
>
> Since the beginning of the years, we have been working with our OPS team
> to have it in production. It is not simple as the team has to learn K8S and
> trust a newborn operator.
> This takes time especially as our internal cluster has been tweaked for
> multi-tenancy with obscure options being set by our K8s team…
>
> We also developed with Instaclustr the Backup & Restore functionnality (we
> have new CRDs (Custom Resource Definition) for backup and restore and a
> reconcile loop that calls out Instaclustr sidecar for these operations).
> We now support multiple backups in parallel and can write to s3/ google or
> azur (but Stefan could give more details here if needed)
>
> During the SIG calls we mentioned our desire to donate CassKop once it
> satisfies our basics requirements (v1 coming just now but I said it too
> many times already)
> I am actually not sure Datastax mentioned their desire to donate
> cass-operator but we decided to compare the designs and the functionalities
> based on respective CRDs.
> The CRD is the interface with the user as it is where you describe the
> cluster that you want to have.
> These talks were very interesting and we found out that the CassKop team
> had made good choices most of the time but was may be too open.
> Indeed our intention was to give all the possibilities for our OPS team to
> work.
> This includes :
> - very open topology definition using any configuration of labels to map
> dcs / racks and nodes to labels on clusters (we have labels on dcs / rooms
> / rows and server racks so we can map C* racks to storage or network arrays
> internaly)
> - possibility to have multiple C* nodes on a single K8S host (because
> internal clouds are not really clouds, they have limited resources)
> - custom C* image selection,
> - custom bootstrap script that lets you configure C* as you want using
> ConfigMaps,
> -  the ability to mount different volumes wherever they wanted,
> -  the possibility to run any number of sidecars alongside C* for custom
> probes in our case
>
> This makes CassKop quite powerful and flexible.
> We made sure that all those options are not enabled by default so one can
> just pop a simple 3 node cluster quickly
>
> On the other hand cass-operator had an interesting way of configuring C*
> just inside the CRD using cass-config. This is simple and elegant so we are
> implementing it as well for the support of C* 4
>
> Now for the future, there are 3 choices in my opinion:
> - start from scratch (or John’s repo) by cherry picking bits from all
> operators. This is possible but will take some time / effort to have
> something usable. And then it will be compared to cass-operator and CassKop.
> I don’t see Orange contributing too much here as we believe CassKop to be
> a much better starting point
> - choose cass-operator: it is not on offer right now so let’s see if it
> does. I think Orange could contribute some bits inherited from CassKop if
> it is agreed by the community. Not sure it would be enough for us to use it.
> - choose CassKop: we would be delighted to donate it and contribute with
> some committers (including the original author who now works for AWS). It
> would then become the community operator but there would be cass-operator
> alongside probably.
> But Cass-operator is made to make it easier for Datastax to manage
> customer clusters by imposing some configuration. It make sense for their
> needs, so may be 2 operators. We don’t know how backup/restore will be
> handled here with medusa being adapted to K8s
>
> Sorry again for being long but 2 years of work deserve some lines of text
> :)
>
> I just saw your message Patrick but this was written already so we gain a
> week.
>
> Franck
>
> On 24 Sep 2020, at 10:08, Benjamin Lerer <benjamin.lerer@datastax.com
> <ma...@datastax.com>> wrote:
>
>
> I realise there are meeting logs, but getting a wider discourse with
> non-stakeholder input might help to build a community consensus?  It
> doesn't seem like it can hurt at this point, anyway.
>
>
> +1
>
>
>
> On Wed, Sep 23, 2020 at 9:21 PM Benedict Elliott Smith <
> benedict@apache.org<ma...@apache.org>>
> wrote:
>
> Perhaps it helps to widen the field of discussion to the dev list?
>
> It might help if each of the stakeholder organisations state their view on
> the situation, including why they would or would not support a given
> approach/operator, and what (preferably specific) circumstances might lead
> them to change their mind?
>
> I realise there are meeting logs, but getting a wider discourse with
> non-stakeholder input might help to build a community consensus?  It
> doesn't seem like it can hurt at this point, anyway.
>
>
> On 23/09/2020, 17:13, "John Sanda" <john.sanda@gmail.com<mailto:
> john.sanda@gmail.com>> wrote:
>
>    I want to point out that pretty much everything being  discussed in
> this
>    thread has been discussed at length during the SIG meetings. I think
> it is
>    worth noting because we are pretty much still have the same
> conversation.
>
>    On Wed, Sep 23, 2020 at 12:03 PM Benedict Elliott Smith <
> benedict@apache.org<ma...@apache.org>>
>    wrote:
>
> I don't think there's anything about a code drop that's not "The
> Apache
> Way"
>
> If there's a consensus (or even strong majority) amongst invested
> parties,
> I don't see why we could not adopt an operator directly into the
> project.
>
> It's possible a green field approach might lead to fewer hard
> feelings, as
> everyone is in the same boat. Perhaps all operators are also
> suboptimal and
> could be improved with a rewrite? But I think coordinating a lot of
> different entities around an empty codebase is particularly
> challenging.  I
> actually think it could be better for cohesion and collaboration to
> have a
> suboptimal but substantive starting point.
>
>
> On 23/09/2020, 16:11, "Stefan Miklosovic" <
> stefan.miklosovic@instaclustr.com<ma...@instaclustr.com>>
> wrote:
>
>    I think that from Instaclustr it was stated quite clearly
> multiple
>    times that we are "fine to throw it away" if there is something
> better
>    and more wide-spread.Indeed, we have invested a lot of time in
> the
>    operator but it was not useless at all, we gained a lot of quite
>    unique knowledge how to put all pieces together. However, I
> think that
>    this space is going to be quite fragmented and "balkanized",
> which is
>    not always a bad thing, but in a quite narrow area as Kubernetes
>    operator is, I just do not see how 4 operators are going to be
>    beneficial for ordinary people ("official" from community, ours,
>    Datastax one and CassKop (without any significant order)). Sure,
>    innovation and healthy competition is important but to what
> extent ...
>    One can start a Cassandra cluster on Kubernetes just so many
> times
>    differently and nobody really likes a vendor lock-in. People
> wanting
>    to run a cluster on K8S realise that there are three operators,
> each
>    backed by a private business entity, and the community operator
> is not
>    there ... Huh, interesting ... One may even start to question
> what is
>    wrong with these folks that it takes three companies to build
> their
>    own solution.
>
>    Having said that, to my perception, Cassandra community just
> does not
>    have enough engineers nor contributors to keep 4 operators alive
> at
>    the same time (I wish I was wrong) so the idea of selecting the
> best
>    one or to merge obvious things and approaches together is
>    understandable, even if it meant we eventually sunset ours. In
>    addition, nobody from big players is going to contribute to the
> code
>    base of the other one, for obvious reasons, so channeling and
>    directing this effort into something common for a community
> seems to
>    be the only reasonable way of cooperation.
>
>    It is quite hard to bootstrap this if the donation of the code
> in big
>    chunks / whole repo is out of question as it is not the "Apache
> way"
>    (there was some thread running here about this in more depth a
> while
>    ago) and we basically need to start from scratch which is quite
>    demotivating, we are just inventing the wheel and nobody is up
> to it.
>    It is like people are waiting for that to happen so they can
> jump in
>    "once it is the thing" but it will never materialise or at least
> the
>    hurdle to kick it off is unnecessarily high. Nobody is going to
> invest
>    in this heavily if there is already a working operator from
> companies
>    mentioned above. As I understood it, one reason of not choosing
> the
>    way of donating it all is that "the learning and community
> building
>    should happen in organic manner and we just can not accept the
>    donation", but is not it true that it is easier to build a
> community
>    around something which is already there rather than trying to
> build it
>    around an idea which is quite hard to dedicate to?
>
>    On Wed, 23 Sep 2020 at 15:28, Joshua McKenzie <
> jmckenzie@apache.org<ma...@apache.org>>
> wrote:
>
> I think there's significant value to the community in trying
> to
> coalesce
> on a single approach,
> I agree. Unfortunately in this case, the parties with a vested
> interest and
> written operators came to the table and couldn't agree to
> coalesce
> on a
> single approach. John Sanda attempted to start an initiative to
> write a
> best-of-breed combining choice parts of each operator, but that
> effort did
> not gain traction.
>
> Which is where my hypothesis comes from that if there were a
> clear
> "better
> fit" operator to start from we wouldn't be in a deadlock; the
> correct
> choice would be obvious. Reasonably so, every engineer that's
> written
> something is going to want that something to be used and not
> thrown
> away in
> favor of another something without strong evidence as to why
> that's
> the
> better choice.
>
> As far as I know, nobody has made a clear case as to a more
> compelling
> place to start in terms of an operator donation the project
> then
> collaborates on. There's no mass adoption evidence nor feature
> enumeration
> that I know of for any of the approaches anyone's taken, so the
> discussions
> remain stalled.
>
>
>
> On Wed, Sep 23, 2020 at 7:18 AM, Benedict Elliott Smith <
> benedict@apache.org<ma...@apache.org>
> wrote:
>
> I think there's significant value to the community in trying
> to
> coalesce
> on a single approach, earlier than later. This is an
> opportunity
> to expand
> the number of active organisations involved directly in the
> Apache
> Cassandra project, as well as to more quickly expand the
> project's
> functionality into an area we consider urgent and important.
> I
> think it
> would be a real shame to waste this opportunity. No doubt it
> will
> be hard,
> as organisations have certain built-in investments in their
> own
> approaches.
>
> I haven't participated in these calls as I do not consider
> myself
> to have
> the relevant experience and expertise, and have other
> focuses on
> the
> project. I just wanted to voice a vote in favour of trying to
> bring the
> different organisations together on a single approach if
> possible.
> Is there
> anything the project can do to help this happen?
>
> On 23/09/2020, 03:04, "Ben Bromhead" <ben@instaclustr.com<mailto:
> ben@instaclustr.com>>
> wrote:
>
> I think there is certainly an appetite to donate and
> standardise
> on a
> given operator (as mentioned in this thread).
>
> I personally found the SIG hard to participate in due to time
> zones and
> the synchronous nature of it.
>
> So while it was a great forum to dive into certain details
> for a
> subset of
> participants and a worthwhile endeavour, I wouldn't paint it
> as an
> accurate
> reflection of community intent.
>
> I don't think that any participants want to continue down
> the path
> of "let
> a thousand flowers bloom". That's why we are looking towards
> CasKop (as
> well as a number of technical reasons).
>
> Some of the recorded meetings and outputs can also be found
> if you
> are
> interested in some primary sources
> https://cwiki.apache.org/confluence/display/CASSANDRA/
> Cassandra+Kubernetes+Operator+SIG
> .
>
> From what I understand second-hand from talking to people on
> the
> SIG
> calls,
>
> there was a general inability to agree on an existing
> operator as a
> starting point and not much engagement on taking best of
> breed
> from the
> various to combine them. Seems to leave us in the "let a
> thousand
> flowers
> bloom" stage of letting operators grow in the ecosystem and
> seeing
> which
> ones meet the needs of end users before talking about
> adopting one
> into the
> foundation.
>
> Great to hear that you folks are joining forces though!
> Bodes well
> for C*
> users that are wanting to run things on k8s.
>
> On Tue, Sep 22, 2020 at 4:26 AM, Ben Bromhead <
> ben@instaclustr.com<ma...@instaclustr.com>
>
> wrote:
>
> For what it's worth, a quick update from me:
>
> CassKop now has at least two organisations working on it
> substantially
> (Orange and Instaclustr) as well as the numerous other
> contributors.
>
> Internally we will also start pointing others towards CasKop
> once
> a few
> things get merged. While we are not yet sunsetting our
> operator
> yet, it
>
> is
>
> certainly looking that way.
>
> I'd love to see the community adopt it as a starting point
> for
> working
> towards whatever level of functionality is desired.
>
> Cheers
>
> Ben
>
> On Fri, Sep 11, 2020 at 2:37 PM John Sanda <
> john.sanda@gmail.com>
> wrote:
>
> On Thu, Sep 10, 2020 at 5:27 PM Josh McKenzie <
> jmckenzie@apache.org>
> wrote:
>
> There's basically 1 java driver in the C* ecosystem. We have
> 3? 4?
> or
>
> more
>
> operators in the ecosystem. Has one of them hit a clear
> supermajority of
> adoption that makes it the de facto default and makes sense
> to
> pull it
>
> into
>
> the project?
>
> We as a project community were pretty slow to move on
> building a
> PoV
>
> around
>
> kubernetes so we find ourselves in a situation with a bunch
> of
> contenders
> for inclusion in the project. It's not clear to me what
> heuristics
> we'd
>
> use
>
> to gauge which one would be the best fit for inclusion
> outside
> letting
> community adoption speak.
>
> ---
> Josh McKenzie
>
> We actually talked a good bit on the SIG call earlier today
> about
> heuristics. We need to document what functionality an
> operator
> should
> include at level 0, level 1, etc. We did discuss this a good
> bit
> during
> some of the initial SIG meetings, but I guess it wasn't
> really a
> focal
> point at the time. I think we should also provide references
> to
> existing
> operator projects and possibly other related projects. This
> would
> benefit
> both community users as well as people working on these
> projects.
>
> - John
>
> --
>
> Ben Bromhead
>
> Instaclustr | www.instaclustr.com | @instaclustr
> <http://twitter.com/instaclustr> | (650) 284 9692
>
> --
>
> Ben Bromhead
>
> Instaclustr | www.instaclustr.com | @instaclustr
> <http://twitter.com/instaclustr> | (650) 284 9692
>
>
>
> --------------------------------------------------------------------- To
> unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
> For
> additional
> commands, e-mail: dev-help@cassandra.apache.org
>
>
>
> ---------------------------------------------------------------------
>    To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
>    For additional commands, e-mail: dev-help@cassandra.apache.org
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
> For additional commands, e-mail: dev-help@cassandra.apache.org
>
> --
>
>    - John
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
> For additional commands, e-mail: dev-help@cassandra.apache.org
>
>
>
>
>
> _________________________________________________________________________________________________________________________
>
> Ce message et ses pieces jointes peuvent contenir des informations
> confidentielles ou privilegiees et ne doivent donc
> pas etre diffuses, exploites ou copies sans autorisation. Si vous avez
> recu ce message par erreur, veuillez le signaler
> a l'expediteur et le detruire ainsi que les pieces jointes. Les messages
> electroniques etant susceptibles d'alteration,
> Orange decline toute responsabilite si ce message a ete altere, deforme ou
> falsifie. Merci.
>
> This message and its attachments may contain confidential or privileged
> information that may be protected by law;
> they should not be distributed, used or copied without authorisation.
> If you have received this email in error, please notify the sender and
> delete this message and its attachments.
> As emails may be altered, Orange is not liable for messages that have been
> modified, changed or falsified.
> Thank you.
>
>

Re: [DISCUSS] Next steps for Kubernetes operator SIG

Posted by Patrick McFadin <pm...@gmail.com>.

Hi everyone,

Just a reminder. Our meeting at the top of the hour!

https://datastax.zoom.us/j/390839037

Patrick

On Fri, Oct 16, 2020 at 8:46 AM Joshua McKenzie <jm...@apache.org>
wrote:

> I'm a big fan of github milestones for exactly this kind of work. Just as
> much process as strictly required to logically group things and align on
> scope and nothing more.
>
>
> On Fri, Oct 16, 2020 at 9:24 AM, <fr...@orange.com> wrote:
>
> > Hi all, sorry for the delay we were busy releasing V1 of CassKop which
> > happened this week: https://github.com/Orange-OpenSource/casskop
> >
> > Now we would like to open the discussions about feature merging and I
> > would like to follow Joshua’s proposition: open issues on cass-operator.
> >
> > As I said, we discuss first and if we find a way forward then we do it!
> >
> > We will open the issues next week and move forward.
> >
> > We can discuss them during the next SIG calls from thursday
> >
> > Franck
> >
> > On 6 Oct 2020, at 08:02, Ben Bromhead <ben@instaclustr.com<mailto:ben@
> > instaclustr.com>> wrote:
> >
> > Thanks Frank and Christopher.
> >
> > Sounds like we have a good path to consolidate around!
> >
> > On Sat, Oct 3, 2020 at 11:12 AM Joshua McKenzie <jmckenzie@apache.org
> > <ma...@apache.org>> wrote:
> >
> > how to best merge Casskop's features in Cass-operator.
> >
> > What if we create issues on the gh repo here
> > https://github.com/datastax/cass-operator/issues, create a milestone out
> > of
> > that, and have engineers rally on it to get things merged? We have a few
> > engineers focused on k8s ecosystem for Cassandra from the DataStax side
> > who'd be happy to collaborate with you folks to get these things in.
> >
> > On Fri, Oct 02, 2020 at 11:34 AM, <franck.dehay@orange.com<mailto:franck
> .
> > dehay@orange.com>> wrote:
> >
> > An update on Orange's point of view following the recent emails:
> >
> > If we were a newly interested party in running C* in K8s, we would use
> > Cass-operator as it comes from Datastax.
> >
> > The logic would then be that the community embraces it and thanks
> Datastax
> > for offering it!
> >
> > So, on Orange side, we propose to discuss with Datastax how to best merge
> > Casskop's features in Cass-operator. These features are:
> > - nodes labelling to map any internal architecture (including network
> > specific labels to muti-dc setup)
> > - volumes & sidecars management (possibly linked to PodTemplateSpec)
> > - backup & restore (we ruled out velero and can share why we went with
> > Instaclustr but Medusa could work too)
> > - kubectl plugin integration (quite useful on the ops side without an
> > admin UI)
> > - multiCassKop evolution to drive multiple cass-operators instead of
> > multiple casskops (this could remain Orange internal if too specific)
> >
> > We could decide at the end of these discussions the best way forward.
> > Orange could make PRs on cass-operator, but only if we agree we want the
> > functionalities :)
> >
> > If we can sort it out we could end up with a pretty neat operator.
> >
> > We share a common architecture (operator-sdk), start to know each other
> > with all these meetings so it should be possible if we want to!
> >
> > Would that be ok for the community and Datastax?
> >
> > On 2 Oct 2020, at 14:52, Joshua McKenzie <jmckenzie@apache.org<mailto:
> > jmckenzie@apache.org>> wrote:
> >
> > What are next steps here?
> >
> > Maybe we collectively put a table together w/the 2 operators and a list
> of
> > features to compare and contrast? Enumerate the frameworks / dependencies
> > they have to help form a point of view about the strengths and weaknesses
> > of each option?
> >
> > On Tue, Sep 29, 2020 at 10:22 PM, Christopher Bradford <bradfordcp@gmail.
> > com
> >
> > wrote:
> >
> > Hello Dev list,
> >
> > I'm Chris Bradford a Product Manager at DataStax working with the
> > cass-operator team. For background, we started down the path of
> developing
> > an operator internally to power our C*aaS platform, Astra. Care was taken
> > from day 1 to keep anything specific to this product at a layer above
> > cass-operator so it could solely focus on the task of operating Cassandra
> > clusters. With that being said, every single cluster on Astra is
> > provisioned and operated by cass-operator. The value of an advanced
> > operator to Cassandra users is tremendous so we decided to open source
> the
> > project (and associated components) with the goal of building a
> community.
> > It absolutely makes sense to offer this project and codebase up for
> > donation as a standard / baseline for running C* on Kubernetes.
> >
> > Below you will find a collection of cass-operator features,
> > differentiators, and roadmap / inflight initiatives. Table-stakes
> Must-have
> > functionality for a C* operator
> >
> > -
> >
> > Datacenter provisioning
> > -
> >
> > Schedule all pods
> > -
> >
> > Bootstrap nodes in the appropriate order
> > -
> >
> > Seeds
> > -
> >
> > Across racks
> > -
> >
> > etc.
> > -
> >
> > Uniform configuration
> > -
> >
> > Scale-up
> > -
> >
> > Add new nodes in a balanced manner across rack
> > -
> >
> > Scale-down
> > -
> >
> > Remove nodes one at a time across racks
> > -
> >
> > Node recovery
> > -
> >
> > Restart process
> > -
> >
> > Reschedule instance (IE replace node)
> > - Replace instance
> > -
> >
> > Specific workflows for seed node replacements
> > -
> >
> > Multi-DC / Multi-Rack
> > -
> >
> > Multi-Region / Multi-K8s Cluster
> > -
> >
> > Note this requires support at a networking layer for pod to pod IP
> > connectivity. This may be accomplished within the cluster with CNIs like
> > Cilium or externally via traditional networking tools.
> >
> > Differentiators
> >
> > -
> >
> > OSS Ecosystem / Components
> > -
> >
> > Cass Config Builder - OSS project extracted from DataStax OpsCenter Life
> > Cycle Manager to provide automated configuration file rendering
> > -
> >
> > Cass Config Definitions - definitions files for cass-config-builder,
> > defines all configuration files, their parameters, and templates
> > -
> >
> > Management API for Apache Cassandra (MAAC)
> > -
> >
> > Metrics Collector for Apache Cassandra (MCAC)
> > -
> >
> > Reference Prometheus Operator CRDs
> > -
> >
> > ServiceMonitor
> > -
> >
> > Instance
> > -
> >
> > Reference Grafana Operator CRDs
> > -
> >
> > Instance
> > -
> >
> > Dashboards
> > -
> >
> > Datasource
> > -
> >
> > PodTemplateSpec
> > -
> >
> > Customization of existing pods including support for adding containers,
> > volumes, etc
> > -
> >
> > Advanced Networking
> > -
> >
> > Node Port
> > -
> >
> > Host Network
> > -
> >
> > Simple security
> > -
> >
> > Management API mTLS support
> > -
> >
> > Automated generation of keystore and truststore for internode and client
> > to node TLS
> > -
> >
> > Automated superuser account configuration
> > -
> >
> > The default superuser (cassandra/cassandra) is disabled and never
> > available to clients
> > -
> >
> > Cluster administration account may be automatically (or provided) with
> > values stored in a k8s secret
> > -
> >
> > Automatic application of NetworkTopologyStrategy with appropriate RF for
> > system keyspaces
> > -
> >
> > Validating webhook
> > -
> >
> > Invalid changes are rejected with a helpful message
> > -
> >
> > Rolling cluster updates
> > -
> >
> > Change in binary (C* upgrade)
> > -
> >
> > Change in configuration
> > -
> >
> > Canary deployments - single rack application of changes for validation
> > before broader deployment
> > -
> >
> > Rolling restart
> > -
> >
> > Platform Integration / Testing / Certification
> > -
> >
> > Red Hat Openshift compatible and certified
> > -
> >
> > Secure, Universal Base Image (UBI) foundation images with security
> > scanning performed by Red Hat
> > -
> >
> > cass-operator
> > -
> >
> > cass-config-builder
> > -
> >
> > apache-cassandra w/ MCAC and MAAC
> > -
> >
> > Integration with Red Hat certification pipeline / marketplace
> > -
> >
> > Presence in Red Hat Operator Hub built into OpenShift interface
> > -
> >
> > VMware Tanzu Kubernetes Grid Integrated Edition compatible and certified
> > -
> >
> > Security scanning for images performed by VMware
> > -
> >
> > Amazon EKS
> > -
> >
> > Google GKE
> > -
> >
> > Azure AKS
> > -
> >
> > Documentation / Reference Implementations
> > -
> >
> > Cloud storage classes
> > -
> >
> > Ingress solutions
> > -
> >
> > Sample connection validation application with reference implementations
> of
> > Java Driver client connection parameters
> > -
> >
> > Cluster-level Stop / Resume - stop all running instances while keeping
> > persistent storage. Allows for scaling compute down to zero. Bringing the
> > cluster back up follows expected startup procedures
> >
> > Road Map / Inflight
> >
> > 1.
> >
> > Repair
> > 1.
> >
> > Reaper integration
> > 2.
> >
> > Backups
> > 1.
> >
> > Velero integration
> > 2. Medusa integration
> > 3.
> >
> > Advanced Networking via sidecar
> > 1.
> >
> > Combination of proxy sidecars (a la Envoy) to allow for persistent IP
> > addresses despite Kubernetes' best efforts to shuffle them.
> > 4.
> >
> > Single pod canary deployments
> > 5.
> >
> > Platform Certification
> > 1. VMware Project Pacific
> >
> > 2.
> >
> > Rancher Kubernetes Engine (K3s)
> > 6.
> >
> > Documentation
> > 1.
> >
> > Multi-region
> > 2.
> >
> > Multi-cloud
> > 3.
> >
> > Additional ingress providers
> > 1. Voyager
> > 2. HAProxy
> > 3. Gloo
> > 4. Ambassdor
> > 5. Envoy
> > 6. NGINX Ingress Controller
> > 4.
> >
> > Additional storage class references
> > 1.
> >
> > OpenEBS
> > 7.
> >
> > Cassandra Enhancements
> > 1.
> >
> > [#CASSANDRA-15823] Support for networking via identity instead of IP
> > - ASF JIRA <https://issues.apache.org/jira/browse/CASSANDRA-15823>
> >
> > If there are further questions about the project, codebase, architecture,
> > etc. the team would be happy to dive in to the details and discuss more.
> >
> > Cheers,
> > ~Chris
> >
> > Christopher Bradford
> >
> > On Mon, Sep 28, 2020 at 12:19 PM Patrick McFadin <pmcfadin@gmail.com
> > <ma...@gmail.com>> wrote:
> >
> > I can agree with that Ben. Franck did a good job of outlining CassKop.
> > Somebody from the cass-operator will be posting something similar and we
> > can keep it on the mailing list.
> >
> > Patrick
> >
> > On Sun, Sep 27, 2020 at 2:16 PM Ben Bromhead <ben@instaclustr.com
> <mailto:
> > ben@instaclustr.com>> wrote:
> >
> > Thanks Frank and Stefan.
> >
> > @Patrick great suggestion and worthwhile getting everything on the table.
> >
> > One minor change I would advocate for. The SIG has been great to iterate
> > and interact on the details, but I really think this conversation given
> >
> > the
> >
> > nature of the content needs to be on the mailing list. The mailing list
> >
> > is
> >
> > really our system of record and the most accessible.
> >
> > It gives folk time to think and digest, it's asynchronous, easily
> > searchable and let's be honest, the majority of stakeholders in this are
> > not US based, so the timing issue then goes away and makes it easier for
> > people to participate in. I feel like we've made a lot more progress by
> > simply having this discussion here.
> >
> > So instead of a presentation, maybe just an email to the ML addressing
> >
> > the
> >
> > headings that Patrick identified?
> >
> > On Fri, Sep 25, 2020 at 7:55 AM Stefan Miklosovic < stefan.miklosovic@
> > instaclustr.com<http://instaclustr.com>> wrote:
> >
> > Hi,
> >
> > Patrick's suggestion seems good to me.
> >
> > I won't go into specifics here as I need to genuinely prepare for this.
> It
> > is quite hard to dig deep into the solutions of others and bring some
> > constructive criticism because it takes a lot of time to study it and
> > everybody has some "why's" behind it.
> >
> > To summarize my goals and concerns:
> >
> > 1) We should be as much "Kubernetes operator idiomatic" as possible.
> > Industry standards, no custom brain-child of this or that group because
> > they think it is just cool or they just didn't know any better. I do NOT
> > say it is like that right now, I just want to be ruthless here as much as
> > possible when it comes to functionality and why it is done like that. It
> is
> > awesome that we have already something latest (thanks to John) and it
> > adheres to the latest releases. I personally had a hard time to keep up
> > with all the releases, once I finished something and I aligned it, after
> a
> > week or two there was already another one where things were different, it
> > is a very fast-moving space and I hope that by time we develop something
> it
> > will not be obsolete.
> >
> > 2) It may be easier said than done but it is guaranteed that people get
> > emotional, it's their precious etc, so please let's go into this with
> good
> > intentions, not trying to push one solution over the other just because
> > they would like to see it there ... I will have an equally hard time to
> > comply with this point. My plan is to explain what is _wrong_ with our
> > solution. Where we made mistakes and what should be done differently but
> it
> > is "too late" etc. It is quite hard to describe your work and all effort
> > in
> > this light but without telling what is wrong we can not decide what is
> > good
> > imho.
> >
> > 3) We should put something together fast enough so we can call it a
> > release. We can always iterate on it for eternity. But the foundations
> need
> > to be there. Here I want to say that I especially like what John did. I
> > looked through these specs and it was obvious it has been written with
> care
> > and attention. It looked _solid_. I am not sure how hard it is to put all
> > other things on top of that, I truly do not, and here I think we would
> have
> > to reinvent that wheel if we want to proceed because I can not imagine
> > what
> > it would be to retrofit e.g. CassKop on top of John specs, it is just
> like
> > putting round pegs into the square holes, maybe some chunks would be
> > reused
> > easily but otherwise I worry we will be just on square one.
> >
> > One specific feeling I have as I read this is that even if there is the
> > will to create the fourth operator, the respective parties will not be
> able
> > to drop their own repository. The whole point behind this effort, to me,
> > is
> > to have a solid, community driven, stable, modern and feature complete
> > operator people are truly using. I can see that once this is real, we
> will
> > _really_ sunset our operator, redirecting people to the new operator on
> > main readme doc etc, we truly mean it. Sure, if somebody comes and bug
> fix
> > will be needed, we will fix it, but the whole point of doing this is to
> > stop using what we have currently, over time, otherwise we are just
> > splitting this space even more. If CassKop is not sure if they will use
> it
> > because they do not know if that operator will be "enough" for them,
> > aren't
> > we just doing it wrong? If I exaggerate, they should be fine with
> deleting
> > the whole repository and using just this Cassandra one we are going to
> > make
> > otherwise I don't see the point to work on this ...
> >
> > On Thu, 24 Sep 2020 at 20:45, Joshua McKenzie <jmckenzie@apache.org
> > <ma...@apache.org>> wrote:
> >
> > - choose cass-operator: it is not on offer right now so let’s see if
> >
> > it
> >
> > does
> >
> > We should all talk a lot more, but this is 100% a mistake - I take
> >
> > the
> >
> > blame for that. The intention has long been to offer cass-operator
> >
> > for
> >
> > donation but it slipped through the cracks and your email yesterday
> >
> > made
> >
> > me
> >
> > double-take.
> >
> > We have since resolved this misalignment. DataStax would be happy to
> >
> > donate
> >
> > any and all of cass-operator to the ASF and C* project if it's what
> >
> > we
> >
> > all
> >
> > agree best serves our collective Cassandra users. I'm also cognizant
> >
> > that
> >
> > an immense amount of effort has gone into CassKop and we seem to have
> > something of an embarrassment of riches.
> >
> > I'm given to understand (haven't dug in personally) that the two
> >
> > operators
> >
> > express pretty different opinions when it comes to frameworks,
> >
> > designs,
> >
> > supported versions, etc. I think a discrete enumeration of the
> >
> > feature
> >
> > set
> >
> > and "identities" of both could really help navigate this conversation
> >
> > going
> >
> > forward.
> >
> > Also - thanks for that context Franck. It's always helpful to know
> >
> > where
> >
> > other people are coming from when we're all working together towards
> >
> > a
> >
> > common goal.
> >
> > On Thu, Sep 24, 2020 at 12:23 PM, <franck.dehay@orange.com<mailto:franck
> .
> > dehay@orange.com>> wrote:
> >
> > I can share Orange’s view of the situation, sorry it is a long
> >
> > story!
> >
> > We started CassKop at the end of 2018 after betting on K8S which
> >
> > was
> >
> > not
> >
> > so simple as far as C* was concerned. Lack of support for local
> >
> > storage,
> >
> > IPs that change all the time, different network plugins to try to
> >
> > implement
> >
> > a non standard K8s way of having nodes see each other from
> >
> > different
> >
> > dcs…
> >
> > We hesitated with Mesos but could not have both and K8S was already
> > tracting so much you could not not choose it.
> >
> > Anyway, we looked around and did not see anyone with such
> >
> > requirements
> >
> > so
> >
> > we said: why not try it ourselves but on github so that we may give
> >
> > it
> >
> > back
> >
> > to the community. We have used C* for quite a few years with great
> >
> > success
> >
> > on production with massive load and perfect availability. We love
> >
> > C*
> >
> > @
> >
> > Orange :) Thanks!
> >
> > So we started writing support for mono-dc cluster (CassKop) and
> >
> > added
> >
> > the
> >
> > multi dc support with MultiCassKop which is another operator
> >
> > included
> >
> > in
> >
> > the CassKop repo. For more details we tried to document our designs
> >
> > as
> >
> > much
> >
> > as possible here:
> >
> > https://orange-opensource.github.io/casskop/docs/
> >
> > 1_concepts/3_design_principes#multi-site-management
> >
> > In the middle of last year we had some talks with Datastax about
> >
> > working
> >
> > together around their new management sidecar. Their position on
> >
> > open
> >
> > source
> >
> > was not clear at that time so we said please come back when you
> >
> > have
> >
> > decided to go open source with it. Which they did in the beginning
> >
> > of
> >
> > this
> >
> > year. But at that time I guess work had started on cass-operator so
> >
> > we
> >
> > kept
> >
> > our separate ways.
> >
> > Since the beginning of the years, we have been working with our OPS
> >
> > team
> >
> > to have it in production. It is not simple as the team has to learn
> >
> > K8S and
> >
> > trust a newborn operator. This takes time especially as our
> >
> > internal
> >
> > cluster has been tweaked for multi-tenancy with obscure options
> >
> > being
> >
> > set
> >
> > by our K8s team…
> >
> > We also developed with Instaclustr the Backup & Restore
> >
> > functionnality
> >
> > (we
> >
> > have new CRDs (Custom Resource Definition) for backup and restore
> >
> > and a
> >
> > reconcile loop that calls out Instaclustr sidecar for these
> >
> > operations). We
> >
> > now support multiple backups in parallel and can write to s3/
> >
> > google
> >
> > or
> >
> > azur (but Stefan could give more details here if needed)
> >
> > During the SIG calls we mentioned our desire to donate CassKop once
> >
> > it
> >
> > satisfies our basics requirements (v1 coming just now but I said it
> >
> > too
> >
> > many times already) I am actually not sure Datastax mentioned their
> >
> > desire
> >
> > to donate cass-operator but we decided to compare the designs and
> >
> > the
> >
> > functionalities based on respective CRDs. The CRD is the interface
> >
> > with the
> >
> > user as it is where you describe the cluster that you want to have.
> >
> > These
> >
> > talks were very interesting and we found out that the CassKop team
> >
> > had
> >
> > made
> >
> > good choices most of the time but was may be too open. Indeed our
> >
> > intention
> >
> > was to give all the possibilities for our OPS team to work. This
> >
> > includes :
> >
> > - very open topology definition using any configuration of labels
> >
> > to
> >
> > map
> >
> > dcs / racks and nodes to labels on clusters (we have labels on dcs
> >
> > /
> >
> > rooms
> >
> > / rows and server racks so we can map C* racks to storage or
> >
> > network
> >
> > arrays
> >
> > internaly)
> > - possibility to have multiple C* nodes on a single K8S host
> >
> > (because
> >
> > internal clouds are not really clouds, they have limited resources)
> > - custom C* image selection,
> > - custom bootstrap script that lets you configure C* as you want
> >
> > using
> >
> > ConfigMaps,
> > - the ability to mount different volumes wherever they wanted,
> > - the possibility to run any number of sidecars alongside C* for
> >
> > custom
> >
> > probes in our case
> >
> > This makes CassKop quite powerful and flexible.
> > We made sure that all those options are not enabled by default so
> >
> > one
> >
> > can
> >
> > just pop a simple 3 node cluster quickly
> >
> > On the other hand cass-operator had an interesting way of
> >
> > configuring
> >
> > C*
> >
> > just inside the CRD using cass-config. This is simple and elegant
> >
> > so
> >
> > we are
> >
> > implementing it as well for the support of C* 4
> >
> > Now for the future, there are 3 choices in my opinion:
> > - start from scratch (or John’s repo) by cherry picking bits from
> >
> > all
> >
> > operators. This is possible but will take some time / effort to
> >
> > have
> >
> > something usable. And then it will be compared to cass-operator and
> > CassKop. I don’t see Orange contributing too much here as we
> >
> > believe
> >
> > CassKop to be a much better starting point
> > - choose cass-operator: it is not on offer right now so let’s see
> >
> > if
> >
> > it
> >
> > does. I think Orange could contribute some bits inherited from
> >
> > CassKop
> >
> > if
> >
> > it is agreed by the community. Not sure it would be enough for us
> >
> > to
> >
> > use
> >
> > it.
> > - choose CassKop: we would be delighted to donate it and contribute
> >
> > with
> >
> > some committers (including the original author who now works for
> >
> > AWS).
> >
> > It
> >
> > would then become the community operator but there would be
> >
> > cass-operator
> >
> > alongside probably. But Cass-operator is made to make it easier for
> > Datastax to manage customer clusters by imposing some
> >
> > configuration.
> >
> > It
> >
> > make sense for their needs, so may be 2 operators. We don’t know
> >
> > how
> >
> > backup/restore will be handled here with medusa being adapted to
> >
> > K8s
> >
> > Sorry again for being long but 2 years of work deserve some lines
> >
> > of
> >
> > text
> >
> > :)
> >
> > I just saw your message Patrick but this was written already so we
> >
> > gain a
> >
> > week.
> >
> > Franck
> >
> > On 24 Sep 2020, at 10:08, Benjamin Lerer <
> >
> > benjamin.lerer@datastax.com
> >
> > <ma...@datastax.com>> wrote:
> >
> > I realise there are meeting logs, but getting a wider discourse
> >
> > with
> >
> > non-stakeholder input might help to build a community consensus? It
> >
> > doesn't
> >
> > seem like it can hurt at this point, anyway.
> >
> > +1
> >
> > On Wed, Sep 23, 2020 at 9:21 PM Benedict Elliott Smith
> >
> > <benedict@apache.
> >
> > org<ma...@apache.org>> wrote:
> >
> > Perhaps it helps to widen the field of discussion to the dev list?
> >
> > It might help if each of the stakeholder organisations state their
> >
> > view on
> >
> > the situation, including why they would or would not support a
> >
> > given
> >
> > approach/operator, and what (preferably specific) circumstances
> >
> > might
> >
> > lead
> >
> > them to change their mind?
> >
> > I realise there are meeting logs, but getting a wider discourse
> >
> > with
> >
> > non-stakeholder input might help to build a community consensus? It
> >
> > doesn't
> >
> > seem like it can hurt at this point, anyway.
> >
> > On 23/09/2020, 17:13, "John Sanda" <john.sanda@gmail.com<mailto:
> >
> > john.
> >
> > sanda@gmail.com>> wrote:
> >
> > I want to point out that pretty much everything being discussed in
> >
> > this
> >
> > thread has been discussed at length during the SIG meetings. I
> >
> > think
> >
> > it is
> >
> > worth noting because we are pretty much still have the same
> >
> > conversation.
> >
> > On Wed, Sep 23, 2020 at 12:03 PM Benedict Elliott Smith <
> >
> > benedict@apache.
> >
> > org<ma...@apache.org>> wrote:
> >
> > I don't think there's anything about a code drop that's not "The
> >
> > Apache
> >
> > Way"
> >
> > If there's a consensus (or even strong majority) amongst invested
> >
> > parties,
> >
> > I don't see why we could not adopt an operator directly into the
> >
> > project.
> >
> > It's possible a green field approach might lead to fewer hard
> >
> > feelings, as
> >
> > everyone is in the same boat. Perhaps all operators are also
> >
> > suboptimal
> >
> > and
> > could be improved with a rewrite? But I think coordinating a lot of
> > different entities around an empty codebase is particularly
> >
> > challenging. I
> >
> > actually think it could be better for cohesion and collaboration to
> >
> > have a
> >
> > suboptimal but substantive starting point.
> >
> > On 23/09/2020, 16:11, "Stefan Miklosovic" < stefan.miklosovic@
> > instaclustr.com<ma...@instaclustr.com>> wrote:
> >
> > I think that from Instaclustr it was stated quite clearly multiple times
> > that we are "fine to throw it away" if there is something
> >
> > better
> >
> > and more wide-spread.Indeed, we have invested a lot of time in the
> > operator but it was not useless at all, we gained a lot of quite
> >
> > unique
> >
> > knowledge how to put all pieces together. However, I think that this
> space
> > is going to be quite fragmented and "balkanized", which
> >
> > is
> >
> > not always a bad thing, but in a quite narrow area as Kubernetes
> >
> > operator
> >
> > is, I just do not see how 4 operators are going to be beneficial
> >
> > for
> >
> > ordinary people ("official" from community, ours, Datastax one and
> >
> > CassKop
> >
> > (without any significant order)). Sure, innovation and healthy
> >
> > competition
> >
> > is important but to what extent ...
> > One can start a Cassandra cluster on Kubernetes just so many times
> > differently and nobody really likes a vendor lock-in. People
> >
> > wanting
> >
> > to run a cluster on K8S realise that there are three operators,
> >
> > each
> >
> > backed by a private business entity, and the community operator is
> >
> > not
> >
> > there ... Huh, interesting ... One may even start to question what
> >
> > is
> >
> > wrong with these folks that it takes three companies to build their own
> > solution.
> >
> > Having said that, to my perception, Cassandra community just does
> >
> > not
> >
> > have enough engineers nor contributors to keep 4 operators alive at the
> > same time (I wish I was wrong) so the idea of selecting the
> >
> > best
> >
> > one or to merge obvious things and approaches together is
> >
> > understandable,
> >
> > even if it meant we eventually sunset ours. In addition, nobody
> >
> > from
> >
> > big
> >
> > players is going to contribute to the code
> > base of the other one, for obvious reasons, so channeling and
> >
> > directing
> >
> > this effort into something common for a community seems to be the only
> > reasonable way of cooperation.
> >
> > It is quite hard to bootstrap this if the donation of the code in
> >
> > big
> >
> > chunks / whole repo is out of question as it is not the "Apache
> >
> > way"
> >
> > (there was some thread running here about this in more depth a
> >
> > while
> >
> > ago) and we basically need to start from scratch which is quite
> > demotivating, we are just inventing the wheel and nobody is up to
> >
> > it.
> >
> > It is like people are waiting for that to happen so they can jump
> >
> > in
> >
> > "once it is the thing" but it will never materialise or at least
> >
> > the
> >
> > hurdle to kick it off is unnecessarily high. Nobody is going to
> >
> > invest
> >
> > in this heavily if there is already a working operator from
> >
> > companies
> >
> > mentioned above. As I understood it, one reason of not choosing the way
> of
> > donating it all is that "the learning and community building should
> happen
> > in organic manner and we just can not accept the
> >
> > donation",
> >
> > but is not it true that it is easier to build a community around
> something
> > which is already there rather than trying to build
> >
> > it
> >
> > around an idea which is quite hard to dedicate to?
> >
> > On Wed, 23 Sep 2020 at 15:28, Joshua McKenzie <
> >
> > jmckenzie@apache.org<ma...@apache.org>
> >
> > <ma...@apache.org>> wrote:
> >
> > I think there's significant value to the community in trying to coalesce
> > on a single approach,
> > I agree. Unfortunately in this case, the parties with a vested
> >
> > interest
> >
> > and
> > written operators came to the table and couldn't agree to coalesce on a
> > single approach. John Sanda attempted to start an initiative to
> >
> > write a
> >
> > best-of-breed combining choice parts of each operator, but that
> >
> > effort
> >
> > did
> >
> > not gain traction.
> >
> > Which is where my hypothesis comes from that if there were a clear
> > "better
> > fit" operator to start from we wouldn't be in a deadlock; the
> >
> > correct
> >
> > choice would be obvious. Reasonably so, every engineer that's
> >
> > written
> >
> > something is going to want that something to be used and not thrown away
> > in
> > favor of another something without strong evidence as to why that's the
> > better choice.
> >
> > As far as I know, nobody has made a clear case as to a more
> >
> > compelling
> >
> > place to start in terms of an operator donation the project then
> > collaborates on. There's no mass adoption evidence nor feature
> >
> > enumeration
> >
> > that I know of for any of the approaches anyone's taken, so the
> > discussions
> > remain stalled.
> >
> > On Wed, Sep 23, 2020 at 7:18 AM, Benedict Elliott Smith <
> >
> > benedict@apache.
> >
> > org<ma...@apache.org> wrote:
> >
> > I think there's significant value to the community in trying to coalesce
> > on a single approach, earlier than later. This is an opportunity to
> expand
> > the number of active organisations involved directly in the Apache
> > Cassandra project, as well as to more quickly expand the project's
> > functionality into an area we consider urgent and important. I think it
> > would be a real shame to waste this opportunity. No doubt it will be
> hard,
> > as organisations have certain built-in investments in their own
> > approaches.
> >
> > I haven't participated in these calls as I do not consider myself to have
> > the relevant experience and expertise, and have other focuses on the
> > project. I just wanted to voice a vote in favour of trying to bring
> >
> > the
> >
> > different organisations together on a single approach if possible. Is
> > there
> > anything the project can do to help this happen?
> >
> > On 23/09/2020, 03:04, "Ben Bromhead" <ben@instaclustr.com<mailto:ben@
> > instaclustr.com><mailto:
> >
> > ben@
> >
> > instaclustr.com<http://instaclustr.com>>> wrote:
> >
> > I think there is certainly an appetite to donate and standardise on a
> > given operator (as mentioned in this thread).
> >
> > I personally found the SIG hard to participate in due to time zones
> >
> > and
> >
> > the synchronous nature of it.
> >
> > So while it was a great forum to dive into certain details for a subset
> of
> > participants and a worthwhile endeavour, I wouldn't paint it as an
> > accurate
> > reflection of community intent.
> >
> > I don't think that any participants want to continue down the path of
> > "let
> > a thousand flowers bloom". That's why we are looking towards CasKop
> >
> > (as
> >
> > well as a number of technical reasons).
> >
> > Some of the recorded meetings and outputs can also be found if you are
> > interested in some primary sources
> > https://cwiki.apache.org/confluence/display/CASSANDRA/
> > Cassandra+Kubernetes+Operator+SIG
> > .
> >
> > From what I understand second-hand from talking to people on the SIG
> > calls,
> >
> > there was a general inability to agree on an existing operator as a
> > starting point and not much engagement on taking best of breed from the
> > various to combine them. Seems to leave us in the "let a thousand flowers
> > bloom" stage of letting operators grow in the ecosystem and seeing which
> > ones meet the needs of end users before talking about adopting one into
> the
> > foundation.
> >
> > Great to hear that you folks are joining forces though! Bodes well for C*
> > users that are wanting to run things on k8s.
> >
> > On Tue, Sep 22, 2020 at 4:26 AM, Ben Bromhead <
> >
> > ben@instaclustr.com<ma...@instaclustr.com>
> >
> > <ma...@instaclustr.com>
> >
> > wrote:
> >
> > For what it's worth, a quick update from me:
> >
> > CassKop now has at least two organisations working on it
> >
> > substantially
> >
> > (Orange and Instaclustr) as well as the numerous other
> >
> > contributors.
> >
> > Internally we will also start pointing others towards CasKop once a few
> > things get merged. While we are not yet sunsetting our operator yet, it
> >
> > is
> >
> > certainly looking that way.
> >
> > I'd love to see the community adopt it as a starting point for working
> > towards whatever level of functionality is desired.
> >
> > Cheers
> >
> > Ben
> >
> > On Fri, Sep 11, 2020 at 2:37 PM John Sanda <
> > john.sanda@gmail.com>
> > wrote:
> >
> > On Thu, Sep 10, 2020 at 5:27 PM Josh McKenzie <
> >
> > jmckenzie@apache.org
> >
> > wrote:
> >
> > There's basically 1 java driver in the C* ecosystem. We have 3? 4? or
> >
> > more
> >
> > operators in the ecosystem. Has one of them hit a clear
> >
> > supermajority
> >
> > of
> >
> > adoption that makes it the de facto default and makes sense to pull it
> >
> > into
> >
> > the project?
> >
> > We as a project community were pretty slow to move on building a PoV
> >
> > around
> >
> > kubernetes so we find ourselves in a situation with a bunch of contenders
> > for inclusion in the project. It's not clear to me what heuristics we'd
> >
> > use
> >
> > to gauge which one would be the best fit for inclusion outside letting
> > community adoption speak.
> >
> > ---
> > Josh McKenzie
> >
> > We actually talked a good bit on the SIG call earlier today about
> > heuristics. We need to document what functionality an operator should
> > include at level 0, level 1, etc. We did discuss this a good bit during
> > some of the initial SIG meetings, but I guess it wasn't really a focal
> > point at the time. I think we should also provide references to existing
> > operator projects and possibly other related projects. This would benefit
> > both community users as well as people working on these projects.
> >
> > - John
> >
> > --
> >
> > Ben Bromhead
> >
> > Instaclustr | www.instaclustr.com | @instaclustr
> > <http://twitter.com/instaclustr> | (650) 284 9692
> >
> > --
> >
> > Ben Bromhead
> >
> > Instaclustr | www.instaclustr.com | @instaclustr
> > <http://twitter.com/instaclustr> | (650) 284 9692
> >
> > ---------------------------------------------------------------------
> >
> > To
> >
> > unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org For additional
> > commands, e-mail: dev-help@cassandra.apache.org
> >
> > ---------------------------------------------------------------------
> >
> > To
> >
> > unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org For
> >
> > additional
> >
> > commands, e-mail: dev-help@cassandra.apache.org
> >
> > ---------------------------------------------------------------------
> >
> > To
> >
> > unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org For
> >
> > additional
> >
> > commands, e-mail: dev-help@cassandra.apache.org
> >
> > --
> >
> > - John
> >
> > ---------------------------------------------------------------------
> >
> > To
> >
> > unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org For
> >
> > additional
> >
> > commands, e-mail: dev-help@cassandra.apache.org
> >
> >
> _________________________________________________________________________________________________________________________
> >
> >
> > Ce message et ses pieces jointes peuvent contenir des informations
> > confidentielles ou privilegiees et ne doivent donc pas etre
> >
> > diffuses,
> >
> > exploites ou copies sans autorisation. Si vous avez recu ce message
> >
> > par
> >
> > erreur, veuillez le signaler a l'expediteur et le detruire ainsi
> >
> > que
> >
> > les
> >
> > pieces jointes. Les messages electroniques etant susceptibles
> >
> > d'alteration,
> >
> > Orange decline toute responsabilite si ce message a ete altere,
> >
> > deforme ou
> >
> > falsifie. Merci.
> >
> > This message and its attachments may contain confidential or
> >
> > privileged
> >
> > information that may be protected by law; they should not be
> >
> > distributed,
> >
> > used or copied without authorisation. If you have received this
> >
> > email
> >
> > in
> >
> > error, please notify the sender and delete this message and its
> > attachments. As emails may be altered, Orange is not liable for
> >
> > messages
> >
> > that have been modified, changed or falsified. Thank you.
> >
> > --------------------------------------------------------------------- To
> > unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org<mailto:
> > dev-unsubscribe@cassandra.apache.org> For additional commands, e-mail:
> > dev-help@cassandra.apache.org<ma...@cassandra.apache.org>
> >
> > --
> >
> > Ben Bromhead
> >
> > Instaclustr | www.instaclustr.com<http://www.instaclustr.com> |
> > @instaclustr
> > <http://twitter.com/instaclustr> | (650) 284 9692
> >
> >
> _________________________________________________________________________________________________________________________
> >
> >
> > Ce message et ses pieces jointes peuvent contenir des informations
> > confidentielles ou privilegiees et ne doivent donc pas etre diffuses,
> > exploites ou copies sans autorisation. Si vous avez recu ce message par
> > erreur, veuillez le signaler a l'expediteur et le detruire ainsi que les
> > pieces jointes. Les messages electroniques etant susceptibles
> d'alteration,
> > Orange decline toute responsabilite si ce message a ete altere, deforme
> ou
> > falsifie. Merci.
> >
> > This message and its attachments may contain confidential or privileged
> > information that may be protected by law; they should not be distributed,
> > used or copied without authorisation. If you have received this email in
> > error, please notify the sender and delete this message and its
> > attachments. As emails may be altered, Orange is not liable for messages
> > that have been modified, changed or falsified. Thank you.
> >
> > --
> >
> > Ben Bromhead
> >
> > Instaclustr | www.instaclustr.com<http://www.instaclustr.com> |
> > @instaclustr
> > <http://twitter.com/instaclustr> | (650) 284 9692
> >
> >
> _________________________________________________________________________________________________________________________
> >
> >
> > Ce message et ses pieces jointes peuvent contenir des informations
> > confidentielles ou privilegiees et ne doivent donc pas etre diffuses,
> > exploites ou copies sans autorisation. Si vous avez recu ce message par
> > erreur, veuillez le signaler a l'expediteur et le detruire ainsi que les
> > pieces jointes. Les messages electroniques etant susceptibles
> d'alteration,
> > Orange decline toute responsabilite si ce message a ete altere, deforme
> ou
> > falsifie. Merci.
> >
> > This message and its attachments may contain confidential or privileged
> > information that may be protected by law; they should not be distributed,
> > used or copied without authorisation. If you have received this email in
> > error, please notify the sender and delete this message and its
> > attachments. As emails may be altered, Orange is not liable for messages
> > that have been modified, changed or falsified. Thank you.
> >
>

Re: [DISCUSS] Next steps for Kubernetes operator SIG

Posted by Joshua McKenzie <jm...@apache.org>.

I'm a big fan of github milestones for exactly this kind of work. Just as
much process as strictly required to logically group things and align on
scope and nothing more.


On Fri, Oct 16, 2020 at 9:24 AM, <fr...@orange.com> wrote:

> Hi all, sorry for the delay we were busy releasing V1 of CassKop which
> happened this week: https://github.com/Orange-OpenSource/casskop
>
> Now we would like to open the discussions about feature merging and I
> would like to follow Joshua’s proposition: open issues on cass-operator.
>
> As I said, we discuss first and if we find a way forward then we do it!
>
> We will open the issues next week and move forward.
>
> We can discuss them during the next SIG calls from thursday
>
> Franck
>
> On 6 Oct 2020, at 08:02, Ben Bromhead <ben@instaclustr.com<mailto:ben@
> instaclustr.com>> wrote:
>
> Thanks Frank and Christopher.
>
> Sounds like we have a good path to consolidate around!
>
> On Sat, Oct 3, 2020 at 11:12 AM Joshua McKenzie <jmckenzie@apache.org
> <ma...@apache.org>> wrote:
>
> how to best merge Casskop's features in Cass-operator.
>
> What if we create issues on the gh repo here
> https://github.com/datastax/cass-operator/issues, create a milestone out
> of
> that, and have engineers rally on it to get things merged? We have a few
> engineers focused on k8s ecosystem for Cassandra from the DataStax side
> who'd be happy to collaborate with you folks to get these things in.
>
> On Fri, Oct 02, 2020 at 11:34 AM, <franck.dehay@orange.com<mailto:franck.
> dehay@orange.com>> wrote:
>
> An update on Orange's point of view following the recent emails:
>
> If we were a newly interested party in running C* in K8s, we would use
> Cass-operator as it comes from Datastax.
>
> The logic would then be that the community embraces it and thanks Datastax
> for offering it!
>
> So, on Orange side, we propose to discuss with Datastax how to best merge
> Casskop's features in Cass-operator. These features are:
> - nodes labelling to map any internal architecture (including network
> specific labels to muti-dc setup)
> - volumes & sidecars management (possibly linked to PodTemplateSpec)
> - backup & restore (we ruled out velero and can share why we went with
> Instaclustr but Medusa could work too)
> - kubectl plugin integration (quite useful on the ops side without an
> admin UI)
> - multiCassKop evolution to drive multiple cass-operators instead of
> multiple casskops (this could remain Orange internal if too specific)
>
> We could decide at the end of these discussions the best way forward.
> Orange could make PRs on cass-operator, but only if we agree we want the
> functionalities :)
>
> If we can sort it out we could end up with a pretty neat operator.
>
> We share a common architecture (operator-sdk), start to know each other
> with all these meetings so it should be possible if we want to!
>
> Would that be ok for the community and Datastax?
>
> On 2 Oct 2020, at 14:52, Joshua McKenzie <jmckenzie@apache.org<mailto:
> jmckenzie@apache.org>> wrote:
>
> What are next steps here?
>
> Maybe we collectively put a table together w/the 2 operators and a list of
> features to compare and contrast? Enumerate the frameworks / dependencies
> they have to help form a point of view about the strengths and weaknesses
> of each option?
>
> On Tue, Sep 29, 2020 at 10:22 PM, Christopher Bradford <bradfordcp@gmail.
> com
>
> wrote:
>
> Hello Dev list,
>
> I'm Chris Bradford a Product Manager at DataStax working with the
> cass-operator team. For background, we started down the path of developing
> an operator internally to power our C*aaS platform, Astra. Care was taken
> from day 1 to keep anything specific to this product at a layer above
> cass-operator so it could solely focus on the task of operating Cassandra
> clusters. With that being said, every single cluster on Astra is
> provisioned and operated by cass-operator. The value of an advanced
> operator to Cassandra users is tremendous so we decided to open source the
> project (and associated components) with the goal of building a community.
> It absolutely makes sense to offer this project and codebase up for
> donation as a standard / baseline for running C* on Kubernetes.
>
> Below you will find a collection of cass-operator features,
> differentiators, and roadmap / inflight initiatives. Table-stakes Must-have
> functionality for a C* operator
>
> -
>
> Datacenter provisioning
> -
>
> Schedule all pods
> -
>
> Bootstrap nodes in the appropriate order
> -
>
> Seeds
> -
>
> Across racks
> -
>
> etc.
> -
>
> Uniform configuration
> -
>
> Scale-up
> -
>
> Add new nodes in a balanced manner across rack
> -
>
> Scale-down
> -
>
> Remove nodes one at a time across racks
> -
>
> Node recovery
> -
>
> Restart process
> -
>
> Reschedule instance (IE replace node)
> - Replace instance
> -
>
> Specific workflows for seed node replacements
> -
>
> Multi-DC / Multi-Rack
> -
>
> Multi-Region / Multi-K8s Cluster
> -
>
> Note this requires support at a networking layer for pod to pod IP
> connectivity. This may be accomplished within the cluster with CNIs like
> Cilium or externally via traditional networking tools.
>
> Differentiators
>
> -
>
> OSS Ecosystem / Components
> -
>
> Cass Config Builder - OSS project extracted from DataStax OpsCenter Life
> Cycle Manager to provide automated configuration file rendering
> -
>
> Cass Config Definitions - definitions files for cass-config-builder,
> defines all configuration files, their parameters, and templates
> -
>
> Management API for Apache Cassandra (MAAC)
> -
>
> Metrics Collector for Apache Cassandra (MCAC)
> -
>
> Reference Prometheus Operator CRDs
> -
>
> ServiceMonitor
> -
>
> Instance
> -
>
> Reference Grafana Operator CRDs
> -
>
> Instance
> -
>
> Dashboards
> -
>
> Datasource
> -
>
> PodTemplateSpec
> -
>
> Customization of existing pods including support for adding containers,
> volumes, etc
> -
>
> Advanced Networking
> -
>
> Node Port
> -
>
> Host Network
> -
>
> Simple security
> -
>
> Management API mTLS support
> -
>
> Automated generation of keystore and truststore for internode and client
> to node TLS
> -
>
> Automated superuser account configuration
> -
>
> The default superuser (cassandra/cassandra) is disabled and never
> available to clients
> -
>
> Cluster administration account may be automatically (or provided) with
> values stored in a k8s secret
> -
>
> Automatic application of NetworkTopologyStrategy with appropriate RF for
> system keyspaces
> -
>
> Validating webhook
> -
>
> Invalid changes are rejected with a helpful message
> -
>
> Rolling cluster updates
> -
>
> Change in binary (C* upgrade)
> -
>
> Change in configuration
> -
>
> Canary deployments - single rack application of changes for validation
> before broader deployment
> -
>
> Rolling restart
> -
>
> Platform Integration / Testing / Certification
> -
>
> Red Hat Openshift compatible and certified
> -
>
> Secure, Universal Base Image (UBI) foundation images with security
> scanning performed by Red Hat
> -
>
> cass-operator
> -
>
> cass-config-builder
> -
>
> apache-cassandra w/ MCAC and MAAC
> -
>
> Integration with Red Hat certification pipeline / marketplace
> -
>
> Presence in Red Hat Operator Hub built into OpenShift interface
> -
>
> VMware Tanzu Kubernetes Grid Integrated Edition compatible and certified
> -
>
> Security scanning for images performed by VMware
> -
>
> Amazon EKS
> -
>
> Google GKE
> -
>
> Azure AKS
> -
>
> Documentation / Reference Implementations
> -
>
> Cloud storage classes
> -
>
> Ingress solutions
> -
>
> Sample connection validation application with reference implementations of
> Java Driver client connection parameters
> -
>
> Cluster-level Stop / Resume - stop all running instances while keeping
> persistent storage. Allows for scaling compute down to zero. Bringing the
> cluster back up follows expected startup procedures
>
> Road Map / Inflight
>
> 1.
>
> Repair
> 1.
>
> Reaper integration
> 2.
>
> Backups
> 1.
>
> Velero integration
> 2. Medusa integration
> 3.
>
> Advanced Networking via sidecar
> 1.
>
> Combination of proxy sidecars (a la Envoy) to allow for persistent IP
> addresses despite Kubernetes' best efforts to shuffle them.
> 4.
>
> Single pod canary deployments
> 5.
>
> Platform Certification
> 1. VMware Project Pacific
>
> 2.
>
> Rancher Kubernetes Engine (K3s)
> 6.
>
> Documentation
> 1.
>
> Multi-region
> 2.
>
> Multi-cloud
> 3.
>
> Additional ingress providers
> 1. Voyager
> 2. HAProxy
> 3. Gloo
> 4. Ambassdor
> 5. Envoy
> 6. NGINX Ingress Controller
> 4.
>
> Additional storage class references
> 1.
>
> OpenEBS
> 7.
>
> Cassandra Enhancements
> 1.
>
> [#CASSANDRA-15823] Support for networking via identity instead of IP
> - ASF JIRA <https://issues.apache.org/jira/browse/CASSANDRA-15823>
>
> If there are further questions about the project, codebase, architecture,
> etc. the team would be happy to dive in to the details and discuss more.
>
> Cheers,
> ~Chris
>
> Christopher Bradford
>
> On Mon, Sep 28, 2020 at 12:19 PM Patrick McFadin <pmcfadin@gmail.com
> <ma...@gmail.com>> wrote:
>
> I can agree with that Ben. Franck did a good job of outlining CassKop.
> Somebody from the cass-operator will be posting something similar and we
> can keep it on the mailing list.
>
> Patrick
>
> On Sun, Sep 27, 2020 at 2:16 PM Ben Bromhead <ben@instaclustr.com<mailto:
> ben@instaclustr.com>> wrote:
>
> Thanks Frank and Stefan.
>
> @Patrick great suggestion and worthwhile getting everything on the table.
>
> One minor change I would advocate for. The SIG has been great to iterate
> and interact on the details, but I really think this conversation given
>
> the
>
> nature of the content needs to be on the mailing list. The mailing list
>
> is
>
> really our system of record and the most accessible.
>
> It gives folk time to think and digest, it's asynchronous, easily
> searchable and let's be honest, the majority of stakeholders in this are
> not US based, so the timing issue then goes away and makes it easier for
> people to participate in. I feel like we've made a lot more progress by
> simply having this discussion here.
>
> So instead of a presentation, maybe just an email to the ML addressing
>
> the
>
> headings that Patrick identified?
>
> On Fri, Sep 25, 2020 at 7:55 AM Stefan Miklosovic < stefan.miklosovic@
> instaclustr.com<http://instaclustr.com>> wrote:
>
> Hi,
>
> Patrick's suggestion seems good to me.
>
> I won't go into specifics here as I need to genuinely prepare for this. It
> is quite hard to dig deep into the solutions of others and bring some
> constructive criticism because it takes a lot of time to study it and
> everybody has some "why's" behind it.
>
> To summarize my goals and concerns:
>
> 1) We should be as much "Kubernetes operator idiomatic" as possible.
> Industry standards, no custom brain-child of this or that group because
> they think it is just cool or they just didn't know any better. I do NOT
> say it is like that right now, I just want to be ruthless here as much as
> possible when it comes to functionality and why it is done like that. It is
> awesome that we have already something latest (thanks to John) and it
> adheres to the latest releases. I personally had a hard time to keep up
> with all the releases, once I finished something and I aligned it, after a
> week or two there was already another one where things were different, it
> is a very fast-moving space and I hope that by time we develop something it
> will not be obsolete.
>
> 2) It may be easier said than done but it is guaranteed that people get
> emotional, it's their precious etc, so please let's go into this with good
> intentions, not trying to push one solution over the other just because
> they would like to see it there ... I will have an equally hard time to
> comply with this point. My plan is to explain what is _wrong_ with our
> solution. Where we made mistakes and what should be done differently but it
> is "too late" etc. It is quite hard to describe your work and all effort
> in
> this light but without telling what is wrong we can not decide what is
> good
> imho.
>
> 3) We should put something together fast enough so we can call it a
> release. We can always iterate on it for eternity. But the foundations need
> to be there. Here I want to say that I especially like what John did. I
> looked through these specs and it was obvious it has been written with care
> and attention. It looked _solid_. I am not sure how hard it is to put all
> other things on top of that, I truly do not, and here I think we would have
> to reinvent that wheel if we want to proceed because I can not imagine
> what
> it would be to retrofit e.g. CassKop on top of John specs, it is just like
> putting round pegs into the square holes, maybe some chunks would be
> reused
> easily but otherwise I worry we will be just on square one.
>
> One specific feeling I have as I read this is that even if there is the
> will to create the fourth operator, the respective parties will not be able
> to drop their own repository. The whole point behind this effort, to me,
> is
> to have a solid, community driven, stable, modern and feature complete
> operator people are truly using. I can see that once this is real, we will
> _really_ sunset our operator, redirecting people to the new operator on
> main readme doc etc, we truly mean it. Sure, if somebody comes and bug fix
> will be needed, we will fix it, but the whole point of doing this is to
> stop using what we have currently, over time, otherwise we are just
> splitting this space even more. If CassKop is not sure if they will use it
> because they do not know if that operator will be "enough" for them,
> aren't
> we just doing it wrong? If I exaggerate, they should be fine with deleting
> the whole repository and using just this Cassandra one we are going to
> make
> otherwise I don't see the point to work on this ...
>
> On Thu, 24 Sep 2020 at 20:45, Joshua McKenzie <jmckenzie@apache.org
> <ma...@apache.org>> wrote:
>
> - choose cass-operator: it is not on offer right now so let’s see if
>
> it
>
> does
>
> We should all talk a lot more, but this is 100% a mistake - I take
>
> the
>
> blame for that. The intention has long been to offer cass-operator
>
> for
>
> donation but it slipped through the cracks and your email yesterday
>
> made
>
> me
>
> double-take.
>
> We have since resolved this misalignment. DataStax would be happy to
>
> donate
>
> any and all of cass-operator to the ASF and C* project if it's what
>
> we
>
> all
>
> agree best serves our collective Cassandra users. I'm also cognizant
>
> that
>
> an immense amount of effort has gone into CassKop and we seem to have
> something of an embarrassment of riches.
>
> I'm given to understand (haven't dug in personally) that the two
>
> operators
>
> express pretty different opinions when it comes to frameworks,
>
> designs,
>
> supported versions, etc. I think a discrete enumeration of the
>
> feature
>
> set
>
> and "identities" of both could really help navigate this conversation
>
> going
>
> forward.
>
> Also - thanks for that context Franck. It's always helpful to know
>
> where
>
> other people are coming from when we're all working together towards
>
> a
>
> common goal.
>
> On Thu, Sep 24, 2020 at 12:23 PM, <franck.dehay@orange.com<mailto:franck.
> dehay@orange.com>> wrote:
>
> I can share Orange’s view of the situation, sorry it is a long
>
> story!
>
> We started CassKop at the end of 2018 after betting on K8S which
>
> was
>
> not
>
> so simple as far as C* was concerned. Lack of support for local
>
> storage,
>
> IPs that change all the time, different network plugins to try to
>
> implement
>
> a non standard K8s way of having nodes see each other from
>
> different
>
> dcs…
>
> We hesitated with Mesos but could not have both and K8S was already
> tracting so much you could not not choose it.
>
> Anyway, we looked around and did not see anyone with such
>
> requirements
>
> so
>
> we said: why not try it ourselves but on github so that we may give
>
> it
>
> back
>
> to the community. We have used C* for quite a few years with great
>
> success
>
> on production with massive load and perfect availability. We love
>
> C*
>
> @
>
> Orange :) Thanks!
>
> So we started writing support for mono-dc cluster (CassKop) and
>
> added
>
> the
>
> multi dc support with MultiCassKop which is another operator
>
> included
>
> in
>
> the CassKop repo. For more details we tried to document our designs
>
> as
>
> much
>
> as possible here:
>
> https://orange-opensource.github.io/casskop/docs/
>
> 1_concepts/3_design_principes#multi-site-management
>
> In the middle of last year we had some talks with Datastax about
>
> working
>
> together around their new management sidecar. Their position on
>
> open
>
> source
>
> was not clear at that time so we said please come back when you
>
> have
>
> decided to go open source with it. Which they did in the beginning
>
> of
>
> this
>
> year. But at that time I guess work had started on cass-operator so
>
> we
>
> kept
>
> our separate ways.
>
> Since the beginning of the years, we have been working with our OPS
>
> team
>
> to have it in production. It is not simple as the team has to learn
>
> K8S and
>
> trust a newborn operator. This takes time especially as our
>
> internal
>
> cluster has been tweaked for multi-tenancy with obscure options
>
> being
>
> set
>
> by our K8s team…
>
> We also developed with Instaclustr the Backup & Restore
>
> functionnality
>
> (we
>
> have new CRDs (Custom Resource Definition) for backup and restore
>
> and a
>
> reconcile loop that calls out Instaclustr sidecar for these
>
> operations). We
>
> now support multiple backups in parallel and can write to s3/
>
> google
>
> or
>
> azur (but Stefan could give more details here if needed)
>
> During the SIG calls we mentioned our desire to donate CassKop once
>
> it
>
> satisfies our basics requirements (v1 coming just now but I said it
>
> too
>
> many times already) I am actually not sure Datastax mentioned their
>
> desire
>
> to donate cass-operator but we decided to compare the designs and
>
> the
>
> functionalities based on respective CRDs. The CRD is the interface
>
> with the
>
> user as it is where you describe the cluster that you want to have.
>
> These
>
> talks were very interesting and we found out that the CassKop team
>
> had
>
> made
>
> good choices most of the time but was may be too open. Indeed our
>
> intention
>
> was to give all the possibilities for our OPS team to work. This
>
> includes :
>
> - very open topology definition using any configuration of labels
>
> to
>
> map
>
> dcs / racks and nodes to labels on clusters (we have labels on dcs
>
> /
>
> rooms
>
> / rows and server racks so we can map C* racks to storage or
>
> network
>
> arrays
>
> internaly)
> - possibility to have multiple C* nodes on a single K8S host
>
> (because
>
> internal clouds are not really clouds, they have limited resources)
> - custom C* image selection,
> - custom bootstrap script that lets you configure C* as you want
>
> using
>
> ConfigMaps,
> - the ability to mount different volumes wherever they wanted,
> - the possibility to run any number of sidecars alongside C* for
>
> custom
>
> probes in our case
>
> This makes CassKop quite powerful and flexible.
> We made sure that all those options are not enabled by default so
>
> one
>
> can
>
> just pop a simple 3 node cluster quickly
>
> On the other hand cass-operator had an interesting way of
>
> configuring
>
> C*
>
> just inside the CRD using cass-config. This is simple and elegant
>
> so
>
> we are
>
> implementing it as well for the support of C* 4
>
> Now for the future, there are 3 choices in my opinion:
> - start from scratch (or John’s repo) by cherry picking bits from
>
> all
>
> operators. This is possible but will take some time / effort to
>
> have
>
> something usable. And then it will be compared to cass-operator and
> CassKop. I don’t see Orange contributing too much here as we
>
> believe
>
> CassKop to be a much better starting point
> - choose cass-operator: it is not on offer right now so let’s see
>
> if
>
> it
>
> does. I think Orange could contribute some bits inherited from
>
> CassKop
>
> if
>
> it is agreed by the community. Not sure it would be enough for us
>
> to
>
> use
>
> it.
> - choose CassKop: we would be delighted to donate it and contribute
>
> with
>
> some committers (including the original author who now works for
>
> AWS).
>
> It
>
> would then become the community operator but there would be
>
> cass-operator
>
> alongside probably. But Cass-operator is made to make it easier for
> Datastax to manage customer clusters by imposing some
>
> configuration.
>
> It
>
> make sense for their needs, so may be 2 operators. We don’t know
>
> how
>
> backup/restore will be handled here with medusa being adapted to
>
> K8s
>
> Sorry again for being long but 2 years of work deserve some lines
>
> of
>
> text
>
> :)
>
> I just saw your message Patrick but this was written already so we
>
> gain a
>
> week.
>
> Franck
>
> On 24 Sep 2020, at 10:08, Benjamin Lerer <
>
> benjamin.lerer@datastax.com
>
> <ma...@datastax.com>> wrote:
>
> I realise there are meeting logs, but getting a wider discourse
>
> with
>
> non-stakeholder input might help to build a community consensus? It
>
> doesn't
>
> seem like it can hurt at this point, anyway.
>
> +1
>
> On Wed, Sep 23, 2020 at 9:21 PM Benedict Elliott Smith
>
> <benedict@apache.
>
> org<ma...@apache.org>> wrote:
>
> Perhaps it helps to widen the field of discussion to the dev list?
>
> It might help if each of the stakeholder organisations state their
>
> view on
>
> the situation, including why they would or would not support a
>
> given
>
> approach/operator, and what (preferably specific) circumstances
>
> might
>
> lead
>
> them to change their mind?
>
> I realise there are meeting logs, but getting a wider discourse
>
> with
>
> non-stakeholder input might help to build a community consensus? It
>
> doesn't
>
> seem like it can hurt at this point, anyway.
>
> On 23/09/2020, 17:13, "John Sanda" <john.sanda@gmail.com<mailto:
>
> john.
>
> sanda@gmail.com>> wrote:
>
> I want to point out that pretty much everything being discussed in
>
> this
>
> thread has been discussed at length during the SIG meetings. I
>
> think
>
> it is
>
> worth noting because we are pretty much still have the same
>
> conversation.
>
> On Wed, Sep 23, 2020 at 12:03 PM Benedict Elliott Smith <
>
> benedict@apache.
>
> org<ma...@apache.org>> wrote:
>
> I don't think there's anything about a code drop that's not "The
>
> Apache
>
> Way"
>
> If there's a consensus (or even strong majority) amongst invested
>
> parties,
>
> I don't see why we could not adopt an operator directly into the
>
> project.
>
> It's possible a green field approach might lead to fewer hard
>
> feelings, as
>
> everyone is in the same boat. Perhaps all operators are also
>
> suboptimal
>
> and
> could be improved with a rewrite? But I think coordinating a lot of
> different entities around an empty codebase is particularly
>
> challenging. I
>
> actually think it could be better for cohesion and collaboration to
>
> have a
>
> suboptimal but substantive starting point.
>
> On 23/09/2020, 16:11, "Stefan Miklosovic" < stefan.miklosovic@
> instaclustr.com<ma...@instaclustr.com>> wrote:
>
> I think that from Instaclustr it was stated quite clearly multiple times
> that we are "fine to throw it away" if there is something
>
> better
>
> and more wide-spread.Indeed, we have invested a lot of time in the
> operator but it was not useless at all, we gained a lot of quite
>
> unique
>
> knowledge how to put all pieces together. However, I think that this space
> is going to be quite fragmented and "balkanized", which
>
> is
>
> not always a bad thing, but in a quite narrow area as Kubernetes
>
> operator
>
> is, I just do not see how 4 operators are going to be beneficial
>
> for
>
> ordinary people ("official" from community, ours, Datastax one and
>
> CassKop
>
> (without any significant order)). Sure, innovation and healthy
>
> competition
>
> is important but to what extent ...
> One can start a Cassandra cluster on Kubernetes just so many times
> differently and nobody really likes a vendor lock-in. People
>
> wanting
>
> to run a cluster on K8S realise that there are three operators,
>
> each
>
> backed by a private business entity, and the community operator is
>
> not
>
> there ... Huh, interesting ... One may even start to question what
>
> is
>
> wrong with these folks that it takes three companies to build their own
> solution.
>
> Having said that, to my perception, Cassandra community just does
>
> not
>
> have enough engineers nor contributors to keep 4 operators alive at the
> same time (I wish I was wrong) so the idea of selecting the
>
> best
>
> one or to merge obvious things and approaches together is
>
> understandable,
>
> even if it meant we eventually sunset ours. In addition, nobody
>
> from
>
> big
>
> players is going to contribute to the code
> base of the other one, for obvious reasons, so channeling and
>
> directing
>
> this effort into something common for a community seems to be the only
> reasonable way of cooperation.
>
> It is quite hard to bootstrap this if the donation of the code in
>
> big
>
> chunks / whole repo is out of question as it is not the "Apache
>
> way"
>
> (there was some thread running here about this in more depth a
>
> while
>
> ago) and we basically need to start from scratch which is quite
> demotivating, we are just inventing the wheel and nobody is up to
>
> it.
>
> It is like people are waiting for that to happen so they can jump
>
> in
>
> "once it is the thing" but it will never materialise or at least
>
> the
>
> hurdle to kick it off is unnecessarily high. Nobody is going to
>
> invest
>
> in this heavily if there is already a working operator from
>
> companies
>
> mentioned above. As I understood it, one reason of not choosing the way of
> donating it all is that "the learning and community building should happen
> in organic manner and we just can not accept the
>
> donation",
>
> but is not it true that it is easier to build a community around something
> which is already there rather than trying to build
>
> it
>
> around an idea which is quite hard to dedicate to?
>
> On Wed, 23 Sep 2020 at 15:28, Joshua McKenzie <
>
> jmckenzie@apache.org<ma...@apache.org>
>
> <ma...@apache.org>> wrote:
>
> I think there's significant value to the community in trying to coalesce
> on a single approach,
> I agree. Unfortunately in this case, the parties with a vested
>
> interest
>
> and
> written operators came to the table and couldn't agree to coalesce on a
> single approach. John Sanda attempted to start an initiative to
>
> write a
>
> best-of-breed combining choice parts of each operator, but that
>
> effort
>
> did
>
> not gain traction.
>
> Which is where my hypothesis comes from that if there were a clear
> "better
> fit" operator to start from we wouldn't be in a deadlock; the
>
> correct
>
> choice would be obvious. Reasonably so, every engineer that's
>
> written
>
> something is going to want that something to be used and not thrown away
> in
> favor of another something without strong evidence as to why that's the
> better choice.
>
> As far as I know, nobody has made a clear case as to a more
>
> compelling
>
> place to start in terms of an operator donation the project then
> collaborates on. There's no mass adoption evidence nor feature
>
> enumeration
>
> that I know of for any of the approaches anyone's taken, so the
> discussions
> remain stalled.
>
> On Wed, Sep 23, 2020 at 7:18 AM, Benedict Elliott Smith <
>
> benedict@apache.
>
> org<ma...@apache.org> wrote:
>
> I think there's significant value to the community in trying to coalesce
> on a single approach, earlier than later. This is an opportunity to expand
> the number of active organisations involved directly in the Apache
> Cassandra project, as well as to more quickly expand the project's
> functionality into an area we consider urgent and important. I think it
> would be a real shame to waste this opportunity. No doubt it will be hard,
> as organisations have certain built-in investments in their own
> approaches.
>
> I haven't participated in these calls as I do not consider myself to have
> the relevant experience and expertise, and have other focuses on the
> project. I just wanted to voice a vote in favour of trying to bring
>
> the
>
> different organisations together on a single approach if possible. Is
> there
> anything the project can do to help this happen?
>
> On 23/09/2020, 03:04, "Ben Bromhead" <ben@instaclustr.com<mailto:ben@
> instaclustr.com><mailto:
>
> ben@
>
> instaclustr.com<http://instaclustr.com>>> wrote:
>
> I think there is certainly an appetite to donate and standardise on a
> given operator (as mentioned in this thread).
>
> I personally found the SIG hard to participate in due to time zones
>
> and
>
> the synchronous nature of it.
>
> So while it was a great forum to dive into certain details for a subset of
> participants and a worthwhile endeavour, I wouldn't paint it as an
> accurate
> reflection of community intent.
>
> I don't think that any participants want to continue down the path of
> "let
> a thousand flowers bloom". That's why we are looking towards CasKop
>
> (as
>
> well as a number of technical reasons).
>
> Some of the recorded meetings and outputs can also be found if you are
> interested in some primary sources
> https://cwiki.apache.org/confluence/display/CASSANDRA/
> Cassandra+Kubernetes+Operator+SIG
> .
>
> From what I understand second-hand from talking to people on the SIG
> calls,
>
> there was a general inability to agree on an existing operator as a
> starting point and not much engagement on taking best of breed from the
> various to combine them. Seems to leave us in the "let a thousand flowers
> bloom" stage of letting operators grow in the ecosystem and seeing which
> ones meet the needs of end users before talking about adopting one into the
> foundation.
>
> Great to hear that you folks are joining forces though! Bodes well for C*
> users that are wanting to run things on k8s.
>
> On Tue, Sep 22, 2020 at 4:26 AM, Ben Bromhead <
>
> ben@instaclustr.com<ma...@instaclustr.com>
>
> <ma...@instaclustr.com>
>
> wrote:
>
> For what it's worth, a quick update from me:
>
> CassKop now has at least two organisations working on it
>
> substantially
>
> (Orange and Instaclustr) as well as the numerous other
>
> contributors.
>
> Internally we will also start pointing others towards CasKop once a few
> things get merged. While we are not yet sunsetting our operator yet, it
>
> is
>
> certainly looking that way.
>
> I'd love to see the community adopt it as a starting point for working
> towards whatever level of functionality is desired.
>
> Cheers
>
> Ben
>
> On Fri, Sep 11, 2020 at 2:37 PM John Sanda <
> john.sanda@gmail.com>
> wrote:
>
> On Thu, Sep 10, 2020 at 5:27 PM Josh McKenzie <
>
> jmckenzie@apache.org
>
> wrote:
>
> There's basically 1 java driver in the C* ecosystem. We have 3? 4? or
>
> more
>
> operators in the ecosystem. Has one of them hit a clear
>
> supermajority
>
> of
>
> adoption that makes it the de facto default and makes sense to pull it
>
> into
>
> the project?
>
> We as a project community were pretty slow to move on building a PoV
>
> around
>
> kubernetes so we find ourselves in a situation with a bunch of contenders
> for inclusion in the project. It's not clear to me what heuristics we'd
>
> use
>
> to gauge which one would be the best fit for inclusion outside letting
> community adoption speak.
>
> ---
> Josh McKenzie
>
> We actually talked a good bit on the SIG call earlier today about
> heuristics. We need to document what functionality an operator should
> include at level 0, level 1, etc. We did discuss this a good bit during
> some of the initial SIG meetings, but I guess it wasn't really a focal
> point at the time. I think we should also provide references to existing
> operator projects and possibly other related projects. This would benefit
> both community users as well as people working on these projects.
>
> - John
>
> --
>
> Ben Bromhead
>
> Instaclustr | www.instaclustr.com | @instaclustr
> <http://twitter.com/instaclustr> | (650) 284 9692
>
> --
>
> Ben Bromhead
>
> Instaclustr | www.instaclustr.com | @instaclustr
> <http://twitter.com/instaclustr> | (650) 284 9692
>
> ---------------------------------------------------------------------
>
> To
>
> unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org For additional
> commands, e-mail: dev-help@cassandra.apache.org
>
> ---------------------------------------------------------------------
>
> To
>
> unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org For
>
> additional
>
> commands, e-mail: dev-help@cassandra.apache.org
>
> ---------------------------------------------------------------------
>
> To
>
> unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org For
>
> additional
>
> commands, e-mail: dev-help@cassandra.apache.org
>
> --
>
> - John
>
> ---------------------------------------------------------------------
>
> To
>
> unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org For
>
> additional
>
> commands, e-mail: dev-help@cassandra.apache.org
>
> _________________________________________________________________________________________________________________________
>
>
> Ce message et ses pieces jointes peuvent contenir des informations
> confidentielles ou privilegiees et ne doivent donc pas etre
>
> diffuses,
>
> exploites ou copies sans autorisation. Si vous avez recu ce message
>
> par
>
> erreur, veuillez le signaler a l'expediteur et le detruire ainsi
>
> que
>
> les
>
> pieces jointes. Les messages electroniques etant susceptibles
>
> d'alteration,
>
> Orange decline toute responsabilite si ce message a ete altere,
>
> deforme ou
>
> falsifie. Merci.
>
> This message and its attachments may contain confidential or
>
> privileged
>
> information that may be protected by law; they should not be
>
> distributed,
>
> used or copied without authorisation. If you have received this
>
> email
>
> in
>
> error, please notify the sender and delete this message and its
> attachments. As emails may be altered, Orange is not liable for
>
> messages
>
> that have been modified, changed or falsified. Thank you.
>
> --------------------------------------------------------------------- To
> unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org<mailto:
> dev-unsubscribe@cassandra.apache.org> For additional commands, e-mail:
> dev-help@cassandra.apache.org<ma...@cassandra.apache.org>
>
> --
>
> Ben Bromhead
>
> Instaclustr | www.instaclustr.com<http://www.instaclustr.com> |
> @instaclustr
> <http://twitter.com/instaclustr> | (650) 284 9692
>
> _________________________________________________________________________________________________________________________
>
>
> Ce message et ses pieces jointes peuvent contenir des informations
> confidentielles ou privilegiees et ne doivent donc pas etre diffuses,
> exploites ou copies sans autorisation. Si vous avez recu ce message par
> erreur, veuillez le signaler a l'expediteur et le detruire ainsi que les
> pieces jointes. Les messages electroniques etant susceptibles d'alteration,
> Orange decline toute responsabilite si ce message a ete altere, deforme ou
> falsifie. Merci.
>
> This message and its attachments may contain confidential or privileged
> information that may be protected by law; they should not be distributed,
> used or copied without authorisation. If you have received this email in
> error, please notify the sender and delete this message and its
> attachments. As emails may be altered, Orange is not liable for messages
> that have been modified, changed or falsified. Thank you.
>
> --
>
> Ben Bromhead
>
> Instaclustr | www.instaclustr.com<http://www.instaclustr.com> |
> @instaclustr
> <http://twitter.com/instaclustr> | (650) 284 9692
>
> _________________________________________________________________________________________________________________________
>
>
> Ce message et ses pieces jointes peuvent contenir des informations
> confidentielles ou privilegiees et ne doivent donc pas etre diffuses,
> exploites ou copies sans autorisation. Si vous avez recu ce message par
> erreur, veuillez le signaler a l'expediteur et le detruire ainsi que les
> pieces jointes. Les messages electroniques etant susceptibles d'alteration,
> Orange decline toute responsabilite si ce message a ete altere, deforme ou
> falsifie. Merci.
>
> This message and its attachments may contain confidential or privileged
> information that may be protected by law; they should not be distributed,
> used or copied without authorisation. If you have received this email in
> error, please notify the sender and delete this message and its
> attachments. As emails may be altered, Orange is not liable for messages
> that have been modified, changed or falsified. Thank you.
>

Re: [DISCUSS] Next steps for Kubernetes operator SIG

Posted by fr...@orange.com.

Hi all, sorry for the delay we were busy releasing V1 of CassKop which happened this week:
https://github.com/Orange-OpenSource/casskop

Now we would like to open the discussions about feature merging and I would like to follow Joshua’s proposition: open issues on cass-operator.

As I said, we discuss first and if we find a way forward then we do it!

We will open the issues next week and move forward.

We can discuss them during the next SIG calls from thursday

Franck

On 6 Oct 2020, at 08:02, Ben Bromhead <be...@instaclustr.com>> wrote:

Thanks Frank and Christopher.

Sounds like we have a good path to consolidate around!

On Sat, Oct 3, 2020 at 11:12 AM Joshua McKenzie <jm...@apache.org>>
wrote:

how to best merge Casskop's features in Cass-operator.

What if we create issues on the gh repo here
https://github.com/datastax/cass-operator/issues, create a milestone out
of
that, and have engineers rally on it to get things merged? We have a few
engineers focused on k8s ecosystem for Cassandra from the DataStax side
who'd be happy to collaborate with you folks to get these things in.

On Fri, Oct 02, 2020 at 11:34 AM, <fr...@orange.com>> wrote:

An update on Orange's point of view following the recent emails:

If we were a newly interested party in running C* in K8s, we would use
Cass-operator as it comes from Datastax.

The logic would then be that the community embraces it and thanks
Datastax
for offering it!

So, on Orange side, we propose to discuss with Datastax how to best merge
Casskop's features in Cass-operator. These features are:
- nodes labelling to map any internal architecture (including network
specific labels to muti-dc setup)
- volumes & sidecars management (possibly linked to PodTemplateSpec)
- backup & restore (we ruled out velero and can share why we went with
Instaclustr but Medusa could work too)
- kubectl plugin integration (quite useful on the ops side without an
admin UI)
- multiCassKop evolution to drive multiple cass-operators instead of
multiple casskops (this could remain Orange internal if too specific)

We could decide at the end of these discussions the best way forward.
Orange could make PRs on cass-operator, but only if we agree we want the
functionalities :)

If we can sort it out we could end up with a pretty neat operator.

We share a common architecture (operator-sdk), start to know each other
with all these meetings so it should be possible if we want to!

Would that be ok for the community and Datastax?

On 2 Oct 2020, at 14:52, Joshua McKenzie <jm...@apache.org>> wrote:

What are next steps here?

Maybe we collectively put a table together w/the 2 operators and a list
of
features to compare and contrast? Enumerate the frameworks / dependencies
they have to help form a point of view about the strengths and weaknesses
of each option?

On Tue, Sep 29, 2020 at 10:22 PM, Christopher Bradford <bradfordcp@gmail.
com

wrote:

Hello Dev list,

I'm Chris Bradford a Product Manager at DataStax working with the
cass-operator team. For background, we started down the path of
developing
an operator internally to power our C*aaS platform, Astra. Care was taken
from day 1 to keep anything specific to this product at a layer above
cass-operator so it could solely focus on the task of operating Cassandra
clusters. With that being said, every single cluster on Astra is
provisioned and operated by cass-operator. The value of an advanced
operator to Cassandra users is tremendous so we decided to open source
the
project (and associated components) with the goal of building a
community.
It absolutely makes sense to offer this project and codebase up for
donation as a standard / baseline for running C* on Kubernetes.

Below you will find a collection of cass-operator features,
differentiators, and roadmap / inflight initiatives. Table-stakes
Must-have
functionality for a C* operator

Datacenter provisioning
-

Schedule all pods
-

Bootstrap nodes in the appropriate order
-

Seeds
-

Across racks
-

etc.
-

Uniform configuration
-

Scale-up
-

Add new nodes in a balanced manner across rack
-

Scale-down
-

Remove nodes one at a time across racks
-

Node recovery
-

Restart process
-

Reschedule instance (IE replace node)
- Replace instance
-

Specific workflows for seed node replacements
-

Multi-DC / Multi-Rack
-

Multi-Region / Multi-K8s Cluster
-

Note this requires support at a networking layer for pod to pod IP
connectivity. This may be accomplished within the cluster with CNIs like
Cilium or externally via traditional networking tools.

Differentiators

OSS Ecosystem / Components
-

Cass Config Builder - OSS project extracted from DataStax OpsCenter Life
Cycle Manager to provide automated configuration file rendering
-

Cass Config Definitions - definitions files for cass-config-builder,
defines all configuration files, their parameters, and templates
-

Management API for Apache Cassandra (MAAC)
-

Metrics Collector for Apache Cassandra (MCAC)
-

Reference Prometheus Operator CRDs
-

ServiceMonitor
-

Instance
-

Reference Grafana Operator CRDs
-

Instance
-

Dashboards
-

Datasource
-

PodTemplateSpec
-

Customization of existing pods including support for adding containers,
volumes, etc
-

Advanced Networking
-

Node Port
-

Host Network
-

Simple security
-

Management API mTLS support
-

Automated generation of keystore and truststore for internode and client
to node TLS
-

Automated superuser account configuration
-

The default superuser (cassandra/cassandra) is disabled and never
available to clients
-

Cluster administration account may be automatically (or provided) with
values stored in a k8s secret
-

Automatic application of NetworkTopologyStrategy with appropriate RF for
system keyspaces
-

Validating webhook
-

Invalid changes are rejected with a helpful message
-

Rolling cluster updates
-

Change in binary (C* upgrade)
-

Change in configuration
-

Canary deployments - single rack application of changes for validation
before broader deployment
-

Rolling restart
-

Platform Integration / Testing / Certification
-

Red Hat Openshift compatible and certified
-

Secure, Universal Base Image (UBI) foundation images with security
scanning performed by Red Hat
-

cass-operator
-

cass-config-builder
-

apache-cassandra w/ MCAC and MAAC
-

Integration with Red Hat certification pipeline / marketplace
-

Presence in Red Hat Operator Hub built into OpenShift interface
-

VMware Tanzu Kubernetes Grid Integrated Edition compatible and certified
-

Security scanning for images performed by VMware
-

Amazon EKS
-

Google GKE
-

Azure AKS
-

Documentation / Reference Implementations
-

Cloud storage classes
-

Ingress solutions
-

Sample connection validation application with reference implementations
of
Java Driver client connection parameters
-

Cluster-level Stop / Resume - stop all running instances while keeping
persistent storage. Allows for scaling compute down to zero. Bringing the
cluster back up follows expected startup procedures

Road Map / Inflight

Repair
1.

Reaper integration
2.

Backups
1.

Velero integration
2. Medusa integration
3.

Advanced Networking via sidecar
1.

Combination of proxy sidecars (a la Envoy) to allow for persistent IP
addresses despite Kubernetes' best efforts to shuffle them.
4.

Single pod canary deployments
5.

Platform Certification
1. VMware Project Pacific

Rancher Kubernetes Engine (K3s)
6.

Documentation
1.

Multi-region
2.

Multi-cloud
3.

Additional ingress providers
1. Voyager
2. HAProxy
3. Gloo
4. Ambassdor
5. Envoy
6. NGINX Ingress Controller
4.

Additional storage class references
1.

OpenEBS
7.

Cassandra Enhancements
1.

[#CASSANDRA-15823] Support for networking via identity instead of IP
- ASF JIRA <https://issues.apache.org/jira/browse/CASSANDRA-15823>

If there are further questions about the project, codebase, architecture,
etc. the team would be happy to dive in to the details and discuss more.

Cheers,
~Chris

Christopher Bradford

On Mon, Sep 28, 2020 at 12:19 PM Patrick McFadin <pm...@gmail.com>>
wrote:

I can agree with that Ben. Franck did a good job of outlining CassKop.
Somebody from the cass-operator will be posting something similar and we
can keep it on the mailing list.

Patrick

On Sun, Sep 27, 2020 at 2:16 PM Ben Bromhead <be...@instaclustr.com>>
wrote:

Thanks Frank and Stefan.

@Patrick great suggestion and worthwhile getting everything on the table.

One minor change I would advocate for. The SIG has been great to iterate
and interact on the details, but I really think this conversation given

the

nature of the content needs to be on the mailing list. The mailing list

really our system of record and the most accessible.

It gives folk time to think and digest, it's asynchronous, easily
searchable and let's be honest, the majority of stakeholders in this are
not US based, so the timing issue then goes away and makes it easier for
people to participate in. I feel like we've made a lot more progress by
simply having this discussion here.

So instead of a presentation, maybe just an email to the ML addressing

the

headings that Patrick identified?

On Fri, Sep 25, 2020 at 7:55 AM Stefan Miklosovic < stefan.miklosovic@
instaclustr.com<http://instaclustr.com>> wrote:

Hi,

Patrick's suggestion seems good to me.

I won't go into specifics here as I need to genuinely prepare for this.
It
is quite hard to dig deep into the solutions of others and bring some
constructive criticism because it takes a lot of time to study it and
everybody has some "why's" behind it.

To summarize my goals and concerns:

1) We should be as much "Kubernetes operator idiomatic" as possible.
Industry standards, no custom brain-child of this or that group because
they think it is just cool or they just didn't know any better. I do NOT
say it is like that right now, I just want to be ruthless here as much as
possible when it comes to functionality and why it is done like that. It
is
awesome that we have already something latest (thanks to John) and it
adheres to the latest releases. I personally had a hard time to keep up
with all the releases, once I finished something and I aligned it, after
a
week or two there was already another one where things were different, it
is a very fast-moving space and I hope that by time we develop something
it
will not be obsolete.

2) It may be easier said than done but it is guaranteed that people get
emotional, it's their precious etc, so please let's go into this with
good
intentions, not trying to push one solution over the other just because
they would like to see it there ... I will have an equally hard time to
comply with this point. My plan is to explain what is _wrong_ with our
solution. Where we made mistakes and what should be done differently but
it
is "too late" etc. It is quite hard to describe your work and all effort
in
this light but without telling what is wrong we can not decide what is
good
imho.

3) We should put something together fast enough so we can call it a
release. We can always iterate on it for eternity. But the foundations
need
to be there. Here I want to say that I especially like what John did. I
looked through these specs and it was obvious it has been written with
care
and attention. It looked _solid_. I am not sure how hard it is to put all
other things on top of that, I truly do not, and here I think we would
have
to reinvent that wheel if we want to proceed because I can not imagine
what
it would be to retrofit e.g. CassKop on top of John specs, it is just
like
putting round pegs into the square holes, maybe some chunks would be
reused
easily but otherwise I worry we will be just on square one.

One specific feeling I have as I read this is that even if there is the
will to create the fourth operator, the respective parties will not be
able
to drop their own repository. The whole point behind this effort, to me,
is
to have a solid, community driven, stable, modern and feature complete
operator people are truly using. I can see that once this is real, we
will
_really_ sunset our operator, redirecting people to the new operator on
main readme doc etc, we truly mean it. Sure, if somebody comes and bug
fix
will be needed, we will fix it, but the whole point of doing this is to
stop using what we have currently, over time, otherwise we are just
splitting this space even more. If CassKop is not sure if they will use
it
because they do not know if that operator will be "enough" for them,
aren't
we just doing it wrong? If I exaggerate, they should be fine with
deleting
the whole repository and using just this Cassandra one we are going to
make
otherwise I don't see the point to work on this ...

On Thu, 24 Sep 2020 at 20:45, Joshua McKenzie <jm...@apache.org>>
wrote:

- choose cass-operator: it is not on offer right now so let’s see if

does

We should all talk a lot more, but this is 100% a mistake - I take

the

blame for that. The intention has long been to offer cass-operator

for

donation but it slipped through the cracks and your email yesterday

made

double-take.

We have since resolved this misalignment. DataStax would be happy to

donate

any and all of cass-operator to the ASF and C* project if it's what

all

agree best serves our collective Cassandra users. I'm also cognizant

that

an immense amount of effort has gone into CassKop and we seem to have
something of an embarrassment of riches.

I'm given to understand (haven't dug in personally) that the two

operators

express pretty different opinions when it comes to frameworks,

designs,

supported versions, etc. I think a discrete enumeration of the

feature

set

and "identities" of both could really help navigate this conversation

going

forward.

Also - thanks for that context Franck. It's always helpful to know

where

other people are coming from when we're all working together towards

common goal.

On Thu, Sep 24, 2020 at 12:23 PM, <fr...@orange.com>> wrote:

I can share Orange’s view of the situation, sorry it is a long

story!

We started CassKop at the end of 2018 after betting on K8S which

was

not

so simple as far as C* was concerned. Lack of support for local

storage,

IPs that change all the time, different network plugins to try to

implement

a non standard K8s way of having nodes see each other from

different

dcs…

We hesitated with Mesos but could not have both and K8S was already
tracting so much you could not not choose it.

Anyway, we looked around and did not see anyone with such

requirements

we said: why not try it ourselves but on github so that we may give

back

to the community. We have used C* for quite a few years with great

success

on production with massive load and perfect availability. We love

Orange :) Thanks!

So we started writing support for mono-dc cluster (CassKop) and

added

the

multi dc support with MultiCassKop which is another operator

included

the CassKop repo. For more details we tried to document our designs

much

as possible here:

https://orange-opensource.github.io/casskop/docs/

1_concepts/3_design_principes#multi-site-management

In the middle of last year we had some talks with Datastax about

working

together around their new management sidecar. Their position on

open

source

was not clear at that time so we said please come back when you

have

decided to go open source with it. Which they did in the beginning

this

year. But at that time I guess work had started on cass-operator so

kept

our separate ways.

Since the beginning of the years, we have been working with our OPS

team

to have it in production. It is not simple as the team has to learn

K8S and

trust a newborn operator. This takes time especially as our

internal

cluster has been tweaked for multi-tenancy with obscure options

being

set

by our K8s team…

We also developed with Instaclustr the Backup & Restore

functionnality

(we

have new CRDs (Custom Resource Definition) for backup and restore

and a

reconcile loop that calls out Instaclustr sidecar for these

operations). We

now support multiple backups in parallel and can write to s3/

google

azur (but Stefan could give more details here if needed)

During the SIG calls we mentioned our desire to donate CassKop once

satisfies our basics requirements (v1 coming just now but I said it

too

many times already) I am actually not sure Datastax mentioned their

desire

to donate cass-operator but we decided to compare the designs and

the

functionalities based on respective CRDs. The CRD is the interface

with the

user as it is where you describe the cluster that you want to have.

These

talks were very interesting and we found out that the CassKop team

had

made

good choices most of the time but was may be too open. Indeed our

intention

was to give all the possibilities for our OPS team to work. This

includes :

- very open topology definition using any configuration of labels

map

dcs / racks and nodes to labels on clusters (we have labels on dcs

rooms

/ rows and server racks so we can map C* racks to storage or

network

arrays

internaly)
- possibility to have multiple C* nodes on a single K8S host

(because

internal clouds are not really clouds, they have limited resources)
- custom C* image selection,
- custom bootstrap script that lets you configure C* as you want

using

ConfigMaps,
- the ability to mount different volumes wherever they wanted,
- the possibility to run any number of sidecars alongside C* for

custom

probes in our case

This makes CassKop quite powerful and flexible.
We made sure that all those options are not enabled by default so

one

can

just pop a simple 3 node cluster quickly

On the other hand cass-operator had an interesting way of

configuring

just inside the CRD using cass-config. This is simple and elegant

we are

implementing it as well for the support of C* 4

Now for the future, there are 3 choices in my opinion:
- start from scratch (or John’s repo) by cherry picking bits from

all

operators. This is possible but will take some time / effort to

have

something usable. And then it will be compared to cass-operator and
CassKop. I don’t see Orange contributing too much here as we

believe

CassKop to be a much better starting point
- choose cass-operator: it is not on offer right now so let’s see

does. I think Orange could contribute some bits inherited from

CassKop

it is agreed by the community. Not sure it would be enough for us

use

it.
- choose CassKop: we would be delighted to donate it and contribute

with

some committers (including the original author who now works for

AWS).

would then become the community operator but there would be

cass-operator

alongside probably. But Cass-operator is made to make it easier for
Datastax to manage customer clusters by imposing some

configuration.

make sense for their needs, so may be 2 operators. We don’t know

how

backup/restore will be handled here with medusa being adapted to

K8s

Sorry again for being long but 2 years of work deserve some lines

text

I just saw your message Patrick but this was written already so we

gain a

week.

Franck

On 24 Sep 2020, at 10:08, Benjamin Lerer <

benjamin.lerer@datastax.com

<ma...@datastax.com>> wrote:

I realise there are meeting logs, but getting a wider discourse

with

non-stakeholder input might help to build a community consensus? It

doesn't

seem like it can hurt at this point, anyway.

On Wed, Sep 23, 2020 at 9:21 PM Benedict Elliott Smith

<benedict@apache.

org<ma...@apache.org>> wrote:

Perhaps it helps to widen the field of discussion to the dev list?

It might help if each of the stakeholder organisations state their

view on

the situation, including why they would or would not support a

given

approach/operator, and what (preferably specific) circumstances

might

lead

them to change their mind?

I realise there are meeting logs, but getting a wider discourse

with

non-stakeholder input might help to build a community consensus? It

doesn't

seem like it can hurt at this point, anyway.

On 23/09/2020, 17:13, "John Sanda" <john.sanda@gmail.com<mailto:

john.

sanda@gmail.com>> wrote:

I want to point out that pretty much everything being discussed in

this

thread has been discussed at length during the SIG meetings. I

think

it is

worth noting because we are pretty much still have the same

conversation.

On Wed, Sep 23, 2020 at 12:03 PM Benedict Elliott Smith <

benedict@apache.

org<ma...@apache.org>> wrote:

I don't think there's anything about a code drop that's not "The

Apache

Way"

If there's a consensus (or even strong majority) amongst invested

parties,

I don't see why we could not adopt an operator directly into the

project.

It's possible a green field approach might lead to fewer hard

feelings, as

everyone is in the same boat. Perhaps all operators are also

suboptimal

and
could be improved with a rewrite? But I think coordinating a lot of
different entities around an empty codebase is particularly

challenging. I

actually think it could be better for cohesion and collaboration to

have a

suboptimal but substantive starting point.

On 23/09/2020, 16:11, "Stefan Miklosovic" < stefan.miklosovic@
instaclustr.com<ma...@instaclustr.com>> wrote:

I think that from Instaclustr it was stated quite clearly multiple times
that we are "fine to throw it away" if there is something

better

and more wide-spread.Indeed, we have invested a lot of time in the
operator but it was not useless at all, we gained a lot of quite

unique

knowledge how to put all pieces together. However, I think that this
space
is going to be quite fragmented and "balkanized", which

not always a bad thing, but in a quite narrow area as Kubernetes

operator

is, I just do not see how 4 operators are going to be beneficial

for

ordinary people ("official" from community, ours, Datastax one and

CassKop

(without any significant order)). Sure, innovation and healthy

competition

is important but to what extent ...
One can start a Cassandra cluster on Kubernetes just so many times
differently and nobody really likes a vendor lock-in. People

wanting

to run a cluster on K8S realise that there are three operators,

each

backed by a private business entity, and the community operator is

not

there ... Huh, interesting ... One may even start to question what

wrong with these folks that it takes three companies to build their own
solution.

Having said that, to my perception, Cassandra community just does

not

have enough engineers nor contributors to keep 4 operators alive at the
same time (I wish I was wrong) so the idea of selecting the

best

one or to merge obvious things and approaches together is

understandable,

even if it meant we eventually sunset ours. In addition, nobody

from

big

players is going to contribute to the code
base of the other one, for obvious reasons, so channeling and

directing

this effort into something common for a community seems to be the only
reasonable way of cooperation.

It is quite hard to bootstrap this if the donation of the code in

big

chunks / whole repo is out of question as it is not the "Apache

way"

(there was some thread running here about this in more depth a

while

ago) and we basically need to start from scratch which is quite
demotivating, we are just inventing the wheel and nobody is up to

it.

It is like people are waiting for that to happen so they can jump

"once it is the thing" but it will never materialise or at least

the

hurdle to kick it off is unnecessarily high. Nobody is going to

invest

in this heavily if there is already a working operator from

companies

mentioned above. As I understood it, one reason of not choosing the way
of
donating it all is that "the learning and community building should
happen
in organic manner and we just can not accept the

donation",

but is not it true that it is easier to build a community around
something
which is already there rather than trying to build

around an idea which is quite hard to dedicate to?

On Wed, 23 Sep 2020 at 15:28, Joshua McKenzie <

jmckenzie@apache.org<ma...@apache.org>

<ma...@apache.org>> wrote:

I think there's significant value to the community in trying to coalesce
on a single approach,
I agree. Unfortunately in this case, the parties with a vested

interest

and
written operators came to the table and couldn't agree to coalesce on a
single approach. John Sanda attempted to start an initiative to

write a

best-of-breed combining choice parts of each operator, but that

effort

did

not gain traction.

Which is where my hypothesis comes from that if there were a clear
"better
fit" operator to start from we wouldn't be in a deadlock; the

correct

choice would be obvious. Reasonably so, every engineer that's

written

something is going to want that something to be used and not thrown away
in
favor of another something without strong evidence as to why that's the
better choice.

As far as I know, nobody has made a clear case as to a more

compelling

place to start in terms of an operator donation the project then
collaborates on. There's no mass adoption evidence nor feature

enumeration

that I know of for any of the approaches anyone's taken, so the
discussions
remain stalled.

On Wed, Sep 23, 2020 at 7:18 AM, Benedict Elliott Smith <

benedict@apache.

org<ma...@apache.org> wrote:

I think there's significant value to the community in trying to coalesce
on a single approach, earlier than later. This is an opportunity to
expand
the number of active organisations involved directly in the Apache
Cassandra project, as well as to more quickly expand the project's
functionality into an area we consider urgent and important. I think it
would be a real shame to waste this opportunity. No doubt it will be
hard,
as organisations have certain built-in investments in their own
approaches.

the

different organisations together on a single approach if possible. Is
there
anything the project can do to help this happen?

On 23/09/2020, 03:04, "Ben Bromhead" <be...@instaclustr.com><mailto:

ben@

instaclustr.com<http://instaclustr.com>>> wrote:

I think there is certainly an appetite to donate and standardise on a
given operator (as mentioned in this thread).

I personally found the SIG hard to participate in due to time zones

and

the synchronous nature of it.

So while it was a great forum to dive into certain details for a subset
of
participants and a worthwhile endeavour, I wouldn't paint it as an
accurate
reflection of community intent.

I don't think that any participants want to continue down the path of
"let
a thousand flowers bloom". That's why we are looking towards CasKop

(as

well as a number of technical reasons).

Some of the recorded meetings and outputs can also be found if you are
interested in some primary sources
https://cwiki.apache.org/confluence/display/CASSANDRA/
Cassandra+Kubernetes+Operator+SIG
.

From what I understand second-hand from talking to people on the SIG
calls,

there was a general inability to agree on an existing operator as a
starting point and not much engagement on taking best of breed from the
various to combine them. Seems to leave us in the "let a thousand flowers
bloom" stage of letting operators grow in the ecosystem and seeing which
ones meet the needs of end users before talking about adopting one into
the
foundation.

Great to hear that you folks are joining forces though! Bodes well for C*
users that are wanting to run things on k8s.

On Tue, Sep 22, 2020 at 4:26 AM, Ben Bromhead <

ben@instaclustr.com<ma...@instaclustr.com>

<ma...@instaclustr.com>

wrote:

For what it's worth, a quick update from me:

CassKop now has at least two organisations working on it

substantially

(Orange and Instaclustr) as well as the numerous other

contributors.

Internally we will also start pointing others towards CasKop once a few
things get merged. While we are not yet sunsetting our operator yet, it

certainly looking that way.

I'd love to see the community adopt it as a starting point for working
towards whatever level of functionality is desired.

Cheers

Ben

On Fri, Sep 11, 2020 at 2:37 PM John Sanda <
john.sanda@gmail.com>
wrote:

On Thu, Sep 10, 2020 at 5:27 PM Josh McKenzie <

jmckenzie@apache.org

wrote:

There's basically 1 java driver in the C* ecosystem. We have 3? 4? or

operators in the ecosystem. Has one of them hit a clear

supermajority

adoption that makes it the de facto default and makes sense to pull it

into

the project?

We as a project community were pretty slow to move on building a PoV

around

kubernetes so we find ourselves in a situation with a bunch of contenders
for inclusion in the project. It's not clear to me what heuristics we'd

use

to gauge which one would be the best fit for inclusion outside letting
community adoption speak.

---
Josh McKenzie

We actually talked a good bit on the SIG call earlier today about
heuristics. We need to document what functionality an operator should
include at level 0, level 1, etc. We did discuss this a good bit during
some of the initial SIG meetings, but I guess it wasn't really a focal
point at the time. I think we should also provide references to existing
operator projects and possibly other related projects. This would benefit
both community users as well as people working on these projects.

- John

Ben Bromhead

Instaclustr | www.instaclustr.com | @instaclustr
<http://twitter.com/instaclustr> | (650) 284 9692

Ben Bromhead

Instaclustr | www.instaclustr.com | @instaclustr
<http://twitter.com/instaclustr> | (650) 284 9692

---------------------------------------------------------------------

unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org For additional
commands, e-mail: dev-help@cassandra.apache.org

---------------------------------------------------------------------

unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org For

additional

commands, e-mail: dev-help@cassandra.apache.org

---------------------------------------------------------------------

unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org For

additional

commands, e-mail: dev-help@cassandra.apache.org

- John

---------------------------------------------------------------------

unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org For

additional

commands, e-mail: dev-help@cassandra.apache.org

_________________________________________________________________________________________________________________________

Ce message et ses pieces jointes peuvent contenir des informations
confidentielles ou privilegiees et ne doivent donc pas etre

diffuses,

exploites ou copies sans autorisation. Si vous avez recu ce message

par

erreur, veuillez le signaler a l'expediteur et le detruire ainsi

que

les

pieces jointes. Les messages electroniques etant susceptibles

d'alteration,

Orange decline toute responsabilite si ce message a ete altere,

deforme ou

falsifie. Merci.

This message and its attachments may contain confidential or

privileged

information that may be protected by law; they should not be

distributed,

used or copied without authorisation. If you have received this

error, please notify the sender and delete this message and its
attachments. As emails may be altered, Orange is not liable for

messages

that have been modified, changed or falsified. Thank you.

--------------------------------------------------------------------- To
unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org<ma...@cassandra.apache.org> For additional
commands, e-mail: dev-help@cassandra.apache.org<ma...@cassandra.apache.org>

Ben Bromhead

Instaclustr | www.instaclustr.com<http://www.instaclustr.com> | @instaclustr
<http://twitter.com/instaclustr> | (650) 284 9692

_________________________________________________________________________________________________________________________

Ce message et ses pieces jointes peuvent contenir des informations
confidentielles ou privilegiees et ne doivent donc pas etre diffuses,
exploites ou copies sans autorisation. Si vous avez recu ce message par
erreur, veuillez le signaler a l'expediteur et le detruire ainsi que les
pieces jointes. Les messages electroniques etant susceptibles
d'alteration,
Orange decline toute responsabilite si ce message a ete altere, deforme
ou
falsifie. Merci.

This message and its attachments may contain confidential or privileged
information that may be protected by law; they should not be distributed,
used or copied without authorisation. If you have received this email in
error, please notify the sender and delete this message and its
attachments. As emails may be altered, Orange is not liable for messages
that have been modified, changed or falsified. Thank you.

Ben Bromhead

Instaclustr | www.instaclustr.com<http://www.instaclustr.com> | @instaclustr
<http://twitter.com/instaclustr> | (650) 284 9692

_________________________________________________________________________________________________________________________

Ce message et ses pieces jointes peuvent contenir des informations confidentielles ou privilegiees et ne doivent donc
pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce message par erreur, veuillez le signaler
a l'expediteur et le detruire ainsi que les pieces jointes. Les messages electroniques etant susceptibles d'alteration,
Orange decline toute responsabilite si ce message a ete altere, deforme ou falsifie. Merci.

This message and its attachments may contain confidential or privileged information that may be protected by law;
they should not be distributed, used or copied without authorisation.
If you have received this email in error, please notify the sender and delete this message and its attachments.
As emails may be altered, Orange is not liable for messages that have been modified, changed or falsified.
Thank you.

Re: [DISCUSS] Next steps for Kubernetes operator SIG

Posted by Ben Bromhead <be...@instaclustr.com>.

Thanks Frank and Christopher.

Sounds like we have a good path to consolidate around!

On Sat, Oct 3, 2020 at 11:12 AM Joshua McKenzie <jm...@apache.org>
wrote:

> how to best merge Casskop's features in Cass-operator.
>
> What if we create issues on the gh repo here
> https://github.com/datastax/cass-operator/issues, create a milestone out
> of
> that, and have engineers rally on it to get things merged? We have a few
> engineers focused on k8s ecosystem for Cassandra from the DataStax side
> who'd be happy to collaborate with you folks to get these things in.
>
>
> On Fri, Oct 02, 2020 at 11:34 AM, <fr...@orange.com> wrote:
>
> > An update on Orange's point of view following the recent emails:
> >
> > If we were a newly interested party in running C* in K8s, we would use
> > Cass-operator as it comes from Datastax.
> >
> > The logic would then be that the community embraces it and thanks
> Datastax
> > for offering it!
> >
> > So, on Orange side, we propose to discuss with Datastax how to best merge
> > Casskop's features in Cass-operator. These features are:
> > - nodes labelling to map any internal architecture (including network
> > specific labels to muti-dc setup)
> > - volumes & sidecars management (possibly linked to PodTemplateSpec)
> > - backup & restore (we ruled out velero and can share why we went with
> > Instaclustr but Medusa could work too)
> > - kubectl plugin integration (quite useful on the ops side without an
> > admin UI)
> > - multiCassKop evolution to drive multiple cass-operators instead of
> > multiple casskops (this could remain Orange internal if too specific)
> >
> > We could decide at the end of these discussions the best way forward.
> > Orange could make PRs on cass-operator, but only if we agree we want the
> > functionalities :)
> >
> > If we can sort it out we could end up with a pretty neat operator.
> >
> > We share a common architecture (operator-sdk), start to know each other
> > with all these meetings so it should be possible if we want to!
> >
> > Would that be ok for the community and Datastax?
> >
> > On 2 Oct 2020, at 14:52, Joshua McKenzie <jm...@apache.org> wrote:
> >
> > What are next steps here?
> >
> > Maybe we collectively put a table together w/the 2 operators and a list
> of
> > features to compare and contrast? Enumerate the frameworks / dependencies
> > they have to help form a point of view about the strengths and weaknesses
> > of each option?
> >
> > On Tue, Sep 29, 2020 at 10:22 PM, Christopher Bradford <bradfordcp@gmail.
> > com
> >
> > wrote:
> >
> > Hello Dev list,
> >
> > I'm Chris Bradford a Product Manager at DataStax working with the
> > cass-operator team. For background, we started down the path of
> developing
> > an operator internally to power our C*aaS platform, Astra. Care was taken
> > from day 1 to keep anything specific to this product at a layer above
> > cass-operator so it could solely focus on the task of operating Cassandra
> > clusters. With that being said, every single cluster on Astra is
> > provisioned and operated by cass-operator. The value of an advanced
> > operator to Cassandra users is tremendous so we decided to open source
> the
> > project (and associated components) with the goal of building a
> community.
> > It absolutely makes sense to offer this project and codebase up for
> > donation as a standard / baseline for running C* on Kubernetes.
> >
> > Below you will find a collection of cass-operator features,
> > differentiators, and roadmap / inflight initiatives. Table-stakes
> Must-have
> > functionality for a C* operator
> >
> > -
> >
> > Datacenter provisioning
> > -
> >
> > Schedule all pods
> > -
> >
> > Bootstrap nodes in the appropriate order
> > -
> >
> > Seeds
> > -
> >
> > Across racks
> > -
> >
> > etc.
> > -
> >
> > Uniform configuration
> > -
> >
> > Scale-up
> > -
> >
> > Add new nodes in a balanced manner across rack
> > -
> >
> > Scale-down
> > -
> >
> > Remove nodes one at a time across racks
> > -
> >
> > Node recovery
> > -
> >
> > Restart process
> > -
> >
> > Reschedule instance (IE replace node)
> > - Replace instance
> > -
> >
> > Specific workflows for seed node replacements
> > -
> >
> > Multi-DC / Multi-Rack
> > -
> >
> > Multi-Region / Multi-K8s Cluster
> > -
> >
> > Note this requires support at a networking layer for pod to pod IP
> > connectivity. This may be accomplished within the cluster with CNIs like
> > Cilium or externally via traditional networking tools.
> >
> > Differentiators
> >
> > -
> >
> > OSS Ecosystem / Components
> > -
> >
> > Cass Config Builder - OSS project extracted from DataStax OpsCenter Life
> > Cycle Manager to provide automated configuration file rendering
> > -
> >
> > Cass Config Definitions - definitions files for cass-config-builder,
> > defines all configuration files, their parameters, and templates
> > -
> >
> > Management API for Apache Cassandra (MAAC)
> > -
> >
> > Metrics Collector for Apache Cassandra (MCAC)
> > -
> >
> > Reference Prometheus Operator CRDs
> > -
> >
> > ServiceMonitor
> > -
> >
> > Instance
> > -
> >
> > Reference Grafana Operator CRDs
> > -
> >
> > Instance
> > -
> >
> > Dashboards
> > -
> >
> > Datasource
> > -
> >
> > PodTemplateSpec
> > -
> >
> > Customization of existing pods including support for adding containers,
> > volumes, etc
> > -
> >
> > Advanced Networking
> > -
> >
> > Node Port
> > -
> >
> > Host Network
> > -
> >
> > Simple security
> > -
> >
> > Management API mTLS support
> > -
> >
> > Automated generation of keystore and truststore for internode and client
> > to node TLS
> > -
> >
> > Automated superuser account configuration
> > -
> >
> > The default superuser (cassandra/cassandra) is disabled and never
> > available to clients
> > -
> >
> > Cluster administration account may be automatically (or provided) with
> > values stored in a k8s secret
> > -
> >
> > Automatic application of NetworkTopologyStrategy with appropriate RF for
> > system keyspaces
> > -
> >
> > Validating webhook
> > -
> >
> > Invalid changes are rejected with a helpful message
> > -
> >
> > Rolling cluster updates
> > -
> >
> > Change in binary (C* upgrade)
> > -
> >
> > Change in configuration
> > -
> >
> > Canary deployments - single rack application of changes for validation
> > before broader deployment
> > -
> >
> > Rolling restart
> > -
> >
> > Platform Integration / Testing / Certification
> > -
> >
> > Red Hat Openshift compatible and certified
> > -
> >
> > Secure, Universal Base Image (UBI) foundation images with security
> > scanning performed by Red Hat
> > -
> >
> > cass-operator
> > -
> >
> > cass-config-builder
> > -
> >
> > apache-cassandra w/ MCAC and MAAC
> > -
> >
> > Integration with Red Hat certification pipeline / marketplace
> > -
> >
> > Presence in Red Hat Operator Hub built into OpenShift interface
> > -
> >
> > VMware Tanzu Kubernetes Grid Integrated Edition compatible and certified
> > -
> >
> > Security scanning for images performed by VMware
> > -
> >
> > Amazon EKS
> > -
> >
> > Google GKE
> > -
> >
> > Azure AKS
> > -
> >
> > Documentation / Reference Implementations
> > -
> >
> > Cloud storage classes
> > -
> >
> > Ingress solutions
> > -
> >
> > Sample connection validation application with reference implementations
> of
> > Java Driver client connection parameters
> > -
> >
> > Cluster-level Stop / Resume - stop all running instances while keeping
> > persistent storage. Allows for scaling compute down to zero. Bringing the
> > cluster back up follows expected startup procedures
> >
> > Road Map / Inflight
> >
> > 1.
> >
> > Repair
> > 1.
> >
> > Reaper integration
> > 2.
> >
> > Backups
> > 1.
> >
> > Velero integration
> > 2. Medusa integration
> > 3.
> >
> > Advanced Networking via sidecar
> > 1.
> >
> > Combination of proxy sidecars (a la Envoy) to allow for persistent IP
> > addresses despite Kubernetes' best efforts to shuffle them.
> > 4.
> >
> > Single pod canary deployments
> > 5.
> >
> > Platform Certification
> > 1. VMware Project Pacific
> >
> > 2.
> >
> > Rancher Kubernetes Engine (K3s)
> > 6.
> >
> > Documentation
> > 1.
> >
> > Multi-region
> > 2.
> >
> > Multi-cloud
> > 3.
> >
> > Additional ingress providers
> > 1. Voyager
> > 2. HAProxy
> > 3. Gloo
> > 4. Ambassdor
> > 5. Envoy
> > 6. NGINX Ingress Controller
> > 4.
> >
> > Additional storage class references
> > 1.
> >
> > OpenEBS
> > 7.
> >
> > Cassandra Enhancements
> > 1.
> >
> > [#CASSANDRA-15823] Support for networking via identity instead of IP
> > - ASF JIRA <https://issues.apache.org/jira/browse/CASSANDRA-15823>
> >
> > If there are further questions about the project, codebase, architecture,
> > etc. the team would be happy to dive in to the details and discuss more.
> >
> > Cheers,
> > ~Chris
> >
> > Christopher Bradford
> >
> > On Mon, Sep 28, 2020 at 12:19 PM Patrick McFadin <pm...@gmail.com>
> > wrote:
> >
> > I can agree with that Ben. Franck did a good job of outlining CassKop.
> > Somebody from the cass-operator will be posting something similar and we
> > can keep it on the mailing list.
> >
> > Patrick
> >
> > On Sun, Sep 27, 2020 at 2:16 PM Ben Bromhead <be...@instaclustr.com>
> wrote:
> >
> > Thanks Frank and Stefan.
> >
> > @Patrick great suggestion and worthwhile getting everything on the table.
> >
> > One minor change I would advocate for. The SIG has been great to iterate
> > and interact on the details, but I really think this conversation given
> >
> > the
> >
> > nature of the content needs to be on the mailing list. The mailing list
> >
> > is
> >
> > really our system of record and the most accessible.
> >
> > It gives folk time to think and digest, it's asynchronous, easily
> > searchable and let's be honest, the majority of stakeholders in this are
> > not US based, so the timing issue then goes away and makes it easier for
> > people to participate in. I feel like we've made a lot more progress by
> > simply having this discussion here.
> >
> > So instead of a presentation, maybe just an email to the ML addressing
> >
> > the
> >
> > headings that Patrick identified?
> >
> > On Fri, Sep 25, 2020 at 7:55 AM Stefan Miklosovic < stefan.miklosovic@
> > instaclustr.com> wrote:
> >
> > Hi,
> >
> > Patrick's suggestion seems good to me.
> >
> > I won't go into specifics here as I need to genuinely prepare for this.
> It
> > is quite hard to dig deep into the solutions of others and bring some
> > constructive criticism because it takes a lot of time to study it and
> > everybody has some "why's" behind it.
> >
> > To summarize my goals and concerns:
> >
> > 1) We should be as much "Kubernetes operator idiomatic" as possible.
> > Industry standards, no custom brain-child of this or that group because
> > they think it is just cool or they just didn't know any better. I do NOT
> > say it is like that right now, I just want to be ruthless here as much as
> > possible when it comes to functionality and why it is done like that. It
> is
> > awesome that we have already something latest (thanks to John) and it
> > adheres to the latest releases. I personally had a hard time to keep up
> > with all the releases, once I finished something and I aligned it, after
> a
> > week or two there was already another one where things were different, it
> > is a very fast-moving space and I hope that by time we develop something
> it
> > will not be obsolete.
> >
> > 2) It may be easier said than done but it is guaranteed that people get
> > emotional, it's their precious etc, so please let's go into this with
> good
> > intentions, not trying to push one solution over the other just because
> > they would like to see it there ... I will have an equally hard time to
> > comply with this point. My plan is to explain what is _wrong_ with our
> > solution. Where we made mistakes and what should be done differently but
> it
> > is "too late" etc. It is quite hard to describe your work and all effort
> in
> > this light but without telling what is wrong we can not decide what is
> good
> > imho.
> >
> > 3) We should put something together fast enough so we can call it a
> > release. We can always iterate on it for eternity. But the foundations
> need
> > to be there. Here I want to say that I especially like what John did. I
> > looked through these specs and it was obvious it has been written with
> care
> > and attention. It looked _solid_. I am not sure how hard it is to put all
> > other things on top of that, I truly do not, and here I think we would
> have
> > to reinvent that wheel if we want to proceed because I can not imagine
> what
> > it would be to retrofit e.g. CassKop on top of John specs, it is just
> like
> > putting round pegs into the square holes, maybe some chunks would be
> reused
> > easily but otherwise I worry we will be just on square one.
> >
> > One specific feeling I have as I read this is that even if there is the
> > will to create the fourth operator, the respective parties will not be
> able
> > to drop their own repository. The whole point behind this effort, to me,
> is
> > to have a solid, community driven, stable, modern and feature complete
> > operator people are truly using. I can see that once this is real, we
> will
> > _really_ sunset our operator, redirecting people to the new operator on
> > main readme doc etc, we truly mean it. Sure, if somebody comes and bug
> fix
> > will be needed, we will fix it, but the whole point of doing this is to
> > stop using what we have currently, over time, otherwise we are just
> > splitting this space even more. If CassKop is not sure if they will use
> it
> > because they do not know if that operator will be "enough" for them,
> aren't
> > we just doing it wrong? If I exaggerate, they should be fine with
> deleting
> > the whole repository and using just this Cassandra one we are going to
> make
> > otherwise I don't see the point to work on this ...
> >
> > On Thu, 24 Sep 2020 at 20:45, Joshua McKenzie <jm...@apache.org>
> > wrote:
> >
> > - choose cass-operator: it is not on offer right now so let’s see if
> >
> > it
> >
> > does
> >
> > We should all talk a lot more, but this is 100% a mistake - I take
> >
> > the
> >
> > blame for that. The intention has long been to offer cass-operator
> >
> > for
> >
> > donation but it slipped through the cracks and your email yesterday
> >
> > made
> >
> > me
> >
> > double-take.
> >
> > We have since resolved this misalignment. DataStax would be happy to
> >
> > donate
> >
> > any and all of cass-operator to the ASF and C* project if it's what
> >
> > we
> >
> > all
> >
> > agree best serves our collective Cassandra users. I'm also cognizant
> >
> > that
> >
> > an immense amount of effort has gone into CassKop and we seem to have
> > something of an embarrassment of riches.
> >
> > I'm given to understand (haven't dug in personally) that the two
> >
> > operators
> >
> > express pretty different opinions when it comes to frameworks,
> >
> > designs,
> >
> > supported versions, etc. I think a discrete enumeration of the
> >
> > feature
> >
> > set
> >
> > and "identities" of both could really help navigate this conversation
> >
> > going
> >
> > forward.
> >
> > Also - thanks for that context Franck. It's always helpful to know
> >
> > where
> >
> > other people are coming from when we're all working together towards
> >
> > a
> >
> > common goal.
> >
> > On Thu, Sep 24, 2020 at 12:23 PM, <fr...@orange.com> wrote:
> >
> > I can share Orange’s view of the situation, sorry it is a long
> >
> > story!
> >
> > We started CassKop at the end of 2018 after betting on K8S which
> >
> > was
> >
> > not
> >
> > so simple as far as C* was concerned. Lack of support for local
> >
> > storage,
> >
> > IPs that change all the time, different network plugins to try to
> >
> > implement
> >
> > a non standard K8s way of having nodes see each other from
> >
> > different
> >
> > dcs…
> >
> > We hesitated with Mesos but could not have both and K8S was already
> > tracting so much you could not not choose it.
> >
> > Anyway, we looked around and did not see anyone with such
> >
> > requirements
> >
> > so
> >
> > we said: why not try it ourselves but on github so that we may give
> >
> > it
> >
> > back
> >
> > to the community. We have used C* for quite a few years with great
> >
> > success
> >
> > on production with massive load and perfect availability. We love
> >
> > C*
> >
> > @
> >
> > Orange :) Thanks!
> >
> > So we started writing support for mono-dc cluster (CassKop) and
> >
> > added
> >
> > the
> >
> > multi dc support with MultiCassKop which is another operator
> >
> > included
> >
> > in
> >
> > the CassKop repo. For more details we tried to document our designs
> >
> > as
> >
> > much
> >
> > as possible here:
> >
> > https://orange-opensource.github.io/casskop/docs/
> >
> > 1_concepts/3_design_principes#multi-site-management
> >
> > In the middle of last year we had some talks with Datastax about
> >
> > working
> >
> > together around their new management sidecar. Their position on
> >
> > open
> >
> > source
> >
> > was not clear at that time so we said please come back when you
> >
> > have
> >
> > decided to go open source with it. Which they did in the beginning
> >
> > of
> >
> > this
> >
> > year. But at that time I guess work had started on cass-operator so
> >
> > we
> >
> > kept
> >
> > our separate ways.
> >
> > Since the beginning of the years, we have been working with our OPS
> >
> > team
> >
> > to have it in production. It is not simple as the team has to learn
> >
> > K8S and
> >
> > trust a newborn operator. This takes time especially as our
> >
> > internal
> >
> > cluster has been tweaked for multi-tenancy with obscure options
> >
> > being
> >
> > set
> >
> > by our K8s team…
> >
> > We also developed with Instaclustr the Backup & Restore
> >
> > functionnality
> >
> > (we
> >
> > have new CRDs (Custom Resource Definition) for backup and restore
> >
> > and a
> >
> > reconcile loop that calls out Instaclustr sidecar for these
> >
> > operations). We
> >
> > now support multiple backups in parallel and can write to s3/
> >
> > google
> >
> > or
> >
> > azur (but Stefan could give more details here if needed)
> >
> > During the SIG calls we mentioned our desire to donate CassKop once
> >
> > it
> >
> > satisfies our basics requirements (v1 coming just now but I said it
> >
> > too
> >
> > many times already) I am actually not sure Datastax mentioned their
> >
> > desire
> >
> > to donate cass-operator but we decided to compare the designs and
> >
> > the
> >
> > functionalities based on respective CRDs. The CRD is the interface
> >
> > with the
> >
> > user as it is where you describe the cluster that you want to have.
> >
> > These
> >
> > talks were very interesting and we found out that the CassKop team
> >
> > had
> >
> > made
> >
> > good choices most of the time but was may be too open. Indeed our
> >
> > intention
> >
> > was to give all the possibilities for our OPS team to work. This
> >
> > includes :
> >
> > - very open topology definition using any configuration of labels
> >
> > to
> >
> > map
> >
> > dcs / racks and nodes to labels on clusters (we have labels on dcs
> >
> > /
> >
> > rooms
> >
> > / rows and server racks so we can map C* racks to storage or
> >
> > network
> >
> > arrays
> >
> > internaly)
> > - possibility to have multiple C* nodes on a single K8S host
> >
> > (because
> >
> > internal clouds are not really clouds, they have limited resources)
> > - custom C* image selection,
> > - custom bootstrap script that lets you configure C* as you want
> >
> > using
> >
> > ConfigMaps,
> > - the ability to mount different volumes wherever they wanted,
> > - the possibility to run any number of sidecars alongside C* for
> >
> > custom
> >
> > probes in our case
> >
> > This makes CassKop quite powerful and flexible.
> > We made sure that all those options are not enabled by default so
> >
> > one
> >
> > can
> >
> > just pop a simple 3 node cluster quickly
> >
> > On the other hand cass-operator had an interesting way of
> >
> > configuring
> >
> > C*
> >
> > just inside the CRD using cass-config. This is simple and elegant
> >
> > so
> >
> > we are
> >
> > implementing it as well for the support of C* 4
> >
> > Now for the future, there are 3 choices in my opinion:
> > - start from scratch (or John’s repo) by cherry picking bits from
> >
> > all
> >
> > operators. This is possible but will take some time / effort to
> >
> > have
> >
> > something usable. And then it will be compared to cass-operator and
> > CassKop. I don’t see Orange contributing too much here as we
> >
> > believe
> >
> > CassKop to be a much better starting point
> > - choose cass-operator: it is not on offer right now so let’s see
> >
> > if
> >
> > it
> >
> > does. I think Orange could contribute some bits inherited from
> >
> > CassKop
> >
> > if
> >
> > it is agreed by the community. Not sure it would be enough for us
> >
> > to
> >
> > use
> >
> > it.
> > - choose CassKop: we would be delighted to donate it and contribute
> >
> > with
> >
> > some committers (including the original author who now works for
> >
> > AWS).
> >
> > It
> >
> > would then become the community operator but there would be
> >
> > cass-operator
> >
> > alongside probably. But Cass-operator is made to make it easier for
> > Datastax to manage customer clusters by imposing some
> >
> > configuration.
> >
> > It
> >
> > make sense for their needs, so may be 2 operators. We don’t know
> >
> > how
> >
> > backup/restore will be handled here with medusa being adapted to
> >
> > K8s
> >
> > Sorry again for being long but 2 years of work deserve some lines
> >
> > of
> >
> > text
> >
> > :)
> >
> > I just saw your message Patrick but this was written already so we
> >
> > gain a
> >
> > week.
> >
> > Franck
> >
> > On 24 Sep 2020, at 10:08, Benjamin Lerer <
> >
> > benjamin.lerer@datastax.com
> >
> > <ma...@datastax.com>> wrote:
> >
> > I realise there are meeting logs, but getting a wider discourse
> >
> > with
> >
> > non-stakeholder input might help to build a community consensus? It
> >
> > doesn't
> >
> > seem like it can hurt at this point, anyway.
> >
> > +1
> >
> > On Wed, Sep 23, 2020 at 9:21 PM Benedict Elliott Smith
> >
> > <benedict@apache.
> >
> > org<ma...@apache.org>> wrote:
> >
> > Perhaps it helps to widen the field of discussion to the dev list?
> >
> > It might help if each of the stakeholder organisations state their
> >
> > view on
> >
> > the situation, including why they would or would not support a
> >
> > given
> >
> > approach/operator, and what (preferably specific) circumstances
> >
> > might
> >
> > lead
> >
> > them to change their mind?
> >
> > I realise there are meeting logs, but getting a wider discourse
> >
> > with
> >
> > non-stakeholder input might help to build a community consensus? It
> >
> > doesn't
> >
> > seem like it can hurt at this point, anyway.
> >
> > On 23/09/2020, 17:13, "John Sanda" <john.sanda@gmail.com<mailto:
> >
> > john.
> >
> > sanda@gmail.com>> wrote:
> >
> > I want to point out that pretty much everything being discussed in
> >
> > this
> >
> > thread has been discussed at length during the SIG meetings. I
> >
> > think
> >
> > it is
> >
> > worth noting because we are pretty much still have the same
> >
> > conversation.
> >
> > On Wed, Sep 23, 2020 at 12:03 PM Benedict Elliott Smith <
> >
> > benedict@apache.
> >
> > org<ma...@apache.org>> wrote:
> >
> > I don't think there's anything about a code drop that's not "The
> >
> > Apache
> >
> > Way"
> >
> > If there's a consensus (or even strong majority) amongst invested
> >
> > parties,
> >
> > I don't see why we could not adopt an operator directly into the
> >
> > project.
> >
> > It's possible a green field approach might lead to fewer hard
> >
> > feelings, as
> >
> > everyone is in the same boat. Perhaps all operators are also
> >
> > suboptimal
> >
> > and
> > could be improved with a rewrite? But I think coordinating a lot of
> > different entities around an empty codebase is particularly
> >
> > challenging. I
> >
> > actually think it could be better for cohesion and collaboration to
> >
> > have a
> >
> > suboptimal but substantive starting point.
> >
> > On 23/09/2020, 16:11, "Stefan Miklosovic" < stefan.miklosovic@
> > instaclustr.com<ma...@instaclustr.com>> wrote:
> >
> > I think that from Instaclustr it was stated quite clearly multiple times
> > that we are "fine to throw it away" if there is something
> >
> > better
> >
> > and more wide-spread.Indeed, we have invested a lot of time in the
> > operator but it was not useless at all, we gained a lot of quite
> >
> > unique
> >
> > knowledge how to put all pieces together. However, I think that this
> space
> > is going to be quite fragmented and "balkanized", which
> >
> > is
> >
> > not always a bad thing, but in a quite narrow area as Kubernetes
> >
> > operator
> >
> > is, I just do not see how 4 operators are going to be beneficial
> >
> > for
> >
> > ordinary people ("official" from community, ours, Datastax one and
> >
> > CassKop
> >
> > (without any significant order)). Sure, innovation and healthy
> >
> > competition
> >
> > is important but to what extent ...
> > One can start a Cassandra cluster on Kubernetes just so many times
> > differently and nobody really likes a vendor lock-in. People
> >
> > wanting
> >
> > to run a cluster on K8S realise that there are three operators,
> >
> > each
> >
> > backed by a private business entity, and the community operator is
> >
> > not
> >
> > there ... Huh, interesting ... One may even start to question what
> >
> > is
> >
> > wrong with these folks that it takes three companies to build their own
> > solution.
> >
> > Having said that, to my perception, Cassandra community just does
> >
> > not
> >
> > have enough engineers nor contributors to keep 4 operators alive at the
> > same time (I wish I was wrong) so the idea of selecting the
> >
> > best
> >
> > one or to merge obvious things and approaches together is
> >
> > understandable,
> >
> > even if it meant we eventually sunset ours. In addition, nobody
> >
> > from
> >
> > big
> >
> > players is going to contribute to the code
> > base of the other one, for obvious reasons, so channeling and
> >
> > directing
> >
> > this effort into something common for a community seems to be the only
> > reasonable way of cooperation.
> >
> > It is quite hard to bootstrap this if the donation of the code in
> >
> > big
> >
> > chunks / whole repo is out of question as it is not the "Apache
> >
> > way"
> >
> > (there was some thread running here about this in more depth a
> >
> > while
> >
> > ago) and we basically need to start from scratch which is quite
> > demotivating, we are just inventing the wheel and nobody is up to
> >
> > it.
> >
> > It is like people are waiting for that to happen so they can jump
> >
> > in
> >
> > "once it is the thing" but it will never materialise or at least
> >
> > the
> >
> > hurdle to kick it off is unnecessarily high. Nobody is going to
> >
> > invest
> >
> > in this heavily if there is already a working operator from
> >
> > companies
> >
> > mentioned above. As I understood it, one reason of not choosing the way
> of
> > donating it all is that "the learning and community building should
> happen
> > in organic manner and we just can not accept the
> >
> > donation",
> >
> > but is not it true that it is easier to build a community around
> something
> > which is already there rather than trying to build
> >
> > it
> >
> > around an idea which is quite hard to dedicate to?
> >
> > On Wed, 23 Sep 2020 at 15:28, Joshua McKenzie <
> >
> > jmckenzie@apache.org
> >
> > <ma...@apache.org>> wrote:
> >
> > I think there's significant value to the community in trying to coalesce
> > on a single approach,
> > I agree. Unfortunately in this case, the parties with a vested
> >
> > interest
> >
> > and
> > written operators came to the table and couldn't agree to coalesce on a
> > single approach. John Sanda attempted to start an initiative to
> >
> > write a
> >
> > best-of-breed combining choice parts of each operator, but that
> >
> > effort
> >
> > did
> >
> > not gain traction.
> >
> > Which is where my hypothesis comes from that if there were a clear
> > "better
> > fit" operator to start from we wouldn't be in a deadlock; the
> >
> > correct
> >
> > choice would be obvious. Reasonably so, every engineer that's
> >
> > written
> >
> > something is going to want that something to be used and not thrown away
> > in
> > favor of another something without strong evidence as to why that's the
> > better choice.
> >
> > As far as I know, nobody has made a clear case as to a more
> >
> > compelling
> >
> > place to start in terms of an operator donation the project then
> > collaborates on. There's no mass adoption evidence nor feature
> >
> > enumeration
> >
> > that I know of for any of the approaches anyone's taken, so the
> > discussions
> > remain stalled.
> >
> > On Wed, Sep 23, 2020 at 7:18 AM, Benedict Elliott Smith <
> >
> > benedict@apache.
> >
> > org<ma...@apache.org> wrote:
> >
> > I think there's significant value to the community in trying to coalesce
> > on a single approach, earlier than later. This is an opportunity to
> expand
> > the number of active organisations involved directly in the Apache
> > Cassandra project, as well as to more quickly expand the project's
> > functionality into an area we consider urgent and important. I think it
> > would be a real shame to waste this opportunity. No doubt it will be
> hard,
> > as organisations have certain built-in investments in their own
> approaches.
> >
> > I haven't participated in these calls as I do not consider myself to have
> > the relevant experience and expertise, and have other focuses on the
> > project. I just wanted to voice a vote in favour of trying to bring
> >
> > the
> >
> > different organisations together on a single approach if possible. Is
> > there
> > anything the project can do to help this happen?
> >
> > On 23/09/2020, 03:04, "Ben Bromhead" <ben@instaclustr.com<mailto:
> >
> > ben@
> >
> > instaclustr.com>> wrote:
> >
> > I think there is certainly an appetite to donate and standardise on a
> > given operator (as mentioned in this thread).
> >
> > I personally found the SIG hard to participate in due to time zones
> >
> > and
> >
> > the synchronous nature of it.
> >
> > So while it was a great forum to dive into certain details for a subset
> of
> > participants and a worthwhile endeavour, I wouldn't paint it as an
> accurate
> > reflection of community intent.
> >
> > I don't think that any participants want to continue down the path of
> "let
> > a thousand flowers bloom". That's why we are looking towards CasKop
> >
> > (as
> >
> > well as a number of technical reasons).
> >
> > Some of the recorded meetings and outputs can also be found if you are
> > interested in some primary sources
> > https://cwiki.apache.org/confluence/display/CASSANDRA/
> > Cassandra+Kubernetes+Operator+SIG
> > .
> >
> > From what I understand second-hand from talking to people on the SIG
> > calls,
> >
> > there was a general inability to agree on an existing operator as a
> > starting point and not much engagement on taking best of breed from the
> > various to combine them. Seems to leave us in the "let a thousand flowers
> > bloom" stage of letting operators grow in the ecosystem and seeing which
> > ones meet the needs of end users before talking about adopting one into
> the
> > foundation.
> >
> > Great to hear that you folks are joining forces though! Bodes well for C*
> > users that are wanting to run things on k8s.
> >
> > On Tue, Sep 22, 2020 at 4:26 AM, Ben Bromhead <
> >
> > ben@instaclustr.com
> >
> > <ma...@instaclustr.com>
> >
> > wrote:
> >
> > For what it's worth, a quick update from me:
> >
> > CassKop now has at least two organisations working on it
> >
> > substantially
> >
> > (Orange and Instaclustr) as well as the numerous other
> >
> > contributors.
> >
> > Internally we will also start pointing others towards CasKop once a few
> > things get merged. While we are not yet sunsetting our operator yet, it
> >
> > is
> >
> > certainly looking that way.
> >
> > I'd love to see the community adopt it as a starting point for working
> > towards whatever level of functionality is desired.
> >
> > Cheers
> >
> > Ben
> >
> > On Fri, Sep 11, 2020 at 2:37 PM John Sanda <
> > john.sanda@gmail.com>
> > wrote:
> >
> > On Thu, Sep 10, 2020 at 5:27 PM Josh McKenzie <
> >
> > jmckenzie@apache.org
> >
> > wrote:
> >
> > There's basically 1 java driver in the C* ecosystem. We have 3? 4? or
> >
> > more
> >
> > operators in the ecosystem. Has one of them hit a clear
> >
> > supermajority
> >
> > of
> >
> > adoption that makes it the de facto default and makes sense to pull it
> >
> > into
> >
> > the project?
> >
> > We as a project community were pretty slow to move on building a PoV
> >
> > around
> >
> > kubernetes so we find ourselves in a situation with a bunch of contenders
> > for inclusion in the project. It's not clear to me what heuristics we'd
> >
> > use
> >
> > to gauge which one would be the best fit for inclusion outside letting
> > community adoption speak.
> >
> > ---
> > Josh McKenzie
> >
> > We actually talked a good bit on the SIG call earlier today about
> > heuristics. We need to document what functionality an operator should
> > include at level 0, level 1, etc. We did discuss this a good bit during
> > some of the initial SIG meetings, but I guess it wasn't really a focal
> > point at the time. I think we should also provide references to existing
> > operator projects and possibly other related projects. This would benefit
> > both community users as well as people working on these projects.
> >
> > - John
> >
> > --
> >
> > Ben Bromhead
> >
> > Instaclustr | www.instaclustr.com | @instaclustr
> > <http://twitter.com/instaclustr> | (650) 284 9692
> >
> > --
> >
> > Ben Bromhead
> >
> > Instaclustr | www.instaclustr.com | @instaclustr
> > <http://twitter.com/instaclustr> | (650) 284 9692
> >
> > ---------------------------------------------------------------------
> >
> > To
> >
> > unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org For additional
> > commands, e-mail: dev-help@cassandra.apache.org
> >
> > ---------------------------------------------------------------------
> >
> > To
> >
> > unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org For
> >
> > additional
> >
> > commands, e-mail: dev-help@cassandra.apache.org
> >
> > ---------------------------------------------------------------------
> >
> > To
> >
> > unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org For
> >
> > additional
> >
> > commands, e-mail: dev-help@cassandra.apache.org
> >
> > --
> >
> > - John
> >
> > ---------------------------------------------------------------------
> >
> > To
> >
> > unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org For
> >
> > additional
> >
> > commands, e-mail: dev-help@cassandra.apache.org
> >
> >
> _________________________________________________________________________________________________________________________
> >
> >
> > Ce message et ses pieces jointes peuvent contenir des informations
> > confidentielles ou privilegiees et ne doivent donc pas etre
> >
> > diffuses,
> >
> > exploites ou copies sans autorisation. Si vous avez recu ce message
> >
> > par
> >
> > erreur, veuillez le signaler a l'expediteur et le detruire ainsi
> >
> > que
> >
> > les
> >
> > pieces jointes. Les messages electroniques etant susceptibles
> >
> > d'alteration,
> >
> > Orange decline toute responsabilite si ce message a ete altere,
> >
> > deforme ou
> >
> > falsifie. Merci.
> >
> > This message and its attachments may contain confidential or
> >
> > privileged
> >
> > information that may be protected by law; they should not be
> >
> > distributed,
> >
> > used or copied without authorisation. If you have received this
> >
> > email
> >
> > in
> >
> > error, please notify the sender and delete this message and its
> > attachments. As emails may be altered, Orange is not liable for
> >
> > messages
> >
> > that have been modified, changed or falsified. Thank you.
> >
> > --------------------------------------------------------------------- To
> > unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org For additional
> > commands, e-mail: dev-help@cassandra.apache.org
> >
> > --
> >
> > Ben Bromhead
> >
> > Instaclustr | www.instaclustr.com | @instaclustr
> > <http://twitter.com/instaclustr> | (650) 284 9692
> >
> >
> _________________________________________________________________________________________________________________________
> >
> >
> > Ce message et ses pieces jointes peuvent contenir des informations
> > confidentielles ou privilegiees et ne doivent donc pas etre diffuses,
> > exploites ou copies sans autorisation. Si vous avez recu ce message par
> > erreur, veuillez le signaler a l'expediteur et le detruire ainsi que les
> > pieces jointes. Les messages electroniques etant susceptibles
> d'alteration,
> > Orange decline toute responsabilite si ce message a ete altere, deforme
> ou
> > falsifie. Merci.
> >
> > This message and its attachments may contain confidential or privileged
> > information that may be protected by law; they should not be distributed,
> > used or copied without authorisation. If you have received this email in
> > error, please notify the sender and delete this message and its
> > attachments. As emails may be altered, Orange is not liable for messages
> > that have been modified, changed or falsified. Thank you.
> >
>


-- 

Ben Bromhead

Instaclustr | www.instaclustr.com | @instaclustr
<http://twitter.com/instaclustr> | (650) 284 9692

Re: [DISCUSS] Next steps for Kubernetes operator SIG

Posted by Joshua McKenzie <jm...@apache.org>.

how to best merge Casskop's features in Cass-operator.

What if we create issues on the gh repo here
https://github.com/datastax/cass-operator/issues, create a milestone out of
that, and have engineers rally on it to get things merged? We have a few
engineers focused on k8s ecosystem for Cassandra from the DataStax side
who'd be happy to collaborate with you folks to get these things in.


On Fri, Oct 02, 2020 at 11:34 AM, <fr...@orange.com> wrote:

> An update on Orange's point of view following the recent emails:
>
> If we were a newly interested party in running C* in K8s, we would use
> Cass-operator as it comes from Datastax.
>
> The logic would then be that the community embraces it and thanks Datastax
> for offering it!
>
> So, on Orange side, we propose to discuss with Datastax how to best merge
> Casskop's features in Cass-operator. These features are:
> - nodes labelling to map any internal architecture (including network
> specific labels to muti-dc setup)
> - volumes & sidecars management (possibly linked to PodTemplateSpec)
> - backup & restore (we ruled out velero and can share why we went with
> Instaclustr but Medusa could work too)
> - kubectl plugin integration (quite useful on the ops side without an
> admin UI)
> - multiCassKop evolution to drive multiple cass-operators instead of
> multiple casskops (this could remain Orange internal if too specific)
>
> We could decide at the end of these discussions the best way forward.
> Orange could make PRs on cass-operator, but only if we agree we want the
> functionalities :)
>
> If we can sort it out we could end up with a pretty neat operator.
>
> We share a common architecture (operator-sdk), start to know each other
> with all these meetings so it should be possible if we want to!
>
> Would that be ok for the community and Datastax?
>
> On 2 Oct 2020, at 14:52, Joshua McKenzie <jm...@apache.org> wrote:
>
> What are next steps here?
>
> Maybe we collectively put a table together w/the 2 operators and a list of
> features to compare and contrast? Enumerate the frameworks / dependencies
> they have to help form a point of view about the strengths and weaknesses
> of each option?
>
> On Tue, Sep 29, 2020 at 10:22 PM, Christopher Bradford <bradfordcp@gmail.
> com
>
> wrote:
>
> Hello Dev list,
>
> I'm Chris Bradford a Product Manager at DataStax working with the
> cass-operator team. For background, we started down the path of developing
> an operator internally to power our C*aaS platform, Astra. Care was taken
> from day 1 to keep anything specific to this product at a layer above
> cass-operator so it could solely focus on the task of operating Cassandra
> clusters. With that being said, every single cluster on Astra is
> provisioned and operated by cass-operator. The value of an advanced
> operator to Cassandra users is tremendous so we decided to open source the
> project (and associated components) with the goal of building a community.
> It absolutely makes sense to offer this project and codebase up for
> donation as a standard / baseline for running C* on Kubernetes.
>
> Below you will find a collection of cass-operator features,
> differentiators, and roadmap / inflight initiatives. Table-stakes Must-have
> functionality for a C* operator
>
> -
>
> Datacenter provisioning
> -
>
> Schedule all pods
> -
>
> Bootstrap nodes in the appropriate order
> -
>
> Seeds
> -
>
> Across racks
> -
>
> etc.
> -
>
> Uniform configuration
> -
>
> Scale-up
> -
>
> Add new nodes in a balanced manner across rack
> -
>
> Scale-down
> -
>
> Remove nodes one at a time across racks
> -
>
> Node recovery
> -
>
> Restart process
> -
>
> Reschedule instance (IE replace node)
> - Replace instance
> -
>
> Specific workflows for seed node replacements
> -
>
> Multi-DC / Multi-Rack
> -
>
> Multi-Region / Multi-K8s Cluster
> -
>
> Note this requires support at a networking layer for pod to pod IP
> connectivity. This may be accomplished within the cluster with CNIs like
> Cilium or externally via traditional networking tools.
>
> Differentiators
>
> -
>
> OSS Ecosystem / Components
> -
>
> Cass Config Builder - OSS project extracted from DataStax OpsCenter Life
> Cycle Manager to provide automated configuration file rendering
> -
>
> Cass Config Definitions - definitions files for cass-config-builder,
> defines all configuration files, their parameters, and templates
> -
>
> Management API for Apache Cassandra (MAAC)
> -
>
> Metrics Collector for Apache Cassandra (MCAC)
> -
>
> Reference Prometheus Operator CRDs
> -
>
> ServiceMonitor
> -
>
> Instance
> -
>
> Reference Grafana Operator CRDs
> -
>
> Instance
> -
>
> Dashboards
> -
>
> Datasource
> -
>
> PodTemplateSpec
> -
>
> Customization of existing pods including support for adding containers,
> volumes, etc
> -
>
> Advanced Networking
> -
>
> Node Port
> -
>
> Host Network
> -
>
> Simple security
> -
>
> Management API mTLS support
> -
>
> Automated generation of keystore and truststore for internode and client
> to node TLS
> -
>
> Automated superuser account configuration
> -
>
> The default superuser (cassandra/cassandra) is disabled and never
> available to clients
> -
>
> Cluster administration account may be automatically (or provided) with
> values stored in a k8s secret
> -
>
> Automatic application of NetworkTopologyStrategy with appropriate RF for
> system keyspaces
> -
>
> Validating webhook
> -
>
> Invalid changes are rejected with a helpful message
> -
>
> Rolling cluster updates
> -
>
> Change in binary (C* upgrade)
> -
>
> Change in configuration
> -
>
> Canary deployments - single rack application of changes for validation
> before broader deployment
> -
>
> Rolling restart
> -
>
> Platform Integration / Testing / Certification
> -
>
> Red Hat Openshift compatible and certified
> -
>
> Secure, Universal Base Image (UBI) foundation images with security
> scanning performed by Red Hat
> -
>
> cass-operator
> -
>
> cass-config-builder
> -
>
> apache-cassandra w/ MCAC and MAAC
> -
>
> Integration with Red Hat certification pipeline / marketplace
> -
>
> Presence in Red Hat Operator Hub built into OpenShift interface
> -
>
> VMware Tanzu Kubernetes Grid Integrated Edition compatible and certified
> -
>
> Security scanning for images performed by VMware
> -
>
> Amazon EKS
> -
>
> Google GKE
> -
>
> Azure AKS
> -
>
> Documentation / Reference Implementations
> -
>
> Cloud storage classes
> -
>
> Ingress solutions
> -
>
> Sample connection validation application with reference implementations of
> Java Driver client connection parameters
> -
>
> Cluster-level Stop / Resume - stop all running instances while keeping
> persistent storage. Allows for scaling compute down to zero. Bringing the
> cluster back up follows expected startup procedures
>
> Road Map / Inflight
>
> 1.
>
> Repair
> 1.
>
> Reaper integration
> 2.
>
> Backups
> 1.
>
> Velero integration
> 2. Medusa integration
> 3.
>
> Advanced Networking via sidecar
> 1.
>
> Combination of proxy sidecars (a la Envoy) to allow for persistent IP
> addresses despite Kubernetes' best efforts to shuffle them.
> 4.
>
> Single pod canary deployments
> 5.
>
> Platform Certification
> 1. VMware Project Pacific
>
> 2.
>
> Rancher Kubernetes Engine (K3s)
> 6.
>
> Documentation
> 1.
>
> Multi-region
> 2.
>
> Multi-cloud
> 3.
>
> Additional ingress providers
> 1. Voyager
> 2. HAProxy
> 3. Gloo
> 4. Ambassdor
> 5. Envoy
> 6. NGINX Ingress Controller
> 4.
>
> Additional storage class references
> 1.
>
> OpenEBS
> 7.
>
> Cassandra Enhancements
> 1.
>
> [#CASSANDRA-15823] Support for networking via identity instead of IP
> - ASF JIRA <https://issues.apache.org/jira/browse/CASSANDRA-15823>
>
> If there are further questions about the project, codebase, architecture,
> etc. the team would be happy to dive in to the details and discuss more.
>
> Cheers,
> ~Chris
>
> Christopher Bradford
>
> On Mon, Sep 28, 2020 at 12:19 PM Patrick McFadin <pm...@gmail.com>
> wrote:
>
> I can agree with that Ben. Franck did a good job of outlining CassKop.
> Somebody from the cass-operator will be posting something similar and we
> can keep it on the mailing list.
>
> Patrick
>
> On Sun, Sep 27, 2020 at 2:16 PM Ben Bromhead <be...@instaclustr.com> wrote:
>
> Thanks Frank and Stefan.
>
> @Patrick great suggestion and worthwhile getting everything on the table.
>
> One minor change I would advocate for. The SIG has been great to iterate
> and interact on the details, but I really think this conversation given
>
> the
>
> nature of the content needs to be on the mailing list. The mailing list
>
> is
>
> really our system of record and the most accessible.
>
> It gives folk time to think and digest, it's asynchronous, easily
> searchable and let's be honest, the majority of stakeholders in this are
> not US based, so the timing issue then goes away and makes it easier for
> people to participate in. I feel like we've made a lot more progress by
> simply having this discussion here.
>
> So instead of a presentation, maybe just an email to the ML addressing
>
> the
>
> headings that Patrick identified?
>
> On Fri, Sep 25, 2020 at 7:55 AM Stefan Miklosovic < stefan.miklosovic@
> instaclustr.com> wrote:
>
> Hi,
>
> Patrick's suggestion seems good to me.
>
> I won't go into specifics here as I need to genuinely prepare for this. It
> is quite hard to dig deep into the solutions of others and bring some
> constructive criticism because it takes a lot of time to study it and
> everybody has some "why's" behind it.
>
> To summarize my goals and concerns:
>
> 1) We should be as much "Kubernetes operator idiomatic" as possible.
> Industry standards, no custom brain-child of this or that group because
> they think it is just cool or they just didn't know any better. I do NOT
> say it is like that right now, I just want to be ruthless here as much as
> possible when it comes to functionality and why it is done like that. It is
> awesome that we have already something latest (thanks to John) and it
> adheres to the latest releases. I personally had a hard time to keep up
> with all the releases, once I finished something and I aligned it, after a
> week or two there was already another one where things were different, it
> is a very fast-moving space and I hope that by time we develop something it
> will not be obsolete.
>
> 2) It may be easier said than done but it is guaranteed that people get
> emotional, it's their precious etc, so please let's go into this with good
> intentions, not trying to push one solution over the other just because
> they would like to see it there ... I will have an equally hard time to
> comply with this point. My plan is to explain what is _wrong_ with our
> solution. Where we made mistakes and what should be done differently but it
> is "too late" etc. It is quite hard to describe your work and all effort in
> this light but without telling what is wrong we can not decide what is good
> imho.
>
> 3) We should put something together fast enough so we can call it a
> release. We can always iterate on it for eternity. But the foundations need
> to be there. Here I want to say that I especially like what John did. I
> looked through these specs and it was obvious it has been written with care
> and attention. It looked _solid_. I am not sure how hard it is to put all
> other things on top of that, I truly do not, and here I think we would have
> to reinvent that wheel if we want to proceed because I can not imagine what
> it would be to retrofit e.g. CassKop on top of John specs, it is just like
> putting round pegs into the square holes, maybe some chunks would be reused
> easily but otherwise I worry we will be just on square one.
>
> One specific feeling I have as I read this is that even if there is the
> will to create the fourth operator, the respective parties will not be able
> to drop their own repository. The whole point behind this effort, to me, is
> to have a solid, community driven, stable, modern and feature complete
> operator people are truly using. I can see that once this is real, we will
> _really_ sunset our operator, redirecting people to the new operator on
> main readme doc etc, we truly mean it. Sure, if somebody comes and bug fix
> will be needed, we will fix it, but the whole point of doing this is to
> stop using what we have currently, over time, otherwise we are just
> splitting this space even more. If CassKop is not sure if they will use it
> because they do not know if that operator will be "enough" for them, aren't
> we just doing it wrong? If I exaggerate, they should be fine with deleting
> the whole repository and using just this Cassandra one we are going to make
> otherwise I don't see the point to work on this ...
>
> On Thu, 24 Sep 2020 at 20:45, Joshua McKenzie <jm...@apache.org>
> wrote:
>
> - choose cass-operator: it is not on offer right now so let’s see if
>
> it
>
> does
>
> We should all talk a lot more, but this is 100% a mistake - I take
>
> the
>
> blame for that. The intention has long been to offer cass-operator
>
> for
>
> donation but it slipped through the cracks and your email yesterday
>
> made
>
> me
>
> double-take.
>
> We have since resolved this misalignment. DataStax would be happy to
>
> donate
>
> any and all of cass-operator to the ASF and C* project if it's what
>
> we
>
> all
>
> agree best serves our collective Cassandra users. I'm also cognizant
>
> that
>
> an immense amount of effort has gone into CassKop and we seem to have
> something of an embarrassment of riches.
>
> I'm given to understand (haven't dug in personally) that the two
>
> operators
>
> express pretty different opinions when it comes to frameworks,
>
> designs,
>
> supported versions, etc. I think a discrete enumeration of the
>
> feature
>
> set
>
> and "identities" of both could really help navigate this conversation
>
> going
>
> forward.
>
> Also - thanks for that context Franck. It's always helpful to know
>
> where
>
> other people are coming from when we're all working together towards
>
> a
>
> common goal.
>
> On Thu, Sep 24, 2020 at 12:23 PM, <fr...@orange.com> wrote:
>
> I can share Orange’s view of the situation, sorry it is a long
>
> story!
>
> We started CassKop at the end of 2018 after betting on K8S which
>
> was
>
> not
>
> so simple as far as C* was concerned. Lack of support for local
>
> storage,
>
> IPs that change all the time, different network plugins to try to
>
> implement
>
> a non standard K8s way of having nodes see each other from
>
> different
>
> dcs…
>
> We hesitated with Mesos but could not have both and K8S was already
> tracting so much you could not not choose it.
>
> Anyway, we looked around and did not see anyone with such
>
> requirements
>
> so
>
> we said: why not try it ourselves but on github so that we may give
>
> it
>
> back
>
> to the community. We have used C* for quite a few years with great
>
> success
>
> on production with massive load and perfect availability. We love
>
> C*
>
> @
>
> Orange :) Thanks!
>
> So we started writing support for mono-dc cluster (CassKop) and
>
> added
>
> the
>
> multi dc support with MultiCassKop which is another operator
>
> included
>
> in
>
> the CassKop repo. For more details we tried to document our designs
>
> as
>
> much
>
> as possible here:
>
> https://orange-opensource.github.io/casskop/docs/
>
> 1_concepts/3_design_principes#multi-site-management
>
> In the middle of last year we had some talks with Datastax about
>
> working
>
> together around their new management sidecar. Their position on
>
> open
>
> source
>
> was not clear at that time so we said please come back when you
>
> have
>
> decided to go open source with it. Which they did in the beginning
>
> of
>
> this
>
> year. But at that time I guess work had started on cass-operator so
>
> we
>
> kept
>
> our separate ways.
>
> Since the beginning of the years, we have been working with our OPS
>
> team
>
> to have it in production. It is not simple as the team has to learn
>
> K8S and
>
> trust a newborn operator. This takes time especially as our
>
> internal
>
> cluster has been tweaked for multi-tenancy with obscure options
>
> being
>
> set
>
> by our K8s team…
>
> We also developed with Instaclustr the Backup & Restore
>
> functionnality
>
> (we
>
> have new CRDs (Custom Resource Definition) for backup and restore
>
> and a
>
> reconcile loop that calls out Instaclustr sidecar for these
>
> operations). We
>
> now support multiple backups in parallel and can write to s3/
>
> google
>
> or
>
> azur (but Stefan could give more details here if needed)
>
> During the SIG calls we mentioned our desire to donate CassKop once
>
> it
>
> satisfies our basics requirements (v1 coming just now but I said it
>
> too
>
> many times already) I am actually not sure Datastax mentioned their
>
> desire
>
> to donate cass-operator but we decided to compare the designs and
>
> the
>
> functionalities based on respective CRDs. The CRD is the interface
>
> with the
>
> user as it is where you describe the cluster that you want to have.
>
> These
>
> talks were very interesting and we found out that the CassKop team
>
> had
>
> made
>
> good choices most of the time but was may be too open. Indeed our
>
> intention
>
> was to give all the possibilities for our OPS team to work. This
>
> includes :
>
> - very open topology definition using any configuration of labels
>
> to
>
> map
>
> dcs / racks and nodes to labels on clusters (we have labels on dcs
>
> /
>
> rooms
>
> / rows and server racks so we can map C* racks to storage or
>
> network
>
> arrays
>
> internaly)
> - possibility to have multiple C* nodes on a single K8S host
>
> (because
>
> internal clouds are not really clouds, they have limited resources)
> - custom C* image selection,
> - custom bootstrap script that lets you configure C* as you want
>
> using
>
> ConfigMaps,
> - the ability to mount different volumes wherever they wanted,
> - the possibility to run any number of sidecars alongside C* for
>
> custom
>
> probes in our case
>
> This makes CassKop quite powerful and flexible.
> We made sure that all those options are not enabled by default so
>
> one
>
> can
>
> just pop a simple 3 node cluster quickly
>
> On the other hand cass-operator had an interesting way of
>
> configuring
>
> C*
>
> just inside the CRD using cass-config. This is simple and elegant
>
> so
>
> we are
>
> implementing it as well for the support of C* 4
>
> Now for the future, there are 3 choices in my opinion:
> - start from scratch (or John’s repo) by cherry picking bits from
>
> all
>
> operators. This is possible but will take some time / effort to
>
> have
>
> something usable. And then it will be compared to cass-operator and
> CassKop. I don’t see Orange contributing too much here as we
>
> believe
>
> CassKop to be a much better starting point
> - choose cass-operator: it is not on offer right now so let’s see
>
> if
>
> it
>
> does. I think Orange could contribute some bits inherited from
>
> CassKop
>
> if
>
> it is agreed by the community. Not sure it would be enough for us
>
> to
>
> use
>
> it.
> - choose CassKop: we would be delighted to donate it and contribute
>
> with
>
> some committers (including the original author who now works for
>
> AWS).
>
> It
>
> would then become the community operator but there would be
>
> cass-operator
>
> alongside probably. But Cass-operator is made to make it easier for
> Datastax to manage customer clusters by imposing some
>
> configuration.
>
> It
>
> make sense for their needs, so may be 2 operators. We don’t know
>
> how
>
> backup/restore will be handled here with medusa being adapted to
>
> K8s
>
> Sorry again for being long but 2 years of work deserve some lines
>
> of
>
> text
>
> :)
>
> I just saw your message Patrick but this was written already so we
>
> gain a
>
> week.
>
> Franck
>
> On 24 Sep 2020, at 10:08, Benjamin Lerer <
>
> benjamin.lerer@datastax.com
>
> <ma...@datastax.com>> wrote:
>
> I realise there are meeting logs, but getting a wider discourse
>
> with
>
> non-stakeholder input might help to build a community consensus? It
>
> doesn't
>
> seem like it can hurt at this point, anyway.
>
> +1
>
> On Wed, Sep 23, 2020 at 9:21 PM Benedict Elliott Smith
>
> <benedict@apache.
>
> org<ma...@apache.org>> wrote:
>
> Perhaps it helps to widen the field of discussion to the dev list?
>
> It might help if each of the stakeholder organisations state their
>
> view on
>
> the situation, including why they would or would not support a
>
> given
>
> approach/operator, and what (preferably specific) circumstances
>
> might
>
> lead
>
> them to change their mind?
>
> I realise there are meeting logs, but getting a wider discourse
>
> with
>
> non-stakeholder input might help to build a community consensus? It
>
> doesn't
>
> seem like it can hurt at this point, anyway.
>
> On 23/09/2020, 17:13, "John Sanda" <john.sanda@gmail.com<mailto:
>
> john.
>
> sanda@gmail.com>> wrote:
>
> I want to point out that pretty much everything being discussed in
>
> this
>
> thread has been discussed at length during the SIG meetings. I
>
> think
>
> it is
>
> worth noting because we are pretty much still have the same
>
> conversation.
>
> On Wed, Sep 23, 2020 at 12:03 PM Benedict Elliott Smith <
>
> benedict@apache.
>
> org<ma...@apache.org>> wrote:
>
> I don't think there's anything about a code drop that's not "The
>
> Apache
>
> Way"
>
> If there's a consensus (or even strong majority) amongst invested
>
> parties,
>
> I don't see why we could not adopt an operator directly into the
>
> project.
>
> It's possible a green field approach might lead to fewer hard
>
> feelings, as
>
> everyone is in the same boat. Perhaps all operators are also
>
> suboptimal
>
> and
> could be improved with a rewrite? But I think coordinating a lot of
> different entities around an empty codebase is particularly
>
> challenging. I
>
> actually think it could be better for cohesion and collaboration to
>
> have a
>
> suboptimal but substantive starting point.
>
> On 23/09/2020, 16:11, "Stefan Miklosovic" < stefan.miklosovic@
> instaclustr.com<ma...@instaclustr.com>> wrote:
>
> I think that from Instaclustr it was stated quite clearly multiple times
> that we are "fine to throw it away" if there is something
>
> better
>
> and more wide-spread.Indeed, we have invested a lot of time in the
> operator but it was not useless at all, we gained a lot of quite
>
> unique
>
> knowledge how to put all pieces together. However, I think that this space
> is going to be quite fragmented and "balkanized", which
>
> is
>
> not always a bad thing, but in a quite narrow area as Kubernetes
>
> operator
>
> is, I just do not see how 4 operators are going to be beneficial
>
> for
>
> ordinary people ("official" from community, ours, Datastax one and
>
> CassKop
>
> (without any significant order)). Sure, innovation and healthy
>
> competition
>
> is important but to what extent ...
> One can start a Cassandra cluster on Kubernetes just so many times
> differently and nobody really likes a vendor lock-in. People
>
> wanting
>
> to run a cluster on K8S realise that there are three operators,
>
> each
>
> backed by a private business entity, and the community operator is
>
> not
>
> there ... Huh, interesting ... One may even start to question what
>
> is
>
> wrong with these folks that it takes three companies to build their own
> solution.
>
> Having said that, to my perception, Cassandra community just does
>
> not
>
> have enough engineers nor contributors to keep 4 operators alive at the
> same time (I wish I was wrong) so the idea of selecting the
>
> best
>
> one or to merge obvious things and approaches together is
>
> understandable,
>
> even if it meant we eventually sunset ours. In addition, nobody
>
> from
>
> big
>
> players is going to contribute to the code
> base of the other one, for obvious reasons, so channeling and
>
> directing
>
> this effort into something common for a community seems to be the only
> reasonable way of cooperation.
>
> It is quite hard to bootstrap this if the donation of the code in
>
> big
>
> chunks / whole repo is out of question as it is not the "Apache
>
> way"
>
> (there was some thread running here about this in more depth a
>
> while
>
> ago) and we basically need to start from scratch which is quite
> demotivating, we are just inventing the wheel and nobody is up to
>
> it.
>
> It is like people are waiting for that to happen so they can jump
>
> in
>
> "once it is the thing" but it will never materialise or at least
>
> the
>
> hurdle to kick it off is unnecessarily high. Nobody is going to
>
> invest
>
> in this heavily if there is already a working operator from
>
> companies
>
> mentioned above. As I understood it, one reason of not choosing the way of
> donating it all is that "the learning and community building should happen
> in organic manner and we just can not accept the
>
> donation",
>
> but is not it true that it is easier to build a community around something
> which is already there rather than trying to build
>
> it
>
> around an idea which is quite hard to dedicate to?
>
> On Wed, 23 Sep 2020 at 15:28, Joshua McKenzie <
>
> jmckenzie@apache.org
>
> <ma...@apache.org>> wrote:
>
> I think there's significant value to the community in trying to coalesce
> on a single approach,
> I agree. Unfortunately in this case, the parties with a vested
>
> interest
>
> and
> written operators came to the table and couldn't agree to coalesce on a
> single approach. John Sanda attempted to start an initiative to
>
> write a
>
> best-of-breed combining choice parts of each operator, but that
>
> effort
>
> did
>
> not gain traction.
>
> Which is where my hypothesis comes from that if there were a clear
> "better
> fit" operator to start from we wouldn't be in a deadlock; the
>
> correct
>
> choice would be obvious. Reasonably so, every engineer that's
>
> written
>
> something is going to want that something to be used and not thrown away
> in
> favor of another something without strong evidence as to why that's the
> better choice.
>
> As far as I know, nobody has made a clear case as to a more
>
> compelling
>
> place to start in terms of an operator donation the project then
> collaborates on. There's no mass adoption evidence nor feature
>
> enumeration
>
> that I know of for any of the approaches anyone's taken, so the
> discussions
> remain stalled.
>
> On Wed, Sep 23, 2020 at 7:18 AM, Benedict Elliott Smith <
>
> benedict@apache.
>
> org<ma...@apache.org> wrote:
>
> I think there's significant value to the community in trying to coalesce
> on a single approach, earlier than later. This is an opportunity to expand
> the number of active organisations involved directly in the Apache
> Cassandra project, as well as to more quickly expand the project's
> functionality into an area we consider urgent and important. I think it
> would be a real shame to waste this opportunity. No doubt it will be hard,
> as organisations have certain built-in investments in their own approaches.
>
> I haven't participated in these calls as I do not consider myself to have
> the relevant experience and expertise, and have other focuses on the
> project. I just wanted to voice a vote in favour of trying to bring
>
> the
>
> different organisations together on a single approach if possible. Is
> there
> anything the project can do to help this happen?
>
> On 23/09/2020, 03:04, "Ben Bromhead" <ben@instaclustr.com<mailto:
>
> ben@
>
> instaclustr.com>> wrote:
>
> I think there is certainly an appetite to donate and standardise on a
> given operator (as mentioned in this thread).
>
> I personally found the SIG hard to participate in due to time zones
>
> and
>
> the synchronous nature of it.
>
> So while it was a great forum to dive into certain details for a subset of
> participants and a worthwhile endeavour, I wouldn't paint it as an accurate
> reflection of community intent.
>
> I don't think that any participants want to continue down the path of "let
> a thousand flowers bloom". That's why we are looking towards CasKop
>
> (as
>
> well as a number of technical reasons).
>
> Some of the recorded meetings and outputs can also be found if you are
> interested in some primary sources
> https://cwiki.apache.org/confluence/display/CASSANDRA/
> Cassandra+Kubernetes+Operator+SIG
> .
>
> From what I understand second-hand from talking to people on the SIG
> calls,
>
> there was a general inability to agree on an existing operator as a
> starting point and not much engagement on taking best of breed from the
> various to combine them. Seems to leave us in the "let a thousand flowers
> bloom" stage of letting operators grow in the ecosystem and seeing which
> ones meet the needs of end users before talking about adopting one into the
> foundation.
>
> Great to hear that you folks are joining forces though! Bodes well for C*
> users that are wanting to run things on k8s.
>
> On Tue, Sep 22, 2020 at 4:26 AM, Ben Bromhead <
>
> ben@instaclustr.com
>
> <ma...@instaclustr.com>
>
> wrote:
>
> For what it's worth, a quick update from me:
>
> CassKop now has at least two organisations working on it
>
> substantially
>
> (Orange and Instaclustr) as well as the numerous other
>
> contributors.
>
> Internally we will also start pointing others towards CasKop once a few
> things get merged. While we are not yet sunsetting our operator yet, it
>
> is
>
> certainly looking that way.
>
> I'd love to see the community adopt it as a starting point for working
> towards whatever level of functionality is desired.
>
> Cheers
>
> Ben
>
> On Fri, Sep 11, 2020 at 2:37 PM John Sanda <
> john.sanda@gmail.com>
> wrote:
>
> On Thu, Sep 10, 2020 at 5:27 PM Josh McKenzie <
>
> jmckenzie@apache.org
>
> wrote:
>
> There's basically 1 java driver in the C* ecosystem. We have 3? 4? or
>
> more
>
> operators in the ecosystem. Has one of them hit a clear
>
> supermajority
>
> of
>
> adoption that makes it the de facto default and makes sense to pull it
>
> into
>
> the project?
>
> We as a project community were pretty slow to move on building a PoV
>
> around
>
> kubernetes so we find ourselves in a situation with a bunch of contenders
> for inclusion in the project. It's not clear to me what heuristics we'd
>
> use
>
> to gauge which one would be the best fit for inclusion outside letting
> community adoption speak.
>
> ---
> Josh McKenzie
>
> We actually talked a good bit on the SIG call earlier today about
> heuristics. We need to document what functionality an operator should
> include at level 0, level 1, etc. We did discuss this a good bit during
> some of the initial SIG meetings, but I guess it wasn't really a focal
> point at the time. I think we should also provide references to existing
> operator projects and possibly other related projects. This would benefit
> both community users as well as people working on these projects.
>
> - John
>
> --
>
> Ben Bromhead
>
> Instaclustr | www.instaclustr.com | @instaclustr
> <http://twitter.com/instaclustr> | (650) 284 9692
>
> --
>
> Ben Bromhead
>
> Instaclustr | www.instaclustr.com | @instaclustr
> <http://twitter.com/instaclustr> | (650) 284 9692
>
> ---------------------------------------------------------------------
>
> To
>
> unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org For additional
> commands, e-mail: dev-help@cassandra.apache.org
>
> ---------------------------------------------------------------------
>
> To
>
> unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org For
>
> additional
>
> commands, e-mail: dev-help@cassandra.apache.org
>
> ---------------------------------------------------------------------
>
> To
>
> unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org For
>
> additional
>
> commands, e-mail: dev-help@cassandra.apache.org
>
> --
>
> - John
>
> ---------------------------------------------------------------------
>
> To
>
> unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org For
>
> additional
>
> commands, e-mail: dev-help@cassandra.apache.org
>
> _________________________________________________________________________________________________________________________
>
>
> Ce message et ses pieces jointes peuvent contenir des informations
> confidentielles ou privilegiees et ne doivent donc pas etre
>
> diffuses,
>
> exploites ou copies sans autorisation. Si vous avez recu ce message
>
> par
>
> erreur, veuillez le signaler a l'expediteur et le detruire ainsi
>
> que
>
> les
>
> pieces jointes. Les messages electroniques etant susceptibles
>
> d'alteration,
>
> Orange decline toute responsabilite si ce message a ete altere,
>
> deforme ou
>
> falsifie. Merci.
>
> This message and its attachments may contain confidential or
>
> privileged
>
> information that may be protected by law; they should not be
>
> distributed,
>
> used or copied without authorisation. If you have received this
>
> email
>
> in
>
> error, please notify the sender and delete this message and its
> attachments. As emails may be altered, Orange is not liable for
>
> messages
>
> that have been modified, changed or falsified. Thank you.
>
> --------------------------------------------------------------------- To
> unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org For additional
> commands, e-mail: dev-help@cassandra.apache.org
>
> --
>
> Ben Bromhead
>
> Instaclustr | www.instaclustr.com | @instaclustr
> <http://twitter.com/instaclustr> | (650) 284 9692
>
> _________________________________________________________________________________________________________________________
>
>
> Ce message et ses pieces jointes peuvent contenir des informations
> confidentielles ou privilegiees et ne doivent donc pas etre diffuses,
> exploites ou copies sans autorisation. Si vous avez recu ce message par
> erreur, veuillez le signaler a l'expediteur et le detruire ainsi que les
> pieces jointes. Les messages electroniques etant susceptibles d'alteration,
> Orange decline toute responsabilite si ce message a ete altere, deforme ou
> falsifie. Merci.
>
> This message and its attachments may contain confidential or privileged
> information that may be protected by law; they should not be distributed,
> used or copied without authorisation. If you have received this email in
> error, please notify the sender and delete this message and its
> attachments. As emails may be altered, Orange is not liable for messages
> that have been modified, changed or falsified. Thank you.
>

Re: [DISCUSS] Next steps for Kubernetes operator SIG

Posted by Christopher Bradford <br...@gmail.com>.

Hello Franck,

This sounds like a great plan. We would love to expand the group of
contributors and work towards getting the combined efforts pulled into the
Apache Cassandra project proper. The list of items listed here are all wins
in our book (and we've even said how much we enjoyed the CassKop labelling
interface).

Cheers,
~Chris

Christopher Bradford



On Fri, Oct 2, 2020 at 11:35 AM <fr...@orange.com> wrote:

> An update on Orange's point of view following the recent emails:
>
> If we were a newly interested party in running C* in K8s, we would use
> Cass-operator as it comes from Datastax.
>
> The logic would then be that the community embraces it and thanks Datastax
> for offering it!
>
> So, on Orange side, we propose to discuss with Datastax how to best merge
> Casskop's features in Cass-operator.
> These features are:
> - nodes labelling to map any internal architecture (including network
> specific labels to muti-dc setup)
> - volumes & sidecars management (possibly linked to PodTemplateSpec)
> - backup & restore (we ruled out velero and can share why we went with
> Instaclustr but Medusa could work too)
> - kubectl plugin integration (quite useful on the ops side without an
> admin UI)
> - multiCassKop evolution to drive multiple cass-operators instead of
> multiple casskops (this could remain Orange internal if too specific)
>
> We could decide at the end of these discussions the best way forward.
> Orange could make PRs on cass-operator, but only if we agree we want the
> functionalities :)
>
> If we can sort it out we could end up with a pretty neat operator.
>
> We share a common architecture (operator-sdk), start to know each other
> with all these meetings so it should be possible if we want to!
>
> Would that be ok for the community and Datastax?
>
>
>
> > On 2 Oct 2020, at 14:52, Joshua McKenzie <jm...@apache.org> wrote:
> >
> > What are next steps here?
> >
> > Maybe we collectively put a table together w/the 2 operators and a list
> of
> > features to compare and contrast? Enumerate the frameworks / dependencies
> > they have to help form a point of view about the strengths and weaknesses
> > of each option?
> >
> >
> > On Tue, Sep 29, 2020 at 10:22 PM, Christopher Bradford <
> bradfordcp@gmail.com
> >> wrote:
> >
> >> Hello Dev list,
> >>
> >> I'm Chris Bradford a Product Manager at DataStax working with the
> >> cass-operator team. For background, we started down the path of
> developing
> >> an operator internally to power our C*aaS platform, Astra. Care was
> taken
> >> from day 1 to keep anything specific to this product at a layer above
> >> cass-operator so it could solely focus on the task of operating
> Cassandra
> >> clusters. With that being said, every single cluster on Astra is
> >> provisioned and operated by cass-operator. The value of an advanced
> >> operator to Cassandra users is tremendous so we decided to open source
> the
> >> project (and associated components) with the goal of building a
> community.
> >> It absolutely makes sense to offer this project and codebase up for
> >> donation as a standard / baseline for running C* on Kubernetes.
> >>
> >> Below you will find a collection of cass-operator features,
> >> differentiators, and roadmap / inflight initiatives. Table-stakes
> >> Must-have functionality for a C* operator
> >>
> >> -
> >>
> >> Datacenter provisioning
> >> -
> >>
> >> Schedule all pods
> >> -
> >>
> >> Bootstrap nodes in the appropriate order
> >> -
> >>
> >> Seeds
> >> -
> >>
> >> Across racks
> >> -
> >>
> >> etc.
> >> -
> >>
> >> Uniform configuration
> >> -
> >>
> >> Scale-up
> >> -
> >>
> >> Add new nodes in a balanced manner across rack
> >> -
> >>
> >> Scale-down
> >> -
> >>
> >> Remove nodes one at a time across racks
> >> -
> >>
> >> Node recovery
> >> -
> >>
> >> Restart process
> >> -
> >>
> >> Reschedule instance (IE replace node)
> >> - Replace instance
> >> -
> >>
> >> Specific workflows for seed node replacements
> >> -
> >>
> >> Multi-DC / Multi-Rack
> >> -
> >>
> >> Multi-Region / Multi-K8s Cluster
> >> -
> >>
> >> Note this requires support at a networking layer for pod to pod IP
> >> connectivity. This may be accomplished within the cluster with CNIs like
> >> Cilium or externally via traditional networking tools.
> >>
> >> Differentiators
> >>
> >> -
> >>
> >> OSS Ecosystem / Components
> >> -
> >>
> >> Cass Config Builder - OSS project extracted from DataStax OpsCenter Life
> >> Cycle Manager to provide automated configuration file rendering
> >> -
> >>
> >> Cass Config Definitions - definitions files for cass-config-builder,
> >> defines all configuration files, their parameters, and templates
> >> -
> >>
> >> Management API for Apache Cassandra (MAAC)
> >> -
> >>
> >> Metrics Collector for Apache Cassandra (MCAC)
> >> -
> >>
> >> Reference Prometheus Operator CRDs
> >> -
> >>
> >> ServiceMonitor
> >> -
> >>
> >> Instance
> >> -
> >>
> >> Reference Grafana Operator CRDs
> >> -
> >>
> >> Instance
> >> -
> >>
> >> Dashboards
> >> -
> >>
> >> Datasource
> >> -
> >>
> >> PodTemplateSpec
> >> -
> >>
> >> Customization of existing pods including support for adding containers,
> >> volumes, etc
> >> -
> >>
> >> Advanced Networking
> >> -
> >>
> >> Node Port
> >> -
> >>
> >> Host Network
> >> -
> >>
> >> Simple security
> >> -
> >>
> >> Management API mTLS support
> >> -
> >>
> >> Automated generation of keystore and truststore for internode and client
> >> to node TLS
> >> -
> >>
> >> Automated superuser account configuration
> >> -
> >>
> >> The default superuser (cassandra/cassandra) is disabled and never
> >> available to clients
> >> -
> >>
> >> Cluster administration account may be automatically (or provided) with
> >> values stored in a k8s secret
> >> -
> >>
> >> Automatic application of NetworkTopologyStrategy with appropriate RF for
> >> system keyspaces
> >> -
> >>
> >> Validating webhook
> >> -
> >>
> >> Invalid changes are rejected with a helpful message
> >> -
> >>
> >> Rolling cluster updates
> >> -
> >>
> >> Change in binary (C* upgrade)
> >> -
> >>
> >> Change in configuration
> >> -
> >>
> >> Canary deployments - single rack application of changes for validation
> >> before broader deployment
> >> -
> >>
> >> Rolling restart
> >> -
> >>
> >> Platform Integration / Testing / Certification
> >> -
> >>
> >> Red Hat Openshift compatible and certified
> >> -
> >>
> >> Secure, Universal Base Image (UBI) foundation images with security
> >> scanning performed by Red Hat
> >> -
> >>
> >> cass-operator
> >> -
> >>
> >> cass-config-builder
> >> -
> >>
> >> apache-cassandra w/ MCAC and MAAC
> >> -
> >>
> >> Integration with Red Hat certification pipeline / marketplace
> >> -
> >>
> >> Presence in Red Hat Operator Hub built into OpenShift interface
> >> -
> >>
> >> VMware Tanzu Kubernetes Grid Integrated Edition compatible and certified
> >> -
> >>
> >> Security scanning for images performed by VMware
> >> -
> >>
> >> Amazon EKS
> >> -
> >>
> >> Google GKE
> >> -
> >>
> >> Azure AKS
> >> -
> >>
> >> Documentation / Reference Implementations
> >> -
> >>
> >> Cloud storage classes
> >> -
> >>
> >> Ingress solutions
> >> -
> >>
> >> Sample connection validation application with reference implementations
> of
> >> Java Driver client connection parameters
> >> -
> >>
> >> Cluster-level Stop / Resume - stop all running instances while keeping
> >> persistent storage. Allows for scaling compute down to zero. Bringing
> the
> >> cluster back up follows expected startup procedures
> >>
> >> Road Map / Inflight
> >>
> >> 1.
> >>
> >> Repair
> >> 1.
> >>
> >> Reaper integration
> >> 2.
> >>
> >> Backups
> >> 1.
> >>
> >> Velero integration
> >> 2. Medusa integration
> >> 3.
> >>
> >> Advanced Networking via sidecar
> >> 1.
> >>
> >> Combination of proxy sidecars (a la Envoy) to allow for persistent IP
> >> addresses despite Kubernetes' best efforts to shuffle them.
> >> 4.
> >>
> >> Single pod canary deployments
> >> 5.
> >>
> >> Platform Certification
> >> 1. VMware Project Pacific
> >>
> >> 2.
> >>
> >> Rancher Kubernetes Engine (K3s)
> >> 6.
> >>
> >> Documentation
> >> 1.
> >>
> >> Multi-region
> >> 2.
> >>
> >> Multi-cloud
> >> 3.
> >>
> >> Additional ingress providers
> >> 1. Voyager
> >> 2. HAProxy
> >> 3. Gloo
> >> 4. Ambassdor
> >> 5. Envoy
> >> 6. NGINX Ingress Controller
> >> 4.
> >>
> >> Additional storage class references
> >> 1.
> >>
> >> OpenEBS
> >> 7.
> >>
> >> Cassandra Enhancements
> >> 1.
> >>
> >> [#CASSANDRA-15823] Support for networking via identity instead of IP
> >> - ASF JIRA <https://issues.apache.org/jira/browse/CASSANDRA-15823>
> >>
> >> If there are further questions about the project, codebase,
> architecture,
> >> etc. the team would be happy to dive in to the details and discuss more.
> >>
> >> Cheers,
> >> ~Chris
> >>
> >> Christopher Bradford
> >>
> >> On Mon, Sep 28, 2020 at 12:19 PM Patrick McFadin <pm...@gmail.com>
> >> wrote:
> >>
> >> I can agree with that Ben. Franck did a good job of outlining CassKop.
> >> Somebody from the cass-operator will be posting something similar and we
> >> can keep it on the mailing list.
> >>
> >> Patrick
> >>
> >> On Sun, Sep 27, 2020 at 2:16 PM Ben Bromhead <be...@instaclustr.com>
> wrote:
> >>
> >> Thanks Frank and Stefan.
> >>
> >> @Patrick great suggestion and worthwhile getting everything on the
> table.
> >>
> >> One minor change I would advocate for. The SIG has been great to iterate
> >> and interact on the details, but I really think this conversation given
> >>
> >> the
> >>
> >> nature of the content needs to be on the mailing list. The mailing list
> >>
> >> is
> >>
> >> really our system of record and the most accessible.
> >>
> >> It gives folk time to think and digest, it's asynchronous, easily
> >> searchable and let's be honest, the majority of stakeholders in this are
> >> not US based, so the timing issue then goes away and makes it easier for
> >> people to participate in. I feel like we've made a lot more progress by
> >> simply having this discussion here.
> >>
> >> So instead of a presentation, maybe just an email to the ML addressing
> >>
> >> the
> >>
> >> headings that Patrick identified?
> >>
> >> On Fri, Sep 25, 2020 at 7:55 AM Stefan Miklosovic < stefan.miklosovic@
> >> instaclustr.com> wrote:
> >>
> >> Hi,
> >>
> >> Patrick's suggestion seems good to me.
> >>
> >> I won't go into specifics here as I need to genuinely prepare for this.
> It
> >> is quite hard to dig deep into the solutions of others and bring some
> >> constructive criticism because it takes a lot of time to study it and
> >> everybody has some "why's" behind it.
> >>
> >> To summarize my goals and concerns:
> >>
> >> 1) We should be as much "Kubernetes operator idiomatic" as possible.
> >> Industry standards, no custom brain-child of this or that group because
> >> they think it is just cool or they just didn't know any better. I do NOT
> >> say it is like that right now, I just want to be ruthless here as much
> as
> >> possible when it comes to functionality and why it is done like that.
> It is
> >> awesome that we have already something latest (thanks to John) and it
> >> adheres to the latest releases. I personally had a hard time to keep up
> >> with all the releases, once I finished something and I aligned it,
> after a
> >> week or two there was already another one where things were different,
> it
> >> is a very fast-moving space and I hope that by time we develop
> something it
> >> will not be obsolete.
> >>
> >> 2) It may be easier said than done but it is guaranteed that people get
> >> emotional, it's their precious etc, so please let's go into this with
> good
> >> intentions, not trying to push one solution over the other just because
> >> they would like to see it there ... I will have an equally hard time to
> >> comply with this point. My plan is to explain what is _wrong_ with our
> >> solution. Where we made mistakes and what should be done differently
> but it
> >> is "too late" etc. It is quite hard to describe your work and all
> effort in
> >> this light but without telling what is wrong we can not decide what is
> good
> >> imho.
> >>
> >> 3) We should put something together fast enough so we can call it a
> >> release. We can always iterate on it for eternity. But the foundations
> need
> >> to be there. Here I want to say that I especially like what John did. I
> >> looked through these specs and it was obvious it has been written with
> care
> >> and attention. It looked _solid_. I am not sure how hard it is to put
> all
> >> other things on top of that, I truly do not, and here I think we would
> have
> >> to reinvent that wheel if we want to proceed because I can not imagine
> what
> >> it would be to retrofit e.g. CassKop on top of John specs, it is just
> like
> >> putting round pegs into the square holes, maybe some chunks would be
> reused
> >> easily but otherwise I worry we will be just on square one.
> >>
> >> One specific feeling I have as I read this is that even if there is the
> >> will to create the fourth operator, the respective parties will not be
> able
> >> to drop their own repository. The whole point behind this effort, to
> me, is
> >> to have a solid, community driven, stable, modern and feature complete
> >> operator people are truly using. I can see that once this is real, we
> will
> >> _really_ sunset our operator, redirecting people to the new operator on
> >> main readme doc etc, we truly mean it. Sure, if somebody comes and bug
> fix
> >> will be needed, we will fix it, but the whole point of doing this is to
> >> stop using what we have currently, over time, otherwise we are just
> >> splitting this space even more. If CassKop is not sure if they will use
> it
> >> because they do not know if that operator will be "enough" for them,
> aren't
> >> we just doing it wrong? If I exaggerate, they should be fine with
> deleting
> >> the whole repository and using just this Cassandra one we are going to
> make
> >> otherwise I don't see the point to work on this ...
> >>
> >> On Thu, 24 Sep 2020 at 20:45, Joshua McKenzie <jm...@apache.org>
> >> wrote:
> >>
> >> - choose cass-operator: it is not on offer right now so let’s see if
> >>
> >> it
> >>
> >> does
> >>
> >> We should all talk a lot more, but this is 100% a mistake - I take
> >>
> >> the
> >>
> >> blame for that. The intention has long been to offer cass-operator
> >>
> >> for
> >>
> >> donation but it slipped through the cracks and your email yesterday
> >>
> >> made
> >>
> >> me
> >>
> >> double-take.
> >>
> >> We have since resolved this misalignment. DataStax would be happy to
> >>
> >> donate
> >>
> >> any and all of cass-operator to the ASF and C* project if it's what
> >>
> >> we
> >>
> >> all
> >>
> >> agree best serves our collective Cassandra users. I'm also cognizant
> >>
> >> that
> >>
> >> an immense amount of effort has gone into CassKop and we seem to have
> >> something of an embarrassment of riches.
> >>
> >> I'm given to understand (haven't dug in personally) that the two
> >>
> >> operators
> >>
> >> express pretty different opinions when it comes to frameworks,
> >>
> >> designs,
> >>
> >> supported versions, etc. I think a discrete enumeration of the
> >>
> >> feature
> >>
> >> set
> >>
> >> and "identities" of both could really help navigate this conversation
> >>
> >> going
> >>
> >> forward.
> >>
> >> Also - thanks for that context Franck. It's always helpful to know
> >>
> >> where
> >>
> >> other people are coming from when we're all working together towards
> >>
> >> a
> >>
> >> common goal.
> >>
> >> On Thu, Sep 24, 2020 at 12:23 PM, <fr...@orange.com> wrote:
> >>
> >> I can share Orange’s view of the situation, sorry it is a long
> >>
> >> story!
> >>
> >> We started CassKop at the end of 2018 after betting on K8S which
> >>
> >> was
> >>
> >> not
> >>
> >> so simple as far as C* was concerned. Lack of support for local
> >>
> >> storage,
> >>
> >> IPs that change all the time, different network plugins to try to
> >>
> >> implement
> >>
> >> a non standard K8s way of having nodes see each other from
> >>
> >> different
> >>
> >> dcs…
> >>
> >> We hesitated with Mesos but could not have both and K8S was already
> >> tracting so much you could not not choose it.
> >>
> >> Anyway, we looked around and did not see anyone with such
> >>
> >> requirements
> >>
> >> so
> >>
> >> we said: why not try it ourselves but on github so that we may give
> >>
> >> it
> >>
> >> back
> >>
> >> to the community. We have used C* for quite a few years with great
> >>
> >> success
> >>
> >> on production with massive load and perfect availability. We love
> >>
> >> C*
> >>
> >> @
> >>
> >> Orange :) Thanks!
> >>
> >> So we started writing support for mono-dc cluster (CassKop) and
> >>
> >> added
> >>
> >> the
> >>
> >> multi dc support with MultiCassKop which is another operator
> >>
> >> included
> >>
> >> in
> >>
> >> the CassKop repo. For more details we tried to document our designs
> >>
> >> as
> >>
> >> much
> >>
> >> as possible here:
> >>
> >> https://orange-opensource.github.io/casskop/docs/
> >>
> >> 1_concepts/3_design_principes#multi-site-management
> >>
> >> In the middle of last year we had some talks with Datastax about
> >>
> >> working
> >>
> >> together around their new management sidecar. Their position on
> >>
> >> open
> >>
> >> source
> >>
> >> was not clear at that time so we said please come back when you
> >>
> >> have
> >>
> >> decided to go open source with it. Which they did in the beginning
> >>
> >> of
> >>
> >> this
> >>
> >> year. But at that time I guess work had started on cass-operator so
> >>
> >> we
> >>
> >> kept
> >>
> >> our separate ways.
> >>
> >> Since the beginning of the years, we have been working with our OPS
> >>
> >> team
> >>
> >> to have it in production. It is not simple as the team has to learn
> >>
> >> K8S and
> >>
> >> trust a newborn operator. This takes time especially as our
> >>
> >> internal
> >>
> >> cluster has been tweaked for multi-tenancy with obscure options
> >>
> >> being
> >>
> >> set
> >>
> >> by our K8s team…
> >>
> >> We also developed with Instaclustr the Backup & Restore
> >>
> >> functionnality
> >>
> >> (we
> >>
> >> have new CRDs (Custom Resource Definition) for backup and restore
> >>
> >> and a
> >>
> >> reconcile loop that calls out Instaclustr sidecar for these
> >>
> >> operations). We
> >>
> >> now support multiple backups in parallel and can write to s3/
> >>
> >> google
> >>
> >> or
> >>
> >> azur (but Stefan could give more details here if needed)
> >>
> >> During the SIG calls we mentioned our desire to donate CassKop once
> >>
> >> it
> >>
> >> satisfies our basics requirements (v1 coming just now but I said it
> >>
> >> too
> >>
> >> many times already) I am actually not sure Datastax mentioned their
> >>
> >> desire
> >>
> >> to donate cass-operator but we decided to compare the designs and
> >>
> >> the
> >>
> >> functionalities based on respective CRDs. The CRD is the interface
> >>
> >> with the
> >>
> >> user as it is where you describe the cluster that you want to have.
> >>
> >> These
> >>
> >> talks were very interesting and we found out that the CassKop team
> >>
> >> had
> >>
> >> made
> >>
> >> good choices most of the time but was may be too open. Indeed our
> >>
> >> intention
> >>
> >> was to give all the possibilities for our OPS team to work. This
> >>
> >> includes :
> >>
> >> - very open topology definition using any configuration of labels
> >>
> >> to
> >>
> >> map
> >>
> >> dcs / racks and nodes to labels on clusters (we have labels on dcs
> >>
> >> /
> >>
> >> rooms
> >>
> >> / rows and server racks so we can map C* racks to storage or
> >>
> >> network
> >>
> >> arrays
> >>
> >> internaly)
> >> - possibility to have multiple C* nodes on a single K8S host
> >>
> >> (because
> >>
> >> internal clouds are not really clouds, they have limited resources)
> >> - custom C* image selection,
> >> - custom bootstrap script that lets you configure C* as you want
> >>
> >> using
> >>
> >> ConfigMaps,
> >> - the ability to mount different volumes wherever they wanted,
> >> - the possibility to run any number of sidecars alongside C* for
> >>
> >> custom
> >>
> >> probes in our case
> >>
> >> This makes CassKop quite powerful and flexible.
> >> We made sure that all those options are not enabled by default so
> >>
> >> one
> >>
> >> can
> >>
> >> just pop a simple 3 node cluster quickly
> >>
> >> On the other hand cass-operator had an interesting way of
> >>
> >> configuring
> >>
> >> C*
> >>
> >> just inside the CRD using cass-config. This is simple and elegant
> >>
> >> so
> >>
> >> we are
> >>
> >> implementing it as well for the support of C* 4
> >>
> >> Now for the future, there are 3 choices in my opinion:
> >> - start from scratch (or John’s repo) by cherry picking bits from
> >>
> >> all
> >>
> >> operators. This is possible but will take some time / effort to
> >>
> >> have
> >>
> >> something usable. And then it will be compared to cass-operator and
> >> CassKop. I don’t see Orange contributing too much here as we
> >>
> >> believe
> >>
> >> CassKop to be a much better starting point
> >> - choose cass-operator: it is not on offer right now so let’s see
> >>
> >> if
> >>
> >> it
> >>
> >> does. I think Orange could contribute some bits inherited from
> >>
> >> CassKop
> >>
> >> if
> >>
> >> it is agreed by the community. Not sure it would be enough for us
> >>
> >> to
> >>
> >> use
> >>
> >> it.
> >> - choose CassKop: we would be delighted to donate it and contribute
> >>
> >> with
> >>
> >> some committers (including the original author who now works for
> >>
> >> AWS).
> >>
> >> It
> >>
> >> would then become the community operator but there would be
> >>
> >> cass-operator
> >>
> >> alongside probably. But Cass-operator is made to make it easier for
> >> Datastax to manage customer clusters by imposing some
> >>
> >> configuration.
> >>
> >> It
> >>
> >> make sense for their needs, so may be 2 operators. We don’t know
> >>
> >> how
> >>
> >> backup/restore will be handled here with medusa being adapted to
> >>
> >> K8s
> >>
> >> Sorry again for being long but 2 years of work deserve some lines
> >>
> >> of
> >>
> >> text
> >>
> >> :)
> >>
> >> I just saw your message Patrick but this was written already so we
> >>
> >> gain a
> >>
> >> week.
> >>
> >> Franck
> >>
> >> On 24 Sep 2020, at 10:08, Benjamin Lerer <
> >>
> >> benjamin.lerer@datastax.com
> >>
> >> <ma...@datastax.com>> wrote:
> >>
> >> I realise there are meeting logs, but getting a wider discourse
> >>
> >> with
> >>
> >> non-stakeholder input might help to build a community consensus? It
> >>
> >> doesn't
> >>
> >> seem like it can hurt at this point, anyway.
> >>
> >> +1
> >>
> >> On Wed, Sep 23, 2020 at 9:21 PM Benedict Elliott Smith
> >>
> >> <benedict@apache.
> >>
> >> org<ma...@apache.org>> wrote:
> >>
> >> Perhaps it helps to widen the field of discussion to the dev list?
> >>
> >> It might help if each of the stakeholder organisations state their
> >>
> >> view on
> >>
> >> the situation, including why they would or would not support a
> >>
> >> given
> >>
> >> approach/operator, and what (preferably specific) circumstances
> >>
> >> might
> >>
> >> lead
> >>
> >> them to change their mind?
> >>
> >> I realise there are meeting logs, but getting a wider discourse
> >>
> >> with
> >>
> >> non-stakeholder input might help to build a community consensus? It
> >>
> >> doesn't
> >>
> >> seem like it can hurt at this point, anyway.
> >>
> >> On 23/09/2020, 17:13, "John Sanda" <john.sanda@gmail.com<mailto:
> >>
> >> john.
> >>
> >> sanda@gmail.com>> wrote:
> >>
> >> I want to point out that pretty much everything being discussed in
> >>
> >> this
> >>
> >> thread has been discussed at length during the SIG meetings. I
> >>
> >> think
> >>
> >> it is
> >>
> >> worth noting because we are pretty much still have the same
> >>
> >> conversation.
> >>
> >> On Wed, Sep 23, 2020 at 12:03 PM Benedict Elliott Smith <
> >>
> >> benedict@apache.
> >>
> >> org<ma...@apache.org>> wrote:
> >>
> >> I don't think there's anything about a code drop that's not "The
> >>
> >> Apache
> >>
> >> Way"
> >>
> >> If there's a consensus (or even strong majority) amongst invested
> >>
> >> parties,
> >>
> >> I don't see why we could not adopt an operator directly into the
> >>
> >> project.
> >>
> >> It's possible a green field approach might lead to fewer hard
> >>
> >> feelings, as
> >>
> >> everyone is in the same boat. Perhaps all operators are also
> >>
> >> suboptimal
> >>
> >> and
> >> could be improved with a rewrite? But I think coordinating a lot of
> >> different entities around an empty codebase is particularly
> >>
> >> challenging. I
> >>
> >> actually think it could be better for cohesion and collaboration to
> >>
> >> have a
> >>
> >> suboptimal but substantive starting point.
> >>
> >> On 23/09/2020, 16:11, "Stefan Miklosovic" < stefan.miklosovic@
> >> instaclustr.com<ma...@instaclustr.com>> wrote:
> >>
> >> I think that from Instaclustr it was stated quite clearly multiple times
> >> that we are "fine to throw it away" if there is something
> >>
> >> better
> >>
> >> and more wide-spread.Indeed, we have invested a lot of time in the
> >> operator but it was not useless at all, we gained a lot of quite
> >>
> >> unique
> >>
> >> knowledge how to put all pieces together. However, I think that this
> space
> >> is going to be quite fragmented and "balkanized", which
> >>
> >> is
> >>
> >> not always a bad thing, but in a quite narrow area as Kubernetes
> >>
> >> operator
> >>
> >> is, I just do not see how 4 operators are going to be beneficial
> >>
> >> for
> >>
> >> ordinary people ("official" from community, ours, Datastax one and
> >>
> >> CassKop
> >>
> >> (without any significant order)). Sure, innovation and healthy
> >>
> >> competition
> >>
> >> is important but to what extent ...
> >> One can start a Cassandra cluster on Kubernetes just so many times
> >> differently and nobody really likes a vendor lock-in. People
> >>
> >> wanting
> >>
> >> to run a cluster on K8S realise that there are three operators,
> >>
> >> each
> >>
> >> backed by a private business entity, and the community operator is
> >>
> >> not
> >>
> >> there ... Huh, interesting ... One may even start to question what
> >>
> >> is
> >>
> >> wrong with these folks that it takes three companies to build their own
> >> solution.
> >>
> >> Having said that, to my perception, Cassandra community just does
> >>
> >> not
> >>
> >> have enough engineers nor contributors to keep 4 operators alive at the
> >> same time (I wish I was wrong) so the idea of selecting the
> >>
> >> best
> >>
> >> one or to merge obvious things and approaches together is
> >>
> >> understandable,
> >>
> >> even if it meant we eventually sunset ours. In addition, nobody
> >>
> >> from
> >>
> >> big
> >>
> >> players is going to contribute to the code
> >> base of the other one, for obvious reasons, so channeling and
> >>
> >> directing
> >>
> >> this effort into something common for a community seems to be the only
> >> reasonable way of cooperation.
> >>
> >> It is quite hard to bootstrap this if the donation of the code in
> >>
> >> big
> >>
> >> chunks / whole repo is out of question as it is not the "Apache
> >>
> >> way"
> >>
> >> (there was some thread running here about this in more depth a
> >>
> >> while
> >>
> >> ago) and we basically need to start from scratch which is quite
> >> demotivating, we are just inventing the wheel and nobody is up to
> >>
> >> it.
> >>
> >> It is like people are waiting for that to happen so they can jump
> >>
> >> in
> >>
> >> "once it is the thing" but it will never materialise or at least
> >>
> >> the
> >>
> >> hurdle to kick it off is unnecessarily high. Nobody is going to
> >>
> >> invest
> >>
> >> in this heavily if there is already a working operator from
> >>
> >> companies
> >>
> >> mentioned above. As I understood it, one reason of not choosing the way
> of
> >> donating it all is that "the learning and community building should
> happen
> >> in organic manner and we just can not accept the
> >>
> >> donation",
> >>
> >> but is not it true that it is easier to build a community around
> something
> >> which is already there rather than trying to build
> >>
> >> it
> >>
> >> around an idea which is quite hard to dedicate to?
> >>
> >> On Wed, 23 Sep 2020 at 15:28, Joshua McKenzie <
> >>
> >> jmckenzie@apache.org
> >>
> >> <ma...@apache.org>> wrote:
> >>
> >> I think there's significant value to the community in trying to coalesce
> >> on a single approach,
> >> I agree. Unfortunately in this case, the parties with a vested
> >>
> >> interest
> >>
> >> and
> >> written operators came to the table and couldn't agree to coalesce on a
> >> single approach. John Sanda attempted to start an initiative to
> >>
> >> write a
> >>
> >> best-of-breed combining choice parts of each operator, but that
> >>
> >> effort
> >>
> >> did
> >>
> >> not gain traction.
> >>
> >> Which is where my hypothesis comes from that if there were a clear
> >> "better
> >> fit" operator to start from we wouldn't be in a deadlock; the
> >>
> >> correct
> >>
> >> choice would be obvious. Reasonably so, every engineer that's
> >>
> >> written
> >>
> >> something is going to want that something to be used and not thrown away
> >> in
> >> favor of another something without strong evidence as to why that's the
> >> better choice.
> >>
> >> As far as I know, nobody has made a clear case as to a more
> >>
> >> compelling
> >>
> >> place to start in terms of an operator donation the project then
> >> collaborates on. There's no mass adoption evidence nor feature
> >>
> >> enumeration
> >>
> >> that I know of for any of the approaches anyone's taken, so the
> >> discussions
> >> remain stalled.
> >>
> >> On Wed, Sep 23, 2020 at 7:18 AM, Benedict Elliott Smith <
> >>
> >> benedict@apache.
> >>
> >> org<ma...@apache.org> wrote:
> >>
> >> I think there's significant value to the community in trying to coalesce
> >> on a single approach, earlier than later. This is an opportunity to
> expand
> >> the number of active organisations involved directly in the Apache
> >> Cassandra project, as well as to more quickly expand the project's
> >> functionality into an area we consider urgent and important. I think it
> >> would be a real shame to waste this opportunity. No doubt it will be
> hard,
> >> as organisations have certain built-in investments in their own
> >> approaches.
> >>
> >> I haven't participated in these calls as I do not consider myself to
> have
> >> the relevant experience and expertise, and have other focuses on the
> >> project. I just wanted to voice a vote in favour of trying to bring
> >>
> >> the
> >>
> >> different organisations together on a single approach if possible. Is
> >> there
> >> anything the project can do to help this happen?
> >>
> >> On 23/09/2020, 03:04, "Ben Bromhead" <ben@instaclustr.com<mailto:
> >>
> >> ben@
> >>
> >> instaclustr.com>> wrote:
> >>
> >> I think there is certainly an appetite to donate and standardise on a
> >> given operator (as mentioned in this thread).
> >>
> >> I personally found the SIG hard to participate in due to time zones
> >>
> >> and
> >>
> >> the synchronous nature of it.
> >>
> >> So while it was a great forum to dive into certain details for a subset
> of
> >> participants and a worthwhile endeavour, I wouldn't paint it as an
> >> accurate
> >> reflection of community intent.
> >>
> >> I don't think that any participants want to continue down the path of
> "let
> >> a thousand flowers bloom". That's why we are looking towards CasKop
> >>
> >> (as
> >>
> >> well as a number of technical reasons).
> >>
> >> Some of the recorded meetings and outputs can also be found if you are
> >> interested in some primary sources
> >> https://cwiki.apache.org/confluence/display/CASSANDRA/
> >> Cassandra+Kubernetes+Operator+SIG
> >> .
> >>
> >> From what I understand second-hand from talking to people on the SIG
> >> calls,
> >>
> >> there was a general inability to agree on an existing operator as a
> >> starting point and not much engagement on taking best of breed from the
> >> various to combine them. Seems to leave us in the "let a thousand
> flowers
> >> bloom" stage of letting operators grow in the ecosystem and seeing which
> >> ones meet the needs of end users before talking about adopting one into
> >> the
> >> foundation.
> >>
> >> Great to hear that you folks are joining forces though! Bodes well for
> C*
> >> users that are wanting to run things on k8s.
> >>
> >> On Tue, Sep 22, 2020 at 4:26 AM, Ben Bromhead <
> >>
> >> ben@instaclustr.com
> >>
> >> <ma...@instaclustr.com>
> >>
> >> wrote:
> >>
> >> For what it's worth, a quick update from me:
> >>
> >> CassKop now has at least two organisations working on it
> >>
> >> substantially
> >>
> >> (Orange and Instaclustr) as well as the numerous other
> >>
> >> contributors.
> >>
> >> Internally we will also start pointing others towards CasKop once a few
> >> things get merged. While we are not yet sunsetting our operator yet, it
> >>
> >> is
> >>
> >> certainly looking that way.
> >>
> >> I'd love to see the community adopt it as a starting point for working
> >> towards whatever level of functionality is desired.
> >>
> >> Cheers
> >>
> >> Ben
> >>
> >> On Fri, Sep 11, 2020 at 2:37 PM John Sanda <
> >> john.sanda@gmail.com>
> >> wrote:
> >>
> >> On Thu, Sep 10, 2020 at 5:27 PM Josh McKenzie <
> >>
> >> jmckenzie@apache.org
> >>
> >> wrote:
> >>
> >> There's basically 1 java driver in the C* ecosystem. We have 3? 4? or
> >>
> >> more
> >>
> >> operators in the ecosystem. Has one of them hit a clear
> >>
> >> supermajority
> >>
> >> of
> >>
> >> adoption that makes it the de facto default and makes sense to pull it
> >>
> >> into
> >>
> >> the project?
> >>
> >> We as a project community were pretty slow to move on building a PoV
> >>
> >> around
> >>
> >> kubernetes so we find ourselves in a situation with a bunch of
> contenders
> >> for inclusion in the project. It's not clear to me what heuristics we'd
> >>
> >> use
> >>
> >> to gauge which one would be the best fit for inclusion outside letting
> >> community adoption speak.
> >>
> >> ---
> >> Josh McKenzie
> >>
> >> We actually talked a good bit on the SIG call earlier today about
> >> heuristics. We need to document what functionality an operator should
> >> include at level 0, level 1, etc. We did discuss this a good bit during
> >> some of the initial SIG meetings, but I guess it wasn't really a focal
> >> point at the time. I think we should also provide references to existing
> >> operator projects and possibly other related projects. This would
> benefit
> >> both community users as well as people working on these projects.
> >>
> >> - John
> >>
> >> --
> >>
> >> Ben Bromhead
> >>
> >> Instaclustr | www.instaclustr.com | @instaclustr
> >> <http://twitter.com/instaclustr> | (650) 284 9692
> >>
> >> --
> >>
> >> Ben Bromhead
> >>
> >> Instaclustr | www.instaclustr.com | @instaclustr
> >> <http://twitter.com/instaclustr> | (650) 284 9692
> >>
> >> ---------------------------------------------------------------------
> >>
> >> To
> >>
> >> unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org For
> additional
> >> commands, e-mail: dev-help@cassandra.apache.org
> >>
> >> ---------------------------------------------------------------------
> >>
> >> To
> >>
> >> unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org For
> >>
> >> additional
> >>
> >> commands, e-mail: dev-help@cassandra.apache.org
> >>
> >> ---------------------------------------------------------------------
> >>
> >> To
> >>
> >> unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org For
> >>
> >> additional
> >>
> >> commands, e-mail: dev-help@cassandra.apache.org
> >>
> >> --
> >>
> >> - John
> >>
> >> ---------------------------------------------------------------------
> >>
> >> To
> >>
> >> unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org For
> >>
> >> additional
> >>
> >> commands, e-mail: dev-help@cassandra.apache.org
> >>
> >>
> _________________________________________________________________________________________________________________________
> >>
> >>
> >> Ce message et ses pieces jointes peuvent contenir des informations
> >> confidentielles ou privilegiees et ne doivent donc pas etre
> >>
> >> diffuses,
> >>
> >> exploites ou copies sans autorisation. Si vous avez recu ce message
> >>
> >> par
> >>
> >> erreur, veuillez le signaler a l'expediteur et le detruire ainsi
> >>
> >> que
> >>
> >> les
> >>
> >> pieces jointes. Les messages electroniques etant susceptibles
> >>
> >> d'alteration,
> >>
> >> Orange decline toute responsabilite si ce message a ete altere,
> >>
> >> deforme ou
> >>
> >> falsifie. Merci.
> >>
> >> This message and its attachments may contain confidential or
> >>
> >> privileged
> >>
> >> information that may be protected by law; they should not be
> >>
> >> distributed,
> >>
> >> used or copied without authorisation. If you have received this
> >>
> >> email
> >>
> >> in
> >>
> >> error, please notify the sender and delete this message and its
> >> attachments. As emails may be altered, Orange is not liable for
> >>
> >> messages
> >>
> >> that have been modified, changed or falsified. Thank you.
> >>
> >> --------------------------------------------------------------------- To
> >> unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org For
> additional
> >> commands, e-mail: dev-help@cassandra.apache.org
> >>
> >> --
> >>
> >> Ben Bromhead
> >>
> >> Instaclustr | www.instaclustr.com | @instaclustr
> >> <http://twitter.com/instaclustr> | (650) 284 9692
> >>
> >>
>
>
>
> _________________________________________________________________________________________________________________________
>
> Ce message et ses pieces jointes peuvent contenir des informations
> confidentielles ou privilegiees et ne doivent donc
> pas etre diffuses, exploites ou copies sans autorisation. Si vous avez
> recu ce message par erreur, veuillez le signaler
> a l'expediteur et le detruire ainsi que les pieces jointes. Les messages
> electroniques etant susceptibles d'alteration,
> Orange decline toute responsabilite si ce message a ete altere, deforme ou
> falsifie. Merci.
>
> This message and its attachments may contain confidential or privileged
> information that may be protected by law;
> they should not be distributed, used or copied without authorisation.
> If you have received this email in error, please notify the sender and
> delete this message and its attachments.
> As emails may be altered, Orange is not liable for messages that have been
> modified, changed or falsified.
> Thank you.
>
>

Re: [DISCUSS] Next steps for Kubernetes operator SIG

Posted by fr...@orange.com.

An update on Orange's point of view following the recent emails:

If we were a newly interested party in running C* in K8s, we would use Cass-operator as it comes from Datastax.

The logic would then be that the community embraces it and thanks Datastax for offering it!

So, on Orange side, we propose to discuss with Datastax how to best merge Casskop's features in Cass-operator.
These features are:
- nodes labelling to map any internal architecture (including network specific labels to muti-dc setup)
- volumes & sidecars management (possibly linked to PodTemplateSpec)
- backup & restore (we ruled out velero and can share why we went with Instaclustr but Medusa could work too)
- kubectl plugin integration (quite useful on the ops side without an admin UI)
- multiCassKop evolution to drive multiple cass-operators instead of multiple casskops (this could remain Orange internal if too specific)

We could decide at the end of these discussions the best way forward.
Orange could make PRs on cass-operator, but only if we agree we want the functionalities :)

If we can sort it out we could end up with a pretty neat operator.

We share a common architecture (operator-sdk), start to know each other with all these meetings so it should be possible if we want to!

Would that be ok for the community and Datastax?



> On 2 Oct 2020, at 14:52, Joshua McKenzie <jm...@apache.org> wrote:
> 
> What are next steps here?
> 
> Maybe we collectively put a table together w/the 2 operators and a list of
> features to compare and contrast? Enumerate the frameworks / dependencies
> they have to help form a point of view about the strengths and weaknesses
> of each option?
> 
> 
> On Tue, Sep 29, 2020 at 10:22 PM, Christopher Bradford <bradfordcp@gmail.com
>> wrote:
> 
>> Hello Dev list,
>> 
>> I'm Chris Bradford a Product Manager at DataStax working with the
>> cass-operator team. For background, we started down the path of developing
>> an operator internally to power our C*aaS platform, Astra. Care was taken
>> from day 1 to keep anything specific to this product at a layer above
>> cass-operator so it could solely focus on the task of operating Cassandra
>> clusters. With that being said, every single cluster on Astra is
>> provisioned and operated by cass-operator. The value of an advanced
>> operator to Cassandra users is tremendous so we decided to open source the
>> project (and associated components) with the goal of building a community.
>> It absolutely makes sense to offer this project and codebase up for
>> donation as a standard / baseline for running C* on Kubernetes.
>> 
>> Below you will find a collection of cass-operator features,
>> differentiators, and roadmap / inflight initiatives. Table-stakes
>> Must-have functionality for a C* operator
>> 
>> -
>> 
>> Datacenter provisioning
>> -
>> 
>> Schedule all pods
>> -
>> 
>> Bootstrap nodes in the appropriate order
>> -
>> 
>> Seeds
>> -
>> 
>> Across racks
>> -
>> 
>> etc.
>> -
>> 
>> Uniform configuration
>> -
>> 
>> Scale-up
>> -
>> 
>> Add new nodes in a balanced manner across rack
>> -
>> 
>> Scale-down
>> -
>> 
>> Remove nodes one at a time across racks
>> -
>> 
>> Node recovery
>> -
>> 
>> Restart process
>> -
>> 
>> Reschedule instance (IE replace node)
>> - Replace instance
>> -
>> 
>> Specific workflows for seed node replacements
>> -
>> 
>> Multi-DC / Multi-Rack
>> -
>> 
>> Multi-Region / Multi-K8s Cluster
>> -
>> 
>> Note this requires support at a networking layer for pod to pod IP
>> connectivity. This may be accomplished within the cluster with CNIs like
>> Cilium or externally via traditional networking tools.
>> 
>> Differentiators
>> 
>> -
>> 
>> OSS Ecosystem / Components
>> -
>> 
>> Cass Config Builder - OSS project extracted from DataStax OpsCenter Life
>> Cycle Manager to provide automated configuration file rendering
>> -
>> 
>> Cass Config Definitions - definitions files for cass-config-builder,
>> defines all configuration files, their parameters, and templates
>> -
>> 
>> Management API for Apache Cassandra (MAAC)
>> -
>> 
>> Metrics Collector for Apache Cassandra (MCAC)
>> -
>> 
>> Reference Prometheus Operator CRDs
>> -
>> 
>> ServiceMonitor
>> -
>> 
>> Instance
>> -
>> 
>> Reference Grafana Operator CRDs
>> -
>> 
>> Instance
>> -
>> 
>> Dashboards
>> -
>> 
>> Datasource
>> -
>> 
>> PodTemplateSpec
>> -
>> 
>> Customization of existing pods including support for adding containers,
>> volumes, etc
>> -
>> 
>> Advanced Networking
>> -
>> 
>> Node Port
>> -
>> 
>> Host Network
>> -
>> 
>> Simple security
>> -
>> 
>> Management API mTLS support
>> -
>> 
>> Automated generation of keystore and truststore for internode and client
>> to node TLS
>> -
>> 
>> Automated superuser account configuration
>> -
>> 
>> The default superuser (cassandra/cassandra) is disabled and never
>> available to clients
>> -
>> 
>> Cluster administration account may be automatically (or provided) with
>> values stored in a k8s secret
>> -
>> 
>> Automatic application of NetworkTopologyStrategy with appropriate RF for
>> system keyspaces
>> -
>> 
>> Validating webhook
>> -
>> 
>> Invalid changes are rejected with a helpful message
>> -
>> 
>> Rolling cluster updates
>> -
>> 
>> Change in binary (C* upgrade)
>> -
>> 
>> Change in configuration
>> -
>> 
>> Canary deployments - single rack application of changes for validation
>> before broader deployment
>> -
>> 
>> Rolling restart
>> -
>> 
>> Platform Integration / Testing / Certification
>> -
>> 
>> Red Hat Openshift compatible and certified
>> -
>> 
>> Secure, Universal Base Image (UBI) foundation images with security
>> scanning performed by Red Hat
>> -
>> 
>> cass-operator
>> -
>> 
>> cass-config-builder
>> -
>> 
>> apache-cassandra w/ MCAC and MAAC
>> -
>> 
>> Integration with Red Hat certification pipeline / marketplace
>> -
>> 
>> Presence in Red Hat Operator Hub built into OpenShift interface
>> -
>> 
>> VMware Tanzu Kubernetes Grid Integrated Edition compatible and certified
>> -
>> 
>> Security scanning for images performed by VMware
>> -
>> 
>> Amazon EKS
>> -
>> 
>> Google GKE
>> -
>> 
>> Azure AKS
>> -
>> 
>> Documentation / Reference Implementations
>> -
>> 
>> Cloud storage classes
>> -
>> 
>> Ingress solutions
>> -
>> 
>> Sample connection validation application with reference implementations of
>> Java Driver client connection parameters
>> -
>> 
>> Cluster-level Stop / Resume - stop all running instances while keeping
>> persistent storage. Allows for scaling compute down to zero. Bringing the
>> cluster back up follows expected startup procedures
>> 
>> Road Map / Inflight
>> 
>> 1.
>> 
>> Repair
>> 1.
>> 
>> Reaper integration
>> 2.
>> 
>> Backups
>> 1.
>> 
>> Velero integration
>> 2. Medusa integration
>> 3.
>> 
>> Advanced Networking via sidecar
>> 1.
>> 
>> Combination of proxy sidecars (a la Envoy) to allow for persistent IP
>> addresses despite Kubernetes' best efforts to shuffle them.
>> 4.
>> 
>> Single pod canary deployments
>> 5.
>> 
>> Platform Certification
>> 1. VMware Project Pacific
>> 
>> 2.
>> 
>> Rancher Kubernetes Engine (K3s)
>> 6.
>> 
>> Documentation
>> 1.
>> 
>> Multi-region
>> 2.
>> 
>> Multi-cloud
>> 3.
>> 
>> Additional ingress providers
>> 1. Voyager
>> 2. HAProxy
>> 3. Gloo
>> 4. Ambassdor
>> 5. Envoy
>> 6. NGINX Ingress Controller
>> 4.
>> 
>> Additional storage class references
>> 1.
>> 
>> OpenEBS
>> 7.
>> 
>> Cassandra Enhancements
>> 1.
>> 
>> [#CASSANDRA-15823] Support for networking via identity instead of IP
>> - ASF JIRA <https://issues.apache.org/jira/browse/CASSANDRA-15823>
>> 
>> If there are further questions about the project, codebase, architecture,
>> etc. the team would be happy to dive in to the details and discuss more.
>> 
>> Cheers,
>> ~Chris
>> 
>> Christopher Bradford
>> 
>> On Mon, Sep 28, 2020 at 12:19 PM Patrick McFadin <pm...@gmail.com>
>> wrote:
>> 
>> I can agree with that Ben. Franck did a good job of outlining CassKop.
>> Somebody from the cass-operator will be posting something similar and we
>> can keep it on the mailing list.
>> 
>> Patrick
>> 
>> On Sun, Sep 27, 2020 at 2:16 PM Ben Bromhead <be...@instaclustr.com> wrote:
>> 
>> Thanks Frank and Stefan.
>> 
>> @Patrick great suggestion and worthwhile getting everything on the table.
>> 
>> One minor change I would advocate for. The SIG has been great to iterate
>> and interact on the details, but I really think this conversation given
>> 
>> the
>> 
>> nature of the content needs to be on the mailing list. The mailing list
>> 
>> is
>> 
>> really our system of record and the most accessible.
>> 
>> It gives folk time to think and digest, it's asynchronous, easily
>> searchable and let's be honest, the majority of stakeholders in this are
>> not US based, so the timing issue then goes away and makes it easier for
>> people to participate in. I feel like we've made a lot more progress by
>> simply having this discussion here.
>> 
>> So instead of a presentation, maybe just an email to the ML addressing
>> 
>> the
>> 
>> headings that Patrick identified?
>> 
>> On Fri, Sep 25, 2020 at 7:55 AM Stefan Miklosovic < stefan.miklosovic@
>> instaclustr.com> wrote:
>> 
>> Hi,
>> 
>> Patrick's suggestion seems good to me.
>> 
>> I won't go into specifics here as I need to genuinely prepare for this. It
>> is quite hard to dig deep into the solutions of others and bring some
>> constructive criticism because it takes a lot of time to study it and
>> everybody has some "why's" behind it.
>> 
>> To summarize my goals and concerns:
>> 
>> 1) We should be as much "Kubernetes operator idiomatic" as possible.
>> Industry standards, no custom brain-child of this or that group because
>> they think it is just cool or they just didn't know any better. I do NOT
>> say it is like that right now, I just want to be ruthless here as much as
>> possible when it comes to functionality and why it is done like that. It is
>> awesome that we have already something latest (thanks to John) and it
>> adheres to the latest releases. I personally had a hard time to keep up
>> with all the releases, once I finished something and I aligned it, after a
>> week or two there was already another one where things were different, it
>> is a very fast-moving space and I hope that by time we develop something it
>> will not be obsolete.
>> 
>> 2) It may be easier said than done but it is guaranteed that people get
>> emotional, it's their precious etc, so please let's go into this with good
>> intentions, not trying to push one solution over the other just because
>> they would like to see it there ... I will have an equally hard time to
>> comply with this point. My plan is to explain what is _wrong_ with our
>> solution. Where we made mistakes and what should be done differently but it
>> is "too late" etc. It is quite hard to describe your work and all effort in
>> this light but without telling what is wrong we can not decide what is good
>> imho.
>> 
>> 3) We should put something together fast enough so we can call it a
>> release. We can always iterate on it for eternity. But the foundations need
>> to be there. Here I want to say that I especially like what John did. I
>> looked through these specs and it was obvious it has been written with care
>> and attention. It looked _solid_. I am not sure how hard it is to put all
>> other things on top of that, I truly do not, and here I think we would have
>> to reinvent that wheel if we want to proceed because I can not imagine what
>> it would be to retrofit e.g. CassKop on top of John specs, it is just like
>> putting round pegs into the square holes, maybe some chunks would be reused
>> easily but otherwise I worry we will be just on square one.
>> 
>> One specific feeling I have as I read this is that even if there is the
>> will to create the fourth operator, the respective parties will not be able
>> to drop their own repository. The whole point behind this effort, to me, is
>> to have a solid, community driven, stable, modern and feature complete
>> operator people are truly using. I can see that once this is real, we will
>> _really_ sunset our operator, redirecting people to the new operator on
>> main readme doc etc, we truly mean it. Sure, if somebody comes and bug fix
>> will be needed, we will fix it, but the whole point of doing this is to
>> stop using what we have currently, over time, otherwise we are just
>> splitting this space even more. If CassKop is not sure if they will use it
>> because they do not know if that operator will be "enough" for them, aren't
>> we just doing it wrong? If I exaggerate, they should be fine with deleting
>> the whole repository and using just this Cassandra one we are going to make
>> otherwise I don't see the point to work on this ...
>> 
>> On Thu, 24 Sep 2020 at 20:45, Joshua McKenzie <jm...@apache.org>
>> wrote:
>> 
>> - choose cass-operator: it is not on offer right now so let’s see if
>> 
>> it
>> 
>> does
>> 
>> We should all talk a lot more, but this is 100% a mistake - I take
>> 
>> the
>> 
>> blame for that. The intention has long been to offer cass-operator
>> 
>> for
>> 
>> donation but it slipped through the cracks and your email yesterday
>> 
>> made
>> 
>> me
>> 
>> double-take.
>> 
>> We have since resolved this misalignment. DataStax would be happy to
>> 
>> donate
>> 
>> any and all of cass-operator to the ASF and C* project if it's what
>> 
>> we
>> 
>> all
>> 
>> agree best serves our collective Cassandra users. I'm also cognizant
>> 
>> that
>> 
>> an immense amount of effort has gone into CassKop and we seem to have
>> something of an embarrassment of riches.
>> 
>> I'm given to understand (haven't dug in personally) that the two
>> 
>> operators
>> 
>> express pretty different opinions when it comes to frameworks,
>> 
>> designs,
>> 
>> supported versions, etc. I think a discrete enumeration of the
>> 
>> feature
>> 
>> set
>> 
>> and "identities" of both could really help navigate this conversation
>> 
>> going
>> 
>> forward.
>> 
>> Also - thanks for that context Franck. It's always helpful to know
>> 
>> where
>> 
>> other people are coming from when we're all working together towards
>> 
>> a
>> 
>> common goal.
>> 
>> On Thu, Sep 24, 2020 at 12:23 PM, <fr...@orange.com> wrote:
>> 
>> I can share Orange’s view of the situation, sorry it is a long
>> 
>> story!
>> 
>> We started CassKop at the end of 2018 after betting on K8S which
>> 
>> was
>> 
>> not
>> 
>> so simple as far as C* was concerned. Lack of support for local
>> 
>> storage,
>> 
>> IPs that change all the time, different network plugins to try to
>> 
>> implement
>> 
>> a non standard K8s way of having nodes see each other from
>> 
>> different
>> 
>> dcs…
>> 
>> We hesitated with Mesos but could not have both and K8S was already
>> tracting so much you could not not choose it.
>> 
>> Anyway, we looked around and did not see anyone with such
>> 
>> requirements
>> 
>> so
>> 
>> we said: why not try it ourselves but on github so that we may give
>> 
>> it
>> 
>> back
>> 
>> to the community. We have used C* for quite a few years with great
>> 
>> success
>> 
>> on production with massive load and perfect availability. We love
>> 
>> C*
>> 
>> @
>> 
>> Orange :) Thanks!
>> 
>> So we started writing support for mono-dc cluster (CassKop) and
>> 
>> added
>> 
>> the
>> 
>> multi dc support with MultiCassKop which is another operator
>> 
>> included
>> 
>> in
>> 
>> the CassKop repo. For more details we tried to document our designs
>> 
>> as
>> 
>> much
>> 
>> as possible here:
>> 
>> https://orange-opensource.github.io/casskop/docs/
>> 
>> 1_concepts/3_design_principes#multi-site-management
>> 
>> In the middle of last year we had some talks with Datastax about
>> 
>> working
>> 
>> together around their new management sidecar. Their position on
>> 
>> open
>> 
>> source
>> 
>> was not clear at that time so we said please come back when you
>> 
>> have
>> 
>> decided to go open source with it. Which they did in the beginning
>> 
>> of
>> 
>> this
>> 
>> year. But at that time I guess work had started on cass-operator so
>> 
>> we
>> 
>> kept
>> 
>> our separate ways.
>> 
>> Since the beginning of the years, we have been working with our OPS
>> 
>> team
>> 
>> to have it in production. It is not simple as the team has to learn
>> 
>> K8S and
>> 
>> trust a newborn operator. This takes time especially as our
>> 
>> internal
>> 
>> cluster has been tweaked for multi-tenancy with obscure options
>> 
>> being
>> 
>> set
>> 
>> by our K8s team…
>> 
>> We also developed with Instaclustr the Backup & Restore
>> 
>> functionnality
>> 
>> (we
>> 
>> have new CRDs (Custom Resource Definition) for backup and restore
>> 
>> and a
>> 
>> reconcile loop that calls out Instaclustr sidecar for these
>> 
>> operations). We
>> 
>> now support multiple backups in parallel and can write to s3/
>> 
>> google
>> 
>> or
>> 
>> azur (but Stefan could give more details here if needed)
>> 
>> During the SIG calls we mentioned our desire to donate CassKop once
>> 
>> it
>> 
>> satisfies our basics requirements (v1 coming just now but I said it
>> 
>> too
>> 
>> many times already) I am actually not sure Datastax mentioned their
>> 
>> desire
>> 
>> to donate cass-operator but we decided to compare the designs and
>> 
>> the
>> 
>> functionalities based on respective CRDs. The CRD is the interface
>> 
>> with the
>> 
>> user as it is where you describe the cluster that you want to have.
>> 
>> These
>> 
>> talks were very interesting and we found out that the CassKop team
>> 
>> had
>> 
>> made
>> 
>> good choices most of the time but was may be too open. Indeed our
>> 
>> intention
>> 
>> was to give all the possibilities for our OPS team to work. This
>> 
>> includes :
>> 
>> - very open topology definition using any configuration of labels
>> 
>> to
>> 
>> map
>> 
>> dcs / racks and nodes to labels on clusters (we have labels on dcs
>> 
>> /
>> 
>> rooms
>> 
>> / rows and server racks so we can map C* racks to storage or
>> 
>> network
>> 
>> arrays
>> 
>> internaly)
>> - possibility to have multiple C* nodes on a single K8S host
>> 
>> (because
>> 
>> internal clouds are not really clouds, they have limited resources)
>> - custom C* image selection,
>> - custom bootstrap script that lets you configure C* as you want
>> 
>> using
>> 
>> ConfigMaps,
>> - the ability to mount different volumes wherever they wanted,
>> - the possibility to run any number of sidecars alongside C* for
>> 
>> custom
>> 
>> probes in our case
>> 
>> This makes CassKop quite powerful and flexible.
>> We made sure that all those options are not enabled by default so
>> 
>> one
>> 
>> can
>> 
>> just pop a simple 3 node cluster quickly
>> 
>> On the other hand cass-operator had an interesting way of
>> 
>> configuring
>> 
>> C*
>> 
>> just inside the CRD using cass-config. This is simple and elegant
>> 
>> so
>> 
>> we are
>> 
>> implementing it as well for the support of C* 4
>> 
>> Now for the future, there are 3 choices in my opinion:
>> - start from scratch (or John’s repo) by cherry picking bits from
>> 
>> all
>> 
>> operators. This is possible but will take some time / effort to
>> 
>> have
>> 
>> something usable. And then it will be compared to cass-operator and
>> CassKop. I don’t see Orange contributing too much here as we
>> 
>> believe
>> 
>> CassKop to be a much better starting point
>> - choose cass-operator: it is not on offer right now so let’s see
>> 
>> if
>> 
>> it
>> 
>> does. I think Orange could contribute some bits inherited from
>> 
>> CassKop
>> 
>> if
>> 
>> it is agreed by the community. Not sure it would be enough for us
>> 
>> to
>> 
>> use
>> 
>> it.
>> - choose CassKop: we would be delighted to donate it and contribute
>> 
>> with
>> 
>> some committers (including the original author who now works for
>> 
>> AWS).
>> 
>> It
>> 
>> would then become the community operator but there would be
>> 
>> cass-operator
>> 
>> alongside probably. But Cass-operator is made to make it easier for
>> Datastax to manage customer clusters by imposing some
>> 
>> configuration.
>> 
>> It
>> 
>> make sense for their needs, so may be 2 operators. We don’t know
>> 
>> how
>> 
>> backup/restore will be handled here with medusa being adapted to
>> 
>> K8s
>> 
>> Sorry again for being long but 2 years of work deserve some lines
>> 
>> of
>> 
>> text
>> 
>> :)
>> 
>> I just saw your message Patrick but this was written already so we
>> 
>> gain a
>> 
>> week.
>> 
>> Franck
>> 
>> On 24 Sep 2020, at 10:08, Benjamin Lerer <
>> 
>> benjamin.lerer@datastax.com
>> 
>> <ma...@datastax.com>> wrote:
>> 
>> I realise there are meeting logs, but getting a wider discourse
>> 
>> with
>> 
>> non-stakeholder input might help to build a community consensus? It
>> 
>> doesn't
>> 
>> seem like it can hurt at this point, anyway.
>> 
>> +1
>> 
>> On Wed, Sep 23, 2020 at 9:21 PM Benedict Elliott Smith
>> 
>> <benedict@apache.
>> 
>> org<ma...@apache.org>> wrote:
>> 
>> Perhaps it helps to widen the field of discussion to the dev list?
>> 
>> It might help if each of the stakeholder organisations state their
>> 
>> view on
>> 
>> the situation, including why they would or would not support a
>> 
>> given
>> 
>> approach/operator, and what (preferably specific) circumstances
>> 
>> might
>> 
>> lead
>> 
>> them to change their mind?
>> 
>> I realise there are meeting logs, but getting a wider discourse
>> 
>> with
>> 
>> non-stakeholder input might help to build a community consensus? It
>> 
>> doesn't
>> 
>> seem like it can hurt at this point, anyway.
>> 
>> On 23/09/2020, 17:13, "John Sanda" <john.sanda@gmail.com<mailto:
>> 
>> john.
>> 
>> sanda@gmail.com>> wrote:
>> 
>> I want to point out that pretty much everything being discussed in
>> 
>> this
>> 
>> thread has been discussed at length during the SIG meetings. I
>> 
>> think
>> 
>> it is
>> 
>> worth noting because we are pretty much still have the same
>> 
>> conversation.
>> 
>> On Wed, Sep 23, 2020 at 12:03 PM Benedict Elliott Smith <
>> 
>> benedict@apache.
>> 
>> org<ma...@apache.org>> wrote:
>> 
>> I don't think there's anything about a code drop that's not "The
>> 
>> Apache
>> 
>> Way"
>> 
>> If there's a consensus (or even strong majority) amongst invested
>> 
>> parties,
>> 
>> I don't see why we could not adopt an operator directly into the
>> 
>> project.
>> 
>> It's possible a green field approach might lead to fewer hard
>> 
>> feelings, as
>> 
>> everyone is in the same boat. Perhaps all operators are also
>> 
>> suboptimal
>> 
>> and
>> could be improved with a rewrite? But I think coordinating a lot of
>> different entities around an empty codebase is particularly
>> 
>> challenging. I
>> 
>> actually think it could be better for cohesion and collaboration to
>> 
>> have a
>> 
>> suboptimal but substantive starting point.
>> 
>> On 23/09/2020, 16:11, "Stefan Miklosovic" < stefan.miklosovic@
>> instaclustr.com<ma...@instaclustr.com>> wrote:
>> 
>> I think that from Instaclustr it was stated quite clearly multiple times
>> that we are "fine to throw it away" if there is something
>> 
>> better
>> 
>> and more wide-spread.Indeed, we have invested a lot of time in the
>> operator but it was not useless at all, we gained a lot of quite
>> 
>> unique
>> 
>> knowledge how to put all pieces together. However, I think that this space
>> is going to be quite fragmented and "balkanized", which
>> 
>> is
>> 
>> not always a bad thing, but in a quite narrow area as Kubernetes
>> 
>> operator
>> 
>> is, I just do not see how 4 operators are going to be beneficial
>> 
>> for
>> 
>> ordinary people ("official" from community, ours, Datastax one and
>> 
>> CassKop
>> 
>> (without any significant order)). Sure, innovation and healthy
>> 
>> competition
>> 
>> is important but to what extent ...
>> One can start a Cassandra cluster on Kubernetes just so many times
>> differently and nobody really likes a vendor lock-in. People
>> 
>> wanting
>> 
>> to run a cluster on K8S realise that there are three operators,
>> 
>> each
>> 
>> backed by a private business entity, and the community operator is
>> 
>> not
>> 
>> there ... Huh, interesting ... One may even start to question what
>> 
>> is
>> 
>> wrong with these folks that it takes three companies to build their own
>> solution.
>> 
>> Having said that, to my perception, Cassandra community just does
>> 
>> not
>> 
>> have enough engineers nor contributors to keep 4 operators alive at the
>> same time (I wish I was wrong) so the idea of selecting the
>> 
>> best
>> 
>> one or to merge obvious things and approaches together is
>> 
>> understandable,
>> 
>> even if it meant we eventually sunset ours. In addition, nobody
>> 
>> from
>> 
>> big
>> 
>> players is going to contribute to the code
>> base of the other one, for obvious reasons, so channeling and
>> 
>> directing
>> 
>> this effort into something common for a community seems to be the only
>> reasonable way of cooperation.
>> 
>> It is quite hard to bootstrap this if the donation of the code in
>> 
>> big
>> 
>> chunks / whole repo is out of question as it is not the "Apache
>> 
>> way"
>> 
>> (there was some thread running here about this in more depth a
>> 
>> while
>> 
>> ago) and we basically need to start from scratch which is quite
>> demotivating, we are just inventing the wheel and nobody is up to
>> 
>> it.
>> 
>> It is like people are waiting for that to happen so they can jump
>> 
>> in
>> 
>> "once it is the thing" but it will never materialise or at least
>> 
>> the
>> 
>> hurdle to kick it off is unnecessarily high. Nobody is going to
>> 
>> invest
>> 
>> in this heavily if there is already a working operator from
>> 
>> companies
>> 
>> mentioned above. As I understood it, one reason of not choosing the way of
>> donating it all is that "the learning and community building should happen
>> in organic manner and we just can not accept the
>> 
>> donation",
>> 
>> but is not it true that it is easier to build a community around something
>> which is already there rather than trying to build
>> 
>> it
>> 
>> around an idea which is quite hard to dedicate to?
>> 
>> On Wed, 23 Sep 2020 at 15:28, Joshua McKenzie <
>> 
>> jmckenzie@apache.org
>> 
>> <ma...@apache.org>> wrote:
>> 
>> I think there's significant value to the community in trying to coalesce
>> on a single approach,
>> I agree. Unfortunately in this case, the parties with a vested
>> 
>> interest
>> 
>> and
>> written operators came to the table and couldn't agree to coalesce on a
>> single approach. John Sanda attempted to start an initiative to
>> 
>> write a
>> 
>> best-of-breed combining choice parts of each operator, but that
>> 
>> effort
>> 
>> did
>> 
>> not gain traction.
>> 
>> Which is where my hypothesis comes from that if there were a clear
>> "better
>> fit" operator to start from we wouldn't be in a deadlock; the
>> 
>> correct
>> 
>> choice would be obvious. Reasonably so, every engineer that's
>> 
>> written
>> 
>> something is going to want that something to be used and not thrown away
>> in
>> favor of another something without strong evidence as to why that's the
>> better choice.
>> 
>> As far as I know, nobody has made a clear case as to a more
>> 
>> compelling
>> 
>> place to start in terms of an operator donation the project then
>> collaborates on. There's no mass adoption evidence nor feature
>> 
>> enumeration
>> 
>> that I know of for any of the approaches anyone's taken, so the
>> discussions
>> remain stalled.
>> 
>> On Wed, Sep 23, 2020 at 7:18 AM, Benedict Elliott Smith <
>> 
>> benedict@apache.
>> 
>> org<ma...@apache.org> wrote:
>> 
>> I think there's significant value to the community in trying to coalesce
>> on a single approach, earlier than later. This is an opportunity to expand
>> the number of active organisations involved directly in the Apache
>> Cassandra project, as well as to more quickly expand the project's
>> functionality into an area we consider urgent and important. I think it
>> would be a real shame to waste this opportunity. No doubt it will be hard,
>> as organisations have certain built-in investments in their own
>> approaches.
>> 
>> I haven't participated in these calls as I do not consider myself to have
>> the relevant experience and expertise, and have other focuses on the
>> project. I just wanted to voice a vote in favour of trying to bring
>> 
>> the
>> 
>> different organisations together on a single approach if possible. Is
>> there
>> anything the project can do to help this happen?
>> 
>> On 23/09/2020, 03:04, "Ben Bromhead" <ben@instaclustr.com<mailto:
>> 
>> ben@
>> 
>> instaclustr.com>> wrote:
>> 
>> I think there is certainly an appetite to donate and standardise on a
>> given operator (as mentioned in this thread).
>> 
>> I personally found the SIG hard to participate in due to time zones
>> 
>> and
>> 
>> the synchronous nature of it.
>> 
>> So while it was a great forum to dive into certain details for a subset of
>> participants and a worthwhile endeavour, I wouldn't paint it as an
>> accurate
>> reflection of community intent.
>> 
>> I don't think that any participants want to continue down the path of "let
>> a thousand flowers bloom". That's why we are looking towards CasKop
>> 
>> (as
>> 
>> well as a number of technical reasons).
>> 
>> Some of the recorded meetings and outputs can also be found if you are
>> interested in some primary sources
>> https://cwiki.apache.org/confluence/display/CASSANDRA/
>> Cassandra+Kubernetes+Operator+SIG
>> .
>> 
>> From what I understand second-hand from talking to people on the SIG
>> calls,
>> 
>> there was a general inability to agree on an existing operator as a
>> starting point and not much engagement on taking best of breed from the
>> various to combine them. Seems to leave us in the "let a thousand flowers
>> bloom" stage of letting operators grow in the ecosystem and seeing which
>> ones meet the needs of end users before talking about adopting one into
>> the
>> foundation.
>> 
>> Great to hear that you folks are joining forces though! Bodes well for C*
>> users that are wanting to run things on k8s.
>> 
>> On Tue, Sep 22, 2020 at 4:26 AM, Ben Bromhead <
>> 
>> ben@instaclustr.com
>> 
>> <ma...@instaclustr.com>
>> 
>> wrote:
>> 
>> For what it's worth, a quick update from me:
>> 
>> CassKop now has at least two organisations working on it
>> 
>> substantially
>> 
>> (Orange and Instaclustr) as well as the numerous other
>> 
>> contributors.
>> 
>> Internally we will also start pointing others towards CasKop once a few
>> things get merged. While we are not yet sunsetting our operator yet, it
>> 
>> is
>> 
>> certainly looking that way.
>> 
>> I'd love to see the community adopt it as a starting point for working
>> towards whatever level of functionality is desired.
>> 
>> Cheers
>> 
>> Ben
>> 
>> On Fri, Sep 11, 2020 at 2:37 PM John Sanda <
>> john.sanda@gmail.com>
>> wrote:
>> 
>> On Thu, Sep 10, 2020 at 5:27 PM Josh McKenzie <
>> 
>> jmckenzie@apache.org
>> 
>> wrote:
>> 
>> There's basically 1 java driver in the C* ecosystem. We have 3? 4? or
>> 
>> more
>> 
>> operators in the ecosystem. Has one of them hit a clear
>> 
>> supermajority
>> 
>> of
>> 
>> adoption that makes it the de facto default and makes sense to pull it
>> 
>> into
>> 
>> the project?
>> 
>> We as a project community were pretty slow to move on building a PoV
>> 
>> around
>> 
>> kubernetes so we find ourselves in a situation with a bunch of contenders
>> for inclusion in the project. It's not clear to me what heuristics we'd
>> 
>> use
>> 
>> to gauge which one would be the best fit for inclusion outside letting
>> community adoption speak.
>> 
>> ---
>> Josh McKenzie
>> 
>> We actually talked a good bit on the SIG call earlier today about
>> heuristics. We need to document what functionality an operator should
>> include at level 0, level 1, etc. We did discuss this a good bit during
>> some of the initial SIG meetings, but I guess it wasn't really a focal
>> point at the time. I think we should also provide references to existing
>> operator projects and possibly other related projects. This would benefit
>> both community users as well as people working on these projects.
>> 
>> - John
>> 
>> --
>> 
>> Ben Bromhead
>> 
>> Instaclustr | www.instaclustr.com | @instaclustr
>> <http://twitter.com/instaclustr> | (650) 284 9692
>> 
>> --
>> 
>> Ben Bromhead
>> 
>> Instaclustr | www.instaclustr.com | @instaclustr
>> <http://twitter.com/instaclustr> | (650) 284 9692
>> 
>> ---------------------------------------------------------------------
>> 
>> To
>> 
>> unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org For additional
>> commands, e-mail: dev-help@cassandra.apache.org
>> 
>> ---------------------------------------------------------------------
>> 
>> To
>> 
>> unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org For
>> 
>> additional
>> 
>> commands, e-mail: dev-help@cassandra.apache.org
>> 
>> ---------------------------------------------------------------------
>> 
>> To
>> 
>> unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org For
>> 
>> additional
>> 
>> commands, e-mail: dev-help@cassandra.apache.org
>> 
>> --
>> 
>> - John
>> 
>> ---------------------------------------------------------------------
>> 
>> To
>> 
>> unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org For
>> 
>> additional
>> 
>> commands, e-mail: dev-help@cassandra.apache.org
>> 
>> _________________________________________________________________________________________________________________________
>> 
>> 
>> Ce message et ses pieces jointes peuvent contenir des informations
>> confidentielles ou privilegiees et ne doivent donc pas etre
>> 
>> diffuses,
>> 
>> exploites ou copies sans autorisation. Si vous avez recu ce message
>> 
>> par
>> 
>> erreur, veuillez le signaler a l'expediteur et le detruire ainsi
>> 
>> que
>> 
>> les
>> 
>> pieces jointes. Les messages electroniques etant susceptibles
>> 
>> d'alteration,
>> 
>> Orange decline toute responsabilite si ce message a ete altere,
>> 
>> deforme ou
>> 
>> falsifie. Merci.
>> 
>> This message and its attachments may contain confidential or
>> 
>> privileged
>> 
>> information that may be protected by law; they should not be
>> 
>> distributed,
>> 
>> used or copied without authorisation. If you have received this
>> 
>> email
>> 
>> in
>> 
>> error, please notify the sender and delete this message and its
>> attachments. As emails may be altered, Orange is not liable for
>> 
>> messages
>> 
>> that have been modified, changed or falsified. Thank you.
>> 
>> --------------------------------------------------------------------- To
>> unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org For additional
>> commands, e-mail: dev-help@cassandra.apache.org
>> 
>> --
>> 
>> Ben Bromhead
>> 
>> Instaclustr | www.instaclustr.com | @instaclustr
>> <http://twitter.com/instaclustr> | (650) 284 9692
>> 
>> 


_________________________________________________________________________________________________________________________

Ce message et ses pieces jointes peuvent contenir des informations confidentielles ou privilegiees et ne doivent donc
pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce message par erreur, veuillez le signaler
a l'expediteur et le detruire ainsi que les pieces jointes. Les messages electroniques etant susceptibles d'alteration,
Orange decline toute responsabilite si ce message a ete altere, deforme ou falsifie. Merci.

This message and its attachments may contain confidential or privileged information that may be protected by law;
they should not be distributed, used or copied without authorisation.
If you have received this email in error, please notify the sender and delete this message and its attachments.
As emails may be altered, Orange is not liable for messages that have been modified, changed or falsified.
Thank you.

Re: [DISCUSS] Next steps for Kubernetes operator SIG

Posted by Joshua McKenzie <jm...@apache.org>.

What are next steps here?

Maybe we collectively put a table together w/the 2 operators and a list of
features to compare and contrast? Enumerate the frameworks / dependencies
they have to help form a point of view about the strengths and weaknesses
of each option?


On Tue, Sep 29, 2020 at 10:22 PM, Christopher Bradford <bradfordcp@gmail.com
> wrote:

> Hello Dev list,
>
> I'm Chris Bradford a Product Manager at DataStax working with the
> cass-operator team. For background, we started down the path of developing
> an operator internally to power our C*aaS platform, Astra. Care was taken
> from day 1 to keep anything specific to this product at a layer above
> cass-operator so it could solely focus on the task of operating Cassandra
> clusters. With that being said, every single cluster on Astra is
> provisioned and operated by cass-operator. The value of an advanced
> operator to Cassandra users is tremendous so we decided to open source the
> project (and associated components) with the goal of building a community.
> It absolutely makes sense to offer this project and codebase up for
> donation as a standard / baseline for running C* on Kubernetes.
>
> Below you will find a collection of cass-operator features,
> differentiators, and roadmap / inflight initiatives. Table-stakes
> Must-have functionality for a C* operator
>
> -
>
> Datacenter provisioning
> -
>
> Schedule all pods
> -
>
> Bootstrap nodes in the appropriate order
> -
>
> Seeds
> -
>
> Across racks
> -
>
> etc.
> -
>
> Uniform configuration
> -
>
> Scale-up
> -
>
> Add new nodes in a balanced manner across rack
> -
>
> Scale-down
> -
>
> Remove nodes one at a time across racks
> -
>
> Node recovery
> -
>
> Restart process
> -
>
> Reschedule instance (IE replace node)
> - Replace instance
> -
>
> Specific workflows for seed node replacements
> -
>
> Multi-DC / Multi-Rack
> -
>
> Multi-Region / Multi-K8s Cluster
> -
>
> Note this requires support at a networking layer for pod to pod IP
> connectivity. This may be accomplished within the cluster with CNIs like
> Cilium or externally via traditional networking tools.
>
> Differentiators
>
> -
>
> OSS Ecosystem / Components
> -
>
> Cass Config Builder - OSS project extracted from DataStax OpsCenter Life
> Cycle Manager to provide automated configuration file rendering
> -
>
> Cass Config Definitions - definitions files for cass-config-builder,
> defines all configuration files, their parameters, and templates
> -
>
> Management API for Apache Cassandra (MAAC)
> -
>
> Metrics Collector for Apache Cassandra (MCAC)
> -
>
> Reference Prometheus Operator CRDs
> -
>
> ServiceMonitor
> -
>
> Instance
> -
>
> Reference Grafana Operator CRDs
> -
>
> Instance
> -
>
> Dashboards
> -
>
> Datasource
> -
>
> PodTemplateSpec
> -
>
> Customization of existing pods including support for adding containers,
> volumes, etc
> -
>
> Advanced Networking
> -
>
> Node Port
> -
>
> Host Network
> -
>
> Simple security
> -
>
> Management API mTLS support
> -
>
> Automated generation of keystore and truststore for internode and client
> to node TLS
> -
>
> Automated superuser account configuration
> -
>
> The default superuser (cassandra/cassandra) is disabled and never
> available to clients
> -
>
> Cluster administration account may be automatically (or provided) with
> values stored in a k8s secret
> -
>
> Automatic application of NetworkTopologyStrategy with appropriate RF for
> system keyspaces
> -
>
> Validating webhook
> -
>
> Invalid changes are rejected with a helpful message
> -
>
> Rolling cluster updates
> -
>
> Change in binary (C* upgrade)
> -
>
> Change in configuration
> -
>
> Canary deployments - single rack application of changes for validation
> before broader deployment
> -
>
> Rolling restart
> -
>
> Platform Integration / Testing / Certification
> -
>
> Red Hat Openshift compatible and certified
> -
>
> Secure, Universal Base Image (UBI) foundation images with security
> scanning performed by Red Hat
> -
>
> cass-operator
> -
>
> cass-config-builder
> -
>
> apache-cassandra w/ MCAC and MAAC
> -
>
> Integration with Red Hat certification pipeline / marketplace
> -
>
> Presence in Red Hat Operator Hub built into OpenShift interface
> -
>
> VMware Tanzu Kubernetes Grid Integrated Edition compatible and certified
> -
>
> Security scanning for images performed by VMware
> -
>
> Amazon EKS
> -
>
> Google GKE
> -
>
> Azure AKS
> -
>
> Documentation / Reference Implementations
> -
>
> Cloud storage classes
> -
>
> Ingress solutions
> -
>
> Sample connection validation application with reference implementations of
> Java Driver client connection parameters
> -
>
> Cluster-level Stop / Resume - stop all running instances while keeping
> persistent storage. Allows for scaling compute down to zero. Bringing the
> cluster back up follows expected startup procedures
>
> Road Map / Inflight
>
> 1.
>
> Repair
> 1.
>
> Reaper integration
> 2.
>
> Backups
> 1.
>
> Velero integration
> 2. Medusa integration
> 3.
>
> Advanced Networking via sidecar
> 1.
>
> Combination of proxy sidecars (a la Envoy) to allow for persistent IP
> addresses despite Kubernetes' best efforts to shuffle them.
> 4.
>
> Single pod canary deployments
> 5.
>
> Platform Certification
> 1. VMware Project Pacific
>
> 2.
>
> Rancher Kubernetes Engine (K3s)
> 6.
>
> Documentation
> 1.
>
> Multi-region
> 2.
>
> Multi-cloud
> 3.
>
> Additional ingress providers
> 1. Voyager
> 2. HAProxy
> 3. Gloo
> 4. Ambassdor
> 5. Envoy
> 6. NGINX Ingress Controller
> 4.
>
> Additional storage class references
> 1.
>
> OpenEBS
> 7.
>
> Cassandra Enhancements
> 1.
>
> [#CASSANDRA-15823] Support for networking via identity instead of IP
> - ASF JIRA <https://issues.apache.org/jira/browse/CASSANDRA-15823>
>
> If there are further questions about the project, codebase, architecture,
> etc. the team would be happy to dive in to the details and discuss more.
>
> Cheers,
> ~Chris
>
> Christopher Bradford
>
> On Mon, Sep 28, 2020 at 12:19 PM Patrick McFadin <pm...@gmail.com>
> wrote:
>
> I can agree with that Ben. Franck did a good job of outlining CassKop.
> Somebody from the cass-operator will be posting something similar and we
> can keep it on the mailing list.
>
> Patrick
>
> On Sun, Sep 27, 2020 at 2:16 PM Ben Bromhead <be...@instaclustr.com> wrote:
>
> Thanks Frank and Stefan.
>
> @Patrick great suggestion and worthwhile getting everything on the table.
>
> One minor change I would advocate for. The SIG has been great to iterate
> and interact on the details, but I really think this conversation given
>
> the
>
> nature of the content needs to be on the mailing list. The mailing list
>
> is
>
> really our system of record and the most accessible.
>
> It gives folk time to think and digest, it's asynchronous, easily
> searchable and let's be honest, the majority of stakeholders in this are
> not US based, so the timing issue then goes away and makes it easier for
> people to participate in. I feel like we've made a lot more progress by
> simply having this discussion here.
>
> So instead of a presentation, maybe just an email to the ML addressing
>
> the
>
> headings that Patrick identified?
>
> On Fri, Sep 25, 2020 at 7:55 AM Stefan Miklosovic < stefan.miklosovic@
> instaclustr.com> wrote:
>
> Hi,
>
> Patrick's suggestion seems good to me.
>
> I won't go into specifics here as I need to genuinely prepare for this. It
> is quite hard to dig deep into the solutions of others and bring some
> constructive criticism because it takes a lot of time to study it and
> everybody has some "why's" behind it.
>
> To summarize my goals and concerns:
>
> 1) We should be as much "Kubernetes operator idiomatic" as possible.
> Industry standards, no custom brain-child of this or that group because
> they think it is just cool or they just didn't know any better. I do NOT
> say it is like that right now, I just want to be ruthless here as much as
> possible when it comes to functionality and why it is done like that. It is
> awesome that we have already something latest (thanks to John) and it
> adheres to the latest releases. I personally had a hard time to keep up
> with all the releases, once I finished something and I aligned it, after a
> week or two there was already another one where things were different, it
> is a very fast-moving space and I hope that by time we develop something it
> will not be obsolete.
>
> 2) It may be easier said than done but it is guaranteed that people get
> emotional, it's their precious etc, so please let's go into this with good
> intentions, not trying to push one solution over the other just because
> they would like to see it there ... I will have an equally hard time to
> comply with this point. My plan is to explain what is _wrong_ with our
> solution. Where we made mistakes and what should be done differently but it
> is "too late" etc. It is quite hard to describe your work and all effort in
> this light but without telling what is wrong we can not decide what is good
> imho.
>
> 3) We should put something together fast enough so we can call it a
> release. We can always iterate on it for eternity. But the foundations need
> to be there. Here I want to say that I especially like what John did. I
> looked through these specs and it was obvious it has been written with care
> and attention. It looked _solid_. I am not sure how hard it is to put all
> other things on top of that, I truly do not, and here I think we would have
> to reinvent that wheel if we want to proceed because I can not imagine what
> it would be to retrofit e.g. CassKop on top of John specs, it is just like
> putting round pegs into the square holes, maybe some chunks would be reused
> easily but otherwise I worry we will be just on square one.
>
> One specific feeling I have as I read this is that even if there is the
> will to create the fourth operator, the respective parties will not be able
> to drop their own repository. The whole point behind this effort, to me, is
> to have a solid, community driven, stable, modern and feature complete
> operator people are truly using. I can see that once this is real, we will
> _really_ sunset our operator, redirecting people to the new operator on
> main readme doc etc, we truly mean it. Sure, if somebody comes and bug fix
> will be needed, we will fix it, but the whole point of doing this is to
> stop using what we have currently, over time, otherwise we are just
> splitting this space even more. If CassKop is not sure if they will use it
> because they do not know if that operator will be "enough" for them, aren't
> we just doing it wrong? If I exaggerate, they should be fine with deleting
> the whole repository and using just this Cassandra one we are going to make
> otherwise I don't see the point to work on this ...
>
> On Thu, 24 Sep 2020 at 20:45, Joshua McKenzie <jm...@apache.org>
> wrote:
>
> - choose cass-operator: it is not on offer right now so let’s see if
>
> it
>
> does
>
> We should all talk a lot more, but this is 100% a mistake - I take
>
> the
>
> blame for that. The intention has long been to offer cass-operator
>
> for
>
> donation but it slipped through the cracks and your email yesterday
>
> made
>
> me
>
> double-take.
>
> We have since resolved this misalignment. DataStax would be happy to
>
> donate
>
> any and all of cass-operator to the ASF and C* project if it's what
>
> we
>
> all
>
> agree best serves our collective Cassandra users. I'm also cognizant
>
> that
>
> an immense amount of effort has gone into CassKop and we seem to have
> something of an embarrassment of riches.
>
> I'm given to understand (haven't dug in personally) that the two
>
> operators
>
> express pretty different opinions when it comes to frameworks,
>
> designs,
>
> supported versions, etc. I think a discrete enumeration of the
>
> feature
>
> set
>
> and "identities" of both could really help navigate this conversation
>
> going
>
> forward.
>
> Also - thanks for that context Franck. It's always helpful to know
>
> where
>
> other people are coming from when we're all working together towards
>
> a
>
> common goal.
>
> On Thu, Sep 24, 2020 at 12:23 PM, <fr...@orange.com> wrote:
>
> I can share Orange’s view of the situation, sorry it is a long
>
> story!
>
> We started CassKop at the end of 2018 after betting on K8S which
>
> was
>
> not
>
> so simple as far as C* was concerned. Lack of support for local
>
> storage,
>
> IPs that change all the time, different network plugins to try to
>
> implement
>
> a non standard K8s way of having nodes see each other from
>
> different
>
> dcs…
>
> We hesitated with Mesos but could not have both and K8S was already
> tracting so much you could not not choose it.
>
> Anyway, we looked around and did not see anyone with such
>
> requirements
>
> so
>
> we said: why not try it ourselves but on github so that we may give
>
> it
>
> back
>
> to the community. We have used C* for quite a few years with great
>
> success
>
> on production with massive load and perfect availability. We love
>
> C*
>
> @
>
> Orange :) Thanks!
>
> So we started writing support for mono-dc cluster (CassKop) and
>
> added
>
> the
>
> multi dc support with MultiCassKop which is another operator
>
> included
>
> in
>
> the CassKop repo. For more details we tried to document our designs
>
> as
>
> much
>
> as possible here:
>
> https://orange-opensource.github.io/casskop/docs/
>
> 1_concepts/3_design_principes#multi-site-management
>
> In the middle of last year we had some talks with Datastax about
>
> working
>
> together around their new management sidecar. Their position on
>
> open
>
> source
>
> was not clear at that time so we said please come back when you
>
> have
>
> decided to go open source with it. Which they did in the beginning
>
> of
>
> this
>
> year. But at that time I guess work had started on cass-operator so
>
> we
>
> kept
>
> our separate ways.
>
> Since the beginning of the years, we have been working with our OPS
>
> team
>
> to have it in production. It is not simple as the team has to learn
>
> K8S and
>
> trust a newborn operator. This takes time especially as our
>
> internal
>
> cluster has been tweaked for multi-tenancy with obscure options
>
> being
>
> set
>
> by our K8s team…
>
> We also developed with Instaclustr the Backup & Restore
>
> functionnality
>
> (we
>
> have new CRDs (Custom Resource Definition) for backup and restore
>
> and a
>
> reconcile loop that calls out Instaclustr sidecar for these
>
> operations). We
>
> now support multiple backups in parallel and can write to s3/
>
> google
>
> or
>
> azur (but Stefan could give more details here if needed)
>
> During the SIG calls we mentioned our desire to donate CassKop once
>
> it
>
> satisfies our basics requirements (v1 coming just now but I said it
>
> too
>
> many times already) I am actually not sure Datastax mentioned their
>
> desire
>
> to donate cass-operator but we decided to compare the designs and
>
> the
>
> functionalities based on respective CRDs. The CRD is the interface
>
> with the
>
> user as it is where you describe the cluster that you want to have.
>
> These
>
> talks were very interesting and we found out that the CassKop team
>
> had
>
> made
>
> good choices most of the time but was may be too open. Indeed our
>
> intention
>
> was to give all the possibilities for our OPS team to work. This
>
> includes :
>
> - very open topology definition using any configuration of labels
>
> to
>
> map
>
> dcs / racks and nodes to labels on clusters (we have labels on dcs
>
> /
>
> rooms
>
> / rows and server racks so we can map C* racks to storage or
>
> network
>
> arrays
>
> internaly)
> - possibility to have multiple C* nodes on a single K8S host
>
> (because
>
> internal clouds are not really clouds, they have limited resources)
> - custom C* image selection,
> - custom bootstrap script that lets you configure C* as you want
>
> using
>
> ConfigMaps,
> - the ability to mount different volumes wherever they wanted,
> - the possibility to run any number of sidecars alongside C* for
>
> custom
>
> probes in our case
>
> This makes CassKop quite powerful and flexible.
> We made sure that all those options are not enabled by default so
>
> one
>
> can
>
> just pop a simple 3 node cluster quickly
>
> On the other hand cass-operator had an interesting way of
>
> configuring
>
> C*
>
> just inside the CRD using cass-config. This is simple and elegant
>
> so
>
> we are
>
> implementing it as well for the support of C* 4
>
> Now for the future, there are 3 choices in my opinion:
> - start from scratch (or John’s repo) by cherry picking bits from
>
> all
>
> operators. This is possible but will take some time / effort to
>
> have
>
> something usable. And then it will be compared to cass-operator and
> CassKop. I don’t see Orange contributing too much here as we
>
> believe
>
> CassKop to be a much better starting point
> - choose cass-operator: it is not on offer right now so let’s see
>
> if
>
> it
>
> does. I think Orange could contribute some bits inherited from
>
> CassKop
>
> if
>
> it is agreed by the community. Not sure it would be enough for us
>
> to
>
> use
>
> it.
> - choose CassKop: we would be delighted to donate it and contribute
>
> with
>
> some committers (including the original author who now works for
>
> AWS).
>
> It
>
> would then become the community operator but there would be
>
> cass-operator
>
> alongside probably. But Cass-operator is made to make it easier for
> Datastax to manage customer clusters by imposing some
>
> configuration.
>
> It
>
> make sense for their needs, so may be 2 operators. We don’t know
>
> how
>
> backup/restore will be handled here with medusa being adapted to
>
> K8s
>
> Sorry again for being long but 2 years of work deserve some lines
>
> of
>
> text
>
> :)
>
> I just saw your message Patrick but this was written already so we
>
> gain a
>
> week.
>
> Franck
>
> On 24 Sep 2020, at 10:08, Benjamin Lerer <
>
> benjamin.lerer@datastax.com
>
> <ma...@datastax.com>> wrote:
>
> I realise there are meeting logs, but getting a wider discourse
>
> with
>
> non-stakeholder input might help to build a community consensus? It
>
> doesn't
>
> seem like it can hurt at this point, anyway.
>
> +1
>
> On Wed, Sep 23, 2020 at 9:21 PM Benedict Elliott Smith
>
> <benedict@apache.
>
> org<ma...@apache.org>> wrote:
>
> Perhaps it helps to widen the field of discussion to the dev list?
>
> It might help if each of the stakeholder organisations state their
>
> view on
>
> the situation, including why they would or would not support a
>
> given
>
> approach/operator, and what (preferably specific) circumstances
>
> might
>
> lead
>
> them to change their mind?
>
> I realise there are meeting logs, but getting a wider discourse
>
> with
>
> non-stakeholder input might help to build a community consensus? It
>
> doesn't
>
> seem like it can hurt at this point, anyway.
>
> On 23/09/2020, 17:13, "John Sanda" <john.sanda@gmail.com<mailto:
>
> john.
>
> sanda@gmail.com>> wrote:
>
> I want to point out that pretty much everything being discussed in
>
> this
>
> thread has been discussed at length during the SIG meetings. I
>
> think
>
> it is
>
> worth noting because we are pretty much still have the same
>
> conversation.
>
> On Wed, Sep 23, 2020 at 12:03 PM Benedict Elliott Smith <
>
> benedict@apache.
>
> org<ma...@apache.org>> wrote:
>
> I don't think there's anything about a code drop that's not "The
>
> Apache
>
> Way"
>
> If there's a consensus (or even strong majority) amongst invested
>
> parties,
>
> I don't see why we could not adopt an operator directly into the
>
> project.
>
> It's possible a green field approach might lead to fewer hard
>
> feelings, as
>
> everyone is in the same boat. Perhaps all operators are also
>
> suboptimal
>
> and
> could be improved with a rewrite? But I think coordinating a lot of
> different entities around an empty codebase is particularly
>
> challenging. I
>
> actually think it could be better for cohesion and collaboration to
>
> have a
>
> suboptimal but substantive starting point.
>
> On 23/09/2020, 16:11, "Stefan Miklosovic" < stefan.miklosovic@
> instaclustr.com<ma...@instaclustr.com>> wrote:
>
> I think that from Instaclustr it was stated quite clearly multiple times
> that we are "fine to throw it away" if there is something
>
> better
>
> and more wide-spread.Indeed, we have invested a lot of time in the
> operator but it was not useless at all, we gained a lot of quite
>
> unique
>
> knowledge how to put all pieces together. However, I think that this space
> is going to be quite fragmented and "balkanized", which
>
> is
>
> not always a bad thing, but in a quite narrow area as Kubernetes
>
> operator
>
> is, I just do not see how 4 operators are going to be beneficial
>
> for
>
> ordinary people ("official" from community, ours, Datastax one and
>
> CassKop
>
> (without any significant order)). Sure, innovation and healthy
>
> competition
>
> is important but to what extent ...
> One can start a Cassandra cluster on Kubernetes just so many times
> differently and nobody really likes a vendor lock-in. People
>
> wanting
>
> to run a cluster on K8S realise that there are three operators,
>
> each
>
> backed by a private business entity, and the community operator is
>
> not
>
> there ... Huh, interesting ... One may even start to question what
>
> is
>
> wrong with these folks that it takes three companies to build their own
> solution.
>
> Having said that, to my perception, Cassandra community just does
>
> not
>
> have enough engineers nor contributors to keep 4 operators alive at the
> same time (I wish I was wrong) so the idea of selecting the
>
> best
>
> one or to merge obvious things and approaches together is
>
> understandable,
>
> even if it meant we eventually sunset ours. In addition, nobody
>
> from
>
> big
>
> players is going to contribute to the code
> base of the other one, for obvious reasons, so channeling and
>
> directing
>
> this effort into something common for a community seems to be the only
> reasonable way of cooperation.
>
> It is quite hard to bootstrap this if the donation of the code in
>
> big
>
> chunks / whole repo is out of question as it is not the "Apache
>
> way"
>
> (there was some thread running here about this in more depth a
>
> while
>
> ago) and we basically need to start from scratch which is quite
> demotivating, we are just inventing the wheel and nobody is up to
>
> it.
>
> It is like people are waiting for that to happen so they can jump
>
> in
>
> "once it is the thing" but it will never materialise or at least
>
> the
>
> hurdle to kick it off is unnecessarily high. Nobody is going to
>
> invest
>
> in this heavily if there is already a working operator from
>
> companies
>
> mentioned above. As I understood it, one reason of not choosing the way of
> donating it all is that "the learning and community building should happen
> in organic manner and we just can not accept the
>
> donation",
>
> but is not it true that it is easier to build a community around something
> which is already there rather than trying to build
>
> it
>
> around an idea which is quite hard to dedicate to?
>
> On Wed, 23 Sep 2020 at 15:28, Joshua McKenzie <
>
> jmckenzie@apache.org
>
> <ma...@apache.org>> wrote:
>
> I think there's significant value to the community in trying to coalesce
> on a single approach,
> I agree. Unfortunately in this case, the parties with a vested
>
> interest
>
> and
> written operators came to the table and couldn't agree to coalesce on a
> single approach. John Sanda attempted to start an initiative to
>
> write a
>
> best-of-breed combining choice parts of each operator, but that
>
> effort
>
> did
>
> not gain traction.
>
> Which is where my hypothesis comes from that if there were a clear
> "better
> fit" operator to start from we wouldn't be in a deadlock; the
>
> correct
>
> choice would be obvious. Reasonably so, every engineer that's
>
> written
>
> something is going to want that something to be used and not thrown away
> in
> favor of another something without strong evidence as to why that's the
> better choice.
>
> As far as I know, nobody has made a clear case as to a more
>
> compelling
>
> place to start in terms of an operator donation the project then
> collaborates on. There's no mass adoption evidence nor feature
>
> enumeration
>
> that I know of for any of the approaches anyone's taken, so the
> discussions
> remain stalled.
>
> On Wed, Sep 23, 2020 at 7:18 AM, Benedict Elliott Smith <
>
> benedict@apache.
>
> org<ma...@apache.org> wrote:
>
> I think there's significant value to the community in trying to coalesce
> on a single approach, earlier than later. This is an opportunity to expand
> the number of active organisations involved directly in the Apache
> Cassandra project, as well as to more quickly expand the project's
> functionality into an area we consider urgent and important. I think it
> would be a real shame to waste this opportunity. No doubt it will be hard,
> as organisations have certain built-in investments in their own
> approaches.
>
> I haven't participated in these calls as I do not consider myself to have
> the relevant experience and expertise, and have other focuses on the
> project. I just wanted to voice a vote in favour of trying to bring
>
> the
>
> different organisations together on a single approach if possible. Is
> there
> anything the project can do to help this happen?
>
> On 23/09/2020, 03:04, "Ben Bromhead" <ben@instaclustr.com<mailto:
>
> ben@
>
> instaclustr.com>> wrote:
>
> I think there is certainly an appetite to donate and standardise on a
> given operator (as mentioned in this thread).
>
> I personally found the SIG hard to participate in due to time zones
>
> and
>
> the synchronous nature of it.
>
> So while it was a great forum to dive into certain details for a subset of
> participants and a worthwhile endeavour, I wouldn't paint it as an
> accurate
> reflection of community intent.
>
> I don't think that any participants want to continue down the path of "let
> a thousand flowers bloom". That's why we are looking towards CasKop
>
> (as
>
> well as a number of technical reasons).
>
> Some of the recorded meetings and outputs can also be found if you are
> interested in some primary sources
> https://cwiki.apache.org/confluence/display/CASSANDRA/
> Cassandra+Kubernetes+Operator+SIG
> .
>
> From what I understand second-hand from talking to people on the SIG
> calls,
>
> there was a general inability to agree on an existing operator as a
> starting point and not much engagement on taking best of breed from the
> various to combine them. Seems to leave us in the "let a thousand flowers
> bloom" stage of letting operators grow in the ecosystem and seeing which
> ones meet the needs of end users before talking about adopting one into
> the
> foundation.
>
> Great to hear that you folks are joining forces though! Bodes well for C*
> users that are wanting to run things on k8s.
>
> On Tue, Sep 22, 2020 at 4:26 AM, Ben Bromhead <
>
> ben@instaclustr.com
>
> <ma...@instaclustr.com>
>
> wrote:
>
> For what it's worth, a quick update from me:
>
> CassKop now has at least two organisations working on it
>
> substantially
>
> (Orange and Instaclustr) as well as the numerous other
>
> contributors.
>
> Internally we will also start pointing others towards CasKop once a few
> things get merged. While we are not yet sunsetting our operator yet, it
>
> is
>
> certainly looking that way.
>
> I'd love to see the community adopt it as a starting point for working
> towards whatever level of functionality is desired.
>
> Cheers
>
> Ben
>
> On Fri, Sep 11, 2020 at 2:37 PM John Sanda <
> john.sanda@gmail.com>
> wrote:
>
> On Thu, Sep 10, 2020 at 5:27 PM Josh McKenzie <
>
> jmckenzie@apache.org
>
> wrote:
>
> There's basically 1 java driver in the C* ecosystem. We have 3? 4? or
>
> more
>
> operators in the ecosystem. Has one of them hit a clear
>
> supermajority
>
> of
>
> adoption that makes it the de facto default and makes sense to pull it
>
> into
>
> the project?
>
> We as a project community were pretty slow to move on building a PoV
>
> around
>
> kubernetes so we find ourselves in a situation with a bunch of contenders
> for inclusion in the project. It's not clear to me what heuristics we'd
>
> use
>
> to gauge which one would be the best fit for inclusion outside letting
> community adoption speak.
>
> ---
> Josh McKenzie
>
> We actually talked a good bit on the SIG call earlier today about
> heuristics. We need to document what functionality an operator should
> include at level 0, level 1, etc. We did discuss this a good bit during
> some of the initial SIG meetings, but I guess it wasn't really a focal
> point at the time. I think we should also provide references to existing
> operator projects and possibly other related projects. This would benefit
> both community users as well as people working on these projects.
>
> - John
>
> --
>
> Ben Bromhead
>
> Instaclustr | www.instaclustr.com | @instaclustr
> <http://twitter.com/instaclustr> | (650) 284 9692
>
> --
>
> Ben Bromhead
>
> Instaclustr | www.instaclustr.com | @instaclustr
> <http://twitter.com/instaclustr> | (650) 284 9692
>
> ---------------------------------------------------------------------
>
> To
>
> unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org For additional
> commands, e-mail: dev-help@cassandra.apache.org
>
> ---------------------------------------------------------------------
>
> To
>
> unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org For
>
> additional
>
> commands, e-mail: dev-help@cassandra.apache.org
>
> ---------------------------------------------------------------------
>
> To
>
> unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org For
>
> additional
>
> commands, e-mail: dev-help@cassandra.apache.org
>
> --
>
> - John
>
> ---------------------------------------------------------------------
>
> To
>
> unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org For
>
> additional
>
> commands, e-mail: dev-help@cassandra.apache.org
>
> _________________________________________________________________________________________________________________________
>
>
> Ce message et ses pieces jointes peuvent contenir des informations
> confidentielles ou privilegiees et ne doivent donc pas etre
>
> diffuses,
>
> exploites ou copies sans autorisation. Si vous avez recu ce message
>
> par
>
> erreur, veuillez le signaler a l'expediteur et le detruire ainsi
>
> que
>
> les
>
> pieces jointes. Les messages electroniques etant susceptibles
>
> d'alteration,
>
> Orange decline toute responsabilite si ce message a ete altere,
>
> deforme ou
>
> falsifie. Merci.
>
> This message and its attachments may contain confidential or
>
> privileged
>
> information that may be protected by law; they should not be
>
> distributed,
>
> used or copied without authorisation. If you have received this
>
> email
>
> in
>
> error, please notify the sender and delete this message and its
> attachments. As emails may be altered, Orange is not liable for
>
> messages
>
> that have been modified, changed or falsified. Thank you.
>
> --------------------------------------------------------------------- To
> unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org For additional
> commands, e-mail: dev-help@cassandra.apache.org
>
> --
>
> Ben Bromhead
>
> Instaclustr | www.instaclustr.com | @instaclustr
> <http://twitter.com/instaclustr> | (650) 284 9692
>
>

Re: [DISCUSS] Next steps for Kubernetes operator SIG

Posted by Christopher Bradford <br...@gmail.com>.

Hello Dev list,

I'm Chris Bradford a Product Manager at DataStax working with the
cass-operator team. For background, we started down the path of developing
an operator internally to power our C*aaS platform, Astra. Care was taken
from day 1 to keep anything specific to this product at a layer above
cass-operator so it could solely focus on the task of operating Cassandra
clusters. With that being said, every single cluster on Astra is
provisioned and operated by cass-operator. The value of an advanced
operator to Cassandra users is tremendous so we decided to open source the
project (and associated components) with the goal of building a community.
It absolutely makes sense to offer this project and codebase up for
donation as a standard / baseline for running C* on Kubernetes.

Below you will find a collection of cass-operator features,
differentiators, and roadmap / inflight initiatives.
Table-stakes
Must-have functionality for a C* operator

   -

   Datacenter provisioning
   -

      Schedule all pods
      -

      Bootstrap nodes in the appropriate order
      -

         Seeds
         -

         Across racks
         -

         etc.
         -

      Uniform configuration
      -

   Scale-up
   -

      Add new nodes in a balanced manner across rack
      -

   Scale-down
   -

      Remove nodes one at a time across racks
      -

   Node recovery
   -

      Restart process
      -

      Reschedule instance (IE replace node)
      - Replace instance
      -

      Specific workflows for seed node replacements
      -

   Multi-DC / Multi-Rack
   -

   Multi-Region / Multi-K8s Cluster
   -

      Note this requires support at a networking layer for pod to pod IP
      connectivity. This may be accomplished within the cluster with CNIs like
      Cilium or externally via traditional networking tools.

Differentiators

   -

   OSS Ecosystem / Components
   -

      Cass Config Builder - OSS project extracted from DataStax OpsCenter
      Life Cycle Manager to provide automated configuration file rendering
      -

      Cass Config Definitions - definitions files for cass-config-builder,
      defines all configuration files, their parameters, and templates
      -

      Management API for Apache Cassandra (MAAC)
      -

      Metrics Collector for Apache Cassandra (MCAC)
      -

         Reference Prometheus Operator CRDs
         -

            ServiceMonitor
            -

            Instance
            -

         Reference Grafana Operator CRDs
         -

            Instance
            -

            Dashboards
            -

            Datasource
            -

   PodTemplateSpec
   -

      Customization of existing pods including support for adding
      containers, volumes, etc
      -

   Advanced Networking
   -

      Node Port
      -

      Host Network
      -

   Simple security
   -

      Management API mTLS support
      -

      Automated generation of keystore and truststore for internode and
      client to node TLS
      -

   Automated superuser account configuration
   -

      The default superuser (cassandra/cassandra) is disabled and never
      available to clients
      -

      Cluster administration account may be automatically (or provided)
      with values stored in a k8s secret
      -

   Automatic application of NetworkTopologyStrategy with appropriate RF for
   system keyspaces
   -

   Validating webhook
   -

      Invalid changes are rejected with a helpful message
      -

   Rolling cluster updates
   -

      Change in binary (C* upgrade)
      -

      Change in configuration
      -

      Canary deployments - single rack application of changes for
      validation before broader deployment
      -

   Rolling restart
   -

   Platform Integration / Testing / Certification
   -

      Red Hat Openshift compatible and certified
      -

         Secure, Universal Base Image (UBI) foundation images with security
         scanning performed by Red Hat
         -

            cass-operator
            -

            cass-config-builder
            -

            apache-cassandra w/ MCAC and MAAC
            -

         Integration with Red Hat certification pipeline / marketplace
         -

         Presence in Red Hat Operator Hub built into OpenShift interface
         -

      VMware Tanzu Kubernetes Grid Integrated Edition compatible and
      certified
      -

         Security scanning for images performed by VMware
         -

      Amazon EKS
      -

      Google GKE
      -

      Azure AKS
      -

      Documentation / Reference Implementations
      -

      Cloud storage classes
      -

      Ingress solutions
      -

         Sample connection validation application with reference
         implementations of Java Driver client connection parameters
         -

   Cluster-level Stop / Resume - stop all running instances while keeping
   persistent storage. Allows for scaling compute down to zero. Bringing the
   cluster back up follows expected startup procedures

Road Map / Inflight

   1.

   Repair
   1.

      Reaper integration
      2.

   Backups
   1.

      Velero integration
      2. Medusa integration
   3.

   Advanced Networking via sidecar
   1.

      Combination of proxy sidecars (a la Envoy) to allow for persistent IP
      addresses despite Kubernetes' best efforts to shuffle them.
      4.

   Single pod canary deployments
   5.

   Platform Certification
   1. VMware Project Pacific

      2.

      Rancher Kubernetes Engine (K3s)
      6.

   Documentation
   1.

      Multi-region
      2.

      Multi-cloud
      3.

      Additional ingress providers
      1. Voyager
         2. HAProxy
         3. Gloo
         4. Ambassdor
         5. Envoy
         6. NGINX Ingress Controller
         4.

      Additional storage class references
      1.

         OpenEBS
         7.

   Cassandra Enhancements
   1.

      [#CASSANDRA-15823] Support for networking via identity instead of IP
      - ASF JIRA <https://issues.apache.org/jira/browse/CASSANDRA-15823>


If there are further questions about the project, codebase, architecture,
etc. the team would be happy to dive in to the details and discuss more.

Cheers,
~Chris

Christopher Bradford



On Mon, Sep 28, 2020 at 12:19 PM Patrick McFadin <pm...@gmail.com> wrote:

> I can agree with that Ben. Franck did a good job of outlining CassKop.
> Somebody from the cass-operator will be posting something similar and we
> can keep it on the mailing list.
>
> Patrick
>
> On Sun, Sep 27, 2020 at 2:16 PM Ben Bromhead <be...@instaclustr.com> wrote:
>
> > Thanks Frank and Stefan.
> >
> > @Patrick great suggestion and worthwhile getting everything on the table.
> >
> > One minor change I would advocate for. The SIG has been great to iterate
> > and interact on the details, but I really think this conversation given
> the
> > nature of the content needs to be on the mailing list. The mailing list
> is
> > really our system of record and the most accessible.
> >
> > It gives folk time to think and digest, it's asynchronous, easily
> > searchable and let's be honest, the majority of stakeholders in this are
> > not US based, so the timing issue then goes away and makes it easier for
> > people to participate in. I feel like we've made a lot more progress by
> > simply having this discussion here.
> >
> > So instead of a presentation, maybe just an email to the ML addressing
> the
> > headings that Patrick identified?
> >
> >
> > On Fri, Sep 25, 2020 at 7:55 AM Stefan Miklosovic <
> > stefan.miklosovic@instaclustr.com> wrote:
> >
> > > Hi,
> > >
> > > Patrick's suggestion seems good to me.
> > >
> > > I won't go into specifics here as I need to genuinely prepare for
> > > this. It is quite hard to dig deep into the solutions of others and
> > > bring some constructive criticism because it takes a lot of time to
> > > study it and everybody has some "why's" behind it.
> > >
> > > To summarize my goals and concerns:
> > >
> > > 1) We should be as much "Kubernetes operator idiomatic" as possible.
> > > Industry standards, no custom brain-child of this or that group
> > > because they think it is just cool or they just didn't know any
> > > better. I do NOT say it is like that right now, I just want to be
> > > ruthless here as much as possible when it comes to functionality and
> > > why it is done like that. It is awesome that we have already something
> > > latest (thanks to John) and it adheres to the latest releases. I
> > > personally had a hard time to keep up with all the releases, once I
> > > finished something and I aligned it, after a week or two there was
> > > already another one where things were different, it is a very
> > > fast-moving space and I hope that by time we develop something it will
> > > not be obsolete.
> > >
> > > 2) It may be easier said than done but it is guaranteed that people
> > > get emotional, it's their precious etc, so please let's go into this
> > > with good intentions, not trying to push one solution over the other
> > > just because they would like to see it there ... I will have an
> > > equally hard time to comply with this point. My plan is to explain
> > > what is _wrong_ with our solution. Where we made mistakes and what
> > > should be done differently but it is "too late" etc. It is quite hard
> > > to describe your work and all effort in this light but without telling
> > > what is wrong we can not decide what is good imho.
> > >
> > > 3) We should put something together fast enough so we can call it a
> > > release. We can always iterate on it for eternity. But the foundations
> > > need to be there. Here I want to say that I especially like what John
> > > did. I looked through these specs and it was obvious it has been
> > > written with care and attention. It looked _solid_. I am not sure how
> > > hard it is to put all other things on top of that, I truly do not, and
> > > here I think we would have to reinvent that wheel if we want to
> > > proceed because I can not imagine what it would be to retrofit e.g.
> > > CassKop on top of John specs, it is just like putting round pegs into
> > > the square holes, maybe some chunks would be reused easily but
> > > otherwise I worry we will be just on square one.
> > >
> > > One specific feeling I have as I read this is that even if there is
> > > the will to create the fourth operator, the respective parties will
> > > not be able to drop their own repository. The whole point behind this
> > > effort, to me, is to have a solid, community driven, stable, modern
> > > and feature complete operator people are truly using. I can see that
> > > once this is real, we will _really_ sunset our operator, redirecting
> > > people to the new operator on main readme doc etc, we truly mean it.
> > > Sure, if somebody comes and bug fix will be needed, we will fix it,
> > > but the whole point of doing this is to stop using what we have
> > > currently, over time, otherwise we are just splitting this space even
> > > more. If CassKop is not sure if they will use it because they do not
> > > know if that operator will be "enough" for them, aren't we just doing
> > > it wrong? If I exaggerate, they should be fine with deleting the whole
> > > repository and using just this Cassandra one we are going to make
> > > otherwise I don't see the point to work on this ...
> > >
> > > On Thu, 24 Sep 2020 at 20:45, Joshua McKenzie <jm...@apache.org>
> > > wrote:
> > > >
> > > > - choose cass-operator: it is not on offer right now so let’s see if
> it
> > > does
> > > >
> > > >
> > > > We should all talk a lot more, but this is 100% a mistake - I take
> the
> > > > blame for that. The intention has long been to offer cass-operator
> for
> > > > donation but it slipped through the cracks and your email yesterday
> > made
> > > me
> > > > double-take.
> > > >
> > > > We have since resolved this misalignment. DataStax would be happy to
> > > donate
> > > > any and all of cass-operator to the ASF and C* project if it's what
> we
> > > all
> > > > agree best serves our collective Cassandra users. I'm also cognizant
> > that
> > > > an immense amount of effort has gone into CassKop and we seem to have
> > > > something of an embarrassment of riches.
> > > >
> > > > I'm given to understand (haven't dug in personally) that the two
> > > operators
> > > > express pretty different opinions when it comes to frameworks,
> designs,
> > > > supported versions, etc. I think a discrete enumeration of the
> feature
> > > set
> > > > and "identities" of both could really help navigate this conversation
> > > going
> > > > forward.
> > > >
> > > > Also - thanks for that context Franck. It's always helpful to know
> > where
> > > > other people are coming from when we're all working together towards
> a
> > > > common goal.
> > > >
> > > >
> > > > On Thu, Sep 24, 2020 at 12:23 PM, <fr...@orange.com> wrote:
> > > >
> > > > > I can share Orange’s view of the situation, sorry it is a long
> story!
> > > > >
> > > > > We started CassKop at the end of 2018 after betting on K8S which
> was
> > > not
> > > > > so simple as far as C* was concerned. Lack of support for local
> > > storage,
> > > > > IPs that change all the time, different network plugins to try to
> > > implement
> > > > > a non standard K8s way of having nodes see each other from
> different
> > > dcs…
> > > > > We hesitated with Mesos but could not have both and K8S was already
> > > > > tracting so much you could not not choose it.
> > > > >
> > > > > Anyway, we looked around and did not see anyone with such
> > requirements
> > > so
> > > > > we said: why not try it ourselves but on github so that we may give
> > it
> > > back
> > > > > to the community. We have used C* for quite a few years with great
> > > success
> > > > > on production with massive load and perfect availability. We love
> C*
> > @
> > > > > Orange :) Thanks!
> > > > >
> > > > > So we started writing support for mono-dc cluster (CassKop) and
> added
> > > the
> > > > > multi dc support with MultiCassKop which is another operator
> included
> > > in
> > > > > the CassKop repo. For more details we tried to document our designs
> > as
> > > much
> > > > > as possible here:
> https://orange-opensource.github.io/casskop/docs/
> > > > > 1_concepts/3_design_principes#multi-site-management
> > > > >
> > > > > In the middle of last year we had some talks with Datastax about
> > > working
> > > > > together around their new management sidecar. Their position on
> open
> > > source
> > > > > was not clear at that time so we said please come back when you
> have
> > > > > decided to go open source with it. Which they did in the beginning
> of
> > > this
> > > > > year. But at that time I guess work had started on cass-operator so
> > we
> > > kept
> > > > > our separate ways.
> > > > >
> > > > > Since the beginning of the years, we have been working with our OPS
> > > team
> > > > > to have it in production. It is not simple as the team has to learn
> > > K8S and
> > > > > trust a newborn operator. This takes time especially as our
> internal
> > > > > cluster has been tweaked for multi-tenancy with obscure options
> being
> > > set
> > > > > by our K8s team…
> > > > >
> > > > > We also developed with Instaclustr the Backup & Restore
> > functionnality
> > > (we
> > > > > have new CRDs (Custom Resource Definition) for backup and restore
> > and a
> > > > > reconcile loop that calls out Instaclustr sidecar for these
> > > operations). We
> > > > > now support multiple backups in parallel and can write to s3/
> google
> > or
> > > > > azur (but Stefan could give more details here if needed)
> > > > >
> > > > > During the SIG calls we mentioned our desire to donate CassKop once
> > it
> > > > > satisfies our basics requirements (v1 coming just now but I said it
> > too
> > > > > many times already) I am actually not sure Datastax mentioned their
> > > desire
> > > > > to donate cass-operator but we decided to compare the designs and
> the
> > > > > functionalities based on respective CRDs. The CRD is the interface
> > > with the
> > > > > user as it is where you describe the cluster that you want to have.
> > > These
> > > > > talks were very interesting and we found out that the CassKop team
> > had
> > > made
> > > > > good choices most of the time but was may be too open. Indeed our
> > > intention
> > > > > was to give all the possibilities for our OPS team to work. This
> > > includes :
> > > > > - very open topology definition using any configuration of labels
> to
> > > map
> > > > > dcs / racks and nodes to labels on clusters (we have labels on dcs
> /
> > > rooms
> > > > > / rows and server racks so we can map C* racks to storage or
> network
> > > arrays
> > > > > internaly)
> > > > > - possibility to have multiple C* nodes on a single K8S host
> (because
> > > > > internal clouds are not really clouds, they have limited resources)
> > > > > - custom C* image selection,
> > > > > - custom bootstrap script that lets you configure C* as you want
> > using
> > > > > ConfigMaps,
> > > > > - the ability to mount different volumes wherever they wanted,
> > > > > - the possibility to run any number of sidecars alongside C* for
> > custom
> > > > > probes in our case
> > > > >
> > > > > This makes CassKop quite powerful and flexible.
> > > > > We made sure that all those options are not enabled by default so
> one
> > > can
> > > > > just pop a simple 3 node cluster quickly
> > > > >
> > > > > On the other hand cass-operator had an interesting way of
> configuring
> > > C*
> > > > > just inside the CRD using cass-config. This is simple and elegant
> so
> > > we are
> > > > > implementing it as well for the support of C* 4
> > > > >
> > > > > Now for the future, there are 3 choices in my opinion:
> > > > > - start from scratch (or John’s repo) by cherry picking bits from
> all
> > > > > operators. This is possible but will take some time / effort to
> have
> > > > > something usable. And then it will be compared to cass-operator and
> > > > > CassKop. I don’t see Orange contributing too much here as we
> believe
> > > > > CassKop to be a much better starting point
> > > > > - choose cass-operator: it is not on offer right now so let’s see
> if
> > it
> > > > > does. I think Orange could contribute some bits inherited from
> > CassKop
> > > if
> > > > > it is agreed by the community. Not sure it would be enough for us
> to
> > > use
> > > > > it.
> > > > > - choose CassKop: we would be delighted to donate it and contribute
> > > with
> > > > > some committers (including the original author who now works for
> > AWS).
> > > It
> > > > > would then become the community operator but there would be
> > > cass-operator
> > > > > alongside probably. But Cass-operator is made to make it easier for
> > > > > Datastax to manage customer clusters by imposing some
> configuration.
> > It
> > > > > make sense for their needs, so may be 2 operators. We don’t know
> how
> > > > > backup/restore will be handled here with medusa being adapted to
> K8s
> > > > >
> > > > > Sorry again for being long but 2 years of work deserve some lines
> of
> > > text
> > > > > :)
> > > > >
> > > > > I just saw your message Patrick but this was written already so we
> > > gain a
> > > > > week.
> > > > >
> > > > > Franck
> > > > >
> > > > > On 24 Sep 2020, at 10:08, Benjamin Lerer <
> > benjamin.lerer@datastax.com
> > > > > <ma...@datastax.com>> wrote:
> > > > >
> > > > > I realise there are meeting logs, but getting a wider discourse
> with
> > > > > non-stakeholder input might help to build a community consensus? It
> > > doesn't
> > > > > seem like it can hurt at this point, anyway.
> > > > >
> > > > > +1
> > > > >
> > > > > On Wed, Sep 23, 2020 at 9:21 PM Benedict Elliott Smith
> > > <benedict@apache.
> > > > > org<ma...@apache.org>> wrote:
> > > > >
> > > > > Perhaps it helps to widen the field of discussion to the dev list?
> > > > >
> > > > > It might help if each of the stakeholder organisations state their
> > > view on
> > > > > the situation, including why they would or would not support a
> given
> > > > > approach/operator, and what (preferably specific) circumstances
> might
> > > lead
> > > > > them to change their mind?
> > > > >
> > > > > I realise there are meeting logs, but getting a wider discourse
> with
> > > > > non-stakeholder input might help to build a community consensus? It
> > > doesn't
> > > > > seem like it can hurt at this point, anyway.
> > > > >
> > > > > On 23/09/2020, 17:13, "John Sanda" <john.sanda@gmail.com<mailto:
> > john.
> > > > > sanda@gmail.com>> wrote:
> > > > >
> > > > > I want to point out that pretty much everything being discussed in
> > this
> > > > > thread has been discussed at length during the SIG meetings. I
> think
> > > it is
> > > > > worth noting because we are pretty much still have the same
> > > conversation.
> > > > >
> > > > > On Wed, Sep 23, 2020 at 12:03 PM Benedict Elliott Smith <
> > > benedict@apache.
> > > > > org<ma...@apache.org>> wrote:
> > > > >
> > > > > I don't think there's anything about a code drop that's not "The
> > Apache
> > > > > Way"
> > > > >
> > > > > If there's a consensus (or even strong majority) amongst invested
> > > parties,
> > > > > I don't see why we could not adopt an operator directly into the
> > > project.
> > > > >
> > > > > It's possible a green field approach might lead to fewer hard
> > > feelings, as
> > > > > everyone is in the same boat. Perhaps all operators are also
> > suboptimal
> > > > > and
> > > > > could be improved with a rewrite? But I think coordinating a lot of
> > > > > different entities around an empty codebase is particularly
> > > challenging. I
> > > > > actually think it could be better for cohesion and collaboration to
> > > have a
> > > > > suboptimal but substantive starting point.
> > > > >
> > > > > On 23/09/2020, 16:11, "Stefan Miklosovic" < stefan.miklosovic@
> > > > > instaclustr.com<ma...@instaclustr.com>> wrote:
> > > > >
> > > > > I think that from Instaclustr it was stated quite clearly multiple
> > > > > times that we are "fine to throw it away" if there is something
> > better
> > > > > and more wide-spread.Indeed, we have invested a lot of time in the
> > > > > operator but it was not useless at all, we gained a lot of quite
> > unique
> > > > > knowledge how to put all pieces together. However, I think that
> > > > > this space is going to be quite fragmented and "balkanized", which
> is
> > > > > not always a bad thing, but in a quite narrow area as Kubernetes
> > > operator
> > > > > is, I just do not see how 4 operators are going to be beneficial
> for
> > > > > ordinary people ("official" from community, ours, Datastax one and
> > > CassKop
> > > > > (without any significant order)). Sure, innovation and healthy
> > > competition
> > > > > is important but to what extent ...
> > > > > One can start a Cassandra cluster on Kubernetes just so many times
> > > > > differently and nobody really likes a vendor lock-in. People
> wanting
> > > > > to run a cluster on K8S realise that there are three operators,
> each
> > > > > backed by a private business entity, and the community operator is
> > not
> > > > > there ... Huh, interesting ... One may even start to question what
> is
> > > > > wrong with these folks that it takes three companies to build their
> > > > > own solution.
> > > > >
> > > > > Having said that, to my perception, Cassandra community just does
> not
> > > > > have enough engineers nor contributors to keep 4 operators alive at
> > > > > the same time (I wish I was wrong) so the idea of selecting the
> best
> > > > > one or to merge obvious things and approaches together is
> > > understandable,
> > > > > even if it meant we eventually sunset ours. In addition, nobody
> from
> > > big
> > > > > players is going to contribute to the code
> > > > > base of the other one, for obvious reasons, so channeling and
> > directing
> > > > > this effort into something common for a community seems to
> > > > > be the only reasonable way of cooperation.
> > > > >
> > > > > It is quite hard to bootstrap this if the donation of the code in
> big
> > > > > chunks / whole repo is out of question as it is not the "Apache
> way"
> > > > > (there was some thread running here about this in more depth a
> while
> > > > > ago) and we basically need to start from scratch which is quite
> > > > > demotivating, we are just inventing the wheel and nobody is up to
> it.
> > > > > It is like people are waiting for that to happen so they can jump
> in
> > > > > "once it is the thing" but it will never materialise or at least
> the
> > > > > hurdle to kick it off is unnecessarily high. Nobody is going to
> > invest
> > > > > in this heavily if there is already a working operator from
> companies
> > > > > mentioned above. As I understood it, one reason of not choosing the
> > > > > way of donating it all is that "the learning and community building
> > > > > should happen in organic manner and we just can not accept the
> > > donation",
> > > > > but is not it true that it is easier to build a community
> > > > > around something which is already there rather than trying to build
> > it
> > > > > around an idea which is quite hard to dedicate to?
> > > > >
> > > > > On Wed, 23 Sep 2020 at 15:28, Joshua McKenzie <
> jmckenzie@apache.org
> > > > > <ma...@apache.org>> wrote:
> > > > >
> > > > > I think there's significant value to the community in trying to
> > > > > coalesce
> > > > > on a single approach,
> > > > > I agree. Unfortunately in this case, the parties with a vested
> > interest
> > > > > and
> > > > > written operators came to the table and couldn't agree to coalesce
> > > > > on a
> > > > > single approach. John Sanda attempted to start an initiative to
> > write a
> > > > > best-of-breed combining choice parts of each operator, but that
> > effort
> > > did
> > > > > not gain traction.
> > > > >
> > > > > Which is where my hypothesis comes from that if there were a clear
> > > > > "better
> > > > > fit" operator to start from we wouldn't be in a deadlock; the
> correct
> > > > > choice would be obvious. Reasonably so, every engineer that's
> written
> > > > > something is going to want that something to be used and not thrown
> > > > > away in
> > > > > favor of another something without strong evidence as to why that's
> > > > > the
> > > > > better choice.
> > > > >
> > > > > As far as I know, nobody has made a clear case as to a more
> > compelling
> > > > > place to start in terms of an operator donation the project then
> > > > > collaborates on. There's no mass adoption evidence nor feature
> > > enumeration
> > > > > that I know of for any of the approaches anyone's taken, so the
> > > > > discussions
> > > > > remain stalled.
> > > > >
> > > > > On Wed, Sep 23, 2020 at 7:18 AM, Benedict Elliott Smith <
> > > benedict@apache.
> > > > > org<ma...@apache.org> wrote:
> > > > >
> > > > > I think there's significant value to the community in trying to
> > > > > coalesce
> > > > > on a single approach, earlier than later. This is an opportunity
> > > > > to expand
> > > > > the number of active organisations involved directly in the Apache
> > > > > Cassandra project, as well as to more quickly expand the project's
> > > > > functionality into an area we consider urgent and important. I
> > > > > think it
> > > > > would be a real shame to waste this opportunity. No doubt it will
> > > > > be hard,
> > > > > as organisations have certain built-in investments in their own
> > > > > approaches.
> > > > >
> > > > > I haven't participated in these calls as I do not consider myself
> > > > > to have
> > > > > the relevant experience and expertise, and have other focuses on
> > > > > the
> > > > > project. I just wanted to voice a vote in favour of trying to bring
> > the
> > > > > different organisations together on a single approach if possible.
> > > > > Is there
> > > > > anything the project can do to help this happen?
> > > > >
> > > > > On 23/09/2020, 03:04, "Ben Bromhead" <ben@instaclustr.com<mailto:
> > ben@
> > > > > instaclustr.com>> wrote:
> > > > >
> > > > > I think there is certainly an appetite to donate and standardise
> > > > > on a
> > > > > given operator (as mentioned in this thread).
> > > > >
> > > > > I personally found the SIG hard to participate in due to time zones
> > and
> > > > > the synchronous nature of it.
> > > > >
> > > > > So while it was a great forum to dive into certain details for a
> > > > > subset of
> > > > > participants and a worthwhile endeavour, I wouldn't paint it as an
> > > > > accurate
> > > > > reflection of community intent.
> > > > >
> > > > > I don't think that any participants want to continue down the path
> > > > > of "let
> > > > > a thousand flowers bloom". That's why we are looking towards CasKop
> > (as
> > > > > well as a number of technical reasons).
> > > > >
> > > > > Some of the recorded meetings and outputs can also be found if you
> > > > > are
> > > > > interested in some primary sources
> > > > > https://cwiki.apache.org/confluence/display/CASSANDRA/
> > > > > Cassandra+Kubernetes+Operator+SIG
> > > > > .
> > > > >
> > > > > From what I understand second-hand from talking to people on the
> > > > > SIG
> > > > > calls,
> > > > >
> > > > > there was a general inability to agree on an existing operator as a
> > > > > starting point and not much engagement on taking best of breed
> > > > > from the
> > > > > various to combine them. Seems to leave us in the "let a thousand
> > > > > flowers
> > > > > bloom" stage of letting operators grow in the ecosystem and seeing
> > > > > which
> > > > > ones meet the needs of end users before talking about adopting one
> > > > > into the
> > > > > foundation.
> > > > >
> > > > > Great to hear that you folks are joining forces though! Bodes well
> > > > > for C*
> > > > > users that are wanting to run things on k8s.
> > > > >
> > > > > On Tue, Sep 22, 2020 at 4:26 AM, Ben Bromhead <
> ben@instaclustr.com
> > > > > <ma...@instaclustr.com>
> > > > >
> > > > > wrote:
> > > > >
> > > > > For what it's worth, a quick update from me:
> > > > >
> > > > > CassKop now has at least two organisations working on it
> > substantially
> > > > > (Orange and Instaclustr) as well as the numerous other
> contributors.
> > > > >
> > > > > Internally we will also start pointing others towards CasKop once
> > > > > a few
> > > > > things get merged. While we are not yet sunsetting our operator
> > > > > yet, it
> > > > >
> > > > > is
> > > > >
> > > > > certainly looking that way.
> > > > >
> > > > > I'd love to see the community adopt it as a starting point for
> > > > > working
> > > > > towards whatever level of functionality is desired.
> > > > >
> > > > > Cheers
> > > > >
> > > > > Ben
> > > > >
> > > > > On Fri, Sep 11, 2020 at 2:37 PM John Sanda <
> > > > > john.sanda@gmail.com>
> > > > > wrote:
> > > > >
> > > > > On Thu, Sep 10, 2020 at 5:27 PM Josh McKenzie <
> jmckenzie@apache.org
> > >
> > > > > wrote:
> > > > >
> > > > > There's basically 1 java driver in the C* ecosystem. We have 3? 4?
> > > > > or
> > > > >
> > > > > more
> > > > >
> > > > > operators in the ecosystem. Has one of them hit a clear
> supermajority
> > > of
> > > > > adoption that makes it the de facto default and makes sense to
> > > > > pull it
> > > > >
> > > > > into
> > > > >
> > > > > the project?
> > > > >
> > > > > We as a project community were pretty slow to move on building a
> > > > > PoV
> > > > >
> > > > > around
> > > > >
> > > > > kubernetes so we find ourselves in a situation with a bunch of
> > > > > contenders
> > > > > for inclusion in the project. It's not clear to me what heuristics
> > > > > we'd
> > > > >
> > > > > use
> > > > >
> > > > > to gauge which one would be the best fit for inclusion outside
> > > > > letting
> > > > > community adoption speak.
> > > > >
> > > > > ---
> > > > > Josh McKenzie
> > > > >
> > > > > We actually talked a good bit on the SIG call earlier today about
> > > > > heuristics. We need to document what functionality an operator
> > > > > should
> > > > > include at level 0, level 1, etc. We did discuss this a good bit
> > > > > during
> > > > > some of the initial SIG meetings, but I guess it wasn't really a
> > > > > focal
> > > > > point at the time. I think we should also provide references to
> > > > > existing
> > > > > operator projects and possibly other related projects. This would
> > > > > benefit
> > > > > both community users as well as people working on these projects.
> > > > >
> > > > > - John
> > > > >
> > > > > --
> > > > >
> > > > > Ben Bromhead
> > > > >
> > > > > Instaclustr | www.instaclustr.com | @instaclustr
> > > > > <http://twitter.com/instaclustr> | (650) 284 9692
> > > > >
> > > > > --
> > > > >
> > > > > Ben Bromhead
> > > > >
> > > > > Instaclustr | www.instaclustr.com | @instaclustr
> > > > > <http://twitter.com/instaclustr> | (650) 284 9692
> > > > >
> > > > >
> ---------------------------------------------------------------------
> > > To
> > > > > unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org For
> > > > > additional
> > > > > commands, e-mail: dev-help@cassandra.apache.org
> > > > >
> > > > >
> ---------------------------------------------------------------------
> > > To
> > > > > unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org For
> > > additional
> > > > > commands, e-mail: dev-help@cassandra.apache.org
> > > > >
> > > > >
> ---------------------------------------------------------------------
> > > To
> > > > > unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org For
> > > additional
> > > > > commands, e-mail: dev-help@cassandra.apache.org
> > > > >
> > > > > --
> > > > >
> > > > > - John
> > > > >
> > > > >
> ---------------------------------------------------------------------
> > > To
> > > > > unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org For
> > > additional
> > > > > commands, e-mail: dev-help@cassandra.apache.org
> > > > >
> > > > >
> > >
> >
> _________________________________________________________________________________________________________________________
> > > > >
> > > > >
> > > > > Ce message et ses pieces jointes peuvent contenir des informations
> > > > > confidentielles ou privilegiees et ne doivent donc pas etre
> diffuses,
> > > > > exploites ou copies sans autorisation. Si vous avez recu ce message
> > par
> > > > > erreur, veuillez le signaler a l'expediteur et le detruire ainsi
> que
> > > les
> > > > > pieces jointes. Les messages electroniques etant susceptibles
> > > d'alteration,
> > > > > Orange decline toute responsabilite si ce message a ete altere,
> > > deforme ou
> > > > > falsifie. Merci.
> > > > >
> > > > > This message and its attachments may contain confidential or
> > privileged
> > > > > information that may be protected by law; they should not be
> > > distributed,
> > > > > used or copied without authorisation. If you have received this
> email
> > > in
> > > > > error, please notify the sender and delete this message and its
> > > > > attachments. As emails may be altered, Orange is not liable for
> > > messages
> > > > > that have been modified, changed or falsified. Thank you.
> > > > >
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
> > > For additional commands, e-mail: dev-help@cassandra.apache.org
> > >
> > >
> >
> > --
> >
> > Ben Bromhead
> >
> > Instaclustr | www.instaclustr.com | @instaclustr
> > <http://twitter.com/instaclustr> | (650) 284 9692
> >
>

Re: [DISCUSS] Next steps for Kubernetes operator SIG

Posted by Patrick McFadin <pm...@gmail.com>.

I can agree with that Ben. Franck did a good job of outlining CassKop.
Somebody from the cass-operator will be posting something similar and we
can keep it on the mailing list.

Patrick

On Sun, Sep 27, 2020 at 2:16 PM Ben Bromhead <be...@instaclustr.com> wrote:

> Thanks Frank and Stefan.
>
> @Patrick great suggestion and worthwhile getting everything on the table.
>
> One minor change I would advocate for. The SIG has been great to iterate
> and interact on the details, but I really think this conversation given the
> nature of the content needs to be on the mailing list. The mailing list is
> really our system of record and the most accessible.
>
> It gives folk time to think and digest, it's asynchronous, easily
> searchable and let's be honest, the majority of stakeholders in this are
> not US based, so the timing issue then goes away and makes it easier for
> people to participate in. I feel like we've made a lot more progress by
> simply having this discussion here.
>
> So instead of a presentation, maybe just an email to the ML addressing the
> headings that Patrick identified?
>
>
> On Fri, Sep 25, 2020 at 7:55 AM Stefan Miklosovic <
> stefan.miklosovic@instaclustr.com> wrote:
>
> > Hi,
> >
> > Patrick's suggestion seems good to me.
> >
> > I won't go into specifics here as I need to genuinely prepare for
> > this. It is quite hard to dig deep into the solutions of others and
> > bring some constructive criticism because it takes a lot of time to
> > study it and everybody has some "why's" behind it.
> >
> > To summarize my goals and concerns:
> >
> > 1) We should be as much "Kubernetes operator idiomatic" as possible.
> > Industry standards, no custom brain-child of this or that group
> > because they think it is just cool or they just didn't know any
> > better. I do NOT say it is like that right now, I just want to be
> > ruthless here as much as possible when it comes to functionality and
> > why it is done like that. It is awesome that we have already something
> > latest (thanks to John) and it adheres to the latest releases. I
> > personally had a hard time to keep up with all the releases, once I
> > finished something and I aligned it, after a week or two there was
> > already another one where things were different, it is a very
> > fast-moving space and I hope that by time we develop something it will
> > not be obsolete.
> >
> > 2) It may be easier said than done but it is guaranteed that people
> > get emotional, it's their precious etc, so please let's go into this
> > with good intentions, not trying to push one solution over the other
> > just because they would like to see it there ... I will have an
> > equally hard time to comply with this point. My plan is to explain
> > what is _wrong_ with our solution. Where we made mistakes and what
> > should be done differently but it is "too late" etc. It is quite hard
> > to describe your work and all effort in this light but without telling
> > what is wrong we can not decide what is good imho.
> >
> > 3) We should put something together fast enough so we can call it a
> > release. We can always iterate on it for eternity. But the foundations
> > need to be there. Here I want to say that I especially like what John
> > did. I looked through these specs and it was obvious it has been
> > written with care and attention. It looked _solid_. I am not sure how
> > hard it is to put all other things on top of that, I truly do not, and
> > here I think we would have to reinvent that wheel if we want to
> > proceed because I can not imagine what it would be to retrofit e.g.
> > CassKop on top of John specs, it is just like putting round pegs into
> > the square holes, maybe some chunks would be reused easily but
> > otherwise I worry we will be just on square one.
> >
> > One specific feeling I have as I read this is that even if there is
> > the will to create the fourth operator, the respective parties will
> > not be able to drop their own repository. The whole point behind this
> > effort, to me, is to have a solid, community driven, stable, modern
> > and feature complete operator people are truly using. I can see that
> > once this is real, we will _really_ sunset our operator, redirecting
> > people to the new operator on main readme doc etc, we truly mean it.
> > Sure, if somebody comes and bug fix will be needed, we will fix it,
> > but the whole point of doing this is to stop using what we have
> > currently, over time, otherwise we are just splitting this space even
> > more. If CassKop is not sure if they will use it because they do not
> > know if that operator will be "enough" for them, aren't we just doing
> > it wrong? If I exaggerate, they should be fine with deleting the whole
> > repository and using just this Cassandra one we are going to make
> > otherwise I don't see the point to work on this ...
> >
> > On Thu, 24 Sep 2020 at 20:45, Joshua McKenzie <jm...@apache.org>
> > wrote:
> > >
> > > - choose cass-operator: it is not on offer right now so let’s see if it
> > does
> > >
> > >
> > > We should all talk a lot more, but this is 100% a mistake - I take the
> > > blame for that. The intention has long been to offer cass-operator for
> > > donation but it slipped through the cracks and your email yesterday
> made
> > me
> > > double-take.
> > >
> > > We have since resolved this misalignment. DataStax would be happy to
> > donate
> > > any and all of cass-operator to the ASF and C* project if it's what we
> > all
> > > agree best serves our collective Cassandra users. I'm also cognizant
> that
> > > an immense amount of effort has gone into CassKop and we seem to have
> > > something of an embarrassment of riches.
> > >
> > > I'm given to understand (haven't dug in personally) that the two
> > operators
> > > express pretty different opinions when it comes to frameworks, designs,
> > > supported versions, etc. I think a discrete enumeration of the feature
> > set
> > > and "identities" of both could really help navigate this conversation
> > going
> > > forward.
> > >
> > > Also - thanks for that context Franck. It's always helpful to know
> where
> > > other people are coming from when we're all working together towards a
> > > common goal.
> > >
> > >
> > > On Thu, Sep 24, 2020 at 12:23 PM, <fr...@orange.com> wrote:
> > >
> > > > I can share Orange’s view of the situation, sorry it is a long story!
> > > >
> > > > We started CassKop at the end of 2018 after betting on K8S which was
> > not
> > > > so simple as far as C* was concerned. Lack of support for local
> > storage,
> > > > IPs that change all the time, different network plugins to try to
> > implement
> > > > a non standard K8s way of having nodes see each other from different
> > dcs…
> > > > We hesitated with Mesos but could not have both and K8S was already
> > > > tracting so much you could not not choose it.
> > > >
> > > > Anyway, we looked around and did not see anyone with such
> requirements
> > so
> > > > we said: why not try it ourselves but on github so that we may give
> it
> > back
> > > > to the community. We have used C* for quite a few years with great
> > success
> > > > on production with massive load and perfect availability. We love C*
> @
> > > > Orange :) Thanks!
> > > >
> > > > So we started writing support for mono-dc cluster (CassKop) and added
> > the
> > > > multi dc support with MultiCassKop which is another operator included
> > in
> > > > the CassKop repo. For more details we tried to document our designs
> as
> > much
> > > > as possible here: https://orange-opensource.github.io/casskop/docs/
> > > > 1_concepts/3_design_principes#multi-site-management
> > > >
> > > > In the middle of last year we had some talks with Datastax about
> > working
> > > > together around their new management sidecar. Their position on open
> > source
> > > > was not clear at that time so we said please come back when you have
> > > > decided to go open source with it. Which they did in the beginning of
> > this
> > > > year. But at that time I guess work had started on cass-operator so
> we
> > kept
> > > > our separate ways.
> > > >
> > > > Since the beginning of the years, we have been working with our OPS
> > team
> > > > to have it in production. It is not simple as the team has to learn
> > K8S and
> > > > trust a newborn operator. This takes time especially as our internal
> > > > cluster has been tweaked for multi-tenancy with obscure options being
> > set
> > > > by our K8s team…
> > > >
> > > > We also developed with Instaclustr the Backup & Restore
> functionnality
> > (we
> > > > have new CRDs (Custom Resource Definition) for backup and restore
> and a
> > > > reconcile loop that calls out Instaclustr sidecar for these
> > operations). We
> > > > now support multiple backups in parallel and can write to s3/ google
> or
> > > > azur (but Stefan could give more details here if needed)
> > > >
> > > > During the SIG calls we mentioned our desire to donate CassKop once
> it
> > > > satisfies our basics requirements (v1 coming just now but I said it
> too
> > > > many times already) I am actually not sure Datastax mentioned their
> > desire
> > > > to donate cass-operator but we decided to compare the designs and the
> > > > functionalities based on respective CRDs. The CRD is the interface
> > with the
> > > > user as it is where you describe the cluster that you want to have.
> > These
> > > > talks were very interesting and we found out that the CassKop team
> had
> > made
> > > > good choices most of the time but was may be too open. Indeed our
> > intention
> > > > was to give all the possibilities for our OPS team to work. This
> > includes :
> > > > - very open topology definition using any configuration of labels to
> > map
> > > > dcs / racks and nodes to labels on clusters (we have labels on dcs /
> > rooms
> > > > / rows and server racks so we can map C* racks to storage or network
> > arrays
> > > > internaly)
> > > > - possibility to have multiple C* nodes on a single K8S host (because
> > > > internal clouds are not really clouds, they have limited resources)
> > > > - custom C* image selection,
> > > > - custom bootstrap script that lets you configure C* as you want
> using
> > > > ConfigMaps,
> > > > - the ability to mount different volumes wherever they wanted,
> > > > - the possibility to run any number of sidecars alongside C* for
> custom
> > > > probes in our case
> > > >
> > > > This makes CassKop quite powerful and flexible.
> > > > We made sure that all those options are not enabled by default so one
> > can
> > > > just pop a simple 3 node cluster quickly
> > > >
> > > > On the other hand cass-operator had an interesting way of configuring
> > C*
> > > > just inside the CRD using cass-config. This is simple and elegant so
> > we are
> > > > implementing it as well for the support of C* 4
> > > >
> > > > Now for the future, there are 3 choices in my opinion:
> > > > - start from scratch (or John’s repo) by cherry picking bits from all
> > > > operators. This is possible but will take some time / effort to have
> > > > something usable. And then it will be compared to cass-operator and
> > > > CassKop. I don’t see Orange contributing too much here as we believe
> > > > CassKop to be a much better starting point
> > > > - choose cass-operator: it is not on offer right now so let’s see if
> it
> > > > does. I think Orange could contribute some bits inherited from
> CassKop
> > if
> > > > it is agreed by the community. Not sure it would be enough for us to
> > use
> > > > it.
> > > > - choose CassKop: we would be delighted to donate it and contribute
> > with
> > > > some committers (including the original author who now works for
> AWS).
> > It
> > > > would then become the community operator but there would be
> > cass-operator
> > > > alongside probably. But Cass-operator is made to make it easier for
> > > > Datastax to manage customer clusters by imposing some configuration.
> It
> > > > make sense for their needs, so may be 2 operators. We don’t know how
> > > > backup/restore will be handled here with medusa being adapted to K8s
> > > >
> > > > Sorry again for being long but 2 years of work deserve some lines of
> > text
> > > > :)
> > > >
> > > > I just saw your message Patrick but this was written already so we
> > gain a
> > > > week.
> > > >
> > > > Franck
> > > >
> > > > On 24 Sep 2020, at 10:08, Benjamin Lerer <
> benjamin.lerer@datastax.com
> > > > <ma...@datastax.com>> wrote:
> > > >
> > > > I realise there are meeting logs, but getting a wider discourse with
> > > > non-stakeholder input might help to build a community consensus? It
> > doesn't
> > > > seem like it can hurt at this point, anyway.
> > > >
> > > > +1
> > > >
> > > > On Wed, Sep 23, 2020 at 9:21 PM Benedict Elliott Smith
> > <benedict@apache.
> > > > org<ma...@apache.org>> wrote:
> > > >
> > > > Perhaps it helps to widen the field of discussion to the dev list?
> > > >
> > > > It might help if each of the stakeholder organisations state their
> > view on
> > > > the situation, including why they would or would not support a given
> > > > approach/operator, and what (preferably specific) circumstances might
> > lead
> > > > them to change their mind?
> > > >
> > > > I realise there are meeting logs, but getting a wider discourse with
> > > > non-stakeholder input might help to build a community consensus? It
> > doesn't
> > > > seem like it can hurt at this point, anyway.
> > > >
> > > > On 23/09/2020, 17:13, "John Sanda" <john.sanda@gmail.com<mailto:
> john.
> > > > sanda@gmail.com>> wrote:
> > > >
> > > > I want to point out that pretty much everything being discussed in
> this
> > > > thread has been discussed at length during the SIG meetings. I think
> > it is
> > > > worth noting because we are pretty much still have the same
> > conversation.
> > > >
> > > > On Wed, Sep 23, 2020 at 12:03 PM Benedict Elliott Smith <
> > benedict@apache.
> > > > org<ma...@apache.org>> wrote:
> > > >
> > > > I don't think there's anything about a code drop that's not "The
> Apache
> > > > Way"
> > > >
> > > > If there's a consensus (or even strong majority) amongst invested
> > parties,
> > > > I don't see why we could not adopt an operator directly into the
> > project.
> > > >
> > > > It's possible a green field approach might lead to fewer hard
> > feelings, as
> > > > everyone is in the same boat. Perhaps all operators are also
> suboptimal
> > > > and
> > > > could be improved with a rewrite? But I think coordinating a lot of
> > > > different entities around an empty codebase is particularly
> > challenging. I
> > > > actually think it could be better for cohesion and collaboration to
> > have a
> > > > suboptimal but substantive starting point.
> > > >
> > > > On 23/09/2020, 16:11, "Stefan Miklosovic" < stefan.miklosovic@
> > > > instaclustr.com<ma...@instaclustr.com>> wrote:
> > > >
> > > > I think that from Instaclustr it was stated quite clearly multiple
> > > > times that we are "fine to throw it away" if there is something
> better
> > > > and more wide-spread.Indeed, we have invested a lot of time in the
> > > > operator but it was not useless at all, we gained a lot of quite
> unique
> > > > knowledge how to put all pieces together. However, I think that
> > > > this space is going to be quite fragmented and "balkanized", which is
> > > > not always a bad thing, but in a quite narrow area as Kubernetes
> > operator
> > > > is, I just do not see how 4 operators are going to be beneficial for
> > > > ordinary people ("official" from community, ours, Datastax one and
> > CassKop
> > > > (without any significant order)). Sure, innovation and healthy
> > competition
> > > > is important but to what extent ...
> > > > One can start a Cassandra cluster on Kubernetes just so many times
> > > > differently and nobody really likes a vendor lock-in. People wanting
> > > > to run a cluster on K8S realise that there are three operators, each
> > > > backed by a private business entity, and the community operator is
> not
> > > > there ... Huh, interesting ... One may even start to question what is
> > > > wrong with these folks that it takes three companies to build their
> > > > own solution.
> > > >
> > > > Having said that, to my perception, Cassandra community just does not
> > > > have enough engineers nor contributors to keep 4 operators alive at
> > > > the same time (I wish I was wrong) so the idea of selecting the best
> > > > one or to merge obvious things and approaches together is
> > understandable,
> > > > even if it meant we eventually sunset ours. In addition, nobody from
> > big
> > > > players is going to contribute to the code
> > > > base of the other one, for obvious reasons, so channeling and
> directing
> > > > this effort into something common for a community seems to
> > > > be the only reasonable way of cooperation.
> > > >
> > > > It is quite hard to bootstrap this if the donation of the code in big
> > > > chunks / whole repo is out of question as it is not the "Apache way"
> > > > (there was some thread running here about this in more depth a while
> > > > ago) and we basically need to start from scratch which is quite
> > > > demotivating, we are just inventing the wheel and nobody is up to it.
> > > > It is like people are waiting for that to happen so they can jump in
> > > > "once it is the thing" but it will never materialise or at least the
> > > > hurdle to kick it off is unnecessarily high. Nobody is going to
> invest
> > > > in this heavily if there is already a working operator from companies
> > > > mentioned above. As I understood it, one reason of not choosing the
> > > > way of donating it all is that "the learning and community building
> > > > should happen in organic manner and we just can not accept the
> > donation",
> > > > but is not it true that it is easier to build a community
> > > > around something which is already there rather than trying to build
> it
> > > > around an idea which is quite hard to dedicate to?
> > > >
> > > > On Wed, 23 Sep 2020 at 15:28, Joshua McKenzie < jmckenzie@apache.org
> > > > <ma...@apache.org>> wrote:
> > > >
> > > > I think there's significant value to the community in trying to
> > > > coalesce
> > > > on a single approach,
> > > > I agree. Unfortunately in this case, the parties with a vested
> interest
> > > > and
> > > > written operators came to the table and couldn't agree to coalesce
> > > > on a
> > > > single approach. John Sanda attempted to start an initiative to
> write a
> > > > best-of-breed combining choice parts of each operator, but that
> effort
> > did
> > > > not gain traction.
> > > >
> > > > Which is where my hypothesis comes from that if there were a clear
> > > > "better
> > > > fit" operator to start from we wouldn't be in a deadlock; the correct
> > > > choice would be obvious. Reasonably so, every engineer that's written
> > > > something is going to want that something to be used and not thrown
> > > > away in
> > > > favor of another something without strong evidence as to why that's
> > > > the
> > > > better choice.
> > > >
> > > > As far as I know, nobody has made a clear case as to a more
> compelling
> > > > place to start in terms of an operator donation the project then
> > > > collaborates on. There's no mass adoption evidence nor feature
> > enumeration
> > > > that I know of for any of the approaches anyone's taken, so the
> > > > discussions
> > > > remain stalled.
> > > >
> > > > On Wed, Sep 23, 2020 at 7:18 AM, Benedict Elliott Smith <
> > benedict@apache.
> > > > org<ma...@apache.org> wrote:
> > > >
> > > > I think there's significant value to the community in trying to
> > > > coalesce
> > > > on a single approach, earlier than later. This is an opportunity
> > > > to expand
> > > > the number of active organisations involved directly in the Apache
> > > > Cassandra project, as well as to more quickly expand the project's
> > > > functionality into an area we consider urgent and important. I
> > > > think it
> > > > would be a real shame to waste this opportunity. No doubt it will
> > > > be hard,
> > > > as organisations have certain built-in investments in their own
> > > > approaches.
> > > >
> > > > I haven't participated in these calls as I do not consider myself
> > > > to have
> > > > the relevant experience and expertise, and have other focuses on
> > > > the
> > > > project. I just wanted to voice a vote in favour of trying to bring
> the
> > > > different organisations together on a single approach if possible.
> > > > Is there
> > > > anything the project can do to help this happen?
> > > >
> > > > On 23/09/2020, 03:04, "Ben Bromhead" <ben@instaclustr.com<mailto:
> ben@
> > > > instaclustr.com>> wrote:
> > > >
> > > > I think there is certainly an appetite to donate and standardise
> > > > on a
> > > > given operator (as mentioned in this thread).
> > > >
> > > > I personally found the SIG hard to participate in due to time zones
> and
> > > > the synchronous nature of it.
> > > >
> > > > So while it was a great forum to dive into certain details for a
> > > > subset of
> > > > participants and a worthwhile endeavour, I wouldn't paint it as an
> > > > accurate
> > > > reflection of community intent.
> > > >
> > > > I don't think that any participants want to continue down the path
> > > > of "let
> > > > a thousand flowers bloom". That's why we are looking towards CasKop
> (as
> > > > well as a number of technical reasons).
> > > >
> > > > Some of the recorded meetings and outputs can also be found if you
> > > > are
> > > > interested in some primary sources
> > > > https://cwiki.apache.org/confluence/display/CASSANDRA/
> > > > Cassandra+Kubernetes+Operator+SIG
> > > > .
> > > >
> > > > From what I understand second-hand from talking to people on the
> > > > SIG
> > > > calls,
> > > >
> > > > there was a general inability to agree on an existing operator as a
> > > > starting point and not much engagement on taking best of breed
> > > > from the
> > > > various to combine them. Seems to leave us in the "let a thousand
> > > > flowers
> > > > bloom" stage of letting operators grow in the ecosystem and seeing
> > > > which
> > > > ones meet the needs of end users before talking about adopting one
> > > > into the
> > > > foundation.
> > > >
> > > > Great to hear that you folks are joining forces though! Bodes well
> > > > for C*
> > > > users that are wanting to run things on k8s.
> > > >
> > > > On Tue, Sep 22, 2020 at 4:26 AM, Ben Bromhead < ben@instaclustr.com
> > > > <ma...@instaclustr.com>
> > > >
> > > > wrote:
> > > >
> > > > For what it's worth, a quick update from me:
> > > >
> > > > CassKop now has at least two organisations working on it
> substantially
> > > > (Orange and Instaclustr) as well as the numerous other contributors.
> > > >
> > > > Internally we will also start pointing others towards CasKop once
> > > > a few
> > > > things get merged. While we are not yet sunsetting our operator
> > > > yet, it
> > > >
> > > > is
> > > >
> > > > certainly looking that way.
> > > >
> > > > I'd love to see the community adopt it as a starting point for
> > > > working
> > > > towards whatever level of functionality is desired.
> > > >
> > > > Cheers
> > > >
> > > > Ben
> > > >
> > > > On Fri, Sep 11, 2020 at 2:37 PM John Sanda <
> > > > john.sanda@gmail.com>
> > > > wrote:
> > > >
> > > > On Thu, Sep 10, 2020 at 5:27 PM Josh McKenzie < jmckenzie@apache.org
> >
> > > > wrote:
> > > >
> > > > There's basically 1 java driver in the C* ecosystem. We have 3? 4?
> > > > or
> > > >
> > > > more
> > > >
> > > > operators in the ecosystem. Has one of them hit a clear supermajority
> > of
> > > > adoption that makes it the de facto default and makes sense to
> > > > pull it
> > > >
> > > > into
> > > >
> > > > the project?
> > > >
> > > > We as a project community were pretty slow to move on building a
> > > > PoV
> > > >
> > > > around
> > > >
> > > > kubernetes so we find ourselves in a situation with a bunch of
> > > > contenders
> > > > for inclusion in the project. It's not clear to me what heuristics
> > > > we'd
> > > >
> > > > use
> > > >
> > > > to gauge which one would be the best fit for inclusion outside
> > > > letting
> > > > community adoption speak.
> > > >
> > > > ---
> > > > Josh McKenzie
> > > >
> > > > We actually talked a good bit on the SIG call earlier today about
> > > > heuristics. We need to document what functionality an operator
> > > > should
> > > > include at level 0, level 1, etc. We did discuss this a good bit
> > > > during
> > > > some of the initial SIG meetings, but I guess it wasn't really a
> > > > focal
> > > > point at the time. I think we should also provide references to
> > > > existing
> > > > operator projects and possibly other related projects. This would
> > > > benefit
> > > > both community users as well as people working on these projects.
> > > >
> > > > - John
> > > >
> > > > --
> > > >
> > > > Ben Bromhead
> > > >
> > > > Instaclustr | www.instaclustr.com | @instaclustr
> > > > <http://twitter.com/instaclustr> | (650) 284 9692
> > > >
> > > > --
> > > >
> > > > Ben Bromhead
> > > >
> > > > Instaclustr | www.instaclustr.com | @instaclustr
> > > > <http://twitter.com/instaclustr> | (650) 284 9692
> > > >
> > > > ---------------------------------------------------------------------
> > To
> > > > unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org For
> > > > additional
> > > > commands, e-mail: dev-help@cassandra.apache.org
> > > >
> > > > ---------------------------------------------------------------------
> > To
> > > > unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org For
> > additional
> > > > commands, e-mail: dev-help@cassandra.apache.org
> > > >
> > > > ---------------------------------------------------------------------
> > To
> > > > unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org For
> > additional
> > > > commands, e-mail: dev-help@cassandra.apache.org
> > > >
> > > > --
> > > >
> > > > - John
> > > >
> > > > ---------------------------------------------------------------------
> > To
> > > > unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org For
> > additional
> > > > commands, e-mail: dev-help@cassandra.apache.org
> > > >
> > > >
> >
> _________________________________________________________________________________________________________________________
> > > >
> > > >
> > > > Ce message et ses pieces jointes peuvent contenir des informations
> > > > confidentielles ou privilegiees et ne doivent donc pas etre diffuses,
> > > > exploites ou copies sans autorisation. Si vous avez recu ce message
> par
> > > > erreur, veuillez le signaler a l'expediteur et le detruire ainsi que
> > les
> > > > pieces jointes. Les messages electroniques etant susceptibles
> > d'alteration,
> > > > Orange decline toute responsabilite si ce message a ete altere,
> > deforme ou
> > > > falsifie. Merci.
> > > >
> > > > This message and its attachments may contain confidential or
> privileged
> > > > information that may be protected by law; they should not be
> > distributed,
> > > > used or copied without authorisation. If you have received this email
> > in
> > > > error, please notify the sender and delete this message and its
> > > > attachments. As emails may be altered, Orange is not liable for
> > messages
> > > > that have been modified, changed or falsified. Thank you.
> > > >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
> > For additional commands, e-mail: dev-help@cassandra.apache.org
> >
> >
>
> --
>
> Ben Bromhead
>
> Instaclustr | www.instaclustr.com | @instaclustr
> <http://twitter.com/instaclustr> | (650) 284 9692
>

Re: [DISCUSS] Next steps for Kubernetes operator SIG

Posted by Ben Bromhead <be...@instaclustr.com>.

Thanks Frank and Stefan.

@Patrick great suggestion and worthwhile getting everything on the table.

One minor change I would advocate for. The SIG has been great to iterate
and interact on the details, but I really think this conversation given the
nature of the content needs to be on the mailing list. The mailing list is
really our system of record and the most accessible.

It gives folk time to think and digest, it's asynchronous, easily
searchable and let's be honest, the majority of stakeholders in this are
not US based, so the timing issue then goes away and makes it easier for
people to participate in. I feel like we've made a lot more progress by
simply having this discussion here.

So instead of a presentation, maybe just an email to the ML addressing the
headings that Patrick identified?


On Fri, Sep 25, 2020 at 7:55 AM Stefan Miklosovic <
stefan.miklosovic@instaclustr.com> wrote:

> Hi,
>
> Patrick's suggestion seems good to me.
>
> I won't go into specifics here as I need to genuinely prepare for
> this. It is quite hard to dig deep into the solutions of others and
> bring some constructive criticism because it takes a lot of time to
> study it and everybody has some "why's" behind it.
>
> To summarize my goals and concerns:
>
> 1) We should be as much "Kubernetes operator idiomatic" as possible.
> Industry standards, no custom brain-child of this or that group
> because they think it is just cool or they just didn't know any
> better. I do NOT say it is like that right now, I just want to be
> ruthless here as much as possible when it comes to functionality and
> why it is done like that. It is awesome that we have already something
> latest (thanks to John) and it adheres to the latest releases. I
> personally had a hard time to keep up with all the releases, once I
> finished something and I aligned it, after a week or two there was
> already another one where things were different, it is a very
> fast-moving space and I hope that by time we develop something it will
> not be obsolete.
>
> 2) It may be easier said than done but it is guaranteed that people
> get emotional, it's their precious etc, so please let's go into this
> with good intentions, not trying to push one solution over the other
> just because they would like to see it there ... I will have an
> equally hard time to comply with this point. My plan is to explain
> what is _wrong_ with our solution. Where we made mistakes and what
> should be done differently but it is "too late" etc. It is quite hard
> to describe your work and all effort in this light but without telling
> what is wrong we can not decide what is good imho.
>
> 3) We should put something together fast enough so we can call it a
> release. We can always iterate on it for eternity. But the foundations
> need to be there. Here I want to say that I especially like what John
> did. I looked through these specs and it was obvious it has been
> written with care and attention. It looked _solid_. I am not sure how
> hard it is to put all other things on top of that, I truly do not, and
> here I think we would have to reinvent that wheel if we want to
> proceed because I can not imagine what it would be to retrofit e.g.
> CassKop on top of John specs, it is just like putting round pegs into
> the square holes, maybe some chunks would be reused easily but
> otherwise I worry we will be just on square one.
>
> One specific feeling I have as I read this is that even if there is
> the will to create the fourth operator, the respective parties will
> not be able to drop their own repository. The whole point behind this
> effort, to me, is to have a solid, community driven, stable, modern
> and feature complete operator people are truly using. I can see that
> once this is real, we will _really_ sunset our operator, redirecting
> people to the new operator on main readme doc etc, we truly mean it.
> Sure, if somebody comes and bug fix will be needed, we will fix it,
> but the whole point of doing this is to stop using what we have
> currently, over time, otherwise we are just splitting this space even
> more. If CassKop is not sure if they will use it because they do not
> know if that operator will be "enough" for them, aren't we just doing
> it wrong? If I exaggerate, they should be fine with deleting the whole
> repository and using just this Cassandra one we are going to make
> otherwise I don't see the point to work on this ...
>
> On Thu, 24 Sep 2020 at 20:45, Joshua McKenzie <jm...@apache.org>
> wrote:
> >
> > - choose cass-operator: it is not on offer right now so let’s see if it
> does
> >
> >
> > We should all talk a lot more, but this is 100% a mistake - I take the
> > blame for that. The intention has long been to offer cass-operator for
> > donation but it slipped through the cracks and your email yesterday made
> me
> > double-take.
> >
> > We have since resolved this misalignment. DataStax would be happy to
> donate
> > any and all of cass-operator to the ASF and C* project if it's what we
> all
> > agree best serves our collective Cassandra users. I'm also cognizant that
> > an immense amount of effort has gone into CassKop and we seem to have
> > something of an embarrassment of riches.
> >
> > I'm given to understand (haven't dug in personally) that the two
> operators
> > express pretty different opinions when it comes to frameworks, designs,
> > supported versions, etc. I think a discrete enumeration of the feature
> set
> > and "identities" of both could really help navigate this conversation
> going
> > forward.
> >
> > Also - thanks for that context Franck. It's always helpful to know where
> > other people are coming from when we're all working together towards a
> > common goal.
> >
> >
> > On Thu, Sep 24, 2020 at 12:23 PM, <fr...@orange.com> wrote:
> >
> > > I can share Orange’s view of the situation, sorry it is a long story!
> > >
> > > We started CassKop at the end of 2018 after betting on K8S which was
> not
> > > so simple as far as C* was concerned. Lack of support for local
> storage,
> > > IPs that change all the time, different network plugins to try to
> implement
> > > a non standard K8s way of having nodes see each other from different
> dcs…
> > > We hesitated with Mesos but could not have both and K8S was already
> > > tracting so much you could not not choose it.
> > >
> > > Anyway, we looked around and did not see anyone with such requirements
> so
> > > we said: why not try it ourselves but on github so that we may give it
> back
> > > to the community. We have used C* for quite a few years with great
> success
> > > on production with massive load and perfect availability. We love C* @
> > > Orange :) Thanks!
> > >
> > > So we started writing support for mono-dc cluster (CassKop) and added
> the
> > > multi dc support with MultiCassKop which is another operator included
> in
> > > the CassKop repo. For more details we tried to document our designs as
> much
> > > as possible here: https://orange-opensource.github.io/casskop/docs/
> > > 1_concepts/3_design_principes#multi-site-management
> > >
> > > In the middle of last year we had some talks with Datastax about
> working
> > > together around their new management sidecar. Their position on open
> source
> > > was not clear at that time so we said please come back when you have
> > > decided to go open source with it. Which they did in the beginning of
> this
> > > year. But at that time I guess work had started on cass-operator so we
> kept
> > > our separate ways.
> > >
> > > Since the beginning of the years, we have been working with our OPS
> team
> > > to have it in production. It is not simple as the team has to learn
> K8S and
> > > trust a newborn operator. This takes time especially as our internal
> > > cluster has been tweaked for multi-tenancy with obscure options being
> set
> > > by our K8s team…
> > >
> > > We also developed with Instaclustr the Backup & Restore functionnality
> (we
> > > have new CRDs (Custom Resource Definition) for backup and restore and a
> > > reconcile loop that calls out Instaclustr sidecar for these
> operations). We
> > > now support multiple backups in parallel and can write to s3/ google or
> > > azur (but Stefan could give more details here if needed)
> > >
> > > During the SIG calls we mentioned our desire to donate CassKop once it
> > > satisfies our basics requirements (v1 coming just now but I said it too
> > > many times already) I am actually not sure Datastax mentioned their
> desire
> > > to donate cass-operator but we decided to compare the designs and the
> > > functionalities based on respective CRDs. The CRD is the interface
> with the
> > > user as it is where you describe the cluster that you want to have.
> These
> > > talks were very interesting and we found out that the CassKop team had
> made
> > > good choices most of the time but was may be too open. Indeed our
> intention
> > > was to give all the possibilities for our OPS team to work. This
> includes :
> > > - very open topology definition using any configuration of labels to
> map
> > > dcs / racks and nodes to labels on clusters (we have labels on dcs /
> rooms
> > > / rows and server racks so we can map C* racks to storage or network
> arrays
> > > internaly)
> > > - possibility to have multiple C* nodes on a single K8S host (because
> > > internal clouds are not really clouds, they have limited resources)
> > > - custom C* image selection,
> > > - custom bootstrap script that lets you configure C* as you want using
> > > ConfigMaps,
> > > - the ability to mount different volumes wherever they wanted,
> > > - the possibility to run any number of sidecars alongside C* for custom
> > > probes in our case
> > >
> > > This makes CassKop quite powerful and flexible.
> > > We made sure that all those options are not enabled by default so one
> can
> > > just pop a simple 3 node cluster quickly
> > >
> > > On the other hand cass-operator had an interesting way of configuring
> C*
> > > just inside the CRD using cass-config. This is simple and elegant so
> we are
> > > implementing it as well for the support of C* 4
> > >
> > > Now for the future, there are 3 choices in my opinion:
> > > - start from scratch (or John’s repo) by cherry picking bits from all
> > > operators. This is possible but will take some time / effort to have
> > > something usable. And then it will be compared to cass-operator and
> > > CassKop. I don’t see Orange contributing too much here as we believe
> > > CassKop to be a much better starting point
> > > - choose cass-operator: it is not on offer right now so let’s see if it
> > > does. I think Orange could contribute some bits inherited from CassKop
> if
> > > it is agreed by the community. Not sure it would be enough for us to
> use
> > > it.
> > > - choose CassKop: we would be delighted to donate it and contribute
> with
> > > some committers (including the original author who now works for AWS).
> It
> > > would then become the community operator but there would be
> cass-operator
> > > alongside probably. But Cass-operator is made to make it easier for
> > > Datastax to manage customer clusters by imposing some configuration. It
> > > make sense for their needs, so may be 2 operators. We don’t know how
> > > backup/restore will be handled here with medusa being adapted to K8s
> > >
> > > Sorry again for being long but 2 years of work deserve some lines of
> text
> > > :)
> > >
> > > I just saw your message Patrick but this was written already so we
> gain a
> > > week.
> > >
> > > Franck
> > >
> > > On 24 Sep 2020, at 10:08, Benjamin Lerer <benjamin.lerer@datastax.com
> > > <ma...@datastax.com>> wrote:
> > >
> > > I realise there are meeting logs, but getting a wider discourse with
> > > non-stakeholder input might help to build a community consensus? It
> doesn't
> > > seem like it can hurt at this point, anyway.
> > >
> > > +1
> > >
> > > On Wed, Sep 23, 2020 at 9:21 PM Benedict Elliott Smith
> <benedict@apache.
> > > org<ma...@apache.org>> wrote:
> > >
> > > Perhaps it helps to widen the field of discussion to the dev list?
> > >
> > > It might help if each of the stakeholder organisations state their
> view on
> > > the situation, including why they would or would not support a given
> > > approach/operator, and what (preferably specific) circumstances might
> lead
> > > them to change their mind?
> > >
> > > I realise there are meeting logs, but getting a wider discourse with
> > > non-stakeholder input might help to build a community consensus? It
> doesn't
> > > seem like it can hurt at this point, anyway.
> > >
> > > On 23/09/2020, 17:13, "John Sanda" <john.sanda@gmail.com<mailto:john.
> > > sanda@gmail.com>> wrote:
> > >
> > > I want to point out that pretty much everything being discussed in this
> > > thread has been discussed at length during the SIG meetings. I think
> it is
> > > worth noting because we are pretty much still have the same
> conversation.
> > >
> > > On Wed, Sep 23, 2020 at 12:03 PM Benedict Elliott Smith <
> benedict@apache.
> > > org<ma...@apache.org>> wrote:
> > >
> > > I don't think there's anything about a code drop that's not "The Apache
> > > Way"
> > >
> > > If there's a consensus (or even strong majority) amongst invested
> parties,
> > > I don't see why we could not adopt an operator directly into the
> project.
> > >
> > > It's possible a green field approach might lead to fewer hard
> feelings, as
> > > everyone is in the same boat. Perhaps all operators are also suboptimal
> > > and
> > > could be improved with a rewrite? But I think coordinating a lot of
> > > different entities around an empty codebase is particularly
> challenging. I
> > > actually think it could be better for cohesion and collaboration to
> have a
> > > suboptimal but substantive starting point.
> > >
> > > On 23/09/2020, 16:11, "Stefan Miklosovic" < stefan.miklosovic@
> > > instaclustr.com<ma...@instaclustr.com>> wrote:
> > >
> > > I think that from Instaclustr it was stated quite clearly multiple
> > > times that we are "fine to throw it away" if there is something better
> > > and more wide-spread.Indeed, we have invested a lot of time in the
> > > operator but it was not useless at all, we gained a lot of quite unique
> > > knowledge how to put all pieces together. However, I think that
> > > this space is going to be quite fragmented and "balkanized", which is
> > > not always a bad thing, but in a quite narrow area as Kubernetes
> operator
> > > is, I just do not see how 4 operators are going to be beneficial for
> > > ordinary people ("official" from community, ours, Datastax one and
> CassKop
> > > (without any significant order)). Sure, innovation and healthy
> competition
> > > is important but to what extent ...
> > > One can start a Cassandra cluster on Kubernetes just so many times
> > > differently and nobody really likes a vendor lock-in. People wanting
> > > to run a cluster on K8S realise that there are three operators, each
> > > backed by a private business entity, and the community operator is not
> > > there ... Huh, interesting ... One may even start to question what is
> > > wrong with these folks that it takes three companies to build their
> > > own solution.
> > >
> > > Having said that, to my perception, Cassandra community just does not
> > > have enough engineers nor contributors to keep 4 operators alive at
> > > the same time (I wish I was wrong) so the idea of selecting the best
> > > one or to merge obvious things and approaches together is
> understandable,
> > > even if it meant we eventually sunset ours. In addition, nobody from
> big
> > > players is going to contribute to the code
> > > base of the other one, for obvious reasons, so channeling and directing
> > > this effort into something common for a community seems to
> > > be the only reasonable way of cooperation.
> > >
> > > It is quite hard to bootstrap this if the donation of the code in big
> > > chunks / whole repo is out of question as it is not the "Apache way"
> > > (there was some thread running here about this in more depth a while
> > > ago) and we basically need to start from scratch which is quite
> > > demotivating, we are just inventing the wheel and nobody is up to it.
> > > It is like people are waiting for that to happen so they can jump in
> > > "once it is the thing" but it will never materialise or at least the
> > > hurdle to kick it off is unnecessarily high. Nobody is going to invest
> > > in this heavily if there is already a working operator from companies
> > > mentioned above. As I understood it, one reason of not choosing the
> > > way of donating it all is that "the learning and community building
> > > should happen in organic manner and we just can not accept the
> donation",
> > > but is not it true that it is easier to build a community
> > > around something which is already there rather than trying to build it
> > > around an idea which is quite hard to dedicate to?
> > >
> > > On Wed, 23 Sep 2020 at 15:28, Joshua McKenzie < jmckenzie@apache.org
> > > <ma...@apache.org>> wrote:
> > >
> > > I think there's significant value to the community in trying to
> > > coalesce
> > > on a single approach,
> > > I agree. Unfortunately in this case, the parties with a vested interest
> > > and
> > > written operators came to the table and couldn't agree to coalesce
> > > on a
> > > single approach. John Sanda attempted to start an initiative to write a
> > > best-of-breed combining choice parts of each operator, but that effort
> did
> > > not gain traction.
> > >
> > > Which is where my hypothesis comes from that if there were a clear
> > > "better
> > > fit" operator to start from we wouldn't be in a deadlock; the correct
> > > choice would be obvious. Reasonably so, every engineer that's written
> > > something is going to want that something to be used and not thrown
> > > away in
> > > favor of another something without strong evidence as to why that's
> > > the
> > > better choice.
> > >
> > > As far as I know, nobody has made a clear case as to a more compelling
> > > place to start in terms of an operator donation the project then
> > > collaborates on. There's no mass adoption evidence nor feature
> enumeration
> > > that I know of for any of the approaches anyone's taken, so the
> > > discussions
> > > remain stalled.
> > >
> > > On Wed, Sep 23, 2020 at 7:18 AM, Benedict Elliott Smith <
> benedict@apache.
> > > org<ma...@apache.org> wrote:
> > >
> > > I think there's significant value to the community in trying to
> > > coalesce
> > > on a single approach, earlier than later. This is an opportunity
> > > to expand
> > > the number of active organisations involved directly in the Apache
> > > Cassandra project, as well as to more quickly expand the project's
> > > functionality into an area we consider urgent and important. I
> > > think it
> > > would be a real shame to waste this opportunity. No doubt it will
> > > be hard,
> > > as organisations have certain built-in investments in their own
> > > approaches.
> > >
> > > I haven't participated in these calls as I do not consider myself
> > > to have
> > > the relevant experience and expertise, and have other focuses on
> > > the
> > > project. I just wanted to voice a vote in favour of trying to bring the
> > > different organisations together on a single approach if possible.
> > > Is there
> > > anything the project can do to help this happen?
> > >
> > > On 23/09/2020, 03:04, "Ben Bromhead" <ben@instaclustr.com<mailto:ben@
> > > instaclustr.com>> wrote:
> > >
> > > I think there is certainly an appetite to donate and standardise
> > > on a
> > > given operator (as mentioned in this thread).
> > >
> > > I personally found the SIG hard to participate in due to time zones and
> > > the synchronous nature of it.
> > >
> > > So while it was a great forum to dive into certain details for a
> > > subset of
> > > participants and a worthwhile endeavour, I wouldn't paint it as an
> > > accurate
> > > reflection of community intent.
> > >
> > > I don't think that any participants want to continue down the path
> > > of "let
> > > a thousand flowers bloom". That's why we are looking towards CasKop (as
> > > well as a number of technical reasons).
> > >
> > > Some of the recorded meetings and outputs can also be found if you
> > > are
> > > interested in some primary sources
> > > https://cwiki.apache.org/confluence/display/CASSANDRA/
> > > Cassandra+Kubernetes+Operator+SIG
> > > .
> > >
> > > From what I understand second-hand from talking to people on the
> > > SIG
> > > calls,
> > >
> > > there was a general inability to agree on an existing operator as a
> > > starting point and not much engagement on taking best of breed
> > > from the
> > > various to combine them. Seems to leave us in the "let a thousand
> > > flowers
> > > bloom" stage of letting operators grow in the ecosystem and seeing
> > > which
> > > ones meet the needs of end users before talking about adopting one
> > > into the
> > > foundation.
> > >
> > > Great to hear that you folks are joining forces though! Bodes well
> > > for C*
> > > users that are wanting to run things on k8s.
> > >
> > > On Tue, Sep 22, 2020 at 4:26 AM, Ben Bromhead < ben@instaclustr.com
> > > <ma...@instaclustr.com>
> > >
> > > wrote:
> > >
> > > For what it's worth, a quick update from me:
> > >
> > > CassKop now has at least two organisations working on it substantially
> > > (Orange and Instaclustr) as well as the numerous other contributors.
> > >
> > > Internally we will also start pointing others towards CasKop once
> > > a few
> > > things get merged. While we are not yet sunsetting our operator
> > > yet, it
> > >
> > > is
> > >
> > > certainly looking that way.
> > >
> > > I'd love to see the community adopt it as a starting point for
> > > working
> > > towards whatever level of functionality is desired.
> > >
> > > Cheers
> > >
> > > Ben
> > >
> > > On Fri, Sep 11, 2020 at 2:37 PM John Sanda <
> > > john.sanda@gmail.com>
> > > wrote:
> > >
> > > On Thu, Sep 10, 2020 at 5:27 PM Josh McKenzie < jmckenzie@apache.org>
> > > wrote:
> > >
> > > There's basically 1 java driver in the C* ecosystem. We have 3? 4?
> > > or
> > >
> > > more
> > >
> > > operators in the ecosystem. Has one of them hit a clear supermajority
> of
> > > adoption that makes it the de facto default and makes sense to
> > > pull it
> > >
> > > into
> > >
> > > the project?
> > >
> > > We as a project community were pretty slow to move on building a
> > > PoV
> > >
> > > around
> > >
> > > kubernetes so we find ourselves in a situation with a bunch of
> > > contenders
> > > for inclusion in the project. It's not clear to me what heuristics
> > > we'd
> > >
> > > use
> > >
> > > to gauge which one would be the best fit for inclusion outside
> > > letting
> > > community adoption speak.
> > >
> > > ---
> > > Josh McKenzie
> > >
> > > We actually talked a good bit on the SIG call earlier today about
> > > heuristics. We need to document what functionality an operator
> > > should
> > > include at level 0, level 1, etc. We did discuss this a good bit
> > > during
> > > some of the initial SIG meetings, but I guess it wasn't really a
> > > focal
> > > point at the time. I think we should also provide references to
> > > existing
> > > operator projects and possibly other related projects. This would
> > > benefit
> > > both community users as well as people working on these projects.
> > >
> > > - John
> > >
> > > --
> > >
> > > Ben Bromhead
> > >
> > > Instaclustr | www.instaclustr.com | @instaclustr
> > > <http://twitter.com/instaclustr> | (650) 284 9692
> > >
> > > --
> > >
> > > Ben Bromhead
> > >
> > > Instaclustr | www.instaclustr.com | @instaclustr
> > > <http://twitter.com/instaclustr> | (650) 284 9692
> > >
> > > ---------------------------------------------------------------------
> To
> > > unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org For
> > > additional
> > > commands, e-mail: dev-help@cassandra.apache.org
> > >
> > > ---------------------------------------------------------------------
> To
> > > unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org For
> additional
> > > commands, e-mail: dev-help@cassandra.apache.org
> > >
> > > ---------------------------------------------------------------------
> To
> > > unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org For
> additional
> > > commands, e-mail: dev-help@cassandra.apache.org
> > >
> > > --
> > >
> > > - John
> > >
> > > ---------------------------------------------------------------------
> To
> > > unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org For
> additional
> > > commands, e-mail: dev-help@cassandra.apache.org
> > >
> > >
> _________________________________________________________________________________________________________________________
> > >
> > >
> > > Ce message et ses pieces jointes peuvent contenir des informations
> > > confidentielles ou privilegiees et ne doivent donc pas etre diffuses,
> > > exploites ou copies sans autorisation. Si vous avez recu ce message par
> > > erreur, veuillez le signaler a l'expediteur et le detruire ainsi que
> les
> > > pieces jointes. Les messages electroniques etant susceptibles
> d'alteration,
> > > Orange decline toute responsabilite si ce message a ete altere,
> deforme ou
> > > falsifie. Merci.
> > >
> > > This message and its attachments may contain confidential or privileged
> > > information that may be protected by law; they should not be
> distributed,
> > > used or copied without authorisation. If you have received this email
> in
> > > error, please notify the sender and delete this message and its
> > > attachments. As emails may be altered, Orange is not liable for
> messages
> > > that have been modified, changed or falsified. Thank you.
> > >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
> For additional commands, e-mail: dev-help@cassandra.apache.org
>
>

-- 

Ben Bromhead

Instaclustr | www.instaclustr.com | @instaclustr
<http://twitter.com/instaclustr> | (650) 284 9692

Re: [DISCUSS] Next steps for Kubernetes operator SIG

Posted by Stefan Miklosovic <st...@instaclustr.com>.

Hi,

Patrick's suggestion seems good to me.

I won't go into specifics here as I need to genuinely prepare for
this. It is quite hard to dig deep into the solutions of others and
bring some constructive criticism because it takes a lot of time to
study it and everybody has some "why's" behind it.

To summarize my goals and concerns:

1) We should be as much "Kubernetes operator idiomatic" as possible.
Industry standards, no custom brain-child of this or that group
because they think it is just cool or they just didn't know any
better. I do NOT say it is like that right now, I just want to be
ruthless here as much as possible when it comes to functionality and
why it is done like that. It is awesome that we have already something
latest (thanks to John) and it adheres to the latest releases. I
personally had a hard time to keep up with all the releases, once I
finished something and I aligned it, after a week or two there was
already another one where things were different, it is a very
fast-moving space and I hope that by time we develop something it will
not be obsolete.

2) It may be easier said than done but it is guaranteed that people
get emotional, it's their precious etc, so please let's go into this
with good intentions, not trying to push one solution over the other
just because they would like to see it there ... I will have an
equally hard time to comply with this point. My plan is to explain
what is _wrong_ with our solution. Where we made mistakes and what
should be done differently but it is "too late" etc. It is quite hard
to describe your work and all effort in this light but without telling
what is wrong we can not decide what is good imho.

3) We should put something together fast enough so we can call it a
release. We can always iterate on it for eternity. But the foundations
need to be there. Here I want to say that I especially like what John
did. I looked through these specs and it was obvious it has been
written with care and attention. It looked _solid_. I am not sure how
hard it is to put all other things on top of that, I truly do not, and
here I think we would have to reinvent that wheel if we want to
proceed because I can not imagine what it would be to retrofit e.g.
CassKop on top of John specs, it is just like putting round pegs into
the square holes, maybe some chunks would be reused easily but
otherwise I worry we will be just on square one.

One specific feeling I have as I read this is that even if there is
the will to create the fourth operator, the respective parties will
not be able to drop their own repository. The whole point behind this
effort, to me, is to have a solid, community driven, stable, modern
and feature complete operator people are truly using. I can see that
once this is real, we will _really_ sunset our operator, redirecting
people to the new operator on main readme doc etc, we truly mean it.
Sure, if somebody comes and bug fix will be needed, we will fix it,
but the whole point of doing this is to stop using what we have
currently, over time, otherwise we are just splitting this space even
more. If CassKop is not sure if they will use it because they do not
know if that operator will be "enough" for them, aren't we just doing
it wrong? If I exaggerate, they should be fine with deleting the whole
repository and using just this Cassandra one we are going to make
otherwise I don't see the point to work on this ...

On Thu, 24 Sep 2020 at 20:45, Joshua McKenzie <jm...@apache.org> wrote:
>
> - choose cass-operator: it is not on offer right now so let’s see if it does
>
>
> We should all talk a lot more, but this is 100% a mistake - I take the
> blame for that. The intention has long been to offer cass-operator for
> donation but it slipped through the cracks and your email yesterday made me
> double-take.
>
> We have since resolved this misalignment. DataStax would be happy to donate
> any and all of cass-operator to the ASF and C* project if it's what we all
> agree best serves our collective Cassandra users. I'm also cognizant that
> an immense amount of effort has gone into CassKop and we seem to have
> something of an embarrassment of riches.
>
> I'm given to understand (haven't dug in personally) that the two operators
> express pretty different opinions when it comes to frameworks, designs,
> supported versions, etc. I think a discrete enumeration of the feature set
> and "identities" of both could really help navigate this conversation going
> forward.
>
> Also - thanks for that context Franck. It's always helpful to know where
> other people are coming from when we're all working together towards a
> common goal.
>
>
> On Thu, Sep 24, 2020 at 12:23 PM, <fr...@orange.com> wrote:
>
> > I can share Orange’s view of the situation, sorry it is a long story!
> >
> > We started CassKop at the end of 2018 after betting on K8S which was not
> > so simple as far as C* was concerned. Lack of support for local storage,
> > IPs that change all the time, different network plugins to try to implement
> > a non standard K8s way of having nodes see each other from different dcs…
> > We hesitated with Mesos but could not have both and K8S was already
> > tracting so much you could not not choose it.
> >
> > Anyway, we looked around and did not see anyone with such requirements so
> > we said: why not try it ourselves but on github so that we may give it back
> > to the community. We have used C* for quite a few years with great success
> > on production with massive load and perfect availability. We love C* @
> > Orange :) Thanks!
> >
> > So we started writing support for mono-dc cluster (CassKop) and added the
> > multi dc support with MultiCassKop which is another operator included in
> > the CassKop repo. For more details we tried to document our designs as much
> > as possible here: https://orange-opensource.github.io/casskop/docs/
> > 1_concepts/3_design_principes#multi-site-management
> >
> > In the middle of last year we had some talks with Datastax about working
> > together around their new management sidecar. Their position on open source
> > was not clear at that time so we said please come back when you have
> > decided to go open source with it. Which they did in the beginning of this
> > year. But at that time I guess work had started on cass-operator so we kept
> > our separate ways.
> >
> > Since the beginning of the years, we have been working with our OPS team
> > to have it in production. It is not simple as the team has to learn K8S and
> > trust a newborn operator. This takes time especially as our internal
> > cluster has been tweaked for multi-tenancy with obscure options being set
> > by our K8s team…
> >
> > We also developed with Instaclustr the Backup & Restore functionnality (we
> > have new CRDs (Custom Resource Definition) for backup and restore and a
> > reconcile loop that calls out Instaclustr sidecar for these operations). We
> > now support multiple backups in parallel and can write to s3/ google or
> > azur (but Stefan could give more details here if needed)
> >
> > During the SIG calls we mentioned our desire to donate CassKop once it
> > satisfies our basics requirements (v1 coming just now but I said it too
> > many times already) I am actually not sure Datastax mentioned their desire
> > to donate cass-operator but we decided to compare the designs and the
> > functionalities based on respective CRDs. The CRD is the interface with the
> > user as it is where you describe the cluster that you want to have. These
> > talks were very interesting and we found out that the CassKop team had made
> > good choices most of the time but was may be too open. Indeed our intention
> > was to give all the possibilities for our OPS team to work. This includes :
> > - very open topology definition using any configuration of labels to map
> > dcs / racks and nodes to labels on clusters (we have labels on dcs / rooms
> > / rows and server racks so we can map C* racks to storage or network arrays
> > internaly)
> > - possibility to have multiple C* nodes on a single K8S host (because
> > internal clouds are not really clouds, they have limited resources)
> > - custom C* image selection,
> > - custom bootstrap script that lets you configure C* as you want using
> > ConfigMaps,
> > - the ability to mount different volumes wherever they wanted,
> > - the possibility to run any number of sidecars alongside C* for custom
> > probes in our case
> >
> > This makes CassKop quite powerful and flexible.
> > We made sure that all those options are not enabled by default so one can
> > just pop a simple 3 node cluster quickly
> >
> > On the other hand cass-operator had an interesting way of configuring C*
> > just inside the CRD using cass-config. This is simple and elegant so we are
> > implementing it as well for the support of C* 4
> >
> > Now for the future, there are 3 choices in my opinion:
> > - start from scratch (or John’s repo) by cherry picking bits from all
> > operators. This is possible but will take some time / effort to have
> > something usable. And then it will be compared to cass-operator and
> > CassKop. I don’t see Orange contributing too much here as we believe
> > CassKop to be a much better starting point
> > - choose cass-operator: it is not on offer right now so let’s see if it
> > does. I think Orange could contribute some bits inherited from CassKop if
> > it is agreed by the community. Not sure it would be enough for us to use
> > it.
> > - choose CassKop: we would be delighted to donate it and contribute with
> > some committers (including the original author who now works for AWS). It
> > would then become the community operator but there would be cass-operator
> > alongside probably. But Cass-operator is made to make it easier for
> > Datastax to manage customer clusters by imposing some configuration. It
> > make sense for their needs, so may be 2 operators. We don’t know how
> > backup/restore will be handled here with medusa being adapted to K8s
> >
> > Sorry again for being long but 2 years of work deserve some lines of text
> > :)
> >
> > I just saw your message Patrick but this was written already so we gain a
> > week.
> >
> > Franck
> >
> > On 24 Sep 2020, at 10:08, Benjamin Lerer <benjamin.lerer@datastax.com
> > <ma...@datastax.com>> wrote:
> >
> > I realise there are meeting logs, but getting a wider discourse with
> > non-stakeholder input might help to build a community consensus? It doesn't
> > seem like it can hurt at this point, anyway.
> >
> > +1
> >
> > On Wed, Sep 23, 2020 at 9:21 PM Benedict Elliott Smith <benedict@apache.
> > org<ma...@apache.org>> wrote:
> >
> > Perhaps it helps to widen the field of discussion to the dev list?
> >
> > It might help if each of the stakeholder organisations state their view on
> > the situation, including why they would or would not support a given
> > approach/operator, and what (preferably specific) circumstances might lead
> > them to change their mind?
> >
> > I realise there are meeting logs, but getting a wider discourse with
> > non-stakeholder input might help to build a community consensus? It doesn't
> > seem like it can hurt at this point, anyway.
> >
> > On 23/09/2020, 17:13, "John Sanda" <john.sanda@gmail.com<mailto:john.
> > sanda@gmail.com>> wrote:
> >
> > I want to point out that pretty much everything being discussed in this
> > thread has been discussed at length during the SIG meetings. I think it is
> > worth noting because we are pretty much still have the same conversation.
> >
> > On Wed, Sep 23, 2020 at 12:03 PM Benedict Elliott Smith < benedict@apache.
> > org<ma...@apache.org>> wrote:
> >
> > I don't think there's anything about a code drop that's not "The Apache
> > Way"
> >
> > If there's a consensus (or even strong majority) amongst invested parties,
> > I don't see why we could not adopt an operator directly into the project.
> >
> > It's possible a green field approach might lead to fewer hard feelings, as
> > everyone is in the same boat. Perhaps all operators are also suboptimal
> > and
> > could be improved with a rewrite? But I think coordinating a lot of
> > different entities around an empty codebase is particularly challenging. I
> > actually think it could be better for cohesion and collaboration to have a
> > suboptimal but substantive starting point.
> >
> > On 23/09/2020, 16:11, "Stefan Miklosovic" < stefan.miklosovic@
> > instaclustr.com<ma...@instaclustr.com>> wrote:
> >
> > I think that from Instaclustr it was stated quite clearly multiple
> > times that we are "fine to throw it away" if there is something better
> > and more wide-spread.Indeed, we have invested a lot of time in the
> > operator but it was not useless at all, we gained a lot of quite unique
> > knowledge how to put all pieces together. However, I think that
> > this space is going to be quite fragmented and "balkanized", which is
> > not always a bad thing, but in a quite narrow area as Kubernetes operator
> > is, I just do not see how 4 operators are going to be beneficial for
> > ordinary people ("official" from community, ours, Datastax one and CassKop
> > (without any significant order)). Sure, innovation and healthy competition
> > is important but to what extent ...
> > One can start a Cassandra cluster on Kubernetes just so many times
> > differently and nobody really likes a vendor lock-in. People wanting
> > to run a cluster on K8S realise that there are three operators, each
> > backed by a private business entity, and the community operator is not
> > there ... Huh, interesting ... One may even start to question what is
> > wrong with these folks that it takes three companies to build their
> > own solution.
> >
> > Having said that, to my perception, Cassandra community just does not
> > have enough engineers nor contributors to keep 4 operators alive at
> > the same time (I wish I was wrong) so the idea of selecting the best
> > one or to merge obvious things and approaches together is understandable,
> > even if it meant we eventually sunset ours. In addition, nobody from big
> > players is going to contribute to the code
> > base of the other one, for obvious reasons, so channeling and directing
> > this effort into something common for a community seems to
> > be the only reasonable way of cooperation.
> >
> > It is quite hard to bootstrap this if the donation of the code in big
> > chunks / whole repo is out of question as it is not the "Apache way"
> > (there was some thread running here about this in more depth a while
> > ago) and we basically need to start from scratch which is quite
> > demotivating, we are just inventing the wheel and nobody is up to it.
> > It is like people are waiting for that to happen so they can jump in
> > "once it is the thing" but it will never materialise or at least the
> > hurdle to kick it off is unnecessarily high. Nobody is going to invest
> > in this heavily if there is already a working operator from companies
> > mentioned above. As I understood it, one reason of not choosing the
> > way of donating it all is that "the learning and community building
> > should happen in organic manner and we just can not accept the donation",
> > but is not it true that it is easier to build a community
> > around something which is already there rather than trying to build it
> > around an idea which is quite hard to dedicate to?
> >
> > On Wed, 23 Sep 2020 at 15:28, Joshua McKenzie < jmckenzie@apache.org
> > <ma...@apache.org>> wrote:
> >
> > I think there's significant value to the community in trying to
> > coalesce
> > on a single approach,
> > I agree. Unfortunately in this case, the parties with a vested interest
> > and
> > written operators came to the table and couldn't agree to coalesce
> > on a
> > single approach. John Sanda attempted to start an initiative to write a
> > best-of-breed combining choice parts of each operator, but that effort did
> > not gain traction.
> >
> > Which is where my hypothesis comes from that if there were a clear
> > "better
> > fit" operator to start from we wouldn't be in a deadlock; the correct
> > choice would be obvious. Reasonably so, every engineer that's written
> > something is going to want that something to be used and not thrown
> > away in
> > favor of another something without strong evidence as to why that's
> > the
> > better choice.
> >
> > As far as I know, nobody has made a clear case as to a more compelling
> > place to start in terms of an operator donation the project then
> > collaborates on. There's no mass adoption evidence nor feature enumeration
> > that I know of for any of the approaches anyone's taken, so the
> > discussions
> > remain stalled.
> >
> > On Wed, Sep 23, 2020 at 7:18 AM, Benedict Elliott Smith < benedict@apache.
> > org<ma...@apache.org> wrote:
> >
> > I think there's significant value to the community in trying to
> > coalesce
> > on a single approach, earlier than later. This is an opportunity
> > to expand
> > the number of active organisations involved directly in the Apache
> > Cassandra project, as well as to more quickly expand the project's
> > functionality into an area we consider urgent and important. I
> > think it
> > would be a real shame to waste this opportunity. No doubt it will
> > be hard,
> > as organisations have certain built-in investments in their own
> > approaches.
> >
> > I haven't participated in these calls as I do not consider myself
> > to have
> > the relevant experience and expertise, and have other focuses on
> > the
> > project. I just wanted to voice a vote in favour of trying to bring the
> > different organisations together on a single approach if possible.
> > Is there
> > anything the project can do to help this happen?
> >
> > On 23/09/2020, 03:04, "Ben Bromhead" <ben@instaclustr.com<mailto:ben@
> > instaclustr.com>> wrote:
> >
> > I think there is certainly an appetite to donate and standardise
> > on a
> > given operator (as mentioned in this thread).
> >
> > I personally found the SIG hard to participate in due to time zones and
> > the synchronous nature of it.
> >
> > So while it was a great forum to dive into certain details for a
> > subset of
> > participants and a worthwhile endeavour, I wouldn't paint it as an
> > accurate
> > reflection of community intent.
> >
> > I don't think that any participants want to continue down the path
> > of "let
> > a thousand flowers bloom". That's why we are looking towards CasKop (as
> > well as a number of technical reasons).
> >
> > Some of the recorded meetings and outputs can also be found if you
> > are
> > interested in some primary sources
> > https://cwiki.apache.org/confluence/display/CASSANDRA/
> > Cassandra+Kubernetes+Operator+SIG
> > .
> >
> > From what I understand second-hand from talking to people on the
> > SIG
> > calls,
> >
> > there was a general inability to agree on an existing operator as a
> > starting point and not much engagement on taking best of breed
> > from the
> > various to combine them. Seems to leave us in the "let a thousand
> > flowers
> > bloom" stage of letting operators grow in the ecosystem and seeing
> > which
> > ones meet the needs of end users before talking about adopting one
> > into the
> > foundation.
> >
> > Great to hear that you folks are joining forces though! Bodes well
> > for C*
> > users that are wanting to run things on k8s.
> >
> > On Tue, Sep 22, 2020 at 4:26 AM, Ben Bromhead < ben@instaclustr.com
> > <ma...@instaclustr.com>
> >
> > wrote:
> >
> > For what it's worth, a quick update from me:
> >
> > CassKop now has at least two organisations working on it substantially
> > (Orange and Instaclustr) as well as the numerous other contributors.
> >
> > Internally we will also start pointing others towards CasKop once
> > a few
> > things get merged. While we are not yet sunsetting our operator
> > yet, it
> >
> > is
> >
> > certainly looking that way.
> >
> > I'd love to see the community adopt it as a starting point for
> > working
> > towards whatever level of functionality is desired.
> >
> > Cheers
> >
> > Ben
> >
> > On Fri, Sep 11, 2020 at 2:37 PM John Sanda <
> > john.sanda@gmail.com>
> > wrote:
> >
> > On Thu, Sep 10, 2020 at 5:27 PM Josh McKenzie < jmckenzie@apache.org>
> > wrote:
> >
> > There's basically 1 java driver in the C* ecosystem. We have 3? 4?
> > or
> >
> > more
> >
> > operators in the ecosystem. Has one of them hit a clear supermajority of
> > adoption that makes it the de facto default and makes sense to
> > pull it
> >
> > into
> >
> > the project?
> >
> > We as a project community were pretty slow to move on building a
> > PoV
> >
> > around
> >
> > kubernetes so we find ourselves in a situation with a bunch of
> > contenders
> > for inclusion in the project. It's not clear to me what heuristics
> > we'd
> >
> > use
> >
> > to gauge which one would be the best fit for inclusion outside
> > letting
> > community adoption speak.
> >
> > ---
> > Josh McKenzie
> >
> > We actually talked a good bit on the SIG call earlier today about
> > heuristics. We need to document what functionality an operator
> > should
> > include at level 0, level 1, etc. We did discuss this a good bit
> > during
> > some of the initial SIG meetings, but I guess it wasn't really a
> > focal
> > point at the time. I think we should also provide references to
> > existing
> > operator projects and possibly other related projects. This would
> > benefit
> > both community users as well as people working on these projects.
> >
> > - John
> >
> > --
> >
> > Ben Bromhead
> >
> > Instaclustr | www.instaclustr.com | @instaclustr
> > <http://twitter.com/instaclustr> | (650) 284 9692
> >
> > --
> >
> > Ben Bromhead
> >
> > Instaclustr | www.instaclustr.com | @instaclustr
> > <http://twitter.com/instaclustr> | (650) 284 9692
> >
> > --------------------------------------------------------------------- To
> > unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org For
> > additional
> > commands, e-mail: dev-help@cassandra.apache.org
> >
> > --------------------------------------------------------------------- To
> > unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org For additional
> > commands, e-mail: dev-help@cassandra.apache.org
> >
> > --------------------------------------------------------------------- To
> > unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org For additional
> > commands, e-mail: dev-help@cassandra.apache.org
> >
> > --
> >
> > - John
> >
> > --------------------------------------------------------------------- To
> > unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org For additional
> > commands, e-mail: dev-help@cassandra.apache.org
> >
> > _________________________________________________________________________________________________________________________
> >
> >
> > Ce message et ses pieces jointes peuvent contenir des informations
> > confidentielles ou privilegiees et ne doivent donc pas etre diffuses,
> > exploites ou copies sans autorisation. Si vous avez recu ce message par
> > erreur, veuillez le signaler a l'expediteur et le detruire ainsi que les
> > pieces jointes. Les messages electroniques etant susceptibles d'alteration,
> > Orange decline toute responsabilite si ce message a ete altere, deforme ou
> > falsifie. Merci.
> >
> > This message and its attachments may contain confidential or privileged
> > information that may be protected by law; they should not be distributed,
> > used or copied without authorisation. If you have received this email in
> > error, please notify the sender and delete this message and its
> > attachments. As emails may be altered, Orange is not liable for messages
> > that have been modified, changed or falsified. Thank you.
> >

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
For additional commands, e-mail: dev-help@cassandra.apache.org

Re: [DISCUSS] Next steps for Kubernetes operator SIG

Posted by Joshua McKenzie <jm...@apache.org>.

- choose cass-operator: it is not on offer right now so let’s see if it does


We should all talk a lot more, but this is 100% a mistake - I take the
blame for that. The intention has long been to offer cass-operator for
donation but it slipped through the cracks and your email yesterday made me
double-take.

We have since resolved this misalignment. DataStax would be happy to donate
any and all of cass-operator to the ASF and C* project if it's what we all
agree best serves our collective Cassandra users. I'm also cognizant that
an immense amount of effort has gone into CassKop and we seem to have
something of an embarrassment of riches.

I'm given to understand (haven't dug in personally) that the two operators
express pretty different opinions when it comes to frameworks, designs,
supported versions, etc. I think a discrete enumeration of the feature set
and "identities" of both could really help navigate this conversation going
forward.

Also - thanks for that context Franck. It's always helpful to know where
other people are coming from when we're all working together towards a
common goal.


On Thu, Sep 24, 2020 at 12:23 PM, <fr...@orange.com> wrote:

> I can share Orange’s view of the situation, sorry it is a long story!
>
> We started CassKop at the end of 2018 after betting on K8S which was not
> so simple as far as C* was concerned. Lack of support for local storage,
> IPs that change all the time, different network plugins to try to implement
> a non standard K8s way of having nodes see each other from different dcs…
> We hesitated with Mesos but could not have both and K8S was already
> tracting so much you could not not choose it.
>
> Anyway, we looked around and did not see anyone with such requirements so
> we said: why not try it ourselves but on github so that we may give it back
> to the community. We have used C* for quite a few years with great success
> on production with massive load and perfect availability. We love C* @
> Orange :) Thanks!
>
> So we started writing support for mono-dc cluster (CassKop) and added the
> multi dc support with MultiCassKop which is another operator included in
> the CassKop repo. For more details we tried to document our designs as much
> as possible here: https://orange-opensource.github.io/casskop/docs/
> 1_concepts/3_design_principes#multi-site-management
>
> In the middle of last year we had some talks with Datastax about working
> together around their new management sidecar. Their position on open source
> was not clear at that time so we said please come back when you have
> decided to go open source with it. Which they did in the beginning of this
> year. But at that time I guess work had started on cass-operator so we kept
> our separate ways.
>
> Since the beginning of the years, we have been working with our OPS team
> to have it in production. It is not simple as the team has to learn K8S and
> trust a newborn operator. This takes time especially as our internal
> cluster has been tweaked for multi-tenancy with obscure options being set
> by our K8s team…
>
> We also developed with Instaclustr the Backup & Restore functionnality (we
> have new CRDs (Custom Resource Definition) for backup and restore and a
> reconcile loop that calls out Instaclustr sidecar for these operations). We
> now support multiple backups in parallel and can write to s3/ google or
> azur (but Stefan could give more details here if needed)
>
> During the SIG calls we mentioned our desire to donate CassKop once it
> satisfies our basics requirements (v1 coming just now but I said it too
> many times already) I am actually not sure Datastax mentioned their desire
> to donate cass-operator but we decided to compare the designs and the
> functionalities based on respective CRDs. The CRD is the interface with the
> user as it is where you describe the cluster that you want to have. These
> talks were very interesting and we found out that the CassKop team had made
> good choices most of the time but was may be too open. Indeed our intention
> was to give all the possibilities for our OPS team to work. This includes :
> - very open topology definition using any configuration of labels to map
> dcs / racks and nodes to labels on clusters (we have labels on dcs / rooms
> / rows and server racks so we can map C* racks to storage or network arrays
> internaly)
> - possibility to have multiple C* nodes on a single K8S host (because
> internal clouds are not really clouds, they have limited resources)
> - custom C* image selection,
> - custom bootstrap script that lets you configure C* as you want using
> ConfigMaps,
> - the ability to mount different volumes wherever they wanted,
> - the possibility to run any number of sidecars alongside C* for custom
> probes in our case
>
> This makes CassKop quite powerful and flexible.
> We made sure that all those options are not enabled by default so one can
> just pop a simple 3 node cluster quickly
>
> On the other hand cass-operator had an interesting way of configuring C*
> just inside the CRD using cass-config. This is simple and elegant so we are
> implementing it as well for the support of C* 4
>
> Now for the future, there are 3 choices in my opinion:
> - start from scratch (or John’s repo) by cherry picking bits from all
> operators. This is possible but will take some time / effort to have
> something usable. And then it will be compared to cass-operator and
> CassKop. I don’t see Orange contributing too much here as we believe
> CassKop to be a much better starting point
> - choose cass-operator: it is not on offer right now so let’s see if it
> does. I think Orange could contribute some bits inherited from CassKop if
> it is agreed by the community. Not sure it would be enough for us to use
> it.
> - choose CassKop: we would be delighted to donate it and contribute with
> some committers (including the original author who now works for AWS). It
> would then become the community operator but there would be cass-operator
> alongside probably. But Cass-operator is made to make it easier for
> Datastax to manage customer clusters by imposing some configuration. It
> make sense for their needs, so may be 2 operators. We don’t know how
> backup/restore will be handled here with medusa being adapted to K8s
>
> Sorry again for being long but 2 years of work deserve some lines of text
> :)
>
> I just saw your message Patrick but this was written already so we gain a
> week.
>
> Franck
>
> On 24 Sep 2020, at 10:08, Benjamin Lerer <benjamin.lerer@datastax.com
> <ma...@datastax.com>> wrote:
>
> I realise there are meeting logs, but getting a wider discourse with
> non-stakeholder input might help to build a community consensus? It doesn't
> seem like it can hurt at this point, anyway.
>
> +1
>
> On Wed, Sep 23, 2020 at 9:21 PM Benedict Elliott Smith <benedict@apache.
> org<ma...@apache.org>> wrote:
>
> Perhaps it helps to widen the field of discussion to the dev list?
>
> It might help if each of the stakeholder organisations state their view on
> the situation, including why they would or would not support a given
> approach/operator, and what (preferably specific) circumstances might lead
> them to change their mind?
>
> I realise there are meeting logs, but getting a wider discourse with
> non-stakeholder input might help to build a community consensus? It doesn't
> seem like it can hurt at this point, anyway.
>
> On 23/09/2020, 17:13, "John Sanda" <john.sanda@gmail.com<mailto:john.
> sanda@gmail.com>> wrote:
>
> I want to point out that pretty much everything being discussed in this
> thread has been discussed at length during the SIG meetings. I think it is
> worth noting because we are pretty much still have the same conversation.
>
> On Wed, Sep 23, 2020 at 12:03 PM Benedict Elliott Smith < benedict@apache.
> org<ma...@apache.org>> wrote:
>
> I don't think there's anything about a code drop that's not "The Apache
> Way"
>
> If there's a consensus (or even strong majority) amongst invested parties,
> I don't see why we could not adopt an operator directly into the project.
>
> It's possible a green field approach might lead to fewer hard feelings, as
> everyone is in the same boat. Perhaps all operators are also suboptimal
> and
> could be improved with a rewrite? But I think coordinating a lot of
> different entities around an empty codebase is particularly challenging. I
> actually think it could be better for cohesion and collaboration to have a
> suboptimal but substantive starting point.
>
> On 23/09/2020, 16:11, "Stefan Miklosovic" < stefan.miklosovic@
> instaclustr.com<ma...@instaclustr.com>> wrote:
>
> I think that from Instaclustr it was stated quite clearly multiple
> times that we are "fine to throw it away" if there is something better
> and more wide-spread.Indeed, we have invested a lot of time in the
> operator but it was not useless at all, we gained a lot of quite unique
> knowledge how to put all pieces together. However, I think that
> this space is going to be quite fragmented and "balkanized", which is
> not always a bad thing, but in a quite narrow area as Kubernetes operator
> is, I just do not see how 4 operators are going to be beneficial for
> ordinary people ("official" from community, ours, Datastax one and CassKop
> (without any significant order)). Sure, innovation and healthy competition
> is important but to what extent ...
> One can start a Cassandra cluster on Kubernetes just so many times
> differently and nobody really likes a vendor lock-in. People wanting
> to run a cluster on K8S realise that there are three operators, each
> backed by a private business entity, and the community operator is not
> there ... Huh, interesting ... One may even start to question what is
> wrong with these folks that it takes three companies to build their
> own solution.
>
> Having said that, to my perception, Cassandra community just does not
> have enough engineers nor contributors to keep 4 operators alive at
> the same time (I wish I was wrong) so the idea of selecting the best
> one or to merge obvious things and approaches together is understandable,
> even if it meant we eventually sunset ours. In addition, nobody from big
> players is going to contribute to the code
> base of the other one, for obvious reasons, so channeling and directing
> this effort into something common for a community seems to
> be the only reasonable way of cooperation.
>
> It is quite hard to bootstrap this if the donation of the code in big
> chunks / whole repo is out of question as it is not the "Apache way"
> (there was some thread running here about this in more depth a while
> ago) and we basically need to start from scratch which is quite
> demotivating, we are just inventing the wheel and nobody is up to it.
> It is like people are waiting for that to happen so they can jump in
> "once it is the thing" but it will never materialise or at least the
> hurdle to kick it off is unnecessarily high. Nobody is going to invest
> in this heavily if there is already a working operator from companies
> mentioned above. As I understood it, one reason of not choosing the
> way of donating it all is that "the learning and community building
> should happen in organic manner and we just can not accept the donation",
> but is not it true that it is easier to build a community
> around something which is already there rather than trying to build it
> around an idea which is quite hard to dedicate to?
>
> On Wed, 23 Sep 2020 at 15:28, Joshua McKenzie < jmckenzie@apache.org
> <ma...@apache.org>> wrote:
>
> I think there's significant value to the community in trying to
> coalesce
> on a single approach,
> I agree. Unfortunately in this case, the parties with a vested interest
> and
> written operators came to the table and couldn't agree to coalesce
> on a
> single approach. John Sanda attempted to start an initiative to write a
> best-of-breed combining choice parts of each operator, but that effort did
> not gain traction.
>
> Which is where my hypothesis comes from that if there were a clear
> "better
> fit" operator to start from we wouldn't be in a deadlock; the correct
> choice would be obvious. Reasonably so, every engineer that's written
> something is going to want that something to be used and not thrown
> away in
> favor of another something without strong evidence as to why that's
> the
> better choice.
>
> As far as I know, nobody has made a clear case as to a more compelling
> place to start in terms of an operator donation the project then
> collaborates on. There's no mass adoption evidence nor feature enumeration
> that I know of for any of the approaches anyone's taken, so the
> discussions
> remain stalled.
>
> On Wed, Sep 23, 2020 at 7:18 AM, Benedict Elliott Smith < benedict@apache.
> org<ma...@apache.org> wrote:
>
> I think there's significant value to the community in trying to
> coalesce
> on a single approach, earlier than later. This is an opportunity
> to expand
> the number of active organisations involved directly in the Apache
> Cassandra project, as well as to more quickly expand the project's
> functionality into an area we consider urgent and important. I
> think it
> would be a real shame to waste this opportunity. No doubt it will
> be hard,
> as organisations have certain built-in investments in their own
> approaches.
>
> I haven't participated in these calls as I do not consider myself
> to have
> the relevant experience and expertise, and have other focuses on
> the
> project. I just wanted to voice a vote in favour of trying to bring the
> different organisations together on a single approach if possible.
> Is there
> anything the project can do to help this happen?
>
> On 23/09/2020, 03:04, "Ben Bromhead" <ben@instaclustr.com<mailto:ben@
> instaclustr.com>> wrote:
>
> I think there is certainly an appetite to donate and standardise
> on a
> given operator (as mentioned in this thread).
>
> I personally found the SIG hard to participate in due to time zones and
> the synchronous nature of it.
>
> So while it was a great forum to dive into certain details for a
> subset of
> participants and a worthwhile endeavour, I wouldn't paint it as an
> accurate
> reflection of community intent.
>
> I don't think that any participants want to continue down the path
> of "let
> a thousand flowers bloom". That's why we are looking towards CasKop (as
> well as a number of technical reasons).
>
> Some of the recorded meetings and outputs can also be found if you
> are
> interested in some primary sources
> https://cwiki.apache.org/confluence/display/CASSANDRA/
> Cassandra+Kubernetes+Operator+SIG
> .
>
> From what I understand second-hand from talking to people on the
> SIG
> calls,
>
> there was a general inability to agree on an existing operator as a
> starting point and not much engagement on taking best of breed
> from the
> various to combine them. Seems to leave us in the "let a thousand
> flowers
> bloom" stage of letting operators grow in the ecosystem and seeing
> which
> ones meet the needs of end users before talking about adopting one
> into the
> foundation.
>
> Great to hear that you folks are joining forces though! Bodes well
> for C*
> users that are wanting to run things on k8s.
>
> On Tue, Sep 22, 2020 at 4:26 AM, Ben Bromhead < ben@instaclustr.com
> <ma...@instaclustr.com>
>
> wrote:
>
> For what it's worth, a quick update from me:
>
> CassKop now has at least two organisations working on it substantially
> (Orange and Instaclustr) as well as the numerous other contributors.
>
> Internally we will also start pointing others towards CasKop once
> a few
> things get merged. While we are not yet sunsetting our operator
> yet, it
>
> is
>
> certainly looking that way.
>
> I'd love to see the community adopt it as a starting point for
> working
> towards whatever level of functionality is desired.
>
> Cheers
>
> Ben
>
> On Fri, Sep 11, 2020 at 2:37 PM John Sanda <
> john.sanda@gmail.com>
> wrote:
>
> On Thu, Sep 10, 2020 at 5:27 PM Josh McKenzie < jmckenzie@apache.org>
> wrote:
>
> There's basically 1 java driver in the C* ecosystem. We have 3? 4?
> or
>
> more
>
> operators in the ecosystem. Has one of them hit a clear supermajority of
> adoption that makes it the de facto default and makes sense to
> pull it
>
> into
>
> the project?
>
> We as a project community were pretty slow to move on building a
> PoV
>
> around
>
> kubernetes so we find ourselves in a situation with a bunch of
> contenders
> for inclusion in the project. It's not clear to me what heuristics
> we'd
>
> use
>
> to gauge which one would be the best fit for inclusion outside
> letting
> community adoption speak.
>
> ---
> Josh McKenzie
>
> We actually talked a good bit on the SIG call earlier today about
> heuristics. We need to document what functionality an operator
> should
> include at level 0, level 1, etc. We did discuss this a good bit
> during
> some of the initial SIG meetings, but I guess it wasn't really a
> focal
> point at the time. I think we should also provide references to
> existing
> operator projects and possibly other related projects. This would
> benefit
> both community users as well as people working on these projects.
>
> - John
>
> --
>
> Ben Bromhead
>
> Instaclustr | www.instaclustr.com | @instaclustr
> <http://twitter.com/instaclustr> | (650) 284 9692
>
> --
>
> Ben Bromhead
>
> Instaclustr | www.instaclustr.com | @instaclustr
> <http://twitter.com/instaclustr> | (650) 284 9692
>
> --------------------------------------------------------------------- To
> unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org For
> additional
> commands, e-mail: dev-help@cassandra.apache.org
>
> --------------------------------------------------------------------- To
> unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org For additional
> commands, e-mail: dev-help@cassandra.apache.org
>
> --------------------------------------------------------------------- To
> unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org For additional
> commands, e-mail: dev-help@cassandra.apache.org
>
> --
>
> - John
>
> --------------------------------------------------------------------- To
> unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org For additional
> commands, e-mail: dev-help@cassandra.apache.org
>
> _________________________________________________________________________________________________________________________
>
>
> Ce message et ses pieces jointes peuvent contenir des informations
> confidentielles ou privilegiees et ne doivent donc pas etre diffuses,
> exploites ou copies sans autorisation. Si vous avez recu ce message par
> erreur, veuillez le signaler a l'expediteur et le detruire ainsi que les
> pieces jointes. Les messages electroniques etant susceptibles d'alteration,
> Orange decline toute responsabilite si ce message a ete altere, deforme ou
> falsifie. Merci.
>
> This message and its attachments may contain confidential or privileged
> information that may be protected by law; they should not be distributed,
> used or copied without authorisation. If you have received this email in
> error, please notify the sender and delete this message and its
> attachments. As emails may be altered, Orange is not liable for messages
> that have been modified, changed or falsified. Thank you.
>

Re: [DISCUSS] Next steps for Kubernetes operator SIG

Posted by fr...@orange.com.

I can share Orange’s view of the situation, sorry it is a long story!

We started CassKop at the end of 2018 after betting on K8S which was not so simple as far as C* was concerned.
Lack of support for local storage, IPs that change all the time, different network plugins to try to implement a non standard K8s way of having nodes see each other from different dcs…
We hesitated with Mesos but could not have both and K8S was already tracting so much you could not not choose it.

Anyway, we looked around and did not see anyone with such requirements so we said: why not try it ourselves but on github so that we may give it back to the community.
We have used C* for quite a few years with great success on production with massive load and perfect availability. We love C* @ Orange :) Thanks!

So we started writing support for mono-dc cluster (CassKop) and added the multi dc support with MultiCassKop which is another operator included in the CassKop repo.
For more details we tried to document our designs as much as possible here: https://orange-opensource.github.io/casskop/docs/1_concepts/3_design_principes#multi-site-management

In the middle of last year we had some talks with Datastax about working together around their new management sidecar. Their position on open source was not clear at that time so we said please come back when you have decided to go open source with it.
Which they did in the beginning of this year. But at that time I guess work had started on cass-operator so we kept our separate ways.

Since the beginning of the years, we have been working with our OPS team to have it in production. It is not simple as the team has to learn K8S and trust a newborn operator.
This takes time especially as our internal cluster has been tweaked for multi-tenancy with obscure options being set by our K8s team…

We also developed with Instaclustr the Backup & Restore functionnality (we have new CRDs (Custom Resource Definition) for backup and restore and a reconcile loop that calls out Instaclustr sidecar for these operations).
We now support multiple backups in parallel and can write to s3/ google or azur (but Stefan could give more details here if needed)

During the SIG calls we mentioned our desire to donate CassKop once it satisfies our basics requirements (v1 coming just now but I said it too many times already)
I am actually not sure Datastax mentioned their desire to donate cass-operator but we decided to compare the designs and the functionalities based on respective CRDs.
The CRD is the interface with the user as it is where you describe the cluster that you want to have.
These talks were very interesting and we found out that the CassKop team had made good choices most of the time but was may be too open.
Indeed our intention was to give all the possibilities for our OPS team to work.
This includes :
- very open topology definition using any configuration of labels to map dcs / racks and nodes to labels on clusters (we have labels on dcs / rooms / rows and server racks so we can map C* racks to storage or network arrays internaly)
- possibility to have multiple C* nodes on a single K8S host (because internal clouds are not really clouds, they have limited resources)
- custom C* image selection,
- custom bootstrap script that lets you configure C* as you want using ConfigMaps,
- the ability to mount different volumes wherever they wanted,
- the possibility to run any number of sidecars alongside C* for custom probes in our case

This makes CassKop quite powerful and flexible.
We made sure that all those options are not enabled by default so one can just pop a simple 3 node cluster quickly

On the other hand cass-operator had an interesting way of configuring C* just inside the CRD using cass-config. This is simple and elegant so we are implementing it as well for the support of C* 4

Now for the future, there are 3 choices in my opinion:
- start from scratch (or John’s repo) by cherry picking bits from all operators. This is possible but will take some time / effort to have something usable. And then it will be compared to cass-operator and CassKop.
I don’t see Orange contributing too much here as we believe CassKop to be a much better starting point
- choose cass-operator: it is not on offer right now so let’s see if it does. I think Orange could contribute some bits inherited from CassKop if it is agreed by the community. Not sure it would be enough for us to use it.
- choose CassKop: we would be delighted to donate it and contribute with some committers (including the original author who now works for AWS). It would then become the community operator but there would be cass-operator alongside probably.
But Cass-operator is made to make it easier for Datastax to manage customer clusters by imposing some configuration. It make sense for their needs, so may be 2 operators. We don’t know how backup/restore will be handled here with medusa being adapted to K8s

Sorry again for being long but 2 years of work deserve some lines of text :)

I just saw your message Patrick but this was written already so we gain a week.

Franck

On 24 Sep 2020, at 10:08, Benjamin Lerer <be...@datastax.com>> wrote:

I realise there are meeting logs, but getting a wider discourse with
non-stakeholder input might help to build a community consensus? It
doesn't seem like it can hurt at this point, anyway.

On Wed, Sep 23, 2020 at 9:21 PM Benedict Elliott Smith <be...@apache.org>>
wrote:

Perhaps it helps to widen the field of discussion to the dev list?

It might help if each of the stakeholder organisations state their view on
the situation, including why they would or would not support a given
approach/operator, and what (preferably specific) circumstances might lead
them to change their mind?

I realise there are meeting logs, but getting a wider discourse with
non-stakeholder input might help to build a community consensus? It
doesn't seem like it can hurt at this point, anyway.

On 23/09/2020, 17:13, "John Sanda" <jo...@gmail.com>> wrote:

I want to point out that pretty much everything being discussed in
this
thread has been discussed at length during the SIG meetings. I think
it is
worth noting because we are pretty much still have the same
conversation.

On Wed, Sep 23, 2020 at 12:03 PM Benedict Elliott Smith <
benedict@apache.org<ma...@apache.org>>
wrote:

I don't think there's anything about a code drop that's not "The
Apache
Way"

If there's a consensus (or even strong majority) amongst invested
parties,
I don't see why we could not adopt an operator directly into the
project.

It's possible a green field approach might lead to fewer hard
feelings, as
everyone is in the same boat. Perhaps all operators are also
suboptimal and
could be improved with a rewrite? But I think coordinating a lot of
different entities around an empty codebase is particularly
challenging. I
actually think it could be better for cohesion and collaboration to
have a
suboptimal but substantive starting point.

On 23/09/2020, 16:11, "Stefan Miklosovic" <
stefan.miklosovic@instaclustr.com<ma...@instaclustr.com>> wrote:

I think that from Instaclustr it was stated quite clearly
multiple
times that we are "fine to throw it away" if there is something
better
and more wide-spread.Indeed, we have invested a lot of time in
the
operator but it was not useless at all, we gained a lot of quite
unique knowledge how to put all pieces together. However, I
think that
this space is going to be quite fragmented and "balkanized",
which is
not always a bad thing, but in a quite narrow area as Kubernetes
operator is, I just do not see how 4 operators are going to be
beneficial for ordinary people ("official" from community, ours,
Datastax one and CassKop (without any significant order)). Sure,
innovation and healthy competition is important but to what
extent ...
One can start a Cassandra cluster on Kubernetes just so many
times
differently and nobody really likes a vendor lock-in. People
wanting
to run a cluster on K8S realise that there are three operators,
each
backed by a private business entity, and the community operator
is not
there ... Huh, interesting ... One may even start to question
what is
wrong with these folks that it takes three companies to build
their
own solution.

Having said that, to my perception, Cassandra community just
does not
have enough engineers nor contributors to keep 4 operators alive
at
the same time (I wish I was wrong) so the idea of selecting the
best
one or to merge obvious things and approaches together is
understandable, even if it meant we eventually sunset ours. In
addition, nobody from big players is going to contribute to the
code
base of the other one, for obvious reasons, so channeling and
directing this effort into something common for a community
seems to
be the only reasonable way of cooperation.

It is quite hard to bootstrap this if the donation of the code
in big
chunks / whole repo is out of question as it is not the "Apache
way"
(there was some thread running here about this in more depth a
while
ago) and we basically need to start from scratch which is quite
demotivating, we are just inventing the wheel and nobody is up
to it.
It is like people are waiting for that to happen so they can
jump in
"once it is the thing" but it will never materialise or at least
the
hurdle to kick it off is unnecessarily high. Nobody is going to
invest
in this heavily if there is already a working operator from
companies
mentioned above. As I understood it, one reason of not choosing
the
way of donating it all is that "the learning and community
building
should happen in organic manner and we just can not accept the
donation", but is not it true that it is easier to build a
community
around something which is already there rather than trying to
build it
around an idea which is quite hard to dedicate to?

On Wed, 23 Sep 2020 at 15:28, Joshua McKenzie <
jmckenzie@apache.org<ma...@apache.org>>
wrote:

I think there's significant value to the community in trying
to
coalesce
on a single approach,
I agree. Unfortunately in this case, the parties with a vested
interest and
written operators came to the table and couldn't agree to
coalesce
on a
single approach. John Sanda attempted to start an initiative to
write a
best-of-breed combining choice parts of each operator, but that
effort did
not gain traction.

Which is where my hypothesis comes from that if there were a
clear
"better
fit" operator to start from we wouldn't be in a deadlock; the
correct
choice would be obvious. Reasonably so, every engineer that's
written
something is going to want that something to be used and not
thrown
away in
favor of another something without strong evidence as to why
that's
the
better choice.

As far as I know, nobody has made a clear case as to a more
compelling
place to start in terms of an operator donation the project
then
collaborates on. There's no mass adoption evidence nor feature
enumeration
that I know of for any of the approaches anyone's taken, so the
discussions
remain stalled.

On Wed, Sep 23, 2020 at 7:18 AM, Benedict Elliott Smith <
benedict@apache.org<ma...@apache.org>
wrote:

I think there's significant value to the community in trying
to
coalesce
on a single approach, earlier than later. This is an
opportunity
to expand
the number of active organisations involved directly in the
Apache
Cassandra project, as well as to more quickly expand the
project's
functionality into an area we consider urgent and important.
I
think it
would be a real shame to waste this opportunity. No doubt it
will
be hard,
as organisations have certain built-in investments in their
own
approaches.

I haven't participated in these calls as I do not consider
myself
to have
the relevant experience and expertise, and have other
focuses on
the
project. I just wanted to voice a vote in favour of trying to
bring the
different organisations together on a single approach if
possible.
Is there
anything the project can do to help this happen?

On 23/09/2020, 03:04, "Ben Bromhead" <be...@instaclustr.com>>
wrote:

I think there is certainly an appetite to donate and
standardise
on a
given operator (as mentioned in this thread).

I personally found the SIG hard to participate in due to time
zones and
the synchronous nature of it.

So while it was a great forum to dive into certain details
for a
subset of
participants and a worthwhile endeavour, I wouldn't paint it
as an
accurate
reflection of community intent.

I don't think that any participants want to continue down
the path
of "let
a thousand flowers bloom". That's why we are looking towards
CasKop (as
well as a number of technical reasons).

Some of the recorded meetings and outputs can also be found
if you
are
interested in some primary sources
https://cwiki.apache.org/confluence/display/CASSANDRA/
Cassandra+Kubernetes+Operator+SIG
.

From what I understand second-hand from talking to people on
the
SIG
calls,

there was a general inability to agree on an existing
operator as a
starting point and not much engagement on taking best of
breed
from the
various to combine them. Seems to leave us in the "let a
thousand
flowers
bloom" stage of letting operators grow in the ecosystem and
seeing
which
ones meet the needs of end users before talking about
adopting one
into the
foundation.

Great to hear that you folks are joining forces though!
Bodes well
for C*
users that are wanting to run things on k8s.

On Tue, Sep 22, 2020 at 4:26 AM, Ben Bromhead <
ben@instaclustr.com<ma...@instaclustr.com>

wrote:

For what it's worth, a quick update from me:

CassKop now has at least two organisations working on it
substantially
(Orange and Instaclustr) as well as the numerous other
contributors.

Internally we will also start pointing others towards CasKop
once
a few
things get merged. While we are not yet sunsetting our
operator
yet, it

certainly looking that way.

I'd love to see the community adopt it as a starting point
for
working
towards whatever level of functionality is desired.

Cheers

Ben

On Fri, Sep 11, 2020 at 2:37 PM John Sanda <
john.sanda@gmail.com>
wrote:

On Thu, Sep 10, 2020 at 5:27 PM Josh McKenzie <
jmckenzie@apache.org>
wrote:

There's basically 1 java driver in the C* ecosystem. We have
3? 4?
or

operators in the ecosystem. Has one of them hit a clear
supermajority of
adoption that makes it the de facto default and makes sense
to
pull it

into

the project?

We as a project community were pretty slow to move on
building a
PoV

around

kubernetes so we find ourselves in a situation with a bunch
of
contenders
for inclusion in the project. It's not clear to me what
heuristics
we'd

use

to gauge which one would be the best fit for inclusion
outside
letting
community adoption speak.

---
Josh McKenzie

We actually talked a good bit on the SIG call earlier today
about
heuristics. We need to document what functionality an
operator
should
include at level 0, level 1, etc. We did discuss this a good
bit
during
some of the initial SIG meetings, but I guess it wasn't
really a
focal
point at the time. I think we should also provide references
to
existing
operator projects and possibly other related projects. This
would
benefit
both community users as well as people working on these
projects.

- John

Ben Bromhead

Instaclustr | www.instaclustr.com | @instaclustr
<http://twitter.com/instaclustr> | (650) 284 9692

Ben Bromhead

Instaclustr | www.instaclustr.com | @instaclustr
<http://twitter.com/instaclustr> | (650) 284 9692

--------------------------------------------------------------------- To
unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
For
additional
commands, e-mail: dev-help@cassandra.apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
For additional commands, e-mail: dev-help@cassandra.apache.org

- John

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
For additional commands, e-mail: dev-help@cassandra.apache.org

_________________________________________________________________________________________________________________________

Ce message et ses pieces jointes peuvent contenir des informations confidentielles ou privilegiees et ne doivent donc
pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce message par erreur, veuillez le signaler
a l'expediteur et le detruire ainsi que les pieces jointes. Les messages electroniques etant susceptibles d'alteration,
Orange decline toute responsabilite si ce message a ete altere, deforme ou falsifie. Merci.

This message and its attachments may contain confidential or privileged information that may be protected by law;
they should not be distributed, used or copied without authorisation.
If you have received this email in error, please notify the sender and delete this message and its attachments.
As emails may be altered, Orange is not liable for messages that have been modified, changed or falsified.
Thank you.

Re: [DISCUSS] Next steps for Kubernetes operator SIG

Posted by Benjamin Lerer <be...@datastax.com>.

>
> I realise there are meeting logs, but getting a wider discourse with
> non-stakeholder input might help to build a community consensus?  It
> doesn't seem like it can hurt at this point, anyway.
>

+1



On Wed, Sep 23, 2020 at 9:21 PM Benedict Elliott Smith <be...@apache.org>
wrote:

> Perhaps it helps to widen the field of discussion to the dev list?
>
> It might help if each of the stakeholder organisations state their view on
> the situation, including why they would or would not support a given
> approach/operator, and what (preferably specific) circumstances might lead
> them to change their mind?
>
> I realise there are meeting logs, but getting a wider discourse with
> non-stakeholder input might help to build a community consensus?  It
> doesn't seem like it can hurt at this point, anyway.
>
>
> On 23/09/2020, 17:13, "John Sanda" <jo...@gmail.com> wrote:
>
>     I want to point out that pretty much everything being  discussed in
> this
>     thread has been discussed at length during the SIG meetings. I think
> it is
>     worth noting because we are pretty much still have the same
> conversation.
>
>     On Wed, Sep 23, 2020 at 12:03 PM Benedict Elliott Smith <
> benedict@apache.org>
>     wrote:
>
>     > I don't think there's anything about a code drop that's not "The
> Apache
>     > Way"
>     >
>     > If there's a consensus (or even strong majority) amongst invested
> parties,
>     > I don't see why we could not adopt an operator directly into the
> project.
>     >
>     > It's possible a green field approach might lead to fewer hard
> feelings, as
>     > everyone is in the same boat. Perhaps all operators are also
> suboptimal and
>     > could be improved with a rewrite? But I think coordinating a lot of
>     > different entities around an empty codebase is particularly
> challenging.  I
>     > actually think it could be better for cohesion and collaboration to
> have a
>     > suboptimal but substantive starting point.
>     >
>     >
>     > On 23/09/2020, 16:11, "Stefan Miklosovic" <
>     > stefan.miklosovic@instaclustr.com> wrote:
>     >
>     >     I think that from Instaclustr it was stated quite clearly
> multiple
>     >     times that we are "fine to throw it away" if there is something
> better
>     >     and more wide-spread.Indeed, we have invested a lot of time in
> the
>     >     operator but it was not useless at all, we gained a lot of quite
>     >     unique knowledge how to put all pieces together. However, I
> think that
>     >     this space is going to be quite fragmented and "balkanized",
> which is
>     >     not always a bad thing, but in a quite narrow area as Kubernetes
>     >     operator is, I just do not see how 4 operators are going to be
>     >     beneficial for ordinary people ("official" from community, ours,
>     >     Datastax one and CassKop (without any significant order)). Sure,
>     >     innovation and healthy competition is important but to what
> extent ...
>     >     One can start a Cassandra cluster on Kubernetes just so many
> times
>     >     differently and nobody really likes a vendor lock-in. People
> wanting
>     >     to run a cluster on K8S realise that there are three operators,
> each
>     >     backed by a private business entity, and the community operator
> is not
>     >     there ... Huh, interesting ... One may even start to question
> what is
>     >     wrong with these folks that it takes three companies to build
> their
>     >     own solution.
>     >
>     >     Having said that, to my perception, Cassandra community just
> does not
>     >     have enough engineers nor contributors to keep 4 operators alive
> at
>     >     the same time (I wish I was wrong) so the idea of selecting the
> best
>     >     one or to merge obvious things and approaches together is
>     >     understandable, even if it meant we eventually sunset ours. In
>     >     addition, nobody from big players is going to contribute to the
> code
>     >     base of the other one, for obvious reasons, so channeling and
>     >     directing this effort into something common for a community
> seems to
>     >     be the only reasonable way of cooperation.
>     >
>     >     It is quite hard to bootstrap this if the donation of the code
> in big
>     >     chunks / whole repo is out of question as it is not the "Apache
> way"
>     >     (there was some thread running here about this in more depth a
> while
>     >     ago) and we basically need to start from scratch which is quite
>     >     demotivating, we are just inventing the wheel and nobody is up
> to it.
>     >     It is like people are waiting for that to happen so they can
> jump in
>     >     "once it is the thing" but it will never materialise or at least
> the
>     >     hurdle to kick it off is unnecessarily high. Nobody is going to
> invest
>     >     in this heavily if there is already a working operator from
> companies
>     >     mentioned above. As I understood it, one reason of not choosing
> the
>     >     way of donating it all is that "the learning and community
> building
>     >     should happen in organic manner and we just can not accept the
>     >     donation", but is not it true that it is easier to build a
> community
>     >     around something which is already there rather than trying to
> build it
>     >     around an idea which is quite hard to dedicate to?
>     >
>     >     On Wed, 23 Sep 2020 at 15:28, Joshua McKenzie <
> jmckenzie@apache.org>
>     > wrote:
>     >     >
>     >     > > I think there's significant value to the community in trying
> to
>     > coalesce
>     >     > on a single approach,
>     >     > I agree. Unfortunately in this case, the parties with a vested
>     > interest and
>     >     > written operators came to the table and couldn't agree to
> coalesce
>     > on a
>     >     > single approach. John Sanda attempted to start an initiative to
>     > write a
>     >     > best-of-breed combining choice parts of each operator, but that
>     > effort did
>     >     > not gain traction.
>     >     >
>     >     > Which is where my hypothesis comes from that if there were a
> clear
>     > "better
>     >     > fit" operator to start from we wouldn't be in a deadlock; the
> correct
>     >     > choice would be obvious. Reasonably so, every engineer that's
> written
>     >     > something is going to want that something to be used and not
> thrown
>     > away in
>     >     > favor of another something without strong evidence as to why
> that's
>     > the
>     >     > better choice.
>     >     >
>     >     > As far as I know, nobody has made a clear case as to a more
>     > compelling
>     >     > place to start in terms of an operator donation the project
> then
>     >     > collaborates on. There's no mass adoption evidence nor feature
>     > enumeration
>     >     > that I know of for any of the approaches anyone's taken, so the
>     > discussions
>     >     > remain stalled.
>     >     >
>     >     >
>     >     >
>     >     > On Wed, Sep 23, 2020 at 7:18 AM, Benedict Elliott Smith <
>     > benedict@apache.org
>     >     > > wrote:
>     >     >
>     >     > > I think there's significant value to the community in trying
> to
>     > coalesce
>     >     > > on a single approach, earlier than later. This is an
> opportunity
>     > to expand
>     >     > > the number of active organisations involved directly in the
> Apache
>     >     > > Cassandra project, as well as to more quickly expand the
> project's
>     >     > > functionality into an area we consider urgent and important.
> I
>     > think it
>     >     > > would be a real shame to waste this opportunity. No doubt it
> will
>     > be hard,
>     >     > > as organisations have certain built-in investments in their
> own
>     > approaches.
>     >     > >
>     >     > > I haven't participated in these calls as I do not consider
> myself
>     > to have
>     >     > > the relevant experience and expertise, and have other
> focuses on
>     > the
>     >     > > project. I just wanted to voice a vote in favour of trying to
>     > bring the
>     >     > > different organisations together on a single approach if
> possible.
>     > Is there
>     >     > > anything the project can do to help this happen?
>     >     > >
>     >     > > On 23/09/2020, 03:04, "Ben Bromhead" <be...@instaclustr.com>
> wrote:
>     >     > >
>     >     > > I think there is certainly an appetite to donate and
> standardise
>     > on a
>     >     > > given operator (as mentioned in this thread).
>     >     > >
>     >     > > I personally found the SIG hard to participate in due to time
>     > zones and
>     >     > > the synchronous nature of it.
>     >     > >
>     >     > > So while it was a great forum to dive into certain details
> for a
>     > subset of
>     >     > > participants and a worthwhile endeavour, I wouldn't paint it
> as an
>     > accurate
>     >     > > reflection of community intent.
>     >     > >
>     >     > > I don't think that any participants want to continue down
> the path
>     > of "let
>     >     > > a thousand flowers bloom". That's why we are looking towards
>     > CasKop (as
>     >     > > well as a number of technical reasons).
>     >     > >
>     >     > > Some of the recorded meetings and outputs can also be found
> if you
>     > are
>     >     > > interested in some primary sources
>     >     > > https://cwiki.apache.org/confluence/display/CASSANDRA/
>     >     > > Cassandra+Kubernetes+Operator+SIG
>     >     > > .
>     >     > >
>     >     > > From what I understand second-hand from talking to people on
> the
>     > SIG
>     >     > > calls,
>     >     > >
>     >     > > there was a general inability to agree on an existing
> operator as a
>     >     > > starting point and not much engagement on taking best of
> breed
>     > from the
>     >     > > various to combine them. Seems to leave us in the "let a
> thousand
>     > flowers
>     >     > > bloom" stage of letting operators grow in the ecosystem and
> seeing
>     > which
>     >     > > ones meet the needs of end users before talking about
> adopting one
>     > into the
>     >     > > foundation.
>     >     > >
>     >     > > Great to hear that you folks are joining forces though!
> Bodes well
>     > for C*
>     >     > > users that are wanting to run things on k8s.
>     >     > >
>     >     > > On Tue, Sep 22, 2020 at 4:26 AM, Ben Bromhead <
> ben@instaclustr.com
>     > >
>     >     > > wrote:
>     >     > >
>     >     > > For what it's worth, a quick update from me:
>     >     > >
>     >     > > CassKop now has at least two organisations working on it
>     > substantially
>     >     > > (Orange and Instaclustr) as well as the numerous other
>     > contributors.
>     >     > >
>     >     > > Internally we will also start pointing others towards CasKop
> once
>     > a few
>     >     > > things get merged. While we are not yet sunsetting our
> operator
>     > yet, it
>     >     > >
>     >     > > is
>     >     > >
>     >     > > certainly looking that way.
>     >     > >
>     >     > > I'd love to see the community adopt it as a starting point
> for
>     > working
>     >     > > towards whatever level of functionality is desired.
>     >     > >
>     >     > > Cheers
>     >     > >
>     >     > > Ben
>     >     > >
>     >     > > On Fri, Sep 11, 2020 at 2:37 PM John Sanda <
> john.sanda@gmail.com>
>     > wrote:
>     >     > >
>     >     > > On Thu, Sep 10, 2020 at 5:27 PM Josh McKenzie <
>     > jmckenzie@apache.org>
>     >     > > wrote:
>     >     > >
>     >     > > There's basically 1 java driver in the C* ecosystem. We have
> 3? 4?
>     > or
>     >     > >
>     >     > > more
>     >     > >
>     >     > > operators in the ecosystem. Has one of them hit a clear
>     > supermajority of
>     >     > > adoption that makes it the de facto default and makes sense
> to
>     > pull it
>     >     > >
>     >     > > into
>     >     > >
>     >     > > the project?
>     >     > >
>     >     > > We as a project community were pretty slow to move on
> building a
>     > PoV
>     >     > >
>     >     > > around
>     >     > >
>     >     > > kubernetes so we find ourselves in a situation with a bunch
> of
>     > contenders
>     >     > > for inclusion in the project. It's not clear to me what
> heuristics
>     > we'd
>     >     > >
>     >     > > use
>     >     > >
>     >     > > to gauge which one would be the best fit for inclusion
> outside
>     > letting
>     >     > > community adoption speak.
>     >     > >
>     >     > > ---
>     >     > > Josh McKenzie
>     >     > >
>     >     > > We actually talked a good bit on the SIG call earlier today
> about
>     >     > > heuristics. We need to document what functionality an
> operator
>     > should
>     >     > > include at level 0, level 1, etc. We did discuss this a good
> bit
>     > during
>     >     > > some of the initial SIG meetings, but I guess it wasn't
> really a
>     > focal
>     >     > > point at the time. I think we should also provide references
> to
>     > existing
>     >     > > operator projects and possibly other related projects. This
> would
>     > benefit
>     >     > > both community users as well as people working on these
> projects.
>     >     > >
>     >     > > - John
>     >     > >
>     >     > > --
>     >     > >
>     >     > > Ben Bromhead
>     >     > >
>     >     > > Instaclustr | www.instaclustr.com | @instaclustr
>     >     > > <http://twitter.com/instaclustr> | (650) 284 9692
>     >     > >
>     >     > > --
>     >     > >
>     >     > > Ben Bromhead
>     >     > >
>     >     > > Instaclustr | www.instaclustr.com | @instaclustr
>     >     > > <http://twitter.com/instaclustr> | (650) 284 9692
>     >     > >
>     >     > >
>     >
> --------------------------------------------------------------------- To
>     >     > > unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
> For
>     > additional
>     >     > > commands, e-mail: dev-help@cassandra.apache.org
>     >     > >
>     >
>     >
>  ---------------------------------------------------------------------
>     >     To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
>     >     For additional commands, e-mail: dev-help@cassandra.apache.org
>     >
>     >
>     >
>     >
>     > ---------------------------------------------------------------------
>     > To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
>     > For additional commands, e-mail: dev-help@cassandra.apache.org
>     >
>     > --
>
>     - John
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
> For additional commands, e-mail: dev-help@cassandra.apache.org
>
>

Re: [DISCUSS] Next steps for Kubernetes operator SIG

Posted by Patrick McFadin <pm...@gmail.com>.

I would like to propose a hybrid a hybrid of what Benedict mentioned.

Let's postpone today's (Sept 24) SIG to the next week, Oct 1. Same time.

I'll keep the same zoom with some modifications. Each group, CassKop and
cass-operator can have time to present the following:

 - State your view of the situation
 - Why they would or would not support a given approach/operator. Be
technically specific.
 - What technical circumstances might lead them to change their mind
 - Your view of a path forward

Each group will get 20 minutes with 10 minutes of q&a. I can moderate.
After the meeting I'll post the video as well as a full transcript (Thank
you Otter.ai!) If there were presentation decks, then post a PDF of those.
I'll kick off the discussion in the dev ML and we can debate here.

This is my proposal for moving things forward if possible. +1, -1 or more
debating?

Patrick

On Wed, Sep 23, 2020 at 12:21 PM Benedict Elliott Smith <be...@apache.org>
wrote:

> Perhaps it helps to widen the field of discussion to the dev list?
>
> It might help if each of the stakeholder organisations state their view on
> the situation, including why they would or would not support a given
> approach/operator, and what (preferably specific) circumstances might lead
> them to change their mind?
>
> I realise there are meeting logs, but getting a wider discourse with
> non-stakeholder input might help to build a community consensus?  It
> doesn't seem like it can hurt at this point, anyway.
>
>
> On 23/09/2020, 17:13, "John Sanda" <jo...@gmail.com> wrote:
>
>     I want to point out that pretty much everything being  discussed in
> this
>     thread has been discussed at length during the SIG meetings. I think
> it is
>     worth noting because we are pretty much still have the same
> conversation.
>
>     On Wed, Sep 23, 2020 at 12:03 PM Benedict Elliott Smith <
> benedict@apache.org>
>     wrote:
>
>     > I don't think there's anything about a code drop that's not "The
> Apache
>     > Way"
>     >
>     > If there's a consensus (or even strong majority) amongst invested
> parties,
>     > I don't see why we could not adopt an operator directly into the
> project.
>     >
>     > It's possible a green field approach might lead to fewer hard
> feelings, as
>     > everyone is in the same boat. Perhaps all operators are also
> suboptimal and
>     > could be improved with a rewrite? But I think coordinating a lot of
>     > different entities around an empty codebase is particularly
> challenging.  I
>     > actually think it could be better for cohesion and collaboration to
> have a
>     > suboptimal but substantive starting point.
>     >
>     >
>     > On 23/09/2020, 16:11, "Stefan Miklosovic" <
>     > stefan.miklosovic@instaclustr.com> wrote:
>     >
>     >     I think that from Instaclustr it was stated quite clearly
> multiple
>     >     times that we are "fine to throw it away" if there is something
> better
>     >     and more wide-spread.Indeed, we have invested a lot of time in
> the
>     >     operator but it was not useless at all, we gained a lot of quite
>     >     unique knowledge how to put all pieces together. However, I
> think that
>     >     this space is going to be quite fragmented and "balkanized",
> which is
>     >     not always a bad thing, but in a quite narrow area as Kubernetes
>     >     operator is, I just do not see how 4 operators are going to be
>     >     beneficial for ordinary people ("official" from community, ours,
>     >     Datastax one and CassKop (without any significant order)). Sure,
>     >     innovation and healthy competition is important but to what
> extent ...
>     >     One can start a Cassandra cluster on Kubernetes just so many
> times
>     >     differently and nobody really likes a vendor lock-in. People
> wanting
>     >     to run a cluster on K8S realise that there are three operators,
> each
>     >     backed by a private business entity, and the community operator
> is not
>     >     there ... Huh, interesting ... One may even start to question
> what is
>     >     wrong with these folks that it takes three companies to build
> their
>     >     own solution.
>     >
>     >     Having said that, to my perception, Cassandra community just
> does not
>     >     have enough engineers nor contributors to keep 4 operators alive
> at
>     >     the same time (I wish I was wrong) so the idea of selecting the
> best
>     >     one or to merge obvious things and approaches together is
>     >     understandable, even if it meant we eventually sunset ours. In
>     >     addition, nobody from big players is going to contribute to the
> code
>     >     base of the other one, for obvious reasons, so channeling and
>     >     directing this effort into something common for a community
> seems to
>     >     be the only reasonable way of cooperation.
>     >
>     >     It is quite hard to bootstrap this if the donation of the code
> in big
>     >     chunks / whole repo is out of question as it is not the "Apache
> way"
>     >     (there was some thread running here about this in more depth a
> while
>     >     ago) and we basically need to start from scratch which is quite
>     >     demotivating, we are just inventing the wheel and nobody is up
> to it.
>     >     It is like people are waiting for that to happen so they can
> jump in
>     >     "once it is the thing" but it will never materialise or at least
> the
>     >     hurdle to kick it off is unnecessarily high. Nobody is going to
> invest
>     >     in this heavily if there is already a working operator from
> companies
>     >     mentioned above. As I understood it, one reason of not choosing
> the
>     >     way of donating it all is that "the learning and community
> building
>     >     should happen in organic manner and we just can not accept the
>     >     donation", but is not it true that it is easier to build a
> community
>     >     around something which is already there rather than trying to
> build it
>     >     around an idea which is quite hard to dedicate to?
>     >
>     >     On Wed, 23 Sep 2020 at 15:28, Joshua McKenzie <
> jmckenzie@apache.org>
>     > wrote:
>     >     >
>     >     > > I think there's significant value to the community in trying
> to
>     > coalesce
>     >     > on a single approach,
>     >     > I agree. Unfortunately in this case, the parties with a vested
>     > interest and
>     >     > written operators came to the table and couldn't agree to
> coalesce
>     > on a
>     >     > single approach. John Sanda attempted to start an initiative to
>     > write a
>     >     > best-of-breed combining choice parts of each operator, but that
>     > effort did
>     >     > not gain traction.
>     >     >
>     >     > Which is where my hypothesis comes from that if there were a
> clear
>     > "better
>     >     > fit" operator to start from we wouldn't be in a deadlock; the
> correct
>     >     > choice would be obvious. Reasonably so, every engineer that's
> written
>     >     > something is going to want that something to be used and not
> thrown
>     > away in
>     >     > favor of another something without strong evidence as to why
> that's
>     > the
>     >     > better choice.
>     >     >
>     >     > As far as I know, nobody has made a clear case as to a more
>     > compelling
>     >     > place to start in terms of an operator donation the project
> then
>     >     > collaborates on. There's no mass adoption evidence nor feature
>     > enumeration
>     >     > that I know of for any of the approaches anyone's taken, so the
>     > discussions
>     >     > remain stalled.
>     >     >
>     >     >
>     >     >
>     >     > On Wed, Sep 23, 2020 at 7:18 AM, Benedict Elliott Smith <
>     > benedict@apache.org
>     >     > > wrote:
>     >     >
>     >     > > I think there's significant value to the community in trying
> to
>     > coalesce
>     >     > > on a single approach, earlier than later. This is an
> opportunity
>     > to expand
>     >     > > the number of active organisations involved directly in the
> Apache
>     >     > > Cassandra project, as well as to more quickly expand the
> project's
>     >     > > functionality into an area we consider urgent and important.
> I
>     > think it
>     >     > > would be a real shame to waste this opportunity. No doubt it
> will
>     > be hard,
>     >     > > as organisations have certain built-in investments in their
> own
>     > approaches.
>     >     > >
>     >     > > I haven't participated in these calls as I do not consider
> myself
>     > to have
>     >     > > the relevant experience and expertise, and have other
> focuses on
>     > the
>     >     > > project. I just wanted to voice a vote in favour of trying to
>     > bring the
>     >     > > different organisations together on a single approach if
> possible.
>     > Is there
>     >     > > anything the project can do to help this happen?
>     >     > >
>     >     > > On 23/09/2020, 03:04, "Ben Bromhead" <be...@instaclustr.com>
> wrote:
>     >     > >
>     >     > > I think there is certainly an appetite to donate and
> standardise
>     > on a
>     >     > > given operator (as mentioned in this thread).
>     >     > >
>     >     > > I personally found the SIG hard to participate in due to time
>     > zones and
>     >     > > the synchronous nature of it.
>     >     > >
>     >     > > So while it was a great forum to dive into certain details
> for a
>     > subset of
>     >     > > participants and a worthwhile endeavour, I wouldn't paint it
> as an
>     > accurate
>     >     > > reflection of community intent.
>     >     > >
>     >     > > I don't think that any participants want to continue down
> the path
>     > of "let
>     >     > > a thousand flowers bloom". That's why we are looking towards
>     > CasKop (as
>     >     > > well as a number of technical reasons).
>     >     > >
>     >     > > Some of the recorded meetings and outputs can also be found
> if you
>     > are
>     >     > > interested in some primary sources
>     >     > > https://cwiki.apache.org/confluence/display/CASSANDRA/
>     >     > > Cassandra+Kubernetes+Operator+SIG
>     >     > > .
>     >     > >
>     >     > > From what I understand second-hand from talking to people on
> the
>     > SIG
>     >     > > calls,
>     >     > >
>     >     > > there was a general inability to agree on an existing
> operator as a
>     >     > > starting point and not much engagement on taking best of
> breed
>     > from the
>     >     > > various to combine them. Seems to leave us in the "let a
> thousand
>     > flowers
>     >     > > bloom" stage of letting operators grow in the ecosystem and
> seeing
>     > which
>     >     > > ones meet the needs of end users before talking about
> adopting one
>     > into the
>     >     > > foundation.
>     >     > >
>     >     > > Great to hear that you folks are joining forces though!
> Bodes well
>     > for C*
>     >     > > users that are wanting to run things on k8s.
>     >     > >
>     >     > > On Tue, Sep 22, 2020 at 4:26 AM, Ben Bromhead <
> ben@instaclustr.com
>     > >
>     >     > > wrote:
>     >     > >
>     >     > > For what it's worth, a quick update from me:
>     >     > >
>     >     > > CassKop now has at least two organisations working on it
>     > substantially
>     >     > > (Orange and Instaclustr) as well as the numerous other
>     > contributors.
>     >     > >
>     >     > > Internally we will also start pointing others towards CasKop
> once
>     > a few
>     >     > > things get merged. While we are not yet sunsetting our
> operator
>     > yet, it
>     >     > >
>     >     > > is
>     >     > >
>     >     > > certainly looking that way.
>     >     > >
>     >     > > I'd love to see the community adopt it as a starting point
> for
>     > working
>     >     > > towards whatever level of functionality is desired.
>     >     > >
>     >     > > Cheers
>     >     > >
>     >     > > Ben
>     >     > >
>     >     > > On Fri, Sep 11, 2020 at 2:37 PM John Sanda <
> john.sanda@gmail.com>
>     > wrote:
>     >     > >
>     >     > > On Thu, Sep 10, 2020 at 5:27 PM Josh McKenzie <
>     > jmckenzie@apache.org>
>     >     > > wrote:
>     >     > >
>     >     > > There's basically 1 java driver in the C* ecosystem. We have
> 3? 4?
>     > or
>     >     > >
>     >     > > more
>     >     > >
>     >     > > operators in the ecosystem. Has one of them hit a clear
>     > supermajority of
>     >     > > adoption that makes it the de facto default and makes sense
> to
>     > pull it
>     >     > >
>     >     > > into
>     >     > >
>     >     > > the project?
>     >     > >
>     >     > > We as a project community were pretty slow to move on
> building a
>     > PoV
>     >     > >
>     >     > > around
>     >     > >
>     >     > > kubernetes so we find ourselves in a situation with a bunch
> of
>     > contenders
>     >     > > for inclusion in the project. It's not clear to me what
> heuristics
>     > we'd
>     >     > >
>     >     > > use
>     >     > >
>     >     > > to gauge which one would be the best fit for inclusion
> outside
>     > letting
>     >     > > community adoption speak.
>     >     > >
>     >     > > ---
>     >     > > Josh McKenzie
>     >     > >
>     >     > > We actually talked a good bit on the SIG call earlier today
> about
>     >     > > heuristics. We need to document what functionality an
> operator
>     > should
>     >     > > include at level 0, level 1, etc. We did discuss this a good
> bit
>     > during
>     >     > > some of the initial SIG meetings, but I guess it wasn't
> really a
>     > focal
>     >     > > point at the time. I think we should also provide references
> to
>     > existing
>     >     > > operator projects and possibly other related projects. This
> would
>     > benefit
>     >     > > both community users as well as people working on these
> projects.
>     >     > >
>     >     > > - John
>     >     > >
>     >     > > --
>     >     > >
>     >     > > Ben Bromhead
>     >     > >
>     >     > > Instaclustr | www.instaclustr.com | @instaclustr
>     >     > > <http://twitter.com/instaclustr> | (650) 284 9692
>     >     > >
>     >     > > --
>     >     > >
>     >     > > Ben Bromhead
>     >     > >
>     >     > > Instaclustr | www.instaclustr.com | @instaclustr
>     >     > > <http://twitter.com/instaclustr> | (650) 284 9692
>     >     > >
>     >     > >
>     >
> --------------------------------------------------------------------- To
>     >     > > unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
> For
>     > additional
>     >     > > commands, e-mail: dev-help@cassandra.apache.org
>     >     > >
>     >
>     >
>  ---------------------------------------------------------------------
>     >     To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
>     >     For additional commands, e-mail: dev-help@cassandra.apache.org
>     >
>     >
>     >
>     >
>     > ---------------------------------------------------------------------
>     > To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
>     > For additional commands, e-mail: dev-help@cassandra.apache.org
>     >
>     > --
>
>     - John
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
> For additional commands, e-mail: dev-help@cassandra.apache.org
>
>

Re: [DISCUSS] Next steps for Kubernetes operator SIG

Posted by Benedict Elliott Smith <be...@apache.org>.

Perhaps it helps to widen the field of discussion to the dev list?

It might help if each of the stakeholder organisations state their view on the situation, including why they would or would not support a given approach/operator, and what (preferably specific) circumstances might lead them to change their mind?

I realise there are meeting logs, but getting a wider discourse with non-stakeholder input might help to build a community consensus?  It doesn't seem like it can hurt at this point, anyway.


On 23/09/2020, 17:13, "John Sanda" <jo...@gmail.com> wrote:

    I want to point out that pretty much everything being  discussed in this
    thread has been discussed at length during the SIG meetings. I think it is
    worth noting because we are pretty much still have the same conversation.

    On Wed, Sep 23, 2020 at 12:03 PM Benedict Elliott Smith <be...@apache.org>
    wrote:

    > I don't think there's anything about a code drop that's not "The Apache
    > Way"
    >
    > If there's a consensus (or even strong majority) amongst invested parties,
    > I don't see why we could not adopt an operator directly into the project.
    >
    > It's possible a green field approach might lead to fewer hard feelings, as
    > everyone is in the same boat. Perhaps all operators are also suboptimal and
    > could be improved with a rewrite? But I think coordinating a lot of
    > different entities around an empty codebase is particularly challenging.  I
    > actually think it could be better for cohesion and collaboration to have a
    > suboptimal but substantive starting point.
    >
    >
    > On 23/09/2020, 16:11, "Stefan Miklosovic" <
    > stefan.miklosovic@instaclustr.com> wrote:
    >
    >     I think that from Instaclustr it was stated quite clearly multiple
    >     times that we are "fine to throw it away" if there is something better
    >     and more wide-spread.Indeed, we have invested a lot of time in the
    >     operator but it was not useless at all, we gained a lot of quite
    >     unique knowledge how to put all pieces together. However, I think that
    >     this space is going to be quite fragmented and "balkanized", which is
    >     not always a bad thing, but in a quite narrow area as Kubernetes
    >     operator is, I just do not see how 4 operators are going to be
    >     beneficial for ordinary people ("official" from community, ours,
    >     Datastax one and CassKop (without any significant order)). Sure,
    >     innovation and healthy competition is important but to what extent ...
    >     One can start a Cassandra cluster on Kubernetes just so many times
    >     differently and nobody really likes a vendor lock-in. People wanting
    >     to run a cluster on K8S realise that there are three operators, each
    >     backed by a private business entity, and the community operator is not
    >     there ... Huh, interesting ... One may even start to question what is
    >     wrong with these folks that it takes three companies to build their
    >     own solution.
    >
    >     Having said that, to my perception, Cassandra community just does not
    >     have enough engineers nor contributors to keep 4 operators alive at
    >     the same time (I wish I was wrong) so the idea of selecting the best
    >     one or to merge obvious things and approaches together is
    >     understandable, even if it meant we eventually sunset ours. In
    >     addition, nobody from big players is going to contribute to the code
    >     base of the other one, for obvious reasons, so channeling and
    >     directing this effort into something common for a community seems to
    >     be the only reasonable way of cooperation.
    >
    >     It is quite hard to bootstrap this if the donation of the code in big
    >     chunks / whole repo is out of question as it is not the "Apache way"
    >     (there was some thread running here about this in more depth a while
    >     ago) and we basically need to start from scratch which is quite
    >     demotivating, we are just inventing the wheel and nobody is up to it.
    >     It is like people are waiting for that to happen so they can jump in
    >     "once it is the thing" but it will never materialise or at least the
    >     hurdle to kick it off is unnecessarily high. Nobody is going to invest
    >     in this heavily if there is already a working operator from companies
    >     mentioned above. As I understood it, one reason of not choosing the
    >     way of donating it all is that "the learning and community building
    >     should happen in organic manner and we just can not accept the
    >     donation", but is not it true that it is easier to build a community
    >     around something which is already there rather than trying to build it
    >     around an idea which is quite hard to dedicate to?
    >
    >     On Wed, 23 Sep 2020 at 15:28, Joshua McKenzie <jm...@apache.org>
    > wrote:
    >     >
    >     > > I think there's significant value to the community in trying to
    > coalesce
    >     > on a single approach,
    >     > I agree. Unfortunately in this case, the parties with a vested
    > interest and
    >     > written operators came to the table and couldn't agree to coalesce
    > on a
    >     > single approach. John Sanda attempted to start an initiative to
    > write a
    >     > best-of-breed combining choice parts of each operator, but that
    > effort did
    >     > not gain traction.
    >     >
    >     > Which is where my hypothesis comes from that if there were a clear
    > "better
    >     > fit" operator to start from we wouldn't be in a deadlock; the correct
    >     > choice would be obvious. Reasonably so, every engineer that's written
    >     > something is going to want that something to be used and not thrown
    > away in
    >     > favor of another something without strong evidence as to why that's
    > the
    >     > better choice.
    >     >
    >     > As far as I know, nobody has made a clear case as to a more
    > compelling
    >     > place to start in terms of an operator donation the project then
    >     > collaborates on. There's no mass adoption evidence nor feature
    > enumeration
    >     > that I know of for any of the approaches anyone's taken, so the
    > discussions
    >     > remain stalled.
    >     >
    >     >
    >     >
    >     > On Wed, Sep 23, 2020 at 7:18 AM, Benedict Elliott Smith <
    > benedict@apache.org
    >     > > wrote:
    >     >
    >     > > I think there's significant value to the community in trying to
    > coalesce
    >     > > on a single approach, earlier than later. This is an opportunity
    > to expand
    >     > > the number of active organisations involved directly in the Apache
    >     > > Cassandra project, as well as to more quickly expand the project's
    >     > > functionality into an area we consider urgent and important. I
    > think it
    >     > > would be a real shame to waste this opportunity. No doubt it will
    > be hard,
    >     > > as organisations have certain built-in investments in their own
    > approaches.
    >     > >
    >     > > I haven't participated in these calls as I do not consider myself
    > to have
    >     > > the relevant experience and expertise, and have other focuses on
    > the
    >     > > project. I just wanted to voice a vote in favour of trying to
    > bring the
    >     > > different organisations together on a single approach if possible.
    > Is there
    >     > > anything the project can do to help this happen?
    >     > >
    >     > > On 23/09/2020, 03:04, "Ben Bromhead" <be...@instaclustr.com> wrote:
    >     > >
    >     > > I think there is certainly an appetite to donate and standardise
    > on a
    >     > > given operator (as mentioned in this thread).
    >     > >
    >     > > I personally found the SIG hard to participate in due to time
    > zones and
    >     > > the synchronous nature of it.
    >     > >
    >     > > So while it was a great forum to dive into certain details for a
    > subset of
    >     > > participants and a worthwhile endeavour, I wouldn't paint it as an
    > accurate
    >     > > reflection of community intent.
    >     > >
    >     > > I don't think that any participants want to continue down the path
    > of "let
    >     > > a thousand flowers bloom". That's why we are looking towards
    > CasKop (as
    >     > > well as a number of technical reasons).
    >     > >
    >     > > Some of the recorded meetings and outputs can also be found if you
    > are
    >     > > interested in some primary sources
    >     > > https://cwiki.apache.org/confluence/display/CASSANDRA/
    >     > > Cassandra+Kubernetes+Operator+SIG
    >     > > .
    >     > >
    >     > > From what I understand second-hand from talking to people on the
    > SIG
    >     > > calls,
    >     > >
    >     > > there was a general inability to agree on an existing operator as a
    >     > > starting point and not much engagement on taking best of breed
    > from the
    >     > > various to combine them. Seems to leave us in the "let a thousand
    > flowers
    >     > > bloom" stage of letting operators grow in the ecosystem and seeing
    > which
    >     > > ones meet the needs of end users before talking about adopting one
    > into the
    >     > > foundation.
    >     > >
    >     > > Great to hear that you folks are joining forces though! Bodes well
    > for C*
    >     > > users that are wanting to run things on k8s.
    >     > >
    >     > > On Tue, Sep 22, 2020 at 4:26 AM, Ben Bromhead <ben@instaclustr.com
    > >
    >     > > wrote:
    >     > >
    >     > > For what it's worth, a quick update from me:
    >     > >
    >     > > CassKop now has at least two organisations working on it
    > substantially
    >     > > (Orange and Instaclustr) as well as the numerous other
    > contributors.
    >     > >
    >     > > Internally we will also start pointing others towards CasKop once
    > a few
    >     > > things get merged. While we are not yet sunsetting our operator
    > yet, it
    >     > >
    >     > > is
    >     > >
    >     > > certainly looking that way.
    >     > >
    >     > > I'd love to see the community adopt it as a starting point for
    > working
    >     > > towards whatever level of functionality is desired.
    >     > >
    >     > > Cheers
    >     > >
    >     > > Ben
    >     > >
    >     > > On Fri, Sep 11, 2020 at 2:37 PM John Sanda <jo...@gmail.com>
    > wrote:
    >     > >
    >     > > On Thu, Sep 10, 2020 at 5:27 PM Josh McKenzie <
    > jmckenzie@apache.org>
    >     > > wrote:
    >     > >
    >     > > There's basically 1 java driver in the C* ecosystem. We have 3? 4?
    > or
    >     > >
    >     > > more
    >     > >
    >     > > operators in the ecosystem. Has one of them hit a clear
    > supermajority of
    >     > > adoption that makes it the de facto default and makes sense to
    > pull it
    >     > >
    >     > > into
    >     > >
    >     > > the project?
    >     > >
    >     > > We as a project community were pretty slow to move on building a
    > PoV
    >     > >
    >     > > around
    >     > >
    >     > > kubernetes so we find ourselves in a situation with a bunch of
    > contenders
    >     > > for inclusion in the project. It's not clear to me what heuristics
    > we'd
    >     > >
    >     > > use
    >     > >
    >     > > to gauge which one would be the best fit for inclusion outside
    > letting
    >     > > community adoption speak.
    >     > >
    >     > > ---
    >     > > Josh McKenzie
    >     > >
    >     > > We actually talked a good bit on the SIG call earlier today about
    >     > > heuristics. We need to document what functionality an operator
    > should
    >     > > include at level 0, level 1, etc. We did discuss this a good bit
    > during
    >     > > some of the initial SIG meetings, but I guess it wasn't really a
    > focal
    >     > > point at the time. I think we should also provide references to
    > existing
    >     > > operator projects and possibly other related projects. This would
    > benefit
    >     > > both community users as well as people working on these projects.
    >     > >
    >     > > - John
    >     > >
    >     > > --
    >     > >
    >     > > Ben Bromhead
    >     > >
    >     > > Instaclustr | www.instaclustr.com | @instaclustr
    >     > > <http://twitter.com/instaclustr> | (650) 284 9692
    >     > >
    >     > > --
    >     > >
    >     > > Ben Bromhead
    >     > >
    >     > > Instaclustr | www.instaclustr.com | @instaclustr
    >     > > <http://twitter.com/instaclustr> | (650) 284 9692
    >     > >
    >     > >
    > --------------------------------------------------------------------- To
    >     > > unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org For
    > additional
    >     > > commands, e-mail: dev-help@cassandra.apache.org
    >     > >
    >
    >     ---------------------------------------------------------------------
    >     To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
    >     For additional commands, e-mail: dev-help@cassandra.apache.org
    >
    >
    >
    >
    > ---------------------------------------------------------------------
    > To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
    > For additional commands, e-mail: dev-help@cassandra.apache.org
    >
    > --

    - John



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
For additional commands, e-mail: dev-help@cassandra.apache.org

Re: [DISCUSS] Next steps for Kubernetes operator SIG

Posted by John Sanda <jo...@gmail.com>.

I want to point out that pretty much everything being  discussed in this
thread has been discussed at length during the SIG meetings. I think it is
worth noting because we are pretty much still have the same conversation.

On Wed, Sep 23, 2020 at 12:03 PM Benedict Elliott Smith <be...@apache.org>
wrote:

> I don't think there's anything about a code drop that's not "The Apache
> Way"
>
> If there's a consensus (or even strong majority) amongst invested parties,
> I don't see why we could not adopt an operator directly into the project.
>
> It's possible a green field approach might lead to fewer hard feelings, as
> everyone is in the same boat. Perhaps all operators are also suboptimal and
> could be improved with a rewrite? But I think coordinating a lot of
> different entities around an empty codebase is particularly challenging.  I
> actually think it could be better for cohesion and collaboration to have a
> suboptimal but substantive starting point.
>
>
> On 23/09/2020, 16:11, "Stefan Miklosovic" <
> stefan.miklosovic@instaclustr.com> wrote:
>
>     I think that from Instaclustr it was stated quite clearly multiple
>     times that we are "fine to throw it away" if there is something better
>     and more wide-spread.Indeed, we have invested a lot of time in the
>     operator but it was not useless at all, we gained a lot of quite
>     unique knowledge how to put all pieces together. However, I think that
>     this space is going to be quite fragmented and "balkanized", which is
>     not always a bad thing, but in a quite narrow area as Kubernetes
>     operator is, I just do not see how 4 operators are going to be
>     beneficial for ordinary people ("official" from community, ours,
>     Datastax one and CassKop (without any significant order)). Sure,
>     innovation and healthy competition is important but to what extent ...
>     One can start a Cassandra cluster on Kubernetes just so many times
>     differently and nobody really likes a vendor lock-in. People wanting
>     to run a cluster on K8S realise that there are three operators, each
>     backed by a private business entity, and the community operator is not
>     there ... Huh, interesting ... One may even start to question what is
>     wrong with these folks that it takes three companies to build their
>     own solution.
>
>     Having said that, to my perception, Cassandra community just does not
>     have enough engineers nor contributors to keep 4 operators alive at
>     the same time (I wish I was wrong) so the idea of selecting the best
>     one or to merge obvious things and approaches together is
>     understandable, even if it meant we eventually sunset ours. In
>     addition, nobody from big players is going to contribute to the code
>     base of the other one, for obvious reasons, so channeling and
>     directing this effort into something common for a community seems to
>     be the only reasonable way of cooperation.
>
>     It is quite hard to bootstrap this if the donation of the code in big
>     chunks / whole repo is out of question as it is not the "Apache way"
>     (there was some thread running here about this in more depth a while
>     ago) and we basically need to start from scratch which is quite
>     demotivating, we are just inventing the wheel and nobody is up to it.
>     It is like people are waiting for that to happen so they can jump in
>     "once it is the thing" but it will never materialise or at least the
>     hurdle to kick it off is unnecessarily high. Nobody is going to invest
>     in this heavily if there is already a working operator from companies
>     mentioned above. As I understood it, one reason of not choosing the
>     way of donating it all is that "the learning and community building
>     should happen in organic manner and we just can not accept the
>     donation", but is not it true that it is easier to build a community
>     around something which is already there rather than trying to build it
>     around an idea which is quite hard to dedicate to?
>
>     On Wed, 23 Sep 2020 at 15:28, Joshua McKenzie <jm...@apache.org>
> wrote:
>     >
>     > > I think there's significant value to the community in trying to
> coalesce
>     > on a single approach,
>     > I agree. Unfortunately in this case, the parties with a vested
> interest and
>     > written operators came to the table and couldn't agree to coalesce
> on a
>     > single approach. John Sanda attempted to start an initiative to
> write a
>     > best-of-breed combining choice parts of each operator, but that
> effort did
>     > not gain traction.
>     >
>     > Which is where my hypothesis comes from that if there were a clear
> "better
>     > fit" operator to start from we wouldn't be in a deadlock; the correct
>     > choice would be obvious. Reasonably so, every engineer that's written
>     > something is going to want that something to be used and not thrown
> away in
>     > favor of another something without strong evidence as to why that's
> the
>     > better choice.
>     >
>     > As far as I know, nobody has made a clear case as to a more
> compelling
>     > place to start in terms of an operator donation the project then
>     > collaborates on. There's no mass adoption evidence nor feature
> enumeration
>     > that I know of for any of the approaches anyone's taken, so the
> discussions
>     > remain stalled.
>     >
>     >
>     >
>     > On Wed, Sep 23, 2020 at 7:18 AM, Benedict Elliott Smith <
> benedict@apache.org
>     > > wrote:
>     >
>     > > I think there's significant value to the community in trying to
> coalesce
>     > > on a single approach, earlier than later. This is an opportunity
> to expand
>     > > the number of active organisations involved directly in the Apache
>     > > Cassandra project, as well as to more quickly expand the project's
>     > > functionality into an area we consider urgent and important. I
> think it
>     > > would be a real shame to waste this opportunity. No doubt it will
> be hard,
>     > > as organisations have certain built-in investments in their own
> approaches.
>     > >
>     > > I haven't participated in these calls as I do not consider myself
> to have
>     > > the relevant experience and expertise, and have other focuses on
> the
>     > > project. I just wanted to voice a vote in favour of trying to
> bring the
>     > > different organisations together on a single approach if possible.
> Is there
>     > > anything the project can do to help this happen?
>     > >
>     > > On 23/09/2020, 03:04, "Ben Bromhead" <be...@instaclustr.com> wrote:
>     > >
>     > > I think there is certainly an appetite to donate and standardise
> on a
>     > > given operator (as mentioned in this thread).
>     > >
>     > > I personally found the SIG hard to participate in due to time
> zones and
>     > > the synchronous nature of it.
>     > >
>     > > So while it was a great forum to dive into certain details for a
> subset of
>     > > participants and a worthwhile endeavour, I wouldn't paint it as an
> accurate
>     > > reflection of community intent.
>     > >
>     > > I don't think that any participants want to continue down the path
> of "let
>     > > a thousand flowers bloom". That's why we are looking towards
> CasKop (as
>     > > well as a number of technical reasons).
>     > >
>     > > Some of the recorded meetings and outputs can also be found if you
> are
>     > > interested in some primary sources
>     > > https://cwiki.apache.org/confluence/display/CASSANDRA/
>     > > Cassandra+Kubernetes+Operator+SIG
>     > > .
>     > >
>     > > From what I understand second-hand from talking to people on the
> SIG
>     > > calls,
>     > >
>     > > there was a general inability to agree on an existing operator as a
>     > > starting point and not much engagement on taking best of breed
> from the
>     > > various to combine them. Seems to leave us in the "let a thousand
> flowers
>     > > bloom" stage of letting operators grow in the ecosystem and seeing
> which
>     > > ones meet the needs of end users before talking about adopting one
> into the
>     > > foundation.
>     > >
>     > > Great to hear that you folks are joining forces though! Bodes well
> for C*
>     > > users that are wanting to run things on k8s.
>     > >
>     > > On Tue, Sep 22, 2020 at 4:26 AM, Ben Bromhead <ben@instaclustr.com
> >
>     > > wrote:
>     > >
>     > > For what it's worth, a quick update from me:
>     > >
>     > > CassKop now has at least two organisations working on it
> substantially
>     > > (Orange and Instaclustr) as well as the numerous other
> contributors.
>     > >
>     > > Internally we will also start pointing others towards CasKop once
> a few
>     > > things get merged. While we are not yet sunsetting our operator
> yet, it
>     > >
>     > > is
>     > >
>     > > certainly looking that way.
>     > >
>     > > I'd love to see the community adopt it as a starting point for
> working
>     > > towards whatever level of functionality is desired.
>     > >
>     > > Cheers
>     > >
>     > > Ben
>     > >
>     > > On Fri, Sep 11, 2020 at 2:37 PM John Sanda <jo...@gmail.com>
> wrote:
>     > >
>     > > On Thu, Sep 10, 2020 at 5:27 PM Josh McKenzie <
> jmckenzie@apache.org>
>     > > wrote:
>     > >
>     > > There's basically 1 java driver in the C* ecosystem. We have 3? 4?
> or
>     > >
>     > > more
>     > >
>     > > operators in the ecosystem. Has one of them hit a clear
> supermajority of
>     > > adoption that makes it the de facto default and makes sense to
> pull it
>     > >
>     > > into
>     > >
>     > > the project?
>     > >
>     > > We as a project community were pretty slow to move on building a
> PoV
>     > >
>     > > around
>     > >
>     > > kubernetes so we find ourselves in a situation with a bunch of
> contenders
>     > > for inclusion in the project. It's not clear to me what heuristics
> we'd
>     > >
>     > > use
>     > >
>     > > to gauge which one would be the best fit for inclusion outside
> letting
>     > > community adoption speak.
>     > >
>     > > ---
>     > > Josh McKenzie
>     > >
>     > > We actually talked a good bit on the SIG call earlier today about
>     > > heuristics. We need to document what functionality an operator
> should
>     > > include at level 0, level 1, etc. We did discuss this a good bit
> during
>     > > some of the initial SIG meetings, but I guess it wasn't really a
> focal
>     > > point at the time. I think we should also provide references to
> existing
>     > > operator projects and possibly other related projects. This would
> benefit
>     > > both community users as well as people working on these projects.
>     > >
>     > > - John
>     > >
>     > > --
>     > >
>     > > Ben Bromhead
>     > >
>     > > Instaclustr | www.instaclustr.com | @instaclustr
>     > > <http://twitter.com/instaclustr> | (650) 284 9692
>     > >
>     > > --
>     > >
>     > > Ben Bromhead
>     > >
>     > > Instaclustr | www.instaclustr.com | @instaclustr
>     > > <http://twitter.com/instaclustr> | (650) 284 9692
>     > >
>     > >
> --------------------------------------------------------------------- To
>     > > unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org For
> additional
>     > > commands, e-mail: dev-help@cassandra.apache.org
>     > >
>
>     ---------------------------------------------------------------------
>     To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
>     For additional commands, e-mail: dev-help@cassandra.apache.org
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
> For additional commands, e-mail: dev-help@cassandra.apache.org
>
> --

- John

Re: [DISCUSS] Next steps for Kubernetes operator SIG

Posted by John Sanda <jo...@gmail.com>.

I W

On Wed, Sep 23, 2020 at 12:03 PM Benedict Elliott Smith <be...@apache.org>
wrote:

> I don't think there's anything about a code drop that's not "The Apache
> Way"
>
> If there's a consensus (or even strong majority) amongst invested parties,
> I don't see why we could not adopt an operator directly into the project.
>
> It's possible a green field approach might lead to fewer hard feelings, as
> everyone is in the same boat. Perhaps all operators are also suboptimal and
> could be improved with a rewrite? But I think coordinating a lot of
> different entities around an empty codebase is particularly challenging.  I
> actually think it could be better for cohesion and collaboration to have a
> suboptimal but substantive starting point.
>
>
> On 23/09/2020, 16:11, "Stefan Miklosovic" <
> stefan.miklosovic@instaclustr.com> wrote:
>
>     I think that from Instaclustr it was stated quite clearly multiple
>     times that we are "fine to throw it away" if there is something better
>     and more wide-spread.Indeed, we have invested a lot of time in the
>     operator but it was not useless at all, we gained a lot of quite
>     unique knowledge how to put all pieces together. However, I think that
>     this space is going to be quite fragmented and "balkanized", which is
>     not always a bad thing, but in a quite narrow area as Kubernetes
>     operator is, I just do not see how 4 operators are going to be
>     beneficial for ordinary people ("official" from community, ours,
>     Datastax one and CassKop (without any significant order)). Sure,
>     innovation and healthy competition is important but to what extent ...
>     One can start a Cassandra cluster on Kubernetes just so many times
>     differently and nobody really likes a vendor lock-in. People wanting
>     to run a cluster on K8S realise that there are three operators, each
>     backed by a private business entity, and the community operator is not
>     there ... Huh, interesting ... One may even start to question what is
>     wrong with these folks that it takes three companies to build their
>     own solution.
>
>     Having said that, to my perception, Cassandra community just does not
>     have enough engineers nor contributors to keep 4 operators alive at
>     the same time (I wish I was wrong) so the idea of selecting the best
>     one or to merge obvious things and approaches together is
>     understandable, even if it meant we eventually sunset ours. In
>     addition, nobody from big players is going to contribute to the code
>     base of the other one, for obvious reasons, so channeling and
>     directing this effort into something common for a community seems to
>     be the only reasonable way of cooperation.
>
>     It is quite hard to bootstrap this if the donation of the code in big
>     chunks / whole repo is out of question as it is not the "Apache way"
>     (there was some thread running here about this in more depth a while
>     ago) and we basically need to start from scratch which is quite
>     demotivating, we are just inventing the wheel and nobody is up to it.
>     It is like people are waiting for that to happen so they can jump in
>     "once it is the thing" but it will never materialise or at least the
>     hurdle to kick it off is unnecessarily high. Nobody is going to invest
>     in this heavily if there is already a working operator from companies
>     mentioned above. As I understood it, one reason of not choosing the
>     way of donating it all is that "the learning and community building
>     should happen in organic manner and we just can not accept the
>     donation", but is not it true that it is easier to build a community
>     around something which is already there rather than trying to build it
>     around an idea which is quite hard to dedicate to?
>
>     On Wed, 23 Sep 2020 at 15:28, Joshua McKenzie <jm...@apache.org>
> wrote:
>     >
>     > > I think there's significant value to the community in trying to
> coalesce
>     > on a single approach,
>     > I agree. Unfortunately in this case, the parties with a vested
> interest and
>     > written operators came to the table and couldn't agree to coalesce
> on a
>     > single approach. John Sanda attempted to start an initiative to
> write a
>     > best-of-breed combining choice parts of each operator, but that
> effort did
>     > not gain traction.
>     >
>     > Which is where my hypothesis comes from that if there were a clear
> "better
>     > fit" operator to start from we wouldn't be in a deadlock; the correct
>     > choice would be obvious. Reasonably so, every engineer that's written
>     > something is going to want that something to be used and not thrown
> away in
>     > favor of another something without strong evidence as to why that's
> the
>     > better choice.
>     >
>     > As far as I know, nobody has made a clear case as to a more
> compelling
>     > place to start in terms of an operator donation the project then
>     > collaborates on. There's no mass adoption evidence nor feature
> enumeration
>     > that I know of for any of the approaches anyone's taken, so the
> discussions
>     > remain stalled.
>     >
>     >
>     >
>     > On Wed, Sep 23, 2020 at 7:18 AM, Benedict Elliott Smith <
> benedict@apache.org
>     > > wrote:
>     >
>     > > I think there's significant value to the community in trying to
> coalesce
>     > > on a single approach, earlier than later. This is an opportunity
> to expand
>     > > the number of active organisations involved directly in the Apache
>     > > Cassandra project, as well as to more quickly expand the project's
>     > > functionality into an area we consider urgent and important. I
> think it
>     > > would be a real shame to waste this opportunity. No doubt it will
> be hard,
>     > > as organisations have certain built-in investments in their own
> approaches.
>     > >
>     > > I haven't participated in these calls as I do not consider myself
> to have
>     > > the relevant experience and expertise, and have other focuses on
> the
>     > > project. I just wanted to voice a vote in favour of trying to
> bring the
>     > > different organisations together on a single approach if possible.
> Is there
>     > > anything the project can do to help this happen?
>     > >
>     > > On 23/09/2020, 03:04, "Ben Bromhead" <be...@instaclustr.com> wrote:
>     > >
>     > > I think there is certainly an appetite to donate and standardise
> on a
>     > > given operator (as mentioned in this thread).
>     > >
>     > > I personally found the SIG hard to participate in due to time
> zones and
>     > > the synchronous nature of it.
>     > >
>     > > So while it was a great forum to dive into certain details for a
> subset of
>     > > participants and a worthwhile endeavour, I wouldn't paint it as an
> accurate
>     > > reflection of community intent.
>     > >
>     > > I don't think that any participants want to continue down the path
> of "let
>     > > a thousand flowers bloom". That's why we are looking towards
> CasKop (as
>     > > well as a number of technical reasons).
>     > >
>     > > Some of the recorded meetings and outputs can also be found if you
> are
>     > > interested in some primary sources
>     > > https://cwiki.apache.org/confluence/display/CASSANDRA/
>     > > Cassandra+Kubernetes+Operator+SIG
>     > > .
>     > >
>     > > From what I understand second-hand from talking to people on the
> SIG
>     > > calls,
>     > >
>     > > there was a general inability to agree on an existing operator as a
>     > > starting point and not much engagement on taking best of breed
> from the
>     > > various to combine them. Seems to leave us in the "let a thousand
> flowers
>     > > bloom" stage of letting operators grow in the ecosystem and seeing
> which
>     > > ones meet the needs of end users before talking about adopting one
> into the
>     > > foundation.
>     > >
>     > > Great to hear that you folks are joining forces though! Bodes well
> for C*
>     > > users that are wanting to run things on k8s.
>     > >
>     > > On Tue, Sep 22, 2020 at 4:26 AM, Ben Bromhead <ben@instaclustr.com
> >
>     > > wrote:
>     > >
>     > > For what it's worth, a quick update from me:
>     > >
>     > > CassKop now has at least two organisations working on it
> substantially
>     > > (Orange and Instaclustr) as well as the numerous other
> contributors.
>     > >
>     > > Internally we will also start pointing others towards CasKop once
> a few
>     > > things get merged. While we are not yet sunsetting our operator
> yet, it
>     > >
>     > > is
>     > >
>     > > certainly looking that way.
>     > >
>     > > I'd love to see the community adopt it as a starting point for
> working
>     > > towards whatever level of functionality is desired.
>     > >
>     > > Cheers
>     > >
>     > > Ben
>     > >
>     > > On Fri, Sep 11, 2020 at 2:37 PM John Sanda <jo...@gmail.com>
> wrote:
>     > >
>     > > On Thu, Sep 10, 2020 at 5:27 PM Josh McKenzie <
> jmckenzie@apache.org>
>     > > wrote:
>     > >
>     > > There's basically 1 java driver in the C* ecosystem. We have 3? 4?
> or
>     > >
>     > > more
>     > >
>     > > operators in the ecosystem. Has one of them hit a clear
> supermajority of
>     > > adoption that makes it the de facto default and makes sense to
> pull it
>     > >
>     > > into
>     > >
>     > > the project?
>     > >
>     > > We as a project community were pretty slow to move on building a
> PoV
>     > >
>     > > around
>     > >
>     > > kubernetes so we find ourselves in a situation with a bunch of
> contenders
>     > > for inclusion in the project. It's not clear to me what heuristics
> we'd
>     > >
>     > > use
>     > >
>     > > to gauge which one would be the best fit for inclusion outside
> letting
>     > > community adoption speak.
>     > >
>     > > ---
>     > > Josh McKenzie
>     > >
>     > > We actually talked a good bit on the SIG call earlier today about
>     > > heuristics. We need to document what functionality an operator
> should
>     > > include at level 0, level 1, etc. We did discuss this a good bit
> during
>     > > some of the initial SIG meetings, but I guess it wasn't really a
> focal
>     > > point at the time. I think we should also provide references to
> existing
>     > > operator projects and possibly other related projects. This would
> benefit
>     > > both community users as well as people working on these projects.
>     > >
>     > > - John
>     > >
>     > > --
>     > >
>     > > Ben Bromhead
>     > >
>     > > Instaclustr | www.instaclustr.com | @instaclustr
>     > > <http://twitter.com/instaclustr> | (650) 284 9692
>     > >
>     > > --
>     > >
>     > > Ben Bromhead
>     > >
>     > > Instaclustr | www.instaclustr.com | @instaclustr
>     > > <http://twitter.com/instaclustr> | (650) 284 9692
>     > >
>     > >
> --------------------------------------------------------------------- To
>     > > unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org For
> additional
>     > > commands, e-mail: dev-help@cassandra.apache.org
>     > >
>
>     ---------------------------------------------------------------------
>     To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
>     For additional commands, e-mail: dev-help@cassandra.apache.org
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
> For additional commands, e-mail: dev-help@cassandra.apache.org
>
> --

- John

Re: [DISCUSS] Next steps for Kubernetes operator SIG

Posted by Benedict Elliott Smith <be...@apache.org>.

I don't think there's anything about a code drop that's not "The Apache Way"

If there's a consensus (or even strong majority) amongst invested parties, I don't see why we could not adopt an operator directly into the project.

It's possible a green field approach might lead to fewer hard feelings, as everyone is in the same boat. Perhaps all operators are also suboptimal and could be improved with a rewrite? But I think coordinating a lot of different entities around an empty codebase is particularly challenging.  I actually think it could be better for cohesion and collaboration to have a suboptimal but substantive starting point.


On 23/09/2020, 16:11, "Stefan Miklosovic" <st...@instaclustr.com> wrote:

    I think that from Instaclustr it was stated quite clearly multiple
    times that we are "fine to throw it away" if there is something better
    and more wide-spread.Indeed, we have invested a lot of time in the
    operator but it was not useless at all, we gained a lot of quite
    unique knowledge how to put all pieces together. However, I think that
    this space is going to be quite fragmented and "balkanized", which is
    not always a bad thing, but in a quite narrow area as Kubernetes
    operator is, I just do not see how 4 operators are going to be
    beneficial for ordinary people ("official" from community, ours,
    Datastax one and CassKop (without any significant order)). Sure,
    innovation and healthy competition is important but to what extent ...
    One can start a Cassandra cluster on Kubernetes just so many times
    differently and nobody really likes a vendor lock-in. People wanting
    to run a cluster on K8S realise that there are three operators, each
    backed by a private business entity, and the community operator is not
    there ... Huh, interesting ... One may even start to question what is
    wrong with these folks that it takes three companies to build their
    own solution.

    Having said that, to my perception, Cassandra community just does not
    have enough engineers nor contributors to keep 4 operators alive at
    the same time (I wish I was wrong) so the idea of selecting the best
    one or to merge obvious things and approaches together is
    understandable, even if it meant we eventually sunset ours. In
    addition, nobody from big players is going to contribute to the code
    base of the other one, for obvious reasons, so channeling and
    directing this effort into something common for a community seems to
    be the only reasonable way of cooperation.

    It is quite hard to bootstrap this if the donation of the code in big
    chunks / whole repo is out of question as it is not the "Apache way"
    (there was some thread running here about this in more depth a while
    ago) and we basically need to start from scratch which is quite
    demotivating, we are just inventing the wheel and nobody is up to it.
    It is like people are waiting for that to happen so they can jump in
    "once it is the thing" but it will never materialise or at least the
    hurdle to kick it off is unnecessarily high. Nobody is going to invest
    in this heavily if there is already a working operator from companies
    mentioned above. As I understood it, one reason of not choosing the
    way of donating it all is that "the learning and community building
    should happen in organic manner and we just can not accept the
    donation", but is not it true that it is easier to build a community
    around something which is already there rather than trying to build it
    around an idea which is quite hard to dedicate to?

    On Wed, 23 Sep 2020 at 15:28, Joshua McKenzie <jm...@apache.org> wrote:
    >
    > > I think there's significant value to the community in trying to coalesce
    > on a single approach,
    > I agree. Unfortunately in this case, the parties with a vested interest and
    > written operators came to the table and couldn't agree to coalesce on a
    > single approach. John Sanda attempted to start an initiative to write a
    > best-of-breed combining choice parts of each operator, but that effort did
    > not gain traction.
    >
    > Which is where my hypothesis comes from that if there were a clear "better
    > fit" operator to start from we wouldn't be in a deadlock; the correct
    > choice would be obvious. Reasonably so, every engineer that's written
    > something is going to want that something to be used and not thrown away in
    > favor of another something without strong evidence as to why that's the
    > better choice.
    >
    > As far as I know, nobody has made a clear case as to a more compelling
    > place to start in terms of an operator donation the project then
    > collaborates on. There's no mass adoption evidence nor feature enumeration
    > that I know of for any of the approaches anyone's taken, so the discussions
    > remain stalled.
    >
    >
    >
    > On Wed, Sep 23, 2020 at 7:18 AM, Benedict Elliott Smith <benedict@apache.org
    > > wrote:
    >
    > > I think there's significant value to the community in trying to coalesce
    > > on a single approach, earlier than later. This is an opportunity to expand
    > > the number of active organisations involved directly in the Apache
    > > Cassandra project, as well as to more quickly expand the project's
    > > functionality into an area we consider urgent and important. I think it
    > > would be a real shame to waste this opportunity. No doubt it will be hard,
    > > as organisations have certain built-in investments in their own approaches.
    > >
    > > I haven't participated in these calls as I do not consider myself to have
    > > the relevant experience and expertise, and have other focuses on the
    > > project. I just wanted to voice a vote in favour of trying to bring the
    > > different organisations together on a single approach if possible. Is there
    > > anything the project can do to help this happen?
    > >
    > > On 23/09/2020, 03:04, "Ben Bromhead" <be...@instaclustr.com> wrote:
    > >
    > > I think there is certainly an appetite to donate and standardise on a
    > > given operator (as mentioned in this thread).
    > >
    > > I personally found the SIG hard to participate in due to time zones and
    > > the synchronous nature of it.
    > >
    > > So while it was a great forum to dive into certain details for a subset of
    > > participants and a worthwhile endeavour, I wouldn't paint it as an accurate
    > > reflection of community intent.
    > >
    > > I don't think that any participants want to continue down the path of "let
    > > a thousand flowers bloom". That's why we are looking towards CasKop (as
    > > well as a number of technical reasons).
    > >
    > > Some of the recorded meetings and outputs can also be found if you are
    > > interested in some primary sources
    > > https://cwiki.apache.org/confluence/display/CASSANDRA/
    > > Cassandra+Kubernetes+Operator+SIG
    > > .
    > >
    > > From what I understand second-hand from talking to people on the SIG
    > > calls,
    > >
    > > there was a general inability to agree on an existing operator as a
    > > starting point and not much engagement on taking best of breed from the
    > > various to combine them. Seems to leave us in the "let a thousand flowers
    > > bloom" stage of letting operators grow in the ecosystem and seeing which
    > > ones meet the needs of end users before talking about adopting one into the
    > > foundation.
    > >
    > > Great to hear that you folks are joining forces though! Bodes well for C*
    > > users that are wanting to run things on k8s.
    > >
    > > On Tue, Sep 22, 2020 at 4:26 AM, Ben Bromhead <be...@instaclustr.com>
    > > wrote:
    > >
    > > For what it's worth, a quick update from me:
    > >
    > > CassKop now has at least two organisations working on it substantially
    > > (Orange and Instaclustr) as well as the numerous other contributors.
    > >
    > > Internally we will also start pointing others towards CasKop once a few
    > > things get merged. While we are not yet sunsetting our operator yet, it
    > >
    > > is
    > >
    > > certainly looking that way.
    > >
    > > I'd love to see the community adopt it as a starting point for working
    > > towards whatever level of functionality is desired.
    > >
    > > Cheers
    > >
    > > Ben
    > >
    > > On Fri, Sep 11, 2020 at 2:37 PM John Sanda <jo...@gmail.com> wrote:
    > >
    > > On Thu, Sep 10, 2020 at 5:27 PM Josh McKenzie <jm...@apache.org>
    > > wrote:
    > >
    > > There's basically 1 java driver in the C* ecosystem. We have 3? 4? or
    > >
    > > more
    > >
    > > operators in the ecosystem. Has one of them hit a clear supermajority of
    > > adoption that makes it the de facto default and makes sense to pull it
    > >
    > > into
    > >
    > > the project?
    > >
    > > We as a project community were pretty slow to move on building a PoV
    > >
    > > around
    > >
    > > kubernetes so we find ourselves in a situation with a bunch of contenders
    > > for inclusion in the project. It's not clear to me what heuristics we'd
    > >
    > > use
    > >
    > > to gauge which one would be the best fit for inclusion outside letting
    > > community adoption speak.
    > >
    > > ---
    > > Josh McKenzie
    > >
    > > We actually talked a good bit on the SIG call earlier today about
    > > heuristics. We need to document what functionality an operator should
    > > include at level 0, level 1, etc. We did discuss this a good bit during
    > > some of the initial SIG meetings, but I guess it wasn't really a focal
    > > point at the time. I think we should also provide references to existing
    > > operator projects and possibly other related projects. This would benefit
    > > both community users as well as people working on these projects.
    > >
    > > - John
    > >
    > > --
    > >
    > > Ben Bromhead
    > >
    > > Instaclustr | www.instaclustr.com | @instaclustr
    > > <http://twitter.com/instaclustr> | (650) 284 9692
    > >
    > > --
    > >
    > > Ben Bromhead
    > >
    > > Instaclustr | www.instaclustr.com | @instaclustr
    > > <http://twitter.com/instaclustr> | (650) 284 9692
    > >
    > > --------------------------------------------------------------------- To
    > > unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org For additional
    > > commands, e-mail: dev-help@cassandra.apache.org
    > >

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
    For additional commands, e-mail: dev-help@cassandra.apache.org




---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
For additional commands, e-mail: dev-help@cassandra.apache.org

Re: [DISCUSS] Next steps for Kubernetes operator SIG

Posted by Stefan Miklosovic <st...@instaclustr.com>.

I think that from Instaclustr it was stated quite clearly multiple
times that we are "fine to throw it away" if there is something better
and more wide-spread.Indeed, we have invested a lot of time in the
operator but it was not useless at all, we gained a lot of quite
unique knowledge how to put all pieces together. However, I think that
this space is going to be quite fragmented and "balkanized", which is
not always a bad thing, but in a quite narrow area as Kubernetes
operator is, I just do not see how 4 operators are going to be
beneficial for ordinary people ("official" from community, ours,
Datastax one and CassKop (without any significant order)). Sure,
innovation and healthy competition is important but to what extent ...
One can start a Cassandra cluster on Kubernetes just so many times
differently and nobody really likes a vendor lock-in. People wanting
to run a cluster on K8S realise that there are three operators, each
backed by a private business entity, and the community operator is not
there ... Huh, interesting ... One may even start to question what is
wrong with these folks that it takes three companies to build their
own solution.

Having said that, to my perception, Cassandra community just does not
have enough engineers nor contributors to keep 4 operators alive at
the same time (I wish I was wrong) so the idea of selecting the best
one or to merge obvious things and approaches together is
understandable, even if it meant we eventually sunset ours. In
addition, nobody from big players is going to contribute to the code
base of the other one, for obvious reasons, so channeling and
directing this effort into something common for a community seems to
be the only reasonable way of cooperation.

It is quite hard to bootstrap this if the donation of the code in big
chunks / whole repo is out of question as it is not the "Apache way"
(there was some thread running here about this in more depth a while
ago) and we basically need to start from scratch which is quite
demotivating, we are just inventing the wheel and nobody is up to it.
It is like people are waiting for that to happen so they can jump in
"once it is the thing" but it will never materialise or at least the
hurdle to kick it off is unnecessarily high. Nobody is going to invest
in this heavily if there is already a working operator from companies
mentioned above. As I understood it, one reason of not choosing the
way of donating it all is that "the learning and community building
should happen in organic manner and we just can not accept the
donation", but is not it true that it is easier to build a community
around something which is already there rather than trying to build it
around an idea which is quite hard to dedicate to?

On Wed, 23 Sep 2020 at 15:28, Joshua McKenzie <jm...@apache.org> wrote:
>
> > I think there's significant value to the community in trying to coalesce
> on a single approach,
> I agree. Unfortunately in this case, the parties with a vested interest and
> written operators came to the table and couldn't agree to coalesce on a
> single approach. John Sanda attempted to start an initiative to write a
> best-of-breed combining choice parts of each operator, but that effort did
> not gain traction.
>
> Which is where my hypothesis comes from that if there were a clear "better
> fit" operator to start from we wouldn't be in a deadlock; the correct
> choice would be obvious. Reasonably so, every engineer that's written
> something is going to want that something to be used and not thrown away in
> favor of another something without strong evidence as to why that's the
> better choice.
>
> As far as I know, nobody has made a clear case as to a more compelling
> place to start in terms of an operator donation the project then
> collaborates on. There's no mass adoption evidence nor feature enumeration
> that I know of for any of the approaches anyone's taken, so the discussions
> remain stalled.
>
>
>
> On Wed, Sep 23, 2020 at 7:18 AM, Benedict Elliott Smith <benedict@apache.org
> > wrote:
>
> > I think there's significant value to the community in trying to coalesce
> > on a single approach, earlier than later. This is an opportunity to expand
> > the number of active organisations involved directly in the Apache
> > Cassandra project, as well as to more quickly expand the project's
> > functionality into an area we consider urgent and important. I think it
> > would be a real shame to waste this opportunity. No doubt it will be hard,
> > as organisations have certain built-in investments in their own approaches.
> >
> > I haven't participated in these calls as I do not consider myself to have
> > the relevant experience and expertise, and have other focuses on the
> > project. I just wanted to voice a vote in favour of trying to bring the
> > different organisations together on a single approach if possible. Is there
> > anything the project can do to help this happen?
> >
> > On 23/09/2020, 03:04, "Ben Bromhead" <be...@instaclustr.com> wrote:
> >
> > I think there is certainly an appetite to donate and standardise on a
> > given operator (as mentioned in this thread).
> >
> > I personally found the SIG hard to participate in due to time zones and
> > the synchronous nature of it.
> >
> > So while it was a great forum to dive into certain details for a subset of
> > participants and a worthwhile endeavour, I wouldn't paint it as an accurate
> > reflection of community intent.
> >
> > I don't think that any participants want to continue down the path of "let
> > a thousand flowers bloom". That's why we are looking towards CasKop (as
> > well as a number of technical reasons).
> >
> > Some of the recorded meetings and outputs can also be found if you are
> > interested in some primary sources
> > https://cwiki.apache.org/confluence/display/CASSANDRA/
> > Cassandra+Kubernetes+Operator+SIG
> > .
> >
> > From what I understand second-hand from talking to people on the SIG
> > calls,
> >
> > there was a general inability to agree on an existing operator as a
> > starting point and not much engagement on taking best of breed from the
> > various to combine them. Seems to leave us in the "let a thousand flowers
> > bloom" stage of letting operators grow in the ecosystem and seeing which
> > ones meet the needs of end users before talking about adopting one into the
> > foundation.
> >
> > Great to hear that you folks are joining forces though! Bodes well for C*
> > users that are wanting to run things on k8s.
> >
> > On Tue, Sep 22, 2020 at 4:26 AM, Ben Bromhead <be...@instaclustr.com>
> > wrote:
> >
> > For what it's worth, a quick update from me:
> >
> > CassKop now has at least two organisations working on it substantially
> > (Orange and Instaclustr) as well as the numerous other contributors.
> >
> > Internally we will also start pointing others towards CasKop once a few
> > things get merged. While we are not yet sunsetting our operator yet, it
> >
> > is
> >
> > certainly looking that way.
> >
> > I'd love to see the community adopt it as a starting point for working
> > towards whatever level of functionality is desired.
> >
> > Cheers
> >
> > Ben
> >
> > On Fri, Sep 11, 2020 at 2:37 PM John Sanda <jo...@gmail.com> wrote:
> >
> > On Thu, Sep 10, 2020 at 5:27 PM Josh McKenzie <jm...@apache.org>
> > wrote:
> >
> > There's basically 1 java driver in the C* ecosystem. We have 3? 4? or
> >
> > more
> >
> > operators in the ecosystem. Has one of them hit a clear supermajority of
> > adoption that makes it the de facto default and makes sense to pull it
> >
> > into
> >
> > the project?
> >
> > We as a project community were pretty slow to move on building a PoV
> >
> > around
> >
> > kubernetes so we find ourselves in a situation with a bunch of contenders
> > for inclusion in the project. It's not clear to me what heuristics we'd
> >
> > use
> >
> > to gauge which one would be the best fit for inclusion outside letting
> > community adoption speak.
> >
> > ---
> > Josh McKenzie
> >
> > We actually talked a good bit on the SIG call earlier today about
> > heuristics. We need to document what functionality an operator should
> > include at level 0, level 1, etc. We did discuss this a good bit during
> > some of the initial SIG meetings, but I guess it wasn't really a focal
> > point at the time. I think we should also provide references to existing
> > operator projects and possibly other related projects. This would benefit
> > both community users as well as people working on these projects.
> >
> > - John
> >
> > --
> >
> > Ben Bromhead
> >
> > Instaclustr | www.instaclustr.com | @instaclustr
> > <http://twitter.com/instaclustr> | (650) 284 9692
> >
> > --
> >
> > Ben Bromhead
> >
> > Instaclustr | www.instaclustr.com | @instaclustr
> > <http://twitter.com/instaclustr> | (650) 284 9692
> >
> > --------------------------------------------------------------------- To
> > unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org For additional
> > commands, e-mail: dev-help@cassandra.apache.org
> >

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
For additional commands, e-mail: dev-help@cassandra.apache.org

Re: [DISCUSS] Next steps for Kubernetes operator SIG

Posted by fr...@orange.com.

I can explain quite a bit of the history of why we are in this situation today if you want but the important question is:
Who is willing to donate its operator and the control over its future to the community?
- Orange does with CassKop, as soon as we release v1 quite soon.
- who else? and when?

Then we can compare the features :)


> On 23 Sep 2020, at 15:27, Joshua McKenzie <jm...@apache.org> wrote:
> 
>> I think there's significant value to the community in trying to coalesce
> on a single approach,
> I agree. Unfortunately in this case, the parties with a vested interest and
> written operators came to the table and couldn't agree to coalesce on a
> single approach. John Sanda attempted to start an initiative to write a
> best-of-breed combining choice parts of each operator, but that effort did
> not gain traction.
> 
> Which is where my hypothesis comes from that if there were a clear "better
> fit" operator to start from we wouldn't be in a deadlock; the correct
> choice would be obvious. Reasonably so, every engineer that's written
> something is going to want that something to be used and not thrown away in
> favor of another something without strong evidence as to why that's the
> better choice.
> 
> As far as I know, nobody has made a clear case as to a more compelling
> place to start in terms of an operator donation the project then
> collaborates on. There's no mass adoption evidence nor feature enumeration
> that I know of for any of the approaches anyone's taken, so the discussions
> remain stalled.
> 
> 
> 
> On Wed, Sep 23, 2020 at 7:18 AM, Benedict Elliott Smith <benedict@apache.org
>> wrote:
> 
>> I think there's significant value to the community in trying to coalesce
>> on a single approach, earlier than later. This is an opportunity to expand
>> the number of active organisations involved directly in the Apache
>> Cassandra project, as well as to more quickly expand the project's
>> functionality into an area we consider urgent and important. I think it
>> would be a real shame to waste this opportunity. No doubt it will be hard,
>> as organisations have certain built-in investments in their own approaches.
>> 
>> I haven't participated in these calls as I do not consider myself to have
>> the relevant experience and expertise, and have other focuses on the
>> project. I just wanted to voice a vote in favour of trying to bring the
>> different organisations together on a single approach if possible. Is there
>> anything the project can do to help this happen?
>> 
>> On 23/09/2020, 03:04, "Ben Bromhead" <be...@instaclustr.com> wrote:
>> 
>> I think there is certainly an appetite to donate and standardise on a
>> given operator (as mentioned in this thread).
>> 
>> I personally found the SIG hard to participate in due to time zones and
>> the synchronous nature of it.
>> 
>> So while it was a great forum to dive into certain details for a subset of
>> participants and a worthwhile endeavour, I wouldn't paint it as an accurate
>> reflection of community intent.
>> 
>> I don't think that any participants want to continue down the path of "let
>> a thousand flowers bloom". That's why we are looking towards CasKop (as
>> well as a number of technical reasons).
>> 
>> Some of the recorded meetings and outputs can also be found if you are
>> interested in some primary sources
>> https://cwiki.apache.org/confluence/display/CASSANDRA/
>> Cassandra+Kubernetes+Operator+SIG
>> .
>> 
>> From what I understand second-hand from talking to people on the SIG
>> calls,
>> 
>> there was a general inability to agree on an existing operator as a
>> starting point and not much engagement on taking best of breed from the
>> various to combine them. Seems to leave us in the "let a thousand flowers
>> bloom" stage of letting operators grow in the ecosystem and seeing which
>> ones meet the needs of end users before talking about adopting one into the
>> foundation.
>> 
>> Great to hear that you folks are joining forces though! Bodes well for C*
>> users that are wanting to run things on k8s.
>> 
>> On Tue, Sep 22, 2020 at 4:26 AM, Ben Bromhead <be...@instaclustr.com>
>> wrote:
>> 
>> For what it's worth, a quick update from me:
>> 
>> CassKop now has at least two organisations working on it substantially
>> (Orange and Instaclustr) as well as the numerous other contributors.
>> 
>> Internally we will also start pointing others towards CasKop once a few
>> things get merged. While we are not yet sunsetting our operator yet, it
>> 
>> is
>> 
>> certainly looking that way.
>> 
>> I'd love to see the community adopt it as a starting point for working
>> towards whatever level of functionality is desired.
>> 
>> Cheers
>> 
>> Ben
>> 
>> On Fri, Sep 11, 2020 at 2:37 PM John Sanda <jo...@gmail.com> wrote:
>> 
>> On Thu, Sep 10, 2020 at 5:27 PM Josh McKenzie <jm...@apache.org>
>> wrote:
>> 
>> There's basically 1 java driver in the C* ecosystem. We have 3? 4? or
>> 
>> more
>> 
>> operators in the ecosystem. Has one of them hit a clear supermajority of
>> adoption that makes it the de facto default and makes sense to pull it
>> 
>> into
>> 
>> the project?
>> 
>> We as a project community were pretty slow to move on building a PoV
>> 
>> around
>> 
>> kubernetes so we find ourselves in a situation with a bunch of contenders
>> for inclusion in the project. It's not clear to me what heuristics we'd
>> 
>> use
>> 
>> to gauge which one would be the best fit for inclusion outside letting
>> community adoption speak.
>> 
>> ---
>> Josh McKenzie
>> 
>> We actually talked a good bit on the SIG call earlier today about
>> heuristics. We need to document what functionality an operator should
>> include at level 0, level 1, etc. We did discuss this a good bit during
>> some of the initial SIG meetings, but I guess it wasn't really a focal
>> point at the time. I think we should also provide references to existing
>> operator projects and possibly other related projects. This would benefit
>> both community users as well as people working on these projects.
>> 
>> - John
>> 
>> --
>> 
>> Ben Bromhead
>> 
>> Instaclustr | www.instaclustr.com | @instaclustr
>> <http://twitter.com/instaclustr> | (650) 284 9692
>> 
>> --
>> 
>> Ben Bromhead
>> 
>> Instaclustr | www.instaclustr.com | @instaclustr
>> <http://twitter.com/instaclustr> | (650) 284 9692
>> 
>> --------------------------------------------------------------------- To
>> unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org For additional
>> commands, e-mail: dev-help@cassandra.apache.org
>> 


_________________________________________________________________________________________________________________________

Ce message et ses pieces jointes peuvent contenir des informations confidentielles ou privilegiees et ne doivent donc
pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce message par erreur, veuillez le signaler
a l'expediteur et le detruire ainsi que les pieces jointes. Les messages electroniques etant susceptibles d'alteration,
Orange decline toute responsabilite si ce message a ete altere, deforme ou falsifie. Merci.

This message and its attachments may contain confidential or privileged information that may be protected by law;
they should not be distributed, used or copied without authorisation.
If you have received this email in error, please notify the sender and delete this message and its attachments.
As emails may be altered, Orange is not liable for messages that have been modified, changed or falsified.
Thank you.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
For additional commands, e-mail: dev-help@cassandra.apache.org

Re: [DISCUSS] Next steps for Kubernetes operator SIG

Posted by Joshua McKenzie <jm...@apache.org>.

> I think there's significant value to the community in trying to coalesce
on a single approach,
I agree. Unfortunately in this case, the parties with a vested interest and
written operators came to the table and couldn't agree to coalesce on a
single approach. John Sanda attempted to start an initiative to write a
best-of-breed combining choice parts of each operator, but that effort did
not gain traction.

Which is where my hypothesis comes from that if there were a clear "better
fit" operator to start from we wouldn't be in a deadlock; the correct
choice would be obvious. Reasonably so, every engineer that's written
something is going to want that something to be used and not thrown away in
favor of another something without strong evidence as to why that's the
better choice.

As far as I know, nobody has made a clear case as to a more compelling
place to start in terms of an operator donation the project then
collaborates on. There's no mass adoption evidence nor feature enumeration
that I know of for any of the approaches anyone's taken, so the discussions
remain stalled.



On Wed, Sep 23, 2020 at 7:18 AM, Benedict Elliott Smith <benedict@apache.org
> wrote:

> I think there's significant value to the community in trying to coalesce
> on a single approach, earlier than later. This is an opportunity to expand
> the number of active organisations involved directly in the Apache
> Cassandra project, as well as to more quickly expand the project's
> functionality into an area we consider urgent and important. I think it
> would be a real shame to waste this opportunity. No doubt it will be hard,
> as organisations have certain built-in investments in their own approaches.
>
> I haven't participated in these calls as I do not consider myself to have
> the relevant experience and expertise, and have other focuses on the
> project. I just wanted to voice a vote in favour of trying to bring the
> different organisations together on a single approach if possible. Is there
> anything the project can do to help this happen?
>
> On 23/09/2020, 03:04, "Ben Bromhead" <be...@instaclustr.com> wrote:
>
> I think there is certainly an appetite to donate and standardise on a
> given operator (as mentioned in this thread).
>
> I personally found the SIG hard to participate in due to time zones and
> the synchronous nature of it.
>
> So while it was a great forum to dive into certain details for a subset of
> participants and a worthwhile endeavour, I wouldn't paint it as an accurate
> reflection of community intent.
>
> I don't think that any participants want to continue down the path of "let
> a thousand flowers bloom". That's why we are looking towards CasKop (as
> well as a number of technical reasons).
>
> Some of the recorded meetings and outputs can also be found if you are
> interested in some primary sources
> https://cwiki.apache.org/confluence/display/CASSANDRA/
> Cassandra+Kubernetes+Operator+SIG
> .
>
> From what I understand second-hand from talking to people on the SIG
> calls,
>
> there was a general inability to agree on an existing operator as a
> starting point and not much engagement on taking best of breed from the
> various to combine them. Seems to leave us in the "let a thousand flowers
> bloom" stage of letting operators grow in the ecosystem and seeing which
> ones meet the needs of end users before talking about adopting one into the
> foundation.
>
> Great to hear that you folks are joining forces though! Bodes well for C*
> users that are wanting to run things on k8s.
>
> On Tue, Sep 22, 2020 at 4:26 AM, Ben Bromhead <be...@instaclustr.com>
> wrote:
>
> For what it's worth, a quick update from me:
>
> CassKop now has at least two organisations working on it substantially
> (Orange and Instaclustr) as well as the numerous other contributors.
>
> Internally we will also start pointing others towards CasKop once a few
> things get merged. While we are not yet sunsetting our operator yet, it
>
> is
>
> certainly looking that way.
>
> I'd love to see the community adopt it as a starting point for working
> towards whatever level of functionality is desired.
>
> Cheers
>
> Ben
>
> On Fri, Sep 11, 2020 at 2:37 PM John Sanda <jo...@gmail.com> wrote:
>
> On Thu, Sep 10, 2020 at 5:27 PM Josh McKenzie <jm...@apache.org>
> wrote:
>
> There's basically 1 java driver in the C* ecosystem. We have 3? 4? or
>
> more
>
> operators in the ecosystem. Has one of them hit a clear supermajority of
> adoption that makes it the de facto default and makes sense to pull it
>
> into
>
> the project?
>
> We as a project community were pretty slow to move on building a PoV
>
> around
>
> kubernetes so we find ourselves in a situation with a bunch of contenders
> for inclusion in the project. It's not clear to me what heuristics we'd
>
> use
>
> to gauge which one would be the best fit for inclusion outside letting
> community adoption speak.
>
> ---
> Josh McKenzie
>
> We actually talked a good bit on the SIG call earlier today about
> heuristics. We need to document what functionality an operator should
> include at level 0, level 1, etc. We did discuss this a good bit during
> some of the initial SIG meetings, but I guess it wasn't really a focal
> point at the time. I think we should also provide references to existing
> operator projects and possibly other related projects. This would benefit
> both community users as well as people working on these projects.
>
> - John
>
> --
>
> Ben Bromhead
>
> Instaclustr | www.instaclustr.com | @instaclustr
> <http://twitter.com/instaclustr> | (650) 284 9692
>
> --
>
> Ben Bromhead
>
> Instaclustr | www.instaclustr.com | @instaclustr
> <http://twitter.com/instaclustr> | (650) 284 9692
>
> --------------------------------------------------------------------- To
> unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org For additional
> commands, e-mail: dev-help@cassandra.apache.org
>

Re: [DISCUSS] Next steps for Kubernetes operator SIG

Posted by Benedict Elliott Smith <be...@apache.org>.

I think there's significant value to the community in trying to coalesce on a single approach, earlier than later.  This is an opportunity to expand the number of active organisations involved directly in the Apache Cassandra project, as well as to more quickly expand the project's functionality into an area we consider urgent and important.  I think it would be a real shame to waste this opportunity.  No doubt it will be hard, as organisations have certain built-in investments in their own approaches.

I haven't participated in these calls as I do not consider myself to have the relevant experience and expertise, and have other focuses on the project.  I just wanted to voice a vote in favour of trying to bring the different organisations together on a single approach if possible.  Is there anything the project can do to help this happen?
 

On 23/09/2020, 03:04, "Ben Bromhead" <be...@instaclustr.com> wrote:

    I think there is certainly an appetite to donate and standardise on a given
    operator (as mentioned in this thread).

    I personally found the SIG hard to participate in due to time zones and the
    synchronous nature of it.

    So while it was a great forum to dive into certain details for a subset of
    participants and a worthwhile endeavour, I wouldn't paint it as an accurate
    reflection of community intent.

    I don't think that any participants want to continue down the path of  "let
    a thousand flowers bloom". That's why we are looking towards CasKop (as
    well as a number of technical reasons).

    Some of the recorded meetings and outputs can also be found if you are
    interested in some primary sources
    https://cwiki.apache.org/confluence/display/CASSANDRA/Cassandra+Kubernetes+Operator+SIG
    .

    From what I understand second-hand from talking to people on the SIG calls,
    > there was a general inability to agree on an existing operator as a
    > starting point and not much engagement on taking best of breed from the
    > various to combine them. Seems to leave us in the "let a thousand flowers
    > bloom" stage of letting operators grow in the ecosystem and seeing which
    > ones meet the needs of end users before talking about adopting one into the
    > foundation.
    >
    > Great to hear that you folks are joining forces though! Bodes well for C*
    > users that are wanting to run things on k8s.
    >
    >
    >
    > On Tue, Sep 22, 2020 at 4:26 AM, Ben Bromhead <be...@instaclustr.com> wrote:
    >
    > > For what it's worth, a quick update from me:
    > >
    > > CassKop now has at least two organisations working on it substantially
    > > (Orange and Instaclustr) as well as the numerous other contributors.
    > >
    > > Internally we will also start pointing others towards CasKop once a few
    > > things get merged. While we are not yet sunsetting our operator yet, it
    > is
    > > certainly looking that way.
    > >
    > > I'd love to see the community adopt it as a starting point for working
    > > towards whatever level of functionality is desired.
    > >
    > > Cheers
    > >
    > > Ben
    > >
    > > On Fri, Sep 11, 2020 at 2:37 PM John Sanda <jo...@gmail.com> wrote:
    > >
    > > On Thu, Sep 10, 2020 at 5:27 PM Josh McKenzie <jm...@apache.org>
    > > wrote:
    > >
    > > There's basically 1 java driver in the C* ecosystem. We have 3? 4? or
    > >
    > > more
    > >
    > > operators in the ecosystem. Has one of them hit a clear supermajority of
    > > adoption that makes it the de facto default and makes sense to pull it
    > >
    > > into
    > >
    > > the project?
    > >
    > > We as a project community were pretty slow to move on building a PoV
    > >
    > > around
    > >
    > > kubernetes so we find ourselves in a situation with a bunch of contenders
    > > for inclusion in the project. It's not clear to me what heuristics we'd
    > >
    > > use
    > >
    > > to gauge which one would be the best fit for inclusion outside letting
    > > community adoption speak.
    > >
    > > ---
    > > Josh McKenzie
    > >
    > > We actually talked a good bit on the SIG call earlier today about
    > > heuristics. We need to document what functionality an operator should
    > > include at level 0, level 1, etc. We did discuss this a good bit during
    > > some of the initial SIG meetings, but I guess it wasn't really a focal
    > > point at the time. I think we should also provide references to existing
    > > operator projects and possibly other related projects. This would benefit
    > > both community users as well as people working on these projects.
    > >
    > > - John
    > >
    > > --
    > >
    > > Ben Bromhead
    > >
    > > Instaclustr | www.instaclustr.com | @instaclustr
    > > <http://twitter.com/instaclustr> | (650) 284 9692
    > >
    >


    -- 

    Ben Bromhead

    Instaclustr | www.instaclustr.com | @instaclustr
    <http://twitter.com/instaclustr> | (650) 284 9692



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
For additional commands, e-mail: dev-help@cassandra.apache.org

Re: [DISCUSS] Next steps for Kubernetes operator SIG

Posted by Ben Bromhead <be...@instaclustr.com>.

I think there is certainly an appetite to donate and standardise on a given
operator (as mentioned in this thread).

I personally found the SIG hard to participate in due to time zones and the
synchronous nature of it.

So while it was a great forum to dive into certain details for a subset of
participants and a worthwhile endeavour, I wouldn't paint it as an accurate
reflection of community intent.

I don't think that any participants want to continue down the path of  "let
a thousand flowers bloom". That's why we are looking towards CasKop (as
well as a number of technical reasons).

Some of the recorded meetings and outputs can also be found if you are
interested in some primary sources
https://cwiki.apache.org/confluence/display/CASSANDRA/Cassandra+Kubernetes+Operator+SIG
.

From what I understand second-hand from talking to people on the SIG calls,
> there was a general inability to agree on an existing operator as a
> starting point and not much engagement on taking best of breed from the
> various to combine them. Seems to leave us in the "let a thousand flowers
> bloom" stage of letting operators grow in the ecosystem and seeing which
> ones meet the needs of end users before talking about adopting one into the
> foundation.
>
> Great to hear that you folks are joining forces though! Bodes well for C*
> users that are wanting to run things on k8s.
>
>
>
> On Tue, Sep 22, 2020 at 4:26 AM, Ben Bromhead <be...@instaclustr.com> wrote:
>
> > For what it's worth, a quick update from me:
> >
> > CassKop now has at least two organisations working on it substantially
> > (Orange and Instaclustr) as well as the numerous other contributors.
> >
> > Internally we will also start pointing others towards CasKop once a few
> > things get merged. While we are not yet sunsetting our operator yet, it
> is
> > certainly looking that way.
> >
> > I'd love to see the community adopt it as a starting point for working
> > towards whatever level of functionality is desired.
> >
> > Cheers
> >
> > Ben
> >
> > On Fri, Sep 11, 2020 at 2:37 PM John Sanda <jo...@gmail.com> wrote:
> >
> > On Thu, Sep 10, 2020 at 5:27 PM Josh McKenzie <jm...@apache.org>
> > wrote:
> >
> > There's basically 1 java driver in the C* ecosystem. We have 3? 4? or
> >
> > more
> >
> > operators in the ecosystem. Has one of them hit a clear supermajority of
> > adoption that makes it the de facto default and makes sense to pull it
> >
> > into
> >
> > the project?
> >
> > We as a project community were pretty slow to move on building a PoV
> >
> > around
> >
> > kubernetes so we find ourselves in a situation with a bunch of contenders
> > for inclusion in the project. It's not clear to me what heuristics we'd
> >
> > use
> >
> > to gauge which one would be the best fit for inclusion outside letting
> > community adoption speak.
> >
> > ---
> > Josh McKenzie
> >
> > We actually talked a good bit on the SIG call earlier today about
> > heuristics. We need to document what functionality an operator should
> > include at level 0, level 1, etc. We did discuss this a good bit during
> > some of the initial SIG meetings, but I guess it wasn't really a focal
> > point at the time. I think we should also provide references to existing
> > operator projects and possibly other related projects. This would benefit
> > both community users as well as people working on these projects.
> >
> > - John
> >
> > --
> >
> > Ben Bromhead
> >
> > Instaclustr | www.instaclustr.com | @instaclustr
> > <http://twitter.com/instaclustr> | (650) 284 9692
> >
>


-- 

Ben Bromhead

Instaclustr | www.instaclustr.com | @instaclustr
<http://twitter.com/instaclustr> | (650) 284 9692

Re: [DISCUSS] Next steps for Kubernetes operator SIG

Posted by Joshua McKenzie <jm...@apache.org>.

I'd love to see the community adopt it as a starting point for working
towards whatever level of functionality is desired.

From what I understand second-hand from talking to people on the SIG calls,
there was a general inability to agree on an existing operator as a
starting point and not much engagement on taking best of breed from the
various to combine them. Seems to leave us in the "let a thousand flowers
bloom" stage of letting operators grow in the ecosystem and seeing which
ones meet the needs of end users before talking about adopting one into the
foundation.

Great to hear that you folks are joining forces though! Bodes well for C*
users that are wanting to run things on k8s.



On Tue, Sep 22, 2020 at 4:26 AM, Ben Bromhead <be...@instaclustr.com> wrote:

> For what it's worth, a quick update from me:
>
> CassKop now has at least two organisations working on it substantially
> (Orange and Instaclustr) as well as the numerous other contributors.
>
> Internally we will also start pointing others towards CasKop once a few
> things get merged. While we are not yet sunsetting our operator yet, it is
> certainly looking that way.
>
> I'd love to see the community adopt it as a starting point for working
> towards whatever level of functionality is desired.
>
> Cheers
>
> Ben
>
> On Fri, Sep 11, 2020 at 2:37 PM John Sanda <jo...@gmail.com> wrote:
>
> On Thu, Sep 10, 2020 at 5:27 PM Josh McKenzie <jm...@apache.org>
> wrote:
>
> There's basically 1 java driver in the C* ecosystem. We have 3? 4? or
>
> more
>
> operators in the ecosystem. Has one of them hit a clear supermajority of
> adoption that makes it the de facto default and makes sense to pull it
>
> into
>
> the project?
>
> We as a project community were pretty slow to move on building a PoV
>
> around
>
> kubernetes so we find ourselves in a situation with a bunch of contenders
> for inclusion in the project. It's not clear to me what heuristics we'd
>
> use
>
> to gauge which one would be the best fit for inclusion outside letting
> community adoption speak.
>
> ---
> Josh McKenzie
>
> We actually talked a good bit on the SIG call earlier today about
> heuristics. We need to document what functionality an operator should
> include at level 0, level 1, etc. We did discuss this a good bit during
> some of the initial SIG meetings, but I guess it wasn't really a focal
> point at the time. I think we should also provide references to existing
> operator projects and possibly other related projects. This would benefit
> both community users as well as people working on these projects.
>
> - John
>
> --
>
> Ben Bromhead
>
> Instaclustr | www.instaclustr.com | @instaclustr
> <http://twitter.com/instaclustr> | (650) 284 9692
>

Re: [DISCUSS] Next steps for Kubernetes operator SIG

Posted by Ben Bromhead <be...@instaclustr.com>.

For what it's worth, a quick update from me:

CassKop now has at least two organisations working on it substantially
(Orange and Instaclustr) as well as the numerous other contributors.

Internally we will also start pointing others towards CasKop once a few
things get merged. While we are not yet sunsetting our operator yet, it is
certainly looking that way.

I'd love to see the community adopt it as a starting point for working
towards whatever level of functionality is desired.

Cheers

Ben



On Fri, Sep 11, 2020 at 2:37 PM John Sanda <jo...@gmail.com> wrote:

> On Thu, Sep 10, 2020 at 5:27 PM Josh McKenzie <jm...@apache.org>
> wrote:
>
> > There's basically 1 java driver in the C* ecosystem. We have 3? 4? or
> more
> > operators in the ecosystem. Has one of them hit a clear supermajority of
> > adoption that makes it the de facto default and makes sense to pull it
> into
> > the project?
> >
> > We as a project community were pretty slow to move on building a PoV
> around
> > kubernetes so we find ourselves in a situation with a bunch of contenders
> > for inclusion in the project. It's not clear to me what heuristics we'd
> use
> > to gauge which one would be the best fit for inclusion outside letting
> > community adoption speak.
> >
> > ---
> > Josh McKenzie
> >
> >
> >
> We actually talked a good bit on the SIG call earlier today about
> heuristics. We need to document what functionality an operator should
> include at level 0, level 1, etc. We did discuss this a good bit during
> some of the initial SIG meetings, but I guess it wasn't really a focal
> point at the time. I think we should also provide references to existing
> operator projects and possibly other related projects. This would benefit
> both community users as well as people working on these projects.
>
> - John
>


-- 

Ben Bromhead

Instaclustr | www.instaclustr.com | @instaclustr
<http://twitter.com/instaclustr> | (650) 284 9692

Re: [DISCUSS] Next steps for Kubernetes operator SIG

Posted by John Sanda <jo...@gmail.com>.

On Thu, Sep 10, 2020 at 5:27 PM Josh McKenzie <jm...@apache.org> wrote:

> There's basically 1 java driver in the C* ecosystem. We have 3? 4? or more
> operators in the ecosystem. Has one of them hit a clear supermajority of
> adoption that makes it the de facto default and makes sense to pull it into
> the project?
>
> We as a project community were pretty slow to move on building a PoV around
> kubernetes so we find ourselves in a situation with a bunch of contenders
> for inclusion in the project. It's not clear to me what heuristics we'd use
> to gauge which one would be the best fit for inclusion outside letting
> community adoption speak.
>
> ---
> Josh McKenzie
>
>
>
We actually talked a good bit on the SIG call earlier today about
heuristics. We need to document what functionality an operator should
include at level 0, level 1, etc. We did discuss this a good bit during
some of the initial SIG meetings, but I guess it wasn't really a focal
point at the time. I think we should also provide references to existing
operator projects and possibly other related projects. This would benefit
both community users as well as people working on these projects.

- John

Re: [DISCUSS] Next steps for Kubernetes operator SIG

Posted by Josh McKenzie <jm...@apache.org>.

There's basically 1 java driver in the C* ecosystem. We have 3? 4? or more
operators in the ecosystem. Has one of them hit a clear supermajority of
adoption that makes it the de facto default and makes sense to pull it into
the project?

We as a project community were pretty slow to move on building a PoV around
kubernetes so we find ourselves in a situation with a bunch of contenders
for inclusion in the project. It's not clear to me what heuristics we'd use
to gauge which one would be the best fit for inclusion outside letting
community adoption speak.

---
Josh McKenzie


Sent via Superhuman <ht...@apache.org>


On Thu, Sep 10, 2020 at 10:58 AM, <fr...@orange.com> wrote:

> Hi,
>
> Thanks John for your efforts in setting up the repo and in the SIG
> meetings in general :)
>
> As the team already in charge for CasKop, we did not participate in the
> code in your repo for different reasons:
> - we never said we would. We discussed the CRD in the SIG meetings and our
> objective was to check whether CassKop’s choice were good or not, and they
> were good most of the times.
> - we were finishing V1 of CassKop with backup and restore functionalities.
> This is almost done, we are merging it as I am writing this.
> - we want to offer CassKop to the community when V1 is released as working
> code if it wants it. I mentioned this a few times in the videos so this is
> no news.
>
> An operator is a big work, don’t underestimate it!
>
> We believe not starting from scratch is better but this is only our
> opinion. Should I go formal on this offer and have a dedicated thread as
> was done for the drivers?
>
> Sincerely hope this helps
> Franck
> Product Owner of CassKop @ Orange
> https://github.com/Orange-OpenSource/casskop
> (Yes we’ll fix the vulnerability once the big merge is done :) )
>
> On 10 Sep 2020, at 05:10, John Sanda <jo...@gmail.com> wrote:
>
> Hey everyone,
>
> A while back I started https://github.com/jsanda/cassandra-operator in an
> effort to move things forward. One of my primary goals was to get some
> people contributing. That did not happen, which is understandable. I am
> going to throw out some questions and would love to get feedback,
> particularly from people who have been participating in the SIG and/or are
> involved with relevant projects.
>
> * Should we continue down the path of trying to build a common operator
> project? If yes, how should we proceed?
>
> * Should we broaden the focus to using and running Cassandra in Kubernetes
> in general? CEP 2 Kubernetes Operator
> <https://cwiki.apache.org/confluence/display/CASSANDRA/
> CEP-2+Kubernetes+Operator> says
> that its motivation is to make it easy to run Cassandra on Kubernetes.
> Having an operator is definitely a big part of that, but it is not the only
> part. There are important areas like application development, data
> migration, multi-region / multi-cloud clusters, and tooling integration to
> name a few. I think that the community benefits from collaboration
> regardless of whether or not there is a common operator. That is not to
> suggest that a common operator would not be good. I do not necessarily see
> it as a zero sum game where it has to be a common operator or nothing.
>
> Thanks
>
> - John
>
> _________________________________________________________________________________________________________________________
>
>
> Ce message et ses pieces jointes peuvent contenir des informations
> confidentielles ou privilegiees et ne doivent donc pas etre diffuses,
> exploites ou copies sans autorisation. Si vous avez recu ce message par
> erreur, veuillez le signaler a l'expediteur et le detruire ainsi que les
> pieces jointes. Les messages electroniques etant susceptibles d'alteration,
> Orange decline toute responsabilite si ce message a ete altere, deforme ou
> falsifie. Merci.
>
> This message and its attachments may contain confidential or privileged
> information that may be protected by law; they should not be distributed,
> used or copied without authorisation. If you have received this email in
> error, please notify the sender and delete this message and its
> attachments. As emails may be altered, Orange is not liable for messages
> that have been modified, changed or falsified. Thank you.
>

Re: [DISCUSS] Next steps for Kubernetes operator SIG

Posted by John Sanda <jo...@gmail.com>.

On Thu, Sep 10, 2020 at 10:58 AM <fr...@orange.com> wrote:

> Hi,
>
> Thanks John for your efforts in setting up the repo and in the SIG
> meetings in general :)
>
> As the team already in charge for CasKop, we did not participate in the
> code in your repo for different reasons:
> - we never said we would. We discussed the CRD in the SIG meetings and our
> objective was to check whether CassKop’s choice were good or not, and they
> were good most of the times.
> - we were finishing V1 of CassKop with backup and restore functionalities.
> This is almost done, we are merging it as I am writing this.
> - we want to offer CassKop to the community when V1 is released as working
> code if it wants it. I mentioned this a few times in the videos so this is
> no news.
>

Thanks Franck, and I totally understand about the lack of participation in
my prototype~ish repo. I figured it was a long shot at best to get any
involvement. I felt like I needed to try something to move things forward.
If nothing else, I learned more about kustomize which I have put to good
use in some other work ;-)

- John

Re: [DISCUSS] Next steps for Kubernetes operator SIG

Posted by fr...@orange.com.

Hi,

Thanks John for your efforts in setting up the repo and in the SIG meetings in general :)

As the team already in charge for CasKop, we did not participate in the code in your repo for different reasons:
- we never said we would. We discussed the CRD in the SIG meetings and our objective was to check whether CassKop’s choice were good or not, and they were good most of the times.
- we were finishing V1 of CassKop with backup and restore functionalities. This is almost done, we are merging it as I am writing this.
- we want to offer CassKop to the community when V1 is released as working code if it wants it. I mentioned this a few times in the videos so this is no news.

An operator is a big work, don’t underestimate it!

We believe not starting from scratch is better but this is only our opinion.
Should I go formal on this offer and have a dedicated thread as was done for the drivers?

Sincerely hope this helps
Franck
Product Owner of CassKop @ Orange
https://github.com/Orange-OpenSource/casskop
(Yes we’ll fix the vulnerability once the big merge is done :) )

> On 10 Sep 2020, at 05:10, John Sanda <jo...@gmail.com> wrote:
> 
> Hey everyone,
> 
> A while back I started https://github.com/jsanda/cassandra-operator in an
> effort to move things forward. One of my primary goals was to get some
> people contributing. That did not happen, which is understandable. I am
> going to throw out some questions and would love to get feedback,
> particularly from people who have been participating in the SIG and/or are
> involved with relevant projects.
> 
> * Should we continue down the path of trying to build a common operator
> project? If yes, how should we proceed?
> 
> * Should we broaden the focus to using and running Cassandra in Kubernetes
> in general? CEP 2 Kubernetes Operator
> <https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-2+Kubernetes+Operator>
> says
> that its motivation is to make it easy to run Cassandra on Kubernetes.
> Having an operator is definitely a big part of that, but it is not the only
> part. There are important areas like application development, data
> migration, multi-region / multi-cloud clusters, and tooling integration to
> name a few. I think that the community benefits from collaboration
> regardless of whether or not there is a common operator. That is not to
> suggest that a common operator would not be good. I do not necessarily see
> it as a zero sum game where it has to be a common operator or nothing.
> 
> Thanks
> 
> - John

_________________________________________________________________________________________________________________________

Ce message et ses pieces jointes peuvent contenir des informations confidentielles ou privilegiees et ne doivent donc
pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce message par erreur, veuillez le signaler
a l'expediteur et le detruire ainsi que les pieces jointes. Les messages electroniques etant susceptibles d'alteration,
Orange decline toute responsabilite si ce message a ete altere, deforme ou falsifie. Merci.

This message and its attachments may contain confidential or privileged information that may be protected by law;
they should not be distributed, used or copied without authorisation.
If you have received this email in error, please notify the sender and delete this message and its attachments.
As emails may be altered, Orange is not liable for messages that have been modified, changed or falsified.
Thank you.

Re: [DISCUSS] Next steps for Kubernetes operator SIG

Posted by "Tolbert, Andy" <x...@andrewtolbert.com>.

Hi John,

Thank you for your efforts on getting this bootstrapped!  I have been
meaning to try getting involved for months and appreciate that the SIG
has been recording sessions and taking down notes.

> * Should we continue down the path of trying to build a common operator
> project? If yes, how should we proceed?

I definitely think there is a lot of value having a common operator
project. The path to contributing to an operator will be much clearer
for some if it's an Apache project.

I see on your email back from Aug 5 that you mentioned a goal of:

> Ramp up a project that can eventually be considered for adoption within ASF, presumably as a subproject of Cassandra

Does there need to be much code established before it gets fully
proposed as a subproject and brought under the apache organization?
From what I recall cassandra-sidecar [1] was established as an apache
project before there was a lot of code written.

I'll be looking to provide feedback to the repository you've set up
soon, and hope I can contribute to getting this accepted as a
subproject in any way that I can.

> * Should we broaden the focus to using and running Cassandra in Kubernetes
> in general?

Not everyone running Cassandra in K8S is using an operator, so that
could be a good idea.  I'm curious if that would increase
participation, but it does seem like there is a large enough
classification of issues/improvements specific to running Cassandra on
Kubernetes that they'd be worth discussing in the SIG.

[1]: https://github.com/apache/cassandra-sidecar

Thanks,
Andy

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
For additional commands, e-mail: dev-help@cassandra.apache.org

Re: [DISCUSS] Next steps for Kubernetes operator SIG

Posted by fr...@orange.com.

Sorry forgot to mention that we finished the backup/restore with the help of Instaclustr! Sorry guys!

> On 10 Sep 2020, at 16:58, DEHAY Franck DTSI/DSI <fr...@orange.com> wrote:
> 
> Hi,
> 
> Thanks John for your efforts in setting up the repo and in the SIG meetings in general :)
> 
> As the team already in charge for CasKop, we did not participate in the code in your repo for different reasons:
> - we never said we would. We discussed the CRD in the SIG meetings and our objective was to check whether CassKop’s choice were good or not, and they were good most of the times.
> - we were finishing V1 of CassKop with backup and restore functionalities. This is almost done, we are merging it as I am writing this.
> - we want to offer CassKop to the community when V1 is released as working code if it wants it. I mentioned this a few times in the videos so this is no news.
> 
> An operator is a big work, don’t underestimate it!
> 
> We believe not starting from scratch is better but this is only our opinion.
> Should I go formal on this offer and have a dedicated thread as was done for the drivers?
> 
> Sincerely hope this helps
> Franck
> Product Owner of CassKop @ Orange
> https://github.com/Orange-OpenSource/casskop
> (Yes we’ll fix the vulnerability once the big merge is done :) )
> 
>> On 10 Sep 2020, at 05:10, John Sanda <jo...@gmail.com> wrote:
>> 
>> Hey everyone,
>> 
>> A while back I started https://github.com/jsanda/cassandra-operator in an
>> effort to move things forward. One of my primary goals was to get some
>> people contributing. That did not happen, which is understandable. I am
>> going to throw out some questions and would love to get feedback,
>> particularly from people who have been participating in the SIG and/or are
>> involved with relevant projects.
>> 
>> * Should we continue down the path of trying to build a common operator
>> project? If yes, how should we proceed?
>> 
>> * Should we broaden the focus to using and running Cassandra in Kubernetes
>> in general? CEP 2 Kubernetes Operator
>> <https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-2+Kubernetes+Operator>
>> says
>> that its motivation is to make it easy to run Cassandra on Kubernetes.
>> Having an operator is definitely a big part of that, but it is not the only
>> part. There are important areas like application development, data
>> migration, multi-region / multi-cloud clusters, and tooling integration to
>> name a few. I think that the community benefits from collaboration
>> regardless of whether or not there is a common operator. That is not to
>> suggest that a common operator would not be good. I do not necessarily see
>> it as a zero sum game where it has to be a common operator or nothing.
>> 
>> Thanks
>> 
>> - John
> 


_________________________________________________________________________________________________________________________

Ce message et ses pieces jointes peuvent contenir des informations confidentielles ou privilegiees et ne doivent donc
pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce message par erreur, veuillez le signaler
a l'expediteur et le detruire ainsi que les pieces jointes. Les messages electroniques etant susceptibles d'alteration,
Orange decline toute responsabilite si ce message a ete altere, deforme ou falsifie. Merci.

This message and its attachments may contain confidential or privileged information that may be protected by law;
they should not be distributed, used or copied without authorisation.
If you have received this email in error, please notify the sender and delete this message and its attachments.
As emails may be altered, Orange is not liable for messages that have been modified, changed or falsified.
Thank you.