You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Kenneth Brotman <ke...@yahoo.com.INVALID> on 2019/03/28 08:55:23 UTC

Five Questions for Cassandra Users

I'm looking to get a better feel for how people use Cassandra in practice.
I thought others would benefit as well so may I ask you the following five
questions:

 

1.       Do the same people where you work operate the cluster and write the
code to develop the application?

 

2.       Do you have a metrics stack that allows you to see graphs of
various metrics with all the nodes displayed together?

 

3.       Do you have a log stack that allows you to see the logs for all the
nodes together?

 

4.       Do you regularly repair your clusters - such as by using Reaper?

 

5.       Do you use artificial intelligence to help manage your clusters?

 

 

Thank you for taking your time to share this information!

 

Kenneth Brotman


Re: Five Questions for Cassandra Users

Posted by Jonathan Koppenhofer <jo...@koppedomain.com>.
1. PaaS model. We have a team responsible for the deployment and
self-service tooling, as well as SME for both development and Cassandra
operations. End users consume the service, and are responsible for app
development and operations. Larger apps have separate teams for this,
smaller apps have a single text for both

2. Homegrown with custom agent piping stats to a Cassandra cluster. Grafana
with custom http reader to read metrics from homegrown API. If it would
have existed when we first did this, we probably would have worked in
Prometheus.

3. Yes. ELK and/or Splunk

4. Used homegrown repair mechanism before 2.2. Now use reaper. PaaS
consumers responsible for configuring repairs.

5. No. Need to get better here, but "real" AI seems to be a.bew trend we
have seen talked about on this list.


On Thu, Mar 28, 2019, 5:03 AM Kenneth Brotman <ke...@yahoo.com.invalid>
wrote:

> I’m looking to get a better feel for how people use Cassandra in
> practice.  I thought others would benefit as well so may I ask you the
> following five questions:
>
>
>
> 1.       Do the same people where you work operate the cluster and write
> the code to develop the application?
>
>
>
> 2.       Do you have a metrics stack that allows you to see graphs of
> various metrics with all the nodes displayed together?
>
>
>
> 3.       Do you have a log stack that allows you to see the logs for all
> the nodes together?
>
>
>
> 4.       Do you regularly repair your clusters - such as by using Reaper?
>
>
>
> 5.       Do you use artificial intelligence to help manage your clusters?
>
>
>
>
>
> Thank you for taking your time to share this information!
>
>
>
> Kenneth Brotman
>

Re: Five Questions for Cassandra Users

Posted by Elliott Sims <el...@backblaze.com>.
1.       Do the same people where you work operate the cluster and write
the code to develop the application?

Mostly.  Ops vs dev, although there's some overlap

2.       Do you have a metrics stack that allows you to see graphs of
various metrics with all the nodes displayed together?

 Yes, Prometheus+Grafana (currently custom script reporting to Prometheus,
but that needs revisiting)

3.       Do you have a log stack that allows you to see the logs for all
the nodes together?

 Yep, graylog.

4.       Do you regularly repair your clusters - such as by using Reaper?

 Yes, with reaper.  Every day or two, more or less.  It would be
almost-constant if Reaper could work off queues with blacklisted time
windows instead of a schedule

5.       Do you use artificial intelligence to help manage your clusters?

No.

On Thu, Mar 28, 2019 at 8:46 AM Tom van der Woerdt
<to...@booking.com.invalid> wrote:

> 1.       Do the same people where you work operate the cluster and write
> the code to develop the application?
>
> No, we have a small infrastructure team, and many people developing
> applications using Cassandra
>
> 2.       Do you have a metrics stack that allows you to see graphs of
> various metrics with all the nodes displayed together?
>
> Yes, we use a re-implementation of Graphite, which we open-sourced and now
> lives at https://github.com/go-graphite
>
> 3.       Do you have a log stack that allows you to see the logs for all
> the nodes together?
>
> Yes, although in practice we don't use it much for Cassandra
>
> 4.       Do you regularly repair your clusters - such as by using Reaper?
>
> Yes, we have built our own tools for this
>
> 5.       Do you use artificial intelligence to help manage your clusters?
>
> It's not "artificial intelligence" the way most people would describe it,
> but we certainly don't run our clusters manually
>
>
>
> Tom van der Woerdt
> Site Reliability Engineer
>
> Booking.com B.V.
> Vijzelstraat 66-80 Amsterdam 1017HL Netherlands
> [image: Booking.com] <https://www.booking.com/>
> Empowering people to experience the world since 1996
> 43 languages, 214+ offices worldwide, 141,000+ global destinations, 29
> million reported listings
> Subsidiary of Booking Holdings Inc. (NASDAQ: BKNG)
>
>
> On Thu, Mar 28, 2019 at 10:03 AM Kenneth Brotman
> <ke...@yahoo.com.invalid> wrote:
>
>> I’m looking to get a better feel for how people use Cassandra in
>> practice.  I thought others would benefit as well so may I ask you the
>> following five questions:
>>
>>
>>
>> 1.       Do the same people where you work operate the cluster and write
>> the code to develop the application?
>>
>>
>>
>> 2.       Do you have a metrics stack that allows you to see graphs of
>> various metrics with all the nodes displayed together?
>>
>>
>>
>> 3.       Do you have a log stack that allows you to see the logs for all
>> the nodes together?
>>
>>
>>
>> 4.       Do you regularly repair your clusters - such as by using Reaper?
>>
>>
>>
>> 5.       Do you use artificial intelligence to help manage your clusters?
>>
>>
>>
>>
>>
>> Thank you for taking your time to share this information!
>>
>>
>>
>> Kenneth Brotman
>>
>

Re: Five Questions for Cassandra Users

Posted by Tom van der Woerdt <to...@booking.com.INVALID>.
1.       Do the same people where you work operate the cluster and write
the code to develop the application?

No, we have a small infrastructure team, and many people developing
applications using Cassandra

2.       Do you have a metrics stack that allows you to see graphs of
various metrics with all the nodes displayed together?

Yes, we use a re-implementation of Graphite, which we open-sourced and now
lives at https://github.com/go-graphite

3.       Do you have a log stack that allows you to see the logs for all
the nodes together?

Yes, although in practice we don't use it much for Cassandra

4.       Do you regularly repair your clusters - such as by using Reaper?

Yes, we have built our own tools for this

5.       Do you use artificial intelligence to help manage your clusters?

It's not "artificial intelligence" the way most people would describe it,
but we certainly don't run our clusters manually



Tom van der Woerdt
Site Reliability Engineer

Booking.com B.V.
Vijzelstraat 66-80 Amsterdam 1017HL Netherlands
[image: Booking.com] <https://www.booking.com/>
Empowering people to experience the world since 1996
43 languages, 214+ offices worldwide, 141,000+ global destinations, 29
million reported listings
Subsidiary of Booking Holdings Inc. (NASDAQ: BKNG)


On Thu, Mar 28, 2019 at 10:03 AM Kenneth Brotman
<ke...@yahoo.com.invalid> wrote:

> I’m looking to get a better feel for how people use Cassandra in
> practice.  I thought others would benefit as well so may I ask you the
> following five questions:
>
>
>
> 1.       Do the same people where you work operate the cluster and write
> the code to develop the application?
>
>
>
> 2.       Do you have a metrics stack that allows you to see graphs of
> various metrics with all the nodes displayed together?
>
>
>
> 3.       Do you have a log stack that allows you to see the logs for all
> the nodes together?
>
>
>
> 4.       Do you regularly repair your clusters - such as by using Reaper?
>
>
>
> 5.       Do you use artificial intelligence to help manage your clusters?
>
>
>
>
>
> Thank you for taking your time to share this information!
>
>
>
> Kenneth Brotman
>

RE: Five Questions for Cassandra Users

Posted by Kenneth Brotman <ke...@yahoo.com.INVALID>.
Yes, absolutely!

 

From: Jonathan Koppenhofer [mailto:jon@koppedomain.com] 
Sent: Thursday, March 28, 2019 1:18 PM
To: user@cassandra.apache.org
Subject: Re: Five Questions for Cassandra Users

 

I think it would also be interesting to hear how people are handling automation (which to me is different than AI) and config management.

 

For us it is a combo of custom Java workflows and Saltstack.

 

On Thu, Mar 28, 2019, 5:03 AM Kenneth Brotman <ke...@yahoo.com.invalid> wrote:

I’m looking to get a better feel for how people use Cassandra in practice.  I thought others would benefit as well so may I ask you the following five questions:

 

1.       Do the same people where you work operate the cluster and write the code to develop the application?

 

2.       Do you have a metrics stack that allows you to see graphs of various metrics with all the nodes displayed together?

 

3.       Do you have a log stack that allows you to see the logs for all the nodes together?

 

4.       Do you regularly repair your clusters - such as by using Reaper?

 

5.       Do you use artificial intelligence to help manage your clusters?

 

 

Thank you for taking your time to share this information!

 

Kenneth Brotman


Re: Five Questions for Cassandra Users

Posted by Jonathan Koppenhofer <jo...@koppedomain.com>.
I think it would also be interesting to hear how people are handling
automation (which to me is different than AI) and config management.

For us it is a combo of custom Java workflows and Saltstack.

On Thu, Mar 28, 2019, 5:03 AM Kenneth Brotman <ke...@yahoo.com.invalid>
wrote:

> I’m looking to get a better feel for how people use Cassandra in
> practice.  I thought others would benefit as well so may I ask you the
> following five questions:
>
>
>
> 1.       Do the same people where you work operate the cluster and write
> the code to develop the application?
>
>
>
> 2.       Do you have a metrics stack that allows you to see graphs of
> various metrics with all the nodes displayed together?
>
>
>
> 3.       Do you have a log stack that allows you to see the logs for all
> the nodes together?
>
>
>
> 4.       Do you regularly repair your clusters - such as by using Reaper?
>
>
>
> 5.       Do you use artificial intelligence to help manage your clusters?
>
>
>
>
>
> Thank you for taking your time to share this information!
>
>
>
> Kenneth Brotman
>

Re: Five Questions for Cassandra Users

Posted by Jürgen Albersdorfer <ja...@gmail.com>.
1.       Do the same people where you work operate the cluster and write
the code to develop the application?

yes

2.       Do you have a metrics stack that allows you to see graphs of
various metrics with all the nodes displayed together?

 no

3.       Do you have a log stack that allows you to see the logs for all
the nodes together?

 no

4.       Do you regularly repair your clusters - such as by using Reaper?

 no, never did - try to stay away from ever doing.

5.       Do you use artificial intelligence to help manage your clusters?

no

Am Do., 28. März 2019 um 10:03 Uhr schrieb Kenneth Brotman
<ke...@yahoo.com.invalid>:

> I’m looking to get a better feel for how people use Cassandra in
> practice.  I thought others would benefit as well so may I ask you the
> following five questions:
>
>
>
> 1.       Do the same people where you work operate the cluster and write
> the code to develop the application?
>
>
>
> 2.       Do you have a metrics stack that allows you to see graphs of
> various metrics with all the nodes displayed together?
>
>
>
> 3.       Do you have a log stack that allows you to see the logs for all
> the nodes together?
>
>
>
> 4.       Do you regularly repair your clusters - such as by using Reaper?
>
>
>
> 5.       Do you use artificial intelligence to help manage your clusters?
>
>
>
>
>
> Thank you for taking your time to share this information!
>
>
>
> Kenneth Brotman
>

Re: Five Questions for Cassandra Users

Posted by Abhishek Singh <ab...@gmail.com>.
1.       Do the same people where you work operate the cluster and write
the code to develop the application?

               Different teams. Infra separate , Dev separate.

2.       Do you have a metrics stack that allows you to see graphs of
various metrics with all the nodes displayed together?

               Use third party APM tool to monitor cluster

3.       Do you have a log stack that allows you to see the logs for all
the nodes together?

                   No.Would like to.

4.       Do you regularly repair your clusters - such as by using Reaper?

                Yes

5.       Do you use artificial intelligence to help manage your clusters?

               No

On Thu, 28 Mar, 2019, 2:33 PM Kenneth Brotman, <ke...@yahoo.com.invalid>
wrote:

> I’m looking to get a better feel for how people use Cassandra in
> practice.  I thought others would benefit as well so may I ask you the
> following five questions:
>
>
>
> 1.       Do the same people where you work operate the cluster and write
> the code to develop the application?
>
>
>
> 2.       Do you have a metrics stack that allows you to see graphs of
> various metrics with all the nodes displayed together?
>
>
>
> 3.       Do you have a log stack that allows you to see the logs for all
> the nodes together?
>
>
>
> 4.       Do you regularly repair your clusters - such as by using Reaper?
>
>
>
> 5.       Do you use artificial intelligence to help manage your clusters?
>
>
>
>
>
> Thank you for taking your time to share this information!
>
>
>
> Kenneth Brotman
>

Re: Five Questions for Cassandra Users

Posted by Alain RODRIGUEZ <ar...@gmail.com>.
Hello,

I'm no longer operating "my" own cluster, but now doing consulting for TLP.
Here is what is would say with my own experience:

1.       Do the same people where you work operate the cluster and write
> the code to develop the application?
>

It's not the same set of skills that is required to operate and to use the
driver, write query and develop the needed code.
Some people have all the requested skills, yet the amount of work one can
do in a day is limited anyway.

Some thoughts:
- The design/model should always be done with all the people involved in
this feature/project.
--- Operators will know about the best practices and will assume the
responsibility of having it working ultimately. They are the fire
extinguishers and should help building the house, because they know what
will burn and what is reliable.
--- Devs are the most qualified to build the code, interact with the
drivers and potentially write the code (and tests) needed to query
Cassandra in the 'right way', as defined together with operators.
--- Lawyers/Legal teams can help answer questions around the TTL to use
(Time To Live). Not many data are requested to live forever, and setting a
TTL is a good way to keep the data size under control.
--- If the same person cares of both DEV/OPS (start-ups, small teams
generally), it's good for this team to be at 2+ people big. One alone
cannot exchange ideas, nor be up 24/7...

- A team of operators, that just knows the basic can do level 1 support if
procedures are well documented and if the proper tooling is in place. There
is a fair amount of repetitive work, many times where the 'protocol' is the
same one to react to X or Y. Ultimately, they can escalate to the people
who are responsible for the cluster.


> 2.       Do you have a metrics stack that allows you to see graphs of
> various metrics with all the nodes displayed together?
>

I definitely recommend and advocate for this. It is the best way to get a
feeling of the health of. your cluster at first sight. To understand the
patterns, the bottlenecks, to see the impacts of optimisations  and to
diagnose issues efficiently.

We built Datadog default dashboards to help people using Datadog to monitor
their Cassandra clusters. The release post is here:
http://thelastpickle.com/blog/2017/12/05/datadog-tlp-dashboards.html
Also if you prefer videos, here is what I think about why, what and how to
monitor: https://www.youtube.com/watch?v=Q9AAR4UQzMk

If you're not using Datadog, they are Grafana dashboards available and
prometheus metric exporters as well.
- "grafana cassandra dashboards
<https://www.google.com/search?client=safari&rls=en&q=grafana+cassandra+dashboards&ie=UTF-8&oe=UTF-8>"
on any search engine should give you a few options
- https://github.com/instaclustr/cassandra-exporter

3.       Do you have a log stack that allows you to see the logs for all
> the nodes together?
>

I would say it's a 'should-have' by opposition to a good monitoring system
for example that is a 'must-have'. I never had one or really used one,
despite the fact that as a consultant I worked on multiple clusters.

If you have it in place for other services, then maybe just plug in C*
nodes as well. It will help you if a machine becomes completely unreachable
or to easily aggregate and make statistics for the whole cluster. it can be
really nice. Then just be aware of the amount of logs that Cassandra
generates, the debug level you want to have and think about the appropriate
retention policy.

But it's definitely not the first thing I would care about, as tools allow
you to query from all nodes through ssh to get information about each node.
Or you can always jump on a faulty node.


> 4.       Do you regularly repair your clusters - such as by using Reaper?
>

Most of the people do I believe, one way or another. With cron, house built
tools, reaper, oss scripts to handle "range repairs".
It is not mandatory as long as you do not delete data. It's maybe not
needed if you use strong consistency. I always like to do it regularly.
I like to think that my nodes are having the same data, that entropy is as
low as possible. It always worked well for me, making me more confident
when operating the cluster (moving token ranges, removing forcefully a
node, etc) and I did not lose data in 6 years (apart from counters, but
they were known to be not 'accurate' not to say 'broken' already by then)
and despite the fact I started with C*0.8 (and fresh first counters
implementation yay!).
I would keep routine repairs as a good practice when it is useful (deletes,
not strongly consistent read) but also when it theory it's not needed, to
help keeping the data where it belongs, and despite Cassandra is now pretty
resilient.

Yet some people are doing perfectly fine... until they run a first repair!
Be sure to read about it before. With default number of vnodes and default
repair options in older versions of cassandra, you could really harm your
cluster.


> 5.       Do you use artificial intelligence to help manage your clusters?



So far I only used "human intelligence" (mine, the collective one from this
mailing list and my colleague's one - really often) to manage my and other
people clusters ;-).

But there's a small part what I do that I could trust a machine to do for
me, and better than I would do. There is a lot of tools out there that do
"things" for us (Reaper, OpsCenter, in-house shared/oss tools, Netflix
opens a lot of tools for many years, but also dashboards that you just have
to plug, etc) that bring some intelligence from other people who face, not
real IA per se as it won't learn by itself, maybe. Also, there is an
ongoing work to make operating Cassandra greater, search for "management
tool" off the top of my head...

I never used any IA to help managing my clusters. The closest thing I had
installed at some point was the OpsCenter adviser once, multiple years ago.
But I was knowing more than 'it' (the IA) about Cassandra by then ;-). I
never had the chance to see a really great IA that would actually help me
with cluster management. Thus I see the interest of some tools to help
people managing their cluster.

Other alternatives are fully managed Cassandra clusters services if that's
of interest for you, using the mailing list (as you did), working with
consultant is another option (but I could be a bit biased recommending you
to work with consultants ;-)).

Hope some of it helps (I always write too much ¯\_(ツ)_/¯),

C*heers,
-----------------------
Alain Rodriguez - alain@thelastpickle.com
France / Spain

The Last Pickle - Apache Cassandra Consulting
http://www.thelastpickle.com




Le lun. 1 avr. 2019 à 20:45, Rahul Singh <ra...@gmail.com> a
écrit :

> Answers inline.
>
>
> 1.       Do the same people where you work operate the cluster and write
> the code to develop the application?
>
>
> No but the operators need to know development , data-modeling, and
> generally how to "code" the application. (Coding is a low-level task of
> assigning a code to a concept.. so I don't think that's the proper verb in
> these scenarios.. engineering, or software development, or even programing
> is a better term). It's because the developers are hired dime a dozen at
> the B / C level and then replaced by D /E / F level developers as things go
> on.. so the Data team eventually ends up being the expert of the
> application and the data platform, and a "Center of Excellence" for the
> development / architects to work with on a collaborative basis.
>
>
>
> 2.       Do you have a metrics stack that allows you to see graphs of
> various metrics with all the nodes displayed together?
>
>
>
> Yes. OpsCenter, ELK, Grafana, custom node data visualizers in excel
> (because lines and charts don't tell you everything)
>
>
> 3.       Do you have a log stack that allows you to see the logs for all
> the nodes together?
>
> ELK. CloudWatch
>
>
> 4.       Do you regularly repair your clusters - such as by using Reaper?
>
>  Depends. Cron, Reaper, OpsCenter Repair, and now NodeSync
>
>
> 5.       Do you use artificial intelligence to help manage your clusters?
>
>
> Yes, I actually have made an artificial general intelligence called
> Gravitron. It learns by ingesting all the news articles I aggregate about
> Cassandra and the links I curate on cassandra.link into a solr/lucene index
> and then using clustering find out the most popular and popularly connected
> content. Once it does that there's a summarization of the content into
> human readable content as well as interpreted bash code that gets pushed
> into a "Recipe Book." As the master operator identifies scenarios using
> english language, and then runs the bash commands, the machine slowly but
> surely "wakes up" and starts to manage itself. It can also play Go , the
> game, and beat IBM's AlphaGo at Go, and Donald Trump at golf while he was
> cheating!
>
>
>
> rahul.xavier.singh@gmail.com
>
> http://cassandra.link
>
> I'm speaking at #DataStaxAccelerate, the world’s premiere #ApacheCassandra
> conference, and I want to see you there! Use my code Singh50 for 50% off
> your registration. www.datastax.com/accelerate
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> Happy april fools day.
>
>
>
>
>
> On Thu, Mar 28, 2019 at 5:03 AM Kenneth Brotman
> <ke...@yahoo.com.invalid> wrote:
>
>> I’m looking to get a better feel for how people use Cassandra in
>> practice.  I thought others would benefit as well so may I ask you the
>> following five questions:
>>
>>
>>
>> 1.       Do the same people where you work operate the cluster and write
>> the code to develop the application?
>>
>>
>>
>> 2.       Do you have a metrics stack that allows you to see graphs of
>> various metrics with all the nodes displayed together?
>>
>>
>>
>> 3.       Do you have a log stack that allows you to see the logs for all
>> the nodes together?
>>
>>
>>
>> 4.       Do you regularly repair your clusters - such as by using Reaper?
>>
>>
>>
>> 5.       Do you use artificial intelligence to help manage your clusters?
>>
>>
>>
>>
>>
>> Thank you for taking your time to share this information!
>>
>>
>>
>> Kenneth Brotman
>>
>

Re: Five Questions for Cassandra Users

Posted by Rahul Singh <ra...@gmail.com>.
Answers inline.


1.       Do the same people where you work operate the cluster and write
the code to develop the application?


No but the operators need to know development , data-modeling, and
generally how to "code" the application. (Coding is a low-level task of
assigning a code to a concept.. so I don't think that's the proper verb in
these scenarios.. engineering, or software development, or even programing
is a better term). It's because the developers are hired dime a dozen at
the B / C level and then replaced by D /E / F level developers as things go
on.. so the Data team eventually ends up being the expert of the
application and the data platform, and a "Center of Excellence" for the
development / architects to work with on a collaborative basis.



2.       Do you have a metrics stack that allows you to see graphs of
various metrics with all the nodes displayed together?



Yes. OpsCenter, ELK, Grafana, custom node data visualizers in excel
(because lines and charts don't tell you everything)


3.       Do you have a log stack that allows you to see the logs for all
the nodes together?

ELK. CloudWatch


4.       Do you regularly repair your clusters - such as by using Reaper?

 Depends. Cron, Reaper, OpsCenter Repair, and now NodeSync


5.       Do you use artificial intelligence to help manage your clusters?


Yes, I actually have made an artificial general intelligence called
Gravitron. It learns by ingesting all the news articles I aggregate about
Cassandra and the links I curate on cassandra.link into a solr/lucene index
and then using clustering find out the most popular and popularly connected
content. Once it does that there's a summarization of the content into
human readable content as well as interpreted bash code that gets pushed
into a "Recipe Book." As the master operator identifies scenarios using
english language, and then runs the bash commands, the machine slowly but
surely "wakes up" and starts to manage itself. It can also play Go , the
game, and beat IBM's AlphaGo at Go, and Donald Trump at golf while he was
cheating!



rahul.xavier.singh@gmail.com

http://cassandra.link

I'm speaking at #DataStaxAccelerate, the world’s premiere #ApacheCassandra
conference, and I want to see you there! Use my code Singh50 for 50% off
your registration. www.datastax.com/accelerate

































Happy april fools day.





On Thu, Mar 28, 2019 at 5:03 AM Kenneth Brotman
<ke...@yahoo.com.invalid> wrote:

> I’m looking to get a better feel for how people use Cassandra in
> practice.  I thought others would benefit as well so may I ask you the
> following five questions:
>
>
>
> 1.       Do the same people where you work operate the cluster and write
> the code to develop the application?
>
>
>
> 2.       Do you have a metrics stack that allows you to see graphs of
> various metrics with all the nodes displayed together?
>
>
>
> 3.       Do you have a log stack that allows you to see the logs for all
> the nodes together?
>
>
>
> 4.       Do you regularly repair your clusters - such as by using Reaper?
>
>
>
> 5.       Do you use artificial intelligence to help manage your clusters?
>
>
>
>
>
> Thank you for taking your time to share this information!
>
>
>
> Kenneth Brotman
>