You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by John Lammers <jo...@karoshealth.com> on 2016/01/22 21:24:38 UTC

Production with Single Node

After deploying a number of production systems with up to 10 Cassandra
nodes each, we are looking at deploying a small, all-in-one-server system
with only a single, local node (Cassandra 2.1.11).

What are the risks of such a configuration?

The virtual disk would be running RAID 5 and the disk controller would have
a flash backed write-behind cache.

What's the best way to configure Cassandra and/or respecify the hardware
for an all-in-one-box solution?

Thanks-in-advance!

--John

Fwd: Production with Single Node

Posted by John Lammers <jo...@karoshealth.com>.
Thanks for your reply Jonathan.

We chose Cassandra for its incredible performance and robustness for large
sites.  Our application is designed from the ground up to take full
advantage of its column oriented data store (giving up the ability to also
run with a relational database backend).

The challenge now is a new market consisting of many small sites that
reportedly can't afford a multi-server solution.  These would be permanent,
one node systems.

--John

---------- Forwarded message ----------
From: Jonathan Haddad <jo...@jonhaddad.com>
Date: Fri, Jan 22, 2016 at 3:34 PM
Subject: Re: Production with Single Node
To: user@cassandra.apache.org


My opinion:
http://rustyrazorblade.com/2013/09/cassandra-faq-can-i-start-with-a-single-node/

TL;DR: the only reason to run 1 node in prod is if you're super broke but
know you'll need to scale up almost immediately after going to prod (maybe
after getting some funding).

If you're planning on doing it as a more permanent solution, you've chosen
the wrong database.

Re: Production with Single Node

Posted by Dan Kinder <dk...@turnitin.com>.
I could see this being desirable if you are deploying the exact same
application as you deploy in other places with many nodes, and you know the
load will be low. It may be a rare situation but in such a case you save
big effort by not having to change your application logic.

Not that I necessarily recommend it but to answer John's question: my
understanding is that you want to keep it snappy and low-latency you should
watch out for GC pause and consider your GC tuning carefully, it being a
single node will cause the whole show to stop. Presumably your load won't
be very high.

Also if you are concerned with durability you may want to consider changing
commitlog_sync
<https://docs.datastax.com/en/cassandra/1.2/cassandra/configuration/configCassandra_yaml_r.html?scroll=reference_ds_qfg_n1r_1k__commitlog_sync>
to
batch. I believe this is the only way to guarantee write durability with
one node. Again with the performance caveat; under high load it could cause
problems.

On Fri, Jan 22, 2016 at 12:34 PM, Jonathan Haddad <jo...@jonhaddad.com> wrote:

> My opinion:
> http://rustyrazorblade.com/2013/09/cassandra-faq-can-i-start-with-a-single-node/
>
> TL;DR: the only reason to run 1 node in prod is if you're super broke but
> know you'll need to scale up almost immediately after going to prod (maybe
> after getting some funding).
>
> If you're planning on doing it as a more permanent solution, you've chosen
> the wrong database.
>
> On Fri, Jan 22, 2016 at 12:30 PM Jack Krupansky <ja...@gmail.com>
> wrote:
>
>> The risks would be about the same as with a single-node Postgres or MySQL
>> database, except that you wouldn't have the benefit of full SQL.
>>
>> How much data (rows, columns), what kind of load pattern (heavy write,
>> heavy update, heavy query), and what types of queries (primary key-only,
>> slices, filtering, secondary indexes, etc.)?
>>
>> -- Jack Krupansky
>>
>> On Fri, Jan 22, 2016 at 3:24 PM, John Lammers <
>> john.lammers@karoshealth.com> wrote:
>>
>>> After deploying a number of production systems with up to 10 Cassandra
>>> nodes each, we are looking at deploying a small, all-in-one-server system
>>> with only a single, local node (Cassandra 2.1.11).
>>>
>>> What are the risks of such a configuration?
>>>
>>> The virtual disk would be running RAID 5 and the disk controller would
>>> have a flash backed write-behind cache.
>>>
>>> What's the best way to configure Cassandra and/or respecify the hardware
>>> for an all-in-one-box solution?
>>>
>>> Thanks-in-advance!
>>>
>>> --John
>>>
>>>
>>


-- 
Dan Kinder
Principal Software Engineer
Turnitin – www.turnitin.com
dkinder@turnitin.com

Re: Production with Single Node

Posted by Jonathan Haddad <jo...@jonhaddad.com>.
My opinion:
http://rustyrazorblade.com/2013/09/cassandra-faq-can-i-start-with-a-single-node/

TL;DR: the only reason to run 1 node in prod is if you're super broke but
know you'll need to scale up almost immediately after going to prod (maybe
after getting some funding).

If you're planning on doing it as a more permanent solution, you've chosen
the wrong database.

On Fri, Jan 22, 2016 at 12:30 PM Jack Krupansky <ja...@gmail.com>
wrote:

> The risks would be about the same as with a single-node Postgres or MySQL
> database, except that you wouldn't have the benefit of full SQL.
>
> How much data (rows, columns), what kind of load pattern (heavy write,
> heavy update, heavy query), and what types of queries (primary key-only,
> slices, filtering, secondary indexes, etc.)?
>
> -- Jack Krupansky
>
> On Fri, Jan 22, 2016 at 3:24 PM, John Lammers <
> john.lammers@karoshealth.com> wrote:
>
>> After deploying a number of production systems with up to 10 Cassandra
>> nodes each, we are looking at deploying a small, all-in-one-server system
>> with only a single, local node (Cassandra 2.1.11).
>>
>> What are the risks of such a configuration?
>>
>> The virtual disk would be running RAID 5 and the disk controller would
>> have a flash backed write-behind cache.
>>
>> What's the best way to configure Cassandra and/or respecify the hardware
>> for an all-in-one-box solution?
>>
>> Thanks-in-advance!
>>
>> --John
>>
>>
>

Re: Production with Single Node

Posted by John Lammers <jo...@karoshealth.com>.
Thanks for your reply Jonathan.

We usually deploy clusters of application nodes running on a Cassandra
database cluster, often with two data centers.  Our application is married
to / designed for Cassandra and we can't support any relational database
without rearchitecting and rewriting a lot of code.

For these small sites, we need to scale *down*, not up.

Like it says in Sebastián's email signature "predictably scalable to any
size," only the size this time is smaller, not larger.

--
John Lammers | karoshealth

+1 519 594 0940 x225 | Skype: johnatkaros
7 Father David Bauer Drive
Waterloo, ON, N2L 0A2, Canada
www.karoshealth.com

On Fri, Jan 22, 2016 at 4:32 PM, Jonathan Haddad <jo...@jonhaddad.com> wrote:

> If you're going to go with a bunch of smaller, single node servers, use
> Postgres.  It's going to be more flexible with a smaller memory footprint.
> You could even use sqlite.
>
> Would you run a single node zookeeper cluster?   Single node map reduce?
> Single node HDFS?  I hope not.
>
> Cassandra's strengths are high availability and linear scalability.  If
> you're not planning on taking advantage of either of those you're using the
> wrong tool for the job.
>
> On Fri, Jan 22, 2016 at 1:25 PM Jeff Jirsa <je...@crowdstrike.com>
> wrote:
>
>> The value of cassandra is in its replication – as a single node solution,
>> it’s slower and less flexible than alternatives
>>
>> From: John Lammers
>> Reply-To: "user@cassandra.apache.org"
>> Date: Friday, January 22, 2016 at 12:57 PM
>> To: Cassandra Mailing List
>>
>> Subject: Fwd: Production with Single Node
>>
>> Thanks for your reply Sebastian.
>>
>> They are specialized data storage & retrieval systems.  The Cassandra
>> database is mainly used to store meta-data for searching.
>>
>> Jonathan, I had seen your article.  But what are some of the technical
>> reasons why a one node Cassandra cluster is a bad idea?  I need ammo to
>> convince others.  Or failing that, what can be done to make this
>> configuration as safe & robust as possible?
>>
>> Thanks!
>>
>> --John
>>
>> ---------- Forwarded message ----------
>> From: Sebastian Estevez <se...@datastax.com>
>> Date: Fri, Jan 22, 2016 at 3:41 PM
>> Subject: Fwd: Production with Single Node
>> To: john.lammers@karoshealth.com
>>
>>
>> Hi John,
>>
>> Can you share a bit more about your use case? What's the purpose of these
>> little clusters? Jon has good points but I'm cautious to dismiss your idea
>> without hearing specifics about your plans.
>>
>>
>> All the best,
>>
>>
>> [image: datastax_logo.png] <http://www.datastax.com/>
>>
>> Sebastián Estévez
>>
>> Solutions Architect |954 905 8615 | sebastian.estevez@datastax.com
>>
>> [image: linkedin.png] <https://www.linkedin.com/company/datastax>[image:
>> facebook.png] <https://www.facebook.com/datastax>[image: twitter.png]
>> <https://twitter.com/datastax>[image: g+.png]
>> <https://plus.google.com/+Datastax/about>
>> <http://feeds.feedburner.com/datastax>
>> <http://goog_410786983>
>>
>>
>> <http://www.datastax.com/gartner-magic-quadrant-odbms>
>>
>> DataStax is the fastest, most scalable distributed database technology,
>> delivering Apache Cassandra to the world’s most innovative enterprises.
>> Datastax is built to be agile, always-on, and predictably scalable to any
>> size. With more than 500 customers in 45 countries, DataStax is the
>> database technology and transactional backbone of choice for the worlds
>> most innovative companies such as Netflix, Adobe, Intuit, and eBay.
>>
>> ---------- Forwarded message ----------
>> From: Jack Krupansky <ja...@gmail.com>
>> Date: Fri, Jan 22, 2016 at 3:30 PM
>> Subject: Re: Production with Single Node
>> To: user@cassandra.apache.org
>>
>>
>> The risks would be about the same as with a single-node Postgres or MySQL
>> database, except that you wouldn't have the benefit of full SQL.
>>
>> How much data (rows, columns), what kind of load pattern (heavy write,
>> heavy update, heavy query), and what types of queries (primary key-only,
>> slices, filtering, secondary indexes, etc.)?
>>
>> -- Jack Krupansky
>>
>> On Fri, Jan 22, 2016 at 3:24 PM, John Lammers <
>> john.lammers@karoshealth.com> wrote:
>>
>>> After deploying a number of production systems with up to 10 Cassandra
>>> nodes each, we are looking at deploying a small, all-in-one-server system
>>> with only a single, local node (Cassandra 2.1.11).
>>>
>>> What are the risks of such a configuration?
>>>
>>> The virtual disk would be running RAID 5 and the disk controller would
>>> have a flash backed write-behind cache.
>>>
>>> What's the best way to configure Cassandra and/or respecify the hardware
>>> for an all-in-one-box solution?
>>>
>>> Thanks-in-advance!
>>>
>>> --John
>>>
>>>
>>
>>
>>

Re: Production with Single Node

Posted by Jonathan Haddad <jo...@jonhaddad.com>.
If you're going to go with a bunch of smaller, single node servers, use
Postgres.  It's going to be more flexible with a smaller memory footprint.
You could even use sqlite.

Would you run a single node zookeeper cluster?   Single node map reduce?
Single node HDFS?  I hope not.

Cassandra's strengths are high availability and linear scalability.  If
you're not planning on taking advantage of either of those you're using the
wrong tool for the job.

On Fri, Jan 22, 2016 at 1:25 PM Jeff Jirsa <je...@crowdstrike.com>
wrote:

> The value of cassandra is in its replication – as a single node solution,
> it’s slower and less flexible than alternatives
>
> From: John Lammers
> Reply-To: "user@cassandra.apache.org"
> Date: Friday, January 22, 2016 at 12:57 PM
> To: Cassandra Mailing List
>
> Subject: Fwd: Production with Single Node
>
> Thanks for your reply Sebastian.
>
> They are specialized data storage & retrieval systems.  The Cassandra
> database is mainly used to store meta-data for searching.
>
> Jonathan, I had seen your article.  But what are some of the technical
> reasons why a one node Cassandra cluster is a bad idea?  I need ammo to
> convince others.  Or failing that, what can be done to make this
> configuration as safe & robust as possible?
>
> Thanks!
>
> --John
>
> ---------- Forwarded message ----------
> From: Sebastian Estevez <se...@datastax.com>
> Date: Fri, Jan 22, 2016 at 3:41 PM
> Subject: Fwd: Production with Single Node
> To: john.lammers@karoshealth.com
>
>
> Hi John,
>
> Can you share a bit more about your use case? What's the purpose of these
> little clusters? Jon has good points but I'm cautious to dismiss your idea
> without hearing specifics about your plans.
>
>
> All the best,
>
>
> [image: datastax_logo.png] <http://www.datastax.com/>
>
> Sebastián Estévez
>
> Solutions Architect |954 905 8615 | sebastian.estevez@datastax.com
>
> [image: linkedin.png] <https://www.linkedin.com/company/datastax>[image:
> facebook.png] <https://www.facebook.com/datastax>[image: twitter.png]
> <https://twitter.com/datastax>[image: g+.png]
> <https://plus.google.com/+Datastax/about>
> <http://feeds.feedburner.com/datastax>
> <http://goog_410786983>
>
>
> <http://www.datastax.com/gartner-magic-quadrant-odbms>
>
> DataStax is the fastest, most scalable distributed database technology,
> delivering Apache Cassandra to the world’s most innovative enterprises.
> Datastax is built to be agile, always-on, and predictably scalable to any
> size. With more than 500 customers in 45 countries, DataStax is the
> database technology and transactional backbone of choice for the worlds
> most innovative companies such as Netflix, Adobe, Intuit, and eBay.
>
> ---------- Forwarded message ----------
> From: Jack Krupansky <ja...@gmail.com>
> Date: Fri, Jan 22, 2016 at 3:30 PM
> Subject: Re: Production with Single Node
> To: user@cassandra.apache.org
>
>
> The risks would be about the same as with a single-node Postgres or MySQL
> database, except that you wouldn't have the benefit of full SQL.
>
> How much data (rows, columns), what kind of load pattern (heavy write,
> heavy update, heavy query), and what types of queries (primary key-only,
> slices, filtering, secondary indexes, etc.)?
>
> -- Jack Krupansky
>
> On Fri, Jan 22, 2016 at 3:24 PM, John Lammers <
> john.lammers@karoshealth.com> wrote:
>
>> After deploying a number of production systems with up to 10 Cassandra
>> nodes each, we are looking at deploying a small, all-in-one-server system
>> with only a single, local node (Cassandra 2.1.11).
>>
>> What are the risks of such a configuration?
>>
>> The virtual disk would be running RAID 5 and the disk controller would
>> have a flash backed write-behind cache.
>>
>> What's the best way to configure Cassandra and/or respecify the hardware
>> for an all-in-one-box solution?
>>
>> Thanks-in-advance!
>>
>> --John
>>
>>
>
>
>

Re: Production with Single Node

Posted by Nikolay Mihaylov <nm...@nmmm.nu>.
HI

We have 2 - 3 installations with single node Cassandra. They working fine,
no problems there,
except if Cassandra stops, everything stops. Even on one node, we usually
"rolling" 500-600 GB data, sometimes even 2-3 TB. We use mostly standard
configuration with almost no changes there.

Here are some considerations for bloom filter config, but they are for old
Cassandra version:
http://nmmm.nu/bloomfilter.htm

https://whoisrequest.com/ - this uses single node Cassandra with about 600
GB data.

We found that it works much better and faster than MySQL. We did test
Postgres, but it was terribly slow. We were in big hurry so we did not
analyze why Postgres was so slow.

Another lesson we learned - when you do single node, put only Cassandra on
single server. Keep webserver / client on different server.

In our latest project we did use TokuDB. It is something like MySQL
"plugin". We know Toku from 5-6 years, but until recently it was paid
software with free demo. TokuDB is currently GPL.

Here is what we researched 5 years ago:

http://www.novini.net/2010/12/mysql-storage-engines-comparison.html

We also did test MongoDB. It is quite fast, but it have been eaten our HDD
very fast.

So little recap what we have:

- Cassandra single nodes - 600-700 GB data
- MySQL with MyISAM - 30-40 GB data
- TokuDB - 100 GB data (this equals to 500 GB MyISAM / InnoDB).

Feel free to contact me if you have non Cassandra related questions.


On Sat, Jan 23, 2016 at 7:10 AM, Anuj Wadehra <an...@yahoo.co.in>
wrote:

> And I think in a 3 node cluster, RAID 0 would do the job instead of RAID 5
> . So you will need less storage to get same disk space. But you will get
> protection against disk failures and infact entire node failure.
>
> Anuj
>
> Sent from Yahoo Mail on Android
> <https://overview.mail.yahoo.com/mobile/?.src=Android>
>
> On Sat, 23 Jan, 2016 at 10:30 am, Anuj Wadehra
> <an...@yahoo.co.in> wrote:
> I think Jonathan said it earlier. You may be happy with the performance
> for now as you are using the same commitlog settings that you use in large
> clusters. Test the new setting recommended so that you know the real
> picture. Or be prepared to lose some data in case of failure.
>
> Other than durability, you single node cluster would be Single Point of
> Failure for your site. RAID 5 will only protect you against a disk failure.
> But a server may be down for other reasons too. Question is :Are you ok
> with site going down?
>
> I would suggest you to use hardware with smaller configuration to save on
> cost for smaller sites and go ahead with a 3 node minimum.That ways you
> will provide all the good features of your design irrespective of the site.
> Cassandra is known to work on commodity servers too.
>
>
>
> Thanks
> Anuj
>
>
>
>
> Sent from Yahoo Mail on Android
> <https://overview.mail.yahoo.com/mobile/?.src=Android>
>
> On Sat, 23 Jan, 2016 at 4:23 am, Jack Krupansky
> <ja...@gmail.com> wrote:
> You do of course have the simple technical matters, most of which need to
> be addressed with a proof of concept implementation, related to memory,
> storage, latency, and throughput. I mean, with a scaled cluster you can
> always add nodes to increase capacity and throughput, and reduce latency,
> but with a single node you have limited flexibility.
>
> Just to be clear, Cassandra is still not recommended for "fat nodes" -
> even if you can fit tons of data on the node, you may not have the computes
> to satisfy throughput and latency requirements. And if you don't have
> enough system memory the amount of storage is irrelevant.
>
> Back to my original question:
> How much data (rows, columns), what kind of load pattern (heavy write,
> heavy update, heavy query), and what types of queries (primary key-only,
> slices, filtering, secondary indexes, etc.)?
>
> I do recall a customer who ran into problems because they had SSD but only
> a very limited amount so they were running out of storage. Having enough
> system memory for file system caching and offheap data is important as well.
>
>
> -- Jack Krupansky
>
> On Fri, Jan 22, 2016 at 5:07 PM, John Lammers <
> john.lammers@karoshealth.com> wrote:
>
>> Thanks for your response Jack.
>>
>> We are already sold on distributed databases, HA and scaling.  We just
>> have some small deployments coming up where there's no money for servers to
>> run multiple Cassandra nodes.
>>
>> So, aside from the lack of HA, I'm asking if a single Cassandra node
>> would be viable in a production environment.  (There would be RAID 5 and
>> the RAID controller cache is backed by flash memory).
>>
>> I'm asking because I'm concerned about using Cassandra in a way that it's
>> not designed for.  That to me is the unsettling aspect.
>>
>> If this is a bad idea, give me the ammo I need to shoot it down.  I need
>> specific technical reasons.
>>
>> Thanks!
>>
>> --John
>>
>> On Fri, Jan 22, 2016 at 4:47 PM, Jack Krupansky <jack.krupansky@gmail.com
>> > wrote:
>>
>>> Is single-node Cassandra has the performance (and capacity) you need and
>>> the NoSQL data model and API are sufficient for your app, and your dev and
>>> ops and support teams are already familiar with and committed to Cassandra,
>>> and you don't need HA or scaling, then it sounds like you are set.
>>>
>>> You asked about risks, and normally lack of HA and scaling are
>>> unacceptable risks when people are looking at distributed databases.
>>>
>>> Most people on this list are dedicated to and passionate about
>>> distributed databases, HA, and scaling, so it is distinctly unsettling when
>>> somebody comes along who isn't interested in and committed to those same
>>> three qualities. But if single-node happens to work for you, then that's
>>> great.
>>>
>>> -- Jack Krupansky
>>>
>>
>>
>

Re: Production with Single Node

Posted by Anuj Wadehra <an...@yahoo.co.in>.
And I think in a 3 node cluster, RAID 0 would do the job instead of RAID 5 . So you will need less storage to get same disk space. But you will get protection against disk failures and infact entire node failure.
Anuj

Sent from Yahoo Mail on Android 
 
  On Sat, 23 Jan, 2016 at 10:30 am, Anuj Wadehra<an...@yahoo.co.in> wrote:   I think Jonathan said it earlier. You may be happy with the performance for now as you are using the same commitlog settings that you use in large clusters. Test the new setting recommended so that you know the real picture. Or be prepared to lose some data in case of failure.
Other than durability, you single node cluster would be Single Point of Failure for your site. RAID 5 will only protect you against a disk failure. But a server may be down for other reasons too. Question is :Are you ok with site going down?
I would suggest you to use hardware with smaller configuration to save on cost for smaller sites and go ahead with a 3 node minimum.That ways you will provide all the good features of your design irrespective of the site. Cassandra is known to work on commodity servers too. 


ThanksAnuj



Sent from Yahoo Mail on Android 
 
  On Sat, 23 Jan, 2016 at 4:23 am, Jack Krupansky<ja...@gmail.com> wrote:   You do of course have the simple technical matters, most of which need to be addressed with a proof of concept implementation, related to memory, storage, latency, and throughput. I mean, with a scaled cluster you can always add nodes to increase capacity and throughput, and reduce latency, but with a single node you have limited flexibility.
Just to be clear, Cassandra is still not recommended for "fat nodes" - even if you can fit tons of data on the node, you may not have the computes to satisfy throughput and latency requirements. And if you don't have enough system memory the amount of storage is irrelevant.
Back to my original question:How much data (rows, columns), what kind of load pattern (heavy write, heavy update, heavy query), and what types of queries (primary key-only, slices, filtering, secondary indexes, etc.)?

I do recall a customer who ran into problems because they had SSD but only a very limited amount so they were running out of storage. Having enough system memory for file system caching and offheap data is important as well.

-- Jack Krupansky
On Fri, Jan 22, 2016 at 5:07 PM, John Lammers <jo...@karoshealth.com> wrote:

Thanks for your response Jack.
We are already sold on distributed databases, HA and scaling.  We just have some small deployments coming up where there's no money for servers to run multiple Cassandra nodes.
So, aside from the lack of HA, I'm asking if a single Cassandra node would be viable in a production environment.  (There would be RAID 5 and the RAID controller cache is backed by flash memory).
I'm asking because I'm concerned about using Cassandra in a way that it's not designed for.  That to me is the unsettling aspect.
If this is a bad idea, give me the ammo I need to shoot it down.  I need specific technical reasons.
Thanks!
--John
On Fri, Jan 22, 2016 at 4:47 PM, Jack Krupansky <ja...@gmail.com> wrote:

Is single-node Cassandra has the performance (and capacity) you need and the NoSQL data model and API are sufficient for your app, and your dev and ops and support teams are already familiar with and committed to Cassandra, and you don't need HA or scaling, then it sounds like you are set.
You asked about risks, and normally lack of HA and scaling are unacceptable risks when people are looking at distributed databases.
Most people on this list are dedicated to and passionate about distributed databases, HA, and scaling, so it is distinctly unsettling when somebody comes along who isn't interested in and committed to those same three qualities. But if single-node happens to work for you, then that's great.
-- Jack Krupansky



  
  

Re: Production with Single Node

Posted by Anuj Wadehra <an...@yahoo.co.in>.
I think Jonathan said it earlier. You may be happy with the performance for now as you are using the same commitlog settings that you use in large clusters. Test the new setting recommended so that you know the real picture. Or be prepared to lose some data in case of failure.
Other than durability, you single node cluster would be Single Point of Failure for your site. RAID 5 will only protect you against a disk failure. But a server may be down for other reasons too. Question is :Are you ok with site going down?
I would suggest you to use hardware with smaller configuration to save on cost for smaller sites and go ahead with a 3 node minimum.That ways you will provide all the good features of your design irrespective of the site. Cassandra is known to work on commodity servers too. 


ThanksAnuj



Sent from Yahoo Mail on Android 
 
  On Sat, 23 Jan, 2016 at 4:23 am, Jack Krupansky<ja...@gmail.com> wrote:   You do of course have the simple technical matters, most of which need to be addressed with a proof of concept implementation, related to memory, storage, latency, and throughput. I mean, with a scaled cluster you can always add nodes to increase capacity and throughput, and reduce latency, but with a single node you have limited flexibility.
Just to be clear, Cassandra is still not recommended for "fat nodes" - even if you can fit tons of data on the node, you may not have the computes to satisfy throughput and latency requirements. And if you don't have enough system memory the amount of storage is irrelevant.
Back to my original question:How much data (rows, columns), what kind of load pattern (heavy write, heavy update, heavy query), and what types of queries (primary key-only, slices, filtering, secondary indexes, etc.)?

I do recall a customer who ran into problems because they had SSD but only a very limited amount so they were running out of storage. Having enough system memory for file system caching and offheap data is important as well.

-- Jack Krupansky
On Fri, Jan 22, 2016 at 5:07 PM, John Lammers <jo...@karoshealth.com> wrote:

Thanks for your response Jack.
We are already sold on distributed databases, HA and scaling.  We just have some small deployments coming up where there's no money for servers to run multiple Cassandra nodes.
So, aside from the lack of HA, I'm asking if a single Cassandra node would be viable in a production environment.  (There would be RAID 5 and the RAID controller cache is backed by flash memory).
I'm asking because I'm concerned about using Cassandra in a way that it's not designed for.  That to me is the unsettling aspect.
If this is a bad idea, give me the ammo I need to shoot it down.  I need specific technical reasons.
Thanks!
--John
On Fri, Jan 22, 2016 at 4:47 PM, Jack Krupansky <ja...@gmail.com> wrote:

Is single-node Cassandra has the performance (and capacity) you need and the NoSQL data model and API are sufficient for your app, and your dev and ops and support teams are already familiar with and committed to Cassandra, and you don't need HA or scaling, then it sounds like you are set.
You asked about risks, and normally lack of HA and scaling are unacceptable risks when people are looking at distributed databases.
Most people on this list are dedicated to and passionate about distributed databases, HA, and scaling, so it is distinctly unsettling when somebody comes along who isn't interested in and committed to those same three qualities. But if single-node happens to work for you, then that's great.
-- Jack Krupansky



  

Re: Production with Single Node

Posted by Jack Krupansky <ja...@gmail.com>.
You do of course have the simple technical matters, most of which need to
be addressed with a proof of concept implementation, related to memory,
storage, latency, and throughput. I mean, with a scaled cluster you can
always add nodes to increase capacity and throughput, and reduce latency,
but with a single node you have limited flexibility.

Just to be clear, Cassandra is still not recommended for "fat nodes" - even
if you can fit tons of data on the node, you may not have the computes to
satisfy throughput and latency requirements. And if you don't have enough
system memory the amount of storage is irrelevant.

Back to my original question:
How much data (rows, columns), what kind of load pattern (heavy write,
heavy update, heavy query), and what types of queries (primary key-only,
slices, filtering, secondary indexes, etc.)?

I do recall a customer who ran into problems because they had SSD but only
a very limited amount so they were running out of storage. Having enough
system memory for file system caching and offheap data is important as well.


-- Jack Krupansky

On Fri, Jan 22, 2016 at 5:07 PM, John Lammers <jo...@karoshealth.com>
wrote:

> Thanks for your response Jack.
>
> We are already sold on distributed databases, HA and scaling.  We just
> have some small deployments coming up where there's no money for servers to
> run multiple Cassandra nodes.
>
> So, aside from the lack of HA, I'm asking if a single Cassandra node would
> be viable in a production environment.  (There would be RAID 5 and the RAID
> controller cache is backed by flash memory).
>
> I'm asking because I'm concerned about using Cassandra in a way that it's
> not designed for.  That to me is the unsettling aspect.
>
> If this is a bad idea, give me the ammo I need to shoot it down.  I need
> specific technical reasons.
>
> Thanks!
>
> --John
>
> On Fri, Jan 22, 2016 at 4:47 PM, Jack Krupansky <ja...@gmail.com>
> wrote:
>
>> Is single-node Cassandra has the performance (and capacity) you need and
>> the NoSQL data model and API are sufficient for your app, and your dev and
>> ops and support teams are already familiar with and committed to Cassandra,
>> and you don't need HA or scaling, then it sounds like you are set.
>>
>> You asked about risks, and normally lack of HA and scaling are
>> unacceptable risks when people are looking at distributed databases.
>>
>> Most people on this list are dedicated to and passionate about
>> distributed databases, HA, and scaling, so it is distinctly unsettling when
>> somebody comes along who isn't interested in and committed to those same
>> three qualities. But if single-node happens to work for you, then that's
>> great.
>>
>> -- Jack Krupansky
>>
>
>

Re: Production with Single Node

Posted by Jonathan Haddad <jo...@jonhaddad.com>.
Have you considered running smaller clusters with 1 customer per keyspace?

If you're going to run 1 node (and you want to benchmark it properly) then
you probably want to switch commitlog_sync to 'batch' and redo your
performance tests.  Without it, you're risking data loss and you aren't
comparing apples to apples.  Something like postgres is giving your durable
writes by default.   Cassandra doesn't do that by default because you've
got redundant commit logs.

On Fri, Jan 22, 2016 at 1:48 PM Jack Krupansky <ja...@gmail.com>
wrote:

> Is single-node Cassandra has the performance (and capacity) you need and
> the NoSQL data model and API are sufficient for your app, and your dev and
> ops and support teams are already familiar with and committed to Cassandra,
> and you don't need HA or scaling, then it sounds like you are set.
>
> You asked about risks, and normally lack of HA and scaling are
> unacceptable risks when people are looking at distributed databases.
>
> Most people on this list are dedicated to and passionate about distributed
> databases, HA, and scaling, so it is distinctly unsettling when somebody
> comes along who isn't interested in and committed to those same three
> qualities. But if single-node happens to work for you, then that's great.
>
> -- Jack Krupansky
>
> On Fri, Jan 22, 2016 at 4:32 PM, John Lammers <
> john.lammers@karoshealth.com> wrote:
>
>> Jeff, that may be true for many ... but for our application, the
>> performance of a single Cassandra node blows the doors off Oracle and
>> PostgreSQL.
>>
>>
>> On Fri, Jan 22, 2016 at 4:24 PM, Jeff Jirsa <je...@crowdstrike.com>
>> wrote:
>>
>>> The value of cassandra is in its replication – as a single node
>>> solution, it’s slower and less flexible than alternatives
>>>
>>
>>
>>>
>>> From: John Lammers
>>> Reply-To: "user@cassandra.apache.org"
>>> Date: Friday, January 22, 2016 at 12:57 PM
>>> To: Cassandra Mailing List
>>>
>>> Subject: Fwd: Production with Single Node
>>>
>>> Thanks for your reply Sebastian.
>>>
>>> They are specialized data storage & retrieval systems.  The Cassandra
>>> database is mainly used to store meta-data for searching.
>>>
>>> Jonathan, I had seen your article.  But what are some of the technical
>>> reasons why a one node Cassandra cluster is a bad idea?  I need ammo to
>>> convince others.  Or failing that, what can be done to make this
>>> configuration as safe & robust as possible?
>>>
>>> Thanks!
>>>
>>> --John
>>>
>>> ---------- Forwarded message ----------
>>> From: Sebastian Estevez <se...@datastax.com>
>>> Date: Fri, Jan 22, 2016 at 3:41 PM
>>> Subject: Fwd: Production with Single Node
>>> To: john.lammers@karoshealth.com
>>>
>>>
>>> Hi John,
>>>
>>> Can you share a bit more about your use case? What's the purpose of
>>> these little clusters? Jon has good points but I'm cautious to dismiss your
>>> idea without hearing specifics about your plans.
>>>
>>>
>>> All the best,
>>>
>>>
>>> [image: datastax_logo.png] <http://www.datastax.com/>
>>>
>>> Sebastián Estévez
>>>
>>> Solutions Architect |954 905 8615 | sebastian.estevez@datastax.com
>>>
>>> [image: linkedin.png] <https://www.linkedin.com/company/datastax>[image:
>>> facebook.png] <https://www.facebook.com/datastax>[image: twitter.png]
>>> <https://twitter.com/datastax>[image: g+.png]
>>> <https://plus.google.com/+Datastax/about>
>>> <http://feeds.feedburner.com/datastax>
>>> <http://goog_410786983>
>>>
>>>
>>> <http://www.datastax.com/gartner-magic-quadrant-odbms>
>>>
>>> DataStax is the fastest, most scalable distributed database technology,
>>> delivering Apache Cassandra to the world’s most innovative enterprises.
>>> Datastax is built to be agile, always-on, and predictably scalable to any
>>> size. With more than 500 customers in 45 countries, DataStax is the
>>> database technology and transactional backbone of choice for the worlds
>>> most innovative companies such as Netflix, Adobe, Intuit, and eBay.
>>>
>>> ---------- Forwarded message ----------
>>> From: Jack Krupansky <ja...@gmail.com>
>>> Date: Fri, Jan 22, 2016 at 3:30 PM
>>> Subject: Re: Production with Single Node
>>> To: user@cassandra.apache.org
>>>
>>>
>>> The risks would be about the same as with a single-node Postgres or
>>> MySQL database, except that you wouldn't have the benefit of full SQL.
>>>
>>> How much data (rows, columns), what kind of load pattern (heavy write,
>>> heavy update, heavy query), and what types of queries (primary key-only,
>>> slices, filtering, secondary indexes, etc.)?
>>>
>>> -- Jack Krupansky
>>>
>>> On Fri, Jan 22, 2016 at 3:24 PM, John Lammers <
>>> john.lammers@karoshealth.com> wrote:
>>>
>>>> After deploying a number of production systems with up to 10 Cassandra
>>>> nodes each, we are looking at deploying a small, all-in-one-server system
>>>> with only a single, local node (Cassandra 2.1.11).
>>>>
>>>> What are the risks of such a configuration?
>>>>
>>>> The virtual disk would be running RAID 5 and the disk controller would
>>>> have a flash backed write-behind cache.
>>>>
>>>> What's the best way to configure Cassandra and/or respecify the
>>>> hardware for an all-in-one-box solution?
>>>>
>>>> Thanks-in-advance!
>>>>
>>>> --John
>>>>
>>>>
>>>
>>>
>>>
>>
>

Re: Production with Single Node

Posted by John Lammers <jo...@karoshealth.com>.
Thanks for your response Jack.

We are already sold on distributed databases, HA and scaling.  We just have
some small deployments coming up where there's no money for servers to run
multiple Cassandra nodes.

So, aside from the lack of HA, I'm asking if a single Cassandra node would
be viable in a production environment.  (There would be RAID 5 and the RAID
controller cache is backed by flash memory).

I'm asking because I'm concerned about using Cassandra in a way that it's
not designed for.  That to me is the unsettling aspect.

If this is a bad idea, give me the ammo I need to shoot it down.  I need
specific technical reasons.

Thanks!

--John

On Fri, Jan 22, 2016 at 4:47 PM, Jack Krupansky <ja...@gmail.com>
wrote:

> Is single-node Cassandra has the performance (and capacity) you need and
> the NoSQL data model and API are sufficient for your app, and your dev and
> ops and support teams are already familiar with and committed to Cassandra,
> and you don't need HA or scaling, then it sounds like you are set.
>
> You asked about risks, and normally lack of HA and scaling are
> unacceptable risks when people are looking at distributed databases.
>
> Most people on this list are dedicated to and passionate about distributed
> databases, HA, and scaling, so it is distinctly unsettling when somebody
> comes along who isn't interested in and committed to those same three
> qualities. But if single-node happens to work for you, then that's great.
>
> -- Jack Krupansky
>

Re: Production with Single Node

Posted by Jack Krupansky <ja...@gmail.com>.
Is single-node Cassandra has the performance (and capacity) you need and
the NoSQL data model and API are sufficient for your app, and your dev and
ops and support teams are already familiar with and committed to Cassandra,
and you don't need HA or scaling, then it sounds like you are set.

You asked about risks, and normally lack of HA and scaling are unacceptable
risks when people are looking at distributed databases.

Most people on this list are dedicated to and passionate about distributed
databases, HA, and scaling, so it is distinctly unsettling when somebody
comes along who isn't interested in and committed to those same three
qualities. But if single-node happens to work for you, then that's great.

-- Jack Krupansky

On Fri, Jan 22, 2016 at 4:32 PM, John Lammers <jo...@karoshealth.com>
wrote:

> Jeff, that may be true for many ... but for our application, the
> performance of a single Cassandra node blows the doors off Oracle and
> PostgreSQL.
>
>
> On Fri, Jan 22, 2016 at 4:24 PM, Jeff Jirsa <je...@crowdstrike.com>
> wrote:
>
>> The value of cassandra is in its replication – as a single node solution,
>> it’s slower and less flexible than alternatives
>>
>
>
>>
>> From: John Lammers
>> Reply-To: "user@cassandra.apache.org"
>> Date: Friday, January 22, 2016 at 12:57 PM
>> To: Cassandra Mailing List
>>
>> Subject: Fwd: Production with Single Node
>>
>> Thanks for your reply Sebastian.
>>
>> They are specialized data storage & retrieval systems.  The Cassandra
>> database is mainly used to store meta-data for searching.
>>
>> Jonathan, I had seen your article.  But what are some of the technical
>> reasons why a one node Cassandra cluster is a bad idea?  I need ammo to
>> convince others.  Or failing that, what can be done to make this
>> configuration as safe & robust as possible?
>>
>> Thanks!
>>
>> --John
>>
>> ---------- Forwarded message ----------
>> From: Sebastian Estevez <se...@datastax.com>
>> Date: Fri, Jan 22, 2016 at 3:41 PM
>> Subject: Fwd: Production with Single Node
>> To: john.lammers@karoshealth.com
>>
>>
>> Hi John,
>>
>> Can you share a bit more about your use case? What's the purpose of these
>> little clusters? Jon has good points but I'm cautious to dismiss your idea
>> without hearing specifics about your plans.
>>
>>
>> All the best,
>>
>>
>> [image: datastax_logo.png] <http://www.datastax.com/>
>>
>> Sebastián Estévez
>>
>> Solutions Architect |954 905 8615 | sebastian.estevez@datastax.com
>>
>> [image: linkedin.png] <https://www.linkedin.com/company/datastax>[image:
>> facebook.png] <https://www.facebook.com/datastax>[image: twitter.png]
>> <https://twitter.com/datastax>[image: g+.png]
>> <https://plus.google.com/+Datastax/about>
>> <http://feeds.feedburner.com/datastax>
>> <http://goog_410786983>
>>
>>
>> <http://www.datastax.com/gartner-magic-quadrant-odbms>
>>
>> DataStax is the fastest, most scalable distributed database technology,
>> delivering Apache Cassandra to the world’s most innovative enterprises.
>> Datastax is built to be agile, always-on, and predictably scalable to any
>> size. With more than 500 customers in 45 countries, DataStax is the
>> database technology and transactional backbone of choice for the worlds
>> most innovative companies such as Netflix, Adobe, Intuit, and eBay.
>>
>> ---------- Forwarded message ----------
>> From: Jack Krupansky <ja...@gmail.com>
>> Date: Fri, Jan 22, 2016 at 3:30 PM
>> Subject: Re: Production with Single Node
>> To: user@cassandra.apache.org
>>
>>
>> The risks would be about the same as with a single-node Postgres or MySQL
>> database, except that you wouldn't have the benefit of full SQL.
>>
>> How much data (rows, columns), what kind of load pattern (heavy write,
>> heavy update, heavy query), and what types of queries (primary key-only,
>> slices, filtering, secondary indexes, etc.)?
>>
>> -- Jack Krupansky
>>
>> On Fri, Jan 22, 2016 at 3:24 PM, John Lammers <
>> john.lammers@karoshealth.com> wrote:
>>
>>> After deploying a number of production systems with up to 10 Cassandra
>>> nodes each, we are looking at deploying a small, all-in-one-server system
>>> with only a single, local node (Cassandra 2.1.11).
>>>
>>> What are the risks of such a configuration?
>>>
>>> The virtual disk would be running RAID 5 and the disk controller would
>>> have a flash backed write-behind cache.
>>>
>>> What's the best way to configure Cassandra and/or respecify the hardware
>>> for an all-in-one-box solution?
>>>
>>> Thanks-in-advance!
>>>
>>> --John
>>>
>>>
>>
>>
>>
>

Re: Production with Single Node

Posted by John Lammers <jo...@karoshealth.com>.
Jeff, that may be true for many ... but for our application, the
performance of a single Cassandra node blows the doors off Oracle and
PostgreSQL.


On Fri, Jan 22, 2016 at 4:24 PM, Jeff Jirsa <je...@crowdstrike.com>
wrote:

> The value of cassandra is in its replication – as a single node solution,
> it’s slower and less flexible than alternatives
>


>
> From: John Lammers
> Reply-To: "user@cassandra.apache.org"
> Date: Friday, January 22, 2016 at 12:57 PM
> To: Cassandra Mailing List
>
> Subject: Fwd: Production with Single Node
>
> Thanks for your reply Sebastian.
>
> They are specialized data storage & retrieval systems.  The Cassandra
> database is mainly used to store meta-data for searching.
>
> Jonathan, I had seen your article.  But what are some of the technical
> reasons why a one node Cassandra cluster is a bad idea?  I need ammo to
> convince others.  Or failing that, what can be done to make this
> configuration as safe & robust as possible?
>
> Thanks!
>
> --John
>
> ---------- Forwarded message ----------
> From: Sebastian Estevez <se...@datastax.com>
> Date: Fri, Jan 22, 2016 at 3:41 PM
> Subject: Fwd: Production with Single Node
> To: john.lammers@karoshealth.com
>
>
> Hi John,
>
> Can you share a bit more about your use case? What's the purpose of these
> little clusters? Jon has good points but I'm cautious to dismiss your idea
> without hearing specifics about your plans.
>
>
> All the best,
>
>
> [image: datastax_logo.png] <http://www.datastax.com/>
>
> Sebastián Estévez
>
> Solutions Architect |954 905 8615 | sebastian.estevez@datastax.com
>
> [image: linkedin.png] <https://www.linkedin.com/company/datastax>[image:
> facebook.png] <https://www.facebook.com/datastax>[image: twitter.png]
> <https://twitter.com/datastax>[image: g+.png]
> <https://plus.google.com/+Datastax/about>
> <http://feeds.feedburner.com/datastax>
> <http://goog_410786983>
>
>
> <http://www.datastax.com/gartner-magic-quadrant-odbms>
>
> DataStax is the fastest, most scalable distributed database technology,
> delivering Apache Cassandra to the world’s most innovative enterprises.
> Datastax is built to be agile, always-on, and predictably scalable to any
> size. With more than 500 customers in 45 countries, DataStax is the
> database technology and transactional backbone of choice for the worlds
> most innovative companies such as Netflix, Adobe, Intuit, and eBay.
>
> ---------- Forwarded message ----------
> From: Jack Krupansky <ja...@gmail.com>
> Date: Fri, Jan 22, 2016 at 3:30 PM
> Subject: Re: Production with Single Node
> To: user@cassandra.apache.org
>
>
> The risks would be about the same as with a single-node Postgres or MySQL
> database, except that you wouldn't have the benefit of full SQL.
>
> How much data (rows, columns), what kind of load pattern (heavy write,
> heavy update, heavy query), and what types of queries (primary key-only,
> slices, filtering, secondary indexes, etc.)?
>
> -- Jack Krupansky
>
> On Fri, Jan 22, 2016 at 3:24 PM, John Lammers <
> john.lammers@karoshealth.com> wrote:
>
>> After deploying a number of production systems with up to 10 Cassandra
>> nodes each, we are looking at deploying a small, all-in-one-server system
>> with only a single, local node (Cassandra 2.1.11).
>>
>> What are the risks of such a configuration?
>>
>> The virtual disk would be running RAID 5 and the disk controller would
>> have a flash backed write-behind cache.
>>
>> What's the best way to configure Cassandra and/or respecify the hardware
>> for an all-in-one-box solution?
>>
>> Thanks-in-advance!
>>
>> --John
>>
>>
>
>
>

Re: Production with Single Node

Posted by Jeff Jirsa <je...@crowdstrike.com>.
The value of cassandra is in its replication – as a single node solution, it’s slower and less flexible than alternatives

From:  John Lammers
Reply-To:  "user@cassandra.apache.org"
Date:  Friday, January 22, 2016 at 12:57 PM
To:  Cassandra Mailing List
Subject:  Fwd: Production with Single Node

Thanks for your reply Sebastian.

They are specialized data storage & retrieval systems.  The Cassandra database is mainly used to store meta-data for searching.

Jonathan, I had seen your article.  But what are some of the technical reasons why a one node Cassandra cluster is a bad idea?  I need ammo to convince others.  Or failing that, what can be done to make this configuration as safe & robust as possible?

Thanks!

--John

---------- Forwarded message ----------
From: Sebastian Estevez <se...@datastax.com>
Date: Fri, Jan 22, 2016 at 3:41 PM
Subject: Fwd: Production with Single Node
To: john.lammers@karoshealth.com


Hi John, 

Can you share a bit more about your use case? What's the purpose of these little clusters? Jon has good points but I'm cautious to dismiss your idea without hearing specifics about your plans.


All the best,



Sebastián Estévez

Solutions Architect |954 905 8615 | sebastian.estevez@datastax.com






DataStax is the fastest, most scalable distributed database technology, delivering Apache Cassandra to the world’s most innovative enterprises. Datastax is built to be agile, always-on, and predictably scalable to any size. With more than 500 customers in 45 countries, DataStax is the database technology and transactional backbone of choice for the worlds most innovative companies such as Netflix, Adobe, Intuit, and eBay.

---------- Forwarded message ----------
From: Jack Krupansky <ja...@gmail.com>
Date: Fri, Jan 22, 2016 at 3:30 PM
Subject: Re: Production with Single Node
To: user@cassandra.apache.org


The risks would be about the same as with a single-node Postgres or MySQL database, except that you wouldn't have the benefit of full SQL. 

How much data (rows, columns), what kind of load pattern (heavy write, heavy update, heavy query), and what types of queries (primary key-only, slices, filtering, secondary indexes, etc.)?

-- Jack Krupansky

On Fri, Jan 22, 2016 at 3:24 PM, John Lammers <jo...@karoshealth.com> wrote:
After deploying a number of production systems with up to 10 Cassandra nodes each, we are looking at deploying a small, all-in-one-server system with only a single, local node (Cassandra 2.1.11).

What are the risks of such a configuration?

The virtual disk would be running RAID 5 and the disk controller would have a flash backed write-behind cache.

What's the best way to configure Cassandra and/or respecify the hardware for an all-in-one-box solution?

Thanks-in-advance!

--John






Fwd: Production with Single Node

Posted by John Lammers <jo...@karoshealth.com>.
Thanks for your reply Sebastian.

They are specialized data storage & retrieval systems.  The Cassandra
database is mainly used to store meta-data for searching.

Jonathan, I had seen your article.  But what are some of the technical
reasons why a one node Cassandra cluster is a bad idea?  I need ammo to
convince others.  Or failing that, what can be done to make this
configuration as safe & robust as possible?

Thanks!

--John

---------- Forwarded message ----------
From: Sebastian Estevez <se...@datastax.com>
Date: Fri, Jan 22, 2016 at 3:41 PM
Subject: Fwd: Production with Single Node
To: john.lammers@karoshealth.com


Hi John,

Can you share a bit more about your use case? What's the purpose of these
little clusters? Jon has good points but I'm cautious to dismiss your idea
without hearing specifics about your plans.


All the best,


[image: datastax_logo.png] <http://www.datastax.com/>

Sebastián Estévez

Solutions Architect | 954 905 8615 | sebastian.estevez@datastax.com

[image: linkedin.png] <https://www.linkedin.com/company/datastax> [image:
facebook.png] <https://www.facebook.com/datastax> [image: twitter.png]
<https://twitter.com/datastax> [image: g+.png]
<https://plus.google.com/+Datastax/about>
<http://feeds.feedburner.com/datastax>
<http://goog_410786983>


<http://www.datastax.com/gartner-magic-quadrant-odbms>

DataStax is the fastest, most scalable distributed database technology,
delivering Apache Cassandra to the world’s most innovative enterprises.
Datastax is built to be agile, always-on, and predictably scalable to any
size. With more than 500 customers in 45 countries, DataStax is the
database technology and transactional backbone of choice for the worlds
most innovative companies such as Netflix, Adobe, Intuit, and eBay.

---------- Forwarded message ----------
From: Jack Krupansky <ja...@gmail.com>
Date: Fri, Jan 22, 2016 at 3:30 PM
Subject: Re: Production with Single Node
To: user@cassandra.apache.org


The risks would be about the same as with a single-node Postgres or MySQL
database, except that you wouldn't have the benefit of full SQL.

How much data (rows, columns), what kind of load pattern (heavy write,
heavy update, heavy query), and what types of queries (primary key-only,
slices, filtering, secondary indexes, etc.)?

-- Jack Krupansky

On Fri, Jan 22, 2016 at 3:24 PM, John Lammers <jo...@karoshealth.com>
wrote:

> After deploying a number of production systems with up to 10 Cassandra
> nodes each, we are looking at deploying a small, all-in-one-server system
> with only a single, local node (Cassandra 2.1.11).
>
> What are the risks of such a configuration?
>
> The virtual disk would be running RAID 5 and the disk controller would
> have a flash backed write-behind cache.
>
> What's the best way to configure Cassandra and/or respecify the hardware
> for an all-in-one-box solution?
>
> Thanks-in-advance!
>
> --John
>
>

Re: Production with Single Node

Posted by Jack Krupansky <ja...@gmail.com>.
The risks would be about the same as with a single-node Postgres or MySQL
database, except that you wouldn't have the benefit of full SQL.

How much data (rows, columns), what kind of load pattern (heavy write,
heavy update, heavy query), and what types of queries (primary key-only,
slices, filtering, secondary indexes, etc.)?

-- Jack Krupansky

On Fri, Jan 22, 2016 at 3:24 PM, John Lammers <jo...@karoshealth.com>
wrote:

> After deploying a number of production systems with up to 10 Cassandra
> nodes each, we are looking at deploying a small, all-in-one-server system
> with only a single, local node (Cassandra 2.1.11).
>
> What are the risks of such a configuration?
>
> The virtual disk would be running RAID 5 and the disk controller would
> have a flash backed write-behind cache.
>
> What's the best way to configure Cassandra and/or respecify the hardware
> for an all-in-one-box solution?
>
> Thanks-in-advance!
>
> --John
>
>