You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by "Durity, Sean R" <SE...@homedepot.com> on 2018/03/20 17:13:51 UTC

RE: [EXTERNAL] Cassandra vs MySQL

I’m not sure there is a fair comparison. MySQL and Cassandra have different ways of solving related (but not necessarily the same) problems of storing and retrieving data.

The data model between MySQL and Cassandra is likely to be very different. The key for Cassandra is that you need to model for the queries that will be executed. If you cannot know the queries ahead of time, Cassandra is not the best choice. If table scans are typically required, Cassandra is not a good choice. If you need more than a few hundred tables in a cluster, Cassandra is not a good choice.

If multi-datacenter replication is required, Cassandra is an awesome choice. If you are going to always query by a partition key (or primary key), Cassandra is a great choice. The nice thing is that the performance scales linearly, so additional data is fine (as long as you add nodes) – again, if your data model is designed for Cassandra. If you like no-downtime upgrades and extreme reliability and availability, Cassandra is a great choice.

Personally, I hope to never have to use/support MySQL again, and I love working with Cassandra. But, Cassandra is not the choice for all data problems.


Sean Durity

From: Oliver Ruebenacker [mailto:curoli@gmail.com]
Sent: Monday, March 12, 2018 3:58 PM
To: user@cassandra.apache.org
Subject: [EXTERNAL] Cassandra vs MySQL


     Hello,
  We have a project currently using MySQL single-node with 5-6TB of data and some performance issues, and we plan to add data up to a total size of maybe 25-30TB.
  We are thinking of migrating to Cassandra. I have been trying to find benchmarks or other guidelines to compare MySQL and Cassandra, but most of them seem to be five years old or older.
  Is there some good more recent material?
  Thanks!
     Best, Oliver

--
Oliver Ruebenacker
Senior Software Engineer, Diabetes Portal<https://urldefense.proofpoint.com/v2/url?u=http-3A__www.type2diabetesgenetics.org_&d=DwMFaQ&c=MtgQEAMQGqekjTjiAhkudQ&r=aC_gxC6z_4f9GLlbWiKzHm1vucZTtVYWDDvyLkh8IaQ&m=j3Lz6pcGNV-FgBKxSeA0Lj6Jh2PC7f53PrXNjGYOPiU&s=1qS6jO1gSrBpPz6yc33IUcVUA-Q0jKm6jmjJr1u89Tc&e=>, Broad Institute<https://urldefense.proofpoint.com/v2/url?u=http-3A__www.broadinstitute.org_&d=DwMFaQ&c=MtgQEAMQGqekjTjiAhkudQ&r=aC_gxC6z_4f9GLlbWiKzHm1vucZTtVYWDDvyLkh8IaQ&m=j3Lz6pcGNV-FgBKxSeA0Lj6Jh2PC7f53PrXNjGYOPiU&s=bzHFcavS9i7dzp6ahF4aLzSmH_LukAHXbiiLk03LeD8&e=>


________________________________

The information in this Internet Email is confidential and may be legally privileged. It is intended solely for the addressee. Access to this Email by anyone else is unauthorized. If you are not the intended recipient, any disclosure, copying, distribution or any action taken or omitted to be taken in reliance on it, is prohibited and may be unlawful. When addressed to our clients any opinions or advice contained in this Email are subject to the terms and conditions expressed in any applicable governing The Home Depot terms of business or client engagement letter. The Home Depot disclaims all responsibility and liability for the accuracy and content of this attachment and for any damages or losses arising from any inaccuracies, errors, viruses, e.g., worms, trojan horses, etc., or other items of a destructive nature, which may be contained in this attachment and shall not be liable for direct, indirect, consequential or special damages in connection with this e-mail message or its attachment.

Re: [EXTERNAL] Cassandra vs MySQL

Posted by Carl Mueller <ca...@smartthings.com>.
Yes, cassandra's big win is that once you get your data and applications
adapted to the platform, you have a clear path to very very large scale and
resiliency. Um, assuming you have the dollars. It scales out on commodity
hardware, but isn't exactly efficient in the use of that hardware. I like
to say that Cassandra makes big data "bigger data" because of the
timestamp-per-cell and column name overhead and replication factor.

On Tue, Mar 20, 2018 at 2:54 PM, Jeff Jirsa <jj...@gmail.com> wrote:

> I suspect you're approaching this problem from the wrong side.
>
> The decision of MySQL vs Cassandra isn't usually about performance, it's
> about the other features that may impact/enable that performance.
>
> - Will you have a data set that won't fit on any single MySQL Server?
> - Will you want to write into two different hot datacenters at the same
> time?
> - Do you want to be able to restart any single server without impacting
> the cluster?
>
> If you answer yes to those, then cassandra has an option to do so
> trivially, where you'd have to build tooling with MySQL.
>
> - Do you want to do arbitrary text searches?
> - Do you need JOINs?
> - Do you want to build indices on a lot of the columns and do ad-hoc
> querying?
>
> If you answer yes to those, they're far easier in MySQL than Cassandra.
>
> If you're just looking for "Cassandra can do X writes per second and MySQL
> can do Y writes per second", those types of benchmarks are rarely relevant,
> because in both cases they tend to require expert tuning to get the full
> potential (and very few people are experts in both) and data dependent (and
> your data probably doesn't match the benchmarker's dataset).
>
> If I had a dataset that was ~10-20gb and wanted to do arbitrary reads on
> the data, I'd choose MySQL unless I absolutely positively could not
> tolerate downtime, in which case I'd go with Cassandra spanning multiple
> datacenters. If I had a dataset that was 200TB, or 200PB, I'd choose
> Cassandra, even if I could theoretically make MySQL do it faster, because
> the extra effort in building the tooling to manage that many shards of
> MySQL would be prohibitive to most organizations.
>
>
>
>
>
>
>
> On Tue, Mar 20, 2018 at 11:44 AM, Oliver Ruebenacker <cu...@gmail.com>
> wrote:
>
>>
>>      Hello,
>>
>>   Thanks for all the responses.
>>
>>   I do know some SQL and CQL, so I know the main differences. You can do
>> joins in MySQL, but the bigger your data, the less likely you want to do
>> that.
>>
>>   If you are a team that wants to consider migrating from MySQL to
>> Cassandra, you need some reason to believe that it is going to be faster.
>> What evidence is there?
>>
>>   Even the Cassandra home page has references to benchmarks to make the
>> case for Cassandra. Unfortunately, they seem to be about five to six years
>> old. It doesn't make sense to keep them there if you just can't compare.
>>
>>      Best, Oliver
>>
>> On Tue, Mar 20, 2018 at 1:13 PM, Durity, Sean R <
>> SEAN_R_DURITY@homedepot.com> wrote:
>>
>>> I’m not sure there is a fair comparison. MySQL and Cassandra have
>>> different ways of solving related (but not necessarily the same) problems
>>> of storing and retrieving data.
>>>
>>>
>>>
>>> The data model between MySQL and Cassandra is likely to be very
>>> different. The key for Cassandra is that you need to model for the queries
>>> that will be executed. If you cannot know the queries ahead of time,
>>> Cassandra is not the best choice. If table scans are typically required,
>>> Cassandra is not a good choice. If you need more than a few hundred tables
>>> in a cluster, Cassandra is not a good choice.
>>>
>>>
>>>
>>> If multi-datacenter replication is required, Cassandra is an awesome
>>> choice. If you are going to always query by a partition key (or primary
>>> key), Cassandra is a great choice. The nice thing is that the performance
>>> scales linearly, so additional data is fine (as long as you add nodes) –
>>> again, if your data model is designed for Cassandra. If you like
>>> no-downtime upgrades and extreme reliability and availability, Cassandra is
>>> a great choice.
>>>
>>>
>>>
>>> Personally, I hope to never have to use/support MySQL again, and I love
>>> working with Cassandra. But, Cassandra is not the choice for all data
>>> problems.
>>>
>>>
>>>
>>>
>>>
>>> Sean Durity
>>>
>>>
>>>
>>> *From:* Oliver Ruebenacker [mailto:curoli@gmail.com]
>>> *Sent:* Monday, March 12, 2018 3:58 PM
>>> *To:* user@cassandra.apache.org
>>> *Subject:* [EXTERNAL] Cassandra vs MySQL
>>>
>>>
>>>
>>>
>>>
>>>      Hello,
>>>
>>>   We have a project currently using MySQL single-node with 5-6TB of data
>>> and some performance issues, and we plan to add data up to a total size of
>>> maybe 25-30TB.
>>>
>>>   We are thinking of migrating to Cassandra. I have been trying to find
>>> benchmarks or other guidelines to compare MySQL and Cassandra, but most of
>>> them seem to be five years old or older.
>>>
>>>   Is there some good more recent material?
>>>
>>>   Thanks!
>>>
>>>      Best, Oliver
>>>
>>>
>>> --
>>>
>>> Oliver Ruebenacker
>>>
>>> Senior Software Engineer, Diabetes Portal
>>> <https://urldefense.proofpoint.com/v2/url?u=http-3A__www.type2diabetesgenetics.org_&d=DwMFaQ&c=MtgQEAMQGqekjTjiAhkudQ&r=aC_gxC6z_4f9GLlbWiKzHm1vucZTtVYWDDvyLkh8IaQ&m=j3Lz6pcGNV-FgBKxSeA0Lj6Jh2PC7f53PrXNjGYOPiU&s=1qS6jO1gSrBpPz6yc33IUcVUA-Q0jKm6jmjJr1u89Tc&e=>,
>>> Broad Institute
>>> <https://urldefense.proofpoint.com/v2/url?u=http-3A__www.broadinstitute.org_&d=DwMFaQ&c=MtgQEAMQGqekjTjiAhkudQ&r=aC_gxC6z_4f9GLlbWiKzHm1vucZTtVYWDDvyLkh8IaQ&m=j3Lz6pcGNV-FgBKxSeA0Lj6Jh2PC7f53PrXNjGYOPiU&s=bzHFcavS9i7dzp6ahF4aLzSmH_LukAHXbiiLk03LeD8&e=>
>>>
>>>
>>>
>>> ------------------------------
>>>
>>> The information in this Internet Email is confidential and may be
>>> legally privileged. It is intended solely for the addressee. Access to this
>>> Email by anyone else is unauthorized. If you are not the intended
>>> recipient, any disclosure, copying, distribution or any action taken or
>>> omitted to be taken in reliance on it, is prohibited and may be unlawful.
>>> When addressed to our clients any opinions or advice contained in this
>>> Email are subject to the terms and conditions expressed in any applicable
>>> governing The Home Depot terms of business or client engagement letter. The
>>> Home Depot disclaims all responsibility and liability for the accuracy and
>>> content of this attachment and for any damages or losses arising from any
>>> inaccuracies, errors, viruses, e.g., worms, trojan horses, etc., or other
>>> items of a destructive nature, which may be contained in this attachment
>>> and shall not be liable for direct, indirect, consequential or special
>>> damages in connection with this e-mail message or its attachment.
>>>
>>
>>
>>
>> --
>> Oliver Ruebenacker
>> Senior Software Engineer, Diabetes Portal
>> <http://www.type2diabetesgenetics.org/>, Broad Institute
>> <http://www.broadinstitute.org/>
>>
>>
>

Re: [EXTERNAL] Cassandra vs MySQL

Posted by Jeff Jirsa <jj...@gmail.com>.
I suspect you're approaching this problem from the wrong side.

The decision of MySQL vs Cassandra isn't usually about performance, it's
about the other features that may impact/enable that performance.

- Will you have a data set that won't fit on any single MySQL Server?
- Will you want to write into two different hot datacenters at the same
time?
- Do you want to be able to restart any single server without impacting the
cluster?

If you answer yes to those, then cassandra has an option to do so
trivially, where you'd have to build tooling with MySQL.

- Do you want to do arbitrary text searches?
- Do you need JOINs?
- Do you want to build indices on a lot of the columns and do ad-hoc
querying?

If you answer yes to those, they're far easier in MySQL than Cassandra.

If you're just looking for "Cassandra can do X writes per second and MySQL
can do Y writes per second", those types of benchmarks are rarely relevant,
because in both cases they tend to require expert tuning to get the full
potential (and very few people are experts in both) and data dependent (and
your data probably doesn't match the benchmarker's dataset).

If I had a dataset that was ~10-20gb and wanted to do arbitrary reads on
the data, I'd choose MySQL unless I absolutely positively could not
tolerate downtime, in which case I'd go with Cassandra spanning multiple
datacenters. If I had a dataset that was 200TB, or 200PB, I'd choose
Cassandra, even if I could theoretically make MySQL do it faster, because
the extra effort in building the tooling to manage that many shards of
MySQL would be prohibitive to most organizations.







On Tue, Mar 20, 2018 at 11:44 AM, Oliver Ruebenacker <cu...@gmail.com>
wrote:

>
>      Hello,
>
>   Thanks for all the responses.
>
>   I do know some SQL and CQL, so I know the main differences. You can do
> joins in MySQL, but the bigger your data, the less likely you want to do
> that.
>
>   If you are a team that wants to consider migrating from MySQL to
> Cassandra, you need some reason to believe that it is going to be faster.
> What evidence is there?
>
>   Even the Cassandra home page has references to benchmarks to make the
> case for Cassandra. Unfortunately, they seem to be about five to six years
> old. It doesn't make sense to keep them there if you just can't compare.
>
>      Best, Oliver
>
> On Tue, Mar 20, 2018 at 1:13 PM, Durity, Sean R <
> SEAN_R_DURITY@homedepot.com> wrote:
>
>> I’m not sure there is a fair comparison. MySQL and Cassandra have
>> different ways of solving related (but not necessarily the same) problems
>> of storing and retrieving data.
>>
>>
>>
>> The data model between MySQL and Cassandra is likely to be very
>> different. The key for Cassandra is that you need to model for the queries
>> that will be executed. If you cannot know the queries ahead of time,
>> Cassandra is not the best choice. If table scans are typically required,
>> Cassandra is not a good choice. If you need more than a few hundred tables
>> in a cluster, Cassandra is not a good choice.
>>
>>
>>
>> If multi-datacenter replication is required, Cassandra is an awesome
>> choice. If you are going to always query by a partition key (or primary
>> key), Cassandra is a great choice. The nice thing is that the performance
>> scales linearly, so additional data is fine (as long as you add nodes) –
>> again, if your data model is designed for Cassandra. If you like
>> no-downtime upgrades and extreme reliability and availability, Cassandra is
>> a great choice.
>>
>>
>>
>> Personally, I hope to never have to use/support MySQL again, and I love
>> working with Cassandra. But, Cassandra is not the choice for all data
>> problems.
>>
>>
>>
>>
>>
>> Sean Durity
>>
>>
>>
>> *From:* Oliver Ruebenacker [mailto:curoli@gmail.com]
>> *Sent:* Monday, March 12, 2018 3:58 PM
>> *To:* user@cassandra.apache.org
>> *Subject:* [EXTERNAL] Cassandra vs MySQL
>>
>>
>>
>>
>>
>>      Hello,
>>
>>   We have a project currently using MySQL single-node with 5-6TB of data
>> and some performance issues, and we plan to add data up to a total size of
>> maybe 25-30TB.
>>
>>   We are thinking of migrating to Cassandra. I have been trying to find
>> benchmarks or other guidelines to compare MySQL and Cassandra, but most of
>> them seem to be five years old or older.
>>
>>   Is there some good more recent material?
>>
>>   Thanks!
>>
>>      Best, Oliver
>>
>>
>> --
>>
>> Oliver Ruebenacker
>>
>> Senior Software Engineer, Diabetes Portal
>> <https://urldefense.proofpoint.com/v2/url?u=http-3A__www.type2diabetesgenetics.org_&d=DwMFaQ&c=MtgQEAMQGqekjTjiAhkudQ&r=aC_gxC6z_4f9GLlbWiKzHm1vucZTtVYWDDvyLkh8IaQ&m=j3Lz6pcGNV-FgBKxSeA0Lj6Jh2PC7f53PrXNjGYOPiU&s=1qS6jO1gSrBpPz6yc33IUcVUA-Q0jKm6jmjJr1u89Tc&e=>,
>> Broad Institute
>> <https://urldefense.proofpoint.com/v2/url?u=http-3A__www.broadinstitute.org_&d=DwMFaQ&c=MtgQEAMQGqekjTjiAhkudQ&r=aC_gxC6z_4f9GLlbWiKzHm1vucZTtVYWDDvyLkh8IaQ&m=j3Lz6pcGNV-FgBKxSeA0Lj6Jh2PC7f53PrXNjGYOPiU&s=bzHFcavS9i7dzp6ahF4aLzSmH_LukAHXbiiLk03LeD8&e=>
>>
>>
>>
>> ------------------------------
>>
>> The information in this Internet Email is confidential and may be legally
>> privileged. It is intended solely for the addressee. Access to this Email
>> by anyone else is unauthorized. If you are not the intended recipient, any
>> disclosure, copying, distribution or any action taken or omitted to be
>> taken in reliance on it, is prohibited and may be unlawful. When addressed
>> to our clients any opinions or advice contained in this Email are subject
>> to the terms and conditions expressed in any applicable governing The Home
>> Depot terms of business or client engagement letter. The Home Depot
>> disclaims all responsibility and liability for the accuracy and content of
>> this attachment and for any damages or losses arising from any
>> inaccuracies, errors, viruses, e.g., worms, trojan horses, etc., or other
>> items of a destructive nature, which may be contained in this attachment
>> and shall not be liable for direct, indirect, consequential or special
>> damages in connection with this e-mail message or its attachment.
>>
>
>
>
> --
> Oliver Ruebenacker
> Senior Software Engineer, Diabetes Portal
> <http://www.type2diabetesgenetics.org/>, Broad Institute
> <http://www.broadinstitute.org/>
>
>

Re: [EXTERNAL] Cassandra vs MySQL

Posted by Joaquin Casares <jo...@thelastpickle.com>.
Hello Oliver,

The first thing that I check when seeing if a workload will work well
within Cassandra is by looking at it's read patterns. Once the read
patterns can be written down on paper, we need to figure out how the write
patterns will populate the required tables. Since you know enough about
CQL, it's mainly about checking to see how denormalization is going to work
out for on-disk read access requests.

Once the read and write patterns are known, we can see if Cassandra will be
a good fit for denormalizing your workflow and thereby benefiting from a
datastore that can scale out horizontally. If your datastore can scale out
horizontally then Cassandra should be faster than a single node MySQL
cluster. If your datastore has too many relational requirements, is built
in for a queue-like purpose, or other edge cases, then it doesn't matter
how fast Cassandra is if it's not the correct tool for the job.

I hope that helps align your discovery/investigation process. :)

Cheers,

Joaquin Casares
Consultant
Austin, TX

Apache Cassandra Consulting
http://www.thelastpickle.com

On Tue, Mar 20, 2018 at 1:44 PM, Oliver Ruebenacker <cu...@gmail.com>
wrote:

>
>      Hello,
>
>   Thanks for all the responses.
>
>   I do know some SQL and CQL, so I know the main differences. You can do
> joins in MySQL, but the bigger your data, the less likely you want to do
> that.
>
>   If you are a team that wants to consider migrating from MySQL to
> Cassandra, you need some reason to believe that it is going to be faster.
> What evidence is there?
>
>   Even the Cassandra home page has references to benchmarks to make the
> case for Cassandra. Unfortunately, they seem to be about five to six years
> old. It doesn't make sense to keep them there if you just can't compare.
>
>      Best, Oliver
>
> On Tue, Mar 20, 2018 at 1:13 PM, Durity, Sean R <
> SEAN_R_DURITY@homedepot.com> wrote:
>
>> I’m not sure there is a fair comparison. MySQL and Cassandra have
>> different ways of solving related (but not necessarily the same) problems
>> of storing and retrieving data.
>>
>>
>>
>> The data model between MySQL and Cassandra is likely to be very
>> different. The key for Cassandra is that you need to model for the queries
>> that will be executed. If you cannot know the queries ahead of time,
>> Cassandra is not the best choice. If table scans are typically required,
>> Cassandra is not a good choice. If you need more than a few hundred tables
>> in a cluster, Cassandra is not a good choice.
>>
>>
>>
>> If multi-datacenter replication is required, Cassandra is an awesome
>> choice. If you are going to always query by a partition key (or primary
>> key), Cassandra is a great choice. The nice thing is that the performance
>> scales linearly, so additional data is fine (as long as you add nodes) –
>> again, if your data model is designed for Cassandra. If you like
>> no-downtime upgrades and extreme reliability and availability, Cassandra is
>> a great choice.
>>
>>
>>
>> Personally, I hope to never have to use/support MySQL again, and I love
>> working with Cassandra. But, Cassandra is not the choice for all data
>> problems.
>>
>>
>>
>>
>>
>> Sean Durity
>>
>>
>>
>> *From:* Oliver Ruebenacker [mailto:curoli@gmail.com]
>> *Sent:* Monday, March 12, 2018 3:58 PM
>> *To:* user@cassandra.apache.org
>> *Subject:* [EXTERNAL] Cassandra vs MySQL
>>
>>
>>
>>
>>
>>      Hello,
>>
>>   We have a project currently using MySQL single-node with 5-6TB of data
>> and some performance issues, and we plan to add data up to a total size of
>> maybe 25-30TB.
>>
>>   We are thinking of migrating to Cassandra. I have been trying to find
>> benchmarks or other guidelines to compare MySQL and Cassandra, but most of
>> them seem to be five years old or older.
>>
>>   Is there some good more recent material?
>>
>>   Thanks!
>>
>>      Best, Oliver
>>
>>
>> --
>>
>> Oliver Ruebenacker
>>
>> Senior Software Engineer, Diabetes Portal
>> <https://urldefense.proofpoint.com/v2/url?u=http-3A__www.type2diabetesgenetics.org_&d=DwMFaQ&c=MtgQEAMQGqekjTjiAhkudQ&r=aC_gxC6z_4f9GLlbWiKzHm1vucZTtVYWDDvyLkh8IaQ&m=j3Lz6pcGNV-FgBKxSeA0Lj6Jh2PC7f53PrXNjGYOPiU&s=1qS6jO1gSrBpPz6yc33IUcVUA-Q0jKm6jmjJr1u89Tc&e=>,
>> Broad Institute
>> <https://urldefense.proofpoint.com/v2/url?u=http-3A__www.broadinstitute.org_&d=DwMFaQ&c=MtgQEAMQGqekjTjiAhkudQ&r=aC_gxC6z_4f9GLlbWiKzHm1vucZTtVYWDDvyLkh8IaQ&m=j3Lz6pcGNV-FgBKxSeA0Lj6Jh2PC7f53PrXNjGYOPiU&s=bzHFcavS9i7dzp6ahF4aLzSmH_LukAHXbiiLk03LeD8&e=>
>>
>>
>>
>> ------------------------------
>>
>> The information in this Internet Email is confidential and may be legally
>> privileged. It is intended solely for the addressee. Access to this Email
>> by anyone else is unauthorized. If you are not the intended recipient, any
>> disclosure, copying, distribution or any action taken or omitted to be
>> taken in reliance on it, is prohibited and may be unlawful. When addressed
>> to our clients any opinions or advice contained in this Email are subject
>> to the terms and conditions expressed in any applicable governing The Home
>> Depot terms of business or client engagement letter. The Home Depot
>> disclaims all responsibility and liability for the accuracy and content of
>> this attachment and for any damages or losses arising from any
>> inaccuracies, errors, viruses, e.g., worms, trojan horses, etc., or other
>> items of a destructive nature, which may be contained in this attachment
>> and shall not be liable for direct, indirect, consequential or special
>> damages in connection with this e-mail message or its attachment.
>>
>
>
>
> --
> Oliver Ruebenacker
> Senior Software Engineer, Diabetes Portal
> <http://www.type2diabetesgenetics.org/>, Broad Institute
> <http://www.broadinstitute.org/>
>
>

Re: [EXTERNAL] Cassandra vs MySQL

Posted by Oliver Ruebenacker <cu...@gmail.com>.
     Hello,

  Thanks for all the responses.

  I do know some SQL and CQL, so I know the main differences. You can do
joins in MySQL, but the bigger your data, the less likely you want to do
that.

  If you are a team that wants to consider migrating from MySQL to
Cassandra, you need some reason to believe that it is going to be faster.
What evidence is there?

  Even the Cassandra home page has references to benchmarks to make the
case for Cassandra. Unfortunately, they seem to be about five to six years
old. It doesn't make sense to keep them there if you just can't compare.

     Best, Oliver

On Tue, Mar 20, 2018 at 1:13 PM, Durity, Sean R <SEAN_R_DURITY@homedepot.com
> wrote:

> I’m not sure there is a fair comparison. MySQL and Cassandra have
> different ways of solving related (but not necessarily the same) problems
> of storing and retrieving data.
>
>
>
> The data model between MySQL and Cassandra is likely to be very different.
> The key for Cassandra is that you need to model for the queries that will
> be executed. If you cannot know the queries ahead of time, Cassandra is not
> the best choice. If table scans are typically required, Cassandra is not a
> good choice. If you need more than a few hundred tables in a cluster,
> Cassandra is not a good choice.
>
>
>
> If multi-datacenter replication is required, Cassandra is an awesome
> choice. If you are going to always query by a partition key (or primary
> key), Cassandra is a great choice. The nice thing is that the performance
> scales linearly, so additional data is fine (as long as you add nodes) –
> again, if your data model is designed for Cassandra. If you like
> no-downtime upgrades and extreme reliability and availability, Cassandra is
> a great choice.
>
>
>
> Personally, I hope to never have to use/support MySQL again, and I love
> working with Cassandra. But, Cassandra is not the choice for all data
> problems.
>
>
>
>
>
> Sean Durity
>
>
>
> *From:* Oliver Ruebenacker [mailto:curoli@gmail.com]
> *Sent:* Monday, March 12, 2018 3:58 PM
> *To:* user@cassandra.apache.org
> *Subject:* [EXTERNAL] Cassandra vs MySQL
>
>
>
>
>
>      Hello,
>
>   We have a project currently using MySQL single-node with 5-6TB of data
> and some performance issues, and we plan to add data up to a total size of
> maybe 25-30TB.
>
>   We are thinking of migrating to Cassandra. I have been trying to find
> benchmarks or other guidelines to compare MySQL and Cassandra, but most of
> them seem to be five years old or older.
>
>   Is there some good more recent material?
>
>   Thanks!
>
>      Best, Oliver
>
>
> --
>
> Oliver Ruebenacker
>
> Senior Software Engineer, Diabetes Portal
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__www.type2diabetesgenetics.org_&d=DwMFaQ&c=MtgQEAMQGqekjTjiAhkudQ&r=aC_gxC6z_4f9GLlbWiKzHm1vucZTtVYWDDvyLkh8IaQ&m=j3Lz6pcGNV-FgBKxSeA0Lj6Jh2PC7f53PrXNjGYOPiU&s=1qS6jO1gSrBpPz6yc33IUcVUA-Q0jKm6jmjJr1u89Tc&e=>,
> Broad Institute
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__www.broadinstitute.org_&d=DwMFaQ&c=MtgQEAMQGqekjTjiAhkudQ&r=aC_gxC6z_4f9GLlbWiKzHm1vucZTtVYWDDvyLkh8IaQ&m=j3Lz6pcGNV-FgBKxSeA0Lj6Jh2PC7f53PrXNjGYOPiU&s=bzHFcavS9i7dzp6ahF4aLzSmH_LukAHXbiiLk03LeD8&e=>
>
>
>
> ------------------------------
>
> The information in this Internet Email is confidential and may be legally
> privileged. It is intended solely for the addressee. Access to this Email
> by anyone else is unauthorized. If you are not the intended recipient, any
> disclosure, copying, distribution or any action taken or omitted to be
> taken in reliance on it, is prohibited and may be unlawful. When addressed
> to our clients any opinions or advice contained in this Email are subject
> to the terms and conditions expressed in any applicable governing The Home
> Depot terms of business or client engagement letter. The Home Depot
> disclaims all responsibility and liability for the accuracy and content of
> this attachment and for any damages or losses arising from any
> inaccuracies, errors, viruses, e.g., worms, trojan horses, etc., or other
> items of a destructive nature, which may be contained in this attachment
> and shall not be liable for direct, indirect, consequential or special
> damages in connection with this e-mail message or its attachment.
>



-- 
Oliver Ruebenacker
Senior Software Engineer, Diabetes Portal
<http://www.type2diabetesgenetics.org/>, Broad Institute
<http://www.broadinstitute.org/>