You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by cass savy <ca...@gmail.com> on 2016/11/03 23:22:37 UTC

Cassandra on Cloud platforms experience

I would like to hear from the community on their experiences or lesson
learnt on hosting Cassandra in cloud platforms like

1. Google Cloud Platform
2. AWS
3. Azure

1.  Which cloud hosting is better and Why?
2.  What differences of C* over vendor provided NoSQL DB like (Bigtable,
Dynamo,Azure Document DB)
3. AWS is more mature in his offerings and Azure is getting there or its
there already based on what I have been investigating so far?

4. What is drive to pick one vs another -Is it cost, infrastructure,
hardware SKU, availability, scalability, performance,ease of deployment and
maintenance,..etc?

Please let me know your thoughts and suggestions if somebody has done a
deep dive into these 3 cloud platforms for C*.


We use datastax cassandra and exploring new usecases in AWS and also
evaluating  or POC it in Azure/GCP

Re: Cassandra on Cloud platforms experience

Posted by Oskar Kjellin <os...@gmail.com>.
Also, if you're running it on azure you should have a look at Netflix Priam. That really helped us automate stuff. 

Dynamo db works well if you don't want to spend hours in running your cluster. If I were to have the option to run on AWS I would use Dynamo when possible. 

Sent from my iPhone

> On 4 nov. 2016, at 10:10, Oskar Kjellin <os...@gmail.com> wrote:
> 
> So I've run Cassandra on both Aws and azure. I would strongly suggest that if you have the option, run as far away from azure as you can. 
> 
> Here's a list of issues I have running Cassandra on azure:
> 1. No native snitch 
> 2. No concept of availability zones. This makes it impossible for Cassandra to put replicas in different AZs. This will hurt your uptime and might incur loss of data. (They have something called a fault domain tho)
> 3. The disks have iops that land in the floppy disk range 
> 4. Even running SSDs will give you poor performance.
> 5. Beware of the global storage account limit. This makes scaling out hurt performance if you put them on the same storage account. Which if your using images is your only choice. 
> 
> Sent from my iPhone
> 
>> On 4 nov. 2016, at 00:22, cass savy <ca...@gmail.com> wrote:
>> 
>> I would like to hear from the community on their experiences or lesson learnt on hosting Cassandra in cloud platforms like
>> 
>> 1. Google Cloud Platform
>> 2. AWS
>> 3. Azure
>> 
>> 1.  Which cloud hosting is better and Why?
>> 2.  What differences of C* over vendor provided NoSQL DB like (Bigtable, Dynamo,Azure Document DB)
>> 3. AWS is more mature in his offerings and Azure is getting there or its there already based on what I have been investigating so far?
>> 
>> 4. What is drive to pick one vs another -Is it cost, infrastructure, hardware SKU, availability, scalability, performance,ease of deployment and maintenance,..etc?
>> 
>> Please let me know your thoughts and suggestions if somebody has done a deep dive into these 3 cloud platforms for C*.
>> 
>> 
>> We use datastax cassandra and exploring new usecases in AWS and also evaluating  or POC it in Azure/GCP
>> 

Re: Cassandra on Cloud platforms experience

Posted by Oskar Kjellin <os...@gmail.com>.
It is good enough if you do not care about automation.
If you care about automation (and you really should), GossipPropertyFileSnitch
is not good enough as it has to be manually updated. Same with availability
set, fault domain isn't available for the instance as metadata as it is on
AWS.

You still have a global limit for the storage account IOPS. So even tho you
might get thruput with a few servers, adding more means copying the VHD to
multiple storage accounts, making automation once again cumbersome.

2016-11-04 11:29 GMT+01:00 Vladimir Yudovin <vl...@winguzone.com>:

> Hi,
>
> >1. No native snitch
> It's not great problem. GossipPropertyFileSnitch is good enough.
>
> >2. No concept of availability zones.
> Azure does have such concept - Availability Set.
> <https://azure.microsoft.com/en-us/documentation/articles/virtual-machines-windows-infrastructure-availability-sets-guidelines/>
> It provides three fault domain (availability zone in Amazon terms) and 20
> updates domains.
>
> >4. Even running SSDs will give you poor performance.
> It depends on disk size. 1T SSD provides 5000 IOPS.
>
>
> So in short:
> Amazon - provides data at rest encryption, flexible EBS storage (or local
> disks), availability zones.
> Azure - provides data at rest encryption, less flexible storage (or local
> disks), availability zones.
> SoftLayer - no data encryption, but they have unique feature -
> connectivity between different data centers (they call this VLAN spanning)
> without need in VPN or other tunneling. They don't have explicit AV zones,
> but you can put nodes in different DC in the same region (some locations)
> with relative low latency 1-1.5 ms. or purchase another VLAN in different
> pod for $25 per month in the same DC.
>
> We provide Cassandra cluster on all provider in many worldwide locations.
>
> Best regards, Vladimir Yudovin,
>
> *Winguzone <https://winguzone.com?from=list> - Hosted Cloud
> CassandraLaunch your cluster in minutes.*
>
>
> ---- On Fri, 04 Nov 2016 05:10:06 -0400*Oskar Kjellin
> <oskar.kjellin@gmail.com <os...@gmail.com>>* wrote ----
>
> So I've run Cassandra on both Aws and azure. I would strongly suggest that
> if you have the option, run as far away from azure as you can.
>
> Here's a list of issues I have running Cassandra on azure:
> 1. No native snitch
> 2. No concept of availability zones. This makes it impossible for
> Cassandra to put replicas in different AZs. This will hurt your uptime and
> might incur loss of data. (They have something called a fault domain tho)
> 3. The disks have iops that land in the floppy disk range
> 4. Even running SSDs will give you poor performance.
> 5. Beware of the global storage account limit. This makes scaling out hurt
> performance if you put them on the same storage account. Which if your
> using images is your only choice.
>
> Sent from my iPhone
>
> > On 4 nov. 2016, at 00:22, cass savy <ca...@gmail.com> wrote:
> >
> > I would like to hear from the community on their experiences or lesson
> learnt on hosting Cassandra in cloud platforms like
> >
> > 1. Google Cloud Platform
> > 2. AWS
> > 3. Azure
> >
> > 1. Which cloud hosting is better and Why?
> > 2. What differences of C* over vendor provided NoSQL DB like (Bigtable,
> Dynamo,Azure Document DB)
> > 3. AWS is more mature in his offerings and Azure is getting there or its
> there already based on what I have been investigating so far?
> >
> > 4. What is drive to pick one vs another -Is it cost, infrastructure,
> hardware SKU, availability, scalability, performance,ease of deployment and
> maintenance,..etc?
> >
> > Please let me know your thoughts and suggestions if somebody has done a
> deep dive into these 3 cloud platforms for C*.
> >
> >
> > We use datastax cassandra and exploring new usecases in AWS and also
> evaluating or POC it in Azure/GCP
> >
>
>
>

Re: Cassandra on Cloud platforms experience

Posted by Vladimir Yudovin <vl...@winguzone.com>.
Hi,



&gt;1. No native snitch

It's not great problem. GossipPropertyFileSnitch is good enough.



&gt;2. No concept of availability zones.

Azure does have such concept - Availability Set. It provides three fault domain (availability zone in Amazon terms) and 20 updates domains.



&gt;4. Even running SSDs will give you poor performance.

It depends on disk size. 1T SSD provides 5000 IOPS.





So in short:

Amazon - provides data at rest encryption, flexible EBS storage (or local disks), availability zones.

Azure - provides data at rest encryption, less flexible storage (or local disks), availability zones.

SoftLayer - no data encryption, but they have unique feature - connectivity between different data centers (they call this VLAN spanning) without need in VPN or other tunneling. They don't have explicit AV zones, but you can put nodes in different DC in the same region (some locations) with relative low latency 1-1.5 ms. or purchase another VLAN in different pod for $25 per month in the same DC.



We provide Cassandra cluster on all provider in many worldwide locations.



Best regards, Vladimir Yudovin, 

Winguzone - Hosted Cloud Cassandra
Launch your cluster in minutes.





---- On Fri, 04 Nov 2016 05:10:06 -0400Oskar Kjellin &lt;oskar.kjellin@gmail.com&gt; wrote ----




So I've run Cassandra on both Aws and azure. I would strongly suggest that if you have the option, run as far away from azure as you can. 



Here's a list of issues I have running Cassandra on azure: 

1. No native snitch 

2. No concept of availability zones. This makes it impossible for Cassandra to put replicas in different AZs. This will hurt your uptime and might incur loss of data. (They have something called a fault domain tho) 

3. The disks have iops that land in the floppy disk range 

4. Even running SSDs will give you poor performance. 

5. Beware of the global storage account limit. This makes scaling out hurt performance if you put them on the same storage account. Which if your using images is your only choice. 



Sent from my iPhone 



&gt; On 4 nov. 2016, at 00:22, cass savy &lt;casssavy@gmail.com&gt; wrote: 

&gt; 

&gt; I would like to hear from the community on their experiences or lesson learnt on hosting Cassandra in cloud platforms like 

&gt; 

&gt; 1. Google Cloud Platform 

&gt; 2. AWS 

&gt; 3. Azure 

&gt; 

&gt; 1. Which cloud hosting is better and Why? 

&gt; 2. What differences of C* over vendor provided NoSQL DB like (Bigtable, Dynamo,Azure Document DB) 

&gt; 3. AWS is more mature in his offerings and Azure is getting there or its there already based on what I have been investigating so far? 

&gt; 

&gt; 4. What is drive to pick one vs another -Is it cost, infrastructure, hardware SKU, availability, scalability, performance,ease of deployment and maintenance,..etc? 

&gt; 

&gt; Please let me know your thoughts and suggestions if somebody has done a deep dive into these 3 cloud platforms for C*. 

&gt; 

&gt; 

&gt; We use datastax cassandra and exploring new usecases in AWS and also evaluating or POC it in Azure/GCP 

&gt; 







Re: Cassandra on Cloud platforms experience

Posted by Oskar Kjellin <os...@gmail.com>.
So I've run Cassandra on both Aws and azure. I would strongly suggest that if you have the option, run as far away from azure as you can. 

Here's a list of issues I have running Cassandra on azure:
1. No native snitch 
2. No concept of availability zones. This makes it impossible for Cassandra to put replicas in different AZs. This will hurt your uptime and might incur loss of data. (They have something called a fault domain tho)
3. The disks have iops that land in the floppy disk range 
4. Even running SSDs will give you poor performance.
5. Beware of the global storage account limit. This makes scaling out hurt performance if you put them on the same storage account. Which if your using images is your only choice. 

Sent from my iPhone

> On 4 nov. 2016, at 00:22, cass savy <ca...@gmail.com> wrote:
> 
> I would like to hear from the community on their experiences or lesson learnt on hosting Cassandra in cloud platforms like
> 
> 1. Google Cloud Platform
> 2. AWS
> 3. Azure
> 
> 1.  Which cloud hosting is better and Why?
> 2.  What differences of C* over vendor provided NoSQL DB like (Bigtable, Dynamo,Azure Document DB)
> 3. AWS is more mature in his offerings and Azure is getting there or its there already based on what I have been investigating so far?
> 
> 4. What is drive to pick one vs another -Is it cost, infrastructure, hardware SKU, availability, scalability, performance,ease of deployment and maintenance,..etc?
> 
> Please let me know your thoughts and suggestions if somebody has done a deep dive into these 3 cloud platforms for C*.
> 
> 
> We use datastax cassandra and exploring new usecases in AWS and also evaluating  or POC it in Azure/GCP
>