You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by "Hiller, Dean" <De...@nrel.gov> on 2013/02/15 21:05:55 UTC

cassandra vs. mongodb quick question

So I found out mongodb varies their node size from 1T to 42T per node depending on the profile.  So if I was going to be writing a lot but rarely changing rows, could I also use cassandra with a per node size of +20T or is that not advisable?

Thanks,
Dean

RE: cassandra vs. mongodb quick question(good additional info)

Posted by Kanwar Sangha <ka...@mavenir.com>.

“The limiting factors are the time it take to repair, the time it takes to replace a node, the memory considerations for 100's of millions of rows. If you the performance of those operations is acceptable to you, then go crazy”



If I have a node which is attached to a RAID and the node crashes but the data is still good on the drives, it would just mean bringing up the node using the same storage ? would this not be fast…?




From: aaron morton [mailto:aaron@thelastpickle.com]
Sent: 21 February 2013 11:46
To: user@cassandra.apache.org
Subject: Re: cassandra vs. mongodb quick question(good additional info)

If you are lazy like me wolfram alpha can help

http://www.wolframalpha.com/input/?i=transfer+42TB+at+10GbE&a=UnitClash_*TB.*Tebibytes--

10 hours 15 minutes 43.59 seconds

Cheers

-----------------
Aaron Morton
Freelance Cassandra Developer
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 21/02/2013, at 11:31 AM, Wojciech Meler <wo...@gmail.com>> wrote:



you have 86400 seconds a day so 42T could take less than 12 hours on 10Gb link
19 lut 2013 02:01, "Hiller, Dean" <De...@nrel.gov>> napisał(a):
I thought about this more, and even with a 10Gbit network, it would take 40 days to bring up a replacement node if mongodb did truly have a 42T / node like I had heard.  I wrote the below email to the person I heard this from going back to basics which really puts some perspective on it….(and a lot of people don't even have a 10Gbit network like we do)

Nodes are hooked up by a 10G network at most right now where that is 10gigabit.  We are talking about 10Terabytes on disk per node recently.

Google "10 gigabit in gigabytes" gives me 1.25 gigabytes/second  (yes I could have divided by 8 in my head but eh…course when I saw the number, I went duh)

So trying to transfer 10 Terabytes  or 10,000 Gigabytes to a node that we are bringing online to replace a dead node would take approximately 5 days???

This means no one else is using the bandwidth too ;).  10,000Gigabytes * 1 second/1.25 * 1hr/60secs * 1 day / 24 hrs = 5.555555 days.  This is more likely 11 days if we only use 50% of the network.

So bringing a new node up to speed is more like 11 days once it is crashed.  I think this is the main reason the 1Terabyte exists to begin with, right?

From an ops perspective, this could sound like a nightmare scenario of waiting 10 days…..maybe it is livable though.  Either way, I thought it would be good to share the numbers.  ALSO, that is assuming the bus with it's 10 disk can keep up with 10G????  Can it?  What is the limit of throughput on a bus / second on the computers we have as on wikipedia there is a huge variance?

What is the rate of the disks too (multiplied by 10 of course)?  Will they keep up with a 10G rate for bringing a new node online?

This all comes into play even more so when you want to double the size of your cluster of course as all nodes have to transfer half of what they have to all the new nodes that come online(cassandra actually has a very data center/rack aware topology to transfer data correctly to not use up all bandwidth unecessarily…I am not sure mongodb has that).  Anyways, just food for thought.

From: aaron morton <aa...@thelastpickle.com>>>
Reply-To: "user@cassandra.apache.org<ma...@cassandra.apache.org>>" <us...@cassandra.apache.org>>>
Date: Monday, February 18, 2013 1:39 PM
To: "user@cassandra.apache.org<ma...@cassandra.apache.org>>" <us...@cassandra.apache.org>>>, Vegard Berget <po...@fantasista.no>>>
Subject: Re: cassandra vs. mongodb quick question

My experience is repair of 300GB compressed data takes longer than 300GB of uncompressed, but I cannot point to an exact number. Calculating the differences is mostly CPU bound and works on the non compressed data.

Streaming uses compression (after uncompressing the on disk data).

So if you have 300GB of compressed data, take a look at how long repair takes and see if you are comfortable with that. You may also want to test replacing a node so you can get the procedure documented and understand how long it takes.

The idea of the soft 300GB to 500GB limit cam about because of a number of cases where people had 1 TB on a single node and they were surprised it took days to repair or replace. If you know how long things may take, and that fits in your operations then go with it.

Cheers

-----------------
Aaron Morton
Freelance Cassandra Developer
New Zealand

@aaronmorton
http://www.thelastpickle.com<http://www.thelastpickle.com/>

On 18/02/2013, at 10:08 PM, Vegard Berget <po...@fantasista.no>>> wrote:



Just out of curiosity :

When using compression, does this affect this one way or another?  Is 300G (compressed) SSTable size, or total size of data?

.vegard,

----- Original Message -----
From:
user@cassandra.apache.org<ma...@cassandra.apache.org>>

To:
<us...@cassandra.apache.org>>>
Cc:

Sent:
Mon, 18 Feb 2013 08:41:25 +1300
Subject:
Re: cassandra vs. mongodb quick question


If you have spinning disk and 1G networking and no virtual nodes, I would still say 300G to 500G is a soft limit.

If you are using virtual nodes, SSD, JBOD disk configuration or faster networking you may go higher.

The limiting factors are the time it take to repair, the time it takes to replace a node, the memory considerations for 100's of millions of rows. If you the performance of those operations is acceptable to you, then go crazy.

Cheers

-----------------
Aaron Morton
Freelance Cassandra Developer
New Zealand

@aaronmorton
http://www.thelastpickle.com<http://www.thelastpickle.com/><http://www.thelastpickle.com/>

On 16/02/2013, at 9:05 AM, "Hiller, Dean" <De...@nrel.gov>>> wrote:

So I found out mongodb varies their node size from 1T to 42T per node depending on the profile.  So if I was going to be writing a lot but rarely changing rows, could I also use cassandra with a per node size of +20T or is that not advisable?

Thanks,
Dean

Re: cassandra vs. mongodb quick question(good additional info)

Posted by Edward Capriolo <ed...@gmail.com>.

The theoretical maximum of 10G is not even close to what you actually get.

http://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&ved=0CDIQFjAA&url=http%3A%2F%2Fdownload.intel.com%2Fsupport%2Fnetwork%2Fsb%2Ffedexcasestudyfinal.pdf&ei=HawmUcWIM6q20QG8j4DIBw&usg=AFQjCNG8Qskl9vXdJvB7OLtIPQgparrt9A&bvm=bv.42661473,d.dmQ&cad=rja

Sorry did not have time to strip the google stuff out of this link.


On Thu, Feb 21, 2013 at 12:45 PM, aaron morton <aa...@thelastpickle.com> wrote:
> If you are lazy like me wolfram alpha can help
>
> http://www.wolframalpha.com/input/?i=transfer+42TB+at+10GbE&a=UnitClash_*TB.*Tebibytes--
>
> 10 hours 15 minutes 43.59 seconds
>
> Cheers
>
> -----------------
> Aaron Morton
> Freelance Cassandra Developer
> New Zealand
>
> @aaronmorton
> http://www.thelastpickle.com
>
> On 21/02/2013, at 11:31 AM, Wojciech Meler <wo...@gmail.com> wrote:
>
> you have 86400 seconds a day so 42T could take less than 12 hours on 10Gb
> link
>
> 19 lut 2013 02:01, "Hiller, Dean" <De...@nrel.gov> napisał(a):
>>
>> I thought about this more, and even with a 10Gbit network, it would take
>> 40 days to bring up a replacement node if mongodb did truly have a 42T /
>> node like I had heard.  I wrote the below email to the person I heard this
>> from going back to basics which really puts some perspective on it….(and a
>> lot of people don't even have a 10Gbit network like we do)
>>
>> Nodes are hooked up by a 10G network at most right now where that is
>> 10gigabit.  We are talking about 10Terabytes on disk per node recently.
>>
>> Google "10 gigabit in gigabytes" gives me 1.25 gigabytes/second  (yes I
>> could have divided by 8 in my head but eh…course when I saw the number, I
>> went duh)
>>
>> So trying to transfer 10 Terabytes  or 10,000 Gigabytes to a node that we
>> are bringing online to replace a dead node would take approximately 5
>> days???
>>
>> This means no one else is using the bandwidth too ;).  10,000Gigabytes * 1
>> second/1.25 * 1hr/60secs * 1 day / 24 hrs = 5.555555 days.  This is more
>> likely 11 days if we only use 50% of the network.
>>
>> So bringing a new node up to speed is more like 11 days once it is
>> crashed.  I think this is the main reason the 1Terabyte exists to begin
>> with, right?
>>
>> From an ops perspective, this could sound like a nightmare scenario of
>> waiting 10 days…..maybe it is livable though.  Either way, I thought it
>> would be good to share the numbers.  ALSO, that is assuming the bus with
>> it's 10 disk can keep up with 10G????  Can it?  What is the limit of
>> throughput on a bus / second on the computers we have as on wikipedia there
>> is a huge variance?
>>
>> What is the rate of the disks too (multiplied by 10 of course)?  Will they
>> keep up with a 10G rate for bringing a new node online?
>>
>> This all comes into play even more so when you want to double the size of
>> your cluster of course as all nodes have to transfer half of what they have
>> to all the new nodes that come online(cassandra actually has a very data
>> center/rack aware topology to transfer data correctly to not use up all
>> bandwidth unecessarily…I am not sure mongodb has that).  Anyways, just food
>> for thought.
>>
>> From: aaron morton
>> <aa...@thelastpickle.com>>
>> Reply-To: "user@cassandra.apache.org<ma...@cassandra.apache.org>"
>> <us...@cassandra.apache.org>>
>> Date: Monday, February 18, 2013 1:39 PM
>> To: "user@cassandra.apache.org<ma...@cassandra.apache.org>"
>> <us...@cassandra.apache.org>>, Vegard Berget
>> <po...@fantasista.no>>
>> Subject: Re: cassandra vs. mongodb quick question
>>
>> My experience is repair of 300GB compressed data takes longer than 300GB
>> of uncompressed, but I cannot point to an exact number. Calculating the
>> differences is mostly CPU bound and works on the non compressed data.
>>
>> Streaming uses compression (after uncompressing the on disk data).
>>
>> So if you have 300GB of compressed data, take a look at how long repair
>> takes and see if you are comfortable with that. You may also want to test
>> replacing a node so you can get the procedure documented and understand how
>> long it takes.
>>
>> The idea of the soft 300GB to 500GB limit cam about because of a number of
>> cases where people had 1 TB on a single node and they were surprised it took
>> days to repair or replace. If you know how long things may take, and that
>> fits in your operations then go with it.
>>
>> Cheers
>>
>> -----------------
>> Aaron Morton
>> Freelance Cassandra Developer
>> New Zealand
>>
>> @aaronmorton
>> http://www.thelastpickle.com
>>
>> On 18/02/2013, at 10:08 PM, Vegard Berget
>> <po...@fantasista.no>> wrote:
>>
>>
>>
>> Just out of curiosity :
>>
>> When using compression, does this affect this one way or another?  Is 300G
>> (compressed) SSTable size, or total size of data?
>>
>> .vegard,
>>
>> ----- Original Message -----
>> From:
>> user@cassandra.apache.org<ma...@cassandra.apache.org>
>>
>> To:
>> <us...@cassandra.apache.org>>
>> Cc:
>>
>> Sent:
>> Mon, 18 Feb 2013 08:41:25 +1300
>> Subject:
>> Re: cassandra vs. mongodb quick question
>>
>>
>> If you have spinning disk and 1G networking and no virtual nodes, I would
>> still say 300G to 500G is a soft limit.
>>
>> If you are using virtual nodes, SSD, JBOD disk configuration or faster
>> networking you may go higher.
>>
>> The limiting factors are the time it take to repair, the time it takes to
>> replace a node, the memory considerations for 100's of millions of rows. If
>> you the performance of those operations is acceptable to you, then go crazy.
>>
>> Cheers
>>
>> -----------------
>> Aaron Morton
>> Freelance Cassandra Developer
>> New Zealand
>>
>> @aaronmorton
>> http://www.thelastpickle.com<http://www.thelastpickle.com/>
>>
>> On 16/02/2013, at 9:05 AM, "Hiller, Dean"
>> <De...@nrel.gov>> wrote:
>>
>> So I found out mongodb varies their node size from 1T to 42T per node
>> depending on the profile.  So if I was going to be writing a lot but rarely
>> changing rows, could I also use cassandra with a per node size of +20T or is
>> that not advisable?
>>
>> Thanks,
>> Dean
>>
>>
>

Re: cassandra vs. mongodb quick question(good additional info)

Posted by aaron morton <aa...@thelastpickle.com>.

If you are lazy like me wolfram alpha can help 

http://www.wolframalpha.com/input/?i=transfer+42TB+at+10GbE&a=UnitClash_*TB.*Tebibytes--

10 hours 15 minutes 43.59 seconds

Cheers

-----------------
Aaron Morton
Freelance Cassandra Developer
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 21/02/2013, at 11:31 AM, Wojciech Meler <wo...@gmail.com> wrote:

> you have 86400 seconds a day so 42T could take less than 12 hours on 10Gb link
> 
> 19 lut 2013 02:01, "Hiller, Dean" <De...@nrel.gov> napisał(a):
> I thought about this more, and even with a 10Gbit network, it would take 40 days to bring up a replacement node if mongodb did truly have a 42T / node like I had heard.  I wrote the below email to the person I heard this from going back to basics which really puts some perspective on it….(and a lot of people don't even have a 10Gbit network like we do)
> 
> Nodes are hooked up by a 10G network at most right now where that is 10gigabit.  We are talking about 10Terabytes on disk per node recently.
> 
> Google "10 gigabit in gigabytes" gives me 1.25 gigabytes/second  (yes I could have divided by 8 in my head but eh…course when I saw the number, I went duh)
> 
> So trying to transfer 10 Terabytes  or 10,000 Gigabytes to a node that we are bringing online to replace a dead node would take approximately 5 days???
> 
> This means no one else is using the bandwidth too ;).  10,000Gigabytes * 1 second/1.25 * 1hr/60secs * 1 day / 24 hrs = 5.555555 days.  This is more likely 11 days if we only use 50% of the network.
> 
> So bringing a new node up to speed is more like 11 days once it is crashed.  I think this is the main reason the 1Terabyte exists to begin with, right?
> 
> From an ops perspective, this could sound like a nightmare scenario of waiting 10 days…..maybe it is livable though.  Either way, I thought it would be good to share the numbers.  ALSO, that is assuming the bus with it's 10 disk can keep up with 10G????  Can it?  What is the limit of throughput on a bus / second on the computers we have as on wikipedia there is a huge variance?
> 
> What is the rate of the disks too (multiplied by 10 of course)?  Will they keep up with a 10G rate for bringing a new node online?
> 
> This all comes into play even more so when you want to double the size of your cluster of course as all nodes have to transfer half of what they have to all the new nodes that come online(cassandra actually has a very data center/rack aware topology to transfer data correctly to not use up all bandwidth unecessarily…I am not sure mongodb has that).  Anyways, just food for thought.
> 
> From: aaron morton <aa...@thelastpickle.com>>
> Reply-To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
> Date: Monday, February 18, 2013 1:39 PM
> To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>, Vegard Berget <po...@fantasista.no>>
> Subject: Re: cassandra vs. mongodb quick question
> 
> My experience is repair of 300GB compressed data takes longer than 300GB of uncompressed, but I cannot point to an exact number. Calculating the differences is mostly CPU bound and works on the non compressed data.
> 
> Streaming uses compression (after uncompressing the on disk data).
> 
> So if you have 300GB of compressed data, take a look at how long repair takes and see if you are comfortable with that. You may also want to test replacing a node so you can get the procedure documented and understand how long it takes.
> 
> The idea of the soft 300GB to 500GB limit cam about because of a number of cases where people had 1 TB on a single node and they were surprised it took days to repair or replace. If you know how long things may take, and that fits in your operations then go with it.
> 
> Cheers
> 
> -----------------
> Aaron Morton
> Freelance Cassandra Developer
> New Zealand
> 
> @aaronmorton
> http://www.thelastpickle.com
> 
> On 18/02/2013, at 10:08 PM, Vegard Berget <po...@fantasista.no>> wrote:
> 
> 
> 
> Just out of curiosity :
> 
> When using compression, does this affect this one way or another?  Is 300G (compressed) SSTable size, or total size of data?
> 
> .vegard,
> 
> ----- Original Message -----
> From:
> user@cassandra.apache.org<ma...@cassandra.apache.org>
> 
> To:
> <us...@cassandra.apache.org>>
> Cc:
> 
> Sent:
> Mon, 18 Feb 2013 08:41:25 +1300
> Subject:
> Re: cassandra vs. mongodb quick question
> 
> 
> If you have spinning disk and 1G networking and no virtual nodes, I would still say 300G to 500G is a soft limit.
> 
> If you are using virtual nodes, SSD, JBOD disk configuration or faster networking you may go higher.
> 
> The limiting factors are the time it take to repair, the time it takes to replace a node, the memory considerations for 100's of millions of rows. If you the performance of those operations is acceptable to you, then go crazy.
> 
> Cheers
> 
> -----------------
> Aaron Morton
> Freelance Cassandra Developer
> New Zealand
> 
> @aaronmorton
> http://www.thelastpickle.com<http://www.thelastpickle.com/>
> 
> On 16/02/2013, at 9:05 AM, "Hiller, Dean" <De...@nrel.gov>> wrote:
> 
> So I found out mongodb varies their node size from 1T to 42T per node depending on the profile.  So if I was going to be writing a lot but rarely changing rows, could I also use cassandra with a per node size of +20T or is that not advisable?
> 
> Thanks,
> Dean
> 
>

Re: cassandra vs. mongodb quick question(good additional info)

Posted by Wojciech Meler <wo...@gmail.com>.

you have 86400 seconds a day so 42T could take less than 12 hours on 10Gb
link
19 lut 2013 02:01, "Hiller, Dean" <De...@nrel.gov> napisał(a):

> I thought about this more, and even with a 10Gbit network, it would take
> 40 days to bring up a replacement node if mongodb did truly have a 42T /
> node like I had heard.  I wrote the below email to the person I heard this
> from going back to basics which really puts some perspective on it….(and a
> lot of people don't even have a 10Gbit network like we do)
>
> Nodes are hooked up by a 10G network at most right now where that is
> 10gigabit.  We are talking about 10Terabytes on disk per node recently.
>
> Google "10 gigabit in gigabytes" gives me 1.25 gigabytes/second  (yes I
> could have divided by 8 in my head but eh…course when I saw the number, I
> went duh)
>
> So trying to transfer 10 Terabytes  or 10,000 Gigabytes to a node that we
> are bringing online to replace a dead node would take approximately 5
> days???
>
> This means no one else is using the bandwidth too ;).  10,000Gigabytes * 1
> second/1.25 * 1hr/60secs * 1 day / 24 hrs = 5.555555 days.  This is more
> likely 11 days if we only use 50% of the network.
>
> So bringing a new node up to speed is more like 11 days once it is
> crashed.  I think this is the main reason the 1Terabyte exists to begin
> with, right?
>
> From an ops perspective, this could sound like a nightmare scenario of
> waiting 10 days…..maybe it is livable though.  Either way, I thought it
> would be good to share the numbers.  ALSO, that is assuming the bus with
> it's 10 disk can keep up with 10G????  Can it?  What is the limit of
> throughput on a bus / second on the computers we have as on wikipedia there
> is a huge variance?
>
> What is the rate of the disks too (multiplied by 10 of course)?  Will they
> keep up with a 10G rate for bringing a new node online?
>
> This all comes into play even more so when you want to double the size of
> your cluster of course as all nodes have to transfer half of what they have
> to all the new nodes that come online(cassandra actually has a very data
> center/rack aware topology to transfer data correctly to not use up all
> bandwidth unecessarily…I am not sure mongodb has that).  Anyways, just food
> for thought.
>
> From: aaron morton <aaron@thelastpickle.com<mailto:aaron@thelastpickle.com
> >>
> Reply-To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <
> user@cassandra.apache.org<ma...@cassandra.apache.org>>
> Date: Monday, February 18, 2013 1:39 PM
> To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <
> user@cassandra.apache.org<ma...@cassandra.apache.org>>, Vegard
> Berget <po...@fantasista.no>>
> Subject: Re: cassandra vs. mongodb quick question
>
> My experience is repair of 300GB compressed data takes longer than 300GB
> of uncompressed, but I cannot point to an exact number. Calculating the
> differences is mostly CPU bound and works on the non compressed data.
>
> Streaming uses compression (after uncompressing the on disk data).
>
> So if you have 300GB of compressed data, take a look at how long repair
> takes and see if you are comfortable with that. You may also want to test
> replacing a node so you can get the procedure documented and understand how
> long it takes.
>
> The idea of the soft 300GB to 500GB limit cam about because of a number of
> cases where people had 1 TB on a single node and they were surprised it
> took days to repair or replace. If you know how long things may take, and
> that fits in your operations then go with it.
>
> Cheers
>
> -----------------
> Aaron Morton
> Freelance Cassandra Developer
> New Zealand
>
> @aaronmorton
> http://www.thelastpickle.com
>
> On 18/02/2013, at 10:08 PM, Vegard Berget <post@fantasista.no<mailto:
> post@fantasista.no>> wrote:
>
>
>
> Just out of curiosity :
>
> When using compression, does this affect this one way or another?  Is 300G
> (compressed) SSTable size, or total size of data?
>
> .vegard,
>
> ----- Original Message -----
> From:
> user@cassandra.apache.org<ma...@cassandra.apache.org>
>
> To:
> <us...@cassandra.apache.org>>
> Cc:
>
> Sent:
> Mon, 18 Feb 2013 08:41:25 +1300
> Subject:
> Re: cassandra vs. mongodb quick question
>
>
> If you have spinning disk and 1G networking and no virtual nodes, I would
> still say 300G to 500G is a soft limit.
>
> If you are using virtual nodes, SSD, JBOD disk configuration or faster
> networking you may go higher.
>
> The limiting factors are the time it take to repair, the time it takes to
> replace a node, the memory considerations for 100's of millions of rows. If
> you the performance of those operations is acceptable to you, then go crazy.
>
> Cheers
>
> -----------------
> Aaron Morton
> Freelance Cassandra Developer
> New Zealand
>
> @aaronmorton
> http://www.thelastpickle.com<http://www.thelastpickle.com/>
>
> On 16/02/2013, at 9:05 AM, "Hiller, Dean" <Dean.Hiller@nrel.gov<mailto:
> Dean.Hiller@nrel.gov>> wrote:
>
> So I found out mongodb varies their node size from 1T to 42T per node
> depending on the profile.  So if I was going to be writing a lot but rarely
> changing rows, could I also use cassandra with a per node size of +20T or
> is that not advisable?
>
> Thanks,
> Dean
>
>
>

Re: cassandra vs. mongodb quick question(good additional info)

Posted by Edward Capriolo <ed...@gmail.com>.

The 40 TB use case you heard about is probably one 40TB mysql machine
that someone migrated to mongo so it would be "web scale" Cassandra is
NOT good with drives that big, get a blade center or a high density
chassis.

On Mon, Feb 18, 2013 at 8:00 PM, Hiller, Dean <De...@nrel.gov> wrote:
> I thought about this more, and even with a 10Gbit network, it would take 40 days to bring up a replacement node if mongodb did truly have a 42T / node like I had heard.  I wrote the below email to the person I heard this from going back to basics which really puts some perspective on it….(and a lot of people don't even have a 10Gbit network like we do)
>
> Nodes are hooked up by a 10G network at most right now where that is 10gigabit.  We are talking about 10Terabytes on disk per node recently.
>
> Google "10 gigabit in gigabytes" gives me 1.25 gigabytes/second  (yes I could have divided by 8 in my head but eh…course when I saw the number, I went duh)
>
> So trying to transfer 10 Terabytes  or 10,000 Gigabytes to a node that we are bringing online to replace a dead node would take approximately 5 days???
>
> This means no one else is using the bandwidth too ;).  10,000Gigabytes * 1 second/1.25 * 1hr/60secs * 1 day / 24 hrs = 5.555555 days.  This is more likely 11 days if we only use 50% of the network.
>
> So bringing a new node up to speed is more like 11 days once it is crashed.  I think this is the main reason the 1Terabyte exists to begin with, right?
>
> From an ops perspective, this could sound like a nightmare scenario of waiting 10 days…..maybe it is livable though.  Either way, I thought it would be good to share the numbers.  ALSO, that is assuming the bus with it's 10 disk can keep up with 10G????  Can it?  What is the limit of throughput on a bus / second on the computers we have as on wikipedia there is a huge variance?
>
> What is the rate of the disks too (multiplied by 10 of course)?  Will they keep up with a 10G rate for bringing a new node online?
>
> This all comes into play even more so when you want to double the size of your cluster of course as all nodes have to transfer half of what they have to all the new nodes that come online(cassandra actually has a very data center/rack aware topology to transfer data correctly to not use up all bandwidth unecessarily…I am not sure mongodb has that).  Anyways, just food for thought.
>
> From: aaron morton <aa...@thelastpickle.com>>
> Reply-To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
> Date: Monday, February 18, 2013 1:39 PM
> To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>, Vegard Berget <po...@fantasista.no>>
> Subject: Re: cassandra vs. mongodb quick question
>
> My experience is repair of 300GB compressed data takes longer than 300GB of uncompressed, but I cannot point to an exact number. Calculating the differences is mostly CPU bound and works on the non compressed data.
>
> Streaming uses compression (after uncompressing the on disk data).
>
> So if you have 300GB of compressed data, take a look at how long repair takes and see if you are comfortable with that. You may also want to test replacing a node so you can get the procedure documented and understand how long it takes.
>
> The idea of the soft 300GB to 500GB limit cam about because of a number of cases where people had 1 TB on a single node and they were surprised it took days to repair or replace. If you know how long things may take, and that fits in your operations then go with it.
>
> Cheers
>
> -----------------
> Aaron Morton
> Freelance Cassandra Developer
> New Zealand
>
> @aaronmorton
> http://www.thelastpickle.com
>
> On 18/02/2013, at 10:08 PM, Vegard Berget <po...@fantasista.no>> wrote:
>
>
>
> Just out of curiosity :
>
> When using compression, does this affect this one way or another?  Is 300G (compressed) SSTable size, or total size of data?
>
> .vegard,
>
> ----- Original Message -----
> From:
> user@cassandra.apache.org<ma...@cassandra.apache.org>
>
> To:
> <us...@cassandra.apache.org>>
> Cc:
>
> Sent:
> Mon, 18 Feb 2013 08:41:25 +1300
> Subject:
> Re: cassandra vs. mongodb quick question
>
>
> If you have spinning disk and 1G networking and no virtual nodes, I would still say 300G to 500G is a soft limit.
>
> If you are using virtual nodes, SSD, JBOD disk configuration or faster networking you may go higher.
>
> The limiting factors are the time it take to repair, the time it takes to replace a node, the memory considerations for 100's of millions of rows. If you the performance of those operations is acceptable to you, then go crazy.
>
> Cheers
>
> -----------------
> Aaron Morton
> Freelance Cassandra Developer
> New Zealand
>
> @aaronmorton
> http://www.thelastpickle.com<http://www.thelastpickle.com/>
>
> On 16/02/2013, at 9:05 AM, "Hiller, Dean" <De...@nrel.gov>> wrote:
>
> So I found out mongodb varies their node size from 1T to 42T per node depending on the profile.  So if I was going to be writing a lot but rarely changing rows, could I also use cassandra with a per node size of +20T or is that not advisable?
>
> Thanks,
> Dean
>
>

Re: cassandra vs. mongodb quick question(good additional info)

Posted by Edward Capriolo <ed...@gmail.com>.

Write once and compact is generally a bad fit for very large datasets.
It is like being able to jump 60 feet in the air, but your legs can
not withstand 10 feet drops.

http://wiki.apache.org/cassandra/LargeDataSetConsiderations



On Wed, Feb 20, 2013 at 3:33 PM, Bryan Talbot <bt...@aeriagames.com> wrote:
> There seem to be some data structures in cassandra which scale with the
> number of rows stored and consume in-jvm memory without bound (other than
> number of rows).  Even with 1.2, I think that index samples are still kept
> in-jvm so you may need to tune index_interval.  Unfortunately that is a
> global value so it will affect all CF and not just the big ones that need it
> to be different.
>
> There may be other issues (like during compaction) but that one pops out.
> Prior to 1.2, bloom filters would be a big problem too.
>
> -Bryan
>
>
>
> On Wed, Feb 20, 2013 at 12:20 PM, Hiller, Dean <De...@nrel.gov> wrote:
>>
>> Heh, we just discovered that mistake a few minutes ago….thanks though.  I
>> am now wondering and may run a test cluster with a separate 6 nodes and test
>> how compaction is on very large data sets and such.  We have tons of
>> research data that sits there so I am wondering if 20T / node is now
>> feasible with cassandra(I mean if mongodb has a 42T which 10gen was telling
>> my colleague, I would think we can with cassandra).
>>
>> Is there any reasons I should know up front that 20T per node won't work.
>> We have 20 disks per node and this definitely has a different profile than
>> previous cassandra systems I have setup.  We don't need really any caching
>> as disk access is typically fine on reads.
>>
>> Thanks,
>> Dean
>>
>> From: Bryan Talbot <bt...@aeriagames.com>>
>> Reply-To: "user@cassandra.apache.org<ma...@cassandra.apache.org>"
>> <us...@cassandra.apache.org>>
>> Date: Wednesday, February 20, 2013 1:04 PM
>> To: "user@cassandra.apache.org<ma...@cassandra.apache.org>"
>> <us...@cassandra.apache.org>>
>> Subject: Re: cassandra vs. mongodb quick question(good additional info)
>>
>> This calculation is incorrect btw.  10,000 GB transferred at 1.25 GB / sec
>> would complete in about 8,000 seconds which is just 2.2 hours and not 5.5
>> days.  The error is in the conversion (1hr/60secs) which is off by 2 orders
>> of magnitude since (1hr/3600secs) is the correct conversion.
>>
>> -Bryan
>>
>>
>> On Mon, Feb 18, 2013 at 5:00 PM, Hiller, Dean
>> <De...@nrel.gov>> wrote:
>> Google "10 gigabit in gigabytes" gives me 1.25 gigabytes/second  (yes I
>> could have divided by 8 in my head but eh…course when I saw the number, I
>> went duh)
>>
>> So trying to transfer 10 Terabytes  or 10,000 Gigabytes to a node that we
>> are bringing online to replace a dead node would take approximately 5
>> days???
>>
>> This means no one else is using the bandwidth too ;).  10,000Gigabytes * 1
>> second/1.25 * 1hr/60secs * 1 day / 24 hrs = 5.555555 days.  This is more
>> likely 11 days if we only use 50% of the network.
>>
>> So bringing a new node up to speed is more like 11 days once it is
>> crashed.  I think this is the main reason the 1Terabyte exists to begin
>> with, right?
>>
>
>

Re: cassandra vs. mongodb quick question(good additional info)

Posted by Bryan Talbot <bt...@aeriagames.com>.

There seem to be some data structures in cassandra which scale with the
number of rows stored and consume in-jvm memory without bound (other than
number of rows).  Even with 1.2, I think that index samples are still kept
in-jvm so you may need to tune index_interval.  Unfortunately that is a
global value so it will affect all CF and not just the big ones that need
it to be different.

There may be other issues (like during compaction) but that one pops out.
 Prior to 1.2, bloom filters would be a big problem too.

-Bryan



On Wed, Feb 20, 2013 at 12:20 PM, Hiller, Dean <De...@nrel.gov> wrote:

> Heh, we just discovered that mistake a few minutes ago….thanks though.  I
> am now wondering and may run a test cluster with a separate 6 nodes and
> test how compaction is on very large data sets and such.  We have tons of
> research data that sits there so I am wondering if 20T / node is now
> feasible with cassandra(I mean if mongodb has a 42T which 10gen was telling
> my colleague, I would think we can with cassandra).
>
> Is there any reasons I should know up front that 20T per node won't work.
>  We have 20 disks per node and this definitely has a different profile than
> previous cassandra systems I have setup.  We don't need really any caching
> as disk access is typically fine on reads.
>
> Thanks,
> Dean
>
> From: Bryan Talbot <bt...@aeriagames.com>>
> Reply-To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <
> user@cassandra.apache.org<ma...@cassandra.apache.org>>
> Date: Wednesday, February 20, 2013 1:04 PM
> To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <
> user@cassandra.apache.org<ma...@cassandra.apache.org>>
> Subject: Re: cassandra vs. mongodb quick question(good additional info)
>
> This calculation is incorrect btw.  10,000 GB transferred at 1.25 GB / sec
> would complete in about 8,000 seconds which is just 2.2 hours and not 5.5
> days.  The error is in the conversion (1hr/60secs) which is off by 2 orders
> of magnitude since (1hr/3600secs) is the correct conversion.
>
> -Bryan
>
>
> On Mon, Feb 18, 2013 at 5:00 PM, Hiller, Dean <Dean.Hiller@nrel.gov
> <ma...@nrel.gov>> wrote:
> Google "10 gigabit in gigabytes" gives me 1.25 gigabytes/second  (yes I
> could have divided by 8 in my head but eh…course when I saw the number, I
> went duh)
>
> So trying to transfer 10 Terabytes  or 10,000 Gigabytes to a node that we
> are bringing online to replace a dead node would take approximately 5
> days???
>
> This means no one else is using the bandwidth too ;).  10,000Gigabytes * 1
> second/1.25 * 1hr/60secs * 1 day / 24 hrs = 5.555555 days.  This is more
> likely 11 days if we only use 50% of the network.
>
> So bringing a new node up to speed is more like 11 days once it is
> crashed.  I think this is the main reason the 1Terabyte exists to begin
> with, right?
>
>

Re: cassandra vs. mongodb quick question(good additional info)

Posted by "Hiller, Dean" <De...@nrel.gov>.

Heh, we just discovered that mistake a few minutes ago….thanks though.  I am now wondering and may run a test cluster with a separate 6 nodes and test how compaction is on very large data sets and such.  We have tons of research data that sits there so I am wondering if 20T / node is now feasible with cassandra(I mean if mongodb has a 42T which 10gen was telling my colleague, I would think we can with cassandra).

Is there any reasons I should know up front that 20T per node won't work.  We have 20 disks per node and this definitely has a different profile than previous cassandra systems I have setup.  We don't need really any caching as disk access is typically fine on reads.

Thanks,
Dean

From: Bryan Talbot <bt...@aeriagames.com>>
Reply-To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Date: Wednesday, February 20, 2013 1:04 PM
To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Subject: Re: cassandra vs. mongodb quick question(good additional info)

This calculation is incorrect btw.  10,000 GB transferred at 1.25 GB / sec would complete in about 8,000 seconds which is just 2.2 hours and not 5.5 days.  The error is in the conversion (1hr/60secs) which is off by 2 orders of magnitude since (1hr/3600secs) is the correct conversion.

-Bryan

On Mon, Feb 18, 2013 at 5:00 PM, Hiller, Dean <De...@nrel.gov>> wrote:
Google "10 gigabit in gigabytes" gives me 1.25 gigabytes/second  (yes I could have divided by 8 in my head but eh…course when I saw the number, I went duh)

So trying to transfer 10 Terabytes  or 10,000 Gigabytes to a node that we are bringing online to replace a dead node would take approximately 5 days???

This means no one else is using the bandwidth too ;).  10,000Gigabytes * 1 second/1.25 * 1hr/60secs * 1 day / 24 hrs = 5.555555 days.  This is more likely 11 days if we only use 50% of the network.

So bringing a new node up to speed is more like 11 days once it is crashed.  I think this is the main reason the 1Terabyte exists to begin with, right?

Re: cassandra vs. mongodb quick question(good additional info)

Posted by Bryan Talbot <bt...@aeriagames.com>.

This calculation is incorrect btw.  10,000 GB transferred at 1.25 GB / sec
would complete in about 8,000 seconds which is just 2.2 hours and not 5.5
days.  The error is in the conversion (1hr/60secs) which is off by 2 orders
of magnitude since (1hr/3600secs) is the correct conversion.

-Bryan

On Mon, Feb 18, 2013 at 5:00 PM, Hiller, Dean <De...@nrel.gov> wrote:

> Google "10 gigabit in gigabytes" gives me 1.25 gigabytes/second  (yes I
> could have divided by 8 in my head but eh…course when I saw the number, I
> went duh)
>
> So trying to transfer 10 Terabytes  or 10,000 Gigabytes to a node that we
> are bringing online to replace a dead node would take approximately 5
> days???
>
> This means no one else is using the bandwidth too ;).  10,000Gigabytes * 1
> second/1.25 * 1hr/60secs * 1 day / 24 hrs = 5.555555 days.  This is more
> likely 11 days if we only use 50% of the network.
>
> So bringing a new node up to speed is more like 11 days once it is
> crashed.  I think this is the main reason the 1Terabyte exists to begin
> with, right?
>
>

Re: cassandra vs. mongodb quick question(good additional info)

Posted by "Hiller, Dean" <De...@nrel.gov>.

I thought about this more, and even with a 10Gbit network, it would take 40 days to bring up a replacement node if mongodb did truly have a 42T / node like I had heard.  I wrote the below email to the person I heard this from going back to basics which really puts some perspective on it….(and a lot of people don't even have a 10Gbit network like we do)

Nodes are hooked up by a 10G network at most right now where that is 10gigabit.  We are talking about 10Terabytes on disk per node recently.

Google "10 gigabit in gigabytes" gives me 1.25 gigabytes/second  (yes I could have divided by 8 in my head but eh…course when I saw the number, I went duh)

So trying to transfer 10 Terabytes  or 10,000 Gigabytes to a node that we are bringing online to replace a dead node would take approximately 5 days???

This means no one else is using the bandwidth too ;).  10,000Gigabytes * 1 second/1.25 * 1hr/60secs * 1 day / 24 hrs = 5.555555 days.  This is more likely 11 days if we only use 50% of the network.

So bringing a new node up to speed is more like 11 days once it is crashed.  I think this is the main reason the 1Terabyte exists to begin with, right?

>From an ops perspective, this could sound like a nightmare scenario of waiting 10 days…..maybe it is livable though.  Either way, I thought it would be good to share the numbers.  ALSO, that is assuming the bus with it's 10 disk can keep up with 10G????  Can it?  What is the limit of throughput on a bus / second on the computers we have as on wikipedia there is a huge variance?

What is the rate of the disks too (multiplied by 10 of course)?  Will they keep up with a 10G rate for bringing a new node online?

This all comes into play even more so when you want to double the size of your cluster of course as all nodes have to transfer half of what they have to all the new nodes that come online(cassandra actually has a very data center/rack aware topology to transfer data correctly to not use up all bandwidth unecessarily…I am not sure mongodb has that).  Anyways, just food for thought.

From: aaron morton <aa...@thelastpickle.com>>
Reply-To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Date: Monday, February 18, 2013 1:39 PM
To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>, Vegard Berget <po...@fantasista.no>>
Subject: Re: cassandra vs. mongodb quick question

My experience is repair of 300GB compressed data takes longer than 300GB of uncompressed, but I cannot point to an exact number. Calculating the differences is mostly CPU bound and works on the non compressed data.

Streaming uses compression (after uncompressing the on disk data).

So if you have 300GB of compressed data, take a look at how long repair takes and see if you are comfortable with that. You may also want to test replacing a node so you can get the procedure documented and understand how long it takes.

The idea of the soft 300GB to 500GB limit cam about because of a number of cases where people had 1 TB on a single node and they were surprised it took days to repair or replace. If you know how long things may take, and that fits in your operations then go with it.

Cheers

-----------------
Aaron Morton
Freelance Cassandra Developer
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 18/02/2013, at 10:08 PM, Vegard Berget <po...@fantasista.no>> wrote:



Just out of curiosity :

When using compression, does this affect this one way or another?  Is 300G (compressed) SSTable size, or total size of data?

.vegard,

----- Original Message -----
From:
user@cassandra.apache.org<ma...@cassandra.apache.org>

To:
<us...@cassandra.apache.org>>
Cc:

Sent:
Mon, 18 Feb 2013 08:41:25 +1300
Subject:
Re: cassandra vs. mongodb quick question


If you have spinning disk and 1G networking and no virtual nodes, I would still say 300G to 500G is a soft limit.

If you are using virtual nodes, SSD, JBOD disk configuration or faster networking you may go higher.

The limiting factors are the time it take to repair, the time it takes to replace a node, the memory considerations for 100's of millions of rows. If you the performance of those operations is acceptable to you, then go crazy.

Cheers

-----------------
Aaron Morton
Freelance Cassandra Developer
New Zealand

@aaronmorton
http://www.thelastpickle.com<http://www.thelastpickle.com/>

On 16/02/2013, at 9:05 AM, "Hiller, Dean" <De...@nrel.gov>> wrote:

So I found out mongodb varies their node size from 1T to 42T per node depending on the profile.  So if I was going to be writing a lot but rarely changing rows, could I also use cassandra with a per node size of +20T or is that not advisable?

Thanks,
Dean

Re: cassandra vs. mongodb quick question

Posted by aaron morton <aa...@thelastpickle.com>.

My experience is repair of 300GB compressed data takes longer than 300GB of uncompressed, but I cannot point to an exact number. Calculating the differences is mostly CPU bound and works on the non compressed data. 

Streaming uses compression (after uncompressing the on disk data).

So if you have 300GB of compressed data, take a look at how long repair takes and see if you are comfortable with that. You may also want to test replacing a node so you can get the procedure documented and understand how long it takes.  

The idea of the soft 300GB to 500GB limit cam about because of a number of cases where people had 1 TB on a single node and they were surprised it took days to repair or replace. If you know how long things may take, and that fits in your operations then go with it. 

Cheers

-----------------
Aaron Morton
Freelance Cassandra Developer
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 18/02/2013, at 10:08 PM, Vegard Berget <po...@fantasista.no> wrote:

>  
> Just out of curiosity :
> 
> When using compression, does this affect this one way or another?  Is 300G (compressed) SSTable size, or total size of data?   
> 
> .vegard,
> 
> 
> ----- Original Message -----
> From:
> user@cassandra.apache.org
> 
> To:
> <us...@cassandra.apache.org>
> Cc:
> 
> Sent:
> Mon, 18 Feb 2013 08:41:25 +1300
> Subject:
> Re: cassandra vs. mongodb quick question
> 
> 
> If you have spinning disk and 1G networking and no virtual nodes, I would still say 300G to 500G is a soft limit. 
> 
> If you are using virtual nodes, SSD, JBOD disk configuration or faster networking you may go higher. 
> 
> The limiting factors are the time it take to repair, the time it takes to replace a node, the memory considerations for 100's of millions of rows. If you the performance of those operations is acceptable to you, then go crazy. 
> 
> Cheers
> 
> -----------------
> Aaron Morton
> Freelance Cassandra Developer
> New Zealand
> 
> @aaronmorton
> http://www.thelastpickle.com
> 
> On 16/02/2013, at 9:05 AM, "Hiller, Dean" <De...@nrel.gov> wrote:
> 
> So I found out mongodb varies their node size from 1T to 42T per node depending on the profile.  So if I was going to be writing a lot but rarely changing rows, could I also use cassandra with a per node size of +20T or is that not advisable?
> 
> Thanks,
> Dean
>

Re: cassandra vs. mongodb quick question

Posted by Vegard Berget <po...@fantasista.no>.

 

	Just out of curiosity :

	When using compression, does this affect this one way or another?
 Is 300G (compressed) SSTable size, or total size of data?   

	.vegard,
----- Original Message -----
From: user@cassandra.apache.org
To:
Cc:
Sent:Mon, 18 Feb 2013 08:41:25 +1300
Subject:Re: cassandra vs. mongodb quick question

 If you have spinning disk and 1G networking and no virtual nodes, I
would still say 300G to 500G is a soft limit. 
 If you are using virtual nodes, SSD, JBOD disk configuration or
faster networking you may go higher.  
 The limiting factors are the time it take to repair, the time it
takes to replace a node, the memory considerations for 100's of
millions of rows. If you the performance of those operations is
acceptable to you, then go crazy.  
 Cheers  
  ----------------- Aaron Morton Freelance Cassandra Developer New
Zealand 
 @aaronmorton http://www.thelastpickle.com [1]   
 On 16/02/2013, at 9:05 AM, "Hiller, Dean"  wrote: 
So I found out mongodb varies their node size from 1T to 42T per node
depending on the profile.  So if I was going to be writing a lot but
rarely changing rows, could I also use cassandra with a per node size
of +20T or is that not advisable?

Thanks,
Dean

 

Links:
------
[1] http://www.thelastpickle.com
[2] mailto:Dean.Hiller@nrel.gov

Re: cassandra vs. mongodb quick question

Posted by aaron morton <aa...@thelastpickle.com>.

If you have spinning disk and 1G networking and no virtual nodes, I would still say 300G to 500G is a soft limit. 

If you are using virtual nodes, SSD, JBOD disk configuration or faster networking you may go higher. 

The limiting factors are the time it take to repair, the time it takes to replace a node, the memory considerations for 100's of millions of rows. If you the performance of those operations is acceptable to you, then go crazy. 

Cheers

-----------------
Aaron Morton
Freelance Cassandra Developer
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 16/02/2013, at 9:05 AM, "Hiller, Dean" <De...@nrel.gov> wrote:

> So I found out mongodb varies their node size from 1T to 42T per node depending on the profile.  So if I was going to be writing a lot but rarely changing rows, could I also use cassandra with a per node size of +20T or is that not advisable?
> 
> Thanks,
> Dean