You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by Ryan Lowe <ry...@gmail.com> on 2011/08/28 20:47:09 UTC

Scaling Out / Replication Factor too?

We are working on a system that has super heavy traffic during specific
times... think of sporting events.  Other times we will get almost 0
traffic.  In order to handle the traffic during the events, we are planning
on scaling out cassandra into a very large cluster.   The size of our data
is still quite small.  A single event's data might be 100MB in size max, but
we will be inserting that data very rapidly and needing to read it at the
same time.

Since we have very slow times, we use a replication factor of 2 and a
cluster size of 2 to handle the traffic... it handles it perfectly.

Since dataset size is not really an issue, what is the best way to scale out
for us?  We are using order preserving partitioners to do range scanning, so
last time I tried to scale out our cluster we ended up with very uneven
load.  Then the few nodes that contained that data were very swamped, while
the rest were barely touched.

Other note is that since we have very little data, and lots of memory, we
turned on key and row cache almost as high as we could go.

So my question is this... if I bring in 20+ nodes, should I increase the
replication factor as well?  It would seem to make sense that more
replication factor would help distribute load?  Or does it just mean that
writes take even longer?  What are some other suggestions on how to do scale
up (and then back down) for a system that gets very high traffic in known
small time windows.



Let me know if you need more info.

Thanks!
Ryan

Re: Scaling Out / Replication Factor too?

Posted by Ryan Lowe <ry...@gmail.com>.

I meant to also add that we do not necessarily care if the Reads are
somewhat stale... if two people reading from the cluster at the same time
get different results (say within a 5 min window) then that is acceptable.
 performance is the key thing.

Ryan

On Sun, Aug 28, 2011 at 7:24 PM, Ryan Lowe <ry...@gmail.com> wrote:

> Edward,
>
> This information (and the presentation) was very helpful... but I still
> have a few more questions.
>
> During a test run, I brought up 16 servers with RF of 2 and Read Repair
> Chance of 1.0.  However, like I mentioned, the load was only on a few
> servers.  I attempted to increase the Key and Row caching to completely
> cover our entire dataset, but still CPU load on the few servers was still
> extremely high.  Does the cache exist on every server in the cluster?  Would
> turning off Read Repair (or turning it dramatically down) help reduce the
> load on the servers with the heavy load?
>
>
> Thanks!
> Ryan
>
>
> On Sun, Aug 28, 2011 at 3:49 PM, Edward Capriolo <ed...@gmail.com>wrote:
>
>> So my question is this... if I bring in 20+ nodes, should I increase the
>> replication factor as well?
>>
>> Each write is done to all natural endpoints of the data. If you set
>> replication factor equal to number of nodes write operations do not scale.
>> This is because each write has to happen on every node. The same thing is
>> true with read operations, even if you ready at CL.ONE read repair will
>> perform a read on all replicates.
>>
>> * However there is one caveat to this advice, I covered it in this
>> presentation I did.
>>
>> http://www.slideshare.net/edwardcapriolo/cassandra-as-memcache
>>
>> The read_repair_chance controls how often a read repair is triggered. You
>> can increase Replication Factor and lower read_repair_chance. This gives you
>> many severs capable of serving the same read without burdening by doing
>> repair reads across the other 19 nodes.
>>
>> However this is NOT the standard method to scale out. The standard, and
>> probably better way in all but a few instances, is to leave the replication
>> factor alone and add more nodes.
>>
>> Normally, people set Replication Factor at 3. This gives 3 nodes to serve
>> reads, as long as their dataset is small, which is true in your case, these
>> reads are heavily cached. You would need a very high number of reads/writes
>> to bottleneck any node.
>>
>> Raising and lowering replication factor is not the way to go, changing the
>> replication factor involves more steps then just changing the variable as
>> does growing and shrinking the cluster.
>>
>> What to do about idling servers is another question. We have thought about
>> having our idling web servers join our hadoop cluster at night and then
>> leave again in the morning :) Maybe you can have some fun with your
>> cassandra gear in its idle time.
>>
>>
>>
>> On Sun, Aug 28, 2011 at 2:47 PM, Ryan Lowe <ry...@gmail.com> wrote:
>>
>>> We are working on a system that has super heavy traffic during specific
>>> times... think of sporting events.  Other times we will get almost 0
>>> traffic.  In order to handle the traffic during the events, we are planning
>>> on scaling out cassandra into a very large cluster.   The size of our data
>>> is still quite small.  A single event's data might be 100MB in size max, but
>>> we will be inserting that data very rapidly and needing to read it at the
>>> same time.
>>>
>>> Since we have very slow times, we use a replication factor of 2 and a
>>> cluster size of 2 to handle the traffic... it handles it perfectly.
>>>
>>> Since dataset size is not really an issue, what is the best way to scale
>>> out for us?  We are using order preserving partitioners to do range
>>> scanning, so last time I tried to scale out our cluster we ended up with
>>> very uneven load.  Then the few nodes that contained that data were very
>>> swamped, while the rest were barely touched.
>>>
>>> Other note is that since we have very little data, and lots of memory, we
>>> turned on key and row cache almost as high as we could go.
>>>
>>> So my question is this... if I bring in 20+ nodes, should I increase the
>>> replication factor as well?  It would seem to make sense that more
>>> replication factor would help distribute load?  Or does it just mean that
>>> writes take even longer?  What are some other suggestions on how to do scale
>>> up (and then back down) for a system that gets very high traffic in known
>>> small time windows.
>>>
>>>
>>>
>>> Let me know if you need more info.
>>>
>>> Thanks!
>>> Ryan
>>>
>>
>>
>

Re: Scaling Out / Replication Factor too?

Posted by Boris Yen <yu...@gmail.com>.

I am not sure, but I think the problem might be "order preserving
partitioners" you used. When using "order preserving partitioners" data
might be skewed meaning most data only stay in a few servers, so that might
create a few heavy load servers.

On Mon, Aug 29, 2011 at 7:24 AM, Ryan Lowe <ry...@gmail.com> wrote:

> Edward,
>
> This information (and the presentation) was very helpful... but I still
> have a few more questions.
>
> During a test run, I brought up 16 servers with RF of 2 and Read Repair
> Chance of 1.0.  However, like I mentioned, the load was only on a few
> servers.  I attempted to increase the Key and Row caching to completely
> cover our entire dataset, but still CPU load on the few servers was still
> extremely high.  Does the cache exist on every server in the cluster?  Would
> turning off Read Repair (or turning it dramatically down) help reduce the
> load on the servers with the heavy load?
>
>
> Thanks!
> Ryan
>
>
> On Sun, Aug 28, 2011 at 3:49 PM, Edward Capriolo <ed...@gmail.com>wrote:
>
>> So my question is this... if I bring in 20+ nodes, should I increase the
>> replication factor as well?
>>
>> Each write is done to all natural endpoints of the data. If you set
>> replication factor equal to number of nodes write operations do not scale.
>> This is because each write has to happen on every node. The same thing is
>> true with read operations, even if you ready at CL.ONE read repair will
>> perform a read on all replicates.
>>
>> * However there is one caveat to this advice, I covered it in this
>> presentation I did.
>>
>> http://www.slideshare.net/edwardcapriolo/cassandra-as-memcache
>>
>> The read_repair_chance controls how often a read repair is triggered. You
>> can increase Replication Factor and lower read_repair_chance. This gives you
>> many severs capable of serving the same read without burdening by doing
>> repair reads across the other 19 nodes.
>>
>> However this is NOT the standard method to scale out. The standard, and
>> probably better way in all but a few instances, is to leave the replication
>> factor alone and add more nodes.
>>
>> Normally, people set Replication Factor at 3. This gives 3 nodes to serve
>> reads, as long as their dataset is small, which is true in your case, these
>> reads are heavily cached. You would need a very high number of reads/writes
>> to bottleneck any node.
>>
>> Raising and lowering replication factor is not the way to go, changing the
>> replication factor involves more steps then just changing the variable as
>> does growing and shrinking the cluster.
>>
>> What to do about idling servers is another question. We have thought about
>> having our idling web servers join our hadoop cluster at night and then
>> leave again in the morning :) Maybe you can have some fun with your
>> cassandra gear in its idle time.
>>
>>
>>
>> On Sun, Aug 28, 2011 at 2:47 PM, Ryan Lowe <ry...@gmail.com> wrote:
>>
>>> We are working on a system that has super heavy traffic during specific
>>> times... think of sporting events.  Other times we will get almost 0
>>> traffic.  In order to handle the traffic during the events, we are planning
>>> on scaling out cassandra into a very large cluster.   The size of our data
>>> is still quite small.  A single event's data might be 100MB in size max, but
>>> we will be inserting that data very rapidly and needing to read it at the
>>> same time.
>>>
>>> Since we have very slow times, we use a replication factor of 2 and a
>>> cluster size of 2 to handle the traffic... it handles it perfectly.
>>>
>>> Since dataset size is not really an issue, what is the best way to scale
>>> out for us?  We are using order preserving partitioners to do range
>>> scanning, so last time I tried to scale out our cluster we ended up with
>>> very uneven load.  Then the few nodes that contained that data were very
>>> swamped, while the rest were barely touched.
>>>
>>> Other note is that since we have very little data, and lots of memory, we
>>> turned on key and row cache almost as high as we could go.
>>>
>>> So my question is this... if I bring in 20+ nodes, should I increase the
>>> replication factor as well?  It would seem to make sense that more
>>> replication factor would help distribute load?  Or does it just mean that
>>> writes take even longer?  What are some other suggestions on how to do scale
>>> up (and then back down) for a system that gets very high traffic in known
>>> small time windows.
>>>
>>>
>>>
>>> Let me know if you need more info.
>>>
>>> Thanks!
>>> Ryan
>>>
>>
>>
>

Re: Scaling Out / Replication Factor too?

Posted by Ryan Lowe <ry...@gmail.com>.

Edward,

This information (and the presentation) was very helpful... but I still have
a few more questions.

During a test run, I brought up 16 servers with RF of 2 and Read Repair
Chance of 1.0.  However, like I mentioned, the load was only on a few
servers.  I attempted to increase the Key and Row caching to completely
cover our entire dataset, but still CPU load on the few servers was still
extremely high.  Does the cache exist on every server in the cluster?  Would
turning off Read Repair (or turning it dramatically down) help reduce the
load on the servers with the heavy load?


Thanks!
Ryan


On Sun, Aug 28, 2011 at 3:49 PM, Edward Capriolo <ed...@gmail.com>wrote:

> So my question is this... if I bring in 20+ nodes, should I increase the
> replication factor as well?
>
> Each write is done to all natural endpoints of the data. If you set
> replication factor equal to number of nodes write operations do not scale.
> This is because each write has to happen on every node. The same thing is
> true with read operations, even if you ready at CL.ONE read repair will
> perform a read on all replicates.
>
> * However there is one caveat to this advice, I covered it in this
> presentation I did.
>
> http://www.slideshare.net/edwardcapriolo/cassandra-as-memcache
>
> The read_repair_chance controls how often a read repair is triggered. You
> can increase Replication Factor and lower read_repair_chance. This gives you
> many severs capable of serving the same read without burdening by doing
> repair reads across the other 19 nodes.
>
> However this is NOT the standard method to scale out. The standard, and
> probably better way in all but a few instances, is to leave the replication
> factor alone and add more nodes.
>
> Normally, people set Replication Factor at 3. This gives 3 nodes to serve
> reads, as long as their dataset is small, which is true in your case, these
> reads are heavily cached. You would need a very high number of reads/writes
> to bottleneck any node.
>
> Raising and lowering replication factor is not the way to go, changing the
> replication factor involves more steps then just changing the variable as
> does growing and shrinking the cluster.
>
> What to do about idling servers is another question. We have thought about
> having our idling web servers join our hadoop cluster at night and then
> leave again in the morning :) Maybe you can have some fun with your
> cassandra gear in its idle time.
>
>
>
> On Sun, Aug 28, 2011 at 2:47 PM, Ryan Lowe <ry...@gmail.com> wrote:
>
>> We are working on a system that has super heavy traffic during specific
>> times... think of sporting events.  Other times we will get almost 0
>> traffic.  In order to handle the traffic during the events, we are planning
>> on scaling out cassandra into a very large cluster.   The size of our data
>> is still quite small.  A single event's data might be 100MB in size max, but
>> we will be inserting that data very rapidly and needing to read it at the
>> same time.
>>
>> Since we have very slow times, we use a replication factor of 2 and a
>> cluster size of 2 to handle the traffic... it handles it perfectly.
>>
>> Since dataset size is not really an issue, what is the best way to scale
>> out for us?  We are using order preserving partitioners to do range
>> scanning, so last time I tried to scale out our cluster we ended up with
>> very uneven load.  Then the few nodes that contained that data were very
>> swamped, while the rest were barely touched.
>>
>> Other note is that since we have very little data, and lots of memory, we
>> turned on key and row cache almost as high as we could go.
>>
>> So my question is this... if I bring in 20+ nodes, should I increase the
>> replication factor as well?  It would seem to make sense that more
>> replication factor would help distribute load?  Or does it just mean that
>> writes take even longer?  What are some other suggestions on how to do scale
>> up (and then back down) for a system that gets very high traffic in known
>> small time windows.
>>
>>
>>
>> Let me know if you need more info.
>>
>> Thanks!
>> Ryan
>>
>
>

Re: Scaling Out / Replication Factor too?

Posted by Edward Capriolo <ed...@gmail.com>.

So my question is this... if I bring in 20+ nodes, should I increase the
replication factor as well?

Each write is done to all natural endpoints of the data. If you set
replication factor equal to number of nodes write operations do not scale.
This is because each write has to happen on every node. The same thing is
true with read operations, even if you ready at CL.ONE read repair will
perform a read on all replicates.

* However there is one caveat to this advice, I covered it in this
presentation I did.

http://www.slideshare.net/edwardcapriolo/cassandra-as-memcache

The read_repair_chance controls how often a read repair is triggered. You
can increase Replication Factor and lower read_repair_chance. This gives you
many severs capable of serving the same read without burdening by doing
repair reads across the other 19 nodes.

However this is NOT the standard method to scale out. The standard, and
probably better way in all but a few instances, is to leave the replication
factor alone and add more nodes.

Normally, people set Replication Factor at 3. This gives 3 nodes to serve
reads, as long as their dataset is small, which is true in your case, these
reads are heavily cached. You would need a very high number of reads/writes
to bottleneck any node.

Raising and lowering replication factor is not the way to go, changing the
replication factor involves more steps then just changing the variable as
does growing and shrinking the cluster.

What to do about idling servers is another question. We have thought about
having our idling web servers join our hadoop cluster at night and then
leave again in the morning :) Maybe you can have some fun with your
cassandra gear in its idle time.

On Sun, Aug 28, 2011 at 2:47 PM, Ryan Lowe <ry...@gmail.com> wrote:

> We are working on a system that has super heavy traffic during specific
> times... think of sporting events.  Other times we will get almost 0
> traffic.  In order to handle the traffic during the events, we are planning
> on scaling out cassandra into a very large cluster.   The size of our data
> is still quite small.  A single event's data might be 100MB in size max, but
> we will be inserting that data very rapidly and needing to read it at the
> same time.
>
> Since we have very slow times, we use a replication factor of 2 and a
> cluster size of 2 to handle the traffic... it handles it perfectly.
>
> Since dataset size is not really an issue, what is the best way to scale
> out for us?  We are using order preserving partitioners to do range
> scanning, so last time I tried to scale out our cluster we ended up with
> very uneven load.  Then the few nodes that contained that data were very
> swamped, while the rest were barely touched.
>
> Other note is that since we have very little data, and lots of memory, we
> turned on key and row cache almost as high as we could go.
>
> So my question is this... if I bring in 20+ nodes, should I increase the
> replication factor as well?  It would seem to make sense that more
> replication factor would help distribute load?  Or does it just mean that
> writes take even longer?  What are some other suggestions on how to do scale
> up (and then back down) for a system that gets very high traffic in known
> small time windows.
>
>
>
> Let me know if you need more info.
>
> Thanks!
> Ryan
>