You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by Riyad Kalla <rk...@gmail.com> on 2011/11/07 06:50:30 UTC

Will writes with < ALL consistency eventually propagate?

I am new to Cassandra and was curious about the following scenario...

Lets say i have a ring of 5 servers. Ultimately I would like each server to be a full replication of the next (master-master-*). 

In a presentation i watched today on Cassandra, the presenter mentioned that the ring members will shard data and route your requests to the right host when they come in to a server that doesnt physically contain the value you wanted. To the client requesting this is seamless excwpt for the added latency.

If i wanted to avoid the routing and latency and ensure every server had the full data set, do i have to write with a consistency level of ALL and wait for all of those writes to return in my code, or can i write with a CL of 1 or 2 and let the ring propagate the rest of the copies to the other servers in the background after my code has continued executing?

I dont mind eventual consistency in my case, but i do (eventually) want all nodes to have all values and cannot tell if this is default behavior, or if sharding is the default and i can only force duplicates onto the other servers explicitly with a CL of ALL.

Best,
Riyad

Re: Will writes with < ALL consistency eventually propagate?

Posted by Anthony Ikeda <an...@gmail.com>.

It's your replication factor that determines how many nodes contain the data. So you would set the replication factor to 5 to ensure all nodes contain the data. 

Your consistency level is all based on when should the server return to the client after writing. When one node has written the data (ConsistencyLevel.ONE)? Or when all nodes have been written to (ConsistencyLevel.ALL)?

Just remember if your replication factor is to ensure all nodes have a copy you may not be able to survive the loss of a single node. 

Sent from my iPhone

On 06/11/2011, at 21:50, Riyad Kalla <rk...@gmail.com> wrote:

> I am new to Cassandra and was curious about the following scenario...
> 
> Lets say i have a ring of 5 servers. Ultimately I would like each server to be a full replication of the next (master-master-*). 
> 
> In a presentation i watched today on Cassandra, the presenter mentioned that the ring members will shard data and route your requests to the right host when they come in to a server that doesnt physically contain the value you wanted. To the client requesting this is seamless excwpt for the added latency.
> 
> If i wanted to avoid the routing and latency and ensure every server had the full data set, do i have to write with a consistency level of ALL and wait for all of those writes to return in my code, or can i write with a CL of 1 or 2 and let the ring propagate the rest of the copies to the other servers in the background after my code has continued executing?
> 
> I dont mind eventual consistency in my case, but i do (eventually) want all nodes to have all values and cannot tell if this is default behavior, or if sharding is the default and i can only force duplicates onto the other servers explicitly with a CL of ALL.
> 
> Best,
> Riyad

Re: Will writes with < ALL consistency eventually propagate?

Posted by Riyad Kalla <rk...@gmail.com>.

Ahh, I see your point.

Thanks for the help Stephen.

On Mon, Nov 7, 2011 at 12:43 PM, Stephen Connolly <
stephen.alan.connolly@gmail.com> wrote:

> at that point, your cluster will either have so much data on each node
> that you will need to split them, keeping rf=5 so you have 10 nodes... or
> the intra cluster traffic will swap you and you will split each node
> keeping rf=5 so you have 10 nodes again.
>
> safest thing is not to design with the assumption that rf=n
>
> - Stephen
>
> ---
> Sent from my Android phone, so random spelling mistakes, random nonsense
> words and other nonsense are a direct result of using swype to type on the
> screen
> On 7 Nov 2011 17:47, "Riyad Kalla" <rk...@gmail.com> wrote:
>
>> Stephen,
>>
>> I appreciate you making the point more strongly; I won't make this
>> decision lightly given the stress you are putting on it, but the technical
>> aspects of this make me curious...
>>
>> If I start with RF=N (number of nodes) now, and in 2 years
>> (hypothetically) my dataset is too large and I say to myself "Dangit,
>> Stephen was right...", couldn't I just change the RF to some smaller value,
>> say "3" at that point or would the Cassandra ring not rebalance the data
>> set nicely at that point?
>>
>> More specifically, would it not know how best to slowly remove extraneous
>> copies from the nodes and make the data more sparse among the ring members?
>>
>> Thanks for the hand-holding; it is helping me understand the operational
>> landscape quickly.
>>
>> -R
>>
>> On Mon, Nov 7, 2011 at 10:18 AM, Stephen Connolly <
>> stephen.alan.connolly@gmail.com> wrote:
>>
>>> Plan for the future....
>>>
>>> At some point your data set will become too big for the node that it
>>> is running on, or your load will force you to split nodes.... once you
>>> do that RF < N
>>>
>>> To solve performance issues with C* the solution is add more nodes
>>>
>>> To solve storage issues with C* the solution is add more nodes
>>>
>>> In most cases the solution in C* is add more nodes.
>>>
>>> Don't assume RF=Number of nodes as a core design decision of your
>>> application and you will not have your ass bitten
>>>
>>> ;-)
>>>
>>> -Stephen
>>> P.S. making the point more extreme to make it clear
>>>
>>> On 7 November 2011 15:04, Riyad Kalla <rk...@gmail.com> wrote:
>>> > Stephen,
>>> > Excellent breakdown; I appreciate all the detail.
>>> > Your last comment about RF being smaller than N (number of nodes) --
>>> in my
>>> > particular case my data set isn't particularly large (a few GB) and is
>>> > distributed globally across a handful of data centers. What I am
>>> utilizing
>>> > Cassandra for is the replication in order to minimize latency for
>>> requests.
>>> > So when a request comes into any location, I want each node in the
>>> ring to
>>> > contain the full data set so it never needs to defer to another member
>>> of
>>> > the ring to answer a question (even if this means eventually
>>> consistency,
>>> > that is alright in my case).
>>> > Given that, the way I've understood this discussion so far is I would
>>> have a
>>> > RF of N (my total node count) but my Consistency Level with all my
>>> writes
>>> > will *likely* be QUORUM -- I think that is a good/safe default for me
>>> to use
>>> > as writes aren't the scenario I need to optimize for latency; that
>>> being
>>> > said, I also don't want to wait for a ConsistencyLevel of ALL to
>>> complete
>>> > before my code continues though.
>>> > Would you agree with this assessment or am I missing the boat on
>>> something?
>>> > Best,
>>> > Riyad
>>> >
>>> > On Mon, Nov 7, 2011 at 7:42 AM, Stephen Connolly
>>> > <st...@gmail.com> wrote:
>>> >>
>>> >> Consistency Level is a pseudo-enum...
>>> >>
>>> >> you have the choice between
>>> >>
>>> >> ONE
>>> >> Quorum (and there are different types of this)
>>> >> ALL
>>> >>
>>> >> At CL=ONE, only one node is guaranteed to have got the write if the
>>> >> operation is a success.
>>> >> At CL=ALL, all nodes that the RF says it should be stored at must
>>> >> confirm the write before the operation succeeds, but a partial write
>>> >> will succeed eventually if at least one node recorded the write
>>> >> At CL=QUORUM, at least ((N/2)+1) nodes must confirm the write for the
>>> >> operation to succeed, otherwise failure, but a partial write will
>>> >> succeed eventually if at least one node recorded the write.
>>> >>
>>> >> Read repair will eventually ensure that the write is replicated across
>>> >> all RF nodes in the cluster.
>>> >>
>>> >> The N in QUORUM above depends on the type of QUORUM you choose, in
>>> >> general think N=RF unless you choose a fancy QUORUM.
>>> >>
>>> >> To have a consistent read, CL of write + CL of read must be > RF...
>>> >>
>>> >> Write at ONE, read at ONE => may not get the most recent write if RF >
>>> >> 1 [fastest write, fastest read] {data loss possible if node lost
>>> >> before read repair}
>>> >> Write at QUORUM, read at ONE => consistent read [moderate write,
>>> >> fastest read] {multiple nodes must be lost for data loss to be
>>> >> possible}
>>> >> Write at ALL, read at ONE => consistent read, writes may be blocked if
>>> >> any node fails [slowest write, fastest read]
>>> >>
>>> >> Write at ONE, read at QUORUM => may not get the most recent write if
>>> >> RF > 2 [fastest write, moderate read]  {data loss possible if node
>>> >> lost before read repair}
>>> >> Write at QUORUM, read at QUORUM => consistent read [moderate write,
>>> >> moderate read] {multiple nodes must be lost for data loss to be
>>> >> possible}
>>> >> Write at ALL, read at QUORUM => consistent read, writes may be blocked
>>> >> if any node fails [slowest write, moderate read]
>>> >>
>>> >> Write at ONE, read at ALL => consistent read, reads may fail if any
>>> >> node fails [fastest write, slowest read] {data loss possible if node
>>> >> lost before read repair}
>>> >> Write at QUORUM, read at ALL => consistent read, reads may fail if any
>>> >> node fails [moderate write, slowest read] {multiple nodes must be lost
>>> >> for data loss to be possible}
>>> >> Write at ALL, read at ALL => consistent read, writes may be blocked if
>>> >> any node fails, reads may fail if any node fails [slowest write,
>>> >> slowest read]
>>> >>
>>> >> Note: You can choose the CL for each and every operation. This is
>>> >> something that you should design into your application (unless you
>>> >> exclusively use QUORUM for all operations, in which case you are
>>> >> advised to bake the logic in, but it is less necessary)
>>> >>
>>> >> The other thing to remember is that RF does not have to equal the
>>> >> number of nodes in your cluster... in fact I would recommend designing
>>> >> your app on the basis that RF < number of nodes in your cluster...
>>> >> because at some point, when your data set grows big enough, you will
>>> >> end up with RF < number of nodes.
>>> >>
>>> >> -Stephen
>>> >>
>>> >> On 7 November 2011 13:03, Riyad Kalla <rk...@gmail.com> wrote:
>>> >> > Ah! Ok I was interpreting what you were saying to mean that if my
>>> RF was
>>> >> > too
>>> >> > high, then the ring would die if I lost one.
>>> >> > Ultimately what I want (I think) is:
>>> >> > Replication Factor: 5 (aka "all of my nodes")
>>> >> > Consistency Level: 2
>>> >> > Put another way, when I write a value, I want it to exist on two
>>> servers
>>> >> > *at
>>> >> > least* before I consider that write "successful" enough for my code
>>> to
>>> >> > continue, but in the background I would like Cassandra to keep
>>> copying
>>> >> > that
>>> >> > value around at its leisure until all the ring nodes know about it.
>>> >> > This sounds like what I need. Thanks for pointing me in the right
>>> >> > direction.
>>> >> > Best,
>>> >> > Riyad
>>> >> >
>>> >> > On Mon, Nov 7, 2011 at 5:47 AM, Anthony Ikeda
>>> >> > <an...@gmail.com>
>>> >> > wrote:
>>> >> >>
>>> >> >> Riyad, I'm also just getting to know the different settings and
>>> values
>>> >> >> myself :)
>>> >> >> I believe, and it also depends on your config, CL.ONE Should
>>> ignore the
>>> >> >> loss of a node if your RF is 5, once you increase the CL then if
>>> you
>>> >> >> lose a
>>> >> >> node the CL is not met and you will get exceptions returned.
>>> >> >> Sent from my iPhone
>>> >> >> On 07/11/2011, at 4:32, Riyad Kalla <rk...@gmail.com> wrote:
>>> >> >>
>>> >> >> Anthony and Jaydeep, thank you for weighing in. I am glad to see
>>> that
>>> >> >> they
>>> >> >> are two different values (makes more sense mentally to me).
>>> >> >> Anthony, what you said caught my attention "to ensure all nodes
>>> have a
>>> >> >> copy you may not be able to survive the loss of a single node." --
>>> why
>>> >> >> would
>>> >> >> this be the case?
>>> >> >> I assumed (incorrectly?) that a node would simply disappear off
>>> the map
>>> >> >> until I could bring it back up again, at which point all the
>>> missing
>>> >> >> values
>>> >> >> that it didn't get while it was done, it would slowly retrieve from
>>> >> >> other
>>> >> >> members of the ring. Is this the wrong understanding?
>>> >> >> If forcing a replication factor equal to the number of nodes in my
>>> ring
>>> >> >> will cause a hard-stop when one ring goes down (as I understood
>>> your
>>> >> >> comment
>>> >> >> to mean), it seems to me I should go with a much lower replication
>>> >> >> factor...
>>> >> >> something along the lines of 3 or roughly ceiling(N / 2) and just
>>> deal
>>> >> >> with
>>> >> >> the latency when one of the nodes has to route a request to another
>>> >> >> server
>>> >> >> when it doesn't contain the value.
>>> >> >> Is there a better way to accomplish what I want, or is keeping the
>>> >> >> replication factor that aggressively high generally a bad thing and
>>> >> >> using
>>> >> >> Cassandra in the "wrong" way?
>>> >> >> Thank you for the help.
>>> >> >> -Riyad
>>> >> >>
>>> >> >> On Sun, Nov 6, 2011 at 11:14 PM, chovatia jaydeep
>>> >> >> <ch...@yahoo.co.in> wrote:
>>> >> >>>
>>> >> >>> Hi Riyad,
>>> >> >>> You can set replication = 5 (number of replicas) and write with
>>> CL =
>>> >> >>> ONE.
>>> >> >>> There is no hard requirement from Cassandra to write with CL=ALL
>>> to
>>> >> >>> replicate the data unless you need it. Considering your example,
>>> If
>>> >> >>> you
>>> >> >>> write with CL=ONE then also it will replicate your data to all 5
>>> >> >>> replicas
>>> >> >>> eventually.
>>> >> >>> Thank you,
>>> >> >>> Jaydeep
>>> >> >>> ________________________________
>>> >> >>> From: Riyad Kalla <rk...@gmail.com>
>>> >> >>> To: "user@cassandra.apache.org" <us...@cassandra.apache.org>
>>> >> >>> Sent: Sunday, 6 November 2011 9:50 PM
>>> >> >>> Subject: Will writes with < ALL consistency eventually propagate?
>>> >> >>>
>>> >> >>> I am new to Cassandra and was curious about the following
>>> scenario...
>>> >> >>>
>>> >> >>> Lets say i have a ring of 5 servers. Ultimately I would like each
>>> >> >>> server
>>> >> >>> to be a full replication of the next (master-master-*).
>>> >> >>>
>>> >> >>> In a presentation i watched today on Cassandra, the presenter
>>> >> >>> mentioned
>>> >> >>> that the ring members will shard data and route your requests to
>>> the
>>> >> >>> right
>>> >> >>> host when they come in to a server that doesnt physically contain
>>> the
>>> >> >>> value
>>> >> >>> you wanted. To the client requesting this is seamless excwpt for
>>> the
>>> >> >>> added
>>> >> >>> latency.
>>> >> >>>
>>> >> >>> If i wanted to avoid the routing and latency and ensure every
>>> server
>>> >> >>> had
>>> >> >>> the full data set, do i have to write with a consistency level of
>>> ALL
>>> >> >>> and
>>> >> >>> wait for all of those writes to return in my code, or can i write
>>> with
>>> >> >>> a CL
>>> >> >>> of 1 or 2 and let the ring propagate the rest of the copies to the
>>> >> >>> other
>>> >> >>> servers in the background after my code has continued executing?
>>> >> >>>
>>> >> >>> I dont mind eventual consistency in my case, but i do (eventually)
>>> >> >>> want
>>> >> >>> all nodes to have all values and cannot tell if this is default
>>> >> >>> behavior, or
>>> >> >>> if sharding is the default and i can only force duplicates onto
>>> the
>>> >> >>> other
>>> >> >>> servers explicitly with a CL of ALL.
>>> >> >>>
>>> >> >>> Best,
>>> >> >>> Riyad
>>> >> >>>
>>> >> >>
>>> >> >
>>> >> >
>>> >
>>> >
>>>
>>
>>

Re: Will writes with < ALL consistency eventually propagate?

Posted by Stephen Connolly <st...@gmail.com>.

at that point, your cluster will either have so much data on each node that
you will need to split them, keeping rf=5 so you have 10 nodes... or the
intra cluster traffic will swap you and you will split each node keeping
rf=5 so you have 10 nodes again.

safest thing is not to design with the assumption that rf=n

- Stephen

---
Sent from my Android phone, so random spelling mistakes, random nonsense
words and other nonsense are a direct result of using swype to type on the
screen
On 7 Nov 2011 17:47, "Riyad Kalla" <rk...@gmail.com> wrote:

> Stephen,
>
> I appreciate you making the point more strongly; I won't make this
> decision lightly given the stress you are putting on it, but the technical
> aspects of this make me curious...
>
> If I start with RF=N (number of nodes) now, and in 2 years
> (hypothetically) my dataset is too large and I say to myself "Dangit,
> Stephen was right...", couldn't I just change the RF to some smaller value,
> say "3" at that point or would the Cassandra ring not rebalance the data
> set nicely at that point?
>
> More specifically, would it not know how best to slowly remove extraneous
> copies from the nodes and make the data more sparse among the ring members?
>
> Thanks for the hand-holding; it is helping me understand the operational
> landscape quickly.
>
> -R
>
> On Mon, Nov 7, 2011 at 10:18 AM, Stephen Connolly <
> stephen.alan.connolly@gmail.com> wrote:
>
>> Plan for the future....
>>
>> At some point your data set will become too big for the node that it
>> is running on, or your load will force you to split nodes.... once you
>> do that RF < N
>>
>> To solve performance issues with C* the solution is add more nodes
>>
>> To solve storage issues with C* the solution is add more nodes
>>
>> In most cases the solution in C* is add more nodes.
>>
>> Don't assume RF=Number of nodes as a core design decision of your
>> application and you will not have your ass bitten
>>
>> ;-)
>>
>> -Stephen
>> P.S. making the point more extreme to make it clear
>>
>> On 7 November 2011 15:04, Riyad Kalla <rk...@gmail.com> wrote:
>> > Stephen,
>> > Excellent breakdown; I appreciate all the detail.
>> > Your last comment about RF being smaller than N (number of nodes) -- in
>> my
>> > particular case my data set isn't particularly large (a few GB) and is
>> > distributed globally across a handful of data centers. What I am
>> utilizing
>> > Cassandra for is the replication in order to minimize latency for
>> requests.
>> > So when a request comes into any location, I want each node in the ring
>> to
>> > contain the full data set so it never needs to defer to another member
>> of
>> > the ring to answer a question (even if this means eventually
>> consistency,
>> > that is alright in my case).
>> > Given that, the way I've understood this discussion so far is I would
>> have a
>> > RF of N (my total node count) but my Consistency Level with all my
>> writes
>> > will *likely* be QUORUM -- I think that is a good/safe default for me
>> to use
>> > as writes aren't the scenario I need to optimize for latency; that being
>> > said, I also don't want to wait for a ConsistencyLevel of ALL to
>> complete
>> > before my code continues though.
>> > Would you agree with this assessment or am I missing the boat on
>> something?
>> > Best,
>> > Riyad
>> >
>> > On Mon, Nov 7, 2011 at 7:42 AM, Stephen Connolly
>> > <st...@gmail.com> wrote:
>> >>
>> >> Consistency Level is a pseudo-enum...
>> >>
>> >> you have the choice between
>> >>
>> >> ONE
>> >> Quorum (and there are different types of this)
>> >> ALL
>> >>
>> >> At CL=ONE, only one node is guaranteed to have got the write if the
>> >> operation is a success.
>> >> At CL=ALL, all nodes that the RF says it should be stored at must
>> >> confirm the write before the operation succeeds, but a partial write
>> >> will succeed eventually if at least one node recorded the write
>> >> At CL=QUORUM, at least ((N/2)+1) nodes must confirm the write for the
>> >> operation to succeed, otherwise failure, but a partial write will
>> >> succeed eventually if at least one node recorded the write.
>> >>
>> >> Read repair will eventually ensure that the write is replicated across
>> >> all RF nodes in the cluster.
>> >>
>> >> The N in QUORUM above depends on the type of QUORUM you choose, in
>> >> general think N=RF unless you choose a fancy QUORUM.
>> >>
>> >> To have a consistent read, CL of write + CL of read must be > RF...
>> >>
>> >> Write at ONE, read at ONE => may not get the most recent write if RF >
>> >> 1 [fastest write, fastest read] {data loss possible if node lost
>> >> before read repair}
>> >> Write at QUORUM, read at ONE => consistent read [moderate write,
>> >> fastest read] {multiple nodes must be lost for data loss to be
>> >> possible}
>> >> Write at ALL, read at ONE => consistent read, writes may be blocked if
>> >> any node fails [slowest write, fastest read]
>> >>
>> >> Write at ONE, read at QUORUM => may not get the most recent write if
>> >> RF > 2 [fastest write, moderate read]  {data loss possible if node
>> >> lost before read repair}
>> >> Write at QUORUM, read at QUORUM => consistent read [moderate write,
>> >> moderate read] {multiple nodes must be lost for data loss to be
>> >> possible}
>> >> Write at ALL, read at QUORUM => consistent read, writes may be blocked
>> >> if any node fails [slowest write, moderate read]
>> >>
>> >> Write at ONE, read at ALL => consistent read, reads may fail if any
>> >> node fails [fastest write, slowest read] {data loss possible if node
>> >> lost before read repair}
>> >> Write at QUORUM, read at ALL => consistent read, reads may fail if any
>> >> node fails [moderate write, slowest read] {multiple nodes must be lost
>> >> for data loss to be possible}
>> >> Write at ALL, read at ALL => consistent read, writes may be blocked if
>> >> any node fails, reads may fail if any node fails [slowest write,
>> >> slowest read]
>> >>
>> >> Note: You can choose the CL for each and every operation. This is
>> >> something that you should design into your application (unless you
>> >> exclusively use QUORUM for all operations, in which case you are
>> >> advised to bake the logic in, but it is less necessary)
>> >>
>> >> The other thing to remember is that RF does not have to equal the
>> >> number of nodes in your cluster... in fact I would recommend designing
>> >> your app on the basis that RF < number of nodes in your cluster...
>> >> because at some point, when your data set grows big enough, you will
>> >> end up with RF < number of nodes.
>> >>
>> >> -Stephen
>> >>
>> >> On 7 November 2011 13:03, Riyad Kalla <rk...@gmail.com> wrote:
>> >> > Ah! Ok I was interpreting what you were saying to mean that if my RF
>> was
>> >> > too
>> >> > high, then the ring would die if I lost one.
>> >> > Ultimately what I want (I think) is:
>> >> > Replication Factor: 5 (aka "all of my nodes")
>> >> > Consistency Level: 2
>> >> > Put another way, when I write a value, I want it to exist on two
>> servers
>> >> > *at
>> >> > least* before I consider that write "successful" enough for my code
>> to
>> >> > continue, but in the background I would like Cassandra to keep
>> copying
>> >> > that
>> >> > value around at its leisure until all the ring nodes know about it.
>> >> > This sounds like what I need. Thanks for pointing me in the right
>> >> > direction.
>> >> > Best,
>> >> > Riyad
>> >> >
>> >> > On Mon, Nov 7, 2011 at 5:47 AM, Anthony Ikeda
>> >> > <an...@gmail.com>
>> >> > wrote:
>> >> >>
>> >> >> Riyad, I'm also just getting to know the different settings and
>> values
>> >> >> myself :)
>> >> >> I believe, and it also depends on your config, CL.ONE Should ignore
>> the
>> >> >> loss of a node if your RF is 5, once you increase the CL then if you
>> >> >> lose a
>> >> >> node the CL is not met and you will get exceptions returned.
>> >> >> Sent from my iPhone
>> >> >> On 07/11/2011, at 4:32, Riyad Kalla <rk...@gmail.com> wrote:
>> >> >>
>> >> >> Anthony and Jaydeep, thank you for weighing in. I am glad to see
>> that
>> >> >> they
>> >> >> are two different values (makes more sense mentally to me).
>> >> >> Anthony, what you said caught my attention "to ensure all nodes
>> have a
>> >> >> copy you may not be able to survive the loss of a single node." --
>> why
>> >> >> would
>> >> >> this be the case?
>> >> >> I assumed (incorrectly?) that a node would simply disappear off the
>> map
>> >> >> until I could bring it back up again, at which point all the missing
>> >> >> values
>> >> >> that it didn't get while it was done, it would slowly retrieve from
>> >> >> other
>> >> >> members of the ring. Is this the wrong understanding?
>> >> >> If forcing a replication factor equal to the number of nodes in my
>> ring
>> >> >> will cause a hard-stop when one ring goes down (as I understood your
>> >> >> comment
>> >> >> to mean), it seems to me I should go with a much lower replication
>> >> >> factor...
>> >> >> something along the lines of 3 or roughly ceiling(N / 2) and just
>> deal
>> >> >> with
>> >> >> the latency when one of the nodes has to route a request to another
>> >> >> server
>> >> >> when it doesn't contain the value.
>> >> >> Is there a better way to accomplish what I want, or is keeping the
>> >> >> replication factor that aggressively high generally a bad thing and
>> >> >> using
>> >> >> Cassandra in the "wrong" way?
>> >> >> Thank you for the help.
>> >> >> -Riyad
>> >> >>
>> >> >> On Sun, Nov 6, 2011 at 11:14 PM, chovatia jaydeep
>> >> >> <ch...@yahoo.co.in> wrote:
>> >> >>>
>> >> >>> Hi Riyad,
>> >> >>> You can set replication = 5 (number of replicas) and write with CL
>> =
>> >> >>> ONE.
>> >> >>> There is no hard requirement from Cassandra to write with CL=ALL to
>> >> >>> replicate the data unless you need it. Considering your example, If
>> >> >>> you
>> >> >>> write with CL=ONE then also it will replicate your data to all 5
>> >> >>> replicas
>> >> >>> eventually.
>> >> >>> Thank you,
>> >> >>> Jaydeep
>> >> >>> ________________________________
>> >> >>> From: Riyad Kalla <rk...@gmail.com>
>> >> >>> To: "user@cassandra.apache.org" <us...@cassandra.apache.org>
>> >> >>> Sent: Sunday, 6 November 2011 9:50 PM
>> >> >>> Subject: Will writes with < ALL consistency eventually propagate?
>> >> >>>
>> >> >>> I am new to Cassandra and was curious about the following
>> scenario...
>> >> >>>
>> >> >>> Lets say i have a ring of 5 servers. Ultimately I would like each
>> >> >>> server
>> >> >>> to be a full replication of the next (master-master-*).
>> >> >>>
>> >> >>> In a presentation i watched today on Cassandra, the presenter
>> >> >>> mentioned
>> >> >>> that the ring members will shard data and route your requests to
>> the
>> >> >>> right
>> >> >>> host when they come in to a server that doesnt physically contain
>> the
>> >> >>> value
>> >> >>> you wanted. To the client requesting this is seamless excwpt for
>> the
>> >> >>> added
>> >> >>> latency.
>> >> >>>
>> >> >>> If i wanted to avoid the routing and latency and ensure every
>> server
>> >> >>> had
>> >> >>> the full data set, do i have to write with a consistency level of
>> ALL
>> >> >>> and
>> >> >>> wait for all of those writes to return in my code, or can i write
>> with
>> >> >>> a CL
>> >> >>> of 1 or 2 and let the ring propagate the rest of the copies to the
>> >> >>> other
>> >> >>> servers in the background after my code has continued executing?
>> >> >>>
>> >> >>> I dont mind eventual consistency in my case, but i do (eventually)
>> >> >>> want
>> >> >>> all nodes to have all values and cannot tell if this is default
>> >> >>> behavior, or
>> >> >>> if sharding is the default and i can only force duplicates onto the
>> >> >>> other
>> >> >>> servers explicitly with a CL of ALL.
>> >> >>>
>> >> >>> Best,
>> >> >>> Riyad
>> >> >>>
>> >> >>
>> >> >
>> >> >
>> >
>> >
>>
>
>

Re: Will writes with < ALL consistency eventually propagate?

Posted by Riyad Kalla <rk...@gmail.com>.

Stephen,

I appreciate you making the point more strongly; I won't make this decision
lightly given the stress you are putting on it, but the technical aspects
of this make me curious...

If I start with RF=N (number of nodes) now, and in 2 years (hypothetically)
my dataset is too large and I say to myself "Dangit, Stephen was right...",
couldn't I just change the RF to some smaller value, say "3" at that point
or would the Cassandra ring not rebalance the data set nicely at that
point?

More specifically, would it not know how best to slowly remove extraneous
copies from the nodes and make the data more sparse among the ring members?

Thanks for the hand-holding; it is helping me understand the operational
landscape quickly.

-R

On Mon, Nov 7, 2011 at 10:18 AM, Stephen Connolly <
stephen.alan.connolly@gmail.com> wrote:

> Plan for the future....
>
> At some point your data set will become too big for the node that it
> is running on, or your load will force you to split nodes.... once you
> do that RF < N
>
> To solve performance issues with C* the solution is add more nodes
>
> To solve storage issues with C* the solution is add more nodes
>
> In most cases the solution in C* is add more nodes.
>
> Don't assume RF=Number of nodes as a core design decision of your
> application and you will not have your ass bitten
>
> ;-)
>
> -Stephen
> P.S. making the point more extreme to make it clear
>
> On 7 November 2011 15:04, Riyad Kalla <rk...@gmail.com> wrote:
> > Stephen,
> > Excellent breakdown; I appreciate all the detail.
> > Your last comment about RF being smaller than N (number of nodes) -- in
> my
> > particular case my data set isn't particularly large (a few GB) and is
> > distributed globally across a handful of data centers. What I am
> utilizing
> > Cassandra for is the replication in order to minimize latency for
> requests.
> > So when a request comes into any location, I want each node in the ring
> to
> > contain the full data set so it never needs to defer to another member of
> > the ring to answer a question (even if this means eventually consistency,
> > that is alright in my case).
> > Given that, the way I've understood this discussion so far is I would
> have a
> > RF of N (my total node count) but my Consistency Level with all my writes
> > will *likely* be QUORUM -- I think that is a good/safe default for me to
> use
> > as writes aren't the scenario I need to optimize for latency; that being
> > said, I also don't want to wait for a ConsistencyLevel of ALL to complete
> > before my code continues though.
> > Would you agree with this assessment or am I missing the boat on
> something?
> > Best,
> > Riyad
> >
> > On Mon, Nov 7, 2011 at 7:42 AM, Stephen Connolly
> > <st...@gmail.com> wrote:
> >>
> >> Consistency Level is a pseudo-enum...
> >>
> >> you have the choice between
> >>
> >> ONE
> >> Quorum (and there are different types of this)
> >> ALL
> >>
> >> At CL=ONE, only one node is guaranteed to have got the write if the
> >> operation is a success.
> >> At CL=ALL, all nodes that the RF says it should be stored at must
> >> confirm the write before the operation succeeds, but a partial write
> >> will succeed eventually if at least one node recorded the write
> >> At CL=QUORUM, at least ((N/2)+1) nodes must confirm the write for the
> >> operation to succeed, otherwise failure, but a partial write will
> >> succeed eventually if at least one node recorded the write.
> >>
> >> Read repair will eventually ensure that the write is replicated across
> >> all RF nodes in the cluster.
> >>
> >> The N in QUORUM above depends on the type of QUORUM you choose, in
> >> general think N=RF unless you choose a fancy QUORUM.
> >>
> >> To have a consistent read, CL of write + CL of read must be > RF...
> >>
> >> Write at ONE, read at ONE => may not get the most recent write if RF >
> >> 1 [fastest write, fastest read] {data loss possible if node lost
> >> before read repair}
> >> Write at QUORUM, read at ONE => consistent read [moderate write,
> >> fastest read] {multiple nodes must be lost for data loss to be
> >> possible}
> >> Write at ALL, read at ONE => consistent read, writes may be blocked if
> >> any node fails [slowest write, fastest read]
> >>
> >> Write at ONE, read at QUORUM => may not get the most recent write if
> >> RF > 2 [fastest write, moderate read]  {data loss possible if node
> >> lost before read repair}
> >> Write at QUORUM, read at QUORUM => consistent read [moderate write,
> >> moderate read] {multiple nodes must be lost for data loss to be
> >> possible}
> >> Write at ALL, read at QUORUM => consistent read, writes may be blocked
> >> if any node fails [slowest write, moderate read]
> >>
> >> Write at ONE, read at ALL => consistent read, reads may fail if any
> >> node fails [fastest write, slowest read] {data loss possible if node
> >> lost before read repair}
> >> Write at QUORUM, read at ALL => consistent read, reads may fail if any
> >> node fails [moderate write, slowest read] {multiple nodes must be lost
> >> for data loss to be possible}
> >> Write at ALL, read at ALL => consistent read, writes may be blocked if
> >> any node fails, reads may fail if any node fails [slowest write,
> >> slowest read]
> >>
> >> Note: You can choose the CL for each and every operation. This is
> >> something that you should design into your application (unless you
> >> exclusively use QUORUM for all operations, in which case you are
> >> advised to bake the logic in, but it is less necessary)
> >>
> >> The other thing to remember is that RF does not have to equal the
> >> number of nodes in your cluster... in fact I would recommend designing
> >> your app on the basis that RF < number of nodes in your cluster...
> >> because at some point, when your data set grows big enough, you will
> >> end up with RF < number of nodes.
> >>
> >> -Stephen
> >>
> >> On 7 November 2011 13:03, Riyad Kalla <rk...@gmail.com> wrote:
> >> > Ah! Ok I was interpreting what you were saying to mean that if my RF
> was
> >> > too
> >> > high, then the ring would die if I lost one.
> >> > Ultimately what I want (I think) is:
> >> > Replication Factor: 5 (aka "all of my nodes")
> >> > Consistency Level: 2
> >> > Put another way, when I write a value, I want it to exist on two
> servers
> >> > *at
> >> > least* before I consider that write "successful" enough for my code to
> >> > continue, but in the background I would like Cassandra to keep copying
> >> > that
> >> > value around at its leisure until all the ring nodes know about it.
> >> > This sounds like what I need. Thanks for pointing me in the right
> >> > direction.
> >> > Best,
> >> > Riyad
> >> >
> >> > On Mon, Nov 7, 2011 at 5:47 AM, Anthony Ikeda
> >> > <an...@gmail.com>
> >> > wrote:
> >> >>
> >> >> Riyad, I'm also just getting to know the different settings and
> values
> >> >> myself :)
> >> >> I believe, and it also depends on your config, CL.ONE Should ignore
> the
> >> >> loss of a node if your RF is 5, once you increase the CL then if you
> >> >> lose a
> >> >> node the CL is not met and you will get exceptions returned.
> >> >> Sent from my iPhone
> >> >> On 07/11/2011, at 4:32, Riyad Kalla <rk...@gmail.com> wrote:
> >> >>
> >> >> Anthony and Jaydeep, thank you for weighing in. I am glad to see that
> >> >> they
> >> >> are two different values (makes more sense mentally to me).
> >> >> Anthony, what you said caught my attention "to ensure all nodes have
> a
> >> >> copy you may not be able to survive the loss of a single node." --
> why
> >> >> would
> >> >> this be the case?
> >> >> I assumed (incorrectly?) that a node would simply disappear off the
> map
> >> >> until I could bring it back up again, at which point all the missing
> >> >> values
> >> >> that it didn't get while it was done, it would slowly retrieve from
> >> >> other
> >> >> members of the ring. Is this the wrong understanding?
> >> >> If forcing a replication factor equal to the number of nodes in my
> ring
> >> >> will cause a hard-stop when one ring goes down (as I understood your
> >> >> comment
> >> >> to mean), it seems to me I should go with a much lower replication
> >> >> factor...
> >> >> something along the lines of 3 or roughly ceiling(N / 2) and just
> deal
> >> >> with
> >> >> the latency when one of the nodes has to route a request to another
> >> >> server
> >> >> when it doesn't contain the value.
> >> >> Is there a better way to accomplish what I want, or is keeping the
> >> >> replication factor that aggressively high generally a bad thing and
> >> >> using
> >> >> Cassandra in the "wrong" way?
> >> >> Thank you for the help.
> >> >> -Riyad
> >> >>
> >> >> On Sun, Nov 6, 2011 at 11:14 PM, chovatia jaydeep
> >> >> <ch...@yahoo.co.in> wrote:
> >> >>>
> >> >>> Hi Riyad,
> >> >>> You can set replication = 5 (number of replicas) and write with CL =
> >> >>> ONE.
> >> >>> There is no hard requirement from Cassandra to write with CL=ALL to
> >> >>> replicate the data unless you need it. Considering your example, If
> >> >>> you
> >> >>> write with CL=ONE then also it will replicate your data to all 5
> >> >>> replicas
> >> >>> eventually.
> >> >>> Thank you,
> >> >>> Jaydeep
> >> >>> ________________________________
> >> >>> From: Riyad Kalla <rk...@gmail.com>
> >> >>> To: "user@cassandra.apache.org" <us...@cassandra.apache.org>
> >> >>> Sent: Sunday, 6 November 2011 9:50 PM
> >> >>> Subject: Will writes with < ALL consistency eventually propagate?
> >> >>>
> >> >>> I am new to Cassandra and was curious about the following
> scenario...
> >> >>>
> >> >>> Lets say i have a ring of 5 servers. Ultimately I would like each
> >> >>> server
> >> >>> to be a full replication of the next (master-master-*).
> >> >>>
> >> >>> In a presentation i watched today on Cassandra, the presenter
> >> >>> mentioned
> >> >>> that the ring members will shard data and route your requests to the
> >> >>> right
> >> >>> host when they come in to a server that doesnt physically contain
> the
> >> >>> value
> >> >>> you wanted. To the client requesting this is seamless excwpt for the
> >> >>> added
> >> >>> latency.
> >> >>>
> >> >>> If i wanted to avoid the routing and latency and ensure every server
> >> >>> had
> >> >>> the full data set, do i have to write with a consistency level of
> ALL
> >> >>> and
> >> >>> wait for all of those writes to return in my code, or can i write
> with
> >> >>> a CL
> >> >>> of 1 or 2 and let the ring propagate the rest of the copies to the
> >> >>> other
> >> >>> servers in the background after my code has continued executing?
> >> >>>
> >> >>> I dont mind eventual consistency in my case, but i do (eventually)
> >> >>> want
> >> >>> all nodes to have all values and cannot tell if this is default
> >> >>> behavior, or
> >> >>> if sharding is the default and i can only force duplicates onto the
> >> >>> other
> >> >>> servers explicitly with a CL of ALL.
> >> >>>
> >> >>> Best,
> >> >>> Riyad
> >> >>>
> >> >>
> >> >
> >> >
> >
> >
>

Re: Will writes with < ALL consistency eventually propagate?

Posted by Stephen Connolly <st...@gmail.com>.

Plan for the future....

At some point your data set will become too big for the node that it
is running on, or your load will force you to split nodes.... once you
do that RF < N

To solve performance issues with C* the solution is add more nodes

To solve storage issues with C* the solution is add more nodes

In most cases the solution in C* is add more nodes.

Don't assume RF=Number of nodes as a core design decision of your
application and you will not have your ass bitten

;-)

-Stephen
P.S. making the point more extreme to make it clear

On 7 November 2011 15:04, Riyad Kalla <rk...@gmail.com> wrote:
> Stephen,
> Excellent breakdown; I appreciate all the detail.
> Your last comment about RF being smaller than N (number of nodes) -- in my
> particular case my data set isn't particularly large (a few GB) and is
> distributed globally across a handful of data centers. What I am utilizing
> Cassandra for is the replication in order to minimize latency for requests.
> So when a request comes into any location, I want each node in the ring to
> contain the full data set so it never needs to defer to another member of
> the ring to answer a question (even if this means eventually consistency,
> that is alright in my case).
> Given that, the way I've understood this discussion so far is I would have a
> RF of N (my total node count) but my Consistency Level with all my writes
> will *likely* be QUORUM -- I think that is a good/safe default for me to use
> as writes aren't the scenario I need to optimize for latency; that being
> said, I also don't want to wait for a ConsistencyLevel of ALL to complete
> before my code continues though.
> Would you agree with this assessment or am I missing the boat on something?
> Best,
> Riyad
>
> On Mon, Nov 7, 2011 at 7:42 AM, Stephen Connolly
> <st...@gmail.com> wrote:
>>
>> Consistency Level is a pseudo-enum...
>>
>> you have the choice between
>>
>> ONE
>> Quorum (and there are different types of this)
>> ALL
>>
>> At CL=ONE, only one node is guaranteed to have got the write if the
>> operation is a success.
>> At CL=ALL, all nodes that the RF says it should be stored at must
>> confirm the write before the operation succeeds, but a partial write
>> will succeed eventually if at least one node recorded the write
>> At CL=QUORUM, at least ((N/2)+1) nodes must confirm the write for the
>> operation to succeed, otherwise failure, but a partial write will
>> succeed eventually if at least one node recorded the write.
>>
>> Read repair will eventually ensure that the write is replicated across
>> all RF nodes in the cluster.
>>
>> The N in QUORUM above depends on the type of QUORUM you choose, in
>> general think N=RF unless you choose a fancy QUORUM.
>>
>> To have a consistent read, CL of write + CL of read must be > RF...
>>
>> Write at ONE, read at ONE => may not get the most recent write if RF >
>> 1 [fastest write, fastest read] {data loss possible if node lost
>> before read repair}
>> Write at QUORUM, read at ONE => consistent read [moderate write,
>> fastest read] {multiple nodes must be lost for data loss to be
>> possible}
>> Write at ALL, read at ONE => consistent read, writes may be blocked if
>> any node fails [slowest write, fastest read]
>>
>> Write at ONE, read at QUORUM => may not get the most recent write if
>> RF > 2 [fastest write, moderate read]  {data loss possible if node
>> lost before read repair}
>> Write at QUORUM, read at QUORUM => consistent read [moderate write,
>> moderate read] {multiple nodes must be lost for data loss to be
>> possible}
>> Write at ALL, read at QUORUM => consistent read, writes may be blocked
>> if any node fails [slowest write, moderate read]
>>
>> Write at ONE, read at ALL => consistent read, reads may fail if any
>> node fails [fastest write, slowest read] {data loss possible if node
>> lost before read repair}
>> Write at QUORUM, read at ALL => consistent read, reads may fail if any
>> node fails [moderate write, slowest read] {multiple nodes must be lost
>> for data loss to be possible}
>> Write at ALL, read at ALL => consistent read, writes may be blocked if
>> any node fails, reads may fail if any node fails [slowest write,
>> slowest read]
>>
>> Note: You can choose the CL for each and every operation. This is
>> something that you should design into your application (unless you
>> exclusively use QUORUM for all operations, in which case you are
>> advised to bake the logic in, but it is less necessary)
>>
>> The other thing to remember is that RF does not have to equal the
>> number of nodes in your cluster... in fact I would recommend designing
>> your app on the basis that RF < number of nodes in your cluster...
>> because at some point, when your data set grows big enough, you will
>> end up with RF < number of nodes.
>>
>> -Stephen
>>
>> On 7 November 2011 13:03, Riyad Kalla <rk...@gmail.com> wrote:
>> > Ah! Ok I was interpreting what you were saying to mean that if my RF was
>> > too
>> > high, then the ring would die if I lost one.
>> > Ultimately what I want (I think) is:
>> > Replication Factor: 5 (aka "all of my nodes")
>> > Consistency Level: 2
>> > Put another way, when I write a value, I want it to exist on two servers
>> > *at
>> > least* before I consider that write "successful" enough for my code to
>> > continue, but in the background I would like Cassandra to keep copying
>> > that
>> > value around at its leisure until all the ring nodes know about it.
>> > This sounds like what I need. Thanks for pointing me in the right
>> > direction.
>> > Best,
>> > Riyad
>> >
>> > On Mon, Nov 7, 2011 at 5:47 AM, Anthony Ikeda
>> > <an...@gmail.com>
>> > wrote:
>> >>
>> >> Riyad, I'm also just getting to know the different settings and values
>> >> myself :)
>> >> I believe, and it also depends on your config, CL.ONE Should ignore the
>> >> loss of a node if your RF is 5, once you increase the CL then if you
>> >> lose a
>> >> node the CL is not met and you will get exceptions returned.
>> >> Sent from my iPhone
>> >> On 07/11/2011, at 4:32, Riyad Kalla <rk...@gmail.com> wrote:
>> >>
>> >> Anthony and Jaydeep, thank you for weighing in. I am glad to see that
>> >> they
>> >> are two different values (makes more sense mentally to me).
>> >> Anthony, what you said caught my attention "to ensure all nodes have a
>> >> copy you may not be able to survive the loss of a single node." -- why
>> >> would
>> >> this be the case?
>> >> I assumed (incorrectly?) that a node would simply disappear off the map
>> >> until I could bring it back up again, at which point all the missing
>> >> values
>> >> that it didn't get while it was done, it would slowly retrieve from
>> >> other
>> >> members of the ring. Is this the wrong understanding?
>> >> If forcing a replication factor equal to the number of nodes in my ring
>> >> will cause a hard-stop when one ring goes down (as I understood your
>> >> comment
>> >> to mean), it seems to me I should go with a much lower replication
>> >> factor...
>> >> something along the lines of 3 or roughly ceiling(N / 2) and just deal
>> >> with
>> >> the latency when one of the nodes has to route a request to another
>> >> server
>> >> when it doesn't contain the value.
>> >> Is there a better way to accomplish what I want, or is keeping the
>> >> replication factor that aggressively high generally a bad thing and
>> >> using
>> >> Cassandra in the "wrong" way?
>> >> Thank you for the help.
>> >> -Riyad
>> >>
>> >> On Sun, Nov 6, 2011 at 11:14 PM, chovatia jaydeep
>> >> <ch...@yahoo.co.in> wrote:
>> >>>
>> >>> Hi Riyad,
>> >>> You can set replication = 5 (number of replicas) and write with CL =
>> >>> ONE.
>> >>> There is no hard requirement from Cassandra to write with CL=ALL to
>> >>> replicate the data unless you need it. Considering your example, If
>> >>> you
>> >>> write with CL=ONE then also it will replicate your data to all 5
>> >>> replicas
>> >>> eventually.
>> >>> Thank you,
>> >>> Jaydeep
>> >>> ________________________________
>> >>> From: Riyad Kalla <rk...@gmail.com>
>> >>> To: "user@cassandra.apache.org" <us...@cassandra.apache.org>
>> >>> Sent: Sunday, 6 November 2011 9:50 PM
>> >>> Subject: Will writes with < ALL consistency eventually propagate?
>> >>>
>> >>> I am new to Cassandra and was curious about the following scenario...
>> >>>
>> >>> Lets say i have a ring of 5 servers. Ultimately I would like each
>> >>> server
>> >>> to be a full replication of the next (master-master-*).
>> >>>
>> >>> In a presentation i watched today on Cassandra, the presenter
>> >>> mentioned
>> >>> that the ring members will shard data and route your requests to the
>> >>> right
>> >>> host when they come in to a server that doesnt physically contain the
>> >>> value
>> >>> you wanted. To the client requesting this is seamless excwpt for the
>> >>> added
>> >>> latency.
>> >>>
>> >>> If i wanted to avoid the routing and latency and ensure every server
>> >>> had
>> >>> the full data set, do i have to write with a consistency level of ALL
>> >>> and
>> >>> wait for all of those writes to return in my code, or can i write with
>> >>> a CL
>> >>> of 1 or 2 and let the ring propagate the rest of the copies to the
>> >>> other
>> >>> servers in the background after my code has continued executing?
>> >>>
>> >>> I dont mind eventual consistency in my case, but i do (eventually)
>> >>> want
>> >>> all nodes to have all values and cannot tell if this is default
>> >>> behavior, or
>> >>> if sharding is the default and i can only force duplicates onto the
>> >>> other
>> >>> servers explicitly with a CL of ALL.
>> >>>
>> >>> Best,
>> >>> Riyad
>> >>>
>> >>
>> >
>> >
>
>

Re: Will writes with < ALL consistency eventually propagate?

Posted by Riyad Kalla <rk...@gmail.com>.

Perfect, thank you Robert.

On Tue, Nov 8, 2011 at 1:10 PM, Robert Jackson <ro...@promedicalinc.com>wrote:

>  *From: *"Riyad Kalla" <rk...@gmail.com>
> *To: *user@cassandra.apache.org
> *Sent: *Tuesday, November 8, 2011 2:49:32 PM
> *Subject: *Re: Will writes with < ALL consistency eventually propagate?
>
>
> I've not looked at setting up rings to replicate with each other before...
> is that process pretty well documented/explained or is this a black box
> that I am slowly wading into?
>
> Take a look at:
>
>
> http://www.datastax.com/dev/blog/deploying-cassandra-across-multiple-data-centers
>
> Robert Jackson
>

Re: Will writes with < ALL consistency eventually propagate?

Posted by Robert Jackson <ro...@promedicalinc.com>.

| From: "Riyad Kalla" <rk...@gmail.com>
| To: user@cassandra.apache.org
| Sent: Tuesday, November 8, 2011 2:49:32 PM
| Subject: Re: Will writes with < ALL consistency eventually propagate?

| I've not looked at setting up rings to replicate with each other
| before... is that process pretty well documented/explained or is
| this a black box that I am slowly wading into?

Take a look at: 

http://www.datastax.com/dev/blog/deploying-cassandra-across-multiple-data-centers 

Robert Jackson

Re: Will writes with < ALL consistency eventually propagate?

Posted by Peter Schuller <pe...@infidyne.com>.

> handful of nodes that I write to with a CL of QUORUM (or there abouts).

If your goal is to service reads w/o waiting for remote servers, you
probably would want to use LOCAL_QUORUM (quorum within a data center)
or ONE for reads. That however assumes an RF of >= 3 in each data
center (which means many copies in total if you have many data
centers).

Carefully plan how many copies you want in each DC, keeping in mind
that 3+ is required for LOCAL_QUORUM to be useful, and keeping in mind
that if you have only one copy per DC the latency to data will be
vastly increased whenever a node is down even if temporarily.

And consider whether you require that written data is guaranteed to be
visible globally upon subsequent read or not.

I don't know the situation, but I suspect some application logic on
top of Cassandra is useful for a CDN like situation. At least if you
have a lot of data and you care about cost efficiency. If it's okay to
just say RF=3 per DC (so e.g. 9 copies for 3 DC:s) and e.g. use
LOCAL_QUOROM for reads, and use LOCAL_QUORUM against each DC for
writes (or else just having less strict consistency guarantees), then
that's easy. It just won't be cost-effective if you start having to
scale to lots and lots of data (or writes).


-- 
/ Peter Schuller (@scode, http://worldmodscode.wordpress.com)

Re: Will writes with < ALL consistency eventually propagate?

Posted by Riyad Kalla <rk...@gmail.com>.

Peter,

It sounds what I might want to deploy is a ring-per-datacenter in this case
and have each data center replicate to one another (to ensure they all have
full copies of the data) but inside of data-center-specific ring, have a
handful of nodes that I write to with a CL of QUORUM (or there abouts).

I've not looked at setting up rings to replicate with each other before...
is that process pretty well documented/explained or is this a black box
that I am slowly wading into?

(watching Andrew's talk from Acunu now to get a better idea of this).

-R

On Mon, Nov 7, 2011 at 10:20 PM, Peter Schuller <peter.schuller@infidyne.com
> wrote:

> > Thanks for the additional insight on this -- think of a CDN that needs to
> > respond to requests, distributed around the globe. Ultimately you would
> hope
> > that each edge location could respond as quickly as possible (RF=N) but
> if
> > each of the ring members keep open/active connections to each other, and
> a
> > request comes in to an edge location that does not contain a copy of the
> > data, does it request the data from the node that does, then cache it (in
> > the case of more requests coming into that edge location with the same
> > request) or does it reply once and forget it, requiring *each* subsequent
> > request to that node to always phone back home to the node that actually
> > contains it?
> > The CDN/edge-server scenario works particularly well to illustrate my
> goals,
> > if visualizing that helps.
> > Look forward to your thoughts.
>
> Nodes will never cache any data. Nodes have the data that they own
> according to the ring topology and the replication factor (to the
> extent that the data has been replicated); the node you happen to talk
> to is merely a "co-ordinator" of a request; essentially a proxy with
> intelligent routing to the correct hosts.
>
> In the CDN situation, if you're talking about e.g. having a group of
> servers in one "place" (network topologically distinct location, such
> as geographically distinct) then a better fit than RF=N is probably to
> use multi-site support and say that you want a certain number of
> copies for each location and have all clients talk to the most local
> "site".
>
> But that's assuming you want to try to model this using just
> Cassandra's replication to begin with. Dynamically caching wherever
> data is accessed is a good idea for a CDN use-case (probably), but is
> not something that Cassandra does itself, internally. It's really
> difficult to know what the best solution is for a CDN; and in your
> case you imply that it's really *not* a CDN and it's just an analogy
> ;)
>
> --
> / Peter Schuller (@scode, http://worldmodscode.wordpress.com)
>

Re: Will writes with < ALL consistency eventually propagate?

Posted by Peter Schuller <pe...@infidyne.com>.

> Thanks for the additional insight on this -- think of a CDN that needs to
> respond to requests, distributed around the globe. Ultimately you would hope
> that each edge location could respond as quickly as possible (RF=N) but if
> each of the ring members keep open/active connections to each other, and a
> request comes in to an edge location that does not contain a copy of the
> data, does it request the data from the node that does, then cache it (in
> the case of more requests coming into that edge location with the same
> request) or does it reply once and forget it, requiring *each* subsequent
> request to that node to always phone back home to the node that actually
> contains it?
> The CDN/edge-server scenario works particularly well to illustrate my goals,
> if visualizing that helps.
> Look forward to your thoughts.

Nodes will never cache any data. Nodes have the data that they own
according to the ring topology and the replication factor (to the
extent that the data has been replicated); the node you happen to talk
to is merely a "co-ordinator" of a request; essentially a proxy with
intelligent routing to the correct hosts.

In the CDN situation, if you're talking about e.g. having a group of
servers in one "place" (network topologically distinct location, such
as geographically distinct) then a better fit than RF=N is probably to
use multi-site support and say that you want a certain number of
copies for each location and have all clients talk to the most local
"site".

But that's assuming you want to try to model this using just
Cassandra's replication to begin with. Dynamically caching wherever
data is accessed is a good idea for a CDN use-case (probably), but is
not something that Cassandra does itself, internally. It's really
difficult to know what the best solution is for a CDN; and in your
case you imply that it's really *not* a CDN and it's just an analogy
;)

-- 
/ Peter Schuller (@scode, http://worldmodscode.wordpress.com)

Re: Will writes with < ALL consistency eventually propagate?

Posted by Riyad Kalla <rk...@gmail.com>.

Peter,

Thanks for the additional insight on this -- think of a CDN that needs to
respond to requests, distributed around the globe. Ultimately you would
hope that each edge location could respond as quickly as possible (RF=N)
but if each of the ring members keep open/active connections to each other,
and a request comes in to an edge location that does not contain a copy of
the data, does it request the data from the node that does, then cache it
(in the case of more requests coming into that edge location with the same
request) or does it reply once and forget it, requiring *each* subsequent
request to that node to always phone back home to the node that actually
contains it?

The CDN/edge-server scenario works particularly well to illustrate my
goals, if visualizing that helps.

Look forward to your thoughts.

-R

On Mon, Nov 7, 2011 at 8:05 PM, Peter Schuller
<pe...@infidyne.com>wrote:

> > Given that, the way I've understood this discussion so far is I would
> have a
> > RF of N (my total node count) but my Consistency Level with all my writes
> > will *likely* be QUORUM -- I think that is a good/safe default for me to
> use
> > as writes aren't the scenario I need to optimize for latency; that being
> > said, I also don't want to wait for a ConsistencyLevel of ALL to complete
> > before my code continues though.
> > Would you agree with this assessment or am I missing the boat on
> something?
>
> Are you *sure* you care about latency to the degree that data being
> non-local actually matters to your application?
>
> Normally you don't set RF=N unless you have particularly special
> requirements. The extra latency implied by another network round-trip
> is certainly greater than zero, but in many practical situations
> outliers and the behavior in case of e.g. node problems is much more
> important than an extra millisecond or two on the average request.
> Setting RF=N causes a larger data set on each node, in addition to
> causing more nodes to be involved in every request. Consider whether
> it's a better use of resources to set RF to e.g. 3 instead, and let
> the ring grow independently. That is what one normally does.
>
> --
> / Peter Schuller (@scode, http://worldmodscode.wordpress.com)
>

Re: Will writes with < ALL consistency eventually propagate?

Posted by Peter Schuller <pe...@infidyne.com>.

> Given that, the way I've understood this discussion so far is I would have a
> RF of N (my total node count) but my Consistency Level with all my writes
> will *likely* be QUORUM -- I think that is a good/safe default for me to use
> as writes aren't the scenario I need to optimize for latency; that being
> said, I also don't want to wait for a ConsistencyLevel of ALL to complete
> before my code continues though.
> Would you agree with this assessment or am I missing the boat on something?

Are you *sure* you care about latency to the degree that data being
non-local actually matters to your application?

Normally you don't set RF=N unless you have particularly special
requirements. The extra latency implied by another network round-trip
is certainly greater than zero, but in many practical situations
outliers and the behavior in case of e.g. node problems is much more
important than an extra millisecond or two on the average request.
Setting RF=N causes a larger data set on each node, in addition to
causing more nodes to be involved in every request. Consider whether
it's a better use of resources to set RF to e.g. 3 instead, and let
the ring grow independently. That is what one normally does.

-- 
/ Peter Schuller (@scode, http://worldmodscode.wordpress.com)

Re: Will writes with < ALL consistency eventually propagate?

Posted by Riyad Kalla <rk...@gmail.com>.

Stephen,

Excellent breakdown; I appreciate all the detail.

Your last comment about RF being smaller than N (number of nodes) -- in my
particular case my data set isn't particularly large (a few GB) and is
distributed globally across a handful of data centers. What I am utilizing
Cassandra for is the replication in order to minimize latency for requests.

So when a request comes into any location, I want each node in the ring to
contain the full data set so it never needs to defer to another member of
the ring to answer a question (even if this means eventually consistency,
that is alright in my case).

Given that, the way I've understood this discussion so far is I would have
a RF of N (my total node count) but my Consistency Level with all my writes
will *likely* be QUORUM -- I think that is a good/safe default for me to
use as writes aren't the scenario I need to optimize for latency; that
being said, I also don't want to wait for a ConsistencyLevel of ALL to
complete before my code continues though.

Would you agree with this assessment or am I missing the boat on something?

Best,
Riyad

On Mon, Nov 7, 2011 at 7:42 AM, Stephen Connolly <
stephen.alan.connolly@gmail.com> wrote:

> Consistency Level is a pseudo-enum...
>
> you have the choice between
>
> ONE
> Quorum (and there are different types of this)
> ALL
>
> At CL=ONE, only one node is guaranteed to have got the write if the
> operation is a success.
> At CL=ALL, all nodes that the RF says it should be stored at must
> confirm the write before the operation succeeds, but a partial write
> will succeed eventually if at least one node recorded the write
> At CL=QUORUM, at least ((N/2)+1) nodes must confirm the write for the
> operation to succeed, otherwise failure, but a partial write will
> succeed eventually if at least one node recorded the write.
>
> Read repair will eventually ensure that the write is replicated across
> all RF nodes in the cluster.
>
> The N in QUORUM above depends on the type of QUORUM you choose, in
> general think N=RF unless you choose a fancy QUORUM.
>
> To have a consistent read, CL of write + CL of read must be > RF...
>
> Write at ONE, read at ONE => may not get the most recent write if RF >
> 1 [fastest write, fastest read] {data loss possible if node lost
> before read repair}
> Write at QUORUM, read at ONE => consistent read [moderate write,
> fastest read] {multiple nodes must be lost for data loss to be
> possible}
> Write at ALL, read at ONE => consistent read, writes may be blocked if
> any node fails [slowest write, fastest read]
>
> Write at ONE, read at QUORUM => may not get the most recent write if
> RF > 2 [fastest write, moderate read]  {data loss possible if node
> lost before read repair}
> Write at QUORUM, read at QUORUM => consistent read [moderate write,
> moderate read] {multiple nodes must be lost for data loss to be
> possible}
> Write at ALL, read at QUORUM => consistent read, writes may be blocked
> if any node fails [slowest write, moderate read]
>
> Write at ONE, read at ALL => consistent read, reads may fail if any
> node fails [fastest write, slowest read] {data loss possible if node
> lost before read repair}
> Write at QUORUM, read at ALL => consistent read, reads may fail if any
> node fails [moderate write, slowest read] {multiple nodes must be lost
> for data loss to be possible}
> Write at ALL, read at ALL => consistent read, writes may be blocked if
> any node fails, reads may fail if any node fails [slowest write,
> slowest read]
>
> Note: You can choose the CL for each and every operation. This is
> something that you should design into your application (unless you
> exclusively use QUORUM for all operations, in which case you are
> advised to bake the logic in, but it is less necessary)
>
> The other thing to remember is that RF does not have to equal the
> number of nodes in your cluster... in fact I would recommend designing
> your app on the basis that RF < number of nodes in your cluster...
> because at some point, when your data set grows big enough, you will
> end up with RF < number of nodes.
>
> -Stephen
>
> On 7 November 2011 13:03, Riyad Kalla <rk...@gmail.com> wrote:
> > Ah! Ok I was interpreting what you were saying to mean that if my RF was
> too
> > high, then the ring would die if I lost one.
> > Ultimately what I want (I think) is:
> > Replication Factor: 5 (aka "all of my nodes")
> > Consistency Level: 2
> > Put another way, when I write a value, I want it to exist on two servers
> *at
> > least* before I consider that write "successful" enough for my code to
> > continue, but in the background I would like Cassandra to keep copying
> that
> > value around at its leisure until all the ring nodes know about it.
> > This sounds like what I need. Thanks for pointing me in the right
> direction.
> > Best,
> > Riyad
> >
> > On Mon, Nov 7, 2011 at 5:47 AM, Anthony Ikeda <
> anthony.ikeda.dev@gmail.com>
> > wrote:
> >>
> >> Riyad, I'm also just getting to know the different settings and values
> >> myself :)
> >> I believe, and it also depends on your config, CL.ONE Should ignore the
> >> loss of a node if your RF is 5, once you increase the CL then if you
> lose a
> >> node the CL is not met and you will get exceptions returned.
> >> Sent from my iPhone
> >> On 07/11/2011, at 4:32, Riyad Kalla <rk...@gmail.com> wrote:
> >>
> >> Anthony and Jaydeep, thank you for weighing in. I am glad to see that
> they
> >> are two different values (makes more sense mentally to me).
> >> Anthony, what you said caught my attention "to ensure all nodes have a
> >> copy you may not be able to survive the loss of a single node." -- why
> would
> >> this be the case?
> >> I assumed (incorrectly?) that a node would simply disappear off the map
> >> until I could bring it back up again, at which point all the missing
> values
> >> that it didn't get while it was done, it would slowly retrieve from
> other
> >> members of the ring. Is this the wrong understanding?
> >> If forcing a replication factor equal to the number of nodes in my ring
> >> will cause a hard-stop when one ring goes down (as I understood your
> comment
> >> to mean), it seems to me I should go with a much lower replication
> factor...
> >> something along the lines of 3 or roughly ceiling(N / 2) and just deal
> with
> >> the latency when one of the nodes has to route a request to another
> server
> >> when it doesn't contain the value.
> >> Is there a better way to accomplish what I want, or is keeping the
> >> replication factor that aggressively high generally a bad thing and
> using
> >> Cassandra in the "wrong" way?
> >> Thank you for the help.
> >> -Riyad
> >>
> >> On Sun, Nov 6, 2011 at 11:14 PM, chovatia jaydeep
> >> <ch...@yahoo.co.in> wrote:
> >>>
> >>> Hi Riyad,
> >>> You can set replication = 5 (number of replicas) and write with CL =
> ONE.
> >>> There is no hard requirement from Cassandra to write with CL=ALL to
> >>> replicate the data unless you need it. Considering your example, If you
> >>> write with CL=ONE then also it will replicate your data to all 5
> replicas
> >>> eventually.
> >>> Thank you,
> >>> Jaydeep
> >>> ________________________________
> >>> From: Riyad Kalla <rk...@gmail.com>
> >>> To: "user@cassandra.apache.org" <us...@cassandra.apache.org>
> >>> Sent: Sunday, 6 November 2011 9:50 PM
> >>> Subject: Will writes with < ALL consistency eventually propagate?
> >>>
> >>> I am new to Cassandra and was curious about the following scenario...
> >>>
> >>> Lets say i have a ring of 5 servers. Ultimately I would like each
> server
> >>> to be a full replication of the next (master-master-*).
> >>>
> >>> In a presentation i watched today on Cassandra, the presenter mentioned
> >>> that the ring members will shard data and route your requests to the
> right
> >>> host when they come in to a server that doesnt physically contain the
> value
> >>> you wanted. To the client requesting this is seamless excwpt for the
> added
> >>> latency.
> >>>
> >>> If i wanted to avoid the routing and latency and ensure every server
> had
> >>> the full data set, do i have to write with a consistency level of ALL
> and
> >>> wait for all of those writes to return in my code, or can i write with
> a CL
> >>> of 1 or 2 and let the ring propagate the rest of the copies to the
> other
> >>> servers in the background after my code has continued executing?
> >>>
> >>> I dont mind eventual consistency in my case, but i do (eventually) want
> >>> all nodes to have all values and cannot tell if this is default
> behavior, or
> >>> if sharding is the default and i can only force duplicates onto the
> other
> >>> servers explicitly with a CL of ALL.
> >>>
> >>> Best,
> >>> Riyad
> >>>
> >>
> >
> >
>

Re: Will writes with < ALL consistency eventually propagate?

Posted by Stephen Connolly <st...@gmail.com>.

Consistency Level is a pseudo-enum...

you have the choice between

ONE
Quorum (and there are different types of this)
ALL

At CL=ONE, only one node is guaranteed to have got the write if the
operation is a success.
At CL=ALL, all nodes that the RF says it should be stored at must
confirm the write before the operation succeeds, but a partial write
will succeed eventually if at least one node recorded the write
At CL=QUORUM, at least ((N/2)+1) nodes must confirm the write for the
operation to succeed, otherwise failure, but a partial write will
succeed eventually if at least one node recorded the write.

Read repair will eventually ensure that the write is replicated across
all RF nodes in the cluster.

The N in QUORUM above depends on the type of QUORUM you choose, in
general think N=RF unless you choose a fancy QUORUM.

To have a consistent read, CL of write + CL of read must be > RF...

Write at ONE, read at ONE => may not get the most recent write if RF >
1 [fastest write, fastest read] {data loss possible if node lost
before read repair}
Write at QUORUM, read at ONE => consistent read [moderate write,
fastest read] {multiple nodes must be lost for data loss to be
possible}
Write at ALL, read at ONE => consistent read, writes may be blocked if
any node fails [slowest write, fastest read]

Write at ONE, read at QUORUM => may not get the most recent write if
RF > 2 [fastest write, moderate read]  {data loss possible if node
lost before read repair}
Write at QUORUM, read at QUORUM => consistent read [moderate write,
moderate read] {multiple nodes must be lost for data loss to be
possible}
Write at ALL, read at QUORUM => consistent read, writes may be blocked
if any node fails [slowest write, moderate read]

Write at ONE, read at ALL => consistent read, reads may fail if any
node fails [fastest write, slowest read] {data loss possible if node
lost before read repair}
Write at QUORUM, read at ALL => consistent read, reads may fail if any
node fails [moderate write, slowest read] {multiple nodes must be lost
for data loss to be possible}
Write at ALL, read at ALL => consistent read, writes may be blocked if
any node fails, reads may fail if any node fails [slowest write,
slowest read]

Note: You can choose the CL for each and every operation. This is
something that you should design into your application (unless you
exclusively use QUORUM for all operations, in which case you are
advised to bake the logic in, but it is less necessary)

The other thing to remember is that RF does not have to equal the
number of nodes in your cluster... in fact I would recommend designing
your app on the basis that RF < number of nodes in your cluster...
because at some point, when your data set grows big enough, you will
end up with RF < number of nodes.

-Stephen

On 7 November 2011 13:03, Riyad Kalla <rk...@gmail.com> wrote:
> Ah! Ok I was interpreting what you were saying to mean that if my RF was too
> high, then the ring would die if I lost one.
> Ultimately what I want (I think) is:
> Replication Factor: 5 (aka "all of my nodes")
> Consistency Level: 2
> Put another way, when I write a value, I want it to exist on two servers *at
> least* before I consider that write "successful" enough for my code to
> continue, but in the background I would like Cassandra to keep copying that
> value around at its leisure until all the ring nodes know about it.
> This sounds like what I need. Thanks for pointing me in the right direction.
> Best,
> Riyad
>
> On Mon, Nov 7, 2011 at 5:47 AM, Anthony Ikeda <an...@gmail.com>
> wrote:
>>
>> Riyad, I'm also just getting to know the different settings and values
>> myself :)
>> I believe, and it also depends on your config, CL.ONE Should ignore the
>> loss of a node if your RF is 5, once you increase the CL then if you lose a
>> node the CL is not met and you will get exceptions returned.
>> Sent from my iPhone
>> On 07/11/2011, at 4:32, Riyad Kalla <rk...@gmail.com> wrote:
>>
>> Anthony and Jaydeep, thank you for weighing in. I am glad to see that they
>> are two different values (makes more sense mentally to me).
>> Anthony, what you said caught my attention "to ensure all nodes have a
>> copy you may not be able to survive the loss of a single node." -- why would
>> this be the case?
>> I assumed (incorrectly?) that a node would simply disappear off the map
>> until I could bring it back up again, at which point all the missing values
>> that it didn't get while it was done, it would slowly retrieve from other
>> members of the ring. Is this the wrong understanding?
>> If forcing a replication factor equal to the number of nodes in my ring
>> will cause a hard-stop when one ring goes down (as I understood your comment
>> to mean), it seems to me I should go with a much lower replication factor...
>> something along the lines of 3 or roughly ceiling(N / 2) and just deal with
>> the latency when one of the nodes has to route a request to another server
>> when it doesn't contain the value.
>> Is there a better way to accomplish what I want, or is keeping the
>> replication factor that aggressively high generally a bad thing and using
>> Cassandra in the "wrong" way?
>> Thank you for the help.
>> -Riyad
>>
>> On Sun, Nov 6, 2011 at 11:14 PM, chovatia jaydeep
>> <ch...@yahoo.co.in> wrote:
>>>
>>> Hi Riyad,
>>> You can set replication = 5 (number of replicas) and write with CL = ONE.
>>> There is no hard requirement from Cassandra to write with CL=ALL to
>>> replicate the data unless you need it. Considering your example, If you
>>> write with CL=ONE then also it will replicate your data to all 5 replicas
>>> eventually.
>>> Thank you,
>>> Jaydeep
>>> ________________________________
>>> From: Riyad Kalla <rk...@gmail.com>
>>> To: "user@cassandra.apache.org" <us...@cassandra.apache.org>
>>> Sent: Sunday, 6 November 2011 9:50 PM
>>> Subject: Will writes with < ALL consistency eventually propagate?
>>>
>>> I am new to Cassandra and was curious about the following scenario...
>>>
>>> Lets say i have a ring of 5 servers. Ultimately I would like each server
>>> to be a full replication of the next (master-master-*).
>>>
>>> In a presentation i watched today on Cassandra, the presenter mentioned
>>> that the ring members will shard data and route your requests to the right
>>> host when they come in to a server that doesnt physically contain the value
>>> you wanted. To the client requesting this is seamless excwpt for the added
>>> latency.
>>>
>>> If i wanted to avoid the routing and latency and ensure every server had
>>> the full data set, do i have to write with a consistency level of ALL and
>>> wait for all of those writes to return in my code, or can i write with a CL
>>> of 1 or 2 and let the ring propagate the rest of the copies to the other
>>> servers in the background after my code has continued executing?
>>>
>>> I dont mind eventual consistency in my case, but i do (eventually) want
>>> all nodes to have all values and cannot tell if this is default behavior, or
>>> if sharding is the default and i can only force duplicates onto the other
>>> servers explicitly with a CL of ALL.
>>>
>>> Best,
>>> Riyad
>>>
>>
>
>

Re: Will writes with < ALL consistency eventually propagate?

Posted by Riyad Kalla <rk...@gmail.com>.

Ah! Ok I was interpreting what you were saying to mean that if my RF was
too high, then the ring would die if I lost one.

Ultimately what I want (I think) is:

Replication Factor: 5 (aka "all of my nodes")
Consistency Level: 2

Put another way, when I write a value, I want it to exist on two servers
*at least* before I consider that write "successful" enough for my code to
continue, but in the background I would like Cassandra to keep copying that
value around at its leisure until all the ring nodes know about it.

This sounds like what I need. Thanks for pointing me in the right direction.

Best,
Riyad

On Mon, Nov 7, 2011 at 5:47 AM, Anthony Ikeda
<an...@gmail.com>wrote:

> Riyad, I'm also just getting to know the different settings and values
> myself :)
>
> I believe, and it also depends on your config, CL.ONE Should ignore the
> loss of a node if your RF is 5, once you increase the CL then if you lose a
> node the CL is not met and you will get exceptions returned.
>
> Sent from my iPhone
>
> On 07/11/2011, at 4:32, Riyad Kalla <rk...@gmail.com> wrote:
>
> Anthony and Jaydeep, thank you for weighing in. I am glad to see that they
> are two different values (makes more sense mentally to me).
>
> Anthony, what you said caught my attention "to ensure all nodes have a
> copy you may not be able to survive the loss of a single node." -- why
> would this be the case?
>
> I assumed (incorrectly?) that a node would simply disappear off the map
> until I could bring it back up again, at which point all the missing values
> that it didn't get while it was done, it would slowly retrieve from other
> members of the ring. Is this the wrong understanding?
>
> If forcing a replication factor equal to the number of nodes in my ring
> will cause a hard-stop when one ring goes down (as I understood your
> comment to mean), it seems to me I should go with a much lower replication
> factor... something along the lines of 3 or roughly ceiling(N / 2) and just
> deal with the latency when one of the nodes has to route a request to
> another server when it doesn't contain the value.
>
> Is there a better way to accomplish what I want, or is keeping the
> replication factor that aggressively high generally a bad thing and using
> Cassandra in the "wrong" way?
>
> Thank you for the help.
>
> -Riyad
>
> On Sun, Nov 6, 2011 at 11:14 PM, chovatia jaydeep <
> chovatia_jaydeep@yahoo.co.in> wrote:
>
>> Hi Riyad,
>>
>> You can set replication = 5 (number of replicas) and write with CL = ONE.
>> There is no hard requirement from Cassandra to write with CL=ALL to
>> replicate the data unless you need it. Considering your example, If you
>> write with CL=ONE then also it will replicate your data to all 5 replicas
>> eventually.
>>
>> Thank you,
>> Jaydeep
>> ------------------------------
>> *From:* Riyad Kalla <rk...@gmail.com>
>> *To:* "user@cassandra.apache.org" <us...@cassandra.apache.org>
>> *Sent:* Sunday, 6 November 2011 9:50 PM
>> *Subject:* Will writes with < ALL consistency eventually propagate?
>>
>> I am new to Cassandra and was curious about the following scenario...
>>
>> Lets say i have a ring of 5 servers. Ultimately I would like each server
>> to be a full replication of the next (master-master-*).
>>
>> In a presentation i watched today on Cassandra, the presenter mentioned
>> that the ring members will shard data and route your requests to the right
>> host when they come in to a server that doesnt physically contain the value
>> you wanted. To the client requesting this is seamless excwpt for the added
>> latency.
>>
>> If i wanted to avoid the routing and latency and ensure every server had
>> the full data set, do i have to write with a consistency level of ALL and
>> wait for all of those writes to return in my code, or can i write with a CL
>> of 1 or 2 and let the ring propagate the rest of the copies to the other
>> servers in the background after my code has continued executing?
>>
>> I dont mind eventual consistency in my case, but i do (eventually) want
>> all nodes to have all values and cannot tell if this is default behavior,
>> or if sharding is the default and i can only force duplicates onto the
>> other servers explicitly with a CL of ALL.
>>
>> Best,
>> Riyad
>>
>>
>

Re: Will writes with < ALL consistency eventually propagate?

Posted by Anthony Ikeda <an...@gmail.com>.

Riyad, I'm also just getting to know the different settings and values myself :)

I believe, and it also depends on your config, CL.ONE Should ignore the loss of a node if your RF is 5, once you increase the CL then if you lose a node the CL is not met and you will get exceptions returned. 

Sent from my iPhone

On 07/11/2011, at 4:32, Riyad Kalla <rk...@gmail.com> wrote:

> Anthony and Jaydeep, thank you for weighing in. I am glad to see that they are two different values (makes more sense mentally to me).
> 
> Anthony, what you said caught my attention "to ensure all nodes have a copy you may not be able to survive the loss of a single node." -- why would this be the case?
> 
> I assumed (incorrectly?) that a node would simply disappear off the map until I could bring it back up again, at which point all the missing values that it didn't get while it was done, it would slowly retrieve from other members of the ring. Is this the wrong understanding?
> 
> If forcing a replication factor equal to the number of nodes in my ring will cause a hard-stop when one ring goes down (as I understood your comment to mean), it seems to me I should go with a much lower replication factor... something along the lines of 3 or roughly ceiling(N / 2) and just deal with the latency when one of the nodes has to route a request to another server when it doesn't contain the value.
> 
> Is there a better way to accomplish what I want, or is keeping the replication factor that aggressively high generally a bad thing and using Cassandra in the "wrong" way?
> 
> Thank you for the help.
> 
> -Riyad
> 
> On Sun, Nov 6, 2011 at 11:14 PM, chovatia jaydeep <ch...@yahoo.co.in> wrote:
> Hi Riyad,
> 
> You can set replication = 5 (number of replicas) and write with CL = ONE. There is no hard requirement from Cassandra to write with CL=ALL to replicate the data unless you need it. Considering your example, If you write with CL=ONE then also it will replicate your data to all 5 replicas eventually.
> 
> Thank you,
> Jaydeep
> From: Riyad Kalla <rk...@gmail.com>
> To: "user@cassandra.apache.org" <us...@cassandra.apache.org>
> Sent: Sunday, 6 November 2011 9:50 PM
> Subject: Will writes with < ALL consistency eventually propagate?
> 
> I am new to Cassandra and was curious about the following scenario...
> 
> Lets say i have a ring of 5 servers. Ultimately I would like each server to be a full replication of the next (master-master-*). 
> 
> In a presentation i watched today on Cassandra, the presenter mentioned that the ring members will shard data and route your requests to the right host when they come in to a server that doesnt physically contain the value you wanted. To the client requesting this is seamless excwpt for the added latency.
> 
> If i wanted to avoid the routing and latency and ensure every server had the full data set, do i have to write with a consistency level of ALL and wait for all of those writes to return in my code, or can i write with a CL of 1 or 2 and let the ring propagate the rest of the copies to the other servers in the background after my code has continued executing?
> 
> I dont mind eventual consistency in my case, but i do (eventually) want all nodes to have all values and cannot tell if this is default behavior, or if sharding is the default and i can only force duplicates onto the other servers explicitly with a CL of ALL.
> 
> Best,
> Riyad
> 
>

Re: Will writes with < ALL consistency eventually propagate?

Posted by Riyad Kalla <rk...@gmail.com>.

Anthony and Jaydeep, thank you for weighing in. I am glad to see that they
are two different values (makes more sense mentally to me).

Anthony, what you said caught my attention "to ensure all nodes have a copy
you may not be able to survive the loss of a single node." -- why would
this be the case?

I assumed (incorrectly?) that a node would simply disappear off the map
until I could bring it back up again, at which point all the missing values
that it didn't get while it was done, it would slowly retrieve from other
members of the ring. Is this the wrong understanding?

If forcing a replication factor equal to the number of nodes in my ring
will cause a hard-stop when one ring goes down (as I understood your
comment to mean), it seems to me I should go with a much lower replication
factor... something along the lines of 3 or roughly ceiling(N / 2) and just
deal with the latency when one of the nodes has to route a request to
another server when it doesn't contain the value.

Is there a better way to accomplish what I want, or is keeping the
replication factor that aggressively high generally a bad thing and using
Cassandra in the "wrong" way?

Thank you for the help.

-Riyad

On Sun, Nov 6, 2011 at 11:14 PM, chovatia jaydeep <
chovatia_jaydeep@yahoo.co.in> wrote:

> Hi Riyad,
>
> You can set replication = 5 (number of replicas) and write with CL = ONE.
> There is no hard requirement from Cassandra to write with CL=ALL to
> replicate the data unless you need it. Considering your example, If you
> write with CL=ONE then also it will replicate your data to all 5 replicas
> eventually.
>
> Thank you,
> Jaydeep
> ------------------------------
> *From:* Riyad Kalla <rk...@gmail.com>
> *To:* "user@cassandra.apache.org" <us...@cassandra.apache.org>
> *Sent:* Sunday, 6 November 2011 9:50 PM
> *Subject:* Will writes with < ALL consistency eventually propagate?
>
> I am new to Cassandra and was curious about the following scenario...
>
> Lets say i have a ring of 5 servers. Ultimately I would like each server
> to be a full replication of the next (master-master-*).
>
> In a presentation i watched today on Cassandra, the presenter mentioned
> that the ring members will shard data and route your requests to the right
> host when they come in to a server that doesnt physically contain the value
> you wanted. To the client requesting this is seamless excwpt for the added
> latency.
>
> If i wanted to avoid the routing and latency and ensure every server had
> the full data set, do i have to write with a consistency level of ALL and
> wait for all of those writes to return in my code, or can i write with a CL
> of 1 or 2 and let the ring propagate the rest of the copies to the other
> servers in the background after my code has continued executing?
>
> I dont mind eventual consistency in my case, but i do (eventually) want
> all nodes to have all values and cannot tell if this is default behavior,
> or if sharding is the default and i can only force duplicates onto the
> other servers explicitly with a CL of ALL.
>
> Best,
> Riyad
>
>

Re: Will writes with < ALL consistency eventually propagate?

Posted by chovatia jaydeep <ch...@yahoo.co.in>.

Hi Riyad,

You can set replication = 5 (number of replicas) and write with CL = ONE. There is no hard requirement from Cassandra to write with CL=ALL to replicate the data unless you need it. Considering your example, If you write with CL=ONE then also it will replicate your data to all 5 replicas eventually.

Thank you,

Jaydeep

________________________________
From: Riyad Kalla <rk...@gmail.com>
To: "user@cassandra.apache.org" <us...@cassandra.apache.org>
Sent: Sunday, 6 November 2011 9:50 PM
Subject: Will writes with < ALL consistency eventually propagate?

I am new to Cassandra and was curious about the following scenario...

Lets say i have a ring of 5 servers. Ultimately I would like each server to be a full replication of the next (master-master-*). 

In a presentation i watched today on Cassandra, the presenter mentioned that the ring members will shard data and route your requests to the right host when they come in to a server that doesnt physically contain the value you wanted. To the client requesting this is seamless excwpt for the added latency.

If i wanted to avoid the routing and latency and ensure every server had the full data set, do i have to write with a consistency level of ALL and wait for all of those writes to return in my code, or can i write with a CL of 1 or 2 and let the ring propagate the rest of the copies to the other servers in the background after my code has continued executing?

I dont mind eventual consistency in my case, but i do (eventually) want all nodes to have all values and cannot tell if this is default behavior, or if sharding is the default and i can only force duplicates onto the other servers explicitly with a CL of ALL.

Best,
Riyad