You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Henry Luo <hl...@choicestream.com> on 2010/10/11 15:53:47 UTC

Multi Data Center Strategy

We have an application that does a lot of updates to the rows. We use replication factor of 3 and are moving to multiple data centers. We would like to accomplish the following setup:

Data are replicated to other data centers. RackAwareStrategy seems to be able to handle that, however


1)      We don't need the data replicated across data centers for each individual update, since they got overridden a lot. Rather, we'd like it be replicated 'once a while', say after x number of updates to a row, or after y number of minutes. In essence, it's a delayed write of the latest copy only to remote nodes.

2)      We would like the read be handled only by the nodes local to the client, i.e., don't bother to send read requests to the remote data center since it's too slow. If the request fails locally, just let it fail.

Is there a way to accomplish this by

1)      Configure Cassandra in a certain way

2)      Change/Add replication strategy

3)      Change/Add some core Cassandra features

4)      Use another layer in front of Cassandra - I'd rather not to go this route.

Thanks.

Henry


________________________________
The information transmitted is intended only for the person or entity to which it is addressed and may contain confidential, proprietary, and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon this information by persons or entities other than the intended recipient is prohibited. If you received this in error, please contact the sender and delete the material from all computers.

Re: Multi Data Center Strategy

Posted by Edward Capriolo <ed...@gmail.com>.
On Mon, Oct 11, 2010 at 9:53 AM, Henry Luo <hl...@choicestream.com> wrote:
> We have an application that does a lot of updates to the rows. We use
> replication factor of 3 and are moving to multiple data centers. We would
> like to accomplish the following setup:
>
>
>
> Data are replicated to other data centers. RackAwareStrategy seems to be
> able to handle that, however
>
>
>
> 1)      We don’t need the data replicated across data centers for each
> individual update, since they got overridden a lot. Rather, we’d like it be
> replicated ‘once a while’, say after x number of updates to a row, or after
> y number of minutes. In essence, it’s a delayed write of the latest copy
> only to remote nodes.
>
> 2)      We would like the read be handled only by the nodes local to the
> client, i.e., don’t bother to send read requests to the remote data center
> since it’s too slow. If the request fails locally, just let it fail.
>
>
>
> Is there a way to accomplish this by
>
> 1)      Configure Cassandra in a certain way
>
> 2)      Change/Add replication strategy
>
> 3)      Change/Add some core Cassandra features
>
> 4)      Use another layer in front of Cassandra – I’d rather not to go this
> route.
>
>
>
> Thanks.
>
>
>
> Henry
>
>
>
> ________________________________
> The information transmitted is intended only for the person or entity to
> which it is addressed and may contain confidential, proprietary, and/or
> privileged material. Any review, retransmission, dissemination or other use
> of, or taking of any action in reliance upon this information by persons or
> entities other than the intended recipient is prohibited. If you received
> this in error, please contact the sender and delete the material from all
> computers.
>

1) No you can not do this. In 7.0 you can set the read_repair_chance
to a low % so read repairs only happen periodically.

2) If you have your multi-data center configuration setup properly
with respect to snitches, and you read at Consistency Level ONE, or
DCQuorum, and set read_repair_chance to a low % as specified above you
will achieve this.

Edward