You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Andreas Rudolph <An...@Spontech-Spine.com> on 2012/01/10 09:52:50 UTC

How to control location of data?

Hi!

We're evaluating Cassandra for our storage needs. One of the key benefits we see is the online replication of the data, that is an easy way to share data across nodes. But we have the need to precisely control on what node group specific parts of a key space (columns/column families) are stored on. Now we're having trouble understanding the documentation. Could anyone help us with to find some answers to our questions?

What does the term "replica" mean: If a key is stored on exactly three nodes in a cluster, is it correct then to say that there are three replicas of that key or are there just two replicas (copies) and one original?
What is the relation between the Cassandra concepts "Partitioner" and "Replica Placement Strategy"? According to documentation found on DataStax web site and architecture internals from the Cassandra Wiki the first storage location of a key (and its associated data) is determined by the "Partitioner" whereas additional storage locations are defined by "Replica Placement Strategy". I'm wondering if I could completely redefine the way how nodes are selected to store a key by just implementing my own subclass of AbstractReplicationStrategy and configuring that subclass into the key space.
How can I suppress that the "Partitioner" is consulted at all to determine what node stores a key first?
Is a key space always distributed across the whole cluster? Is it possible to configure Cassandra in such a way that more or less freely chosen parts of a key space (columns) are stored on arbitrarily chosen nodes?

Any tips would be very appreciated :-)

Re: AW: How to control location of data?

Posted by Maki Watanabe <wa...@gmail.com>.
Small correction:
The token range for each node is (Previous_token, My_Token].
( means exclusive and ] means inclusive.
So N1 is responsible from X+1 to A in following case.

maki

2012/1/11 Roland Gude <ro...@yoochoose.com>:
>
>
> Each node in the cluster is assigned a token (can be done automatically –
> but usually should not)
>
> The token of a node is the start token of the partition it is responsible
> for (and the token of the next node is the end token of the current tokens
> partition)
>
>
>
> Assume you have the following nodes/tokens (which are usually numbers but
> for the example I will use letters)
>
>
>
> N1/A
>
> N2/D
>
> N3/M
>
> N4/X
>
>
>
> This means that N1 is responsible (primary) for [A-D)
>
>        N2 for [D-M)
>
>        N3 for [M-X)
>
> And N4 for [X-A)
>
>
>
> If you have a replication factor of 1 data will go on the nodes like this:
>
>
>
> B -> N1
>
> E->N2
>
> X->N4
>
>
>
> And so on
>
> If you have a higher replication factor, the placement strategy decides
> which node will take replicas of which partition (becoming secondary node
> for that partition)
>
> Simple strategy will just put the replica on the next node in the ring
>
> So same example as above but RF of 2 and simple strategy:
>
>
>
> B-> N1 and N2
>
> E -> N2 and N3
>
> X -> N4 and N1
>
>
>
> Other strategies can factor in things like “put  data in another datacenter”
> or “put data in another rack” or such things.
>
>
>
> Even though the terms primary and secondary imply some means of quality or
> consistency, this is not the case. If a node is responsible for a piece of
> data, it will store it.
>
>
>
>
>
> But placement of the replicas is usually only relevant for availability
> reasons (i.e. disaster recovery etc.)
>
> Actual location should mean nothing to most applications as you can ask any
> node for the data you want and it will provide it to you (fetching it from
> the responsible nodes).
>
> This should be sufficient in almost all cases.
>
>
>
> So in the above example again, you can ask N3 “what data is available” and
> it will tell you: B, E and X, or you could ask it “give me X” and it will
> fetch it from N4 or N1 or both of them depending on consistency
> configuration and return the data to you.
>
>
>
>
>
> So actually if you use Cassandra – for the application the actual storage
> location of the data should not matter. It will be available anywhere in the
> cluster if it is stored on any reachable node.
>
>
>
> Von: Andreas Rudolph [mailto:andreas.rudolph@spontech-spine.com]
> Gesendet: Dienstag, 10. Januar 2012 15:06
> An: user@cassandra.apache.org
> Betreff: Re: AW: How to control location of data?
>
>
>
> Hi!
>
>
>
> Thank you for your last reply. I'm still wondering if I got you right...
>
>
>
> ...
>
> A partitioner decides into which partition a piece of data belongs
>
> Does your statement imply that the partitioner does not take any decisions
> at all on the (physical) storage location? Or put another way: What do you
> mean with "partition"?
>
>
>
> To quote http://wiki.apache.org/cassandra/ArchitectureInternals:
> "... AbstractReplicationStrategy controls what nodes get secondary,
> tertiary, etc. replicas of each key range. Primary replica is always
> determined by the token ring (...)"
>
>
>
> ...
>
> You can select different placement strategies and partitioners for different
> keyspaces, thereby choosing known data to be stored on known hosts.
>
> This is however discouraged for various reasons – i.e.  you need a lot of
> knowledge about your data to keep the cluster balanced. What is your usecase
> for this requirement? there is probably a more suitable solution.
>
>
>
> What we want is to partition the cluster with respect to key spaces.
>
> That is we want to establish an association between nodes and key spaces so
> that a node of the cluster holds data from a key space if and only if that
> node is a *member* of that key space.
>
>
>
> To our knowledge Cassandra has no built-in way to specify such a
> membership-relation. Therefore we thought of implementing our own replica
> placement strategy until we started to assume that the partitioner had to be
> replaced, too, to accomplish the task.
>
>
>
> Do you have any ideas?
>
>
>
>
>
> Von: Andreas Rudolph [mailto:Andreas.Rudolph@Spontech-Spine.com]
> Gesendet: Dienstag, 10. Januar 2012 09:53
> An: user@cassandra.apache.org
> Betreff: How to control location of data?
>
>
>
> Hi!
>
>
>
> We're evaluating Cassandra for our storage needs. One of the key benefits we
> see is the online replication of the data, that is an easy way to share data
> across nodes. But we have the need to precisely control on what node group
> specific parts of a key space (columns/column families) are stored on. Now
> we're having trouble understanding the documentation. Could anyone help us
> with to find some answers to our questions?
>
> ·  What does the term "replica" mean: If a key is stored on exactly three
> nodes in a cluster, is it correct then to say that there are three replicas
> of that key or are there just two replicas (copies) and one original?
>
> ·  What is the relation between the Cassandra concepts "Partitioner" and
> "Replica Placement Strategy"? According to documentation found on DataStax
> web site and architecture internals from the Cassandra Wiki the first
> storage location of a key (and its associated data) is determined by the
> "Partitioner" whereas additional storage locations are defined by "Replica
> Placement Strategy". I'm wondering if I could completely redefine the way
> how nodes are selected to store a key by just implementing my own subclass
> of AbstractReplicationStrategy and configuring that subclass into the key
> space.
>
> ·  How can I suppress that the "Partitioner" is consulted at all to
> determine what node stores a key first?
>
> ·  Is a key space always distributed across the whole cluster? Is it
> possible to configure Cassandra in such a way that more or less freely
> chosen parts of a key space (columns) are stored on arbitrarily chosen
> nodes?
>
>
>
> Any tips would be very appreciated :-)
>
>



-- 
w3m

Re: AW: AW: How to control location of data?

Posted by Andreas Rudolph <an...@spontech-spine.com>.
Hi!

> ... 
> So actually if you use Cassandra – for the application the actual storage location of the data should not matter. It will be available anywhere in the cluster if it is stored on any reachable node.
I suspected it so, that is Cassandra does not provide a mechanism to strictly constrain what nodes in a cluster hold the data for a specific key space because Cassandra is not designed for that purpose.

Thank you very much for your effort and detailed explanation.

>  
> Von: Andreas Rudolph [mailto:andreas.rudolph@spontech-spine.com] 
> Gesendet: Dienstag, 10. Januar 2012 15:06
> An: user@cassandra.apache.org
> Betreff: Re: AW: How to control location of data?
>  
> Hi!
>  
> Thank you for your last reply. I'm still wondering if I got you right...
>  
> ... 
> A partitioner decides into which partition a piece of data belongs
> Does your statement imply that the partitioner does not take any decisions at all on the (physical) storage location? Or put another way: What do you mean with "partition"?
>  
> To quote http://wiki.apache.org/cassandra/ArchitectureInternals: "... AbstractReplicationStrategy controls what nodes get secondary, tertiary, etc. replicas of each key range. Primary replica is always determined by the token ring (...)"
> 
> 
> ... 
> You can select different placement strategies and partitioners for different keyspaces, thereby choosing known data to be stored on known hosts.
> This is however discouraged for various reasons – i.e.  you need a lot of knowledge about your data to keep the cluster balanced. What is your usecase for this requirement? there is probably a more suitable solution.
>  
> What we want is to partition the cluster with respect to key spaces.
> That is we want to establish an association between nodes and key spaces so that a node of the cluster holds data from a key space if and only if that node is a *member* of that key space.
>  
> To our knowledge Cassandra has no built-in way to specify such a membership-relation. Therefore we thought of implementing our own replica placement strategy until we started to assume that the partitioner had to be replaced, too, to accomplish the task.
>  
> Do you have any ideas?
>  
> 
> 
> Von: Andreas Rudolph [mailto:Andreas.Rudolph@Spontech-Spine.com] 
> Gesendet: Dienstag, 10. Januar 2012 09:53
> An: user@cassandra.apache.org
> Betreff: How to control location of data?
>  
> Hi!
>  
> We're evaluating Cassandra for our storage needs. One of the key benefits we see is the online replication of the data, that is an easy way to share data across nodes. But we have the need to precisely control on what node group specific parts of a key space (columns/column families) are stored on. Now we're having trouble understanding the documentation. Could anyone help us with to find some answers to our questions?
> 
> ·  What does the term "replica" mean: If a key is stored on exactly three nodes in a cluster, is it correct then to say that there are three replicas of that key or are there just two replicas (copies) and one original?
> ·  What is the relation between the Cassandra concepts "Partitioner" and "Replica Placement Strategy"? According to documentation found on DataStax web site and architecture internals from the Cassandra Wiki the first storage location of a key (and its associated data) is determined by the "Partitioner" whereas additional storage locations are defined by "Replica Placement Strategy". I'm wondering if I could completely redefine the way how nodes are selected to store a key by just implementing my own subclass of AbstractReplicationStrategy and configuring that subclass into the key space.
> ·  How can I suppress that the "Partitioner" is consulted at all to determine what node stores a key first?
> ·  Is a key space always distributed across the whole cluster? Is it possible to configure Cassandra in such a way that more or less freely chosen parts of a key space (columns) are stored on arbitrarily chosen nodes?
>  
> Any tips would be very appreciated :-)
>  
> 



Re: How to control location of data?

Posted by Viktor Jevdokimov <vj...@gmail.com>.
The idea behind client that controls location of a data is performance, to
avoid unnecessary network round-trips between nodes and unnecessary caching
of backup ranges. All of this mostly is true for reads at CL.ONE and RF>1.

How it works (in our case):

Our client uses describe_ring that returns ring for specified Keyspace with
token ranges and replica endpoints for each range. First node in the list
for a token range is a kind of a primary, others are backup replicas.

The client for most requests for a single key calculates a token and
connects to node that is a primary node for this token. If primary is down,
next endpoint from the list of endpoints for that token range is used.

This way the network load between nodes is much lower. In our case, when
load balancing just rotated all nodes we've seen a 100Mbps load on a node,
while with an approach above 25Mbps only.

Caches on a single node is filled up with data of a primary range of that
node, avoiding of caching replica ranges that also belongs to this node by
RF.

The downside is when primary node is not accessible, backup node has no
cache for a range we're switched to.


2012/1/11 Andreas Rudolph <an...@spontech-spine.com>

> Hi!
>
> ...
> Again, it's probably a bad idea.
>
> I agree on that, now.
>
> Thank you.
>
>
> Cheers
>
> -----------------
> Aaron Morton
> Freelance Developer
> @aaronmorton
> http://www.thelastpickle.com
>
> On 11/01/2012, at 4:56 AM, Roland Gude wrote:
>
> ** **
> Each node in the cluster is assigned a token (can be done automatically –
> but usually should not)****
> The token of a node is the start token of the partition it is responsible
> for (and the token of the next node is the end token of the current tokens
> partition)****
> ** **
> Assume you have the following nodes/tokens (which are usually numbers but
> for the example I will use letters)****
> ** **
> N1/A****
> N2/D****
> N3/M****
> N4/X****
> ** **
> This means that N1 is responsible (primary) for [A-D)****
>        N2 for [D-M)****
>        N3 for [M-X)****
> And N4 for [X-A)****
> ** **
> If you have a replication factor of 1 data will go on the nodes like this:
> ****
> ** **
> B -> N1****
> E->N2****
> X->N4****
> ** **
> And so on****
> If you have a higher replication factor, the placement strategy decides
> which node will take replicas of which partition (becoming secondary node
> for that partition)****
> Simple strategy will just put the replica on the next node in the ring****
> So same example as above but RF of 2 and simple strategy:****
> ** **
> B-> N1 and N2****
> E -> N2 and N3****
> X -> N4 and N1****
> ** **
> Other strategies can factor in things like “put  data in another
> datacenter” or “put data in another rack” or such things.****
> ** **
> Even though the terms primary and secondary imply some means of quality or
> consistency, this is not the case. If a node is responsible for a piece of
> data, it will store it.****
> ** **
> ** **
> But placement of the replicas is usually only relevant for availability
> reasons (i.e. disaster recovery etc.)****
> Actual location should mean nothing to most applications as you can ask
> any node for the data you want and it will provide it to you (fetching it
> from the responsible nodes).****
> This should be sufficient in almost all cases.****
> ** **
> So in the above example again, you can ask N3 “what data is available” and
> it will tell you: B, E and X, or you could ask it “give me X” and it will
> fetch it from N4 or N1 or both of them depending on consistency
> configuration and return the data to you.****
> ** **
> ** **
> So actually if you use Cassandra – for the application the actual storage
> location of the data should not matter. It will be available anywhere in
> the cluster if it is stored on any reachable node.****
> ** **
> *Von:* Andreas Rudolph [mailto:andreas.rudolph@spontech-spine.com]
> *Gesendet:* Dienstag, 10. Januar 2012 15:06
> *An:* user@cassandra.apache.org
> *Betreff:* Re: AW: How to control location of data?****
> ** **
> Hi!****
> ** **
> Thank you for your last reply. I'm still wondering if I got you right...**
> **
> ** **
>
> ... ****
> A partitioner decides into which partition a piece of data belongs****
>
> Does your statement imply that the partitioner does not take any decisions
> at all on the (physical) storage location? Or put another way: What do you
> mean with "partition"?****
> ** **
> To quote http://wiki.apache.org/cassandra/ArchitectureInternals: "... AbstractReplicationStrategy
> controls what nodes get secondary, tertiary, etc. replicas of each key
> range. Primary replica is always determined by the token ring (...)"****
>
>
> ****
> ... ****
> You can select different placement strategies and partitioners for
> different keyspaces, thereby choosing known data to be stored on known
> hosts.****
> This is however discouraged for various reasons – i.e.  you need a lot of
> knowledge about your data to keep the cluster balanced. What is your
> usecase for this requirement? there is probably a more suitable solution.*
> ***
>  ****
> What we want is to partition the cluster with respect to key spaces.****
> That is we want to establish an association between nodes and key spaces
> so that a node of the cluster holds data from a key space if and only if
> that node is a *member* of that key space.****
> ** **
> To our knowledge Cassandra has no built-in way to specify such a
> membership-relation. Therefore we thought of implementing our own replica
> placement strategy until we started to assume that the partitioner had to
> be replaced, too, to accomplish the task.****
> ** **
> Do you have any ideas?****
> ** **
>
>
> ****
> *Von:* Andreas Rudolph [mailto:Andreas.Rudolph@Spontech-Spine.com]
> *Gesendet:* Dienstag, 10. Januar 2012 09:53
> *An:* user@cassandra.apache.org
> *Betreff:* How to control location of data?****
>  ****
> Hi!****
>  ****
>
> We're evaluating Cassandra for our storage needs. One of the key benefits
> we see is the online replication of the data, that is an easy way to share
> data across nodes. But we have the need to precisely control on what node
> group specific parts of a key space (columns/column families) are stored
> on. Now we're having trouble understanding the documentation. Could anyone
> help us with to find some answers to our questions?****
>
> ·  What does the term "replica" mean: If a key is stored on exactly three
> nodes in a cluster, is it correct then to say that there are three replicas
> of that key or are there just two replicas (copies) and one original?****
>
> ·  What is the relation between the Cassandra concepts "Partitioner" and
> "Replica Placement Strategy"? According to documentation found on DataStax
> web site and architecture internals from the Cassandra Wiki the first
> storage location of a key (and its associated data) is determined by the
> "Partitioner" whereas additional storage locations are defined by "Replica
> Placement Strategy". I'm wondering if I could completely redefine the way
> how nodes are selected to store a key by just implementing my own subclass
> of AbstractReplicationStrategy and configuring that subclass into the key
> space.****
>
> ·  How can I suppress that the "Partitioner" is consulted at all to
> determine what node stores a key first?****
>
> ·  Is a key space always distributed across the whole cluster? Is it
> possible to configure Cassandra in such a way that more or less freely
> chosen parts of a key space (columns) are stored on arbitrarily chosen
> nodes?****
>  ****
> Any tips would be very appreciated :-)****
>
>
>
>
>

Re: How to control location of data?

Posted by Andreas Rudolph <an...@spontech-spine.com>.
Hi!

> ...
> Again, it's probably a bad idea. 
I agree on that, now.

Thank you.

> 
> Cheers
> 
> -----------------
> Aaron Morton
> Freelance Developer
> @aaronmorton
> http://www.thelastpickle.com
> 
> On 11/01/2012, at 4:56 AM, Roland Gude wrote:
> 
>>  
>> Each node in the cluster is assigned a token (can be done automatically – but usually should not)
>> The token of a node is the start token of the partition it is responsible for (and the token of the next node is the end token of the current tokens partition)
>>  
>> Assume you have the following nodes/tokens (which are usually numbers but for the example I will use letters)
>>  
>> N1/A
>> N2/D
>> N3/M
>> N4/X
>>  
>> This means that N1 is responsible (primary) for [A-D)
>>        N2 for [D-M)
>>        N3 for [M-X)
>> And N4 for [X-A)
>>  
>> If you have a replication factor of 1 data will go on the nodes like this:
>>  
>> B -> N1
>> E->N2
>> X->N4
>>  
>> And so on
>> If you have a higher replication factor, the placement strategy decides which node will take replicas of which partition (becoming secondary node for that partition)
>> Simple strategy will just put the replica on the next node in the ring
>> So same example as above but RF of 2 and simple strategy:
>>  
>> B-> N1 and N2
>> E -> N2 and N3
>> X -> N4 and N1
>>  
>> Other strategies can factor in things like “put  data in another datacenter” or “put data in another rack” or such things.
>>  
>> Even though the terms primary and secondary imply some means of quality or consistency, this is not the case. If a node is responsible for a piece of data, it will store it.
>>  
>>  
>> But placement of the replicas is usually only relevant for availability reasons (i.e. disaster recovery etc.)
>> Actual location should mean nothing to most applications as you can ask any node for the data you want and it will provide it to you (fetching it from the responsible nodes).
>> This should be sufficient in almost all cases.
>>  
>> So in the above example again, you can ask N3 “what data is available” and it will tell you: B, E and X, or you could ask it “give me X” and it will fetch it from N4 or N1 or both of them depending on consistency configuration and return the data to you.
>>  
>>  
>> So actually if you use Cassandra – for the application the actual storage location of the data should not matter. It will be available anywhere in the cluster if it is stored on any reachable node.
>>  
>> Von: Andreas Rudolph [mailto:andreas.rudolph@spontech-spine.com] 
>> Gesendet: Dienstag, 10. Januar 2012 15:06
>> An: user@cassandra.apache.org
>> Betreff: Re: AW: How to control location of data?
>>  
>> Hi!
>>  
>> Thank you for your last reply. I'm still wondering if I got you right...
>>  
>> ... 
>> A partitioner decides into which partition a piece of data belongs
>> Does your statement imply that the partitioner does not take any decisions at all on the (physical) storage location? Or put another way: What do you mean with "partition"?
>>  
>> To quote http://wiki.apache.org/cassandra/ArchitectureInternals: "... AbstractReplicationStrategy controls what nodes get secondary, tertiary, etc. replicas of each key range. Primary replica is always determined by the token ring (...)"
>> 
>> 
>> ... 
>> You can select different placement strategies and partitioners for different keyspaces, thereby choosing known data to be stored on known hosts.
>> This is however discouraged for various reasons – i.e.  you need a lot of knowledge about your data to keep the cluster balanced. What is your usecase for this requirement? there is probably a more suitable solution.
>>  
>> What we want is to partition the cluster with respect to key spaces.
>> That is we want to establish an association between nodes and key spaces so that a node of the cluster holds data from a key space if and only if that node is a *member* of that key space.
>>  
>> To our knowledge Cassandra has no built-in way to specify such a membership-relation. Therefore we thought of implementing our own replica placement strategy until we started to assume that the partitioner had to be replaced, too, to accomplish the task.
>>  
>> Do you have any ideas?
>>  
>> 
>> 
>> Von: Andreas Rudolph [mailto:Andreas.Rudolph@Spontech-Spine.com] 
>> Gesendet: Dienstag, 10. Januar 2012 09:53
>> An: user@cassandra.apache.org
>> Betreff: How to control location of data?
>>  
>> Hi!
>>  
>> We're evaluating Cassandra for our storage needs. One of the key benefits we see is the online replication of the data, that is an easy way to share data across nodes. But we have the need to precisely control on what node group specific parts of a key space (columns/column families) are stored on. Now we're having trouble understanding the documentation. Could anyone help us with to find some answers to our questions?
>> 
>> ·  What does the term "replica" mean: If a key is stored on exactly three nodes in a cluster, is it correct then to say that there are three replicas of that key or are there just two replicas (copies) and one original?
>> ·  What is the relation between the Cassandra concepts "Partitioner" and "Replica Placement Strategy"? According to documentation found on DataStax web site and architecture internals from the Cassandra Wiki the first storage location of a key (and its associated data) is determined by the "Partitioner" whereas additional storage locations are defined by "Replica Placement Strategy". I'm wondering if I could completely redefine the way how nodes are selected to store a key by just implementing my own subclass of AbstractReplicationStrategy and configuring that subclass into the key space.
>> ·  How can I suppress that the "Partitioner" is consulted at all to determine what node stores a key first?
>> ·  Is a key space always distributed across the whole cluster? Is it possible to configure Cassandra in such a way that more or less freely chosen parts of a key space (columns) are stored on arbitrarily chosen nodes?
>>  
>> Any tips would be very appreciated :-)
>> 
> 



Re: How to control location of data?

Posted by aaron morton <aa...@thelastpickle.com>.
> What we want is to partition the cluster with respect to key spaces.
Why do you want to do this ? (It's probably a bad idea) 

Background in here on the partitioner, placement strategy and the snitch http://thelastpickle.com/2011/02/07/Introduction-to-Cassandra/

Now here's how to do it….

Use the NetworkTopologyStrategy and the PropertyFileSnitch (or the RackInferringSnitch, don't use the RickInferringSnitch see http://goo.gl/cLPjb). 

1) In the PropertyFileSnitch place the nodes into different Data Centers, see conf/cassandra-topology.properties for examples. e.g. nodes 1,2 and 3 in DC1 then nodes 4,5,6 in DC 2
2) In the definition of the Keyspace use the NetworkTopologyStrategy and place replicas in the DC's that contain the nodes you want. e.g. ks1 with strategy_options={DC1:3} and ks2 with strategy_options={DC2:3}
3) You will still want to use the RandomPartitioner. 
4) Rows with the same key in different keyspaces cannot be written to or read from in the same request.

Again, it's probably a bad idea. 

Cheers

-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 11/01/2012, at 4:56 AM, Roland Gude wrote:

>  
> Each node in the cluster is assigned a token (can be done automatically – but usually should not)
> The token of a node is the start token of the partition it is responsible for (and the token of the next node is the end token of the current tokens partition)
>  
> Assume you have the following nodes/tokens (which are usually numbers but for the example I will use letters)
>  
> N1/A
> N2/D
> N3/M
> N4/X
>  
> This means that N1 is responsible (primary) for [A-D)
>        N2 for [D-M)
>        N3 for [M-X)
> And N4 for [X-A)
>  
> If you have a replication factor of 1 data will go on the nodes like this:
>  
> B -> N1
> E->N2
> X->N4
>  
> And so on
> If you have a higher replication factor, the placement strategy decides which node will take replicas of which partition (becoming secondary node for that partition)
> Simple strategy will just put the replica on the next node in the ring
> So same example as above but RF of 2 and simple strategy:
>  
> B-> N1 and N2
> E -> N2 and N3
> X -> N4 and N1
>  
> Other strategies can factor in things like “put  data in another datacenter” or “put data in another rack” or such things.
>  
> Even though the terms primary and secondary imply some means of quality or consistency, this is not the case. If a node is responsible for a piece of data, it will store it.
>  
>  
> But placement of the replicas is usually only relevant for availability reasons (i.e. disaster recovery etc.)
> Actual location should mean nothing to most applications as you can ask any node for the data you want and it will provide it to you (fetching it from the responsible nodes).
> This should be sufficient in almost all cases.
>  
> So in the above example again, you can ask N3 “what data is available” and it will tell you: B, E and X, or you could ask it “give me X” and it will fetch it from N4 or N1 or both of them depending on consistency configuration and return the data to you.
>  
>  
> So actually if you use Cassandra – for the application the actual storage location of the data should not matter. It will be available anywhere in the cluster if it is stored on any reachable node.
>  
> Von: Andreas Rudolph [mailto:andreas.rudolph@spontech-spine.com] 
> Gesendet: Dienstag, 10. Januar 2012 15:06
> An: user@cassandra.apache.org
> Betreff: Re: AW: How to control location of data?
>  
> Hi!
>  
> Thank you for your last reply. I'm still wondering if I got you right...
>  
> ... 
> A partitioner decides into which partition a piece of data belongs
> Does your statement imply that the partitioner does not take any decisions at all on the (physical) storage location? Or put another way: What do you mean with "partition"?
>  
> To quote http://wiki.apache.org/cassandra/ArchitectureInternals: "... AbstractReplicationStrategy controls what nodes get secondary, tertiary, etc. replicas of each key range. Primary replica is always determined by the token ring (...)"
> 
> 
> ... 
> You can select different placement strategies and partitioners for different keyspaces, thereby choosing known data to be stored on known hosts.
> This is however discouraged for various reasons – i.e.  you need a lot of knowledge about your data to keep the cluster balanced. What is your usecase for this requirement? there is probably a more suitable solution.
>  
> What we want is to partition the cluster with respect to key spaces.
> That is we want to establish an association between nodes and key spaces so that a node of the cluster holds data from a key space if and only if that node is a *member* of that key space.
>  
> To our knowledge Cassandra has no built-in way to specify such a membership-relation. Therefore we thought of implementing our own replica placement strategy until we started to assume that the partitioner had to be replaced, too, to accomplish the task.
>  
> Do you have any ideas?
>  
> 
> 
> Von: Andreas Rudolph [mailto:Andreas.Rudolph@Spontech-Spine.com] 
> Gesendet: Dienstag, 10. Januar 2012 09:53
> An: user@cassandra.apache.org
> Betreff: How to control location of data?
>  
> Hi!
>  
> We're evaluating Cassandra for our storage needs. One of the key benefits we see is the online replication of the data, that is an easy way to share data across nodes. But we have the need to precisely control on what node group specific parts of a key space (columns/column families) are stored on. Now we're having trouble understanding the documentation. Could anyone help us with to find some answers to our questions?
> 
> ·  What does the term "replica" mean: If a key is stored on exactly three nodes in a cluster, is it correct then to say that there are three replicas of that key or are there just two replicas (copies) and one original?
> ·  What is the relation between the Cassandra concepts "Partitioner" and "Replica Placement Strategy"? According to documentation found on DataStax web site and architecture internals from the Cassandra Wiki the first storage location of a key (and its associated data) is determined by the "Partitioner" whereas additional storage locations are defined by "Replica Placement Strategy". I'm wondering if I could completely redefine the way how nodes are selected to store a key by just implementing my own subclass of AbstractReplicationStrategy and configuring that subclass into the key space.
> ·  How can I suppress that the "Partitioner" is consulted at all to determine what node stores a key first?
> ·  Is a key space always distributed across the whole cluster? Is it possible to configure Cassandra in such a way that more or less freely chosen parts of a key space (columns) are stored on arbitrarily chosen nodes?
>  
> Any tips would be very appreciated :-)
> 


AW: AW: How to control location of data?

Posted by Roland Gude <ro...@yoochoose.com>.
Each node in the cluster is assigned a token (can be done automatically - but usually should not)
The token of a node is the start token of the partition it is responsible for (and the token of the next node is the end token of the current tokens partition)

Assume you have the following nodes/tokens (which are usually numbers but for the example I will use letters)

N1/A
N2/D
N3/M
N4/X

This means that N1 is responsible (primary) for [A-D)
       N2 for [D-M)
       N3 for [M-X)
And N4 for [X-A)

If you have a replication factor of 1 data will go on the nodes like this:

B -> N1
E->N2
X->N4

And so on
If you have a higher replication factor, the placement strategy decides which node will take replicas of which partition (becoming secondary node for that partition)
Simple strategy will just put the replica on the next node in the ring
So same example as above but RF of 2 and simple strategy:

B-> N1 and N2
E -> N2 and N3
X -> N4 and N1

Other strategies can factor in things like "put  data in another datacenter" or "put data in another rack" or such things.

Even though the terms primary and secondary imply some means of quality or consistency, this is not the case. If a node is responsible for a piece of data, it will store it.


But placement of the replicas is usually only relevant for availability reasons (i.e. disaster recovery etc.)
Actual location should mean nothing to most applications as you can ask any node for the data you want and it will provide it to you (fetching it from the responsible nodes).
This should be sufficient in almost all cases.

So in the above example again, you can ask N3 "what data is available" and it will tell you: B, E and X, or you could ask it "give me X" and it will fetch it from N4 or N1 or both of them depending on consistency configuration and return the data to you.


So actually if you use Cassandra - for the application the actual storage location of the data should not matter. It will be available anywhere in the cluster if it is stored on any reachable node.

Von: Andreas Rudolph [mailto:andreas.rudolph@spontech-spine.com]
Gesendet: Dienstag, 10. Januar 2012 15:06
An: user@cassandra.apache.org
Betreff: Re: AW: How to control location of data?

Hi!

Thank you for your last reply. I'm still wondering if I got you right...

...
A partitioner decides into which partition a piece of data belongs
Does your statement imply that the partitioner does not take any decisions at all on the (physical) storage location? Or put another way: What do you mean with "partition"?

To quote http://wiki.apache.org/cassandra/ArchitectureInternals: "... AbstractReplicationStrategy controls what nodes get secondary, tertiary, etc. replicas of each key range. Primary replica is always determined by the token ring (...)"


...
You can select different placement strategies and partitioners for different keyspaces, thereby choosing known data to be stored on known hosts.
This is however discouraged for various reasons - i.e.  you need a lot of knowledge about your data to keep the cluster balanced. What is your usecase for this requirement? there is probably a more suitable solution.

What we want is to partition the cluster with respect to key spaces.
That is we want to establish an association between nodes and key spaces so that a node of the cluster holds data from a key space if and only if that node is a *member* of that key space.

To our knowledge Cassandra has no built-in way to specify such a membership-relation. Therefore we thought of implementing our own replica placement strategy until we started to assume that the partitioner had to be replaced, too, to accomplish the task.

Do you have any ideas?



Von: Andreas Rudolph [mailto:Andreas.Rudolph@Spontech-Spine.com]
Gesendet: Dienstag, 10. Januar 2012 09:53
An: user@cassandra.apache.org<ma...@cassandra.apache.org>
Betreff: How to control location of data?

Hi!

We're evaluating Cassandra for our storage needs. One of the key benefits we see is the online replication of the data, that is an easy way to share data across nodes. But we have the need to precisely control on what node group specific parts of a key space (columns/column families) are stored on. Now we're having trouble understanding the documentation. Could anyone help us with to find some answers to our questions?

*  What does the term "replica" mean: If a key is stored on exactly three nodes in a cluster, is it correct then to say that there are three replicas of that key or are there just two replicas (copies) and one original?

*  What is the relation between the Cassandra concepts "Partitioner" and "Replica Placement Strategy"? According to documentation found on DataStax web site and architecture internals from the Cassandra Wiki the first storage location of a key (and its associated data) is determined by the "Partitioner" whereas additional storage locations are defined by "Replica Placement Strategy". I'm wondering if I could completely redefine the way how nodes are selected to store a key by just implementing my own subclass of AbstractReplicationStrategy and configuring that subclass into the key space.

*  How can I suppress that the "Partitioner" is consulted at all to determine what node stores a key first?

*  Is a key space always distributed across the whole cluster? Is it possible to configure Cassandra in such a way that more or less freely chosen parts of a key space (columns) are stored on arbitrarily chosen nodes?

Any tips would be very appreciated :-)


Re: AW: How to control location of data?

Posted by Andreas Rudolph <an...@spontech-spine.com>.
Hi!

Thank you for your last reply. I'm still wondering if I got you right...

> ... 
> A partitioner decides into which partition a piece of data belongs
Does your statement imply that the partitioner does not take any decisions at all on the (physical) storage location? Or put another way: What do you mean with "partition"?

To quote http://wiki.apache.org/cassandra/ArchitectureInternals: "... AbstractReplicationStrategy controls what nodes get secondary, tertiary, etc. replicas of each key range. Primary replica is always determined by the token ring (...)"

> ... 
> You can select different placement strategies and partitioners for different keyspaces, thereby choosing known data to be stored on known hosts.
> This is however discouraged for various reasons – i.e.  you need a lot of knowledge about your data to keep the cluster balanced. What is your usecase for this requirement? there is probably a more suitable solution.
>  
What we want is to partition the cluster with respect to key spaces.
That is we want to establish an association between nodes and key spaces so that a node of the cluster holds data from a key space if and only if that node is a *member* of that key space.

To our knowledge Cassandra has no built-in way to specify such a membership-relation. Therefore we thought of implementing our own replica placement strategy until we started to assume that the partitioner had to be replaced, too, to accomplish the task.

Do you have any ideas?


> Von: Andreas Rudolph [mailto:Andreas.Rudolph@Spontech-Spine.com] 
> Gesendet: Dienstag, 10. Januar 2012 09:53
> An: user@cassandra.apache.org
> Betreff: How to control location of data?
>  
> Hi!
>  
> We're evaluating Cassandra for our storage needs. One of the key benefits we see is the online replication of the data, that is an easy way to share data across nodes. But we have the need to precisely control on what node group specific parts of a key space (columns/column families) are stored on. Now we're having trouble understanding the documentation. Could anyone help us with to find some answers to our questions?
> 
> ·  What does the term "replica" mean: If a key is stored on exactly three nodes in a cluster, is it correct then to say that there are three replicas of that key or are there just two replicas (copies) and one original?
> ·  What is the relation between the Cassandra concepts "Partitioner" and "Replica Placement Strategy"? According to documentation found on DataStax web site and architecture internals from the Cassandra Wiki the first storage location of a key (and its associated data) is determined by the "Partitioner" whereas additional storage locations are defined by "Replica Placement Strategy". I'm wondering if I could completely redefine the way how nodes are selected to store a key by just implementing my own subclass of AbstractReplicationStrategy and configuring that subclass into the key space.
> ·  How can I suppress that the "Partitioner" is consulted at all to determine what node stores a key first?
> ·  Is a key space always distributed across the whole cluster? Is it possible to configure Cassandra in such a way that more or less freely chosen parts of a key space (columns) are stored on arbitrarily chosen nodes?
>  
> Any tips would be very appreciated :-)



AW: How to control location of data?

Posted by Roland Gude <ro...@yoochoose.com>.
Hi,

i think everything is called a replica so if data is on 3 nodes you have 3 replicas. There is no such thing as an original.

A partitioner decides into which partition a piece of data belongs
A replica placement strategy decides which partition goes on which node

You cannot suppress the partitioner.

You can select different placement strategies and partitioners for different keyspaces, thereby choosing known data to be stored on known hosts.
This is however discouraged for various reasons - i.e.  you need a lot of knowledge about your data to keep the cluster balanced. What is your usecase for this requirement? there is probably a more suitable solution.

Von: Andreas Rudolph [mailto:Andreas.Rudolph@Spontech-Spine.com]
Gesendet: Dienstag, 10. Januar 2012 09:53
An: user@cassandra.apache.org
Betreff: How to control location of data?

Hi!

We're evaluating Cassandra for our storage needs. One of the key benefits we see is the online replication of the data, that is an easy way to share data across nodes. But we have the need to precisely control on what node group specific parts of a key space (columns/column families) are stored on. Now we're having trouble understanding the documentation. Could anyone help us with to find some answers to our questions?

*  What does the term "replica" mean: If a key is stored on exactly three nodes in a cluster, is it correct then to say that there are three replicas of that key or are there just two replicas (copies) and one original?

*  What is the relation between the Cassandra concepts "Partitioner" and "Replica Placement Strategy"? According to documentation found on DataStax web site and architecture internals from the Cassandra Wiki the first storage location of a key (and its associated data) is determined by the "Partitioner" whereas additional storage locations are defined by "Replica Placement Strategy". I'm wondering if I could completely redefine the way how nodes are selected to store a key by just implementing my own subclass of AbstractReplicationStrategy and configuring that subclass into the key space.

*  How can I suppress that the "Partitioner" is consulted at all to determine what node stores a key first?

*  Is a key space always distributed across the whole cluster? Is it possible to configure Cassandra in such a way that more or less freely chosen parts of a key space (columns) are stored on arbitrarily chosen nodes?

Any tips would be very appreciated :-)