You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Matthias Zeilinger <Ma...@bwinparty.com> on 2013/04/11 15:13:16 UTC

multiple Datacenter values in PropertyFileSnitch

Hi,

I would like to create big cluster for many applications.
Within this cluster I would like to separate the data for each application, which can be easily done via different virtual datacenters and the correct replication strategy.
What I would like to know, if I can specify for 1 node multiple values in the PropertyFileSnitch configuration, so that I can use 1 node for more applications?
For example:
6 nodes:
3 for App A
3 for App B
4 for App C

I want to have such a configuration:
Node 1 - DC-A& DC-C
Node 2 - DC-B & DC-C
Node 3 - DC-A & DC-C
Node 4 - DC-B & DC-C
Node 5 - DC-A
Node 6 - DC-B

Is this possible or does anyone have another solution for this?


Thx & br matthias

Re: multiple Datacenter values in PropertyFileSnitch

Posted by aaron morton <aa...@thelastpickle.com>.
> So that 2 apps with same and very high load pattern are not clashing.
I'm not sure what the advantage is of putting two apps in the same cluster, but using the replication strategy properties so they are on different nodes. The reason to put the apps in the same cluster is to share the resources. 
 
Having a different number of nodes in different DC's and mixing the RF between them can get complicated. 

What sort of load are you considering? IMHO the simple thing to do do some capacity planning and when in doubt start with one multi DC cluster with the same RF in both. 

Cheers

-----------------
Aaron Morton
Freelance Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 12/04/2013, at 7:33 PM, Andras Szerdahelyi <an...@ignitionone.com> wrote:

> I would replicate your different keyspaces to different DCs and scale those appropriately 
> So, for example, HighLoad KS replicates to really-huge-dc, which would have, 10 nodes, LowerLoad KS replicates to smaller-dc with 5 nodes.
> The idea is , you do not mix your different keyspaces in the same datacenter ( this is possible with NetworkTopology ) or for redundancy/HA purposes you place a single replica in the other keyspace's DC but you direct your applications to the "primary" DC of the keyspace, with LOCAL_QUORUM or ONE reads.
> 
> Regards,
> Andras
> 
> From: Matthias Zeilinger <Ma...@bwinparty.com>
> Reply-To: "user@cassandra.apache.org" <us...@cassandra.apache.org>
> Date: Friday 12 April 2013 07:57
> To: "user@cassandra.apache.org" <us...@cassandra.apache.org>
> Subject: RE: multiple Datacenter values in PropertyFileSnitch
> 
> I´m using for each application it´s own keyspace.
> What I want is to split up for different load patterns.
> So that 2 apps with same and very high load pattern are not clashing.
>  
> For other load patterns I want to use another splitting.
>  
> Is there any best practice or should I scale out, so that the complete load can be distributed to on all nodes?
>  
> Br,
> Matthias Zeilinger
> Production Operation – Shared Services
>  
> P: +43 (0) 50 858-31185
> M: +43 (0) 664 85-34459
> E: matthias.zeilinger@bwinparty.com
>  
> bwin.party services (Austria) GmbH
> Marxergasse 1B
> A-1030 Vienna
>  
> www.bwinparty.com
>  
> From: aaron morton [mailto:aaron@thelastpickle.com] 
> Sent: Donnerstag, 11. April 2013 20:48
> To: user@cassandra.apache.org
> Subject: Re: multiple Datacenter values in PropertyFileSnitch
>  
> A node can only exist in one DC and one rack. 
>  
> Use different keyspaces as suggested. 
>  
> Cheers
>  
> -----------------
> Aaron Morton
> Freelance Cassandra Consultant
> New Zealand
>  
> @aaronmorton
> http://www.thelastpickle.com
>  
> On 12/04/2013, at 1:47 AM, Jabbar Azam <aj...@gmail.com> wrote:
> 
> 
> Hello,
> 
> I'm not an expert but I don't think you can do what you want. The way to separate data for applications on the same cluster is to use different tables for different applications or use multiple keyspaces, a keyspace per application. The replication factor you specify for each keyspace specifies how many copies of the data are stored in each datacenter.
> 
> You can't specify that data for a particular application is stored on a specific node, unless that node is in its own cluster.
> 
> I think of a cassandra cluster as a shared resource where all the applications have access to all the nodes in the cluster.
>  
> 
> Thanks
> 
> Jabbar Azam
>  
> 
> On 11 April 2013 14:13, Matthias Zeilinger <Ma...@bwinparty.com> wrote:
> Hi,
>  
> I would like to create big cluster for many applications.
> Within this cluster I would like to separate the data for each application, which can be easily done via different virtual datacenters and the correct replication strategy.
> What I would like to know, if I can specify for 1 node multiple values in the PropertyFileSnitch configuration, so that I can use 1 node for more applications?
> For example:
> 6 nodes:
> 3 for App A
> 3 for App B
> 4 for App C
>  
> I want to have such a configuration:
> Node 1 – DC-A& DC-C
> Node 2 – DC-B & DC-C
> Node 3 – DC-A & DC-C
> Node 4 – DC-B & DC-C
> Node 5 – DC-A
> Node 6 – DC-B
>  
> Is this possible or does anyone have another solution for this?
>  
>  
> Thx & br matthias
>  
>  


Re: multiple Datacenter values in PropertyFileSnitch

Posted by Andras Szerdahelyi <an...@ignitionone.com>.
I would replicate your different keyspaces to different DCs and scale those appropriately
So, for example, HighLoad KS replicates to really-huge-dc, which would have, 10 nodes, LowerLoad KS replicates to smaller-dc with 5 nodes.
The idea is , you do not mix your different keyspaces in the same datacenter ( this is possible with NetworkTopology ) or for redundancy/HA purposes you place a single replica in the other keyspace's DC but you direct your applications to the "primary" DC of the keyspace, with LOCAL_QUORUM or ONE reads.

Regards,
Andras

From: Matthias Zeilinger <Ma...@bwinparty.com>>
Reply-To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Date: Friday 12 April 2013 07:57
To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Subject: RE: multiple Datacenter values in PropertyFileSnitch

I´m using for each application it´s own keyspace.
What I want is to split up for different load patterns.
So that 2 apps with same and very high load pattern are not clashing.

For other load patterns I want to use another splitting.

Is there any best practice or should I scale out, so that the complete load can be distributed to on all nodes?

Br,
Matthias Zeilinger
Production Operation – Shared Services

P: +43 (0) 50 858-31185
M: +43 (0) 664 85-34459
E: matthias.zeilinger@bwinparty.com<ma...@bwinparty.com>

bwin.party services (Austria) GmbH
Marxergasse 1B
A-1030 Vienna

www.bwinparty.com

From: aaron morton [mailto:aaron@thelastpickle.com]
Sent: Donnerstag, 11. April 2013 20:48
To: user@cassandra.apache.org<ma...@cassandra.apache.org>
Subject: Re: multiple Datacenter values in PropertyFileSnitch

A node can only exist in one DC and one rack.

Use different keyspaces as suggested.

Cheers

-----------------
Aaron Morton
Freelance Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 12/04/2013, at 1:47 AM, Jabbar Azam <aj...@gmail.com>> wrote:


Hello,

I'm not an expert but I don't think you can do what you want. The way to separate data for applications on the same cluster is to use different tables for different applications or use multiple keyspaces, a keyspace per application. The replication factor you specify for each keyspace specifies how many copies of the data are stored in each datacenter.
You can't specify that data for a particular application is stored on a specific node, unless that node is in its own cluster.
I think of a cassandra cluster as a shared resource where all the applications have access to all the nodes in the cluster.


Thanks

Jabbar Azam

On 11 April 2013 14:13, Matthias Zeilinger <Ma...@bwinparty.com>> wrote:
Hi,

I would like to create big cluster for many applications.
Within this cluster I would like to separate the data for each application, which can be easily done via different virtual datacenters and the correct replication strategy.
What I would like to know, if I can specify for 1 node multiple values in the PropertyFileSnitch configuration, so that I can use 1 node for more applications?
For example:
6 nodes:
3 for App A
3 for App B
4 for App C

I want to have such a configuration:
Node 1 – DC-A& DC-C
Node 2 – DC-B & DC-C
Node 3 – DC-A & DC-C
Node 4 – DC-B & DC-C
Node 5 – DC-A
Node 6 – DC-B

Is this possible or does anyone have another solution for this?


Thx & br matthias



RE: multiple Datacenter values in PropertyFileSnitch

Posted by Matthias Zeilinger <Ma...@bwinparty.com>.
I´m using for each application it´s own keyspace.
What I want is to split up for different load patterns.
So that 2 apps with same and very high load pattern are not clashing.

For other load patterns I want to use another splitting.

Is there any best practice or should I scale out, so that the complete load can be distributed to on all nodes?

Br,
Matthias Zeilinger
Production Operation - Shared Services

P: +43 (0) 50 858-31185
M: +43 (0) 664 85-34459
E: matthias.zeilinger@bwinparty.com

bwin.party services (Austria) GmbH
Marxergasse 1B
A-1030 Vienna

www.bwinparty.com

From: aaron morton [mailto:aaron@thelastpickle.com]
Sent: Donnerstag, 11. April 2013 20:48
To: user@cassandra.apache.org
Subject: Re: multiple Datacenter values in PropertyFileSnitch

A node can only exist in one DC and one rack.

Use different keyspaces as suggested.

Cheers

-----------------
Aaron Morton
Freelance Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 12/04/2013, at 1:47 AM, Jabbar Azam <aj...@gmail.com>> wrote:


Hello,

I'm not an expert but I don't think you can do what you want. The way to separate data for applications on the same cluster is to use different tables for different applications or use multiple keyspaces, a keyspace per application. The replication factor you specify for each keyspace specifies how many copies of the data are stored in each datacenter.
You can't specify that data for a particular application is stored on a specific node, unless that node is in its own cluster.
I think of a cassandra cluster as a shared resource where all the applications have access to all the nodes in the cluster.


Thanks

Jabbar Azam

On 11 April 2013 14:13, Matthias Zeilinger <Ma...@bwinparty.com>> wrote:
Hi,

I would like to create big cluster for many applications.
Within this cluster I would like to separate the data for each application, which can be easily done via different virtual datacenters and the correct replication strategy.
What I would like to know, if I can specify for 1 node multiple values in the PropertyFileSnitch configuration, so that I can use 1 node for more applications?
For example:
6 nodes:
3 for App A
3 for App B
4 for App C

I want to have such a configuration:
Node 1 - DC-A& DC-C
Node 2 - DC-B & DC-C
Node 3 - DC-A & DC-C
Node 4 - DC-B & DC-C
Node 5 - DC-A
Node 6 - DC-B

Is this possible or does anyone have another solution for this?


Thx & br matthias



Re: multiple Datacenter values in PropertyFileSnitch

Posted by aaron morton <aa...@thelastpickle.com>.
A node can only exist in one DC and one rack. 

Use different keyspaces as suggested. 

Cheers

-----------------
Aaron Morton
Freelance Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 12/04/2013, at 1:47 AM, Jabbar Azam <aj...@gmail.com> wrote:

> Hello,
> 
> I'm not an expert but I don't think you can do what you want. The way to separate data for applications on the same cluster is to use different tables for different applications or use multiple keyspaces, a keyspace per application. The replication factor you specify for each keyspace specifies how many copies of the data are stored in each datacenter.
> 
> You can't specify that data for a particular application is stored on a specific node, unless that node is in its own cluster.
> 
> I think of a cassandra cluster as a shared resource where all the applications have access to all the nodes in the cluster.
> 
> 
> Thanks
> 
> Jabbar Azam
> 
> 
> On 11 April 2013 14:13, Matthias Zeilinger <Ma...@bwinparty.com> wrote:
> Hi,
> 
>  
> 
> I would like to create big cluster for many applications.
> 
> Within this cluster I would like to separate the data for each application, which can be easily done via different virtual datacenters and the correct replication strategy.
> 
> What I would like to know, if I can specify for 1 node multiple values in the PropertyFileSnitch configuration, so that I can use 1 node for more applications?
> 
> For example:
> 
> 6 nodes:
> 
> 3 for App A
> 
> 3 for App B
> 
> 4 for App C
> 
>  
> 
> I want to have such a configuration:
> 
> Node 1 – DC-A& DC-C
> 
> Node 2 – DC-B & DC-C
> 
> Node 3 – DC-A & DC-C
> 
> Node 4 – DC-B & DC-C
> 
> Node 5 – DC-A
> 
> Node 6 – DC-B
> 
>  
> 
> Is this possible or does anyone have another solution for this?
> 
>  
> 
>  
> 
> Thx & br matthias
> 
> 


Re: multiple Datacenter values in PropertyFileSnitch

Posted by Jabbar Azam <aj...@gmail.com>.
Hello,

I'm not an expert but I don't think you can do what you want. The way to
separate data for applications on the same cluster is to use different
tables for different applications or use multiple keyspaces, a keyspace per
application. The replication factor you specify for each keyspace specifies
how many copies of the data are stored in each datacenter.

You can't specify that data for a particular application is stored on a
specific node, unless that node is in its own cluster.

I think of a cassandra cluster as a shared resource where all the
applications have access to all the nodes in the cluster.


Thanks

Jabbar Azam


On 11 April 2013 14:13, Matthias Zeilinger <Matthias.Zeilinger@bwinparty.com
> wrote:

>  Hi,****
>
> ** **
>
> I would like to create big cluster for many applications.****
>
> Within this cluster I would like to separate the data for each
> application, which can be easily done via different virtual datacenters and
> the correct replication strategy.****
>
> What I would like to know, if I can specify for 1 node multiple values in
> the PropertyFileSnitch configuration, so that I can use 1 node for more
> applications?****
>
> For example:****
>
> 6 nodes:****
>
> 3 for App A****
>
> 3 for App B****
>
> 4 for App C****
>
> ** **
>
> I want to have such a configuration:****
>
> Node 1 – DC-A& DC-C****
>
> Node 2 – DC-B & DC-C****
>
> Node 3 – DC-A & DC-C****
>
> Node 4 – DC-B & DC-C****
>
> Node 5 – DC-A****
>
> Node 6 – DC-B****
>
> ** **
>
> Is this possible or does anyone have another solution for this?****
>
> ** **
>
> ** **
>
> Thx & br matthias****
>