You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Bryan Cheng <br...@blockcypher.com> on 2016/04/01 01:35:54 UTC

Re: Multi DC setup for analytics

I'm jumping into this thread late, so sorry if this has been covered
before. But am I correct in reading that you have two different Cassandra
rings, not talking to each other at all, and you want to have a shared DC
with a third Cassandra ring?

I'm not sure what you want to do is possible.

If I had the luxury of starting from scratch, the design I would do is:
All three DC's in one cluster, with 3 datacenters. DC3 is the analytics DC.
DC1's keyspaces are replicated to DC1 and DC3 only.
DC2's keyspaces are replicated to DC2 and DC3 only.

Then you have DC3 with all data from both DC1 and DC2 to run analytics on,
and no cross-talk between DC1 and DC2.

If you cannot rebuild your existing clusters, you may want to consider
using something like Spark to ETL your data out of DC1 and DC2 into a new
cluster at DC3. At that point you're running a data warehouse and lose some
of the advantages of seemless cluster membership.

On Wed, Mar 30, 2016 at 5:43 AM, Anishek Agarwal <an...@gmail.com> wrote:

> Hey Guys,
>
> We did the necessary changes and were trying to get this back on track,
> but hit another wall,
>
> we have two Clusters in Different DC ( DC1 and DC2) with cluster names (
> CLUSTER_1, CLUSTER_2)
>
> we want to have a common analytics cluster in DC3 with cluster name
> (CLUSTER_3). -- looks like this can't be done, so we have to setup two
> different analytics cluster ? can't we just get data from CLUSTER_1/2 to
> same cluster CLUSTER_3 ?
>
> thanks
> anishek
>
> On Mon, Mar 21, 2016 at 3:31 PM, Anishek Agarwal <an...@gmail.com>
> wrote:
>
>> Hey Clint,
>>
>> we have two separate rings which don't talk to each other but both having
>> the same DC name "DCX".
>>
>> @Raja,
>>
>> We had already gone towards the path you suggested.
>>
>> thanks all
>> anishek
>>
>> On Fri, Mar 18, 2016 at 8:01 AM, Reddy Raja <ar...@gmail.com> wrote:
>>
>>> Yes. Here are the steps.
>>> You will have to change the DC Names first.
>>> DC1 and DC2 would be independent clusters.
>>>
>>> Create a new DC, DC3 and include these two DC's on DC3.
>>>
>>> This should work well.
>>>
>>>
>>> On Thu, Mar 17, 2016 at 11:03 PM, Clint Martin <
>>> clintlmartin@coolfiretechnologies.com> wrote:
>>>
>>>> When you say you have two logical DC both with the same name are you
>>>> saying that you have two clusters of servers both with the same DC name,
>>>> nether of which currently talk to each other? IE they are two separate
>>>> rings?
>>>>
>>>> Or do you mean that you have two keyspaces in one cluster?
>>>>
>>>> Or?
>>>>
>>>> Clint
>>>> On Mar 14, 2016 2:11 AM, "Anishek Agarwal" <an...@gmail.com> wrote:
>>>>
>>>>> Hello,
>>>>>
>>>>> We are using cassandra 2.0.17 and have two logical DC having different
>>>>> Keyspaces but both having same logical name DC1.
>>>>>
>>>>> we want to setup another cassandra cluster for analytics which should
>>>>> get data from both the above DC.
>>>>>
>>>>> if we setup the new DC with name DC2 and follow the steps
>>>>> https://docs.datastax.com/en/cassandra/2.0/cassandra/operations/ops_add_dc_to_cluster_t.html
>>>>> will it work ?
>>>>>
>>>>> I would think we would have to first change the names of existing
>>>>> clusters to have to different names and then go with adding another dc
>>>>> getting data from these?
>>>>>
>>>>> Also as soon as we add the node the data starts moving... this will
>>>>> all be only real time changes done to the cluster right ? we still have to
>>>>> do the rebuild to get the data for tokens for node in new cluster ?
>>>>>
>>>>> Thanks
>>>>> Anishek
>>>>>
>>>>
>>>
>>>
>>> --
>>> "In this world, you either have an excuse or a story. I preferred to
>>> have a story"
>>>
>>
>>
>

Re: Multi DC setup for analytics

Posted by Anishek Agarwal <an...@gmail.com>.
Hey Bryan,

Thanks for the info, we inferred as much, currently the only other thing we
were trying were trying to start two separate instances in Analytics
cluster on same set of machines to talk to respective individual DC's but
within 2 mins dropped that as we will have to change ports on atlas one of
the existing DC's so when they join with the analytics cluster they are on
same port.

for now we are just getting another set of machines for this.


I had known about the pattern of using a separate analytics cluster for
cassandra but thought we could join them across two clusters, my bad now
that i think of it i think it would have been better to have just one DC
for realtime prod requests instead of two.

are there ways of merging existing clusters to one cluster in cassandra ?


On Fri, Apr 1, 2016 at 5:05 AM, Bryan Cheng <br...@blockcypher.com> wrote:

> I'm jumping into this thread late, so sorry if this has been covered
> before. But am I correct in reading that you have two different Cassandra
> rings, not talking to each other at all, and you want to have a shared DC
> with a third Cassandra ring?
>
> I'm not sure what you want to do is possible.
>
> If I had the luxury of starting from scratch, the design I would do is:
> All three DC's in one cluster, with 3 datacenters. DC3 is the analytics DC.
> DC1's keyspaces are replicated to DC1 and DC3 only.
> DC2's keyspaces are replicated to DC2 and DC3 only.
>
> Then you have DC3 with all data from both DC1 and DC2 to run analytics on,
> and no cross-talk between DC1 and DC2.
>
> If you cannot rebuild your existing clusters, you may want to consider
> using something like Spark to ETL your data out of DC1 and DC2 into a new
> cluster at DC3. At that point you're running a data warehouse and lose some
> of the advantages of seemless cluster membership.
>
> On Wed, Mar 30, 2016 at 5:43 AM, Anishek Agarwal <an...@gmail.com>
> wrote:
>
>> Hey Guys,
>>
>> We did the necessary changes and were trying to get this back on track,
>> but hit another wall,
>>
>> we have two Clusters in Different DC ( DC1 and DC2) with cluster names (
>> CLUSTER_1, CLUSTER_2)
>>
>> we want to have a common analytics cluster in DC3 with cluster name
>> (CLUSTER_3). -- looks like this can't be done, so we have to setup two
>> different analytics cluster ? can't we just get data from CLUSTER_1/2 to
>> same cluster CLUSTER_3 ?
>>
>> thanks
>> anishek
>>
>> On Mon, Mar 21, 2016 at 3:31 PM, Anishek Agarwal <an...@gmail.com>
>> wrote:
>>
>>> Hey Clint,
>>>
>>> we have two separate rings which don't talk to each other but both
>>> having the same DC name "DCX".
>>>
>>> @Raja,
>>>
>>> We had already gone towards the path you suggested.
>>>
>>> thanks all
>>> anishek
>>>
>>> On Fri, Mar 18, 2016 at 8:01 AM, Reddy Raja <ar...@gmail.com>
>>> wrote:
>>>
>>>> Yes. Here are the steps.
>>>> You will have to change the DC Names first.
>>>> DC1 and DC2 would be independent clusters.
>>>>
>>>> Create a new DC, DC3 and include these two DC's on DC3.
>>>>
>>>> This should work well.
>>>>
>>>>
>>>> On Thu, Mar 17, 2016 at 11:03 PM, Clint Martin <
>>>> clintlmartin@coolfiretechnologies.com> wrote:
>>>>
>>>>> When you say you have two logical DC both with the same name are you
>>>>> saying that you have two clusters of servers both with the same DC name,
>>>>> nether of which currently talk to each other? IE they are two separate
>>>>> rings?
>>>>>
>>>>> Or do you mean that you have two keyspaces in one cluster?
>>>>>
>>>>> Or?
>>>>>
>>>>> Clint
>>>>> On Mar 14, 2016 2:11 AM, "Anishek Agarwal" <an...@gmail.com> wrote:
>>>>>
>>>>>> Hello,
>>>>>>
>>>>>> We are using cassandra 2.0.17 and have two logical DC having
>>>>>> different Keyspaces but both having same logical name DC1.
>>>>>>
>>>>>> we want to setup another cassandra cluster for analytics which should
>>>>>> get data from both the above DC.
>>>>>>
>>>>>> if we setup the new DC with name DC2 and follow the steps
>>>>>> https://docs.datastax.com/en/cassandra/2.0/cassandra/operations/ops_add_dc_to_cluster_t.html
>>>>>> will it work ?
>>>>>>
>>>>>> I would think we would have to first change the names of existing
>>>>>> clusters to have to different names and then go with adding another dc
>>>>>> getting data from these?
>>>>>>
>>>>>> Also as soon as we add the node the data starts moving... this will
>>>>>> all be only real time changes done to the cluster right ? we still have to
>>>>>> do the rebuild to get the data for tokens for node in new cluster ?
>>>>>>
>>>>>> Thanks
>>>>>> Anishek
>>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> "In this world, you either have an excuse or a story. I preferred to
>>>> have a story"
>>>>
>>>
>>>
>>
>