You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by Patrick Julien <pj...@gmail.com> on 2011/04/08 19:17:52 UTC

Pyramid Organization of Data

We have a pilot project running where all our historical data
worldwide would be stored using cassandra.  So far, we have been
successful at getting the write and read throughput we need, in fact,
coming in over 27% over our needed capacity and well beyond what we
were able to achieve with mysql, very impressive.

However, one thing that escapes me is how we should organize different
data center access.

The scenario is the following:

- We have data centers in North America, London, Tokyo and so on.
- The relative cost of data centers is very different, e.g., TCO for
one server in Tokyo is about the same than 5 such computers in New
York.
- We want to have access to all the data from North America, hence we
would run Hadoop/Pig queries from the New York/North America data
center only.

The problem is this: we would like the historical data from Tokyo to
stay in Tokyo and only be replicated to New York.  The one in London
to be in London and only be replicated to New York and so on for all
data centers.

Is this currently possible with Cassandra?  I believe we would need to
run multiple clusters and migrate data manually from data centers to
North America to achieve this.  Also, any suggestions would also be
welcomed.

Re: Pyramid Organization of Data

Posted by Joe Stump <jo...@joestump.net>.

A few lines of Java in a partitioning or rack aware strategy might be able to achieve this. 

--Joe

--
Typed with big fingers on a small keyboard. 

On Apr 8, 2011, at 13:17, Patrick Julien <pj...@gmail.com> wrote:

> We have a pilot project running where all our historical data
> worldwide would be stored using cassandra.  So far, we have been
> successful at getting the write and read throughput we need, in fact,
> coming in over 27% over our needed capacity and well beyond what we
> were able to achieve with mysql, very impressive.
> 
> However, one thing that escapes me is how we should organize different
> data center access.
> 
> The scenario is the following:
> 
> - We have data centers in North America, London, Tokyo and so on.
> - The relative cost of data centers is very different, e.g., TCO for
> one server in Tokyo is about the same than 5 such computers in New
> York.
> - We want to have access to all the data from North America, hence we
> would run Hadoop/Pig queries from the New York/North America data
> center only.
> 
> The problem is this: we would like the historical data from Tokyo to
> stay in Tokyo and only be replicated to New York.  The one in London
> to be in London and only be replicated to New York and so on for all
> data centers.
> 
> Is this currently possible with Cassandra?  I believe we would need to
> run multiple clusters and migrate data manually from data centers to
> North America to achieve this.  Also, any suggestions would also be
> welcomed.

Re: Pyramid Organization of Data

Posted by Patrick Julien <pj...@gmail.com>.

On Thu, Apr 14, 2011 at 4:47 PM, Adrian Cockcroft
<ac...@netflix.com> wrote:
> What you are asking for breaks the eventual consistency model, so you need to create a separate cluster in NYC that collects the same updates but has a much longer setting to timeout the data for deletion, or doesn't get the deletes.
>
> One way is to have a trigger on writes on your pyramid nodes in NY that copies data over to the long term analysis cluster. The two clusters won't be eventually consistent in the presence of failures, but with RF=3 you will get up to three triggers for each write, so you get three chances to get the copy done.
>

Yes, that's one of the scenarios we're contemplating.  However, there
aren't any triggers at the cassandra level and even if they were, we
would get them multiple times.

So far, I believe my best bet is to run 2 clusters.  One global that
has NY and the satellite sites.  The other is NY specific and is the
archive site.

We would then make placement strategy in NY that would decorate the
configured placement strategy so that it would copy the row over to
the archive site before passing it into the non-archive NY cluster.

Re: Pyramid Organization of Data

Posted by Adrian Cockcroft <ac...@netflix.com>.

What you are asking for breaks the eventual consistency model, so you need to create a separate cluster in NYC that collects the same updates but has a much longer setting to timeout the data for deletion, or doesn't get the deletes. 

One way is to have a trigger on writes on your pyramid nodes in NY that copies data over to the long term analysis cluster. The two clusters won't be eventually consistent in the presence of failures, but with RF=3 you will get up to three triggers for each write, so you get three chances to get the copy done. 

Adrian

On Apr 14, 2011, at 10:18 AM, "Patrick Julien" <pj...@gmail.com> wrote:

> Thanks for your input Adrian, we've pretty much settled on this too.
> What I'm trying to figure out is how we do deletes.
> 
> We want to do deletes in the satellites because:
> 
> a) we'll run out of disk space very quickly with the amount of data we have
> b) we don't need more than 3 days worth of history in the satellites,
> we're currently planning for 7 days of capacity
> 
> However, the deletes will get replicated back to NY.  In NY, we don't
> want that, we want to run hadoop/pig over all that data dating back to
> several months/years.  Even if we set the replication factor of the
> satellites to 1 and NY to 3, we'll run out of space very quickly in
> the satellites.
> 
> 
> On Thu, Apr 14, 2011 at 11:23 AM, Adrian Cockcroft
> <ac...@netflix.com> wrote:
>> We have similar requirements for wide area backup/archive at Netflix.
>> I think what you want is a replica with RF of at least 3 in NY for all the
>> satellites, then each satellite could have a lower RF, but if you want safe
>> local quorum I would use 3 everywhere.
>> Then NY is the sum of all the satellites, so that makes most use of the disk
>> space.
>> For archival storage I suggest you use snapshots in NY and save compressed
>> tar files of each keyspace in NY. We've been working on this to allow full
>> and incremental backup and restore from our EC2 hosted Cassandra clusters
>> to/from S3. Full backup/restore works fine, incremental and per-keyspace
>> restore is being worked on.
>> Adrian
>> From: Patrick Julien <pj...@gmail.com>
>> Reply-To: "user@cassandra.apache.org" <us...@cassandra.apache.org>
>> Date: Thu, 14 Apr 2011 05:38:54 -0700
>> To: "user@cassandra.apache.org" <us...@cassandra.apache.org>
>> Subject: Re: Pyramid Organization of Data
>> 
>> Thanks,  I'm still working the problem so anything I find out I will post
>> here.
>> 
>> Yes, you're right, that is the question I am asking.
>> 
>> No, adding more storage is not a solution since new york would have several
>> hundred times more storage.
>> 
>> On Apr 14, 2011 6:38 AM, "aaron morton" <aa...@thelastpickle.com> wrote:
>>> I think your question is "NY is the archive, after a certain amount of
>>> time we want to delete the row from the original DC but keep it in the
>>> archive in NY."
>>> 
>>> Once you delete a row, it's deleted as far as the client is concerned.
>>> GCGaceSeconds is only concerned with when the tombstone marker can be
>>> removed. If NY has a replica of a row from Tokyo and the row is deleted in
>>> either DC, it will be deleted in the other DC as well.
>>> 
>>> Some thoughts...
>>> 1) Add more storage in the satellite DC's, then tilt you chair to
>>> celebrate a job well done :)
>>> 2) Run two clusters as you say.
>>> 3) Just thinking out loud, and I know this does not work now. Would it be
>>> possible to support per CF strategy options, so an archive CF only
>>> replicates to NY ? Can think of possible problems with repair and
>>> LOCAL_QUORUM, out of interest what else would it break?
>>> 
>>> Hope that helps.
>>> Aaron
>>> 
>>> 
>>> 
>>> On 14 Apr 2011, at 10:17, Patrick Julien wrote:
>>> 
>>>> We have been successful in implementing, at scale, the comments you
>>>> posted here. I'm wondering what we can do about deleting data
>>>> however.
>>>> 
>>>> The way I see it, we have considerably more storage capacity in NY,
>>>> but not in the other sites. Using this technique here, it occurs to
>>>> me that we would replicate non-NY deleted rows back to NY. Is there a
>>>> way to tell NY not to tombstone rows?
>>>> 
>>>> The ideas I have so far:
>>>> 
>>>> - Set GCGracePeriod to be much higher in NY than in the other sites.
>>>> This way we can get to tombstone'd rows well beyond their disk life in
>>>> other sites.
>>>> - A variant on this solution is to set the TTL on rows in non NY sites
>>>> and again, set the GCGracePeriod to be considerably higher in NY
>>>> - break this up to multiple clusters and do one write from the client
>>>> to the its 'local' cluster and one write to the NY cluster.
>>>> 
>>>> 
>>>> 
>>>> On Fri, Apr 8, 2011 at 7:15 PM, Jonathan Ellis <jb...@gmail.com> wrote:
>>>>> No, I'm suggesting you have a Tokyo keyspace that gets replicated as
>>>>> {Tokyo: 2, NYC:1}, a London keyspace that gets replicated to {London:
>>>>> 2, NYC: 1}, for example.
>>>>> 
>>>>> On Fri, Apr 8, 2011 at 5:59 PM, Patrick Julien <pj...@gmail.com>
>>>>> wrote:
>>>>>> I'm familiar with this material. I hadn't thought of it from this
>>>>>> angle but I believe what you're suggesting is that the different data
>>>>>> centers would hold a different properties file for node discovery
>>>>>> instead of using auto-discovery.
>>>>>> 
>>>>>> So Tokyo, and others, would have a configuration that make it
>>>>>> oblivious to the non New York data centers.
>>>>>> New York would have a configuration that would give it knowledge of no
>>>>>> other data center.
>>>>>> 
>>>>>> Would that work? Wouldn't the NY data center wonder where these other
>>>>>> writes are coming from?
>>>>>> 
>>>>>> On Fri, Apr 8, 2011 at 6:38 PM, Jonathan Ellis <jb...@gmail.com>
>>>>>> wrote:
>>>>>>> On Fri, Apr 8, 2011 at 12:17 PM, Patrick Julien <pj...@gmail.com>
>>>>>>> wrote:
>>>>>>>> The problem is this: we would like the historical data from Tokyo to
>>>>>>>> stay in Tokyo and only be replicated to New York. The one in London
>>>>>>>> to be in London and only be replicated to New York and so on for all
>>>>>>>> data centers.
>>>>>>>> 
>>>>>>>> Is this currently possible with Cassandra? I believe we would need to
>>>>>>>> run multiple clusters and migrate data manually from data centers to
>>>>>>>> North America to achieve this. Also, any suggestions would also be
>>>>>>>> welcomed.
>>>>>>> 
>>>>>>> NetworkTopologyStrategy allows configuration replicas per-keyspace,
>>>>>>> per-datacenter:
>>>>>>> 
>>>>>>> http://www.datastax.com/dev/blog/deploying-cassandra-across-multiple-data-centers
>>>>>>> 
>>>>>>> --
>>>>>>> Jonathan Ellis
>>>>>>> Project Chair, Apache Cassandra
>>>>>>> co-founder of DataStax, the source for professional Cassandra support
>>>>>>> http://www.datastax.com
>>>>>>> 
>>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> --
>>>>> Jonathan Ellis
>>>>> Project Chair, Apache Cassandra
>>>>> co-founder of DataStax, the source for professional Cassandra support
>>>>> http://www.datastax.com
>>>>> 
>>> 
>> 
>

Re: Pyramid Organization of Data

Posted by Patrick Julien <pj...@gmail.com>.

Thanks for your input Adrian, we've pretty much settled on this too.
What I'm trying to figure out is how we do deletes.

We want to do deletes in the satellites because:

a) we'll run out of disk space very quickly with the amount of data we have
b) we don't need more than 3 days worth of history in the satellites,
we're currently planning for 7 days of capacity

However, the deletes will get replicated back to NY.  In NY, we don't
want that, we want to run hadoop/pig over all that data dating back to
several months/years.  Even if we set the replication factor of the
satellites to 1 and NY to 3, we'll run out of space very quickly in
the satellites.


On Thu, Apr 14, 2011 at 11:23 AM, Adrian Cockcroft
<ac...@netflix.com> wrote:
> We have similar requirements for wide area backup/archive at Netflix.
> I think what you want is a replica with RF of at least 3 in NY for all the
> satellites, then each satellite could have a lower RF, but if you want safe
> local quorum I would use 3 everywhere.
> Then NY is the sum of all the satellites, so that makes most use of the disk
> space.
> For archival storage I suggest you use snapshots in NY and save compressed
> tar files of each keyspace in NY. We've been working on this to allow full
> and incremental backup and restore from our EC2 hosted Cassandra clusters
> to/from S3. Full backup/restore works fine, incremental and per-keyspace
> restore is being worked on.
> Adrian
> From: Patrick Julien <pj...@gmail.com>
> Reply-To: "user@cassandra.apache.org" <us...@cassandra.apache.org>
> Date: Thu, 14 Apr 2011 05:38:54 -0700
> To: "user@cassandra.apache.org" <us...@cassandra.apache.org>
> Subject: Re: Pyramid Organization of Data
>
> Thanks,  I'm still working the problem so anything I find out I will post
> here.
>
> Yes, you're right, that is the question I am asking.
>
> No, adding more storage is not a solution since new york would have several
> hundred times more storage.
>
> On Apr 14, 2011 6:38 AM, "aaron morton" <aa...@thelastpickle.com> wrote:
>> I think your question is "NY is the archive, after a certain amount of
>> time we want to delete the row from the original DC but keep it in the
>> archive in NY."
>>
>> Once you delete a row, it's deleted as far as the client is concerned.
>> GCGaceSeconds is only concerned with when the tombstone marker can be
>> removed. If NY has a replica of a row from Tokyo and the row is deleted in
>> either DC, it will be deleted in the other DC as well.
>>
>> Some thoughts...
>> 1) Add more storage in the satellite DC's, then tilt you chair to
>> celebrate a job well done :)
>> 2) Run two clusters as you say.
>> 3) Just thinking out loud, and I know this does not work now. Would it be
>> possible to support per CF strategy options, so an archive CF only
>> replicates to NY ? Can think of possible problems with repair and
>> LOCAL_QUORUM, out of interest what else would it break?
>>
>> Hope that helps.
>> Aaron
>>
>>
>>
>> On 14 Apr 2011, at 10:17, Patrick Julien wrote:
>>
>>> We have been successful in implementing, at scale, the comments you
>>> posted here. I'm wondering what we can do about deleting data
>>> however.
>>>
>>> The way I see it, we have considerably more storage capacity in NY,
>>> but not in the other sites. Using this technique here, it occurs to
>>> me that we would replicate non-NY deleted rows back to NY. Is there a
>>> way to tell NY not to tombstone rows?
>>>
>>> The ideas I have so far:
>>>
>>> - Set GCGracePeriod to be much higher in NY than in the other sites.
>>> This way we can get to tombstone'd rows well beyond their disk life in
>>> other sites.
>>> - A variant on this solution is to set the TTL on rows in non NY sites
>>> and again, set the GCGracePeriod to be considerably higher in NY
>>> - break this up to multiple clusters and do one write from the client
>>> to the its 'local' cluster and one write to the NY cluster.
>>>
>>>
>>>
>>> On Fri, Apr 8, 2011 at 7:15 PM, Jonathan Ellis <jb...@gmail.com> wrote:
>>>> No, I'm suggesting you have a Tokyo keyspace that gets replicated as
>>>> {Tokyo: 2, NYC:1}, a London keyspace that gets replicated to {London:
>>>> 2, NYC: 1}, for example.
>>>>
>>>> On Fri, Apr 8, 2011 at 5:59 PM, Patrick Julien <pj...@gmail.com>
>>>> wrote:
>>>>> I'm familiar with this material. I hadn't thought of it from this
>>>>> angle but I believe what you're suggesting is that the different data
>>>>> centers would hold a different properties file for node discovery
>>>>> instead of using auto-discovery.
>>>>>
>>>>> So Tokyo, and others, would have a configuration that make it
>>>>> oblivious to the non New York data centers.
>>>>> New York would have a configuration that would give it knowledge of no
>>>>> other data center.
>>>>>
>>>>> Would that work? Wouldn't the NY data center wonder where these other
>>>>> writes are coming from?
>>>>>
>>>>> On Fri, Apr 8, 2011 at 6:38 PM, Jonathan Ellis <jb...@gmail.com>
>>>>> wrote:
>>>>>> On Fri, Apr 8, 2011 at 12:17 PM, Patrick Julien <pj...@gmail.com>
>>>>>> wrote:
>>>>>>> The problem is this: we would like the historical data from Tokyo to
>>>>>>> stay in Tokyo and only be replicated to New York. The one in London
>>>>>>> to be in London and only be replicated to New York and so on for all
>>>>>>> data centers.
>>>>>>>
>>>>>>> Is this currently possible with Cassandra? I believe we would need to
>>>>>>> run multiple clusters and migrate data manually from data centers to
>>>>>>> North America to achieve this. Also, any suggestions would also be
>>>>>>> welcomed.
>>>>>>
>>>>>> NetworkTopologyStrategy allows configuration replicas per-keyspace,
>>>>>> per-datacenter:
>>>>>>
>>>>>> http://www.datastax.com/dev/blog/deploying-cassandra-across-multiple-data-centers
>>>>>>
>>>>>> --
>>>>>> Jonathan Ellis
>>>>>> Project Chair, Apache Cassandra
>>>>>> co-founder of DataStax, the source for professional Cassandra support
>>>>>> http://www.datastax.com
>>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Jonathan Ellis
>>>> Project Chair, Apache Cassandra
>>>> co-founder of DataStax, the source for professional Cassandra support
>>>> http://www.datastax.com
>>>>
>>
>

Re: Pyramid Organization of Data

Posted by Adrian Cockcroft <ac...@netflix.com>.

We have similar requirements for wide area backup/archive at Netflix.

I think what you want is a replica with RF of at least 3 in NY for all the satellites, then each satellite could have a lower RF, but if you want safe local quorum I would use 3 everywhere.

Then NY is the sum of all the satellites, so that makes most use of the disk space.

For archival storage I suggest you use snapshots in NY and save compressed tar files of each keyspace in NY. We've been working on this to allow full and incremental backup and restore from our EC2 hosted Cassandra clusters to/from S3. Full backup/restore works fine, incremental and per-keyspace restore is being worked on.

Adrian

From: Patrick Julien <pj...@gmail.com>>
Reply-To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Date: Thu, 14 Apr 2011 05:38:54 -0700
To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Subject: Re: Pyramid Organization of Data


Thanks,  I'm still working the problem so anything I find out I will post here.

Yes, you're right, that is the question I am asking.

No, adding more storage is not a solution since new york would have several hundred times more storage.

On Apr 14, 2011 6:38 AM, "aaron morton" <aa...@thelastpickle.com>> wrote:
> I think your question is "NY is the archive, after a certain amount of time we want to delete the row from the original DC but keep it in the archive in NY."
>
> Once you delete a row, it's deleted as far as the client is concerned. GCGaceSeconds is only concerned with when the tombstone marker can be removed. If NY has a replica of a row from Tokyo and the row is deleted in either DC, it will be deleted in the other DC as well.
>
> Some thoughts...
> 1) Add more storage in the satellite DC's, then tilt you chair to celebrate a job well done :)
> 2) Run two clusters as you say.
> 3) Just thinking out loud, and I know this does not work now. Would it be possible to support per CF strategy options, so an archive CF only replicates to NY ? Can think of possible problems with repair and LOCAL_QUORUM, out of interest what else would it break?
>
> Hope that helps.
> Aaron
>
>
>
> On 14 Apr 2011, at 10:17, Patrick Julien wrote:
>
>> We have been successful in implementing, at scale, the comments you
>> posted here. I'm wondering what we can do about deleting data
>> however.
>>
>> The way I see it, we have considerably more storage capacity in NY,
>> but not in the other sites. Using this technique here, it occurs to
>> me that we would replicate non-NY deleted rows back to NY. Is there a
>> way to tell NY not to tombstone rows?
>>
>> The ideas I have so far:
>>
>> - Set GCGracePeriod to be much higher in NY than in the other sites.
>> This way we can get to tombstone'd rows well beyond their disk life in
>> other sites.
>> - A variant on this solution is to set the TTL on rows in non NY sites
>> and again, set the GCGracePeriod to be considerably higher in NY
>> - break this up to multiple clusters and do one write from the client
>> to the its 'local' cluster and one write to the NY cluster.
>>
>>
>>
>> On Fri, Apr 8, 2011 at 7:15 PM, Jonathan Ellis <jb...@gmail.com>> wrote:
>>> No, I'm suggesting you have a Tokyo keyspace that gets replicated as
>>> {Tokyo: 2, NYC:1}, a London keyspace that gets replicated to {London:
>>> 2, NYC: 1}, for example.
>>>
>>> On Fri, Apr 8, 2011 at 5:59 PM, Patrick Julien <pj...@gmail.com>> wrote:
>>>> I'm familiar with this material. I hadn't thought of it from this
>>>> angle but I believe what you're suggesting is that the different data
>>>> centers would hold a different properties file for node discovery
>>>> instead of using auto-discovery.
>>>>
>>>> So Tokyo, and others, would have a configuration that make it
>>>> oblivious to the non New York data centers.
>>>> New York would have a configuration that would give it knowledge of no
>>>> other data center.
>>>>
>>>> Would that work? Wouldn't the NY data center wonder where these other
>>>> writes are coming from?
>>>>
>>>> On Fri, Apr 8, 2011 at 6:38 PM, Jonathan Ellis <jb...@gmail.com>> wrote:
>>>>> On Fri, Apr 8, 2011 at 12:17 PM, Patrick Julien <pj...@gmail.com>> wrote:
>>>>>> The problem is this: we would like the historical data from Tokyo to
>>>>>> stay in Tokyo and only be replicated to New York. The one in London
>>>>>> to be in London and only be replicated to New York and so on for all
>>>>>> data centers.
>>>>>>
>>>>>> Is this currently possible with Cassandra? I believe we would need to
>>>>>> run multiple clusters and migrate data manually from data centers to
>>>>>> North America to achieve this. Also, any suggestions would also be
>>>>>> welcomed.
>>>>>
>>>>> NetworkTopologyStrategy allows configuration replicas per-keyspace,
>>>>> per-datacenter:
>>>>> http://www.datastax.com/dev/blog/deploying-cassandra-across-multiple-data-centers
>>>>>
>>>>> --
>>>>> Jonathan Ellis
>>>>> Project Chair, Apache Cassandra
>>>>> co-founder of DataStax, the source for professional Cassandra support
>>>>> http://www.datastax.com
>>>>>
>>>>
>>>
>>>
>>>
>>> --
>>> Jonathan Ellis
>>> Project Chair, Apache Cassandra
>>> co-founder of DataStax, the source for professional Cassandra support
>>> http://www.datastax.com
>>>
>

Re: Pyramid Organization of Data

Posted by Patrick Julien <pj...@gmail.com>.

Thanks,  I'm still working the problem so anything I find out I will post
here.

Yes, you're right, that is the question I am asking.

No, adding more storage is not a solution since new york would have several
hundred times more storage.
On Apr 14, 2011 6:38 AM, "aaron morton" <aa...@thelastpickle.com> wrote:
> I think your question is "NY is the archive, after a certain amount of
time we want to delete the row from the original DC but keep it in the
archive in NY."
>
> Once you delete a row, it's deleted as far as the client is concerned.
GCGaceSeconds is only concerned with when the tombstone marker can be
removed. If NY has a replica of a row from Tokyo and the row is deleted in
either DC, it will be deleted in the other DC as well.
>
> Some thoughts...
> 1) Add more storage in the satellite DC's, then tilt you chair to
celebrate a job well done :)
> 2) Run two clusters as you say.
> 3) Just thinking out loud, and I know this does not work now. Would it be
possible to support per CF strategy options, so an archive CF only
replicates to NY ? Can think of possible problems with repair and
LOCAL_QUORUM, out of interest what else would it break?
>
> Hope that helps.
> Aaron
>
>
>
> On 14 Apr 2011, at 10:17, Patrick Julien wrote:
>
>> We have been successful in implementing, at scale, the comments you
>> posted here. I'm wondering what we can do about deleting data
>> however.
>>
>> The way I see it, we have considerably more storage capacity in NY,
>> but not in the other sites. Using this technique here, it occurs to
>> me that we would replicate non-NY deleted rows back to NY. Is there a
>> way to tell NY not to tombstone rows?
>>
>> The ideas I have so far:
>>
>> - Set GCGracePeriod to be much higher in NY than in the other sites.
>> This way we can get to tombstone'd rows well beyond their disk life in
>> other sites.
>> - A variant on this solution is to set the TTL on rows in non NY sites
>> and again, set the GCGracePeriod to be considerably higher in NY
>> - break this up to multiple clusters and do one write from the client
>> to the its 'local' cluster and one write to the NY cluster.
>>
>>
>>
>> On Fri, Apr 8, 2011 at 7:15 PM, Jonathan Ellis <jb...@gmail.com> wrote:
>>> No, I'm suggesting you have a Tokyo keyspace that gets replicated as
>>> {Tokyo: 2, NYC:1}, a London keyspace that gets replicated to {London:
>>> 2, NYC: 1}, for example.
>>>
>>> On Fri, Apr 8, 2011 at 5:59 PM, Patrick Julien <pj...@gmail.com>
wrote:
>>>> I'm familiar with this material. I hadn't thought of it from this
>>>> angle but I believe what you're suggesting is that the different data
>>>> centers would hold a different properties file for node discovery
>>>> instead of using auto-discovery.
>>>>
>>>> So Tokyo, and others, would have a configuration that make it
>>>> oblivious to the non New York data centers.
>>>> New York would have a configuration that would give it knowledge of no
>>>> other data center.
>>>>
>>>> Would that work? Wouldn't the NY data center wonder where these other
>>>> writes are coming from?
>>>>
>>>> On Fri, Apr 8, 2011 at 6:38 PM, Jonathan Ellis <jb...@gmail.com>
wrote:
>>>>> On Fri, Apr 8, 2011 at 12:17 PM, Patrick Julien <pj...@gmail.com>
wrote:
>>>>>> The problem is this: we would like the historical data from Tokyo to
>>>>>> stay in Tokyo and only be replicated to New York. The one in London
>>>>>> to be in London and only be replicated to New York and so on for all
>>>>>> data centers.
>>>>>>
>>>>>> Is this currently possible with Cassandra? I believe we would need to
>>>>>> run multiple clusters and migrate data manually from data centers to
>>>>>> North America to achieve this. Also, any suggestions would also be
>>>>>> welcomed.
>>>>>
>>>>> NetworkTopologyStrategy allows configuration replicas per-keyspace,
>>>>> per-datacenter:
>>>>>
http://www.datastax.com/dev/blog/deploying-cassandra-across-multiple-data-centers
>>>>>
>>>>> --
>>>>> Jonathan Ellis
>>>>> Project Chair, Apache Cassandra
>>>>> co-founder of DataStax, the source for professional Cassandra support
>>>>> http://www.datastax.com
>>>>>
>>>>
>>>
>>>
>>>
>>> --
>>> Jonathan Ellis
>>> Project Chair, Apache Cassandra
>>> co-founder of DataStax, the source for professional Cassandra support
>>> http://www.datastax.com
>>>
>

Re: Pyramid Organization of Data

Posted by aaron morton <aa...@thelastpickle.com>.

I think your question is "NY is the archive, after a certain amount of time we want to delete the row from the original DC but keep it in the archive in NY."

Once you delete a row, it's deleted as far as the client is concerned. GCGaceSeconds is only concerned with when the tombstone marker can be removed. If NY has a replica of a row from Tokyo and the row is deleted in either DC, it will be deleted in the other DC as well. 

Some thoughts...
1) Add more storage in the satellite DC's, then tilt you chair to celebrate a job well done :)
2) Run two clusters as you say. 
3) Just thinking out loud, and I know this does not work now. Would it be possible to support per CF strategy options, so an archive CF only replicates to NY ? Can think of possible problems with repair and LOCAL_QUORUM, out of interest what else would it break?

Hope that helps.
Aaron


 
On 14 Apr 2011, at 10:17, Patrick Julien wrote:

> We have been successful in implementing, at scale, the comments you
> posted here.  I'm wondering what we can do about deleting data
> however.
> 
> The way I see it, we have considerably more storage capacity in NY,
> but not in the other sites.  Using this technique here, it occurs to
> me that we would replicate non-NY deleted rows back to NY.  Is there a
> way to tell NY not to tombstone rows?
> 
> The ideas I have so far:
> 
> - Set GCGracePeriod to be much higher in NY than in the other sites.
> This way we can get to tombstone'd rows well beyond their disk life in
> other sites.
> - A variant on this solution is to set the TTL on rows in non NY sites
> and again, set the GCGracePeriod to be considerably higher in NY
> - break this up to multiple clusters and do one write from the client
> to the its 'local' cluster and one write to the NY cluster.
> 
> 
> 
> On Fri, Apr 8, 2011 at 7:15 PM, Jonathan Ellis <jb...@gmail.com> wrote:
>> No, I'm suggesting you have a Tokyo keyspace that gets replicated as
>> {Tokyo: 2, NYC:1}, a London keyspace that gets replicated to {London:
>> 2, NYC: 1}, for example.
>> 
>> On Fri, Apr 8, 2011 at 5:59 PM, Patrick Julien <pj...@gmail.com> wrote:
>>> I'm familiar with this material.  I hadn't thought of it from this
>>> angle but I believe what you're suggesting is that the different data
>>> centers would hold a different properties file for node discovery
>>> instead of using auto-discovery.
>>> 
>>> So Tokyo, and others, would have a configuration that make it
>>> oblivious to the non New York data centers.
>>> New York would have a configuration that would give it knowledge of no
>>> other data center.
>>> 
>>> Would that work?  Wouldn't the NY data center wonder where these other
>>> writes are coming from?
>>> 
>>> On Fri, Apr 8, 2011 at 6:38 PM, Jonathan Ellis <jb...@gmail.com> wrote:
>>>> On Fri, Apr 8, 2011 at 12:17 PM, Patrick Julien <pj...@gmail.com> wrote:
>>>>> The problem is this: we would like the historical data from Tokyo to
>>>>> stay in Tokyo and only be replicated to New York.  The one in London
>>>>> to be in London and only be replicated to New York and so on for all
>>>>> data centers.
>>>>> 
>>>>> Is this currently possible with Cassandra?  I believe we would need to
>>>>> run multiple clusters and migrate data manually from data centers to
>>>>> North America to achieve this.  Also, any suggestions would also be
>>>>> welcomed.
>>>> 
>>>> NetworkTopologyStrategy allows configuration replicas per-keyspace,
>>>> per-datacenter:
>>>> http://www.datastax.com/dev/blog/deploying-cassandra-across-multiple-data-centers
>>>> 
>>>> --
>>>> Jonathan Ellis
>>>> Project Chair, Apache Cassandra
>>>> co-founder of DataStax, the source for professional Cassandra support
>>>> http://www.datastax.com
>>>> 
>>> 
>> 
>> 
>> 
>> --
>> Jonathan Ellis
>> Project Chair, Apache Cassandra
>> co-founder of DataStax, the source for professional Cassandra support
>> http://www.datastax.com
>>

Re: Pyramid Organization of Data

Posted by Patrick Julien <pj...@gmail.com>.

We have been successful in implementing, at scale, the comments you
posted here.  I'm wondering what we can do about deleting data
however.

The way I see it, we have considerably more storage capacity in NY,
but not in the other sites.  Using this technique here, it occurs to
me that we would replicate non-NY deleted rows back to NY.  Is there a
way to tell NY not to tombstone rows?

The ideas I have so far:

- Set GCGracePeriod to be much higher in NY than in the other sites.
This way we can get to tombstone'd rows well beyond their disk life in
other sites.
- A variant on this solution is to set the TTL on rows in non NY sites
and again, set the GCGracePeriod to be considerably higher in NY
- break this up to multiple clusters and do one write from the client
to the its 'local' cluster and one write to the NY cluster.



On Fri, Apr 8, 2011 at 7:15 PM, Jonathan Ellis <jb...@gmail.com> wrote:
> No, I'm suggesting you have a Tokyo keyspace that gets replicated as
> {Tokyo: 2, NYC:1}, a London keyspace that gets replicated to {London:
> 2, NYC: 1}, for example.
>
> On Fri, Apr 8, 2011 at 5:59 PM, Patrick Julien <pj...@gmail.com> wrote:
>> I'm familiar with this material.  I hadn't thought of it from this
>> angle but I believe what you're suggesting is that the different data
>> centers would hold a different properties file for node discovery
>> instead of using auto-discovery.
>>
>> So Tokyo, and others, would have a configuration that make it
>> oblivious to the non New York data centers.
>> New York would have a configuration that would give it knowledge of no
>> other data center.
>>
>> Would that work?  Wouldn't the NY data center wonder where these other
>> writes are coming from?
>>
>> On Fri, Apr 8, 2011 at 6:38 PM, Jonathan Ellis <jb...@gmail.com> wrote:
>>> On Fri, Apr 8, 2011 at 12:17 PM, Patrick Julien <pj...@gmail.com> wrote:
>>>> The problem is this: we would like the historical data from Tokyo to
>>>> stay in Tokyo and only be replicated to New York.  The one in London
>>>> to be in London and only be replicated to New York and so on for all
>>>> data centers.
>>>>
>>>> Is this currently possible with Cassandra?  I believe we would need to
>>>> run multiple clusters and migrate data manually from data centers to
>>>> North America to achieve this.  Also, any suggestions would also be
>>>> welcomed.
>>>
>>> NetworkTopologyStrategy allows configuration replicas per-keyspace,
>>> per-datacenter:
>>> http://www.datastax.com/dev/blog/deploying-cassandra-across-multiple-data-centers
>>>
>>> --
>>> Jonathan Ellis
>>> Project Chair, Apache Cassandra
>>> co-founder of DataStax, the source for professional Cassandra support
>>> http://www.datastax.com
>>>
>>
>
>
>
> --
> Jonathan Ellis
> Project Chair, Apache Cassandra
> co-founder of DataStax, the source for professional Cassandra support
> http://www.datastax.com
>

Re: Pyramid Organization of Data

Posted by Patrick Julien <pj...@gmail.com>.

thank you, I get it now.

On Fri, Apr 8, 2011 at 7:15 PM, Jonathan Ellis <jb...@gmail.com> wrote:
> No, I'm suggesting you have a Tokyo keyspace that gets replicated as
> {Tokyo: 2, NYC:1}, a London keyspace that gets replicated to {London:
> 2, NYC: 1}, for example.
>
> On Fri, Apr 8, 2011 at 5:59 PM, Patrick Julien <pj...@gmail.com> wrote:
>> I'm familiar with this material.  I hadn't thought of it from this
>> angle but I believe what you're suggesting is that the different data
>> centers would hold a different properties file for node discovery
>> instead of using auto-discovery.
>>
>> So Tokyo, and others, would have a configuration that make it
>> oblivious to the non New York data centers.
>> New York would have a configuration that would give it knowledge of no
>> other data center.
>>
>> Would that work?  Wouldn't the NY data center wonder where these other
>> writes are coming from?
>>
>> On Fri, Apr 8, 2011 at 6:38 PM, Jonathan Ellis <jb...@gmail.com> wrote:
>>> On Fri, Apr 8, 2011 at 12:17 PM, Patrick Julien <pj...@gmail.com> wrote:
>>>> The problem is this: we would like the historical data from Tokyo to
>>>> stay in Tokyo and only be replicated to New York.  The one in London
>>>> to be in London and only be replicated to New York and so on for all
>>>> data centers.
>>>>
>>>> Is this currently possible with Cassandra?  I believe we would need to
>>>> run multiple clusters and migrate data manually from data centers to
>>>> North America to achieve this.  Also, any suggestions would also be
>>>> welcomed.
>>>
>>> NetworkTopologyStrategy allows configuration replicas per-keyspace,
>>> per-datacenter:
>>> http://www.datastax.com/dev/blog/deploying-cassandra-across-multiple-data-centers
>>>
>>> --
>>> Jonathan Ellis
>>> Project Chair, Apache Cassandra
>>> co-founder of DataStax, the source for professional Cassandra support
>>> http://www.datastax.com
>>>
>>
>
>
>
> --
> Jonathan Ellis
> Project Chair, Apache Cassandra
> co-founder of DataStax, the source for professional Cassandra support
> http://www.datastax.com
>

Re: Pyramid Organization of Data

Posted by Jonathan Ellis <jb...@gmail.com>.

No, I'm suggesting you have a Tokyo keyspace that gets replicated as
{Tokyo: 2, NYC:1}, a London keyspace that gets replicated to {London:
2, NYC: 1}, for example.

On Fri, Apr 8, 2011 at 5:59 PM, Patrick Julien <pj...@gmail.com> wrote:
> I'm familiar with this material.  I hadn't thought of it from this
> angle but I believe what you're suggesting is that the different data
> centers would hold a different properties file for node discovery
> instead of using auto-discovery.
>
> So Tokyo, and others, would have a configuration that make it
> oblivious to the non New York data centers.
> New York would have a configuration that would give it knowledge of no
> other data center.
>
> Would that work?  Wouldn't the NY data center wonder where these other
> writes are coming from?
>
> On Fri, Apr 8, 2011 at 6:38 PM, Jonathan Ellis <jb...@gmail.com> wrote:
>> On Fri, Apr 8, 2011 at 12:17 PM, Patrick Julien <pj...@gmail.com> wrote:
>>> The problem is this: we would like the historical data from Tokyo to
>>> stay in Tokyo and only be replicated to New York.  The one in London
>>> to be in London and only be replicated to New York and so on for all
>>> data centers.
>>>
>>> Is this currently possible with Cassandra?  I believe we would need to
>>> run multiple clusters and migrate data manually from data centers to
>>> North America to achieve this.  Also, any suggestions would also be
>>> welcomed.
>>
>> NetworkTopologyStrategy allows configuration replicas per-keyspace,
>> per-datacenter:
>> http://www.datastax.com/dev/blog/deploying-cassandra-across-multiple-data-centers
>>
>> --
>> Jonathan Ellis
>> Project Chair, Apache Cassandra
>> co-founder of DataStax, the source for professional Cassandra support
>> http://www.datastax.com
>>
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com

Re: Pyramid Organization of Data

Posted by Patrick Julien <pj...@gmail.com>.

I'm familiar with this material.  I hadn't thought of it from this
angle but I believe what you're suggesting is that the different data
centers would hold a different properties file for node discovery
instead of using auto-discovery.

So Tokyo, and others, would have a configuration that make it
oblivious to the non New York data centers.
New York would have a configuration that would give it knowledge of no
other data center.

Would that work?  Wouldn't the NY data center wonder where these other
writes are coming from?

On Fri, Apr 8, 2011 at 6:38 PM, Jonathan Ellis <jb...@gmail.com> wrote:
> On Fri, Apr 8, 2011 at 12:17 PM, Patrick Julien <pj...@gmail.com> wrote:
>> The problem is this: we would like the historical data from Tokyo to
>> stay in Tokyo and only be replicated to New York.  The one in London
>> to be in London and only be replicated to New York and so on for all
>> data centers.
>>
>> Is this currently possible with Cassandra?  I believe we would need to
>> run multiple clusters and migrate data manually from data centers to
>> North America to achieve this.  Also, any suggestions would also be
>> welcomed.
>
> NetworkTopologyStrategy allows configuration replicas per-keyspace,
> per-datacenter:
> http://www.datastax.com/dev/blog/deploying-cassandra-across-multiple-data-centers
>
> --
> Jonathan Ellis
> Project Chair, Apache Cassandra
> co-founder of DataStax, the source for professional Cassandra support
> http://www.datastax.com
>

Re: Pyramid Organization of Data

Posted by Jonathan Ellis <jb...@gmail.com>.

On Fri, Apr 8, 2011 at 12:17 PM, Patrick Julien <pj...@gmail.com> wrote:
> The problem is this: we would like the historical data from Tokyo to
> stay in Tokyo and only be replicated to New York.  The one in London
> to be in London and only be replicated to New York and so on for all
> data centers.
>
> Is this currently possible with Cassandra?  I believe we would need to
> run multiple clusters and migrate data manually from data centers to
> North America to achieve this.  Also, any suggestions would also be
> welcomed.

NetworkTopologyStrategy allows configuration replicas per-keyspace,
per-datacenter:
http://www.datastax.com/dev/blog/deploying-cassandra-across-multiple-data-centers

-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com