You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by Gary <da...@gmail.com> on 2010/03/19 20:16:28 UTC

Digg's data model

I am a newbie to bigtable like model and have a question as follows. Take
Digg as an example, I want to find a list users who dug a URL and also want
to find a list of URLs a user dug. How should the data model look like for
the queries to be efficient? If I use the username and the URL for two rows,
when a user digs a URL, I will have to update two rows so I need a
transaction to keep data consistent.

Any thoughts?

Thanks,
Gary

Re: Digg's data model

Posted by Joe Stump <jo...@joestump.net>.

On Mar 19, 2010, at 1:16 PM, Gary wrote:

> I am a newbie to bigtable like model and have a question as follows. Take Digg as an example, I want to find a list users who dug a URL and also want to find a list of URLs a user dug. How should the data model look like for the queries to be efficient? If I use the username and the URL for two rows, when a user digs a URL, I will have to update two rows so I need a transaction to keep data consistent.

I'll let my former colleagues at Digg chime in here, but I will say you should probably start wrapping your brain around two things that fly quickly out the window with distributed/eventually consisten systems like Cassandra:

1. Normalization 
2. Transactions

:)

--Joe

Re: Digg's data model

Posted by Chris Goffinet <go...@digg.com>.

On Mar 20, 2010, at 9:10 AM, Jeremy Dunck wrote:

> On Sat, Mar 20, 2010 at 10:40 AM, Chris Goffinet <go...@digg.com> wrote:
>>> 5. Backups : If there is a  4 or 5 TB cassandra cluster what do you recommend the backup scenario's could be?
>> 
>> Worst case scenario (total failure) we opted to do global snapshots every 24 hours. This creates hard links to SSTables on each node. We copy those SSTables to HDFS on daily basis. We also wrote a patch to log all events going into the commit log to be written to Scribe so we can have a rolling commit log into HDFS. So in the event that entire cluster corrupts, we can take the last 24 hours snapshot + the commit log right after last snapshot and get the cluster into the last known good state.
> 
> Doesn't this leave you open to corruption you don't discover within 24 hours?

No. We aren't storing the actual commit log structure, we have our own.

-Chris

Re: Digg's data model

Posted by Jeremy Dunck <jd...@gmail.com>.

On Sat, Mar 20, 2010 at 10:40 AM, Chris Goffinet <go...@digg.com> wrote:
>> 5. Backups : If there is a  4 or 5 TB cassandra cluster what do you recommend the backup scenario's could be?
>
> Worst case scenario (total failure) we opted to do global snapshots every 24 hours. This creates hard links to SSTables on each node. We copy those SSTables to HDFS on daily basis. We also wrote a patch to log all events going into the commit log to be written to Scribe so we can have a rolling commit log into HDFS. So in the event that entire cluster corrupts, we can take the last 24 hours snapshot + the commit log right after last snapshot and get the cluster into the last known good state.

Doesn't this leave you open to corruption you don't discover within 24 hours?

Re: Digg's data model

Posted by Ned Wolpert <ne...@imemories.com>.

I'm curious why you are storing the backups (sstables and commit logs) to
HDFS instead of something like lustre. Are your backups using Hadoop's
map/reduce somehow? Or is it for convenience?

On Sat, Mar 20, 2010 at 8:40 AM, Chris Goffinet <go...@digg.com> wrote:

> > 5. Backups : If there is a  4 or 5 TB cassandra cluster what do you
> recommend the backup scenario's could be?
>
> Worst case scenario (total failure) we opted to do global snapshots every
> 24 hours. This creates hard links to SSTables on each node. We copy those
> SSTables to HDFS on daily basis. We also wrote a patch to log all events
> going into the commit log to be written to Scribe so we can have a rolling
> commit log into HDFS. So in the event that entire cluster corrupts, we can
> take the last 24 hours snapshot + the commit log right after last snapshot
> and get the cluster into the last known good state.
>
> -Chris




-- 
Virtually, Ned Wolpert

"Settle thy studies, Faustus, and begin..."   --Marlowe

Re: Digg's data model

Posted by Chris Goffinet <go...@digg.com>.

> 5. Backups : If there is a  4 or 5 TB cassandra cluster what do you recommend the backup scenario's could be?

Worst case scenario (total failure) we opted to do global snapshots every 24 hours. This creates hard links to SSTables on each node. We copy those SSTables to HDFS on daily basis. We also wrote a patch to log all events going into the commit log to be written to Scribe so we can have a rolling commit log into HDFS. So in the event that entire cluster corrupts, we can take the last 24 hours snapshot + the commit log right after last snapshot and get the cluster into the last known good state.

-Chris

Re: Digg's data model

Posted by Joe Stump <jo...@joestump.net>.

On Mar 20, 2010, at 2:53 AM, Lenin Gali wrote:

> 1. Eventual consistency: Given a volume of 5K writes / sec and roughly 1500 writes are Updates per sec while the rest are inserts, what kind of latency can be expected in eventual consistency?

Depending on the size of the cluster you're not looking at much latency at all. On the order of 10's of ms's.

> 2. Performance: Are there any bench marks on how many writes /sec and reads/sec cassandra supports on an "n node" cluster? a Node can be of variable size and would like to know the hardware/software details of the cluster as well. 

Cassandra's performance is impressive. We had one node spike at 103,000 reads a second with a load of only about 6, which is high, but not alarmingly so.

> 3. EC2: Has any one implemented cassandra on EC2 and what kind transaction volume are they using it for and how is their experience with cassandra on EC2?.

We have a 15 node cluster on EC2. We have a patch that is a rack aware strategy specifically for EC2 zones where it replicates keys in a manner so that you have one key in each AZ. We run Cassandra across 3 AZ's on large instances with the ephemeral drives in a RAID0 setup with XFS.

You might also be interested in this:
http://stu.mp/2009/12/disk-io-and-throughput-benchmarks-on-amazons-ec2.html

> 4. Overhead and issues: What are typical nightmare scenario's one could face when using Cassandra for heavy write / read intensive systems?

We haven't ran into any, but when we do find hot spots in the cluster we bootstrap a new node into the cluster with a token range that will alleviate the hot spot. This is rather painless in our experiences. 

> 5. Backups : If there is a  4 or 5 TB cassandra cluster what do you recommend the backup scenario's could be?

There isn't one that I know of. This is what the replication factor is for. We keep three copies of each key in three different datacenters. That's our backup strategy. 

> Also, Does cassandra support counters?

Not yet, but there's active work happening in this area.

> Digg's article said they are going to contribute their work to open source any idea when that would be?

Chris Goffinet is a committer to Cassandra. Digg's contributions are contributed back in an almost daily fashion.

--Joe

Re: Digg's data model

Posted by Chris Goffinet <cg...@chrisgoffinet.com>.

> Also, Does cassandra support counters? Digg's article said they are going to contribute their work to open source any idea when that would be?
> 

All of the custom work has been pushed upstream from Digg and continues. We have a few operational tools we will be releasing that will go into contrib/. I would say Counters w/ vector clocks should have some jira patches in few weeks. We have someone full-time working on that. 

> Thanks in advance for sharing your experience
> 
> Lenin
> 
> 
> 
> On Fri, Mar 19, 2010 at 1:03 PM, Jonathan Ellis <jb...@gmail.com> wrote:
> Jeff Hodsdon edited the new link in:
> http://about.digg.com/blog/looking-future-cassandra
> 
> On Fri, Mar 19, 2010 at 2:49 PM, Nathan McCall <na...@vervewireless.com> wrote:
> > Gary,
> > Did you see this larticle linked from the Cassandra wiki?
> > http://about.digg.com/node/564
> >
> > See http://wiki.apache.org/cassandra/ArticlesAndPresentations for more
> > examples like the above. In general, you structure your data according
> > to how it will be queried. This can lead to duplication, but that is
> > one of the trade-offs for performance and scale.
> >
> > Digg folks - the "Looking to the Future with Cassandra" linked on the
> > wiki is no longer available. I found that article quite helpful
> > originally. Is there a chance this could be re-posted?
> >
> > Cheers,
> > -Nate
> >
> > On Fri, Mar 19, 2010 at 12:16 PM, Gary <da...@gmail.com> wrote:
> >> I am a newbie to bigtable like model and have a question as follows. Take
> >> Digg as an example, I want to find a list users who dug a URL and also want
> >> to find a list of URLs a user dug. How should the data model look like for
> >> the queries to be efficient? If I use the username and the URL for two rows,
> >> when a user digs a URL, I will have to update two rows so I need a
> >> transaction to keep data consistent.
> >> Any thoughts?
> >> Thanks,
> >> Gary
> >
> 
> 
> 
> -- 
> twitter: leningali
> skype: galilenin
> Cell:513.382.3371

Re: Digg's data model

Posted by Lenin Gali <ga...@gmail.com>.

Hi,
I have several questions. I hope some of you can share your experiences in
each or all of these following. I will be curious about twitter and digg's
experience as they might be processing

1. Eventual consistency: Given a volume of 5K writes / sec and roughly 1500
writes are Updates per sec while the rest are inserts, what kind of latency
can be expected in eventual consistency?

2. Performance: Are there any bench marks on how many writes /sec and
reads/sec cassandra supports on an "n node" cluster? a Node can be of
variable size and would like to know the hardware/software details of the
cluster as well.

3. EC2: Has any one implemented cassandra on EC2 and what kind transaction
volume are they using it for and how is their experience with cassandra on
EC2?.

4. Overhead and issues: What are typical nightmare scenario's one could face
when using Cassandra for heavy write / read intensive systems?

5. Backups : If there is a  4 or 5 TB cassandra cluster what do you
recommend the backup scenario's could be?

Also, Does cassandra support counters? Digg's article said they are going to
contribute their work to open source any idea when that would be?

Thanks in advance for sharing your experience

Lenin

On Fri, Mar 19, 2010 at 1:03 PM, Jonathan Ellis <jb...@gmail.com> wrote:

> Jeff Hodsdon edited the new link in:
> http://about.digg.com/blog/looking-future-cassandra
>
> On Fri, Mar 19, 2010 at 2:49 PM, Nathan McCall <na...@vervewireless.com>
> wrote:
> > Gary,
> > Did you see this larticle linked from the Cassandra wiki?
> > http://about.digg.com/node/564
> >
> > See http://wiki.apache.org/cassandra/ArticlesAndPresentations for more
> > examples like the above. In general, you structure your data according
> > to how it will be queried. This can lead to duplication, but that is
> > one of the trade-offs for performance and scale.
> >
> > Digg folks - the "Looking to the Future with Cassandra" linked on the
> > wiki is no longer available. I found that article quite helpful
> > originally. Is there a chance this could be re-posted?
> >
> > Cheers,
> > -Nate
> >
> > On Fri, Mar 19, 2010 at 12:16 PM, Gary <da...@gmail.com> wrote:
> >> I am a newbie to bigtable like model and have a question as follows.
> Take
> >> Digg as an example, I want to find a list users who dug a URL and also
> want
> >> to find a list of URLs a user dug. How should the data model look like
> for
> >> the queries to be efficient? If I use the username and the URL for two
> rows,
> >> when a user digs a URL, I will have to update two rows so I need a
> >> transaction to keep data consistent.
> >> Any thoughts?
> >> Thanks,
> >> Gary
> >
>

-- 
twitter: leningali
skype: galilenin
Cell:513.382.3371

Re: Digg's data model

Posted by Jonathan Ellis <jb...@gmail.com>.

Jeff Hodsdon edited the new link in:
http://about.digg.com/blog/looking-future-cassandra

On Fri, Mar 19, 2010 at 2:49 PM, Nathan McCall <na...@vervewireless.com> wrote:
> Gary,
> Did you see this larticle linked from the Cassandra wiki?
> http://about.digg.com/node/564
>
> See http://wiki.apache.org/cassandra/ArticlesAndPresentations for more
> examples like the above. In general, you structure your data according
> to how it will be queried. This can lead to duplication, but that is
> one of the trade-offs for performance and scale.
>
> Digg folks - the "Looking to the Future with Cassandra" linked on the
> wiki is no longer available. I found that article quite helpful
> originally. Is there a chance this could be re-posted?
>
> Cheers,
> -Nate
>
> On Fri, Mar 19, 2010 at 12:16 PM, Gary <da...@gmail.com> wrote:
>> I am a newbie to bigtable like model and have a question as follows. Take
>> Digg as an example, I want to find a list users who dug a URL and also want
>> to find a list of URLs a user dug. How should the data model look like for
>> the queries to be efficient? If I use the username and the URL for two rows,
>> when a user digs a URL, I will have to update two rows so I need a
>> transaction to keep data consistent.
>> Any thoughts?
>> Thanks,
>> Gary
>

Re: Digg's data model

Posted by Nathan McCall <na...@vervewireless.com>.

Gary,
Did you see this larticle linked from the Cassandra wiki?
http://about.digg.com/node/564

See http://wiki.apache.org/cassandra/ArticlesAndPresentations for more
examples like the above. In general, you structure your data according
to how it will be queried. This can lead to duplication, but that is
one of the trade-offs for performance and scale.

Digg folks - the "Looking to the Future with Cassandra" linked on the
wiki is no longer available. I found that article quite helpful
originally. Is there a chance this could be re-posted?

Cheers,
-Nate

On Fri, Mar 19, 2010 at 12:16 PM, Gary <da...@gmail.com> wrote:
> I am a newbie to bigtable like model and have a question as follows. Take
> Digg as an example, I want to find a list users who dug a URL and also want
> to find a list of URLs a user dug. How should the data model look like for
> the queries to be efficient? If I use the username and the URL for two rows,
> when a user digs a URL, I will have to update two rows so I need a
> transaction to keep data consistent.
> Any thoughts?
> Thanks,
> Gary

Re: Digg's data model

Posted by David Strauss <da...@fourkitchens.com>.

On 2010-03-19 19:16, Gary wrote:
> I am a newbie to bigtable like model and have a question as follows.
> Take Digg as an example, I want to find a list users who dug a URL and
> also want to find a list of URLs a user dug. How should the data model
> look like for the queries to be efficient? If I use the username and the
> URL for two rows, when a user digs a URL, I will have to update two rows
> so I need a transaction to keep data consistent.

Yes, you need to insert two items into Cassandra. There are a few
solutions to the consistency issue:

(1) Live with it. For a site like Digg, immediate consistency just isn't
that important.
(2) Treat one direction as the canonical source. When reading from the
other direction, verify that each item exists in the other direction.
(3) Add locking above Cassandra, using a system like ZooKeeper.
(4) Don't use Cassandra.

-- 
David Strauss
   | david@fourkitchens.com
Four Kitchens
   | http://fourkitchens.com
   | +1 512 454 6659 [office]
   | +1 512 870 8453 [direct]