You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Kevin Burton <bu...@spinn3r.com> on 2011/08/24 04:58:06 UTC

Memory overhead of vector clocks…. how often are they pruned?

I had a thread going the other day about vector clock memory usage and that
it is a series of (clock id, clock):ts and the ability to prune old entries
… I'm specifically curious here how often old entries are pruned.

If you're storing small columns within cassandra.  Say just an integer.  The
vector clock overhead could easily use up far more data than is actually in
your database.

However, if they are pruned, then this shouldn't really be a problem.

How much memory is this wasting?

Thoughts?


    Jonathan Ellis jbellis@gmail.com to user
 show details Aug 19 (4 days ago)
 The problem with naive last write wins is that writes don't always
arrive at each replica in the same order.  So no, that's a
non-starter.

Vector clocks are a series of (client id, clock) entries, and usually
a timestamp so you can prune old entries.  Obviously implementations
can vary, but to pick a specific example, Voldemort [1] uses 2 bytes
per client id, a variable number (at least one) of bytes for the
clock, and 8 bytes for the timestamp.

[1]
https://github.com/voldemort/voldemort/blob/master/src/java/voldemort/versioning/VectorClock.java


-- 

Founder/CEO Spinn3r.com

Location: *San Francisco, CA*
Skype: *burtonator*

Skype-in: *(415) 871-0687*

Re: Memory overhead of vector clocks…. how often are they pruned?

Posted by Kevin Burton <bu...@spinn3r.com>.
Yeah… I gathered that this was a memory for availability tradeoff… I was
just curious how much memory was involved.

It seems a shame to waste so much memory and I can't help shake the feeling
that a lot of this is unnecessary.

In some situations I could see Cassandra using up to 4x more memory than a
traditional sharded DB.

I was thinking that it might be possible to only keep timestamps for RECENT
data.

The machine can only write at a specific rate which is relatively low enough
that if we give clients only a 1-5 minute window to write data, or lose
their ability to write, then Cassandra would only need to keep timestamps
for the most 5 minutes of data.

There are a lot of details I need to work out here because I'd like to
understand the Cassandra internals more.  Obviously this will depend on a
lot of implementation details whether it's possible or not…

but I still can't shake the fact that 4x additional memory is a non-starter.

On Wed, Aug 24, 2011 at 3:11 PM, Ryan King <ry...@twitter.com> wrote:

> We did have a Clock construct for awhile, but it never made it into a
> released version (afaik). We though about using them for counters.
>
> Timestamps are endemic to the data model and therefore can never be
> pruned. Cassandra basically trades memory for availability here.
>
> -ryan
>
> On Wed, Aug 24, 2011 at 10:54 AM, Jeremy Hanna
> <je...@gmail.com> wrote:
> > At the point that book was written (about a year ago it was finalized),
> vector clocks were planned.  In August or September of last year, they were
> removed.  0.7 was released in January.  The ticket for vector clocks is here
> and you can see the reasoning for not using them at the bottom.
> https://issues.apache.org/jira/browse/CASSANDRA-580
> >
> > On Aug 24, 2011, at 12:41 PM, Kevin Burton wrote:
> >
> >> This is really interesting… I can track it down but there are a number
> of references to Cassandra HAVING vector clocks … which would make sense
> that I can't find out how much memory they are using :-P
> >>
> >> "Cassandra: The Definitive Guide" … which I was reading the other night
> says that they were introduced in 0.7 but that they're still figuring out
> what to do with them:
> >>
> >>
> http://books.google.com/books?id=MKGSbCbEdg0C&pg=PA50&lpg=PA50&dq=Cassandra's+clock+was+introduced+in+version+0.7,+but+its+fate+is+uncertain&source=bl&ots=XoQz3tFa1C&sig=Lhdu5j1xRcTPmP4-YQONhxzfRTU&hl=en&ei=MzdVTurWEJTSiAKU5vXoDA&sa=X&oi=book_result&ct=result&resnum=1&ved=0CBkQ6AEwAA#v=onepage&q&f=false
> >>
> >> … so… are 'timestamps' pruned?
> >>
> >> Even this mechanism seems like it will dominate the amount of memory
> used in Cassandra.  I could see many installs requiring 2-3x more memory to
> run Cassandra unless there is a pruning mechanism or some way to minimize
> their use.
> >>
> >> Kevin
> >>
> >>
> >> On Wed, Aug 24, 2011 at 9:05 AM, Ryan King <ry...@twitter.com> wrote:
> >> On Tue, Aug 23, 2011 at 7:58 PM, Kevin Burton <bu...@spinn3r.com>
> wrote:
> >> I had a thread going the other day about vector clock memory usage and
> that it is a series of (clock id, clock):ts and the ability to prune old
> entries … I'm specifically curious here how often old entries are pruned.
> >>
> >> If you're storing small columns within cassandra.  Say just an integer.
>  The vector clock overhead could easily use up far more data than is
> actually in your database.
> >>
> >> However, if they are pruned, then this shouldn't really be a problem.
> >>
> >> How much memory is this wasting?
> >>
> >> I think there is some confusion here– cassandra doesn't use vector
> clocks.
> >>
> >> -ryan
> >>
> >> Thoughts?
> >>
> >>
> >> Jonathan Ellis jbellis@gmail.com to user
> >> show details Aug 19 (4 days ago)
> >> The problem with naive last write wins is that writes don't always
> >> arrive at each replica in the same order.  So no, that's a
> >> non-starter.
> >>
> >> Vector clocks are a series of (client id, clock) entries, and usually
> >> a timestamp so you can prune old entries.  Obviously implementations
> >> can vary, but to pick a specific example, Voldemort [1] uses 2 bytes
> >> per client id, a variable number (at least one) of bytes for the
> >> clock, and 8 bytes for the timestamp.
> >>
> >> [1]
> https://github.com/voldemort/voldemort/blob/master/src/java/voldemort/versioning/VectorClock.java
> >>
> >>
> >> --
> >> Founder/CEO Spinn3r.com
> >>
> >> Location: San Francisco, CA
> >> Skype: burtonator
> >> Skype-in: (415) 871-0687
> >>
> >>
> >>
> >>
> >>
> >> --
> >> Founder/CEO Spinn3r.com
> >>
> >> Location: San Francisco, CA
> >> Skype: burtonator
> >> Skype-in: (415) 871-0687
> >>
> >
> >
>



-- 

Founder/CEO Spinn3r.com

Location: *San Francisco, CA*
Skype: *burtonator*

Skype-in: *(415) 871-0687*

Re: Memory overhead of vector clocks…. how often are they pruned?

Posted by Ryan King <ry...@twitter.com>.
We did have a Clock construct for awhile, but it never made it into a
released version (afaik). We though about using them for counters.

Timestamps are endemic to the data model and therefore can never be
pruned. Cassandra basically trades memory for availability here.

-ryan

On Wed, Aug 24, 2011 at 10:54 AM, Jeremy Hanna
<je...@gmail.com> wrote:
> At the point that book was written (about a year ago it was finalized), vector clocks were planned.  In August or September of last year, they were removed.  0.7 was released in January.  The ticket for vector clocks is here and you can see the reasoning for not using them at the bottom.  https://issues.apache.org/jira/browse/CASSANDRA-580
>
> On Aug 24, 2011, at 12:41 PM, Kevin Burton wrote:
>
>> This is really interesting… I can track it down but there are a number of references to Cassandra HAVING vector clocks … which would make sense that I can't find out how much memory they are using :-P
>>
>> "Cassandra: The Definitive Guide" … which I was reading the other night says that they were introduced in 0.7 but that they're still figuring out what to do with them:
>>
>> http://books.google.com/books?id=MKGSbCbEdg0C&pg=PA50&lpg=PA50&dq=Cassandra's+clock+was+introduced+in+version+0.7,+but+its+fate+is+uncertain&source=bl&ots=XoQz3tFa1C&sig=Lhdu5j1xRcTPmP4-YQONhxzfRTU&hl=en&ei=MzdVTurWEJTSiAKU5vXoDA&sa=X&oi=book_result&ct=result&resnum=1&ved=0CBkQ6AEwAA#v=onepage&q&f=false
>>
>> … so… are 'timestamps' pruned?
>>
>> Even this mechanism seems like it will dominate the amount of memory used in Cassandra.  I could see many installs requiring 2-3x more memory to run Cassandra unless there is a pruning mechanism or some way to minimize their use.
>>
>> Kevin
>>
>>
>> On Wed, Aug 24, 2011 at 9:05 AM, Ryan King <ry...@twitter.com> wrote:
>> On Tue, Aug 23, 2011 at 7:58 PM, Kevin Burton <bu...@spinn3r.com> wrote:
>> I had a thread going the other day about vector clock memory usage and that it is a series of (clock id, clock):ts and the ability to prune old entries … I'm specifically curious here how often old entries are pruned.
>>
>> If you're storing small columns within cassandra.  Say just an integer.  The vector clock overhead could easily use up far more data than is actually in your database.
>>
>> However, if they are pruned, then this shouldn't really be a problem.
>>
>> How much memory is this wasting?
>>
>> I think there is some confusion here– cassandra doesn't use vector clocks.
>>
>> -ryan
>>
>> Thoughts?
>>
>>
>> Jonathan Ellis jbellis@gmail.com to user
>> show details Aug 19 (4 days ago)
>> The problem with naive last write wins is that writes don't always
>> arrive at each replica in the same order.  So no, that's a
>> non-starter.
>>
>> Vector clocks are a series of (client id, clock) entries, and usually
>> a timestamp so you can prune old entries.  Obviously implementations
>> can vary, but to pick a specific example, Voldemort [1] uses 2 bytes
>> per client id, a variable number (at least one) of bytes for the
>> clock, and 8 bytes for the timestamp.
>>
>> [1] https://github.com/voldemort/voldemort/blob/master/src/java/voldemort/versioning/VectorClock.java
>>
>>
>> --
>> Founder/CEO Spinn3r.com
>>
>> Location: San Francisco, CA
>> Skype: burtonator
>> Skype-in: (415) 871-0687
>>
>>
>>
>>
>>
>> --
>> Founder/CEO Spinn3r.com
>>
>> Location: San Francisco, CA
>> Skype: burtonator
>> Skype-in: (415) 871-0687
>>
>
>

Re: Memory overhead of vector clocks…. how often are they pruned?

Posted by Jeremy Hanna <je...@gmail.com>.
At the point that book was written (about a year ago it was finalized), vector clocks were planned.  In August or September of last year, they were removed.  0.7 was released in January.  The ticket for vector clocks is here and you can see the reasoning for not using them at the bottom.  https://issues.apache.org/jira/browse/CASSANDRA-580

On Aug 24, 2011, at 12:41 PM, Kevin Burton wrote:

> This is really interesting… I can track it down but there are a number of references to Cassandra HAVING vector clocks … which would make sense that I can't find out how much memory they are using :-P
> 
> "Cassandra: The Definitive Guide" … which I was reading the other night says that they were introduced in 0.7 but that they're still figuring out what to do with them:
> 
> http://books.google.com/books?id=MKGSbCbEdg0C&pg=PA50&lpg=PA50&dq=Cassandra's+clock+was+introduced+in+version+0.7,+but+its+fate+is+uncertain&source=bl&ots=XoQz3tFa1C&sig=Lhdu5j1xRcTPmP4-YQONhxzfRTU&hl=en&ei=MzdVTurWEJTSiAKU5vXoDA&sa=X&oi=book_result&ct=result&resnum=1&ved=0CBkQ6AEwAA#v=onepage&q&f=false
> 
> … so… are 'timestamps' pruned?  
> 
> Even this mechanism seems like it will dominate the amount of memory used in Cassandra.  I could see many installs requiring 2-3x more memory to run Cassandra unless there is a pruning mechanism or some way to minimize their use.
> 
> Kevin
> 
> 
> On Wed, Aug 24, 2011 at 9:05 AM, Ryan King <ry...@twitter.com> wrote:
> On Tue, Aug 23, 2011 at 7:58 PM, Kevin Burton <bu...@spinn3r.com> wrote:
> I had a thread going the other day about vector clock memory usage and that it is a series of (clock id, clock):ts and the ability to prune old entries … I'm specifically curious here how often old entries are pruned.
> 
> If you're storing small columns within cassandra.  Say just an integer.  The vector clock overhead could easily use up far more data than is actually in your database.
> 
> However, if they are pruned, then this shouldn't really be a problem.  
> 
> How much memory is this wasting?
> 
> I think there is some confusion here– cassandra doesn't use vector clocks.
> 
> -ryan
>  
> Thoughts?
> 
> 
> Jonathan Ellis jbellis@gmail.com to user
> show details Aug 19 (4 days ago)
> The problem with naive last write wins is that writes don't always
> arrive at each replica in the same order.  So no, that's a
> non-starter.
> 
> Vector clocks are a series of (client id, clock) entries, and usually
> a timestamp so you can prune old entries.  Obviously implementations
> can vary, but to pick a specific example, Voldemort [1] uses 2 bytes
> per client id, a variable number (at least one) of bytes for the
> clock, and 8 bytes for the timestamp.
> 
> [1] https://github.com/voldemort/voldemort/blob/master/src/java/voldemort/versioning/VectorClock.java
> 
> 
> -- 
> Founder/CEO Spinn3r.com
> 
> Location: San Francisco, CA
> Skype: burtonator
> Skype-in: (415) 871-0687
> 
> 
> 
> 
> 
> -- 
> Founder/CEO Spinn3r.com
> 
> Location: San Francisco, CA
> Skype: burtonator
> Skype-in: (415) 871-0687
> 


Re: Memory overhead of vector clocks…. how often are they pruned?

Posted by Kevin Burton <bu...@spinn3r.com>.
This is really interesting… I can track it down but there are a number of
references to Cassandra HAVING vector clocks … which would make sense that I
can't find out how much memory they are using :-P

"Cassandra: The Definitive Guide" … which I was reading the other night says
that they were introduced in 0.7 but that they're still figuring out what to
do with them:

http://books.google.com/books?id=MKGSbCbEdg0C&pg=PA50&lpg=PA50&dq=Cassandra's+clock+was+introduced+in+version+0.7,+but+its+fate+is+uncertain&source=bl&ots=XoQz3tFa1C&sig=Lhdu5j1xRcTPmP4-YQONhxzfRTU&hl=en&ei=MzdVTurWEJTSiAKU5vXoDA&sa=X&oi=book_result&ct=result&resnum=1&ved=0CBkQ6AEwAA#v=onepage&q&f=false

… so… are 'timestamps' pruned?

Even this mechanism seems like it will dominate the amount of memory used in
Cassandra.  I could see many installs requiring 2-3x more memory to run
Cassandra unless there is a pruning mechanism or some way to minimize their
use.

Kevin


On Wed, Aug 24, 2011 at 9:05 AM, Ryan King <ry...@twitter.com> wrote:

> On Tue, Aug 23, 2011 at 7:58 PM, Kevin Burton <bu...@spinn3r.com> wrote:
>
>> I had a thread going the other day about vector clock memory usage and
>> that it is a series of (clock id, clock):ts and the ability to prune old
>> entries … I'm specifically curious here how often old entries are pruned.
>>
>> If you're storing small columns within cassandra.  Say just an integer.
>>  The vector clock overhead could easily use up far more data than is
>> actually in your database.
>>
>> However, if they are pruned, then this shouldn't really be a problem.
>>
>> How much memory is this wasting?
>>
>
> I think there is some confusion here– cassandra doesn't use vector clocks.
>
> -ryan
>
>
>> Thoughts?
>>
>>
>>     Jonathan Ellis jbellis@gmail.com to user
>>  show details Aug 19 (4 days ago)
>>  The problem with naive last write wins is that writes don't always
>> arrive at each replica in the same order.  So no, that's a
>> non-starter.
>>
>> Vector clocks are a series of (client id, clock) entries, and usually
>> a timestamp so you can prune old entries.  Obviously implementations
>> can vary, but to pick a specific example, Voldemort [1] uses 2 bytes
>> per client id, a variable number (at least one) of bytes for the
>> clock, and 8 bytes for the timestamp.
>>
>> [1]
>> https://github.com/voldemort/voldemort/blob/master/src/java/voldemort/versioning/VectorClock.java
>>
>>
>> --
>>
>> Founder/CEO Spinn3r.com
>>
>> Location: *San Francisco, CA*
>> Skype: *burtonator*
>>
>> Skype-in: *(415) 871-0687*
>>
>>
>


-- 

Founder/CEO Spinn3r.com

Location: *San Francisco, CA*
Skype: *burtonator*

Skype-in: *(415) 871-0687*

Re: Memory overhead of vector clocks…. how often are they pruned?

Posted by Ryan King <ry...@twitter.com>.
On Tue, Aug 23, 2011 at 7:58 PM, Kevin Burton <bu...@spinn3r.com> wrote:

> I had a thread going the other day about vector clock memory usage and that
> it is a series of (clock id, clock):ts and the ability to prune old entries
> … I'm specifically curious here how often old entries are pruned.
>
> If you're storing small columns within cassandra.  Say just an integer.
>  The vector clock overhead could easily use up far more data than is
> actually in your database.
>
> However, if they are pruned, then this shouldn't really be a problem.
>
> How much memory is this wasting?
>

I think there is some confusion here– cassandra doesn't use vector clocks.

-ryan


> Thoughts?
>
>
>     Jonathan Ellis jbellis@gmail.com to user
>  show details Aug 19 (4 days ago)
>  The problem with naive last write wins is that writes don't always
> arrive at each replica in the same order.  So no, that's a
> non-starter.
>
> Vector clocks are a series of (client id, clock) entries, and usually
> a timestamp so you can prune old entries.  Obviously implementations
> can vary, but to pick a specific example, Voldemort [1] uses 2 bytes
> per client id, a variable number (at least one) of bytes for the
> clock, and 8 bytes for the timestamp.
>
> [1]
> https://github.com/voldemort/voldemort/blob/master/src/java/voldemort/versioning/VectorClock.java
>
>
> --
>
> Founder/CEO Spinn3r.com
>
> Location: *San Francisco, CA*
> Skype: *burtonator*
>
> Skype-in: *(415) 871-0687*
>
>

Re: Memory overhead of vector clocks…. how often are they pruned?

Posted by Radim Kolar <hs...@sendmail.cz>.
 From my point vector clocks is too much overhead. If you sync clocks in 
your cluster using NTP (which you should do anyway) you will get clock 
precision < 1/1000s which is good enough.

all my machines running NTP has offset < 1/1000s. They are FreeBSD, 
Linux is not that precise in clock syncing.

      remote           local      st poll reach  delay   offset    disp
=======================================================================
*barricade.rack9 64.6.104.18      2   64  377 0.06187  0.000996 0.05093