You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by David Boxenhorn <da...@lookin2.com> on 2011/01/16 15:53:25 UTC

Tombstone lifespan after multiple deletions

If I delete a row, and later on delete it again, before GCGraceSeconds has
elapsed, does the tombstone live longer?

In other words, if I have the following scenario:

GCGraceSeconds = 10 days
On day 1 I delete a row
On day 5 I delete the row again

Will the tombstone be removed on day 10 or day 15?

Re: Tombstone lifespan after multiple deletions

Posted by Zhu Han <sc...@gmail.com>.
I'm not clear here.  Are you worried about the later inserted tombstone
prevents the whole row from being reclaimed and the storage space can not be
freed?

To my knowledge,  after major compaction,  only  the row key and tombstone
are kept. Is it a big deal?

best regards,
hanzhu


On Tue, Jan 18, 2011 at 9:41 PM, David Boxenhorn <da...@lookin2.com> wrote:

> Thanks, Aaron, but I'm not 100% clear.
>
> My situation is this: My use case spins off rows (not columns) that I no
> longer need and want to delete. It is possible that these rows were never
> created in the first place, or were already deleted. This is a very large
> cleanup task that normally deletes a lot of rows, and the last thing that I
> want to do is create tombstones for rows that didn't exist in the first
> place, or lengthen the life on disk of tombstones of rows that are already
> deleted.
>
> So the question is: before I delete, do I have to retrieve the row to see
> if it exists in the first place?
>
>
>
> On Tue, Jan 18, 2011 at 11:38 AM, Aaron Morton <aa...@thelastpickle.com>wrote:
>
>> AFAIK that's not necessary, there is no need to worry about previous
>> deletes. You can delete stuff that does not even exist, neither batch_mutate
>> or remove are going to throw an error.
>>
>> All the columns that were (roughly speaking) present at your first
>> deletion will be available for GC at the end of the first tombstones life.
>> Same for the second.
>>
>> Say you were to write a col between the two deletes with the same name as
>> one present at the start. The first version of the col is avail for GC after
>> tombstone 1, and the second after tombstone 2.
>>
>> Hope that helps
>> Aaron
>>
>> On 18/01/2011, at 9:37 PM, David Boxenhorn <da...@lookin2.com> wrote:
>>
>> Thanks. In other words, before I delete something, I should check to see
>> whether it exists as a live row in the first place.
>>
>> On Tue, Jan 18, 2011 at 9:24 AM, Ryan King < <ry...@twitter.com>
>> ryan@twitter.com> wrote:
>>
>>> On Sun, Jan 16, 2011 at 6:53 AM, David Boxenhorn < <da...@lookin2.com>
>>> david@lookin2.com> wrote:
>>> > If I delete a row, and later on delete it again, before GCGraceSeconds
>>> has
>>> > elapsed, does the tombstone live longer?
>>>
>>> Each delete is a new tombstone, which should answer your question.
>>>
>>> -ryan
>>>
>>> > In other words, if I have the following scenario:
>>> >
>>> > GCGraceSeconds = 10 days
>>> > On day 1 I delete a row
>>> > On day 5 I delete the row again
>>> >
>>> > Will the tombstone be removed on day 10 or day 15?
>>> >
>>>
>>
>>
>

Re: Tombstone lifespan after multiple deletions

Posted by Zhu Han <sc...@gmail.com>.
If the tombstone is older than the row or column inserted later, is the
tombstone skipped entirely after compaction?

best regards,
hanzhu


On Wed, Jan 19, 2011 at 11:16 AM, Jonathan Ellis <jb...@gmail.com> wrote:

> If you mean that multiple tombstones for the same row or column should
> be merged into a single one at compaction time, then yes, that is what
> happens.
>
> On Tue, Jan 18, 2011 at 7:53 PM, Germán Kondolf
> <ge...@gmail.com> wrote:
> > Maybe it could be taken into account when the compaction is executed,
> > if I only have a consecutive list of uninterrupted tombstones it could
> > only care about the first. It sounds like the-way-it-should-be, maybe
> > as a part of the "row-reduce" process.
> >
> > Is it feasible? Looking into the CASSANDRA-1074 sounds like it should.
> >
> > //GK
> > http://twitter.com/germanklf
> > http://code.google.com/p/seide/
> >
> > On Tue, Jan 18, 2011 at 10:55 AM, Sylvain Lebresne <sy...@riptano.com>
> wrote:
> >> On Tue, Jan 18, 2011 at 2:41 PM, David Boxenhorn <da...@lookin2.com>
> wrote:
> >>> Thanks, Aaron, but I'm not 100% clear.
> >>>
> >>> My situation is this: My use case spins off rows (not columns) that I
> no
> >>> longer need and want to delete. It is possible that these rows were
> never
> >>> created in the first place, or were already deleted. This is a very
> large
> >>> cleanup task that normally deletes a lot of rows, and the last thing
> that I
> >>> want to do is create tombstones for rows that didn't exist in the first
> >>> place, or lengthen the life on disk of tombstones of rows that are
> already
> >>> deleted.
> >>>
> >>> So the question is: before I delete, do I have to retrieve the row to
> see if
> >>> it exists in the first place?
> >>
> >> Yes, in your situation you do.
> >>
> >>>
> >>>
> >>>
> >>> On Tue, Jan 18, 2011 at 11:38 AM, Aaron Morton <
> aaron@thelastpickle.com>
> >>> wrote:
> >>>>
> >>>> AFAIK that's not necessary, there is no need to worry about previous
> >>>> deletes. You can delete stuff that does not even exist, neither
> batch_mutate
> >>>> or remove are going to throw an error.
> >>>> All the columns that were (roughly speaking) present at your first
> >>>> deletion will be available for GC at the end of the first tombstones
> life.
> >>>> Same for the second.
> >>>> Say you were to write a col between the two deletes with the same name
> as
> >>>> one present at the start. The first version of the col is avail for GC
> after
> >>>> tombstone 1, and the second after tombstone 2.
> >>>> Hope that helps
> >>>> Aaron
> >>>> On 18/01/2011, at 9:37 PM, David Boxenhorn <da...@lookin2.com> wrote:
> >>>>
> >>>> Thanks. In other words, before I delete something, I should check to
> see
> >>>> whether it exists as a live row in the first place.
> >>>>
> >>>> On Tue, Jan 18, 2011 at 9:24 AM, Ryan King <ry...@twitter.com> wrote:
> >>>>>
> >>>>> On Sun, Jan 16, 2011 at 6:53 AM, David Boxenhorn <da...@lookin2.com>
> >>>>> wrote:
> >>>>> > If I delete a row, and later on delete it again, before
> GCGraceSeconds
> >>>>> > has
> >>>>> > elapsed, does the tombstone live longer?
> >>>>>
> >>>>> Each delete is a new tombstone, which should answer your question.
> >>>>>
> >>>>> -ryan
> >>>>>
> >>>>> > In other words, if I have the following scenario:
> >>>>> >
> >>>>> > GCGraceSeconds = 10 days
> >>>>> > On day 1 I delete a row
> >>>>> > On day 5 I delete the row again
> >>>>> >
> >>>>> > Will the tombstone be removed on day 10 or day 15?
> >>>>> >
> >>>>
> >>>
> >>>
> >>
> >
>
>
>
> --
> Jonathan Ellis
> Project Chair, Apache Cassandra
> co-founder of Riptano, the source for professional Cassandra support
> http://riptano.com
>

Re: Tombstone lifespan after multiple deletions

Posted by Zhu Han <sc...@gmail.com>.
On Wed, Jan 19, 2011 at 8:41 PM, Germán Kondolf <ge...@gmail.com>wrote:

> On Wed, Jan 19, 2011 at 12:59 AM, Zhu Han <sc...@gmail.com> wrote:
> >
> >
> > On Wed, Jan 19, 2011 at 11:35 AM, Germán Kondolf <
> german.kondolf@gmail.com>
> > wrote:
> >>
> >> Yes, that's what I meant, but correct me if I'm wrong, when a deletion
> >> comes after another deletion for the same row or column will the
> gc-before
> >> count against the last one, isn't it?
> >>
> > IIRC, after compaction. even if the row key is not wiped, all the CF are
> > replaced by the youngest tombstone.  I do not understand very clearly the
> > benefit of wiping out the whole row as early as possible.
> >
>

The only problem I saw is the bloom filter might be filled up, if it was
inserted too many tombstones for rows non existed.

>
> I think it is not a "benefit", but a potencial issue, if you delete
> columns or rows without checking them before you could make them live
> as long as you keep issuing deletions, maybe it's a strange use-case,
> but certainly Cassandra provides new non-traditional ways of
> processing high-volume of information.
>
> As the original example depicted clearly:
> day 1 -> insert Row1.Col1
> day 2 -> delete Row1.Col1
> day 11 (before gc-grace-seconds) -> delete Row1.Col1
>
> In the last command I've extended the life of a tombstone, maybe the
> check before the deletion could have a performance impact in the
> process, so I think it might be handled server-side instead of
> client-side.
>
> //GK
> http://twitter.com/germanklf
> http://code.google.com/p/seide/
>
> >>
> >> Maybe knowing that all the subsequent versions of a deletion are
> deletions
> >> too, it could take the first timestamp against the gc-grace-seconds when
> is
> >> reducing & compacting.
> >>
> >> // Germán Kondolf
> >> http://twitter.com/germanklf
> >> http://code.google.com/p/seide/
> >> // @i4
> >>
> >> On 19/01/2011, at 00:16, Jonathan Ellis <jb...@gmail.com> wrote:
> >>
> >> > If you mean that multiple tombstones for the same row or column should
> >> > be merged into a single one at compaction time, then yes, that is what
> >> > happens.
> >> >
> >> > On Tue, Jan 18, 2011 at 7:53 PM, Germán Kondolf
> >> > <ge...@gmail.com> wrote:
> >> >> Maybe it could be taken into account when the compaction is executed,
> >> >> if I only have a consecutive list of uninterrupted tombstones it
> could
> >> >> only care about the first. It sounds like the-way-it-should-be, maybe
> >> >> as a part of the "row-reduce" process.
> >> >>
> >> >> Is it feasible? Looking into the CASSANDRA-1074 sounds like it
> should.
> >> >>
> >> >> //GK
> >> >> http://twitter.com/germanklf
> >> >> http://code.google.com/p/seide/
> >> >>
> >> >> On Tue, Jan 18, 2011 at 10:55 AM, Sylvain Lebresne
> >> >> <sy...@riptano.com> wrote:
> >> >>> On Tue, Jan 18, 2011 at 2:41 PM, David Boxenhorn <david@lookin2.com
> >
> >> >>> wrote:
> >> >>>> Thanks, Aaron, but I'm not 100% clear.
> >> >>>>
> >> >>>> My situation is this: My use case spins off rows (not columns) that
> I
> >> >>>> no
> >> >>>> longer need and want to delete. It is possible that these rows were
> >> >>>> never
> >> >>>> created in the first place, or were already deleted. This is a very
> >> >>>> large
> >> >>>> cleanup task that normally deletes a lot of rows, and the last
> thing
> >> >>>> that I
> >> >>>> want to do is create tombstones for rows that didn't exist in the
> >> >>>> first
> >> >>>> place, or lengthen the life on disk of tombstones of rows that are
> >> >>>> already
> >> >>>> deleted.
> >> >>>>
> >> >>>> So the question is: before I delete, do I have to retrieve the row
> to
> >> >>>> see if
> >> >>>> it exists in the first place?
> >> >>>
> >> >>> Yes, in your situation you do.
> >> >>>
> >> >>>>
> >> >>>>
> >> >>>>
> >> >>>> On Tue, Jan 18, 2011 at 11:38 AM, Aaron Morton
> >> >>>> <aa...@thelastpickle.com>
> >> >>>> wrote:
> >> >>>>>
> >> >>>>> AFAIK that's not necessary, there is no need to worry about
> previous
> >> >>>>> deletes. You can delete stuff that does not even exist, neither
> >> >>>>> batch_mutate
> >> >>>>> or remove are going to throw an error.
> >> >>>>> All the columns that were (roughly speaking) present at your first
> >> >>>>> deletion will be available for GC at the end of the first
> tombstones
> >> >>>>> life.
> >> >>>>> Same for the second.
> >> >>>>> Say you were to write a col between the two deletes with the same
> >> >>>>> name as
> >> >>>>> one present at the start. The first version of the col is avail
> for
> >> >>>>> GC after
> >> >>>>> tombstone 1, and the second after tombstone 2.
> >> >>>>> Hope that helps
> >> >>>>> Aaron
> >> >>>>> On 18/01/2011, at 9:37 PM, David Boxenhorn <da...@lookin2.com>
> >> >>>>> wrote:
> >> >>>>>
> >> >>>>> Thanks. In other words, before I delete something, I should check
> to
> >> >>>>> see
> >> >>>>> whether it exists as a live row in the first place.
> >> >>>>>
> >> >>>>> On Tue, Jan 18, 2011 at 9:24 AM, Ryan King <ry...@twitter.com>
> wrote:
> >> >>>>>>
> >> >>>>>> On Sun, Jan 16, 2011 at 6:53 AM, David Boxenhorn
> >> >>>>>> <da...@lookin2.com>
> >> >>>>>> wrote:
> >> >>>>>>> If I delete a row, and later on delete it again, before
> >> >>>>>>> GCGraceSeconds
> >> >>>>>>> has
> >> >>>>>>> elapsed, does the tombstone live longer?
> >> >>>>>>
> >> >>>>>> Each delete is a new tombstone, which should answer your
> question.
> >> >>>>>>
> >> >>>>>> -ryan
> >> >>>>>>
> >> >>>>>>> In other words, if I have the following scenario:
> >> >>>>>>>
> >> >>>>>>> GCGraceSeconds = 10 days
> >> >>>>>>> On day 1 I delete a row
> >> >>>>>>> On day 5 I delete the row again
> >> >>>>>>>
> >> >>>>>>> Will the tombstone be removed on day 10 or day 15?
> >> >>>>>>>
> >> >>>>>
> >> >>>>
> >> >>>>
> >> >>>
> >> >>
> >> >
> >> >
> >> >
> >> > --
> >> > Jonathan Ellis
> >> > Project Chair, Apache Cassandra
> >> > co-founder of Riptano, the source for professional Cassandra support
> >> > http://riptano.com
> >>
> >
> >
>
> //GK
> http://twitter.com/germanklf
> http://code.google.com/p/seide/
>

Re: Tombstone lifespan after multiple deletions

Posted by Germán Kondolf <ge...@gmail.com>.
On Wed, Jan 19, 2011 at 11:52 AM, Jonathan Ellis <jb...@gmail.com> wrote:
> On Wed, Jan 19, 2011 at 6:41 AM, Germán Kondolf
> <ge...@gmail.com> wrote:
>> As the original example depicted clearly:
>> day 1 -> insert Row1.Col1
>> day 2 -> delete Row1.Col1
>> day 11 (before gc-grace-seconds) -> delete Row1.Col1
>>
>> In the last command I've extended the life of a tombstone, maybe the
>> check before the deletion could have a performance impact in the
>> process, so I think it might be handled server-side instead of
>> client-side.
>
> It has performance implications no matter where you do it, which is
> why we're not going to do it on the server. :)
>
> "Writes [or deletes] don't cause reads" is a basic design decision.
> This is a much bigger win than the very narrow corner case of being
> able to remove a tombstone marker a little earlier.
>

I totally agree on that, I'll never propose a read before a write
server-side, my bad, I didn't make that clear.

The idea is that in the reduce process during a compaction we could
change the logic to take the oldest expiration time instead of the
youngest, I should take a look to the code to see if it's feasible.

A workaround just by configuration is to reduce the gc-grace-seconds
enough to avoid this undesired "tombstone-keep-alive".

//GK
http://twitter.com/germanklf
http://code.google.com/p/seide/
> --
> Jonathan Ellis
> Project Chair, Apache Cassandra
> co-founder of Riptano, the source for professional Cassandra support
> http://riptano.com
>

Re: Tombstone lifespan after multiple deletions

Posted by Jonathan Ellis <jb...@gmail.com>.
On Wed, Jan 19, 2011 at 6:41 AM, Germán Kondolf
<ge...@gmail.com> wrote:
> As the original example depicted clearly:
> day 1 -> insert Row1.Col1
> day 2 -> delete Row1.Col1
> day 11 (before gc-grace-seconds) -> delete Row1.Col1
>
> In the last command I've extended the life of a tombstone, maybe the
> check before the deletion could have a performance impact in the
> process, so I think it might be handled server-side instead of
> client-side.

It has performance implications no matter where you do it, which is
why we're not going to do it on the server. :)

"Writes [or deletes] don't cause reads" is a basic design decision.
This is a much bigger win than the very narrow corner case of being
able to remove a tombstone marker a little earlier.

-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com

Re: Tombstone lifespan after multiple deletions

Posted by Germán Kondolf <ge...@gmail.com>.
On Wed, Jan 19, 2011 at 12:59 AM, Zhu Han <sc...@gmail.com> wrote:
>
>
> On Wed, Jan 19, 2011 at 11:35 AM, Germán Kondolf <ge...@gmail.com>
> wrote:
>>
>> Yes, that's what I meant, but correct me if I'm wrong, when a deletion
>> comes after another deletion for the same row or column will the gc-before
>> count against the last one, isn't it?
>>
> IIRC, after compaction. even if the row key is not wiped, all the CF are
> replaced by the youngest tombstone.  I do not understand very clearly the
> benefit of wiping out the whole row as early as possible.
>

I think it is not a "benefit", but a potencial issue, if you delete
columns or rows without checking them before you could make them live
as long as you keep issuing deletions, maybe it's a strange use-case,
but certainly Cassandra provides new non-traditional ways of
processing high-volume of information.

As the original example depicted clearly:
day 1 -> insert Row1.Col1
day 2 -> delete Row1.Col1
day 11 (before gc-grace-seconds) -> delete Row1.Col1

In the last command I've extended the life of a tombstone, maybe the
check before the deletion could have a performance impact in the
process, so I think it might be handled server-side instead of
client-side.

//GK
http://twitter.com/germanklf
http://code.google.com/p/seide/

>>
>> Maybe knowing that all the subsequent versions of a deletion are deletions
>> too, it could take the first timestamp against the gc-grace-seconds when is
>> reducing & compacting.
>>
>> // Germán Kondolf
>> http://twitter.com/germanklf
>> http://code.google.com/p/seide/
>> // @i4
>>
>> On 19/01/2011, at 00:16, Jonathan Ellis <jb...@gmail.com> wrote:
>>
>> > If you mean that multiple tombstones for the same row or column should
>> > be merged into a single one at compaction time, then yes, that is what
>> > happens.
>> >
>> > On Tue, Jan 18, 2011 at 7:53 PM, Germán Kondolf
>> > <ge...@gmail.com> wrote:
>> >> Maybe it could be taken into account when the compaction is executed,
>> >> if I only have a consecutive list of uninterrupted tombstones it could
>> >> only care about the first. It sounds like the-way-it-should-be, maybe
>> >> as a part of the "row-reduce" process.
>> >>
>> >> Is it feasible? Looking into the CASSANDRA-1074 sounds like it should.
>> >>
>> >> //GK
>> >> http://twitter.com/germanklf
>> >> http://code.google.com/p/seide/
>> >>
>> >> On Tue, Jan 18, 2011 at 10:55 AM, Sylvain Lebresne
>> >> <sy...@riptano.com> wrote:
>> >>> On Tue, Jan 18, 2011 at 2:41 PM, David Boxenhorn <da...@lookin2.com>
>> >>> wrote:
>> >>>> Thanks, Aaron, but I'm not 100% clear.
>> >>>>
>> >>>> My situation is this: My use case spins off rows (not columns) that I
>> >>>> no
>> >>>> longer need and want to delete. It is possible that these rows were
>> >>>> never
>> >>>> created in the first place, or were already deleted. This is a very
>> >>>> large
>> >>>> cleanup task that normally deletes a lot of rows, and the last thing
>> >>>> that I
>> >>>> want to do is create tombstones for rows that didn't exist in the
>> >>>> first
>> >>>> place, or lengthen the life on disk of tombstones of rows that are
>> >>>> already
>> >>>> deleted.
>> >>>>
>> >>>> So the question is: before I delete, do I have to retrieve the row to
>> >>>> see if
>> >>>> it exists in the first place?
>> >>>
>> >>> Yes, in your situation you do.
>> >>>
>> >>>>
>> >>>>
>> >>>>
>> >>>> On Tue, Jan 18, 2011 at 11:38 AM, Aaron Morton
>> >>>> <aa...@thelastpickle.com>
>> >>>> wrote:
>> >>>>>
>> >>>>> AFAIK that's not necessary, there is no need to worry about previous
>> >>>>> deletes. You can delete stuff that does not even exist, neither
>> >>>>> batch_mutate
>> >>>>> or remove are going to throw an error.
>> >>>>> All the columns that were (roughly speaking) present at your first
>> >>>>> deletion will be available for GC at the end of the first tombstones
>> >>>>> life.
>> >>>>> Same for the second.
>> >>>>> Say you were to write a col between the two deletes with the same
>> >>>>> name as
>> >>>>> one present at the start. The first version of the col is avail for
>> >>>>> GC after
>> >>>>> tombstone 1, and the second after tombstone 2.
>> >>>>> Hope that helps
>> >>>>> Aaron
>> >>>>> On 18/01/2011, at 9:37 PM, David Boxenhorn <da...@lookin2.com>
>> >>>>> wrote:
>> >>>>>
>> >>>>> Thanks. In other words, before I delete something, I should check to
>> >>>>> see
>> >>>>> whether it exists as a live row in the first place.
>> >>>>>
>> >>>>> On Tue, Jan 18, 2011 at 9:24 AM, Ryan King <ry...@twitter.com> wrote:
>> >>>>>>
>> >>>>>> On Sun, Jan 16, 2011 at 6:53 AM, David Boxenhorn
>> >>>>>> <da...@lookin2.com>
>> >>>>>> wrote:
>> >>>>>>> If I delete a row, and later on delete it again, before
>> >>>>>>> GCGraceSeconds
>> >>>>>>> has
>> >>>>>>> elapsed, does the tombstone live longer?
>> >>>>>>
>> >>>>>> Each delete is a new tombstone, which should answer your question.
>> >>>>>>
>> >>>>>> -ryan
>> >>>>>>
>> >>>>>>> In other words, if I have the following scenario:
>> >>>>>>>
>> >>>>>>> GCGraceSeconds = 10 days
>> >>>>>>> On day 1 I delete a row
>> >>>>>>> On day 5 I delete the row again
>> >>>>>>>
>> >>>>>>> Will the tombstone be removed on day 10 or day 15?
>> >>>>>>>
>> >>>>>
>> >>>>
>> >>>>
>> >>>
>> >>
>> >
>> >
>> >
>> > --
>> > Jonathan Ellis
>> > Project Chair, Apache Cassandra
>> > co-founder of Riptano, the source for professional Cassandra support
>> > http://riptano.com
>>
>
>

//GK
http://twitter.com/germanklf
http://code.google.com/p/seide/

Re: Tombstone lifespan after multiple deletions

Posted by Zhu Han <sc...@gmail.com>.
On Wed, Jan 19, 2011 at 11:35 AM, Germán Kondolf
<ge...@gmail.com>wrote:

> Yes, that's what I meant, but correct me if I'm wrong, when a deletion
> comes after another deletion for the same row or column will the gc-before
> count against the last one, isn't it?
>
> IIRC, after compaction. even if the row key is not wiped, all the CF are
replaced by the youngest tombstone.  I do not understand very clearly the
benefit of wiping out the whole row as early as possible.


>
> Maybe knowing that all the subsequent versions of a deletion are deletions
> too, it could take the first timestamp against the gc-grace-seconds when is
> reducing & compacting.
>
> // Germán Kondolf
> http://twitter.com/germanklf
> http://code.google.com/p/seide/
> // @i4
>
> On 19/01/2011, at 00:16, Jonathan Ellis <jb...@gmail.com> wrote:
>
> > If you mean that multiple tombstones for the same row or column should
> > be merged into a single one at compaction time, then yes, that is what
> > happens.
> >
> > On Tue, Jan 18, 2011 at 7:53 PM, Germán Kondolf
> > <ge...@gmail.com> wrote:
> >> Maybe it could be taken into account when the compaction is executed,
> >> if I only have a consecutive list of uninterrupted tombstones it could
> >> only care about the first. It sounds like the-way-it-should-be, maybe
> >> as a part of the "row-reduce" process.
> >>
> >> Is it feasible? Looking into the CASSANDRA-1074 sounds like it should.
> >>
> >> //GK
> >> http://twitter.com/germanklf
> >> http://code.google.com/p/seide/
> >>
> >> On Tue, Jan 18, 2011 at 10:55 AM, Sylvain Lebresne <sy...@riptano.com>
> wrote:
> >>> On Tue, Jan 18, 2011 at 2:41 PM, David Boxenhorn <da...@lookin2.com>
> wrote:
> >>>> Thanks, Aaron, but I'm not 100% clear.
> >>>>
> >>>> My situation is this: My use case spins off rows (not columns) that I
> no
> >>>> longer need and want to delete. It is possible that these rows were
> never
> >>>> created in the first place, or were already deleted. This is a very
> large
> >>>> cleanup task that normally deletes a lot of rows, and the last thing
> that I
> >>>> want to do is create tombstones for rows that didn't exist in the
> first
> >>>> place, or lengthen the life on disk of tombstones of rows that are
> already
> >>>> deleted.
> >>>>
> >>>> So the question is: before I delete, do I have to retrieve the row to
> see if
> >>>> it exists in the first place?
> >>>
> >>> Yes, in your situation you do.
> >>>
> >>>>
> >>>>
> >>>>
> >>>> On Tue, Jan 18, 2011 at 11:38 AM, Aaron Morton <
> aaron@thelastpickle.com>
> >>>> wrote:
> >>>>>
> >>>>> AFAIK that's not necessary, there is no need to worry about previous
> >>>>> deletes. You can delete stuff that does not even exist, neither
> batch_mutate
> >>>>> or remove are going to throw an error.
> >>>>> All the columns that were (roughly speaking) present at your first
> >>>>> deletion will be available for GC at the end of the first tombstones
> life.
> >>>>> Same for the second.
> >>>>> Say you were to write a col between the two deletes with the same
> name as
> >>>>> one present at the start. The first version of the col is avail for
> GC after
> >>>>> tombstone 1, and the second after tombstone 2.
> >>>>> Hope that helps
> >>>>> Aaron
> >>>>> On 18/01/2011, at 9:37 PM, David Boxenhorn <da...@lookin2.com>
> wrote:
> >>>>>
> >>>>> Thanks. In other words, before I delete something, I should check to
> see
> >>>>> whether it exists as a live row in the first place.
> >>>>>
> >>>>> On Tue, Jan 18, 2011 at 9:24 AM, Ryan King <ry...@twitter.com> wrote:
> >>>>>>
> >>>>>> On Sun, Jan 16, 2011 at 6:53 AM, David Boxenhorn <david@lookin2.com
> >
> >>>>>> wrote:
> >>>>>>> If I delete a row, and later on delete it again, before
> GCGraceSeconds
> >>>>>>> has
> >>>>>>> elapsed, does the tombstone live longer?
> >>>>>>
> >>>>>> Each delete is a new tombstone, which should answer your question.
> >>>>>>
> >>>>>> -ryan
> >>>>>>
> >>>>>>> In other words, if I have the following scenario:
> >>>>>>>
> >>>>>>> GCGraceSeconds = 10 days
> >>>>>>> On day 1 I delete a row
> >>>>>>> On day 5 I delete the row again
> >>>>>>>
> >>>>>>> Will the tombstone be removed on day 10 or day 15?
> >>>>>>>
> >>>>>
> >>>>
> >>>>
> >>>
> >>
> >
> >
> >
> > --
> > Jonathan Ellis
> > Project Chair, Apache Cassandra
> > co-founder of Riptano, the source for professional Cassandra support
> > http://riptano.com
>
>

Re: Tombstone lifespan after multiple deletions

Posted by Germán Kondolf <ge...@gmail.com>.
Yes, that's what I meant, but correct me if I'm wrong, when a deletion comes after another deletion for the same row or column will the gc-before count against the last one, isn't it?

Maybe knowing that all the subsequent versions of a deletion are deletions too, it could take the first timestamp against the gc-grace-seconds when is reducing & compacting.

// Germán Kondolf
http://twitter.com/germanklf
http://code.google.com/p/seide/
// @i4

On 19/01/2011, at 00:16, Jonathan Ellis <jb...@gmail.com> wrote:

> If you mean that multiple tombstones for the same row or column should
> be merged into a single one at compaction time, then yes, that is what
> happens.
> 
> On Tue, Jan 18, 2011 at 7:53 PM, Germán Kondolf
> <ge...@gmail.com> wrote:
>> Maybe it could be taken into account when the compaction is executed,
>> if I only have a consecutive list of uninterrupted tombstones it could
>> only care about the first. It sounds like the-way-it-should-be, maybe
>> as a part of the "row-reduce" process.
>> 
>> Is it feasible? Looking into the CASSANDRA-1074 sounds like it should.
>> 
>> //GK
>> http://twitter.com/germanklf
>> http://code.google.com/p/seide/
>> 
>> On Tue, Jan 18, 2011 at 10:55 AM, Sylvain Lebresne <sy...@riptano.com> wrote:
>>> On Tue, Jan 18, 2011 at 2:41 PM, David Boxenhorn <da...@lookin2.com> wrote:
>>>> Thanks, Aaron, but I'm not 100% clear.
>>>> 
>>>> My situation is this: My use case spins off rows (not columns) that I no
>>>> longer need and want to delete. It is possible that these rows were never
>>>> created in the first place, or were already deleted. This is a very large
>>>> cleanup task that normally deletes a lot of rows, and the last thing that I
>>>> want to do is create tombstones for rows that didn't exist in the first
>>>> place, or lengthen the life on disk of tombstones of rows that are already
>>>> deleted.
>>>> 
>>>> So the question is: before I delete, do I have to retrieve the row to see if
>>>> it exists in the first place?
>>> 
>>> Yes, in your situation you do.
>>> 
>>>> 
>>>> 
>>>> 
>>>> On Tue, Jan 18, 2011 at 11:38 AM, Aaron Morton <aa...@thelastpickle.com>
>>>> wrote:
>>>>> 
>>>>> AFAIK that's not necessary, there is no need to worry about previous
>>>>> deletes. You can delete stuff that does not even exist, neither batch_mutate
>>>>> or remove are going to throw an error.
>>>>> All the columns that were (roughly speaking) present at your first
>>>>> deletion will be available for GC at the end of the first tombstones life.
>>>>> Same for the second.
>>>>> Say you were to write a col between the two deletes with the same name as
>>>>> one present at the start. The first version of the col is avail for GC after
>>>>> tombstone 1, and the second after tombstone 2.
>>>>> Hope that helps
>>>>> Aaron
>>>>> On 18/01/2011, at 9:37 PM, David Boxenhorn <da...@lookin2.com> wrote:
>>>>> 
>>>>> Thanks. In other words, before I delete something, I should check to see
>>>>> whether it exists as a live row in the first place.
>>>>> 
>>>>> On Tue, Jan 18, 2011 at 9:24 AM, Ryan King <ry...@twitter.com> wrote:
>>>>>> 
>>>>>> On Sun, Jan 16, 2011 at 6:53 AM, David Boxenhorn <da...@lookin2.com>
>>>>>> wrote:
>>>>>>> If I delete a row, and later on delete it again, before GCGraceSeconds
>>>>>>> has
>>>>>>> elapsed, does the tombstone live longer?
>>>>>> 
>>>>>> Each delete is a new tombstone, which should answer your question.
>>>>>> 
>>>>>> -ryan
>>>>>> 
>>>>>>> In other words, if I have the following scenario:
>>>>>>> 
>>>>>>> GCGraceSeconds = 10 days
>>>>>>> On day 1 I delete a row
>>>>>>> On day 5 I delete the row again
>>>>>>> 
>>>>>>> Will the tombstone be removed on day 10 or day 15?
>>>>>>> 
>>>>> 
>>>> 
>>>> 
>>> 
>> 
> 
> 
> 
> -- 
> Jonathan Ellis
> Project Chair, Apache Cassandra
> co-founder of Riptano, the source for professional Cassandra support
> http://riptano.com


Re: Tombstone lifespan after multiple deletions

Posted by Jonathan Ellis <jb...@gmail.com>.
If you mean that multiple tombstones for the same row or column should
be merged into a single one at compaction time, then yes, that is what
happens.

On Tue, Jan 18, 2011 at 7:53 PM, Germán Kondolf
<ge...@gmail.com> wrote:
> Maybe it could be taken into account when the compaction is executed,
> if I only have a consecutive list of uninterrupted tombstones it could
> only care about the first. It sounds like the-way-it-should-be, maybe
> as a part of the "row-reduce" process.
>
> Is it feasible? Looking into the CASSANDRA-1074 sounds like it should.
>
> //GK
> http://twitter.com/germanklf
> http://code.google.com/p/seide/
>
> On Tue, Jan 18, 2011 at 10:55 AM, Sylvain Lebresne <sy...@riptano.com> wrote:
>> On Tue, Jan 18, 2011 at 2:41 PM, David Boxenhorn <da...@lookin2.com> wrote:
>>> Thanks, Aaron, but I'm not 100% clear.
>>>
>>> My situation is this: My use case spins off rows (not columns) that I no
>>> longer need and want to delete. It is possible that these rows were never
>>> created in the first place, or were already deleted. This is a very large
>>> cleanup task that normally deletes a lot of rows, and the last thing that I
>>> want to do is create tombstones for rows that didn't exist in the first
>>> place, or lengthen the life on disk of tombstones of rows that are already
>>> deleted.
>>>
>>> So the question is: before I delete, do I have to retrieve the row to see if
>>> it exists in the first place?
>>
>> Yes, in your situation you do.
>>
>>>
>>>
>>>
>>> On Tue, Jan 18, 2011 at 11:38 AM, Aaron Morton <aa...@thelastpickle.com>
>>> wrote:
>>>>
>>>> AFAIK that's not necessary, there is no need to worry about previous
>>>> deletes. You can delete stuff that does not even exist, neither batch_mutate
>>>> or remove are going to throw an error.
>>>> All the columns that were (roughly speaking) present at your first
>>>> deletion will be available for GC at the end of the first tombstones life.
>>>> Same for the second.
>>>> Say you were to write a col between the two deletes with the same name as
>>>> one present at the start. The first version of the col is avail for GC after
>>>> tombstone 1, and the second after tombstone 2.
>>>> Hope that helps
>>>> Aaron
>>>> On 18/01/2011, at 9:37 PM, David Boxenhorn <da...@lookin2.com> wrote:
>>>>
>>>> Thanks. In other words, before I delete something, I should check to see
>>>> whether it exists as a live row in the first place.
>>>>
>>>> On Tue, Jan 18, 2011 at 9:24 AM, Ryan King <ry...@twitter.com> wrote:
>>>>>
>>>>> On Sun, Jan 16, 2011 at 6:53 AM, David Boxenhorn <da...@lookin2.com>
>>>>> wrote:
>>>>> > If I delete a row, and later on delete it again, before GCGraceSeconds
>>>>> > has
>>>>> > elapsed, does the tombstone live longer?
>>>>>
>>>>> Each delete is a new tombstone, which should answer your question.
>>>>>
>>>>> -ryan
>>>>>
>>>>> > In other words, if I have the following scenario:
>>>>> >
>>>>> > GCGraceSeconds = 10 days
>>>>> > On day 1 I delete a row
>>>>> > On day 5 I delete the row again
>>>>> >
>>>>> > Will the tombstone be removed on day 10 or day 15?
>>>>> >
>>>>
>>>
>>>
>>
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com

Re: Tombstone lifespan after multiple deletions

Posted by Germán Kondolf <ge...@gmail.com>.
Maybe it could be taken into account when the compaction is executed,
if I only have a consecutive list of uninterrupted tombstones it could
only care about the first. It sounds like the-way-it-should-be, maybe
as a part of the "row-reduce" process.

Is it feasible? Looking into the CASSANDRA-1074 sounds like it should.

//GK
http://twitter.com/germanklf
http://code.google.com/p/seide/

On Tue, Jan 18, 2011 at 10:55 AM, Sylvain Lebresne <sy...@riptano.com> wrote:
> On Tue, Jan 18, 2011 at 2:41 PM, David Boxenhorn <da...@lookin2.com> wrote:
>> Thanks, Aaron, but I'm not 100% clear.
>>
>> My situation is this: My use case spins off rows (not columns) that I no
>> longer need and want to delete. It is possible that these rows were never
>> created in the first place, or were already deleted. This is a very large
>> cleanup task that normally deletes a lot of rows, and the last thing that I
>> want to do is create tombstones for rows that didn't exist in the first
>> place, or lengthen the life on disk of tombstones of rows that are already
>> deleted.
>>
>> So the question is: before I delete, do I have to retrieve the row to see if
>> it exists in the first place?
>
> Yes, in your situation you do.
>
>>
>>
>>
>> On Tue, Jan 18, 2011 at 11:38 AM, Aaron Morton <aa...@thelastpickle.com>
>> wrote:
>>>
>>> AFAIK that's not necessary, there is no need to worry about previous
>>> deletes. You can delete stuff that does not even exist, neither batch_mutate
>>> or remove are going to throw an error.
>>> All the columns that were (roughly speaking) present at your first
>>> deletion will be available for GC at the end of the first tombstones life.
>>> Same for the second.
>>> Say you were to write a col between the two deletes with the same name as
>>> one present at the start. The first version of the col is avail for GC after
>>> tombstone 1, and the second after tombstone 2.
>>> Hope that helps
>>> Aaron
>>> On 18/01/2011, at 9:37 PM, David Boxenhorn <da...@lookin2.com> wrote:
>>>
>>> Thanks. In other words, before I delete something, I should check to see
>>> whether it exists as a live row in the first place.
>>>
>>> On Tue, Jan 18, 2011 at 9:24 AM, Ryan King <ry...@twitter.com> wrote:
>>>>
>>>> On Sun, Jan 16, 2011 at 6:53 AM, David Boxenhorn <da...@lookin2.com>
>>>> wrote:
>>>> > If I delete a row, and later on delete it again, before GCGraceSeconds
>>>> > has
>>>> > elapsed, does the tombstone live longer?
>>>>
>>>> Each delete is a new tombstone, which should answer your question.
>>>>
>>>> -ryan
>>>>
>>>> > In other words, if I have the following scenario:
>>>> >
>>>> > GCGraceSeconds = 10 days
>>>> > On day 1 I delete a row
>>>> > On day 5 I delete the row again
>>>> >
>>>> > Will the tombstone be removed on day 10 or day 15?
>>>> >
>>>
>>
>>
>

Re: Tombstone lifespan after multiple deletions

Posted by Aaron Morton <aa...@thelastpickle.com>.
Sylvain,

Just to check my knowledge. Is this only the case if the delete is sent without a super column or predicate? What about a delete for a specific column that did not exist?

Thanks
Aaron 

On 19/01/2011, at 2:58 AM, David Boxenhorn <da...@lookin2.com> wrote:

> Thanks. 
> 
> On Tue, Jan 18, 2011 at 3:55 PM, Sylvain Lebresne <sy...@riptano.com> wrote:
> On Tue, Jan 18, 2011 at 2:41 PM, David Boxenhorn <da...@lookin2.com> wrote:
> > Thanks, Aaron, but I'm not 100% clear.
> >
> > My situation is this: My use case spins off rows (not columns) that I no
> > longer need and want to delete. It is possible that these rows were never
> > created in the first place, or were already deleted. This is a very large
> > cleanup task that normally deletes a lot of rows, and the last thing that I
> > want to do is create tombstones for rows that didn't exist in the first
> > place, or lengthen the life on disk of tombstones of rows that are already
> > deleted.
> >
> > So the question is: before I delete, do I have to retrieve the row to see if
> > it exists in the first place?
> 
> Yes, in your situation you do.
> 
> >
> >
> >
> > On Tue, Jan 18, 2011 at 11:38 AM, Aaron Morton <aa...@thelastpickle.com>
> > wrote:
> >>
> >> AFAIK that's not necessary, there is no need to worry about previous
> >> deletes. You can delete stuff that does not even exist, neither batch_mutate
> >> or remove are going to throw an error.
> >> All the columns that were (roughly speaking) present at your first
> >> deletion will be available for GC at the end of the first tombstones life.
> >> Same for the second.
> >> Say you were to write a col between the two deletes with the same name as
> >> one present at the start. The first version of the col is avail for GC after
> >> tombstone 1, and the second after tombstone 2.
> >> Hope that helps
> >> Aaron
> >> On 18/01/2011, at 9:37 PM, David Boxenhorn <da...@lookin2.com> wrote:
> >>
> >> Thanks. In other words, before I delete something, I should check to see
> >> whether it exists as a live row in the first place.
> >>
> >> On Tue, Jan 18, 2011 at 9:24 AM, Ryan King <ry...@twitter.com> wrote:
> >>>
> >>> On Sun, Jan 16, 2011 at 6:53 AM, David Boxenhorn <da...@lookin2.com>
> >>> wrote:
> >>> > If I delete a row, and later on delete it again, before GCGraceSeconds
> >>> > has
> >>> > elapsed, does the tombstone live longer?
> >>>
> >>> Each delete is a new tombstone, which should answer your question.
> >>>
> >>> -ryan
> >>>
> >>> > In other words, if I have the following scenario:
> >>> >
> >>> > GCGraceSeconds = 10 days
> >>> > On day 1 I delete a row
> >>> > On day 5 I delete the row again
> >>> >
> >>> > Will the tombstone be removed on day 10 or day 15?
> >>> >
> >>
> >
> >
> 

Re: Tombstone lifespan after multiple deletions

Posted by David Boxenhorn <da...@lookin2.com>.
Thanks.

On Tue, Jan 18, 2011 at 3:55 PM, Sylvain Lebresne <sy...@riptano.com>wrote:

> On Tue, Jan 18, 2011 at 2:41 PM, David Boxenhorn <da...@lookin2.com>
> wrote:
> > Thanks, Aaron, but I'm not 100% clear.
> >
> > My situation is this: My use case spins off rows (not columns) that I no
> > longer need and want to delete. It is possible that these rows were never
> > created in the first place, or were already deleted. This is a very large
> > cleanup task that normally deletes a lot of rows, and the last thing that
> I
> > want to do is create tombstones for rows that didn't exist in the first
> > place, or lengthen the life on disk of tombstones of rows that are
> already
> > deleted.
> >
> > So the question is: before I delete, do I have to retrieve the row to see
> if
> > it exists in the first place?
>
> Yes, in your situation you do.
>
> >
> >
> >
> > On Tue, Jan 18, 2011 at 11:38 AM, Aaron Morton <aa...@thelastpickle.com>
> > wrote:
> >>
> >> AFAIK that's not necessary, there is no need to worry about previous
> >> deletes. You can delete stuff that does not even exist, neither
> batch_mutate
> >> or remove are going to throw an error.
> >> All the columns that were (roughly speaking) present at your first
> >> deletion will be available for GC at the end of the first tombstones
> life.
> >> Same for the second.
> >> Say you were to write a col between the two deletes with the same name
> as
> >> one present at the start. The first version of the col is avail for GC
> after
> >> tombstone 1, and the second after tombstone 2.
> >> Hope that helps
> >> Aaron
> >> On 18/01/2011, at 9:37 PM, David Boxenhorn <da...@lookin2.com> wrote:
> >>
> >> Thanks. In other words, before I delete something, I should check to see
> >> whether it exists as a live row in the first place.
> >>
> >> On Tue, Jan 18, 2011 at 9:24 AM, Ryan King <ry...@twitter.com> wrote:
> >>>
> >>> On Sun, Jan 16, 2011 at 6:53 AM, David Boxenhorn <da...@lookin2.com>
> >>> wrote:
> >>> > If I delete a row, and later on delete it again, before
> GCGraceSeconds
> >>> > has
> >>> > elapsed, does the tombstone live longer?
> >>>
> >>> Each delete is a new tombstone, which should answer your question.
> >>>
> >>> -ryan
> >>>
> >>> > In other words, if I have the following scenario:
> >>> >
> >>> > GCGraceSeconds = 10 days
> >>> > On day 1 I delete a row
> >>> > On day 5 I delete the row again
> >>> >
> >>> > Will the tombstone be removed on day 10 or day 15?
> >>> >
> >>
> >
> >
>

Re: Tombstone lifespan after multiple deletions

Posted by Sylvain Lebresne <sy...@riptano.com>.
On Tue, Jan 18, 2011 at 2:41 PM, David Boxenhorn <da...@lookin2.com> wrote:
> Thanks, Aaron, but I'm not 100% clear.
>
> My situation is this: My use case spins off rows (not columns) that I no
> longer need and want to delete. It is possible that these rows were never
> created in the first place, or were already deleted. This is a very large
> cleanup task that normally deletes a lot of rows, and the last thing that I
> want to do is create tombstones for rows that didn't exist in the first
> place, or lengthen the life on disk of tombstones of rows that are already
> deleted.
>
> So the question is: before I delete, do I have to retrieve the row to see if
> it exists in the first place?

Yes, in your situation you do.

>
>
>
> On Tue, Jan 18, 2011 at 11:38 AM, Aaron Morton <aa...@thelastpickle.com>
> wrote:
>>
>> AFAIK that's not necessary, there is no need to worry about previous
>> deletes. You can delete stuff that does not even exist, neither batch_mutate
>> or remove are going to throw an error.
>> All the columns that were (roughly speaking) present at your first
>> deletion will be available for GC at the end of the first tombstones life.
>> Same for the second.
>> Say you were to write a col between the two deletes with the same name as
>> one present at the start. The first version of the col is avail for GC after
>> tombstone 1, and the second after tombstone 2.
>> Hope that helps
>> Aaron
>> On 18/01/2011, at 9:37 PM, David Boxenhorn <da...@lookin2.com> wrote:
>>
>> Thanks. In other words, before I delete something, I should check to see
>> whether it exists as a live row in the first place.
>>
>> On Tue, Jan 18, 2011 at 9:24 AM, Ryan King <ry...@twitter.com> wrote:
>>>
>>> On Sun, Jan 16, 2011 at 6:53 AM, David Boxenhorn <da...@lookin2.com>
>>> wrote:
>>> > If I delete a row, and later on delete it again, before GCGraceSeconds
>>> > has
>>> > elapsed, does the tombstone live longer?
>>>
>>> Each delete is a new tombstone, which should answer your question.
>>>
>>> -ryan
>>>
>>> > In other words, if I have the following scenario:
>>> >
>>> > GCGraceSeconds = 10 days
>>> > On day 1 I delete a row
>>> > On day 5 I delete the row again
>>> >
>>> > Will the tombstone be removed on day 10 or day 15?
>>> >
>>
>
>

Re: Tombstone lifespan after multiple deletions

Posted by David Boxenhorn <da...@lookin2.com>.
Thanks, Aaron, but I'm not 100% clear.

My situation is this: My use case spins off rows (not columns) that I no
longer need and want to delete. It is possible that these rows were never
created in the first place, or were already deleted. This is a very large
cleanup task that normally deletes a lot of rows, and the last thing that I
want to do is create tombstones for rows that didn't exist in the first
place, or lengthen the life on disk of tombstones of rows that are already
deleted.

So the question is: before I delete, do I have to retrieve the row to see if
it exists in the first place?



On Tue, Jan 18, 2011 at 11:38 AM, Aaron Morton <aa...@thelastpickle.com>wrote:

> AFAIK that's not necessary, there is no need to worry about previous
> deletes. You can delete stuff that does not even exist, neither batch_mutate
> or remove are going to throw an error.
>
> All the columns that were (roughly speaking) present at your first deletion
> will be available for GC at the end of the first tombstones life. Same for
> the second.
>
> Say you were to write a col between the two deletes with the same name as
> one present at the start. The first version of the col is avail for GC after
> tombstone 1, and the second after tombstone 2.
>
> Hope that helps
> Aaron
>
> On 18/01/2011, at 9:37 PM, David Boxenhorn <da...@lookin2.com> wrote:
>
> Thanks. In other words, before I delete something, I should check to see
> whether it exists as a live row in the first place.
>
> On Tue, Jan 18, 2011 at 9:24 AM, Ryan King < <ry...@twitter.com>
> ryan@twitter.com> wrote:
>
>> On Sun, Jan 16, 2011 at 6:53 AM, David Boxenhorn < <da...@lookin2.com>
>> david@lookin2.com> wrote:
>> > If I delete a row, and later on delete it again, before GCGraceSeconds
>> has
>> > elapsed, does the tombstone live longer?
>>
>> Each delete is a new tombstone, which should answer your question.
>>
>> -ryan
>>
>> > In other words, if I have the following scenario:
>> >
>> > GCGraceSeconds = 10 days
>> > On day 1 I delete a row
>> > On day 5 I delete the row again
>> >
>> > Will the tombstone be removed on day 10 or day 15?
>> >
>>
>
>

Re: Tombstone lifespan after multiple deletions

Posted by Aaron Morton <aa...@thelastpickle.com>.
AFAIK that's not necessary, there is no need to worry about previous deletes. You can delete stuff that does not even exist, neither batch_mutate or remove are going to throw an error.

All the columns that were (roughly speaking) present at your first deletion will be available for GC at the end of the first tombstones life. Same for the second.

Say you were to write a col between the two deletes with the same name as one present at the start. The first version of the col is avail for GC after tombstone 1, and the second after tombstone 2. 

Hope that helps
Aaron
On 18/01/2011, at 9:37 PM, David Boxenhorn <da...@lookin2.com> wrote:

> Thanks. In other words, before I delete something, I should check to see whether it exists as a live row in the first place. 
> 
> On Tue, Jan 18, 2011 at 9:24 AM, Ryan King <ry...@twitter.com> wrote:
> On Sun, Jan 16, 2011 at 6:53 AM, David Boxenhorn <da...@lookin2.com> wrote:
> > If I delete a row, and later on delete it again, before GCGraceSeconds has
> > elapsed, does the tombstone live longer?
> 
> Each delete is a new tombstone, which should answer your question.
> 
> -ryan
> 
> > In other words, if I have the following scenario:
> >
> > GCGraceSeconds = 10 days
> > On day 1 I delete a row
> > On day 5 I delete the row again
> >
> > Will the tombstone be removed on day 10 or day 15?
> >
> 

Re: Tombstone lifespan after multiple deletions

Posted by David Boxenhorn <da...@lookin2.com>.
Thanks. In other words, before I delete something, I should check to see
whether it exists as a live row in the first place.

On Tue, Jan 18, 2011 at 9:24 AM, Ryan King <ry...@twitter.com> wrote:

> On Sun, Jan 16, 2011 at 6:53 AM, David Boxenhorn <da...@lookin2.com>
> wrote:
> > If I delete a row, and later on delete it again, before GCGraceSeconds
> has
> > elapsed, does the tombstone live longer?
>
> Each delete is a new tombstone, which should answer your question.
>
> -ryan
>
> > In other words, if I have the following scenario:
> >
> > GCGraceSeconds = 10 days
> > On day 1 I delete a row
> > On day 5 I delete the row again
> >
> > Will the tombstone be removed on day 10 or day 15?
> >
>

Re: Tombstone lifespan after multiple deletions

Posted by Ryan King <ry...@twitter.com>.
On Sun, Jan 16, 2011 at 6:53 AM, David Boxenhorn <da...@lookin2.com> wrote:
> If I delete a row, and later on delete it again, before GCGraceSeconds has
> elapsed, does the tombstone live longer?

Each delete is a new tombstone, which should answer your question.

-ryan

> In other words, if I have the following scenario:
>
> GCGraceSeconds = 10 days
> On day 1 I delete a row
> On day 5 I delete the row again
>
> Will the tombstone be removed on day 10 or day 15?
>