You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by AJ <aj...@dude.podzone.net> on 2011/06/13 14:59:02 UTC

Docs: "Why do deleted keys show up during range scans?"

http://wiki.apache.org/cassandra/FAQ#range_ghosts

"So to special case leaving out result entries for deletions, we would 
have to check the entire rest of the row to make sure there is no 
undeleted data anywhere else either (in which case leaving the key out 
would be an error)."

The above doesn't read well and I don't get it.  Can anyone rephrase it 
or elaborate?

Thanks!

Re: Docs: "Why do deleted keys show up during range scans?"

Posted by AJ <aj...@dude.podzone.net>.

Thanks, but right now I'm thinking, RTFC ;o)

On 6/14/2011 4:37 PM, aaron morton wrote:
>> While you can delete a row, if I understand correctly, what happens is a
>> tombstone is created which matches every column, so in effect it is
>> deleting the columns, not the whole row.
> A tombstone is created at the level of the delete, rather than for every column. Otherwise imagine deleting a row with 1 million columns.
>
> Tombstones are created at the Column, Super Column and Row level. Deleting at the row level writes a row level tombstone. All these different tombstones are resolved during the read process.
>
> My understanding of "So to special case leaving out result entries for deletions, we would have to check the entire rest of the row to make sure there is no undeleted data anywhere else either (in which case leaving the key out would be an error)." is...
>
> Resolving the predicate to determine if a row contains the specified columns is a (somewhat) bound operation. Determining if a row as ANY non deleted columns is a potentially unbound operation that could involve lots-o-io .  Imagine a row with 1 million columns, and the first 100,000 have been deleted.
>
> For each row in the result set you can say either :
>
> 1) It has 1 or more of the columns I requested.
> 2) It has none of the columns I requested.
> 3) it has no columns, but cassandra decided it was too much work to conclusively prove that. Because after all I asked if it had some specific columns not if it had any columns.
>
> Hope that helps.
>
> -----------------
> Aaron Morton
> Freelance Cassandra Developer
> @aaronmorton
> http://www.thelastpickle.com
>
> On 15 Jun 2011, at 04:25, Jeremiah Jordan wrote:
>
>> Also, tombstone's are not "attached" anywhere.  A tombstone is just a
>> column with special value which says "I was deleted".  And I am pretty
>> sure they go into SSTables etc the exact same way regular columns do.
>>
>> -----Original Message-----
>> From: Jeremiah Jordan [mailto:JEREMIAH.JORDAN@morningstar.com]
>> Sent: Tuesday, June 14, 2011 11:22 AM
>> To: user@cassandra.apache.org
>> Subject: RE: Docs: "Why do deleted keys show up during range scans?"
>>
>> I am pretty sure how Cassandra works will make sense to you if you think
>> of it that way, that rows do not get deleted, columns get deleted.
>> While you can delete a row, if I understand correctly, what happens is a
>> tombstone is created which matches every column, so in effect it is
>> deleting the columns, not the whole row.  A row key will not be
>> forgotten/deleted until there are no columns or tombstones which
>> reference it.  Until there are no references to that row key in any
>> SSTables you can still get that key back from the API.
>>
>> -Jeremiah
>>
>> -----Original Message-----
>> From: AJ [mailto:aj@dude.podzone.net]
>> Sent: Monday, June 13, 2011 12:11 PM
>> To: user@cassandra.apache.org
>> Subject: Re: Docs: "Why do deleted keys show up during range scans?"
>>
>> On 6/13/2011 10:14 AM, Stephen Connolly wrote:
>>> store the query inverted.
>>>
>>> that way empty ->   deleted
>>>
>> I don't know what that means... get the other columns?  Can you
>> elaborate?  Is there docs for this or is this a hack/workaround?
>>
>>> the tombstones are stored for each column that had data IIRC... but at
>>> this point my grok of C* is lacking
>> I suspected this, but wasn't sure.  It sounds like when a row is
>> deleted, a tombstone is not "attached" to the row, but to each column???
>> So, if all columns are deleted then the row is considered deleted?
>> Hmmm, that doesn't sound right, but that doesn't mean it isn't ! ;o)
>

Re: Docs: "Why do deleted keys show up during range scans?"

Posted by aaron morton <aa...@thelastpickle.com>.

> While you can delete a row, if I understand correctly, what happens is a
> tombstone is created which matches every column, so in effect it is
> deleting the columns, not the whole row. 

A tombstone is created at the level of the delete, rather than for every column. Otherwise imagine deleting a row with 1 million columns.

Tombstones are created at the Column, Super Column and Row level. Deleting at the row level writes a row level tombstone. All these different tombstones are resolved during the read process. 

My understanding of "So to special case leaving out result entries for deletions, we would have to check the entire rest of the row to make sure there is no undeleted data anywhere else either (in which case leaving the key out would be an error)." is...

Resolving the predicate to determine if a row contains the specified columns is a (somewhat) bound operation. Determining if a row as ANY non deleted columns is a potentially unbound operation that could involve lots-o-io .  Imagine a row with 1 million columns, and the first 100,000 have been deleted. 

For each row in the result set you can say either :

1) It has 1 or more of the columns I requested.
2) It has none of the columns I requested. 
3) it has no columns, but cassandra decided it was too much work to conclusively prove that. Because after all I asked if it had some specific columns not if it had any columns.  

Hope that helps. 

-----------------
Aaron Morton
Freelance Cassandra Developer
@aaronmorton
http://www.thelastpickle.com

On 15 Jun 2011, at 04:25, Jeremiah Jordan wrote:

> Also, tombstone's are not "attached" anywhere.  A tombstone is just a
> column with special value which says "I was deleted".  And I am pretty
> sure they go into SSTables etc the exact same way regular columns do.
> 
> -----Original Message-----
> From: Jeremiah Jordan [mailto:JEREMIAH.JORDAN@morningstar.com] 
> Sent: Tuesday, June 14, 2011 11:22 AM
> To: user@cassandra.apache.org
> Subject: RE: Docs: "Why do deleted keys show up during range scans?"
> 
> I am pretty sure how Cassandra works will make sense to you if you think
> of it that way, that rows do not get deleted, columns get deleted.
> While you can delete a row, if I understand correctly, what happens is a
> tombstone is created which matches every column, so in effect it is
> deleting the columns, not the whole row.  A row key will not be
> forgotten/deleted until there are no columns or tombstones which
> reference it.  Until there are no references to that row key in any
> SSTables you can still get that key back from the API.
> 
> -Jeremiah
> 
> -----Original Message-----
> From: AJ [mailto:aj@dude.podzone.net]
> Sent: Monday, June 13, 2011 12:11 PM
> To: user@cassandra.apache.org
> Subject: Re: Docs: "Why do deleted keys show up during range scans?"
> 
> On 6/13/2011 10:14 AM, Stephen Connolly wrote:
>> 
>> store the query inverted.
>> 
>> that way empty ->  deleted
>> 
> I don't know what that means... get the other columns?  Can you
> elaborate?  Is there docs for this or is this a hack/workaround?
> 
>> the tombstones are stored for each column that had data IIRC... but at
> 
>> this point my grok of C* is lacking
> I suspected this, but wasn't sure.  It sounds like when a row is
> deleted, a tombstone is not "attached" to the row, but to each column???
> So, if all columns are deleted then the row is considered deleted?
> Hmmm, that doesn't sound right, but that doesn't mean it isn't ! ;o)

RE: Docs: "Why do deleted keys show up during range scans?"

Posted by Jeremiah Jordan <JE...@morningstar.com>.

Also, tombstone's are not "attached" anywhere.  A tombstone is just a
column with special value which says "I was deleted".  And I am pretty
sure they go into SSTables etc the exact same way regular columns do.

-----Original Message-----
From: Jeremiah Jordan [mailto:JEREMIAH.JORDAN@morningstar.com] 
Sent: Tuesday, June 14, 2011 11:22 AM
To: user@cassandra.apache.org
Subject: RE: Docs: "Why do deleted keys show up during range scans?"

I am pretty sure how Cassandra works will make sense to you if you think
of it that way, that rows do not get deleted, columns get deleted.
While you can delete a row, if I understand correctly, what happens is a
tombstone is created which matches every column, so in effect it is
deleting the columns, not the whole row.  A row key will not be
forgotten/deleted until there are no columns or tombstones which
reference it.  Until there are no references to that row key in any
SSTables you can still get that key back from the API.

-Jeremiah

-----Original Message-----
From: AJ [mailto:aj@dude.podzone.net]
Sent: Monday, June 13, 2011 12:11 PM
To: user@cassandra.apache.org
Subject: Re: Docs: "Why do deleted keys show up during range scans?"

On 6/13/2011 10:14 AM, Stephen Connolly wrote:
>
> store the query inverted.
>
> that way empty ->  deleted
>
I don't know what that means... get the other columns?  Can you
elaborate?  Is there docs for this or is this a hack/workaround?

> the tombstones are stored for each column that had data IIRC... but at

> this point my grok of C* is lacking
I suspected this, but wasn't sure.  It sounds like when a row is
deleted, a tombstone is not "attached" to the row, but to each column???
So, if all columns are deleted then the row is considered deleted?
Hmmm, that doesn't sound right, but that doesn't mean it isn't ! ;o)

RE: Docs: "Why do deleted keys show up during range scans?"

Posted by Jeremiah Jordan <JE...@morningstar.com>.

I am pretty sure how Cassandra works will make sense to you if you think
of it that way, that rows do not get deleted, columns get deleted.
While you can delete a row, if I understand correctly, what happens is a
tombstone is created which matches every column, so in effect it is
deleting the columns, not the whole row.  A row key will not be
forgotten/deleted until there are no columns or tombstones which
reference it.  Until there are no references to that row key in any
SSTables you can still get that key back from the API.

-Jeremiah

-----Original Message-----
From: AJ [mailto:aj@dude.podzone.net] 
Sent: Monday, June 13, 2011 12:11 PM
To: user@cassandra.apache.org
Subject: Re: Docs: "Why do deleted keys show up during range scans?"

On 6/13/2011 10:14 AM, Stephen Connolly wrote:
>
> store the query inverted.
>
> that way empty ->  deleted
>
I don't know what that means... get the other columns?  Can you
elaborate?  Is there docs for this or is this a hack/workaround?

> the tombstones are stored for each column that had data IIRC... but at

> this point my grok of C* is lacking
I suspected this, but wasn't sure.  It sounds like when a row is
deleted, a tombstone is not "attached" to the row, but to each column???
So, if all columns are deleted then the row is considered deleted?
Hmmm, that doesn't sound right, but that doesn't mean it isn't ! ;o)

Re: Docs: "Why do deleted keys show up during range scans?"

Posted by AJ <aj...@dude.podzone.net>.

On 6/13/2011 10:14 AM, Stephen Connolly wrote:
>
> store the query inverted.
>
> that way empty ->  deleted
>
I don't know what that means... get the other columns?  Can you 
elaborate?  Is there docs for this or is this a hack/workaround?

> the tombstones are stored for each column that had data IIRC... but at
> this point my grok of C* is lacking
I suspected this, but wasn't sure.  It sounds like when a row is 
deleted, a tombstone is not "attached" to the row, but to each 
column???  So, if all columns are deleted then the row is considered 
deleted?  Hmmm, that doesn't sound right, but that doesn't mean it isn't 
! ;o)

Re: Docs: "Why do deleted keys show up during range scans?"

Posted by Stephen Connolly <st...@gmail.com>.

On 13 June 2011 17:09, AJ <aj...@dude.podzone.net> wrote:
> On 6/13/2011 9:25 AM, Stephen Connolly wrote:
>>
>> On 13 June 2011 16:14, AJ<aj...@dude.podzone.net>  wrote:
>>>
>>> On 6/13/2011 7:03 AM, Stephen Connolly wrote:
>>>>
>>>> It returns the set of columns for the set of rows... how do you
>>>> determine the difference between a completely empty row and a row that
>>>> just does not have any of the matching columns?
>>>
>>> I would expect it to not return anything (no row at all) for both of
>>> those
>>> cases.  Are you saying that an empty row is returned for rows that do not
>>> match the predicate?  So, if I perform a range slice where the range is
>>> every row of the CF and the slice equates to no matches and I have 1
>>> million
>>> rows in the CF, then I will get a result set of 1 million empty rows?
>>>
>> No I am saying that for each row that matches, you will get a result,
>> even if the columns that you request happen to be empty for that
>> specific row.
>>
>
> Ok, this I understand I guess.  If I query a range of rows and want only a
> certain column and a row does not have that column, I would like to know
> that.

deleted rows don't have the column either which is the point.

>
>> Likewise, any deleted rows in the same row range will show as empty
>> because C* would have a tone of work to figure out the difference
>> between being deleted and being empty.
>>
>
> But, if a row does indeed have the column, but that row was deleted, why
> would I get an empty row?  You say because of a ton of work.  So, the
> tombstone for the row is not stored "close-by" for quick access... or
> something like that?  At any rate, how do I figure out if the empty row is
> empty because it was deleted?  Sorry if I'm being dense.
>

store the query inverted.

that way empty -> deleted

the tombstones are stored for each column that had data IIRC... but at
this point my grok of C* is lacking

>
>

Re: Docs: "Why do deleted keys show up during range scans?"

Posted by AJ <aj...@dude.podzone.net>.

On 6/13/2011 9:25 AM, Stephen Connolly wrote:
> On 13 June 2011 16:14, AJ<aj...@dude.podzone.net>  wrote:
>> On 6/13/2011 7:03 AM, Stephen Connolly wrote:
>>> It returns the set of columns for the set of rows... how do you
>>> determine the difference between a completely empty row and a row that
>>> just does not have any of the matching columns?
>> I would expect it to not return anything (no row at all) for both of those
>> cases.  Are you saying that an empty row is returned for rows that do not
>> match the predicate?  So, if I perform a range slice where the range is
>> every row of the CF and the slice equates to no matches and I have 1 million
>> rows in the CF, then I will get a result set of 1 million empty rows?
>>
> No I am saying that for each row that matches, you will get a result,
> even if the columns that you request happen to be empty for that
> specific row.
>

Ok, this I understand I guess.  If I query a range of rows and want only 
a certain column and a row does not have that column, I would like to 
know that.

> Likewise, any deleted rows in the same row range will show as empty
> because C* would have a tone of work to figure out the difference
> between being deleted and being empty.
>

But, if a row does indeed have the column, but that row was deleted, why 
would I get an empty row?  You say because of a ton of work.  So, the 
tombstone for the row is not stored "close-by" for quick access... or 
something like that?  At any rate, how do I figure out if the empty row 
is empty because it was deleted?  Sorry if I'm being dense.

Re: Docs: "Why do deleted keys show up during range scans?"

Posted by Stephen Connolly <st...@gmail.com>.

On 13 June 2011 16:14, AJ <aj...@dude.podzone.net> wrote:
> On 6/13/2011 7:03 AM, Stephen Connolly wrote:
>>
>> It returns the set of columns for the set of rows... how do you
>> determine the difference between a completely empty row and a row that
>> just does not have any of the matching columns?
>
> I would expect it to not return anything (no row at all) for both of those
> cases.  Are you saying that an empty row is returned for rows that do not
> match the predicate?  So, if I perform a range slice where the range is
> every row of the CF and the slice equates to no matches and I have 1 million
> rows in the CF, then I will get a result set of 1 million empty rows?
>
No I am saying that for each row that matches, you will get a result,
even if the columns that you request happen to be empty for that
specific row.

Likewise, any deleted rows in the same row range will show as empty
because C* would have a tone of work to figure out the difference
between being deleted and being empty.

Re: Docs: "Why do deleted keys show up during range scans?"

Posted by AJ <aj...@dude.podzone.net>.

On 6/13/2011 7:03 AM, Stephen Connolly wrote:
> It returns the set of columns for the set of rows... how do you
> determine the difference between a completely empty row and a row that
> just does not have any of the matching columns?

I would expect it to not return anything (no row at all) for both of 
those cases.  Are you saying that an empty row is returned for rows that 
do not match the predicate?  So, if I perform a range slice where the 
range is every row of the CF and the slice equates to no matches and I 
have 1 million rows in the CF, then I will get a result set of 1 million 
empty rows?

Re: Docs: "Why do deleted keys show up during range scans?"

Posted by Stephen Connolly <st...@gmail.com>.

It returns the set of columns for the set of rows... how do you
determine the difference between a completely empty row and a row that
just does not have any of the matching columns?

Well the answer is that Cassandra does not go and check whether there
are any columns outside of the range you are querying, so it will just
return the empty (for the column range you specified) row.... your
code needs to be robust enough to be able to understand that an empty
list of columns does not imply that there are no columns at all for
that row key (i.e. it is deleted and waiting tombstone expiry & gc) or
there is a column outside the range you queried.

On 13 June 2011 13:59, AJ <aj...@dude.podzone.net> wrote:
> http://wiki.apache.org/cassandra/FAQ#range_ghosts
>
> "So to special case leaving out result entries for deletions, we would have
> to check the entire rest of the row to make sure there is no undeleted data
> anywhere else either (in which case leaving the key out would be an error)."
>
> The above doesn't read well and I don't get it.  Can anyone rephrase it or
> elaborate?
>
> Thanks!
>