You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by Scott McCarty <sc...@metajinx.com> on 2011/01/28 03:19:17 UTC

Has anyone seen column deletes that seem not to actually delete the column?

Hi,

I'm trying to figure out what's going on with some column removes that don't
seem to be taking hold.

This particular test is being done on a single node cluster running 0.6.8
with CL=QUORUM on the writes (which shouldn't matter, I'd think).

What I'm seeing in our client log files is that a column remove is being
done on a particular rowkey but when I later use cassandra-cli to query that
rowkey the column shows up again.  We're using batch_mutate with a list of
about 500 modifications in the batch.  The timestamp on the column is 6
months ago and the timestamp we're setting on the mutation is the current
time (in microseconds).  I've verified that I'm not inserting the same
column back into the rowkey after it's deleted.

We're using hector on the client side.  I'm seeing no exceptions anywhere
during the rather long run (about 2 hours of thumping the server pretty
thoroughly...2.5GB of log files).  This dropping of the delete appears to be
very rare as we're doing tens of thousands of them and it looks like only a
small number aren't working.

I looked at the server logs and the only thing of interest there is that a
few minutes after the supposed delete the server switches in a new memtable.
 Here's the log:

2011-01-27 07:45:28,646 INFO [org.apache.cassandra.db.ColumnFamilyStore] -
View has reached its threshold; switching in a fresh Memtable at
CommitLogContext(file='/dblogs/core/commitlog/CommitLog-1296142416683.log',
position=43038896)
2011-01-27 07:45:28,646 INFO [org.apache.cassandra.db.ColumnFamilyStore] -
Enqueuing flush of Memtable-View@1961066949(67679287 bytes, 248 operations)
2011-01-27 07:45:28,646 INFO [org.apache.cassandra.db.Memtable] - Writing
Memtable-View@1961066949(67679287 bytes, 248 operations)
2011-01-27 07:45:29,351 INFO [org.apache.cassandra.db.Memtable] - Completed
flushing /opt/cassandra/storage/core/data/Panama/View-14593-Data.db (1572707
bytes)
2011-01-27 07:46:42,667 INFO [org.apache.cassandra.db.ColumnFamilyStore] -
CFTest1 has reached its threshold; switching in a fresh Memtable at
CommitLogContext(file='/dblogs/core/commitlog/CommitLog-1296142416683.log',
position=47447132)
2011-01-27 07:46:42,668 INFO [org.apache.cassandra.db.ColumnFamilyStore] -
Enqueuing flush of Memtable-CFTest1@277830944(38486 bytes, 501 operations)
2011-01-27 07:46:42,668 INFO [org.apache.cassandra.db.Memtable] - Writing
Memtable-CFTest1@277830944(38486 bytes, 501 operations)
2011-01-27 07:46:43,073 INFO [org.apache.cassandra.db.Memtable] - Completed
flushing /opt/cassandra/storage/core/data/Panama/CFTest1-1258-Data.db (45750
bytes)
[snip]

This is about 5 minutes after the delete.

I definitely can't say that it's not a client-side failure where that
exception is getting silently swallowed.  My concern is that the delete
request makes it to the server but somehow gets lost (on a one node
cluster...).

Does anyone know if there's a known issue related to losing delete requests?

Thanks,
  Scott

Re: Has anyone seen column deletes that seem not to actually delete the column?

Posted by Aaron Morton <aa...@thelastpickle.com>.

No current issues with delete that I know of. To be safe can you upgrade to 0.7 or 0.6.10 ? 

Some things to check:
- When you read the column back in the CLI does it have the timestamp you expected?
- Until proven otherwise assume it's a client side thing, can you add some more logging into your app and turn on DEBUG logging in cassandra. Then when you notice the fault stop your running task and try to produce.
- The CLI works at CL ONE, so there is a small possibility that you could connect to a node that did not get the delete. Are there any WARN messages about dropped messages ?

If you are still having troubles provide some more info on how the delete is being performed. 

Good luck
Aaron
On 28 Jan, 2011,at 03:19 PM, Scott McCarty <sc...@metajinx.com> wrote:

Hi,

I'm trying to figure out what's going on with some column removes that don't seem to be taking hold.

This particular test is being done on a single node cluster running 0.6.8 with CL=QUORUM on the writes (which shouldn't matter, I'd think).

What I'm seeing in our client log files is that a column remove is being done on a particular rowkey but when I later use cassandra-cli to query that rowkey the column shows up again.  We're using batch_mutate with a list of about 500 modifications in the batch.  The timestamp on the column is 6 months ago and the timestamp we're setting on the mutation is the current time (in microseconds).  I've verified that I'm not inserting the same column back into the rowkey after it's deleted.

We're using hector on the client side.  I'm seeing no exceptions anywhere during the rather long run (about 2 hours of thumping the server pretty thoroughly...2.5GB of log files).  This dropping of the delete appears to be very rare as we're doing tens of thousands of them and it looks like only a small number aren't working.

I looked at the server logs and the only thing of interest there is that a few minutes after the supposed delete the server switches in a new memtable.  Here's the log:

2011-01-27 07:45:28,646 INFO [org.apache.cassandra.db.ColumnFamilyStore] - View has reached its threshold; switching in a fresh Memtable at CommitLogContext(file='/dblogs/core/commitlog/CommitLog-1296142416683.log', position=43038896)
2011-01-27 07:45:28,646 INFO [org.apache.cassandra.db.ColumnFamilyStore] - Enqueuing flush of Memtable-View@1961066949(67679287 bytes, 248 operations)
2011-01-27 07:45:28,646 INFO [org.apache.cassandra.db.Memtable] - Writing Memtable-View@1961066949(67679287 bytes, 248 operations)
2011-01-27 07:45:29,351 INFO [org.apache.cassandra.dbMemtable] - Completed flushing /opt/cassandra/storage/core/data/Panama/View-14593-Data.db (1572707 bytes)
2011-01-27 07:46:42,667 INFO [org.apache.cassandra.db.ColumnFamilyStore] - CFTest1 has reached its threshold; switching in a fresh Memtable at CommitLogContext(file='/dblogs/core/commitlog/CommitLog-1296142416683.log', position=47447132)
2011-01-27 07:46:42,668 INFO [org.apache.cassandra.db.ColumnFamilyStore] - Enqueuing flush of Memtable-CFTest1@277830944(38486 bytes, 501 operations)
2011-01-27 07:46:42,668 INFO [org.apache.cassandra.db.Memtable] - Writing Memtable-CFTest1@277830944(38486 bytes, 501 operations)
2011-01-27 07:46:43,073 INFO [org.apache.cassandra.db.Memtable] - Completed flushing /opt/cassandra/storage/core/data/Panama/CFTest1-1258-Data.db (45750 bytes)
[snip]

This is about 5 minutes after the delete.

I definitely can't say that it's not a client-side failure where that exception is getting silently swallowed.  My concern is that the delete request makes it to the server but somehow gets lost (on a one node cluster...).

Does anyone know if there's a known issue related to losing delete requests?

Thanks,
  Scott