You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Alex Petrov (JIRA)" <ji...@apache.org> on 2017/03/20 16:59:41 UTC

[jira] [Comment Edited] (CASSANDRA-13337) Dropping column results in "corrupt" SSTable

    [ https://issues.apache.org/jira/browse/CASSANDRA-13337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15933059#comment-15933059 ] 

Alex Petrov edited comment on CASSANDRA-13337 at 3/20/17 4:59 PM:
------------------------------------------------------------------

There's another way to reproduce the same issue with slightly different steps:

{code}
CREATE KEYSPACE IF NOT EXISTS "test" WITH REPLICATION = {'class' : 'org.apache.cassandra.locator.SimpleStrategy', 'replication_factor': 1 };

CREATE TABLE IF NOT EXISTS "test"."reproduce" (pk1 int, ck1 int, v1 int, v2 int, v3 int, v4 int, v5 int, PRIMARY KEY(pk1, ck1));

UPDATE "test"."reproduce" SET v1 = 1, v2 = 1, v3 = 3, v4 = 4 WHERE pk1 = 1 AND ck1 = 0;

ALTER TABLE "test"."reproduce" DROP v2;
ALTER TABLE "test"."reproduce" DROP v3;
ALTER TABLE "test"."reproduce" DROP v4;

SELECT * FROM test.reproduce;
{code}

But essentially the problem is that we do return empty rows from local storage. For example, when {{UPDATE}} was used to set only a subset of rows, then the rows that were used in {{UPDATE}} get dropped. When trying to query, we end up with an empty row. This wouldn't happen with {{INSERT}} since for {{INSERT}} we have liveness set.

I just see a single small problem: 

{code}
        createTable("CREATE TABLE %s(k int PRIMARY KEY, x int, y int)");
        execute("UPDATE %s SET x = 1 WHERE k = 0");
        flush(doFlush); // (1)
        execute("ALTER TABLE %s DROP x");
{code}

If we do flush at point {{1}}, we will end up with a single row {{row(1, null)}}. However, if we do not do flush and query directly from memtable, we end up with an empty result.


was (Author: ifesdjeen):
There's another way to reproduce the same issue with slightly different steps:

{code}
CREATE KEYSPACE IF NOT EXISTS "test" WITH REPLICATION = {'class' : 'org.apache.cassandra.locator.SimpleStrategy', 'replication_factor': 1 };

CREATE TABLE IF NOT EXISTS "test"."reproduce" (pk1 int, ck1 int, v1 int, v2 int, v3 int, v4 int, v5 int, PRIMARY KEY(pk1, ck1));

UPDATE "test"."reproduce" SET v1 = 1, v2 = 1, v3 = 3, v4 = 4 WHERE pk1 = 1 AND ck1 = 0;

ALTER TABLE "test"."reproduce" DROP v2;
ALTER TABLE "test"."reproduce" DROP v3;
ALTER TABLE "test"."reproduce" DROP v4;

SELECT * FROM test.reproduce;
{code}

But essentially the problem is that we do return empty rows from local storage. For example, when {{UPDATE}} was used to set only a subset of rows, then the rows that were used in {{UPDATE}} get dropped. When trying to query, we end up with an empty row. This wouldn't happen with {{INSERT}} since for {{INSERT}} we have liveness set.

I just see a single small problem: 

{code}
        createTable("CREATE TABLE %s(k int PRIMARY KEY, x int, y int)");
        execute("UPDATE %s SET x = 1 WHERE k = 0");
        flush(doFlush); // (1)
        execute("ALTER TABLE %s DROP x");
{code}

If we do flush at point {{1}}, we will end up with a single row {{row(1, null)}}. However, if we do not do flush and query directly from sstable, we end up with an empty result.

> Dropping column results in "corrupt" SSTable
> --------------------------------------------
>
>                 Key: CASSANDRA-13337
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-13337
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Compaction
>            Reporter: Jonas Borgström
>            Assignee: Sylvain Lebresne
>             Fix For: 3.0.x, 3.11.x
>
>
> It seems like dropping a column can make SSTables containing rows with writes to only the dropped column will become uncompactable.
> Also Cassandra <= 3.9 and <= 3.0.11 will even refuse to start with the same stack trace
> {code}
> cqlsh -e "create keyspace test with replication = { 'class' : 'SimpleStrategy', 'replication_factor' : 1 }"
> cqlsh -e "create table test.test(pk text primary key, x text, y text)"
> cqlsh -e "update test.test set x='1' where pk='1'"
> nodetool flush
> cqlsh -e "update test.test set x='1', y='1' where pk='1'"
> nodetool flush
> cqlsh -e "alter table test.test drop x"
> nodetool compact test test
> error: Corrupt empty row found in unfiltered partition
> -- StackTrace --
> java.io.IOException: Corrupt empty row found in unfiltered partition
> 	at org.apache.cassandra.db.rows.UnfilteredSerializer.deserialize(UnfilteredSerializer.java:382)
> 	at org.apache.cassandra.io.sstable.SSTableSimpleIterator$CurrentFormatIterator.computeNext(SSTableSimpleIterator.java:87)
> 	at org.apache.cassandra.io.sstable.SSTableSimpleIterator$CurrentFormatIterator.computeNext(SSTableSimpleIterator.java:65)
> 	at org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47)
> 	at org.apache.cassandra.io.sstable.SSTableIdentityIterator.doCompute(SSTableIdentityIterator.java:123)
> 	at org.apache.cassandra.io.sstable.SSTableIdentityIterator.computeNext(SSTableIdentityIterator.java:100)
> 	at org.apache.cassandra.io.sstable.SSTableIdentityIterator.computeNext(SSTableIdentityIterator.java:30)
> 	at org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47)
> 	at org.apache.cassandra.db.rows.LazilyInitializedUnfilteredRowIterator.computeNext(LazilyInitializedUnfilteredRowIterator.java:95)
> 	at org.apache.cassandra.db.rows.LazilyInitializedUnfilteredRowIterator.computeNext(LazilyInitializedUnfilteredRowIterator.java:32)
> 	at org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47)
> 	at org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:369)
> 	at org.apache.cassandra.utils.MergeIterator$ManyToOne.advance(MergeIterator.java:189)
> 	at org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:158)
> 	at org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47)
> 	at org.apache.cassandra.db.rows.UnfilteredRowIterators$UnfilteredRowMergeIterator.computeNext(UnfilteredRowIterators.java:509)
> 	at org.apache.cassandra.db.rows.UnfilteredRowIterators$UnfilteredRowMergeIterator.computeNext(UnfilteredRowIterators.java:369)
> 	at org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47)
> 	at org.apache.cassandra.db.transform.BaseRows.hasNext(BaseRows.java:129)
> 	at org.apache.cassandra.db.transform.UnfilteredRows.isEmpty(UnfilteredRows.java:58)
> 	at org.apache.cassandra.db.partitions.PurgeFunction.applyToPartition(PurgeFunction.java:67)
> 	at org.apache.cassandra.db.partitions.PurgeFunction.applyToPartition(PurgeFunction.java:26)
> 	at org.apache.cassandra.db.transform.BasePartitions.hasNext(BasePartitions.java:96)
> 	at org.apache.cassandra.db.compaction.CompactionIterator.hasNext(CompactionIterator.java:227)
> 	at org.apache.cassandra.db.compaction.CompactionTask.runMayThrow(CompactionTask.java:190)
> 	at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
> 	at org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:89)
> 	at org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:61)
> 	at org.apache.cassandra.db.compaction.CompactionManager$8.runMayThrow(CompactionManager.java:610)
> 	at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
> 	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> 	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> 	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> 	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> 	at org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:79)
> 	at java.lang.Thread.run(Thread.java:745)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)