You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Hannu Kröger (JIRA)" <ji...@apache.org> on 2018/03/23 06:31:00 UTC

[jira] [Comment Edited] (CASSANDRA-14336) sstableloader fails if sstables contains removed columns

    [ https://issues.apache.org/jira/browse/CASSANDRA-14336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16410870#comment-16410870 ] 

Hannu Kröger edited comment on CASSANDRA-14336 at 3/23/18 6:30 AM:
-------------------------------------------------------------------

If this is really expected behaviour, then following flow won't always work:
 # Take snapshot on cluster A
 # Dump schema on cluster A
 # Load schema on cluster B
 # Load data with sstableloader on cluster B

I'm not 100% sure it would work even on if we were to load snapshot back to cluster A.

 

Here the step 4 will work only if

1) No columns were ever dropped

OR

2) Compactions have been run on all affected sstables after columns have been dropped

OR

3) nodetool upgradesstable --all was executed on cluster A

 

And I think this is not very practical. Also I don't think documented?


was (Author: hkroger):
If this is really expected behaviour, then following flow won't always work:

1) Take snapshot on cluster A

2) Dump schema on cluster A

3) Load schema on cluster B

4) Load data with sstableloader on cluster B

 

I'm not 100% sure it would work even on if we were to load snapshot back to cluster A.

 

Here the step 4 will work only if

1) No columns were ever dropped

OR

2) Compactions have been run on all affected sstables after columns have been dropped

OR

3) nodetool upgradesstable --all was executed on cluster A

 

And I think this is not very practical. Also I don't think documented?

> sstableloader fails if sstables contains removed columns
> --------------------------------------------------------
>
>                 Key: CASSANDRA-14336
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-14336
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Hannu Kröger
>            Assignee: Jaydeepkumar Chovatia
>            Priority: Major
>
> If I copy the schema and try to load in sstables with sstableloader, loading sometimes fails with
> {code:java}
> Exception in thread "main" org.apache.cassandra.tools.BulkLoadException: java.lang.RuntimeException: Failed to list files in /tmp/test/bug3_dest-acdc
>     at org.apache.cassandra.tools.BulkLoader.load(BulkLoader.java:93)
>     at org.apache.cassandra.tools.BulkLoader.main(BulkLoader.java:48)
> Caused by: java.lang.RuntimeException: Failed to list files in /tmp/test/bug3_dest-acdc
>     at org.apache.cassandra.db.lifecycle.LogAwareFileLister.list(LogAwareFileLister.java:77)
>     at org.apache.cassandra.db.lifecycle.LifecycleTransaction.getFiles(LifecycleTransaction.java:561)
>     at org.apache.cassandra.io.sstable.SSTableLoader.openSSTables(SSTableLoader.java:76)
>     at org.apache.cassandra.io.sstable.SSTableLoader.stream(SSTableLoader.java:165)
>     at org.apache.cassandra.tools.BulkLoader.load(BulkLoader.java:80)
>     ... 1 more
> Caused by: java.lang.RuntimeException: Unknown column d during deserialization
>     at org.apache.cassandra.db.SerializationHeader$Component.toHeader(SerializationHeader.java:321)
>     at org.apache.cassandra.io.sstable.format.SSTableReader.openForBatch(SSTableReader.java:440)
>     at org.apache.cassandra.io.sstable.SSTableLoader.lambda$openSSTables$0(SSTableLoader.java:121)
>     at org.apache.cassandra.db.lifecycle.LogAwareFileLister.lambda$innerList$2(LogAwareFileLister.java:99)
>     at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:174)
>     at java.util.TreeMap$EntrySpliterator.forEachRemaining(TreeMap.java:2969)
>     at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481)
>     at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471)
>     at java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708)
>     at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
>     at java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:499)
>     at org.apache.cassandra.db.lifecycle.LogAwareFileLister.innerList(LogAwareFileLister.java:101)
>     at org.apache.cassandra.db.lifecycle.LogAwareFileLister.list(LogAwareFileLister.java:73)
>     ... 5 more{code}
> This requires that we have dropped columns in the source table and sstables exist from the "old schema" time.
> This can be very easily reproduced. I used following script:
> {code:java}
> KS=test
> SRCTABLE=bug3_source
> DESTTABLE=bug3_dest
> DATADIR=/usr/local/var/lib/cassandra/data
> TMPDIR=/tmp
> cqlsh -e "CREATE TABLE $KS.$SRCTABLE(a int primary key, b int, c int, d int);"
> cqlsh -e "CREATE TABLE $KS.$DESTTABLE(a int primary key, b int, c int);"
> cqlsh -e "INSERT INTO $KS.$SRCTABLE(a,b,c,d) values(1,2,3,4);"
> nodetool flush $KS $SRCTABLE
> cqlsh -e "ALTER TABLE $KS.$SRCTABLE DROP d;"
> nodetool flush $KS $SRCTABLE
> mkdir -p $TMPDIR/$KS/$DESTTABLE-acdc
> cp $DATADIR/$KS/$SRCTABLE-*/* $TMPDIR/$KS/$DESTTABLE-acdc
> sstableloader -d 127.0.0.1 $TMPDIR/$KS/$DESTTABLE-acdc{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@cassandra.apache.org
For additional commands, e-mail: commits-help@cassandra.apache.org