You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Jonathan Ellis (JIRA)" <ji...@apache.org> on 2011/02/22 00:28:38 UTC
[jira] Created: (CASSANDRA-2211) Cleanup can create sstables whose
contents do not match their advertised version
Cleanup can create sstables whose contents do not match their advertised version
--------------------------------------------------------------------------------
Key: CASSANDRA-2211
URL: https://issues.apache.org/jira/browse/CASSANDRA-2211
Project: Cassandra
Issue Type: Bug
Reporter: Jonathan Ellis
Assignee: Jonathan Ellis
{code}
if (Range.isTokenInRanges(row.getKey().token, ranges))
{
writer = maybeCreateWriter(sstable, compactionFileLocation, expectedBloomFilterSize, writer);
writer.append(new EchoedRow(row));
totalkeysWritten++;
}
else
{
while (row.hasNext())
{
IColumn column = row.next();
if (indexedColumns.contains(column.name()))
Table.cleanupIndexEntry(cfs, row.getKey().key, column);
}
}
{code}
... that is, rows that haven't changed we copy to the new sstable without deserializing. But, the new sstable is created with CURRENT_VERSION which may not be what the old data consisted of.
(This could cause symptoms similar to CASSANDRA-2195 but I do not think it is the cause of that bug; IIRC the cluster in question there was not upgraded from an older Cassandra.)
--
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Updated: (CASSANDRA-2211) Cleanup can create sstables whose
contents do not match their advertised version
Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/CASSANDRA-2211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jonathan Ellis updated CASSANDRA-2211:
--------------------------------------
Attachment: 2211.txt
> Cleanup can create sstables whose contents do not match their advertised version
> --------------------------------------------------------------------------------
>
> Key: CASSANDRA-2211
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2211
> Project: Cassandra
> Issue Type: Bug
> Affects Versions: 0.7.1
> Reporter: Jonathan Ellis
> Assignee: Jonathan Ellis
> Fix For: 0.7.3
>
> Attachments: 2211.txt
>
> Original Estimate: 4h
> Remaining Estimate: 4h
>
> Since cleanup switched to per-sstable operation (CASSANDRA-1916), the main loop looks like this:
> {code}
> if (Range.isTokenInRanges(row.getKey().token, ranges))
> {
> writer = maybeCreateWriter(sstable, compactionFileLocation, expectedBloomFilterSize, writer);
> writer.append(new EchoedRow(row));
> totalkeysWritten++;
> }
> else
> {
> while (row.hasNext())
> {
> IColumn column = row.next();
> if (indexedColumns.contains(column.name()))
> Table.cleanupIndexEntry(cfs, row.getKey().key, column);
> }
> }
> {code}
> ... that is, rows that haven't changed we copy to the new sstable without deserializing, with EchoedRow. But, the new sstable is created with CURRENT_VERSION which may not be what the old data consisted of.
> (This could cause symptoms similar to CASSANDRA-2195 but I do not think it is the cause of that bug; IIRC the cluster in question there was not upgraded from an older Cassandra.)
--
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Updated: (CASSANDRA-2211) Cleanup can create sstables whose
contents do not match their advertised version
Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/CASSANDRA-2211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jonathan Ellis updated CASSANDRA-2211:
--------------------------------------
Attachment: 2211.txt
> Cleanup can create sstables whose contents do not match their advertised version
> --------------------------------------------------------------------------------
>
> Key: CASSANDRA-2211
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2211
> Project: Cassandra
> Issue Type: Bug
> Affects Versions: 0.7.1
> Reporter: Jonathan Ellis
> Assignee: Jonathan Ellis
> Fix For: 0.7.3
>
> Attachments: 2211.txt
>
> Original Estimate: 4h
> Remaining Estimate: 4h
>
> Since cleanup switched to per-sstable operation (CASSANDRA-1916), the main loop looks like this:
> {code}
> if (Range.isTokenInRanges(row.getKey().token, ranges))
> {
> writer = maybeCreateWriter(sstable, compactionFileLocation, expectedBloomFilterSize, writer);
> writer.append(new EchoedRow(row));
> totalkeysWritten++;
> }
> else
> {
> while (row.hasNext())
> {
> IColumn column = row.next();
> if (indexedColumns.contains(column.name()))
> Table.cleanupIndexEntry(cfs, row.getKey().key, column);
> }
> }
> {code}
> ... that is, rows that haven't changed we copy to the new sstable without deserializing, with EchoedRow. But, the new sstable is created with CURRENT_VERSION which may not be what the old data consisted of.
> (This could cause symptoms similar to CASSANDRA-2195 but I do not think it is the cause of that bug; IIRC the cluster in question there was not upgraded from an older Cassandra.)
--
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (CASSANDRA-2211) Cleanup can create sstables
whose contents do not match their advertised version
Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/CASSANDRA-2211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12997629#comment-12997629 ]
Jonathan Ellis commented on CASSANDRA-2211:
-------------------------------------------
A better fix would be to have it echo if the data is on the current version, otherwise rewrite. This would (a) be a better fit with our policy of not having to keep code around to write old versions and (b) allow a better upgrade path to version N + 1 (that doesn't support the old version sstables) than major compaction. I'll see if I can do that tonight.
> Cleanup can create sstables whose contents do not match their advertised version
> --------------------------------------------------------------------------------
>
> Key: CASSANDRA-2211
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2211
> Project: Cassandra
> Issue Type: Bug
> Affects Versions: 0.7.1
> Reporter: Jonathan Ellis
> Assignee: Jonathan Ellis
> Fix For: 0.7.3
>
> Attachments: 2211.txt
>
> Original Estimate: 4h
> Remaining Estimate: 4h
>
> Since cleanup switched to per-sstable operation (CASSANDRA-1916), the main loop looks like this:
> {code}
> if (Range.isTokenInRanges(row.getKey().token, ranges))
> {
> writer = maybeCreateWriter(sstable, compactionFileLocation, expectedBloomFilterSize, writer);
> writer.append(new EchoedRow(row));
> totalkeysWritten++;
> }
> else
> {
> while (row.hasNext())
> {
> IColumn column = row.next();
> if (indexedColumns.contains(column.name()))
> Table.cleanupIndexEntry(cfs, row.getKey().key, column);
> }
> }
> {code}
> ... that is, rows that haven't changed we copy to the new sstable without deserializing, with EchoedRow. But, the new sstable is created with CURRENT_VERSION which may not be what the old data consisted of.
> (This could cause symptoms similar to CASSANDRA-2195 but I do not think it is the cause of that bug; IIRC the cluster in question there was not upgraded from an older Cassandra.)
--
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Updated: (CASSANDRA-2211) Cleanup can create sstables whose
contents do not match their advertised version
Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/CASSANDRA-2211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jonathan Ellis updated CASSANDRA-2211:
--------------------------------------
Description:
Since cleanup switched to per-sstable operation (CASSANDRA-1916), the main loop looks like this:
{code}
if (Range.isTokenInRanges(row.getKey().token, ranges))
{
writer = maybeCreateWriter(sstable, compactionFileLocation, expectedBloomFilterSize, writer);
writer.append(new EchoedRow(row));
totalkeysWritten++;
}
else
{
while (row.hasNext())
{
IColumn column = row.next();
if (indexedColumns.contains(column.name()))
Table.cleanupIndexEntry(cfs, row.getKey().key, column);
}
}
{code}
... that is, rows that haven't changed we copy to the new sstable without deserializing, with EchoedRow. But, the new sstable is created with CURRENT_VERSION which may not be what the old data consisted of.
(This could cause symptoms similar to CASSANDRA-2195 but I do not think it is the cause of that bug; IIRC the cluster in question there was not upgraded from an older Cassandra.)
was:
{code}
if (Range.isTokenInRanges(row.getKey().token, ranges))
{
writer = maybeCreateWriter(sstable, compactionFileLocation, expectedBloomFilterSize, writer);
writer.append(new EchoedRow(row));
totalkeysWritten++;
}
else
{
while (row.hasNext())
{
IColumn column = row.next();
if (indexedColumns.contains(column.name()))
Table.cleanupIndexEntry(cfs, row.getKey().key, column);
}
}
{code}
... that is, rows that haven't changed we copy to the new sstable without deserializing. But, the new sstable is created with CURRENT_VERSION which may not be what the old data consisted of.
(This could cause symptoms similar to CASSANDRA-2195 but I do not think it is the cause of that bug; IIRC the cluster in question there was not upgraded from an older Cassandra.)
Remaining Estimate: 4h (was: 2h)
Original Estimate: 4h (was: 2h)
> Cleanup can create sstables whose contents do not match their advertised version
> --------------------------------------------------------------------------------
>
> Key: CASSANDRA-2211
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2211
> Project: Cassandra
> Issue Type: Bug
> Affects Versions: 0.7.1
> Reporter: Jonathan Ellis
> Assignee: Jonathan Ellis
> Fix For: 0.7.3
>
> Attachments: 2211.txt
>
> Original Estimate: 4h
> Remaining Estimate: 4h
>
> Since cleanup switched to per-sstable operation (CASSANDRA-1916), the main loop looks like this:
> {code}
> if (Range.isTokenInRanges(row.getKey().token, ranges))
> {
> writer = maybeCreateWriter(sstable, compactionFileLocation, expectedBloomFilterSize, writer);
> writer.append(new EchoedRow(row));
> totalkeysWritten++;
> }
> else
> {
> while (row.hasNext())
> {
> IColumn column = row.next();
> if (indexedColumns.contains(column.name()))
> Table.cleanupIndexEntry(cfs, row.getKey().key, column);
> }
> }
> {code}
> ... that is, rows that haven't changed we copy to the new sstable without deserializing, with EchoedRow. But, the new sstable is created with CURRENT_VERSION which may not be what the old data consisted of.
> (This could cause symptoms similar to CASSANDRA-2195 but I do not think it is the cause of that bug; IIRC the cluster in question there was not upgraded from an older Cassandra.)
--
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Updated: (CASSANDRA-2211) Cleanup can create sstables whose
contents do not match their advertised version
Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/CASSANDRA-2211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jonathan Ellis updated CASSANDRA-2211:
--------------------------------------
Attachment: (was: 2211.txt)
> Cleanup can create sstables whose contents do not match their advertised version
> --------------------------------------------------------------------------------
>
> Key: CASSANDRA-2211
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2211
> Project: Cassandra
> Issue Type: Bug
> Affects Versions: 0.7.1
> Reporter: Jonathan Ellis
> Assignee: Jonathan Ellis
> Fix For: 0.7.3
>
> Attachments: 2211.txt
>
> Original Estimate: 4h
> Remaining Estimate: 4h
>
> Since cleanup switched to per-sstable operation (CASSANDRA-1916), the main loop looks like this:
> {code}
> if (Range.isTokenInRanges(row.getKey().token, ranges))
> {
> writer = maybeCreateWriter(sstable, compactionFileLocation, expectedBloomFilterSize, writer);
> writer.append(new EchoedRow(row));
> totalkeysWritten++;
> }
> else
> {
> while (row.hasNext())
> {
> IColumn column = row.next();
> if (indexedColumns.contains(column.name()))
> Table.cleanupIndexEntry(cfs, row.getKey().key, column);
> }
> }
> {code}
> ... that is, rows that haven't changed we copy to the new sstable without deserializing, with EchoedRow. But, the new sstable is created with CURRENT_VERSION which may not be what the old data consisted of.
> (This could cause symptoms similar to CASSANDRA-2195 but I do not think it is the cause of that bug; IIRC the cluster in question there was not upgraded from an older Cassandra.)
--
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Updated: (CASSANDRA-2211) Cleanup can create sstables whose
contents do not match their advertised version
Posted by "Sylvain Lebresne (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/CASSANDRA-2211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Sylvain Lebresne updated CASSANDRA-2211:
----------------------------------------
Attachment: 0001-2211-v3.patch
+1 on the patch. I'm just attaching a v3 that simply use getDefaultGcBefore() throughout CompactionManager (to make things cleaner)
Sadly, this is not the only place where we echo data wrongfully, cf. CASSANDRA-2216
> Cleanup can create sstables whose contents do not match their advertised version
> --------------------------------------------------------------------------------
>
> Key: CASSANDRA-2211
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2211
> Project: Cassandra
> Issue Type: Bug
> Affects Versions: 0.7.1
> Reporter: Jonathan Ellis
> Assignee: Jonathan Ellis
> Fix For: 0.7.3
>
> Attachments: 0001-2211-v3.patch, 2211-v2.txt, 2211.txt
>
> Original Estimate: 4h
> Remaining Estimate: 4h
>
> Since cleanup switched to per-sstable operation (CASSANDRA-1916), the main loop looks like this:
> {code}
> if (Range.isTokenInRanges(row.getKey().token, ranges))
> {
> writer = maybeCreateWriter(sstable, compactionFileLocation, expectedBloomFilterSize, writer);
> writer.append(new EchoedRow(row));
> totalkeysWritten++;
> }
> else
> {
> while (row.hasNext())
> {
> IColumn column = row.next();
> if (indexedColumns.contains(column.name()))
> Table.cleanupIndexEntry(cfs, row.getKey().key, column);
> }
> }
> {code}
> ... that is, rows that haven't changed we copy to the new sstable without deserializing, with EchoedRow. But, the new sstable is created with CURRENT_VERSION which may not be what the old data consisted of.
> (This could cause symptoms similar to CASSANDRA-2195 but I do not think it is the cause of that bug; IIRC the cluster in question there was not upgraded from an older Cassandra.)
--
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Updated: (CASSANDRA-2211) Cleanup can create sstables whose
contents do not match their advertised version
Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/CASSANDRA-2211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jonathan Ellis updated CASSANDRA-2211:
--------------------------------------
Attachment: 2211-v2.txt
v2 as described above.
> Cleanup can create sstables whose contents do not match their advertised version
> --------------------------------------------------------------------------------
>
> Key: CASSANDRA-2211
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2211
> Project: Cassandra
> Issue Type: Bug
> Affects Versions: 0.7.1
> Reporter: Jonathan Ellis
> Assignee: Jonathan Ellis
> Fix For: 0.7.3
>
> Attachments: 2211-v2.txt, 2211.txt
>
> Original Estimate: 4h
> Remaining Estimate: 4h
>
> Since cleanup switched to per-sstable operation (CASSANDRA-1916), the main loop looks like this:
> {code}
> if (Range.isTokenInRanges(row.getKey().token, ranges))
> {
> writer = maybeCreateWriter(sstable, compactionFileLocation, expectedBloomFilterSize, writer);
> writer.append(new EchoedRow(row));
> totalkeysWritten++;
> }
> else
> {
> while (row.hasNext())
> {
> IColumn column = row.next();
> if (indexedColumns.contains(column.name()))
> Table.cleanupIndexEntry(cfs, row.getKey().key, column);
> }
> }
> {code}
> ... that is, rows that haven't changed we copy to the new sstable without deserializing, with EchoedRow. But, the new sstable is created with CURRENT_VERSION which may not be what the old data consisted of.
> (This could cause symptoms similar to CASSANDRA-2195 but I do not think it is the cause of that bug; IIRC the cluster in question there was not upgraded from an older Cassandra.)
--
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (CASSANDRA-2211) Cleanup can create sstables
whose contents do not match their advertised version
Posted by "Hudson (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/CASSANDRA-2211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12997845#comment-12997845 ]
Hudson commented on CASSANDRA-2211:
-----------------------------------
Integrated in Cassandra-0.7 #303 (See [https://hudson.apache.org/hudson/job/Cassandra-0.7/303/])
fix for cleanup writing old-format data into new-version sstable
patch by jbellis; reviewed by slebresne for CASSANDRA-2211
> Cleanup can create sstables whose contents do not match their advertised version
> --------------------------------------------------------------------------------
>
> Key: CASSANDRA-2211
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2211
> Project: Cassandra
> Issue Type: Bug
> Affects Versions: 0.7.1
> Reporter: Jonathan Ellis
> Assignee: Jonathan Ellis
> Fix For: 0.7.3
>
> Attachments: 0001-2211-v3.patch, 2211-v2.txt, 2211.txt
>
> Original Estimate: 4h
> Remaining Estimate: 4h
>
> Since cleanup switched to per-sstable operation (CASSANDRA-1916), the main loop looks like this:
> {code}
> if (Range.isTokenInRanges(row.getKey().token, ranges))
> {
> writer = maybeCreateWriter(sstable, compactionFileLocation, expectedBloomFilterSize, writer);
> writer.append(new EchoedRow(row));
> totalkeysWritten++;
> }
> else
> {
> while (row.hasNext())
> {
> IColumn column = row.next();
> if (indexedColumns.contains(column.name()))
> Table.cleanupIndexEntry(cfs, row.getKey().key, column);
> }
> }
> {code}
> ... that is, rows that haven't changed we copy to the new sstable without deserializing, with EchoedRow. But, the new sstable is created with CURRENT_VERSION which may not be what the old data consisted of.
> (This could cause symptoms similar to CASSANDRA-2195 but I do not think it is the cause of that bug; IIRC the cluster in question there was not upgraded from an older Cassandra.)
--
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira