You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Sylvain Lebresne (JIRA)" <ji...@apache.org> on 2013/10/07 11:14:42 UTC

[jira] [Commented] (CASSANDRA-4988) Fix concurrent addition of collection columns

    [ https://issues.apache.org/jira/browse/CASSANDRA-4988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13788002#comment-13788002 ] 

Sylvain Lebresne commented on CASSANDRA-4988:
---------------------------------------------

Each separate collection already has it's separate entry in schema_columns, and we have all the information there, so I don't think we need a new table here. The information is already redundant. It's just that because the comparator object needs to know the collections (to implement correctly AbstractType.compareCollectionMembers), we currently include the collection names in the comparator serialized form and we should stop doing that, but we already have all the information we need.

In fact, in 2.0 the 'comparator' field in schema_columnfamilies is entirely useless, all the information it contains can be reconstructed from the schema_columns. So probably the right solution is to stop saving that field at all, and to reconstruct it from schema_columns instead. Which would also some the "concurrent modification of comparator components" other problem I've discussed above.

Of course, we'd need to be careful with backward compatibility if we do so.

> Fix concurrent addition of collection columns
> ---------------------------------------------
>
>                 Key: CASSANDRA-4988
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4988
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Sylvain Lebresne
>            Assignee: Sylvain Lebresne
>             Fix For: 2.0.2
>
>
> It is currently not safe to update the schema by adding multiple collection columns to the same table. The reason is that with collections, the comparator embeds a map of names->comparator for each collection columns (since different maps can have different key type for example). And when serialized on disk in the schema table, the comparator is serialized as a string with that map as one column. So if new collection columns are added concurrently, the addition may not be merged correctly.
> One option to fix this would be to stop serializing the names->comparator map of ColumnToCollectionType in toString(), and do one of:
> # reconstruct that map from the information stores in the schema_columns. The downside I can see is that code-wise this may not be super clean to do.
> # change ColumnToCollectionType so that instead of having it's own names->comparator map, to just store a point to the CFMetaData that contains it and when it needs to find the exact comparator for a collection column, it would use CFMetadata.column_metadata directly. The downside is that creating a dependency from a comparator to a CFMetadata feels a bit backward.
> Note sure what's the best solution of the two honestly.
> While probably more anecdotal, we also now allow to change the type of the comparator in some cases (for example updating to BytesType is always allowed), and doing so concurrently on multiple components of a composite comparator is also not safe for a similar reason. I'm not sure how to fix that one.



--
This message was sent by Atlassian JIRA
(v6.1#6144)