You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Benjamin Lerer (Jira)" <ji...@apache.org> on 2022/07/13 13:52:00 UTC
[jira] [Commented] (CASSANDRA-17601) IllegalStateException with prepared queries selecting static columns in mixed 3.0.x/4.x clusters

    [ https://issues.apache.org/jira/browse/CASSANDRA-17601?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17566323#comment-17566323 ] 

Benjamin Lerer commented on CASSANDRA-17601:
--------------------------------------------

[~jonmeredith] I am still trying to wrap my mind around the problem.
If the problem is with the pre-computation why do we not simply switch for all scenarios to {{OnRequestColumnFilterFactory}}?
  


> IllegalStateException with prepared queries selecting static columns in mixed 3.0.x/4.x clusters
> ------------------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-17601
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-17601
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Cluster/Gossip, Consistency/Coordination
>            Reporter: Jon Meredith
>            Assignee: Jon Meredith
>            Priority: Normal
>             Fix For: 4.0.x, 4.1-beta, 4.x
>
>
> Clusters that contain prepared statements that partially select static columns before the upgrade will fail to execute those statements coordinated from the 4.x nodes until the upgrade completes.
> h2. Reproduction
> Setup (before upgrade)
> {code:java}
> CREATE KEYSPACE ks1 WITH replication = {'class': 'SimpleStrategy', 'replication_factor':3}
> CREATE TABLE ks1.tbl1 (pk1 int,
> ck2 int,
> s3 int static,
> s4 int static,
> c5 int,
> PRIMARY KEY (pk1, ck2));
> INSERT INTO ks1.tbl1 (pk1, ck2, s3, s4, c5) VALUES (1, 2, 3, 4, 5);
> {code}
> Prepared Statement (prepare before upgrade)
> {code:java}
> SELECT c5, s3 FROM ks1.tbl1 WHERE pk1 = ? AND ck2 = ?;
> {code}
> Exception on 3.0.x nodes (when executing prepared statement after upgrade)
> {code:java}
> java.lang.IllegalStateException: [s3, s4] is not a subset of [s3] at org.apache.cassandra.db.Columns$Serializer.encodeBitmap(Columns.java:566)
> at org.apache.cassandra.db.Columns$Serializer.serializeSubset(Columns.java:498) at org.apache.cassandra.db.rows.UnfilteredSerializer.serializeRowBody(UnfilteredSerializer.java:235)
> at org.apache.cassandra.db.rows.UnfilteredSerializer.serialize(UnfilteredSerializer.java:209)
> at org.apache.cassandra.db.rows.UnfilteredSerializer.serialize(UnfilteredSerializer.java:141)
> at org.apache.cassandra.db.rows.UnfilteredSerializer.serialize(UnfilteredSerializer.java:129)
> at org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer.serialize(UnfilteredRowIteratorSerializer.java:140)
> at org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer.serialize(UnfilteredRowIteratorSerializer.java:95)
> at org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer.serialize(UnfilteredRowIteratorSerializer.java:80)
> at org.apache.cassandra.db.partitions.UnfilteredPartitionIterators$Serializer.serialize(UnfilteredPartitionIterators.java:308)
> at org.apache.cassandra.db.ReadResponse$LocalDataResponse.build(ReadResponse.java:191)
> at org.apache.cassandra.db.ReadResponse$LocalDataResponse.<init>(ReadResponse.java:181)
> at org.apache.cassandra.db.ReadResponse$LocalDataResponse.<init>(ReadResponse.java:177)
> at org.apache.cassandra.db.ReadResponse.createDataResponse(ReadResponse.java:48)
> at org.apache.cassandra.db.ReadCommand.createResponse(ReadCommand.java:335)
> at org.apache.cassandra.db.ReadCommandVerbHandler.doVerb(ReadCommandVerbHandler.java:91)
> at org.apache.cassandra.net.InboundSink.lambda$new$0(InboundSink.java:77)
> at org.apache.cassandra.net.InboundSink.accept(InboundSink.java:93)
> at org.apache.cassandra.net.InboundSink.accept(InboundSink.java:44)
> at org.apache.cassandra.net.InboundMessageHandler$ProcessMessage.run(InboundMessageHandler.java:433)
> at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
> at org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:162)
> at org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.run(AbstractLocalAwareExecutorService.java:134)
> at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:119)
> at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
> at java.base/java.lang.Thread.run(Thread.java:834)
> {code}
> Exception on 4.0.x nodes (when executing prepared statement after upgrade)
> {code:java}
> java.lang.IllegalStateException: [ColumnDefinition{name=s3, type=org.apache.cassandra.db.marshal.IntType, kind=STATIC, position=-1},
> ColumnDefinition{name=s4, type=org.apache.cassandra.db.marshal.IntType, kind=STATIC, position=-1}] is not a subset of [s3]
> at org.apache.cassandra.db.Columns$Serializer.encodeBitmap(Columns.java:555)
> at org.apache.cassandra.db.Columns$Serializer.serializeSubset(Columns.java:487)
> at org.apache.cassandra.db.rows.UnfilteredSerializer.serializeRowBody(UnfilteredSerializer.java:216)
> at org.apache.cassandra.db.rows.UnfilteredSerializer.serialize(UnfilteredSerializer.java:190)
> at org.apache.cassandra.db.rows.UnfilteredSerializer.serialize(UnfilteredSerializer.java:121)
> at org.apache.cassandra.db.rows.UnfilteredSerializer.serialize(UnfilteredSerializer.java:109)
> at org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer.serialize(UnfilteredRowIteratorSerializer.java:140)
> at org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer.serialize(UnfilteredRowIteratorSerializer.java:94)
> at org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer.serialize(UnfilteredRowIteratorSerializer.java:79)
> at org.apache.cassandra.db.partitions.UnfilteredPartitionIterators$Serializer.serialize(UnfilteredPartitionIterators.java:326)
> at org.apache.cassandra.db.ReadResponse$LocalDataResponse.build(ReadResponse.java:186)
> at org.apache.cassandra.db.ReadResponse$LocalDataResponse.<init>(ReadResponse.java:179)
> at org.apache.cassandra.db.ReadResponse$LocalDataResponse.<init>(ReadResponse.java:175)
> at org.apache.cassandra.db.ReadResponse.createDataResponse(ReadResponse.java:75)
> at org.apache.cassandra.db.ReadCommand.createResponse(ReadCommand.java:499)
> at org.apache.cassandra.db.ReadCommandVerbHandler.doVerb(ReadCommandVerbHandler.java:91)
> at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:67)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.runUnsafe(AbstractLocalAwareExecutorService.java:194)
> at org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.runUnsafe(AbstractLocalAwareExecutorService.java:137)
> at org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:167)
> at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:122) at java.lang.Thread.run(Thread.java:748)
> {code}
> The root cause is CASSANDRA-16686 changes ColumnFilters to build and deserialize based on what versions the coordinating node thinks are running in the cluster, and that
> knowledge is always incorrect when statements are reprepared on startup and may be incorrect as all nodes reach their final version.
> h2. Sequence of events:
> Prepared statements are persisted in {{system.prepared_statements}} to be re-prepared on future startup.
> When the 4.x node starts up after upgrade, in {{org.apache.cassandra.service.CassandraDaemon#setup}} it calls {{QueryProcessor.instance.preloadPreparedStatements}} *before* the {{Gossiper}} is started by a call to {{StorageService.instance.initServer()}} later in {{{}setup{}}}.
> As part of preparing statements, when possible a {{ColumnFilterFactory}} is created that returns a {{ColumnFilter}} built at the time the query is prepared.
> After the changes from CASSANDRA-16686, the {{ColumnFilter}} builder constructs different column filter variants depending on the lowest version reported in gossip by checking {{{}org.apache.cassandra.gms.Gossiper#upgradeFromVersionMemoized{}}}. If this runs before the Gossiper is enabled the {{{}SystemKeyspace.CURRENT_VERSION{}}}, causing the {{ColumnFilter}} to create a column filter as if the cluster were fully upgraded.
> For the query above, the ColumnFilter creates an ALL_REGULARS_AND_QUERIED_STATICS_COLUMNS filter.
> The 3.0.x nodes participating do not understand the new flag and creates a {{ColumnFilter}} the equivalent of a {{{}WildcardColumnFilter{}}}. The 4.x nodes participating do understand the new flag, however the deserializer takes the lower than 3.4 path as other 3.0 nodes are known about and creates a {{{}WildcardColumFilter{}}}.
> The fetchedColumns sent by the ALL_REGULARS_AND_QUERIED_STATICS_COLUMNS filter only contains the queried static columns, however the pre-3.4 sstable iterator returns all regular and static columns, causing an IllegalStateException when the serialized response is sent back.
> The ISE clears once all nodes in the cluster think they are upgraded to the current version and behave as the originally prepared query intended.
> h2. Related Problems
> _Non-deterministic behavior of 4.0.x/4.1.x nodes_
> If the prepared statements are cleared and/or freshly prepared when the cluster is in mixed 3.0/4.0 mode, the pre-built ColumnFilter will remain in the mixed mode version until re-prepared on a restart or cache clear/eviction.
> As upgradeFromVersionMemoized times out and is recalculated after the upgrade reaches a single version, individual nodes will make a local decision on column filter building and deserializing.
> Nodes that update upgradeFromVersionMemoized early that coordinate requests may cause the same ISE against nodes responding to the read command have the previous version still.
> _Digest Mismatches_
> If {{ALL_REGULARS_AND_QUERIED_STATICS_COLUMN}} {{ColumnFilter}} s are incorrectly sent to 3.0.x nodes, the list of columns included will be ignored and compute a different digest than one locally executed on a 4.0.x coordinator.
> h1. Proposed fix
> In discussion with [~ifesdjeen], he suggested that the one way to resolve this is the {{ALL_REGULARS_AND_QUERIED_STATICS_COLUMNS}} filter should by deprecated (or just removed) and no longer built, always selecting all static columns
> This would just leave {{WildCardColumnFilter}} and {{SelectionColumnFilter}} with {{ALL_COLUMNS}} or {{ONLY_QUERIED_COLUMNS}}.
> This is a potential performance regression for unusual schemas with very large numbers of static columns, but seems unlikely in practice.
> /cc: [~blerer] 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@cassandra.apache.org
For additional commands, e-mail: commits-help@cassandra.apache.org