You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Caleb Rackliffe (Jira)" <ji...@apache.org> on 2022/12/02 07:11:00 UTC

[jira] [Commented] (CASSANDRA-17719) CEP-15: Multi-Partition Transaction CQL Support (Alpha)

    [ https://issues.apache.org/jira/browse/CASSANDRA-17719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17642322#comment-17642322 ] 

Caleb Rackliffe commented on CASSANDRA-17719:
---------------------------------------------

I'll be posting a patch for review soon, but before I do, here's a summary of what it does and does not (yet) contain. As the title of this issue now indicates, what we'll hopefully commit here is a basic CQL integration, with a follow-up Jira addressing those items we still need to complete before the CEP-15/Accord work completes and we release something to our users.

*In this Patch*

* Support for the basic syntax outlined in the description of this issue
* Support for all scalar, (frozen and unfrozen) collection, and tuple-based C* data types in LET references in SELECT, IF, and INSERT/UPDATE
* Read-only and write-only transactions
* Basic statement validation (ensuring LET deals with single rows, transaction-specific syntax is not supported outside one, ignoring user timestamps, etc.) and tests to go along with it (see {{TransactionStatementTest}})
* Basic authorization and audit log support
* Support for bind variables in transactions
* Support for shorthand decrement operations (which are not supported in standard CQL) in transactions
* Copious in-JVM dtests that cover the above cases (see {{AccordCQLTest}})
* A new configurable timeout for transaction coordination
* CQL-based simulator support (although as of this writing it can be finicky, depending on the number of threads the run uses)
* Support for executing transactions at {{NODE_LOCAL}} consistency level (which doesn't really matter, outside of the fact I wanted to make sure {{TransactionStatement#executeLocally()}} worked)

*Not in this Patch (...and to be dealt with in a follow-up issue, which will also have more detail on these items)*

* Useful error messages for every validation failure
* Complete {{estimatedSizeOnHeap()}} implementations for all the {{Txn*}} classes
* Determining whether there's a way leverage the {{Selector}} hierarchy at all to squeeze out a bit more reuse around {{LET}} references we return in {{SELECT}}.
* An attempt to make the {{Bound}} hierarchy serializable, prepare {{Bound}} subtypes from transaction statement conditions, and use them directly for predicate evaluation.
* Partial state replication for transaction conditions.
* Support for returning the result of IF statements to the client.
* Constant terms on the RHS of LET statements (ex. {{LET a = ?}} and bind a scalar value)
* LET references on the RHS of condition predicates
* Support for {{CONTAINS}}, {{CONTAINS KEY}}, and {{IN}} operators in condition predicates
* LET references to nested UDTs
* Support for mixed conditional and unconditional updates (although this might not even be something we need in a first release of Accord)
* Latency metrics for transaction execution at the coordinator
* Reasonable JavaDoc for the major new classes and concepts (although this is something that I'd feel better adding if we're past initial review and major changes aren't very likely)
* Updates to the user-facing CQL docs (which would include bumping the CQL language version)
* A definitive answer for how transactions will interact w/ TTLs
* Validation/testing to ensure we do not execute transactions w/ potentially harmful Guardrails
* New transaction-specific tracing support

> CEP-15: Multi-Partition Transaction CQL Support (Alpha)
> -------------------------------------------------------
>
>                 Key: CASSANDRA-17719
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-17719
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Accord, CQL/Syntax
>            Reporter: Blake Eggleston
>            Assignee: Caleb Rackliffe
>            Priority: Normal
>             Fix For: NA
>
>
> The dev list thread regarding CQL transaction support seems to have converged on a syntax.
>  
> The thread is here: [https://lists.apache.org/thread/5sds3968mnnk42c24pvgwphg4qvo2xk0]
>  
> The message proposing the syntax ultimately agreed on is here: [https://lists.apache.org/thread/y289tczngj68bqpoo7gkso3bzmtf86pl]
>  
> I'll describe my understanding of  the agreed syntax here for, but I'd suggest reading through the thread.
>  
> The example query is this:
> {code:sql}
> BEGIN TRANSACTION
>   LET car = (SELECT miles_driven, is_running FROM cars WHERE model=’pinto’);
>   LET user = (SELECT miles_driven FROM users WHERE name=’Blake’);
>   SELECT car.is_running, car.miles_driven;
>   IF car.is_running THEN
>     UPDATE users SET miles_driven = user.miles_driven + 30 WHERE name='blake';
>     UPDATE cars SET miles_driven = car.miles_driven + 30 WHERE model='pinto';
>   END IF
> COMMIT TRANSACTION
> {code}
> Sections are described below, and we want to require the statement enforces an order on the different types of clauses. First, assignments, then select(s), then conditional updates. This may be relaxed in the future, but is meant to prevent users from interleaving updates and assignments and expecting read your own write behavior that we don't want to support in v1. 
> h3. Reference assignments
> {code:sql}
>   LET car = (SELECT miles_driven, is_running FROM cars WHERE model=’pinto’){code}
>  
> The first part is basically assigning the result of a SELECT statement to a named reference that can be used in updates/conditions and be returned to the user. Tuple names belong to a global scope and must not clash with other LET statements or update statements (more on that in the updates section). Also, the select statement must either be a point read, with all primary key columns defined, or explicitly set a limit of 1. 
> h3. Selection
> Data to returned to client. Currently, we're only supporting a single select statement. Either a normal select statement, or one returning values assigned by LET statements as shown in the example. Ultimately we'll want to support multiple select statements and returning the results to the client. Although that will require a protocol change.
> h3. Updates
> Normal inserts/updates/deletes with the following extensions:
>  * Inserts and updates can reference values assigned earlier in the statement
>  * Updates can reference their own columns:
> {code:java}
> miles_driven = miles_driven + 30{code}
>  - or -
> {code:java}
> miles_driven += 30{code}
> These will of course need to generate any required reads behind the scenes. There's no precedence of table vs reference names, so if a relevant column name and reference name conflict, the query needs to fail to parse.
> h3. If blocks 
> {code:sql}
>   IF <some conditions> THEN
>     <some update>;
>     <another update>;
>   END IF
> {code}
>  
> For v1, we only need to support a single condition in the format above. In the future, we'd like to support multiple branches with multiple conditions like:
>  
> {code:sql}
>   IF <some conditions> THEN
>     <some update>;
>     <another update>;
>   ELSE IF <another condition> THEN
>     <yet another update>;
>   ELSE
>     <an update>;
>   END IF
> {code}
>  
> h3. Conditions
> Comparisons of value references to literals or other references. IS NULL / IS NOT NULL also needs to be supported. Multiple comparisons need to be supported, but v1 only needs to support AND'ing them together.
> {code:java}
> Supported operators: =, !=, >, >=, <, <=, IS NULL, IS NOT NULL
> <value_reference> = 5
> <value_reference> IS NOT NULL
> IF car IS NOT NULL AND car.miles_driven > 30
> IF car.miles_driven = user.miles_driven{code}
> (Note that {{IS[ NOT ]NULL}} can apply to both tuple references and individually dereferenced columns.)
> CONTAINS and CONTAINS KEY require indexed collections, and will not be supported in v1.
> h3. Implementation notes
> I have a proof of concept I'd created to demo the initially proposed syntax here: [https://github.com/bdeggleston/cassandra/tree/accord-cql-poc],  It could be used as a starting point, a source of ideas for approaches, or not used at all. The main thing to keep in mind is that value references prevent creating read commands and mutations until later in the transaction process, and potentially on another machine, which means we can't create accord transactions with fully formed mutations and read commands. CQL statements and terms are also not serializable, and would require a ton of work to properly serialize, so there will need to be some intermediate stage that can be serialized.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@cassandra.apache.org
For additional commands, e-mail: commits-help@cassandra.apache.org