You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@cassandra.apache.org by Caleb Rackliffe <ca...@gmail.com> on 2022/07/22 16:21:06 UTC

Re: CEP-15 multi key transaction syntax

Avi brought up an interesting point around NULLness checking in
CASSANDRA-17762 <https://issues.apache.org/jira/browse/CASSANDRA-17762>...

In SQL, any comparison with NULL is NULL, which is interpreted as FALSE in
> a condition. To test for NULLness, you use IS NULL or IS NOT NULL. But LWT
> uses IF col = NULL as a NULLness test. This is likely to confuse people
> coming from SQL and hamper attempts to extend the dialect.


We can leave that Jira open to address what to do in the legacy LWT case,
but I'd support a SQL-congruent syntax here (IS NULL or IS NOT NULL), where
we have something closer to a blank slate.

Thoughts?

On Thu, Jun 30, 2022 at 6:25 PM Abe Ratnofsky <ab...@aber.io> wrote:

> The new syntax looks great, and I’m really excited to see this coming
> together.
>
> One piece of feedback on the proposed syntax is around the use of “=“ as a
> declaration in addition to its current use as an equality operator in a
> WHERE clause and an assignment operator in an UPDATE:
>
> BEGIN TRANSACTION
>   LET car_miles = miles_driven, car_is_running = is_running FROM cars
> WHERE model=’pinto’
>   LET user_miles = miles_driven FROM users WHERE name=’blake’
>   SELECT something else from some other table
>   IF NOT car_is_running THEN ABORT
>   UPDATE users SET miles_driven = user_miles + 30 WHERE name='blake';
>   UPDATE cars SET miles_driven = car_miles + 30 WHERE model='pinto';
> COMMIT TRANSACTION
>
> This is supported in languages like PL/pgSQL, but in a normal SQL query
> kind of local declaration is often expressed as an alias (SELECT col AS
> new_col), subquery alias (SELECT col) t, or common table expression (WITH t
> AS (SELECT col)).
>
> Here’s an example of an alternative to the proposed syntax that I’d find
> more readable:
>
> BEGIN TRANSACTION
>   WITH car_miles, car_is_running AS (SELECT miles_driven, is_running FROM
> cars WHERE model=’pinto’),
>   user_miles AS (SELECT miles_driven FROM users WHERE name=’blake’)
>   IF NOT car_is_running THEN ABORT
>   UPDATE users SET miles_driven = user_miles + 30 WHERE name='blake';
>   UPDATE cars SET miles_driven = car_miles + 30 WHERE model='pinto';
> COMMIT TRANSACTION
>
> There’s also the option of naming the transaction like a subquery, and
> supporting LET via AS (this one I’m less sure about but wanted to propose
> anyway):
>
> BEGIN TRANSACTION t1
>   SELECT miles_driven AS t1.car_miles, is_running AS t1.car_is_running
> FROM cars WHERE model=’pinto’;
>   SELECT miles_driven AS t1.user_miles FROM users WHERE name=’blake’;
>   IF NOT car_is_running THEN ABORT
>   UPDATE users SET miles_driven = user_miles + 30 WHERE name='blake';
>   UPDATE cars SET miles_driven = car_miles + 30 WHERE model='pinto';
> COMMIT TRANSACTION
>
> This also has the benefit of resolving ambiguity in case of naming
> conflicts with existing (or future) column names.
>
> --
> Abe
>

Re: CEP-15 multi key transaction syntax

Posted by Patrick McFadin <pm...@gmail.com>.
I'm really happy to see CEP-15 getting closer to a final implementation.
I'm going to walk through my reasoning for your proposals wrt trying to
explain this to somebody new.

Looking at all the options, the first thing that comes up for me is the
Cassandra project's complicated relationship with NULL.  We have prior art
with EXISTS/NOT EXISTS when creating new tables. IS NULL/IS NOT NULL is
used in materialized views similarly to proposals 2,4 and 5.

CREATE MATERIALIZED VIEW [ IF NOT EXISTS ] [keyspace_name.]view_name
  AS SELECT [ (column_list) ]
  FROM [keyspace_name.]table_name
  [ WHERE column_name IS NOT NULL
  [ AND column_name IS NOT NULL ... ] ]
  [ AND relation [ AND ... ] ]
  PRIMARY KEY ( column_list )
  [ WITH [ table_properties ]
  [ [ AND ] CLUSTERING ORDER BY (cluster_column_name order_option) ] ] ;

 Based on that, I believe 1 and 3 would just confuse users, so -1 on those.

Trying to explain the difference between row and column operations with
LET, I can't see the difference between a row and column in #2.

#4 introduces a boolean instead of column names and just adds more syntax.

#5 is verbose and, in my opinion, easier to reason when writing a query.
Thinking top down, I need to know if these exact rows and/or column values
exist before changing them, so I'll define them first. Then I'll iterate
over the state I created in my actual changes so I know I'm changing
precisely what I want.

#5 could use a bit more to be clearer to somebody who doesn't write CQL
queries daily and wouldn't require memorizing subtle differences. It should
be similar to all the other syntax, so learning a little about CQL will let
you move into more without completely re-learning the new syntax.

So I propose #6)
BEGIN TRANSACTION
  LET row1 = SELECT * FROM ks.tbl WHERE k=0 AND c=0; <-- * selects all
columns
  LET row2 = SELECT v FROM ks.tbl WHERE k=1 AND c=0;
  SELECT row1, row2
  IF row1 IS NULL AND row2.v = 3 THEN
    INSERT INTO ks.tbl (k, c, v) VALUES (0, 0, 1);
  END IF
COMMIT TRANSACTION

I added the SELECT in the LET just so it's straightforward, you are
reading, and it's just like doing a regular select, but you are assigning
it to a variable.

I removed the confusing 'row1.v' and replaced it with 'row1' I can't see
why you would need the '.v' vs having the complete variable I created in
the statement above.

EOL

Patrick

On Thu, Aug 11, 2022 at 1:37 PM Caleb Rackliffe <ca...@gmail.com>
wrote:

> ...and one more option...
>
> 5.) Introduce tuple assignments, removing all ambiguity around row vs.
> column operations.
>
> BEGIN TRANSACTION
>   LET row1 = * FROM ks.tbl WHERE k=0 AND c=0; <-- * selects all columns
>   LET row2 = (v) FROM ks.tbl WHERE k=1 AND c=0;
>   SELECT row1.v, row2.v
>   IF row1 IS NULL AND row2.v = 3 THEN
>     INSERT INTO ks.tbl (k, c, v) VALUES (0, 0, 1);
>   END IF
> COMMIT TRANSACTION
>
>
>
> On Thu, Aug 11, 2022 at 12:55 PM Caleb Rackliffe <ca...@gmail.com>
> wrote:
>
>> via Benedict, here is a 4th option:
>>
>> 4.) Similar to #2, but don't rely on the key element being NULL.
>>
>> If the read returns no result, x effectively becomes NULL. Otherwise, it
>> remains true/NOT NULL.
>>
>> BEGIN TRANSACTION
>>   LET x = true FROM ks.tbl WHERE k=0 AND c=0;
>>   LET row2_v = v FROM ks.tbl WHERE k=1 AND c=0;
>>   SELECT x, row2_v
>>   IF x IS NULL AND row2_v = 3 THEN
>>     INSERT INTO ks.tbl (k, c, v) VALUES (0, 0, 1);
>>   END IF
>> COMMIT TRANSACTION
>>
>> On Thu, Aug 11, 2022 at 12:12 PM Caleb Rackliffe <
>> calebrackliffe@gmail.com> wrote:
>>
>>> Hello again everyone!
>>>
>>> I've been working on a prototype
>>> <https://issues.apache.org/jira/browse/CASSANDRA-17719> in
>>> CASSANDRA-17719 for a grammar that roughly corresponds to what we've agreed
>>> on in this thread. One thing that isn't immediately obvious to me is how
>>> the LET syntax handles cases where we want to check for the plain existence
>>> of a row in IF. For example, in this hybrid of the originally proposed
>>> syntax and something more like what we've agreed on (and the RETURNING just
>>> to distinguish between that and SELECT), this could be pretty
>>> straightforward:
>>>
>>> BEGIN TRANSACTION
>>>   SELECT v FROM ks.tbl WHERE k=0 AND c=0 AS row1;
>>>   SELECT v FROM ks.tbl WHERE k=1 AND c=0 AS row2;
>>>   RETURNING row1.v, row2.v
>>>   IF row1 NOT EXISTS AND row2.v = 3 THEN
>>>     INSERT INTO ks.tbl (k, c, v) VALUES (0, 0, 1);
>>>   END IF
>>> COMMIT TRANSACTION
>>>
>>> The NOT EXISTS operator has row1 to work with. One the other hand, w/
>>> the LET syntax and no naming of reads, it's not clear what the best
>>> solution would be. Here are a few possibilities:
>>>
>>> 1.) Provide a few built-in functions that operate on a whole result row.
>>> If we assume a SQL style IS NULL and IS NOT NULL (see my last post here)
>>> for operations on particular columns, this probably eliminates the need for
>>> EXISTS/NOT EXISTS as well.
>>>
>>> BEGIN TRANSACTION
>>>   LET row1_missing = notExists() FROM ks.tbl WHERE k=0 AND c=0;
>>>   LET row2_v = v FROM ks.tbl WHERE k=1 AND c=0;
>>>   SELECT row1_missing, row2_v
>>>   IF row1_missing AND row2_v = 3 THEN
>>>     INSERT INTO ks.tbl (k, c, v) VALUES (0, 0, 1);
>>>   END IF
>>> COMMIT TRANSACTION
>>>
>>> 2.) Assign and check the first primary key element to determine whether
>>> the row exists.
>>>
>>> BEGIN TRANSACTION
>>>   LET row1_k = k FROM ks.tbl WHERE k=0 AND c=0;
>>>   LET row2_v = v FROM ks.tbl WHERE k=1 AND c=0;
>>>   SELECT row1_k, row2_v
>>>   IF row1_k IS NULL AND row2_v = 3 THEN
>>>     INSERT INTO ks.tbl (k, c, v) VALUES (0, 0, 1);
>>>   END IF
>>> COMMIT TRANSACTION
>>>
>>> 3.) Reconsider the LET concept toward something that allows us to
>>> explicitly name our reads again.
>>>
>>> BEGIN TRANSACTION
>>>   WITH (SELECT v FROM ks.tbl WHERE k=0 AND c=0) AS row1;
>>>   WITH (SELECT v FROM ks.tbl WHERE k=1 AND c=0) AS row2;
>>>   SELECT row1.v, row2.v
>>>   IF row1 NOT EXISTS AND row2.v = 3 THEN
>>>     INSERT INTO ks.tbl (k, c, v) VALUES (0, 0, 1);
>>>   END IF
>>> COMMIT TRANSACTION
>>>
>>> I don't have a strong affinity for any of these, although #1 seems the
>>> most awkward.
>>>
>>> Does anyone have any other alternatives? Preference for one of the above
>>> options?
>>>
>>> Thanks!
>>>
>>> On Fri, Jul 22, 2022 at 11:21 AM Caleb Rackliffe <
>>> calebrackliffe@gmail.com> wrote:
>>>
>>>> Avi brought up an interesting point around NULLness checking in
>>>> CASSANDRA-17762 <https://issues.apache.org/jira/browse/CASSANDRA-17762>
>>>> ...
>>>>
>>>> In SQL, any comparison with NULL is NULL, which is interpreted as FALSE
>>>>> in a condition. To test for NULLness, you use IS NULL or IS NOT NULL. But
>>>>> LWT uses IF col = NULL as a NULLness test. This is likely to confuse people
>>>>> coming from SQL and hamper attempts to extend the dialect.
>>>>
>>>>
>>>> We can leave that Jira open to address what to do in the legacy LWT
>>>> case, but I'd support a SQL-congruent syntax here (IS NULL or IS NOT
>>>> NULL), where we have something closer to a blank slate.
>>>>
>>>> Thoughts?
>>>>
>>>> On Thu, Jun 30, 2022 at 6:25 PM Abe Ratnofsky <ab...@aber.io> wrote:
>>>>
>>>>> The new syntax looks great, and I’m really excited to see this coming
>>>>> together.
>>>>>
>>>>> One piece of feedback on the proposed syntax is around the use of “=“
>>>>> as a declaration in addition to its current use as an equality operator in
>>>>> a WHERE clause and an assignment operator in an UPDATE:
>>>>>
>>>>> BEGIN TRANSACTION
>>>>>   LET car_miles = miles_driven, car_is_running = is_running FROM cars
>>>>> WHERE model=’pinto’
>>>>>   LET user_miles = miles_driven FROM users WHERE name=’blake’
>>>>>   SELECT something else from some other table
>>>>>   IF NOT car_is_running THEN ABORT
>>>>>   UPDATE users SET miles_driven = user_miles + 30 WHERE name='blake';
>>>>>   UPDATE cars SET miles_driven = car_miles + 30 WHERE model='pinto';
>>>>> COMMIT TRANSACTION
>>>>>
>>>>> This is supported in languages like PL/pgSQL, but in a normal SQL
>>>>> query kind of local declaration is often expressed as an alias (SELECT col
>>>>> AS new_col), subquery alias (SELECT col) t, or common table expression
>>>>> (WITH t AS (SELECT col)).
>>>>>
>>>>> Here’s an example of an alternative to the proposed syntax that I’d
>>>>> find more readable:
>>>>>
>>>>> BEGIN TRANSACTION
>>>>>   WITH car_miles, car_is_running AS (SELECT miles_driven, is_running
>>>>> FROM cars WHERE model=’pinto’),
>>>>>   user_miles AS (SELECT miles_driven FROM users WHERE name=’blake’)
>>>>>   IF NOT car_is_running THEN ABORT
>>>>>   UPDATE users SET miles_driven = user_miles + 30 WHERE name='blake';
>>>>>   UPDATE cars SET miles_driven = car_miles + 30 WHERE model='pinto';
>>>>> COMMIT TRANSACTION
>>>>>
>>>>> There’s also the option of naming the transaction like a subquery, and
>>>>> supporting LET via AS (this one I’m less sure about but wanted to propose
>>>>> anyway):
>>>>>
>>>>> BEGIN TRANSACTION t1
>>>>>   SELECT miles_driven AS t1.car_miles, is_running AS t1.car_is_running
>>>>> FROM cars WHERE model=’pinto’;
>>>>>   SELECT miles_driven AS t1.user_miles FROM users WHERE name=’blake’;
>>>>>   IF NOT car_is_running THEN ABORT
>>>>>   UPDATE users SET miles_driven = user_miles + 30 WHERE name='blake';
>>>>>   UPDATE cars SET miles_driven = car_miles + 30 WHERE model='pinto';
>>>>> COMMIT TRANSACTION
>>>>>
>>>>> This also has the benefit of resolving ambiguity in case of naming
>>>>> conflicts with existing (or future) column names.
>>>>>
>>>>> --
>>>>> Abe
>>>>>
>>>>

Re: CEP-15 multi key transaction syntax

Posted by Caleb Rackliffe <ca...@gmail.com>.
...and one more option...

5.) Introduce tuple assignments, removing all ambiguity around row vs.
column operations.

BEGIN TRANSACTION
  LET row1 = * FROM ks.tbl WHERE k=0 AND c=0; <-- * selects all columns
  LET row2 = (v) FROM ks.tbl WHERE k=1 AND c=0;
  SELECT row1.v, row2.v
  IF row1 IS NULL AND row2.v = 3 THEN
    INSERT INTO ks.tbl (k, c, v) VALUES (0, 0, 1);
  END IF
COMMIT TRANSACTION



On Thu, Aug 11, 2022 at 12:55 PM Caleb Rackliffe <ca...@gmail.com>
wrote:

> via Benedict, here is a 4th option:
>
> 4.) Similar to #2, but don't rely on the key element being NULL.
>
> If the read returns no result, x effectively becomes NULL. Otherwise, it
> remains true/NOT NULL.
>
> BEGIN TRANSACTION
>   LET x = true FROM ks.tbl WHERE k=0 AND c=0;
>   LET row2_v = v FROM ks.tbl WHERE k=1 AND c=0;
>   SELECT x, row2_v
>   IF x IS NULL AND row2_v = 3 THEN
>     INSERT INTO ks.tbl (k, c, v) VALUES (0, 0, 1);
>   END IF
> COMMIT TRANSACTION
>
> On Thu, Aug 11, 2022 at 12:12 PM Caleb Rackliffe <ca...@gmail.com>
> wrote:
>
>> Hello again everyone!
>>
>> I've been working on a prototype
>> <https://issues.apache.org/jira/browse/CASSANDRA-17719> in
>> CASSANDRA-17719 for a grammar that roughly corresponds to what we've agreed
>> on in this thread. One thing that isn't immediately obvious to me is how
>> the LET syntax handles cases where we want to check for the plain existence
>> of a row in IF. For example, in this hybrid of the originally proposed
>> syntax and something more like what we've agreed on (and the RETURNING just
>> to distinguish between that and SELECT), this could be pretty
>> straightforward:
>>
>> BEGIN TRANSACTION
>>   SELECT v FROM ks.tbl WHERE k=0 AND c=0 AS row1;
>>   SELECT v FROM ks.tbl WHERE k=1 AND c=0 AS row2;
>>   RETURNING row1.v, row2.v
>>   IF row1 NOT EXISTS AND row2.v = 3 THEN
>>     INSERT INTO ks.tbl (k, c, v) VALUES (0, 0, 1);
>>   END IF
>> COMMIT TRANSACTION
>>
>> The NOT EXISTS operator has row1 to work with. One the other hand, w/ the
>> LET syntax and no naming of reads, it's not clear what the best solution
>> would be. Here are a few possibilities:
>>
>> 1.) Provide a few built-in functions that operate on a whole result row.
>> If we assume a SQL style IS NULL and IS NOT NULL (see my last post here)
>> for operations on particular columns, this probably eliminates the need for
>> EXISTS/NOT EXISTS as well.
>>
>> BEGIN TRANSACTION
>>   LET row1_missing = notExists() FROM ks.tbl WHERE k=0 AND c=0;
>>   LET row2_v = v FROM ks.tbl WHERE k=1 AND c=0;
>>   SELECT row1_missing, row2_v
>>   IF row1_missing AND row2_v = 3 THEN
>>     INSERT INTO ks.tbl (k, c, v) VALUES (0, 0, 1);
>>   END IF
>> COMMIT TRANSACTION
>>
>> 2.) Assign and check the first primary key element to determine whether
>> the row exists.
>>
>> BEGIN TRANSACTION
>>   LET row1_k = k FROM ks.tbl WHERE k=0 AND c=0;
>>   LET row2_v = v FROM ks.tbl WHERE k=1 AND c=0;
>>   SELECT row1_k, row2_v
>>   IF row1_k IS NULL AND row2_v = 3 THEN
>>     INSERT INTO ks.tbl (k, c, v) VALUES (0, 0, 1);
>>   END IF
>> COMMIT TRANSACTION
>>
>> 3.) Reconsider the LET concept toward something that allows us to
>> explicitly name our reads again.
>>
>> BEGIN TRANSACTION
>>   WITH (SELECT v FROM ks.tbl WHERE k=0 AND c=0) AS row1;
>>   WITH (SELECT v FROM ks.tbl WHERE k=1 AND c=0) AS row2;
>>   SELECT row1.v, row2.v
>>   IF row1 NOT EXISTS AND row2.v = 3 THEN
>>     INSERT INTO ks.tbl (k, c, v) VALUES (0, 0, 1);
>>   END IF
>> COMMIT TRANSACTION
>>
>> I don't have a strong affinity for any of these, although #1 seems the
>> most awkward.
>>
>> Does anyone have any other alternatives? Preference for one of the above
>> options?
>>
>> Thanks!
>>
>> On Fri, Jul 22, 2022 at 11:21 AM Caleb Rackliffe <
>> calebrackliffe@gmail.com> wrote:
>>
>>> Avi brought up an interesting point around NULLness checking in
>>> CASSANDRA-17762 <https://issues.apache.org/jira/browse/CASSANDRA-17762>
>>> ...
>>>
>>> In SQL, any comparison with NULL is NULL, which is interpreted as FALSE
>>>> in a condition. To test for NULLness, you use IS NULL or IS NOT NULL. But
>>>> LWT uses IF col = NULL as a NULLness test. This is likely to confuse people
>>>> coming from SQL and hamper attempts to extend the dialect.
>>>
>>>
>>> We can leave that Jira open to address what to do in the legacy LWT
>>> case, but I'd support a SQL-congruent syntax here (IS NULL or IS NOT
>>> NULL), where we have something closer to a blank slate.
>>>
>>> Thoughts?
>>>
>>> On Thu, Jun 30, 2022 at 6:25 PM Abe Ratnofsky <ab...@aber.io> wrote:
>>>
>>>> The new syntax looks great, and I’m really excited to see this coming
>>>> together.
>>>>
>>>> One piece of feedback on the proposed syntax is around the use of “=“
>>>> as a declaration in addition to its current use as an equality operator in
>>>> a WHERE clause and an assignment operator in an UPDATE:
>>>>
>>>> BEGIN TRANSACTION
>>>>   LET car_miles = miles_driven, car_is_running = is_running FROM cars
>>>> WHERE model=’pinto’
>>>>   LET user_miles = miles_driven FROM users WHERE name=’blake’
>>>>   SELECT something else from some other table
>>>>   IF NOT car_is_running THEN ABORT
>>>>   UPDATE users SET miles_driven = user_miles + 30 WHERE name='blake';
>>>>   UPDATE cars SET miles_driven = car_miles + 30 WHERE model='pinto';
>>>> COMMIT TRANSACTION
>>>>
>>>> This is supported in languages like PL/pgSQL, but in a normal SQL query
>>>> kind of local declaration is often expressed as an alias (SELECT col AS
>>>> new_col), subquery alias (SELECT col) t, or common table expression (WITH t
>>>> AS (SELECT col)).
>>>>
>>>> Here’s an example of an alternative to the proposed syntax that I’d
>>>> find more readable:
>>>>
>>>> BEGIN TRANSACTION
>>>>   WITH car_miles, car_is_running AS (SELECT miles_driven, is_running
>>>> FROM cars WHERE model=’pinto’),
>>>>   user_miles AS (SELECT miles_driven FROM users WHERE name=’blake’)
>>>>   IF NOT car_is_running THEN ABORT
>>>>   UPDATE users SET miles_driven = user_miles + 30 WHERE name='blake';
>>>>   UPDATE cars SET miles_driven = car_miles + 30 WHERE model='pinto';
>>>> COMMIT TRANSACTION
>>>>
>>>> There’s also the option of naming the transaction like a subquery, and
>>>> supporting LET via AS (this one I’m less sure about but wanted to propose
>>>> anyway):
>>>>
>>>> BEGIN TRANSACTION t1
>>>>   SELECT miles_driven AS t1.car_miles, is_running AS t1.car_is_running
>>>> FROM cars WHERE model=’pinto’;
>>>>   SELECT miles_driven AS t1.user_miles FROM users WHERE name=’blake’;
>>>>   IF NOT car_is_running THEN ABORT
>>>>   UPDATE users SET miles_driven = user_miles + 30 WHERE name='blake';
>>>>   UPDATE cars SET miles_driven = car_miles + 30 WHERE model='pinto';
>>>> COMMIT TRANSACTION
>>>>
>>>> This also has the benefit of resolving ambiguity in case of naming
>>>> conflicts with existing (or future) column names.
>>>>
>>>> --
>>>> Abe
>>>>
>>>

Re: CEP-15 multi key transaction syntax

Posted by Caleb Rackliffe <ca...@gmail.com>.
via Benedict, here is a 4th option:

4.) Similar to #2, but don't rely on the key element being NULL.

If the read returns no result, x effectively becomes NULL. Otherwise, it
remains true/NOT NULL.

BEGIN TRANSACTION
  LET x = true FROM ks.tbl WHERE k=0 AND c=0;
  LET row2_v = v FROM ks.tbl WHERE k=1 AND c=0;
  SELECT x, row2_v
  IF x IS NULL AND row2_v = 3 THEN
    INSERT INTO ks.tbl (k, c, v) VALUES (0, 0, 1);
  END IF
COMMIT TRANSACTION

On Thu, Aug 11, 2022 at 12:12 PM Caleb Rackliffe <ca...@gmail.com>
wrote:

> Hello again everyone!
>
> I've been working on a prototype
> <https://issues.apache.org/jira/browse/CASSANDRA-17719> in
> CASSANDRA-17719 for a grammar that roughly corresponds to what we've agreed
> on in this thread. One thing that isn't immediately obvious to me is how
> the LET syntax handles cases where we want to check for the plain existence
> of a row in IF. For example, in this hybrid of the originally proposed
> syntax and something more like what we've agreed on (and the RETURNING just
> to distinguish between that and SELECT), this could be pretty
> straightforward:
>
> BEGIN TRANSACTION
>   SELECT v FROM ks.tbl WHERE k=0 AND c=0 AS row1;
>   SELECT v FROM ks.tbl WHERE k=1 AND c=0 AS row2;
>   RETURNING row1.v, row2.v
>   IF row1 NOT EXISTS AND row2.v = 3 THEN
>     INSERT INTO ks.tbl (k, c, v) VALUES (0, 0, 1);
>   END IF
> COMMIT TRANSACTION
>
> The NOT EXISTS operator has row1 to work with. One the other hand, w/ the
> LET syntax and no naming of reads, it's not clear what the best solution
> would be. Here are a few possibilities:
>
> 1.) Provide a few built-in functions that operate on a whole result row.
> If we assume a SQL style IS NULL and IS NOT NULL (see my last post here)
> for operations on particular columns, this probably eliminates the need for
> EXISTS/NOT EXISTS as well.
>
> BEGIN TRANSACTION
>   LET row1_missing = notExists() FROM ks.tbl WHERE k=0 AND c=0;
>   LET row2_v = v FROM ks.tbl WHERE k=1 AND c=0;
>   SELECT row1_missing, row2_v
>   IF row1_missing AND row2_v = 3 THEN
>     INSERT INTO ks.tbl (k, c, v) VALUES (0, 0, 1);
>   END IF
> COMMIT TRANSACTION
>
> 2.) Assign and check the first primary key element to determine whether
> the row exists.
>
> BEGIN TRANSACTION
>   LET row1_k = k FROM ks.tbl WHERE k=0 AND c=0;
>   LET row2_v = v FROM ks.tbl WHERE k=1 AND c=0;
>   SELECT row1_k, row2_v
>   IF row1_k IS NULL AND row2_v = 3 THEN
>     INSERT INTO ks.tbl (k, c, v) VALUES (0, 0, 1);
>   END IF
> COMMIT TRANSACTION
>
> 3.) Reconsider the LET concept toward something that allows us to
> explicitly name our reads again.
>
> BEGIN TRANSACTION
>   WITH (SELECT v FROM ks.tbl WHERE k=0 AND c=0) AS row1;
>   WITH (SELECT v FROM ks.tbl WHERE k=1 AND c=0) AS row2;
>   SELECT row1.v, row2.v
>   IF row1 NOT EXISTS AND row2.v = 3 THEN
>     INSERT INTO ks.tbl (k, c, v) VALUES (0, 0, 1);
>   END IF
> COMMIT TRANSACTION
>
> I don't have a strong affinity for any of these, although #1 seems the
> most awkward.
>
> Does anyone have any other alternatives? Preference for one of the above
> options?
>
> Thanks!
>
> On Fri, Jul 22, 2022 at 11:21 AM Caleb Rackliffe <ca...@gmail.com>
> wrote:
>
>> Avi brought up an interesting point around NULLness checking in
>> CASSANDRA-17762 <https://issues.apache.org/jira/browse/CASSANDRA-17762>
>> ...
>>
>> In SQL, any comparison with NULL is NULL, which is interpreted as FALSE
>>> in a condition. To test for NULLness, you use IS NULL or IS NOT NULL. But
>>> LWT uses IF col = NULL as a NULLness test. This is likely to confuse people
>>> coming from SQL and hamper attempts to extend the dialect.
>>
>>
>> We can leave that Jira open to address what to do in the legacy LWT case,
>> but I'd support a SQL-congruent syntax here (IS NULL or IS NOT NULL),
>> where we have something closer to a blank slate.
>>
>> Thoughts?
>>
>> On Thu, Jun 30, 2022 at 6:25 PM Abe Ratnofsky <ab...@aber.io> wrote:
>>
>>> The new syntax looks great, and I’m really excited to see this coming
>>> together.
>>>
>>> One piece of feedback on the proposed syntax is around the use of “=“ as
>>> a declaration in addition to its current use as an equality operator in a
>>> WHERE clause and an assignment operator in an UPDATE:
>>>
>>> BEGIN TRANSACTION
>>>   LET car_miles = miles_driven, car_is_running = is_running FROM cars
>>> WHERE model=’pinto’
>>>   LET user_miles = miles_driven FROM users WHERE name=’blake’
>>>   SELECT something else from some other table
>>>   IF NOT car_is_running THEN ABORT
>>>   UPDATE users SET miles_driven = user_miles + 30 WHERE name='blake';
>>>   UPDATE cars SET miles_driven = car_miles + 30 WHERE model='pinto';
>>> COMMIT TRANSACTION
>>>
>>> This is supported in languages like PL/pgSQL, but in a normal SQL query
>>> kind of local declaration is often expressed as an alias (SELECT col AS
>>> new_col), subquery alias (SELECT col) t, or common table expression (WITH t
>>> AS (SELECT col)).
>>>
>>> Here’s an example of an alternative to the proposed syntax that I’d find
>>> more readable:
>>>
>>> BEGIN TRANSACTION
>>>   WITH car_miles, car_is_running AS (SELECT miles_driven, is_running
>>> FROM cars WHERE model=’pinto’),
>>>   user_miles AS (SELECT miles_driven FROM users WHERE name=’blake’)
>>>   IF NOT car_is_running THEN ABORT
>>>   UPDATE users SET miles_driven = user_miles + 30 WHERE name='blake';
>>>   UPDATE cars SET miles_driven = car_miles + 30 WHERE model='pinto';
>>> COMMIT TRANSACTION
>>>
>>> There’s also the option of naming the transaction like a subquery, and
>>> supporting LET via AS (this one I’m less sure about but wanted to propose
>>> anyway):
>>>
>>> BEGIN TRANSACTION t1
>>>   SELECT miles_driven AS t1.car_miles, is_running AS t1.car_is_running
>>> FROM cars WHERE model=’pinto’;
>>>   SELECT miles_driven AS t1.user_miles FROM users WHERE name=’blake’;
>>>   IF NOT car_is_running THEN ABORT
>>>   UPDATE users SET miles_driven = user_miles + 30 WHERE name='blake';
>>>   UPDATE cars SET miles_driven = car_miles + 30 WHERE model='pinto';
>>> COMMIT TRANSACTION
>>>
>>> This also has the benefit of resolving ambiguity in case of naming
>>> conflicts with existing (or future) column names.
>>>
>>> --
>>> Abe
>>>
>>

Re: CEP-15 multi key transaction syntax

Posted by Caleb Rackliffe <ca...@gmail.com>.
Hello again everyone!

I've been working on a prototype
<https://issues.apache.org/jira/browse/CASSANDRA-17719> in CASSANDRA-17719
for a grammar that roughly corresponds to what we've agreed on in this
thread. One thing that isn't immediately obvious to me is how the LET
syntax handles cases where we want to check for the plain existence of a
row in IF. For example, in this hybrid of the originally proposed syntax
and something more like what we've agreed on (and the RETURNING just to
distinguish between that and SELECT), this could be pretty straightforward:

BEGIN TRANSACTION
  SELECT v FROM ks.tbl WHERE k=0 AND c=0 AS row1;
  SELECT v FROM ks.tbl WHERE k=1 AND c=0 AS row2;
  RETURNING row1.v, row2.v
  IF row1 NOT EXISTS AND row2.v = 3 THEN
    INSERT INTO ks.tbl (k, c, v) VALUES (0, 0, 1);
  END IF
COMMIT TRANSACTION

The NOT EXISTS operator has row1 to work with. One the other hand, w/ the
LET syntax and no naming of reads, it's not clear what the best solution
would be. Here are a few possibilities:

1.) Provide a few built-in functions that operate on a whole result row. If
we assume a SQL style IS NULL and IS NOT NULL (see my last post here) for
operations on particular columns, this probably eliminates the need for
EXISTS/NOT EXISTS as well.

BEGIN TRANSACTION
  LET row1_missing = notExists() FROM ks.tbl WHERE k=0 AND c=0;
  LET row2_v = v FROM ks.tbl WHERE k=1 AND c=0;
  SELECT row1_missing, row2_v
  IF row1_missing AND row2_v = 3 THEN
    INSERT INTO ks.tbl (k, c, v) VALUES (0, 0, 1);
  END IF
COMMIT TRANSACTION

2.) Assign and check the first primary key element to determine whether the
row exists.

BEGIN TRANSACTION
  LET row1_k = k FROM ks.tbl WHERE k=0 AND c=0;
  LET row2_v = v FROM ks.tbl WHERE k=1 AND c=0;
  SELECT row1_k, row2_v
  IF row1_k IS NULL AND row2_v = 3 THEN
    INSERT INTO ks.tbl (k, c, v) VALUES (0, 0, 1);
  END IF
COMMIT TRANSACTION

3.) Reconsider the LET concept toward something that allows us to
explicitly name our reads again.

BEGIN TRANSACTION
  WITH (SELECT v FROM ks.tbl WHERE k=0 AND c=0) AS row1;
  WITH (SELECT v FROM ks.tbl WHERE k=1 AND c=0) AS row2;
  SELECT row1.v, row2.v
  IF row1 NOT EXISTS AND row2.v = 3 THEN
    INSERT INTO ks.tbl (k, c, v) VALUES (0, 0, 1);
  END IF
COMMIT TRANSACTION

I don't have a strong affinity for any of these, although #1 seems the most
awkward.

Does anyone have any other alternatives? Preference for one of the above
options?

Thanks!

On Fri, Jul 22, 2022 at 11:21 AM Caleb Rackliffe <ca...@gmail.com>
wrote:

> Avi brought up an interesting point around NULLness checking in
> CASSANDRA-17762 <https://issues.apache.org/jira/browse/CASSANDRA-17762>...
>
> In SQL, any comparison with NULL is NULL, which is interpreted as FALSE in
>> a condition. To test for NULLness, you use IS NULL or IS NOT NULL. But LWT
>> uses IF col = NULL as a NULLness test. This is likely to confuse people
>> coming from SQL and hamper attempts to extend the dialect.
>
>
> We can leave that Jira open to address what to do in the legacy LWT case,
> but I'd support a SQL-congruent syntax here (IS NULL or IS NOT NULL),
> where we have something closer to a blank slate.
>
> Thoughts?
>
> On Thu, Jun 30, 2022 at 6:25 PM Abe Ratnofsky <ab...@aber.io> wrote:
>
>> The new syntax looks great, and I’m really excited to see this coming
>> together.
>>
>> One piece of feedback on the proposed syntax is around the use of “=“ as
>> a declaration in addition to its current use as an equality operator in a
>> WHERE clause and an assignment operator in an UPDATE:
>>
>> BEGIN TRANSACTION
>>   LET car_miles = miles_driven, car_is_running = is_running FROM cars
>> WHERE model=’pinto’
>>   LET user_miles = miles_driven FROM users WHERE name=’blake’
>>   SELECT something else from some other table
>>   IF NOT car_is_running THEN ABORT
>>   UPDATE users SET miles_driven = user_miles + 30 WHERE name='blake';
>>   UPDATE cars SET miles_driven = car_miles + 30 WHERE model='pinto';
>> COMMIT TRANSACTION
>>
>> This is supported in languages like PL/pgSQL, but in a normal SQL query
>> kind of local declaration is often expressed as an alias (SELECT col AS
>> new_col), subquery alias (SELECT col) t, or common table expression (WITH t
>> AS (SELECT col)).
>>
>> Here’s an example of an alternative to the proposed syntax that I’d find
>> more readable:
>>
>> BEGIN TRANSACTION
>>   WITH car_miles, car_is_running AS (SELECT miles_driven, is_running FROM
>> cars WHERE model=’pinto’),
>>   user_miles AS (SELECT miles_driven FROM users WHERE name=’blake’)
>>   IF NOT car_is_running THEN ABORT
>>   UPDATE users SET miles_driven = user_miles + 30 WHERE name='blake';
>>   UPDATE cars SET miles_driven = car_miles + 30 WHERE model='pinto';
>> COMMIT TRANSACTION
>>
>> There’s also the option of naming the transaction like a subquery, and
>> supporting LET via AS (this one I’m less sure about but wanted to propose
>> anyway):
>>
>> BEGIN TRANSACTION t1
>>   SELECT miles_driven AS t1.car_miles, is_running AS t1.car_is_running
>> FROM cars WHERE model=’pinto’;
>>   SELECT miles_driven AS t1.user_miles FROM users WHERE name=’blake’;
>>   IF NOT car_is_running THEN ABORT
>>   UPDATE users SET miles_driven = user_miles + 30 WHERE name='blake';
>>   UPDATE cars SET miles_driven = car_miles + 30 WHERE model='pinto';
>> COMMIT TRANSACTION
>>
>> This also has the benefit of resolving ambiguity in case of naming
>> conflicts with existing (or future) column names.
>>
>> --
>> Abe
>>
>