You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Vincent White (JIRA)" <ji...@apache.org> on 2017/01/25 07:59:26 UTC

[jira] [Commented] (CASSANDRA-13127) Materialized Views: View row expires too soon

    [ https://issues.apache.org/jira/browse/CASSANDRA-13127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15837348#comment-15837348 ] 

Vincent White commented on CASSANDRA-13127:
-------------------------------------------

I think I pretty much understand what's happening here. It basically all stems from the base upsert behaviour  (creating a row via {{UPDATE}} so the primary key columns don't exist on their own vs {{INSERT}}). I'm still not sure it matches the MV docs though and the comments in the code say things like:
{code}1) either the columns for the base and view PKs are exactly the same: in that case, the view entry should live as long as the base row lives. This means the view entry should only expire once *everything* in the base row has expired. Which means the row TTL should be the max of any other TTL.{code} I think the logic in {{computeLivenessInfoForEntry}} doesn't make sense for updates because it only ever expected inserts. It leads to some funky behaviour if you're mixing updates, inserts and TTL's. I didn't test with deletes but I guees they could cause similar results.

Simply patching computeLivenessInfoForEntry like:
{code:title=ViewUpdateGenerator.java#computeLivenessInfoForEntry}
            int expirationTime = baseLiveness.localExpirationTime();
            for (Cell cell : baseRow.cells())
            {

-                if (cell.ttl() > ttl)
+                if (cell.localDeletionTime() > expirationTime)
                {
                    ttl = cell.ttl();
                    expirationTime = cell.localDeletionTime();
                }
            }
-            return ttl == baseLiveness.ttl()
+            return expirationTime == baseLiveness.localExpirationTime()
                 ? baseLiveness
                 : LivenessInfo.withExpirationTime(baseLiveness.timestamp(), ttl, expirationTime);
        }
{code} isn't enough because it leads to further unexpected behaviour where update statements will resurrect previously TTL'd MV entries in some cases. If an update statement sets a column that could cause the update of _any_ view in that keyspace it will resurrect entries in views that have PK's made up of only columns from the base PK, regardless of whether the statement updates non-PK columns in that view. If the update statement only sets values of columns that don't appear in the keyspace's MV's then no MV TTL'd entries for that PK will be resurrected. If there was never an entry in the MV for that MV PK then it won't create a a new one. This is because upserts don't create new MV entries unless they set the value of a non-PK column in that view (with or without this patch).

I don't think I've seen it referenced anywhere but is that intended behaviour when using upserts and materialized views? That an {{UPDATE}} to a column not in a view will not create an entry in an MV if the veiw's PK is only made up of columns from the base table's PK, but the matching {{INSERT}} statement will?

> Materialized Views: View row expires too soon
> ---------------------------------------------
>
>                 Key: CASSANDRA-13127
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-13127
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Duarte Nunes
>
> Consider the following commands, ran against trunk:
> {code}
> echo "DROP MATERIALIZED VIEW ks.mv; DROP TABLE ks.base;" | bin/cqlsh
> echo "CREATE TABLE ks.base (p int, c int, v int, PRIMARY KEY (p, c));" | bin/cqlsh
> echo "CREATE MATERIALIZED VIEW ks.mv AS SELECT p, c FROM base WHERE p IS NOT NULL AND c IS NOT NULL PRIMARY KEY (c, p);" | bin/cqlsh
> echo "INSERT INTO ks.base (p, c) VALUES (0, 0) USING TTL 10;" | bin/cqlsh
> # wait for row liveness to get closer to expiration
> sleep 6;
> echo "UPDATE ks.base USING TTL 8 SET v = 0 WHERE p = 0 and c = 0;" | bin/cqlsh
> echo "SELECT p, c, ttl(v) FROM ks.base; SELECT * FROM ks.mv;" | bin/cqlsh
>  p | c | ttl(v)
> ---+---+--------
>  0 | 0 |      7
> (1 rows)
>  c | p
> ---+---
>  0 | 0
> (1 rows)
> # wait for row liveness to expire
> sleep 4;
> echo "SELECT p, c, ttl(v) FROM ks.base; SELECT * FROM ks.mv;" | bin/cqlsh
>  p | c | ttl(v)
> ---+---+--------
>  0 | 0 |      3
> (1 rows)
>  c | p
> ---+---
> (0 rows)
> {code}
> Notice how the view row is removed even though the base row is still live. I would say this is because in ViewUpdateGenerator#computeLivenessInfoForEntry the TTLs are compared instead of the expiration times, but I'm not sure I'm getting that far ahead in the code when updating a column that's not in the view.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)