You are viewing a plain text version of this content. The canonical link for it is here.
Posted to pr@cassandra.apache.org by GitBox <gi...@apache.org> on 2023/01/14 09:50:24 UTC

[GitHub] [cassandra] belliottsmith commented on a diff in pull request #2075: (Accord) Integrate accord-core range transaction support (but do not support them within C* yet)

belliottsmith commented on code in PR #2075:
URL: https://github.com/apache/cassandra/pull/2075#discussion_r1070243359


##########
src/java/org/apache/cassandra/service/accord/txn/TxnNamedRead.java:
##########
@@ -114,8 +117,13 @@ public PartitionKey key()
     public Future<Data> read(boolean isForWriteTxn, SafeCommandStore safeStore, Timestamp executeAt)
     {
         SinglePartitionReadCommand command = (SinglePartitionReadCommand) get();
-        AccordCommandsForKey cfk = (AccordCommandsForKey) safeStore.commandsForKey(key);
-        int nowInSeconds = cfk.nowInSecondsFor(executeAt, isForWriteTxn);
+        // TODO (required, safety): before release, double check reasoning that this is safe
+//        AccordCommandsForKey cfk = ((SafeAccordCommandStore)safeStore).commandsForKey(key);
+//        int nowInSeconds = cfk.nowInSecondsFor(executeAt, isForWriteTxn);
+        // It's fine for our nowInSeconds to lag slightly our insertion timestamp, as to the user
+        // this simply looks like the transaction witnessed TTL'd data and the data then expired
+        // immediately after the transaction executed, and this simplifies things a great deal
+        int nowInSeconds = (int) TimeUnit.MICROSECONDS.toSeconds(executeAt.hlc());

Review Comment:
   The `executeAt` is more than 8 bytes so we cannot _guarantee_ being able to fit it into the normal C* register, so we have to either support timestamps larger than 8 bytes, or we have to be able to compress them into 8 bytes in a manner that is consistent across replicas for a given key. 
   
   It is pretty much out of scope of this patch to worry about this, but as you mentioned I believe in the other tickets, and as I have as an action item, this should be _mostly_ solvable with in-memory summaries. My currently vaguely formulated approach would be to track for contiguous ranges the maximum witnessed _evicted_ hlc we have applied to disk, and to bring a given record's state back into memory if we are to apply a transaction within the range with a hlc equal or lower to this. Since we will mostly not evict from cache until older timestamps have expired this should avoid any additional work.
   
   In general, our caching needs lots of work, both to make it hyper lean and to have several layers of varying degrees of summary, but I do not believe we can readily address that until the immutable state refactor lands.
   
   The `nowInSeconds` I think just does not need to worry and can always use the `hlc`. We don't care about overlaps there, as it only affects TTL expiration and tombstone collection (the latter of which we probably won't even use gc grace for eventually, since we have perfect consistency). I made this change partially in this patch I think, which was probably poor hygiene, but was envisaged as part of the asynchrony work for `calculateDeps` that we are punting to another patch, so I can roll those back too.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org