You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@jackrabbit.apache.org by GitBox <gi...@apache.org> on 2022/11/18 12:11:18 UTC

[GitHub] [jackrabbit-oak] jelmini opened a new pull request, #760: OAK-10006 Prevent writes to Azure storage if lock is not held

jelmini opened a new pull request, #760:
URL: https://github.com/apache/jackrabbit-oak/pull/760

   If the lock is lost to another instance, all writes attempts should fail with an exception.
   If the lock fails to be renewed, all writes are suspended (blocked) until the renewal succeed.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [jackrabbit-oak] jelmini commented on a diff in pull request #760: OAK-10006 Prevent writes to Azure storage if lock is not held

Posted by GitBox <gi...@apache.org>.
jelmini commented on code in PR #760:
URL: https://github.com/apache/jackrabbit-oak/pull/760#discussion_r1049651644


##########
oak-segment-tar/src/main/java/org/apache/jackrabbit/oak/segment/file/FileStore.java:
##########
@@ -237,6 +237,28 @@ private static GarbageCollectionStrategy newGarbageCollectionStrategy() {
         log.debug("TAR files: {}", tarFiles);
     }
 
+    private void onLockStatusChanged(SegmentNodeStorePersistence.LockStatus lockStatus) {
+        switch (lockStatus) {
+            case LOST:
+                try {
+                    if (tarFiles != null) {
+                        tarFiles.close();

Review Comment:
   Right. Thanks for spotting this, @ahanikel !
   I suppose we should add an `unsafeShutdown()` method to `TarFiles`, `TarWriter` and `AbstractRemoteSegmentArchiveWriter`, so that we can shut them down without attempting any finalisation writes to the remote storage, which would anyway fail or corrupt the data.
   What do you think?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [jackrabbit-oak] ahanikel commented on a diff in pull request #760: OAK-10006 Prevent writes to Azure storage if lock is not held

Posted by GitBox <gi...@apache.org>.
ahanikel commented on code in PR #760:
URL: https://github.com/apache/jackrabbit-oak/pull/760#discussion_r1027810157


##########
oak-segment-azure/src/main/java/org/apache/jackrabbit/oak/segment/azure/AzureRepositoryLock.java:
##########
@@ -121,6 +132,21 @@ private void refreshLease() {
         }
     }
 
+    private void notifyLockStatusChange(LockStatus renewal) {
+        try {
+            lockStatusChangedCallback.accept(renewal);
+        } catch (RuntimeException e) {
+            log.warn("Exception in lock status change callback", e);
+        }
+    }
+
+    void doRenewLease() throws StorageException {
+        BlobRequestOptions blobRequestOptions = new BlobRequestOptions();
+        blobRequestOptions.setMaximumExecutionTimeInMs(Math.max(INTERVAL - renewalInterval - 1, INTERVAL / 2 - 1));

Review Comment:
   Could you add a comment here that explains this?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [jackrabbit-oak] ahanikel commented on a diff in pull request #760: OAK-10006 Prevent writes to Azure storage if lock is not held

Posted by GitBox <gi...@apache.org>.
ahanikel commented on code in PR #760:
URL: https://github.com/apache/jackrabbit-oak/pull/760#discussion_r1027835132


##########
oak-segment-tar/src/main/java/org/apache/jackrabbit/oak/segment/file/FileStore.java:
##########
@@ -237,6 +237,28 @@ private static GarbageCollectionStrategy newGarbageCollectionStrategy() {
         log.debug("TAR files: {}", tarFiles);
     }
 
+    private void onLockStatusChanged(SegmentNodeStorePersistence.LockStatus lockStatus) {
+        switch (lockStatus) {
+            case LOST:
+                try {
+                    if (tarFiles != null) {
+                        tarFiles.close();

Review Comment:
   This will write the graph and references to the tar file (`archive.close()` in `TarWriter`), shouldn't we avoid that? Also it will try to flush the write queue (`q.flush()` in `AbstractRemoteSegmentArchiveWriter`), so if it's not empty, it will wait there forever, right?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [jackrabbit-oak] jelmini commented on a diff in pull request #760: OAK-10006 Prevent writes to Azure storage if lock is not held

Posted by GitBox <gi...@apache.org>.
jelmini commented on code in PR #760:
URL: https://github.com/apache/jackrabbit-oak/pull/760#discussion_r1049715995


##########
oak-segment-azure/src/main/java/org/apache/jackrabbit/oak/segment/azure/AzureRepositoryLock.java:
##########
@@ -121,6 +132,21 @@ private void refreshLease() {
         }
     }
 
+    private void notifyLockStatusChange(LockStatus renewal) {
+        try {
+            lockStatusChangedCallback.accept(renewal);
+        } catch (RuntimeException e) {
+            log.warn("Exception in lock status change callback", e);
+        }
+    }
+
+    void doRenewLease() throws StorageException {
+        BlobRequestOptions blobRequestOptions = new BlobRequestOptions();
+        blobRequestOptions.setMaximumExecutionTimeInMs(Math.max(INTERVAL - renewalInterval - 1, INTERVAL / 2 - 1));

Review Comment:
   Ok, I will add a comment.
   In summary, previously `renewLease()` was called with the default `BlobRequestOptions`. However, those defaults  can be set globally for the client instance and may be not a good fit for calling `renewLease()` as it could block for too long. As the lease is acquired for 60 seconds and `renewLease()` is called by default every 30 seconds, I think it should timeout after at most 29 seconds (giving 1 sec margin), so that we can call `notifyLockStatusChange(LockStatus.RENEWAL_FAILED)` and suspend writes before the lease becomes available to be acquired by another instance. 
   BTW, I have just noticed that values should be in millis and thus multiplied by 1000...



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org