You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@ignite.apache.org by "sanpwc (via GitHub)" <gi...@apache.org> on 2023/05/17 07:50:42 UTC

[GitHub] [ignite-3] sanpwc opened a new pull request, #2078: IGNITE-19428 PlacementDriver API implemented.

sanpwc opened a new pull request, #2078:
URL: https://github.com/apache/ignite-3/pull/2078

   …e implemented.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@ignite.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [ignite-3] sanpwc commented on a diff in pull request #2078: IGNITE-19428 PlacementDriver API implemented.

Posted by "sanpwc (via GitHub)" <gi...@apache.org>.
sanpwc commented on code in PR #2078:
URL: https://github.com/apache/ignite-3/pull/2078#discussion_r1196533267


##########
modules/placement-driver/src/main/java/org/apache/ignite/internal/placementdriver/leases/LeaseTracker.java:
##########
@@ -152,4 +195,46 @@ public CompletableFuture<Void> onUpdate(WatchEvent event) {
         public void onError(Throwable e) {
         }
     }
+
+    @Override
+    public CompletableFuture<LeaseMeta> awaitPrimaryReplica(ReplicationGroupId groupId, HybridTimestamp timestamp) {
+        if (!busyLock.enterBusy()) {
+            return failedFuture(new NodeStoppingException("Component is stopping."));
+        }
+        try {
+            return primaryReplicaWaiters.computeIfAbsent(groupId, id -> new PendingIndependentComparableValuesTracker<>(MIN_VALUE))
+                    .waitFor(timestamp);
+        } finally {
+            busyLock.leaveBusy();
+        }
+    }
+
+    @Override
+    public CompletableFuture<LeaseMeta> getPrimaryReplica(ReplicationGroupId replicationGroupId, HybridTimestamp timestamp) {
+        if (!busyLock.enterBusy()) {
+            return failedFuture(new NodeStoppingException("Component is stopping."));
+        }
+        try {
+            // There's no sense in awaiting previously detected primary replica more than lease interval.
+            return awaitPrimaryReplica(replicationGroupId, timestamp).orTimeout(longLeaseInterval, TimeUnit.MILLISECONDS);

Review Comment:
   Because it's maximum reasonable time to await for lease appearance based on lease propagation.
   First of all, from the one hand leaseInterval should be gt than MAX_CLOCK_SKEW by design, thus we've safe with clock drift here, on the other hand if distribution duration will be more than leaseInterval we may see an inappropriate gap which is inappropriate.
   Another approach with awaiting ms safe time, seems to be to invasive, code will become ugly, so I've finished with await bounding with given lease interval.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@ignite.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [ignite-3] vldpyatkov commented on a diff in pull request #2078: IGNITE-19428 PlacementDriver API implemented.

Posted by "vldpyatkov (via GitHub)" <gi...@apache.org>.
vldpyatkov commented on code in PR #2078:
URL: https://github.com/apache/ignite-3/pull/2078#discussion_r1198211553


##########
modules/placement-driver/src/main/java/org/apache/ignite/internal/placementdriver/leases/LeaseTracker.java:
##########
@@ -152,4 +195,46 @@ public CompletableFuture<Void> onUpdate(WatchEvent event) {
         public void onError(Throwable e) {
         }
     }
+
+    @Override
+    public CompletableFuture<LeaseMeta> awaitPrimaryReplica(ReplicationGroupId groupId, HybridTimestamp timestamp) {
+        if (!busyLock.enterBusy()) {
+            return failedFuture(new NodeStoppingException("Component is stopping."));
+        }
+        try {
+            return primaryReplicaWaiters.computeIfAbsent(groupId, id -> new PendingIndependentComparableValuesTracker<>(MIN_VALUE))
+                    .waitFor(timestamp);
+        } finally {
+            busyLock.leaveBusy();
+        }
+    }
+
+    @Override
+    public CompletableFuture<LeaseMeta> getPrimaryReplica(ReplicationGroupId replicationGroupId, HybridTimestamp timestamp) {
+        if (!busyLock.enterBusy()) {
+            return failedFuture(new NodeStoppingException("Component is stopping."));
+        }
+        try {
+            // There's no sense in awaiting previously detected primary replica more than lease interval.
+            return awaitPrimaryReplica(replicationGroupId, timestamp).orTimeout(longLeaseInterval, TimeUnit.MILLISECONDS);

Review Comment:
   I disagree that we can use long lease interval to limited waiting time. Timeout transaction is possible to use here, if I correctly understand how it is used in production, but the limitation is imposed externally.
   Here, safe time is only possible to use. If it is not possible for now, you will be able to write TODO.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@ignite.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [ignite-3] denis-chudov commented on a diff in pull request #2078: IGNITE-19428 PlacementDriver API implemented.

Posted by "denis-chudov (via GitHub)" <gi...@apache.org>.
denis-chudov commented on code in PR #2078:
URL: https://github.com/apache/ignite-3/pull/2078#discussion_r1196699129


##########
modules/placement-driver/src/main/java/org/apache/ignite/internal/placementdriver/leases/Lease.java:
##########
@@ -23,17 +23,19 @@
 
 import java.nio.ByteBuffer;
 import java.nio.charset.StandardCharsets;
+import java.util.Objects;
 import org.apache.ignite.internal.hlc.HybridTimestamp;
+import org.apache.ignite.internal.placementdriver.LeaseMeta;
 
 /**
  * A lease representation in memory.
  * The real lease is stored in Meta storage.
  */
-public class Lease {
+public class Lease implements LeaseMeta {
     /** The object is used when nothing holds the lease. Empty lease is always expired. */
     public static Lease EMPTY_LEASE = new Lease(null, MIN_VALUE, MIN_VALUE);
 
-    /** A node that holds a lease until {@code stopLeas}. */
+    /** A node that holds a lease until {@code stopLeas}LeaseMeta. */

Review Comment:
   Anyway, this javadoc seems to be broken. At least we can remove the part " until {@code stopLeas}".



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@ignite.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [ignite-3] denis-chudov commented on a diff in pull request #2078: IGNITE-19428 PlacementDriver API implemented.

Posted by "denis-chudov (via GitHub)" <gi...@apache.org>.
denis-chudov commented on code in PR #2078:
URL: https://github.com/apache/ignite-3/pull/2078#discussion_r1196473114


##########
modules/placement-driver/src/main/java/org/apache/ignite/internal/placementdriver/leases/Lease.java:
##########
@@ -23,17 +23,19 @@
 
 import java.nio.ByteBuffer;
 import java.nio.charset.StandardCharsets;
+import java.util.Objects;
 import org.apache.ignite.internal.hlc.HybridTimestamp;
+import org.apache.ignite.internal.placementdriver.LeaseMeta;
 
 /**
  * A lease representation in memory.
  * The real lease is stored in Meta storage.
  */
-public class Lease {
+public class Lease implements LeaseMeta {
     /** The object is used when nothing holds the lease. Empty lease is always expired. */
     public static Lease EMPTY_LEASE = new Lease(null, MIN_VALUE, MIN_VALUE);
 
-    /** A node that holds a lease until {@code stopLeas}. */
+    /** A node that holds a lease until {@code stopLeas}LeaseMeta. */

Review Comment:
   I can't get this. Could you rephrase?



##########
modules/placement-driver/src/main/java/org/apache/ignite/internal/placementdriver/leases/LeaseTracker.java:
##########
@@ -65,12 +84,22 @@ public class LeaseTracker {
      *
      * @param vaultManager Vault manager.
      * @param msManager Meta storage manager.
+     * @param longLeaseInterval Long lease interval in {@code TimeUnit.MILLISECONDS}.  The interval is used between lease granting attempts.

Review Comment:
   It's not obvious from this description that this is the interval for newly granted lease that is not accepted yet, and is never used for actual leases.



##########
modules/placement-driver/src/main/java/org/apache/ignite/internal/placementdriver/leases/LeaseTracker.java:
##########
@@ -152,4 +195,46 @@ public CompletableFuture<Void> onUpdate(WatchEvent event) {
         public void onError(Throwable e) {
         }
     }
+
+    @Override
+    public CompletableFuture<LeaseMeta> awaitPrimaryReplica(ReplicationGroupId groupId, HybridTimestamp timestamp) {
+        if (!busyLock.enterBusy()) {
+            return failedFuture(new NodeStoppingException("Component is stopping."));
+        }
+        try {
+            return primaryReplicaWaiters.computeIfAbsent(groupId, id -> new PendingIndependentComparableValuesTracker<>(MIN_VALUE))
+                    .waitFor(timestamp);
+        } finally {
+            busyLock.leaveBusy();
+        }
+    }
+
+    @Override
+    public CompletableFuture<LeaseMeta> getPrimaryReplica(ReplicationGroupId replicationGroupId, HybridTimestamp timestamp) {
+        if (!busyLock.enterBusy()) {
+            return failedFuture(new NodeStoppingException("Component is stopping."));
+        }
+        try {
+            // There's no sense in awaiting previously detected primary replica more than lease interval.
+            return awaitPrimaryReplica(replicationGroupId, timestamp).orTimeout(longLeaseInterval, TimeUnit.MILLISECONDS);

Review Comment:
   why `longLeaseInterval` is used here?



##########
modules/placement-driver/src/main/java/org/apache/ignite/internal/placementdriver/leases/Lease.java:
##########
@@ -234,4 +224,22 @@ public String toString() {
                 + ", prolongable=" + prolongable
                 + '}';
     }
+
+    @Override
+    public boolean equals(Object o) {

Review Comment:
   Could you please replace `LeaseSerializationTest#assertLeasesEqual` with this `equals` method?



##########
modules/placement-driver/src/main/java/org/apache/ignite/internal/placementdriver/LeaseMeta.java:
##########
@@ -0,0 +1,46 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.ignite.internal.placementdriver;
+
+import org.apache.ignite.internal.hlc.HybridTimestamp;
+
+/**
+ * Replica lease meta.
+ */
+public interface LeaseMeta {

Review Comment:
   Shouldn't this interface be in `placement-driver-api`?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@ignite.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [ignite-3] sanpwc commented on a diff in pull request #2078: IGNITE-19428 PlacementDriver API implemented.

Posted by "sanpwc (via GitHub)" <gi...@apache.org>.
sanpwc commented on code in PR #2078:
URL: https://github.com/apache/ignite-3/pull/2078#discussion_r1196517510


##########
modules/placement-driver/src/main/java/org/apache/ignite/internal/placementdriver/leases/Lease.java:
##########
@@ -23,17 +23,19 @@
 
 import java.nio.ByteBuffer;
 import java.nio.charset.StandardCharsets;
+import java.util.Objects;
 import org.apache.ignite.internal.hlc.HybridTimestamp;
+import org.apache.ignite.internal.placementdriver.LeaseMeta;
 
 /**
  * A lease representation in memory.
  * The real lease is stored in Meta storage.
  */
-public class Lease {
+public class Lease implements LeaseMeta {
     /** The object is used when nothing holds the lease. Empty lease is always expired. */
     public static Lease EMPTY_LEASE = new Lease(null, MIN_VALUE, MIN_VALUE);
 
-    /** A node that holds a lease until {@code stopLeas}. */
+    /** A node that holds a lease until {@code stopLeas}LeaseMeta. */

Review Comment:
   It's an Ctrl+C/Ctrl+V artifact. LeaseMeta should be removed.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@ignite.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [ignite-3] sanpwc commented on a diff in pull request #2078: IGNITE-19428 PlacementDriver API implemented.

Posted by "sanpwc (via GitHub)" <gi...@apache.org>.
sanpwc commented on code in PR #2078:
URL: https://github.com/apache/ignite-3/pull/2078#discussion_r1196777949


##########
modules/placement-driver/src/main/java/org/apache/ignite/internal/placementdriver/leases/LeaseTracker.java:
##########
@@ -65,12 +84,22 @@ public class LeaseTracker {
      *
      * @param vaultManager Vault manager.
      * @param msManager Meta storage manager.
+     * @param longLeaseInterval Long lease interval in {@code TimeUnit.MILLISECONDS}.  The interval is used between lease granting attempts.

Review Comment:
   Same as above, that wasn't part of my changes, I've just used comment from original code. @vldpyatkov Any opinion about given question.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@ignite.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [ignite-3] sanpwc commented on a diff in pull request #2078: IGNITE-19428 PlacementDriver API implemented.

Posted by "sanpwc (via GitHub)" <gi...@apache.org>.
sanpwc commented on code in PR #2078:
URL: https://github.com/apache/ignite-3/pull/2078#discussion_r1197861881


##########
modules/placement-driver/src/main/java/org/apache/ignite/internal/placementdriver/PlacementDriver.java:
##########
@@ -0,0 +1,51 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.ignite.internal.placementdriver;
+
+import java.util.concurrent.CompletableFuture;
+import org.apache.ignite.internal.hlc.HybridTimestamp;
+import org.apache.ignite.internal.replicator.ReplicationGroupId;
+
+/**
+ * Service that provides an ability to await and retrieve primary replicas for replication groups.
+ */
+public interface PlacementDriver {
+    /**
+     * Returns a future for the primary replica for the specified replication group whose expiration time (the right border of the
+     * corresponding lease interval) is greater than or equal to the timestamp passed as a parameter. Please pay attention that there are
+     * no restriction on the lease start time (left border), it can either be less or greater than or equal to proposed timestamp.
+     * Given method will await for an appropriate primary replica appearance if there's no already existing one. Such awaiting logic is
+     * unbounded, so it's mandatory to use explicit await termination like {@code orTimeout}.
+     *
+     * @param groupId Replication group id.
+     * @param timestamp Timestamp reference value.
+     * @return Primary replica future.
+     */
+    CompletableFuture<LeaseMeta> awaitPrimaryReplica(ReplicationGroupId groupId, HybridTimestamp timestamp);
+
+    /**
+     * Same as {@link #awaitPrimaryReplica(ReplicationGroupId, HybridTimestamp)} despite the fact that given method await logic is bounded.
+     * It will wait for a primary replica for a reasonable period of time, and complete a future with {@code TimeoutException} if a matching
+     * lease isn't found. Generally speaking reasonable here means enough for distribution across cluster nodes.
+     *
+     * @param replicationGroupId Replication group id.
+     * @param timestamp Timestamp reference value.
+     * @return Primary replica future.
+     */
+    CompletableFuture<LeaseMeta> getPrimaryReplica(ReplicationGroupId replicationGroupId, HybridTimestamp timestamp);
+}

Review Comment:
   Added.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@ignite.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [ignite-3] sanpwc commented on a diff in pull request #2078: IGNITE-19428 PlacementDriver API implemented.

Posted by "sanpwc (via GitHub)" <gi...@apache.org>.
sanpwc commented on code in PR #2078:
URL: https://github.com/apache/ignite-3/pull/2078#discussion_r1196777949


##########
modules/placement-driver/src/main/java/org/apache/ignite/internal/placementdriver/leases/LeaseTracker.java:
##########
@@ -65,12 +84,22 @@ public class LeaseTracker {
      *
      * @param vaultManager Vault manager.
      * @param msManager Meta storage manager.
+     * @param longLeaseInterval Long lease interval in {@code TimeUnit.MILLISECONDS}.  The interval is used between lease granting attempts.

Review Comment:
   Same as above, that wasn't part of my changes, I've just used comment from original code. However I don't have any objections to your proposal. Will use the suggestion. Fixed.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@ignite.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [ignite-3] sanpwc commented on a diff in pull request #2078: IGNITE-19428 PlacementDriver API implemented.

Posted by "sanpwc (via GitHub)" <gi...@apache.org>.
sanpwc commented on code in PR #2078:
URL: https://github.com/apache/ignite-3/pull/2078#discussion_r1199111998


##########
modules/placement-driver/src/main/java/org/apache/ignite/internal/placementdriver/leases/LeaseTracker.java:
##########
@@ -152,4 +195,46 @@ public CompletableFuture<Void> onUpdate(WatchEvent event) {
         public void onError(Throwable e) {
         }
     }
+
+    @Override
+    public CompletableFuture<LeaseMeta> awaitPrimaryReplica(ReplicationGroupId groupId, HybridTimestamp timestamp) {
+        if (!busyLock.enterBusy()) {
+            return failedFuture(new NodeStoppingException("Component is stopping."));
+        }
+        try {
+            return primaryReplicaWaiters.computeIfAbsent(groupId, id -> new PendingIndependentComparableValuesTracker<>(MIN_VALUE))
+                    .waitFor(timestamp);
+        } finally {
+            busyLock.leaveBusy();
+        }
+    }
+
+    @Override
+    public CompletableFuture<LeaseMeta> getPrimaryReplica(ReplicationGroupId replicationGroupId, HybridTimestamp timestamp) {
+        if (!busyLock.enterBusy()) {
+            return failedFuture(new NodeStoppingException("Component is stopping."));
+        }
+        try {
+            // There's no sense in awaiting previously detected primary replica more than lease interval.
+            return awaitPrimaryReplica(replicationGroupId, timestamp).orTimeout(longLeaseInterval, TimeUnit.MILLISECONDS);

Review Comment:
   > Timeout transaction is possible to use here
   Nope
   
   > To use safe time is absolutely correct
   Yes. I've reworked getPrimaryReplica to use the safe time. And created a ticket in order to fix the race in meta storage safe time publication https://issues.apache.org/jira/browse/IGNITE-19532 @SammyVimes is notified and hopefully will fix it within next week. It worth to mention that I've run corresponding test 1000 times - no failures, so I event didn't mute it with the issue.



##########
modules/placement-driver/src/main/java/org/apache/ignite/internal/placementdriver/leases/LeaseTracker.java:
##########
@@ -152,4 +195,46 @@ public CompletableFuture<Void> onUpdate(WatchEvent event) {
         public void onError(Throwable e) {
         }
     }
+
+    @Override
+    public CompletableFuture<LeaseMeta> awaitPrimaryReplica(ReplicationGroupId groupId, HybridTimestamp timestamp) {
+        if (!busyLock.enterBusy()) {
+            return failedFuture(new NodeStoppingException("Component is stopping."));
+        }
+        try {
+            return primaryReplicaWaiters.computeIfAbsent(groupId, id -> new PendingIndependentComparableValuesTracker<>(MIN_VALUE))
+                    .waitFor(timestamp);
+        } finally {
+            busyLock.leaveBusy();
+        }
+    }
+
+    @Override
+    public CompletableFuture<LeaseMeta> getPrimaryReplica(ReplicationGroupId replicationGroupId, HybridTimestamp timestamp) {
+        if (!busyLock.enterBusy()) {
+            return failedFuture(new NodeStoppingException("Component is stopping."));
+        }
+        try {
+            // There's no sense in awaiting previously detected primary replica more than lease interval.
+            return awaitPrimaryReplica(replicationGroupId, timestamp).orTimeout(longLeaseInterval, TimeUnit.MILLISECONDS);

Review Comment:
   > Timeout transaction is possible to use here
   Nope
   
   > To use safe time is absolutely correct
   
   Yes. I've reworked getPrimaryReplica to use the safe time. And created a ticket in order to fix the race in meta storage safe time publication https://issues.apache.org/jira/browse/IGNITE-19532 @SammyVimes is notified and hopefully will fix it within next week. It worth to mention that I've run corresponding test 1000 times - no failures, so I event didn't mute it with the issue.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@ignite.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [ignite-3] sanpwc commented on a diff in pull request #2078: IGNITE-19428 PlacementDriver API implemented.

Posted by "sanpwc (via GitHub)" <gi...@apache.org>.
sanpwc commented on code in PR #2078:
URL: https://github.com/apache/ignite-3/pull/2078#discussion_r1196526749


##########
modules/placement-driver/src/main/java/org/apache/ignite/internal/placementdriver/leases/LeaseTracker.java:
##########
@@ -65,12 +84,22 @@ public class LeaseTracker {
      *
      * @param vaultManager Vault manager.
      * @param msManager Meta storage manager.
+     * @param longLeaseInterval Long lease interval in {@code TimeUnit.MILLISECONDS}.  The interval is used between lease granting attempts.

Review Comment:
   Because same value is used in LeaseUpdater for the purposes you've mentioned. Here it's used in order to specify max time to await for lease to be populated locally within lease across nodes distribution process. So, to be honest I believe that comment is fine. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@ignite.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [ignite-3] sanpwc commented on a diff in pull request #2078: IGNITE-19428 PlacementDriver API implemented.

Posted by "sanpwc (via GitHub)" <gi...@apache.org>.
sanpwc commented on code in PR #2078:
URL: https://github.com/apache/ignite-3/pull/2078#discussion_r1197848836


##########
modules/placement-driver/src/main/java/org/apache/ignite/internal/placementdriver/leases/LeaseTracker.java:
##########
@@ -138,10 +175,16 @@ public CompletableFuture<Void> onUpdate(WatchEvent event) {
 
                 if (msEntry.empty()) {
                     leases.remove(grpId);
+                    tryRemoveTracker(grpId);
                 } else {
                     Lease lease = fromBytes(msEntry.value());
 
+                    assert lease.getExpirationTime() != MAX_VALUE : "INFINITE lease expiration time isn't expected";

Review Comment:
   Also removed.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@ignite.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [ignite-3] sanpwc commented on a diff in pull request #2078: IGNITE-19428 PlacementDriver API implemented.

Posted by "sanpwc (via GitHub)" <gi...@apache.org>.
sanpwc commented on code in PR #2078:
URL: https://github.com/apache/ignite-3/pull/2078#discussion_r1199111998


##########
modules/placement-driver/src/main/java/org/apache/ignite/internal/placementdriver/leases/LeaseTracker.java:
##########
@@ -152,4 +195,46 @@ public CompletableFuture<Void> onUpdate(WatchEvent event) {
         public void onError(Throwable e) {
         }
     }
+
+    @Override
+    public CompletableFuture<LeaseMeta> awaitPrimaryReplica(ReplicationGroupId groupId, HybridTimestamp timestamp) {
+        if (!busyLock.enterBusy()) {
+            return failedFuture(new NodeStoppingException("Component is stopping."));
+        }
+        try {
+            return primaryReplicaWaiters.computeIfAbsent(groupId, id -> new PendingIndependentComparableValuesTracker<>(MIN_VALUE))
+                    .waitFor(timestamp);
+        } finally {
+            busyLock.leaveBusy();
+        }
+    }
+
+    @Override
+    public CompletableFuture<LeaseMeta> getPrimaryReplica(ReplicationGroupId replicationGroupId, HybridTimestamp timestamp) {
+        if (!busyLock.enterBusy()) {
+            return failedFuture(new NodeStoppingException("Component is stopping."));
+        }
+        try {
+            // There's no sense in awaiting previously detected primary replica more than lease interval.
+            return awaitPrimaryReplica(replicationGroupId, timestamp).orTimeout(longLeaseInterval, TimeUnit.MILLISECONDS);

Review Comment:
   > Timeout transaction is possible to use here
   
   Nope
   
   > To use safe time is absolutely correct
   
   Yes. I've reworked getPrimaryReplica to use the safe time. And created a ticket in order to fix the race in meta storage safe time publication https://issues.apache.org/jira/browse/IGNITE-19532 @SammyVimes is notified and hopefully will fix it within next week. It worth to mention that I've run corresponding test 1000 times - no failures, so I event didn't mute it with the issue.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@ignite.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [ignite-3] sanpwc commented on a diff in pull request #2078: IGNITE-19428 PlacementDriver API implemented.

Posted by "sanpwc (via GitHub)" <gi...@apache.org>.
sanpwc commented on code in PR #2078:
URL: https://github.com/apache/ignite-3/pull/2078#discussion_r1196517510


##########
modules/placement-driver/src/main/java/org/apache/ignite/internal/placementdriver/leases/Lease.java:
##########
@@ -23,17 +23,19 @@
 
 import java.nio.ByteBuffer;
 import java.nio.charset.StandardCharsets;
+import java.util.Objects;
 import org.apache.ignite.internal.hlc.HybridTimestamp;
+import org.apache.ignite.internal.placementdriver.LeaseMeta;
 
 /**
  * A lease representation in memory.
  * The real lease is stored in Meta storage.
  */
-public class Lease {
+public class Lease implements LeaseMeta {
     /** The object is used when nothing holds the lease. Empty lease is always expired. */
     public static Lease EMPTY_LEASE = new Lease(null, MIN_VALUE, MIN_VALUE);
 
-    /** A node that holds a lease until {@code stopLeas}. */
+    /** A node that holds a lease until {@code stopLeas}LeaseMeta. */

Review Comment:
   It's an Ctrl+C/Ctrl+V artifact. LeaseMeta should be removed. Fixed.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@ignite.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [ignite-3] sanpwc commented on a diff in pull request #2078: IGNITE-19428 PlacementDriver API implemented.

Posted by "sanpwc (via GitHub)" <gi...@apache.org>.
sanpwc commented on code in PR #2078:
URL: https://github.com/apache/ignite-3/pull/2078#discussion_r1197846348


##########
modules/placement-driver/src/main/java/org/apache/ignite/internal/placementdriver/leases/LeaseTracker.java:
##########
@@ -92,6 +121,11 @@ public void startTrack() {
                 Lease lease = fromBytes(entry.value());
 
                 leases.put(grpId, lease);
+
+                assert lease.getExpirationTime() != MAX_VALUE : "INFINITE lease expiration time isn't expected";

Review Comment:
   Removed.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@ignite.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [ignite-3] sanpwc merged pull request #2078: IGNITE-19428 PlacementDriver API implemented.

Posted by "sanpwc (via GitHub)" <gi...@apache.org>.
sanpwc merged PR #2078:
URL: https://github.com/apache/ignite-3/pull/2078


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@ignite.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [ignite-3] vldpyatkov commented on a diff in pull request #2078: IGNITE-19428 PlacementDriver API implemented.

Posted by "vldpyatkov (via GitHub)" <gi...@apache.org>.
vldpyatkov commented on code in PR #2078:
URL: https://github.com/apache/ignite-3/pull/2078#discussion_r1198826153


##########
modules/placement-driver/src/main/java/org/apache/ignite/internal/placementdriver/leases/LeaseTracker.java:
##########
@@ -152,4 +195,46 @@ public CompletableFuture<Void> onUpdate(WatchEvent event) {
         public void onError(Throwable e) {
         }
     }
+
+    @Override
+    public CompletableFuture<LeaseMeta> awaitPrimaryReplica(ReplicationGroupId groupId, HybridTimestamp timestamp) {
+        if (!busyLock.enterBusy()) {
+            return failedFuture(new NodeStoppingException("Component is stopping."));
+        }
+        try {
+            return primaryReplicaWaiters.computeIfAbsent(groupId, id -> new PendingIndependentComparableValuesTracker<>(MIN_VALUE))
+                    .waitFor(timestamp);
+        } finally {
+            busyLock.leaveBusy();
+        }
+    }
+
+    @Override
+    public CompletableFuture<LeaseMeta> getPrimaryReplica(ReplicationGroupId replicationGroupId, HybridTimestamp timestamp) {
+        if (!busyLock.enterBusy()) {
+            return failedFuture(new NodeStoppingException("Component is stopping."));
+        }
+        try {
+            // There's no sense in awaiting previously detected primary replica more than lease interval.
+            return awaitPrimaryReplica(replicationGroupId, timestamp).orTimeout(longLeaseInterval, TimeUnit.MILLISECONDS);

Review Comment:
   To use safe time is absolutely correct, but any timeout cannot be used as a replacement, because any timeout might be violated. Time to spread information across cluster does not have time limitation.
   We have only guaranties are related to clock of cluster, but this cannot be applied to safe time. Safe time on different nodes might be arbitrary different. Hence, we cannot to replace safe time check by checking timeout.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@ignite.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [ignite-3] denis-chudov commented on a diff in pull request #2078: IGNITE-19428 PlacementDriver API implemented.

Posted by "denis-chudov (via GitHub)" <gi...@apache.org>.
denis-chudov commented on code in PR #2078:
URL: https://github.com/apache/ignite-3/pull/2078#discussion_r1198822068


##########
modules/placement-driver/src/main/java/org/apache/ignite/internal/placementdriver/leases/LeaseTracker.java:
##########
@@ -152,4 +195,46 @@ public CompletableFuture<Void> onUpdate(WatchEvent event) {
         public void onError(Throwable e) {
         }
     }
+
+    @Override
+    public CompletableFuture<LeaseMeta> awaitPrimaryReplica(ReplicationGroupId groupId, HybridTimestamp timestamp) {
+        if (!busyLock.enterBusy()) {
+            return failedFuture(new NodeStoppingException("Component is stopping."));
+        }
+        try {
+            return primaryReplicaWaiters.computeIfAbsent(groupId, id -> new PendingIndependentComparableValuesTracker<>(MIN_VALUE))
+                    .waitFor(timestamp);
+        } finally {
+            busyLock.leaveBusy();
+        }
+    }
+
+    @Override
+    public CompletableFuture<LeaseMeta> getPrimaryReplica(ReplicationGroupId replicationGroupId, HybridTimestamp timestamp) {
+        if (!busyLock.enterBusy()) {
+            return failedFuture(new NodeStoppingException("Component is stopping."));
+        }
+        try {
+            // There's no sense in awaiting previously detected primary replica more than lease interval.
+            return awaitPrimaryReplica(replicationGroupId, timestamp).orTimeout(longLeaseInterval, TimeUnit.MILLISECONDS);

Review Comment:
   Consider the following example:
   - longLeaseInterval is 2 min
   - cluster is experiencing some problems
   - user runs very important transaction with a timeout 10 min, they are aware that there are some problems but have nothing to do with it right now, that's why the transaction timeout is rather long
   - transaction starts, successfully performs some operation within, let say, 1 minute, and is going to be committed with commit timestamp 1000
   - commit operation hangs for 2 minutes and then the transaction is rolled back at timestamp 1120 (let max clock skew be 0), because it was unable to get lease for timestamp 1000 withing this time interval. Actually lease for the interval (500, 1002) was received on corresponding node at local time 1121, but the transaction is already rolled back at this time.
   
   So we have the transaction with timeout 10 minutes that was committed by user but rolled back by cluster after ~3 minutes from the moment it started. It is actually implicit commit timeout, which serves also as implicit and non-configurable transaction timeout. This looks at least weird from the user's point of view.
   
   I understand that longLeaseInterval is long enough to count it as a reasonable time of data propagation, but the consequence is that we can roll back any transaction if some of its operations exceed "reasonable" time which is not related to transaction timeout, and therefore, arbitrarily roll back any transaction before it reaches its timeout. This behavior impacts user experience, seems to be invalid and shouldn't exist.
   
   So, the transaction timeout instead of safe time seems to be more correct here. I suggest to add parameter (timeout) for this method and leave TODO for reworking it to safe time.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@ignite.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [ignite-3] sanpwc commented on a diff in pull request #2078: IGNITE-19428 PlacementDriver API implemented.

Posted by "sanpwc (via GitHub)" <gi...@apache.org>.
sanpwc commented on code in PR #2078:
URL: https://github.com/apache/ignite-3/pull/2078#discussion_r1196773652


##########
modules/placement-driver/src/main/java/org/apache/ignite/internal/placementdriver/leases/Lease.java:
##########
@@ -23,17 +23,19 @@
 
 import java.nio.ByteBuffer;
 import java.nio.charset.StandardCharsets;
+import java.util.Objects;
 import org.apache.ignite.internal.hlc.HybridTimestamp;
+import org.apache.ignite.internal.placementdriver.LeaseMeta;
 
 /**
  * A lease representation in memory.
  * The real lease is stored in Meta storage.
  */
-public class Lease {
+public class Lease implements LeaseMeta {
     /** The object is used when nothing holds the lease. Empty lease is always expired. */
     public static Lease EMPTY_LEASE = new Lease(null, MIN_VALUE, MIN_VALUE);
 
-    /** A node that holds a lease until {@code stopLeas}. */
+    /** A node that holds a lease until {@code stopLeas}LeaseMeta. */

Review Comment:
   Well, it wasn't part of my changes.
   @vldpyatkov Could you please clarify what does it mean? Do you have any objections for {@code stopLeas} removal?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@ignite.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [ignite-3] sanpwc commented on a diff in pull request #2078: IGNITE-19428 PlacementDriver API implemented.

Posted by "sanpwc (via GitHub)" <gi...@apache.org>.
sanpwc commented on code in PR #2078:
URL: https://github.com/apache/ignite-3/pull/2078#discussion_r1196533267


##########
modules/placement-driver/src/main/java/org/apache/ignite/internal/placementdriver/leases/LeaseTracker.java:
##########
@@ -152,4 +195,46 @@ public CompletableFuture<Void> onUpdate(WatchEvent event) {
         public void onError(Throwable e) {
         }
     }
+
+    @Override
+    public CompletableFuture<LeaseMeta> awaitPrimaryReplica(ReplicationGroupId groupId, HybridTimestamp timestamp) {
+        if (!busyLock.enterBusy()) {
+            return failedFuture(new NodeStoppingException("Component is stopping."));
+        }
+        try {
+            return primaryReplicaWaiters.computeIfAbsent(groupId, id -> new PendingIndependentComparableValuesTracker<>(MIN_VALUE))
+                    .waitFor(timestamp);
+        } finally {
+            busyLock.leaveBusy();
+        }
+    }
+
+    @Override
+    public CompletableFuture<LeaseMeta> getPrimaryReplica(ReplicationGroupId replicationGroupId, HybridTimestamp timestamp) {
+        if (!busyLock.enterBusy()) {
+            return failedFuture(new NodeStoppingException("Component is stopping."));
+        }
+        try {
+            // There's no sense in awaiting previously detected primary replica more than lease interval.
+            return awaitPrimaryReplica(replicationGroupId, timestamp).orTimeout(longLeaseInterval, TimeUnit.MILLISECONDS);

Review Comment:
   Because it's maximum reasonable time to await for lease appearance based on lease propagation.
   First of all, from the one hand leaseInterval should be gt the MAX_CLOCK_SKEW by design thus we've safe with clock drift here, on the other hand if distribution duration will be more than leaseInterval we may see an inappropriate gap.
   Another approach with awaiting ms safe time, seems to be to invasive, code will become ugly, so I've finished with await bounding with given lease interval.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@ignite.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [ignite-3] sanpwc commented on a diff in pull request #2078: IGNITE-19428 PlacementDriver API implemented.

Posted by "sanpwc (via GitHub)" <gi...@apache.org>.
sanpwc commented on code in PR #2078:
URL: https://github.com/apache/ignite-3/pull/2078#discussion_r1197846010


##########
modules/placement-driver/src/main/java/org/apache/ignite/internal/placementdriver/LeaseUpdater.java:
##########
@@ -74,7 +73,7 @@ public class LeaseUpdater {
 
     private final AtomicBoolean active = new AtomicBoolean();
 
-    /** Long lease interval. The interval is used between lease granting attempts. */
+    /** Long lease interval in {@code TimeUnit.MILLISECONDS}. The interval is used between lease granting attempts. */

Review Comment:
   Shame on me))



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@ignite.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [ignite-3] sanpwc commented on a diff in pull request #2078: IGNITE-19428 PlacementDriver API implemented.

Posted by "sanpwc (via GitHub)" <gi...@apache.org>.
sanpwc commented on code in PR #2078:
URL: https://github.com/apache/ignite-3/pull/2078#discussion_r1197845234


##########
modules/placement-driver/src/main/java/org/apache/ignite/internal/placementdriver/leases/LeaseTracker.java:
##########
@@ -65,12 +84,22 @@ public class LeaseTracker {
      *
      * @param vaultManager Vault manager.
      * @param msManager Meta storage manager.
+     * @param longLeaseInterval Long lease interval in {@code TimeUnit.MILLISECONDS}.  The interval is used between lease granting attempts.

Review Comment:
   Fixed,



##########
modules/placement-driver/src/main/java/org/apache/ignite/internal/placementdriver/leases/LeaseTracker.java:
##########
@@ -65,12 +84,22 @@ public class LeaseTracker {
      *
      * @param vaultManager Vault manager.
      * @param msManager Meta storage manager.
+     * @param longLeaseInterval Long lease interval in {@code TimeUnit.MILLISECONDS}.  The interval is used between lease granting attempts.

Review Comment:
   Fixed.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@ignite.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [ignite-3] sanpwc commented on a diff in pull request #2078: IGNITE-19428 PlacementDriver API implemented.

Posted by "sanpwc (via GitHub)" <gi...@apache.org>.
sanpwc commented on code in PR #2078:
URL: https://github.com/apache/ignite-3/pull/2078#discussion_r1197841119


##########
modules/placement-driver/src/main/java/org/apache/ignite/internal/placementdriver/leases/Lease.java:
##########
@@ -23,17 +23,19 @@
 
 import java.nio.ByteBuffer;
 import java.nio.charset.StandardCharsets;
+import java.util.Objects;
 import org.apache.ignite.internal.hlc.HybridTimestamp;
+import org.apache.ignite.internal.placementdriver.LeaseMeta;
 
 /**
  * A lease representation in memory.
  * The real lease is stored in Meta storage.
  */
-public class Lease {
+public class Lease implements LeaseMeta {
     /** The object is used when nothing holds the lease. Empty lease is always expired. */
     public static Lease EMPTY_LEASE = new Lease(null, MIN_VALUE, MIN_VALUE);
 
-    /** A node that holds a lease until {@code stopLeas}. */
+    /** A node that holds a lease until {@code stopLeas}LeaseMeta. */

Review Comment:
   Fixed.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@ignite.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [ignite-3] sanpwc commented on a diff in pull request #2078: IGNITE-19428 PlacementDriver API implemented.

Posted by "sanpwc (via GitHub)" <gi...@apache.org>.
sanpwc commented on code in PR #2078:
URL: https://github.com/apache/ignite-3/pull/2078#discussion_r1196521875


##########
modules/placement-driver/src/main/java/org/apache/ignite/internal/placementdriver/leases/Lease.java:
##########
@@ -234,4 +224,22 @@ public String toString() {
                 + ", prolongable=" + prolongable
                 + '}';
     }
+
+    @Override
+    public boolean equals(Object o) {

Review Comment:
   Yep. Done.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@ignite.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [ignite-3] denis-chudov commented on a diff in pull request #2078: IGNITE-19428 PlacementDriver API implemented.

Posted by "denis-chudov (via GitHub)" <gi...@apache.org>.
denis-chudov commented on code in PR #2078:
URL: https://github.com/apache/ignite-3/pull/2078#discussion_r1196714214


##########
modules/placement-driver/src/main/java/org/apache/ignite/internal/placementdriver/leases/LeaseTracker.java:
##########
@@ -65,12 +84,22 @@ public class LeaseTracker {
      *
      * @param vaultManager Vault manager.
      * @param msManager Meta storage manager.
+     * @param longLeaseInterval Long lease interval in {@code TimeUnit.MILLISECONDS}.  The interval is used between lease granting attempts.

Review Comment:
   Then it would be more correct to change it to "The interval is used on the beginning of lease granting process", because "between lease granting attempts" is still not clear, what do you think?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@ignite.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [ignite-3] denis-chudov commented on a diff in pull request #2078: IGNITE-19428 PlacementDriver API implemented.

Posted by "denis-chudov (via GitHub)" <gi...@apache.org>.
denis-chudov commented on code in PR #2078:
URL: https://github.com/apache/ignite-3/pull/2078#discussion_r1198822068


##########
modules/placement-driver/src/main/java/org/apache/ignite/internal/placementdriver/leases/LeaseTracker.java:
##########
@@ -152,4 +195,46 @@ public CompletableFuture<Void> onUpdate(WatchEvent event) {
         public void onError(Throwable e) {
         }
     }
+
+    @Override
+    public CompletableFuture<LeaseMeta> awaitPrimaryReplica(ReplicationGroupId groupId, HybridTimestamp timestamp) {
+        if (!busyLock.enterBusy()) {
+            return failedFuture(new NodeStoppingException("Component is stopping."));
+        }
+        try {
+            return primaryReplicaWaiters.computeIfAbsent(groupId, id -> new PendingIndependentComparableValuesTracker<>(MIN_VALUE))
+                    .waitFor(timestamp);
+        } finally {
+            busyLock.leaveBusy();
+        }
+    }
+
+    @Override
+    public CompletableFuture<LeaseMeta> getPrimaryReplica(ReplicationGroupId replicationGroupId, HybridTimestamp timestamp) {
+        if (!busyLock.enterBusy()) {
+            return failedFuture(new NodeStoppingException("Component is stopping."));
+        }
+        try {
+            // There's no sense in awaiting previously detected primary replica more than lease interval.
+            return awaitPrimaryReplica(replicationGroupId, timestamp).orTimeout(longLeaseInterval, TimeUnit.MILLISECONDS);

Review Comment:
   Consider the following example:
   - longLeaseInterval is 2 min
   - cluster is experiencing some problems
   - user runs very important transaction with a timeout 10 min, they are aware that there are some problems but have nothing to do with it right now, that's why the transaction timeout is rather long
   - transaction starts, successfully performs some operation within, let say, 1 minute, and is going to be committed with commit timestamp 1000
   - commit operation hangs for 2 minutes and then the transaction is rolled back at timestamp 1120 (let max clock skew be 0), because it was unable to get lease for timestamp 1000 withing this time interval. Actually lease for the interval (500, 1002) was received on corresponding node at local time 1121, but the transaction is already rolled back at this time.
   
   So we have the transaction with timeout 10 minutes that was committed by user but rolled back by cluster after ~3 minutes from the moment it started. It is actually implicit commit timeout, which serves also as implicit and non-configurable transaction timeout. This looks at least weird from the user's point of view.
   
   I understand that longLeaseInterval is long enough to count it as a reasonable time of data propagation, but the consequence is that we can roll back any transaction if some of its operations exceeds "reasonable" time which is not related to transaction timeout, and therefore, arbitrarily roll back any transaction before it reaches its timeout. This behavior impacts user experience, seems to be invalid and shouldn't exist.
   
   So, the transaction timeout instead of safe time seems to be more correct here.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@ignite.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [ignite-3] denis-chudov commented on a diff in pull request #2078: IGNITE-19428 PlacementDriver API implemented.

Posted by "denis-chudov (via GitHub)" <gi...@apache.org>.
denis-chudov commented on code in PR #2078:
URL: https://github.com/apache/ignite-3/pull/2078#discussion_r1198822068


##########
modules/placement-driver/src/main/java/org/apache/ignite/internal/placementdriver/leases/LeaseTracker.java:
##########
@@ -152,4 +195,46 @@ public CompletableFuture<Void> onUpdate(WatchEvent event) {
         public void onError(Throwable e) {
         }
     }
+
+    @Override
+    public CompletableFuture<LeaseMeta> awaitPrimaryReplica(ReplicationGroupId groupId, HybridTimestamp timestamp) {
+        if (!busyLock.enterBusy()) {
+            return failedFuture(new NodeStoppingException("Component is stopping."));
+        }
+        try {
+            return primaryReplicaWaiters.computeIfAbsent(groupId, id -> new PendingIndependentComparableValuesTracker<>(MIN_VALUE))
+                    .waitFor(timestamp);
+        } finally {
+            busyLock.leaveBusy();
+        }
+    }
+
+    @Override
+    public CompletableFuture<LeaseMeta> getPrimaryReplica(ReplicationGroupId replicationGroupId, HybridTimestamp timestamp) {
+        if (!busyLock.enterBusy()) {
+            return failedFuture(new NodeStoppingException("Component is stopping."));
+        }
+        try {
+            // There's no sense in awaiting previously detected primary replica more than lease interval.
+            return awaitPrimaryReplica(replicationGroupId, timestamp).orTimeout(longLeaseInterval, TimeUnit.MILLISECONDS);

Review Comment:
   Consider the following example:
   - longLeaseInterval is 2 min
   - cluster is experiencing some problems
   - user runs very important transaction with a timeout 10 min, they are aware that there are some problems but have nothing to do with it right now, that's why the transaction timeout is rather long
   - transaction starts, successfully performs some operation within, let say, 1 minute, and is going to be committed with commit timestamp 1000
   - commit operation hangs for 2 minutes and then the transaction is rolled back at timestamp 1120 (let max clock skew be 0), because it was unable to get lease for timestamp 1000 withing this time interval. Actually lease for the interval (500, 1002) was received on corresponding node at local time 1121, but the transaction is already rolled back at this time.
   
   So we have the transaction with timeout 10 minutes that was committed by user but rolled back by cluster after ~3 minutes from the moment it started. It is actually implicit commit timeout, which serves also as implicit and non-configurable transaction timeout. This looks at least weird from the user's point of view.
   
   I understand that longLeaseInterval is long enough to count it as a reasonable time of data propagation, but the consequence is that we can roll back any transaction if some of its operations exceed "reasonable" time which is not related to transaction timeout, and therefore, arbitrarily roll back any transaction before it reaches its timeout. This behavior impacts user experience, seems to be invalid and shouldn't exist.
   
   So, the transaction timeout instead of safe time seems to be more correct here. I suggest to add parameter (timeout) for this method and leave TODO to rework it to safe time.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@ignite.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [ignite-3] sanpwc commented on a diff in pull request #2078: IGNITE-19428 PlacementDriver API implemented.

Posted by "sanpwc (via GitHub)" <gi...@apache.org>.
sanpwc commented on code in PR #2078:
URL: https://github.com/apache/ignite-3/pull/2078#discussion_r1196546149


##########
modules/placement-driver/src/main/java/org/apache/ignite/internal/placementdriver/LeaseMeta.java:
##########
@@ -0,0 +1,46 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.ignite.internal.placementdriver;
+
+import org.apache.ignite.internal.hlc.HybridTimestamp;
+
+/**
+ * Replica lease meta.
+ */
+public interface LeaseMeta {

Review Comment:
   You are right. Same as for PlacementDriver interface. Both moved to `placement-driver-api`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@ignite.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [ignite-3] vldpyatkov commented on a diff in pull request #2078: IGNITE-19428 PlacementDriver API implemented.

Posted by "vldpyatkov (via GitHub)" <gi...@apache.org>.
vldpyatkov commented on code in PR #2078:
URL: https://github.com/apache/ignite-3/pull/2078#discussion_r1196235603


##########
modules/placement-driver/src/main/java/org/apache/ignite/internal/placementdriver/PlacementDriver.java:
##########
@@ -0,0 +1,51 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.ignite.internal.placementdriver;
+
+import java.util.concurrent.CompletableFuture;
+import org.apache.ignite.internal.hlc.HybridTimestamp;
+import org.apache.ignite.internal.replicator.ReplicationGroupId;
+
+/**
+ * Service that provides an ability to await and retrieve primary replicas for replication groups.
+ */
+public interface PlacementDriver {
+    /**
+     * Returns a future for the primary replica for the specified replication group whose expiration time (the right border of the
+     * corresponding lease interval) is greater than or equal to the timestamp passed as a parameter. Please pay attention that there are
+     * no restriction on the lease start time (left border), it can either be less or greater than or equal to proposed timestamp.
+     * Given method will await for an appropriate primary replica appearance if there's no already existing one. Such awaiting logic is
+     * unbounded, so it's mandatory to use explicit await termination like {@code orTimeout}.
+     *
+     * @param groupId Replication group id.
+     * @param timestamp Timestamp reference value.
+     * @return Primary replica future.
+     */
+    CompletableFuture<LeaseMeta> awaitPrimaryReplica(ReplicationGroupId groupId, HybridTimestamp timestamp);
+
+    /**
+     * Same as {@link #awaitPrimaryReplica(ReplicationGroupId, HybridTimestamp)} despite the fact that given method await logic is bounded.
+     * It will wait for a primary replica for a reasonable period of time, and complete a future with {@code TimeoutException} if a matching
+     * lease isn't found. Generally speaking reasonable here means enough for distribution across cluster nodes.
+     *
+     * @param replicationGroupId Replication group id.
+     * @param timestamp Timestamp reference value.
+     * @return Primary replica future.
+     */
+    CompletableFuture<LeaseMeta> getPrimaryReplica(ReplicationGroupId replicationGroupId, HybridTimestamp timestamp);
+}

Review Comment:
   Add empty line in the end of the file.



##########
modules/placement-driver/src/main/java/org/apache/ignite/internal/placementdriver/leases/LeaseTracker.java:
##########
@@ -65,12 +84,22 @@ public class LeaseTracker {
      *
      * @param vaultManager Vault manager.
      * @param msManager Meta storage manager.
+     * @param longLeaseInterval Long lease interval in {@code TimeUnit.MILLISECONDS}.  The interval is used between lease granting attempts.
+     * @param busyLock Busy lock to protect {@link PlacementDriver} methods on corresponding manager stop.
      */
-    public LeaseTracker(VaultManager vaultManager, MetaStorageManager msManager) {
+    public LeaseTracker(
+            VaultManager vaultManager,
+            MetaStorageManager msManager,
+            long longLeaseInterval,
+            IgniteSpinBusyLock busyLock
+    ) {
         this.vaultManager = vaultManager;
         this.msManager = msManager;
+        this.longLeaseInterval = longLeaseInterval;
+        this.busyLock = busyLock;

Review Comment:
   It is forbidden to take the lock object from where it was created.



##########
modules/placement-driver/src/main/java/org/apache/ignite/internal/placementdriver/leases/LeaseTracker.java:
##########
@@ -92,6 +121,11 @@ public void startTrack() {
                 Lease lease = fromBytes(entry.value());
 
                 leases.put(grpId, lease);
+
+                assert lease.getExpirationTime() != MAX_VALUE : "INFINITE lease expiration time isn't expected";

Review Comment:
   I do not understand which is the reason to check expiration time to MAX_VALUE. We never use the constant to define a lease.



##########
modules/placement-driver/src/main/java/org/apache/ignite/internal/placementdriver/leases/LeaseTracker.java:
##########
@@ -65,12 +84,22 @@ public class LeaseTracker {
      *
      * @param vaultManager Vault manager.
      * @param msManager Meta storage manager.
+     * @param longLeaseInterval Long lease interval in {@code TimeUnit.MILLISECONDS}.  The interval is used between lease granting attempts.

Review Comment:
   I do not have any objection. I think the both definition are clear.



##########
modules/placement-driver/src/main/java/org/apache/ignite/internal/placementdriver/leases/LeaseTracker.java:
##########
@@ -138,10 +175,16 @@ public CompletableFuture<Void> onUpdate(WatchEvent event) {
 
                 if (msEntry.empty()) {
                     leases.remove(grpId);
+                    tryRemoveTracker(grpId);
                 } else {
                     Lease lease = fromBytes(msEntry.value());
 
+                    assert lease.getExpirationTime() != MAX_VALUE : "INFINITE lease expiration time isn't expected";

Review Comment:
   The same as above, using of MAX_VALUE is unexpected.



##########
modules/placement-driver/src/main/java/org/apache/ignite/internal/placementdriver/LeaseUpdater.java:
##########
@@ -74,7 +73,7 @@ public class LeaseUpdater {
 
     private final AtomicBoolean active = new AtomicBoolean();
 
-    /** Long lease interval. The interval is used between lease granting attempts. */
+    /** Long lease interval in {@code TimeUnit.MILLISECONDS}. The interval is used between lease granting attempts. */

Review Comment:
   Why do you use {@code TimeUnit.MILLISECONDS} (as a part of code)?
   Millisecond has a well known definition. I think the word "milliseconds" is enough.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@ignite.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [ignite-3] sanpwc commented on a diff in pull request #2078: IGNITE-19428 PlacementDriver API implemented.

Posted by "sanpwc (via GitHub)" <gi...@apache.org>.
sanpwc commented on code in PR #2078:
URL: https://github.com/apache/ignite-3/pull/2078#discussion_r1197860260


##########
modules/placement-driver/src/main/java/org/apache/ignite/internal/placementdriver/leases/LeaseTracker.java:
##########
@@ -65,12 +84,22 @@ public class LeaseTracker {
      *
      * @param vaultManager Vault manager.
      * @param msManager Meta storage manager.
+     * @param longLeaseInterval Long lease interval in {@code TimeUnit.MILLISECONDS}.  The interval is used between lease granting attempts.
+     * @param busyLock Busy lock to protect {@link PlacementDriver} methods on corresponding manager stop.
      */
-    public LeaseTracker(VaultManager vaultManager, MetaStorageManager msManager) {
+    public LeaseTracker(
+            VaultManager vaultManager,
+            MetaStorageManager msManager,
+            long longLeaseInterval,
+            IgniteSpinBusyLock busyLock
+    ) {
         this.vaultManager = vaultManager;
         this.msManager = msManager;
+        this.longLeaseInterval = longLeaseInterval;
+        this.busyLock = busyLock;

Review Comment:
   Fixed.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@ignite.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [ignite-3] denis-chudov commented on a diff in pull request #2078: IGNITE-19428 PlacementDriver API implemented.

Posted by "denis-chudov (via GitHub)" <gi...@apache.org>.
denis-chudov commented on code in PR #2078:
URL: https://github.com/apache/ignite-3/pull/2078#discussion_r1198822068


##########
modules/placement-driver/src/main/java/org/apache/ignite/internal/placementdriver/leases/LeaseTracker.java:
##########
@@ -152,4 +195,46 @@ public CompletableFuture<Void> onUpdate(WatchEvent event) {
         public void onError(Throwable e) {
         }
     }
+
+    @Override
+    public CompletableFuture<LeaseMeta> awaitPrimaryReplica(ReplicationGroupId groupId, HybridTimestamp timestamp) {
+        if (!busyLock.enterBusy()) {
+            return failedFuture(new NodeStoppingException("Component is stopping."));
+        }
+        try {
+            return primaryReplicaWaiters.computeIfAbsent(groupId, id -> new PendingIndependentComparableValuesTracker<>(MIN_VALUE))
+                    .waitFor(timestamp);
+        } finally {
+            busyLock.leaveBusy();
+        }
+    }
+
+    @Override
+    public CompletableFuture<LeaseMeta> getPrimaryReplica(ReplicationGroupId replicationGroupId, HybridTimestamp timestamp) {
+        if (!busyLock.enterBusy()) {
+            return failedFuture(new NodeStoppingException("Component is stopping."));
+        }
+        try {
+            // There's no sense in awaiting previously detected primary replica more than lease interval.
+            return awaitPrimaryReplica(replicationGroupId, timestamp).orTimeout(longLeaseInterval, TimeUnit.MILLISECONDS);

Review Comment:
   Consider the following example:
   - longLeaseInterval is 2 min
   - cluster is experiencing some problems
   - user runs very important transaction with a timeout 10 min, they are aware that there are some problems but have nothing to do with it right now, that's why the transaction timeout is rather long
   - transaction starts, successfully performs some operation within, let say, 1 minute, and is going to be committed with commit timestamp 1000
   - commit operation hangs for 2 minutes and then the transaction is rolled back at timestamp 1120 (let max clock skew be 0), because it was unable to get lease for timestamp 1000 withing this time interval. Actually lease for the interval (500, 1002) was received on corresponding node at local time 1121, but the transaction is already rolled back at this time.
   
   So we have the transaction with timeout 10 minutes that was committed by user but rolled back by cluster after ~3 minutes. It is actually implicit commit timeout, which serves also as implicit and non-configurable transaction timeout. This looks at least weird from the user's point of view.
   
   I understand that longLeaseInterval is long enough to count it as a reasonable time of data propagation, but the consequence is that we can roll back any transaction if some of its operations exceeds "reasonable" time which is not related to transaction timeout, and therefore, arbitrarily roll back any transaction before it reaches its timeout. This behavior impacts user experience, seems to be invalid and shouldn't exist.
   
   So, the transaction timeout instead of safe time seems to be more correct here.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@ignite.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [ignite-3] denis-chudov commented on a diff in pull request #2078: IGNITE-19428 PlacementDriver API implemented.

Posted by "denis-chudov (via GitHub)" <gi...@apache.org>.
denis-chudov commented on code in PR #2078:
URL: https://github.com/apache/ignite-3/pull/2078#discussion_r1198822068


##########
modules/placement-driver/src/main/java/org/apache/ignite/internal/placementdriver/leases/LeaseTracker.java:
##########
@@ -152,4 +195,46 @@ public CompletableFuture<Void> onUpdate(WatchEvent event) {
         public void onError(Throwable e) {
         }
     }
+
+    @Override
+    public CompletableFuture<LeaseMeta> awaitPrimaryReplica(ReplicationGroupId groupId, HybridTimestamp timestamp) {
+        if (!busyLock.enterBusy()) {
+            return failedFuture(new NodeStoppingException("Component is stopping."));
+        }
+        try {
+            return primaryReplicaWaiters.computeIfAbsent(groupId, id -> new PendingIndependentComparableValuesTracker<>(MIN_VALUE))
+                    .waitFor(timestamp);
+        } finally {
+            busyLock.leaveBusy();
+        }
+    }
+
+    @Override
+    public CompletableFuture<LeaseMeta> getPrimaryReplica(ReplicationGroupId replicationGroupId, HybridTimestamp timestamp) {
+        if (!busyLock.enterBusy()) {
+            return failedFuture(new NodeStoppingException("Component is stopping."));
+        }
+        try {
+            // There's no sense in awaiting previously detected primary replica more than lease interval.
+            return awaitPrimaryReplica(replicationGroupId, timestamp).orTimeout(longLeaseInterval, TimeUnit.MILLISECONDS);

Review Comment:
   Consider the following example:
   - longLeaseInterval is 2 min
   - cluster is experiencing some problems
   - user runs very important transaction with a timeout 10 min, they are aware that there are some problems but have nothing to do with it right now, that's why the transaction timeout is rather long
   - transaction starts, successfully performs some operation within, let say, 1 minute, and is going to be committed with commit timestamp 1000
   - commit operation hangs for 2 minutes and then the transaction is rolled back at timestamp 1120 (let max clock skew be 0), because it was unable to get lease for timestamp 1000 withing this time interval. Actually lease for the interval (500, 1002) was received on corresponding node at local time 1121, but the transaction is already rolled back at this time.
   
   So we have the transaction with timeout 10 minutes that was committed by user but rolled back by cluster after ~3 minutes from the moment it started. It is actually implicit commit timeout, which serves also as implicit and non-configurable transaction timeout. This looks at least weird from the user's point of view.
   
   I understand that longLeaseInterval is long enough to count it as a reasonable time of data propagation, but the consequence is that we can roll back any transaction if some of its operations exceed "reasonable" time which is not related to transaction timeout, and therefore, arbitrarily roll back any transaction before it reaches its timeout. This behavior impacts user experience, seems to be invalid and shouldn't exist.
   
   So, the transaction timeout instead of safe time seems to be more correct here.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@ignite.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org