You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pulsar.apache.org by GitBox <gi...@apache.org> on 2021/06/08 03:37:55 UTC

[GitHub] [pulsar] sursingh opened a new pull request #10862: [Issue 10860][pulsar-metadata] Ensure metadata cache consistency across brokers

sursingh opened a new pull request #10862:
URL: https://github.com/apache/pulsar/pull/10862


   Fixes #10860
   
   ### Motivation
   
   When doing operations on multiple brokers, the state is not always consistent. Sometimes the operations don't seem to get replicated to other brokers in the cluster. The queries from the brokers returns different results. See #10860 for details
   
   ### Modifications
   
   1. While creating a new object, we add the object to cache. However we are not adding a watch on the corresponding path in zookeeper. So when other brokers change/delete the object, the broker that added the object is not notified. With this change we will do a get(), which will add a watch on the corresponding path. This will also add the object to the cache.
   
   2. Similarly while deleting, we are adding an empty object to the cache. Any local request will now return a `object not found` error, without needing to go the zookeeper. However if the object get re-added on any other broker, we will not be notified. We will continue to return `object not found` error, even though an instance of the object now exist. This change remove the addition of empty object to the cache. If a request is received for the object, we will do a get from the zookeeper and if it still doesn't exist, an empty object will get added. Also a watch will get added for the corresponding path. This will ensure that cache will get update the object is again added by some other broker.
   
   ### Verifying this change
   
   - Added a test case that clearly demonstrates the issue and ensured that behavior is fixed after the change
   
   ### Does this pull request potentially affect one of the following parts:
   
   *If `yes` was chosen, please highlight the changes*
   
     - Dependencies (does it add or upgrade a dependency): (no)
     - The public API: (no)
     - The schema: (no)
     - The default values of configurations: (no)
     - The wire protocol: (no)
     - The rest endpoints: (no)
     - The admin cli options: (no)
     - Anything that affects deployment: (no)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] merlimat commented on a change in pull request #10862: [Issue 10860][pulsar-metadata] Ensure metadata cache consistency across brokers

Posted by GitBox <gi...@apache.org>.
merlimat commented on a change in pull request #10862:
URL: https://github.com/apache/pulsar/pull/10862#discussion_r647596995



##########
File path: pulsar-metadata/src/main/java/org/apache/pulsar/metadata/cache/impl/MetadataCacheImpl.java
##########
@@ -209,8 +211,17 @@ public MetadataCacheImpl(MetadataStore store, MetadataSerde<T> serde) {
         store.put(path, content, Optional.of(-1L))
                 .thenAccept(stat -> {
                     // Make sure we have the value cached before the operation is completed
-                    objCache.put(path, FutureUtils.value(Optional.of(new CacheGetResult<>(value, stat))));
-                    future.complete(null);
+                    // In addition to caching the value, we need to add a watch on the path,
+                    // so when/if it changes on any other node, we are notified and we can
+                    // update the cache
+                    objCache.get(path).whenComplete( (stat2, ex) -> {

Review comment:
       No worries for ZK implementation, I'll be making the change to use the persistent recursive watch for master branch. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] merlimat commented on a change in pull request #10862: [Issue 10860][pulsar-metadata] Ensure metadata cache consistency across brokers

Posted by GitBox <gi...@apache.org>.
merlimat commented on a change in pull request #10862:
URL: https://github.com/apache/pulsar/pull/10862#discussion_r647122774



##########
File path: pulsar-metadata/src/main/java/org/apache/pulsar/metadata/cache/impl/MetadataCacheImpl.java
##########
@@ -209,8 +211,17 @@ public MetadataCacheImpl(MetadataStore store, MetadataSerde<T> serde) {
         store.put(path, content, Optional.of(-1L))
                 .thenAccept(stat -> {
                     // Make sure we have the value cached before the operation is completed
-                    objCache.put(path, FutureUtils.value(Optional.of(new CacheGetResult<>(value, stat))));
-                    future.complete(null);
+                    // In addition to caching the value, we need to add a watch on the path,
+                    // so when/if it changes on any other node, we are notified and we can
+                    // update the cache
+                    objCache.get(path).whenComplete( (stat2, ex) -> {

Review comment:
       For this one, we can do the following: 
    * Commit it with this approach
    * Cherry Pick for 2.8 branch
    * For 2.9 branch, go with persistent-recursive watched in ZK and we can eliminate this get() operation again.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] sursingh commented on a change in pull request #10862: [Issue 10860][pulsar-metadata] Ensure metadata cache consistency across brokers

Posted by GitBox <gi...@apache.org>.
sursingh commented on a change in pull request #10862:
URL: https://github.com/apache/pulsar/pull/10862#discussion_r647586105



##########
File path: pulsar-metadata/src/main/java/org/apache/pulsar/metadata/cache/impl/MetadataCacheImpl.java
##########
@@ -229,7 +240,11 @@ public MetadataCacheImpl(MetadataStore store, MetadataSerde<T> serde) {
         return store.delete(path, Optional.empty())
                 .thenAccept(v -> {
                     // Mark in the cache that the object was removed
-                    objCache.put(path, FutureUtils.value(Optional.empty()));
+                    // objCache.put(path, FutureUtils.value(Optional.empty()));

Review comment:
       Yes the cache gets invalidated on delete. I will remove the commented line




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] sursingh commented on pull request #10862: [Issue 10860][pulsar-metadata] Ensure metadata cache consistency across brokers

Posted by GitBox <gi...@apache.org>.
sursingh commented on pull request #10862:
URL: https://github.com/apache/pulsar/pull/10862#issuecomment-856414386


   @merlimat : Can you help review this change.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] ravi-vaidyanathan commented on pull request #10862: [Issue 10860][pulsar-metadata] Ensure metadata cache consistency across brokers

Posted by GitBox <gi...@apache.org>.
ravi-vaidyanathan commented on pull request #10862:
URL: https://github.com/apache/pulsar/pull/10862#issuecomment-856883975


   /pulsarbot run-failure-checks


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] sursingh commented on pull request #10862: [Issue 10860][pulsar-metadata] Ensure metadata cache consistency across brokers

Posted by GitBox <gi...@apache.org>.
sursingh commented on pull request #10862:
URL: https://github.com/apache/pulsar/pull/10862#issuecomment-856899368


   @Anonymitaet This is only a bugfix. We don't need to update the docs for this.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] merlimat commented on a change in pull request #10862: [Issue 10860][pulsar-metadata] Ensure metadata cache consistency across brokers

Posted by GitBox <gi...@apache.org>.
merlimat commented on a change in pull request #10862:
URL: https://github.com/apache/pulsar/pull/10862#discussion_r647122881



##########
File path: pulsar-metadata/src/main/java/org/apache/pulsar/metadata/cache/impl/MetadataCacheImpl.java
##########
@@ -229,7 +240,11 @@ public MetadataCacheImpl(MetadataStore store, MetadataSerde<T> serde) {
         return store.delete(path, Optional.empty())
                 .thenAccept(v -> {
                     // Mark in the cache that the object was removed
-                    objCache.put(path, FutureUtils.value(Optional.empty()));
+                    // objCache.put(path, FutureUtils.value(Optional.empty()));

Review comment:
       I take this back, the cache is anyway invalidated in `AbstractMetadataStore.delete()`. We can just remove the commented line here.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] sursingh commented on a change in pull request #10862: [Issue 10860][pulsar-metadata] Ensure metadata cache consistency across brokers

Posted by GitBox <gi...@apache.org>.
sursingh commented on a change in pull request #10862:
URL: https://github.com/apache/pulsar/pull/10862#discussion_r647590049



##########
File path: pulsar-metadata/src/main/java/org/apache/pulsar/metadata/cache/impl/MetadataCacheImpl.java
##########
@@ -209,8 +211,17 @@ public MetadataCacheImpl(MetadataStore store, MetadataSerde<T> serde) {
         store.put(path, content, Optional.of(-1L))
                 .thenAccept(stat -> {
                     // Make sure we have the value cached before the operation is completed
-                    objCache.put(path, FutureUtils.value(Optional.of(new CacheGetResult<>(value, stat))));
-                    future.complete(null);
+                    // In addition to caching the value, we need to add a watch on the path,
+                    // so when/if it changes on any other node, we are notified and we can
+                    // update the cache
+                    objCache.get(path).whenComplete( (stat2, ex) -> {

Review comment:
       Thanks @merlimat. I will commit this as is. Will work on moving this to ZK implementation in a separate PR.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] sursingh closed pull request #10862: [Issue 10860][pulsar-metadata] Ensure metadata cache consistency across brokers

Posted by GitBox <gi...@apache.org>.
sursingh closed pull request #10862:
URL: https://github.com/apache/pulsar/pull/10862


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] merlimat commented on a change in pull request #10862: [Issue 10860][pulsar-metadata] Ensure metadata cache consistency across brokers

Posted by GitBox <gi...@apache.org>.
merlimat commented on a change in pull request #10862:
URL: https://github.com/apache/pulsar/pull/10862#discussion_r647101647



##########
File path: pulsar-metadata/src/main/java/org/apache/pulsar/metadata/cache/impl/MetadataCacheImpl.java
##########
@@ -229,7 +240,11 @@ public MetadataCacheImpl(MetadataStore store, MetadataSerde<T> serde) {
         return store.delete(path, Optional.empty())
                 .thenAccept(v -> {
                     // Mark in the cache that the object was removed
-                    objCache.put(path, FutureUtils.value(Optional.empty()));
+                    // objCache.put(path, FutureUtils.value(Optional.empty()));

Review comment:
       I think that we could either invalidate it from the cache here, do the read to set the watch, or again, since it's mostly related with the ZK, do it there

##########
File path: pulsar-metadata/src/main/java/org/apache/pulsar/metadata/cache/impl/MetadataCacheImpl.java
##########
@@ -209,8 +211,17 @@ public MetadataCacheImpl(MetadataStore store, MetadataSerde<T> serde) {
         store.put(path, content, Optional.of(-1L))
                 .thenAccept(stat -> {
                     // Make sure we have the value cached before the operation is completed
-                    objCache.put(path, FutureUtils.value(Optional.of(new CacheGetResult<>(value, stat))));
-                    future.complete(null);
+                    // In addition to caching the value, we need to add a watch on the path,
+                    // so when/if it changes on any other node, we are notified and we can
+                    // update the cache
+                    objCache.get(path).whenComplete( (stat2, ex) -> {

Review comment:
       I think this logic should belong more to the ZK implementation, where we can set the watch after the put




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] merlimat merged pull request #10862: [Issue 10860][pulsar-metadata] Ensure metadata cache consistency across brokers

Posted by GitBox <gi...@apache.org>.
merlimat merged pull request #10862:
URL: https://github.com/apache/pulsar/pull/10862


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] Anonymitaet commented on pull request #10862: [Issue 10860][pulsar-metadata] Ensure metadata cache consistency across brokers

Posted by GitBox <gi...@apache.org>.
Anonymitaet commented on pull request #10862:
URL: https://github.com/apache/pulsar/pull/10862#issuecomment-856639147


   @sursingh thanks for your contribution. For this PR, do we need to update docs?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org