You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pulsar.apache.org by GitBox <gi...@apache.org> on 2022/03/22 15:50:04 UTC

[GitHub] [pulsar] andrasbeni opened a new pull request #14804: [Issue 14505] PIP-145: Enable evaluating subscription pattern on broker side

andrasbeni opened a new pull request #14804:
URL: https://github.com/apache/pulsar/pull/14804


   
   
   Master Issue: #14505
   
   ### Motivation
   
   This is part 1 of PIP-145: Improve performance of regex subscriptions
   
   
   ### Modifications
   
   This change covers applying subscription pattern on broker side and
   using hashes to skip updates when nothing changed.
   
   ### Verifying this change
   
   - [ ] Make sure that the change passes the CI checks.
   
   This change added unit tests 
   
   ### Does this pull request potentially affect one of the following parts:
   
   *If `yes` was chosen, please highlight the changes*
   
     - Dependencies (does it add or upgrade a dependency): no
     - The public API: no
     - The schema: no
     - The default values of configurations: no
     - The wire protocol: yes: New optional fields are added to existing commands
     - The rest endpoints: no
     - The admin cli options: no
     - Anything that affects deployment: no
   
   ### Documentation
   
   Check the box below or label this PR directly (if you have committer privilege).
   
   Need to update docs? 
   
   - [ ] `doc-required` 
     
     (If you need help on updating docs, create a doc issue)
     
   - [ ] `no-need-doc` 
     
     (Please explain why)
     
   - [ ] `doc` 
     
     (If this PR contains doc changes)
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] merlimat commented on pull request #14804: [Issue 14505] PIP-145: Enable evaluating subscription pattern on broker side

Posted by GitBox <gi...@apache.org>.
merlimat commented on pull request #14804:
URL: https://github.com/apache/pulsar/pull/14804#issuecomment-1083805037


   /pulsarbot rerun-failure-checks


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] merlimat removed a comment on pull request #14804: [Issue 14505] PIP-145: Enable evaluating subscription pattern on broker side

Posted by GitBox <gi...@apache.org>.
merlimat removed a comment on pull request #14804:
URL: https://github.com/apache/pulsar/pull/14804#issuecomment-1083177019


   /pulsarbot rerun-failure-checks


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] Jason918 commented on a change in pull request #14804: [Issue 14505] PIP-145: Enable evaluating subscription pattern on broker side

Posted by GitBox <gi...@apache.org>.
Jason918 commented on a change in pull request #14804:
URL: https://github.com/apache/pulsar/pull/14804#discussion_r840182914



##########
File path: pulsar-broker/src/main/java/org/apache/pulsar/broker/service/ServerCnx.java
##########
@@ -1968,12 +1978,26 @@ protected void handleGetTopicsOfNamespace(CommandGetTopicsOfNamespace commandGet
                 if (isAuthorized) {
                     getBrokerService().pulsar().getNamespaceService().getListOfTopics(namespaceName, mode)
                         .thenAccept(topics -> {
+                            boolean filterTopics = enableSubscriptionPatternEvaluation && topicsPattern.isPresent()
+                                    && topicsPattern.get().length() <= maxSubscriptionPatternLength;

Review comment:
       > From a user perspective there would be no visible difference.
   
   The user here is more like cluster maintainer who have access to broker config to enable this feature.
   
   
   
   > The point is to avoid "unbound" regexes on server side.
   
   Yes, this limitation is reasonable. But I think we need a warn log here for cluster maintainer, as they may not know exactly how long the topic pattern string is used.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] merlimat commented on a change in pull request #14804: [Issue 14505] PIP-145: Enable evaluating subscription pattern on broker side

Posted by GitBox <gi...@apache.org>.
merlimat commented on a change in pull request #14804:
URL: https://github.com/apache/pulsar/pull/14804#discussion_r838578185



##########
File path: pulsar-broker/src/main/java/org/apache/pulsar/broker/service/ServerCnx.java
##########
@@ -1968,12 +1978,26 @@ protected void handleGetTopicsOfNamespace(CommandGetTopicsOfNamespace commandGet
                 if (isAuthorized) {
                     getBrokerService().pulsar().getNamespaceService().getListOfTopics(namespaceName, mode)
                         .thenAccept(topics -> {
+                            boolean filterTopics = enableSubscriptionPatternEvaluation && topicsPattern.isPresent()
+                                    && topicsPattern.get().length() <= maxSubscriptionPatternLength;

Review comment:
       From a user perspective there would be no visible difference. If the filter is not applied on server, the client library will filter it. 
   
   The point is to avoid "unbound" regexes on server side. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] hangc0276 commented on a change in pull request #14804: [Issue 14505] PIP-145: Enable evaluating subscription pattern on broker side

Posted by GitBox <gi...@apache.org>.
hangc0276 commented on a change in pull request #14804:
URL: https://github.com/apache/pulsar/pull/14804#discussion_r839105307



##########
File path: pulsar-broker/src/test/java/org/apache/pulsar/client/impl/PatternTopicsConsumerImplTest.java
##########
@@ -420,66 +419,6 @@ public void testBinaryProtoToGetTopicsOfNamespaceAll() throws Exception {
         producer4.close();
     }
 

Review comment:
       Why remote those tests?

##########
File path: pulsar-broker-common/src/main/java/org/apache/pulsar/broker/ServiceConfiguration.java
##########
@@ -914,6 +914,20 @@
     )
     private int resourceUsageTransportPublishIntervalInSecs = 60;
 
+    @FieldContext(
+            dynamic = false,
+            category = CATEGORY_POLICIES,
+            doc = "Enables evaluating subscription pattern on broker side."
+    )
+    private boolean enableBrokerSideSubscriptionPatternEvaluation = true;

Review comment:
       Do we need enable this feature by default ?

##########
File path: pulsar-broker/src/main/java/org/apache/pulsar/broker/service/ServerCnx.java
##########
@@ -1968,12 +1978,26 @@ protected void handleGetTopicsOfNamespace(CommandGetTopicsOfNamespace commandGet
                 if (isAuthorized) {
                     getBrokerService().pulsar().getNamespaceService().getListOfTopics(namespaceName, mode)
                         .thenAccept(topics -> {
+                            boolean filterTopics = enableSubscriptionPatternEvaluation && topicsPattern.isPresent()
+                                    && topicsPattern.get().length() <= maxSubscriptionPatternLength;

Review comment:
       It will be hard to debug why the server side filter doesn't take affect when the regex pattern length > max length if we do not expose a log.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] github-actions[bot] commented on pull request #14804: [Issue 14505] PIP-145: Enable evaluating subscription pattern on broker side

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #14804:
URL: https://github.com/apache/pulsar/pull/14804#issuecomment-1075349695


   @andrasbeni:Thanks for your contribution. For this PR, do we need to update docs?
   (The [PR template contains info about doc](https://github.com/apache/pulsar/blob/master/.github/PULL_REQUEST_TEMPLATE.md#documentation), which helps others know more about the changes. Can you provide doc-related info in this and future PR descriptions? Thanks)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] Jason918 commented on a change in pull request #14804: [Issue 14505] PIP-145: Enable evaluating subscription pattern on broker side

Posted by GitBox <gi...@apache.org>.
Jason918 commented on a change in pull request #14804:
URL: https://github.com/apache/pulsar/pull/14804#discussion_r838472984



##########
File path: pulsar-broker/src/main/java/org/apache/pulsar/broker/service/ServerCnx.java
##########
@@ -1968,12 +1978,26 @@ protected void handleGetTopicsOfNamespace(CommandGetTopicsOfNamespace commandGet
                 if (isAuthorized) {
                     getBrokerService().pulsar().getNamespaceService().getListOfTopics(namespaceName, mode)
                         .thenAccept(topics -> {
+                            boolean filterTopics = enableSubscriptionPatternEvaluation && topicsPattern.isPresent()
+                                    && topicsPattern.get().length() <= maxSubscriptionPatternLength;

Review comment:
       Should we provide some way to expose the situation if `topicsPattern.get().length() <= maxSubscriptionPatternLength`? Or else it would very hard for user to know why topic pattern not working .




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] merlimat commented on pull request #14804: [Issue 14505] PIP-145: Enable evaluating subscription pattern on broker side

Posted by GitBox <gi...@apache.org>.
merlimat commented on pull request #14804:
URL: https://github.com/apache/pulsar/pull/14804#issuecomment-1083177019


   /pulsarbot rerun-failure-checks


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] merlimat commented on a change in pull request #14804: [Issue 14505] PIP-145: Enable evaluating subscription pattern on broker side

Posted by GitBox <gi...@apache.org>.
merlimat commented on a change in pull request #14804:
URL: https://github.com/apache/pulsar/pull/14804#discussion_r838097252



##########
File path: pulsar-common/src/main/java/org/apache/pulsar/common/topics/TopicList.java
##########
@@ -0,0 +1,66 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pulsar.common.topics;
+
+import com.google.common.hash.Hashing;
+import java.nio.charset.StandardCharsets;
+import java.util.Collection;
+import java.util.HashSet;
+import java.util.List;
+import java.util.Set;
+import java.util.regex.Pattern;
+import java.util.stream.Collectors;
+import org.apache.pulsar.common.naming.TopicName;
+
+public class TopicList {

Review comment:
       We could use `@UtilityClass` lombok annotation here to prevent instances of this class from being created.

##########
File path: pulsar-common/src/main/java/org/apache/pulsar/common/topics/TopicList.java
##########
@@ -0,0 +1,66 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pulsar.common.topics;
+
+import com.google.common.hash.Hashing;
+import java.nio.charset.StandardCharsets;
+import java.util.Collection;
+import java.util.HashSet;
+import java.util.List;
+import java.util.Set;
+import java.util.regex.Pattern;
+import java.util.stream.Collectors;
+import org.apache.pulsar.common.naming.TopicName;
+
+public class TopicList {
+
+    // get topics that match 'topicsPattern' from original topics list
+    // return result should contain only topic names, without partition part
+    public static List<String> filterTopics(List<String> original, String regex) {
+        Pattern topicsPattern = Pattern.compile(regex);
+        return filterTopics(original, topicsPattern);
+    }
+    public static List<String> filterTopics(List<String> original, Pattern topicsPattern) {
+
+        final Pattern shortenedTopicsPattern = topicsPattern.toString().contains("://")
+                ? Pattern.compile(topicsPattern.toString().split("\\:\\/\\/")[1]) : topicsPattern;

Review comment:
       We could try to pre-compile the `\\:\\/\\/` regex so that we don't need to parse it each time.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] andrasbeni commented on a change in pull request #14804: [Issue 14505] PIP-145: Enable evaluating subscription pattern on broker side

Posted by GitBox <gi...@apache.org>.
andrasbeni commented on a change in pull request #14804:
URL: https://github.com/apache/pulsar/pull/14804#discussion_r839621226



##########
File path: pulsar-broker/src/test/java/org/apache/pulsar/client/impl/PatternTopicsConsumerImplTest.java
##########
@@ -420,66 +419,6 @@ public void testBinaryProtoToGetTopicsOfNamespaceAll() throws Exception {
         producer4.close();
     }
 

Review comment:
       The methods that these tests verify have been moved to org.apache.pulsar.common.topics.TopicList in order to make them available to both broker and client. 
   So the tests have also been moved to the corresponding test class: TopicListTest.
   The added benefit of this is that the new test class does not start a broker for each test case, so the tests in question will run a few milliseconds faster. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] merlimat closed pull request #14804: [Issue 14505] PIP-145: Enable evaluating subscription pattern on broker side

Posted by GitBox <gi...@apache.org>.
merlimat closed pull request #14804:
URL: https://github.com/apache/pulsar/pull/14804


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] hangc0276 commented on a change in pull request #14804: [Issue 14505] PIP-145: Enable evaluating subscription pattern on broker side

Posted by GitBox <gi...@apache.org>.
hangc0276 commented on a change in pull request #14804:
URL: https://github.com/apache/pulsar/pull/14804#discussion_r839105307



##########
File path: pulsar-broker/src/test/java/org/apache/pulsar/client/impl/PatternTopicsConsumerImplTest.java
##########
@@ -420,66 +419,6 @@ public void testBinaryProtoToGetTopicsOfNamespaceAll() throws Exception {
         producer4.close();
     }
 

Review comment:
       Why remove those tests?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org