You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pulsar.apache.org by GitBox <gi...@apache.org> on 2021/04/06 20:16:08 UTC

[GitHub] [pulsar] devinbost opened a new pull request #10154: [functions] Enable custom function producer behavior - Part 1 of Proposed PIP 84

devinbost opened a new pull request #10154:
URL: https://github.com/apache/pulsar/pull/10154


   Part 1 of [Proposal] PIP 84: Cluster-Wide and Function-Specific Producer Defaults
   For details, see the [mailing list discussion](https://lists.apache.org/thread.html/rc58361593fbd20bb6ca714aed0d84ffbaf54c305c0824d3bfed0c8ef%40%3Cdev.pulsar.apache.org%3E)
   
   Fixes #9986
   
   This PR adds these properties to `ProducerConfig`:
   ```
       public Boolean batchingDisabled;
       public Boolean chunkingEnabled;
       public Boolean blockIfQueueFullDisabled;
       public CompressionType compressionType;
       public HashingScheme hashingScheme;
       public MessageRoutingMode messageRoutingMode;
       public Long batchingMaxPublishDelay;
   ```
   
   It updates the methods that convert between the `ProducerConfig` and the protobuf `ProducerSpec`. 
   It adds logic to use the new function producer settings when they are provided; otherwise, it reverts to the prior behavior. 
   It also refactors the `FunctionResultRouter` to prevent a race condition on `isBatchingEnabled`. 
   
   It adds these properties to `ProducerSpec` in `Function.proto`:
   ```
       bool batchingDisabled = 6;
       bool chunkingEnabled = 7;
       bool blockIfQueueFullDisabled = 8;
       CompressionType compressionType = 9;
       HashingScheme hashingScheme = 10;
       MessageRoutingMode messageRoutingMode = 11;
       int64 batchingMaxPublishDelay = 12;
   ```
   
   and adds these new enums to Function.proto:
   
   ```
   enum CompressionType {
       LZ4 = 0;
       NONE = 1;
       ZLIB = 2;
       ZSTD = 3;
       SNAPPY = 4;
   }
   
   enum HashingScheme {
       MURMUR3_32HASH = 0;
       JAVA_STRING_HASH = 1;
   }
   
   enum MessageRoutingMode {
       CUSTOM_PARTITION = 0;
       SINGLE_PARTITION = 1;
       ROUND_ROBIN_PARTITION = 2;
   }
   ```
   
   The new protobuf files have not yet been generated for Python and Go functions, but those changes are planned in a subsequent PR. 
   
   Existing tests have been updated to include the new properties, and additional tests have been provided to ensure that new configs merge properly into existing configs during function registration/update. 
   
   Changes did not need to be made to the Admin CLI parameters because the Admin CLI already provides a way for the `ProducerConfig` to be provided as a JSON string. 
   
   Docs will be provided in a subsequent PR. 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] jerrypeng commented on a change in pull request #10154: [functions] Enable custom function producer behavior - Part 1 of Proposed PIP 84

Posted by GitBox <gi...@apache.org>.
jerrypeng commented on a change in pull request #10154:
URL: https://github.com/apache/pulsar/pull/10154#discussion_r612026409



##########
File path: pulsar-functions/instance/src/main/java/org/apache/pulsar/functions/instance/ContextImpl.java
##########
@@ -480,27 +481,54 @@ public void recordMetric(String metricName, double value) {
 
         if (producer == null) {
 
-            Producer<O> newProducer = ((ProducerBuilderImpl<O>) pulsar.getProducerBuilder().clone())
-                    .schema(schema)
-                    .blockIfQueueFull(true)
-                    .enableBatching(true)
-                    .batchingMaxPublishDelay(10, TimeUnit.MILLISECONDS)
-                    .compressionType(CompressionType.LZ4)
-                    .hashingScheme(HashingScheme.Murmur3_32Hash) //
-                    .messageRoutingMode(MessageRoutingMode.CustomPartition)
-                    .messageRouter(FunctionResultRouter.of())
-                    // set send timeout to be infinity to prevent potential deadlock with consumer
-                    // that might happen when consumer is blocked due to unacked messages
-                    .sendTimeout(0, TimeUnit.SECONDS)
-                    .topic(topicName)
-                    .properties(InstanceUtils.getProperties(componentType,
-                            FunctionCommon.getFullyQualifiedName(
-                                    this.config.getFunctionDetails().getTenant(),
-                                    this.config.getFunctionDetails().getNamespace(),
-                                    this.config.getFunctionDetails().getName()),
-                            this.config.getInstanceId()))
-                    .create();
-
+            Producer<O> newProducer = null;
+
+            if (this.config.getFunctionDetails() != null && this.config.getFunctionDetails().getSink() != null &&
+            this.config.getFunctionDetails().getSink().getProducerSpec() != null){

Review comment:
       `this.config.getFunctionDetails().getSink().getProducerSpec()` will never return null.  This is technically changing the existing behavior of `publish` when called from `Context`.  I don't think this is the right change. If a user want to use a custom producer to produce in the function (instead of just returning), the user should be able to:
   
   1. Instantiate a PulsarClient, create a producer, and sent messages (already can do this today)
   2. We can add a new method called "newProducerBuilder()" to the Context in which the returned builder automatically inherits some of the settings such as authentication from the function.
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] codelipenghui commented on pull request #10154: [functions] Enable custom function producer behavior - Part 1 of Proposed PIP 84

Posted by GitBox <gi...@apache.org>.
codelipenghui commented on pull request #10154:
URL: https://github.com/apache/pulsar/pull/10154#issuecomment-1058891683


   The pr had no activity for 30 days, mark with Stale label.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] devinbost commented on pull request #10154: [functions] Enable custom function producer behavior - Part 1 of Proposed PIP 84

Posted by GitBox <gi...@apache.org>.
devinbost commented on pull request #10154:
URL: https://github.com/apache/pulsar/pull/10154#issuecomment-814481381


   /pulsarbot run-failure-checks


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] devinbost commented on pull request #10154: [functions] Enable custom function producer behavior - Part 1 of Proposed PIP 84

Posted by GitBox <gi...@apache.org>.
devinbost commented on pull request #10154:
URL: https://github.com/apache/pulsar/pull/10154#issuecomment-814529310


   /pulsarbot run-failure-checks


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] devinbost commented on pull request #10154: [functions] Enable custom function producer behavior - Part 1 of Proposed PIP 84

Posted by GitBox <gi...@apache.org>.
devinbost commented on pull request #10154:
URL: https://github.com/apache/pulsar/pull/10154#issuecomment-814572084


   /pulsarbot run-failure-checks


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] devinbost commented on a change in pull request #10154: [functions] Enable custom function producer behavior - Part 1 of Proposed PIP 84

Posted by GitBox <gi...@apache.org>.
devinbost commented on a change in pull request #10154:
URL: https://github.com/apache/pulsar/pull/10154#discussion_r612084685



##########
File path: pulsar-functions/instance/src/main/java/org/apache/pulsar/functions/instance/ContextImpl.java
##########
@@ -480,27 +481,54 @@ public void recordMetric(String metricName, double value) {
 
         if (producer == null) {
 
-            Producer<O> newProducer = ((ProducerBuilderImpl<O>) pulsar.getProducerBuilder().clone())
-                    .schema(schema)
-                    .blockIfQueueFull(true)
-                    .enableBatching(true)
-                    .batchingMaxPublishDelay(10, TimeUnit.MILLISECONDS)
-                    .compressionType(CompressionType.LZ4)
-                    .hashingScheme(HashingScheme.Murmur3_32Hash) //
-                    .messageRoutingMode(MessageRoutingMode.CustomPartition)
-                    .messageRouter(FunctionResultRouter.of())
-                    // set send timeout to be infinity to prevent potential deadlock with consumer
-                    // that might happen when consumer is blocked due to unacked messages
-                    .sendTimeout(0, TimeUnit.SECONDS)
-                    .topic(topicName)
-                    .properties(InstanceUtils.getProperties(componentType,
-                            FunctionCommon.getFullyQualifiedName(
-                                    this.config.getFunctionDetails().getTenant(),
-                                    this.config.getFunctionDetails().getNamespace(),
-                                    this.config.getFunctionDetails().getName()),
-                            this.config.getInstanceId()))
-                    .create();
-
+            Producer<O> newProducer = null;
+
+            if (this.config.getFunctionDetails() != null && this.config.getFunctionDetails().getSink() != null &&
+            this.config.getFunctionDetails().getSink().getProducerSpec() != null){

Review comment:
       @jerrypeng Thanks for reviewing the PR. I want to better understand your suggestion. I'm hearing several things, so please correct any of my misunderstandings:
   1. Instead of performing the null check on `this.config.getFunctionDetails().getSink().getProducerSpec()`, I could skip the null check and just pass through the protobuf properties directly. The defaults they use match the existing configurations, but it sounds like there's a concern that it could still change existing behavior? 
   2. You're saying that I could define a new method, `newProducerBuilder()` to inherit the protobuf settings (instead of passing them through directly, like I mentioned above), but it sounds like there are other settings I'd want to inherit as well (such as relating to authentication)? 
   
   Would the `newProducerBuilder()` method be used instead of `getProducer()` by functions that want to use the new settings? It sounds like that would require a code change for any existing function that we want to update to use the new settings. Is that right?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] devinbost commented on pull request #10154: [functions] Enable custom function producer behavior - Part 1 of Proposed PIP 84

Posted by GitBox <gi...@apache.org>.
devinbost commented on pull request #10154:
URL: https://github.com/apache/pulsar/pull/10154#issuecomment-814659786


   @jerrypeng @sijie @codelipenghui @freeznet I got all the tests to pass (after 5 runs...), and this PR needs a review. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] devinbost commented on pull request #10154: [functions] Enable custom function producer behavior - Part 1 of Proposed PIP 84

Posted by GitBox <gi...@apache.org>.
devinbost commented on pull request #10154:
URL: https://github.com/apache/pulsar/pull/10154#issuecomment-814620701


   /pulsarbot run-failure-checks


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] devinbost commented on pull request #10154: [functions] Enable custom function producer behavior - Part 1 of Proposed PIP 84

Posted by GitBox <gi...@apache.org>.
devinbost commented on pull request #10154:
URL: https://github.com/apache/pulsar/pull/10154#issuecomment-814664233


   Also @eolivelli 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] devinbost commented on pull request #10154: [functions] Enable custom function producer behavior - Part 1 of Proposed PIP 84

Posted by GitBox <gi...@apache.org>.
devinbost commented on pull request #10154:
URL: https://github.com/apache/pulsar/pull/10154#issuecomment-814660393


   This PR is the derivative of #9987 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] github-actions[bot] commented on pull request #10154: [functions] Enable custom function producer behavior - Part 1 of Proposed PIP 84

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #10154:
URL: https://github.com/apache/pulsar/pull/10154#issuecomment-1058891879


   @devinbost:Thanks for your contribution. For this PR, do we need to update docs?
   (The [PR template contains info about doc](https://github.com/apache/pulsar/blob/master/.github/PULL_REQUEST_TEMPLATE.md#documentation), which helps others know more about the changes. Can you provide doc-related info in this and future PR descriptions? Thanks)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org