You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pulsar.apache.org by Baozi <wu...@icloud.com.INVALID> on 2022/05/12 08:51:51 UTC

[DISCUSS] PIP-166: Function add NONE delivery semantics

Hi Pulsar community,

I open a https://github.com/apache/pulsar/issues/15560 for Function add NONE delivery semantics

Let me know what you think.


Thanks,
Baodi Shi


## Motivation

Currently Function supports three delivery semantics, and also provides autoAck to control whether to automatically ack.
Because autoAck affects the delivery semantics of Function, it can be confusing for users to understand the relationship between these two parameters.

For example, when the user configures `Guarantees == ATMOST_ONCE` and `autoAck == false`, then the framework will not help the user to ack messages, and the processing semantics may become `ATLEAST_ONCE`.

The delivery semantics provided by Function should be clear. When the user sets the guarantees, the framework should ensure point-to-point semantic processing and cannot be affected by other parameters.

## Goal

Added `NONE` delivery semantics and delete `autoAck` config.

The original intention of `autoAck` semantics is that users want to control the timing of ack by themselves. When autoAck == false, the processing semantics provided by the framework should be invalid. Then we can add `NONE` processing semantics to replace the autoAck == false scenario.

When the user configuration `ProcessingGuarantees == NONE`, the framework does not help the user to do any ack operations, and the ack is left to the user to handle. In other cases, the framework guarantees processing semantics.

## API Changes
1. Add `NONE` type to ProcessingGuarantees
``` java
public enum ProcessingGuarantees {
      ATLEAST_ONCE,
      ATMOST_ONCE,
      EFFECTIVELY_ONCE,
      NONE
}
```

2. Delete autoAck config in FunctionConfig
``` java
public class FunctionConfig {
-    private Boolean autoAck;
}
```

## Implementation

1. In `PulsarSinkAtLeastOnceProcessor` and `PulsarSinkEffectivelyOnceProcessor`, when `ProcessingGuarantees != NONE` can be ack.

<https://github.com/apache/pulsar/blob/c49a977de4b0b525ec80e5070bc90eddcc7cddad/pulsar-functions/instance/src/main/java/org/apache/pulsar/functions/sink/PulsarSink.java#L274-L276>

2. When the delivery semantic is `ATMOST_ONCE`, the message will be acked immediately after receiving the message, no longer affected by the autoAck configuration.

https://github.com/apache/pulsar/blob/c49a977de4b0b525ec80e5070bc90eddcc7cddad/pulsar-functions/instance/src/main/java/org/apache/pulsar/functions/instance/JavaInstanceRunnable.java#L271-L276

3. When user call `record.ack()` in function, just  `ProcessingGuarantees == NONE` can be work.

## Plan test
The main test and assert is that when ProcessingGuarantees == NONE, the function framework will not do any ack operations for the user.

## Compatibility
1. This change will invalidate the user's setting of autoAck, which should be explained in the documentation and provide parameter verification to remind the user.
2. Runtimes of other languages ​​need to maintain consistent processing logic (python, go).



Re: [DISCUSS] PIP-166: Function add NONE delivery semantics

Posted by 石宝迪 <wu...@icloud.com.INVALID>.
>> This I don't understand fully. When you introduce MANUAL
> ProcessingGuarantee configuration, isn't configuration universal to all
> languages? When someone uses pulsar-admin to create a function, regardless
> of the language, they can specify Processing Guarantee. If we're adding
> this ability and documenting it, isn't supporting all clients mandatory? I
> definitely think it can be separate pull requests, but it seems important
> IMO to deliver consistent developer experience to the person developing
> functions.

Yes, to support all clients. I didn't express it clearly before. Since our changes are backward compatible, before the change is released, we can open multiple PR  to iteratively implement the runtime of each language. When all languages ​​are supported, publish documentation to inform users.


Thanks,
Baodi Shi

> 2022年5月30日 22:0613,Asaf Mesika <as...@gmail.com> 写道:
> 
> On Mon, May 30, 2022 at 4:24 PM Baozi <wu...@icloud.com.invalid>
> wrote:
> 
>> Hi, Rui.
>> 
>> Thanks review.
>> 
>>> 1. API changes should also contain the changes of `Function.proto`,
>> including new `ProcessingGuarantees` option and `autoAck`.
>> 
>> 
>> I added to pip.
>> 
>>> 2. Please be sure the other language runtimes (like Python, Golang) do
>> support similar `record.ack()` function from the context, if no, it might
>> have some API changes for different runtime we well.
>> 
>> 
>> I added to Compatibility. The goal of this PIP is to keep other language
>> runtimes consistent with java, but it needs to be iterated slowly. We will
>> support java runtimes first.
>> 
>> This I don't understand fully. When you introduce MANUAL
> ProcessingGuarantee configuration, isn't configuration universal to all
> languages? When someone uses pulsar-admin to create a function, regardless
> of the language, they can specify Processing Guarantee. If we're adding
> this ability and documenting it, isn't supporting all clients mandatory? I
> definitely think it can be separate pull requests, but it seems important
> IMO to deliver consistent developer experience to the person developing
> functions.
> 
> 
> 
>> Thanks,
>> Baodi Shi
>> 
>>> 2022年5月30日 12:5128,Rui Fu <rf...@apache.org> 写道:
>>> 
>>> Hi Baodi,
>>> 
>>> Nice work. Put some suggestions below, ptal.
>>> 
>>> 1. API changes should also contain the changes of `Function.proto`,
>> including new `ProcessingGuarantees` option and `autoAck`.
>>> 2. Please be sure the other language runtimes (like Python, Golang) do
>> support similar `record.ack()` function from the context, if no, it might
>> have some API changes for different runtime we well.
>>> 
>>> 
>>> Best,
>>> 
>>> Rui Fu
>>> 在 2022年5月29日 +0800 22:18,Asaf Mesika <as...@gmail.com>,写道:
>>>> 1. "Added NONE delivery semantics and delete autoAck config."
>>>> - Added --> add
>>>> 
>>>> 2. I suggest calling it `MANUAL` `ProcessingGuarantee` instead of NONE.
>> As
>>>> you carefully explained, ProcessingGuarantee comes does to the fact that
>>>> the function executor calls acknowledge, in specific timing:
>>>> - `AT_MOST_ONCE` - When the message is read by the client, it is
>>>> immediately acknowledged and only then the function is executed, thus
>>>> guaranteeing it will not run more than once
>>>> - `AT_LEAST_ONCE` - Message is acknowledged *after* the function
>> finished
>>>> execution, thus it will be run at least once.
>>>> - `MANUAL` - Signals to the user that it is up to them to acknowledge
>> the
>>>> message, inside the function.
>>>> 
>>>> I think if you couple that change with adding the explanation I wrote
>>>> above to the documentation it will become crystal clear (hopefully)
>> what is
>>>> a Processing Guarantee exactly and what each value signifies.
>>>> 
>>>> 3. Removing autoAck from Function Config breaks backward compatibility,
>>>> thus shouldn't this be strongly reflected in the PIP - how does Pulsar
>>>> release handle breaking change?
>>>> 
>>>> 4. Regarding Implementation (1), isn't the class itself
>>>> PulsarSinkAtLeastOnceProcessor encodes what to do? I'm not sure I
>>>> understand how you use that enum value *inside* the class/
>>>> 
>>>> 
>>>> On Fri, May 27, 2022 at 1:08 AM Neng Lu <fr...@gmail.com> wrote:
>>>> 
>>>>> Some suggestions:
>>>>> 
>>>>> 1. Instead of deleting the `autoAck`, keep it but not use it in the
>> code.
>>>>> And documented clearly it's deprecated for the following 2~3 release.
>> And
>>>>> then delete it.
>>>>> 2. For `PulsarSinkAtLeastOnceProcessor` and
>>>>> `PulsarSinkEffectivelyOnceProcessor`, if `NONE` is configured, it
>> defaults
>>>>> to ATLEAST_ONCE.
>>>>> 3. Need to let users know the behavior when they call `record.ack()`
>> inside
>>>>> the function implementation.
>>>>> 
>>>>> On Thu, May 12, 2022 at 1:52 AM Baozi <wudixiaolongbao@icloud.com
>> .invalid>
>>>>> wrote:
>>>>> 
>>>>>> Hi Pulsar community,
>>>>>> 
>>>>>> I open a https://github.com/apache/pulsar/issues/15560 for Function
>> add
>>>>>> NONE delivery semantics
>>>>>> 
>>>>>> Let me know what you think.
>>>>>> 
>>>>>> 
>>>>>> Thanks,
>>>>>> Baodi Shi
>>>>>> 
>>>>>> 
>>>>>> ## Motivation
>>>>>> 
>>>>>> Currently Function supports three delivery semantics, and also
>> provides
>>>>>> autoAck to control whether to automatically ack.
>>>>>> Because autoAck affects the delivery semantics of Function, it can be
>>>>>> confusing for users to understand the relationship between these two
>>>>>> parameters.
>>>>>> 
>>>>>> For example, when the user configures `Guarantees == ATMOST_ONCE` and
>>>>>> `autoAck == false`, then the framework will not help the user to ack
>>>>>> messages, and the processing semantics may become `ATLEAST_ONCE`.
>>>>>> 
>>>>>> The delivery semantics provided by Function should be clear. When the
>>>>> user
>>>>>> sets the guarantees, the framework should ensure point-to-point
>> semantic
>>>>>> processing and cannot be affected by other parameters.
>>>>>> 
>>>>>> ## Goal
>>>>>> 
>>>>>> Added `NONE` delivery semantics and delete `autoAck` config.
>>>>>> 
>>>>>> The original intention of `autoAck` semantics is that users want to
>>>>>> control the timing of ack by themselves. When autoAck == false, the
>>>>>> processing semantics provided by the framework should be invalid.
>> Then we
>>>>>> can add `NONE` processing semantics to replace the autoAck == false
>>>>>> scenario.
>>>>>> 
>>>>>> When the user configuration `ProcessingGuarantees == NONE`, the
>> framework
>>>>>> does not help the user to do any ack operations, and the ack is left
>> to
>>>>> the
>>>>>> user to handle. In other cases, the framework guarantees processing
>>>>>> semantics.
>>>>>> 
>>>>>> ## API Changes
>>>>>> 1. Add `NONE` type to ProcessingGuarantees
>>>>>> ``` java
>>>>>> public enum ProcessingGuarantees {
>>>>>> ATLEAST_ONCE,
>>>>>> ATMOST_ONCE,
>>>>>> EFFECTIVELY_ONCE,
>>>>>> NONE
>>>>>> }
>>>>>> ```
>>>>>> 
>>>>>> 2. Delete autoAck config in FunctionConfig
>>>>>> ``` java
>>>>>> public class FunctionConfig {
>>>>>> - private Boolean autoAck;
>>>>>> }
>>>>>> ```
>>>>>> 
>>>>>> ## Implementation
>>>>>> 
>>>>>> 1. In `PulsarSinkAtLeastOnceProcessor` and
>>>>>> `PulsarSinkEffectivelyOnceProcessor`, when `ProcessingGuarantees !=
>> NONE`
>>>>>> can be ack.
>>>>>> 
>>>>>> <
>>>>>> 
>>>>> 
>> https://github.com/apache/pulsar/blob/c49a977de4b0b525ec80e5070bc90eddcc7cddad/pulsar-functions/instance/src/main/java/org/apache/pulsar/functions/sink/PulsarSink.java#L274-L276
>>>>>>> 
>>>>>> 
>>>>>> 2. When the delivery semantic is `ATMOST_ONCE`, the message will be
>> acked
>>>>>> immediately after receiving the message, no longer affected by the
>>>>> autoAck
>>>>>> configuration.
>>>>>> 
>>>>>> 
>>>>>> 
>>>>> 
>> https://github.com/apache/pulsar/blob/c49a977de4b0b525ec80e5070bc90eddcc7cddad/pulsar-functions/instance/src/main/java/org/apache/pulsar/functions/instance/JavaInstanceRunnable.java#L271-L276
>>>>>> 
>>>>>> 3. When user call `record.ack()` in function, just
>> `ProcessingGuarantees
>>>>>> == NONE` can be work.
>>>>>> 
>>>>>> ## Plan test
>>>>>> The main test and assert is that when ProcessingGuarantees == NONE,
>> the
>>>>>> function framework will not do any ack operations for the user.
>>>>>> 
>>>>>> ## Compatibility
>>>>>> 1. This change will invalidate the user's setting of autoAck, which
>>>>> should
>>>>>> be explained in the documentation and provide parameter verification
>> to
>>>>>> remind the user.
>>>>>> 2. Runtimes of other languages need to maintain consistent processing
>>>>>> logic (python, go).
>>>>>> 
>>>>>> 
>>>>>> 
>>>>> 
>>>>> --
>>>>> Best Regards,
>>>>> Neng
>>>>> 
>> 
>> 


Re: [DISCUSS] PIP-166: Function add NONE delivery semantics

Posted by Asaf Mesika <as...@gmail.com>.
On Mon, May 30, 2022 at 4:24 PM Baozi <wu...@icloud.com.invalid>
wrote:

> Hi, Rui.
>
> Thanks review.
>
> > 1. API changes should also contain the changes of `Function.proto`,
> including new `ProcessingGuarantees` option and `autoAck`.
>
>
> I added to pip.
>
> > 2. Please be sure the other language runtimes (like Python, Golang) do
> support similar `record.ack()` function from the context, if no, it might
> have some API changes for different runtime we well.
>
>
> I added to Compatibility. The goal of this PIP is to keep other language
> runtimes consistent with java, but it needs to be iterated slowly. We will
> support java runtimes first.
>
> This I don't understand fully. When you introduce MANUAL
ProcessingGuarantee configuration, isn't configuration universal to all
languages? When someone uses pulsar-admin to create a function, regardless
of the language, they can specify Processing Guarantee. If we're adding
this ability and documenting it, isn't supporting all clients mandatory? I
definitely think it can be separate pull requests, but it seems important
IMO to deliver consistent developer experience to the person developing
functions.



> Thanks,
> Baodi Shi
>
> > 2022年5月30日 12:5128,Rui Fu <rf...@apache.org> 写道:
> >
> > Hi Baodi,
> >
> > Nice work. Put some suggestions below, ptal.
> >
> > 1. API changes should also contain the changes of `Function.proto`,
> including new `ProcessingGuarantees` option and `autoAck`.
> > 2. Please be sure the other language runtimes (like Python, Golang) do
> support similar `record.ack()` function from the context, if no, it might
> have some API changes for different runtime we well.
> >
> >
> > Best,
> >
> > Rui Fu
> > 在 2022年5月29日 +0800 22:18,Asaf Mesika <as...@gmail.com>,写道:
> >> 1. "Added NONE delivery semantics and delete autoAck config."
> >> - Added --> add
> >>
> >> 2. I suggest calling it `MANUAL` `ProcessingGuarantee` instead of NONE.
> As
> >> you carefully explained, ProcessingGuarantee comes does to the fact that
> >> the function executor calls acknowledge, in specific timing:
> >> - `AT_MOST_ONCE` - When the message is read by the client, it is
> >> immediately acknowledged and only then the function is executed, thus
> >> guaranteeing it will not run more than once
> >> - `AT_LEAST_ONCE` - Message is acknowledged *after* the function
> finished
> >> execution, thus it will be run at least once.
> >> - `MANUAL` - Signals to the user that it is up to them to acknowledge
> the
> >> message, inside the function.
> >>
> >> I think if you couple that change with adding the explanation I wrote
> >> above to the documentation it will become crystal clear (hopefully)
> what is
> >> a Processing Guarantee exactly and what each value signifies.
> >>
> >> 3. Removing autoAck from Function Config breaks backward compatibility,
> >> thus shouldn't this be strongly reflected in the PIP - how does Pulsar
> >> release handle breaking change?
> >>
> >> 4. Regarding Implementation (1), isn't the class itself
> >> PulsarSinkAtLeastOnceProcessor encodes what to do? I'm not sure I
> >> understand how you use that enum value *inside* the class/
> >>
> >>
> >> On Fri, May 27, 2022 at 1:08 AM Neng Lu <fr...@gmail.com> wrote:
> >>
> >>> Some suggestions:
> >>>
> >>> 1. Instead of deleting the `autoAck`, keep it but not use it in the
> code.
> >>> And documented clearly it's deprecated for the following 2~3 release.
> And
> >>> then delete it.
> >>> 2. For `PulsarSinkAtLeastOnceProcessor` and
> >>> `PulsarSinkEffectivelyOnceProcessor`, if `NONE` is configured, it
> defaults
> >>> to ATLEAST_ONCE.
> >>> 3. Need to let users know the behavior when they call `record.ack()`
> inside
> >>> the function implementation.
> >>>
> >>> On Thu, May 12, 2022 at 1:52 AM Baozi <wudixiaolongbao@icloud.com
> .invalid>
> >>> wrote:
> >>>
> >>>> Hi Pulsar community,
> >>>>
> >>>> I open a https://github.com/apache/pulsar/issues/15560 for Function
> add
> >>>> NONE delivery semantics
> >>>>
> >>>> Let me know what you think.
> >>>>
> >>>>
> >>>> Thanks,
> >>>> Baodi Shi
> >>>>
> >>>>
> >>>> ## Motivation
> >>>>
> >>>> Currently Function supports three delivery semantics, and also
> provides
> >>>> autoAck to control whether to automatically ack.
> >>>> Because autoAck affects the delivery semantics of Function, it can be
> >>>> confusing for users to understand the relationship between these two
> >>>> parameters.
> >>>>
> >>>> For example, when the user configures `Guarantees == ATMOST_ONCE` and
> >>>> `autoAck == false`, then the framework will not help the user to ack
> >>>> messages, and the processing semantics may become `ATLEAST_ONCE`.
> >>>>
> >>>> The delivery semantics provided by Function should be clear. When the
> >>> user
> >>>> sets the guarantees, the framework should ensure point-to-point
> semantic
> >>>> processing and cannot be affected by other parameters.
> >>>>
> >>>> ## Goal
> >>>>
> >>>> Added `NONE` delivery semantics and delete `autoAck` config.
> >>>>
> >>>> The original intention of `autoAck` semantics is that users want to
> >>>> control the timing of ack by themselves. When autoAck == false, the
> >>>> processing semantics provided by the framework should be invalid.
> Then we
> >>>> can add `NONE` processing semantics to replace the autoAck == false
> >>>> scenario.
> >>>>
> >>>> When the user configuration `ProcessingGuarantees == NONE`, the
> framework
> >>>> does not help the user to do any ack operations, and the ack is left
> to
> >>> the
> >>>> user to handle. In other cases, the framework guarantees processing
> >>>> semantics.
> >>>>
> >>>> ## API Changes
> >>>> 1. Add `NONE` type to ProcessingGuarantees
> >>>> ``` java
> >>>> public enum ProcessingGuarantees {
> >>>> ATLEAST_ONCE,
> >>>> ATMOST_ONCE,
> >>>> EFFECTIVELY_ONCE,
> >>>> NONE
> >>>> }
> >>>> ```
> >>>>
> >>>> 2. Delete autoAck config in FunctionConfig
> >>>> ``` java
> >>>> public class FunctionConfig {
> >>>> - private Boolean autoAck;
> >>>> }
> >>>> ```
> >>>>
> >>>> ## Implementation
> >>>>
> >>>> 1. In `PulsarSinkAtLeastOnceProcessor` and
> >>>> `PulsarSinkEffectivelyOnceProcessor`, when `ProcessingGuarantees !=
> NONE`
> >>>> can be ack.
> >>>>
> >>>> <
> >>>>
> >>>
> https://github.com/apache/pulsar/blob/c49a977de4b0b525ec80e5070bc90eddcc7cddad/pulsar-functions/instance/src/main/java/org/apache/pulsar/functions/sink/PulsarSink.java#L274-L276
> >>>>>
> >>>>
> >>>> 2. When the delivery semantic is `ATMOST_ONCE`, the message will be
> acked
> >>>> immediately after receiving the message, no longer affected by the
> >>> autoAck
> >>>> configuration.
> >>>>
> >>>>
> >>>>
> >>>
> https://github.com/apache/pulsar/blob/c49a977de4b0b525ec80e5070bc90eddcc7cddad/pulsar-functions/instance/src/main/java/org/apache/pulsar/functions/instance/JavaInstanceRunnable.java#L271-L276
> >>>>
> >>>> 3. When user call `record.ack()` in function, just
> `ProcessingGuarantees
> >>>> == NONE` can be work.
> >>>>
> >>>> ## Plan test
> >>>> The main test and assert is that when ProcessingGuarantees == NONE,
> the
> >>>> function framework will not do any ack operations for the user.
> >>>>
> >>>> ## Compatibility
> >>>> 1. This change will invalidate the user's setting of autoAck, which
> >>> should
> >>>> be explained in the documentation and provide parameter verification
> to
> >>>> remind the user.
> >>>> 2. Runtimes of other languages need to maintain consistent processing
> >>>> logic (python, go).
> >>>>
> >>>>
> >>>>
> >>>
> >>> --
> >>> Best Regards,
> >>> Neng
> >>>
>
>

Re: [DISCUSS] PIP-166: Function add NONE delivery semantics

Posted by Baozi <wu...@icloud.com.INVALID>.
Hi, Rui.

Thanks review.

> 1. API changes should also contain the changes of `Function.proto`, including new `ProcessingGuarantees` option and `autoAck`.


I added to pip.

> 2. Please be sure the other language runtimes (like Python, Golang) do support similar `record.ack()` function from the context, if no, it might have some API changes for different runtime we well.


I added to Compatibility. The goal of this PIP is to keep other language runtimes consistent with java, but it needs to be iterated slowly. We will support java runtimes first.

Thanks,
Baodi Shi

> 2022年5月30日 12:5128,Rui Fu <rf...@apache.org> 写道:
> 
> Hi Baodi,
> 
> Nice work. Put some suggestions below, ptal.
> 
> 1. API changes should also contain the changes of `Function.proto`, including new `ProcessingGuarantees` option and `autoAck`.
> 2. Please be sure the other language runtimes (like Python, Golang) do support similar `record.ack()` function from the context, if no, it might have some API changes for different runtime we well.
> 
> 
> Best,
> 
> Rui Fu
> 在 2022年5月29日 +0800 22:18,Asaf Mesika <as...@gmail.com>,写道:
>> 1. "Added NONE delivery semantics and delete autoAck config."
>> - Added --> add
>> 
>> 2. I suggest calling it `MANUAL` `ProcessingGuarantee` instead of NONE. As
>> you carefully explained, ProcessingGuarantee comes does to the fact that
>> the function executor calls acknowledge, in specific timing:
>> - `AT_MOST_ONCE` - When the message is read by the client, it is
>> immediately acknowledged and only then the function is executed, thus
>> guaranteeing it will not run more than once
>> - `AT_LEAST_ONCE` - Message is acknowledged *after* the function finished
>> execution, thus it will be run at least once.
>> - `MANUAL` - Signals to the user that it is up to them to acknowledge the
>> message, inside the function.
>> 
>> I think if you couple that change with adding the explanation I wrote
>> above to the documentation it will become crystal clear (hopefully) what is
>> a Processing Guarantee exactly and what each value signifies.
>> 
>> 3. Removing autoAck from Function Config breaks backward compatibility,
>> thus shouldn't this be strongly reflected in the PIP - how does Pulsar
>> release handle breaking change?
>> 
>> 4. Regarding Implementation (1), isn't the class itself
>> PulsarSinkAtLeastOnceProcessor encodes what to do? I'm not sure I
>> understand how you use that enum value *inside* the class/
>> 
>> 
>> On Fri, May 27, 2022 at 1:08 AM Neng Lu <fr...@gmail.com> wrote:
>> 
>>> Some suggestions:
>>> 
>>> 1. Instead of deleting the `autoAck`, keep it but not use it in the code.
>>> And documented clearly it's deprecated for the following 2~3 release. And
>>> then delete it.
>>> 2. For `PulsarSinkAtLeastOnceProcessor` and
>>> `PulsarSinkEffectivelyOnceProcessor`, if `NONE` is configured, it defaults
>>> to ATLEAST_ONCE.
>>> 3. Need to let users know the behavior when they call `record.ack()` inside
>>> the function implementation.
>>> 
>>> On Thu, May 12, 2022 at 1:52 AM Baozi <wu...@icloud.com.invalid>
>>> wrote:
>>> 
>>>> Hi Pulsar community,
>>>> 
>>>> I open a https://github.com/apache/pulsar/issues/15560 for Function add
>>>> NONE delivery semantics
>>>> 
>>>> Let me know what you think.
>>>> 
>>>> 
>>>> Thanks,
>>>> Baodi Shi
>>>> 
>>>> 
>>>> ## Motivation
>>>> 
>>>> Currently Function supports three delivery semantics, and also provides
>>>> autoAck to control whether to automatically ack.
>>>> Because autoAck affects the delivery semantics of Function, it can be
>>>> confusing for users to understand the relationship between these two
>>>> parameters.
>>>> 
>>>> For example, when the user configures `Guarantees == ATMOST_ONCE` and
>>>> `autoAck == false`, then the framework will not help the user to ack
>>>> messages, and the processing semantics may become `ATLEAST_ONCE`.
>>>> 
>>>> The delivery semantics provided by Function should be clear. When the
>>> user
>>>> sets the guarantees, the framework should ensure point-to-point semantic
>>>> processing and cannot be affected by other parameters.
>>>> 
>>>> ## Goal
>>>> 
>>>> Added `NONE` delivery semantics and delete `autoAck` config.
>>>> 
>>>> The original intention of `autoAck` semantics is that users want to
>>>> control the timing of ack by themselves. When autoAck == false, the
>>>> processing semantics provided by the framework should be invalid. Then we
>>>> can add `NONE` processing semantics to replace the autoAck == false
>>>> scenario.
>>>> 
>>>> When the user configuration `ProcessingGuarantees == NONE`, the framework
>>>> does not help the user to do any ack operations, and the ack is left to
>>> the
>>>> user to handle. In other cases, the framework guarantees processing
>>>> semantics.
>>>> 
>>>> ## API Changes
>>>> 1. Add `NONE` type to ProcessingGuarantees
>>>> ``` java
>>>> public enum ProcessingGuarantees {
>>>> ATLEAST_ONCE,
>>>> ATMOST_ONCE,
>>>> EFFECTIVELY_ONCE,
>>>> NONE
>>>> }
>>>> ```
>>>> 
>>>> 2. Delete autoAck config in FunctionConfig
>>>> ``` java
>>>> public class FunctionConfig {
>>>> - private Boolean autoAck;
>>>> }
>>>> ```
>>>> 
>>>> ## Implementation
>>>> 
>>>> 1. In `PulsarSinkAtLeastOnceProcessor` and
>>>> `PulsarSinkEffectivelyOnceProcessor`, when `ProcessingGuarantees != NONE`
>>>> can be ack.
>>>> 
>>>> <
>>>> 
>>> https://github.com/apache/pulsar/blob/c49a977de4b0b525ec80e5070bc90eddcc7cddad/pulsar-functions/instance/src/main/java/org/apache/pulsar/functions/sink/PulsarSink.java#L274-L276
>>>>> 
>>>> 
>>>> 2. When the delivery semantic is `ATMOST_ONCE`, the message will be acked
>>>> immediately after receiving the message, no longer affected by the
>>> autoAck
>>>> configuration.
>>>> 
>>>> 
>>>> 
>>> https://github.com/apache/pulsar/blob/c49a977de4b0b525ec80e5070bc90eddcc7cddad/pulsar-functions/instance/src/main/java/org/apache/pulsar/functions/instance/JavaInstanceRunnable.java#L271-L276
>>>> 
>>>> 3. When user call `record.ack()` in function, just `ProcessingGuarantees
>>>> == NONE` can be work.
>>>> 
>>>> ## Plan test
>>>> The main test and assert is that when ProcessingGuarantees == NONE, the
>>>> function framework will not do any ack operations for the user.
>>>> 
>>>> ## Compatibility
>>>> 1. This change will invalidate the user's setting of autoAck, which
>>> should
>>>> be explained in the documentation and provide parameter verification to
>>>> remind the user.
>>>> 2. Runtimes of other languages need to maintain consistent processing
>>>> logic (python, go).
>>>> 
>>>> 
>>>> 
>>> 
>>> --
>>> Best Regards,
>>> Neng
>>> 


Re: [DISCUSS] PIP-166: Function add NONE delivery semantics

Posted by Asaf Mesika <as...@gmail.com>.
Regarding
>
>
>    1. I would add: Validate existing ProcessingGuarantee test for
>    AtMostOnce, AtLeastOnce, ExactlyOnce still works (when autoAck=true)
>
> Remove "I would add:"

On Tue, May 31, 2022 at 12:24 PM Baozi <wu...@icloud.com.invalid>
wrote:

> Hi, Asaf.
>
> Thanks review.
>
> >> I'm not entirely sure that is accurate. The Effectively-Once as I
> > understand it is achieved using transactions, thus the consumption of
> that
> > message and production of any messages, as a result, are considered one
> > atomic unit - either message acknowledged and messages produced or
> neither.
>
>
> Not using transactions now, I understand: EFFECTIVELY_ONCE = ATLEAST_ONCE
> + Message Deduplication.
>
> @Neng Lu @Rui Fu Can help make sure?
>
> >> I would issue a WARN when reading configuring the function (thus emitted
> > once) when the user actively configured autoAck=false and warn them that
> > this configuration is deprecated and they should switch to the MANUAL
> > ProcessingGuarantee configuration option.
>
>
> Added to API Change(2)
>
> >> suggest you clarify what existing behavior remains for backward
> > compatibility with the appropriate comment that this is deprecated and
> will
> > be removed.
>
> Yes, I have rewritten it, please see Implementation(1)
>
> > 5. Regarding Test Plan
> > * I would add: Validate the test of autoAck=false still works (you
> haven't
> > broken anything)
> > * I would add: Validate existing ProcessingGuarantee test for AtMostOnce,
> > AtLeastOnce, ExactlyOnce still works (when autoAck=true)
>
>
> Nice, I added to PIP.
>
>
> Thanks,
> Baodi Shi
>
> > 2022年5月30日 22:0011,Asaf Mesika <as...@gmail.com> 写道:
> >
> > Thanks for applying the fixes.
> >
> > 1. Regarding
> >
> >>
> >>   - EFFECTIVELY_ONCE: The message is acknowledged *after* the function
> >>   finished execution. Depends on pulsar deduplication, and provides
> >>   end-to-end effectively once processing.
> >>
> >> I'm not entirely sure that is accurate. The Effectively-Once as I
> > understand it is achieved using transactions, thus the consumption of
> that
> > message and production of any messages, as a result, are considered one
> > atomic unit - either message acknowledged and messages produced or
> neither.
> >
> > 2. Regarding
> >
> >>
> >>   1. Indication of autoAck is deprecated, and not use it in the code.
> >>   (and also Function.proto)
> >>
> >> * I would issue a WARN when reading configuring the function (thus
> emitted
> > once) when the user actively configured autoAck=false and warn them that
> > this configuration is deprecated and they should switch to the MANUAL
> > ProcessingGuarantee configuration option.
> >
> > 3. Regarding
> >
> >>
> >>   1. When the delivery semantic is ATMOST_ONCE, the message will be
> >>   acked immediately after receiving the message, no longer affected by
> the
> >>   autoAck configuration.
> >>
> >> I suggest you clarify what existing behavior remains for backward
> > compatibility with the appropriate comment that this is deprecated and
> will
> > be removed.
> >
> > 4. Regarding
> >
> >>
> >>   1.
> >>
> >>   When user call record.ack() in function, just ProcessingGuarantees ==
> >>   MANUAL can be work. In turn, when ProcessingGuarantees != MANUAL, user
> >>   call record.ack() is invalid(print warn log).
> >>
> >> That might blast WARN messages to the user. Perhaps save the fact that
> you
> > have printed a WARN message once and only print when the variable is not
> > set?
> >
> > 5. Regarding Test Plan
> > * I would add: Validate the test of autoAck=false still works (you
> haven't
> > broken anything)
> > * I would add: Validate existing ProcessingGuarantee test for AtMostOnce,
> > AtLeastOnce, ExactlyOnce still works (when autoAck=true)
> >
> >
> >
> > On Mon, May 30, 2022 at 4:09 PM Baozi <wudixiaolongbao@icloud.com
> .invalid>
> > wrote:
> >
> >> Hi, Mesika.
> >>
> >> Thanks review.
> >>
> >>>> 2. I suggest calling it `MANUAL` `ProcessingGuarantee` instead of
> NONE.
> >> As
> >>>> you carefully explained, ProcessingGuarantee comes does to the fact
> that
> >>>> the function executor calls acknowledge, in specific timing:
> >>
> >>
> >> Added, Refer to the latest pip.
> >> https://github.com/apache/pulsar/issues/15560
> >>
> >>>> 3. Removing autoAck from Function Config breaks backward
> compatibility,
> >>>> thus shouldn't this be strongly reflected in the PIP - how does Pulsar
> >>>> release handle breaking change?
> >>
> >> As suggested by @neng, They will be marked as deprecated first and
> clearly
> >> stated in the documentation. Remove it after 2~3 release.
> >>
> >>>> 4. Regarding Implementation (1), isn't the class itself
> >>>> PulsarSinkAtLeastOnceProcessor encodes what to do? I'm not sure I
> >>>> understand how you use that enum value *inside* the class/
> >>
> >> I changed PIP, add new PulsarSinkManualProcessor. Refer to the latest
> PIP
> >> API Changes(3)
> >>
> >> Thanks,
> >> Baodi Shi
> >>
> >>> 2022年5月30日 12:5128,Rui Fu <rf...@apache.org> 写道:
> >>>
> >>> Hi Baodi,
> >>>
> >>> Nice work. Put some suggestions below, ptal.
> >>>
> >>> 1. API changes should also contain the changes of `Function.proto`,
> >> including new `ProcessingGuarantees` option and `autoAck`.
> >>> 2. Please be sure the other language runtimes (like Python, Golang) do
> >> support similar `record.ack()` function from the context, if no, it
> might
> >> have some API changes for different runtime we well.
> >>>
> >>>
> >>> Best,
> >>>
> >>> Rui Fu
> >>> 在 2022年5月29日 +0800 22:18,Asaf Mesika <as...@gmail.com>,写道:
> >>>> 1. "Added NONE delivery semantics and delete autoAck config."
> >>>> - Added --> add
> >>>>
> >>>> 2. I suggest calling it `MANUAL` `ProcessingGuarantee` instead of
> NONE.
> >> As
> >>>> you carefully explained, ProcessingGuarantee comes does to the fact
> that
> >>>> the function executor calls acknowledge, in specific timing:
> >>>> - `AT_MOST_ONCE` - When the message is read by the client, it is
> >>>> immediately acknowledged and only then the function is executed, thus
> >>>> guaranteeing it will not run more than once
> >>>> - `AT_LEAST_ONCE` - Message is acknowledged *after* the function
> >> finished
> >>>> execution, thus it will be run at least once.
> >>>> - `MANUAL` - Signals to the user that it is up to them to acknowledge
> >> the
> >>>> message, inside the function.
> >>>>
> >>>> I think if you couple that change with adding the explanation I wrote
> >>>> above to the documentation it will become crystal clear (hopefully)
> >> what is
> >>>> a Processing Guarantee exactly and what each value signifies.
> >>>>
> >>>> 3. Removing autoAck from Function Config breaks backward
> compatibility,
> >>>> thus shouldn't this be strongly reflected in the PIP - how does Pulsar
> >>>> release handle breaking change?
> >>>>
> >>>> 4. Regarding Implementation (1), isn't the class itself
> >>>> PulsarSinkAtLeastOnceProcessor encodes what to do? I'm not sure I
> >>>> understand how you use that enum value *inside* the class/
> >>>>
> >>>>
> >>>> On Fri, May 27, 2022 at 1:08 AM Neng Lu <fr...@gmail.com> wrote:
> >>>>
> >>>>> Some suggestions:
> >>>>>
> >>>>> 1. Instead of deleting the `autoAck`, keep it but not use it in the
> >> code.
> >>>>> And documented clearly it's deprecated for the following 2~3 release.
> >> And
> >>>>> then delete it.
> >>>>> 2. For `PulsarSinkAtLeastOnceProcessor` and
> >>>>> `PulsarSinkEffectivelyOnceProcessor`, if `NONE` is configured, it
> >> defaults
> >>>>> to ATLEAST_ONCE.
> >>>>> 3. Need to let users know the behavior when they call `record.ack()`
> >> inside
> >>>>> the function implementation.
> >>>>>
> >>>>> On Thu, May 12, 2022 at 1:52 AM Baozi <wudixiaolongbao@icloud.com
> >> .invalid>
> >>>>> wrote:
> >>>>>
> >>>>>> Hi Pulsar community,
> >>>>>>
> >>>>>> I open a https://github.com/apache/pulsar/issues/15560 for Function
> >> add
> >>>>>> NONE delivery semantics
> >>>>>>
> >>>>>> Let me know what you think.
> >>>>>>
> >>>>>>
> >>>>>> Thanks,
> >>>>>> Baodi Shi
> >>>>>>
> >>>>>>
> >>>>>> ## Motivation
> >>>>>>
> >>>>>> Currently Function supports three delivery semantics, and also
> >> provides
> >>>>>> autoAck to control whether to automatically ack.
> >>>>>> Because autoAck affects the delivery semantics of Function, it can
> be
> >>>>>> confusing for users to understand the relationship between these two
> >>>>>> parameters.
> >>>>>>
> >>>>>> For example, when the user configures `Guarantees == ATMOST_ONCE`
> and
> >>>>>> `autoAck == false`, then the framework will not help the user to ack
> >>>>>> messages, and the processing semantics may become `ATLEAST_ONCE`.
> >>>>>>
> >>>>>> The delivery semantics provided by Function should be clear. When
> the
> >>>>> user
> >>>>>> sets the guarantees, the framework should ensure point-to-point
> >> semantic
> >>>>>> processing and cannot be affected by other parameters.
> >>>>>>
> >>>>>> ## Goal
> >>>>>>
> >>>>>> Added `NONE` delivery semantics and delete `autoAck` config.
> >>>>>>
> >>>>>> The original intention of `autoAck` semantics is that users want to
> >>>>>> control the timing of ack by themselves. When autoAck == false, the
> >>>>>> processing semantics provided by the framework should be invalid.
> >> Then we
> >>>>>> can add `NONE` processing semantics to replace the autoAck == false
> >>>>>> scenario.
> >>>>>>
> >>>>>> When the user configuration `ProcessingGuarantees == NONE`, the
> >> framework
> >>>>>> does not help the user to do any ack operations, and the ack is left
> >> to
> >>>>> the
> >>>>>> user to handle. In other cases, the framework guarantees processing
> >>>>>> semantics.
> >>>>>>
> >>>>>> ## API Changes
> >>>>>> 1. Add `NONE` type to ProcessingGuarantees
> >>>>>> ``` java
> >>>>>> public enum ProcessingGuarantees {
> >>>>>> ATLEAST_ONCE,
> >>>>>> ATMOST_ONCE,
> >>>>>> EFFECTIVELY_ONCE,
> >>>>>> NONE
> >>>>>> }
> >>>>>> ```
> >>>>>>
> >>>>>> 2. Delete autoAck config in FunctionConfig
> >>>>>> ``` java
> >>>>>> public class FunctionConfig {
> >>>>>> - private Boolean autoAck;
> >>>>>> }
> >>>>>> ```
> >>>>>>
> >>>>>> ## Implementation
> >>>>>>
> >>>>>> 1. In `PulsarSinkAtLeastOnceProcessor` and
> >>>>>> `PulsarSinkEffectivelyOnceProcessor`, when `ProcessingGuarantees !=
> >> NONE`
> >>>>>> can be ack.
> >>>>>>
> >>>>>> <
> >>>>>>
> >>>>>
> >>
> https://github.com/apache/pulsar/blob/c49a977de4b0b525ec80e5070bc90eddcc7cddad/pulsar-functions/instance/src/main/java/org/apache/pulsar/functions/sink/PulsarSink.java#L274-L276
> >>>>>>>
> >>>>>>
> >>>>>> 2. When the delivery semantic is `ATMOST_ONCE`, the message will be
> >> acked
> >>>>>> immediately after receiving the message, no longer affected by the
> >>>>> autoAck
> >>>>>> configuration.
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>
> >>
> https://github.com/apache/pulsar/blob/c49a977de4b0b525ec80e5070bc90eddcc7cddad/pulsar-functions/instance/src/main/java/org/apache/pulsar/functions/instance/JavaInstanceRunnable.java#L271-L276
> >>>>>>
> >>>>>> 3. When user call `record.ack()` in function, just
> >> `ProcessingGuarantees
> >>>>>> == NONE` can be work.
> >>>>>>
> >>>>>> ## Plan test
> >>>>>> The main test and assert is that when ProcessingGuarantees == NONE,
> >> the
> >>>>>> function framework will not do any ack operations for the user.
> >>>>>>
> >>>>>> ## Compatibility
> >>>>>> 1. This change will invalidate the user's setting of autoAck, which
> >>>>> should
> >>>>>> be explained in the documentation and provide parameter verification
> >> to
> >>>>>> remind the user.
> >>>>>> 2. Runtimes of other languages need to maintain consistent
> processing
> >>>>>> logic (python, go).
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>
> >>>>> --
> >>>>> Best Regards,
> >>>>> Neng
> >>>>>
> >>
> >>
>
>

Re: [DISCUSS] PIP-166: Function add NONE delivery semantics

Posted by Neng Lu <fr...@gmail.com>.
Hi Baodi,

Thanks for the reply and update of the PIP.

1. Pulsar Functions currently isn't integrated with the Transaction feature
yet, so there's no EXACTLY_ONCE support.

2. And Yes, "EFFECTIVELY_ONCE = ATLEAST_ONCE + Message Deduplication"



On Tue, May 31, 2022 at 9:16 AM 石宝迪 <wu...@icloud.com.invalid>
wrote:

> >> If you fail to start the function, you immediately break people's
> > functions when they upgrade to this version. How about notifying them
> once
> > via logger (WARN)?
>
>
> I tend to fail. Although this breaks the current logic. but the current
> implementation can be considered is a bug.
>
> > It will flood their logs if they used it wrong. Maybe write to log once?
>
>
> Agree, I changed PIP.
>
> Thanks,
> Baodi Shi
>
> > 2022年5月31日 23:5720,Asaf Mesika <as...@gmail.com> 写道:
> >
> > Hi Baodi,
> >
> > Regarding
> >
> >>
> >>   1. When the delivery semantic is ATMOST_ONCE, add verify autoAck must
> >>   be true. If the validation fails, let the function fail to start (This
> >>   temporarily resolves the configuration ambiguity). When autoAck is
> >>   subsequently removed, the message will be acked immediately after
> receiving
> >>   the message.
> >>
> >>
> >> If you fail to start the function, you immediately break people's
> > functions when they upgrade to this version. How about notifying them
> once
> > via logger (WARN)?
> >
> > Regarding
> >
> >>
> >>   1.
> >>
> >>
> >>   When user call record.ack() in function, just ProcessingGuarantees ==
> >>   MANUAL can be work. In turn, when ProcessingGuarantees != MANUAL, user
> >>   call record.ack() is invalid(print warn log).
> >>
> >> It will flood their logs if they used it wrong. Maybe write to log once?
> >
> > On Tue, May 31, 2022 at 12:24 PM Baozi <wudixiaolongbao@icloud.com
> .invalid>
> > wrote:
> >
> >> Hi, Asaf.
> >>
> >> Thanks review.
> >>
> >>>> I'm not entirely sure that is accurate. The Effectively-Once as I
> >>> understand it is achieved using transactions, thus the consumption of
> >> that
> >>> message and production of any messages, as a result, are considered one
> >>> atomic unit - either message acknowledged and messages produced or
> >> neither.
> >>
> >>
> >> Not using transactions now, I understand: EFFECTIVELY_ONCE =
> ATLEAST_ONCE
> >> + Message Deduplication.
> >>
> >> @Neng Lu @Rui Fu Can help make sure?
> >>
> >>>> I would issue a WARN when reading configuring the function (thus
> emitted
> >>> once) when the user actively configured autoAck=false and warn them
> that
> >>> this configuration is deprecated and they should switch to the MANUAL
> >>> ProcessingGuarantee configuration option.
> >>
> >>
> >> Added to API Change(2)
> >>
> >>>> suggest you clarify what existing behavior remains for backward
> >>> compatibility with the appropriate comment that this is deprecated and
> >> will
> >>> be removed.
> >>
> >> Yes, I have rewritten it, please see Implementation(1)
> >>
> >>> 5. Regarding Test Plan
> >>> * I would add: Validate the test of autoAck=false still works (you
> >> haven't
> >>> broken anything)
> >>> * I would add: Validate existing ProcessingGuarantee test for
> AtMostOnce,
> >>> AtLeastOnce, ExactlyOnce still works (when autoAck=true)
> >>
> >>
> >> Nice, I added to PIP.
> >>
> >>
> >> Thanks,
> >> Baodi Shi
> >>
> >>> 2022年5月30日 22:0011,Asaf Mesika <as...@gmail.com> 写道:
> >>>
> >>> Thanks for applying the fixes.
> >>>
> >>> 1. Regarding
> >>>
> >>>>
> >>>>  - EFFECTIVELY_ONCE: The message is acknowledged *after* the function
> >>>>  finished execution. Depends on pulsar deduplication, and provides
> >>>>  end-to-end effectively once processing.
> >>>>
> >>>> I'm not entirely sure that is accurate. The Effectively-Once as I
> >>> understand it is achieved using transactions, thus the consumption of
> >> that
> >>> message and production of any messages, as a result, are considered one
> >>> atomic unit - either message acknowledged and messages produced or
> >> neither.
> >>>
> >>> 2. Regarding
> >>>
> >>>>
> >>>>  1. Indication of autoAck is deprecated, and not use it in the code.
> >>>>  (and also Function.proto)
> >>>>
> >>>> * I would issue a WARN when reading configuring the function (thus
> >> emitted
> >>> once) when the user actively configured autoAck=false and warn them
> that
> >>> this configuration is deprecated and they should switch to the MANUAL
> >>> ProcessingGuarantee configuration option.
> >>>
> >>> 3. Regarding
> >>>
> >>>>
> >>>>  1. When the delivery semantic is ATMOST_ONCE, the message will be
> >>>>  acked immediately after receiving the message, no longer affected by
> >> the
> >>>>  autoAck configuration.
> >>>>
> >>>> I suggest you clarify what existing behavior remains for backward
> >>> compatibility with the appropriate comment that this is deprecated and
> >> will
> >>> be removed.
> >>>
> >>> 4. Regarding
> >>>
> >>>>
> >>>>  1.
> >>>>
> >>>>  When user call record.ack() in function, just ProcessingGuarantees ==
> >>>>  MANUAL can be work. In turn, when ProcessingGuarantees != MANUAL,
> user
> >>>>  call record.ack() is invalid(print warn log).
> >>>>
> >>>> That might blast WARN messages to the user. Perhaps save the fact that
> >> you
> >>> have printed a WARN message once and only print when the variable is
> not
> >>> set?
> >>>
> >>> 5. Regarding Test Plan
> >>> * I would add: Validate the test of autoAck=false still works (you
> >> haven't
> >>> broken anything)
> >>> * I would add: Validate existing ProcessingGuarantee test for
> AtMostOnce,
> >>> AtLeastOnce, ExactlyOnce still works (when autoAck=true)
> >>>
> >>>
> >>>
> >>> On Mon, May 30, 2022 at 4:09 PM Baozi <wudixiaolongbao@icloud.com
> >> .invalid>
> >>> wrote:
> >>>
> >>>> Hi, Mesika.
> >>>>
> >>>> Thanks review.
> >>>>
> >>>>>> 2. I suggest calling it `MANUAL` `ProcessingGuarantee` instead of
> >> NONE.
> >>>> As
> >>>>>> you carefully explained, ProcessingGuarantee comes does to the fact
> >> that
> >>>>>> the function executor calls acknowledge, in specific timing:
> >>>>
> >>>>
> >>>> Added, Refer to the latest pip.
> >>>> https://github.com/apache/pulsar/issues/15560
> >>>>
> >>>>>> 3. Removing autoAck from Function Config breaks backward
> >> compatibility,
> >>>>>> thus shouldn't this be strongly reflected in the PIP - how does
> Pulsar
> >>>>>> release handle breaking change?
> >>>>
> >>>> As suggested by @neng, They will be marked as deprecated first and
> >> clearly
> >>>> stated in the documentation. Remove it after 2~3 release.
> >>>>
> >>>>>> 4. Regarding Implementation (1), isn't the class itself
> >>>>>> PulsarSinkAtLeastOnceProcessor encodes what to do? I'm not sure I
> >>>>>> understand how you use that enum value *inside* the class/
> >>>>
> >>>> I changed PIP, add new PulsarSinkManualProcessor. Refer to the latest
> >> PIP
> >>>> API Changes(3)
> >>>>
> >>>> Thanks,
> >>>> Baodi Shi
> >>>>
> >>>>> 2022年5月30日 12:5128,Rui Fu <rf...@apache.org> 写道:
> >>>>>
> >>>>> Hi Baodi,
> >>>>>
> >>>>> Nice work. Put some suggestions below, ptal.
> >>>>>
> >>>>> 1. API changes should also contain the changes of `Function.proto`,
> >>>> including new `ProcessingGuarantees` option and `autoAck`.
> >>>>> 2. Please be sure the other language runtimes (like Python, Golang)
> do
> >>>> support similar `record.ack()` function from the context, if no, it
> >> might
> >>>> have some API changes for different runtime we well.
> >>>>>
> >>>>>
> >>>>> Best,
> >>>>>
> >>>>> Rui Fu
> >>>>> 在 2022年5月29日 +0800 22:18,Asaf Mesika <as...@gmail.com>,写道:
> >>>>>> 1. "Added NONE delivery semantics and delete autoAck config."
> >>>>>> - Added --> add
> >>>>>>
> >>>>>> 2. I suggest calling it `MANUAL` `ProcessingGuarantee` instead of
> >> NONE.
> >>>> As
> >>>>>> you carefully explained, ProcessingGuarantee comes does to the fact
> >> that
> >>>>>> the function executor calls acknowledge, in specific timing:
> >>>>>> - `AT_MOST_ONCE` - When the message is read by the client, it is
> >>>>>> immediately acknowledged and only then the function is executed,
> thus
> >>>>>> guaranteeing it will not run more than once
> >>>>>> - `AT_LEAST_ONCE` - Message is acknowledged *after* the function
> >>>> finished
> >>>>>> execution, thus it will be run at least once.
> >>>>>> - `MANUAL` - Signals to the user that it is up to them to
> acknowledge
> >>>> the
> >>>>>> message, inside the function.
> >>>>>>
> >>>>>> I think if you couple that change with adding the explanation I
> wrote
> >>>>>> above to the documentation it will become crystal clear (hopefully)
> >>>> what is
> >>>>>> a Processing Guarantee exactly and what each value signifies.
> >>>>>>
> >>>>>> 3. Removing autoAck from Function Config breaks backward
> >> compatibility,
> >>>>>> thus shouldn't this be strongly reflected in the PIP - how does
> Pulsar
> >>>>>> release handle breaking change?
> >>>>>>
> >>>>>> 4. Regarding Implementation (1), isn't the class itself
> >>>>>> PulsarSinkAtLeastOnceProcessor encodes what to do? I'm not sure I
> >>>>>> understand how you use that enum value *inside* the class/
> >>>>>>
> >>>>>>
> >>>>>> On Fri, May 27, 2022 at 1:08 AM Neng Lu <fr...@gmail.com> wrote:
> >>>>>>
> >>>>>>> Some suggestions:
> >>>>>>>
> >>>>>>> 1. Instead of deleting the `autoAck`, keep it but not use it in the
> >>>> code.
> >>>>>>> And documented clearly it's deprecated for the following 2~3
> release.
> >>>> And
> >>>>>>> then delete it.
> >>>>>>> 2. For `PulsarSinkAtLeastOnceProcessor` and
> >>>>>>> `PulsarSinkEffectivelyOnceProcessor`, if `NONE` is configured, it
> >>>> defaults
> >>>>>>> to ATLEAST_ONCE.
> >>>>>>> 3. Need to let users know the behavior when they call
> `record.ack()`
> >>>> inside
> >>>>>>> the function implementation.
> >>>>>>>
> >>>>>>> On Thu, May 12, 2022 at 1:52 AM Baozi <wudixiaolongbao@icloud.com
> >>>> .invalid>
> >>>>>>> wrote:
> >>>>>>>
> >>>>>>>> Hi Pulsar community,
> >>>>>>>>
> >>>>>>>> I open a https://github.com/apache/pulsar/issues/15560 for
> Function
> >>>> add
> >>>>>>>> NONE delivery semantics
> >>>>>>>>
> >>>>>>>> Let me know what you think.
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> Thanks,
> >>>>>>>> Baodi Shi
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> ## Motivation
> >>>>>>>>
> >>>>>>>> Currently Function supports three delivery semantics, and also
> >>>> provides
> >>>>>>>> autoAck to control whether to automatically ack.
> >>>>>>>> Because autoAck affects the delivery semantics of Function, it can
> >> be
> >>>>>>>> confusing for users to understand the relationship between these
> two
> >>>>>>>> parameters.
> >>>>>>>>
> >>>>>>>> For example, when the user configures `Guarantees == ATMOST_ONCE`
> >> and
> >>>>>>>> `autoAck == false`, then the framework will not help the user to
> ack
> >>>>>>>> messages, and the processing semantics may become `ATLEAST_ONCE`.
> >>>>>>>>
> >>>>>>>> The delivery semantics provided by Function should be clear. When
> >> the
> >>>>>>> user
> >>>>>>>> sets the guarantees, the framework should ensure point-to-point
> >>>> semantic
> >>>>>>>> processing and cannot be affected by other parameters.
> >>>>>>>>
> >>>>>>>> ## Goal
> >>>>>>>>
> >>>>>>>> Added `NONE` delivery semantics and delete `autoAck` config.
> >>>>>>>>
> >>>>>>>> The original intention of `autoAck` semantics is that users want
> to
> >>>>>>>> control the timing of ack by themselves. When autoAck == false,
> the
> >>>>>>>> processing semantics provided by the framework should be invalid.
> >>>> Then we
> >>>>>>>> can add `NONE` processing semantics to replace the autoAck ==
> false
> >>>>>>>> scenario.
> >>>>>>>>
> >>>>>>>> When the user configuration `ProcessingGuarantees == NONE`, the
> >>>> framework
> >>>>>>>> does not help the user to do any ack operations, and the ack is
> left
> >>>> to
> >>>>>>> the
> >>>>>>>> user to handle. In other cases, the framework guarantees
> processing
> >>>>>>>> semantics.
> >>>>>>>>
> >>>>>>>> ## API Changes
> >>>>>>>> 1. Add `NONE` type to ProcessingGuarantees
> >>>>>>>> ``` java
> >>>>>>>> public enum ProcessingGuarantees {
> >>>>>>>> ATLEAST_ONCE,
> >>>>>>>> ATMOST_ONCE,
> >>>>>>>> EFFECTIVELY_ONCE,
> >>>>>>>> NONE
> >>>>>>>> }
> >>>>>>>> ```
> >>>>>>>>
> >>>>>>>> 2. Delete autoAck config in FunctionConfig
> >>>>>>>> ``` java
> >>>>>>>> public class FunctionConfig {
> >>>>>>>> - private Boolean autoAck;
> >>>>>>>> }
> >>>>>>>> ```
> >>>>>>>>
> >>>>>>>> ## Implementation
> >>>>>>>>
> >>>>>>>> 1. In `PulsarSinkAtLeastOnceProcessor` and
> >>>>>>>> `PulsarSinkEffectivelyOnceProcessor`, when `ProcessingGuarantees
> !=
> >>>> NONE`
> >>>>>>>> can be ack.
> >>>>>>>>
> >>>>>>>> <
> >>>>>>>>
> >>>>>>>
> >>>>
> >>
> https://github.com/apache/pulsar/blob/c49a977de4b0b525ec80e5070bc90eddcc7cddad/pulsar-functions/instance/src/main/java/org/apache/pulsar/functions/sink/PulsarSink.java#L274-L276
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>> 2. When the delivery semantic is `ATMOST_ONCE`, the message will
> be
> >>>> acked
> >>>>>>>> immediately after receiving the message, no longer affected by the
> >>>>>>> autoAck
> >>>>>>>> configuration.
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>
> >>
> https://github.com/apache/pulsar/blob/c49a977de4b0b525ec80e5070bc90eddcc7cddad/pulsar-functions/instance/src/main/java/org/apache/pulsar/functions/instance/JavaInstanceRunnable.java#L271-L276
> >>>>>>>>
> >>>>>>>> 3. When user call `record.ack()` in function, just
> >>>> `ProcessingGuarantees
> >>>>>>>> == NONE` can be work.
> >>>>>>>>
> >>>>>>>> ## Plan test
> >>>>>>>> The main test and assert is that when ProcessingGuarantees ==
> NONE,
> >>>> the
> >>>>>>>> function framework will not do any ack operations for the user.
> >>>>>>>>
> >>>>>>>> ## Compatibility
> >>>>>>>> 1. This change will invalidate the user's setting of autoAck,
> which
> >>>>>>> should
> >>>>>>>> be explained in the documentation and provide parameter
> verification
> >>>> to
> >>>>>>>> remind the user.
> >>>>>>>> 2. Runtimes of other languages need to maintain consistent
> >> processing
> >>>>>>>> logic (python, go).
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>>>> --
> >>>>>>> Best Regards,
> >>>>>>> Neng
> >>>>>>>
> >>>>
> >>>>
> >>
> >>
>
>

-- 
Best Regards,
Neng

Re: [DISCUSS] PIP-166: Function add NONE delivery semantics

Posted by Baodi Shi <ba...@icloud.com.INVALID>.
Hi, I found some problems with `FunctionWindows` when I implemented this pip, and I added it to PIP: Implementation[4].

After I submit the first PR, you can refer to it.

Thanks,
Baodi Shi

> On Jun 2, 2022, at 18:4232, 石宝迪 <wu...@icloud.com.INVALID> wrote:
> 
>> Ok. I would add in the Compatability change another section with bold or
>> capital letters to highlight you're creating a breaking change. It should
>> be reflected in the release notes somehow - don't know the process for that.
> 
> Ok, I added to `Incompatible case`. PTAL.
> 
> 
> Thanks,
> Baodi Shi
> 
>> 2022年6月2日 18:0404,Asaf Mesika <as...@gmail.com> 写道:
>> 
>>> 
>>> I tend to fail. Although this breaks the current logic. but the current
>>> implementation can be considered is a bug.
>> 
>> Ok. I would add in the Compatability change another section with bold or
>> capital letters to highlight you're creating a breaking change. It should
>> be reflected in the release notes somehow - don't know the process for that.
>> 
>> On Tue, May 31, 2022 at 7:16 PM 石宝迪 <wu...@icloud.com.invalid>
>> wrote:
>> 
>>>>> If you fail to start the function, you immediately break people's
>>>> functions when they upgrade to this version. How about notifying them
>>> once
>>>> via logger (WARN)?
>>> 
>>> 
>>> I tend to fail. Although this breaks the current logic. but the current
>>> implementation can be considered is a bug.
>>> 
>>>> It will flood their logs if they used it wrong. Maybe write to log once?
>>> 
>>> 
>>> Agree, I changed PIP.
>>> 
>>> Thanks,
>>> Baodi Shi
>>> 
>>>> 2022年5月31日 23:5720,Asaf Mesika <as...@gmail.com> 写道:
>>>> 
>>>> Hi Baodi,
>>>> 
>>>> Regarding
>>>> 
>>>>> 
>>>>> 1. When the delivery semantic is ATMOST_ONCE, add verify autoAck must
>>>>> be true. If the validation fails, let the function fail to start (This
>>>>> temporarily resolves the configuration ambiguity). When autoAck is
>>>>> subsequently removed, the message will be acked immediately after
>>> receiving
>>>>> the message.
>>>>> 
>>>>> 
>>>>> If you fail to start the function, you immediately break people's
>>>> functions when they upgrade to this version. How about notifying them
>>> once
>>>> via logger (WARN)?
>>>> 
>>>> Regarding
>>>> 
>>>>> 
>>>>> 1.
>>>>> 
>>>>> 
>>>>> When user call record.ack() in function, just ProcessingGuarantees ==
>>>>> MANUAL can be work. In turn, when ProcessingGuarantees != MANUAL, user
>>>>> call record.ack() is invalid(print warn log).
>>>>> 
>>>>> It will flood their logs if they used it wrong. Maybe write to log once?
>>>> 
>>>> On Tue, May 31, 2022 at 12:24 PM Baozi <wudixiaolongbao@icloud.com
>>> .invalid>
>>>> wrote:
>>>> 
>>>>> Hi, Asaf.
>>>>> 
>>>>> Thanks review.
>>>>> 
>>>>>>> I'm not entirely sure that is accurate. The Effectively-Once as I
>>>>>> understand it is achieved using transactions, thus the consumption of
>>>>> that
>>>>>> message and production of any messages, as a result, are considered one
>>>>>> atomic unit - either message acknowledged and messages produced or
>>>>> neither.
>>>>> 
>>>>> 
>>>>> Not using transactions now, I understand: EFFECTIVELY_ONCE =
>>> ATLEAST_ONCE
>>>>> + Message Deduplication.
>>>>> 
>>>>> @Neng Lu @Rui Fu Can help make sure?
>>>>> 
>>>>>>> I would issue a WARN when reading configuring the function (thus
>>> emitted
>>>>>> once) when the user actively configured autoAck=false and warn them
>>> that
>>>>>> this configuration is deprecated and they should switch to the MANUAL
>>>>>> ProcessingGuarantee configuration option.
>>>>> 
>>>>> 
>>>>> Added to API Change(2)
>>>>> 
>>>>>>> suggest you clarify what existing behavior remains for backward
>>>>>> compatibility with the appropriate comment that this is deprecated and
>>>>> will
>>>>>> be removed.
>>>>> 
>>>>> Yes, I have rewritten it, please see Implementation(1)
>>>>> 
>>>>>> 5. Regarding Test Plan
>>>>>> * I would add: Validate the test of autoAck=false still works (you
>>>>> haven't
>>>>>> broken anything)
>>>>>> * I would add: Validate existing ProcessingGuarantee test for
>>> AtMostOnce,
>>>>>> AtLeastOnce, ExactlyOnce still works (when autoAck=true)
>>>>> 
>>>>> 
>>>>> Nice, I added to PIP.
>>>>> 
>>>>> 
>>>>> Thanks,
>>>>> Baodi Shi
>>>>> 
>>>>>> 2022年5月30日 22:0011,Asaf Mesika <as...@gmail.com> 写道:
>>>>>> 
>>>>>> Thanks for applying the fixes.
>>>>>> 
>>>>>> 1. Regarding
>>>>>> 
>>>>>>> 
>>>>>>> - EFFECTIVELY_ONCE: The message is acknowledged *after* the function
>>>>>>> finished execution. Depends on pulsar deduplication, and provides
>>>>>>> end-to-end effectively once processing.
>>>>>>> 
>>>>>>> I'm not entirely sure that is accurate. The Effectively-Once as I
>>>>>> understand it is achieved using transactions, thus the consumption of
>>>>> that
>>>>>> message and production of any messages, as a result, are considered one
>>>>>> atomic unit - either message acknowledged and messages produced or
>>>>> neither.
>>>>>> 
>>>>>> 2. Regarding
>>>>>> 
>>>>>>> 
>>>>>>> 1. Indication of autoAck is deprecated, and not use it in the code.
>>>>>>> (and also Function.proto)
>>>>>>> 
>>>>>>> * I would issue a WARN when reading configuring the function (thus
>>>>> emitted
>>>>>> once) when the user actively configured autoAck=false and warn them
>>> that
>>>>>> this configuration is deprecated and they should switch to the MANUAL
>>>>>> ProcessingGuarantee configuration option.
>>>>>> 
>>>>>> 3. Regarding
>>>>>> 
>>>>>>> 
>>>>>>> 1. When the delivery semantic is ATMOST_ONCE, the message will be
>>>>>>> acked immediately after receiving the message, no longer affected by
>>>>> the
>>>>>>> autoAck configuration.
>>>>>>> 
>>>>>>> I suggest you clarify what existing behavior remains for backward
>>>>>> compatibility with the appropriate comment that this is deprecated and
>>>>> will
>>>>>> be removed.
>>>>>> 
>>>>>> 4. Regarding
>>>>>> 
>>>>>>> 
>>>>>>> 1.
>>>>>>> 
>>>>>>> When user call record.ack() in function, just ProcessingGuarantees ==
>>>>>>> MANUAL can be work. In turn, when ProcessingGuarantees != MANUAL,
>>> user
>>>>>>> call record.ack() is invalid(print warn log).
>>>>>>> 
>>>>>>> That might blast WARN messages to the user. Perhaps save the fact that
>>>>> you
>>>>>> have printed a WARN message once and only print when the variable is
>>> not
>>>>>> set?
>>>>>> 
>>>>>> 5. Regarding Test Plan
>>>>>> * I would add: Validate the test of autoAck=false still works (you
>>>>> haven't
>>>>>> broken anything)
>>>>>> * I would add: Validate existing ProcessingGuarantee test for
>>> AtMostOnce,
>>>>>> AtLeastOnce, ExactlyOnce still works (when autoAck=true)
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> On Mon, May 30, 2022 at 4:09 PM Baozi <wudixiaolongbao@icloud.com
>>>>> .invalid>
>>>>>> wrote:
>>>>>> 
>>>>>>> Hi, Mesika.
>>>>>>> 
>>>>>>> Thanks review.
>>>>>>> 
>>>>>>>>> 2. I suggest calling it `MANUAL` `ProcessingGuarantee` instead of
>>>>> NONE.
>>>>>>> As
>>>>>>>>> you carefully explained, ProcessingGuarantee comes does to the fact
>>>>> that
>>>>>>>>> the function executor calls acknowledge, in specific timing:
>>>>>>> 
>>>>>>> 
>>>>>>> Added, Refer to the latest pip.
>>>>>>> https://github.com/apache/pulsar/issues/15560
>>>>>>> 
>>>>>>>>> 3. Removing autoAck from Function Config breaks backward
>>>>> compatibility,
>>>>>>>>> thus shouldn't this be strongly reflected in the PIP - how does
>>> Pulsar
>>>>>>>>> release handle breaking change?
>>>>>>> 
>>>>>>> As suggested by @neng, They will be marked as deprecated first and
>>>>> clearly
>>>>>>> stated in the documentation. Remove it after 2~3 release.
>>>>>>> 
>>>>>>>>> 4. Regarding Implementation (1), isn't the class itself
>>>>>>>>> PulsarSinkAtLeastOnceProcessor encodes what to do? I'm not sure I
>>>>>>>>> understand how you use that enum value *inside* the class/
>>>>>>> 
>>>>>>> I changed PIP, add new PulsarSinkManualProcessor. Refer to the latest
>>>>> PIP
>>>>>>> API Changes(3)
>>>>>>> 
>>>>>>> Thanks,
>>>>>>> Baodi Shi
>>>>>>> 
>>>>>>>> 2022年5月30日 12:5128,Rui Fu <rf...@apache.org> 写道:
>>>>>>>> 
>>>>>>>> Hi Baodi,
>>>>>>>> 
>>>>>>>> Nice work. Put some suggestions below, ptal.
>>>>>>>> 
>>>>>>>> 1. API changes should also contain the changes of `Function.proto`,
>>>>>>> including new `ProcessingGuarantees` option and `autoAck`.
>>>>>>>> 2. Please be sure the other language runtimes (like Python, Golang)
>>> do
>>>>>>> support similar `record.ack()` function from the context, if no, it
>>>>> might
>>>>>>> have some API changes for different runtime we well.
>>>>>>>> 
>>>>>>>> 
>>>>>>>> Best,
>>>>>>>> 
>>>>>>>> Rui Fu
>>>>>>>> 在 2022年5月29日 +0800 22:18,Asaf Mesika <as...@gmail.com>,写道:
>>>>>>>>> 1. "Added NONE delivery semantics and delete autoAck config."
>>>>>>>>> - Added --> add
>>>>>>>>> 
>>>>>>>>> 2. I suggest calling it `MANUAL` `ProcessingGuarantee` instead of
>>>>> NONE.
>>>>>>> As
>>>>>>>>> you carefully explained, ProcessingGuarantee comes does to the fact
>>>>> that
>>>>>>>>> the function executor calls acknowledge, in specific timing:
>>>>>>>>> - `AT_MOST_ONCE` - When the message is read by the client, it is
>>>>>>>>> immediately acknowledged and only then the function is executed,
>>> thus
>>>>>>>>> guaranteeing it will not run more than once
>>>>>>>>> - `AT_LEAST_ONCE` - Message is acknowledged *after* the function
>>>>>>> finished
>>>>>>>>> execution, thus it will be run at least once.
>>>>>>>>> - `MANUAL` - Signals to the user that it is up to them to
>>> acknowledge
>>>>>>> the
>>>>>>>>> message, inside the function.
>>>>>>>>> 
>>>>>>>>> I think if you couple that change with adding the explanation I
>>> wrote
>>>>>>>>> above to the documentation it will become crystal clear (hopefully)
>>>>>>> what is
>>>>>>>>> a Processing Guarantee exactly and what each value signifies.
>>>>>>>>> 
>>>>>>>>> 3. Removing autoAck from Function Config breaks backward
>>>>> compatibility,
>>>>>>>>> thus shouldn't this be strongly reflected in the PIP - how does
>>> Pulsar
>>>>>>>>> release handle breaking change?
>>>>>>>>> 
>>>>>>>>> 4. Regarding Implementation (1), isn't the class itself
>>>>>>>>> PulsarSinkAtLeastOnceProcessor encodes what to do? I'm not sure I
>>>>>>>>> understand how you use that enum value *inside* the class/
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> On Fri, May 27, 2022 at 1:08 AM Neng Lu <fr...@gmail.com> wrote:
>>>>>>>>> 
>>>>>>>>>> Some suggestions:
>>>>>>>>>> 
>>>>>>>>>> 1. Instead of deleting the `autoAck`, keep it but not use it in the
>>>>>>> code.
>>>>>>>>>> And documented clearly it's deprecated for the following 2~3
>>> release.
>>>>>>> And
>>>>>>>>>> then delete it.
>>>>>>>>>> 2. For `PulsarSinkAtLeastOnceProcessor` and
>>>>>>>>>> `PulsarSinkEffectivelyOnceProcessor`, if `NONE` is configured, it
>>>>>>> defaults
>>>>>>>>>> to ATLEAST_ONCE.
>>>>>>>>>> 3. Need to let users know the behavior when they call
>>> `record.ack()`
>>>>>>> inside
>>>>>>>>>> the function implementation.
>>>>>>>>>> 
>>>>>>>>>> On Thu, May 12, 2022 at 1:52 AM Baozi <wudixiaolongbao@icloud.com
>>>>>>> .invalid>
>>>>>>>>>> wrote:
>>>>>>>>>> 
>>>>>>>>>>> Hi Pulsar community,
>>>>>>>>>>> 
>>>>>>>>>>> I open a https://github.com/apache/pulsar/issues/15560 for
>>> Function
>>>>>>> add
>>>>>>>>>>> NONE delivery semantics
>>>>>>>>>>> 
>>>>>>>>>>> Let me know what you think.
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Baodi Shi
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> ## Motivation
>>>>>>>>>>> 
>>>>>>>>>>> Currently Function supports three delivery semantics, and also
>>>>>>> provides
>>>>>>>>>>> autoAck to control whether to automatically ack.
>>>>>>>>>>> Because autoAck affects the delivery semantics of Function, it can
>>>>> be
>>>>>>>>>>> confusing for users to understand the relationship between these
>>> two
>>>>>>>>>>> parameters.
>>>>>>>>>>> 
>>>>>>>>>>> For example, when the user configures `Guarantees == ATMOST_ONCE`
>>>>> and
>>>>>>>>>>> `autoAck == false`, then the framework will not help the user to
>>> ack
>>>>>>>>>>> messages, and the processing semantics may become `ATLEAST_ONCE`.
>>>>>>>>>>> 
>>>>>>>>>>> The delivery semantics provided by Function should be clear. When
>>>>> the
>>>>>>>>>> user
>>>>>>>>>>> sets the guarantees, the framework should ensure point-to-point
>>>>>>> semantic
>>>>>>>>>>> processing and cannot be affected by other parameters.
>>>>>>>>>>> 
>>>>>>>>>>> ## Goal
>>>>>>>>>>> 
>>>>>>>>>>> Added `NONE` delivery semantics and delete `autoAck` config.
>>>>>>>>>>> 
>>>>>>>>>>> The original intention of `autoAck` semantics is that users want
>>> to
>>>>>>>>>>> control the timing of ack by themselves. When autoAck == false,
>>> the
>>>>>>>>>>> processing semantics provided by the framework should be invalid.
>>>>>>> Then we
>>>>>>>>>>> can add `NONE` processing semantics to replace the autoAck ==
>>> false
>>>>>>>>>>> scenario.
>>>>>>>>>>> 
>>>>>>>>>>> When the user configuration `ProcessingGuarantees == NONE`, the
>>>>>>> framework
>>>>>>>>>>> does not help the user to do any ack operations, and the ack is
>>> left
>>>>>>> to
>>>>>>>>>> the
>>>>>>>>>>> user to handle. In other cases, the framework guarantees
>>> processing
>>>>>>>>>>> semantics.
>>>>>>>>>>> 
>>>>>>>>>>> ## API Changes
>>>>>>>>>>> 1. Add `NONE` type to ProcessingGuarantees
>>>>>>>>>>> ``` java
>>>>>>>>>>> public enum ProcessingGuarantees {
>>>>>>>>>>> ATLEAST_ONCE,
>>>>>>>>>>> ATMOST_ONCE,
>>>>>>>>>>> EFFECTIVELY_ONCE,
>>>>>>>>>>> NONE
>>>>>>>>>>> }
>>>>>>>>>>> ```
>>>>>>>>>>> 
>>>>>>>>>>> 2. Delete autoAck config in FunctionConfig
>>>>>>>>>>> ``` java
>>>>>>>>>>> public class FunctionConfig {
>>>>>>>>>>> - private Boolean autoAck;
>>>>>>>>>>> }
>>>>>>>>>>> ```
>>>>>>>>>>> 
>>>>>>>>>>> ## Implementation
>>>>>>>>>>> 
>>>>>>>>>>> 1. In `PulsarSinkAtLeastOnceProcessor` and
>>>>>>>>>>> `PulsarSinkEffectivelyOnceProcessor`, when `ProcessingGuarantees
>>> !=
>>>>>>> NONE`
>>>>>>>>>>> can be ack.
>>>>>>>>>>> 
>>>>>>>>>>> <
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>> 
>>>>> 
>>> https://github.com/apache/pulsar/blob/c49a977de4b0b525ec80e5070bc90eddcc7cddad/pulsar-functions/instance/src/main/java/org/apache/pulsar/functions/sink/PulsarSink.java#L274-L276
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 2. When the delivery semantic is `ATMOST_ONCE`, the message will
>>> be
>>>>>>> acked
>>>>>>>>>>> immediately after receiving the message, no longer affected by the
>>>>>>>>>> autoAck
>>>>>>>>>>> configuration.
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>> 
>>>>> 
>>> https://github.com/apache/pulsar/blob/c49a977de4b0b525ec80e5070bc90eddcc7cddad/pulsar-functions/instance/src/main/java/org/apache/pulsar/functions/instance/JavaInstanceRunnable.java#L271-L276
>>>>>>>>>>> 
>>>>>>>>>>> 3. When user call `record.ack()` in function, just
>>>>>>> `ProcessingGuarantees
>>>>>>>>>>> == NONE` can be work.
>>>>>>>>>>> 
>>>>>>>>>>> ## Plan test
>>>>>>>>>>> The main test and assert is that when ProcessingGuarantees ==
>>> NONE,
>>>>>>> the
>>>>>>>>>>> function framework will not do any ack operations for the user.
>>>>>>>>>>> 
>>>>>>>>>>> ## Compatibility
>>>>>>>>>>> 1. This change will invalidate the user's setting of autoAck,
>>> which
>>>>>>>>>> should
>>>>>>>>>>> be explained in the documentation and provide parameter
>>> verification
>>>>>>> to
>>>>>>>>>>> remind the user.
>>>>>>>>>>> 2. Runtimes of other languages need to maintain consistent
>>>>> processing
>>>>>>>>>>> logic (python, go).
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> --
>>>>>>>>>> Best Regards,
>>>>>>>>>> Neng
>>>>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>> 
>>>>> 
>>> 
>>> 
> 


Re: [DISCUSS] PIP-166: Function add NONE delivery semantics

Posted by PengHui Li <pe...@apache.org>.
> It should be reflected in the release notes somehow - don't know the
process for that.

Yes, we are using the label `release/important-notice` to track the
important things we need to highlight in the release not.
I have added the label.

I support the proposal.

Thanks,
Penghui

On Thu, Jun 2, 2022 at 6:42 PM 石宝迪 <wu...@icloud.com.invalid>
wrote:

> > Ok. I would add in the Compatability change another section with bold or
> > capital letters to highlight you're creating a breaking change. It should
> > be reflected in the release notes somehow - don't know the process for
> that.
>
> Ok, I added to `Incompatible case`. PTAL.
>
>
> Thanks,
> Baodi Shi
>
> > 2022年6月2日 18:0404,Asaf Mesika <as...@gmail.com> 写道:
> >
> >>
> >> I tend to fail. Although this breaks the current logic. but the current
> >> implementation can be considered is a bug.
> >
> > Ok. I would add in the Compatability change another section with bold or
> > capital letters to highlight you're creating a breaking change. It should
> > be reflected in the release notes somehow - don't know the process for
> that.
> >
> > On Tue, May 31, 2022 at 7:16 PM 石宝迪 <wu...@icloud.com.invalid>
> > wrote:
> >
> >>>> If you fail to start the function, you immediately break people's
> >>> functions when they upgrade to this version. How about notifying them
> >> once
> >>> via logger (WARN)?
> >>
> >>
> >> I tend to fail. Although this breaks the current logic. but the current
> >> implementation can be considered is a bug.
> >>
> >>> It will flood their logs if they used it wrong. Maybe write to log
> once?
> >>
> >>
> >> Agree, I changed PIP.
> >>
> >> Thanks,
> >> Baodi Shi
> >>
> >>> 2022年5月31日 23:5720,Asaf Mesika <as...@gmail.com> 写道:
> >>>
> >>> Hi Baodi,
> >>>
> >>> Regarding
> >>>
> >>>>
> >>>>  1. When the delivery semantic is ATMOST_ONCE, add verify autoAck must
> >>>>  be true. If the validation fails, let the function fail to start
> (This
> >>>>  temporarily resolves the configuration ambiguity). When autoAck is
> >>>>  subsequently removed, the message will be acked immediately after
> >> receiving
> >>>>  the message.
> >>>>
> >>>>
> >>>> If you fail to start the function, you immediately break people's
> >>> functions when they upgrade to this version. How about notifying them
> >> once
> >>> via logger (WARN)?
> >>>
> >>> Regarding
> >>>
> >>>>
> >>>>  1.
> >>>>
> >>>>
> >>>>  When user call record.ack() in function, just ProcessingGuarantees ==
> >>>>  MANUAL can be work. In turn, when ProcessingGuarantees != MANUAL,
> user
> >>>>  call record.ack() is invalid(print warn log).
> >>>>
> >>>> It will flood their logs if they used it wrong. Maybe write to log
> once?
> >>>
> >>> On Tue, May 31, 2022 at 12:24 PM Baozi <wudixiaolongbao@icloud.com
> >> .invalid>
> >>> wrote:
> >>>
> >>>> Hi, Asaf.
> >>>>
> >>>> Thanks review.
> >>>>
> >>>>>> I'm not entirely sure that is accurate. The Effectively-Once as I
> >>>>> understand it is achieved using transactions, thus the consumption of
> >>>> that
> >>>>> message and production of any messages, as a result, are considered
> one
> >>>>> atomic unit - either message acknowledged and messages produced or
> >>>> neither.
> >>>>
> >>>>
> >>>> Not using transactions now, I understand: EFFECTIVELY_ONCE =
> >> ATLEAST_ONCE
> >>>> + Message Deduplication.
> >>>>
> >>>> @Neng Lu @Rui Fu Can help make sure?
> >>>>
> >>>>>> I would issue a WARN when reading configuring the function (thus
> >> emitted
> >>>>> once) when the user actively configured autoAck=false and warn them
> >> that
> >>>>> this configuration is deprecated and they should switch to the MANUAL
> >>>>> ProcessingGuarantee configuration option.
> >>>>
> >>>>
> >>>> Added to API Change(2)
> >>>>
> >>>>>> suggest you clarify what existing behavior remains for backward
> >>>>> compatibility with the appropriate comment that this is deprecated
> and
> >>>> will
> >>>>> be removed.
> >>>>
> >>>> Yes, I have rewritten it, please see Implementation(1)
> >>>>
> >>>>> 5. Regarding Test Plan
> >>>>> * I would add: Validate the test of autoAck=false still works (you
> >>>> haven't
> >>>>> broken anything)
> >>>>> * I would add: Validate existing ProcessingGuarantee test for
> >> AtMostOnce,
> >>>>> AtLeastOnce, ExactlyOnce still works (when autoAck=true)
> >>>>
> >>>>
> >>>> Nice, I added to PIP.
> >>>>
> >>>>
> >>>> Thanks,
> >>>> Baodi Shi
> >>>>
> >>>>> 2022年5月30日 22:0011,Asaf Mesika <as...@gmail.com> 写道:
> >>>>>
> >>>>> Thanks for applying the fixes.
> >>>>>
> >>>>> 1. Regarding
> >>>>>
> >>>>>>
> >>>>>> - EFFECTIVELY_ONCE: The message is acknowledged *after* the function
> >>>>>> finished execution. Depends on pulsar deduplication, and provides
> >>>>>> end-to-end effectively once processing.
> >>>>>>
> >>>>>> I'm not entirely sure that is accurate. The Effectively-Once as I
> >>>>> understand it is achieved using transactions, thus the consumption of
> >>>> that
> >>>>> message and production of any messages, as a result, are considered
> one
> >>>>> atomic unit - either message acknowledged and messages produced or
> >>>> neither.
> >>>>>
> >>>>> 2. Regarding
> >>>>>
> >>>>>>
> >>>>>> 1. Indication of autoAck is deprecated, and not use it in the code.
> >>>>>> (and also Function.proto)
> >>>>>>
> >>>>>> * I would issue a WARN when reading configuring the function (thus
> >>>> emitted
> >>>>> once) when the user actively configured autoAck=false and warn them
> >> that
> >>>>> this configuration is deprecated and they should switch to the MANUAL
> >>>>> ProcessingGuarantee configuration option.
> >>>>>
> >>>>> 3. Regarding
> >>>>>
> >>>>>>
> >>>>>> 1. When the delivery semantic is ATMOST_ONCE, the message will be
> >>>>>> acked immediately after receiving the message, no longer affected by
> >>>> the
> >>>>>> autoAck configuration.
> >>>>>>
> >>>>>> I suggest you clarify what existing behavior remains for backward
> >>>>> compatibility with the appropriate comment that this is deprecated
> and
> >>>> will
> >>>>> be removed.
> >>>>>
> >>>>> 4. Regarding
> >>>>>
> >>>>>>
> >>>>>> 1.
> >>>>>>
> >>>>>> When user call record.ack() in function, just ProcessingGuarantees
> ==
> >>>>>> MANUAL can be work. In turn, when ProcessingGuarantees != MANUAL,
> >> user
> >>>>>> call record.ack() is invalid(print warn log).
> >>>>>>
> >>>>>> That might blast WARN messages to the user. Perhaps save the fact
> that
> >>>> you
> >>>>> have printed a WARN message once and only print when the variable is
> >> not
> >>>>> set?
> >>>>>
> >>>>> 5. Regarding Test Plan
> >>>>> * I would add: Validate the test of autoAck=false still works (you
> >>>> haven't
> >>>>> broken anything)
> >>>>> * I would add: Validate existing ProcessingGuarantee test for
> >> AtMostOnce,
> >>>>> AtLeastOnce, ExactlyOnce still works (when autoAck=true)
> >>>>>
> >>>>>
> >>>>>
> >>>>> On Mon, May 30, 2022 at 4:09 PM Baozi <wudixiaolongbao@icloud.com
> >>>> .invalid>
> >>>>> wrote:
> >>>>>
> >>>>>> Hi, Mesika.
> >>>>>>
> >>>>>> Thanks review.
> >>>>>>
> >>>>>>>> 2. I suggest calling it `MANUAL` `ProcessingGuarantee` instead of
> >>>> NONE.
> >>>>>> As
> >>>>>>>> you carefully explained, ProcessingGuarantee comes does to the
> fact
> >>>> that
> >>>>>>>> the function executor calls acknowledge, in specific timing:
> >>>>>>
> >>>>>>
> >>>>>> Added, Refer to the latest pip.
> >>>>>> https://github.com/apache/pulsar/issues/15560
> >>>>>>
> >>>>>>>> 3. Removing autoAck from Function Config breaks backward
> >>>> compatibility,
> >>>>>>>> thus shouldn't this be strongly reflected in the PIP - how does
> >> Pulsar
> >>>>>>>> release handle breaking change?
> >>>>>>
> >>>>>> As suggested by @neng, They will be marked as deprecated first and
> >>>> clearly
> >>>>>> stated in the documentation. Remove it after 2~3 release.
> >>>>>>
> >>>>>>>> 4. Regarding Implementation (1), isn't the class itself
> >>>>>>>> PulsarSinkAtLeastOnceProcessor encodes what to do? I'm not sure I
> >>>>>>>> understand how you use that enum value *inside* the class/
> >>>>>>
> >>>>>> I changed PIP, add new PulsarSinkManualProcessor. Refer to the
> latest
> >>>> PIP
> >>>>>> API Changes(3)
> >>>>>>
> >>>>>> Thanks,
> >>>>>> Baodi Shi
> >>>>>>
> >>>>>>> 2022年5月30日 12:5128,Rui Fu <rf...@apache.org> 写道:
> >>>>>>>
> >>>>>>> Hi Baodi,
> >>>>>>>
> >>>>>>> Nice work. Put some suggestions below, ptal.
> >>>>>>>
> >>>>>>> 1. API changes should also contain the changes of `Function.proto`,
> >>>>>> including new `ProcessingGuarantees` option and `autoAck`.
> >>>>>>> 2. Please be sure the other language runtimes (like Python, Golang)
> >> do
> >>>>>> support similar `record.ack()` function from the context, if no, it
> >>>> might
> >>>>>> have some API changes for different runtime we well.
> >>>>>>>
> >>>>>>>
> >>>>>>> Best,
> >>>>>>>
> >>>>>>> Rui Fu
> >>>>>>> 在 2022年5月29日 +0800 22:18,Asaf Mesika <as...@gmail.com>,写道:
> >>>>>>>> 1. "Added NONE delivery semantics and delete autoAck config."
> >>>>>>>> - Added --> add
> >>>>>>>>
> >>>>>>>> 2. I suggest calling it `MANUAL` `ProcessingGuarantee` instead of
> >>>> NONE.
> >>>>>> As
> >>>>>>>> you carefully explained, ProcessingGuarantee comes does to the
> fact
> >>>> that
> >>>>>>>> the function executor calls acknowledge, in specific timing:
> >>>>>>>> - `AT_MOST_ONCE` - When the message is read by the client, it is
> >>>>>>>> immediately acknowledged and only then the function is executed,
> >> thus
> >>>>>>>> guaranteeing it will not run more than once
> >>>>>>>> - `AT_LEAST_ONCE` - Message is acknowledged *after* the function
> >>>>>> finished
> >>>>>>>> execution, thus it will be run at least once.
> >>>>>>>> - `MANUAL` - Signals to the user that it is up to them to
> >> acknowledge
> >>>>>> the
> >>>>>>>> message, inside the function.
> >>>>>>>>
> >>>>>>>> I think if you couple that change with adding the explanation I
> >> wrote
> >>>>>>>> above to the documentation it will become crystal clear
> (hopefully)
> >>>>>> what is
> >>>>>>>> a Processing Guarantee exactly and what each value signifies.
> >>>>>>>>
> >>>>>>>> 3. Removing autoAck from Function Config breaks backward
> >>>> compatibility,
> >>>>>>>> thus shouldn't this be strongly reflected in the PIP - how does
> >> Pulsar
> >>>>>>>> release handle breaking change?
> >>>>>>>>
> >>>>>>>> 4. Regarding Implementation (1), isn't the class itself
> >>>>>>>> PulsarSinkAtLeastOnceProcessor encodes what to do? I'm not sure I
> >>>>>>>> understand how you use that enum value *inside* the class/
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> On Fri, May 27, 2022 at 1:08 AM Neng Lu <fr...@gmail.com>
> wrote:
> >>>>>>>>
> >>>>>>>>> Some suggestions:
> >>>>>>>>>
> >>>>>>>>> 1. Instead of deleting the `autoAck`, keep it but not use it in
> the
> >>>>>> code.
> >>>>>>>>> And documented clearly it's deprecated for the following 2~3
> >> release.
> >>>>>> And
> >>>>>>>>> then delete it.
> >>>>>>>>> 2. For `PulsarSinkAtLeastOnceProcessor` and
> >>>>>>>>> `PulsarSinkEffectivelyOnceProcessor`, if `NONE` is configured, it
> >>>>>> defaults
> >>>>>>>>> to ATLEAST_ONCE.
> >>>>>>>>> 3. Need to let users know the behavior when they call
> >> `record.ack()`
> >>>>>> inside
> >>>>>>>>> the function implementation.
> >>>>>>>>>
> >>>>>>>>> On Thu, May 12, 2022 at 1:52 AM Baozi <
> wudixiaolongbao@icloud.com
> >>>>>> .invalid>
> >>>>>>>>> wrote:
> >>>>>>>>>
> >>>>>>>>>> Hi Pulsar community,
> >>>>>>>>>>
> >>>>>>>>>> I open a https://github.com/apache/pulsar/issues/15560 for
> >> Function
> >>>>>> add
> >>>>>>>>>> NONE delivery semantics
> >>>>>>>>>>
> >>>>>>>>>> Let me know what you think.
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> Thanks,
> >>>>>>>>>> Baodi Shi
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> ## Motivation
> >>>>>>>>>>
> >>>>>>>>>> Currently Function supports three delivery semantics, and also
> >>>>>> provides
> >>>>>>>>>> autoAck to control whether to automatically ack.
> >>>>>>>>>> Because autoAck affects the delivery semantics of Function, it
> can
> >>>> be
> >>>>>>>>>> confusing for users to understand the relationship between these
> >> two
> >>>>>>>>>> parameters.
> >>>>>>>>>>
> >>>>>>>>>> For example, when the user configures `Guarantees ==
> ATMOST_ONCE`
> >>>> and
> >>>>>>>>>> `autoAck == false`, then the framework will not help the user to
> >> ack
> >>>>>>>>>> messages, and the processing semantics may become
> `ATLEAST_ONCE`.
> >>>>>>>>>>
> >>>>>>>>>> The delivery semantics provided by Function should be clear.
> When
> >>>> the
> >>>>>>>>> user
> >>>>>>>>>> sets the guarantees, the framework should ensure point-to-point
> >>>>>> semantic
> >>>>>>>>>> processing and cannot be affected by other parameters.
> >>>>>>>>>>
> >>>>>>>>>> ## Goal
> >>>>>>>>>>
> >>>>>>>>>> Added `NONE` delivery semantics and delete `autoAck` config.
> >>>>>>>>>>
> >>>>>>>>>> The original intention of `autoAck` semantics is that users want
> >> to
> >>>>>>>>>> control the timing of ack by themselves. When autoAck == false,
> >> the
> >>>>>>>>>> processing semantics provided by the framework should be
> invalid.
> >>>>>> Then we
> >>>>>>>>>> can add `NONE` processing semantics to replace the autoAck ==
> >> false
> >>>>>>>>>> scenario.
> >>>>>>>>>>
> >>>>>>>>>> When the user configuration `ProcessingGuarantees == NONE`, the
> >>>>>> framework
> >>>>>>>>>> does not help the user to do any ack operations, and the ack is
> >> left
> >>>>>> to
> >>>>>>>>> the
> >>>>>>>>>> user to handle. In other cases, the framework guarantees
> >> processing
> >>>>>>>>>> semantics.
> >>>>>>>>>>
> >>>>>>>>>> ## API Changes
> >>>>>>>>>> 1. Add `NONE` type to ProcessingGuarantees
> >>>>>>>>>> ``` java
> >>>>>>>>>> public enum ProcessingGuarantees {
> >>>>>>>>>> ATLEAST_ONCE,
> >>>>>>>>>> ATMOST_ONCE,
> >>>>>>>>>> EFFECTIVELY_ONCE,
> >>>>>>>>>> NONE
> >>>>>>>>>> }
> >>>>>>>>>> ```
> >>>>>>>>>>
> >>>>>>>>>> 2. Delete autoAck config in FunctionConfig
> >>>>>>>>>> ``` java
> >>>>>>>>>> public class FunctionConfig {
> >>>>>>>>>> - private Boolean autoAck;
> >>>>>>>>>> }
> >>>>>>>>>> ```
> >>>>>>>>>>
> >>>>>>>>>> ## Implementation
> >>>>>>>>>>
> >>>>>>>>>> 1. In `PulsarSinkAtLeastOnceProcessor` and
> >>>>>>>>>> `PulsarSinkEffectivelyOnceProcessor`, when `ProcessingGuarantees
> >> !=
> >>>>>> NONE`
> >>>>>>>>>> can be ack.
> >>>>>>>>>>
> >>>>>>>>>> <
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>
> >>>>
> >>
> https://github.com/apache/pulsar/blob/c49a977de4b0b525ec80e5070bc90eddcc7cddad/pulsar-functions/instance/src/main/java/org/apache/pulsar/functions/sink/PulsarSink.java#L274-L276
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> 2. When the delivery semantic is `ATMOST_ONCE`, the message will
> >> be
> >>>>>> acked
> >>>>>>>>>> immediately after receiving the message, no longer affected by
> the
> >>>>>>>>> autoAck
> >>>>>>>>>> configuration.
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>
> >>>>
> >>
> https://github.com/apache/pulsar/blob/c49a977de4b0b525ec80e5070bc90eddcc7cddad/pulsar-functions/instance/src/main/java/org/apache/pulsar/functions/instance/JavaInstanceRunnable.java#L271-L276
> >>>>>>>>>>
> >>>>>>>>>> 3. When user call `record.ack()` in function, just
> >>>>>> `ProcessingGuarantees
> >>>>>>>>>> == NONE` can be work.
> >>>>>>>>>>
> >>>>>>>>>> ## Plan test
> >>>>>>>>>> The main test and assert is that when ProcessingGuarantees ==
> >> NONE,
> >>>>>> the
> >>>>>>>>>> function framework will not do any ack operations for the user.
> >>>>>>>>>>
> >>>>>>>>>> ## Compatibility
> >>>>>>>>>> 1. This change will invalidate the user's setting of autoAck,
> >> which
> >>>>>>>>> should
> >>>>>>>>>> be explained in the documentation and provide parameter
> >> verification
> >>>>>> to
> >>>>>>>>>> remind the user.
> >>>>>>>>>> 2. Runtimes of other languages need to maintain consistent
> >>>> processing
> >>>>>>>>>> logic (python, go).
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> --
> >>>>>>>>> Best Regards,
> >>>>>>>>> Neng
> >>>>>>>>>
> >>>>>>
> >>>>>>
> >>>>
> >>>>
> >>
> >>
>
>

Re: [DISCUSS] PIP-166: Function add NONE delivery semantics

Posted by 石宝迪 <wu...@icloud.com.INVALID>.
> Ok. I would add in the Compatability change another section with bold or
> capital letters to highlight you're creating a breaking change. It should
> be reflected in the release notes somehow - don't know the process for that.

Ok, I added to `Incompatible case`. PTAL.


Thanks,
Baodi Shi

> 2022年6月2日 18:0404,Asaf Mesika <as...@gmail.com> 写道:
> 
>> 
>> I tend to fail. Although this breaks the current logic. but the current
>> implementation can be considered is a bug.
> 
> Ok. I would add in the Compatability change another section with bold or
> capital letters to highlight you're creating a breaking change. It should
> be reflected in the release notes somehow - don't know the process for that.
> 
> On Tue, May 31, 2022 at 7:16 PM 石宝迪 <wu...@icloud.com.invalid>
> wrote:
> 
>>>> If you fail to start the function, you immediately break people's
>>> functions when they upgrade to this version. How about notifying them
>> once
>>> via logger (WARN)?
>> 
>> 
>> I tend to fail. Although this breaks the current logic. but the current
>> implementation can be considered is a bug.
>> 
>>> It will flood their logs if they used it wrong. Maybe write to log once?
>> 
>> 
>> Agree, I changed PIP.
>> 
>> Thanks,
>> Baodi Shi
>> 
>>> 2022年5月31日 23:5720,Asaf Mesika <as...@gmail.com> 写道:
>>> 
>>> Hi Baodi,
>>> 
>>> Regarding
>>> 
>>>> 
>>>>  1. When the delivery semantic is ATMOST_ONCE, add verify autoAck must
>>>>  be true. If the validation fails, let the function fail to start (This
>>>>  temporarily resolves the configuration ambiguity). When autoAck is
>>>>  subsequently removed, the message will be acked immediately after
>> receiving
>>>>  the message.
>>>> 
>>>> 
>>>> If you fail to start the function, you immediately break people's
>>> functions when they upgrade to this version. How about notifying them
>> once
>>> via logger (WARN)?
>>> 
>>> Regarding
>>> 
>>>> 
>>>>  1.
>>>> 
>>>> 
>>>>  When user call record.ack() in function, just ProcessingGuarantees ==
>>>>  MANUAL can be work. In turn, when ProcessingGuarantees != MANUAL, user
>>>>  call record.ack() is invalid(print warn log).
>>>> 
>>>> It will flood their logs if they used it wrong. Maybe write to log once?
>>> 
>>> On Tue, May 31, 2022 at 12:24 PM Baozi <wudixiaolongbao@icloud.com
>> .invalid>
>>> wrote:
>>> 
>>>> Hi, Asaf.
>>>> 
>>>> Thanks review.
>>>> 
>>>>>> I'm not entirely sure that is accurate. The Effectively-Once as I
>>>>> understand it is achieved using transactions, thus the consumption of
>>>> that
>>>>> message and production of any messages, as a result, are considered one
>>>>> atomic unit - either message acknowledged and messages produced or
>>>> neither.
>>>> 
>>>> 
>>>> Not using transactions now, I understand: EFFECTIVELY_ONCE =
>> ATLEAST_ONCE
>>>> + Message Deduplication.
>>>> 
>>>> @Neng Lu @Rui Fu Can help make sure?
>>>> 
>>>>>> I would issue a WARN when reading configuring the function (thus
>> emitted
>>>>> once) when the user actively configured autoAck=false and warn them
>> that
>>>>> this configuration is deprecated and they should switch to the MANUAL
>>>>> ProcessingGuarantee configuration option.
>>>> 
>>>> 
>>>> Added to API Change(2)
>>>> 
>>>>>> suggest you clarify what existing behavior remains for backward
>>>>> compatibility with the appropriate comment that this is deprecated and
>>>> will
>>>>> be removed.
>>>> 
>>>> Yes, I have rewritten it, please see Implementation(1)
>>>> 
>>>>> 5. Regarding Test Plan
>>>>> * I would add: Validate the test of autoAck=false still works (you
>>>> haven't
>>>>> broken anything)
>>>>> * I would add: Validate existing ProcessingGuarantee test for
>> AtMostOnce,
>>>>> AtLeastOnce, ExactlyOnce still works (when autoAck=true)
>>>> 
>>>> 
>>>> Nice, I added to PIP.
>>>> 
>>>> 
>>>> Thanks,
>>>> Baodi Shi
>>>> 
>>>>> 2022年5月30日 22:0011,Asaf Mesika <as...@gmail.com> 写道:
>>>>> 
>>>>> Thanks for applying the fixes.
>>>>> 
>>>>> 1. Regarding
>>>>> 
>>>>>> 
>>>>>> - EFFECTIVELY_ONCE: The message is acknowledged *after* the function
>>>>>> finished execution. Depends on pulsar deduplication, and provides
>>>>>> end-to-end effectively once processing.
>>>>>> 
>>>>>> I'm not entirely sure that is accurate. The Effectively-Once as I
>>>>> understand it is achieved using transactions, thus the consumption of
>>>> that
>>>>> message and production of any messages, as a result, are considered one
>>>>> atomic unit - either message acknowledged and messages produced or
>>>> neither.
>>>>> 
>>>>> 2. Regarding
>>>>> 
>>>>>> 
>>>>>> 1. Indication of autoAck is deprecated, and not use it in the code.
>>>>>> (and also Function.proto)
>>>>>> 
>>>>>> * I would issue a WARN when reading configuring the function (thus
>>>> emitted
>>>>> once) when the user actively configured autoAck=false and warn them
>> that
>>>>> this configuration is deprecated and they should switch to the MANUAL
>>>>> ProcessingGuarantee configuration option.
>>>>> 
>>>>> 3. Regarding
>>>>> 
>>>>>> 
>>>>>> 1. When the delivery semantic is ATMOST_ONCE, the message will be
>>>>>> acked immediately after receiving the message, no longer affected by
>>>> the
>>>>>> autoAck configuration.
>>>>>> 
>>>>>> I suggest you clarify what existing behavior remains for backward
>>>>> compatibility with the appropriate comment that this is deprecated and
>>>> will
>>>>> be removed.
>>>>> 
>>>>> 4. Regarding
>>>>> 
>>>>>> 
>>>>>> 1.
>>>>>> 
>>>>>> When user call record.ack() in function, just ProcessingGuarantees ==
>>>>>> MANUAL can be work. In turn, when ProcessingGuarantees != MANUAL,
>> user
>>>>>> call record.ack() is invalid(print warn log).
>>>>>> 
>>>>>> That might blast WARN messages to the user. Perhaps save the fact that
>>>> you
>>>>> have printed a WARN message once and only print when the variable is
>> not
>>>>> set?
>>>>> 
>>>>> 5. Regarding Test Plan
>>>>> * I would add: Validate the test of autoAck=false still works (you
>>>> haven't
>>>>> broken anything)
>>>>> * I would add: Validate existing ProcessingGuarantee test for
>> AtMostOnce,
>>>>> AtLeastOnce, ExactlyOnce still works (when autoAck=true)
>>>>> 
>>>>> 
>>>>> 
>>>>> On Mon, May 30, 2022 at 4:09 PM Baozi <wudixiaolongbao@icloud.com
>>>> .invalid>
>>>>> wrote:
>>>>> 
>>>>>> Hi, Mesika.
>>>>>> 
>>>>>> Thanks review.
>>>>>> 
>>>>>>>> 2. I suggest calling it `MANUAL` `ProcessingGuarantee` instead of
>>>> NONE.
>>>>>> As
>>>>>>>> you carefully explained, ProcessingGuarantee comes does to the fact
>>>> that
>>>>>>>> the function executor calls acknowledge, in specific timing:
>>>>>> 
>>>>>> 
>>>>>> Added, Refer to the latest pip.
>>>>>> https://github.com/apache/pulsar/issues/15560
>>>>>> 
>>>>>>>> 3. Removing autoAck from Function Config breaks backward
>>>> compatibility,
>>>>>>>> thus shouldn't this be strongly reflected in the PIP - how does
>> Pulsar
>>>>>>>> release handle breaking change?
>>>>>> 
>>>>>> As suggested by @neng, They will be marked as deprecated first and
>>>> clearly
>>>>>> stated in the documentation. Remove it after 2~3 release.
>>>>>> 
>>>>>>>> 4. Regarding Implementation (1), isn't the class itself
>>>>>>>> PulsarSinkAtLeastOnceProcessor encodes what to do? I'm not sure I
>>>>>>>> understand how you use that enum value *inside* the class/
>>>>>> 
>>>>>> I changed PIP, add new PulsarSinkManualProcessor. Refer to the latest
>>>> PIP
>>>>>> API Changes(3)
>>>>>> 
>>>>>> Thanks,
>>>>>> Baodi Shi
>>>>>> 
>>>>>>> 2022年5月30日 12:5128,Rui Fu <rf...@apache.org> 写道:
>>>>>>> 
>>>>>>> Hi Baodi,
>>>>>>> 
>>>>>>> Nice work. Put some suggestions below, ptal.
>>>>>>> 
>>>>>>> 1. API changes should also contain the changes of `Function.proto`,
>>>>>> including new `ProcessingGuarantees` option and `autoAck`.
>>>>>>> 2. Please be sure the other language runtimes (like Python, Golang)
>> do
>>>>>> support similar `record.ack()` function from the context, if no, it
>>>> might
>>>>>> have some API changes for different runtime we well.
>>>>>>> 
>>>>>>> 
>>>>>>> Best,
>>>>>>> 
>>>>>>> Rui Fu
>>>>>>> 在 2022年5月29日 +0800 22:18,Asaf Mesika <as...@gmail.com>,写道:
>>>>>>>> 1. "Added NONE delivery semantics and delete autoAck config."
>>>>>>>> - Added --> add
>>>>>>>> 
>>>>>>>> 2. I suggest calling it `MANUAL` `ProcessingGuarantee` instead of
>>>> NONE.
>>>>>> As
>>>>>>>> you carefully explained, ProcessingGuarantee comes does to the fact
>>>> that
>>>>>>>> the function executor calls acknowledge, in specific timing:
>>>>>>>> - `AT_MOST_ONCE` - When the message is read by the client, it is
>>>>>>>> immediately acknowledged and only then the function is executed,
>> thus
>>>>>>>> guaranteeing it will not run more than once
>>>>>>>> - `AT_LEAST_ONCE` - Message is acknowledged *after* the function
>>>>>> finished
>>>>>>>> execution, thus it will be run at least once.
>>>>>>>> - `MANUAL` - Signals to the user that it is up to them to
>> acknowledge
>>>>>> the
>>>>>>>> message, inside the function.
>>>>>>>> 
>>>>>>>> I think if you couple that change with adding the explanation I
>> wrote
>>>>>>>> above to the documentation it will become crystal clear (hopefully)
>>>>>> what is
>>>>>>>> a Processing Guarantee exactly and what each value signifies.
>>>>>>>> 
>>>>>>>> 3. Removing autoAck from Function Config breaks backward
>>>> compatibility,
>>>>>>>> thus shouldn't this be strongly reflected in the PIP - how does
>> Pulsar
>>>>>>>> release handle breaking change?
>>>>>>>> 
>>>>>>>> 4. Regarding Implementation (1), isn't the class itself
>>>>>>>> PulsarSinkAtLeastOnceProcessor encodes what to do? I'm not sure I
>>>>>>>> understand how you use that enum value *inside* the class/
>>>>>>>> 
>>>>>>>> 
>>>>>>>> On Fri, May 27, 2022 at 1:08 AM Neng Lu <fr...@gmail.com> wrote:
>>>>>>>> 
>>>>>>>>> Some suggestions:
>>>>>>>>> 
>>>>>>>>> 1. Instead of deleting the `autoAck`, keep it but not use it in the
>>>>>> code.
>>>>>>>>> And documented clearly it's deprecated for the following 2~3
>> release.
>>>>>> And
>>>>>>>>> then delete it.
>>>>>>>>> 2. For `PulsarSinkAtLeastOnceProcessor` and
>>>>>>>>> `PulsarSinkEffectivelyOnceProcessor`, if `NONE` is configured, it
>>>>>> defaults
>>>>>>>>> to ATLEAST_ONCE.
>>>>>>>>> 3. Need to let users know the behavior when they call
>> `record.ack()`
>>>>>> inside
>>>>>>>>> the function implementation.
>>>>>>>>> 
>>>>>>>>> On Thu, May 12, 2022 at 1:52 AM Baozi <wudixiaolongbao@icloud.com
>>>>>> .invalid>
>>>>>>>>> wrote:
>>>>>>>>> 
>>>>>>>>>> Hi Pulsar community,
>>>>>>>>>> 
>>>>>>>>>> I open a https://github.com/apache/pulsar/issues/15560 for
>> Function
>>>>>> add
>>>>>>>>>> NONE delivery semantics
>>>>>>>>>> 
>>>>>>>>>> Let me know what you think.
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> Thanks,
>>>>>>>>>> Baodi Shi
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> ## Motivation
>>>>>>>>>> 
>>>>>>>>>> Currently Function supports three delivery semantics, and also
>>>>>> provides
>>>>>>>>>> autoAck to control whether to automatically ack.
>>>>>>>>>> Because autoAck affects the delivery semantics of Function, it can
>>>> be
>>>>>>>>>> confusing for users to understand the relationship between these
>> two
>>>>>>>>>> parameters.
>>>>>>>>>> 
>>>>>>>>>> For example, when the user configures `Guarantees == ATMOST_ONCE`
>>>> and
>>>>>>>>>> `autoAck == false`, then the framework will not help the user to
>> ack
>>>>>>>>>> messages, and the processing semantics may become `ATLEAST_ONCE`.
>>>>>>>>>> 
>>>>>>>>>> The delivery semantics provided by Function should be clear. When
>>>> the
>>>>>>>>> user
>>>>>>>>>> sets the guarantees, the framework should ensure point-to-point
>>>>>> semantic
>>>>>>>>>> processing and cannot be affected by other parameters.
>>>>>>>>>> 
>>>>>>>>>> ## Goal
>>>>>>>>>> 
>>>>>>>>>> Added `NONE` delivery semantics and delete `autoAck` config.
>>>>>>>>>> 
>>>>>>>>>> The original intention of `autoAck` semantics is that users want
>> to
>>>>>>>>>> control the timing of ack by themselves. When autoAck == false,
>> the
>>>>>>>>>> processing semantics provided by the framework should be invalid.
>>>>>> Then we
>>>>>>>>>> can add `NONE` processing semantics to replace the autoAck ==
>> false
>>>>>>>>>> scenario.
>>>>>>>>>> 
>>>>>>>>>> When the user configuration `ProcessingGuarantees == NONE`, the
>>>>>> framework
>>>>>>>>>> does not help the user to do any ack operations, and the ack is
>> left
>>>>>> to
>>>>>>>>> the
>>>>>>>>>> user to handle. In other cases, the framework guarantees
>> processing
>>>>>>>>>> semantics.
>>>>>>>>>> 
>>>>>>>>>> ## API Changes
>>>>>>>>>> 1. Add `NONE` type to ProcessingGuarantees
>>>>>>>>>> ``` java
>>>>>>>>>> public enum ProcessingGuarantees {
>>>>>>>>>> ATLEAST_ONCE,
>>>>>>>>>> ATMOST_ONCE,
>>>>>>>>>> EFFECTIVELY_ONCE,
>>>>>>>>>> NONE
>>>>>>>>>> }
>>>>>>>>>> ```
>>>>>>>>>> 
>>>>>>>>>> 2. Delete autoAck config in FunctionConfig
>>>>>>>>>> ``` java
>>>>>>>>>> public class FunctionConfig {
>>>>>>>>>> - private Boolean autoAck;
>>>>>>>>>> }
>>>>>>>>>> ```
>>>>>>>>>> 
>>>>>>>>>> ## Implementation
>>>>>>>>>> 
>>>>>>>>>> 1. In `PulsarSinkAtLeastOnceProcessor` and
>>>>>>>>>> `PulsarSinkEffectivelyOnceProcessor`, when `ProcessingGuarantees
>> !=
>>>>>> NONE`
>>>>>>>>>> can be ack.
>>>>>>>>>> 
>>>>>>>>>> <
>>>>>>>>>> 
>>>>>>>>> 
>>>>>> 
>>>> 
>> https://github.com/apache/pulsar/blob/c49a977de4b0b525ec80e5070bc90eddcc7cddad/pulsar-functions/instance/src/main/java/org/apache/pulsar/functions/sink/PulsarSink.java#L274-L276
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 2. When the delivery semantic is `ATMOST_ONCE`, the message will
>> be
>>>>>> acked
>>>>>>>>>> immediately after receiving the message, no longer affected by the
>>>>>>>>> autoAck
>>>>>>>>>> configuration.
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>> 
>>>> 
>> https://github.com/apache/pulsar/blob/c49a977de4b0b525ec80e5070bc90eddcc7cddad/pulsar-functions/instance/src/main/java/org/apache/pulsar/functions/instance/JavaInstanceRunnable.java#L271-L276
>>>>>>>>>> 
>>>>>>>>>> 3. When user call `record.ack()` in function, just
>>>>>> `ProcessingGuarantees
>>>>>>>>>> == NONE` can be work.
>>>>>>>>>> 
>>>>>>>>>> ## Plan test
>>>>>>>>>> The main test and assert is that when ProcessingGuarantees ==
>> NONE,
>>>>>> the
>>>>>>>>>> function framework will not do any ack operations for the user.
>>>>>>>>>> 
>>>>>>>>>> ## Compatibility
>>>>>>>>>> 1. This change will invalidate the user's setting of autoAck,
>> which
>>>>>>>>> should
>>>>>>>>>> be explained in the documentation and provide parameter
>> verification
>>>>>> to
>>>>>>>>>> remind the user.
>>>>>>>>>> 2. Runtimes of other languages need to maintain consistent
>>>> processing
>>>>>>>>>> logic (python, go).
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> --
>>>>>>>>> Best Regards,
>>>>>>>>> Neng
>>>>>>>>> 
>>>>>> 
>>>>>> 
>>>> 
>>>> 
>> 
>> 


Re: [DISCUSS] PIP-166: Function add NONE delivery semantics

Posted by Asaf Mesika <as...@gmail.com>.
>
> I tend to fail. Although this breaks the current logic. but the current
> implementation can be considered is a bug.

Ok. I would add in the Compatability change another section with bold or
capital letters to highlight you're creating a breaking change. It should
be reflected in the release notes somehow - don't know the process for that.

On Tue, May 31, 2022 at 7:16 PM 石宝迪 <wu...@icloud.com.invalid>
wrote:

> >> If you fail to start the function, you immediately break people's
> > functions when they upgrade to this version. How about notifying them
> once
> > via logger (WARN)?
>
>
> I tend to fail. Although this breaks the current logic. but the current
> implementation can be considered is a bug.
>
> > It will flood their logs if they used it wrong. Maybe write to log once?
>
>
> Agree, I changed PIP.
>
> Thanks,
> Baodi Shi
>
> > 2022年5月31日 23:5720,Asaf Mesika <as...@gmail.com> 写道:
> >
> > Hi Baodi,
> >
> > Regarding
> >
> >>
> >>   1. When the delivery semantic is ATMOST_ONCE, add verify autoAck must
> >>   be true. If the validation fails, let the function fail to start (This
> >>   temporarily resolves the configuration ambiguity). When autoAck is
> >>   subsequently removed, the message will be acked immediately after
> receiving
> >>   the message.
> >>
> >>
> >> If you fail to start the function, you immediately break people's
> > functions when they upgrade to this version. How about notifying them
> once
> > via logger (WARN)?
> >
> > Regarding
> >
> >>
> >>   1.
> >>
> >>
> >>   When user call record.ack() in function, just ProcessingGuarantees ==
> >>   MANUAL can be work. In turn, when ProcessingGuarantees != MANUAL, user
> >>   call record.ack() is invalid(print warn log).
> >>
> >> It will flood their logs if they used it wrong. Maybe write to log once?
> >
> > On Tue, May 31, 2022 at 12:24 PM Baozi <wudixiaolongbao@icloud.com
> .invalid>
> > wrote:
> >
> >> Hi, Asaf.
> >>
> >> Thanks review.
> >>
> >>>> I'm not entirely sure that is accurate. The Effectively-Once as I
> >>> understand it is achieved using transactions, thus the consumption of
> >> that
> >>> message and production of any messages, as a result, are considered one
> >>> atomic unit - either message acknowledged and messages produced or
> >> neither.
> >>
> >>
> >> Not using transactions now, I understand: EFFECTIVELY_ONCE =
> ATLEAST_ONCE
> >> + Message Deduplication.
> >>
> >> @Neng Lu @Rui Fu Can help make sure?
> >>
> >>>> I would issue a WARN when reading configuring the function (thus
> emitted
> >>> once) when the user actively configured autoAck=false and warn them
> that
> >>> this configuration is deprecated and they should switch to the MANUAL
> >>> ProcessingGuarantee configuration option.
> >>
> >>
> >> Added to API Change(2)
> >>
> >>>> suggest you clarify what existing behavior remains for backward
> >>> compatibility with the appropriate comment that this is deprecated and
> >> will
> >>> be removed.
> >>
> >> Yes, I have rewritten it, please see Implementation(1)
> >>
> >>> 5. Regarding Test Plan
> >>> * I would add: Validate the test of autoAck=false still works (you
> >> haven't
> >>> broken anything)
> >>> * I would add: Validate existing ProcessingGuarantee test for
> AtMostOnce,
> >>> AtLeastOnce, ExactlyOnce still works (when autoAck=true)
> >>
> >>
> >> Nice, I added to PIP.
> >>
> >>
> >> Thanks,
> >> Baodi Shi
> >>
> >>> 2022年5月30日 22:0011,Asaf Mesika <as...@gmail.com> 写道:
> >>>
> >>> Thanks for applying the fixes.
> >>>
> >>> 1. Regarding
> >>>
> >>>>
> >>>>  - EFFECTIVELY_ONCE: The message is acknowledged *after* the function
> >>>>  finished execution. Depends on pulsar deduplication, and provides
> >>>>  end-to-end effectively once processing.
> >>>>
> >>>> I'm not entirely sure that is accurate. The Effectively-Once as I
> >>> understand it is achieved using transactions, thus the consumption of
> >> that
> >>> message and production of any messages, as a result, are considered one
> >>> atomic unit - either message acknowledged and messages produced or
> >> neither.
> >>>
> >>> 2. Regarding
> >>>
> >>>>
> >>>>  1. Indication of autoAck is deprecated, and not use it in the code.
> >>>>  (and also Function.proto)
> >>>>
> >>>> * I would issue a WARN when reading configuring the function (thus
> >> emitted
> >>> once) when the user actively configured autoAck=false and warn them
> that
> >>> this configuration is deprecated and they should switch to the MANUAL
> >>> ProcessingGuarantee configuration option.
> >>>
> >>> 3. Regarding
> >>>
> >>>>
> >>>>  1. When the delivery semantic is ATMOST_ONCE, the message will be
> >>>>  acked immediately after receiving the message, no longer affected by
> >> the
> >>>>  autoAck configuration.
> >>>>
> >>>> I suggest you clarify what existing behavior remains for backward
> >>> compatibility with the appropriate comment that this is deprecated and
> >> will
> >>> be removed.
> >>>
> >>> 4. Regarding
> >>>
> >>>>
> >>>>  1.
> >>>>
> >>>>  When user call record.ack() in function, just ProcessingGuarantees ==
> >>>>  MANUAL can be work. In turn, when ProcessingGuarantees != MANUAL,
> user
> >>>>  call record.ack() is invalid(print warn log).
> >>>>
> >>>> That might blast WARN messages to the user. Perhaps save the fact that
> >> you
> >>> have printed a WARN message once and only print when the variable is
> not
> >>> set?
> >>>
> >>> 5. Regarding Test Plan
> >>> * I would add: Validate the test of autoAck=false still works (you
> >> haven't
> >>> broken anything)
> >>> * I would add: Validate existing ProcessingGuarantee test for
> AtMostOnce,
> >>> AtLeastOnce, ExactlyOnce still works (when autoAck=true)
> >>>
> >>>
> >>>
> >>> On Mon, May 30, 2022 at 4:09 PM Baozi <wudixiaolongbao@icloud.com
> >> .invalid>
> >>> wrote:
> >>>
> >>>> Hi, Mesika.
> >>>>
> >>>> Thanks review.
> >>>>
> >>>>>> 2. I suggest calling it `MANUAL` `ProcessingGuarantee` instead of
> >> NONE.
> >>>> As
> >>>>>> you carefully explained, ProcessingGuarantee comes does to the fact
> >> that
> >>>>>> the function executor calls acknowledge, in specific timing:
> >>>>
> >>>>
> >>>> Added, Refer to the latest pip.
> >>>> https://github.com/apache/pulsar/issues/15560
> >>>>
> >>>>>> 3. Removing autoAck from Function Config breaks backward
> >> compatibility,
> >>>>>> thus shouldn't this be strongly reflected in the PIP - how does
> Pulsar
> >>>>>> release handle breaking change?
> >>>>
> >>>> As suggested by @neng, They will be marked as deprecated first and
> >> clearly
> >>>> stated in the documentation. Remove it after 2~3 release.
> >>>>
> >>>>>> 4. Regarding Implementation (1), isn't the class itself
> >>>>>> PulsarSinkAtLeastOnceProcessor encodes what to do? I'm not sure I
> >>>>>> understand how you use that enum value *inside* the class/
> >>>>
> >>>> I changed PIP, add new PulsarSinkManualProcessor. Refer to the latest
> >> PIP
> >>>> API Changes(3)
> >>>>
> >>>> Thanks,
> >>>> Baodi Shi
> >>>>
> >>>>> 2022年5月30日 12:5128,Rui Fu <rf...@apache.org> 写道:
> >>>>>
> >>>>> Hi Baodi,
> >>>>>
> >>>>> Nice work. Put some suggestions below, ptal.
> >>>>>
> >>>>> 1. API changes should also contain the changes of `Function.proto`,
> >>>> including new `ProcessingGuarantees` option and `autoAck`.
> >>>>> 2. Please be sure the other language runtimes (like Python, Golang)
> do
> >>>> support similar `record.ack()` function from the context, if no, it
> >> might
> >>>> have some API changes for different runtime we well.
> >>>>>
> >>>>>
> >>>>> Best,
> >>>>>
> >>>>> Rui Fu
> >>>>> 在 2022年5月29日 +0800 22:18,Asaf Mesika <as...@gmail.com>,写道:
> >>>>>> 1. "Added NONE delivery semantics and delete autoAck config."
> >>>>>> - Added --> add
> >>>>>>
> >>>>>> 2. I suggest calling it `MANUAL` `ProcessingGuarantee` instead of
> >> NONE.
> >>>> As
> >>>>>> you carefully explained, ProcessingGuarantee comes does to the fact
> >> that
> >>>>>> the function executor calls acknowledge, in specific timing:
> >>>>>> - `AT_MOST_ONCE` - When the message is read by the client, it is
> >>>>>> immediately acknowledged and only then the function is executed,
> thus
> >>>>>> guaranteeing it will not run more than once
> >>>>>> - `AT_LEAST_ONCE` - Message is acknowledged *after* the function
> >>>> finished
> >>>>>> execution, thus it will be run at least once.
> >>>>>> - `MANUAL` - Signals to the user that it is up to them to
> acknowledge
> >>>> the
> >>>>>> message, inside the function.
> >>>>>>
> >>>>>> I think if you couple that change with adding the explanation I
> wrote
> >>>>>> above to the documentation it will become crystal clear (hopefully)
> >>>> what is
> >>>>>> a Processing Guarantee exactly and what each value signifies.
> >>>>>>
> >>>>>> 3. Removing autoAck from Function Config breaks backward
> >> compatibility,
> >>>>>> thus shouldn't this be strongly reflected in the PIP - how does
> Pulsar
> >>>>>> release handle breaking change?
> >>>>>>
> >>>>>> 4. Regarding Implementation (1), isn't the class itself
> >>>>>> PulsarSinkAtLeastOnceProcessor encodes what to do? I'm not sure I
> >>>>>> understand how you use that enum value *inside* the class/
> >>>>>>
> >>>>>>
> >>>>>> On Fri, May 27, 2022 at 1:08 AM Neng Lu <fr...@gmail.com> wrote:
> >>>>>>
> >>>>>>> Some suggestions:
> >>>>>>>
> >>>>>>> 1. Instead of deleting the `autoAck`, keep it but not use it in the
> >>>> code.
> >>>>>>> And documented clearly it's deprecated for the following 2~3
> release.
> >>>> And
> >>>>>>> then delete it.
> >>>>>>> 2. For `PulsarSinkAtLeastOnceProcessor` and
> >>>>>>> `PulsarSinkEffectivelyOnceProcessor`, if `NONE` is configured, it
> >>>> defaults
> >>>>>>> to ATLEAST_ONCE.
> >>>>>>> 3. Need to let users know the behavior when they call
> `record.ack()`
> >>>> inside
> >>>>>>> the function implementation.
> >>>>>>>
> >>>>>>> On Thu, May 12, 2022 at 1:52 AM Baozi <wudixiaolongbao@icloud.com
> >>>> .invalid>
> >>>>>>> wrote:
> >>>>>>>
> >>>>>>>> Hi Pulsar community,
> >>>>>>>>
> >>>>>>>> I open a https://github.com/apache/pulsar/issues/15560 for
> Function
> >>>> add
> >>>>>>>> NONE delivery semantics
> >>>>>>>>
> >>>>>>>> Let me know what you think.
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> Thanks,
> >>>>>>>> Baodi Shi
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> ## Motivation
> >>>>>>>>
> >>>>>>>> Currently Function supports three delivery semantics, and also
> >>>> provides
> >>>>>>>> autoAck to control whether to automatically ack.
> >>>>>>>> Because autoAck affects the delivery semantics of Function, it can
> >> be
> >>>>>>>> confusing for users to understand the relationship between these
> two
> >>>>>>>> parameters.
> >>>>>>>>
> >>>>>>>> For example, when the user configures `Guarantees == ATMOST_ONCE`
> >> and
> >>>>>>>> `autoAck == false`, then the framework will not help the user to
> ack
> >>>>>>>> messages, and the processing semantics may become `ATLEAST_ONCE`.
> >>>>>>>>
> >>>>>>>> The delivery semantics provided by Function should be clear. When
> >> the
> >>>>>>> user
> >>>>>>>> sets the guarantees, the framework should ensure point-to-point
> >>>> semantic
> >>>>>>>> processing and cannot be affected by other parameters.
> >>>>>>>>
> >>>>>>>> ## Goal
> >>>>>>>>
> >>>>>>>> Added `NONE` delivery semantics and delete `autoAck` config.
> >>>>>>>>
> >>>>>>>> The original intention of `autoAck` semantics is that users want
> to
> >>>>>>>> control the timing of ack by themselves. When autoAck == false,
> the
> >>>>>>>> processing semantics provided by the framework should be invalid.
> >>>> Then we
> >>>>>>>> can add `NONE` processing semantics to replace the autoAck ==
> false
> >>>>>>>> scenario.
> >>>>>>>>
> >>>>>>>> When the user configuration `ProcessingGuarantees == NONE`, the
> >>>> framework
> >>>>>>>> does not help the user to do any ack operations, and the ack is
> left
> >>>> to
> >>>>>>> the
> >>>>>>>> user to handle. In other cases, the framework guarantees
> processing
> >>>>>>>> semantics.
> >>>>>>>>
> >>>>>>>> ## API Changes
> >>>>>>>> 1. Add `NONE` type to ProcessingGuarantees
> >>>>>>>> ``` java
> >>>>>>>> public enum ProcessingGuarantees {
> >>>>>>>> ATLEAST_ONCE,
> >>>>>>>> ATMOST_ONCE,
> >>>>>>>> EFFECTIVELY_ONCE,
> >>>>>>>> NONE
> >>>>>>>> }
> >>>>>>>> ```
> >>>>>>>>
> >>>>>>>> 2. Delete autoAck config in FunctionConfig
> >>>>>>>> ``` java
> >>>>>>>> public class FunctionConfig {
> >>>>>>>> - private Boolean autoAck;
> >>>>>>>> }
> >>>>>>>> ```
> >>>>>>>>
> >>>>>>>> ## Implementation
> >>>>>>>>
> >>>>>>>> 1. In `PulsarSinkAtLeastOnceProcessor` and
> >>>>>>>> `PulsarSinkEffectivelyOnceProcessor`, when `ProcessingGuarantees
> !=
> >>>> NONE`
> >>>>>>>> can be ack.
> >>>>>>>>
> >>>>>>>> <
> >>>>>>>>
> >>>>>>>
> >>>>
> >>
> https://github.com/apache/pulsar/blob/c49a977de4b0b525ec80e5070bc90eddcc7cddad/pulsar-functions/instance/src/main/java/org/apache/pulsar/functions/sink/PulsarSink.java#L274-L276
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>> 2. When the delivery semantic is `ATMOST_ONCE`, the message will
> be
> >>>> acked
> >>>>>>>> immediately after receiving the message, no longer affected by the
> >>>>>>> autoAck
> >>>>>>>> configuration.
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>
> >>
> https://github.com/apache/pulsar/blob/c49a977de4b0b525ec80e5070bc90eddcc7cddad/pulsar-functions/instance/src/main/java/org/apache/pulsar/functions/instance/JavaInstanceRunnable.java#L271-L276
> >>>>>>>>
> >>>>>>>> 3. When user call `record.ack()` in function, just
> >>>> `ProcessingGuarantees
> >>>>>>>> == NONE` can be work.
> >>>>>>>>
> >>>>>>>> ## Plan test
> >>>>>>>> The main test and assert is that when ProcessingGuarantees ==
> NONE,
> >>>> the
> >>>>>>>> function framework will not do any ack operations for the user.
> >>>>>>>>
> >>>>>>>> ## Compatibility
> >>>>>>>> 1. This change will invalidate the user's setting of autoAck,
> which
> >>>>>>> should
> >>>>>>>> be explained in the documentation and provide parameter
> verification
> >>>> to
> >>>>>>>> remind the user.
> >>>>>>>> 2. Runtimes of other languages need to maintain consistent
> >> processing
> >>>>>>>> logic (python, go).
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>>>> --
> >>>>>>> Best Regards,
> >>>>>>> Neng
> >>>>>>>
> >>>>
> >>>>
> >>
> >>
>
>

Re: [DISCUSS] PIP-166: Function add NONE delivery semantics

Posted by 石宝迪 <wu...@icloud.com.INVALID>.
>> If you fail to start the function, you immediately break people's
> functions when they upgrade to this version. How about notifying them once
> via logger (WARN)?


I tend to fail. Although this breaks the current logic. but the current implementation can be considered is a bug.

> It will flood their logs if they used it wrong. Maybe write to log once?


Agree, I changed PIP.

Thanks,
Baodi Shi

> 2022年5月31日 23:5720,Asaf Mesika <as...@gmail.com> 写道:
> 
> Hi Baodi,
> 
> Regarding
> 
>> 
>>   1. When the delivery semantic is ATMOST_ONCE, add verify autoAck must
>>   be true. If the validation fails, let the function fail to start (This
>>   temporarily resolves the configuration ambiguity). When autoAck is
>>   subsequently removed, the message will be acked immediately after receiving
>>   the message.
>> 
>> 
>> If you fail to start the function, you immediately break people's
> functions when they upgrade to this version. How about notifying them once
> via logger (WARN)?
> 
> Regarding
> 
>> 
>>   1.
>> 
>> 
>>   When user call record.ack() in function, just ProcessingGuarantees ==
>>   MANUAL can be work. In turn, when ProcessingGuarantees != MANUAL, user
>>   call record.ack() is invalid(print warn log).
>> 
>> It will flood their logs if they used it wrong. Maybe write to log once?
> 
> On Tue, May 31, 2022 at 12:24 PM Baozi <wu...@icloud.com.invalid>
> wrote:
> 
>> Hi, Asaf.
>> 
>> Thanks review.
>> 
>>>> I'm not entirely sure that is accurate. The Effectively-Once as I
>>> understand it is achieved using transactions, thus the consumption of
>> that
>>> message and production of any messages, as a result, are considered one
>>> atomic unit - either message acknowledged and messages produced or
>> neither.
>> 
>> 
>> Not using transactions now, I understand: EFFECTIVELY_ONCE = ATLEAST_ONCE
>> + Message Deduplication.
>> 
>> @Neng Lu @Rui Fu Can help make sure?
>> 
>>>> I would issue a WARN when reading configuring the function (thus emitted
>>> once) when the user actively configured autoAck=false and warn them that
>>> this configuration is deprecated and they should switch to the MANUAL
>>> ProcessingGuarantee configuration option.
>> 
>> 
>> Added to API Change(2)
>> 
>>>> suggest you clarify what existing behavior remains for backward
>>> compatibility with the appropriate comment that this is deprecated and
>> will
>>> be removed.
>> 
>> Yes, I have rewritten it, please see Implementation(1)
>> 
>>> 5. Regarding Test Plan
>>> * I would add: Validate the test of autoAck=false still works (you
>> haven't
>>> broken anything)
>>> * I would add: Validate existing ProcessingGuarantee test for AtMostOnce,
>>> AtLeastOnce, ExactlyOnce still works (when autoAck=true)
>> 
>> 
>> Nice, I added to PIP.
>> 
>> 
>> Thanks,
>> Baodi Shi
>> 
>>> 2022年5月30日 22:0011,Asaf Mesika <as...@gmail.com> 写道:
>>> 
>>> Thanks for applying the fixes.
>>> 
>>> 1. Regarding
>>> 
>>>> 
>>>>  - EFFECTIVELY_ONCE: The message is acknowledged *after* the function
>>>>  finished execution. Depends on pulsar deduplication, and provides
>>>>  end-to-end effectively once processing.
>>>> 
>>>> I'm not entirely sure that is accurate. The Effectively-Once as I
>>> understand it is achieved using transactions, thus the consumption of
>> that
>>> message and production of any messages, as a result, are considered one
>>> atomic unit - either message acknowledged and messages produced or
>> neither.
>>> 
>>> 2. Regarding
>>> 
>>>> 
>>>>  1. Indication of autoAck is deprecated, and not use it in the code.
>>>>  (and also Function.proto)
>>>> 
>>>> * I would issue a WARN when reading configuring the function (thus
>> emitted
>>> once) when the user actively configured autoAck=false and warn them that
>>> this configuration is deprecated and they should switch to the MANUAL
>>> ProcessingGuarantee configuration option.
>>> 
>>> 3. Regarding
>>> 
>>>> 
>>>>  1. When the delivery semantic is ATMOST_ONCE, the message will be
>>>>  acked immediately after receiving the message, no longer affected by
>> the
>>>>  autoAck configuration.
>>>> 
>>>> I suggest you clarify what existing behavior remains for backward
>>> compatibility with the appropriate comment that this is deprecated and
>> will
>>> be removed.
>>> 
>>> 4. Regarding
>>> 
>>>> 
>>>>  1.
>>>> 
>>>>  When user call record.ack() in function, just ProcessingGuarantees ==
>>>>  MANUAL can be work. In turn, when ProcessingGuarantees != MANUAL, user
>>>>  call record.ack() is invalid(print warn log).
>>>> 
>>>> That might blast WARN messages to the user. Perhaps save the fact that
>> you
>>> have printed a WARN message once and only print when the variable is not
>>> set?
>>> 
>>> 5. Regarding Test Plan
>>> * I would add: Validate the test of autoAck=false still works (you
>> haven't
>>> broken anything)
>>> * I would add: Validate existing ProcessingGuarantee test for AtMostOnce,
>>> AtLeastOnce, ExactlyOnce still works (when autoAck=true)
>>> 
>>> 
>>> 
>>> On Mon, May 30, 2022 at 4:09 PM Baozi <wudixiaolongbao@icloud.com
>> .invalid>
>>> wrote:
>>> 
>>>> Hi, Mesika.
>>>> 
>>>> Thanks review.
>>>> 
>>>>>> 2. I suggest calling it `MANUAL` `ProcessingGuarantee` instead of
>> NONE.
>>>> As
>>>>>> you carefully explained, ProcessingGuarantee comes does to the fact
>> that
>>>>>> the function executor calls acknowledge, in specific timing:
>>>> 
>>>> 
>>>> Added, Refer to the latest pip.
>>>> https://github.com/apache/pulsar/issues/15560
>>>> 
>>>>>> 3. Removing autoAck from Function Config breaks backward
>> compatibility,
>>>>>> thus shouldn't this be strongly reflected in the PIP - how does Pulsar
>>>>>> release handle breaking change?
>>>> 
>>>> As suggested by @neng, They will be marked as deprecated first and
>> clearly
>>>> stated in the documentation. Remove it after 2~3 release.
>>>> 
>>>>>> 4. Regarding Implementation (1), isn't the class itself
>>>>>> PulsarSinkAtLeastOnceProcessor encodes what to do? I'm not sure I
>>>>>> understand how you use that enum value *inside* the class/
>>>> 
>>>> I changed PIP, add new PulsarSinkManualProcessor. Refer to the latest
>> PIP
>>>> API Changes(3)
>>>> 
>>>> Thanks,
>>>> Baodi Shi
>>>> 
>>>>> 2022年5月30日 12:5128,Rui Fu <rf...@apache.org> 写道:
>>>>> 
>>>>> Hi Baodi,
>>>>> 
>>>>> Nice work. Put some suggestions below, ptal.
>>>>> 
>>>>> 1. API changes should also contain the changes of `Function.proto`,
>>>> including new `ProcessingGuarantees` option and `autoAck`.
>>>>> 2. Please be sure the other language runtimes (like Python, Golang) do
>>>> support similar `record.ack()` function from the context, if no, it
>> might
>>>> have some API changes for different runtime we well.
>>>>> 
>>>>> 
>>>>> Best,
>>>>> 
>>>>> Rui Fu
>>>>> 在 2022年5月29日 +0800 22:18,Asaf Mesika <as...@gmail.com>,写道:
>>>>>> 1. "Added NONE delivery semantics and delete autoAck config."
>>>>>> - Added --> add
>>>>>> 
>>>>>> 2. I suggest calling it `MANUAL` `ProcessingGuarantee` instead of
>> NONE.
>>>> As
>>>>>> you carefully explained, ProcessingGuarantee comes does to the fact
>> that
>>>>>> the function executor calls acknowledge, in specific timing:
>>>>>> - `AT_MOST_ONCE` - When the message is read by the client, it is
>>>>>> immediately acknowledged and only then the function is executed, thus
>>>>>> guaranteeing it will not run more than once
>>>>>> - `AT_LEAST_ONCE` - Message is acknowledged *after* the function
>>>> finished
>>>>>> execution, thus it will be run at least once.
>>>>>> - `MANUAL` - Signals to the user that it is up to them to acknowledge
>>>> the
>>>>>> message, inside the function.
>>>>>> 
>>>>>> I think if you couple that change with adding the explanation I wrote
>>>>>> above to the documentation it will become crystal clear (hopefully)
>>>> what is
>>>>>> a Processing Guarantee exactly and what each value signifies.
>>>>>> 
>>>>>> 3. Removing autoAck from Function Config breaks backward
>> compatibility,
>>>>>> thus shouldn't this be strongly reflected in the PIP - how does Pulsar
>>>>>> release handle breaking change?
>>>>>> 
>>>>>> 4. Regarding Implementation (1), isn't the class itself
>>>>>> PulsarSinkAtLeastOnceProcessor encodes what to do? I'm not sure I
>>>>>> understand how you use that enum value *inside* the class/
>>>>>> 
>>>>>> 
>>>>>> On Fri, May 27, 2022 at 1:08 AM Neng Lu <fr...@gmail.com> wrote:
>>>>>> 
>>>>>>> Some suggestions:
>>>>>>> 
>>>>>>> 1. Instead of deleting the `autoAck`, keep it but not use it in the
>>>> code.
>>>>>>> And documented clearly it's deprecated for the following 2~3 release.
>>>> And
>>>>>>> then delete it.
>>>>>>> 2. For `PulsarSinkAtLeastOnceProcessor` and
>>>>>>> `PulsarSinkEffectivelyOnceProcessor`, if `NONE` is configured, it
>>>> defaults
>>>>>>> to ATLEAST_ONCE.
>>>>>>> 3. Need to let users know the behavior when they call `record.ack()`
>>>> inside
>>>>>>> the function implementation.
>>>>>>> 
>>>>>>> On Thu, May 12, 2022 at 1:52 AM Baozi <wudixiaolongbao@icloud.com
>>>> .invalid>
>>>>>>> wrote:
>>>>>>> 
>>>>>>>> Hi Pulsar community,
>>>>>>>> 
>>>>>>>> I open a https://github.com/apache/pulsar/issues/15560 for Function
>>>> add
>>>>>>>> NONE delivery semantics
>>>>>>>> 
>>>>>>>> Let me know what you think.
>>>>>>>> 
>>>>>>>> 
>>>>>>>> Thanks,
>>>>>>>> Baodi Shi
>>>>>>>> 
>>>>>>>> 
>>>>>>>> ## Motivation
>>>>>>>> 
>>>>>>>> Currently Function supports three delivery semantics, and also
>>>> provides
>>>>>>>> autoAck to control whether to automatically ack.
>>>>>>>> Because autoAck affects the delivery semantics of Function, it can
>> be
>>>>>>>> confusing for users to understand the relationship between these two
>>>>>>>> parameters.
>>>>>>>> 
>>>>>>>> For example, when the user configures `Guarantees == ATMOST_ONCE`
>> and
>>>>>>>> `autoAck == false`, then the framework will not help the user to ack
>>>>>>>> messages, and the processing semantics may become `ATLEAST_ONCE`.
>>>>>>>> 
>>>>>>>> The delivery semantics provided by Function should be clear. When
>> the
>>>>>>> user
>>>>>>>> sets the guarantees, the framework should ensure point-to-point
>>>> semantic
>>>>>>>> processing and cannot be affected by other parameters.
>>>>>>>> 
>>>>>>>> ## Goal
>>>>>>>> 
>>>>>>>> Added `NONE` delivery semantics and delete `autoAck` config.
>>>>>>>> 
>>>>>>>> The original intention of `autoAck` semantics is that users want to
>>>>>>>> control the timing of ack by themselves. When autoAck == false, the
>>>>>>>> processing semantics provided by the framework should be invalid.
>>>> Then we
>>>>>>>> can add `NONE` processing semantics to replace the autoAck == false
>>>>>>>> scenario.
>>>>>>>> 
>>>>>>>> When the user configuration `ProcessingGuarantees == NONE`, the
>>>> framework
>>>>>>>> does not help the user to do any ack operations, and the ack is left
>>>> to
>>>>>>> the
>>>>>>>> user to handle. In other cases, the framework guarantees processing
>>>>>>>> semantics.
>>>>>>>> 
>>>>>>>> ## API Changes
>>>>>>>> 1. Add `NONE` type to ProcessingGuarantees
>>>>>>>> ``` java
>>>>>>>> public enum ProcessingGuarantees {
>>>>>>>> ATLEAST_ONCE,
>>>>>>>> ATMOST_ONCE,
>>>>>>>> EFFECTIVELY_ONCE,
>>>>>>>> NONE
>>>>>>>> }
>>>>>>>> ```
>>>>>>>> 
>>>>>>>> 2. Delete autoAck config in FunctionConfig
>>>>>>>> ``` java
>>>>>>>> public class FunctionConfig {
>>>>>>>> - private Boolean autoAck;
>>>>>>>> }
>>>>>>>> ```
>>>>>>>> 
>>>>>>>> ## Implementation
>>>>>>>> 
>>>>>>>> 1. In `PulsarSinkAtLeastOnceProcessor` and
>>>>>>>> `PulsarSinkEffectivelyOnceProcessor`, when `ProcessingGuarantees !=
>>>> NONE`
>>>>>>>> can be ack.
>>>>>>>> 
>>>>>>>> <
>>>>>>>> 
>>>>>>> 
>>>> 
>> https://github.com/apache/pulsar/blob/c49a977de4b0b525ec80e5070bc90eddcc7cddad/pulsar-functions/instance/src/main/java/org/apache/pulsar/functions/sink/PulsarSink.java#L274-L276
>>>>>>>>> 
>>>>>>>> 
>>>>>>>> 2. When the delivery semantic is `ATMOST_ONCE`, the message will be
>>>> acked
>>>>>>>> immediately after receiving the message, no longer affected by the
>>>>>>> autoAck
>>>>>>>> configuration.
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>> 
>>>> 
>> https://github.com/apache/pulsar/blob/c49a977de4b0b525ec80e5070bc90eddcc7cddad/pulsar-functions/instance/src/main/java/org/apache/pulsar/functions/instance/JavaInstanceRunnable.java#L271-L276
>>>>>>>> 
>>>>>>>> 3. When user call `record.ack()` in function, just
>>>> `ProcessingGuarantees
>>>>>>>> == NONE` can be work.
>>>>>>>> 
>>>>>>>> ## Plan test
>>>>>>>> The main test and assert is that when ProcessingGuarantees == NONE,
>>>> the
>>>>>>>> function framework will not do any ack operations for the user.
>>>>>>>> 
>>>>>>>> ## Compatibility
>>>>>>>> 1. This change will invalidate the user's setting of autoAck, which
>>>>>>> should
>>>>>>>> be explained in the documentation and provide parameter verification
>>>> to
>>>>>>>> remind the user.
>>>>>>>> 2. Runtimes of other languages need to maintain consistent
>> processing
>>>>>>>> logic (python, go).
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>> 
>>>>>>> --
>>>>>>> Best Regards,
>>>>>>> Neng
>>>>>>> 
>>>> 
>>>> 
>> 
>> 


Re: [DISCUSS] PIP-166: Function add NONE delivery semantics

Posted by Asaf Mesika <as...@gmail.com>.
Hi Baodi,

Regarding

>
>    1. When the delivery semantic is ATMOST_ONCE, add verify autoAck must
>    be true. If the validation fails, let the function fail to start (This
>    temporarily resolves the configuration ambiguity). When autoAck is
>    subsequently removed, the message will be acked immediately after receiving
>    the message.
>
>
> If you fail to start the function, you immediately break people's
functions when they upgrade to this version. How about notifying them once
via logger (WARN)?

Regarding

>
>    1.
>
>
>    When user call record.ack() in function, just ProcessingGuarantees ==
>    MANUAL can be work. In turn, when ProcessingGuarantees != MANUAL, user
>    call record.ack() is invalid(print warn log).
>
> It will flood their logs if they used it wrong. Maybe write to log once?

On Tue, May 31, 2022 at 12:24 PM Baozi <wu...@icloud.com.invalid>
wrote:

> Hi, Asaf.
>
> Thanks review.
>
> >> I'm not entirely sure that is accurate. The Effectively-Once as I
> > understand it is achieved using transactions, thus the consumption of
> that
> > message and production of any messages, as a result, are considered one
> > atomic unit - either message acknowledged and messages produced or
> neither.
>
>
> Not using transactions now, I understand: EFFECTIVELY_ONCE = ATLEAST_ONCE
> + Message Deduplication.
>
> @Neng Lu @Rui Fu Can help make sure?
>
> >> I would issue a WARN when reading configuring the function (thus emitted
> > once) when the user actively configured autoAck=false and warn them that
> > this configuration is deprecated and they should switch to the MANUAL
> > ProcessingGuarantee configuration option.
>
>
> Added to API Change(2)
>
> >> suggest you clarify what existing behavior remains for backward
> > compatibility with the appropriate comment that this is deprecated and
> will
> > be removed.
>
> Yes, I have rewritten it, please see Implementation(1)
>
> > 5. Regarding Test Plan
> > * I would add: Validate the test of autoAck=false still works (you
> haven't
> > broken anything)
> > * I would add: Validate existing ProcessingGuarantee test for AtMostOnce,
> > AtLeastOnce, ExactlyOnce still works (when autoAck=true)
>
>
> Nice, I added to PIP.
>
>
> Thanks,
> Baodi Shi
>
> > 2022年5月30日 22:0011,Asaf Mesika <as...@gmail.com> 写道:
> >
> > Thanks for applying the fixes.
> >
> > 1. Regarding
> >
> >>
> >>   - EFFECTIVELY_ONCE: The message is acknowledged *after* the function
> >>   finished execution. Depends on pulsar deduplication, and provides
> >>   end-to-end effectively once processing.
> >>
> >> I'm not entirely sure that is accurate. The Effectively-Once as I
> > understand it is achieved using transactions, thus the consumption of
> that
> > message and production of any messages, as a result, are considered one
> > atomic unit - either message acknowledged and messages produced or
> neither.
> >
> > 2. Regarding
> >
> >>
> >>   1. Indication of autoAck is deprecated, and not use it in the code.
> >>   (and also Function.proto)
> >>
> >> * I would issue a WARN when reading configuring the function (thus
> emitted
> > once) when the user actively configured autoAck=false and warn them that
> > this configuration is deprecated and they should switch to the MANUAL
> > ProcessingGuarantee configuration option.
> >
> > 3. Regarding
> >
> >>
> >>   1. When the delivery semantic is ATMOST_ONCE, the message will be
> >>   acked immediately after receiving the message, no longer affected by
> the
> >>   autoAck configuration.
> >>
> >> I suggest you clarify what existing behavior remains for backward
> > compatibility with the appropriate comment that this is deprecated and
> will
> > be removed.
> >
> > 4. Regarding
> >
> >>
> >>   1.
> >>
> >>   When user call record.ack() in function, just ProcessingGuarantees ==
> >>   MANUAL can be work. In turn, when ProcessingGuarantees != MANUAL, user
> >>   call record.ack() is invalid(print warn log).
> >>
> >> That might blast WARN messages to the user. Perhaps save the fact that
> you
> > have printed a WARN message once and only print when the variable is not
> > set?
> >
> > 5. Regarding Test Plan
> > * I would add: Validate the test of autoAck=false still works (you
> haven't
> > broken anything)
> > * I would add: Validate existing ProcessingGuarantee test for AtMostOnce,
> > AtLeastOnce, ExactlyOnce still works (when autoAck=true)
> >
> >
> >
> > On Mon, May 30, 2022 at 4:09 PM Baozi <wudixiaolongbao@icloud.com
> .invalid>
> > wrote:
> >
> >> Hi, Mesika.
> >>
> >> Thanks review.
> >>
> >>>> 2. I suggest calling it `MANUAL` `ProcessingGuarantee` instead of
> NONE.
> >> As
> >>>> you carefully explained, ProcessingGuarantee comes does to the fact
> that
> >>>> the function executor calls acknowledge, in specific timing:
> >>
> >>
> >> Added, Refer to the latest pip.
> >> https://github.com/apache/pulsar/issues/15560
> >>
> >>>> 3. Removing autoAck from Function Config breaks backward
> compatibility,
> >>>> thus shouldn't this be strongly reflected in the PIP - how does Pulsar
> >>>> release handle breaking change?
> >>
> >> As suggested by @neng, They will be marked as deprecated first and
> clearly
> >> stated in the documentation. Remove it after 2~3 release.
> >>
> >>>> 4. Regarding Implementation (1), isn't the class itself
> >>>> PulsarSinkAtLeastOnceProcessor encodes what to do? I'm not sure I
> >>>> understand how you use that enum value *inside* the class/
> >>
> >> I changed PIP, add new PulsarSinkManualProcessor. Refer to the latest
> PIP
> >> API Changes(3)
> >>
> >> Thanks,
> >> Baodi Shi
> >>
> >>> 2022年5月30日 12:5128,Rui Fu <rf...@apache.org> 写道:
> >>>
> >>> Hi Baodi,
> >>>
> >>> Nice work. Put some suggestions below, ptal.
> >>>
> >>> 1. API changes should also contain the changes of `Function.proto`,
> >> including new `ProcessingGuarantees` option and `autoAck`.
> >>> 2. Please be sure the other language runtimes (like Python, Golang) do
> >> support similar `record.ack()` function from the context, if no, it
> might
> >> have some API changes for different runtime we well.
> >>>
> >>>
> >>> Best,
> >>>
> >>> Rui Fu
> >>> 在 2022年5月29日 +0800 22:18,Asaf Mesika <as...@gmail.com>,写道:
> >>>> 1. "Added NONE delivery semantics and delete autoAck config."
> >>>> - Added --> add
> >>>>
> >>>> 2. I suggest calling it `MANUAL` `ProcessingGuarantee` instead of
> NONE.
> >> As
> >>>> you carefully explained, ProcessingGuarantee comes does to the fact
> that
> >>>> the function executor calls acknowledge, in specific timing:
> >>>> - `AT_MOST_ONCE` - When the message is read by the client, it is
> >>>> immediately acknowledged and only then the function is executed, thus
> >>>> guaranteeing it will not run more than once
> >>>> - `AT_LEAST_ONCE` - Message is acknowledged *after* the function
> >> finished
> >>>> execution, thus it will be run at least once.
> >>>> - `MANUAL` - Signals to the user that it is up to them to acknowledge
> >> the
> >>>> message, inside the function.
> >>>>
> >>>> I think if you couple that change with adding the explanation I wrote
> >>>> above to the documentation it will become crystal clear (hopefully)
> >> what is
> >>>> a Processing Guarantee exactly and what each value signifies.
> >>>>
> >>>> 3. Removing autoAck from Function Config breaks backward
> compatibility,
> >>>> thus shouldn't this be strongly reflected in the PIP - how does Pulsar
> >>>> release handle breaking change?
> >>>>
> >>>> 4. Regarding Implementation (1), isn't the class itself
> >>>> PulsarSinkAtLeastOnceProcessor encodes what to do? I'm not sure I
> >>>> understand how you use that enum value *inside* the class/
> >>>>
> >>>>
> >>>> On Fri, May 27, 2022 at 1:08 AM Neng Lu <fr...@gmail.com> wrote:
> >>>>
> >>>>> Some suggestions:
> >>>>>
> >>>>> 1. Instead of deleting the `autoAck`, keep it but not use it in the
> >> code.
> >>>>> And documented clearly it's deprecated for the following 2~3 release.
> >> And
> >>>>> then delete it.
> >>>>> 2. For `PulsarSinkAtLeastOnceProcessor` and
> >>>>> `PulsarSinkEffectivelyOnceProcessor`, if `NONE` is configured, it
> >> defaults
> >>>>> to ATLEAST_ONCE.
> >>>>> 3. Need to let users know the behavior when they call `record.ack()`
> >> inside
> >>>>> the function implementation.
> >>>>>
> >>>>> On Thu, May 12, 2022 at 1:52 AM Baozi <wudixiaolongbao@icloud.com
> >> .invalid>
> >>>>> wrote:
> >>>>>
> >>>>>> Hi Pulsar community,
> >>>>>>
> >>>>>> I open a https://github.com/apache/pulsar/issues/15560 for Function
> >> add
> >>>>>> NONE delivery semantics
> >>>>>>
> >>>>>> Let me know what you think.
> >>>>>>
> >>>>>>
> >>>>>> Thanks,
> >>>>>> Baodi Shi
> >>>>>>
> >>>>>>
> >>>>>> ## Motivation
> >>>>>>
> >>>>>> Currently Function supports three delivery semantics, and also
> >> provides
> >>>>>> autoAck to control whether to automatically ack.
> >>>>>> Because autoAck affects the delivery semantics of Function, it can
> be
> >>>>>> confusing for users to understand the relationship between these two
> >>>>>> parameters.
> >>>>>>
> >>>>>> For example, when the user configures `Guarantees == ATMOST_ONCE`
> and
> >>>>>> `autoAck == false`, then the framework will not help the user to ack
> >>>>>> messages, and the processing semantics may become `ATLEAST_ONCE`.
> >>>>>>
> >>>>>> The delivery semantics provided by Function should be clear. When
> the
> >>>>> user
> >>>>>> sets the guarantees, the framework should ensure point-to-point
> >> semantic
> >>>>>> processing and cannot be affected by other parameters.
> >>>>>>
> >>>>>> ## Goal
> >>>>>>
> >>>>>> Added `NONE` delivery semantics and delete `autoAck` config.
> >>>>>>
> >>>>>> The original intention of `autoAck` semantics is that users want to
> >>>>>> control the timing of ack by themselves. When autoAck == false, the
> >>>>>> processing semantics provided by the framework should be invalid.
> >> Then we
> >>>>>> can add `NONE` processing semantics to replace the autoAck == false
> >>>>>> scenario.
> >>>>>>
> >>>>>> When the user configuration `ProcessingGuarantees == NONE`, the
> >> framework
> >>>>>> does not help the user to do any ack operations, and the ack is left
> >> to
> >>>>> the
> >>>>>> user to handle. In other cases, the framework guarantees processing
> >>>>>> semantics.
> >>>>>>
> >>>>>> ## API Changes
> >>>>>> 1. Add `NONE` type to ProcessingGuarantees
> >>>>>> ``` java
> >>>>>> public enum ProcessingGuarantees {
> >>>>>> ATLEAST_ONCE,
> >>>>>> ATMOST_ONCE,
> >>>>>> EFFECTIVELY_ONCE,
> >>>>>> NONE
> >>>>>> }
> >>>>>> ```
> >>>>>>
> >>>>>> 2. Delete autoAck config in FunctionConfig
> >>>>>> ``` java
> >>>>>> public class FunctionConfig {
> >>>>>> - private Boolean autoAck;
> >>>>>> }
> >>>>>> ```
> >>>>>>
> >>>>>> ## Implementation
> >>>>>>
> >>>>>> 1. In `PulsarSinkAtLeastOnceProcessor` and
> >>>>>> `PulsarSinkEffectivelyOnceProcessor`, when `ProcessingGuarantees !=
> >> NONE`
> >>>>>> can be ack.
> >>>>>>
> >>>>>> <
> >>>>>>
> >>>>>
> >>
> https://github.com/apache/pulsar/blob/c49a977de4b0b525ec80e5070bc90eddcc7cddad/pulsar-functions/instance/src/main/java/org/apache/pulsar/functions/sink/PulsarSink.java#L274-L276
> >>>>>>>
> >>>>>>
> >>>>>> 2. When the delivery semantic is `ATMOST_ONCE`, the message will be
> >> acked
> >>>>>> immediately after receiving the message, no longer affected by the
> >>>>> autoAck
> >>>>>> configuration.
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>
> >>
> https://github.com/apache/pulsar/blob/c49a977de4b0b525ec80e5070bc90eddcc7cddad/pulsar-functions/instance/src/main/java/org/apache/pulsar/functions/instance/JavaInstanceRunnable.java#L271-L276
> >>>>>>
> >>>>>> 3. When user call `record.ack()` in function, just
> >> `ProcessingGuarantees
> >>>>>> == NONE` can be work.
> >>>>>>
> >>>>>> ## Plan test
> >>>>>> The main test and assert is that when ProcessingGuarantees == NONE,
> >> the
> >>>>>> function framework will not do any ack operations for the user.
> >>>>>>
> >>>>>> ## Compatibility
> >>>>>> 1. This change will invalidate the user's setting of autoAck, which
> >>>>> should
> >>>>>> be explained in the documentation and provide parameter verification
> >> to
> >>>>>> remind the user.
> >>>>>> 2. Runtimes of other languages need to maintain consistent
> processing
> >>>>>> logic (python, go).
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>
> >>>>> --
> >>>>> Best Regards,
> >>>>> Neng
> >>>>>
> >>
> >>
>
>

Re: [DISCUSS] PIP-166: Function add NONE delivery semantics

Posted by Baozi <wu...@icloud.com.INVALID>.
Hi, Asaf.

Thanks review. 

>> I'm not entirely sure that is accurate. The Effectively-Once as I
> understand it is achieved using transactions, thus the consumption of that
> message and production of any messages, as a result, are considered one
> atomic unit - either message acknowledged and messages produced or neither.


Not using transactions now, I understand: EFFECTIVELY_ONCE = ATLEAST_ONCE + Message Deduplication.

@Neng Lu @Rui Fu Can help make sure?

>> I would issue a WARN when reading configuring the function (thus emitted
> once) when the user actively configured autoAck=false and warn them that
> this configuration is deprecated and they should switch to the MANUAL
> ProcessingGuarantee configuration option.


Added to API Change(2)

>> suggest you clarify what existing behavior remains for backward
> compatibility with the appropriate comment that this is deprecated and will
> be removed.

Yes, I have rewritten it, please see Implementation(1)

> 5. Regarding Test Plan
> * I would add: Validate the test of autoAck=false still works (you haven't
> broken anything)
> * I would add: Validate existing ProcessingGuarantee test for AtMostOnce,
> AtLeastOnce, ExactlyOnce still works (when autoAck=true)


Nice, I added to PIP.


Thanks,
Baodi Shi

> 2022年5月30日 22:0011,Asaf Mesika <as...@gmail.com> 写道:
> 
> Thanks for applying the fixes.
> 
> 1. Regarding
> 
>> 
>>   - EFFECTIVELY_ONCE: The message is acknowledged *after* the function
>>   finished execution. Depends on pulsar deduplication, and provides
>>   end-to-end effectively once processing.
>> 
>> I'm not entirely sure that is accurate. The Effectively-Once as I
> understand it is achieved using transactions, thus the consumption of that
> message and production of any messages, as a result, are considered one
> atomic unit - either message acknowledged and messages produced or neither.
> 
> 2. Regarding
> 
>> 
>>   1. Indication of autoAck is deprecated, and not use it in the code.
>>   (and also Function.proto)
>> 
>> * I would issue a WARN when reading configuring the function (thus emitted
> once) when the user actively configured autoAck=false and warn them that
> this configuration is deprecated and they should switch to the MANUAL
> ProcessingGuarantee configuration option.
> 
> 3. Regarding
> 
>> 
>>   1. When the delivery semantic is ATMOST_ONCE, the message will be
>>   acked immediately after receiving the message, no longer affected by the
>>   autoAck configuration.
>> 
>> I suggest you clarify what existing behavior remains for backward
> compatibility with the appropriate comment that this is deprecated and will
> be removed.
> 
> 4. Regarding
> 
>> 
>>   1.
>> 
>>   When user call record.ack() in function, just ProcessingGuarantees ==
>>   MANUAL can be work. In turn, when ProcessingGuarantees != MANUAL, user
>>   call record.ack() is invalid(print warn log).
>> 
>> That might blast WARN messages to the user. Perhaps save the fact that you
> have printed a WARN message once and only print when the variable is not
> set?
> 
> 5. Regarding Test Plan
> * I would add: Validate the test of autoAck=false still works (you haven't
> broken anything)
> * I would add: Validate existing ProcessingGuarantee test for AtMostOnce,
> AtLeastOnce, ExactlyOnce still works (when autoAck=true)
> 
> 
> 
> On Mon, May 30, 2022 at 4:09 PM Baozi <wu...@icloud.com.invalid>
> wrote:
> 
>> Hi, Mesika.
>> 
>> Thanks review.
>> 
>>>> 2. I suggest calling it `MANUAL` `ProcessingGuarantee` instead of NONE.
>> As
>>>> you carefully explained, ProcessingGuarantee comes does to the fact that
>>>> the function executor calls acknowledge, in specific timing:
>> 
>> 
>> Added, Refer to the latest pip.
>> https://github.com/apache/pulsar/issues/15560
>> 
>>>> 3. Removing autoAck from Function Config breaks backward compatibility,
>>>> thus shouldn't this be strongly reflected in the PIP - how does Pulsar
>>>> release handle breaking change?
>> 
>> As suggested by @neng, They will be marked as deprecated first and clearly
>> stated in the documentation. Remove it after 2~3 release.
>> 
>>>> 4. Regarding Implementation (1), isn't the class itself
>>>> PulsarSinkAtLeastOnceProcessor encodes what to do? I'm not sure I
>>>> understand how you use that enum value *inside* the class/
>> 
>> I changed PIP, add new PulsarSinkManualProcessor. Refer to the latest PIP
>> API Changes(3)
>> 
>> Thanks,
>> Baodi Shi
>> 
>>> 2022年5月30日 12:5128,Rui Fu <rf...@apache.org> 写道:
>>> 
>>> Hi Baodi,
>>> 
>>> Nice work. Put some suggestions below, ptal.
>>> 
>>> 1. API changes should also contain the changes of `Function.proto`,
>> including new `ProcessingGuarantees` option and `autoAck`.
>>> 2. Please be sure the other language runtimes (like Python, Golang) do
>> support similar `record.ack()` function from the context, if no, it might
>> have some API changes for different runtime we well.
>>> 
>>> 
>>> Best,
>>> 
>>> Rui Fu
>>> 在 2022年5月29日 +0800 22:18,Asaf Mesika <as...@gmail.com>,写道:
>>>> 1. "Added NONE delivery semantics and delete autoAck config."
>>>> - Added --> add
>>>> 
>>>> 2. I suggest calling it `MANUAL` `ProcessingGuarantee` instead of NONE.
>> As
>>>> you carefully explained, ProcessingGuarantee comes does to the fact that
>>>> the function executor calls acknowledge, in specific timing:
>>>> - `AT_MOST_ONCE` - When the message is read by the client, it is
>>>> immediately acknowledged and only then the function is executed, thus
>>>> guaranteeing it will not run more than once
>>>> - `AT_LEAST_ONCE` - Message is acknowledged *after* the function
>> finished
>>>> execution, thus it will be run at least once.
>>>> - `MANUAL` - Signals to the user that it is up to them to acknowledge
>> the
>>>> message, inside the function.
>>>> 
>>>> I think if you couple that change with adding the explanation I wrote
>>>> above to the documentation it will become crystal clear (hopefully)
>> what is
>>>> a Processing Guarantee exactly and what each value signifies.
>>>> 
>>>> 3. Removing autoAck from Function Config breaks backward compatibility,
>>>> thus shouldn't this be strongly reflected in the PIP - how does Pulsar
>>>> release handle breaking change?
>>>> 
>>>> 4. Regarding Implementation (1), isn't the class itself
>>>> PulsarSinkAtLeastOnceProcessor encodes what to do? I'm not sure I
>>>> understand how you use that enum value *inside* the class/
>>>> 
>>>> 
>>>> On Fri, May 27, 2022 at 1:08 AM Neng Lu <fr...@gmail.com> wrote:
>>>> 
>>>>> Some suggestions:
>>>>> 
>>>>> 1. Instead of deleting the `autoAck`, keep it but not use it in the
>> code.
>>>>> And documented clearly it's deprecated for the following 2~3 release.
>> And
>>>>> then delete it.
>>>>> 2. For `PulsarSinkAtLeastOnceProcessor` and
>>>>> `PulsarSinkEffectivelyOnceProcessor`, if `NONE` is configured, it
>> defaults
>>>>> to ATLEAST_ONCE.
>>>>> 3. Need to let users know the behavior when they call `record.ack()`
>> inside
>>>>> the function implementation.
>>>>> 
>>>>> On Thu, May 12, 2022 at 1:52 AM Baozi <wudixiaolongbao@icloud.com
>> .invalid>
>>>>> wrote:
>>>>> 
>>>>>> Hi Pulsar community,
>>>>>> 
>>>>>> I open a https://github.com/apache/pulsar/issues/15560 for Function
>> add
>>>>>> NONE delivery semantics
>>>>>> 
>>>>>> Let me know what you think.
>>>>>> 
>>>>>> 
>>>>>> Thanks,
>>>>>> Baodi Shi
>>>>>> 
>>>>>> 
>>>>>> ## Motivation
>>>>>> 
>>>>>> Currently Function supports three delivery semantics, and also
>> provides
>>>>>> autoAck to control whether to automatically ack.
>>>>>> Because autoAck affects the delivery semantics of Function, it can be
>>>>>> confusing for users to understand the relationship between these two
>>>>>> parameters.
>>>>>> 
>>>>>> For example, when the user configures `Guarantees == ATMOST_ONCE` and
>>>>>> `autoAck == false`, then the framework will not help the user to ack
>>>>>> messages, and the processing semantics may become `ATLEAST_ONCE`.
>>>>>> 
>>>>>> The delivery semantics provided by Function should be clear. When the
>>>>> user
>>>>>> sets the guarantees, the framework should ensure point-to-point
>> semantic
>>>>>> processing and cannot be affected by other parameters.
>>>>>> 
>>>>>> ## Goal
>>>>>> 
>>>>>> Added `NONE` delivery semantics and delete `autoAck` config.
>>>>>> 
>>>>>> The original intention of `autoAck` semantics is that users want to
>>>>>> control the timing of ack by themselves. When autoAck == false, the
>>>>>> processing semantics provided by the framework should be invalid.
>> Then we
>>>>>> can add `NONE` processing semantics to replace the autoAck == false
>>>>>> scenario.
>>>>>> 
>>>>>> When the user configuration `ProcessingGuarantees == NONE`, the
>> framework
>>>>>> does not help the user to do any ack operations, and the ack is left
>> to
>>>>> the
>>>>>> user to handle. In other cases, the framework guarantees processing
>>>>>> semantics.
>>>>>> 
>>>>>> ## API Changes
>>>>>> 1. Add `NONE` type to ProcessingGuarantees
>>>>>> ``` java
>>>>>> public enum ProcessingGuarantees {
>>>>>> ATLEAST_ONCE,
>>>>>> ATMOST_ONCE,
>>>>>> EFFECTIVELY_ONCE,
>>>>>> NONE
>>>>>> }
>>>>>> ```
>>>>>> 
>>>>>> 2. Delete autoAck config in FunctionConfig
>>>>>> ``` java
>>>>>> public class FunctionConfig {
>>>>>> - private Boolean autoAck;
>>>>>> }
>>>>>> ```
>>>>>> 
>>>>>> ## Implementation
>>>>>> 
>>>>>> 1. In `PulsarSinkAtLeastOnceProcessor` and
>>>>>> `PulsarSinkEffectivelyOnceProcessor`, when `ProcessingGuarantees !=
>> NONE`
>>>>>> can be ack.
>>>>>> 
>>>>>> <
>>>>>> 
>>>>> 
>> https://github.com/apache/pulsar/blob/c49a977de4b0b525ec80e5070bc90eddcc7cddad/pulsar-functions/instance/src/main/java/org/apache/pulsar/functions/sink/PulsarSink.java#L274-L276
>>>>>>> 
>>>>>> 
>>>>>> 2. When the delivery semantic is `ATMOST_ONCE`, the message will be
>> acked
>>>>>> immediately after receiving the message, no longer affected by the
>>>>> autoAck
>>>>>> configuration.
>>>>>> 
>>>>>> 
>>>>>> 
>>>>> 
>> https://github.com/apache/pulsar/blob/c49a977de4b0b525ec80e5070bc90eddcc7cddad/pulsar-functions/instance/src/main/java/org/apache/pulsar/functions/instance/JavaInstanceRunnable.java#L271-L276
>>>>>> 
>>>>>> 3. When user call `record.ack()` in function, just
>> `ProcessingGuarantees
>>>>>> == NONE` can be work.
>>>>>> 
>>>>>> ## Plan test
>>>>>> The main test and assert is that when ProcessingGuarantees == NONE,
>> the
>>>>>> function framework will not do any ack operations for the user.
>>>>>> 
>>>>>> ## Compatibility
>>>>>> 1. This change will invalidate the user's setting of autoAck, which
>>>>> should
>>>>>> be explained in the documentation and provide parameter verification
>> to
>>>>>> remind the user.
>>>>>> 2. Runtimes of other languages need to maintain consistent processing
>>>>>> logic (python, go).
>>>>>> 
>>>>>> 
>>>>>> 
>>>>> 
>>>>> --
>>>>> Best Regards,
>>>>> Neng
>>>>> 
>> 
>> 


Re: [DISCUSS] PIP-166: Function add NONE delivery semantics

Posted by Asaf Mesika <as...@gmail.com>.
Thanks for applying the fixes.

1. Regarding

>
>    - EFFECTIVELY_ONCE: The message is acknowledged *after* the function
>    finished execution. Depends on pulsar deduplication, and provides
>    end-to-end effectively once processing.
>
> I'm not entirely sure that is accurate. The Effectively-Once as I
understand it is achieved using transactions, thus the consumption of that
message and production of any messages, as a result, are considered one
atomic unit - either message acknowledged and messages produced or neither.

2. Regarding

>
>    1. Indication of autoAck is deprecated, and not use it in the code.
>    (and also Function.proto)
>
> * I would issue a WARN when reading configuring the function (thus emitted
once) when the user actively configured autoAck=false and warn them that
this configuration is deprecated and they should switch to the MANUAL
ProcessingGuarantee configuration option.

3. Regarding

>
>    1. When the delivery semantic is ATMOST_ONCE, the message will be
>    acked immediately after receiving the message, no longer affected by the
>    autoAck configuration.
>
> I suggest you clarify what existing behavior remains for backward
compatibility with the appropriate comment that this is deprecated and will
be removed.

4. Regarding

>
>    1.
>
>    When user call record.ack() in function, just ProcessingGuarantees ==
>    MANUAL can be work. In turn, when ProcessingGuarantees != MANUAL, user
>    call record.ack() is invalid(print warn log).
>
> That might blast WARN messages to the user. Perhaps save the fact that you
have printed a WARN message once and only print when the variable is not
set?

5. Regarding Test Plan
* I would add: Validate the test of autoAck=false still works (you haven't
broken anything)
* I would add: Validate existing ProcessingGuarantee test for AtMostOnce,
AtLeastOnce, ExactlyOnce still works (when autoAck=true)



On Mon, May 30, 2022 at 4:09 PM Baozi <wu...@icloud.com.invalid>
wrote:

> Hi, Mesika.
>
> Thanks review.
>
> >> 2. I suggest calling it `MANUAL` `ProcessingGuarantee` instead of NONE.
> As
> >> you carefully explained, ProcessingGuarantee comes does to the fact that
> >> the function executor calls acknowledge, in specific timing:
>
>
> Added, Refer to the latest pip.
> https://github.com/apache/pulsar/issues/15560
>
> >> 3. Removing autoAck from Function Config breaks backward compatibility,
> >> thus shouldn't this be strongly reflected in the PIP - how does Pulsar
> >> release handle breaking change?
>
> As suggested by @neng, They will be marked as deprecated first and clearly
> stated in the documentation. Remove it after 2~3 release.
>
> >> 4. Regarding Implementation (1), isn't the class itself
> >> PulsarSinkAtLeastOnceProcessor encodes what to do? I'm not sure I
> >> understand how you use that enum value *inside* the class/
>
> I changed PIP, add new PulsarSinkManualProcessor. Refer to the latest PIP
> API Changes(3)
>
> Thanks,
> Baodi Shi
>
> > 2022年5月30日 12:5128,Rui Fu <rf...@apache.org> 写道:
> >
> > Hi Baodi,
> >
> > Nice work. Put some suggestions below, ptal.
> >
> > 1. API changes should also contain the changes of `Function.proto`,
> including new `ProcessingGuarantees` option and `autoAck`.
> > 2. Please be sure the other language runtimes (like Python, Golang) do
> support similar `record.ack()` function from the context, if no, it might
> have some API changes for different runtime we well.
> >
> >
> > Best,
> >
> > Rui Fu
> > 在 2022年5月29日 +0800 22:18,Asaf Mesika <as...@gmail.com>,写道:
> >> 1. "Added NONE delivery semantics and delete autoAck config."
> >> - Added --> add
> >>
> >> 2. I suggest calling it `MANUAL` `ProcessingGuarantee` instead of NONE.
> As
> >> you carefully explained, ProcessingGuarantee comes does to the fact that
> >> the function executor calls acknowledge, in specific timing:
> >> - `AT_MOST_ONCE` - When the message is read by the client, it is
> >> immediately acknowledged and only then the function is executed, thus
> >> guaranteeing it will not run more than once
> >> - `AT_LEAST_ONCE` - Message is acknowledged *after* the function
> finished
> >> execution, thus it will be run at least once.
> >> - `MANUAL` - Signals to the user that it is up to them to acknowledge
> the
> >> message, inside the function.
> >>
> >> I think if you couple that change with adding the explanation I wrote
> >> above to the documentation it will become crystal clear (hopefully)
> what is
> >> a Processing Guarantee exactly and what each value signifies.
> >>
> >> 3. Removing autoAck from Function Config breaks backward compatibility,
> >> thus shouldn't this be strongly reflected in the PIP - how does Pulsar
> >> release handle breaking change?
> >>
> >> 4. Regarding Implementation (1), isn't the class itself
> >> PulsarSinkAtLeastOnceProcessor encodes what to do? I'm not sure I
> >> understand how you use that enum value *inside* the class/
> >>
> >>
> >> On Fri, May 27, 2022 at 1:08 AM Neng Lu <fr...@gmail.com> wrote:
> >>
> >>> Some suggestions:
> >>>
> >>> 1. Instead of deleting the `autoAck`, keep it but not use it in the
> code.
> >>> And documented clearly it's deprecated for the following 2~3 release.
> And
> >>> then delete it.
> >>> 2. For `PulsarSinkAtLeastOnceProcessor` and
> >>> `PulsarSinkEffectivelyOnceProcessor`, if `NONE` is configured, it
> defaults
> >>> to ATLEAST_ONCE.
> >>> 3. Need to let users know the behavior when they call `record.ack()`
> inside
> >>> the function implementation.
> >>>
> >>> On Thu, May 12, 2022 at 1:52 AM Baozi <wudixiaolongbao@icloud.com
> .invalid>
> >>> wrote:
> >>>
> >>>> Hi Pulsar community,
> >>>>
> >>>> I open a https://github.com/apache/pulsar/issues/15560 for Function
> add
> >>>> NONE delivery semantics
> >>>>
> >>>> Let me know what you think.
> >>>>
> >>>>
> >>>> Thanks,
> >>>> Baodi Shi
> >>>>
> >>>>
> >>>> ## Motivation
> >>>>
> >>>> Currently Function supports three delivery semantics, and also
> provides
> >>>> autoAck to control whether to automatically ack.
> >>>> Because autoAck affects the delivery semantics of Function, it can be
> >>>> confusing for users to understand the relationship between these two
> >>>> parameters.
> >>>>
> >>>> For example, when the user configures `Guarantees == ATMOST_ONCE` and
> >>>> `autoAck == false`, then the framework will not help the user to ack
> >>>> messages, and the processing semantics may become `ATLEAST_ONCE`.
> >>>>
> >>>> The delivery semantics provided by Function should be clear. When the
> >>> user
> >>>> sets the guarantees, the framework should ensure point-to-point
> semantic
> >>>> processing and cannot be affected by other parameters.
> >>>>
> >>>> ## Goal
> >>>>
> >>>> Added `NONE` delivery semantics and delete `autoAck` config.
> >>>>
> >>>> The original intention of `autoAck` semantics is that users want to
> >>>> control the timing of ack by themselves. When autoAck == false, the
> >>>> processing semantics provided by the framework should be invalid.
> Then we
> >>>> can add `NONE` processing semantics to replace the autoAck == false
> >>>> scenario.
> >>>>
> >>>> When the user configuration `ProcessingGuarantees == NONE`, the
> framework
> >>>> does not help the user to do any ack operations, and the ack is left
> to
> >>> the
> >>>> user to handle. In other cases, the framework guarantees processing
> >>>> semantics.
> >>>>
> >>>> ## API Changes
> >>>> 1. Add `NONE` type to ProcessingGuarantees
> >>>> ``` java
> >>>> public enum ProcessingGuarantees {
> >>>> ATLEAST_ONCE,
> >>>> ATMOST_ONCE,
> >>>> EFFECTIVELY_ONCE,
> >>>> NONE
> >>>> }
> >>>> ```
> >>>>
> >>>> 2. Delete autoAck config in FunctionConfig
> >>>> ``` java
> >>>> public class FunctionConfig {
> >>>> - private Boolean autoAck;
> >>>> }
> >>>> ```
> >>>>
> >>>> ## Implementation
> >>>>
> >>>> 1. In `PulsarSinkAtLeastOnceProcessor` and
> >>>> `PulsarSinkEffectivelyOnceProcessor`, when `ProcessingGuarantees !=
> NONE`
> >>>> can be ack.
> >>>>
> >>>> <
> >>>>
> >>>
> https://github.com/apache/pulsar/blob/c49a977de4b0b525ec80e5070bc90eddcc7cddad/pulsar-functions/instance/src/main/java/org/apache/pulsar/functions/sink/PulsarSink.java#L274-L276
> >>>>>
> >>>>
> >>>> 2. When the delivery semantic is `ATMOST_ONCE`, the message will be
> acked
> >>>> immediately after receiving the message, no longer affected by the
> >>> autoAck
> >>>> configuration.
> >>>>
> >>>>
> >>>>
> >>>
> https://github.com/apache/pulsar/blob/c49a977de4b0b525ec80e5070bc90eddcc7cddad/pulsar-functions/instance/src/main/java/org/apache/pulsar/functions/instance/JavaInstanceRunnable.java#L271-L276
> >>>>
> >>>> 3. When user call `record.ack()` in function, just
> `ProcessingGuarantees
> >>>> == NONE` can be work.
> >>>>
> >>>> ## Plan test
> >>>> The main test and assert is that when ProcessingGuarantees == NONE,
> the
> >>>> function framework will not do any ack operations for the user.
> >>>>
> >>>> ## Compatibility
> >>>> 1. This change will invalidate the user's setting of autoAck, which
> >>> should
> >>>> be explained in the documentation and provide parameter verification
> to
> >>>> remind the user.
> >>>> 2. Runtimes of other languages need to maintain consistent processing
> >>>> logic (python, go).
> >>>>
> >>>>
> >>>>
> >>>
> >>> --
> >>> Best Regards,
> >>> Neng
> >>>
>
>

Re: [DISCUSS] PIP-166: Function add NONE delivery semantics

Posted by Baozi <wu...@icloud.com.INVALID>.
Hi, Mesika.

Thanks review.

>> 2. I suggest calling it `MANUAL` `ProcessingGuarantee` instead of NONE. As
>> you carefully explained, ProcessingGuarantee comes does to the fact that
>> the function executor calls acknowledge, in specific timing:


Added, Refer to the latest pip. https://github.com/apache/pulsar/issues/15560

>> 3. Removing autoAck from Function Config breaks backward compatibility,
>> thus shouldn't this be strongly reflected in the PIP - how does Pulsar
>> release handle breaking change?

As suggested by @neng, They will be marked as deprecated first and clearly stated in the documentation. Remove it after 2~3 release.

>> 4. Regarding Implementation (1), isn't the class itself
>> PulsarSinkAtLeastOnceProcessor encodes what to do? I'm not sure I
>> understand how you use that enum value *inside* the class/

I changed PIP, add new PulsarSinkManualProcessor. Refer to the latest PIP API Changes(3)

Thanks,
Baodi Shi

> 2022年5月30日 12:5128,Rui Fu <rf...@apache.org> 写道:
> 
> Hi Baodi,
> 
> Nice work. Put some suggestions below, ptal.
> 
> 1. API changes should also contain the changes of `Function.proto`, including new `ProcessingGuarantees` option and `autoAck`.
> 2. Please be sure the other language runtimes (like Python, Golang) do support similar `record.ack()` function from the context, if no, it might have some API changes for different runtime we well.
> 
> 
> Best,
> 
> Rui Fu
> 在 2022年5月29日 +0800 22:18,Asaf Mesika <as...@gmail.com>,写道:
>> 1. "Added NONE delivery semantics and delete autoAck config."
>> - Added --> add
>> 
>> 2. I suggest calling it `MANUAL` `ProcessingGuarantee` instead of NONE. As
>> you carefully explained, ProcessingGuarantee comes does to the fact that
>> the function executor calls acknowledge, in specific timing:
>> - `AT_MOST_ONCE` - When the message is read by the client, it is
>> immediately acknowledged and only then the function is executed, thus
>> guaranteeing it will not run more than once
>> - `AT_LEAST_ONCE` - Message is acknowledged *after* the function finished
>> execution, thus it will be run at least once.
>> - `MANUAL` - Signals to the user that it is up to them to acknowledge the
>> message, inside the function.
>> 
>> I think if you couple that change with adding the explanation I wrote
>> above to the documentation it will become crystal clear (hopefully) what is
>> a Processing Guarantee exactly and what each value signifies.
>> 
>> 3. Removing autoAck from Function Config breaks backward compatibility,
>> thus shouldn't this be strongly reflected in the PIP - how does Pulsar
>> release handle breaking change?
>> 
>> 4. Regarding Implementation (1), isn't the class itself
>> PulsarSinkAtLeastOnceProcessor encodes what to do? I'm not sure I
>> understand how you use that enum value *inside* the class/
>> 
>> 
>> On Fri, May 27, 2022 at 1:08 AM Neng Lu <fr...@gmail.com> wrote:
>> 
>>> Some suggestions:
>>> 
>>> 1. Instead of deleting the `autoAck`, keep it but not use it in the code.
>>> And documented clearly it's deprecated for the following 2~3 release. And
>>> then delete it.
>>> 2. For `PulsarSinkAtLeastOnceProcessor` and
>>> `PulsarSinkEffectivelyOnceProcessor`, if `NONE` is configured, it defaults
>>> to ATLEAST_ONCE.
>>> 3. Need to let users know the behavior when they call `record.ack()` inside
>>> the function implementation.
>>> 
>>> On Thu, May 12, 2022 at 1:52 AM Baozi <wu...@icloud.com.invalid>
>>> wrote:
>>> 
>>>> Hi Pulsar community,
>>>> 
>>>> I open a https://github.com/apache/pulsar/issues/15560 for Function add
>>>> NONE delivery semantics
>>>> 
>>>> Let me know what you think.
>>>> 
>>>> 
>>>> Thanks,
>>>> Baodi Shi
>>>> 
>>>> 
>>>> ## Motivation
>>>> 
>>>> Currently Function supports three delivery semantics, and also provides
>>>> autoAck to control whether to automatically ack.
>>>> Because autoAck affects the delivery semantics of Function, it can be
>>>> confusing for users to understand the relationship between these two
>>>> parameters.
>>>> 
>>>> For example, when the user configures `Guarantees == ATMOST_ONCE` and
>>>> `autoAck == false`, then the framework will not help the user to ack
>>>> messages, and the processing semantics may become `ATLEAST_ONCE`.
>>>> 
>>>> The delivery semantics provided by Function should be clear. When the
>>> user
>>>> sets the guarantees, the framework should ensure point-to-point semantic
>>>> processing and cannot be affected by other parameters.
>>>> 
>>>> ## Goal
>>>> 
>>>> Added `NONE` delivery semantics and delete `autoAck` config.
>>>> 
>>>> The original intention of `autoAck` semantics is that users want to
>>>> control the timing of ack by themselves. When autoAck == false, the
>>>> processing semantics provided by the framework should be invalid. Then we
>>>> can add `NONE` processing semantics to replace the autoAck == false
>>>> scenario.
>>>> 
>>>> When the user configuration `ProcessingGuarantees == NONE`, the framework
>>>> does not help the user to do any ack operations, and the ack is left to
>>> the
>>>> user to handle. In other cases, the framework guarantees processing
>>>> semantics.
>>>> 
>>>> ## API Changes
>>>> 1. Add `NONE` type to ProcessingGuarantees
>>>> ``` java
>>>> public enum ProcessingGuarantees {
>>>> ATLEAST_ONCE,
>>>> ATMOST_ONCE,
>>>> EFFECTIVELY_ONCE,
>>>> NONE
>>>> }
>>>> ```
>>>> 
>>>> 2. Delete autoAck config in FunctionConfig
>>>> ``` java
>>>> public class FunctionConfig {
>>>> - private Boolean autoAck;
>>>> }
>>>> ```
>>>> 
>>>> ## Implementation
>>>> 
>>>> 1. In `PulsarSinkAtLeastOnceProcessor` and
>>>> `PulsarSinkEffectivelyOnceProcessor`, when `ProcessingGuarantees != NONE`
>>>> can be ack.
>>>> 
>>>> <
>>>> 
>>> https://github.com/apache/pulsar/blob/c49a977de4b0b525ec80e5070bc90eddcc7cddad/pulsar-functions/instance/src/main/java/org/apache/pulsar/functions/sink/PulsarSink.java#L274-L276
>>>>> 
>>>> 
>>>> 2. When the delivery semantic is `ATMOST_ONCE`, the message will be acked
>>>> immediately after receiving the message, no longer affected by the
>>> autoAck
>>>> configuration.
>>>> 
>>>> 
>>>> 
>>> https://github.com/apache/pulsar/blob/c49a977de4b0b525ec80e5070bc90eddcc7cddad/pulsar-functions/instance/src/main/java/org/apache/pulsar/functions/instance/JavaInstanceRunnable.java#L271-L276
>>>> 
>>>> 3. When user call `record.ack()` in function, just `ProcessingGuarantees
>>>> == NONE` can be work.
>>>> 
>>>> ## Plan test
>>>> The main test and assert is that when ProcessingGuarantees == NONE, the
>>>> function framework will not do any ack operations for the user.
>>>> 
>>>> ## Compatibility
>>>> 1. This change will invalidate the user's setting of autoAck, which
>>> should
>>>> be explained in the documentation and provide parameter verification to
>>>> remind the user.
>>>> 2. Runtimes of other languages need to maintain consistent processing
>>>> logic (python, go).
>>>> 
>>>> 
>>>> 
>>> 
>>> --
>>> Best Regards,
>>> Neng
>>> 


Re: [DISCUSS] PIP-166: Function add NONE delivery semantics

Posted by Rui Fu <rf...@apache.org>.
Hi Baodi,

Nice work. Put some suggestions below, ptal.

1. API changes should also contain the changes of `Function.proto`, including new `ProcessingGuarantees` option and `autoAck`.
2. Please be sure the other language runtimes (like Python, Golang) do support similar `record.ack()` function from the context, if no, it might have some API changes for different runtime we well.


Best,

Rui Fu
在 2022年5月29日 +0800 22:18,Asaf Mesika <as...@gmail.com>,写道:
> 1. "Added NONE delivery semantics and delete autoAck config."
> - Added --> add
>
> 2. I suggest calling it `MANUAL` `ProcessingGuarantee` instead of NONE. As
> you carefully explained, ProcessingGuarantee comes does to the fact that
> the function executor calls acknowledge, in specific timing:
> - `AT_MOST_ONCE` - When the message is read by the client, it is
> immediately acknowledged and only then the function is executed, thus
> guaranteeing it will not run more than once
> - `AT_LEAST_ONCE` - Message is acknowledged *after* the function finished
> execution, thus it will be run at least once.
> - `MANUAL` - Signals to the user that it is up to them to acknowledge the
> message, inside the function.
>
> I think if you couple that change with adding the explanation I wrote
> above to the documentation it will become crystal clear (hopefully) what is
> a Processing Guarantee exactly and what each value signifies.
>
> 3. Removing autoAck from Function Config breaks backward compatibility,
> thus shouldn't this be strongly reflected in the PIP - how does Pulsar
> release handle breaking change?
>
> 4. Regarding Implementation (1), isn't the class itself
> PulsarSinkAtLeastOnceProcessor encodes what to do? I'm not sure I
> understand how you use that enum value *inside* the class/
>
>
> On Fri, May 27, 2022 at 1:08 AM Neng Lu <fr...@gmail.com> wrote:
>
> > Some suggestions:
> >
> > 1. Instead of deleting the `autoAck`, keep it but not use it in the code.
> > And documented clearly it's deprecated for the following 2~3 release. And
> > then delete it.
> > 2. For `PulsarSinkAtLeastOnceProcessor` and
> > `PulsarSinkEffectivelyOnceProcessor`, if `NONE` is configured, it defaults
> > to ATLEAST_ONCE.
> > 3. Need to let users know the behavior when they call `record.ack()` inside
> > the function implementation.
> >
> > On Thu, May 12, 2022 at 1:52 AM Baozi <wu...@icloud.com.invalid>
> > wrote:
> >
> > > Hi Pulsar community,
> > >
> > > I open a https://github.com/apache/pulsar/issues/15560 for Function add
> > > NONE delivery semantics
> > >
> > > Let me know what you think.
> > >
> > >
> > > Thanks,
> > > Baodi Shi
> > >
> > >
> > > ## Motivation
> > >
> > > Currently Function supports three delivery semantics, and also provides
> > > autoAck to control whether to automatically ack.
> > > Because autoAck affects the delivery semantics of Function, it can be
> > > confusing for users to understand the relationship between these two
> > > parameters.
> > >
> > > For example, when the user configures `Guarantees == ATMOST_ONCE` and
> > > `autoAck == false`, then the framework will not help the user to ack
> > > messages, and the processing semantics may become `ATLEAST_ONCE`.
> > >
> > > The delivery semantics provided by Function should be clear. When the
> > user
> > > sets the guarantees, the framework should ensure point-to-point semantic
> > > processing and cannot be affected by other parameters.
> > >
> > > ## Goal
> > >
> > > Added `NONE` delivery semantics and delete `autoAck` config.
> > >
> > > The original intention of `autoAck` semantics is that users want to
> > > control the timing of ack by themselves. When autoAck == false, the
> > > processing semantics provided by the framework should be invalid. Then we
> > > can add `NONE` processing semantics to replace the autoAck == false
> > > scenario.
> > >
> > > When the user configuration `ProcessingGuarantees == NONE`, the framework
> > > does not help the user to do any ack operations, and the ack is left to
> > the
> > > user to handle. In other cases, the framework guarantees processing
> > > semantics.
> > >
> > > ## API Changes
> > > 1. Add `NONE` type to ProcessingGuarantees
> > > ``` java
> > > public enum ProcessingGuarantees {
> > > ATLEAST_ONCE,
> > > ATMOST_ONCE,
> > > EFFECTIVELY_ONCE,
> > > NONE
> > > }
> > > ```
> > >
> > > 2. Delete autoAck config in FunctionConfig
> > > ``` java
> > > public class FunctionConfig {
> > > - private Boolean autoAck;
> > > }
> > > ```
> > >
> > > ## Implementation
> > >
> > > 1. In `PulsarSinkAtLeastOnceProcessor` and
> > > `PulsarSinkEffectivelyOnceProcessor`, when `ProcessingGuarantees != NONE`
> > > can be ack.
> > >
> > > <
> > >
> > https://github.com/apache/pulsar/blob/c49a977de4b0b525ec80e5070bc90eddcc7cddad/pulsar-functions/instance/src/main/java/org/apache/pulsar/functions/sink/PulsarSink.java#L274-L276
> > > >
> > >
> > > 2. When the delivery semantic is `ATMOST_ONCE`, the message will be acked
> > > immediately after receiving the message, no longer affected by the
> > autoAck
> > > configuration.
> > >
> > >
> > >
> > https://github.com/apache/pulsar/blob/c49a977de4b0b525ec80e5070bc90eddcc7cddad/pulsar-functions/instance/src/main/java/org/apache/pulsar/functions/instance/JavaInstanceRunnable.java#L271-L276
> > >
> > > 3. When user call `record.ack()` in function, just `ProcessingGuarantees
> > > == NONE` can be work.
> > >
> > > ## Plan test
> > > The main test and assert is that when ProcessingGuarantees == NONE, the
> > > function framework will not do any ack operations for the user.
> > >
> > > ## Compatibility
> > > 1. This change will invalidate the user's setting of autoAck, which
> > should
> > > be explained in the documentation and provide parameter verification to
> > > remind the user.
> > > 2. Runtimes of other languages need to maintain consistent processing
> > > logic (python, go).
> > >
> > >
> > >
> >
> > --
> > Best Regards,
> > Neng
> >

Re: [DISCUSS] PIP-166: Function add NONE delivery semantics

Posted by Asaf Mesika <as...@gmail.com>.
1. "Added NONE delivery semantics and delete autoAck config."
 - Added --> add

 2. I suggest calling it `MANUAL` `ProcessingGuarantee` instead of NONE. As
you carefully explained, ProcessingGuarantee comes does to the fact that
the function executor calls acknowledge, in specific timing:
 - `AT_MOST_ONCE` - When the message is read by the client, it is
immediately acknowledged and only then the function is executed, thus
guaranteeing it will not run more than once
 - `AT_LEAST_ONCE` - Message is acknowledged *after* the function finished
execution, thus it will be run at least once.
 - `MANUAL` - Signals to the user that it is up to them to acknowledge the
message, inside the function.

 I think if you couple that change with adding the explanation I wrote
above to the documentation it will become crystal clear (hopefully) what is
a Processing Guarantee exactly and what each value signifies.

3. Removing autoAck from Function Config breaks backward compatibility,
thus shouldn't this be strongly reflected in the PIP - how does Pulsar
release handle breaking change?

4. Regarding Implementation (1), isn't the class itself
PulsarSinkAtLeastOnceProcessor encodes what to do? I'm not sure I
understand how you use that enum value *inside* the class/


On Fri, May 27, 2022 at 1:08 AM Neng Lu <fr...@gmail.com> wrote:

> Some suggestions:
>
> 1. Instead of deleting the `autoAck`, keep it but not use it in the code.
> And documented clearly it's deprecated for the following 2~3 release. And
> then delete it.
> 2. For `PulsarSinkAtLeastOnceProcessor` and
> `PulsarSinkEffectivelyOnceProcessor`, if `NONE` is configured, it defaults
> to ATLEAST_ONCE.
> 3. Need to let users know the behavior when they call `record.ack()` inside
> the function implementation.
>
> On Thu, May 12, 2022 at 1:52 AM Baozi <wu...@icloud.com.invalid>
> wrote:
>
> > Hi Pulsar community,
> >
> > I open a https://github.com/apache/pulsar/issues/15560 for Function add
> > NONE delivery semantics
> >
> > Let me know what you think.
> >
> >
> > Thanks,
> > Baodi Shi
> >
> >
> > ## Motivation
> >
> > Currently Function supports three delivery semantics, and also provides
> > autoAck to control whether to automatically ack.
> > Because autoAck affects the delivery semantics of Function, it can be
> > confusing for users to understand the relationship between these two
> > parameters.
> >
> > For example, when the user configures `Guarantees == ATMOST_ONCE` and
> > `autoAck == false`, then the framework will not help the user to ack
> > messages, and the processing semantics may become `ATLEAST_ONCE`.
> >
> > The delivery semantics provided by Function should be clear. When the
> user
> > sets the guarantees, the framework should ensure point-to-point semantic
> > processing and cannot be affected by other parameters.
> >
> > ## Goal
> >
> > Added `NONE` delivery semantics and delete `autoAck` config.
> >
> > The original intention of `autoAck` semantics is that users want to
> > control the timing of ack by themselves. When autoAck == false, the
> > processing semantics provided by the framework should be invalid. Then we
> > can add `NONE` processing semantics to replace the autoAck == false
> > scenario.
> >
> > When the user configuration `ProcessingGuarantees == NONE`, the framework
> > does not help the user to do any ack operations, and the ack is left to
> the
> > user to handle. In other cases, the framework guarantees processing
> > semantics.
> >
> > ## API Changes
> > 1. Add `NONE` type to ProcessingGuarantees
> > ``` java
> > public enum ProcessingGuarantees {
> >       ATLEAST_ONCE,
> >       ATMOST_ONCE,
> >       EFFECTIVELY_ONCE,
> >       NONE
> > }
> > ```
> >
> > 2. Delete autoAck config in FunctionConfig
> > ``` java
> > public class FunctionConfig {
> > -    private Boolean autoAck;
> > }
> > ```
> >
> > ## Implementation
> >
> > 1. In `PulsarSinkAtLeastOnceProcessor` and
> > `PulsarSinkEffectivelyOnceProcessor`, when `ProcessingGuarantees != NONE`
> > can be ack.
> >
> > <
> >
> https://github.com/apache/pulsar/blob/c49a977de4b0b525ec80e5070bc90eddcc7cddad/pulsar-functions/instance/src/main/java/org/apache/pulsar/functions/sink/PulsarSink.java#L274-L276
> > >
> >
> > 2. When the delivery semantic is `ATMOST_ONCE`, the message will be acked
> > immediately after receiving the message, no longer affected by the
> autoAck
> > configuration.
> >
> >
> >
> https://github.com/apache/pulsar/blob/c49a977de4b0b525ec80e5070bc90eddcc7cddad/pulsar-functions/instance/src/main/java/org/apache/pulsar/functions/instance/JavaInstanceRunnable.java#L271-L276
> >
> > 3. When user call `record.ack()` in function, just  `ProcessingGuarantees
> > == NONE` can be work.
> >
> > ## Plan test
> > The main test and assert is that when ProcessingGuarantees == NONE, the
> > function framework will not do any ack operations for the user.
> >
> > ## Compatibility
> > 1. This change will invalidate the user's setting of autoAck, which
> should
> > be explained in the documentation and provide parameter verification to
> > remind the user.
> > 2. Runtimes of other languages need to maintain consistent processing
> > logic (python, go).
> >
> >
> >
>
> --
> Best Regards,
> Neng
>

Re: [DISCUSS] PIP-166: Function add NONE delivery semantics

Posted by Baozi <wu...@icloud.com.INVALID>.
Hi, Neng.

Thank review.

> 1. Instead of deleting the `autoAck`, keep it but not use it in the code.
> And documented clearly it's deprecated for the following 2~3 release. And
> then delete it.

Great! I changed PIP. Refer to the latest pip. https://github.com/apache/pulsar/issues/15560

> 2. For `PulsarSinkAtLeastOnceProcessor` and
> `PulsarSinkEffectivelyOnceProcessor`, if `NONE` is configured, it defaults
> to ATLEAST_ONCE.

I took Mesika's suggestion and changed NONE to MANUAL. I add new PulsarSinkProcessor implements: PulsarSinkManualProcessor. So, if `MANUAL ` is configured, Will not use PulsarSinkAtLeastOnceProcessor and PulsarSinkEffectivelyOnceProcessor. Refer to the latest PIP API Changes(3)

> 3. Need to let users know the behavior when they call `record.ack()` inside
> the function implementation.


According to the description of  Implementation(2), We will add to the documentation to tell the user.

Thanks,
Baodi Shi

> 2022年5月27日 06:0836,Neng Lu <fr...@gmail.com> 写道:
> 
> Some suggestions:
> 
> 1. Instead of deleting the `autoAck`, keep it but not use it in the code.
> And documented clearly it's deprecated for the following 2~3 release. And
> then delete it.
> 2. For `PulsarSinkAtLeastOnceProcessor` and
> `PulsarSinkEffectivelyOnceProcessor`, if `NONE` is configured, it defaults
> to ATLEAST_ONCE.
> 3. Need to let users know the behavior when they call `record.ack()` inside
> the function implementation.
> 
> On Thu, May 12, 2022 at 1:52 AM Baozi <wu...@icloud.com.invalid>
> wrote:
> 
>> Hi Pulsar community,
>> 
>> I open a https://github.com/apache/pulsar/issues/15560 for Function add
>> NONE delivery semantics
>> 
>> Let me know what you think.
>> 
>> 
>> Thanks,
>> Baodi Shi
>> 
>> 
>> ## Motivation
>> 
>> Currently Function supports three delivery semantics, and also provides
>> autoAck to control whether to automatically ack.
>> Because autoAck affects the delivery semantics of Function, it can be
>> confusing for users to understand the relationship between these two
>> parameters.
>> 
>> For example, when the user configures `Guarantees == ATMOST_ONCE` and
>> `autoAck == false`, then the framework will not help the user to ack
>> messages, and the processing semantics may become `ATLEAST_ONCE`.
>> 
>> The delivery semantics provided by Function should be clear. When the user
>> sets the guarantees, the framework should ensure point-to-point semantic
>> processing and cannot be affected by other parameters.
>> 
>> ## Goal
>> 
>> Added `NONE` delivery semantics and delete `autoAck` config.
>> 
>> The original intention of `autoAck` semantics is that users want to
>> control the timing of ack by themselves. When autoAck == false, the
>> processing semantics provided by the framework should be invalid. Then we
>> can add `NONE` processing semantics to replace the autoAck == false
>> scenario.
>> 
>> When the user configuration `ProcessingGuarantees == NONE`, the framework
>> does not help the user to do any ack operations, and the ack is left to the
>> user to handle. In other cases, the framework guarantees processing
>> semantics.
>> 
>> ## API Changes
>> 1. Add `NONE` type to ProcessingGuarantees
>> ``` java
>> public enum ProcessingGuarantees {
>>      ATLEAST_ONCE,
>>      ATMOST_ONCE,
>>      EFFECTIVELY_ONCE,
>>      NONE
>> }
>> ```
>> 
>> 2. Delete autoAck config in FunctionConfig
>> ``` java
>> public class FunctionConfig {
>> -    private Boolean autoAck;
>> }
>> ```
>> 
>> ## Implementation
>> 
>> 1. In `PulsarSinkAtLeastOnceProcessor` and
>> `PulsarSinkEffectivelyOnceProcessor`, when `ProcessingGuarantees != NONE`
>> can be ack.
>> 
>> <
>> https://github.com/apache/pulsar/blob/c49a977de4b0b525ec80e5070bc90eddcc7cddad/pulsar-functions/instance/src/main/java/org/apache/pulsar/functions/sink/PulsarSink.java#L274-L276
>>> 
>> 
>> 2. When the delivery semantic is `ATMOST_ONCE`, the message will be acked
>> immediately after receiving the message, no longer affected by the autoAck
>> configuration.
>> 
>> 
>> https://github.com/apache/pulsar/blob/c49a977de4b0b525ec80e5070bc90eddcc7cddad/pulsar-functions/instance/src/main/java/org/apache/pulsar/functions/instance/JavaInstanceRunnable.java#L271-L276
>> 
>> 3. When user call `record.ack()` in function, just  `ProcessingGuarantees
>> == NONE` can be work.
>> 
>> ## Plan test
>> The main test and assert is that when ProcessingGuarantees == NONE, the
>> function framework will not do any ack operations for the user.
>> 
>> ## Compatibility
>> 1. This change will invalidate the user's setting of autoAck, which should
>> be explained in the documentation and provide parameter verification to
>> remind the user.
>> 2. Runtimes of other languages ​​need to maintain consistent processing
>> logic (python, go).
>> 
>> 
>> 
> 
> -- 
> Best Regards,
> Neng


Re: [DISCUSS] PIP-166: Function add NONE delivery semantics

Posted by Neng Lu <fr...@gmail.com>.
Some suggestions:

1. Instead of deleting the `autoAck`, keep it but not use it in the code.
And documented clearly it's deprecated for the following 2~3 release. And
then delete it.
2. For `PulsarSinkAtLeastOnceProcessor` and
`PulsarSinkEffectivelyOnceProcessor`, if `NONE` is configured, it defaults
to ATLEAST_ONCE.
3. Need to let users know the behavior when they call `record.ack()` inside
the function implementation.

On Thu, May 12, 2022 at 1:52 AM Baozi <wu...@icloud.com.invalid>
wrote:

> Hi Pulsar community,
>
> I open a https://github.com/apache/pulsar/issues/15560 for Function add
> NONE delivery semantics
>
> Let me know what you think.
>
>
> Thanks,
> Baodi Shi
>
>
> ## Motivation
>
> Currently Function supports three delivery semantics, and also provides
> autoAck to control whether to automatically ack.
> Because autoAck affects the delivery semantics of Function, it can be
> confusing for users to understand the relationship between these two
> parameters.
>
> For example, when the user configures `Guarantees == ATMOST_ONCE` and
> `autoAck == false`, then the framework will not help the user to ack
> messages, and the processing semantics may become `ATLEAST_ONCE`.
>
> The delivery semantics provided by Function should be clear. When the user
> sets the guarantees, the framework should ensure point-to-point semantic
> processing and cannot be affected by other parameters.
>
> ## Goal
>
> Added `NONE` delivery semantics and delete `autoAck` config.
>
> The original intention of `autoAck` semantics is that users want to
> control the timing of ack by themselves. When autoAck == false, the
> processing semantics provided by the framework should be invalid. Then we
> can add `NONE` processing semantics to replace the autoAck == false
> scenario.
>
> When the user configuration `ProcessingGuarantees == NONE`, the framework
> does not help the user to do any ack operations, and the ack is left to the
> user to handle. In other cases, the framework guarantees processing
> semantics.
>
> ## API Changes
> 1. Add `NONE` type to ProcessingGuarantees
> ``` java
> public enum ProcessingGuarantees {
>       ATLEAST_ONCE,
>       ATMOST_ONCE,
>       EFFECTIVELY_ONCE,
>       NONE
> }
> ```
>
> 2. Delete autoAck config in FunctionConfig
> ``` java
> public class FunctionConfig {
> -    private Boolean autoAck;
> }
> ```
>
> ## Implementation
>
> 1. In `PulsarSinkAtLeastOnceProcessor` and
> `PulsarSinkEffectivelyOnceProcessor`, when `ProcessingGuarantees != NONE`
> can be ack.
>
> <
> https://github.com/apache/pulsar/blob/c49a977de4b0b525ec80e5070bc90eddcc7cddad/pulsar-functions/instance/src/main/java/org/apache/pulsar/functions/sink/PulsarSink.java#L274-L276
> >
>
> 2. When the delivery semantic is `ATMOST_ONCE`, the message will be acked
> immediately after receiving the message, no longer affected by the autoAck
> configuration.
>
>
> https://github.com/apache/pulsar/blob/c49a977de4b0b525ec80e5070bc90eddcc7cddad/pulsar-functions/instance/src/main/java/org/apache/pulsar/functions/instance/JavaInstanceRunnable.java#L271-L276
>
> 3. When user call `record.ack()` in function, just  `ProcessingGuarantees
> == NONE` can be work.
>
> ## Plan test
> The main test and assert is that when ProcessingGuarantees == NONE, the
> function framework will not do any ack operations for the user.
>
> ## Compatibility
> 1. This change will invalidate the user's setting of autoAck, which should
> be explained in the documentation and provide parameter verification to
> remind the user.
> 2. Runtimes of other languages ​​need to maintain consistent processing
> logic (python, go).
>
>
>

-- 
Best Regards,
Neng

Re: [DISCUSS] PIP-166: Function add NONE delivery semantics

Posted by 石宝迪 <wu...@icloud.com.INVALID>.
Only commiters can modify wikis, we can discuss that first.


Thanks,
Baodi Shi

> 2022年5月30日 22:5828,Asaf Mesika <as...@gmail.com> 写道:
> 
> Still new to the PIP process: Shouldn't this PIP be added here
> <https://github.com/apache/pulsar/wiki#pulsar-improvement-proposals>?
> 
> On Thu, May 12, 2022 at 11:52 AM Baozi <wu...@icloud.com.invalid>
> wrote:
> 
>> Hi Pulsar community,
>> 
>> I open a https://github.com/apache/pulsar/issues/15560 for Function add
>> NONE delivery semantics
>> 
>> Let me know what you think.
>> 
>> 
>> Thanks,
>> Baodi Shi
>> 
>> 
>> ## Motivation
>> 
>> Currently Function supports three delivery semantics, and also provides
>> autoAck to control whether to automatically ack.
>> Because autoAck affects the delivery semantics of Function, it can be
>> confusing for users to understand the relationship between these two
>> parameters.
>> 
>> For example, when the user configures `Guarantees == ATMOST_ONCE` and
>> `autoAck == false`, then the framework will not help the user to ack
>> messages, and the processing semantics may become `ATLEAST_ONCE`.
>> 
>> The delivery semantics provided by Function should be clear. When the user
>> sets the guarantees, the framework should ensure point-to-point semantic
>> processing and cannot be affected by other parameters.
>> 
>> ## Goal
>> 
>> Added `NONE` delivery semantics and delete `autoAck` config.
>> 
>> The original intention of `autoAck` semantics is that users want to
>> control the timing of ack by themselves. When autoAck == false, the
>> processing semantics provided by the framework should be invalid. Then we
>> can add `NONE` processing semantics to replace the autoAck == false
>> scenario.
>> 
>> When the user configuration `ProcessingGuarantees == NONE`, the framework
>> does not help the user to do any ack operations, and the ack is left to the
>> user to handle. In other cases, the framework guarantees processing
>> semantics.
>> 
>> ## API Changes
>> 1. Add `NONE` type to ProcessingGuarantees
>> ``` java
>> public enum ProcessingGuarantees {
>>      ATLEAST_ONCE,
>>      ATMOST_ONCE,
>>      EFFECTIVELY_ONCE,
>>      NONE
>> }
>> ```
>> 
>> 2. Delete autoAck config in FunctionConfig
>> ``` java
>> public class FunctionConfig {
>> -    private Boolean autoAck;
>> }
>> ```
>> 
>> ## Implementation
>> 
>> 1. In `PulsarSinkAtLeastOnceProcessor` and
>> `PulsarSinkEffectivelyOnceProcessor`, when `ProcessingGuarantees != NONE`
>> can be ack.
>> 
>> <
>> https://github.com/apache/pulsar/blob/c49a977de4b0b525ec80e5070bc90eddcc7cddad/pulsar-functions/instance/src/main/java/org/apache/pulsar/functions/sink/PulsarSink.java#L274-L276
>>> 
>> 
>> 2. When the delivery semantic is `ATMOST_ONCE`, the message will be acked
>> immediately after receiving the message, no longer affected by the autoAck
>> configuration.
>> 
>> 
>> https://github.com/apache/pulsar/blob/c49a977de4b0b525ec80e5070bc90eddcc7cddad/pulsar-functions/instance/src/main/java/org/apache/pulsar/functions/instance/JavaInstanceRunnable.java#L271-L276
>> 
>> 3. When user call `record.ack()` in function, just  `ProcessingGuarantees
>> == NONE` can be work.
>> 
>> ## Plan test
>> The main test and assert is that when ProcessingGuarantees == NONE, the
>> function framework will not do any ack operations for the user.
>> 
>> ## Compatibility
>> 1. This change will invalidate the user's setting of autoAck, which should
>> be explained in the documentation and provide parameter verification to
>> remind the user.
>> 2. Runtimes of other languages ​​need to maintain consistent processing
>> logic (python, go).
>> 
>> 
>> 


Re: [DISCUSS] PIP-166: Function add NONE delivery semantics

Posted by Asaf Mesika <as...@gmail.com>.
Still new to the PIP process: Shouldn't this PIP be added here
<https://github.com/apache/pulsar/wiki#pulsar-improvement-proposals>?

On Thu, May 12, 2022 at 11:52 AM Baozi <wu...@icloud.com.invalid>
wrote:

> Hi Pulsar community,
>
> I open a https://github.com/apache/pulsar/issues/15560 for Function add
> NONE delivery semantics
>
> Let me know what you think.
>
>
> Thanks,
> Baodi Shi
>
>
> ## Motivation
>
> Currently Function supports three delivery semantics, and also provides
> autoAck to control whether to automatically ack.
> Because autoAck affects the delivery semantics of Function, it can be
> confusing for users to understand the relationship between these two
> parameters.
>
> For example, when the user configures `Guarantees == ATMOST_ONCE` and
> `autoAck == false`, then the framework will not help the user to ack
> messages, and the processing semantics may become `ATLEAST_ONCE`.
>
> The delivery semantics provided by Function should be clear. When the user
> sets the guarantees, the framework should ensure point-to-point semantic
> processing and cannot be affected by other parameters.
>
> ## Goal
>
> Added `NONE` delivery semantics and delete `autoAck` config.
>
> The original intention of `autoAck` semantics is that users want to
> control the timing of ack by themselves. When autoAck == false, the
> processing semantics provided by the framework should be invalid. Then we
> can add `NONE` processing semantics to replace the autoAck == false
> scenario.
>
> When the user configuration `ProcessingGuarantees == NONE`, the framework
> does not help the user to do any ack operations, and the ack is left to the
> user to handle. In other cases, the framework guarantees processing
> semantics.
>
> ## API Changes
> 1. Add `NONE` type to ProcessingGuarantees
> ``` java
> public enum ProcessingGuarantees {
>       ATLEAST_ONCE,
>       ATMOST_ONCE,
>       EFFECTIVELY_ONCE,
>       NONE
> }
> ```
>
> 2. Delete autoAck config in FunctionConfig
> ``` java
> public class FunctionConfig {
> -    private Boolean autoAck;
> }
> ```
>
> ## Implementation
>
> 1. In `PulsarSinkAtLeastOnceProcessor` and
> `PulsarSinkEffectivelyOnceProcessor`, when `ProcessingGuarantees != NONE`
> can be ack.
>
> <
> https://github.com/apache/pulsar/blob/c49a977de4b0b525ec80e5070bc90eddcc7cddad/pulsar-functions/instance/src/main/java/org/apache/pulsar/functions/sink/PulsarSink.java#L274-L276
> >
>
> 2. When the delivery semantic is `ATMOST_ONCE`, the message will be acked
> immediately after receiving the message, no longer affected by the autoAck
> configuration.
>
>
> https://github.com/apache/pulsar/blob/c49a977de4b0b525ec80e5070bc90eddcc7cddad/pulsar-functions/instance/src/main/java/org/apache/pulsar/functions/instance/JavaInstanceRunnable.java#L271-L276
>
> 3. When user call `record.ack()` in function, just  `ProcessingGuarantees
> == NONE` can be work.
>
> ## Plan test
> The main test and assert is that when ProcessingGuarantees == NONE, the
> function framework will not do any ack operations for the user.
>
> ## Compatibility
> 1. This change will invalidate the user's setting of autoAck, which should
> be explained in the documentation and provide parameter verification to
> remind the user.
> 2. Runtimes of other languages ​​need to maintain consistent processing
> logic (python, go).
>
>
>