You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pulsar.apache.org by GitBox <gi...@apache.org> on 2019/06/10 07:08:00 UTC

[GitHub] [pulsar] KIC opened a new issue #4501: Pulsar functions should be able to return none and multiple values

KIC opened a new issue #4501: Pulsar functions should be able to return none and multiple values
URL: https://github.com/apache/pulsar/issues/4501
 
 
   First of all, I love pulsar functions it is almost exactly what I have always wanted to build. Just alone and as a hobby project I never really finished it. So I am really happy someone else did it! However I am missing one important feature.
   
   **Is your feature request related to a problem? Please describe.**
   As I understand the current implementation of pulsar functions is a 1:1 relation ship. One event in -> one event out. This is very limiting as one can not even write a function to filter out events. Also there are use cases when you get a batched message and you need to "unpack" it into single events. Or when you need to interpolate values from a previous event (which is held in the state).
   
   **Describe the solution you'd like**
   I propose that the interface should be something along the lines `Function<I, ? extends Collection<O>`. This way you can either return nothing (an empty list), a single element, but also a collection of like interpolated values.
   
   **Describe alternatives you've considered**
   If you consider a `PublishFunction` then I see here the following problem. In the very moment you also need to store a state via the `Context` you get a timing issue. What if you stored the state but then for some reason you are not able to send to the topic. Or even worse what if you could send  _n_ of _m_ messages and then the network fails? I would be easier when pulsar handles all these cases outside of the function implementation.
   
   **Additional context**
   One not necessarily needs to use atomic transactions over different storage solutions for this use case. Functions just need to be deterministic. So you just need to know what is needed to reproduce the failed "state" and you need to know what was the last message which has been sent to the target topic. You then store the state and only send the missing messages after the last one which was already sent.
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services