You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@pulsar.apache.org by GitBox <gi...@apache.org> on 2019/09/04 20:03:51 UTC

[GitHub] [pulsar] devinbost commented on issue #5116: Provide a flag for message properties pass through in functions

devinbost commented on issue #5116: Provide a flag for message properties pass through in functions
URL: https://github.com/apache/pulsar/issues/5116#issuecomment-528062821
 
 
   I was just about to create a new feature request until I saw this issue. 
   I want to provide an example use case that might be helpful to identify the value of this feature.
   
   Let's say we have a pipeline like this:
   
   > topic1 -> function1 -> topic2 -> function2 -> topic3 -> function3 -> . . . -> topicN -> functionN
   
   Let's say that I want to create a message tracing capability for an application that sends a test message through the pipeline and listens for the message to indicate how far the message got int the pipeline. (Think like the X-Ray tracing feature for AWS Lambda.) If we can tag a message at the start of the pipeline, then we can have our functions (which import from a common function override) automatically listen for messages with the given tag and write to our application to indicate if they received the message or not. If we could also attach a timestamp upfront, then we could also compute and report the total latency through the pipeline at each subsequent function in the pipeline. 
   
   As another possible (though similar) application, we could randomly sample every 1/1000th (or some other frequency) message that passes through a given pipeline for health, performance, or data contract checks to ensure that messages are behaving as expected through the pipeline across many different possible logical paths. For example, if there are multiple actors that are producing to a single topic, this sampling could expose cases where a percentage of messages are malformed (but not malformed enough to create a backlog), such as where a particular field is an empty string when it shouldn't be. Extensive validation may be expensive to perform on a high-velocity pipeline for every message, but it may be inexpensive to perform on a sample of incoming messages. It would also allow us to accurately compute end-to-end pipeline latencies and other statistics, which would be useful for SLAs.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services