You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pulsar.apache.org by GitBox <gi...@apache.org> on 2017/12/21 01:34:39 UTC

[GitHub] rdhabalia opened a new pull request #1000: Make sure nextTuple emits tuple with non-null values

rdhabalia opened a new pull request #1000: Make sure nextTuple emits tuple with non-null values
URL: https://github.com/apache/incubator-pulsar/pull/1000
 
 
   ### Motivation
   
   Right now, PulsarSpout consumes messages  from the queue, converts into Tuple and emits to the topology. However, if the converted tuple is null then PulsarSpout doesn't emit the tuple when `nextTuple()` is triggered in the topology. It seems if `nextTuple()` doesn't emit the tuple then [Storm-spout](https://github.com/apache/storm/blob/master/storm-client/src/jvm/org/apache/storm/trident/spout/RichSpoutBatchExecutor.java#L108) doesn't batch the tuples and it creates penalty for PulsarSpout. 
   Recently, we have seen that one of the  client-app that uses `PulsarSpout` returns null for for most of the messages while converting pulsar-message to tuple (`getMessageToValuesMapper()`) to throw away pulsar-messages. Because of this behavior `nextTuple()` doesn't emit messages for most of the messages and it decreases overall spout processing significantly.
   
   ### Modifications
   
   - make sure `nextTuple()` always emits non-null tuple by skipping messages which generates null tuple.
   
   ### Result
   
   - it should improve spout processing throughput in usecase where client-app returns null tuple for a pulsar-message.
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services