You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@metron.apache.org by "Ryan Merriman (JIRA)" <ji...@apache.org> on 2019/01/17 15:41:00 UTC

[jira] [Created] (METRON-1968) Messages are lost when a parser produces multiple messages and batch size is greater than 1

Ryan Merriman created METRON-1968:
-------------------------------------

             Summary: Messages are lost when a parser produces multiple messages and batch size is greater than 1
                 Key: METRON-1968
                 URL: https://issues.apache.org/jira/browse/METRON-1968
             Project: Metron
          Issue Type: Bug
            Reporter: Ryan Merriman


A bug was discovered where messages are lost when a parser produces multiple messages.  This happens anytime the batch size for that sensor is set to greater than 1.  For example, consider a parser that produces 30 messages from a single input message.  Assume the batch size for this sensor/parser is set to 10.  The batch is currently flushed only after 10 tuples are received and only 10 messages are written out.  I think the correct behavior would be for 3 batches of 10 messages to be flushed for every tuple and a total of 300 messages written for every 10 tuples.

This is happening because the various writer classes/interfaces (BulkWriterComponent, BulkMessageWriter, KafkaWriter, etc) assume a 1 to 1 relationship between messages and tuples.  The root cause of this specific issue is [here|https://github.com/apache/metron/blob/master/metron-platform/metron-writer/src/main/java/org/apache/metron/writer/kafka/KafkaWriter.java#L236].



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)