You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@metron.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2016/11/23 14:05:58 UTC

[jira] [Commented] (METRON-227) Add Time-Based Flushing to Writer Bolt

    [ https://issues.apache.org/jira/browse/METRON-227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15690187#comment-15690187 ] 

ASF GitHub Bot commented on METRON-227:
---------------------------------------

Github user nickwallen commented on the issue:

    https://github.com/apache/incubator-metron/pull/188
  
    Opened https://issues.apache.org/jira/browse/INFRA-12959 for Apache Infra to close this PR.


> Add Time-Based Flushing to Writer Bolt
> --------------------------------------
>
>                 Key: METRON-227
>                 URL: https://issues.apache.org/jira/browse/METRON-227
>             Project: Metron
>          Issue Type: Bug
>            Reporter: Domenic Puzio
>            Assignee: Matt Foley
>
> We need to change the BulkMessageWriterBolt and BulkWriterComponent to use time-based flushing when writing data to Elasticsearch or Solr.
> Currently, we set a batch size, and the Writer waits for that number of tuples to build up; however, Storm has a timeout value that prevents it from waiting for too long. If the Writer does not get the batch size before the timeout, then it recycles the tuples through the topology. In addition, Storm only allows so many pending messages that have not been acked - if too many messages are waiting for the bulk Writer, then it will recycle them through the topology. This is not desired behavior and directly impacts the performance of this Writer. We would like to be able to specify a unit of time for which the topology would flush, writing the data it's currently holding to Elasticsearch or Solr even if the batch size is not met.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)