You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@storm.apache.org by "Rick Kellogg (JIRA)" <ji...@apache.org> on 2015/10/09 02:19:27 UTC

[jira] [Updated] (STORM-24) Refactor internal routing to more efficiently send the same values to multiple tasks

     [ https://issues.apache.org/jira/browse/STORM-24?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Rick Kellogg updated STORM-24:
------------------------------
    Component/s: storm-core

> Refactor internal routing to more efficiently send the same values to multiple tasks
> ------------------------------------------------------------------------------------
>
>                 Key: STORM-24
>                 URL: https://issues.apache.org/jira/browse/STORM-24
>             Project: Apache Storm
>          Issue Type: Improvement
>          Components: storm-core
>            Reporter: James Xu
>              Labels: HighPriority
>
> https://github.com/nathanmarz/storm/issues/408
> Storm should be more efficient when sending the same payload to multiple tasks. Rather than create many tuples for each target task, the internal routing should send to the target worker [list of task ids, payload] as one message, and then the recipient will turn that into a tuple for each task in the worker.
> This issue is a prerequisite for having a "stats" stream (for use in dynamically adjusting tasks), as the stats payload is fairly large.
> This issue comprises the following pieces:
> Internal routing changed from being [task id, tuple] to [list of task ids, tuple values, list of message ids]
> Transfer thread turns [list of task ids, tuple values, list of message ids] into as few messages as possible
> Routing thread needs similar modifications as transfer thread (should probably share code)
> Reciever transforms [list of task ids, tuple values, list of message ids] into a tuple for every task
> Serialization code needs to be refactored to understand this new format
> Tuples aren't created outright, but are created later once it reaches the destination worker (because message ids and tuple payload need to be kept separate)
> Another emitDirect that takes in a list of task ids



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)