You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@storm.apache.org by "Jungtaek Lim (JIRA)" <ji...@apache.org> on 2014/10/10 02:45:34 UTC

[jira] [Commented] (STORM-510) Netty messaging client blocks transfer thread on reconnect

    [ https://issues.apache.org/jira/browse/STORM-510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14166082#comment-14166082 ] 

Jungtaek Lim commented on STORM-510:
------------------------------------

It may covered by STORM-329. If STORM-329 cannot cover this issue, I will try.

> Netty messaging client blocks transfer thread on reconnect
> ----------------------------------------------------------
>
>                 Key: STORM-510
>                 URL: https://issues.apache.org/jira/browse/STORM-510
>             Project: Apache Storm
>          Issue Type: Bug
>    Affects Versions: 0.9.2-incubating
>            Reporter: Robert Joseph Evans
>            Priority: Critical
>
> The latest netty client code will attempt to reestablish the connection on failure as part of the send method call.  It will block until the connection is established or a timeout happens, by default this is about 30 seconds, which is also the default tuple timeout.  
> This is exacerbated by the read lock that is held during the send, that prevents the node->socket mapping from changing while we are sending.  This is mostly so that we don't close connections while we are trying to write to them, which would cause an exception.  But this makes it so if there are multiple workers on a node that all get rescheduled we will wait the full 30 seconds to timeout for each worker.
> send must be non-blocking in the current design of the worker, or it will prevent other messages from being delivered, and is likely to cause many many messages to timeout on a reschedule.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)