You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@storm.apache.org by "Rick Kellogg (JIRA)" <ji...@apache.org> on 2015/10/09 03:22:27 UTC

[jira] [Closed] (STORM-75) Dead lock between ShellBolt and ShellProcess

     [ https://issues.apache.org/jira/browse/STORM-75?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Rick Kellogg closed STORM-75.
-----------------------------
    Resolution: Fixed

> Dead lock between ShellBolt and ShellProcess
> --------------------------------------------
>
>                 Key: STORM-75
>                 URL: https://issues.apache.org/jira/browse/STORM-75
>             Project: Apache Storm
>          Issue Type: Bug
>          Components: storm-multilang
>            Reporter: James Xu
>            Priority: Minor
>
> https://github.com/nathanmarz/storm/issues/423
> The ShellBolt creates shell process and read data from output stream and error stream. The current implementation only read error stream when the output stream is closed. So messages in error stream will be put into the buffer of error stream. When the buffer is fully filled, the output in shell process would be blocked waiting for the error stream buffer to become available. While in ShellBolt it will also block there wait for the output in output stream from shell process. So it's a dead lock.
> This behavior seems dangerous as the issue can be hidden, it can hardly be seen in normal tests. And normally the error output won't be too big to fill up the error stream buffer, but after the system have been running for a while on production, the error stream can be accumulated to full, and then dead lock would happen. There's no any error in log, hard to debug.
> Here in Yahoo we are using many native libraries which is built long time ago, which sometimes writes to error stream when there's some error. It's impossible for us to inspect all the direct and indirect native library dependencies and rebuild all to remove all the error stream writing.
> Now we used a workaround to redirect error stream into /dev/null at the beginning of our shell process. But I think in long term it should be fixed in ShellBolt and ShellProcess.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)