You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@storm.apache.org by Patrick Lucas <pl...@yelp.com> on 2013/11/07 03:14:01 UTC

Re: multilang processes left running when one exits abnormally

Does anyone have any knowledge about this?

I did some slightly more active investigation and added logging for any
received signals to the components left running, and sure enough Storm does
not send them any signal to exit when it itself exits from executing
locally.


On Tue, Aug 27, 2013 at 6:48 PM, Patrick Lucas <pl...@yelp.com> wrote:

> I have run into a problem when testing multilang topologies locally: if
> one multilang process exits abnormally, the others are left running and not
> killed by Storm[1]. When running in a cluster, all processes are killed
> correctly upon failure.
>
> It seems that the correct behavior here would be for Storm to kill all
> multilang processes before exiting, in accordance with "fail-fast"
> philosophy.
>
> Is this perhaps fixed in a later version? I am running 0.8.3-wip3.
>
> Thanks,
> Patrick Lucas
>
> [1]Example output
>
> Starting subprocesses:
>
> ...
> 1892 [Thread-9] INFO  backtype.storm.task.ShellBolt  - Launched subprocess
> with pid 488
> ...
> 1946 [Thread-9] INFO  backtype.storm.spout.ShellSpout  - Launched
> subprocess with pid 494
> ...
> 1999 [Thread-9] INFO  backtype.storm.task.ShellBolt  - Launched subprocess
> with pid 498
> ...
>
> One subprocess fails:
>
> ...
> 2029 [Thread-24] ERROR backtype.storm.util  - Async loop died!
> java.lang.RuntimeException: java.lang.RuntimeException:
> java.lang.RuntimeException: Pipe to subprocess seems to be broken! No
> output read.
> Shell Process Exception:
>
>
>         at
> backtype.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:84)
>         at
> backtype.storm.utils.DisruptorQueue.consumeBatchWhenAvailable(DisruptorQueue.java:55)
>         at
> backtype.storm.disruptor$consume_batch_when_available.invoke(disruptor.clj:56)
>         at
> backtype.storm.disruptor$consume_loop_STAR_$fn__1597.invoke(disruptor.clj:67)
>         at backtype.storm.util$async_loop$fn__465.invoke(util.clj:377)
>         at clojure.lang.AFn.run(AFn.java:24)
>         at java.lang.Thread.run(Thread.java:662)
> Caused by: java.lang.RuntimeException: java.lang.RuntimeException: Pipe to
> subprocess seems to be broken! No output read.
> ...
> (the traceback is repeated multiple times)
>
> The other processes are left running:
>
> plucas     488 79.0  0.2 129352 105408 pts/55  R    18:18   0:01 python -m
> <snip>
> plucas     494 78.0  0.2 126844 100960 pts/55  R    18:18   0:01 python -m
> <snip>
>



-- 
Patrick Lucas