You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@storm.apache.org by "Philippe Guillebert (JIRA)" <ji...@apache.org> on 2014/12/25 14:52:13 UTC

[jira] [Created] (STORM-606) Attempting to call unbound fn during bolt prepare

Philippe Guillebert created STORM-606:
-----------------------------------------

             Summary: Attempting to call unbound fn during bolt prepare
                 Key: STORM-606
                 URL: https://issues.apache.org/jira/browse/STORM-606
             Project: Apache Storm
          Issue Type: Bug
    Affects Versions: 0.9.3
            Reporter: Philippe Guillebert



We had a bunch of topologies running very well under Storm 0.8.2 until last
week when we switched to storm 0.9.2-incubating. We use the clojure DSL,
and clojure 1.5.1 (only).

Since the change, we have a large topology (about 30 bolts, parallellism=10
or 20 per bolt, total 372 tasks on 10 workers) that fails on startup with
several bolts showing the exception :

java.lang.RuntimeException: java.lang.IllegalStateException: Attempting to
call unbound fn: #'entry-dedup.bolt/dedup__ at
backtype.storm.clojure.ClojureBolt.prepare(ClojureBolt.java:77) ...

This can occur on one or several bolts at random and is not consistent
between restarts.

The topology is indeed quite long to initialize (a dozen seconds) due to several models being loaded but this was OK in 0.8.2.

Another (shorter) topology works most of the time but shows this behaviour
on some restarts sometimes.

We found a workaround that works most of the time : start the topology in
the INACTIVE state, then wait 200 seconds, then activate it. But this
doesn't really solve our problem because sometimes Storm tries to rebalance
the topologies by itself and reassigns the topology without our little trick, effectively crashing them.

The same behavior is present with storm 0.9.3.

So maybe something changed in storm that introduces a kind of race
condition during initializaion of some bolts on larger topologies ? Maybe this is a consequence to the switch to Netty ?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)