You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@storm.apache.org by "clay teahouse (JIRA)" <ji...@apache.org> on 2014/12/23 22:09:13 UTC
[jira] [Commented] (STORM-602) HdfsBolt dies when the hadoop node is not available

    [ https://issues.apache.org/jira/browse/STORM-602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14257536#comment-14257536 ] 

clay teahouse commented on STORM-602:
-------------------------------------

Further information on the issue:
1) When starting the topology, if the hadoop nodes are not available, you get "worker died" message and HdfsBolt and the entire topology die.
java.lang.RuntimeException: ("Worker died")
......
2) If the topology is running and then the hadoop nodes become unavailable, you get connection refused error. When hadoop nodes become available, the HdfsBolt never recovers. It keeps giving the following error:
 org.apache.storm.hdfs.bolt.HdfsBolt - write/sync failed.
 All datanodes ....... are bad. Aborting...
        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:1008) ~[hadoop-hdfs-2.2.0.jar:na]
        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:823) ~[hadoop-hdfs-2.2.0.jar:na]
        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:475) ~[hadoop-hdfs-2.2.0.jar:na]
If you restart the topology, everything is OK and HdfsBolt can write to the hdfs nodes.

> HdfsBolt dies when the hadoop node is not available
> ---------------------------------------------------
>
>                 Key: STORM-602
>                 URL: https://issues.apache.org/jira/browse/STORM-602
>             Project: Apache Storm
>          Issue Type: Bug
>          Components: storm-hdfs
>    Affects Versions: 0.9.3
>         Environment: Ubuntu 14.04
>            Reporter: clay teahouse
>
> When the hadoop nodes are not available, HdfsBolt generates the following run time error, and dies and the topology dies with it too.
> 12154 [Thread-50-hdfsBolt2] ERROR backtype.storm.util - Halting process: ("Worker died")
> java.lang.RuntimeException: ("Worker died")
>         at backtype.storm.util$exit_process_BANG_.doInvoke(util.clj:319) [storm-core-0.9.3-SNAPSHOT.jar:0.9.3-SNAPSHOT]
>         at clojure.lang.RestFn.invoke(RestFn.java:423) [clojure-1.5.1.jar:na]
>         at backtype.storm.daemon.worker$fn__4770$fn__4771.invoke(worker.clj:452) [storm-core-0.9.3-SNAPSHOT.jar:0.9.3-SNAPSHOT]
>         at backtype.storm.daemon.executor$mk_executor_data$fn__3287$fn__3288.invoke(executor.clj:239) [storm-core-0.9.3-SNAPSHOT.jar:0.9.3-SNAPSHOT]
>         at backtype.storm.util$async_loop$fn__458.invoke(util.clj:467) [storm-core-0.9.3-SNAPSHOT.jar:0.9.3-SNAPSHOT]
>         at clojure.lang.AFn.run(AFn.java:24) [clojure-1.5.1.jar:na]
>         at java.lang.Thread.run(Thread.java:745) [na:1.7.0_65]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)