You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@nifi.apache.org by "Aldrin Piri (JIRA)" <ji...@apache.org> on 2015/11/01 04:34:27 UTC

[jira] [Comment Edited] (NIFI-1091) Protocol exception can prevent app from receiving commands from bootstrap

    [ https://issues.apache.org/jira/browse/NIFI-1091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14984248#comment-14984248 ] 

Aldrin Piri edited comment on NIFI-1091 at 11/1/15 3:33 AM:
------------------------------------------------------------

A bit of misdirection with my original report.  This seems to be what happens when there is a flow.tar.stale (a result of the issue in NIFI-730).

To recreate in a cluster:
# Have a cluster established with manager and one or more nodes
# With all nodes, stopped create flow.tar.stale (in my case to test recreation, I simply renamed flow.tar to flow.tar.stale) for manager
# Start manager (this fails due to the flow.tar)
# Start one or more nodes, these fail with the above ProtocolException
# Attempt to shutdown nodes and have command fail due to SocketTimeoutException




was (Author: aldrin):
A bit of misdirection with my original report.  This seems to be what happens when there is a flow.tar.stale (a result of the issue in NIFI-730).

To recreate in a cluster:
# Have a cluster established with manager and one or more nodes
# With all nodes, stopped create flow.tar.stale (in my case to test recreation, I simply renamed flow.tar to flow.tar.stale) for manager
# Start manager (this fails due to the flow.tar)
# Start one or more nodes, these fail with the above ProtocolException
# Attempt to shutdown nodes and have command fails due to SocketTimeoutException



> Protocol exception can prevent app from receiving commands from bootstrap
> -------------------------------------------------------------------------
>
>                 Key: NIFI-1091
>                 URL: https://issues.apache.org/jira/browse/NIFI-1091
>             Project: Apache NiFi
>          Issue Type: Bug
>          Components: Core Framework
>    Affects Versions: 0.3.0
>            Reporter: Aldrin Piri
>         Attachments: nifi-1091.dump
>
>
> In the resulting error from NIFI-730 comment (https://issues.apache.org/jira/browse/NIFI-730?focusedCommentId=14984153&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14984153) it appears that this disconnect of the node has it blocking from receiving commands from bootstrap with the following:
> {code}
> :2015-10-31 22:56:05,171 ERROR [main] org.apache.nifi.bootstrap.Command Failed to send shutdown command to port 53666 due to java.net.SocketTimeoutException: Read timed out
> java.net.SocketTimeoutException: Read timed out
>         at java.net.SocketInputStream.socketRead0(Native Method) ~[na:1.8.0_60]
>         at java.net.SocketInputStream.socketRead(SocketInputStream.java:116) ~[na:1.8.0_60]
>         at java.net.SocketInputStream.read(SocketInputStream.java:170) ~[na:1.8.0_60]
>         at java.net.SocketInputStream.read(SocketInputStream.java:141) ~[na:1.8.0_60]
>         at java.net.SocketInputStream.read(SocketInputStream.java:223) ~[na:1.8.0_60]
>         at org.apache.nifi.bootstrap.RunNiFi.stop(RunNiFi.java:647) [nifi-bootstrap-0.3.1-SNAPSHOT.jar:0.3.1-SNAPSHOT]
>         at org.apache.nifi.bootstrap.RunNiFi.main(RunNiFi.java:206) [nifi-bootstrap-0.3.1-SNAPSHOT.jar:0.3.1-SNAPSHOT]
> {code}
> The nifi-app.log shows continuous attempts to reconnect with the manager:
> {code}
> 2015-10-31 22:55:23,870 WARN [main] o.a.nifi.controller.StandardFlowService Failed to connect to cluster due to: org.apache.nifi.cluster.protocol.ProtocolException: Failed to create socket due to: java.net.ConnectException: Connection refused
> org.apache.nifi.cluster.protocol.ProtocolException: Failed to create socket due to: java.net.ConnectException: Connection refused
> 	at org.apache.nifi.cluster.protocol.impl.NodeProtocolSenderImpl.createSocket(NodeProtocolSenderImpl.java:145) ~[nifi-framework-cluster-protocol-0.3.1-SNAPSHOT.jar:0.3.1-SNAPSHOT]
> 	at org.apache.nifi.cluster.protocol.impl.NodeProtocolSenderImpl.requestConnection(NodeProtocolSenderImpl.java:68) ~[nifi-framework-cluster-protocol-0.3.1-SNAPSHOT.jar:0.3.1-SNAPSHOT]
> 	at org.apache.nifi.cluster.protocol.impl.NodeProtocolSenderListener.requestConnection(NodeProtocolSenderListener.java:93) ~[nifi-framework-cluster-protocol-0.3.1-SNAPSHOT.jar:0.3.1-SNAPSHOT]
> 	at org.apache.nifi.controller.StandardFlowService.connect(StandardFlowService.java:652) [nifi-framework-core-0.3.1-SNAPSHOT.jar:0.3.1-SNAPSHOT]
> 	at org.apache.nifi.controller.StandardFlowService.load(StandardFlowService.java:397) [nifi-framework-core-0.3.1-SNAPSHOT.jar:0.3.1-SNAPSHOT]
> 	at org.apache.nifi.web.server.JettyServer.start(JettyServer.java:770) [nifi-jetty-0.3.1-SNAPSHOT.jar:0.3.1-SNAPSHOT]
> 	at org.apache.nifi.NiFi.<init>(NiFi.java:137) [nifi-runtime-0.3.1-SNAPSHOT.jar:0.3.1-SNAPSHOT]
> 	at org.apache.nifi.NiFi.main(NiFi.java:227) [nifi-runtime-0.3.1-SNAPSHOT.jar:0.3.1-SNAPSHOT]
> Caused by: java.net.ConnectException: Connection refused
> 	at java.net.PlainSocketImpl.socketConnect(Native Method) ~[na:1.8.0_60]
> 	at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350) ~[na:1.8.0_60]
> 	at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206) ~[na:1.8.0_60]
> 	at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188) ~[na:1.8.0_60]
> 	at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) ~[na:1.8.0_60]
> 	at java.net.Socket.connect(Socket.java:589) ~[na:1.8.0_60]
> 	at java.net.Socket.connect(Socket.java:538) ~[na:1.8.0_60]
> 	at java.net.Socket.<init>(Socket.java:434) ~[na:1.8.0_60]
> 	at java.net.Socket.<init>(Socket.java:211) ~[na:1.8.0_60]
> 	at org.apache.nifi.io.socket.SocketUtils.createSocket(SocketUtils.java:59) ~[nifi-socket-utils-0.3.1-SNAPSHOT.jar:0.3.1-SNAPSHOT]
> 	at org.apache.nifi.cluster.protocol.impl.NodeProtocolSenderImpl.createSocket(NodeProtocolSenderImpl.java:143) ~[nifi-framework-cluster-protocol-0.3.1-SNAPSHOT.jar:0.3.1-SNAPSHOT]
> 	... 7 common frames omitted
> {code}
> Will attach thread dump.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)