You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@nifi.apache.org by "Reinhard Sell (Jira)" <ji...@apache.org> on 2021/06/28 15:54:00 UTC
[jira] [Created] (NIFI-8746) ListenRELP does not reliably recover
from errors
Reinhard Sell created NIFI-8746:
-----------------------------------
Summary: ListenRELP does not reliably recover from errors
Key: NIFI-8746
URL: https://issues.apache.org/jira/browse/NIFI-8746
Project: Apache NiFi
Issue Type: Bug
Components: Core Framework
Affects Versions: 1.13.2
Reporter: Reinhard Sell
The ListenRELP processor does sometimes not recover from errors (e.g. RELPFrameException). A manual stop and start of the processor is then necessary to re-establish communication with the client (rsyslog). In particular, if such an errors occurs more than once.
h3. How to reproduce:
* Enable DEBUG logging for {{org.apache.nifi.processors.standard.ListenRELP}}
* Create a simple flow with a ListenRELP processor, set a valid port (e.g. 12345). Leave default for all other values, esp. *Max Number of TCP Connections = 2.*
* Connect ListenRELPs output to a funnel and start it.
* Install the tool {{nc}} (netcat).
* Use {{nc}} to provide some correct and also some invalid data as follows:
Start RELP session on command line:
{{$ nc 127.0.0.1 12345}}
Enter the following to open the connection:
{{1 open 0}}
Expect the following response:
{{1 rsp 7 200 OK}}
{{}}
Enter the following to submit a valid line:
{{2 syslog 3 abc}}
Expect the following response:
{{2 rsp 6 200 OK}}
Now enter an invalid line:
{{3 syslog -1}}
Expect RELPFrameException in the logs and *no* response in the {{nc}} session.
Nifi will not respond via this connection anymore, even for valid lines. Which is ok
according to the RELP spec.
Press Ctrl-C to end the {{nc}} session.
Open a new {{nc}} session and repeat the same commands.
It should work for a second time, as we may have two TCP connections.
However, it will not work a third or fourth time: At some point in time ListenRELP will not respond at all, even within a complete new connection. The only way to recover from this state seems to be: Stop and Start of the processor.
Also: At some point in time (after all connections have been used up?) the following DEBUG message is printed *very* often (several times per ms!):
{{o.a.nifi.processors.standard.ListenRELP ListenRELP[id=<uuid>] No more data available, returning for selection}}
This behaviour is a problem for our production setup: Even though it does not happen very often, it does happen. And data might be lost, if this state is not detected and resolved fast enough.
Disclaimer: Sending an invalid RELP frame is *not* what happens in our production environment. It's just a simple way to get ListenRELP into this state.
We are not sure about the core reason for the communcation interuption, perhaps a network/firewall issue. But the result looks very much like described here.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)