You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@nifi.apache.org by "Bryan Bende (JIRA)" <ji...@apache.org> on 2017/09/18 13:24:00 UTC

[jira] [Created] (NIFI-4391) PutTCP not properly closing connections

Bryan Bende created NIFI-4391:
---------------------------------

             Summary: PutTCP not properly closing connections
                 Key: NIFI-4391
                 URL: https://issues.apache.org/jira/browse/NIFI-4391
             Project: Apache NiFi
          Issue Type: Bug
    Affects Versions: 1.3.0, 1.2.0, 1.1.0, 1.0.0
            Reporter: Bryan Bende
            Assignee: Bryan Bende


Thread from the mailing list...

We are using NiFi PutTCP processors to send messages to a number of Moxa
onCell ip gateway devices.

These Moxa devices are running on a cellular network with not always the
most ideal connection. The Moxa only allows for a maximum of 2 simultaneous
client connections.

What we notice is that although we specify connection / read timeouts on
both PutTCP and the Moxa, that sometimes a connection get "stuck". (In the
moxa network monitoring we see 2 client sockets coming from PutTCP in the
ESTABLISHED state that never go away).

This doesn't always happen, but often enough for it to be considered a
problem, as it requires a restart of the moxa ports to clear the connections
(manual step). It typically happens when PutTCP experiences a Timeout.

On the PutTCP processors we have the following settings :

- Idle Connection Expiration : 30 seconds  (we've set this higher due to bad
gprs connection)
- Timeout : 10 seconds (this is only used as a timeout for establishing the
connection)

On the Moxas we have

- TCP alive check time : 2min (this should force the Moxa to close the
socket)

Yet for some reason the connection remains established.

Here's what I found out :

On the moxa I noticed a connection (with client port 48440) that is in
ESTABLISHED mode for 4+ hours. (blocking other connections). On the Moxa I
can see when the connection was established :

2017/09/17 14:20:29 [OpMode] Port01 Connect 10.192.2.90:48440

I can track that down in Nifi via the logs (unfortunately PutTCP doesn't log
client ports, but from the timestamp  I'm sure it's this connection :

{code}
2017-09-17 14:20:10,837 DEBUG [Timer-Driven Process Thread-10]
o.apache.nifi.processors.standard.PutTCP
PutTCP[id=80231a39-1008-1159-a6fa-1f9e3751d608] No available connections,
creating a new one...
2017-09-17 14:20:20,860 ERROR [Timer-Driven Process Thread-10]
o.apache.nifi.processors.standard.PutTCP
PutTCP[id=80231a39-1008-1159-a6fa-1f9e3751d608] No available connections,
and unable to create a new one, transferring
StandardFlowFileRecord[uuid=79f2a166-5211-4d2d-9275-03f0ce4d5b29,claim=StandardContentClaim
[resourceClaim=StandardResourceClaim[id=1505641210025-1, container=default,
section=1], offset=84519, length=9],offset=0,name=23934743676390659,size=9]
to failure: java.net.SocketTimeoutException: Timed out connecting to
10.32.133.40:4001
2017-09-17 14:20:20,860 ERROR [Timer-Driven Process Thread-10]
o.apache.nifi.processors.standard.PutTCP
java.net.SocketTimeoutException: Timed out connecting to 10.32.133.40:4001
        at
org.apache.nifi.processor.util.put.sender.SocketChannelSender.open(SocketChannelSender.java:66)
~[nifi-processor-utils-1.1.0.jar:1.1.0]
        at
org.apache.nifi.processor.util.put.AbstractPutEventProcessor.createSender(AbstractPutEventProcessor.java:312)
~[nifi-processor-utils-1.1.0.jar:1.1.0]
        at
org.apache.nifi.processors.standard.PutTCP.createSender(PutTCP.java:121)
[nifi-standard-processors-1.1.0.jar:1.1.0]
        at
org.apache.nifi.processor.util.put.AbstractPutEventProcessor.acquireSender(AbstractPutEventProcessor.java:334)
~[nifi-processor-utils-1.1.0.jar:1.1.0]
        at
org.apache.nifi.processors.standard.PutTCP.onTrigger(PutTCP.java:176)
[nifi-standard-processors-1.1.0.jar:1.1.0]
        at
org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1099)
[nifi-framework-core-1.1.0.jar:1.1.0]
        at
org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:136)
[nifi-framework-core-1.1.0.jar:1.1.0]
        at
org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:47)
[nifi-framework-core-1.1.0.jar:1.1.0]
        at
org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:132)
[nifi-framework-core-1.1.0.jar:1.1.0]
        at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
[na:1.8.0_111]
        at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
[na:1.8.0_111]
        at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
[na:1.8.0_111]
        at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
[na:1.8.0_111]
        at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
[na:1.8.0_111]
        at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
[na:1.8.0_111]
        at java.lang.Thread.run(Thread.java:745) [na:1.8.0_111]
{code}

On an OS level I indeed see the TCP connection originated from Nifi :

netstat -tn | grep 48440

tcp      711      0 10.192.2.90:48440       10.32.133.40:4001
ESTABLISHED


lsof -i TCP:48440

COMMAND  PID USER   FD   TYPE    DEVICE SIZE/OFF NODE NAME
java    3424 root 1864u  IPv4 404675057      0t0  TCP
NifiServer:48440->10.32.133.40:newoak (ESTABLISHED)

ps -ef | grep 3424

root      3424  3390  8 11:32 ?        00:44:33 java -classpath
/opt/nifi-1.1.0/./conf.....

Any ideas on how to debug this further ?

I looked at the code in SocketChannelSender.java:66 and was wondering it no
additional cleanup is necessary when throwing the SocketTimeoutException. If
the connection ends up getting established after the timeout, doesn't PutTCP
need to clean that up ?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)