You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@nifi.apache.org by "Sam Williams (Jira)" <ji...@apache.org> on 2021/12/09 03:44:00 UTC
[jira] [Created] (NIFI-9463) Large file downloads timeout
Sam Williams created NIFI-9463:
----------------------------------
Summary: Large file downloads timeout
Key: NIFI-9463
URL: https://issues.apache.org/jira/browse/NIFI-9463
Project: Apache NiFi
Issue Type: Bug
Components: Core Framework
Affects Versions: 1.15.0, 1.12.1
Environment: Centos 7, Docker, 3-node cluster, SSL, certificate authentication, JVM Heap 4GB
Reporter: Sam Williams
When attempting to download large files (greater than 500MB) from a queue or from provenance, the request will timeout and the file will not download. The HTTP response from NiFi is:
{code:java}
HTTP ERROR 503: Service Unavailable
URI: /nifi-api/flowfile-queues/<queue-id>/flowfiles/<flowfile-id>/content
STATUS: 503
MESSAGE: Service Unavailable
SERVLET: jerseySpring
{code}
{code:java}
nifi-app.log:
<DTG> WARN [Replicate Request Thread-1337] o.a.n.c.c.h.r.ThreadPoolRequestReplicator
java.ne.SocketTimeoutException: timeout
<...>
{code}
{code:java}
nifi.properties:
nifi.cluster.node.connection.timeout=120 secs
nifi.cluster.node.read.timeout=120 secs
nifi.web.request.timeout=120 secs{code}
As I have been increasing the timeout values and the JVM heap size, I have managed to download larger and larger files, but this does not seem to be a linear phenomenon (i.e. 500MB might take ~30sec, while 600MB will take ~90sec to download)
This has been happening since at least 1.12.0, and I believe it to relate to the implementation of the Jersey client [NIFI-5112] Inefficiency in replicating requests across cluster - ASF JIRA (apache.org)
My guess would be the flowfile content is being streamed back to the node serving the UI which is buffering the content in memory and then streaming to the client.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)