You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@nifi.apache.org by "Mark Payne (Jira)" <ji...@apache.org> on 2021/11/18 15:54:00 UTC

[jira] [Updated] (NIFI-9382) Improve startup time when loading flow that uses many HDFS related processors

     [ https://issues.apache.org/jira/browse/NIFI-9382?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mark Payne updated NIFI-9382:
-----------------------------
    Status: Patch Available  (was: Open)

> Improve startup time when loading flow that uses many HDFS related processors
> -----------------------------------------------------------------------------
>
>                 Key: NIFI-9382
>                 URL: https://issues.apache.org/jira/browse/NIFI-9382
>             Project: Apache NiFi
>          Issue Type: Improvement
>          Components: Core Framework, Extensions
>            Reporter: Mark Payne
>            Assignee: Mark Payne
>            Priority: Major
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> When starting NiFI, if a flow has many HDFS related processors (hundreds to thousands) the startup time can be very long. In one case, I have a user flow that has > 1000 HDFS processors and it takes 1-2 hours to fully start NiFi.
> This is because the HDFS makes a lot of assumptions about the environment that it's running in. These assumptions are not always true, unfortunately, when running in NiFi. The use of static methods in the UserGroupInformation class means that in order to interact with an HDFS cluster using multiple Kerberos Principals, we have to create ClassLoader isolation, using a separate, duplicate ClassLoader for each HDFS processor.
> Because of this, the HDFS client components must be initialized once for each processor, and the initialization of the client is very expensive. We need to improve this so that we don't create a separate ClassLoader that loads hundreds or thousands of classes for each instance of the Processor.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)