You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@nifi.apache.org by "Mark Payne (JIRA)" <ji...@apache.org> on 2018/12/05 15:00:00 UTC

[jira] [Updated] (NIFI-5868) Instrument robust timing information for ListFile

     [ https://issues.apache.org/jira/browse/NIFI-5868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mark Payne updated NIFI-5868:
-----------------------------
    Fix Version/s: 1.9.0
           Status: Patch Available  (was: Open)

> Instrument robust timing information for ListFile
> -------------------------------------------------
>
>                 Key: NIFI-5868
>                 URL: https://issues.apache.org/jira/browse/NIFI-5868
>             Project: Apache NiFi
>          Issue Type: Improvement
>          Components: Extensions
>            Reporter: Mark Payne
>            Assignee: Mark Payne
>            Priority: Major
>             Fix For: 1.9.0
>
>
> ListFile is used in many different contexts. We often see users with a specific use case, though, which is to run ListFile on a Primary Node in a cluster, in order to obtain a file listing of an NFS-mounted share. This works well in most cases, but whenever problems do arise, it is very difficult to understand what the problem is. It would be very helpful to have information such as:
>  * Is there a problem accessing a specific file on the NFS mount?
>  * Is there a problem obtaining a listing from the NFS mount?
>  * Is progress being made at all?
>  * How long is a listing taking right now?
>  * How long does a listing typically take?
>  * Is this problem related to NiFi or to the operating system / infrastructure?
> It would be helpful to track information about each disk access that is occurring, as well as the overall listing progress and issue warnings if we see clear problems arise. We can do this by timing how long each disk access takes, what file was being accessed, and what operating was being performed. If we capture this data in a rolling window, we can assess the data to determine if the listing is now taking longer than it did previously and alert to this fact. Or alert if performing a specific disk operation is taking a long time.
> Gathering this information will likely be fairly heap intensive, so it is best to make the functionality optional.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)