You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@nifi.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2018/05/23 12:41:00 UTC

[jira] [Commented] (NIFI-5228) Allow user to choose whether or not to add File Attributes as FlowFile Attributes when using ListFile

    [ https://issues.apache.org/jira/browse/NIFI-5228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16487180#comment-16487180 ] 

ASF GitHub Bot commented on NIFI-5228:
--------------------------------------

Github user markap14 commented on a diff in the pull request:

    https://github.com/apache/nifi/pull/2733#discussion_r190229672
  
    --- Diff: nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/ListFile.java ---
    @@ -255,43 +262,47 @@ public void onScheduled(final ProcessContext context) {
             final Path absPath = filePath.toAbsolutePath();
             final String absPathString = absPath.getParent().toString() + File.separator;
     
    +        final DateFormat formatter = new SimpleDateFormat(FILE_MODIFY_DATE_ATTR_FORMAT, Locale.US);
    --- End diff --
    
    We don't recommend ever using ThreadLocal for processors in NiFi. This is because each time a Processor is run, it is done in a potentially different thread. For a large deployment you could have hundreds of threads, and the threads stay around for the life of the instance, so the cleanup is a little awkward. The pattern that we commonly follow is to use a BlockingQueue and poll from that, then create if necessary, and put back. I.e., a simple Object Pool. And I did consider it but decided that the complexity that it adds to the code was not worth it, given the cost of creating the DateFormat.


> Allow user to choose whether or not to add File Attributes as FlowFile Attributes when using ListFile
> -----------------------------------------------------------------------------------------------------
>
>                 Key: NIFI-5228
>                 URL: https://issues.apache.org/jira/browse/NIFI-5228
>             Project: Apache NiFi
>          Issue Type: Improvement
>          Components: Extensions
>            Reporter: Mark Payne
>            Assignee: Mark Payne
>            Priority: Major
>             Fix For: 1.7.0
>
>
> The FetchFile processor adds several FlowFIle attributes such as the file's owner, last accessed time, creation time, etc. While these certainly can be useful pieces of information and do serve a purpose, they can be expensive to determine in some configurations. In my use case, I have an Azure File Store mounted to an Ubuntu system with CIFS using SMB 3.0. The remote directory that I am listing has 7,000-8,000 files and takes about 3 minutes to perform the listing with ListFile.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)