You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@nifi.apache.org by "ASF subversion and git services (Jira)" <ji...@apache.org> on 2020/01/13 17:04:01 UTC

[jira] [Commented] (NIFI-6992) Add "Batch Size" property to GetHDFSFileInfo processor

    [ https://issues.apache.org/jira/browse/NIFI-6992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17014478#comment-17014478 ] 

ASF subversion and git services commented on NIFI-6992:
-------------------------------------------------------

Commit 26fcf8158a3e10c490db11340e95e1c7943eae06 in nifi's branch refs/heads/master from Tamas Palfy
[ https://gitbox.apache.org/repos/asf?p=nifi.git;h=26fcf81 ]

NIFI-6992 - Add "Batch Size" property to GetHDFSFileInfo processor - Added "Batch Size property", takes effect only if "Destination" is set to "Content" and Grouping" is set to "None"

NIFI-6992 - Add "Batch Size" property to GetHDFSFileInfo processor - Added validation for 'Batch Size' in 'GetHDFSFileInfo'.
NIFI-6992 - Changed 'GetHDFSFileInfo.BATCH_SIZE' validator from 'NON_NEGATIVE_INTEGER_VALIDATOR' to 'POSITIVE_INTEGER_VALIDATOR'. Added more tests.
NIFI-6992 - Removed 'AllowEmptyValidator'. 'Batch Size' in 'GetHDFSFileInfo' allows null but not empty String.
NIFI-6992 - 'Batch Size' in 'GetHDFSFileInfo' allows null but not empty String - cont.
NIFI-6992 - Fix: Unused import.

This closes #3966.

Signed-off-by: Peter Turcsanyi <tu...@apache.org>


> Add "Batch Size" property to GetHDFSFileInfo processor
> ------------------------------------------------------
>
>                 Key: NIFI-6992
>                 URL: https://issues.apache.org/jira/browse/NIFI-6992
>             Project: Apache NiFi
>          Issue Type: Improvement
>          Components: Extensions
>            Reporter: Tamas Palfy
>            Assignee: Tamas Palfy
>            Priority: Major
>          Time Spent: 3h
>  Remaining Estimate: 0h
>
> GetHDFSFileInfo creates 1 output flowfile for every HDFS file when Grouping is set to None.
> Other grouping strategies are adding a tree structure to the result instead of a flat structure.
> I solution is required to be able to stream HDFS file info results into output but batched, so that huge amount of flowfiles residing in a single session can be prevented.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)