You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nifi.apache.org by "dale.chang13" <da...@outlook.com> on 2016/03/31 14:35:11 UTC

Splitting Incoming FlowFile, Output Multiple FlowFiles

My specific use-case calls for ingesting a CSV table with many rows and then
storing individual rows into HBase and Solar. Additionally, I would like to
avoid developing custom processors, but it seems like the SplitText and
SplitContent Processors do not return individual flowfiles, each with their
own attributes.

However, I was wondering what the best plan of attack would be when taking
an incoming FlowFile and sending FlowFiles through Process Session? Creating
multiple instances of Process Session? session.transfer within a loop?



--
View this message in context: http://apache-nifi-developer-list.39713.n7.nabble.com/Splitting-Incoming-FlowFile-Output-Multiple-FlowFiles-tp8653.html
Sent from the Apache NiFi Developer List mailing list archive at Nabble.com.

Re: Splitting Incoming FlowFile, Output Multiple FlowFiles

Posted by Bryan Bende <bb...@gmail.com>.
Hello,

SplitText and SplitContent should be producing individual FlowFiles. Are
you seeing something different?

For SplitText you would set "Line Split Count" to 1 in order to get a
FlowFile for each line of the incoming CSV.

If you are doing extremely large files, it is generally recommended to do a
two-phase split where the first SplitText might have something like "Line
Split Count" set to 10,000-20,000 and then a second SplitText with "Line
Split Count" set to 1.

-Bryan


On Thu, Mar 31, 2016 at 8:35 AM, dale.chang13 <da...@outlook.com>
wrote:

> My specific use-case calls for ingesting a CSV table with many rows and
> then
> storing individual rows into HBase and Solar. Additionally, I would like to
> avoid developing custom processors, but it seems like the SplitText and
> SplitContent Processors do not return individual flowfiles, each with their
> own attributes.
>
> However, I was wondering what the best plan of attack would be when taking
> an incoming FlowFile and sending FlowFiles through Process Session?
> Creating
> multiple instances of Process Session? session.transfer within a loop?
>
>
>
> --
> View this message in context:
> http://apache-nifi-developer-list.39713.n7.nabble.com/Splitting-Incoming-FlowFile-Output-Multiple-FlowFiles-tp8653.html
> Sent from the Apache NiFi Developer List mailing list archive at
> Nabble.com.
>