You are viewing a plain text version of this content. The canonical link for it is here.

Posted to users@nifi.apache.org by James McMahon <js...@gmail.com> on 2019/12/04 21:54:06 UTC

Split a flow file to multiples, related by common key and with split count

I have a series of attributes that result from an EvaluateJSonPath. One of
those attributes, FNAME, appears to be a list of values like so:
[“A”,”B”,”C”]. I want to split my flow file into one for each list element.
I need my results to have the original content, all the original
attributes, and its value for the split result out of the list as a new
attribute. I need to also know the split count, and be able to later merge
my flow files after evaluating the results of the split.
How can I accomplish this?
Thanks very much in advance.

Re: Split a flow file to multiples, related by common key and with split count

Posted by James McMahon <js...@gmail.com>.

Thank you Daeho. I will be back at work in just a few hours and will try
this approach. It sounds like it is just what I need. Thanks again.

On Wed, Dec 4, 2019 at 9:26 PM 노대호Daeho Ro <da...@bespinglobal.com>
wrote:

> Of course.
>
> There is a processor, the name is SplitJson. It can split the JSON text by
> defined key. For example, if there is a key name is 'fname' and has the
> value [a, b, c]. Once you split the JSON by that processor, the resulted
> JSON will have the same key and values for others but 'fname' will be a for
> the first JSON , b for the second and so on.
>
> After that, do the EvaluateJsonPath for FNAME then it will have a and b
> and c for each splited flowfiles. Thus, I recommend you to place the
> SplitJson processor in front of the  EvaluateJsonPath processor.
>
> 2019년 12월 5일 (목) 오전 10:58, James McMahon <js...@gmail.com>님이 작성:
>
>> I don’t quite follow, Daeho. FNAME is an attribute that results *from*
>> EvaluateJSonPath. Can you explain what you mean by splitting the Jason key
>> before EvaluateJSonPath?
>> Jim
>>
>> On Wed, Dec 4, 2019 at 7:45 PM 노대호Daeho Ro <da...@bespinglobal.com>
>> wrote:
>>
>>> I think you can split the json key for FNAME just before the
>>> EvaluateJsonPath processor. Then, the fragment.* attributes will be
>>> automatically created.
>>>
>>> 2019년 12월 5일 (목) 오전 8:24, Matt Burgess <ma...@apache.org>님이 작성:
>>>
>>>> Jim,
>>>>
>>>> As of NiFi 1.8.0 [1], you should be able to do this with
>>>> UpdateAttribute -> DuplicateFlowFile -> UpdateAttribute pattern, the
>>>> first getting the number of values in the list via the count() EL
>>>> function, the second using that (minus 1) to generate duplicates, each
>>>> with a copy.index attribute set. That attribute can be used in another
>>>> UpdateAttribute with getDelimitedField() EL function for each flow
>>>> file to get its own value from FNAME. You may need to rename some of
>>>> the attributes to fragment.* in order to use a merge processor, but I
>>>> think all the necessary values are covered. Please let me know if this
>>>> works for you or not, I added various improvements in order to support
>>>> use cases like this, but if I missed something I can certainly add it.
>>>>
>>>> Regards,
>>>> Matt
>>>>
>>>> [1] https://issues.apache.org/jira/browse/NIFI-5454
>>>>
>>>> On Wed, Dec 4, 2019 at 4:54 PM James McMahon <js...@gmail.com>
>>>> wrote:
>>>> >
>>>> > I have a series of attributes that result from an EvaluateJSonPath.
>>>> One of those attributes, FNAME, appears to be a list of values like so:
>>>> [“A”,”B”,”C”]. I want to split my flow file into one for each list element.
>>>> I need my results to have the original content, all the original
>>>> attributes, and its value for the split result out of the list as a new
>>>> attribute. I need to also know the split count, and be able to later merge
>>>> my flow files after evaluating the results of the split.
>>>> > How can I accomplish this?
>>>> > Thanks very much in advance.
>>>>
>>>
>>>
>>> --
>>> 노대호  *Daeho Ro */ Service Dev.
>>> daeho.ro@bespinglobal.com / *M* +82 10-6366-2636
>>> *KR* 06167 서울시 서초구 강남대로 327, 13층
>>> <https://www.google.com/maps/search/%EC%84%9C%EC%9A%B8%EC%8B%9C+%EC%84%9C%EC%B4%88%EA%B5%AC+%EA%B0%95%EB%82%A8%EB%8C%80%EB%A1%9C+327,+13%EC%B8%B5?entry=gmail&source=g>
>>> [image: Bespin Global] <https://bespinglobal.com/>
>>> 국내 최다 클라우드 인증 자격을 보유한 MSP • 국내 유일 ISO 인증을
>>> 확보한 MSP • 가트너가 인정한 한중일 유일한 MSP
>>> www.bespinglobal.com • ISO 27001:2013 • ISO 9001:2015 Certified
>>> ------------------------------
>>> *Confidentiality Note:* This email may contain confidential and/or
>>> private information.
>>> If you received this email in error please delete and notify the sender.
>>>
>>
>
> --
> 노대호  *Daeho Ro */ Service Dev.
> daeho.ro@bespinglobal.com / *M* +82 10-6366-2636
> *KR* 06167 서울시 서초구 강남대로 327, 13층
> <https://www.google.com/maps/search/%EC%84%9C%EC%9A%B8%EC%8B%9C+%EC%84%9C%EC%B4%88%EA%B5%AC+%EA%B0%95%EB%82%A8%EB%8C%80%EB%A1%9C+327,+13%EC%B8%B5?entry=gmail&source=g>
> [image: Bespin Global] <https://bespinglobal.com/>
> 국내 최다 클라우드 인증 자격을 보유한 MSP • 국내 유일 ISO 인증을
> 확보한 MSP • 가트너가 인정한 한중일 유일한 MSP
> www.bespinglobal.com • ISO 27001:2013 • ISO 9001:2015 Certified
> ------------------------------
> *Confidentiality Note:* This email may contain confidential and/or
> private information.
> If you received this email in error please delete and notify the sender.
>

Re: Split a flow file to multiples, related by common key and with split count

Posted by James McMahon <js...@gmail.com>.

ForkRecord... I’ve never had the chance to use that processor yet, but will
look into it tomorrow too. Thanks to both you guys for the options. -Jim

On Wed, Dec 4, 2019 at 9:38 PM Matt Burgess <ma...@apache.org> wrote:

> That's a good idea, I hadn't thought of that because I didn't know what
> Jim's upstream flow looked like. You could also consider ForkRecord, that
> might give you what you want if you fork on /FNAME.
>
> Regards,
> Matt
>
> On Wed, Dec 4, 2019 at 9:26 PM 노대호Daeho Ro <da...@bespinglobal.com>
> wrote:
>
>> Of course.
>>
>> There is a processor, the name is SplitJson. It can split the JSON text
>> by defined key. For example, if there is a key name is 'fname' and has the
>> value [a, b, c]. Once you split the JSON by that processor, the resulted
>> JSON will have the same key and values for others but 'fname' will be a for
>> the first JSON , b for the second and so on.
>>
>> After that, do the EvaluateJsonPath for FNAME then it will have a and b
>> and c for each splited flowfiles. Thus, I recommend you to place the
>> SplitJson processor in front of the  EvaluateJsonPath processor.
>>
>> 2019년 12월 5일 (목) 오전 10:58, James McMahon <js...@gmail.com>님이 작성:
>>
>>> I don’t quite follow, Daeho. FNAME is an attribute that results *from*
>>> EvaluateJSonPath. Can you explain what you mean by splitting the Jason key
>>> before EvaluateJSonPath?
>>> Jim
>>>
>>> On Wed, Dec 4, 2019 at 7:45 PM 노대호Daeho Ro <da...@bespinglobal.com>
>>> wrote:
>>>
>>>> I think you can split the json key for FNAME just before the
>>>> EvaluateJsonPath processor. Then, the fragment.* attributes will be
>>>> automatically created.
>>>>
>>>> 2019년 12월 5일 (목) 오전 8:24, Matt Burgess <ma...@apache.org>님이 작성:
>>>>
>>>>> Jim,
>>>>>
>>>>> As of NiFi 1.8.0 [1], you should be able to do this with
>>>>> UpdateAttribute -> DuplicateFlowFile -> UpdateAttribute pattern, the
>>>>> first getting the number of values in the list via the count() EL
>>>>> function, the second using that (minus 1) to generate duplicates, each
>>>>> with a copy.index attribute set. That attribute can be used in another
>>>>> UpdateAttribute with getDelimitedField() EL function for each flow
>>>>> file to get its own value from FNAME. You may need to rename some of
>>>>> the attributes to fragment.* in order to use a merge processor, but I
>>>>> think all the necessary values are covered. Please let me know if this
>>>>> works for you or not, I added various improvements in order to support
>>>>> use cases like this, but if I missed something I can certainly add it.
>>>>>
>>>>> Regards,
>>>>> Matt
>>>>>
>>>>> [1] https://issues.apache.org/jira/browse/NIFI-5454
>>>>>
>>>>> On Wed, Dec 4, 2019 at 4:54 PM James McMahon <js...@gmail.com>
>>>>> wrote:
>>>>> >
>>>>> > I have a series of attributes that result from an EvaluateJSonPath.
>>>>> One of those attributes, FNAME, appears to be a list of values like so:
>>>>> [“A”,”B”,”C”]. I want to split my flow file into one for each list element.
>>>>> I need my results to have the original content, all the original
>>>>> attributes, and its value for the split result out of the list as a new
>>>>> attribute. I need to also know the split count, and be able to later merge
>>>>> my flow files after evaluating the results of the split.
>>>>> > How can I accomplish this?
>>>>> > Thanks very much in advance.
>>>>>
>>>>
>>>>
>>>> --
>>>> 노대호  *Daeho Ro */ Service Dev.
>>>> daeho.ro@bespinglobal.com / *M* +82 10-6366-2636
>>>> *KR* 06167 서울시 서초구 강남대로 327, 13층
>>>> <https://www.google.com/maps/search/%EC%84%9C%EC%9A%B8%EC%8B%9C+%EC%84%9C%EC%B4%88%EA%B5%AC+%EA%B0%95%EB%82%A8%EB%8C%80%EB%A1%9C+327,+13%EC%B8%B5?entry=gmail&source=g>
>>>> [image: Bespin Global] <https://bespinglobal.com/>
>>>> 국내 최다 클라우드 인증 자격을 보유한 MSP • 국내 유일 ISO 인증을
>>>> 확보한 MSP • 가트너가 인정한 한중일 유일한 MSP
>>>> www.bespinglobal.com • ISO 27001:2013 • ISO 9001:2015 Certified
>>>> ------------------------------
>>>> *Confidentiality Note:* This email may contain confidential and/or
>>>> private information.
>>>> If you received this email in error please delete and notify the sender.
>>>>
>>>
>>
>> --
>> 노대호  *Daeho Ro */ Service Dev.
>> daeho.ro@bespinglobal.com / *M* +82 10-6366-2636
>> *KR* 06167 서울시 서초구 강남대로 327
>> <https://www.google.com/maps/search/%EC%84%9C%EC%9A%B8%EC%8B%9C+%EC%84%9C%EC%B4%88%EA%B5%AC+%EA%B0%95%EB%82%A8%EB%8C%80%EB%A1%9C+327?entry=gmail&source=g>,
>> 13층
>> [image: Bespin Global] <https://bespinglobal.com/>
>> 국내 최다 클라우드 인증 자격을 보유한 MSP • 국내 유일 ISO 인증을
>> 확보한 MSP • 가트너가 인정한 한중일 유일한 MSP
>> www.bespinglobal.com • ISO 27001:2013 • ISO 9001:2015 Certified
>> ------------------------------
>> *Confidentiality Note:* This email may contain confidential and/or
>> private information.
>> If you received this email in error please delete and notify the sender.
>>
>

Re: Split a flow file to multiples, related by common key and with split count

Posted by Matt Burgess <ma...@apache.org>.

That's a good idea, I hadn't thought of that because I didn't know what
Jim's upstream flow looked like. You could also consider ForkRecord, that
might give you what you want if you fork on /FNAME.

Regards,
Matt

On Wed, Dec 4, 2019 at 9:26 PM 노대호Daeho Ro <da...@bespinglobal.com>
wrote:

> Of course.
>
> There is a processor, the name is SplitJson. It can split the JSON text by
> defined key. For example, if there is a key name is 'fname' and has the
> value [a, b, c]. Once you split the JSON by that processor, the resulted
> JSON will have the same key and values for others but 'fname' will be a for
> the first JSON , b for the second and so on.
>
> After that, do the EvaluateJsonPath for FNAME then it will have a and b
> and c for each splited flowfiles. Thus, I recommend you to place the
> SplitJson processor in front of the  EvaluateJsonPath processor.
>
> 2019년 12월 5일 (목) 오전 10:58, James McMahon <js...@gmail.com>님이 작성:
>
>> I don’t quite follow, Daeho. FNAME is an attribute that results *from*
>> EvaluateJSonPath. Can you explain what you mean by splitting the Jason key
>> before EvaluateJSonPath?
>> Jim
>>
>> On Wed, Dec 4, 2019 at 7:45 PM 노대호Daeho Ro <da...@bespinglobal.com>
>> wrote:
>>
>>> I think you can split the json key for FNAME just before the
>>> EvaluateJsonPath processor. Then, the fragment.* attributes will be
>>> automatically created.
>>>
>>> 2019년 12월 5일 (목) 오전 8:24, Matt Burgess <ma...@apache.org>님이 작성:
>>>
>>>> Jim,
>>>>
>>>> As of NiFi 1.8.0 [1], you should be able to do this with
>>>> UpdateAttribute -> DuplicateFlowFile -> UpdateAttribute pattern, the
>>>> first getting the number of values in the list via the count() EL
>>>> function, the second using that (minus 1) to generate duplicates, each
>>>> with a copy.index attribute set. That attribute can be used in another
>>>> UpdateAttribute with getDelimitedField() EL function for each flow
>>>> file to get its own value from FNAME. You may need to rename some of
>>>> the attributes to fragment.* in order to use a merge processor, but I
>>>> think all the necessary values are covered. Please let me know if this
>>>> works for you or not, I added various improvements in order to support
>>>> use cases like this, but if I missed something I can certainly add it.
>>>>
>>>> Regards,
>>>> Matt
>>>>
>>>> [1] https://issues.apache.org/jira/browse/NIFI-5454
>>>>
>>>> On Wed, Dec 4, 2019 at 4:54 PM James McMahon <js...@gmail.com>
>>>> wrote:
>>>> >
>>>> > I have a series of attributes that result from an EvaluateJSonPath.
>>>> One of those attributes, FNAME, appears to be a list of values like so:
>>>> [“A”,”B”,”C”]. I want to split my flow file into one for each list element.
>>>> I need my results to have the original content, all the original
>>>> attributes, and its value for the split result out of the list as a new
>>>> attribute. I need to also know the split count, and be able to later merge
>>>> my flow files after evaluating the results of the split.
>>>> > How can I accomplish this?
>>>> > Thanks very much in advance.
>>>>
>>>
>>>
>>> --
>>> 노대호  *Daeho Ro */ Service Dev.
>>> daeho.ro@bespinglobal.com / *M* +82 10-6366-2636
>>> *KR* 06167 서울시 서초구 강남대로 327, 13층
>>> <https://www.google.com/maps/search/%EC%84%9C%EC%9A%B8%EC%8B%9C+%EC%84%9C%EC%B4%88%EA%B5%AC+%EA%B0%95%EB%82%A8%EB%8C%80%EB%A1%9C+327,+13%EC%B8%B5?entry=gmail&source=g>
>>> [image: Bespin Global] <https://bespinglobal.com/>
>>> 국내 최다 클라우드 인증 자격을 보유한 MSP • 국내 유일 ISO 인증을
>>> 확보한 MSP • 가트너가 인정한 한중일 유일한 MSP
>>> www.bespinglobal.com • ISO 27001:2013 • ISO 9001:2015 Certified
>>> ------------------------------
>>> *Confidentiality Note:* This email may contain confidential and/or
>>> private information.
>>> If you received this email in error please delete and notify the sender.
>>>
>>
>
> --
> 노대호  *Daeho Ro */ Service Dev.
> daeho.ro@bespinglobal.com / *M* +82 10-6366-2636
> *KR* 06167 서울시 서초구 강남대로 327, 13층
> [image: Bespin Global] <https://bespinglobal.com/>
> 국내 최다 클라우드 인증 자격을 보유한 MSP • 국내 유일 ISO 인증을
> 확보한 MSP • 가트너가 인정한 한중일 유일한 MSP
> www.bespinglobal.com • ISO 27001:2013 • ISO 9001:2015 Certified
> ------------------------------
> *Confidentiality Note:* This email may contain confidential and/or
> private information.
> If you received this email in error please delete and notify the sender.
>

Re: Split a flow file to multiples, related by common key and with split count

Posted by James McMahon <js...@gmail.com>.

Nice. I like that, thank you Etienne. I also did a quick check and found
this, which looks pretty recent, and pretty good...
https://community.cloudera.com/t5/Community-Articles/Jolt-quick-reference-for-Nifi-Jolt-Processors/ta-p/244350

On Thu, Dec 5, 2019 at 9:04 AM Etienne Jouvin <la...@gmail.com>
wrote:

> On "official site" ;)
>
> https://jolt-demo.appspot.com/#inception
>
> Le jeu. 5 déc. 2019 à 15:01, James McMahon <js...@gmail.com> a
> écrit :
>
>> Absolutely. I am going to do that. When you started working with it, were
>> there any particularly helpful examples of its application you used to
>> learn it that you recommend?
>>
>> On Thu, Dec 5, 2019 at 8:57 AM Etienne Jouvin <la...@gmail.com>
>> wrote:
>>
>>> Hello.
>>>
>>> You are right. If it works and you are satisfied, you should keep your
>>> solution.
>>> By the wya JoltTransformation may be difficult at the very beginning.
>>> But it is very powerful and with some pratice, it begins to be easy.
>>>
>>> For study, you may give it a try.
>>>
>>> Regards.
>>>
>>> Etienne Jouvin
>>>
>>> Le jeu. 5 déc. 2019 à 14:40, James McMahon <js...@gmail.com> a
>>> écrit :
>>>
>>>> Hello Etienne. Yes, Matt may have mentioned that approach and I started
>>>> to look into it.
>>>>
>>>> My initial thought was this: is it much of a savings? My rudimentary
>>>> process works in three process steps - each simple in configuration. The
>>>> JoltTransformationJSON would eliminate only one processor, and it looks
>>>> fairly complex to configure. It appears to require a Custom Transformation
>>>> Class Name, a Custom Module Directory, and a Jolt Specification. For folks
>>>> who have done it before those may be an afterthought. But as is often the
>>>> case with NiFi, if you've never used a processor sometimes it is hard to
>>>> find concrete examples to configure NiFi processors, services, schemas, etc
>>>> etc. I opted to take the more familiar path, not being familiar with the
>>>> Jolt transformation processor.
>>>>
>>>> Am happy to learn and will see if there's much out there in way of
>>>> examples to configure JoltTransformationJSON. For now I'll use my less
>>>> elegant solution that works gets me where i need to be: pumping data
>>>> through my production system.
>>>>
>>>> Good suggestion. Thanks again.
>>>>
>>>> On Thu, Dec 5, 2019 at 8:20 AM Etienne Jouvin <la...@gmail.com>
>>>> wrote:
>>>>
>>>>> Hello.
>>>>>
>>>>> Why don't you use a JoltTransformation process first to produce
>>>>> multiple element in JSON according value in the array, and duplicate common
>>>>> attributes for all.
>>>>> And then, you do the split.
>>>>>
>>>>> Etienne
>>>>>
>>>>>
>>>>>
>>>>>

Re: Split a flow file to multiples, related by common key and with split count

Posted by Etienne Jouvin <la...@gmail.com>.

On "official site" ;)

https://jolt-demo.appspot.com/#inception

Le jeu. 5 déc. 2019 à 15:01, James McMahon <js...@gmail.com> a écrit :

> Absolutely. I am going to do that. When you started working with it, were
> there any particularly helpful examples of its application you used to
> learn it that you recommend?
>
> On Thu, Dec 5, 2019 at 8:57 AM Etienne Jouvin <la...@gmail.com>
> wrote:
>
>> Hello.
>>
>> You are right. If it works and you are satisfied, you should keep your
>> solution.
>> By the wya JoltTransformation may be difficult at the very beginning. But
>> it is very powerful and with some pratice, it begins to be easy.
>>
>> For study, you may give it a try.
>>
>> Regards.
>>
>> Etienne Jouvin
>>
>> Le jeu. 5 déc. 2019 à 14:40, James McMahon <js...@gmail.com> a
>> écrit :
>>
>>> Hello Etienne. Yes, Matt may have mentioned that approach and I started
>>> to look into it.
>>>
>>> My initial thought was this: is it much of a savings? My rudimentary
>>> process works in three process steps - each simple in configuration. The
>>> JoltTransformationJSON would eliminate only one processor, and it looks
>>> fairly complex to configure. It appears to require a Custom Transformation
>>> Class Name, a Custom Module Directory, and a Jolt Specification. For folks
>>> who have done it before those may be an afterthought. But as is often the
>>> case with NiFi, if you've never used a processor sometimes it is hard to
>>> find concrete examples to configure NiFi processors, services, schemas, etc
>>> etc. I opted to take the more familiar path, not being familiar with the
>>> Jolt transformation processor.
>>>
>>> Am happy to learn and will see if there's much out there in way of
>>> examples to configure JoltTransformationJSON. For now I'll use my less
>>> elegant solution that works gets me where i need to be: pumping data
>>> through my production system.
>>>
>>> Good suggestion. Thanks again.
>>>
>>> On Thu, Dec 5, 2019 at 8:20 AM Etienne Jouvin <la...@gmail.com>
>>> wrote:
>>>
>>>> Hello.
>>>>
>>>> Why don't you use a JoltTransformation process first to produce
>>>> multiple element in JSON according value in the array, and duplicate common
>>>> attributes for all.
>>>> And then, you do the split.
>>>>
>>>> Etienne
>>>>
>>>>
>>>>
>>>>

Re: Split a flow file to multiples, related by common key and with split count

Posted by James McMahon <js...@gmail.com>.

Absolutely. I am going to do that. When you started working with it, were
there any particularly helpful examples of its application you used to
learn it that you recommend?

On Thu, Dec 5, 2019 at 8:57 AM Etienne Jouvin <la...@gmail.com>
wrote:

> Hello.
>
> You are right. If it works and you are satisfied, you should keep your
> solution.
> By the wya JoltTransformation may be difficult at the very beginning. But
> it is very powerful and with some pratice, it begins to be easy.
>
> For study, you may give it a try.
>
> Regards.
>
> Etienne Jouvin
>
> Le jeu. 5 déc. 2019 à 14:40, James McMahon <js...@gmail.com> a
> écrit :
>
>> Hello Etienne. Yes, Matt may have mentioned that approach and I started
>> to look into it.
>>
>> My initial thought was this: is it much of a savings? My rudimentary
>> process works in three process steps - each simple in configuration. The
>> JoltTransformationJSON would eliminate only one processor, and it looks
>> fairly complex to configure. It appears to require a Custom Transformation
>> Class Name, a Custom Module Directory, and a Jolt Specification. For folks
>> who have done it before those may be an afterthought. But as is often the
>> case with NiFi, if you've never used a processor sometimes it is hard to
>> find concrete examples to configure NiFi processors, services, schemas, etc
>> etc. I opted to take the more familiar path, not being familiar with the
>> Jolt transformation processor.
>>
>> Am happy to learn and will see if there's much out there in way of
>> examples to configure JoltTransformationJSON. For now I'll use my less
>> elegant solution that works gets me where i need to be: pumping data
>> through my production system.
>>
>> Good suggestion. Thanks again.
>>
>> On Thu, Dec 5, 2019 at 8:20 AM Etienne Jouvin <la...@gmail.com>
>> wrote:
>>
>>> Hello.
>>>
>>> Why don't you use a JoltTransformation process first to produce multiple
>>> element in JSON according value in the array, and duplicate common
>>> attributes for all.
>>> And then, you do the split.
>>>
>>> Etienne
>>>
>>>
>>>
>>>

Re: Split a flow file to multiples, related by common key and with split count

Posted by Etienne Jouvin <la...@gmail.com>.

Hello.

You are right. If it works and you are satisfied, you should keep your
solution.
By the wya JoltTransformation may be difficult at the very beginning. But
it is very powerful and with some pratice, it begins to be easy.

For study, you may give it a try.

Regards.

Etienne Jouvin

Le jeu. 5 déc. 2019 à 14:40, James McMahon <js...@gmail.com> a écrit :

> Hello Etienne. Yes, Matt may have mentioned that approach and I started to
> look into it.
>
> My initial thought was this: is it much of a savings? My rudimentary
> process works in three process steps - each simple in configuration. The
> JoltTransformationJSON would eliminate only one processor, and it looks
> fairly complex to configure. It appears to require a Custom Transformation
> Class Name, a Custom Module Directory, and a Jolt Specification. For folks
> who have done it before those may be an afterthought. But as is often the
> case with NiFi, if you've never used a processor sometimes it is hard to
> find concrete examples to configure NiFi processors, services, schemas, etc
> etc. I opted to take the more familiar path, not being familiar with the
> Jolt transformation processor.
>
> Am happy to learn and will see if there's much out there in way of
> examples to configure JoltTransformationJSON. For now I'll use my less
> elegant solution that works gets me where i need to be: pumping data
> through my production system.
>
> Good suggestion. Thanks again.
>
> On Thu, Dec 5, 2019 at 8:20 AM Etienne Jouvin <la...@gmail.com>
> wrote:
>
>> Hello.
>>
>> Why don't you use a JoltTransformation process first to produce multiple
>> element in JSON according value in the array, and duplicate common
>> attributes for all.
>> And then, you do the split.
>>
>> Etienne
>>
>>
>>
>>

Re: Split a flow file to multiples, related by common key and with split count

Posted by James McMahon <js...@gmail.com>.

Hello Etienne. Yes, Matt may have mentioned that approach and I started to
look into it.

My initial thought was this: is it much of a savings? My rudimentary
process works in three process steps - each simple in configuration. The
JoltTransformationJSON would eliminate only one processor, and it looks
fairly complex to configure. It appears to require a Custom Transformation
Class Name, a Custom Module Directory, and a Jolt Specification. For folks
who have done it before those may be an afterthought. But as is often the
case with NiFi, if you've never used a processor sometimes it is hard to
find concrete examples to configure NiFi processors, services, schemas, etc
etc. I opted to take the more familiar path, not being familiar with the
Jolt transformation processor.

Am happy to learn and will see if there's much out there in way of examples
to configure JoltTransformationJSON. For now I'll use my less elegant
solution that works gets me where i need to be: pumping data through my
production system.

Good suggestion. Thanks again.

On Thu, Dec 5, 2019 at 8:20 AM Etienne Jouvin <la...@gmail.com>
wrote:

> Hello.
>
> Why don't you use a JoltTransformation process first to produce multiple
> element in JSON according value in the array, and duplicate common
> attributes for all.
> And then, you do the split.
>
> Etienne
>
>
> Le jeu. 5 déc. 2019 à 14:11, James McMahon <js...@gmail.com> a
> écrit :
>
>> Daeho and Matt, thank you for all your suggestions. You helped me get to
>> a solution. Here is how I unwound my incoming JSON with a simple flow,
>>
>> My incoming JSON flowfile looks like this:
>> {
>>   "KEY1":"value1",
>>   "KEY2":"value2",
>>   "FNAMES":["A","B","C","D"],
>>   "KEY4":2
>> }
>> My goal is to have a flowfile for each of A, B, C, and D, with attribute
>> THIS_NAME set to each singular FNAMES value, and also preserving KEY1,
>> KEY2, KEY3 as flowfile attributes with their values pulled from the JSON.
>>
>> Final flow: ListFile->FetchFile->EvaluateJsonPath->SplitJson->ExtractText
>>
>> EvaluateJsonPath grabs all JSON key/values to attributes. At this point
>> though, FNAMES attribute is ["A","B","C","D"] -- not quite what we require.
>>
>> SplitJson creates four flowfiles from one, its configuration setting
>> JsonPath Expression as $.FNAMES . This results in four flowfiles. We're
>> almost home.
>>
>> The flowfile content is now just each of the singular values from FNAMES.
>> ExtractText creates attribute THIS_NAME configured like this:
>> Include Capture Group 0 false
>> Dynamic property added is THIS_NAME, configured to regex pattern (.*) .
>> (Bad idea in general in any situation where content length may vary to
>> large content, but not in our case where we know the values in the original
>> JSON list are no larger than half a KB.)
>>
>> After this ExtractText step we have all our attributes, including
>> fragment.count of 4 and a common fragment.identifier we can later use to
>> reunite all after individual processing, with a MergeContent or similar.
>>
>> Thank you once again.
>>
>> On Thu, Dec 5, 2019 at 6:36 AM 노대호Daeho Ro <da...@bespinglobal.com>
>> wrote:
>>
>>> Hm.... I might wrong.
>>>
>>> It wouldn't preserve other keys, so you have to evaluate other keys
>>> first, and split FNAMES and evaluate again. Sorry for the confusion.
>>>
>>> 2019년 12월 5일 (목) 오후 8:29, James McMahon <js...@gmail.com>님이 작성:
>>>
>>>> Typo in my initial reply. I did use $.FNAMES. It drops all the other
>>>> key/value pairs in the output split result flowfiles.
>>>> I configured my SplitJSON like so:
>>>> JsonPathExpression    $.FNAME*S*
>>>> Null Value Representation     empty string
>>>>
>>>> If there are two values in the json array for that key FNAME*S*, I do
>>>> get two output flowfiles. But the only value present in the output is the
>>>> value from the split of the value list of FNAMES. All my other JSON keys
>>>> and values are not present. How do I tell SplitJSON to also retain all the
>>>> key/values I did not split on?
>>>>
>>>> On Thu, Dec 5, 2019 at 6:15 AM 노대호Daeho Ro <da...@bespinglobal.com>
>>>> wrote:
>>>>
>>>>> Path to be $.FNAMES, that will work I guess.
>>>>>
>>>>> 2019년 12월 5일 (목) 오후 8:10, James McMahon <js...@gmail.com>님이 작성:
>>>>>
>>>>>> I should add that I also tried this for JsonPathExpression $.*
>>>>>> That result also wasn't what I require, because it gave me 14
>>>>>> different flowfiles each with only one value - - the two that resulted from
>>>>>> the FNAME key, and one for each of the other 12 keys that had only one
>>>>>> value.
>>>>>> My incoming JSON flowfile looks like this:
>>>>>> {
>>>>>>   "KEY1":"value1",
>>>>>>   "KEY2":"value2",
>>>>>>    .
>>>>>>    .
>>>>>>   "FNAMES":["A","B"],
>>>>>>   "KEY13":2
>>>>>> }
>>>>>>
>>>>>> This is what I need as output:
>>>>>> {
>>>>>>   "KEY1":"value1",
>>>>>>   "KEY2":"value2",
>>>>>>    .
>>>>>>    .
>>>>>>   "FNAMES":"A",,
>>>>>>   "KEY13":2
>>>>>> }
>>>>>>
>>>>>> and
>>>>>>
>>>>>> {
>>>>>>   "KEY1":"value1",
>>>>>>   "KEY2":"value2",
>>>>>>    .
>>>>>>    .
>>>>>>   "FNAMES":"B",
>>>>>>   "KEY13":2
>>>>>> }
>>>>>>
>>>>>> How does one configure SplitJSON to accomplish that?
>>>>>>
>>>>>> On Thu, Dec 5, 2019 at 5:59 AM James McMahon <js...@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Daeho, I configured my SplitJSON like so:
>>>>>>> JsonPathExpression    $.FNAME
>>>>>>> Null Value Representation     empty string
>>>>>>>
>>>>>>> If there are two values in the json array for that key FNAME, I do
>>>>>>> get two output flowfiles. But the only value present in the output is the
>>>>>>> value from the split of the list. All my other JSON keys and values are not
>>>>>>> present. How do I tell SplitJSON to also retain all the key/values I did
>>>>>>> not split on?
>>>>>>>
>>>>>>>
>>>>>>> On Wed, Dec 4, 2019 at 9:26 PM 노대호Daeho Ro <
>>>>>>> daeho.ro@bespinglobal.com> wrote:
>>>>>>>
>>>>>>>> Of course.
>>>>>>>>
>>>>>>>> There is a processor, the name is SplitJson. It can split the JSON
>>>>>>>> text by defined key. For example, if there is a key name is 'fname' and has
>>>>>>>> the value [a, b, c]. Once you split the JSON by that processor, the
>>>>>>>> resulted JSON will have the same key and values for others but 'fname' will
>>>>>>>> be a for the first JSON , b for the second and so on.
>>>>>>>>
>>>>>>>> After that, do the EvaluateJsonPath for FNAME then it will have a
>>>>>>>> and b and c for each splited flowfiles. Thus, I recommend you to place the
>>>>>>>> SplitJson processor in front of the  EvaluateJsonPath processor.
>>>>>>>>
>>>>>>>> 2019년 12월 5일 (목) 오전 10:58, James McMahon <js...@gmail.com>님이
>>>>>>>> 작성:
>>>>>>>>
>>>>>>>>> I don’t quite follow, Daeho. FNAME is an attribute that results
>>>>>>>>> *from* EvaluateJSonPath. Can you explain what you mean by
>>>>>>>>> splitting the Jason key before EvaluateJSonPath?
>>>>>>>>> Jim
>>>>>>>>>
>>>>>>>>> On Wed, Dec 4, 2019 at 7:45 PM 노대호Daeho Ro <
>>>>>>>>> daeho.ro@bespinglobal.com> wrote:
>>>>>>>>>
>>>>>>>>>> I think you can split the json key for FNAME just before the
>>>>>>>>>> EvaluateJsonPath processor. Then, the fragment.* attributes will be
>>>>>>>>>> automatically created.
>>>>>>>>>>
>>>>>>>>>> 2019년 12월 5일 (목) 오전 8:24, Matt Burgess <ma...@apache.org>님이
>>>>>>>>>> 작성:
>>>>>>>>>>
>>>>>>>>>>> Jim,
>>>>>>>>>>>
>>>>>>>>>>> As of NiFi 1.8.0 [1], you should be able to do this with
>>>>>>>>>>> UpdateAttribute -> DuplicateFlowFile -> UpdateAttribute pattern,
>>>>>>>>>>> the
>>>>>>>>>>> first getting the number of values in the list via the count() EL
>>>>>>>>>>> function, the second using that (minus 1) to generate
>>>>>>>>>>> duplicates, each
>>>>>>>>>>> with a copy.index attribute set. That attribute can be used in
>>>>>>>>>>> another
>>>>>>>>>>> UpdateAttribute with getDelimitedField() EL function for each
>>>>>>>>>>> flow
>>>>>>>>>>> file to get its own value from FNAME. You may need to rename
>>>>>>>>>>> some of
>>>>>>>>>>> the attributes to fragment.* in order to use a merge processor,
>>>>>>>>>>> but I
>>>>>>>>>>> think all the necessary values are covered. Please let me know
>>>>>>>>>>> if this
>>>>>>>>>>> works for you or not, I added various improvements in order to
>>>>>>>>>>> support
>>>>>>>>>>> use cases like this, but if I missed something I can certainly
>>>>>>>>>>> add it.
>>>>>>>>>>>
>>>>>>>>>>> Regards,
>>>>>>>>>>> Matt
>>>>>>>>>>>
>>>>>>>>>>> [1] https://issues.apache.org/jira/browse/NIFI-5454
>>>>>>>>>>>
>>>>>>>>>>> On Wed, Dec 4, 2019 at 4:54 PM James McMahon <
>>>>>>>>>>> jsmcmahon3@gmail.com> wrote:
>>>>>>>>>>> >
>>>>>>>>>>> > I have a series of attributes that result from an
>>>>>>>>>>> EvaluateJSonPath. One of those attributes, FNAME, appears to be a list of
>>>>>>>>>>> values like so: [“A”,”B”,”C”]. I want to split my flow file into one for
>>>>>>>>>>> each list element. I need my results to have the original content, all the
>>>>>>>>>>> original attributes, and its value for the split result out of the list as
>>>>>>>>>>> a new attribute. I need to also know the split count, and be able to later
>>>>>>>>>>> merge my flow files after evaluating the results of the split.
>>>>>>>>>>> > How can I accomplish this?
>>>>>>>>>>> > Thanks very much in advance.
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> 노대호  *Daeho Ro */ Service Dev.
>>>>>>>>>> daeho.ro@bespinglobal.com / *M* +82 10-6366-2636
>>>>>>>>>> *KR* 06167 서울시 서초구 강남대로 327, 13층
>>>>>>>>>> <https://www.google.com/maps/search/%EC%84%9C%EC%9A%B8%EC%8B%9C+%EC%84%9C%EC%B4%88%EA%B5%AC+%EA%B0%95%EB%82%A8%EB%8C%80%EB%A1%9C+327,+13%EC%B8%B5?entry=gmail&source=g>
>>>>>>>>>> [image: Bespin Global] <https://bespinglobal.com/>
>>>>>>>>>> 국내 최다 클라우드 인증 자격을 보유한 MSP • 국내 유일 ISO 인증을
>>>>>>>>>> 확보한 MSP • 가트너가 인정한 한중일 유일한 MSP
>>>>>>>>>> www.bespinglobal.com • ISO 27001:2013 • ISO 9001:2015 Certified
>>>>>>>>>> ------------------------------
>>>>>>>>>> *Confidentiality Note:* This email may contain confidential
>>>>>>>>>> and/or private information.
>>>>>>>>>> If you received this email in error please delete and notify the
>>>>>>>>>> sender.
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> 노대호  *Daeho Ro */ Service Dev.
>>>>>>>> daeho.ro@bespinglobal.com / *M* +82 10-6366-2636
>>>>>>>> *KR* 06167 서울시 서초구 강남대로 327, 13층
>>>>>>>> [image: Bespin Global] <https://bespinglobal.com/>
>>>>>>>> 국내 최다 클라우드 인증 자격을 보유한 MSP • 국내 유일 ISO 인증을
>>>>>>>> 확보한 MSP • 가트너가 인정한 한중일 유일한 MSP
>>>>>>>> www.bespinglobal.com • ISO 27001:2013 • ISO 9001:2015 Certified
>>>>>>>> ------------------------------
>>>>>>>> *Confidentiality Note:* This email may contain confidential and/or
>>>>>>>> private information.
>>>>>>>> If you received this email in error please delete and notify the
>>>>>>>> sender.
>>>>>>>>
>>>>>>>
>>>>>
>>>>> --
>>>>> 노대호  *Daeho Ro */ Service Dev.
>>>>> daeho.ro@bespinglobal.com / *M* +82 10-6366-2636
>>>>> *KR* 06167 서울시 서초구 강남대로 327, 13층
>>>>> [image: Bespin Global] <https://bespinglobal.com/>
>>>>> 국내 최다 클라우드 인증 자격을 보유한 MSP • 국내 유일 ISO 인증을
>>>>> 확보한 MSP • 가트너가 인정한 한중일 유일한 MSP
>>>>> www.bespinglobal.com • ISO 27001:2013 • ISO 9001:2015 Certified
>>>>> ------------------------------
>>>>> *Confidentiality Note:* This email may contain confidential and/or
>>>>> private information.
>>>>> If you received this email in error please delete and notify the
>>>>> sender.
>>>>>
>>>>
>>>
>>> --
>>> 노대호  *Daeho Ro */ Service Dev.
>>> daeho.ro@bespinglobal.com / *M* +82 10-6366-2636
>>> *KR* 06167 서울시 서초구 강남대로 327, 13층
>>> [image: Bespin Global] <https://bespinglobal.com/>
>>> 국내 최다 클라우드 인증 자격을 보유한 MSP • 국내 유일 ISO 인증을
>>> 확보한 MSP • 가트너가 인정한 한중일 유일한 MSP
>>> www.bespinglobal.com • ISO 27001:2013 • ISO 9001:2015 Certified
>>> ------------------------------
>>> *Confidentiality Note:* This email may contain confidential and/or
>>> private information.
>>> If you received this email in error please delete and notify the sender.
>>>
>>

Re: Split a flow file to multiples, related by common key and with split count

Posted by Etienne Jouvin <la...@gmail.com>.

Hello.

Why don't you use a JoltTransformation process first to produce multiple
element in JSON according value in the array, and duplicate common
attributes for all.
And then, you do the split.

Etienne


Le jeu. 5 déc. 2019 à 14:11, James McMahon <js...@gmail.com> a écrit :

> Daeho and Matt, thank you for all your suggestions. You helped me get to a
> solution. Here is how I unwound my incoming JSON with a simple flow,
>
> My incoming JSON flowfile looks like this:
> {
>   "KEY1":"value1",
>   "KEY2":"value2",
>   "FNAMES":["A","B","C","D"],
>   "KEY4":2
> }
> My goal is to have a flowfile for each of A, B, C, and D, with attribute
> THIS_NAME set to each singular FNAMES value, and also preserving KEY1,
> KEY2, KEY3 as flowfile attributes with their values pulled from the JSON.
>
> Final flow: ListFile->FetchFile->EvaluateJsonPath->SplitJson->ExtractText
>
> EvaluateJsonPath grabs all JSON key/values to attributes. At this point
> though, FNAMES attribute is ["A","B","C","D"] -- not quite what we require.
>
> SplitJson creates four flowfiles from one, its configuration setting
> JsonPath Expression as $.FNAMES . This results in four flowfiles. We're
> almost home.
>
> The flowfile content is now just each of the singular values from FNAMES.
> ExtractText creates attribute THIS_NAME configured like this:
> Include Capture Group 0 false
> Dynamic property added is THIS_NAME, configured to regex pattern (.*) .
> (Bad idea in general in any situation where content length may vary to
> large content, but not in our case where we know the values in the original
> JSON list are no larger than half a KB.)
>
> After this ExtractText step we have all our attributes, including
> fragment.count of 4 and a common fragment.identifier we can later use to
> reunite all after individual processing, with a MergeContent or similar.
>
> Thank you once again.
>
> On Thu, Dec 5, 2019 at 6:36 AM 노대호Daeho Ro <da...@bespinglobal.com>
> wrote:
>
>> Hm.... I might wrong.
>>
>> It wouldn't preserve other keys, so you have to evaluate other keys
>> first, and split FNAMES and evaluate again. Sorry for the confusion.
>>
>> 2019년 12월 5일 (목) 오후 8:29, James McMahon <js...@gmail.com>님이 작성:
>>
>>> Typo in my initial reply. I did use $.FNAMES. It drops all the other
>>> key/value pairs in the output split result flowfiles.
>>> I configured my SplitJSON like so:
>>> JsonPathExpression    $.FNAME*S*
>>> Null Value Representation     empty string
>>>
>>> If there are two values in the json array for that key FNAME*S*, I do
>>> get two output flowfiles. But the only value present in the output is the
>>> value from the split of the value list of FNAMES. All my other JSON keys
>>> and values are not present. How do I tell SplitJSON to also retain all the
>>> key/values I did not split on?
>>>
>>> On Thu, Dec 5, 2019 at 6:15 AM 노대호Daeho Ro <da...@bespinglobal.com>
>>> wrote:
>>>
>>>> Path to be $.FNAMES, that will work I guess.
>>>>
>>>> 2019년 12월 5일 (목) 오후 8:10, James McMahon <js...@gmail.com>님이 작성:
>>>>
>>>>> I should add that I also tried this for JsonPathExpression $.*
>>>>> That result also wasn't what I require, because it gave me 14
>>>>> different flowfiles each with only one value - - the two that resulted from
>>>>> the FNAME key, and one for each of the other 12 keys that had only one
>>>>> value.
>>>>> My incoming JSON flowfile looks like this:
>>>>> {
>>>>>   "KEY1":"value1",
>>>>>   "KEY2":"value2",
>>>>>    .
>>>>>    .
>>>>>   "FNAMES":["A","B"],
>>>>>   "KEY13":2
>>>>> }
>>>>>
>>>>> This is what I need as output:
>>>>> {
>>>>>   "KEY1":"value1",
>>>>>   "KEY2":"value2",
>>>>>    .
>>>>>    .
>>>>>   "FNAMES":"A",,
>>>>>   "KEY13":2
>>>>> }
>>>>>
>>>>> and
>>>>>
>>>>> {
>>>>>   "KEY1":"value1",
>>>>>   "KEY2":"value2",
>>>>>    .
>>>>>    .
>>>>>   "FNAMES":"B",
>>>>>   "KEY13":2
>>>>> }
>>>>>
>>>>> How does one configure SplitJSON to accomplish that?
>>>>>
>>>>> On Thu, Dec 5, 2019 at 5:59 AM James McMahon <js...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Daeho, I configured my SplitJSON like so:
>>>>>> JsonPathExpression    $.FNAME
>>>>>> Null Value Representation     empty string
>>>>>>
>>>>>> If there are two values in the json array for that key FNAME, I do
>>>>>> get two output flowfiles. But the only value present in the output is the
>>>>>> value from the split of the list. All my other JSON keys and values are not
>>>>>> present. How do I tell SplitJSON to also retain all the key/values I did
>>>>>> not split on?
>>>>>>
>>>>>>
>>>>>> On Wed, Dec 4, 2019 at 9:26 PM 노대호Daeho Ro <da...@bespinglobal.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Of course.
>>>>>>>
>>>>>>> There is a processor, the name is SplitJson. It can split the JSON
>>>>>>> text by defined key. For example, if there is a key name is 'fname' and has
>>>>>>> the value [a, b, c]. Once you split the JSON by that processor, the
>>>>>>> resulted JSON will have the same key and values for others but 'fname' will
>>>>>>> be a for the first JSON , b for the second and so on.
>>>>>>>
>>>>>>> After that, do the EvaluateJsonPath for FNAME then it will have a
>>>>>>> and b and c for each splited flowfiles. Thus, I recommend you to place the
>>>>>>> SplitJson processor in front of the  EvaluateJsonPath processor.
>>>>>>>
>>>>>>> 2019년 12월 5일 (목) 오전 10:58, James McMahon <js...@gmail.com>님이
>>>>>>> 작성:
>>>>>>>
>>>>>>>> I don’t quite follow, Daeho. FNAME is an attribute that results
>>>>>>>> *from* EvaluateJSonPath. Can you explain what you mean by
>>>>>>>> splitting the Jason key before EvaluateJSonPath?
>>>>>>>> Jim
>>>>>>>>
>>>>>>>> On Wed, Dec 4, 2019 at 7:45 PM 노대호Daeho Ro <
>>>>>>>> daeho.ro@bespinglobal.com> wrote:
>>>>>>>>
>>>>>>>>> I think you can split the json key for FNAME just before the
>>>>>>>>> EvaluateJsonPath processor. Then, the fragment.* attributes will be
>>>>>>>>> automatically created.
>>>>>>>>>
>>>>>>>>> 2019년 12월 5일 (목) 오전 8:24, Matt Burgess <ma...@apache.org>님이
>>>>>>>>> 작성:
>>>>>>>>>
>>>>>>>>>> Jim,
>>>>>>>>>>
>>>>>>>>>> As of NiFi 1.8.0 [1], you should be able to do this with
>>>>>>>>>> UpdateAttribute -> DuplicateFlowFile -> UpdateAttribute pattern,
>>>>>>>>>> the
>>>>>>>>>> first getting the number of values in the list via the count() EL
>>>>>>>>>> function, the second using that (minus 1) to generate duplicates,
>>>>>>>>>> each
>>>>>>>>>> with a copy.index attribute set. That attribute can be used in
>>>>>>>>>> another
>>>>>>>>>> UpdateAttribute with getDelimitedField() EL function for each flow
>>>>>>>>>> file to get its own value from FNAME. You may need to rename some
>>>>>>>>>> of
>>>>>>>>>> the attributes to fragment.* in order to use a merge processor,
>>>>>>>>>> but I
>>>>>>>>>> think all the necessary values are covered. Please let me know if
>>>>>>>>>> this
>>>>>>>>>> works for you or not, I added various improvements in order to
>>>>>>>>>> support
>>>>>>>>>> use cases like this, but if I missed something I can certainly
>>>>>>>>>> add it.
>>>>>>>>>>
>>>>>>>>>> Regards,
>>>>>>>>>> Matt
>>>>>>>>>>
>>>>>>>>>> [1] https://issues.apache.org/jira/browse/NIFI-5454
>>>>>>>>>>
>>>>>>>>>> On Wed, Dec 4, 2019 at 4:54 PM James McMahon <
>>>>>>>>>> jsmcmahon3@gmail.com> wrote:
>>>>>>>>>> >
>>>>>>>>>> > I have a series of attributes that result from an
>>>>>>>>>> EvaluateJSonPath. One of those attributes, FNAME, appears to be a list of
>>>>>>>>>> values like so: [“A”,”B”,”C”]. I want to split my flow file into one for
>>>>>>>>>> each list element. I need my results to have the original content, all the
>>>>>>>>>> original attributes, and its value for the split result out of the list as
>>>>>>>>>> a new attribute. I need to also know the split count, and be able to later
>>>>>>>>>> merge my flow files after evaluating the results of the split.
>>>>>>>>>> > How can I accomplish this?
>>>>>>>>>> > Thanks very much in advance.
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> 노대호  *Daeho Ro */ Service Dev.
>>>>>>>>> daeho.ro@bespinglobal.com / *M* +82 10-6366-2636
>>>>>>>>> *KR* 06167 서울시 서초구 강남대로 327, 13층
>>>>>>>>> <https://www.google.com/maps/search/%EC%84%9C%EC%9A%B8%EC%8B%9C+%EC%84%9C%EC%B4%88%EA%B5%AC+%EA%B0%95%EB%82%A8%EB%8C%80%EB%A1%9C+327,+13%EC%B8%B5?entry=gmail&source=g>
>>>>>>>>> [image: Bespin Global] <https://bespinglobal.com/>
>>>>>>>>> 국내 최다 클라우드 인증 자격을 보유한 MSP • 국내 유일 ISO 인증을
>>>>>>>>> 확보한 MSP • 가트너가 인정한 한중일 유일한 MSP
>>>>>>>>> www.bespinglobal.com • ISO 27001:2013 • ISO 9001:2015 Certified
>>>>>>>>> ------------------------------
>>>>>>>>> *Confidentiality Note:* This email may contain confidential
>>>>>>>>> and/or private information.
>>>>>>>>> If you received this email in error please delete and notify the
>>>>>>>>> sender.
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> 노대호  *Daeho Ro */ Service Dev.
>>>>>>> daeho.ro@bespinglobal.com / *M* +82 10-6366-2636
>>>>>>> *KR* 06167 서울시 서초구 강남대로 327, 13층
>>>>>>> [image: Bespin Global] <https://bespinglobal.com/>
>>>>>>> 국내 최다 클라우드 인증 자격을 보유한 MSP • 국내 유일 ISO 인증을
>>>>>>> 확보한 MSP • 가트너가 인정한 한중일 유일한 MSP
>>>>>>> www.bespinglobal.com • ISO 27001:2013 • ISO 9001:2015 Certified
>>>>>>> ------------------------------
>>>>>>> *Confidentiality Note:* This email may contain confidential and/or
>>>>>>> private information.
>>>>>>> If you received this email in error please delete and notify the
>>>>>>> sender.
>>>>>>>
>>>>>>
>>>>
>>>> --
>>>> 노대호  *Daeho Ro */ Service Dev.
>>>> daeho.ro@bespinglobal.com / *M* +82 10-6366-2636
>>>> *KR* 06167 서울시 서초구 강남대로 327, 13층
>>>> [image: Bespin Global] <https://bespinglobal.com/>
>>>> 국내 최다 클라우드 인증 자격을 보유한 MSP • 국내 유일 ISO 인증을
>>>> 확보한 MSP • 가트너가 인정한 한중일 유일한 MSP
>>>> www.bespinglobal.com • ISO 27001:2013 • ISO 9001:2015 Certified
>>>> ------------------------------
>>>> *Confidentiality Note:* This email may contain confidential and/or
>>>> private information.
>>>> If you received this email in error please delete and notify the sender.
>>>>
>>>
>>
>> --
>> 노대호  *Daeho Ro */ Service Dev.
>> daeho.ro@bespinglobal.com / *M* +82 10-6366-2636
>> *KR* 06167 서울시 서초구 강남대로 327, 13층
>> [image: Bespin Global] <https://bespinglobal.com/>
>> 국내 최다 클라우드 인증 자격을 보유한 MSP • 국내 유일 ISO 인증을
>> 확보한 MSP • 가트너가 인정한 한중일 유일한 MSP
>> www.bespinglobal.com • ISO 27001:2013 • ISO 9001:2015 Certified
>> ------------------------------
>> *Confidentiality Note:* This email may contain confidential and/or
>> private information.
>> If you received this email in error please delete and notify the sender.
>>
>

Re: Split a flow file to multiples, related by common key and with split count

Posted by James McMahon <js...@gmail.com>.

Daeho and Matt, thank you for all your suggestions. You helped me get to a
solution. Here is how I unwound my incoming JSON with a simple flow,

My incoming JSON flowfile looks like this:
{
  "KEY1":"value1",
  "KEY2":"value2",
  "FNAMES":["A","B","C","D"],
  "KEY4":2
}
My goal is to have a flowfile for each of A, B, C, and D, with attribute
THIS_NAME set to each singular FNAMES value, and also preserving KEY1,
KEY2, KEY3 as flowfile attributes with their values pulled from the JSON.

Final flow: ListFile->FetchFile->EvaluateJsonPath->SplitJson->ExtractText

EvaluateJsonPath grabs all JSON key/values to attributes. At this point
though, FNAMES attribute is ["A","B","C","D"] -- not quite what we require.

SplitJson creates four flowfiles from one, its configuration setting
JsonPath Expression as $.FNAMES . This results in four flowfiles. We're
almost home.

The flowfile content is now just each of the singular values from FNAMES.
ExtractText creates attribute THIS_NAME configured like this:
Include Capture Group 0 false
Dynamic property added is THIS_NAME, configured to regex pattern (.*) .
(Bad idea in general in any situation where content length may vary to
large content, but not in our case where we know the values in the original
JSON list are no larger than half a KB.)

After this ExtractText step we have all our attributes, including
fragment.count of 4 and a common fragment.identifier we can later use to
reunite all after individual processing, with a MergeContent or similar.

Thank you once again.

On Thu, Dec 5, 2019 at 6:36 AM 노대호Daeho Ro <da...@bespinglobal.com>
wrote:

> Hm.... I might wrong.
>
> It wouldn't preserve other keys, so you have to evaluate other keys first,
> and split FNAMES and evaluate again. Sorry for the confusion.
>
> 2019년 12월 5일 (목) 오후 8:29, James McMahon <js...@gmail.com>님이 작성:
>
>> Typo in my initial reply. I did use $.FNAMES. It drops all the other
>> key/value pairs in the output split result flowfiles.
>> I configured my SplitJSON like so:
>> JsonPathExpression    $.FNAME*S*
>> Null Value Representation     empty string
>>
>> If there are two values in the json array for that key FNAME*S*, I do
>> get two output flowfiles. But the only value present in the output is the
>> value from the split of the value list of FNAMES. All my other JSON keys
>> and values are not present. How do I tell SplitJSON to also retain all the
>> key/values I did not split on?
>>
>> On Thu, Dec 5, 2019 at 6:15 AM 노대호Daeho Ro <da...@bespinglobal.com>
>> wrote:
>>
>>> Path to be $.FNAMES, that will work I guess.
>>>
>>> 2019년 12월 5일 (목) 오후 8:10, James McMahon <js...@gmail.com>님이 작성:
>>>
>>>> I should add that I also tried this for JsonPathExpression $.*
>>>> That result also wasn't what I require, because it gave me 14 different
>>>> flowfiles each with only one value - - the two that resulted from the FNAME
>>>> key, and one for each of the other 12 keys that had only one value.
>>>> My incoming JSON flowfile looks like this:
>>>> {
>>>>   "KEY1":"value1",
>>>>   "KEY2":"value2",
>>>>    .
>>>>    .
>>>>   "FNAMES":["A","B"],
>>>>   "KEY13":2
>>>> }
>>>>
>>>> This is what I need as output:
>>>> {
>>>>   "KEY1":"value1",
>>>>   "KEY2":"value2",
>>>>    .
>>>>    .
>>>>   "FNAMES":"A",,
>>>>   "KEY13":2
>>>> }
>>>>
>>>> and
>>>>
>>>> {
>>>>   "KEY1":"value1",
>>>>   "KEY2":"value2",
>>>>    .
>>>>    .
>>>>   "FNAMES":"B",
>>>>   "KEY13":2
>>>> }
>>>>
>>>> How does one configure SplitJSON to accomplish that?
>>>>
>>>> On Thu, Dec 5, 2019 at 5:59 AM James McMahon <js...@gmail.com>
>>>> wrote:
>>>>
>>>>> Daeho, I configured my SplitJSON like so:
>>>>> JsonPathExpression    $.FNAME
>>>>> Null Value Representation     empty string
>>>>>
>>>>> If there are two values in the json array for that key FNAME, I do get
>>>>> two output flowfiles. But the only value present in the output is the value
>>>>> from the split of the list. All my other JSON keys and values are not
>>>>> present. How do I tell SplitJSON to also retain all the key/values I did
>>>>> not split on?
>>>>>
>>>>>
>>>>> On Wed, Dec 4, 2019 at 9:26 PM 노대호Daeho Ro <da...@bespinglobal.com>
>>>>> wrote:
>>>>>
>>>>>> Of course.
>>>>>>
>>>>>> There is a processor, the name is SplitJson. It can split the JSON
>>>>>> text by defined key. For example, if there is a key name is 'fname' and has
>>>>>> the value [a, b, c]. Once you split the JSON by that processor, the
>>>>>> resulted JSON will have the same key and values for others but 'fname' will
>>>>>> be a for the first JSON , b for the second and so on.
>>>>>>
>>>>>> After that, do the EvaluateJsonPath for FNAME then it will have a and
>>>>>> b and c for each splited flowfiles. Thus, I recommend you to place the
>>>>>> SplitJson processor in front of the  EvaluateJsonPath processor.
>>>>>>
>>>>>> 2019년 12월 5일 (목) 오전 10:58, James McMahon <js...@gmail.com>님이 작성:
>>>>>>
>>>>>>> I don’t quite follow, Daeho. FNAME is an attribute that results
>>>>>>> *from* EvaluateJSonPath. Can you explain what you mean by splitting
>>>>>>> the Jason key before EvaluateJSonPath?
>>>>>>> Jim
>>>>>>>
>>>>>>> On Wed, Dec 4, 2019 at 7:45 PM 노대호Daeho Ro <
>>>>>>> daeho.ro@bespinglobal.com> wrote:
>>>>>>>
>>>>>>>> I think you can split the json key for FNAME just before the
>>>>>>>> EvaluateJsonPath processor. Then, the fragment.* attributes will be
>>>>>>>> automatically created.
>>>>>>>>
>>>>>>>> 2019년 12월 5일 (목) 오전 8:24, Matt Burgess <ma...@apache.org>님이 작성:
>>>>>>>>
>>>>>>>>> Jim,
>>>>>>>>>
>>>>>>>>> As of NiFi 1.8.0 [1], you should be able to do this with
>>>>>>>>> UpdateAttribute -> DuplicateFlowFile -> UpdateAttribute pattern,
>>>>>>>>> the
>>>>>>>>> first getting the number of values in the list via the count() EL
>>>>>>>>> function, the second using that (minus 1) to generate duplicates,
>>>>>>>>> each
>>>>>>>>> with a copy.index attribute set. That attribute can be used in
>>>>>>>>> another
>>>>>>>>> UpdateAttribute with getDelimitedField() EL function for each flow
>>>>>>>>> file to get its own value from FNAME. You may need to rename some
>>>>>>>>> of
>>>>>>>>> the attributes to fragment.* in order to use a merge processor,
>>>>>>>>> but I
>>>>>>>>> think all the necessary values are covered. Please let me know if
>>>>>>>>> this
>>>>>>>>> works for you or not, I added various improvements in order to
>>>>>>>>> support
>>>>>>>>> use cases like this, but if I missed something I can certainly add
>>>>>>>>> it.
>>>>>>>>>
>>>>>>>>> Regards,
>>>>>>>>> Matt
>>>>>>>>>
>>>>>>>>> [1] https://issues.apache.org/jira/browse/NIFI-5454
>>>>>>>>>
>>>>>>>>> On Wed, Dec 4, 2019 at 4:54 PM James McMahon <js...@gmail.com>
>>>>>>>>> wrote:
>>>>>>>>> >
>>>>>>>>> > I have a series of attributes that result from an
>>>>>>>>> EvaluateJSonPath. One of those attributes, FNAME, appears to be a list of
>>>>>>>>> values like so: [“A”,”B”,”C”]. I want to split my flow file into one for
>>>>>>>>> each list element. I need my results to have the original content, all the
>>>>>>>>> original attributes, and its value for the split result out of the list as
>>>>>>>>> a new attribute. I need to also know the split count, and be able to later
>>>>>>>>> merge my flow files after evaluating the results of the split.
>>>>>>>>> > How can I accomplish this?
>>>>>>>>> > Thanks very much in advance.
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> 노대호  *Daeho Ro */ Service Dev.
>>>>>>>> daeho.ro@bespinglobal.com / *M* +82 10-6366-2636
>>>>>>>> *KR* 06167 서울시 서초구 강남대로 327, 13층
>>>>>>>> <https://www.google.com/maps/search/%EC%84%9C%EC%9A%B8%EC%8B%9C+%EC%84%9C%EC%B4%88%EA%B5%AC+%EA%B0%95%EB%82%A8%EB%8C%80%EB%A1%9C+327,+13%EC%B8%B5?entry=gmail&source=g>
>>>>>>>> [image: Bespin Global] <https://bespinglobal.com/>
>>>>>>>> 국내 최다 클라우드 인증 자격을 보유한 MSP • 국내 유일 ISO 인증을
>>>>>>>> 확보한 MSP • 가트너가 인정한 한중일 유일한 MSP
>>>>>>>> www.bespinglobal.com • ISO 27001:2013 • ISO 9001:2015 Certified
>>>>>>>> ------------------------------
>>>>>>>> *Confidentiality Note:* This email may contain confidential and/or
>>>>>>>> private information.
>>>>>>>> If you received this email in error please delete and notify the
>>>>>>>> sender.
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>> --
>>>>>> 노대호  *Daeho Ro */ Service Dev.
>>>>>> daeho.ro@bespinglobal.com / *M* +82 10-6366-2636
>>>>>> *KR* 06167 서울시 서초구 강남대로 327, 13층
>>>>>> [image: Bespin Global] <https://bespinglobal.com/>
>>>>>> 국내 최다 클라우드 인증 자격을 보유한 MSP • 국내 유일 ISO 인증을
>>>>>> 확보한 MSP • 가트너가 인정한 한중일 유일한 MSP
>>>>>> www.bespinglobal.com • ISO 27001:2013 • ISO 9001:2015 Certified
>>>>>> ------------------------------
>>>>>> *Confidentiality Note:* This email may contain confidential and/or
>>>>>> private information.
>>>>>> If you received this email in error please delete and notify the
>>>>>> sender.
>>>>>>
>>>>>
>>>
>>> --
>>> 노대호  *Daeho Ro */ Service Dev.
>>> daeho.ro@bespinglobal.com / *M* +82 10-6366-2636
>>> *KR* 06167 서울시 서초구 강남대로 327, 13층
>>> [image: Bespin Global] <https://bespinglobal.com/>
>>> 국내 최다 클라우드 인증 자격을 보유한 MSP • 국내 유일 ISO 인증을
>>> 확보한 MSP • 가트너가 인정한 한중일 유일한 MSP
>>> www.bespinglobal.com • ISO 27001:2013 • ISO 9001:2015 Certified
>>> ------------------------------
>>> *Confidentiality Note:* This email may contain confidential and/or
>>> private information.
>>> If you received this email in error please delete and notify the sender.
>>>
>>
>
> --
> 노대호  *Daeho Ro */ Service Dev.
> daeho.ro@bespinglobal.com / *M* +82 10-6366-2636
> *KR* 06167 서울시 서초구 강남대로 327, 13층
> [image: Bespin Global] <https://bespinglobal.com/>
> 국내 최다 클라우드 인증 자격을 보유한 MSP • 국내 유일 ISO 인증을
> 확보한 MSP • 가트너가 인정한 한중일 유일한 MSP
> www.bespinglobal.com • ISO 27001:2013 • ISO 9001:2015 Certified
> ------------------------------
> *Confidentiality Note:* This email may contain confidential and/or
> private information.
> If you received this email in error please delete and notify the sender.
>

Re: Split a flow file to multiples, related by common key and with split count

Posted by 노대호Daeho Ro <da...@bespinglobal.com>.

Hm.... I might wrong.

It wouldn't preserve other keys, so you have to evaluate other keys first,
and split FNAMES and evaluate again. Sorry for the confusion.

2019년 12월 5일 (목) 오후 8:29, James McMahon <js...@gmail.com>님이 작성:

> Typo in my initial reply. I did use $.FNAMES. It drops all the other
> key/value pairs in the output split result flowfiles.
> I configured my SplitJSON like so:
> JsonPathExpression    $.FNAME*S*
> Null Value Representation     empty string
>
> If there are two values in the json array for that key FNAME*S*, I do get
> two output flowfiles. But the only value present in the output is the value
> from the split of the value list of FNAMES. All my other JSON keys and
> values are not present. How do I tell SplitJSON to also retain all the
> key/values I did not split on?
>
> On Thu, Dec 5, 2019 at 6:15 AM 노대호Daeho Ro <da...@bespinglobal.com>
> wrote:
>
>> Path to be $.FNAMES, that will work I guess.
>>
>> 2019년 12월 5일 (목) 오후 8:10, James McMahon <js...@gmail.com>님이 작성:
>>
>>> I should add that I also tried this for JsonPathExpression $.*
>>> That result also wasn't what I require, because it gave me 14 different
>>> flowfiles each with only one value - - the two that resulted from the FNAME
>>> key, and one for each of the other 12 keys that had only one value.
>>> My incoming JSON flowfile looks like this:
>>> {
>>>   "KEY1":"value1",
>>>   "KEY2":"value2",
>>>    .
>>>    .
>>>   "FNAMES":["A","B"],
>>>   "KEY13":2
>>> }
>>>
>>> This is what I need as output:
>>> {
>>>   "KEY1":"value1",
>>>   "KEY2":"value2",
>>>    .
>>>    .
>>>   "FNAMES":"A",,
>>>   "KEY13":2
>>> }
>>>
>>> and
>>>
>>> {
>>>   "KEY1":"value1",
>>>   "KEY2":"value2",
>>>    .
>>>    .
>>>   "FNAMES":"B",
>>>   "KEY13":2
>>> }
>>>
>>> How does one configure SplitJSON to accomplish that?
>>>
>>> On Thu, Dec 5, 2019 at 5:59 AM James McMahon <js...@gmail.com>
>>> wrote:
>>>
>>>> Daeho, I configured my SplitJSON like so:
>>>> JsonPathExpression    $.FNAME
>>>> Null Value Representation     empty string
>>>>
>>>> If there are two values in the json array for that key FNAME, I do get
>>>> two output flowfiles. But the only value present in the output is the value
>>>> from the split of the list. All my other JSON keys and values are not
>>>> present. How do I tell SplitJSON to also retain all the key/values I did
>>>> not split on?
>>>>
>>>>
>>>> On Wed, Dec 4, 2019 at 9:26 PM 노대호Daeho Ro <da...@bespinglobal.com>
>>>> wrote:
>>>>
>>>>> Of course.
>>>>>
>>>>> There is a processor, the name is SplitJson. It can split the JSON
>>>>> text by defined key. For example, if there is a key name is 'fname' and has
>>>>> the value [a, b, c]. Once you split the JSON by that processor, the
>>>>> resulted JSON will have the same key and values for others but 'fname' will
>>>>> be a for the first JSON , b for the second and so on.
>>>>>
>>>>> After that, do the EvaluateJsonPath for FNAME then it will have a and
>>>>> b and c for each splited flowfiles. Thus, I recommend you to place the
>>>>> SplitJson processor in front of the  EvaluateJsonPath processor.
>>>>>
>>>>> 2019년 12월 5일 (목) 오전 10:58, James McMahon <js...@gmail.com>님이 작성:
>>>>>
>>>>>> I don’t quite follow, Daeho. FNAME is an attribute that results
>>>>>> *from* EvaluateJSonPath. Can you explain what you mean by splitting
>>>>>> the Jason key before EvaluateJSonPath?
>>>>>> Jim
>>>>>>
>>>>>> On Wed, Dec 4, 2019 at 7:45 PM 노대호Daeho Ro <da...@bespinglobal.com>
>>>>>> wrote:
>>>>>>
>>>>>>> I think you can split the json key for FNAME just before the
>>>>>>> EvaluateJsonPath processor. Then, the fragment.* attributes will be
>>>>>>> automatically created.
>>>>>>>
>>>>>>> 2019년 12월 5일 (목) 오전 8:24, Matt Burgess <ma...@apache.org>님이 작성:
>>>>>>>
>>>>>>>> Jim,
>>>>>>>>
>>>>>>>> As of NiFi 1.8.0 [1], you should be able to do this with
>>>>>>>> UpdateAttribute -> DuplicateFlowFile -> UpdateAttribute pattern, the
>>>>>>>> first getting the number of values in the list via the count() EL
>>>>>>>> function, the second using that (minus 1) to generate duplicates,
>>>>>>>> each
>>>>>>>> with a copy.index attribute set. That attribute can be used in
>>>>>>>> another
>>>>>>>> UpdateAttribute with getDelimitedField() EL function for each flow
>>>>>>>> file to get its own value from FNAME. You may need to rename some of
>>>>>>>> the attributes to fragment.* in order to use a merge processor, but
>>>>>>>> I
>>>>>>>> think all the necessary values are covered. Please let me know if
>>>>>>>> this
>>>>>>>> works for you or not, I added various improvements in order to
>>>>>>>> support
>>>>>>>> use cases like this, but if I missed something I can certainly add
>>>>>>>> it.
>>>>>>>>
>>>>>>>> Regards,
>>>>>>>> Matt
>>>>>>>>
>>>>>>>> [1] https://issues.apache.org/jira/browse/NIFI-5454
>>>>>>>>
>>>>>>>> On Wed, Dec 4, 2019 at 4:54 PM James McMahon <js...@gmail.com>
>>>>>>>> wrote:
>>>>>>>> >
>>>>>>>> > I have a series of attributes that result from an
>>>>>>>> EvaluateJSonPath. One of those attributes, FNAME, appears to be a list of
>>>>>>>> values like so: [“A”,”B”,”C”]. I want to split my flow file into one for
>>>>>>>> each list element. I need my results to have the original content, all the
>>>>>>>> original attributes, and its value for the split result out of the list as
>>>>>>>> a new attribute. I need to also know the split count, and be able to later
>>>>>>>> merge my flow files after evaluating the results of the split.
>>>>>>>> > How can I accomplish this?
>>>>>>>> > Thanks very much in advance.
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> 노대호  *Daeho Ro */ Service Dev.
>>>>>>> daeho.ro@bespinglobal.com / *M* +82 10-6366-2636
>>>>>>> *KR* 06167 서울시 서초구 강남대로 327, 13층
>>>>>>> <https://www.google.com/maps/search/%EC%84%9C%EC%9A%B8%EC%8B%9C+%EC%84%9C%EC%B4%88%EA%B5%AC+%EA%B0%95%EB%82%A8%EB%8C%80%EB%A1%9C+327,+13%EC%B8%B5?entry=gmail&source=g>
>>>>>>> [image: Bespin Global] <https://bespinglobal.com/>
>>>>>>> 국내 최다 클라우드 인증 자격을 보유한 MSP • 국내 유일 ISO 인증을
>>>>>>> 확보한 MSP • 가트너가 인정한 한중일 유일한 MSP
>>>>>>> www.bespinglobal.com • ISO 27001:2013 • ISO 9001:2015 Certified
>>>>>>> ------------------------------
>>>>>>> *Confidentiality Note:* This email may contain confidential and/or
>>>>>>> private information.
>>>>>>> If you received this email in error please delete and notify the
>>>>>>> sender.
>>>>>>>
>>>>>>
>>>>>
>>>>> --
>>>>> 노대호  *Daeho Ro */ Service Dev.
>>>>> daeho.ro@bespinglobal.com / *M* +82 10-6366-2636
>>>>> *KR* 06167 서울시 서초구 강남대로 327, 13층
>>>>> [image: Bespin Global] <https://bespinglobal.com/>
>>>>> 국내 최다 클라우드 인증 자격을 보유한 MSP • 국내 유일 ISO 인증을
>>>>> 확보한 MSP • 가트너가 인정한 한중일 유일한 MSP
>>>>> www.bespinglobal.com • ISO 27001:2013 • ISO 9001:2015 Certified
>>>>> ------------------------------
>>>>> *Confidentiality Note:* This email may contain confidential and/or
>>>>> private information.
>>>>> If you received this email in error please delete and notify the
>>>>> sender.
>>>>>
>>>>
>>
>> --
>> 노대호  *Daeho Ro */ Service Dev.
>> daeho.ro@bespinglobal.com / *M* +82 10-6366-2636
>> *KR* 06167 서울시 서초구 강남대로 327, 13층
>> [image: Bespin Global] <https://bespinglobal.com/>
>> 국내 최다 클라우드 인증 자격을 보유한 MSP • 국내 유일 ISO 인증을
>> 확보한 MSP • 가트너가 인정한 한중일 유일한 MSP
>> www.bespinglobal.com • ISO 27001:2013 • ISO 9001:2015 Certified
>> ------------------------------
>> *Confidentiality Note:* This email may contain confidential and/or
>> private information.
>> If you received this email in error please delete and notify the sender.
>>
>

-- 
노대호  *Daeho Ro */ Service Dev.
daeho.ro@bespinglobal.com / *M* +82 10-6366-2636
*KR* 06167 서울시 서초구 강남대로 327, 13층
[image: Bespin Global] <https://bespinglobal.com/>
국내 최다 클라우드 인증 자격을 보유한 MSP • 국내 유일 ISO 인증을
확보한 MSP • 가트너가 인정한 한중일 유일한 MSP
www.bespinglobal.com • ISO 27001:2013 • ISO 9001:2015 Certified
------------------------------
*Confidentiality Note:* This email may contain confidential and/or private
information.
If you received this email in error please delete and notify the sender.

Re: Split a flow file to multiples, related by common key and with split count

Posted by James McMahon <js...@gmail.com>.

 Typo in my initial reply. I did use $.FNAMES. It drops all the other
key/value pairs in the output split result flowfiles.
I configured my SplitJSON like so:
JsonPathExpression    $.FNAME*S*
Null Value Representation     empty string

If there are two values in the json array for that key FNAME*S*, I do get
two output flowfiles. But the only value present in the output is the value
from the split of the value list of FNAMES. All my other JSON keys and
values are not present. How do I tell SplitJSON to also retain all the
key/values I did not split on?

On Thu, Dec 5, 2019 at 6:15 AM 노대호Daeho Ro <da...@bespinglobal.com>
wrote:

> Path to be $.FNAMES, that will work I guess.
>
> 2019년 12월 5일 (목) 오후 8:10, James McMahon <js...@gmail.com>님이 작성:
>
>> I should add that I also tried this for JsonPathExpression $.*
>> That result also wasn't what I require, because it gave me 14 different
>> flowfiles each with only one value - - the two that resulted from the FNAME
>> key, and one for each of the other 12 keys that had only one value.
>> My incoming JSON flowfile looks like this:
>> {
>>   "KEY1":"value1",
>>   "KEY2":"value2",
>>    .
>>    .
>>   "FNAMES":["A","B"],
>>   "KEY13":2
>> }
>>
>> This is what I need as output:
>> {
>>   "KEY1":"value1",
>>   "KEY2":"value2",
>>    .
>>    .
>>   "FNAMES":"A",,
>>   "KEY13":2
>> }
>>
>> and
>>
>> {
>>   "KEY1":"value1",
>>   "KEY2":"value2",
>>    .
>>    .
>>   "FNAMES":"B",
>>   "KEY13":2
>> }
>>
>> How does one configure SplitJSON to accomplish that?
>>
>> On Thu, Dec 5, 2019 at 5:59 AM James McMahon <js...@gmail.com>
>> wrote:
>>
>>> Daeho, I configured my SplitJSON like so:
>>> JsonPathExpression    $.FNAME
>>> Null Value Representation     empty string
>>>
>>> If there are two values in the json array for that key FNAME, I do get
>>> two output flowfiles. But the only value present in the output is the value
>>> from the split of the list. All my other JSON keys and values are not
>>> present. How do I tell SplitJSON to also retain all the key/values I did
>>> not split on?
>>>
>>>
>>> On Wed, Dec 4, 2019 at 9:26 PM 노대호Daeho Ro <da...@bespinglobal.com>
>>> wrote:
>>>
>>>> Of course.
>>>>
>>>> There is a processor, the name is SplitJson. It can split the JSON text
>>>> by defined key. For example, if there is a key name is 'fname' and has the
>>>> value [a, b, c]. Once you split the JSON by that processor, the resulted
>>>> JSON will have the same key and values for others but 'fname' will be a for
>>>> the first JSON , b for the second and so on.
>>>>
>>>> After that, do the EvaluateJsonPath for FNAME then it will have a and b
>>>> and c for each splited flowfiles. Thus, I recommend you to place the
>>>> SplitJson processor in front of the  EvaluateJsonPath processor.
>>>>
>>>> 2019년 12월 5일 (목) 오전 10:58, James McMahon <js...@gmail.com>님이 작성:
>>>>
>>>>> I don’t quite follow, Daeho. FNAME is an attribute that results *from*
>>>>> EvaluateJSonPath. Can you explain what you mean by splitting the Jason key
>>>>> before EvaluateJSonPath?
>>>>> Jim
>>>>>
>>>>> On Wed, Dec 4, 2019 at 7:45 PM 노대호Daeho Ro <da...@bespinglobal.com>
>>>>> wrote:
>>>>>
>>>>>> I think you can split the json key for FNAME just before the
>>>>>> EvaluateJsonPath processor. Then, the fragment.* attributes will be
>>>>>> automatically created.
>>>>>>
>>>>>> 2019년 12월 5일 (목) 오전 8:24, Matt Burgess <ma...@apache.org>님이 작성:
>>>>>>
>>>>>>> Jim,
>>>>>>>
>>>>>>> As of NiFi 1.8.0 [1], you should be able to do this with
>>>>>>> UpdateAttribute -> DuplicateFlowFile -> UpdateAttribute pattern, the
>>>>>>> first getting the number of values in the list via the count() EL
>>>>>>> function, the second using that (minus 1) to generate duplicates,
>>>>>>> each
>>>>>>> with a copy.index attribute set. That attribute can be used in
>>>>>>> another
>>>>>>> UpdateAttribute with getDelimitedField() EL function for each flow
>>>>>>> file to get its own value from FNAME. You may need to rename some of
>>>>>>> the attributes to fragment.* in order to use a merge processor, but I
>>>>>>> think all the necessary values are covered. Please let me know if
>>>>>>> this
>>>>>>> works for you or not, I added various improvements in order to
>>>>>>> support
>>>>>>> use cases like this, but if I missed something I can certainly add
>>>>>>> it.
>>>>>>>
>>>>>>> Regards,
>>>>>>> Matt
>>>>>>>
>>>>>>> [1] https://issues.apache.org/jira/browse/NIFI-5454
>>>>>>>
>>>>>>> On Wed, Dec 4, 2019 at 4:54 PM James McMahon <js...@gmail.com>
>>>>>>> wrote:
>>>>>>> >
>>>>>>> > I have a series of attributes that result from an
>>>>>>> EvaluateJSonPath. One of those attributes, FNAME, appears to be a list of
>>>>>>> values like so: [“A”,”B”,”C”]. I want to split my flow file into one for
>>>>>>> each list element. I need my results to have the original content, all the
>>>>>>> original attributes, and its value for the split result out of the list as
>>>>>>> a new attribute. I need to also know the split count, and be able to later
>>>>>>> merge my flow files after evaluating the results of the split.
>>>>>>> > How can I accomplish this?
>>>>>>> > Thanks very much in advance.
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> 노대호  *Daeho Ro */ Service Dev.
>>>>>> daeho.ro@bespinglobal.com / *M* +82 10-6366-2636
>>>>>> *KR* 06167 서울시 서초구 강남대로 327, 13층
>>>>>> <https://www.google.com/maps/search/%EC%84%9C%EC%9A%B8%EC%8B%9C+%EC%84%9C%EC%B4%88%EA%B5%AC+%EA%B0%95%EB%82%A8%EB%8C%80%EB%A1%9C+327,+13%EC%B8%B5?entry=gmail&source=g>
>>>>>> [image: Bespin Global] <https://bespinglobal.com/>
>>>>>> 국내 최다 클라우드 인증 자격을 보유한 MSP • 국내 유일 ISO 인증을
>>>>>> 확보한 MSP • 가트너가 인정한 한중일 유일한 MSP
>>>>>> www.bespinglobal.com • ISO 27001:2013 • ISO 9001:2015 Certified
>>>>>> ------------------------------
>>>>>> *Confidentiality Note:* This email may contain confidential and/or
>>>>>> private information.
>>>>>> If you received this email in error please delete and notify the
>>>>>> sender.
>>>>>>
>>>>>
>>>>
>>>> --
>>>> 노대호  *Daeho Ro */ Service Dev.
>>>> daeho.ro@bespinglobal.com / *M* +82 10-6366-2636
>>>> *KR* 06167 서울시 서초구 강남대로 327, 13층
>>>> [image: Bespin Global] <https://bespinglobal.com/>
>>>> 국내 최다 클라우드 인증 자격을 보유한 MSP • 국내 유일 ISO 인증을
>>>> 확보한 MSP • 가트너가 인정한 한중일 유일한 MSP
>>>> www.bespinglobal.com • ISO 27001:2013 • ISO 9001:2015 Certified
>>>> ------------------------------
>>>> *Confidentiality Note:* This email may contain confidential and/or
>>>> private information.
>>>> If you received this email in error please delete and notify the sender.
>>>>
>>>
>
> --
> 노대호  *Daeho Ro */ Service Dev.
> daeho.ro@bespinglobal.com / *M* +82 10-6366-2636
> *KR* 06167 서울시 서초구 강남대로 327, 13층
> [image: Bespin Global] <https://bespinglobal.com/>
> 국내 최다 클라우드 인증 자격을 보유한 MSP • 국내 유일 ISO 인증을
> 확보한 MSP • 가트너가 인정한 한중일 유일한 MSP
> www.bespinglobal.com • ISO 27001:2013 • ISO 9001:2015 Certified
> ------------------------------
> *Confidentiality Note:* This email may contain confidential and/or
> private information.
> If you received this email in error please delete and notify the sender.
>

Re: Split a flow file to multiples, related by common key and with split count

Posted by 노대호Daeho Ro <da...@bespinglobal.com>.

Path to be $.FNAMES, that will work I guess.

2019년 12월 5일 (목) 오후 8:10, James McMahon <js...@gmail.com>님이 작성:

> I should add that I also tried this for JsonPathExpression $.*
> That result also wasn't what I require, because it gave me 14 different
> flowfiles each with only one value - - the two that resulted from the FNAME
> key, and one for each of the other 12 keys that had only one value.
> My incoming JSON flowfile looks like this:
> {
>   "KEY1":"value1",
>   "KEY2":"value2",
>    .
>    .
>   "FNAMES":["A","B"],
>   "KEY13":2
> }
>
> This is what I need as output:
> {
>   "KEY1":"value1",
>   "KEY2":"value2",
>    .
>    .
>   "FNAMES":"A",,
>   "KEY13":2
> }
>
> and
>
> {
>   "KEY1":"value1",
>   "KEY2":"value2",
>    .
>    .
>   "FNAMES":"B",
>   "KEY13":2
> }
>
> How does one configure SplitJSON to accomplish that?
>
> On Thu, Dec 5, 2019 at 5:59 AM James McMahon <js...@gmail.com> wrote:
>
>> Daeho, I configured my SplitJSON like so:
>> JsonPathExpression    $.FNAME
>> Null Value Representation     empty string
>>
>> If there are two values in the json array for that key FNAME, I do get
>> two output flowfiles. But the only value present in the output is the value
>> from the split of the list. All my other JSON keys and values are not
>> present. How do I tell SplitJSON to also retain all the key/values I did
>> not split on?
>>
>>
>> On Wed, Dec 4, 2019 at 9:26 PM 노대호Daeho Ro <da...@bespinglobal.com>
>> wrote:
>>
>>> Of course.
>>>
>>> There is a processor, the name is SplitJson. It can split the JSON text
>>> by defined key. For example, if there is a key name is 'fname' and has the
>>> value [a, b, c]. Once you split the JSON by that processor, the resulted
>>> JSON will have the same key and values for others but 'fname' will be a for
>>> the first JSON , b for the second and so on.
>>>
>>> After that, do the EvaluateJsonPath for FNAME then it will have a and b
>>> and c for each splited flowfiles. Thus, I recommend you to place the
>>> SplitJson processor in front of the  EvaluateJsonPath processor.
>>>
>>> 2019년 12월 5일 (목) 오전 10:58, James McMahon <js...@gmail.com>님이 작성:
>>>
>>>> I don’t quite follow, Daeho. FNAME is an attribute that results *from*
>>>> EvaluateJSonPath. Can you explain what you mean by splitting the Jason key
>>>> before EvaluateJSonPath?
>>>> Jim
>>>>
>>>> On Wed, Dec 4, 2019 at 7:45 PM 노대호Daeho Ro <da...@bespinglobal.com>
>>>> wrote:
>>>>
>>>>> I think you can split the json key for FNAME just before the
>>>>> EvaluateJsonPath processor. Then, the fragment.* attributes will be
>>>>> automatically created.
>>>>>
>>>>> 2019년 12월 5일 (목) 오전 8:24, Matt Burgess <ma...@apache.org>님이 작성:
>>>>>
>>>>>> Jim,
>>>>>>
>>>>>> As of NiFi 1.8.0 [1], you should be able to do this with
>>>>>> UpdateAttribute -> DuplicateFlowFile -> UpdateAttribute pattern, the
>>>>>> first getting the number of values in the list via the count() EL
>>>>>> function, the second using that (minus 1) to generate duplicates, each
>>>>>> with a copy.index attribute set. That attribute can be used in another
>>>>>> UpdateAttribute with getDelimitedField() EL function for each flow
>>>>>> file to get its own value from FNAME. You may need to rename some of
>>>>>> the attributes to fragment.* in order to use a merge processor, but I
>>>>>> think all the necessary values are covered. Please let me know if this
>>>>>> works for you or not, I added various improvements in order to support
>>>>>> use cases like this, but if I missed something I can certainly add it.
>>>>>>
>>>>>> Regards,
>>>>>> Matt
>>>>>>
>>>>>> [1] https://issues.apache.org/jira/browse/NIFI-5454
>>>>>>
>>>>>> On Wed, Dec 4, 2019 at 4:54 PM James McMahon <js...@gmail.com>
>>>>>> wrote:
>>>>>> >
>>>>>> > I have a series of attributes that result from an EvaluateJSonPath.
>>>>>> One of those attributes, FNAME, appears to be a list of values like so:
>>>>>> [“A”,”B”,”C”]. I want to split my flow file into one for each list element.
>>>>>> I need my results to have the original content, all the original
>>>>>> attributes, and its value for the split result out of the list as a new
>>>>>> attribute. I need to also know the split count, and be able to later merge
>>>>>> my flow files after evaluating the results of the split.
>>>>>> > How can I accomplish this?
>>>>>> > Thanks very much in advance.
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> 노대호  *Daeho Ro */ Service Dev.
>>>>> daeho.ro@bespinglobal.com / *M* +82 10-6366-2636
>>>>> *KR* 06167 서울시 서초구 강남대로 327, 13층
>>>>> <https://www.google.com/maps/search/%EC%84%9C%EC%9A%B8%EC%8B%9C+%EC%84%9C%EC%B4%88%EA%B5%AC+%EA%B0%95%EB%82%A8%EB%8C%80%EB%A1%9C+327,+13%EC%B8%B5?entry=gmail&source=g>
>>>>> [image: Bespin Global] <https://bespinglobal.com/>
>>>>> 국내 최다 클라우드 인증 자격을 보유한 MSP • 국내 유일 ISO 인증을
>>>>> 확보한 MSP • 가트너가 인정한 한중일 유일한 MSP
>>>>> www.bespinglobal.com • ISO 27001:2013 • ISO 9001:2015 Certified
>>>>> ------------------------------
>>>>> *Confidentiality Note:* This email may contain confidential and/or
>>>>> private information.
>>>>> If you received this email in error please delete and notify the
>>>>> sender.
>>>>>
>>>>
>>>
>>> --
>>> 노대호  *Daeho Ro */ Service Dev.
>>> daeho.ro@bespinglobal.com / *M* +82 10-6366-2636
>>> *KR* 06167 서울시 서초구 강남대로 327, 13층
>>> [image: Bespin Global] <https://bespinglobal.com/>
>>> 국내 최다 클라우드 인증 자격을 보유한 MSP • 국내 유일 ISO 인증을
>>> 확보한 MSP • 가트너가 인정한 한중일 유일한 MSP
>>> www.bespinglobal.com • ISO 27001:2013 • ISO 9001:2015 Certified
>>> ------------------------------
>>> *Confidentiality Note:* This email may contain confidential and/or
>>> private information.
>>> If you received this email in error please delete and notify the sender.
>>>
>>

-- 
노대호  *Daeho Ro */ Service Dev.
daeho.ro@bespinglobal.com / *M* +82 10-6366-2636
*KR* 06167 서울시 서초구 강남대로 327, 13층
[image: Bespin Global] <https://bespinglobal.com/>
국내 최다 클라우드 인증 자격을 보유한 MSP • 국내 유일 ISO 인증을
확보한 MSP • 가트너가 인정한 한중일 유일한 MSP
www.bespinglobal.com • ISO 27001:2013 • ISO 9001:2015 Certified
------------------------------
*Confidentiality Note:* This email may contain confidential and/or private
information.
If you received this email in error please delete and notify the sender.

Re: Split a flow file to multiples, related by common key and with split count

Posted by James McMahon <js...@gmail.com>.

I should add that I also tried this for JsonPathExpression $.*
That result also wasn't what I require, because it gave me 14 different
flowfiles each with only one value - - the two that resulted from the FNAME
key, and one for each of the other 12 keys that had only one value.
My incoming JSON flowfile looks like this:
{
  "KEY1":"value1",
  "KEY2":"value2",
   .
   .
  "FNAMES":["A","B"],
  "KEY13":2
}

This is what I need as output:
{
  "KEY1":"value1",
  "KEY2":"value2",
   .
   .
  "FNAMES":"A",,
  "KEY13":2
}

and

{
  "KEY1":"value1",
  "KEY2":"value2",
   .
   .
  "FNAMES":"B",
  "KEY13":2
}

How does one configure SplitJSON to accomplish that?

On Thu, Dec 5, 2019 at 5:59 AM James McMahon <js...@gmail.com> wrote:

> Daeho, I configured my SplitJSON like so:
> JsonPathExpression    $.FNAME
> Null Value Representation     empty string
>
> If there are two values in the json array for that key FNAME, I do get two
> output flowfiles. But the only value present in the output is the value
> from the split of the list. All my other JSON keys and values are not
> present. How do I tell SplitJSON to also retain all the key/values I did
> not split on?
>
>
> On Wed, Dec 4, 2019 at 9:26 PM 노대호Daeho Ro <da...@bespinglobal.com>
> wrote:
>
>> Of course.
>>
>> There is a processor, the name is SplitJson. It can split the JSON text
>> by defined key. For example, if there is a key name is 'fname' and has the
>> value [a, b, c]. Once you split the JSON by that processor, the resulted
>> JSON will have the same key and values for others but 'fname' will be a for
>> the first JSON , b for the second and so on.
>>
>> After that, do the EvaluateJsonPath for FNAME then it will have a and b
>> and c for each splited flowfiles. Thus, I recommend you to place the
>> SplitJson processor in front of the  EvaluateJsonPath processor.
>>
>> 2019년 12월 5일 (목) 오전 10:58, James McMahon <js...@gmail.com>님이 작성:
>>
>>> I don’t quite follow, Daeho. FNAME is an attribute that results *from*
>>> EvaluateJSonPath. Can you explain what you mean by splitting the Jason key
>>> before EvaluateJSonPath?
>>> Jim
>>>
>>> On Wed, Dec 4, 2019 at 7:45 PM 노대호Daeho Ro <da...@bespinglobal.com>
>>> wrote:
>>>
>>>> I think you can split the json key for FNAME just before the
>>>> EvaluateJsonPath processor. Then, the fragment.* attributes will be
>>>> automatically created.
>>>>
>>>> 2019년 12월 5일 (목) 오전 8:24, Matt Burgess <ma...@apache.org>님이 작성:
>>>>
>>>>> Jim,
>>>>>
>>>>> As of NiFi 1.8.0 [1], you should be able to do this with
>>>>> UpdateAttribute -> DuplicateFlowFile -> UpdateAttribute pattern, the
>>>>> first getting the number of values in the list via the count() EL
>>>>> function, the second using that (minus 1) to generate duplicates, each
>>>>> with a copy.index attribute set. That attribute can be used in another
>>>>> UpdateAttribute with getDelimitedField() EL function for each flow
>>>>> file to get its own value from FNAME. You may need to rename some of
>>>>> the attributes to fragment.* in order to use a merge processor, but I
>>>>> think all the necessary values are covered. Please let me know if this
>>>>> works for you or not, I added various improvements in order to support
>>>>> use cases like this, but if I missed something I can certainly add it.
>>>>>
>>>>> Regards,
>>>>> Matt
>>>>>
>>>>> [1] https://issues.apache.org/jira/browse/NIFI-5454
>>>>>
>>>>> On Wed, Dec 4, 2019 at 4:54 PM James McMahon <js...@gmail.com>
>>>>> wrote:
>>>>> >
>>>>> > I have a series of attributes that result from an EvaluateJSonPath.
>>>>> One of those attributes, FNAME, appears to be a list of values like so:
>>>>> [“A”,”B”,”C”]. I want to split my flow file into one for each list element.
>>>>> I need my results to have the original content, all the original
>>>>> attributes, and its value for the split result out of the list as a new
>>>>> attribute. I need to also know the split count, and be able to later merge
>>>>> my flow files after evaluating the results of the split.
>>>>> > How can I accomplish this?
>>>>> > Thanks very much in advance.
>>>>>
>>>>
>>>>
>>>> --
>>>> 노대호  *Daeho Ro */ Service Dev.
>>>> daeho.ro@bespinglobal.com / *M* +82 10-6366-2636
>>>> *KR* 06167 서울시 서초구 강남대로 327, 13층
>>>> <https://www.google.com/maps/search/%EC%84%9C%EC%9A%B8%EC%8B%9C+%EC%84%9C%EC%B4%88%EA%B5%AC+%EA%B0%95%EB%82%A8%EB%8C%80%EB%A1%9C+327,+13%EC%B8%B5?entry=gmail&source=g>
>>>> [image: Bespin Global] <https://bespinglobal.com/>
>>>> 국내 최다 클라우드 인증 자격을 보유한 MSP • 국내 유일 ISO 인증을
>>>> 확보한 MSP • 가트너가 인정한 한중일 유일한 MSP
>>>> www.bespinglobal.com • ISO 27001:2013 • ISO 9001:2015 Certified
>>>> ------------------------------
>>>> *Confidentiality Note:* This email may contain confidential and/or
>>>> private information.
>>>> If you received this email in error please delete and notify the sender.
>>>>
>>>
>>
>> --
>> 노대호  *Daeho Ro */ Service Dev.
>> daeho.ro@bespinglobal.com / *M* +82 10-6366-2636
>> *KR* 06167 서울시 서초구 강남대로 327, 13층
>> [image: Bespin Global] <https://bespinglobal.com/>
>> 국내 최다 클라우드 인증 자격을 보유한 MSP • 국내 유일 ISO 인증을
>> 확보한 MSP • 가트너가 인정한 한중일 유일한 MSP
>> www.bespinglobal.com • ISO 27001:2013 • ISO 9001:2015 Certified
>> ------------------------------
>> *Confidentiality Note:* This email may contain confidential and/or
>> private information.
>> If you received this email in error please delete and notify the sender.
>>
>

Re: Split a flow file to multiples, related by common key and with split count

Posted by James McMahon <js...@gmail.com>.

Daeho, I configured my SplitJSON like so:
JsonPathExpression    $.FNAME
Null Value Representation     empty string

If there are two values in the json array for that key FNAME, I do get two
output flowfiles. But the only value present in the output is the value
from the split of the list. All my other JSON keys and values are not
present. How do I tell SplitJSON to also retain all the key/values I did
not split on?


On Wed, Dec 4, 2019 at 9:26 PM 노대호Daeho Ro <da...@bespinglobal.com>
wrote:

> Of course.
>
> There is a processor, the name is SplitJson. It can split the JSON text by
> defined key. For example, if there is a key name is 'fname' and has the
> value [a, b, c]. Once you split the JSON by that processor, the resulted
> JSON will have the same key and values for others but 'fname' will be a for
> the first JSON , b for the second and so on.
>
> After that, do the EvaluateJsonPath for FNAME then it will have a and b
> and c for each splited flowfiles. Thus, I recommend you to place the
> SplitJson processor in front of the  EvaluateJsonPath processor.
>
> 2019년 12월 5일 (목) 오전 10:58, James McMahon <js...@gmail.com>님이 작성:
>
>> I don’t quite follow, Daeho. FNAME is an attribute that results *from*
>> EvaluateJSonPath. Can you explain what you mean by splitting the Jason key
>> before EvaluateJSonPath?
>> Jim
>>
>> On Wed, Dec 4, 2019 at 7:45 PM 노대호Daeho Ro <da...@bespinglobal.com>
>> wrote:
>>
>>> I think you can split the json key for FNAME just before the
>>> EvaluateJsonPath processor. Then, the fragment.* attributes will be
>>> automatically created.
>>>
>>> 2019년 12월 5일 (목) 오전 8:24, Matt Burgess <ma...@apache.org>님이 작성:
>>>
>>>> Jim,
>>>>
>>>> As of NiFi 1.8.0 [1], you should be able to do this with
>>>> UpdateAttribute -> DuplicateFlowFile -> UpdateAttribute pattern, the
>>>> first getting the number of values in the list via the count() EL
>>>> function, the second using that (minus 1) to generate duplicates, each
>>>> with a copy.index attribute set. That attribute can be used in another
>>>> UpdateAttribute with getDelimitedField() EL function for each flow
>>>> file to get its own value from FNAME. You may need to rename some of
>>>> the attributes to fragment.* in order to use a merge processor, but I
>>>> think all the necessary values are covered. Please let me know if this
>>>> works for you or not, I added various improvements in order to support
>>>> use cases like this, but if I missed something I can certainly add it.
>>>>
>>>> Regards,
>>>> Matt
>>>>
>>>> [1] https://issues.apache.org/jira/browse/NIFI-5454
>>>>
>>>> On Wed, Dec 4, 2019 at 4:54 PM James McMahon <js...@gmail.com>
>>>> wrote:
>>>> >
>>>> > I have a series of attributes that result from an EvaluateJSonPath.
>>>> One of those attributes, FNAME, appears to be a list of values like so:
>>>> [“A”,”B”,”C”]. I want to split my flow file into one for each list element.
>>>> I need my results to have the original content, all the original
>>>> attributes, and its value for the split result out of the list as a new
>>>> attribute. I need to also know the split count, and be able to later merge
>>>> my flow files after evaluating the results of the split.
>>>> > How can I accomplish this?
>>>> > Thanks very much in advance.
>>>>
>>>
>>>
>>> --
>>> 노대호  *Daeho Ro */ Service Dev.
>>> daeho.ro@bespinglobal.com / *M* +82 10-6366-2636
>>> *KR* 06167 서울시 서초구 강남대로 327, 13층
>>> <https://www.google.com/maps/search/%EC%84%9C%EC%9A%B8%EC%8B%9C+%EC%84%9C%EC%B4%88%EA%B5%AC+%EA%B0%95%EB%82%A8%EB%8C%80%EB%A1%9C+327,+13%EC%B8%B5?entry=gmail&source=g>
>>> [image: Bespin Global] <https://bespinglobal.com/>
>>> 국내 최다 클라우드 인증 자격을 보유한 MSP • 국내 유일 ISO 인증을
>>> 확보한 MSP • 가트너가 인정한 한중일 유일한 MSP
>>> www.bespinglobal.com • ISO 27001:2013 • ISO 9001:2015 Certified
>>> ------------------------------
>>> *Confidentiality Note:* This email may contain confidential and/or
>>> private information.
>>> If you received this email in error please delete and notify the sender.
>>>
>>
>
> --
> 노대호  *Daeho Ro */ Service Dev.
> daeho.ro@bespinglobal.com / *M* +82 10-6366-2636
> *KR* 06167 서울시 서초구 강남대로 327, 13층
> [image: Bespin Global] <https://bespinglobal.com/>
> 국내 최다 클라우드 인증 자격을 보유한 MSP • 국내 유일 ISO 인증을
> 확보한 MSP • 가트너가 인정한 한중일 유일한 MSP
> www.bespinglobal.com • ISO 27001:2013 • ISO 9001:2015 Certified
> ------------------------------
> *Confidentiality Note:* This email may contain confidential and/or
> private information.
> If you received this email in error please delete and notify the sender.
>

Re: Split a flow file to multiples, related by common key and with split count

Posted by 노대호Daeho Ro <da...@bespinglobal.com>.

Of course.

There is a processor, the name is SplitJson. It can split the JSON text by
defined key. For example, if there is a key name is 'fname' and has the
value [a, b, c]. Once you split the JSON by that processor, the resulted
JSON will have the same key and values for others but 'fname' will be a for
the first JSON , b for the second and so on.

After that, do the EvaluateJsonPath for FNAME then it will have a and b and
c for each splited flowfiles. Thus, I recommend you to place the SplitJson
processor in front of the  EvaluateJsonPath processor.

2019년 12월 5일 (목) 오전 10:58, James McMahon <js...@gmail.com>님이 작성:

> I don’t quite follow, Daeho. FNAME is an attribute that results *from*
> EvaluateJSonPath. Can you explain what you mean by splitting the Jason key
> before EvaluateJSonPath?
> Jim
>
> On Wed, Dec 4, 2019 at 7:45 PM 노대호Daeho Ro <da...@bespinglobal.com>
> wrote:
>
>> I think you can split the json key for FNAME just before the
>> EvaluateJsonPath processor. Then, the fragment.* attributes will be
>> automatically created.
>>
>> 2019년 12월 5일 (목) 오전 8:24, Matt Burgess <ma...@apache.org>님이 작성:
>>
>>> Jim,
>>>
>>> As of NiFi 1.8.0 [1], you should be able to do this with
>>> UpdateAttribute -> DuplicateFlowFile -> UpdateAttribute pattern, the
>>> first getting the number of values in the list via the count() EL
>>> function, the second using that (minus 1) to generate duplicates, each
>>> with a copy.index attribute set. That attribute can be used in another
>>> UpdateAttribute with getDelimitedField() EL function for each flow
>>> file to get its own value from FNAME. You may need to rename some of
>>> the attributes to fragment.* in order to use a merge processor, but I
>>> think all the necessary values are covered. Please let me know if this
>>> works for you or not, I added various improvements in order to support
>>> use cases like this, but if I missed something I can certainly add it.
>>>
>>> Regards,
>>> Matt
>>>
>>> [1] https://issues.apache.org/jira/browse/NIFI-5454
>>>
>>> On Wed, Dec 4, 2019 at 4:54 PM James McMahon <js...@gmail.com>
>>> wrote:
>>> >
>>> > I have a series of attributes that result from an EvaluateJSonPath.
>>> One of those attributes, FNAME, appears to be a list of values like so:
>>> [“A”,”B”,”C”]. I want to split my flow file into one for each list element.
>>> I need my results to have the original content, all the original
>>> attributes, and its value for the split result out of the list as a new
>>> attribute. I need to also know the split count, and be able to later merge
>>> my flow files after evaluating the results of the split.
>>> > How can I accomplish this?
>>> > Thanks very much in advance.
>>>
>>
>>
>> --
>> 노대호  *Daeho Ro */ Service Dev.
>> daeho.ro@bespinglobal.com / *M* +82 10-6366-2636
>> *KR* 06167 서울시 서초구 강남대로 327, 13층
>> <https://www.google.com/maps/search/%EC%84%9C%EC%9A%B8%EC%8B%9C+%EC%84%9C%EC%B4%88%EA%B5%AC+%EA%B0%95%EB%82%A8%EB%8C%80%EB%A1%9C+327,+13%EC%B8%B5?entry=gmail&source=g>
>> [image: Bespin Global] <https://bespinglobal.com/>
>> 국내 최다 클라우드 인증 자격을 보유한 MSP • 국내 유일 ISO 인증을
>> 확보한 MSP • 가트너가 인정한 한중일 유일한 MSP
>> www.bespinglobal.com • ISO 27001:2013 • ISO 9001:2015 Certified
>> ------------------------------
>> *Confidentiality Note:* This email may contain confidential and/or
>> private information.
>> If you received this email in error please delete and notify the sender.
>>
>

-- 
노대호  *Daeho Ro */ Service Dev.
daeho.ro@bespinglobal.com / *M* +82 10-6366-2636
*KR* 06167 서울시 서초구 강남대로 327, 13층
[image: Bespin Global] <https://bespinglobal.com/>
국내 최다 클라우드 인증 자격을 보유한 MSP • 국내 유일 ISO 인증을
확보한 MSP • 가트너가 인정한 한중일 유일한 MSP
www.bespinglobal.com • ISO 27001:2013 • ISO 9001:2015 Certified
------------------------------
*Confidentiality Note:* This email may contain confidential and/or private
information.
If you received this email in error please delete and notify the sender.

Re: Split a flow file to multiples, related by common key and with split count

Posted by James McMahon <js...@gmail.com>.

I don’t quite follow, Daeho. FNAME is an attribute that results *from*
EvaluateJSonPath. Can you explain what you mean by splitting the Jason key
before EvaluateJSonPath?
Jim

On Wed, Dec 4, 2019 at 7:45 PM 노대호Daeho Ro <da...@bespinglobal.com>
wrote:

> I think you can split the json key for FNAME just before the
> EvaluateJsonPath processor. Then, the fragment.* attributes will be
> automatically created.
>
> 2019년 12월 5일 (목) 오전 8:24, Matt Burgess <ma...@apache.org>님이 작성:
>
>> Jim,
>>
>> As of NiFi 1.8.0 [1], you should be able to do this with
>> UpdateAttribute -> DuplicateFlowFile -> UpdateAttribute pattern, the
>> first getting the number of values in the list via the count() EL
>> function, the second using that (minus 1) to generate duplicates, each
>> with a copy.index attribute set. That attribute can be used in another
>> UpdateAttribute with getDelimitedField() EL function for each flow
>> file to get its own value from FNAME. You may need to rename some of
>> the attributes to fragment.* in order to use a merge processor, but I
>> think all the necessary values are covered. Please let me know if this
>> works for you or not, I added various improvements in order to support
>> use cases like this, but if I missed something I can certainly add it.
>>
>> Regards,
>> Matt
>>
>> [1] https://issues.apache.org/jira/browse/NIFI-5454
>>
>> On Wed, Dec 4, 2019 at 4:54 PM James McMahon <js...@gmail.com>
>> wrote:
>> >
>> > I have a series of attributes that result from an EvaluateJSonPath. One
>> of those attributes, FNAME, appears to be a list of values like so:
>> [“A”,”B”,”C”]. I want to split my flow file into one for each list element.
>> I need my results to have the original content, all the original
>> attributes, and its value for the split result out of the list as a new
>> attribute. I need to also know the split count, and be able to later merge
>> my flow files after evaluating the results of the split.
>> > How can I accomplish this?
>> > Thanks very much in advance.
>>
>
>
> --
> 노대호  *Daeho Ro */ Service Dev.
> daeho.ro@bespinglobal.com / *M* +82 10-6366-2636
> *KR* 06167 서울시 서초구 강남대로 327, 13층
> <https://www.google.com/maps/search/%EC%84%9C%EC%9A%B8%EC%8B%9C+%EC%84%9C%EC%B4%88%EA%B5%AC+%EA%B0%95%EB%82%A8%EB%8C%80%EB%A1%9C+327,+13%EC%B8%B5?entry=gmail&source=g>
> [image: Bespin Global] <https://bespinglobal.com/>
> 국내 최다 클라우드 인증 자격을 보유한 MSP • 국내 유일 ISO 인증을
> 확보한 MSP • 가트너가 인정한 한중일 유일한 MSP
> www.bespinglobal.com • ISO 27001:2013 • ISO 9001:2015 Certified
> ------------------------------
> *Confidentiality Note:* This email may contain confidential and/or
> private information.
> If you received this email in error please delete and notify the sender.
>

Re: Split a flow file to multiples, related by common key and with split count

Posted by 노대호Daeho Ro <da...@bespinglobal.com>.

I think you can split the json key for FNAME just before the
EvaluateJsonPath processor. Then, the fragment.* attributes will be
automatically created.

2019년 12월 5일 (목) 오전 8:24, Matt Burgess <ma...@apache.org>님이 작성:

> Jim,
>
> As of NiFi 1.8.0 [1], you should be able to do this with
> UpdateAttribute -> DuplicateFlowFile -> UpdateAttribute pattern, the
> first getting the number of values in the list via the count() EL
> function, the second using that (minus 1) to generate duplicates, each
> with a copy.index attribute set. That attribute can be used in another
> UpdateAttribute with getDelimitedField() EL function for each flow
> file to get its own value from FNAME. You may need to rename some of
> the attributes to fragment.* in order to use a merge processor, but I
> think all the necessary values are covered. Please let me know if this
> works for you or not, I added various improvements in order to support
> use cases like this, but if I missed something I can certainly add it.
>
> Regards,
> Matt
>
> [1] https://issues.apache.org/jira/browse/NIFI-5454
>
> On Wed, Dec 4, 2019 at 4:54 PM James McMahon <js...@gmail.com> wrote:
> >
> > I have a series of attributes that result from an EvaluateJSonPath. One
> of those attributes, FNAME, appears to be a list of values like so:
> [“A”,”B”,”C”]. I want to split my flow file into one for each list element.
> I need my results to have the original content, all the original
> attributes, and its value for the split result out of the list as a new
> attribute. I need to also know the split count, and be able to later merge
> my flow files after evaluating the results of the split.
> > How can I accomplish this?
> > Thanks very much in advance.
>


-- 
노대호  *Daeho Ro */ Service Dev.
daeho.ro@bespinglobal.com / *M* +82 10-6366-2636
*KR* 06167 서울시 서초구 강남대로 327, 13층
[image: Bespin Global] <https://bespinglobal.com/>
국내 최다 클라우드 인증 자격을 보유한 MSP • 국내 유일 ISO 인증을
확보한 MSP • 가트너가 인정한 한중일 유일한 MSP
www.bespinglobal.com • ISO 27001:2013 • ISO 9001:2015 Certified
------------------------------
*Confidentiality Note:* This email may contain confidential and/or private
information.
If you received this email in error please delete and notify the sender.

Re: Split a flow file to multiples, related by common key and with split count

Posted by James McMahon <js...@gmail.com>.

Our NiFi version is 1.7. Is there an approach you can suggest for that
version, Matt?

On Wed, Dec 4, 2019 at 6:24 PM Matt Burgess <ma...@apache.org> wrote:

> Jim,
>
> As of NiFi 1.8.0 [1], you should be able to do this with
> UpdateAttribute -> DuplicateFlowFile -> UpdateAttribute pattern, the
> first getting the number of values in the list via the count() EL
> function, the second using that (minus 1) to generate duplicates, each
> with a copy.index attribute set. That attribute can be used in another
> UpdateAttribute with getDelimitedField() EL function for each flow
> file to get its own value from FNAME. You may need to rename some of
> the attributes to fragment.* in order to use a merge processor, but I
> think all the necessary values are covered. Please let me know if this
> works for you or not, I added various improvements in order to support
> use cases like this, but if I missed something I can certainly add it.
>
> Regards,
> Matt
>
> [1] https://issues.apache.org/jira/browse/NIFI-5454
>
> On Wed, Dec 4, 2019 at 4:54 PM James McMahon <js...@gmail.com> wrote:
> >
> > I have a series of attributes that result from an EvaluateJSonPath. One
> of those attributes, FNAME, appears to be a list of values like so:
> [“A”,”B”,”C”]. I want to split my flow file into one for each list element.
> I need my results to have the original content, all the original
> attributes, and its value for the split result out of the list as a new
> attribute. I need to also know the split count, and be able to later merge
> my flow files after evaluating the results of the split.
> > How can I accomplish this?
> > Thanks very much in advance.
>

Re: Split a flow file to multiples, related by common key and with split count

Posted by Matt Burgess <ma...@apache.org>.

Jim,

As of NiFi 1.8.0 [1], you should be able to do this with
UpdateAttribute -> DuplicateFlowFile -> UpdateAttribute pattern, the
first getting the number of values in the list via the count() EL
function, the second using that (minus 1) to generate duplicates, each
with a copy.index attribute set. That attribute can be used in another
UpdateAttribute with getDelimitedField() EL function for each flow
file to get its own value from FNAME. You may need to rename some of
the attributes to fragment.* in order to use a merge processor, but I
think all the necessary values are covered. Please let me know if this
works for you or not, I added various improvements in order to support
use cases like this, but if I missed something I can certainly add it.

Regards,
Matt

[1] https://issues.apache.org/jira/browse/NIFI-5454

On Wed, Dec 4, 2019 at 4:54 PM James McMahon <js...@gmail.com> wrote:
>
> I have a series of attributes that result from an EvaluateJSonPath. One of those attributes, FNAME, appears to be a list of values like so: [“A”,”B”,”C”]. I want to split my flow file into one for each list element. I need my results to have the original content, all the original attributes, and its value for the split result out of the list as a new attribute. I need to also know the split count, and be able to later merge my flow files after evaluating the results of the split.
> How can I accomplish this?
> Thanks very much in advance.