You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@nifi.apache.org by ALAM Mahabub <ma...@alstomgroup.com> on 2020/04/18 01:26:49 UTC

Include parent fields into the output record fields in XML data

Hello All,
I am new in NiFI. I have below nested xml file and I need to keep the parents node <track id> with its multiple record <switch id> in a same table. I already able to separate them but not able to align them in a same flow file attributes record.

So, it will be highly appreciated if anyone please help me how can I Include parent fields <track id> into the output record <switch> and what will be my flow file?


track_id

track_name

swithc_id

track_continue_course

pos

1

TR_3B_ASW_ITW

2

Straight

554.05

1

TR_3B_ASW_ITW

3
Straight

2654.64

1

TR_3B_ASW_ITW

4
Straight

2767.56

…

…

…

…

…

XML file:

[cid:image001.jpg@01D61531.301FF5B0]

NiFi flow:
In the left flow file, I am able to split  track id=1, name=”TR_3B_ASW_ITW”
In the right flow file, I am able to split records with switch id=2, trackContinueCourse=”straight”, pos=”544.05” etc.

Output needed:


track_id

track_name

swithc_id

track_continue_course

pos

1

TR_3B_ASW_ITW

2

Straight

554.05

1

TR_3B_ASW_ITW

3
Straight

2654.64

1

TR_3B_ASW_ITW

4
Straight

2767.56

…

…

…

…

…



[cid:image003.jpg@01D61531.301FF5B0]
Regards,
Mahabub ALAM

________________________________
CONFIDENTIALITY : This e-mail and any attachments are confidential and may be privileged. If you are not a named recipient, please notify the sender immediately and do not disclose the contents to another person, use it for any purpose or store or copy the information in any medium.

RE: Include parent fields into the output record fields in XML data

Posted by ALAM Mahabub <ma...@alstomgroup.com>.
Hello Matt,
Thanks for your nice explanation. Currently I solve this issue by extracting the parent fields to attributes and keep it all the way of as flow file attribute (EvaluateXQuery--> SplitXML--> EvaluateXQuery --> EvaluateXPath) which is cumbersome and more complicated than the way you advice.

I will definitely try your way next in my next processing.

Thanks for your answer.

Regards,
Mahabub ALAM

From: Matt Burgess <ma...@apache.org>
Sent: Tuesday, April 21, 2020 4:28 PM
To: users@nifi.apache.org
Subject: Re: Include parent fields into the output record fields in XML data

Take a look at the ForkRecord processor, it can be configured to split at a particular level, while still keeping all the parent fields for each split. If that doesn't satisfy your use case, you might try converting to JSON and using JoltTransformJSON. Then you don't have to split, extract, merge. Normally I would recommend JoltTransformRecord (since you wouldn't have to convert to JSON first), but JoltTransformRecord applies the spec to each record, where JoltTransformJSON applies the spec to the entire input. For putting parent fields in child arrays, you'd need access to both.

Another alternative is to extract the parent fields to attributes, then ForkRecord without keeping the parent fields, then there are a handful of processors that can inject fields using Expression Language, such as JoltTransformRecord, UpdateRecord, etc.

If you can avoid the split, you should be able to use PutDatabaseRecord instead of PutSQL, so you wouldn't have to convert/generate SQL and execute the queries individually.

Regards,
Matt


On Sat, Apr 18, 2020 at 10:43 AM ALAM Mahabub <ma...@alstomgroup.com>> wrote:
Hello,
Thanks for your answer but I am still confused how to do it. Can you please explain a bit more with example if you have any. I am shearing my XPath and XQuery below.

Left side to take trac elements:

EvaluateXQuery :

[cid:image001.jpg@01D6191D.57473A00]

EvaluateXpath :

[cid:image002.jpg@01D6191D.57473A00]


Right side to take switch elements

SplitXml:

[cid:image003.jpg@01D6191D.57473A00]

EvaluateXQuery:

[cid:image004.jpg@01D6191D.57473A00]

SplitXml :

[cid:image005.jpg@01D6191D.57473A00]

EvaluateXPath :

[cid:image006.jpg@01D6191D.57473A00]




Regards,

Mahabub ALAM


From: Andy LoPresto <al...@apache.org>>
Sent: Saturday, April 18, 2020 5:42 AM
To: users@nifi.apache.org<ma...@nifi.apache.org>
Subject: Re: Include parent fields into the output record fields in XML data

Can you share your XPath and XQuery properties? I think this should be possible with queries that return an array of results. If the results are in multiple attributes, you may be able to recombine them in the way you want using ExecuteScript or a ScriptedRecordSetWriter to translate them to CSV. You can also use the QueryRecord processor to perform SQL-like queries over large datasets in a flowfile which might be helpful in forming the output you’re looking for.

Andy LoPresto
alopresto@apache.org<ma...@apache.org>
alopresto.apache@gmail.com<ma...@gmail.com>
He/Him
PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69

On Apr 17, 2020, at 6:26 PM, ALAM Mahabub <ma...@alstomgroup.com>> wrote:

Hello All,
I am new in NiFI. I have below nested xml file and I need to keep the parents node <track id> with its multiple record <switch id> in a same table. I already able to separate them but not able to align them in a same flow file attributes record.

So, it will be highly appreciated if anyone please help me how can I Include parent fields <track id> into the output record <switch> and what will be my flow file?

track_id
track_name
swithc_id
track_continue_course
pos
1
TR_3B_ASW_ITW
2
Straight
554.05
1
TR_3B_ASW_ITW
3
Straight
2654.64
1
TR_3B_ASW_ITW
4
Straight
2767.56
…
…
…
…
…

XML file:

<image001.jpg>

NiFi flow:
In the left flow file, I am able to split  track id=1, name=”TR_3B_ASW_ITW”
In the right flow file, I am able to split records with switch id=2, trackContinueCourse=”straight”, pos=”544.05” etc.

Output needed:

track_id
track_name
swithc_id
track_continue_course
pos
1
TR_3B_ASW_ITW
2
Straight
554.05
1
TR_3B_ASW_ITW
3
Straight
2654.64
1
TR_3B_ASW_ITW
4
Straight
2767.56
…
…
…
…
…



<image003.jpg>
Regards,
Mahabub ALAM

________________________________
CONFIDENTIALITY : This e-mail and any attachments are confidential and may be privileged. If you are not a named recipient, please notify the sender immediately and do not disclose the contents to another person, use it for any purpose or store or copy the information in any medium.


________________________________
CONFIDENTIALITY : This e-mail and any attachments are confidential and may be privileged. If you are not a named recipient, please notify the sender immediately and do not disclose the contents to another person, use it for any purpose or store or copy the information in any medium.

________________________________
CONFIDENTIALITY : This e-mail and any attachments are confidential and may be privileged. If you are not a named recipient, please notify the sender immediately and do not disclose the contents to another person, use it for any purpose or store or copy the information in any medium.

Re: Include parent fields into the output record fields in XML data

Posted by Matt Burgess <ma...@apache.org>.
Take a look at the ForkRecord processor, it can be configured to split at a
particular level, while still keeping all the parent fields for each split.
If that doesn't satisfy your use case, you might try converting to JSON and
using JoltTransformJSON. Then you don't have to split, extract, merge.
Normally I would recommend JoltTransformRecord (since you wouldn't have to
convert to JSON first), but JoltTransformRecord applies the spec to each
record, where JoltTransformJSON applies the spec to the entire input. For
putting parent fields in child arrays, you'd need access to both.

Another alternative is to extract the parent fields to attributes, then
ForkRecord without keeping the parent fields, then there are a handful of
processors that can inject fields using Expression Language, such as
JoltTransformRecord, UpdateRecord, etc.

If you can avoid the split, you should be able to use PutDatabaseRecord
instead of PutSQL, so you wouldn't have to convert/generate SQL and execute
the queries individually.

Regards,
Matt


On Sat, Apr 18, 2020 at 10:43 AM ALAM Mahabub <ma...@alstomgroup.com>
wrote:

> Hello,
>
> Thanks for your answer but I am still confused how to do it. Can you
> please explain a bit more with example if you have any. I am shearing my
> XPath and XQuery below.
>
>
>
> *Left side to take trac elements:*
>
>
>
> *EvaluateXQuery :*
>
>
>
>
>
> *EvaluateXpath :*
>
>
>
>
>
>
>
> *Right side to take switch elements*
>
>
>
> *SplitXml:*
>
>
>
>
>
> *EvaluateXQuery:*
>
>
>
>
>
> *SplitXml :*
>
>
>
>
>
> *EvaluateXPath :*
>
>
>
>
>
>
>
> Regards,
>
> Mahabub ALAM
>
>
>
>
>
> *From:* Andy LoPresto <al...@apache.org>
> *Sent:* Saturday, April 18, 2020 5:42 AM
> *To:* users@nifi.apache.org
> *Subject:* Re: Include parent fields into the output record fields in XML
> data
>
>
>
> Can you share your XPath and XQuery properties? I think this should be
> possible with queries that return an array of results. If the results are
> in multiple attributes, you may be able to recombine them in the way you
> want using ExecuteScript or a ScriptedRecordSetWriter to translate them to
> CSV. You can also use the QueryRecord processor to perform SQL-like queries
> over large datasets in a flowfile which might be helpful in forming the
> output you’re looking for.
>
>
>
> Andy LoPresto
> alopresto@apache.org
> *alopresto.apache@gmail.com* <al...@gmail.com>
> He/Him
>
> PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69
>
>
>
> On Apr 17, 2020, at 6:26 PM, ALAM Mahabub <ma...@alstomgroup.com>
> wrote:
>
>
>
> Hello All,
>
> I am new in NiFI. I have below nested xml file and I need to keep the
> parents node <track id> with its multiple record <switch id> in a same
> table. I already able to separate them but not able to align them in a same
> flow file attributes record.
>
>
>
> So, it will be highly appreciated if anyone please help me how can I Include
> parent fields <track id> into the output record <switch> and what will be
> my flow file?
>
>
>
> track_id
>
> track_name
>
> swithc_id
>
> track_continue_course
>
> pos
>
> 1
>
> TR_3B_ASW_ITW
>
> 2
>
> Straight
>
> 554.05
>
> 1
>
> TR_3B_ASW_ITW
>
> 3
>
> Straight
>
> 2654.64
>
> 1
>
> TR_3B_ASW_ITW
>
> 4
>
> Straight
>
> 2767.56
>
> …
>
> …
>
> …
>
> …
>
> …
>
>
>
> *XML file: *
>
>
>
> <image001.jpg>
>
>
>
> *NiFi flow:*
>
> In the left flow file, I am able to split  track id=1, name=”TR_3B_ASW_ITW”
>
> In the right flow file, I am able to split records with switch id=2,
> trackContinueCourse=”straight”, pos=”544.05” etc.
>
>
>
> *Output needed: *
>
>
>
> track_id
>
> track_name
>
> swithc_id
>
> track_continue_course
>
> pos
>
> 1
>
> TR_3B_ASW_ITW
>
> 2
>
> Straight
>
> 554.05
>
> 1
>
> TR_3B_ASW_ITW
>
> 3
>
> Straight
>
> 2654.64
>
> 1
>
> TR_3B_ASW_ITW
>
> 4
>
> Straight
>
> 2767.56
>
> …
>
> …
>
> …
>
> …
>
> …
>
>
>
>
>
>
>
> <image003.jpg>
>
> Regards,
>
> Mahabub ALAM
>
>
> ------------------------------
>
> CONFIDENTIALITY : This e-mail and any attachments are confidential and may
> be privileged. If you are not a named recipient, please notify the sender
> immediately and do not disclose the contents to another person, use it for
> any purpose or store or copy the information in any medium.
>
>
>
> ------------------------------
> CONFIDENTIALITY : This e-mail and any attachments are confidential and may
> be privileged. If you are not a named recipient, please notify the sender
> immediately and do not disclose the contents to another person, use it for
> any purpose or store or copy the information in any medium.
>

RE: Include parent fields into the output record fields in XML data

Posted by ALAM Mahabub <ma...@alstomgroup.com>.
Hello,
Thanks for your answer but I am still confused how to do it. Can you please explain a bit more with example if you have any. I am shearing my XPath and XQuery below.

Left side to take trac elements:

EvaluateXQuery :

[cid:image001.jpg@01D615A0.65CD1CA0]

EvaluateXpath :

[cid:image003.jpg@01D615A0.65CD1CA0]


Right side to take switch elements

SplitXml:

[cid:image005.jpg@01D615A0.65CD1CA0]

EvaluateXQuery:

[cid:image009.jpg@01D615A0.65CD1CA0]

SplitXml :

[cid:image011.jpg@01D615A0.65CD1CA0]

EvaluateXPath :

[cid:image012.jpg@01D615A0.65CD1CA0]




Regards,

Mahabub ALAM


From: Andy LoPresto <al...@apache.org>
Sent: Saturday, April 18, 2020 5:42 AM
To: users@nifi.apache.org
Subject: Re: Include parent fields into the output record fields in XML data

Can you share your XPath and XQuery properties? I think this should be possible with queries that return an array of results. If the results are in multiple attributes, you may be able to recombine them in the way you want using ExecuteScript or a ScriptedRecordSetWriter to translate them to CSV. You can also use the QueryRecord processor to perform SQL-like queries over large datasets in a flowfile which might be helpful in forming the output you’re looking for.

Andy LoPresto
alopresto@apache.org<ma...@apache.org>
alopresto.apache@gmail.com<ma...@gmail.com>
He/Him
PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69


On Apr 17, 2020, at 6:26 PM, ALAM Mahabub <ma...@alstomgroup.com>> wrote:

Hello All,
I am new in NiFI. I have below nested xml file and I need to keep the parents node <track id> with its multiple record <switch id> in a same table. I already able to separate them but not able to align them in a same flow file attributes record.

So, it will be highly appreciated if anyone please help me how can I Include parent fields <track id> into the output record <switch> and what will be my flow file?

track_id
track_name
swithc_id
track_continue_course
pos
1
TR_3B_ASW_ITW
2
Straight
554.05
1
TR_3B_ASW_ITW
3
Straight
2654.64
1
TR_3B_ASW_ITW
4
Straight
2767.56
…
…
…
…
…

XML file:

<image001.jpg>

NiFi flow:
In the left flow file, I am able to split  track id=1, name=”TR_3B_ASW_ITW”
In the right flow file, I am able to split records with switch id=2, trackContinueCourse=”straight”, pos=”544.05” etc.

Output needed:

track_id
track_name
swithc_id
track_continue_course
pos
1
TR_3B_ASW_ITW
2
Straight
554.05
1
TR_3B_ASW_ITW
3
Straight
2654.64
1
TR_3B_ASW_ITW
4
Straight
2767.56
…
…
…
…
…



<image003.jpg>
Regards,
Mahabub ALAM

________________________________
CONFIDENTIALITY : This e-mail and any attachments are confidential and may be privileged. If you are not a named recipient, please notify the sender immediately and do not disclose the contents to another person, use it for any purpose or store or copy the information in any medium.


________________________________
CONFIDENTIALITY : This e-mail and any attachments are confidential and may be privileged. If you are not a named recipient, please notify the sender immediately and do not disclose the contents to another person, use it for any purpose or store or copy the information in any medium.

Re: Include parent fields into the output record fields in XML data

Posted by Andy LoPresto <al...@apache.org>.
Can you share your XPath and XQuery properties? I think this should be possible with queries that return an array of results. If the results are in multiple attributes, you may be able to recombine them in the way you want using ExecuteScript or a ScriptedRecordSetWriter to translate them to CSV. You can also use the QueryRecord processor to perform SQL-like queries over large datasets in a flowfile which might be helpful in forming the output you’re looking for. 
 
Andy LoPresto
alopresto@apache.org
alopresto.apache@gmail.com
He/Him
PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69

> On Apr 17, 2020, at 6:26 PM, ALAM Mahabub <ma...@alstomgroup.com> wrote:
> 
> Hello All, 
> I am new in NiFI. I have below nested xml file and I need to keep the parents node <track id> with its multiple record <switch id> in a same table. I already able to separate them but not able to align them in a same flow file attributes record.
>  
> So, it will be highly appreciated if anyone please help me how can I Include parent fields <track id> into the output record <switch> and what will be my flow file?
>  
> track_id
> track_name
> swithc_id
> track_continue_course
> pos
> 1
> TR_3B_ASW_ITW
> 2
> Straight
> 554.05
> 1
> TR_3B_ASW_ITW
> 3
> Straight
> 2654.64
> 1
> TR_3B_ASW_ITW
> 4
> Straight
> 2767.56
> …
> …
> …
> …
> … 
>  
> XML file: 
>  
> <image001.jpg>
>  
> NiFi flow:
> In the left flow file, I am able to split  track id=1, name=”TR_3B_ASW_ITW”
> In the right flow file, I am able to split records with switch id=2, trackContinueCourse=”straight”, pos=”544.05” etc.
>  
> Output needed: 
>  
> track_id
> track_name
> swithc_id
> track_continue_course
> pos
> 1
> TR_3B_ASW_ITW
> 2
> Straight
> 554.05
> 1
> TR_3B_ASW_ITW
> 3
> Straight
> 2654.64
> 1
> TR_3B_ASW_ITW
> 4
> Straight
> 2767.56
> …
> …
> …
> …
> … 
>  
>  
>  
> <image003.jpg>
> Regards, 
> Mahabub ALAM
> 
> CONFIDENTIALITY : This e-mail and any attachments are confidential and may be privileged. If you are not a named recipient, please notify the sender immediately and do not disclose the contents to another person, use it for any purpose or store or copy the information in any medium.