You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@nifi.apache.org by ALAM Mahabub <ma...@alstomgroup.com> on 2020/04/18 01:26:49 UTC
Include parent fields into the output record fields in XML data
Hello All,
I am new in NiFI. I have below nested xml file and I need to keep the parents node <track id> with its multiple record <switch id> in a same table. I already able to separate them but not able to align them in a same flow file attributes record.
So, it will be highly appreciated if anyone please help me how can I Include parent fields <track id> into the output record <switch> and what will be my flow file?
track_id
track_name
swithc_id
track_continue_course
pos
1
TR_3B_ASW_ITW
2
Straight
554.05
1
TR_3B_ASW_ITW
3
Straight
2654.64
1
TR_3B_ASW_ITW
4
Straight
2767.56
…
…
…
…
…
XML file:
[cid:image001.jpg@01D61531.301FF5B0]
NiFi flow:
In the left flow file, I am able to split track id=1, name=”TR_3B_ASW_ITW”
In the right flow file, I am able to split records with switch id=2, trackContinueCourse=”straight”, pos=”544.05” etc.
Output needed:
track_id
track_name
swithc_id
track_continue_course
pos
1
TR_3B_ASW_ITW
2
Straight
554.05
1
TR_3B_ASW_ITW
3
Straight
2654.64
1
TR_3B_ASW_ITW
4
Straight
2767.56
…
…
…
…
…
[cid:image003.jpg@01D61531.301FF5B0]
Regards,
Mahabub ALAM
________________________________
CONFIDENTIALITY : This e-mail and any attachments are confidential and may be privileged. If you are not a named recipient, please notify the sender immediately and do not disclose the contents to another person, use it for any purpose or store or copy the information in any medium.
RE: Include parent fields into the output record fields in XML data
Posted by ALAM Mahabub <ma...@alstomgroup.com>.
Hello Matt,
Thanks for your nice explanation. Currently I solve this issue by extracting the parent fields to attributes and keep it all the way of as flow file attribute (EvaluateXQuery--> SplitXML--> EvaluateXQuery --> EvaluateXPath) which is cumbersome and more complicated than the way you advice.
I will definitely try your way next in my next processing.
Thanks for your answer.
Regards,
Mahabub ALAM
From: Matt Burgess <ma...@apache.org>
Sent: Tuesday, April 21, 2020 4:28 PM
To: users@nifi.apache.org
Subject: Re: Include parent fields into the output record fields in XML data
Take a look at the ForkRecord processor, it can be configured to split at a particular level, while still keeping all the parent fields for each split. If that doesn't satisfy your use case, you might try converting to JSON and using JoltTransformJSON. Then you don't have to split, extract, merge. Normally I would recommend JoltTransformRecord (since you wouldn't have to convert to JSON first), but JoltTransformRecord applies the spec to each record, where JoltTransformJSON applies the spec to the entire input. For putting parent fields in child arrays, you'd need access to both.
Another alternative is to extract the parent fields to attributes, then ForkRecord without keeping the parent fields, then there are a handful of processors that can inject fields using Expression Language, such as JoltTransformRecord, UpdateRecord, etc.
If you can avoid the split, you should be able to use PutDatabaseRecord instead of PutSQL, so you wouldn't have to convert/generate SQL and execute the queries individually.
Regards,
Matt
On Sat, Apr 18, 2020 at 10:43 AM ALAM Mahabub <ma...@alstomgroup.com>> wrote:
Hello,
Thanks for your answer but I am still confused how to do it. Can you please explain a bit more with example if you have any. I am shearing my XPath and XQuery below.
Left side to take trac elements:
EvaluateXQuery :
[cid:image001.jpg@01D6191D.57473A00]
EvaluateXpath :
[cid:image002.jpg@01D6191D.57473A00]
Right side to take switch elements
SplitXml:
[cid:image003.jpg@01D6191D.57473A00]
EvaluateXQuery:
[cid:image004.jpg@01D6191D.57473A00]
SplitXml :
[cid:image005.jpg@01D6191D.57473A00]
EvaluateXPath :
[cid:image006.jpg@01D6191D.57473A00]
Regards,
Mahabub ALAM
From: Andy LoPresto <al...@apache.org>>
Sent: Saturday, April 18, 2020 5:42 AM
To: users@nifi.apache.org<ma...@nifi.apache.org>
Subject: Re: Include parent fields into the output record fields in XML data
Can you share your XPath and XQuery properties? I think this should be possible with queries that return an array of results. If the results are in multiple attributes, you may be able to recombine them in the way you want using ExecuteScript or a ScriptedRecordSetWriter to translate them to CSV. You can also use the QueryRecord processor to perform SQL-like queries over large datasets in a flowfile which might be helpful in forming the output you’re looking for.
Andy LoPresto
alopresto@apache.org<ma...@apache.org>
alopresto.apache@gmail.com<ma...@gmail.com>
He/Him
PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4 BACE 3C6E F65B 2F7D EF69
On Apr 17, 2020, at 6:26 PM, ALAM Mahabub <ma...@alstomgroup.com>> wrote:
Hello All,
I am new in NiFI. I have below nested xml file and I need to keep the parents node <track id> with its multiple record <switch id> in a same table. I already able to separate them but not able to align them in a same flow file attributes record.
So, it will be highly appreciated if anyone please help me how can I Include parent fields <track id> into the output record <switch> and what will be my flow file?
track_id
track_name
swithc_id
track_continue_course
pos
1
TR_3B_ASW_ITW
2
Straight
554.05
1
TR_3B_ASW_ITW
3
Straight
2654.64
1
TR_3B_ASW_ITW
4
Straight
2767.56
…
…
…
…
…
XML file:
<image001.jpg>
NiFi flow:
In the left flow file, I am able to split track id=1, name=”TR_3B_ASW_ITW”
In the right flow file, I am able to split records with switch id=2, trackContinueCourse=”straight”, pos=”544.05” etc.
Output needed:
track_id
track_name
swithc_id
track_continue_course
pos
1
TR_3B_ASW_ITW
2
Straight
554.05
1
TR_3B_ASW_ITW
3
Straight
2654.64
1
TR_3B_ASW_ITW
4
Straight
2767.56
…
…
…
…
…
<image003.jpg>
Regards,
Mahabub ALAM
________________________________
CONFIDENTIALITY : This e-mail and any attachments are confidential and may be privileged. If you are not a named recipient, please notify the sender immediately and do not disclose the contents to another person, use it for any purpose or store or copy the information in any medium.
________________________________
CONFIDENTIALITY : This e-mail and any attachments are confidential and may be privileged. If you are not a named recipient, please notify the sender immediately and do not disclose the contents to another person, use it for any purpose or store or copy the information in any medium.
________________________________
CONFIDENTIALITY : This e-mail and any attachments are confidential and may be privileged. If you are not a named recipient, please notify the sender immediately and do not disclose the contents to another person, use it for any purpose or store or copy the information in any medium.
Re: Include parent fields into the output record fields in XML data
Posted by Matt Burgess <ma...@apache.org>.
Take a look at the ForkRecord processor, it can be configured to split at a
particular level, while still keeping all the parent fields for each split.
If that doesn't satisfy your use case, you might try converting to JSON and
using JoltTransformJSON. Then you don't have to split, extract, merge.
Normally I would recommend JoltTransformRecord (since you wouldn't have to
convert to JSON first), but JoltTransformRecord applies the spec to each
record, where JoltTransformJSON applies the spec to the entire input. For
putting parent fields in child arrays, you'd need access to both.
Another alternative is to extract the parent fields to attributes, then
ForkRecord without keeping the parent fields, then there are a handful of
processors that can inject fields using Expression Language, such as
JoltTransformRecord, UpdateRecord, etc.
If you can avoid the split, you should be able to use PutDatabaseRecord
instead of PutSQL, so you wouldn't have to convert/generate SQL and execute
the queries individually.
Regards,
Matt
On Sat, Apr 18, 2020 at 10:43 AM ALAM Mahabub <ma...@alstomgroup.com>
wrote:
> Hello,
>
> Thanks for your answer but I am still confused how to do it. Can you
> please explain a bit more with example if you have any. I am shearing my
> XPath and XQuery below.
>
>
>
> *Left side to take trac elements:*
>
>
>
> *EvaluateXQuery :*
>
>
>
>
>
> *EvaluateXpath :*
>
>
>
>
>
>
>
> *Right side to take switch elements*
>
>
>
> *SplitXml:*
>
>
>
>
>
> *EvaluateXQuery:*
>
>
>
>
>
> *SplitXml :*
>
>
>
>
>
> *EvaluateXPath :*
>
>
>
>
>
>
>
> Regards,
>
> Mahabub ALAM
>
>
>
>
>
> *From:* Andy LoPresto <al...@apache.org>
> *Sent:* Saturday, April 18, 2020 5:42 AM
> *To:* users@nifi.apache.org
> *Subject:* Re: Include parent fields into the output record fields in XML
> data
>
>
>
> Can you share your XPath and XQuery properties? I think this should be
> possible with queries that return an array of results. If the results are
> in multiple attributes, you may be able to recombine them in the way you
> want using ExecuteScript or a ScriptedRecordSetWriter to translate them to
> CSV. You can also use the QueryRecord processor to perform SQL-like queries
> over large datasets in a flowfile which might be helpful in forming the
> output you’re looking for.
>
>
>
> Andy LoPresto
> alopresto@apache.org
> *alopresto.apache@gmail.com* <al...@gmail.com>
> He/Him
>
> PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4 BACE 3C6E F65B 2F7D EF69
>
>
>
> On Apr 17, 2020, at 6:26 PM, ALAM Mahabub <ma...@alstomgroup.com>
> wrote:
>
>
>
> Hello All,
>
> I am new in NiFI. I have below nested xml file and I need to keep the
> parents node <track id> with its multiple record <switch id> in a same
> table. I already able to separate them but not able to align them in a same
> flow file attributes record.
>
>
>
> So, it will be highly appreciated if anyone please help me how can I Include
> parent fields <track id> into the output record <switch> and what will be
> my flow file?
>
>
>
> track_id
>
> track_name
>
> swithc_id
>
> track_continue_course
>
> pos
>
> 1
>
> TR_3B_ASW_ITW
>
> 2
>
> Straight
>
> 554.05
>
> 1
>
> TR_3B_ASW_ITW
>
> 3
>
> Straight
>
> 2654.64
>
> 1
>
> TR_3B_ASW_ITW
>
> 4
>
> Straight
>
> 2767.56
>
> …
>
> …
>
> …
>
> …
>
> …
>
>
>
> *XML file: *
>
>
>
> <image001.jpg>
>
>
>
> *NiFi flow:*
>
> In the left flow file, I am able to split track id=1, name=”TR_3B_ASW_ITW”
>
> In the right flow file, I am able to split records with switch id=2,
> trackContinueCourse=”straight”, pos=”544.05” etc.
>
>
>
> *Output needed: *
>
>
>
> track_id
>
> track_name
>
> swithc_id
>
> track_continue_course
>
> pos
>
> 1
>
> TR_3B_ASW_ITW
>
> 2
>
> Straight
>
> 554.05
>
> 1
>
> TR_3B_ASW_ITW
>
> 3
>
> Straight
>
> 2654.64
>
> 1
>
> TR_3B_ASW_ITW
>
> 4
>
> Straight
>
> 2767.56
>
> …
>
> …
>
> …
>
> …
>
> …
>
>
>
>
>
>
>
> <image003.jpg>
>
> Regards,
>
> Mahabub ALAM
>
>
> ------------------------------
>
> CONFIDENTIALITY : This e-mail and any attachments are confidential and may
> be privileged. If you are not a named recipient, please notify the sender
> immediately and do not disclose the contents to another person, use it for
> any purpose or store or copy the information in any medium.
>
>
>
> ------------------------------
> CONFIDENTIALITY : This e-mail and any attachments are confidential and may
> be privileged. If you are not a named recipient, please notify the sender
> immediately and do not disclose the contents to another person, use it for
> any purpose or store or copy the information in any medium.
>
RE: Include parent fields into the output record fields in XML data
Posted by ALAM Mahabub <ma...@alstomgroup.com>.
Hello,
Thanks for your answer but I am still confused how to do it. Can you please explain a bit more with example if you have any. I am shearing my XPath and XQuery below.
Left side to take trac elements:
EvaluateXQuery :
[cid:image001.jpg@01D615A0.65CD1CA0]
EvaluateXpath :
[cid:image003.jpg@01D615A0.65CD1CA0]
Right side to take switch elements
SplitXml:
[cid:image005.jpg@01D615A0.65CD1CA0]
EvaluateXQuery:
[cid:image009.jpg@01D615A0.65CD1CA0]
SplitXml :
[cid:image011.jpg@01D615A0.65CD1CA0]
EvaluateXPath :
[cid:image012.jpg@01D615A0.65CD1CA0]
Regards,
Mahabub ALAM
From: Andy LoPresto <al...@apache.org>
Sent: Saturday, April 18, 2020 5:42 AM
To: users@nifi.apache.org
Subject: Re: Include parent fields into the output record fields in XML data
Can you share your XPath and XQuery properties? I think this should be possible with queries that return an array of results. If the results are in multiple attributes, you may be able to recombine them in the way you want using ExecuteScript or a ScriptedRecordSetWriter to translate them to CSV. You can also use the QueryRecord processor to perform SQL-like queries over large datasets in a flowfile which might be helpful in forming the output you’re looking for.
Andy LoPresto
alopresto@apache.org<ma...@apache.org>
alopresto.apache@gmail.com<ma...@gmail.com>
He/Him
PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4 BACE 3C6E F65B 2F7D EF69
On Apr 17, 2020, at 6:26 PM, ALAM Mahabub <ma...@alstomgroup.com>> wrote:
Hello All,
I am new in NiFI. I have below nested xml file and I need to keep the parents node <track id> with its multiple record <switch id> in a same table. I already able to separate them but not able to align them in a same flow file attributes record.
So, it will be highly appreciated if anyone please help me how can I Include parent fields <track id> into the output record <switch> and what will be my flow file?
track_id
track_name
swithc_id
track_continue_course
pos
1
TR_3B_ASW_ITW
2
Straight
554.05
1
TR_3B_ASW_ITW
3
Straight
2654.64
1
TR_3B_ASW_ITW
4
Straight
2767.56
…
…
…
…
…
XML file:
<image001.jpg>
NiFi flow:
In the left flow file, I am able to split track id=1, name=”TR_3B_ASW_ITW”
In the right flow file, I am able to split records with switch id=2, trackContinueCourse=”straight”, pos=”544.05” etc.
Output needed:
track_id
track_name
swithc_id
track_continue_course
pos
1
TR_3B_ASW_ITW
2
Straight
554.05
1
TR_3B_ASW_ITW
3
Straight
2654.64
1
TR_3B_ASW_ITW
4
Straight
2767.56
…
…
…
…
…
<image003.jpg>
Regards,
Mahabub ALAM
________________________________
CONFIDENTIALITY : This e-mail and any attachments are confidential and may be privileged. If you are not a named recipient, please notify the sender immediately and do not disclose the contents to another person, use it for any purpose or store or copy the information in any medium.
________________________________
CONFIDENTIALITY : This e-mail and any attachments are confidential and may be privileged. If you are not a named recipient, please notify the sender immediately and do not disclose the contents to another person, use it for any purpose or store or copy the information in any medium.
Re: Include parent fields into the output record fields in XML data
Posted by Andy LoPresto <al...@apache.org>.
Can you share your XPath and XQuery properties? I think this should be possible with queries that return an array of results. If the results are in multiple attributes, you may be able to recombine them in the way you want using ExecuteScript or a ScriptedRecordSetWriter to translate them to CSV. You can also use the QueryRecord processor to perform SQL-like queries over large datasets in a flowfile which might be helpful in forming the output you’re looking for.
Andy LoPresto
alopresto@apache.org
alopresto.apache@gmail.com
He/Him
PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4 BACE 3C6E F65B 2F7D EF69
> On Apr 17, 2020, at 6:26 PM, ALAM Mahabub <ma...@alstomgroup.com> wrote:
>
> Hello All,
> I am new in NiFI. I have below nested xml file and I need to keep the parents node <track id> with its multiple record <switch id> in a same table. I already able to separate them but not able to align them in a same flow file attributes record.
>
> So, it will be highly appreciated if anyone please help me how can I Include parent fields <track id> into the output record <switch> and what will be my flow file?
>
> track_id
> track_name
> swithc_id
> track_continue_course
> pos
> 1
> TR_3B_ASW_ITW
> 2
> Straight
> 554.05
> 1
> TR_3B_ASW_ITW
> 3
> Straight
> 2654.64
> 1
> TR_3B_ASW_ITW
> 4
> Straight
> 2767.56
> …
> …
> …
> …
> …
>
> XML file:
>
> <image001.jpg>
>
> NiFi flow:
> In the left flow file, I am able to split track id=1, name=”TR_3B_ASW_ITW”
> In the right flow file, I am able to split records with switch id=2, trackContinueCourse=”straight”, pos=”544.05” etc.
>
> Output needed:
>
> track_id
> track_name
> swithc_id
> track_continue_course
> pos
> 1
> TR_3B_ASW_ITW
> 2
> Straight
> 554.05
> 1
> TR_3B_ASW_ITW
> 3
> Straight
> 2654.64
> 1
> TR_3B_ASW_ITW
> 4
> Straight
> 2767.56
> …
> …
> …
> …
> …
>
>
>
> <image003.jpg>
> Regards,
> Mahabub ALAM
>
> CONFIDENTIALITY : This e-mail and any attachments are confidential and may be privileged. If you are not a named recipient, please notify the sender immediately and do not disclose the contents to another person, use it for any purpose or store or copy the information in any medium.