You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@nifi.apache.org by Anuj Handa <an...@gmail.com> on 2016/04/30 04:43:32 UTC

XML Shredding.

Hello folks..

I am very new to Nifi and trying to see if i can use it to shred XML data.
right now I am reading a file which has different type of xml messages.
Examples below. As a first step i was trying to get an element type
{cashreportdata} and write the whole element to a new file. eventually
different messages will go into different tables.

<Transaction xmlns="DataAPI"><CashReportData Version="3.0"
AppVersion="7.1.60.10976" MessageId="ded6819a-6955-4143-bf40-96c4a4a90c72"
Number="37527" SatNumber="0" TermNumber="1" FromBusinessDate="2013-11-27"
ToBusinessDate="2013-11-27"
TransactionDate="2013-11-27T23:00:02.05875-06:00" EmployeeId="5"
EmployeeName="anuj", <OrderTaxes><TaxByImposition ImpositionName="General
Sales Tax" ImpositionId="1"
Amount="127.2300"/></OrderTaxes></CashReportData></Transaction>

<Transaction xmlns="DataAPI"><DrawerCount Version="3.0"
ApplicationVersion="7.1.60.10976"
MessageId="f1d62b4e-8b35-4979-b5bf-4a313b757000" TransactionId="364"
Number="52794" SatNumber="0" TermNumber="1" BusinessDate="2013-11-27"
TransactionDate="2013-11-28T00:00:08.463" EmployeeId="9" EmployeeName="Anuj
DrawerId="1" <OrderTaxes><TaxByImposition ImpositionName="General Sales
Tax" ImpositionId="1"
Amount="98.6900"/></OrderTaxes></CashInReportData></DrawerCount></Transaction>

i got the below error when using evaluatexpath
[image: Inline image 1]

thisis how the processor was configured.

[image: Inline image 2]
any suggestions on what i could be doing wrong..

Anuj

Re: XML Shredding.

Posted by Anuj Handa <an...@gmail.com>.
Hi Jim,

That malformed XML might just be copy past issue. The xml is well formed.
the File had BOM which was causing the error. I removed it manually and
that fixed that problem.

Google also helped me with the name space
issue. /*/*[local-name()='CashReportData'] seems to work if there's only
one XML doc in file it seems to work in the EvaluateXpath but if there are
mutiple xml docs in the file it gives the below error.

[image: Inline image 1]

Anuj

On Sat, Apr 30, 2016 at 2:08 AM, James Wing <jv...@gmail.com> wrote:

> I believe you have a couple of problems:
>
> 1.) The XML needs to be well-formed. In your first example, the
> CashReportData element is not properly closed, same with DrawerCount in the
> second example.
> 2.) The XML has a namespace, which you would need to reference in your
> XPath.  I'm not sure how to handle that in NiFi, you might need to remove
> that before running EvaluateXPath.
>
> An alternative approach might be to use an ExtractText processor if you
> wish to remove the text of the element.
>
> On Fri, Apr 29, 2016 at 7:43 PM, Anuj Handa <an...@gmail.com> wrote:
>
>> Hello folks..
>>
>> I am very new to Nifi and trying to see if i can use it to shred XML
>> data. right now I am reading a file which has different type of xml
>> messages. Examples below. As a first step i was trying to get an element
>> type {cashreportdata} and write the whole element to a new file. eventually
>> different messages will go into different tables.
>>
>> <Transaction xmlns="DataAPI"><CashReportData Version="3.0"
>> AppVersion="7.1.60.10976" MessageId="ded6819a-6955-4143-bf40-96c4a4a90c72"
>> Number="37527" SatNumber="0" TermNumber="1" FromBusinessDate="2013-11-27"
>> ToBusinessDate="2013-11-27"
>> TransactionDate="2013-11-27T23:00:02.05875-06:00" EmployeeId="5"
>> EmployeeName="anuj", <OrderTaxes><TaxByImposition ImpositionName="General
>> Sales Tax" ImpositionId="1"
>> Amount="127.2300"/></OrderTaxes></CashReportData></Transaction>
>>
>> <Transaction xmlns="DataAPI"><DrawerCount Version="3.0"
>> ApplicationVersion="7.1.60.10976"
>> MessageId="f1d62b4e-8b35-4979-b5bf-4a313b757000" TransactionId="364"
>> Number="52794" SatNumber="0" TermNumber="1" BusinessDate="2013-11-27"
>> TransactionDate="2013-11-28T00:00:08.463" EmployeeId="9" EmployeeName="Anuj
>> DrawerId="1" <OrderTaxes><TaxByImposition ImpositionName="General Sales
>> Tax" ImpositionId="1"
>> Amount="98.6900"/></OrderTaxes></CashInReportData></DrawerCount></Transaction>
>> 
>> i got the below error when using evaluatexpath
>> [image: Inline image 1]
>>
>> thisis how the processor was configured.
>>
>> [image: Inline image 2]
>> any suggestions on what i could be doing wrong..
>>
>> Anuj
>>
>
>

Re: XML Shredding.

Posted by James Wing <jv...@gmail.com>.
I believe you have a couple of problems:

1.) The XML needs to be well-formed. In your first example, the
CashReportData element is not properly closed, same with DrawerCount in the
second example.
2.) The XML has a namespace, which you would need to reference in your
XPath.  I'm not sure how to handle that in NiFi, you might need to remove
that before running EvaluateXPath.

An alternative approach might be to use an ExtractText processor if you
wish to remove the text of the element.

On Fri, Apr 29, 2016 at 7:43 PM, Anuj Handa <an...@gmail.com> wrote:

> Hello folks..
>
> I am very new to Nifi and trying to see if i can use it to shred XML data.
> right now I am reading a file which has different type of xml messages.
> Examples below. As a first step i was trying to get an element type
> {cashreportdata} and write the whole element to a new file. eventually
> different messages will go into different tables.
>
> <Transaction xmlns="DataAPI"><CashReportData Version="3.0"
> AppVersion="7.1.60.10976" MessageId="ded6819a-6955-4143-bf40-96c4a4a90c72"
> Number="37527" SatNumber="0" TermNumber="1" FromBusinessDate="2013-11-27"
> ToBusinessDate="2013-11-27"
> TransactionDate="2013-11-27T23:00:02.05875-06:00" EmployeeId="5"
> EmployeeName="anuj", <OrderTaxes><TaxByImposition ImpositionName="General
> Sales Tax" ImpositionId="1"
> Amount="127.2300"/></OrderTaxes></CashReportData></Transaction>
>
> <Transaction xmlns="DataAPI"><DrawerCount Version="3.0"
> ApplicationVersion="7.1.60.10976"
> MessageId="f1d62b4e-8b35-4979-b5bf-4a313b757000" TransactionId="364"
> Number="52794" SatNumber="0" TermNumber="1" BusinessDate="2013-11-27"
> TransactionDate="2013-11-28T00:00:08.463" EmployeeId="9" EmployeeName="Anuj
> DrawerId="1" <OrderTaxes><TaxByImposition ImpositionName="General Sales
> Tax" ImpositionId="1"
> Amount="98.6900"/></OrderTaxes></CashInReportData></DrawerCount></Transaction>
> 
> i got the below error when using evaluatexpath
> [image: Inline image 1]
>
> thisis how the processor was configured.
>
> [image: Inline image 2]
> any suggestions on what i could be doing wrong..
>
> Anuj
>