You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@nifi.apache.org by "Nathamuni, Ramanujam" <RN...@tiaa.org> on 2016/09/01 20:45:07 UTC

need idea - xml to hbase loading

Please give me some pointer to  start working :)


*************************************************************************
This e-mail may contain confidential or privileged information.
If you are not the intended recipient, please notify the sender immediately and then delete it.

TIAA
*************************************************************************

RE: need idea - xml to hbase loading

Posted by "Nathamuni, Ramanujam" <RN...@tiaa.org>.
Hello All,

I was able to get this flow working and getting the data posted to hbase table.  Thanks to you all for all your idea and help.

Thanks,
Ram


Below is high level technical details on how it works:




1.)    Make sure XSLT https://github.com/bojanbjelic/xml2json  downloaded and  stored it on your NiFI server or location where NiFi can access

2.)    Make sure you have XML available and properly formatted - use  Linux "xmllinit" command to make sure XML file is properly formatted

3.)    Have Hadoop information available and hbase controller service is configured and enabled -



[cid:image003.png@01D2054C.12670340]

4.)    Create hbase table  on Hadoop cluster  and make note of it (this example nifijson table and column family is json1 - command to create hbase table )

        Hbase table creation on  normal Hadoop cluster  -  if its kerberized cluster - please get the ticket and Keytab from KDC admin or Hadoop Admin team.



hbase shell

create 'nifijson', {NAME => 'nifitest'}




5.)    Now we have XML file, XSLT transformer file , hbase table and Hadoop configuration information - let see how the flow will look like  - I am having Write_JSON_PutFile - for double verifications

[cid:image004.png@01D2054C.12670340]


Let us look at each processor and see - how it is configured:


1.)    GET-XML-File



[cid:image002.png@01D2054C.ACCD6140]





2.)    XML_to_Json_XSLT_TransformerXml



[cid:image005.png@01D2054C.ACCD6140]



3.)    Write_HBASE_Table_PutHbaseJson ( please note the ROW Identified which I used as flow file ID - but this needs to changed based on your data requirements)


            [cid:image006.png@01D2054D.AF8B7A60]


       # verification -  on hbase table
             hbase shell
              scan 'nifijson'

From: Nathamuni, Ramanujam
Sent: Friday, September 02, 2016 8:48 AM
To: users@nifi.apache.org
Subject: RE: need idea - xml to hbase loading

I am trying similar approach -  hopefully I will make it working. I think , the challenge is going to XSLT transformer for my use case.

[cid:image001.png@01D20549.E941F0D0]

From: gpool@live.co.za<ma...@live.co.za> [mailto:gpool@live.co.za]
Sent: Thursday, September 01, 2016 7:52 PM
To: users@nifi.apache.org<ma...@nifi.apache.org>
Subject: Re: need idea - xml to hbase loading

Thanks, I'll check it out.
Get Outlook for iOS<https://aka.ms/o0ukef>


On Fri, Sep 2, 2016 at 9:07 AM +1000, "Joey Frazee" <jo...@icloud.com>> wrote:
Guillaume,

I've had a fair bit of success using TransformXml to transform XML into JSON and then PutHBaseJSON to persist the data in HBase.

There are lots of XML to JSON stylesheets out there but this is what I've been using: https://github.com/bojanbjelic/xml2json

If you take this approach, do make sure you update the mime.type attribute after the transformation with an UpdateAttribute; otherwise the provenance data viewer will treat the newly created JSON as XML.

-joey

On Sep 1, 2016, at 4:47 PM, Guillaume Pool <gp...@live.co.za>> wrote:
Hi,

I don't think there is a specific converter but you can use evaluatexpath to store the attributes from the xml document.

Depending on the input you can use xmlsplit and split it at the level in the xml that will give you a flowfile per record, then extract attributes using evaluate xpath, then use attributes to json to put those attributes into json document, then puthbase that into hbase.

Rough idea above of how i would go about it. All depends on what you are trying to achieve.

Regards
Guillaume
Get Outlook for iOS<https://aka.ms/o0ukef>


On Fri, Sep 2, 2016 at 6:45 AM +1000, "Nathamuni, Ramanujam" <RN...@tiaa.org>> wrote:
Please give me some pointer to  start working :)



*************************************************************************
This e-mail may contain confidential or privileged information.
If you are not the intended recipient, please notify the sender immediately and then delete it.

TIAA
*************************************************************************
*************************************************************************
This e-mail may contain confidential or privileged information.
If you are not the intended recipient, please notify the sender immediately and then delete it.

TIAA
*************************************************************************

RE: need idea - xml to hbase loading

Posted by "Nathamuni, Ramanujam" <RN...@tiaa.org>.
I am trying similar approach -  hopefully I will make it working. I think , the challenge is going to XSLT transformer for my use case.

[cid:image001.png@01D204F6.A9EB6D10]

From: gpool@live.co.za [mailto:gpool@live.co.za]
Sent: Thursday, September 01, 2016 7:52 PM
To: users@nifi.apache.org
Subject: Re: need idea - xml to hbase loading

Thanks, I'll check it out.
Get Outlook for iOS<https://aka.ms/o0ukef>



On Fri, Sep 2, 2016 at 9:07 AM +1000, "Joey Frazee" <jo...@icloud.com>> wrote:
Guillaume,

I've had a fair bit of success using TransformXml to transform XML into JSON and then PutHBaseJSON to persist the data in HBase.

There are lots of XML to JSON stylesheets out there but this is what I've been using: https://github.com/bojanbjelic/xml2json

If you take this approach, do make sure you update the mime.type attribute after the transformation with an UpdateAttribute; otherwise the provenance data viewer will treat the newly created JSON as XML.

-joey

On Sep 1, 2016, at 4:47 PM, Guillaume Pool <gp...@live.co.za>> wrote:
Hi,

I don't think there is a specific converter but you can use evaluatexpath to store the attributes from the xml document.

Depending on the input you can use xmlsplit and split it at the level in the xml that will give you a flowfile per record, then extract attributes using evaluate xpath, then use attributes to json to put those attributes into json document, then puthbase that into hbase.

Rough idea above of how i would go about it. All depends on what you are trying to achieve.

Regards
Guillaume
Get Outlook for iOS<https://aka.ms/o0ukef>



On Fri, Sep 2, 2016 at 6:45 AM +1000, "Nathamuni, Ramanujam" <RN...@tiaa.org>> wrote:
Please give me some pointer to  start working :)



*************************************************************************
This e-mail may contain confidential or privileged information.
If you are not the intended recipient, please notify the sender immediately and then delete it.

TIAA
*************************************************************************
*************************************************************************
This e-mail may contain confidential or privileged information.
If you are not the intended recipient, please notify the sender immediately and then delete it.

TIAA
*************************************************************************

Re: need idea - xml to hbase loading

Posted by "gpool@live.co.za" <gp...@live.co.za>.
Thanks, I'll check it out.

Get Outlook for iOS<https://aka.ms/o0ukef>




On Fri, Sep 2, 2016 at 9:07 AM +1000, "Joey Frazee" <jo...@icloud.com>> wrote:

Guillaume,

I've had a fair bit of success using TransformXml to transform XML into JSON and then PutHBaseJSON to persist the data in HBase.

There are lots of XML to JSON stylesheets out there but this is what I've been using: https://github.com/bojanbjelic/xml2json

If you take this approach, do make sure you update the mime.type attribute after the transformation with an UpdateAttribute; otherwise the provenance data viewer will treat the newly created JSON as XML.

-joey

On Sep 1, 2016, at 4:47 PM, Guillaume Pool <gp...@live.co.za>> wrote:

Hi,

I don't think there is a specific converter but you can use evaluatexpath to store the attributes from the xml document.

Depending on the input you can use xmlsplit and split it at the level in the xml that will give you a flowfile per record, then extract attributes using evaluate xpath, then use attributes to json to put those attributes into json document, then puthbase that into hbase.

Rough idea above of how i would go about it. All depends on what you are trying to achieve.

Regards
Guillaume

Get Outlook for iOS<https://aka.ms/o0ukef>




On Fri, Sep 2, 2016 at 6:45 AM +1000, "Nathamuni, Ramanujam" <RN...@tiaa.org>> wrote:

Please give me some pointer to  start working :)



*************************************************************************
This e-mail may contain confidential or privileged information.
If you are not the intended recipient, please notify the sender immediately and then delete it.

TIAA
*************************************************************************

Re: need idea - xml to hbase loading

Posted by Joey Frazee <jo...@icloud.com>.
Guillaume, 

I've had a fair bit of success using TransformXml to transform XML into JSON and then PutHBaseJSON to persist the data in HBase.

There are lots of XML to JSON stylesheets out there but this is what I've been using: https://github.com/bojanbjelic/xml2json

If you take this approach, do make sure you update the mime.type attribute after the transformation with an UpdateAttribute; otherwise the provenance data viewer will treat the newly created JSON as XML.

-joey

> On Sep 1, 2016, at 4:47 PM, Guillaume Pool <gp...@live.co.za> wrote:
> 
> Hi,
> 
> I don't think there is a specific converter but you can use evaluatexpath to store the attributes from the xml document.
> 
> Depending on the input you can use xmlsplit and split it at the level in the xml that will give you a flowfile per record, then extract attributes using evaluate xpath, then use attributes to json to put those attributes into json document, then puthbase that into hbase.
> 
> Rough idea above of how i would go about it. All depends on what you are trying to achieve.
> 
> Regards
> Guillaume
> 
> Get Outlook for iOS
> 
> 
> 
> 
> On Fri, Sep 2, 2016 at 6:45 AM +1000, "Nathamuni, Ramanujam" <RN...@tiaa.org> wrote:
> 
> Please give me some pointer to  start working J 
>  
>  
> 
> *************************************************************************
> This e-mail may contain confidential or privileged information.
> If you are not the intended recipient, please notify the sender immediately and then delete it.
> 
> TIAA
> *************************************************************************

Re: need idea - xml to hbase loading

Posted by Guillaume Pool <gp...@live.co.za>.
Hi,

I don't think there is a specific converter but you can use evaluatexpath to store the attributes from the xml document.

Depending on the input you can use xmlsplit and split it at the level in the xml that will give you a flowfile per record, then extract attributes using evaluate xpath, then use attributes to json to put those attributes into json document, then puthbase that into hbase.

Rough idea above of how i would go about it. All depends on what you are trying to achieve.

Regards
Guillaume

Get Outlook for iOS<https://aka.ms/o0ukef>




On Fri, Sep 2, 2016 at 6:45 AM +1000, "Nathamuni, Ramanujam" <RN...@tiaa.org>> wrote:

Please give me some pointer to  start working :)



*************************************************************************
This e-mail may contain confidential or privileged information.
If you are not the intended recipient, please notify the sender immediately and then delete it.

TIAA
*************************************************************************