You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@nifi.apache.org by Joe Witt <jo...@gmail.com> on 2016/10/04 00:22:47 UTC

Re: ELT on Nifi

Carlos,

I think you're right that more can be done to support a broad range of
transforms and styles of transforms.  The approach you're suggesting makes
sense for the style you prefer and I could envision such a processor that
can execute the transform/statements you're showing in that JSON sample.
Are you proposing to contribute such a processor?

Thanks
Joe

On Mon, Oct 3, 2016 at 2:25 PM, Carlos Manuel Fernandes (DSI) <
carlos.antonio.fernandes@cgd.pt> wrote:

> Hi all,
>
>
>
> When i saw Nifi for the first time , I try to build  a classical ETL/ELT
> flow , and this question is recurrent for the new users.
>
>
>
> Nifi has very good processors for the *Extract* and *Load*, the problem
> arise on Transform, because in ETL/ELT  tools there are specific
> “processors”  (ex: map, SCD, etc.)  binded to DW concepts  and sometimes
> binded  to a specific database (ex: SCDNetezza) . The Transformer
> processors in Nifi  are general purpose  and not correlated with  this
> concepts. The immediate solution is to create a lot of Custom script
> processors but  the metadata of ELT (sql) turn attributes or code of
> processors, not an ideal solution.
>
>
>
> But, If we put  the logic of *Transform*  outside of Nifi, for example in
> some Json structure , then its relative easy, construct a ELT NIFI Template
> capable of run a generic ELT flows.
>
>
>
> Example of a ELT JSon Structure  (the “steps” inside  the “flow” are to be
> executed on PutSql in the same transaction)
>
> {
>
>        "Transformer": [{
>
>              "name": "foo1",
>
>              "type": "Map",
>
>              "description": "Summarize the table foo from table bar",
>
>              "flow": [{
>
>                     "step": 1,
>
>                     "description": "delete all data",
>
>                     "stmt": "delete from  foo"
>
>              }, {
>
>                     "step": 2,
>
>                     "Description": "Count f2 by f1",
>
>                     "stmt": "insert into foo(c1, c2) select c1,sum(c2)
> from bar group by c1"
>
>              }]
>
>        }, {
>
>              "name": "foo2",
>
>              "type": "SCD- Slowly change Dimensions type 1",
>
>              "description": "Update a prod table based on stage table",
>
>              "flow": [{
>
>                     "step": 1,
>
>                     "description": "Process type 1",
>
>                     "stmt": "Update Prod Set Prod.columns = Stage.Columns
> From Stage Inner Join Prod on Stage.key = Prod.key Where Stage.IsType1 = 1 "
>
>              }]
>
>        }]
>
> }
>
>
>
> Example of a  NIFI template who execute that Json structure :
>
>
>
>
>
>
>
> This make sense?  Give me feedback.
>
>
>
> Carlos
>
>
>
>
>
>
>

RE: ELT on Nifi

Posted by "Carlos Manuel Fernandes (DSI)" <ca...@cgd.pt>.
Hi Joe,

I can contribute the Template , which image I send before. For build a processor , I’m not java skilled enough  for that task, I mostly program in Groovy .  If someone  take that task, I can help with ideas and tests.

Thanks

Carlos



From: Joe Witt [mailto:joe.witt@gmail.com]
Sent: terça-feira, 4 de Outubro de 2016 01:23
To: users@nifi.apache.org
Subject: Re: ELT on Nifi

Carlos,

I think you're right that more can be done to support a broad range of transforms and styles of transforms.  The approach you're suggesting makes sense for the style you prefer and I could envision such a processor that can execute the transform/statements you're showing in that JSON sample.  Are you proposing to contribute such a processor?

Thanks
Joe

On Mon, Oct 3, 2016 at 2:25 PM, Carlos Manuel Fernandes (DSI) <ca...@cgd.pt>> wrote:
Hi all,

When i saw Nifi for the first time , I try to build  a classical ETL/ELT flow , and this question is recurrent for the new users.

Nifi has very good processors for the Extract and Load, the problem arise on Transform, because in ETL/ELT  tools there are specific “processors”  (ex: map, SCD, etc.)  binded to DW concepts  and sometimes binded  to a specific database (ex: SCDNetezza) . The Transformer processors in Nifi  are general purpose  and not correlated with  this concepts. The immediate solution is to create a lot of Custom script processors but  the metadata of ELT (sql) turn attributes or code of processors, not an ideal solution.

But, If we put  the logic of Transform  outside of Nifi, for example in some Json structure , then its relative easy, construct a ELT NIFI Template capable of run a generic ELT flows.

Example of a ELT JSon Structure  (the “steps” inside  the “flow” are to be executed on PutSql in the same transaction)
{
       "Transformer": [{
             "name": "foo1",
             "type": "Map",
             "description": "Summarize the table foo from table bar",
             "flow": [{
                    "step": 1,
                    "description": "delete all data",
                    "stmt": "delete from  foo"
             }, {
                    "step": 2,
                    "Description": "Count f2 by f1",
                    "stmt": "insert into foo(c1, c2) select c1,sum(c2) from bar group by c1"
             }]
       }, {
             "name": "foo2",
             "type": "SCD- Slowly change Dimensions type 1",
             "description": "Update a prod table based on stage table",
             "flow": [{
                    "step": 1,
                    "description": "Process type 1",
                    "stmt": "Update Prod Set Prod.columns = Stage.Columns From Stage Inner Join Prod on Stage.key = Prod.key Where Stage.IsType1 = 1 "
             }]
       }]
}

Example of a  NIFI template who execute that Json structure :

[cid:image001.png@01D21E64.24D94F70]


This make sense?  Give me feedback.

Carlos