You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@nifi.apache.org by Joe Witt <jo...@gmail.com> on 2016/10/04 00:22:47 UTC
Re: ELT on Nifi
Carlos,
I think you're right that more can be done to support a broad range of
transforms and styles of transforms. The approach you're suggesting makes
sense for the style you prefer and I could envision such a processor that
can execute the transform/statements you're showing in that JSON sample.
Are you proposing to contribute such a processor?
Thanks
Joe
On Mon, Oct 3, 2016 at 2:25 PM, Carlos Manuel Fernandes (DSI) <
carlos.antonio.fernandes@cgd.pt> wrote:
> Hi all,
>
>
>
> When i saw Nifi for the first time , I try to build a classical ETL/ELT
> flow , and this question is recurrent for the new users.
>
>
>
> Nifi has very good processors for the *Extract* and *Load*, the problem
> arise on Transform, because in ETL/ELT tools there are specific
> “processors” (ex: map, SCD, etc.) binded to DW concepts and sometimes
> binded to a specific database (ex: SCDNetezza) . The Transformer
> processors in Nifi are general purpose and not correlated with this
> concepts. The immediate solution is to create a lot of Custom script
> processors but the metadata of ELT (sql) turn attributes or code of
> processors, not an ideal solution.
>
>
>
> But, If we put the logic of *Transform* outside of Nifi, for example in
> some Json structure , then its relative easy, construct a ELT NIFI Template
> capable of run a generic ELT flows.
>
>
>
> Example of a ELT JSon Structure (the “steps” inside the “flow” are to be
> executed on PutSql in the same transaction)
>
> {
>
> "Transformer": [{
>
> "name": "foo1",
>
> "type": "Map",
>
> "description": "Summarize the table foo from table bar",
>
> "flow": [{
>
> "step": 1,
>
> "description": "delete all data",
>
> "stmt": "delete from foo"
>
> }, {
>
> "step": 2,
>
> "Description": "Count f2 by f1",
>
> "stmt": "insert into foo(c1, c2) select c1,sum(c2)
> from bar group by c1"
>
> }]
>
> }, {
>
> "name": "foo2",
>
> "type": "SCD- Slowly change Dimensions type 1",
>
> "description": "Update a prod table based on stage table",
>
> "flow": [{
>
> "step": 1,
>
> "description": "Process type 1",
>
> "stmt": "Update Prod Set Prod.columns = Stage.Columns
> From Stage Inner Join Prod on Stage.key = Prod.key Where Stage.IsType1 = 1 "
>
> }]
>
> }]
>
> }
>
>
>
> Example of a NIFI template who execute that Json structure :
>
>
>
>
>
>
>
> This make sense? Give me feedback.
>
>
>
> Carlos
>
>
>
>
>
>
>
RE: ELT on Nifi
Posted by "Carlos Manuel Fernandes (DSI)" <ca...@cgd.pt>.
Hi Joe,
I can contribute the Template , which image I send before. For build a processor , I’m not java skilled enough for that task, I mostly program in Groovy . If someone take that task, I can help with ideas and tests.
Thanks
Carlos
From: Joe Witt [mailto:joe.witt@gmail.com]
Sent: terça-feira, 4 de Outubro de 2016 01:23
To: users@nifi.apache.org
Subject: Re: ELT on Nifi
Carlos,
I think you're right that more can be done to support a broad range of transforms and styles of transforms. The approach you're suggesting makes sense for the style you prefer and I could envision such a processor that can execute the transform/statements you're showing in that JSON sample. Are you proposing to contribute such a processor?
Thanks
Joe
On Mon, Oct 3, 2016 at 2:25 PM, Carlos Manuel Fernandes (DSI) <ca...@cgd.pt>> wrote:
Hi all,
When i saw Nifi for the first time , I try to build a classical ETL/ELT flow , and this question is recurrent for the new users.
Nifi has very good processors for the Extract and Load, the problem arise on Transform, because in ETL/ELT tools there are specific “processors” (ex: map, SCD, etc.) binded to DW concepts and sometimes binded to a specific database (ex: SCDNetezza) . The Transformer processors in Nifi are general purpose and not correlated with this concepts. The immediate solution is to create a lot of Custom script processors but the metadata of ELT (sql) turn attributes or code of processors, not an ideal solution.
But, If we put the logic of Transform outside of Nifi, for example in some Json structure , then its relative easy, construct a ELT NIFI Template capable of run a generic ELT flows.
Example of a ELT JSon Structure (the “steps” inside the “flow” are to be executed on PutSql in the same transaction)
{
"Transformer": [{
"name": "foo1",
"type": "Map",
"description": "Summarize the table foo from table bar",
"flow": [{
"step": 1,
"description": "delete all data",
"stmt": "delete from foo"
}, {
"step": 2,
"Description": "Count f2 by f1",
"stmt": "insert into foo(c1, c2) select c1,sum(c2) from bar group by c1"
}]
}, {
"name": "foo2",
"type": "SCD- Slowly change Dimensions type 1",
"description": "Update a prod table based on stage table",
"flow": [{
"step": 1,
"description": "Process type 1",
"stmt": "Update Prod Set Prod.columns = Stage.Columns From Stage Inner Join Prod on Stage.key = Prod.key Where Stage.IsType1 = 1 "
}]
}]
}
Example of a NIFI template who execute that Json structure :
[cid:image001.png@01D21E64.24D94F70]
This make sense? Give me feedback.
Carlos