You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@nifi.apache.org by Gunjan Dave <gu...@gmail.com> on 2016/08/20 05:46:38 UTC

NiFi Real World Usage Queries

Hello NiFi Team,
Firstly, let me congratulate you on the fantastic work that your team is
doing and also on the release of 1.0 (beta) version. Your team is doing
great and NiFi shows many possibilites and  use cases, currently not
envisioned nor currently marketed for.

To be able to use NiFi to fullest capacities can I request your help in
addressing some of my queries?

1) Parallel Development
          How would a team of 50 developers work parallely on designing and
implementing common data flow? How would version control work? How can we
ensure that work of one developer in not overridden by another? Can we
version control the flow.xml file outside of nifi?
From microsoft biztalk background, where orchestration data flow design xml
is version controlled in TFS, can we do the same for flow.xml in git?
Can developers merge the local flow.xml externally and deploy the merged
flow.xml.gz into UAT?

2) User Authorization
      I understand from docs that user authorization needs to be managed in
authorization xml file? But is this manual process? Like I would have some
50 to 100 developers using it, i would be pain to manage the authorization
manually via xml file?
Is there some other way like ldap for authorization?

3) sensitive data like dbcp connection passwords in external file in
encrypted form, is it possible and how?

4) Functional testing the entire flow
      I understand each processor has run processor mock, but is there a
way i can test the entire flow in automated manner before going live?
Can there be an option of reading flow xml file into a separate test
harness and execute specific portions of the flow end to end?
Can minifi used for this purpose? I do understand that minifi is created
for different purpose but can it also be used for some form of automated
testing.

5) how does the debug flow processor work? Could not find enough
documentation there?

7) Is there a detailed infrastructure recommendation and setup guide for
NiFi cluster? Like best practises and some sample setup patterns.

Thanking in advance.

Thanks
Gunjan Dave

Re: NiFi Real World Usage Queries

Posted by Joe Witt <jo...@gmail.com>.
Responses provided in-line.  Apologies for the delay and brevity on
some of these.  Great questions all worth discsussion.

On Sat, Aug 20, 2016 at 1:46 AM, Gunjan Dave <gu...@gmail.com> wrote:
> Hello NiFi Team,
> Firstly, let me congratulate you on the fantastic work that your team is
> doing and also on the release of 1.0 (beta) version. Your team is doing
> great and NiFi shows many possibilites and  use cases, currently not
> envisioned nor currently marketed for.
>
> To be able to use NiFi to fullest capacities can I request your help in
> addressing some of my queries?
>
> 1) Parallel Development
>           How would a team of 50 developers work parallely on designing and
> implementing common data flow? How would version control work? How can we
> ensure that work of one developer in not overridden by another? Can we
> version control the flow.xml file outside of nifi?
> From microsoft biztalk background, where orchestration data flow design xml
> is version controlled in TFS, can we do the same for flow.xml in git?
> Can developers merge the local flow.xml externally and deploy the merged
> flow.xml.gz into UAT?

The development team would generally build flows locally and produce
templates.  Those templates then can be deployed to production.  In
production the expectation would be that the flow can be tweaked
directly or can be replaced by a new version of the template.  There
is no worry about developers overriding eachother unless they are
changing the same component at the same time and in this case the
system will reject the second editor but that would be a pretty rare
case.

> 2) User Authorization
>       I understand from docs that user authorization needs to be managed in
> authorization xml file? But is this manual process? Like I would have some
> 50 to 100 developers using it, i would be pain to manage the authorization
> manually via xml file?
> Is there some other way like ldap for authorization?

Right this would be cumbersome if using our built-in file-based
mechanism.  You could also leverage integration with an external
authorization system such as Apache Ranger (incubating) or implement
an alternative authorization provider which leverages some existing
authorization system you have.

>
> 3) sensitive data like dbcp connection passwords in external file in
> encrypted form, is it possible and how?

Sensitive values like dbcp connection passwords are encrypted by nifi
whenever/wherever the flow configuration is serialized and are not
sent back to the client again as they would not need those values.  So
it should already be well handled.

>
> 4) Functional testing the entire flow
>       I understand each processor has run processor mock, but is there a way
> i can test the entire flow in automated manner before going live?
> Can there be an option of reading flow xml file into a separate test harness
> and execute specific portions of the flow end to end?
> Can minifi used for this purpose? I do understand that minifi is created for
> different purpose but can it also be used for some form of automated
> testing.

Generally this is what local testing and/or templates are helpful for.
We should do more though and offer a programmatic way perhaps.

> 5) how does the debug flow processor work? Could not find enough
> documentation there?

This is really for debugging framework behavior.

>
> 7) Is there a detailed infrastructure recommendation and setup guide for
> NiFi cluster? Like best practises and some sample setup patterns.

Take a look at the admin guide, best practices/install guide, and user
guide.  It should help you get on your way.

>
> Thanking in advance.
>
> Thanks
> Gunjan Dave
>
>
>
>
>
>