You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@nifi.apache.org by Richard Hanson <rh...@mailbox.org> on 2017/04/02 10:45:54 UTC

Some question

I am new to Nifi, and am evaluating it after playing some basic functions. Now I have a few questions: 

- What is the max cluster size nifi can achieve?

- Is it possible to create workflow without GUI? A bit like travis ci .yaml (User creates related workflow file and let nifi execute it (via submit or programmatically)

- Is it possible to run embedded nifi? Checking http://apache-nifi-developer-list.39713.n7.nabble.com/Possibility-of-running-NiFi-embedded-in-a-test-td820.html showing nifi can not run as embedded, but I want to double check for sure (not for unit testing).

Thanks

 





 

Re: Some question

Posted by Juan Sequeiros <he...@gmail.com>.
Ravi,

Yes sir.

On Mon, Apr 17, 2017 at 12:38 PM Ravi Papisetti (rpapiset) <
rpapiset@cisco.com> wrote:

> This is very useful discussion for me as well.
>
>
>
> You mean, step:2 should be something like take flow.xml.gz from
> NiFi_Home/conf and check-in into git (step3)?
>
>
>
> Thanks,
>
> Ravi Papisetti
>
>
>
> *From: *Juan Sequeiros <he...@gmail.com>
> *Reply-To: *"users@nifi.apache.org" <us...@nifi.apache.org>
> *Date: *Monday, April 17, 2017 at 11:14 AM
> *To: *Richard Hanson <rh...@mailbox.org>, "users@nifi.apache.org" <
> users@nifi.apache.org>, Andy LoPresto <al...@apache.org>
> *Subject: *Re: Some question
>
>
>
> yes :)
>
> Except for  number 2, you don't export the flow ... its dynamically
> happening as you edit through the UI .....
>
>
>
> On Mon, Apr 17, 2017 at 9:00 AM Richard Hanson <rh...@mailbox.org>
> wrote:
>
> Does that mean the procedure I need to do for deploying Nifi is
>
> 1. Configure flow graph in UI (for example my local development
> workstation)
>
> 2. Export flow graph to local disk (e.g. flow.xml.gz)
>
> 3. Check flow graph in to SCM such as git
>
> 4. Deploy using ansible + ansible git module to a.) pull flow graph from
> scm b.) deploy to the target server
>
>
>
> Another minor question, is it enough just to replace flow.xml.gz with my
> customized flow graph under conf directory (cos checking nifi.properties,
> it shows nifi.flow.configuration.file points to conf/flow.xml.gz)?
>
>
>
> Thank you again for the advice!
>
>
>
>
>
> On 17 April 2017 at 14:25 Juan Sequeiros <he...@gmail.com> wrote:
>
> Good morning Richard,
>
> We have a similar deployment strategy to what you describe.
>
> The flow graph ( flow.xml.gz * If in single instance ) is checked in to
> git and we use ansible module git to check out.
>
> Since we are in cluster mode we actually check in the flow.tar and when we
> deploy a member node we don't pull the flow from git and instead it will
> get the flow from the NCM but similar method will apply to you just check
> out.
>
>
>
> Regarding starting a flow, we have nifi run as a service ..
>
> If you haven't already, recommend reading up on Admin guide [1] and NIFI
> in depth [2]
>
> [1] https://nifi.apache.org/docs/nifi-docs/html/administration-guide.html
> [2] https://nifi.apache.org/docs/nifi-docs/html/nifi-in-depth.html
>
>
>
> Thanks,
>
> Juan
>
>
>
> On Mon, Apr 17, 2017 at 5:24 AM Richard Hanson <rh...@mailbox.org>
> wrote:
>
> Sorry for the late replying. And thanks for the insights which are
> helpful!
>
>
>
> The second answer leads me to another question. I need to automate the
> process (auto deploying Nifi to remote production server). Searching result
> comes with using HDP, which looks like Hortonwork specific.
>
>
>
>
> https://community.hortonworks.com/articles/58330/automation-to-deploy-hdp-25nifi-10-clusters-runnin.html
>
>
>
> Is this the only (recommended) way to deploy Nifi?
>
>
>
> I am looking for a solution e.g. ansible for auto deploying Nifi, and my
> requirements are basically 1. installing and configuring Nifi, 2. creating
> flow graph, 3. starting the flow. So generally there won't have manual
> configuration (open browser, create flow in UI, etc.). How can I achieve
> this?
>
>
>
> Thanks
>
>
>
>
>
>
>
>
>
> On 03 April 2017 at 20:10 Andy LoPresto <al...@apache.org> wrote:
>
> Hi Richard,
>
>
>
> 1. NiFi does not have a defined maximum cluster size. For the best
> performance, we usually recommend < 10 nodes per cluster, but no more. If
> you have high performance needs, we have generally seen the best results
> with multiple smaller clusters than one large one. In this way, you can
> have hundreds of nodes processing the data in parallel, but the cluster
> administration overhead does not tax a single cluster coordinator to death.
>
> 2. While it is technically possible to create and define a flow.xml.gz
> file by hand, this would be incredibly frustrating, as the components and
> connections need a high number of defined values and must be validated in
> many unique ways. The UI and API allow this to happen in a convenient
> manner. If you genuinely wish to define the flow without a UI, take a look
> at existing flow.xml.gz files to get an understanding of the flow
> definition format.
>
> 3. NiFi can run on small hardware, such as a Raspberry Pi. You may also be
> interested in MiNiFi [1], a sub-project of NiFi. MiNiFi is a “headless
> agent” tool which is designed to run on lightweight or shared systems and
> extend the reach and capabilities of NiFi to the “edge” of data
> collection. MiNiFi offers two versions — a Java version [2] which has a
> high degree of compatibility with NiFi (many of the native processors are
> available), and a C++ version [3] which is extremely compact but has
> limited processor definition at this time. MiNiFi may also be a better fit
> for your “non-UI workflow”, as the flow can be defined using the GUI of
> NiFi and then exported as YAML to the MiNiFi agent, or written directly as
> YAML if desired.
>
>
>
> [1] https://nifi.apache.org/minifi/index.html
>
> [2] https://github.com/apache/nifi-minifi
>
> [3] https://github.com/apache/nifi-minifi-cpp
>
>
>
> Andy LoPresto
>
> alopresto@apache.org
>
> *alopresto.apache@gmail.com <al...@gmail.com>*
>
> PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69
>
>
>
> On Apr 2, 2017, at 3:45 AM, Richard Hanson <rh...@mailbox.org> wrote:
>
>
>
> I am new to Nifi, and am evaluating it after playing some basic
> functions. Now I have a few questions:
>
> - What is the max cluster size nifi can achieve?
>
> - Is it possible to create workflow without GUI? A bit like travis ci
> .yaml (User creates related workflow file and let nifi execute it (via
> submit or programmatically)
>
> - Is it possible to run embedded nifi? Checking
> http://apache-nifi-developer-list.39713.n7.nabble.com/Possibility-of-running-NiFi-embedded-in-a-test-td820.html showing
> nifi can not run as embedded, but I want to double check for sure (not for
> unit testing).
>
> Thanks
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>

Re: Some question

Posted by "Ravi Papisetti (rpapiset)" <rp...@cisco.com>.
This is very useful discussion for me as well.

You mean, step:2 should be something like take flow.xml.gz from NiFi_Home/conf and check-in into git (step3)?

Thanks,
Ravi Papisetti

From: Juan Sequeiros <he...@gmail.com>
Reply-To: "users@nifi.apache.org" <us...@nifi.apache.org>
Date: Monday, April 17, 2017 at 11:14 AM
To: Richard Hanson <rh...@mailbox.org>, "users@nifi.apache.org" <us...@nifi.apache.org>, Andy LoPresto <al...@apache.org>
Subject: Re: Some question

yes :)
Except for  number 2, you don't export the flow ... its dynamically happening as you edit through the UI .....

On Mon, Apr 17, 2017 at 9:00 AM Richard Hanson <rh...@mailbox.org>> wrote:

Does that mean the procedure I need to do for deploying Nifi is

1. Configure flow graph in UI (for example my local development workstation)

2. Export flow graph to local disk (e.g. flow.xml.gz)

3. Check flow graph in to SCM such as git

4. Deploy using ansible + ansible git module to a.) pull flow graph from scm b.) deploy to the target server



Another minor question, is it enough just to replace flow.xml.gz with my customized flow graph under conf directory (cos checking nifi.properties, it shows nifi.flow.configuration.file points to conf/flow.xml.gz)?



Thank you again for the advice!




On 17 April 2017 at 14:25 Juan Sequeiros <he...@gmail.com>> wrote:
Good morning Richard,
We have a similar deployment strategy to what you describe.
The flow graph ( flow.xml.gz * If in single instance ) is checked in to git and we use ansible module git to check out.
Since we are in cluster mode we actually check in the flow.tar and when we deploy a member node we don't pull the flow from git and instead it will get the flow from the NCM but similar method will apply to you just check out.

Regarding starting a flow, we have nifi run as a service ..
If you haven't already, recommend reading up on Admin guide [1] and NIFI in depth [2]

[1] https://nifi.apache.org/docs/nifi-docs/html/administration-guide.html
[2] https://nifi.apache.org/docs/nifi-docs/html/nifi-in-depth.html

Thanks,
Juan

On Mon, Apr 17, 2017 at 5:24 AM Richard Hanson <rh...@mailbox.org>> wrote:

Sorry for the late replying. And thanks for the insights which are helpful!



The second answer leads me to another question. I need to automate the process (auto deploying Nifi to remote production server). Searching result comes with using HDP, which looks like Hortonwork specific.



https://community.hortonworks.com/articles/58330/automation-to-deploy-hdp-25nifi-10-clusters-runnin.html



Is this the only (recommended) way to deploy Nifi?



I am looking for a solution e.g. ansible for auto deploying Nifi, and my requirements are basically 1. installing and configuring Nifi, 2. creating flow graph, 3. starting the flow. So generally there won't have manual configuration (open browser, create flow in UI, etc.). How can I achieve this?



Thanks








On 03 April 2017 at 20:10 Andy LoPresto <al...@apache.org>> wrote:

Hi Richard,

1. NiFi does not have a defined maximum cluster size. For the best performance, we usually recommend < 10 nodes per cluster, but no more. If you have high performance needs, we have generally seen the best results with multiple smaller clusters than one large one. In this way, you can have hundreds of nodes processing the data in parallel, but the cluster administration overhead does not tax a single cluster coordinator to death.
2. While it is technically possible to create and define a flow.xml.gz file by hand, this would be incredibly frustrating, as the components and connections need a high number of defined values and must be validated in many unique ways. The UI and API allow this to happen in a convenient manner. If you genuinely wish to define the flow without a UI, take a look at existing flow.xml.gz files to get an understanding of the flow definition format.
3. NiFi can run on small hardware, such as a Raspberry Pi. You may also be interested in MiNiFi [1], a sub-project of NiFi. MiNiFi is a “headless agent” tool which is designed to run on lightweight or shared systems and extend the reach and capabilities of NiFi to the “edge” of data collection. MiNiFi offers two versions — a Java version [2] which has a high degree of compatibility with NiFi (many of the native processors are available), and a C++ version [3] which is extremely compact but has limited processor definition at this time. MiNiFi may also be a better fit for your “non-UI workflow”, as the flow can be defined using the GUI of NiFi and then exported as YAML to the MiNiFi agent, or written directly as YAML if desired.

[1] https://nifi.apache.org/minifi/index.html
[2] https://github.com/apache/nifi-minifi
[3] https://github.com/apache/nifi-minifi-cpp

Andy LoPresto
alopresto@apache.org<ma...@apache.org>
alopresto.apache@gmail.com<ma...@gmail.com>
PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69

On Apr 2, 2017, at 3:45 AM, Richard Hanson <rh...@mailbox.org>> wrote:


I am new to Nifi, and am evaluating it after playing some basic functions. Now I have a few questions:

- What is the max cluster size nifi can achieve?

- Is it possible to create workflow without GUI? A bit like travis ci .yaml (User creates related workflow file and let nifi execute it (via submit or programmatically)

- Is it possible to run embedded nifi? Checking http://apache-nifi-developer-list.39713.n7.nabble.com/Possibility-of-running-NiFi-embedded-in-a-test-td820.html showing nifi can not run as embedded, but I want to double check for sure (not for unit testing).

Thanks












Re: Some question

Posted by Juan Sequeiros <he...@gmail.com>.
yes :)
Except for  number 2, you don't export the flow ... its dynamically
happening as you edit through the UI .....


On Mon, Apr 17, 2017 at 9:00 AM Richard Hanson <rh...@mailbox.org> wrote:

> Does that mean the procedure I need to do for deploying Nifi is
>
> 1. Configure flow graph in UI (for example my local development
> workstation)
>
> 2. Export flow graph to local disk (e.g. flow.xml.gz)
>
> 3. Check flow graph in to SCM such as git
>
> 4. Deploy using ansible + ansible git module to a.) pull flow graph from
> scm b.) deploy to the target server
>
>
> Another minor question, is it enough just to replace flow.xml.gz with my
> customized flow graph under conf directory (cos checking nifi.properties,
> it shows nifi.flow.configuration.file points to conf/flow.xml.gz)?
>
>
> Thank you again for the advice!
>
>
>
> On 17 April 2017 at 14:25 Juan Sequeiros <he...@gmail.com> wrote:
>
> Good morning Richard,
>
> We have a similar deployment strategy to what you describe.
> The flow graph ( flow.xml.gz * If in single instance ) is checked in to
> git and we use ansible module git to check out.
>
> Since we are in cluster mode we actually check in the flow.tar and when we
> deploy a member node we don't pull the flow from git and instead it will
> get the flow from the NCM but similar method will apply to you just check
> out.
>
> Regarding starting a flow, we have nifi run as a service ..
>
> If you haven't already, recommend reading up on Admin guide [1] and NIFI
> in depth [2]
>
> [1] https://nifi.apache.org/docs/nifi-docs/html/administration-guide.html
> [2] https://nifi.apache.org/docs/nifi-docs/html/nifi-in-depth.html
>
> Thanks,
>
> Juan
>
> On Mon, Apr 17, 2017 at 5:24 AM Richard Hanson <rh...@mailbox.org>
> wrote:
>
> Sorry for the late replying. And thanks for the insights which are
> helpful!
>
>
> The second answer leads me to another question. I need to automate the
> process (auto deploying Nifi to remote production server). Searching result
> comes with using HDP, which looks like Hortonwork specific.
>
>
>
> https://community.hortonworks.com/articles/58330/automation-to-deploy-hdp-25nifi-10-clusters-runnin.html
>
>
> Is this the only (recommended) way to deploy Nifi?
>
>
> I am looking for a solution e.g. ansible for auto deploying Nifi, and my
> requirements are basically 1. installing and configuring Nifi, 2. creating
> flow graph, 3. starting the flow. So generally there won't have manual
> configuration (open browser, create flow in UI, etc.). How can I achieve
> this?
>
>
> Thanks
>
>
>
>
>
> On 03 April 2017 at 20:10 Andy LoPresto <al...@apache.org> wrote:
>
> Hi Richard,
>
> 1. NiFi does not have a defined maximum cluster size. For the best
> performance, we usually recommend < 10 nodes per cluster, but no more. If
> you have high performance needs, we have generally seen the best results
> with multiple smaller clusters than one large one. In this way, you can
> have hundreds of nodes processing the data in parallel, but the cluster
> administration overhead does not tax a single cluster coordinator to death.
> 2. While it is technically possible to create and define a flow.xml.gz
> file by hand, this would be incredibly frustrating, as the components and
> connections need a high number of defined values and must be validated in
> many unique ways. The UI and API allow this to happen in a convenient
> manner. If you genuinely wish to define the flow without a UI, take a look
> at existing flow.xml.gz files to get an understanding of the flow
> definition format.
> 3. NiFi can run on small hardware, such as a Raspberry Pi. You may also be
> interested in MiNiFi [1], a sub-project of NiFi. MiNiFi is a “headless
> agent” tool which is designed to run on lightweight or shared systems and
> extend the reach and capabilities of NiFi to the “edge” of data collection. MiNiFi
> offers two versions — a Java version [2] which has a high degree of
> compatibility with NiFi (many of the native processors are available), and
> a C++ version [3] which is extremely compact but has limited processor
> definition at this time. MiNiFi may also be a better fit for your “non-UI
> workflow”, as the flow can be defined using the GUI of NiFi and then
> exported as YAML to the MiNiFi agent, or written directly as YAML if
> desired.
>
> [1] https://nifi.apache.org/minifi/index.html
> [2] https://github.com/apache/nifi-minifi
> [3] https://github.com/apache/nifi-minifi-cpp
>
> Andy LoPresto
> alopresto@apache.org
> *alopresto.apache@gmail.com <al...@gmail.com>*
> PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69
>
> On Apr 2, 2017, at 3:45 AM, Richard Hanson <rh...@mailbox.org> wrote:
>
> I am new to Nifi, and am evaluating it after playing some basic
> functions. Now I have a few questions:
>
> - What is the max cluster size nifi can achieve?
>
> - Is it possible to create workflow without GUI? A bit like travis ci
> .yaml (User creates related workflow file and let nifi execute it (via
> submit or programmatically)
>
> - Is it possible to run embedded nifi? Checking
> http://apache-nifi-developer-list.39713.n7.nabble.com/Possibility-of-running-NiFi-embedded-in-a-test-td820.html showing
> nifi can not run as embedded, but I want to double check for sure (not for
> unit testing).
>
> Thanks
>
>
>
>
>
>
>
>
>
>

Re: Some question

Posted by Richard Hanson <rh...@mailbox.org>.
Does that mean the procedure I need to do for deploying Nifi is 

1. Configure flow graph in UI (for example my local development workstation)

2. Export flow graph to local disk (e.g. flow.xml.gz)

3. Check flow graph in to SCM such as git

4. Deploy using ansible + ansible git module to a.) pull flow graph from scm b.) deploy to the target server 


Another minor question, is it enough just to replace flow.xml.gz with my customized flow graph under conf directory (cos checking nifi.properties, it shows nifi.flow.configuration.file points to conf/flow.xml.gz)?


Thank you again for the advice! 



> On 17 April 2017 at 14:25 Juan Sequeiros <he...@gmail.com> wrote:
> 
>     Good morning Richard,
> 
>     We have a similar deployment strategy to what you describe.
>     The flow graph ( flow.xml.gz * If in single instance ) is checked in to git and we use ansible module git to check out.
> 
>     Since we are in cluster mode we actually check in the flow.tar and when we deploy a member node we don't pull the flow from git and instead it will get the flow from the NCM but similar method will apply to you just check out.
> 
>     Regarding starting a flow, we have nifi run as a service ..
> 
>     If you haven't already, recommend reading up on Admin guide [1] and NIFI in depth [2]
> 
>     [1] https://nifi.apache.org/docs/nifi-docs/html/administration-guide.html
>     [2] https://nifi.apache.org/docs/nifi-docs/html/nifi-in-depth.html
> 
>     Thanks,
> 
>     Juan
> 
>     On Mon, Apr 17, 2017 at 5:24 AM Richard Hanson <rhanson@mailbox.org mailto:rhanson@mailbox.org > wrote:
> 
>         > > 
> >         Sorry for the late replying. And thanks for the insights which are helpful! 
> > 
> > 
> >         The second answer leads me to another question. I need to automate the process (auto deploying Nifi to remote production server). Searching result comes with using HDP, which looks like Hortonwork specific.
> > 
> > 
> >         https://community.hortonworks.com/articles/58330/automation-to-deploy-hdp-25nifi-10-clusters-runnin.html
> > 
> > 
> >         Is this the only (recommended) way to deploy Nifi? 
> > 
> > 
> >         I am looking for a solution e.g. ansible for auto deploying Nifi, and my requirements are basically 1. installing and configuring Nifi, 2. creating flow graph, 3. starting the flow. So generally there won't have manual configuration (open browser, create flow in UI, etc.). How can I achieve this?
> > 
> > 
> >         Thanks
> > 
> > 
> > 
> > 
> > 
> >             > > >             On 03 April 2017 at 20:10 Andy LoPresto <alopresto@apache.org mailto:alopresto@apache.org > wrote:
> > > 
> > >             Hi Richard,
> > > 
> > >             1. NiFi does not have a defined maximum cluster size. For the best performance, we usually recommend < 10 nodes per cluster, but no more. If you have high performance needs, we have generally seen the best results with multiple smaller clusters than one large one. In this way, you can have hundreds of nodes processing the data in parallel, but the cluster administration overhead does not tax a single cluster coordinator to death. 
> > >             2. While it is technically possible to create and define a flow.xml.gz file by hand, this would be incredibly frustrating, as the components and connections need a high number of defined values and must be validated in many unique ways. The UI and API allow this to happen in a convenient manner. If you genuinely wish to define the flow without a UI, take a look at existing flow.xml.gz files to get an understanding of the flow definition format. 
> > >             3. NiFi can run on small hardware, such as a Raspberry Pi. You may also be interested in MiNiFi [1], a sub-project of NiFi. MiNiFi is a “headless agent” tool which is designed to run on lightweight or shared systems and extend the reach and capabilities of NiFi to the “edge” of data collection. MiNiFi offers two versions — a Java version [2] which has a high degree of compatibility with NiFi (many of the native processors are available), and a C++ version [3] which is extremely compact but has limited processor definition at this time. MiNiFi may also be a better fit for your “non-UI workflow”, as the flow can be defined using the GUI of NiFi and then exported as YAML to the MiNiFi agent, or written directly as YAML if desired. 
> > > 
> > >             [1] https://nifi.apache.org/minifi/index.html
> > >             [2] https://github.com/apache/nifi-minifi
> > >             [3] https://github.com/apache/nifi-minifi-cpp
> > > 
> > >             Andy LoPresto
> > >             alopresto@apache.org mailto:alopresto@apache.org
> > >             alopresto.apache@gmail.com mailto:alopresto.apache@gmail.com
> > >             PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69
> > > 
> > > 
> > >                 > > > >                 On Apr 2, 2017, at 3:45 AM, Richard Hanson <rhanson@mailbox.org mailto:rhanson@mailbox.org > wrote:
> > > > 
> > > > 
> > > >                 I am new to Nifi, and am evaluating it after playing some basic functions. Now I have a few questions: 
> > > > 
> > > >                 - What is the max cluster size nifi can achieve?
> > > > 
> > > >                 - Is it possible to create workflow without GUI? A bit like travis ci .yaml (User creates related workflow file and let nifi execute it (via submit or programmatically)
> > > > 
> > > >                 - Is it possible to run embedded nifi? Checking http://apache-nifi-developer-list.39713.n7.nabble.com/Possibility-of-running-NiFi-embedded-in-a-test-td820.html showing nifi can not run as embedded, but I want to double check for sure (not for unit testing).
> > > > 
> > > >                 Thanks
> > > > 
> > > >                  
> > > > 
> > > > 
> > > > 
> > > > 
> > > > 
> > > >                  
> > > > 
> > > >             > > > 
> > > 
> > >         > > 
> >     > 

Re: Some question

Posted by Juan Sequeiros <he...@gmail.com>.
Good morning Richard,

We have a similar deployment strategy to what you describe.
The flow graph ( flow.xml.gz * If in single instance ) is checked in to git
and we use ansible module git to check out.

Since we are in cluster mode we actually check in the flow.tar and when we
deploy a member node we don't pull the flow from git and instead it will
get the flow from the NCM but similar method will apply to you just check
out.

Regarding starting a flow, we have nifi run as a service ..

If you haven't already, recommend reading up on Admin guide [1] and NIFI in
depth [2]

[1] https://nifi.apache.org/docs/nifi-docs/html/administration-guide.html
[2] https://nifi.apache.org/docs/nifi-docs/html/nifi-in-depth.html

Thanks,

Juan

On Mon, Apr 17, 2017 at 5:24 AM Richard Hanson <rh...@mailbox.org> wrote:

> Sorry for the late replying. And thanks for the insights which are
> helpful!
>
>
> The second answer leads me to another question. I need to automate the
> process (auto deploying Nifi to remote production server). Searching result
> comes with using HDP, which looks like Hortonwork specific.
>
>
>
> https://community.hortonworks.com/articles/58330/automation-to-deploy-hdp-25nifi-10-clusters-runnin.html
>
>
> Is this the only (recommended) way to deploy Nifi?
>
>
> I am looking for a solution e.g. ansible for auto deploying Nifi, and my
> requirements are basically 1. installing and configuring Nifi, 2. creating
> flow graph, 3. starting the flow. So generally there won't have manual
> configuration (open browser, create flow in UI, etc.). How can I achieve
> this?
>
>
> Thanks
>
>
>
>
>
> On 03 April 2017 at 20:10 Andy LoPresto <al...@apache.org> wrote:
>
> Hi Richard,
>
> 1. NiFi does not have a defined maximum cluster size. For the best
> performance, we usually recommend < 10 nodes per cluster, but no more. If
> you have high performance needs, we have generally seen the best results
> with multiple smaller clusters than one large one. In this way, you can
> have hundreds of nodes processing the data in parallel, but the cluster
> administration overhead does not tax a single cluster coordinator to death.
> 2. While it is technically possible to create and define a flow.xml.gz
> file by hand, this would be incredibly frustrating, as the components and
> connections need a high number of defined values and must be validated in
> many unique ways. The UI and API allow this to happen in a convenient
> manner. If you genuinely wish to define the flow without a UI, take a look
> at existing flow.xml.gz files to get an understanding of the flow
> definition format.
> 3. NiFi can run on small hardware, such as a Raspberry Pi. You may also be
> interested in MiNiFi [1], a sub-project of NiFi. MiNiFi is a “headless
> agent” tool which is designed to run on lightweight or shared systems and
> extend the reach and capabilities of NiFi to the “edge” of data collection. MiNiFi
> offers two versions — a Java version [2] which has a high degree of
> compatibility with NiFi (many of the native processors are available), and
> a C++ version [3] which is extremely compact but has limited processor
> definition at this time. MiNiFi may also be a better fit for your “non-UI
> workflow”, as the flow can be defined using the GUI of NiFi and then
> exported as YAML to the MiNiFi agent, or written directly as YAML if
> desired.
>
> [1] https://nifi.apache.org/minifi/index.html
> [2] https://github.com/apache/nifi-minifi
> [3] https://github.com/apache/nifi-minifi-cpp
>
> Andy LoPresto
> alopresto@apache.org
> *alopresto.apache@gmail.com <al...@gmail.com>*
> PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69
>
> On Apr 2, 2017, at 3:45 AM, Richard Hanson <rh...@mailbox.org> wrote:
>
> I am new to Nifi, and am evaluating it after playing some basic
> functions. Now I have a few questions:
>
> - What is the max cluster size nifi can achieve?
>
> - Is it possible to create workflow without GUI? A bit like travis ci
> .yaml (User creates related workflow file and let nifi execute it (via
> submit or programmatically)
>
> - Is it possible to run embedded nifi? Checking
> http://apache-nifi-developer-list.39713.n7.nabble.com/Possibility-of-running-NiFi-embedded-in-a-test-td820.html showing
> nifi can not run as embedded, but I want to double check for sure (not for
> unit testing).
>
> Thanks
>
>
>
>
>
>
>
>
>
>

Re: Some question

Posted by Richard Hanson <rh...@mailbox.org>.
Sorry for the late replying. And thanks for the insights which are helpful! 


The second answer leads me to another question. I need to automate the process (auto deploying Nifi to remote production server). Searching result comes with using HDP, which looks like Hortonwork specific.


https://community.hortonworks.com/articles/58330/automation-to-deploy-hdp-25nifi-10-clusters-runnin.html


Is this the only (recommended) way to deploy Nifi? 


I am looking for a solution e.g. ansible for auto deploying Nifi, and my requirements are basically 1. installing and configuring Nifi, 2. creating flow graph, 3. starting the flow. So generally there won't have manual configuration (open browser, create flow in UI, etc.). How can I achieve this?


Thanks





>     On 03 April 2017 at 20:10 Andy LoPresto <al...@apache.org> wrote:
> 
>     Hi Richard,
> 
>     1. NiFi does not have a defined maximum cluster size. For the best performance, we usually recommend < 10 nodes per cluster, but no more. If you have high performance needs, we have generally seen the best results with multiple smaller clusters than one large one. In this way, you can have hundreds of nodes processing the data in parallel, but the cluster administration overhead does not tax a single cluster coordinator to death. 
>     2. While it is technically possible to create and define a flow.xml.gz file by hand, this would be incredibly frustrating, as the components and connections need a high number of defined values and must be validated in many unique ways. The UI and API allow this to happen in a convenient manner. If you genuinely wish to define the flow without a UI, take a look at existing flow.xml.gz files to get an understanding of the flow definition format. 
>     3. NiFi can run on small hardware, such as a Raspberry Pi. You may also be interested in MiNiFi [1], a sub-project of NiFi. MiNiFi is a “headless agent” tool which is designed to run on lightweight or shared systems and extend the reach and capabilities of NiFi to the “edge” of data collection. MiNiFi offers two versions — a Java version [2] which has a high degree of compatibility with NiFi (many of the native processors are available), and a C++ version [3] which is extremely compact but has limited processor definition at this time. MiNiFi may also be a better fit for your “non-UI workflow”, as the flow can be defined using the GUI of NiFi and then exported as YAML to the MiNiFi agent, or written directly as YAML if desired. 
> 
>     [1] https://nifi.apache.org/minifi/index.html
>     [2] https://github.com/apache/nifi-minifi
>     [3] https://github.com/apache/nifi-minifi-cpp
> 
>     Andy LoPresto
>     alopresto@apache.org mailto:alopresto@apache.org
>     alopresto.apache@gmail.com mailto:alopresto.apache@gmail.com
>     PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69
> 
> 
>         > >         On Apr 2, 2017, at 3:45 AM, Richard Hanson <rhanson@mailbox.org mailto:rhanson@mailbox.org > wrote:
> > 
> > 
> >         I am new to Nifi, and am evaluating it after playing some basic functions. Now I have a few questions: 
> > 
> >         - What is the max cluster size nifi can achieve?
> > 
> >         - Is it possible to create workflow without GUI? A bit like travis ci .yaml (User creates related workflow file and let nifi execute it (via submit or programmatically)
> > 
> >         - Is it possible to run embedded nifi? Checking http://apache-nifi-developer-list.39713.n7.nabble.com/Possibility-of-running-NiFi-embedded-in-a-test-td820.html showing nifi can not run as embedded, but I want to double check for sure (not for unit testing).
> > 
> >         Thanks
> > 
> >          
> > 
> > 
> > 
> > 
> > 
> >          
> > 
> >     > 
> 

Re: Some question

Posted by Andy LoPresto <al...@apache.org>.
Hi Richard,

1. NiFi does not have a defined maximum cluster size. For the best performance, we usually recommend < 10 nodes per cluster, but no more. If you have high performance needs, we have generally seen the best results with multiple smaller clusters than one large one. In this way, you can have hundreds of nodes processing the data in parallel, but the cluster administration overhead does not tax a single cluster coordinator to death.
2. While it is technically possible to create and define a flow.xml.gz file by hand, this would be incredibly frustrating, as the components and connections need a high number of defined values and must be validated in many unique ways. The UI and API allow this to happen in a convenient manner. If you genuinely wish to define the flow without a UI, take a look at existing flow.xml.gz files to get an understanding of the flow definition format.
3. NiFi can run on small hardware, such as a Raspberry Pi. You may also be interested in MiNiFi [1], a sub-project of NiFi. MiNiFi is a “headless agent” tool which is designed to run on lightweight or shared systems and extend the reach and capabilities of NiFi to the “edge” of data collection. MiNiFi offers two versions — a Java version [2] which has a high degree of compatibility with NiFi (many of the native processors are available), and a C++ version [3] which is extremely compact but has limited processor definition at this time. MiNiFi may also be a better fit for your “non-UI workflow”, as the flow can be defined using the GUI of NiFi and then exported as YAML to the MiNiFi agent, or written directly as YAML if desired.

[1] https://nifi.apache.org/minifi/index.html <https://nifi.apache.org/minifi/index.html>
[2] https://github.com/apache/nifi-minifi <https://github.com/apache/nifi-minifi>
[3] https://github.com/apache/nifi-minifi-cpp <https://github.com/apache/nifi-minifi-cpp>

Andy LoPresto
alopresto@apache.org
alopresto.apache@gmail.com
PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69

> On Apr 2, 2017, at 3:45 AM, Richard Hanson <rh...@mailbox.org> wrote:
> 
> I am new to Nifi, and am evaluating it after playing some basic functions. Now I have a few questions:
> 
> - What is the max cluster size nifi can achieve?
> 
> - Is it possible to create workflow without GUI? A bit like travis ci .yaml (User creates related workflow file and let nifi execute it (via submit or programmatically)
> 
> - Is it possible to run embedded nifi? Checking http://apache-nifi-developer-list.39713.n7.nabble.com/Possibility-of-running-NiFi-embedded-in-a-test-td820.html <http://apache-nifi-developer-list.39713.n7.nabble.com/Possibility-of-running-NiFi-embedded-in-a-test-td820.html> showing nifi can not run as embedded, but I want to double check for sure (not for unit testing).
> 
> Thanks
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
>