You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@nifi.apache.org by Boris Tyukin <bo...@boristyukin.com> on 2018/03/01 16:29:10 UTC

setting processor concurrency based on the development/production environment

Hello NiFi community,

started using NiFi recently and fell in love with it! We run 1.6 NiFi alone
with new NiFi registry and I am trying to figure out how to promote NiFi
flow, created in VM environment to our cluster.

One of the things is "Concurrent Tasks" processor parameter. I bump it to 2
or 4 for some processors in my flow, when I develop / test it in VM.

Then we deploy this flow to a beefy cluster node (with 48 cores) and want
to change concurrency to let's say 8 or 10 or 12 for some processors.

Then I work on a new version/make some changes in my VM, and need to be
more shy with concurrency so set it back to 2 or 4.

Then the story repeats...

Is there a better way than to manually set this parameter? I do not believe
I can use a variable there and have to type the actual number of tasks.


Thanks
Boris

Re: setting processor concurrency based on the development/production environment

Posted by Boris Tyukin <bo...@boristyukin.com>.
Opened
NIFI-4921 better support for promoting NiFi processor parameters between
dev and prod environments

Thanks!

On Thu, Mar 1, 2018 at 11:43 AM, Bryan Bende <bb...@gmail.com> wrote:

> Hello,
>
> Glad you are having success with NiFi + NiFi Registry!
>
> You brought up an interesting point about the concurrent tasks...
>
> I think we may want to consider making the concurrent tasks work
> similar to variables, in that we capture the concurrent tasks that the
> flow was developed with and would use it initially, but then if you
> have modified this value in the target environment it would not
> trigger a local change and would be retained across upgrades so that
> you don't have to reset it.
>
> For now you could probably always leave the versioned flow with the
> lower value of 2, then once you are in prod you bump it to 4 until the
> next upgrade is available, you then revert the local changes, do the
> upgrade, and put it back to 4, but its not ideal because it shows a
> local change the entire time.
>
> I don't think there is much you can do differently right now, but I
> think this is a valid case to create a JIRA for.
>
> Thanks,
>
> Bryan
>
> On Thu, Mar 1, 2018 at 11:29 AM, Boris Tyukin <bo...@boristyukin.com>
> wrote:
> > Hello NiFi community,
> >
> > started using NiFi recently and fell in love with it! We run 1.6 NiFi
> alone
> > with new NiFi registry and I am trying to figure out how to promote NiFi
> > flow, created in VM environment to our cluster.
> >
> > One of the things is "Concurrent Tasks" processor parameter. I bump it
> to 2
> > or 4 for some processors in my flow, when I develop / test it in VM.
> >
> > Then we deploy this flow to a beefy cluster node (with 48 cores) and
> want to
> > change concurrency to let's say 8 or 10 or 12 for some processors.
> >
> > Then I work on a new version/make some changes in my VM, and need to be
> more
> > shy with concurrency so set it back to 2 or 4.
> >
> > Then the story repeats...
> >
> > Is there a better way than to manually set this parameter? I do not
> believe
> > I can use a variable there and have to type the actual number of tasks.
> >
> >
> > Thanks
> > Boris
>

Re: setting processor concurrency based on the development/production environment

Posted by Andy LoPresto <al...@apache.org>.
Kevin,

The release cadence is a very loose schedule. Usually “point releases” (minor version upgrades) come approximately 10 - 14 weeks after the previous (i.e. 1.5.0 -> 1.6.0). In some cases, a bug fix release (1.5.0 -> 1.5.1) will be released on a shorter time frame if deemed necessary.

As 1.5.0 was released in mid-January, I would expect (note, am not committing to) a release vote towards mid-to-late April. In general, a "critical mass” of new features is reached and a committer/member of the PMC begins a release discussion thread on the mailing list, and the community weighs in with their thoughts on releasing a new version.

Andy LoPresto
alopresto@apache.org
alopresto.apache@gmail.com
PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69

> On Mar 2, 2018, at 2:54 PM, Kevin Verhoeven <Ke...@ds-iq.com> wrote:
> 
> I am also interested in modifying concurrency based on dev/prod environment.
> 
> Do you know when 1.6 will be released? Is there a published roadmap for the releases?
> 
> Thanks!
> 
> Kevin
> 
> From: Ed B [mailto:bdesert@gmail.com <ma...@gmail.com>]
> Sent: Friday, March 2, 2018 10:23 AM
> To: users@nifi.apache.org <ma...@nifi.apache.org>
> Subject: Re: setting processor concurrency based on the development/production environment
> 
> Boris,
> as the WA - I believe that after deploying the flow on another env, you can run post-deployment scripts.
> You could use PUT rest API to update processors by ID.
> <image001.png>
> 
> 
> 
> On Fri, Mar 2, 2018 at 12:04 PM Andrew Grande <aperepel@gmail.com <ma...@gmail.com>> wrote:
> There are 2 efforts, with somewhat different focus. You are already aware of the community-driven nipyapi, but there's also an official module Ientioned before, will be included with the 1.6 release.
> 
> Andrew
> 
> 
> On Fri, Mar 2, 2018, 8:59 AM Boris Tyukin <boris@boristyukin.com <ma...@boristyukin.com>> wrote:
> Hi Andrew,
> 
> thanks for the idea. I've been playing with nipyapi recently so might give this a try.
> 
> Thanks
> 
> On Thu, Mar 1, 2018 at 7:32 PM, Andrew Grande <aperepel@gmail.com <ma...@gmail.com>> wrote:
> Boris,
> 
> Here's an idea youncould explore _today_.
> 
> Assume your dev and prod flows live in different bucket/registry instance. Given that you are trying out NiFi 1.6, you should be able to extract the versioned flow from DEV and process it to change the concurrency level for PROD before committing it to the prod registry instance. Any script which understands json would do. nifi-toolkit-cli will take care of extracting and moving flow versions.
> 
> It's not ideal (yes, would like concurrency to be a customizable flow var), and it assumes an explicit process to promote between environments, but technically it is possible already. The user experience can be improved in the future.
> 
> Andrew
> 
> 
> On Thu, Mar 1, 2018, 1:52 PM Kevin Doran <kdoran@apache.org <ma...@apache.org>> wrote:
> I think you could put it under either project. Ultimately, if we go with that approach, most (all?) of the logic/enhancement would be in the NiFi code base during save version / import flow / change version operations, so probably best to create it there.
> 
> Glad you are finding NiFi useful.
> 
> Cheers,
> Kevin
> 
> From: Boris Tyukin <boris@boristyukin.com <ma...@boristyukin.com>>
> Reply-To: <users@nifi.apache.org <ma...@nifi.apache.org>>
> Date: Thursday, March 1, 2018 at 13:44
> To: <users@nifi.apache.org <ma...@nifi.apache.org>>
> Subject: Re: setting processor concurrency based on the development/production environment
> 
> thanks Bryan and Kevin. I will be happy to open a jira - would it be a NiFi jira or NiFi registry?  <>
> 
> I like the approach that Bryan suggested.
> 
> I guess for now I will just color code the processors that need to be changed in production.
> 
> P.S. I really, really like where NiFi is going...I've looked at StreamSets and Cask, but for my purposes, I was looking for a tool when I can process various tables without creating a flow per table. I was able to create a very simple flow in NiFi, that will handle 25 tables. My next project is to handle 600 tables in near real-time. I just could see how I would do that with StreamSets or Cask, when you have to create a pipeline per table. I was only being able to do something similar with Apache Airflow, but airflow cannot do things in near real-time. The concept of FlowFiles with attributes is a genius idea, and I am blown away with all the possibilities to extend the functionality of NiFi with custom processors and Groovy scripts. Awesome job, guys.
> 
> On Thu, Mar 1, 2018 at 1:29 PM, Kevin Doran <kdoran@apache.org <ma...@apache.org>> wrote:
> Hi Boris,
> 
> Good point regarding concurrent tasks; thanks for sharing!
> 
> This is a great candidate for something that one should be able to create environment-specific values for, as Bryan suggests. I agree we should create a NiFi JIRA to track this enhancement.
> 
> Thanks,
> Kevin
> 
> On 3/1/18, 11:44, "Bryan Bende" <bbende@gmail.com <ma...@gmail.com>> wrote:
> 
>     Hello,
> 
>     Glad you are having success with NiFi + NiFi Registry!
> 
>     You brought up an interesting point about the concurrent tasks...
> 
>     I think we may want to consider making the concurrent tasks work
>     similar to variables, in that we capture the concurrent tasks that the
>     flow was developed with and would use it initially, but then if you
>     have modified this value in the target environment it would not
>     trigger a local change and would be retained across upgrades so that
>     you don't have to reset it.
> 
>     For now you could probably always leave the versioned flow with the
>     lower value of 2, then once you are in prod you bump it to 4 until the
>     next upgrade is available, you then revert the local changes, do the
>     upgrade, and put it back to 4, but its not ideal because it shows a
>     local change the entire time.
> 
>     I don't think there is much you can do differently right now, but I
>     think this is a valid case to create a JIRA for.
> 
>     Thanks,
> 
>     Bryan
> 
>     On Thu, Mar 1, 2018 at 11:29 AM, Boris Tyukin <boris@boristyukin.com <ma...@boristyukin.com>> wrote:
>     > Hello NiFi community,
>     >
>     > started using NiFi recently and fell in love with it! We run 1.6 NiFi alone
>     > with new NiFi registry and I am trying to figure out how to promote NiFi
>     > flow, created in VM environment to our cluster.
>     >
>     > One of the things is "Concurrent Tasks" processor parameter. I bump it to 2
>     > or 4 for some processors in my flow, when I develop / test it in VM.
>     >
>     > Then we deploy this flow to a beefy cluster node (with 48 cores) and want to
>     > change concurrency to let's say 8 or 10 or 12 for some processors.
>     >
>     > Then I work on a new version/make some changes in my VM, and need to be more
>     > shy with concurrency so set it back to 2 or 4.
>     >
>     > Then the story repeats...
>     >
>     > Is there a better way than to manually set this parameter? I do not believe
>     > I can use a variable there and have to type the actual number of tasks.
>     >
>     >
>     > Thanks
>     > Boris
> 


RE: setting processor concurrency based on the development/production environment

Posted by Kevin Verhoeven <Ke...@ds-iq.com>.
I am also interested in modifying concurrency based on dev/prod environment.

Do you know when 1.6 will be released? Is there a published roadmap for the releases?

Thanks!

Kevin

From: Ed B [mailto:bdesert@gmail.com]
Sent: Friday, March 2, 2018 10:23 AM
To: users@nifi.apache.org
Subject: Re: setting processor concurrency based on the development/production environment

Boris,
as the WA - I believe that after deploying the flow on another env, you can run post-deployment scripts.
You could use PUT rest API to update processors by ID.
[image.png]



On Fri, Mar 2, 2018 at 12:04 PM Andrew Grande <ap...@gmail.com>> wrote:
There are 2 efforts, with somewhat different focus. You are already aware of the community-driven nipyapi, but there's also an official module Ientioned before, will be included with the 1.6 release.

Andrew

On Fri, Mar 2, 2018, 8:59 AM Boris Tyukin <bo...@boristyukin.com>> wrote:
Hi Andrew,

thanks for the idea. I've been playing with nipyapi recently so might give this a try.

Thanks

On Thu, Mar 1, 2018 at 7:32 PM, Andrew Grande <ap...@gmail.com>> wrote:
Boris,

Here's an idea youncould explore _today_.

Assume your dev and prod flows live in different bucket/registry instance. Given that you are trying out NiFi 1.6, you should be able to extract the versioned flow from DEV and process it to change the concurrency level for PROD before committing it to the prod registry instance. Any script which understands json would do. nifi-toolkit-cli will take care of extracting and moving flow versions.

It's not ideal (yes, would like concurrency to be a customizable flow var), and it assumes an explicit process to promote between environments, but technically it is possible already. The user experience can be improved in the future.

Andrew

On Thu, Mar 1, 2018, 1:52 PM Kevin Doran <kd...@apache.org>> wrote:
I think you could put it under either project. Ultimately, if we go with that approach, most (all?) of the logic/enhancement would be in the NiFi code base during save version / import flow / change version operations, so probably best to create it there.

Glad you are finding NiFi useful.

Cheers,
Kevin

From: Boris Tyukin <bo...@boristyukin.com>>
Reply-To: <us...@nifi.apache.org>>
Date: Thursday, March 1, 2018 at 13:44
To: <us...@nifi.apache.org>>
Subject: Re: setting processor concurrency based on the development/production environment

thanks Bryan and Kevin. I will be happy to open a jira - would it be a NiFi jira or NiFi registry?

I like the approach that Bryan suggested.

I guess for now I will just color code the processors that need to be changed in production.

P.S. I really, really like where NiFi is going...I've looked at StreamSets and Cask, but for my purposes, I was looking for a tool when I can process various tables without creating a flow per table. I was able to create a very simple flow in NiFi, that will handle 25 tables. My next project is to handle 600 tables in near real-time. I just could see how I would do that with StreamSets or Cask, when you have to create a pipeline per table. I was only being able to do something similar with Apache Airflow, but airflow cannot do things in near real-time. The concept of FlowFiles with attributes is a genius idea, and I am blown away with all the possibilities to extend the functionality of NiFi with custom processors and Groovy scripts. Awesome job, guys.

On Thu, Mar 1, 2018 at 1:29 PM, Kevin Doran <kd...@apache.org>> wrote:
Hi Boris,

Good point regarding concurrent tasks; thanks for sharing!

This is a great candidate for something that one should be able to create environment-specific values for, as Bryan suggests. I agree we should create a NiFi JIRA to track this enhancement.

Thanks,
Kevin

On 3/1/18, 11:44, "Bryan Bende" <bb...@gmail.com>> wrote:

    Hello,

    Glad you are having success with NiFi + NiFi Registry!

    You brought up an interesting point about the concurrent tasks...

    I think we may want to consider making the concurrent tasks work
    similar to variables, in that we capture the concurrent tasks that the
    flow was developed with and would use it initially, but then if you
    have modified this value in the target environment it would not
    trigger a local change and would be retained across upgrades so that
    you don't have to reset it.

    For now you could probably always leave the versioned flow with the
    lower value of 2, then once you are in prod you bump it to 4 until the
    next upgrade is available, you then revert the local changes, do the
    upgrade, and put it back to 4, but its not ideal because it shows a
    local change the entire time.

    I don't think there is much you can do differently right now, but I
    think this is a valid case to create a JIRA for.

    Thanks,

    Bryan

    On Thu, Mar 1, 2018 at 11:29 AM, Boris Tyukin <bo...@boristyukin.com>> wrote:
    > Hello NiFi community,
    >
    > started using NiFi recently and fell in love with it! We run 1.6 NiFi alone
    > with new NiFi registry and I am trying to figure out how to promote NiFi
    > flow, created in VM environment to our cluster.
    >
    > One of the things is "Concurrent Tasks" processor parameter. I bump it to 2
    > or 4 for some processors in my flow, when I develop / test it in VM.
    >
    > Then we deploy this flow to a beefy cluster node (with 48 cores) and want to
    > change concurrency to let's say 8 or 10 or 12 for some processors.
    >
    > Then I work on a new version/make some changes in my VM, and need to be more
    > shy with concurrency so set it back to 2 or 4.
    >
    > Then the story repeats...
    >
    > Is there a better way than to manually set this parameter? I do not believe
    > I can use a variable there and have to type the actual number of tasks.
    >
    >
    > Thanks
    > Boris




Re: setting processor concurrency based on the development/production environment

Posted by Boris Tyukin <bo...@boristyukin.com>.
thanks guys, good to see so many responses and ideas. You have a great
community here!

Looks like I am making the right choice with NiFi!

Boris

On Fri, Mar 2, 2018 at 1:22 PM, Ed B <bd...@gmail.com> wrote:

> Boris,
> as the WA - I believe that after deploying the flow on another env, you
> can run post-deployment scripts.
> You could use PUT rest API to update processors by ID.
> [image: image.png]
>
>
>
> On Fri, Mar 2, 2018 at 12:04 PM Andrew Grande <ap...@gmail.com> wrote:
>
>> There are 2 efforts, with somewhat different focus. You are already aware
>> of the community-driven nipyapi, but there's also an official module
>> Ientioned before, will be included with the 1.6 release.
>>
>> Andrew
>>
>>
>> On Fri, Mar 2, 2018, 8:59 AM Boris Tyukin <bo...@boristyukin.com> wrote:
>>
>>> Hi Andrew,
>>>
>>> thanks for the idea. I've been playing with nipyapi recently so might
>>> give this a try.
>>>
>>> Thanks
>>>
>>> On Thu, Mar 1, 2018 at 7:32 PM, Andrew Grande <ap...@gmail.com>
>>> wrote:
>>>
>>>> Boris,
>>>>
>>>> Here's an idea youncould explore _today_.
>>>>
>>>> Assume your dev and prod flows live in different bucket/registry
>>>> instance. Given that you are trying out NiFi 1.6, you should be able to
>>>> extract the versioned flow from DEV and process it to change the
>>>> concurrency level for PROD before committing it to the prod registry
>>>> instance. Any script which understands json would do. nifi-toolkit-cli will
>>>> take care of extracting and moving flow versions.
>>>>
>>>> It's not ideal (yes, would like concurrency to be a customizable flow
>>>> var), and it assumes an explicit process to promote between environments,
>>>> but technically it is possible already. The user experience can be improved
>>>> in the future.
>>>>
>>>> Andrew
>>>>
>>>>
>>>> On Thu, Mar 1, 2018, 1:52 PM Kevin Doran <kd...@apache.org> wrote:
>>>>
>>>>> I think you could put it under either project. Ultimately, if we go
>>>>> with that approach, most (all?) of the logic/enhancement would be in the
>>>>> NiFi code base during save version / import flow / change version
>>>>> operations, so probably best to create it there.
>>>>>
>>>>>
>>>>>
>>>>> Glad you are finding NiFi useful.
>>>>>
>>>>>
>>>>>
>>>>> Cheers,
>>>>>
>>>>> Kevin
>>>>>
>>>>>
>>>>>
>>>>> *From: *Boris Tyukin <bo...@boristyukin.com>
>>>>> *Reply-To: *<us...@nifi.apache.org>
>>>>> *Date: *Thursday, March 1, 2018 at 13:44
>>>>> *To: *<us...@nifi.apache.org>
>>>>> *Subject: *Re: setting processor concurrency based on the
>>>>> development/production environment
>>>>>
>>>>>
>>>>>
>>>>> thanks Bryan and Kevin. I will be happy to open a jira - would it be a
>>>>> NiFi jira or NiFi registry?
>>>>>
>>>>>
>>>>>
>>>>> I like the approach that Bryan suggested.
>>>>>
>>>>>
>>>>>
>>>>> I guess for now I will just color code the processors that need to be
>>>>> changed in production.
>>>>>
>>>>>
>>>>>
>>>>> P.S. I really, really like where NiFi is going...I've looked at
>>>>> StreamSets and Cask, but for my purposes, I was looking for a tool when I
>>>>> can process various tables without creating a flow per table. I was able to
>>>>> create a very simple flow in NiFi, that will handle 25 tables. My next
>>>>> project is to handle 600 tables in near real-time. I just could see how I
>>>>> would do that with StreamSets or Cask, when you have to create a pipeline
>>>>> per table. I was only being able to do something similar with Apache
>>>>> Airflow, but airflow cannot do things in near real-time. The concept of
>>>>> FlowFiles with attributes is a genius idea, and I am blown away with all
>>>>> the possibilities to extend the functionality of NiFi with custom
>>>>> processors and Groovy scripts. Awesome job, guys.
>>>>>
>>>>>
>>>>>
>>>>> On Thu, Mar 1, 2018 at 1:29 PM, Kevin Doran <kd...@apache.org> wrote:
>>>>>
>>>>> Hi Boris,
>>>>>
>>>>> Good point regarding concurrent tasks; thanks for sharing!
>>>>>
>>>>> This is a great candidate for something that one should be able to
>>>>> create environment-specific values for, as Bryan suggests. I agree we
>>>>> should create a NiFi JIRA to track this enhancement.
>>>>>
>>>>> Thanks,
>>>>> Kevin
>>>>>
>>>>>
>>>>> On 3/1/18, 11:44, "Bryan Bende" <bb...@gmail.com> wrote:
>>>>>
>>>>>     Hello,
>>>>>
>>>>>     Glad you are having success with NiFi + NiFi Registry!
>>>>>
>>>>>     You brought up an interesting point about the concurrent tasks...
>>>>>
>>>>>     I think we may want to consider making the concurrent tasks work
>>>>>     similar to variables, in that we capture the concurrent tasks that
>>>>> the
>>>>>     flow was developed with and would use it initially, but then if you
>>>>>     have modified this value in the target environment it would not
>>>>>     trigger a local change and would be retained across upgrades so
>>>>> that
>>>>>     you don't have to reset it.
>>>>>
>>>>>     For now you could probably always leave the versioned flow with the
>>>>>     lower value of 2, then once you are in prod you bump it to 4 until
>>>>> the
>>>>>     next upgrade is available, you then revert the local changes, do
>>>>> the
>>>>>     upgrade, and put it back to 4, but its not ideal because it shows a
>>>>>     local change the entire time.
>>>>>
>>>>>     I don't think there is much you can do differently right now, but I
>>>>>     think this is a valid case to create a JIRA for.
>>>>>
>>>>>     Thanks,
>>>>>
>>>>>     Bryan
>>>>>
>>>>>     On Thu, Mar 1, 2018 at 11:29 AM, Boris Tyukin <
>>>>> boris@boristyukin.com> wrote:
>>>>>     > Hello NiFi community,
>>>>>     >
>>>>>     > started using NiFi recently and fell in love with it! We run 1.6
>>>>> NiFi alone
>>>>>     > with new NiFi registry and I am trying to figure out how to
>>>>> promote NiFi
>>>>>     > flow, created in VM environment to our cluster.
>>>>>     >
>>>>>     > One of the things is "Concurrent Tasks" processor parameter. I
>>>>> bump it to 2
>>>>>     > or 4 for some processors in my flow, when I develop / test it in
>>>>> VM.
>>>>>     >
>>>>>     > Then we deploy this flow to a beefy cluster node (with 48 cores)
>>>>> and want to
>>>>>     > change concurrency to let's say 8 or 10 or 12 for some
>>>>> processors.
>>>>>     >
>>>>>     > Then I work on a new version/make some changes in my VM, and
>>>>> need to be more
>>>>>     > shy with concurrency so set it back to 2 or 4.
>>>>>     >
>>>>>     > Then the story repeats...
>>>>>     >
>>>>>     > Is there a better way than to manually set this parameter? I do
>>>>> not believe
>>>>>     > I can use a variable there and have to type the actual number of
>>>>> tasks.
>>>>>     >
>>>>>     >
>>>>>     > Thanks
>>>>>     > Boris
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>

Re: setting processor concurrency based on the development/production environment

Posted by Ed B <bd...@gmail.com>.
Boris,
as the WA - I believe that after deploying the flow on another env, you can
run post-deployment scripts.
You could use PUT rest API to update processors by ID.
[image: image.png]



On Fri, Mar 2, 2018 at 12:04 PM Andrew Grande <ap...@gmail.com> wrote:

> There are 2 efforts, with somewhat different focus. You are already aware
> of the community-driven nipyapi, but there's also an official module
> Ientioned before, will be included with the 1.6 release.
>
> Andrew
>
>
> On Fri, Mar 2, 2018, 8:59 AM Boris Tyukin <bo...@boristyukin.com> wrote:
>
>> Hi Andrew,
>>
>> thanks for the idea. I've been playing with nipyapi recently so might
>> give this a try.
>>
>> Thanks
>>
>> On Thu, Mar 1, 2018 at 7:32 PM, Andrew Grande <ap...@gmail.com> wrote:
>>
>>> Boris,
>>>
>>> Here's an idea youncould explore _today_.
>>>
>>> Assume your dev and prod flows live in different bucket/registry
>>> instance. Given that you are trying out NiFi 1.6, you should be able to
>>> extract the versioned flow from DEV and process it to change the
>>> concurrency level for PROD before committing it to the prod registry
>>> instance. Any script which understands json would do. nifi-toolkit-cli will
>>> take care of extracting and moving flow versions.
>>>
>>> It's not ideal (yes, would like concurrency to be a customizable flow
>>> var), and it assumes an explicit process to promote between environments,
>>> but technically it is possible already. The user experience can be improved
>>> in the future.
>>>
>>> Andrew
>>>
>>>
>>> On Thu, Mar 1, 2018, 1:52 PM Kevin Doran <kd...@apache.org> wrote:
>>>
>>>> I think you could put it under either project. Ultimately, if we go
>>>> with that approach, most (all?) of the logic/enhancement would be in the
>>>> NiFi code base during save version / import flow / change version
>>>> operations, so probably best to create it there.
>>>>
>>>>
>>>>
>>>> Glad you are finding NiFi useful.
>>>>
>>>>
>>>>
>>>> Cheers,
>>>>
>>>> Kevin
>>>>
>>>>
>>>>
>>>> *From: *Boris Tyukin <bo...@boristyukin.com>
>>>> *Reply-To: *<us...@nifi.apache.org>
>>>> *Date: *Thursday, March 1, 2018 at 13:44
>>>> *To: *<us...@nifi.apache.org>
>>>> *Subject: *Re: setting processor concurrency based on the
>>>> development/production environment
>>>>
>>>>
>>>>
>>>> thanks Bryan and Kevin. I will be happy to open a jira - would it be a
>>>> NiFi jira or NiFi registry?
>>>>
>>>>
>>>>
>>>> I like the approach that Bryan suggested.
>>>>
>>>>
>>>>
>>>> I guess for now I will just color code the processors that need to be
>>>> changed in production.
>>>>
>>>>
>>>>
>>>> P.S. I really, really like where NiFi is going...I've looked at
>>>> StreamSets and Cask, but for my purposes, I was looking for a tool when I
>>>> can process various tables without creating a flow per table. I was able to
>>>> create a very simple flow in NiFi, that will handle 25 tables. My next
>>>> project is to handle 600 tables in near real-time. I just could see how I
>>>> would do that with StreamSets or Cask, when you have to create a pipeline
>>>> per table. I was only being able to do something similar with Apache
>>>> Airflow, but airflow cannot do things in near real-time. The concept of
>>>> FlowFiles with attributes is a genius idea, and I am blown away with all
>>>> the possibilities to extend the functionality of NiFi with custom
>>>> processors and Groovy scripts. Awesome job, guys.
>>>>
>>>>
>>>>
>>>> On Thu, Mar 1, 2018 at 1:29 PM, Kevin Doran <kd...@apache.org> wrote:
>>>>
>>>> Hi Boris,
>>>>
>>>> Good point regarding concurrent tasks; thanks for sharing!
>>>>
>>>> This is a great candidate for something that one should be able to
>>>> create environment-specific values for, as Bryan suggests. I agree we
>>>> should create a NiFi JIRA to track this enhancement.
>>>>
>>>> Thanks,
>>>> Kevin
>>>>
>>>>
>>>> On 3/1/18, 11:44, "Bryan Bende" <bb...@gmail.com> wrote:
>>>>
>>>>     Hello,
>>>>
>>>>     Glad you are having success with NiFi + NiFi Registry!
>>>>
>>>>     You brought up an interesting point about the concurrent tasks...
>>>>
>>>>     I think we may want to consider making the concurrent tasks work
>>>>     similar to variables, in that we capture the concurrent tasks that
>>>> the
>>>>     flow was developed with and would use it initially, but then if you
>>>>     have modified this value in the target environment it would not
>>>>     trigger a local change and would be retained across upgrades so that
>>>>     you don't have to reset it.
>>>>
>>>>     For now you could probably always leave the versioned flow with the
>>>>     lower value of 2, then once you are in prod you bump it to 4 until
>>>> the
>>>>     next upgrade is available, you then revert the local changes, do the
>>>>     upgrade, and put it back to 4, but its not ideal because it shows a
>>>>     local change the entire time.
>>>>
>>>>     I don't think there is much you can do differently right now, but I
>>>>     think this is a valid case to create a JIRA for.
>>>>
>>>>     Thanks,
>>>>
>>>>     Bryan
>>>>
>>>>     On Thu, Mar 1, 2018 at 11:29 AM, Boris Tyukin <
>>>> boris@boristyukin.com> wrote:
>>>>     > Hello NiFi community,
>>>>     >
>>>>     > started using NiFi recently and fell in love with it! We run 1.6
>>>> NiFi alone
>>>>     > with new NiFi registry and I am trying to figure out how to
>>>> promote NiFi
>>>>     > flow, created in VM environment to our cluster.
>>>>     >
>>>>     > One of the things is "Concurrent Tasks" processor parameter. I
>>>> bump it to 2
>>>>     > or 4 for some processors in my flow, when I develop / test it in
>>>> VM.
>>>>     >
>>>>     > Then we deploy this flow to a beefy cluster node (with 48 cores)
>>>> and want to
>>>>     > change concurrency to let's say 8 or 10 or 12 for some processors.
>>>>     >
>>>>     > Then I work on a new version/make some changes in my VM, and need
>>>> to be more
>>>>     > shy with concurrency so set it back to 2 or 4.
>>>>     >
>>>>     > Then the story repeats...
>>>>     >
>>>>     > Is there a better way than to manually set this parameter? I do
>>>> not believe
>>>>     > I can use a variable there and have to type the actual number of
>>>> tasks.
>>>>     >
>>>>     >
>>>>     > Thanks
>>>>     > Boris
>>>>
>>>>
>>>>
>>>>
>>>
>>

Re: setting processor concurrency based on the development/production environment

Posted by Andrew Grande <ap...@gmail.com>.
There are 2 efforts, with somewhat different focus. You are already aware
of the community-driven nipyapi, but there's also an official module
Ientioned before, will be included with the 1.6 release.

Andrew

On Fri, Mar 2, 2018, 8:59 AM Boris Tyukin <bo...@boristyukin.com> wrote:

> Hi Andrew,
>
> thanks for the idea. I've been playing with nipyapi recently so might give
> this a try.
>
> Thanks
>
> On Thu, Mar 1, 2018 at 7:32 PM, Andrew Grande <ap...@gmail.com> wrote:
>
>> Boris,
>>
>> Here's an idea youncould explore _today_.
>>
>> Assume your dev and prod flows live in different bucket/registry
>> instance. Given that you are trying out NiFi 1.6, you should be able to
>> extract the versioned flow from DEV and process it to change the
>> concurrency level for PROD before committing it to the prod registry
>> instance. Any script which understands json would do. nifi-toolkit-cli will
>> take care of extracting and moving flow versions.
>>
>> It's not ideal (yes, would like concurrency to be a customizable flow
>> var), and it assumes an explicit process to promote between environments,
>> but technically it is possible already. The user experience can be improved
>> in the future.
>>
>> Andrew
>>
>>
>> On Thu, Mar 1, 2018, 1:52 PM Kevin Doran <kd...@apache.org> wrote:
>>
>>> I think you could put it under either project. Ultimately, if we go with
>>> that approach, most (all?) of the logic/enhancement would be in the NiFi
>>> code base during save version / import flow / change version operations, so
>>> probably best to create it there.
>>>
>>>
>>>
>>> Glad you are finding NiFi useful.
>>>
>>>
>>>
>>> Cheers,
>>>
>>> Kevin
>>>
>>>
>>>
>>> *From: *Boris Tyukin <bo...@boristyukin.com>
>>> *Reply-To: *<us...@nifi.apache.org>
>>> *Date: *Thursday, March 1, 2018 at 13:44
>>> *To: *<us...@nifi.apache.org>
>>> *Subject: *Re: setting processor concurrency based on the
>>> development/production environment
>>>
>>>
>>>
>>> thanks Bryan and Kevin. I will be happy to open a jira - would it be a
>>> NiFi jira or NiFi registry?
>>>
>>>
>>>
>>> I like the approach that Bryan suggested.
>>>
>>>
>>>
>>> I guess for now I will just color code the processors that need to be
>>> changed in production.
>>>
>>>
>>>
>>> P.S. I really, really like where NiFi is going...I've looked at
>>> StreamSets and Cask, but for my purposes, I was looking for a tool when I
>>> can process various tables without creating a flow per table. I was able to
>>> create a very simple flow in NiFi, that will handle 25 tables. My next
>>> project is to handle 600 tables in near real-time. I just could see how I
>>> would do that with StreamSets or Cask, when you have to create a pipeline
>>> per table. I was only being able to do something similar with Apache
>>> Airflow, but airflow cannot do things in near real-time. The concept of
>>> FlowFiles with attributes is a genius idea, and I am blown away with all
>>> the possibilities to extend the functionality of NiFi with custom
>>> processors and Groovy scripts. Awesome job, guys.
>>>
>>>
>>>
>>> On Thu, Mar 1, 2018 at 1:29 PM, Kevin Doran <kd...@apache.org> wrote:
>>>
>>> Hi Boris,
>>>
>>> Good point regarding concurrent tasks; thanks for sharing!
>>>
>>> This is a great candidate for something that one should be able to
>>> create environment-specific values for, as Bryan suggests. I agree we
>>> should create a NiFi JIRA to track this enhancement.
>>>
>>> Thanks,
>>> Kevin
>>>
>>>
>>> On 3/1/18, 11:44, "Bryan Bende" <bb...@gmail.com> wrote:
>>>
>>>     Hello,
>>>
>>>     Glad you are having success with NiFi + NiFi Registry!
>>>
>>>     You brought up an interesting point about the concurrent tasks...
>>>
>>>     I think we may want to consider making the concurrent tasks work
>>>     similar to variables, in that we capture the concurrent tasks that
>>> the
>>>     flow was developed with and would use it initially, but then if you
>>>     have modified this value in the target environment it would not
>>>     trigger a local change and would be retained across upgrades so that
>>>     you don't have to reset it.
>>>
>>>     For now you could probably always leave the versioned flow with the
>>>     lower value of 2, then once you are in prod you bump it to 4 until
>>> the
>>>     next upgrade is available, you then revert the local changes, do the
>>>     upgrade, and put it back to 4, but its not ideal because it shows a
>>>     local change the entire time.
>>>
>>>     I don't think there is much you can do differently right now, but I
>>>     think this is a valid case to create a JIRA for.
>>>
>>>     Thanks,
>>>
>>>     Bryan
>>>
>>>     On Thu, Mar 1, 2018 at 11:29 AM, Boris Tyukin <bo...@boristyukin.com>
>>> wrote:
>>>     > Hello NiFi community,
>>>     >
>>>     > started using NiFi recently and fell in love with it! We run 1.6
>>> NiFi alone
>>>     > with new NiFi registry and I am trying to figure out how to
>>> promote NiFi
>>>     > flow, created in VM environment to our cluster.
>>>     >
>>>     > One of the things is "Concurrent Tasks" processor parameter. I
>>> bump it to 2
>>>     > or 4 for some processors in my flow, when I develop / test it in
>>> VM.
>>>     >
>>>     > Then we deploy this flow to a beefy cluster node (with 48 cores)
>>> and want to
>>>     > change concurrency to let's say 8 or 10 or 12 for some processors.
>>>     >
>>>     > Then I work on a new version/make some changes in my VM, and need
>>> to be more
>>>     > shy with concurrency so set it back to 2 or 4.
>>>     >
>>>     > Then the story repeats...
>>>     >
>>>     > Is there a better way than to manually set this parameter? I do
>>> not believe
>>>     > I can use a variable there and have to type the actual number of
>>> tasks.
>>>     >
>>>     >
>>>     > Thanks
>>>     > Boris
>>>
>>>
>>>
>>>
>>
>

Re: setting processor concurrency based on the development/production environment

Posted by Boris Tyukin <bo...@boristyukin.com>.
Hi Andrew,

thanks for the idea. I've been playing with nipyapi recently so might give
this a try.

Thanks

On Thu, Mar 1, 2018 at 7:32 PM, Andrew Grande <ap...@gmail.com> wrote:

> Boris,
>
> Here's an idea youncould explore _today_.
>
> Assume your dev and prod flows live in different bucket/registry instance.
> Given that you are trying out NiFi 1.6, you should be able to extract the
> versioned flow from DEV and process it to change the concurrency level for
> PROD before committing it to the prod registry instance. Any script which
> understands json would do. nifi-toolkit-cli will take care of extracting
> and moving flow versions.
>
> It's not ideal (yes, would like concurrency to be a customizable flow
> var), and it assumes an explicit process to promote between environments,
> but technically it is possible already. The user experience can be improved
> in the future.
>
> Andrew
>
>
> On Thu, Mar 1, 2018, 1:52 PM Kevin Doran <kd...@apache.org> wrote:
>
>> I think you could put it under either project. Ultimately, if we go with
>> that approach, most (all?) of the logic/enhancement would be in the NiFi
>> code base during save version / import flow / change version operations, so
>> probably best to create it there.
>>
>>
>>
>> Glad you are finding NiFi useful.
>>
>>
>>
>> Cheers,
>>
>> Kevin
>>
>>
>>
>> *From: *Boris Tyukin <bo...@boristyukin.com>
>> *Reply-To: *<us...@nifi.apache.org>
>> *Date: *Thursday, March 1, 2018 at 13:44
>> *To: *<us...@nifi.apache.org>
>> *Subject: *Re: setting processor concurrency based on the
>> development/production environment
>>
>>
>>
>> thanks Bryan and Kevin. I will be happy to open a jira - would it be a
>> NiFi jira or NiFi registry?
>>
>>
>>
>> I like the approach that Bryan suggested.
>>
>>
>>
>> I guess for now I will just color code the processors that need to be
>> changed in production.
>>
>>
>>
>> P.S. I really, really like where NiFi is going...I've looked at
>> StreamSets and Cask, but for my purposes, I was looking for a tool when I
>> can process various tables without creating a flow per table. I was able to
>> create a very simple flow in NiFi, that will handle 25 tables. My next
>> project is to handle 600 tables in near real-time. I just could see how I
>> would do that with StreamSets or Cask, when you have to create a pipeline
>> per table. I was only being able to do something similar with Apache
>> Airflow, but airflow cannot do things in near real-time. The concept of
>> FlowFiles with attributes is a genius idea, and I am blown away with all
>> the possibilities to extend the functionality of NiFi with custom
>> processors and Groovy scripts. Awesome job, guys.
>>
>>
>>
>> On Thu, Mar 1, 2018 at 1:29 PM, Kevin Doran <kd...@apache.org> wrote:
>>
>> Hi Boris,
>>
>> Good point regarding concurrent tasks; thanks for sharing!
>>
>> This is a great candidate for something that one should be able to create
>> environment-specific values for, as Bryan suggests. I agree we should
>> create a NiFi JIRA to track this enhancement.
>>
>> Thanks,
>> Kevin
>>
>>
>> On 3/1/18, 11:44, "Bryan Bende" <bb...@gmail.com> wrote:
>>
>>     Hello,
>>
>>     Glad you are having success with NiFi + NiFi Registry!
>>
>>     You brought up an interesting point about the concurrent tasks...
>>
>>     I think we may want to consider making the concurrent tasks work
>>     similar to variables, in that we capture the concurrent tasks that the
>>     flow was developed with and would use it initially, but then if you
>>     have modified this value in the target environment it would not
>>     trigger a local change and would be retained across upgrades so that
>>     you don't have to reset it.
>>
>>     For now you could probably always leave the versioned flow with the
>>     lower value of 2, then once you are in prod you bump it to 4 until the
>>     next upgrade is available, you then revert the local changes, do the
>>     upgrade, and put it back to 4, but its not ideal because it shows a
>>     local change the entire time.
>>
>>     I don't think there is much you can do differently right now, but I
>>     think this is a valid case to create a JIRA for.
>>
>>     Thanks,
>>
>>     Bryan
>>
>>     On Thu, Mar 1, 2018 at 11:29 AM, Boris Tyukin <bo...@boristyukin.com>
>> wrote:
>>     > Hello NiFi community,
>>     >
>>     > started using NiFi recently and fell in love with it! We run 1.6
>> NiFi alone
>>     > with new NiFi registry and I am trying to figure out how to promote
>> NiFi
>>     > flow, created in VM environment to our cluster.
>>     >
>>     > One of the things is "Concurrent Tasks" processor parameter. I bump
>> it to 2
>>     > or 4 for some processors in my flow, when I develop / test it in VM.
>>     >
>>     > Then we deploy this flow to a beefy cluster node (with 48 cores)
>> and want to
>>     > change concurrency to let's say 8 or 10 or 12 for some processors.
>>     >
>>     > Then I work on a new version/make some changes in my VM, and need
>> to be more
>>     > shy with concurrency so set it back to 2 or 4.
>>     >
>>     > Then the story repeats...
>>     >
>>     > Is there a better way than to manually set this parameter? I do not
>> believe
>>     > I can use a variable there and have to type the actual number of
>> tasks.
>>     >
>>     >
>>     > Thanks
>>     > Boris
>>
>>
>>
>>
>

Re: setting processor concurrency based on the development/production environment

Posted by Andrew Grande <ap...@gmail.com>.
Boris,

Here's an idea youncould explore _today_.

Assume your dev and prod flows live in different bucket/registry instance.
Given that you are trying out NiFi 1.6, you should be able to extract the
versioned flow from DEV and process it to change the concurrency level for
PROD before committing it to the prod registry instance. Any script which
understands json would do. nifi-toolkit-cli will take care of extracting
and moving flow versions.

It's not ideal (yes, would like concurrency to be a customizable flow var),
and it assumes an explicit process to promote between environments, but
technically it is possible already. The user experience can be improved in
the future.

Andrew

On Thu, Mar 1, 2018, 1:52 PM Kevin Doran <kd...@apache.org> wrote:

> I think you could put it under either project. Ultimately, if we go with
> that approach, most (all?) of the logic/enhancement would be in the NiFi
> code base during save version / import flow / change version operations, so
> probably best to create it there.
>
>
>
> Glad you are finding NiFi useful.
>
>
>
> Cheers,
>
> Kevin
>
>
>
> *From: *Boris Tyukin <bo...@boristyukin.com>
> *Reply-To: *<us...@nifi.apache.org>
> *Date: *Thursday, March 1, 2018 at 13:44
> *To: *<us...@nifi.apache.org>
> *Subject: *Re: setting processor concurrency based on the
> development/production environment
>
>
>
> thanks Bryan and Kevin. I will be happy to open a jira - would it be a
> NiFi jira or NiFi registry?
>
>
>
> I like the approach that Bryan suggested.
>
>
>
> I guess for now I will just color code the processors that need to be
> changed in production.
>
>
>
> P.S. I really, really like where NiFi is going...I've looked at StreamSets
> and Cask, but for my purposes, I was looking for a tool when I can process
> various tables without creating a flow per table. I was able to create a
> very simple flow in NiFi, that will handle 25 tables. My next project is to
> handle 600 tables in near real-time. I just could see how I would do that
> with StreamSets or Cask, when you have to create a pipeline per table. I
> was only being able to do something similar with Apache Airflow, but
> airflow cannot do things in near real-time. The concept of FlowFiles with
> attributes is a genius idea, and I am blown away with all the possibilities
> to extend the functionality of NiFi with custom processors and Groovy
> scripts. Awesome job, guys.
>
>
>
> On Thu, Mar 1, 2018 at 1:29 PM, Kevin Doran <kd...@apache.org> wrote:
>
> Hi Boris,
>
> Good point regarding concurrent tasks; thanks for sharing!
>
> This is a great candidate for something that one should be able to create
> environment-specific values for, as Bryan suggests. I agree we should
> create a NiFi JIRA to track this enhancement.
>
> Thanks,
> Kevin
>
>
> On 3/1/18, 11:44, "Bryan Bende" <bb...@gmail.com> wrote:
>
>     Hello,
>
>     Glad you are having success with NiFi + NiFi Registry!
>
>     You brought up an interesting point about the concurrent tasks...
>
>     I think we may want to consider making the concurrent tasks work
>     similar to variables, in that we capture the concurrent tasks that the
>     flow was developed with and would use it initially, but then if you
>     have modified this value in the target environment it would not
>     trigger a local change and would be retained across upgrades so that
>     you don't have to reset it.
>
>     For now you could probably always leave the versioned flow with the
>     lower value of 2, then once you are in prod you bump it to 4 until the
>     next upgrade is available, you then revert the local changes, do the
>     upgrade, and put it back to 4, but its not ideal because it shows a
>     local change the entire time.
>
>     I don't think there is much you can do differently right now, but I
>     think this is a valid case to create a JIRA for.
>
>     Thanks,
>
>     Bryan
>
>     On Thu, Mar 1, 2018 at 11:29 AM, Boris Tyukin <bo...@boristyukin.com>
> wrote:
>     > Hello NiFi community,
>     >
>     > started using NiFi recently and fell in love with it! We run 1.6
> NiFi alone
>     > with new NiFi registry and I am trying to figure out how to promote
> NiFi
>     > flow, created in VM environment to our cluster.
>     >
>     > One of the things is "Concurrent Tasks" processor parameter. I bump
> it to 2
>     > or 4 for some processors in my flow, when I develop / test it in VM.
>     >
>     > Then we deploy this flow to a beefy cluster node (with 48 cores) and
> want to
>     > change concurrency to let's say 8 or 10 or 12 for some processors.
>     >
>     > Then I work on a new version/make some changes in my VM, and need to
> be more
>     > shy with concurrency so set it back to 2 or 4.
>     >
>     > Then the story repeats...
>     >
>     > Is there a better way than to manually set this parameter? I do not
> believe
>     > I can use a variable there and have to type the actual number of
> tasks.
>     >
>     >
>     > Thanks
>     > Boris
>
>
>
>

Re: setting processor concurrency based on the development/production environment

Posted by Kevin Doran <kd...@apache.org>.
I think you could put it under either project. Ultimately, if we go with that approach, most (all?) of the logic/enhancement would be in the NiFi code base during save version / import flow / change version operations, so probably best to create it there.

 

Glad you are finding NiFi useful.

 

Cheers,

Kevin

 

From: Boris Tyukin <bo...@boristyukin.com>
Reply-To: <us...@nifi.apache.org>
Date: Thursday, March 1, 2018 at 13:44
To: <us...@nifi.apache.org>
Subject: Re: setting processor concurrency based on the development/production environment

 

thanks Bryan and Kevin. I will be happy to open a jira - would it be a NiFi jira or NiFi registry? 

 

I like the approach that Bryan suggested.

 

I guess for now I will just color code the processors that need to be changed in production.

 

P.S. I really, really like where NiFi is going...I've looked at StreamSets and Cask, but for my purposes, I was looking for a tool when I can process various tables without creating a flow per table. I was able to create a very simple flow in NiFi, that will handle 25 tables. My next project is to handle 600 tables in near real-time. I just could see how I would do that with StreamSets or Cask, when you have to create a pipeline per table. I was only being able to do something similar with Apache Airflow, but airflow cannot do things in near real-time. The concept of FlowFiles with attributes is a genius idea, and I am blown away with all the possibilities to extend the functionality of NiFi with custom processors and Groovy scripts. Awesome job, guys.

 

On Thu, Mar 1, 2018 at 1:29 PM, Kevin Doran <kd...@apache.org> wrote:

Hi Boris,

Good point regarding concurrent tasks; thanks for sharing!

This is a great candidate for something that one should be able to create environment-specific values for, as Bryan suggests. I agree we should create a NiFi JIRA to track this enhancement.

Thanks,
Kevin


On 3/1/18, 11:44, "Bryan Bende" <bb...@gmail.com> wrote:

    Hello,

    Glad you are having success with NiFi + NiFi Registry!

    You brought up an interesting point about the concurrent tasks...

    I think we may want to consider making the concurrent tasks work
    similar to variables, in that we capture the concurrent tasks that the
    flow was developed with and would use it initially, but then if you
    have modified this value in the target environment it would not
    trigger a local change and would be retained across upgrades so that
    you don't have to reset it.

    For now you could probably always leave the versioned flow with the
    lower value of 2, then once you are in prod you bump it to 4 until the
    next upgrade is available, you then revert the local changes, do the
    upgrade, and put it back to 4, but its not ideal because it shows a
    local change the entire time.

    I don't think there is much you can do differently right now, but I
    think this is a valid case to create a JIRA for.

    Thanks,

    Bryan

    On Thu, Mar 1, 2018 at 11:29 AM, Boris Tyukin <bo...@boristyukin.com> wrote:
    > Hello NiFi community,
    >
    > started using NiFi recently and fell in love with it! We run 1.6 NiFi alone
    > with new NiFi registry and I am trying to figure out how to promote NiFi
    > flow, created in VM environment to our cluster.
    >
    > One of the things is "Concurrent Tasks" processor parameter. I bump it to 2
    > or 4 for some processors in my flow, when I develop / test it in VM.
    >
    > Then we deploy this flow to a beefy cluster node (with 48 cores) and want to
    > change concurrency to let's say 8 or 10 or 12 for some processors.
    >
    > Then I work on a new version/make some changes in my VM, and need to be more
    > shy with concurrency so set it back to 2 or 4.
    >
    > Then the story repeats...
    >
    > Is there a better way than to manually set this parameter? I do not believe
    > I can use a variable there and have to type the actual number of tasks.
    >
    >
    > Thanks
    > Boris



 


Re: setting processor concurrency based on the development/production environment

Posted by Boris Tyukin <bo...@boristyukin.com>.
thanks Bryan and Kevin. I will be happy to open a jira - would it be a NiFi
jira or NiFi registry?

I like the approach that Bryan suggested.

I guess for now I will just color code the processors that need to be
changed in production.

P.S. I really, really like where NiFi is going...I've looked at StreamSets
and Cask, but for my purposes, I was looking for a tool when I can process
various tables without creating a flow per table. I was able to create a
very simple flow in NiFi, that will handle 25 tables. My next project is to
handle 600 tables in near real-time. I just could see how I would do that
with StreamSets or Cask, when you have to create a pipeline per table. I
was only being able to do something similar with Apache Airflow, but
airflow cannot do things in near real-time. The concept of FlowFiles with
attributes is a genius idea, and I am blown away with all the possibilities
to extend the functionality of NiFi with custom processors and Groovy
scripts. Awesome job, guys.

On Thu, Mar 1, 2018 at 1:29 PM, Kevin Doran <kd...@apache.org> wrote:

> Hi Boris,
>
> Good point regarding concurrent tasks; thanks for sharing!
>
> This is a great candidate for something that one should be able to create
> environment-specific values for, as Bryan suggests. I agree we should
> create a NiFi JIRA to track this enhancement.
>
> Thanks,
> Kevin
>
> On 3/1/18, 11:44, "Bryan Bende" <bb...@gmail.com> wrote:
>
>     Hello,
>
>     Glad you are having success with NiFi + NiFi Registry!
>
>     You brought up an interesting point about the concurrent tasks...
>
>     I think we may want to consider making the concurrent tasks work
>     similar to variables, in that we capture the concurrent tasks that the
>     flow was developed with and would use it initially, but then if you
>     have modified this value in the target environment it would not
>     trigger a local change and would be retained across upgrades so that
>     you don't have to reset it.
>
>     For now you could probably always leave the versioned flow with the
>     lower value of 2, then once you are in prod you bump it to 4 until the
>     next upgrade is available, you then revert the local changes, do the
>     upgrade, and put it back to 4, but its not ideal because it shows a
>     local change the entire time.
>
>     I don't think there is much you can do differently right now, but I
>     think this is a valid case to create a JIRA for.
>
>     Thanks,
>
>     Bryan
>
>     On Thu, Mar 1, 2018 at 11:29 AM, Boris Tyukin <bo...@boristyukin.com>
> wrote:
>     > Hello NiFi community,
>     >
>     > started using NiFi recently and fell in love with it! We run 1.6
> NiFi alone
>     > with new NiFi registry and I am trying to figure out how to promote
> NiFi
>     > flow, created in VM environment to our cluster.
>     >
>     > One of the things is "Concurrent Tasks" processor parameter. I bump
> it to 2
>     > or 4 for some processors in my flow, when I develop / test it in VM.
>     >
>     > Then we deploy this flow to a beefy cluster node (with 48 cores) and
> want to
>     > change concurrency to let's say 8 or 10 or 12 for some processors.
>     >
>     > Then I work on a new version/make some changes in my VM, and need to
> be more
>     > shy with concurrency so set it back to 2 or 4.
>     >
>     > Then the story repeats...
>     >
>     > Is there a better way than to manually set this parameter? I do not
> believe
>     > I can use a variable there and have to type the actual number of
> tasks.
>     >
>     >
>     > Thanks
>     > Boris
>
>
>
>

Re: setting processor concurrency based on the development/production environment

Posted by Kevin Doran <kd...@apache.org>.
Hi Boris,

Good point regarding concurrent tasks; thanks for sharing!

This is a great candidate for something that one should be able to create environment-specific values for, as Bryan suggests. I agree we should create a NiFi JIRA to track this enhancement. 

Thanks,
Kevin

On 3/1/18, 11:44, "Bryan Bende" <bb...@gmail.com> wrote:

    Hello,
    
    Glad you are having success with NiFi + NiFi Registry!
    
    You brought up an interesting point about the concurrent tasks...
    
    I think we may want to consider making the concurrent tasks work
    similar to variables, in that we capture the concurrent tasks that the
    flow was developed with and would use it initially, but then if you
    have modified this value in the target environment it would not
    trigger a local change and would be retained across upgrades so that
    you don't have to reset it.
    
    For now you could probably always leave the versioned flow with the
    lower value of 2, then once you are in prod you bump it to 4 until the
    next upgrade is available, you then revert the local changes, do the
    upgrade, and put it back to 4, but its not ideal because it shows a
    local change the entire time.
    
    I don't think there is much you can do differently right now, but I
    think this is a valid case to create a JIRA for.
    
    Thanks,
    
    Bryan
    
    On Thu, Mar 1, 2018 at 11:29 AM, Boris Tyukin <bo...@boristyukin.com> wrote:
    > Hello NiFi community,
    >
    > started using NiFi recently and fell in love with it! We run 1.6 NiFi alone
    > with new NiFi registry and I am trying to figure out how to promote NiFi
    > flow, created in VM environment to our cluster.
    >
    > One of the things is "Concurrent Tasks" processor parameter. I bump it to 2
    > or 4 for some processors in my flow, when I develop / test it in VM.
    >
    > Then we deploy this flow to a beefy cluster node (with 48 cores) and want to
    > change concurrency to let's say 8 or 10 or 12 for some processors.
    >
    > Then I work on a new version/make some changes in my VM, and need to be more
    > shy with concurrency so set it back to 2 or 4.
    >
    > Then the story repeats...
    >
    > Is there a better way than to manually set this parameter? I do not believe
    > I can use a variable there and have to type the actual number of tasks.
    >
    >
    > Thanks
    > Boris
    



Re: setting processor concurrency based on the development/production environment

Posted by Bryan Bende <bb...@gmail.com>.
Hello,

Glad you are having success with NiFi + NiFi Registry!

You brought up an interesting point about the concurrent tasks...

I think we may want to consider making the concurrent tasks work
similar to variables, in that we capture the concurrent tasks that the
flow was developed with and would use it initially, but then if you
have modified this value in the target environment it would not
trigger a local change and would be retained across upgrades so that
you don't have to reset it.

For now you could probably always leave the versioned flow with the
lower value of 2, then once you are in prod you bump it to 4 until the
next upgrade is available, you then revert the local changes, do the
upgrade, and put it back to 4, but its not ideal because it shows a
local change the entire time.

I don't think there is much you can do differently right now, but I
think this is a valid case to create a JIRA for.

Thanks,

Bryan

On Thu, Mar 1, 2018 at 11:29 AM, Boris Tyukin <bo...@boristyukin.com> wrote:
> Hello NiFi community,
>
> started using NiFi recently and fell in love with it! We run 1.6 NiFi alone
> with new NiFi registry and I am trying to figure out how to promote NiFi
> flow, created in VM environment to our cluster.
>
> One of the things is "Concurrent Tasks" processor parameter. I bump it to 2
> or 4 for some processors in my flow, when I develop / test it in VM.
>
> Then we deploy this flow to a beefy cluster node (with 48 cores) and want to
> change concurrency to let's say 8 or 10 or 12 for some processors.
>
> Then I work on a new version/make some changes in my VM, and need to be more
> shy with concurrency so set it back to 2 or 4.
>
> Then the story repeats...
>
> Is there a better way than to manually set this parameter? I do not believe
> I can use a variable there and have to type the actual number of tasks.
>
>
> Thanks
> Boris