You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@nifi.apache.org by Emanuel Oliveira <em...@gmail.com> on 2020/02/07 21:43:47 UTC

upgrade flow with running components using state (the defaul provided by zookeeper)

Hi,

I wonder.. Is it possible to upgrade PG flow to new version when its
contains processors using state ?
FYI The new flow using exact same processors/versions its just minor tweaks
on some properties etc..

Best Regards,
*Emanuel Oliveira*

Re: upgrade flow with running components using state (the defaul provided by zookeeper)

Posted by Emanuel Oliveira <em...@gmail.com>.
I see, so how easy can it be to migrate state of specific processors and
how ?
I guess using ZK CLI ? or grabbing state from the xml directly ?

Best Regards,
*Emanuel Oliveira*



On Sat, Feb 8, 2020 at 6:36 PM Bryan Bende <bb...@gmail.com> wrote:

> There is a ZK migrator in the toolkit that can transfer all state from one
> one ZK to another, for the scenario where you are moving everything to a
> new cluster.
>
> Other than that, it is not part of versioned flows because the state is
> specific to the environment.
>
> On Sat, Feb 8, 2020 at 12:41 PM Emanuel Oliveira <em...@gmail.com>
> wrote:
>
>> Great, good to know thanks Bryan, going take a look to that ZK CLI surely.
>>
>> One last question, good to know deploy new version of PG flow via
>> Registry keeps the state of processors (link by their uuid).
>> And how about deploying the flow into another cluster B, but which
>> already been running on cluster A.. how to copy/move the state of a
>> processor from cluster A into cluster B ?
>>
>> Best Regards,
>> *Emanuel Oliveira*
>>
>>
>>
>> On Sat, Feb 8, 2020 at 4:59 PM Bryan Bende <bb...@gmail.com> wrote:
>>
>>> Yes with registry components are upgraded in place so their ids are not
>>> changing.
>>>
>>> The distributed cache was the old way of storing state in 0.x before the
>>> internal state manager was introduced. It is still there to migrate state
>>> in the event someone upgrades from and old 0.x release, but is not used
>>> otherwise and should be removed on 2.0.0.
>>>
>>> The state managers are defined in state-management.xml, the local one is
>>> a write ahead log and the clustered one is ZooKeeper by default. You could
>>> use ZK CLI to inspect what is stored.
>>>
>>> On Sat, Feb 8, 2020 at 4:33 AM Emanuel Oliveira <em...@gmail.com>
>>> wrote:
>>>
>>>> Yes Bryan, we developed process to deploy from registry uding nifi rest
>>>> api.
>>>>
>>>> I see so state is physically related to processors uuid.
>>>> 1. when importing templates, the uuids change. so reading your
>>>> suggestion hi jts that deploying from registry the same PG same or newer
>>>> version (where our state processor remains de same) via rest API it shall
>>>> keep uuids in both deploys is it?
>>>>
>>>> 2. where do processor states get stored physically at cluster and at
>>>> locsl level? I suppose processors use internally the so called "zoo keeper"
>>>> to also maintain states ? Additionally are just "state" files get synced in
>>>> between nodes or are there nifi or zookeeper or some other type of apis
>>>> being used?
>>>>
>>>> 3. Yesterday we had a flow using ListHdfs + FetchHdfs+ PutS3 , with
>>>> ListHdfs using internal state management (that is property "Distributed
>>>> Cache Service" is not set, i think this means processor using default nifi
>>>> internal state system which is managed/implemented by zookeeper?).
>>>> Something strange happened that dedpite 1000's files got pulled/stored in
>>>> s3 but rightclicking ListHdfs state was empty.. there was no key/values on
>>>> the list.. the processor was been running for 2 days. Isnt supposed for us
>>>> to be able to inspect state? What could we do next time to troubleshoot
>>>> this?
>>>>
>>>>
>>>> Thanks in advance,
>>>> Emanuel O.
>>>>
>>>>
>>>>
>>>> On Fri 7 Feb 2020, 21:54 Bryan Bende, <bb...@gmail.com> wrote:
>>>>
>>>>> Hello,
>>>>>
>>>>> How are you upgrading the flow?
>>>>>
>>>>> If you mean using NiFi Registry and selecting Change Version to a new
>>>>> version, then yes it will retain state.
>>>>>
>>>>> Other than that, probably not because the state is tied to the UUID of
>>>>> the processor, so if you used templates or some other approach, you
>>>>> will likely get a new UUID for the processor in the new flow.
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Bryan
>>>>>
>>>>> On Fri, Feb 7, 2020 at 4:44 PM Emanuel Oliveira <em...@gmail.com>
>>>>> wrote:
>>>>> >
>>>>> > Hi,
>>>>> >
>>>>> > I wonder.. Is it possible to upgrade PG flow to new version when its
>>>>> contains processors using state ?
>>>>> > FYI The new flow using exact same processors/versions its just minor
>>>>> tweaks on some properties etc..
>>>>> >
>>>>> > Best Regards,
>>>>> > Emanuel Oliveira
>>>>> >
>>>>>
>>>> --
>>> Sent from Gmail Mobile
>>>
>> --
> Sent from Gmail Mobile
>

Re: upgrade flow with running components using state (the defaul provided by zookeeper)

Posted by Bryan Bende <bb...@gmail.com>.
There is a ZK migrator in the toolkit that can transfer all state from one
one ZK to another, for the scenario where you are moving everything to a
new cluster.

Other than that, it is not part of versioned flows because the state is
specific to the environment.

On Sat, Feb 8, 2020 at 12:41 PM Emanuel Oliveira <em...@gmail.com> wrote:

> Great, good to know thanks Bryan, going take a look to that ZK CLI surely.
>
> One last question, good to know deploy new version of PG flow via Registry
> keeps the state of processors (link by their uuid).
> And how about deploying the flow into another cluster B, but which already
> been running on cluster A.. how to copy/move the state of a processor from
> cluster A into cluster B ?
>
> Best Regards,
> *Emanuel Oliveira*
>
>
>
> On Sat, Feb 8, 2020 at 4:59 PM Bryan Bende <bb...@gmail.com> wrote:
>
>> Yes with registry components are upgraded in place so their ids are not
>> changing.
>>
>> The distributed cache was the old way of storing state in 0.x before the
>> internal state manager was introduced. It is still there to migrate state
>> in the event someone upgrades from and old 0.x release, but is not used
>> otherwise and should be removed on 2.0.0.
>>
>> The state managers are defined in state-management.xml, the local one is
>> a write ahead log and the clustered one is ZooKeeper by default. You could
>> use ZK CLI to inspect what is stored.
>>
>> On Sat, Feb 8, 2020 at 4:33 AM Emanuel Oliveira <em...@gmail.com>
>> wrote:
>>
>>> Yes Bryan, we developed process to deploy from registry uding nifi rest
>>> api.
>>>
>>> I see so state is physically related to processors uuid.
>>> 1. when importing templates, the uuids change. so reading your
>>> suggestion hi jts that deploying from registry the same PG same or newer
>>> version (where our state processor remains de same) via rest API it shall
>>> keep uuids in both deploys is it?
>>>
>>> 2. where do processor states get stored physically at cluster and at
>>> locsl level? I suppose processors use internally the so called "zoo keeper"
>>> to also maintain states ? Additionally are just "state" files get synced in
>>> between nodes or are there nifi or zookeeper or some other type of apis
>>> being used?
>>>
>>> 3. Yesterday we had a flow using ListHdfs + FetchHdfs+ PutS3 , with
>>> ListHdfs using internal state management (that is property "Distributed
>>> Cache Service" is not set, i think this means processor using default nifi
>>> internal state system which is managed/implemented by zookeeper?).
>>> Something strange happened that dedpite 1000's files got pulled/stored in
>>> s3 but rightclicking ListHdfs state was empty.. there was no key/values on
>>> the list.. the processor was been running for 2 days. Isnt supposed for us
>>> to be able to inspect state? What could we do next time to troubleshoot
>>> this?
>>>
>>>
>>> Thanks in advance,
>>> Emanuel O.
>>>
>>>
>>>
>>> On Fri 7 Feb 2020, 21:54 Bryan Bende, <bb...@gmail.com> wrote:
>>>
>>>> Hello,
>>>>
>>>> How are you upgrading the flow?
>>>>
>>>> If you mean using NiFi Registry and selecting Change Version to a new
>>>> version, then yes it will retain state.
>>>>
>>>> Other than that, probably not because the state is tied to the UUID of
>>>> the processor, so if you used templates or some other approach, you
>>>> will likely get a new UUID for the processor in the new flow.
>>>>
>>>> Thanks,
>>>>
>>>> Bryan
>>>>
>>>> On Fri, Feb 7, 2020 at 4:44 PM Emanuel Oliveira <em...@gmail.com>
>>>> wrote:
>>>> >
>>>> > Hi,
>>>> >
>>>> > I wonder.. Is it possible to upgrade PG flow to new version when its
>>>> contains processors using state ?
>>>> > FYI The new flow using exact same processors/versions its just minor
>>>> tweaks on some properties etc..
>>>> >
>>>> > Best Regards,
>>>> > Emanuel Oliveira
>>>> >
>>>>
>>> --
>> Sent from Gmail Mobile
>>
> --
Sent from Gmail Mobile

Re: upgrade flow with running components using state (the defaul provided by zookeeper)

Posted by Emanuel Oliveira <em...@gmail.com>.
Great, good to know thanks Bryan, going take a look to that ZK CLI surely.

One last question, good to know deploy new version of PG flow via Registry
keeps the state of processors (link by their uuid).
And how about deploying the flow into another cluster B, but which already
been running on cluster A.. how to copy/move the state of a processor from
cluster A into cluster B ?

Best Regards,
*Emanuel Oliveira*



On Sat, Feb 8, 2020 at 4:59 PM Bryan Bende <bb...@gmail.com> wrote:

> Yes with registry components are upgraded in place so their ids are not
> changing.
>
> The distributed cache was the old way of storing state in 0.x before the
> internal state manager was introduced. It is still there to migrate state
> in the event someone upgrades from and old 0.x release, but is not used
> otherwise and should be removed on 2.0.0.
>
> The state managers are defined in state-management.xml, the local one is a
> write ahead log and the clustered one is ZooKeeper by default. You could
> use ZK CLI to inspect what is stored.
>
> On Sat, Feb 8, 2020 at 4:33 AM Emanuel Oliveira <em...@gmail.com>
> wrote:
>
>> Yes Bryan, we developed process to deploy from registry uding nifi rest
>> api.
>>
>> I see so state is physically related to processors uuid.
>> 1. when importing templates, the uuids change. so reading your suggestion
>> hi jts that deploying from registry the same PG same or newer version
>> (where our state processor remains de same) via rest API it shall keep
>> uuids in both deploys is it?
>>
>> 2. where do processor states get stored physically at cluster and at
>> locsl level? I suppose processors use internally the so called "zoo keeper"
>> to also maintain states ? Additionally are just "state" files get synced in
>> between nodes or are there nifi or zookeeper or some other type of apis
>> being used?
>>
>> 3. Yesterday we had a flow using ListHdfs + FetchHdfs+ PutS3 , with
>> ListHdfs using internal state management (that is property "Distributed
>> Cache Service" is not set, i think this means processor using default nifi
>> internal state system which is managed/implemented by zookeeper?).
>> Something strange happened that dedpite 1000's files got pulled/stored in
>> s3 but rightclicking ListHdfs state was empty.. there was no key/values on
>> the list.. the processor was been running for 2 days. Isnt supposed for us
>> to be able to inspect state? What could we do next time to troubleshoot
>> this?
>>
>>
>> Thanks in advance,
>> Emanuel O.
>>
>>
>>
>> On Fri 7 Feb 2020, 21:54 Bryan Bende, <bb...@gmail.com> wrote:
>>
>>> Hello,
>>>
>>> How are you upgrading the flow?
>>>
>>> If you mean using NiFi Registry and selecting Change Version to a new
>>> version, then yes it will retain state.
>>>
>>> Other than that, probably not because the state is tied to the UUID of
>>> the processor, so if you used templates or some other approach, you
>>> will likely get a new UUID for the processor in the new flow.
>>>
>>> Thanks,
>>>
>>> Bryan
>>>
>>> On Fri, Feb 7, 2020 at 4:44 PM Emanuel Oliveira <em...@gmail.com>
>>> wrote:
>>> >
>>> > Hi,
>>> >
>>> > I wonder.. Is it possible to upgrade PG flow to new version when its
>>> contains processors using state ?
>>> > FYI The new flow using exact same processors/versions its just minor
>>> tweaks on some properties etc..
>>> >
>>> > Best Regards,
>>> > Emanuel Oliveira
>>> >
>>>
>> --
> Sent from Gmail Mobile
>

Re: upgrade flow with running components using state (the defaul provided by zookeeper)

Posted by Bryan Bende <bb...@gmail.com>.
Yes with registry components are upgraded in place so their ids are not
changing.

The distributed cache was the old way of storing state in 0.x before the
internal state manager was introduced. It is still there to migrate state
in the event someone upgrades from and old 0.x release, but is not used
otherwise and should be removed on 2.0.0.

The state managers are defined in state-management.xml, the local one is a
write ahead log and the clustered one is ZooKeeper by default. You could
use ZK CLI to inspect what is stored.

On Sat, Feb 8, 2020 at 4:33 AM Emanuel Oliveira <em...@gmail.com> wrote:

> Yes Bryan, we developed process to deploy from registry uding nifi rest
> api.
>
> I see so state is physically related to processors uuid.
> 1. when importing templates, the uuids change. so reading your suggestion
> hi jts that deploying from registry the same PG same or newer version
> (where our state processor remains de same) via rest API it shall keep
> uuids in both deploys is it?
>
> 2. where do processor states get stored physically at cluster and at locsl
> level? I suppose processors use internally the so called "zoo keeper" to
> also maintain states ? Additionally are just "state" files get synced in
> between nodes or are there nifi or zookeeper or some other type of apis
> being used?
>
> 3. Yesterday we had a flow using ListHdfs + FetchHdfs+ PutS3 , with
> ListHdfs using internal state management (that is property "Distributed
> Cache Service" is not set, i think this means processor using default nifi
> internal state system which is managed/implemented by zookeeper?).
> Something strange happened that dedpite 1000's files got pulled/stored in
> s3 but rightclicking ListHdfs state was empty.. there was no key/values on
> the list.. the processor was been running for 2 days. Isnt supposed for us
> to be able to inspect state? What could we do next time to troubleshoot
> this?
>
>
> Thanks in advance,
> Emanuel O.
>
>
>
> On Fri 7 Feb 2020, 21:54 Bryan Bende, <bb...@gmail.com> wrote:
>
>> Hello,
>>
>> How are you upgrading the flow?
>>
>> If you mean using NiFi Registry and selecting Change Version to a new
>> version, then yes it will retain state.
>>
>> Other than that, probably not because the state is tied to the UUID of
>> the processor, so if you used templates or some other approach, you
>> will likely get a new UUID for the processor in the new flow.
>>
>> Thanks,
>>
>> Bryan
>>
>> On Fri, Feb 7, 2020 at 4:44 PM Emanuel Oliveira <em...@gmail.com>
>> wrote:
>> >
>> > Hi,
>> >
>> > I wonder.. Is it possible to upgrade PG flow to new version when its
>> contains processors using state ?
>> > FYI The new flow using exact same processors/versions its just minor
>> tweaks on some properties etc..
>> >
>> > Best Regards,
>> > Emanuel Oliveira
>> >
>>
> --
Sent from Gmail Mobile

Re: upgrade flow with running components using state (the defaul provided by zookeeper)

Posted by Emanuel Oliveira <em...@gmail.com>.
Yes Bryan, we developed process to deploy from registry uding nifi rest api.

I see so state is physically related to processors uuid.
1. when importing templates, the uuids change. so reading your suggestion
hi jts that deploying from registry the same PG same or newer version
(where our state processor remains de same) via rest API it shall keep
uuids in both deploys is it?

2. where do processor states get stored physically at cluster and at locsl
level? I suppose processors use internally the so called "zoo keeper" to
also maintain states ? Additionally are just "state" files get synced in
between nodes or are there nifi or zookeeper or some other type of apis
being used?

3. Yesterday we had a flow using ListHdfs + FetchHdfs+ PutS3 , with
ListHdfs using internal state management (that is property "Distributed
Cache Service" is not set, i think this means processor using default nifi
internal state system which is managed/implemented by zookeeper?).
Something strange happened that dedpite 1000's files got pulled/stored in
s3 but rightclicking ListHdfs state was empty.. there was no key/values on
the list.. the processor was been running for 2 days. Isnt supposed for us
to be able to inspect state? What could we do next time to troubleshoot
this?


Thanks in advance,
Emanuel O.


On Fri 7 Feb 2020, 21:54 Bryan Bende, <bb...@gmail.com> wrote:

> Hello,
>
> How are you upgrading the flow?
>
> If you mean using NiFi Registry and selecting Change Version to a new
> version, then yes it will retain state.
>
> Other than that, probably not because the state is tied to the UUID of
> the processor, so if you used templates or some other approach, you
> will likely get a new UUID for the processor in the new flow.
>
> Thanks,
>
> Bryan
>
> On Fri, Feb 7, 2020 at 4:44 PM Emanuel Oliveira <em...@gmail.com>
> wrote:
> >
> > Hi,
> >
> > I wonder.. Is it possible to upgrade PG flow to new version when its
> contains processors using state ?
> > FYI The new flow using exact same processors/versions its just minor
> tweaks on some properties etc..
> >
> > Best Regards,
> > Emanuel Oliveira
> >
>

Re: upgrade flow with running components using state (the defaul provided by zookeeper)

Posted by Bryan Bende <bb...@gmail.com>.
Hello,

How are you upgrading the flow?

If you mean using NiFi Registry and selecting Change Version to a new
version, then yes it will retain state.

Other than that, probably not because the state is tied to the UUID of
the processor, so if you used templates or some other approach, you
will likely get a new UUID for the processor in the new flow.

Thanks,

Bryan

On Fri, Feb 7, 2020 at 4:44 PM Emanuel Oliveira <em...@gmail.com> wrote:
>
> Hi,
>
> I wonder.. Is it possible to upgrade PG flow to new version when its contains processors using state ?
> FYI The new flow using exact same processors/versions its just minor tweaks on some properties etc..
>
> Best Regards,
> Emanuel Oliveira
>