You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nifi.apache.org by Andre <an...@fucs.org> on 2017/11/15 09:58:04 UTC

The mystery of ListSFTP that stops working

Folks,

Has anyone ever seen a situation where ListSFTP simply stops working?

I have observed a few occurrences of seems to be some weird bug.

Symptoms are:

- Processor looks healthy (i.e. no bulletins)
- State is stalled (i.e. no changes to new files or timestamps)
- No signs of a stuck thread (i.e. processor doesn't display a thread count)
- UI displays processor "Task count = 0/0"
- Neither stopping > starting nor stopping > disabling > enabling >
starting processor seems to make a difference.

Has anyone seen this?

Cheers

Re: The mystery of ListSFTP that stops working

Posted by kapkbr <ka...@gmail.com>.
After some more debugging, I have observed that Nifi is messing ZK nodes
after many stops and starts. I have pointed Nifi to newly created path in ZK
(zookeeper) and everything started working.



--
Sent from: http://apache-nifi-developer-list.39713.n7.nabble.com/

Re: The mystery of ListSFTP that stops working

Posted by kapkbr <ka...@gmail.com>.
I am seeing similar problem, It shows both Tasks/Duration as 0 (Sreenshot 1).
It shows the status as started, but appear never spawned task. Configuration
of the task is 

Timer driven
Run Schedule 0
Primary node only
Yield 1 sec
penalty 30 sec

I have tried to change yield and other things, but nothing seems working. 

Here are some more strange things
1) No message or error in log except following message... 
[StandardProcessScheduler Thread-6] o.a.n.c.s.TimerDrivenSchedulingAgent
Scheduled ListFile[id=cafa1842-015f-1000-ffff-ffffee80ecf2] to run with 1
threads
2) If I change it to run on all nodes, it just works. But thats not what I
want (it will create duplicates if I do)
3) Not only this processor, all processors configured to run on primary node
behave same :(
4) I have another environment with same configuration, it just works fine. 
5) I thought this could be by some stuck threads, but this problem remain
same after cluster restart.

Any idea where to look for this kind of problem? 

<http://apache-nifi-developer-list.39713.n7.nabble.com/file/t841/screenshot1.png> 

-Kap




--
Sent from: http://apache-nifi-developer-list.39713.n7.nabble.com/

Re: The mystery of ListSFTP that stops working

Posted by Andre <an...@fucs.org>.
Hey Mark,

Changing the scheduling doesn't seem to make a difference.

I have noticed some instances where a nifi.sh dump causes a switch of the
primary node and that unleashes the stuck processor but still not clear why
it happens.

I took a thread dump of the primary node while the thing was happening.
Will upload to gist soon.

Cheers



On Thu, Nov 16, 2017 at 2:07 AM, Mark Payne <ma...@hotmail.com> wrote:

> Andre,
>
> So I am guessing that backpressure is not an issue then if you're not
> seeing it run :)
> Have you tried reducing the scheduling period from 5 mins to something
> like 5 seconds?
> Of course, you may not want to actually be running it every 5 seconds in a
> production environment,
> but I am curious if it would cause it to start running or not...
>
> > On Nov 15, 2017, at 10:02 AM, Andre <an...@fucs.org> wrote:
> >
> > Mark,
> >
> > Timer driven, Primary Node, 5 min
> > Yield is set to 1 sec
> > Backpressure = 10k flows or 1GB
> > nifi.administrative.yield.duration=30 sec
> >
> > Cheers
> >
> >
> > On Thu, Nov 16, 2017 at 1:32 AM, Mark Payne <ma...@hotmail.com>
> wrote:
> >
> >> Hey Andre,
> >>
> >> I have not seen this personally. Can you share how you have the
> Scheduling
> >> Tab configured?
> >> Is it set to Timer-Driven with a period of "0 secs"? Also, what is the
> >> Yield Duration set to?
> >> Is there any backpressure configured on any of the outbound connections?
> >> Additionally, what is the value of the "nifi.administrative.yield.
> duration"
> >> property in nifi.properties?
> >>
> >> Sorry - I know that's a barrage of questions. Hopefully it's something
> >> easy that's just being overlooked, though :)
> >>
> >> -Mark
> >>
> >>> On Nov 15, 2017, at 4:58 AM, Andre <an...@fucs.org> wrote:
> >>>
> >>> Folks,
> >>>
> >>> Has anyone ever seen a situation where ListSFTP simply stops working?
> >>>
> >>> I have observed a few occurrences of seems to be some weird bug.
> >>>
> >>> Symptoms are:
> >>>
> >>> - Processor looks healthy (i.e. no bulletins)
> >>> - State is stalled (i.e. no changes to new files or timestamps)
> >>> - No signs of a stuck thread (i.e. processor doesn't display a thread
> >> count)
> >>> - UI displays processor "Task count = 0/0"
> >>> - Neither stopping > starting nor stopping > disabling > enabling >
> >>> starting processor seems to make a difference.
> >>>
> >>> Has anyone seen this?
> >>>
> >>> Cheers
> >>
> >>
>
>

Re: The mystery of ListSFTP that stops working

Posted by Mark Payne <ma...@hotmail.com>.
Andre,

So I am guessing that backpressure is not an issue then if you're not seeing it run :)
Have you tried reducing the scheduling period from 5 mins to something like 5 seconds?
Of course, you may not want to actually be running it every 5 seconds in a production environment,
but I am curious if it would cause it to start running or not...

> On Nov 15, 2017, at 10:02 AM, Andre <an...@fucs.org> wrote:
> 
> Mark,
> 
> Timer driven, Primary Node, 5 min
> Yield is set to 1 sec
> Backpressure = 10k flows or 1GB
> nifi.administrative.yield.duration=30 sec
> 
> Cheers
> 
> 
> On Thu, Nov 16, 2017 at 1:32 AM, Mark Payne <ma...@hotmail.com> wrote:
> 
>> Hey Andre,
>> 
>> I have not seen this personally. Can you share how you have the Scheduling
>> Tab configured?
>> Is it set to Timer-Driven with a period of "0 secs"? Also, what is the
>> Yield Duration set to?
>> Is there any backpressure configured on any of the outbound connections?
>> Additionally, what is the value of the "nifi.administrative.yield.duration"
>> property in nifi.properties?
>> 
>> Sorry - I know that's a barrage of questions. Hopefully it's something
>> easy that's just being overlooked, though :)
>> 
>> -Mark
>> 
>>> On Nov 15, 2017, at 4:58 AM, Andre <an...@fucs.org> wrote:
>>> 
>>> Folks,
>>> 
>>> Has anyone ever seen a situation where ListSFTP simply stops working?
>>> 
>>> I have observed a few occurrences of seems to be some weird bug.
>>> 
>>> Symptoms are:
>>> 
>>> - Processor looks healthy (i.e. no bulletins)
>>> - State is stalled (i.e. no changes to new files or timestamps)
>>> - No signs of a stuck thread (i.e. processor doesn't display a thread
>> count)
>>> - UI displays processor "Task count = 0/0"
>>> - Neither stopping > starting nor stopping > disabling > enabling >
>>> starting processor seems to make a difference.
>>> 
>>> Has anyone seen this?
>>> 
>>> Cheers
>> 
>> 


Re: The mystery of ListSFTP that stops working

Posted by Andre <an...@fucs.org>.
Mark,

Timer driven, Primary Node, 5 min
Yield is set to 1 sec
Backpressure = 10k flows or 1GB
nifi.administrative.yield.duration=30 sec

Cheers


On Thu, Nov 16, 2017 at 1:32 AM, Mark Payne <ma...@hotmail.com> wrote:

> Hey Andre,
>
> I have not seen this personally. Can you share how you have the Scheduling
> Tab configured?
> Is it set to Timer-Driven with a period of "0 secs"? Also, what is the
> Yield Duration set to?
> Is there any backpressure configured on any of the outbound connections?
> Additionally, what is the value of the "nifi.administrative.yield.duration"
> property in nifi.properties?
>
> Sorry - I know that's a barrage of questions. Hopefully it's something
> easy that's just being overlooked, though :)
>
> -Mark
>
> > On Nov 15, 2017, at 4:58 AM, Andre <an...@fucs.org> wrote:
> >
> > Folks,
> >
> > Has anyone ever seen a situation where ListSFTP simply stops working?
> >
> > I have observed a few occurrences of seems to be some weird bug.
> >
> > Symptoms are:
> >
> > - Processor looks healthy (i.e. no bulletins)
> > - State is stalled (i.e. no changes to new files or timestamps)
> > - No signs of a stuck thread (i.e. processor doesn't display a thread
> count)
> > - UI displays processor "Task count = 0/0"
> > - Neither stopping > starting nor stopping > disabling > enabling >
> > starting processor seems to make a difference.
> >
> > Has anyone seen this?
> >
> > Cheers
>
>

Re: The mystery of ListSFTP that stops working

Posted by Mark Payne <ma...@hotmail.com>.
Hey Andre,

I have not seen this personally. Can you share how you have the Scheduling Tab configured?
Is it set to Timer-Driven with a period of "0 secs"? Also, what is the Yield Duration set to?
Is there any backpressure configured on any of the outbound connections?
Additionally, what is the value of the "nifi.administrative.yield.duration" property in nifi.properties?

Sorry - I know that's a barrage of questions. Hopefully it's something easy that's just being overlooked, though :)

-Mark

> On Nov 15, 2017, at 4:58 AM, Andre <an...@fucs.org> wrote:
> 
> Folks,
> 
> Has anyone ever seen a situation where ListSFTP simply stops working?
> 
> I have observed a few occurrences of seems to be some weird bug.
> 
> Symptoms are:
> 
> - Processor looks healthy (i.e. no bulletins)
> - State is stalled (i.e. no changes to new files or timestamps)
> - No signs of a stuck thread (i.e. processor doesn't display a thread count)
> - UI displays processor "Task count = 0/0"
> - Neither stopping > starting nor stopping > disabling > enabling >
> starting processor seems to make a difference.
> 
> Has anyone seen this?
> 
> Cheers


Re: The mystery of ListSFTP that stops working

Posted by ru...@dbiq.nl.
Same behavior seen with putHbase processor. 
Nifi restart was only fix. 
Would like to see root cause and way to get control of processor without nifi-restart. 


> Op 15 nov. 2017 om 10:58 heeft Andre <an...@fucs.org> het volgende geschreven:
> 
> Folks,
> 
> Has anyone ever seen a situation where ListSFTP simply stops working?
> 
> I have observed a few occurrences of seems to be some weird bug.
> 
> Symptoms are:
> 
> - Processor looks healthy (i.e. no bulletins)
> - State is stalled (i.e. no changes to new files or timestamps)
> - No signs of a stuck thread (i.e. processor doesn't display a thread count)
> - UI displays processor "Task count = 0/0"
> - Neither stopping > starting nor stopping > disabling > enabling >
> starting processor seems to make a difference.
> 
> Has anyone seen this?
> 
> Cheers