You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@nifi.apache.org by Oleg Zhurakousky <oz...@hortonworks.com> on 2015/11/17 01:39:45 UTC

Common scheduler and add-hock thread creation

Guys

I am noticing many modules where we have things like "new Thread(..).start()”, creation of new executors and schedulers, Thread.sleep(..)  etc.,. I am sure many would agree that taking such liberties with Threads will have consequences (not IF but WHEN)
On several threads several of us mentioned a “must read” for anyone who is getting into concurrent code - http://ptgmedia.pearsoncmg.com/images/9780321349606/samplepages/9780321349606.pdf and indeed we can/should definitely grab some best practices from this book.

At least we can start from what’s our strategy around thread management for NAR developers? Basically should/should not a user create Threads, Executors, Schedulers etc.

Cheers
Oleg

Re: Common scheduler and add-hock thread creation

Posted by Oleg Zhurakousky <oz...@hortonworks.com>.

The only one I've seen is the catch 22 where you need to shut down executor from the task executed by this executor. So in this case you have no choice but to create a Thread and start it

Sent from my iPhone

> On Nov 16, 2015, at 20:33, Matt Burgess <ma...@gmail.com> wrote:
> 
> Is there a common use case for a single extra thread? I'm asking in ignorance but if the answer is yes then I'd think there'd be best practices around that?
> 
> Regards,
> Matt
> 
> Sent from my iPhone
> 
>> On Nov 16, 2015, at 8:22 PM, Tony Kurc <tr...@gmail.com> wrote:
>> 
>> so, I believe threads in a processor in nifi are much, much easier than
>> general threading in many other applications. There are defined boundaries
>> on when a processor is built and torn down. Pretty much any state in the
>> middle is up to the processor. you know when resources need to be stood up.
>> you know when they need to be torn down.
>> 
>> Because threads have a localized scope, I'm not sure a global pool would be
>> a help. If a processor needs higher throughput or shorter latency, now, the
>> problem is generally isolated and there is a nice little cream center to
>> optimize. If you're blocked on a global pool of threads because some other
>> processor consumed all the threads in a pool, well, suddenly, your
>> performance is no longer a localized problem.
>> 
>> because the common case is "don't use threads" (not everyone is going to
>> build a complex service, contribute to the core framework or need threads
>> in their processor) I actually think code review is a good way to shake out
>> some poor decisions. because optimizing the threads in a processor for a
>> use case a specialized task (the processor writer knows the critical
>> sections and bottlenecks), I'm not sure whether there are massive strides
>> that can be made, but I could be wrong. And we'll always have a weird edge
>> case of some library that wants to do threads its own way that we're trying
>> to integrate.
>> 
>> My guess is a lot of the behavior you mention above are because at the
>> moment, performance isn't needed in that part of code and it was simpler
>> for the author. Or its a bug!
>> 
>> 
>> 
>> On Mon, Nov 16, 2015 at 8:01 PM, Oleg Zhurakousky <
>> ozhurakousky@hortonworks.com> wrote:
>> 
>>> Taking liberties - so let me throw few example. I am sure you’d agree that
>>> Thread creation and management is an expensive and hard and error prone,
>>> hence new java.util.concurrent and all the goodies in it.
>>> - There is a patch currently in the queue where there is a creation of new
>>> Thread() and then starting it. Is it necessary? Could we reuse the thread
>>> from the common pool?
>>> - We have many places where we have Thread.sleep(..) and in fact do sleep
>>> considerable amount of time. That thread lays dormant where it could
>>> actually be doing something. Is it necessary?
>>> 
>>> Cheers
>>> Oleg
>>> 
>>> 
>>>> On Nov 16, 2015, at 7:52 PM, Tony Kurc <tr...@gmail.com> wrote:
>>>> 
>>>> the issue with a best practices guide on this subject is it will be
>>>> dominated by edge cases. The common case should be "don't produce any
>>>> threads".
>>>> 
>>>> That being said, I commented on a jira somewhere about
>>> LinkedBlockingQueues
>>>> used in so many producer/consumer style processors and possibly needing a
>>>> library to have some consistency in using those queues in a consistent
>>>> thread safe manner.
>>>> 
>>>> Also, I'm not quite sure of what you mean by taking liberties?
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> On Mon, Nov 16, 2015 at 7:39 PM, Oleg Zhurakousky <
>>>> ozhurakousky@hortonworks.com> wrote:
>>>> 
>>>>> Guys
>>>>> 
>>>>> I am noticing many modules where we have things like "new
>>>>> Thread(..).start()”, creation of new executors and schedulers,
>>>>> Thread.sleep(..)  etc.,. I am sure many would agree that taking such
>>>>> liberties with Threads will have consequences (not IF but WHEN)
>>>>> On several threads several of us mentioned a “must read” for anyone who
>>> is
>>>>> getting into concurrent code -
>>> http://ptgmedia.pearsoncmg.com/images/9780321349606/samplepages/9780321349606.pdf
>>>>> and indeed we can/should definitely grab some best practices from this
>>> book.
>>>>> 
>>>>> At least we can start from what’s our strategy around thread management
>>>>> for NAR developers? Basically should/should not a user create Threads,
>>>>> Executors, Schedulers etc.
>>>>> 
>>>>> Cheers
>>>>> Oleg
>

Re: Common scheduler and add-hock thread creation

Posted by Matt Burgess <ma...@gmail.com>.

Is there a common use case for a single extra thread? I'm asking in ignorance but if the answer is yes then I'd think there'd be best practices around that?

Regards,
Matt

Sent from my iPhone

> On Nov 16, 2015, at 8:22 PM, Tony Kurc <tr...@gmail.com> wrote:
> 
> so, I believe threads in a processor in nifi are much, much easier than
> general threading in many other applications. There are defined boundaries
> on when a processor is built and torn down. Pretty much any state in the
> middle is up to the processor. you know when resources need to be stood up.
> you know when they need to be torn down.
> 
> Because threads have a localized scope, I'm not sure a global pool would be
> a help. If a processor needs higher throughput or shorter latency, now, the
> problem is generally isolated and there is a nice little cream center to
> optimize. If you're blocked on a global pool of threads because some other
> processor consumed all the threads in a pool, well, suddenly, your
> performance is no longer a localized problem.
> 
> because the common case is "don't use threads" (not everyone is going to
> build a complex service, contribute to the core framework or need threads
> in their processor) I actually think code review is a good way to shake out
> some poor decisions. because optimizing the threads in a processor for a
> use case a specialized task (the processor writer knows the critical
> sections and bottlenecks), I'm not sure whether there are massive strides
> that can be made, but I could be wrong. And we'll always have a weird edge
> case of some library that wants to do threads its own way that we're trying
> to integrate.
> 
> My guess is a lot of the behavior you mention above are because at the
> moment, performance isn't needed in that part of code and it was simpler
> for the author. Or its a bug!
> 
> 
> 
> On Mon, Nov 16, 2015 at 8:01 PM, Oleg Zhurakousky <
> ozhurakousky@hortonworks.com> wrote:
> 
>> Taking liberties - so let me throw few example. I am sure you’d agree that
>> Thread creation and management is an expensive and hard and error prone,
>> hence new java.util.concurrent and all the goodies in it.
>> - There is a patch currently in the queue where there is a creation of new
>> Thread() and then starting it. Is it necessary? Could we reuse the thread
>> from the common pool?
>> - We have many places where we have Thread.sleep(..) and in fact do sleep
>> considerable amount of time. That thread lays dormant where it could
>> actually be doing something. Is it necessary?
>> 
>> Cheers
>> Oleg
>> 
>> 
>>> On Nov 16, 2015, at 7:52 PM, Tony Kurc <tr...@gmail.com> wrote:
>>> 
>>> the issue with a best practices guide on this subject is it will be
>>> dominated by edge cases. The common case should be "don't produce any
>>> threads".
>>> 
>>> That being said, I commented on a jira somewhere about
>> LinkedBlockingQueues
>>> used in so many producer/consumer style processors and possibly needing a
>>> library to have some consistency in using those queues in a consistent
>>> thread safe manner.
>>> 
>>> Also, I'm not quite sure of what you mean by taking liberties?
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> On Mon, Nov 16, 2015 at 7:39 PM, Oleg Zhurakousky <
>>> ozhurakousky@hortonworks.com> wrote:
>>> 
>>>> Guys
>>>> 
>>>> I am noticing many modules where we have things like "new
>>>> Thread(..).start()”, creation of new executors and schedulers,
>>>> Thread.sleep(..)  etc.,. I am sure many would agree that taking such
>>>> liberties with Threads will have consequences (not IF but WHEN)
>>>> On several threads several of us mentioned a “must read” for anyone who
>> is
>>>> getting into concurrent code -
>> http://ptgmedia.pearsoncmg.com/images/9780321349606/samplepages/9780321349606.pdf
>>>> and indeed we can/should definitely grab some best practices from this
>> book.
>>>> 
>>>> At least we can start from what’s our strategy around thread management
>>>> for NAR developers? Basically should/should not a user create Threads,
>>>> Executors, Schedulers etc.
>>>> 
>>>> Cheers
>>>> Oleg
>> 
>>

Re: Common scheduler and add-hock thread creation

Posted by Joe Witt <jo...@gmail.com>.

So back in the day...

Here is the thought process behind how it works today at a high level
and taking some generalities.  Developers of extensions, and that
primarily means processors, begin process sessions.  In a process
session a processor can access, create, destroy zero or more flow
files and route them to relationships.  They do not dictate how often
they run or when they run.  The Flow Controller does that.  When it
decides to invoke them it does so by calling the appropriate method.
The thread given in that call is the thread they can use to operate on
that process session.  When they're done with that session be a good
behaved entity and give the thread back to the controller.  That is
it.  They have no control over threads because generally they don't
need them.

Now, some processors are special and they may be written by a
developer that needs greater control of their own threading model,
like web servers for instance.  That is ok but it is also outside of
what is described above.  It is really 'in addition to' what is
described above.  The framework supported path for dealing with
FlowFiles (which is what NiFi is for) is only as above.  It is 'ok'
for these special cases but so far nothing practical has risen to the
level of it needing a framework resolution.  There have been glimmers
but nothing that has really shown to need a resolution as far as
threading goes.  We've considered having different managed thread
pools and then operators could assign a given component on the flow to
those pools.  This way they can preserve a pool for 'sources' vs
'mid-stream' vs 'delivery' processors for example.  Again, this never
reached the level of needing a framework solution.

There have also been cases where folks want to have processors operate
and they do not do *anything* with FlowFiles at all.  These are for
what is known as the 'NiFi-As-A-Fancy-Cron' tool pattern.  We don't
need to support this one.

Now I can definitely conceive of ways to build processors or flows
which will create difficulty in NiFi.  I am ok with that personally.

Thanks
Joe


On Mon, Nov 16, 2015 at 8:50 PM, Oleg Zhurakousky
<oz...@hortonworks.com> wrote:
> Tony, thanks for your input. At least we have some discussion going. See in line for the rest.
>
>> On Nov 16, 2015, at 8:22 PM, Tony Kurc <tr...@gmail.com> wrote:
>>
>> so, I believe threads in a processor in nifi are much, much easier than
>> general threading in many other applications. There are defined boundaries
>> on when a processor is built and torn down. Pretty much any state in the
>> middle is up to the processor. you know when resources need to be stood up.
>> you know when they need to be torn down.
> Generally true and I’d agree there is not much one can do to stop users doing what they wan to do regardless of how damaging it may be to the rest of the system
>>
>> Because threads have a localized scope, I'm not sure a global pool would be
>> a help. If a processor needs higher throughput or shorter latency, now, the
>> problem is generally isolated and there is a nice little cream center to
>> optimize. If you're blocked on a global pool of threads because some other
>> processor consumed all the threads in a pool, well, suddenly, your
>> performance is no longer a localized problem.
>>
> This argument is argumentative ;)
> 1. What if I’ve saturated all my cores in my localized Processor’s thread pool with things like while (true){}? Then it really doesn’t matter what the rest of the framework does, the system is hosed. So blockage in this case comes from let’s just call it malicious processor and not global thread pool. So, in the end its a bit of a general discipline question ;)
> 2. So in this case one of the best practices could be taken right from Brian’s book that states that tasks should be as short lived as possible. Any repeats and  retries, should be handled by rerunning/rescheduling a task instead of spinning in the loop inside of task. So with global Scheduler exposed via context or something that each Processor, Service etc. sees we can have a shared Thread pool. We can even have ControllerService as ThreadPools.
> Yes, that would take some serious code review and general discipline from the developers but the benefit would be proportional as well.
>
>> because the common case is "don't use threads" (not everyone is going to
>> build a complex service, contribute to the core framework or need threads
>> in their processor) I actually think code review is a good way to shake out
>> some poor decisions. because optimizing the threads in a processor for a
>> use case a specialized task (the processor writer knows the critical
>> sections and bottlenecks), I'm not sure whether there are massive strides
>> that can be made, but I could be wrong. And we'll always have a weird edge
>> case of some library that wants to do threads its own way that we're trying
>> to integrate.
>>
>> My guess is a lot of the behavior you mention above are because at the
>> moment, performance isn't needed in that part of code and it was simpler
>> for the author. Or its a bug!
> I would probably use "performance isn't needed” argument but in hypothetical word of thousands of processors each creating Threads, the so called ’simplicity' could manifest itself as a bug.
>
> I don’t wan to generalize to much at he moment as it is much easier to discuss a concrete case (we have plenty). But I really wanted to get discussion going on this as I am still studying the code base.
>
> Cheers
> Oleg
>
>>
>>
>>
>> On Mon, Nov 16, 2015 at 8:01 PM, Oleg Zhurakousky <
>> ozhurakousky@hortonworks.com> wrote:
>>
>>> Taking liberties - so let me throw few example. I am sure you’d agree that
>>> Thread creation and management is an expensive and hard and error prone,
>>> hence new java.util.concurrent and all the goodies in it.
>>> - There is a patch currently in the queue where there is a creation of new
>>> Thread() and then starting it. Is it necessary? Could we reuse the thread
>>> from the common pool?
>>> - We have many places where we have Thread.sleep(..) and in fact do sleep
>>> considerable amount of time. That thread lays dormant where it could
>>> actually be doing something. Is it necessary?
>>>
>>> Cheers
>>> Oleg
>>>
>>>
>>>> On Nov 16, 2015, at 7:52 PM, Tony Kurc <tr...@gmail.com> wrote:
>>>>
>>>> the issue with a best practices guide on this subject is it will be
>>>> dominated by edge cases. The common case should be "don't produce any
>>>> threads".
>>>>
>>>> That being said, I commented on a jira somewhere about
>>> LinkedBlockingQueues
>>>> used in so many producer/consumer style processors and possibly needing a
>>>> library to have some consistency in using those queues in a consistent
>>>> thread safe manner.
>>>>
>>>> Also, I'm not quite sure of what you mean by taking liberties?
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On Mon, Nov 16, 2015 at 7:39 PM, Oleg Zhurakousky <
>>>> ozhurakousky@hortonworks.com> wrote:
>>>>
>>>>> Guys
>>>>>
>>>>> I am noticing many modules where we have things like "new
>>>>> Thread(..).start()”, creation of new executors and schedulers,
>>>>> Thread.sleep(..)  etc.,. I am sure many would agree that taking such
>>>>> liberties with Threads will have consequences (not IF but WHEN)
>>>>> On several threads several of us mentioned a “must read” for anyone who
>>> is
>>>>> getting into concurrent code -
>>>>>
>>> http://ptgmedia.pearsoncmg.com/images/9780321349606/samplepages/9780321349606.pdf
>>>>> and indeed we can/should definitely grab some best practices from this
>>> book.
>>>>>
>>>>> At least we can start from what’s our strategy around thread management
>>>>> for NAR developers? Basically should/should not a user create Threads,
>>>>> Executors, Schedulers etc.
>>>>>
>>>>> Cheers
>>>>> Oleg
>>>>>
>>>
>>>
>

Re: Common scheduler and add-hock thread creation

Posted by Oleg Zhurakousky <oz...@hortonworks.com>.

Tony, thanks for your input. At least we have some discussion going. See in line for the rest.

> On Nov 16, 2015, at 8:22 PM, Tony Kurc <tr...@gmail.com> wrote:
> 
> so, I believe threads in a processor in nifi are much, much easier than
> general threading in many other applications. There are defined boundaries
> on when a processor is built and torn down. Pretty much any state in the
> middle is up to the processor. you know when resources need to be stood up.
> you know when they need to be torn down.
Generally true and I’d agree there is not much one can do to stop users doing what they wan to do regardless of how damaging it may be to the rest of the system
> 
> Because threads have a localized scope, I'm not sure a global pool would be
> a help. If a processor needs higher throughput or shorter latency, now, the
> problem is generally isolated and there is a nice little cream center to
> optimize. If you're blocked on a global pool of threads because some other
> processor consumed all the threads in a pool, well, suddenly, your
> performance is no longer a localized problem.
> 
This argument is argumentative ;) 
1. What if I’ve saturated all my cores in my localized Processor’s thread pool with things like while (true){}? Then it really doesn’t matter what the rest of the framework does, the system is hosed. So blockage in this case comes from let’s just call it malicious processor and not global thread pool. So, in the end its a bit of a general discipline question ;)
2. So in this case one of the best practices could be taken right from Brian’s book that states that tasks should be as short lived as possible. Any repeats and  retries, should be handled by rerunning/rescheduling a task instead of spinning in the loop inside of task. So with global Scheduler exposed via context or something that each Processor, Service etc. sees we can have a shared Thread pool. We can even have ControllerService as ThreadPools. 
Yes, that would take some serious code review and general discipline from the developers but the benefit would be proportional as well.

> because the common case is "don't use threads" (not everyone is going to
> build a complex service, contribute to the core framework or need threads
> in their processor) I actually think code review is a good way to shake out
> some poor decisions. because optimizing the threads in a processor for a
> use case a specialized task (the processor writer knows the critical
> sections and bottlenecks), I'm not sure whether there are massive strides
> that can be made, but I could be wrong. And we'll always have a weird edge
> case of some library that wants to do threads its own way that we're trying
> to integrate.
> 
> My guess is a lot of the behavior you mention above are because at the
> moment, performance isn't needed in that part of code and it was simpler
> for the author. Or its a bug!
I would probably use "performance isn't needed” argument but in hypothetical word of thousands of processors each creating Threads, the so called ’simplicity' could manifest itself as a bug.

I don’t wan to generalize to much at he moment as it is much easier to discuss a concrete case (we have plenty). But I really wanted to get discussion going on this as I am still studying the code base.

Cheers
Oleg

> 
> 
> 
> On Mon, Nov 16, 2015 at 8:01 PM, Oleg Zhurakousky <
> ozhurakousky@hortonworks.com> wrote:
> 
>> Taking liberties - so let me throw few example. I am sure you’d agree that
>> Thread creation and management is an expensive and hard and error prone,
>> hence new java.util.concurrent and all the goodies in it.
>> - There is a patch currently in the queue where there is a creation of new
>> Thread() and then starting it. Is it necessary? Could we reuse the thread
>> from the common pool?
>> - We have many places where we have Thread.sleep(..) and in fact do sleep
>> considerable amount of time. That thread lays dormant where it could
>> actually be doing something. Is it necessary?
>> 
>> Cheers
>> Oleg
>> 
>> 
>>> On Nov 16, 2015, at 7:52 PM, Tony Kurc <tr...@gmail.com> wrote:
>>> 
>>> the issue with a best practices guide on this subject is it will be
>>> dominated by edge cases. The common case should be "don't produce any
>>> threads".
>>> 
>>> That being said, I commented on a jira somewhere about
>> LinkedBlockingQueues
>>> used in so many producer/consumer style processors and possibly needing a
>>> library to have some consistency in using those queues in a consistent
>>> thread safe manner.
>>> 
>>> Also, I'm not quite sure of what you mean by taking liberties?
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> On Mon, Nov 16, 2015 at 7:39 PM, Oleg Zhurakousky <
>>> ozhurakousky@hortonworks.com> wrote:
>>> 
>>>> Guys
>>>> 
>>>> I am noticing many modules where we have things like "new
>>>> Thread(..).start()”, creation of new executors and schedulers,
>>>> Thread.sleep(..)  etc.,. I am sure many would agree that taking such
>>>> liberties with Threads will have consequences (not IF but WHEN)
>>>> On several threads several of us mentioned a “must read” for anyone who
>> is
>>>> getting into concurrent code -
>>>> 
>> http://ptgmedia.pearsoncmg.com/images/9780321349606/samplepages/9780321349606.pdf
>>>> and indeed we can/should definitely grab some best practices from this
>> book.
>>>> 
>>>> At least we can start from what’s our strategy around thread management
>>>> for NAR developers? Basically should/should not a user create Threads,
>>>> Executors, Schedulers etc.
>>>> 
>>>> Cheers
>>>> Oleg
>>>> 
>> 
>>

Re: Common scheduler and add-hock thread creation

Posted by Tony Kurc <tr...@gmail.com>.

so, I believe threads in a processor in nifi are much, much easier than
general threading in many other applications. There are defined boundaries
on when a processor is built and torn down. Pretty much any state in the
middle is up to the processor. you know when resources need to be stood up.
you know when they need to be torn down.

Because threads have a localized scope, I'm not sure a global pool would be
a help. If a processor needs higher throughput or shorter latency, now, the
problem is generally isolated and there is a nice little cream center to
optimize. If you're blocked on a global pool of threads because some other
processor consumed all the threads in a pool, well, suddenly, your
performance is no longer a localized problem.

because the common case is "don't use threads" (not everyone is going to
build a complex service, contribute to the core framework or need threads
in their processor) I actually think code review is a good way to shake out
some poor decisions. because optimizing the threads in a processor for a
use case a specialized task (the processor writer knows the critical
sections and bottlenecks), I'm not sure whether there are massive strides
that can be made, but I could be wrong. And we'll always have a weird edge
case of some library that wants to do threads its own way that we're trying
to integrate.

My guess is a lot of the behavior you mention above are because at the
moment, performance isn't needed in that part of code and it was simpler
for the author. Or its a bug!

On Mon, Nov 16, 2015 at 8:01 PM, Oleg Zhurakousky <
ozhurakousky@hortonworks.com> wrote:

> Taking liberties - so let me throw few example. I am sure you’d agree that
> Thread creation and management is an expensive and hard and error prone,
> hence new java.util.concurrent and all the goodies in it.
> - There is a patch currently in the queue where there is a creation of new
> Thread() and then starting it. Is it necessary? Could we reuse the thread
> from the common pool?
> - We have many places where we have Thread.sleep(..) and in fact do sleep
> considerable amount of time. That thread lays dormant where it could
> actually be doing something. Is it necessary?
>
> Cheers
> Oleg
>
>
> > On Nov 16, 2015, at 7:52 PM, Tony Kurc <tr...@gmail.com> wrote:
> >
> > the issue with a best practices guide on this subject is it will be
> > dominated by edge cases. The common case should be "don't produce any
> > threads".
> >
> > That being said, I commented on a jira somewhere about
> LinkedBlockingQueues
> > used in so many producer/consumer style processors and possibly needing a
> > library to have some consistency in using those queues in a consistent
> > thread safe manner.
> >
> > Also, I'm not quite sure of what you mean by taking liberties?
> >
> >
> >
> >
> >
> >
> > On Mon, Nov 16, 2015 at 7:39 PM, Oleg Zhurakousky <
> > ozhurakousky@hortonworks.com> wrote:
> >
> >> Guys
> >>
> >> I am noticing many modules where we have things like "new
> >> Thread(..).start()”, creation of new executors and schedulers,
> >> Thread.sleep(..)  etc.,. I am sure many would agree that taking such
> >> liberties with Threads will have consequences (not IF but WHEN)
> >> On several threads several of us mentioned a “must read” for anyone who
> is
> >> getting into concurrent code -
> >>
> http://ptgmedia.pearsoncmg.com/images/9780321349606/samplepages/9780321349606.pdf
> >> and indeed we can/should definitely grab some best practices from this
> book.
> >>
> >> At least we can start from what’s our strategy around thread management
> >> for NAR developers? Basically should/should not a user create Threads,
> >> Executors, Schedulers etc.
> >>
> >> Cheers
> >> Oleg
> >>
>
>

Re: Common scheduler and add-hock thread creation

Posted by Oleg Zhurakousky <oz...@hortonworks.com>.

Taking liberties - so let me throw few example. I am sure you’d agree that Thread creation and management is an expensive and hard and error prone, hence new java.util.concurrent and all the goodies in it. 
- There is a patch currently in the queue where there is a creation of new Thread() and then starting it. Is it necessary? Could we reuse the thread from the common pool?
- We have many places where we have Thread.sleep(..) and in fact do sleep considerable amount of time. That thread lays dormant where it could actually be doing something. Is it necessary?

Cheers
Oleg


> On Nov 16, 2015, at 7:52 PM, Tony Kurc <tr...@gmail.com> wrote:
> 
> the issue with a best practices guide on this subject is it will be
> dominated by edge cases. The common case should be "don't produce any
> threads".
> 
> That being said, I commented on a jira somewhere about LinkedBlockingQueues
> used in so many producer/consumer style processors and possibly needing a
> library to have some consistency in using those queues in a consistent
> thread safe manner.
> 
> Also, I'm not quite sure of what you mean by taking liberties?
> 
> 
> 
> 
> 
> 
> On Mon, Nov 16, 2015 at 7:39 PM, Oleg Zhurakousky <
> ozhurakousky@hortonworks.com> wrote:
> 
>> Guys
>> 
>> I am noticing many modules where we have things like "new
>> Thread(..).start()”, creation of new executors and schedulers,
>> Thread.sleep(..)  etc.,. I am sure many would agree that taking such
>> liberties with Threads will have consequences (not IF but WHEN)
>> On several threads several of us mentioned a “must read” for anyone who is
>> getting into concurrent code -
>> http://ptgmedia.pearsoncmg.com/images/9780321349606/samplepages/9780321349606.pdf
>> and indeed we can/should definitely grab some best practices from this book.
>> 
>> At least we can start from what’s our strategy around thread management
>> for NAR developers? Basically should/should not a user create Threads,
>> Executors, Schedulers etc.
>> 
>> Cheers
>> Oleg
>>

Re: Common scheduler and add-hock thread creation

Posted by Tony Kurc <tr...@gmail.com>.

the issue with a best practices guide on this subject is it will be
dominated by edge cases. The common case should be "don't produce any
threads".

That being said, I commented on a jira somewhere about LinkedBlockingQueues
used in so many producer/consumer style processors and possibly needing a
library to have some consistency in using those queues in a consistent
thread safe manner.

Also, I'm not quite sure of what you mean by taking liberties?

On Mon, Nov 16, 2015 at 7:39 PM, Oleg Zhurakousky <
ozhurakousky@hortonworks.com> wrote:

> Guys
>
> I am noticing many modules where we have things like "new
> Thread(..).start()”, creation of new executors and schedulers,
> Thread.sleep(..)  etc.,. I am sure many would agree that taking such
> liberties with Threads will have consequences (not IF but WHEN)
> On several threads several of us mentioned a “must read” for anyone who is
> getting into concurrent code -
> http://ptgmedia.pearsoncmg.com/images/9780321349606/samplepages/9780321349606.pdf
> and indeed we can/should definitely grab some best practices from this book.
>
> At least we can start from what’s our strategy around thread management
> for NAR developers? Basically should/should not a user create Threads,
> Executors, Schedulers etc.
>
> Cheers
> Oleg
>