You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@trafficserver.apache.org by Fei Deng <du...@apache.org> on 2019/11/19 20:39:39 UTC

API Proposal: TSContDispatch

While PR#6103 (https://github.com/apache/trafficserver/pull/6103) solves
the problem of having the 60ms delay (caused by waiting in sleep), it also
creates an issue where the event loop ends up in a "dead loop" if the
scheduled event schedules itself with 0 timeout (see test code by scw00).
Here's what I have in mind that will fix the problem.


   1. New API "TSContDispatch", which will be basically the current
   "TSContSchedule" calls with 0 timeouts.
   2. When scheduling events/continuations using "TSContSchedule" API with
   0 timeouts, set flag to identify the event to be processed on the next
   event loop.
   3. New "LocalQueue" class to handle events that are supposed to be
   processed in the next event loop.

Here's the branch I'm working on right now, it shows an easy concept of the
"LocalQueue".
https://github.com/duke8253/trafficserver/tree/master-event_dispatch

Re: API Proposal: TSContDispatch

Posted by Alan Carroll <so...@verizonmedia.com.INVALID>.

Ah, missed some other stuff.

The bug we've seen is one that's not always visible, particularly on busy
systems, because the event loop never waits very long. However, in the
particular use case here, the systems are sufficiently lightly loaded and
the latency requirements sufficiently strict that ATS behavior is, from
their point of view, broken. I.e. there are large (10s of ms) delays on
systems with no load. Fei built a fix for this, which changed the behavior
of TSContSchedule, but there was blow back on that because of the behavior
change (see the PR for a long discussion). To fix the production issue and
satisfy requirements from other members of the community, Fei is proposing
splitting the API so there is a clear distinction between "immediate" and
"yield".  I had earlier suggested using a zero timeout for this, but that
got bonged. Therefore I think this is what we have to do to resolve the
issue.

On Wed, Nov 20, 2019 at 3:52 PM Alan Carroll <
solidwallofcode@verizonmedia.com> wrote:

> For event processing style programming, it's standard to have two event
> scheduling "styles" - an immediate "do as soon as possible", and a
> "scheduled" style which is guaranteed to unwind the event processing stack
> at least once. In another way, it's similar to "call" and "yield" in
> co-routines. I will say I'm not sure why TSContSchedule should need a flag
> - I'd prefer it always yields, and the new API be the immediate one. Part
> of this is to address scw00's concern that a naive plugin writer could
> schedule for 0 time in the future, presuming a yield, but in fact creating
> an infinite ("dead") loop. Separating the APIs both restores existing
> behavior before the 6103 PR and enables providing an API needed to fix a
> production problem.
>
> On Wed, Nov 20, 2019 at 10:37 AM Sudheer Vinukonda
> <su...@yahoo.com.invalid> wrote:
>
>> Hmm..Why’d the API need to differentiate the implementation details to
>> users? Alternately, why’d someone pick an API that may have a hole?
>>
>> I haven’t fully analyzed and understood the proposed changes, but having
>> two different API that only differ in how they are implemented under the
>> hood (and one of them with an apparent hole) doesn’t sound right to me.
>>
>> - Sudheer
>>
>>
>> > On Nov 20, 2019, at 8:01 AM, Fei Deng <fe...@verizonmedia.com.invalid>
>> wrote:
>> >
>> > Let me rephrase that, the new API behaves the same as the
>> TSContSchedule
>> > with 0 timeout after PR#6103, which will handle events as soon as
>> possible.
>> > While this is good for delays, it also causes the situation scw00
>> brought
>> > up (dead loop). And since there is no good way of differentiating this
>> > behavior with the original one where the event is put into the
>> > EventQueueExternal, and waiting for the next eventloop (with possible
>> 60ms
>> > delay if on the same thread). So I want to change the
>> > current TSContSchedule code to make sure that we are not suffering from
>> a
>> > dead loop, and also create a new API such that the new execute ASAP is
>> > retained.
>> >
>> >> On Tue, Nov 19, 2019 at 6:06 PM Leif Hedstrom <zw...@apache.org>
>> wrote:
>> >>
>> >>
>> >>
>> >>>> On Nov 20, 2019, at 05:52, Fei Deng <du...@apache.org> wrote:
>> >>>
>> >>> Forgot to mention that this change would restore the old behavior of
>> >>> TSContSchedule minus the delay and dead loop.
>> >>>
>> >>>> On Tue, Nov 19, 2019 at 2:39 PM Fei Deng <du...@apache.org>
>> wrote:
>> >>>>
>> >>>> While PR#6103 (https://github.com/apache/trafficserver/pull/6103)
>> >> solves
>> >>>> the problem of having the 60ms delay (caused by waiting in sleep), it
>> >> also
>> >>>> creates an issue where the event loop ends up in a "dead loop" if the
>> >>>> scheduled event schedules itself with 0 timeout (see test code by
>> >> scw00).
>> >>>> Here's what I have in mind that will fix the problem.
>> >>>>
>> >>>>
>> >>>>  1. New API "TSContDispatch", which will be basically the current
>> >>>>  "TSContSchedule" calls with 0 timeouts.
>> >>
>> >>
>> >> Dumb question, why do we need a new API, if it behaves the same as a
>> call
>> >> to TSContSchedule with a timeout of 0?
>> >>
>> >> I’m getting pretty concerned about these changes, as I think we all
>> agree,
>> >> this is critical code, and it’s (as we are seeing) easy to break
>> things in
>> >> bad ways.
>> >>
>> >> — Leif
>> >>
>> >>>>  2. When scheduling events/continuations using "TSContSchedule" API
>> >> with
>> >>>>  0 timeouts, set flag to identify the event to be processed on the
>> next
>> >>>>  event loop.
>> >>>>  3. New "LocalQueue" class to handle events that are supposed to be
>> >>>>  processed in the next event loop.
>> >>>>
>> >>>> Here's the branch I'm working on right now, it shows an easy concept
>> of
>> >> the
>> >>>> "LocalQueue".
>> >>>> https://github.com/duke8253/trafficserver/tree/master-event_dispatch
>> >>>>
>> >>
>> >>
>>
>>

Re: API Proposal: TSContDispatch

Posted by Alan Carroll <so...@verizonmedia.com.INVALID>.

For event processing style programming, it's standard to have two event
scheduling "styles" - an immediate "do as soon as possible", and a
"scheduled" style which is guaranteed to unwind the event processing stack
at least once. In another way, it's similar to "call" and "yield" in
co-routines. I will say I'm not sure why TSContSchedule should need a flag
- I'd prefer it always yields, and the new API be the immediate one. Part
of this is to address scw00's concern that a naive plugin writer could
schedule for 0 time in the future, presuming a yield, but in fact creating
an infinite ("dead") loop. Separating the APIs both restores existing
behavior before the 6103 PR and enables providing an API needed to fix a
production problem.

On Wed, Nov 20, 2019 at 10:37 AM Sudheer Vinukonda
<su...@yahoo.com.invalid> wrote:

> Hmm..Why’d the API need to differentiate the implementation details to
> users? Alternately, why’d someone pick an API that may have a hole?
>
> I haven’t fully analyzed and understood the proposed changes, but having
> two different API that only differ in how they are implemented under the
> hood (and one of them with an apparent hole) doesn’t sound right to me.
>
> - Sudheer
>
>
> > On Nov 20, 2019, at 8:01 AM, Fei Deng <fe...@verizonmedia.com.invalid>
> wrote:
> >
> > Let me rephrase that, the new API behaves the same as the TSContSchedule
> > with 0 timeout after PR#6103, which will handle events as soon as
> possible.
> > While this is good for delays, it also causes the situation scw00 brought
> > up (dead loop). And since there is no good way of differentiating this
> > behavior with the original one where the event is put into the
> > EventQueueExternal, and waiting for the next eventloop (with possible
> 60ms
> > delay if on the same thread). So I want to change the
> > current TSContSchedule code to make sure that we are not suffering from a
> > dead loop, and also create a new API such that the new execute ASAP is
> > retained.
> >
> >> On Tue, Nov 19, 2019 at 6:06 PM Leif Hedstrom <zw...@apache.org> wrote:
> >>
> >>
> >>
> >>>> On Nov 20, 2019, at 05:52, Fei Deng <du...@apache.org> wrote:
> >>>
> >>> Forgot to mention that this change would restore the old behavior of
> >>> TSContSchedule minus the delay and dead loop.
> >>>
> >>>> On Tue, Nov 19, 2019 at 2:39 PM Fei Deng <du...@apache.org> wrote:
> >>>>
> >>>> While PR#6103 (https://github.com/apache/trafficserver/pull/6103)
> >> solves
> >>>> the problem of having the 60ms delay (caused by waiting in sleep), it
> >> also
> >>>> creates an issue where the event loop ends up in a "dead loop" if the
> >>>> scheduled event schedules itself with 0 timeout (see test code by
> >> scw00).
> >>>> Here's what I have in mind that will fix the problem.
> >>>>
> >>>>
> >>>>  1. New API "TSContDispatch", which will be basically the current
> >>>>  "TSContSchedule" calls with 0 timeouts.
> >>
> >>
> >> Dumb question, why do we need a new API, if it behaves the same as a
> call
> >> to TSContSchedule with a timeout of 0?
> >>
> >> I’m getting pretty concerned about these changes, as I think we all
> agree,
> >> this is critical code, and it’s (as we are seeing) easy to break things
> in
> >> bad ways.
> >>
> >> — Leif
> >>
> >>>>  2. When scheduling events/continuations using "TSContSchedule" API
> >> with
> >>>>  0 timeouts, set flag to identify the event to be processed on the
> next
> >>>>  event loop.
> >>>>  3. New "LocalQueue" class to handle events that are supposed to be
> >>>>  processed in the next event loop.
> >>>>
> >>>> Here's the branch I'm working on right now, it shows an easy concept
> of
> >> the
> >>>> "LocalQueue".
> >>>> https://github.com/duke8253/trafficserver/tree/master-event_dispatch
> >>>>
> >>
> >>
>
>

Re: API Proposal: TSContDispatch

Posted by Fei Deng <fe...@verizonmedia.com.INVALID>.

it's not a bug, it's how it was supposed to be from the beginning but
wasn't functioning correctly.

On Wed, Nov 20, 2019 at 2:27 PM Sudheer Vinukonda
<su...@yahoo.com.invalid> wrote:

>  >>> that will cause an issue for devs that are using
> schedule with 0 timeouts in their code to schedule itself over and over
> again (see test code in PR comment), which is bad programming on their part
> but it might happen and is a valid concern.
>
> So, isn't this a new bug introduced with PR# 6103?
> Either we care about the bug and fix it or if we don't (as you say, _bad
> programming_) then document it as a Gotcha/limitation, but, I still don't
> see why/how a new API is needed to fix a bug.
>
>     On Wednesday, November 20, 2019, 12:21:00 PM PST, Fei Deng
> <fe...@verizonmedia.com.invalid> wrote:
>
>  It's not a hole, it's how it should have been all along, but have not been
> working properly for a long time. schedule_imm is supposed to put things in
> queue and have the eventloop process them asap and not waiting for IO or
> other stuff, but instead it stays in queue till we wake up the thread
> either by signal from others, or waiting out the sleep time. With
> PR#6103,that's been addressed and also fixes the problem with the unwanted
> delays,
> but as scw00 pointed out, that will cause an issue for devs that are
> usingschedule with 0 timeouts in their code to schedule itself over and over
> again (see test code in PR comment), which is bad programming on their part
> but it might happen and is a valid concern. With the changes proposed
> here,the new API will handle that extreme use case where you want things to
> run
> asap without any kind of waiting, and make TSContSchedule calls with 0
> timeouts behave like it used to be (even though it isn't what it should
> be), which is put the event into a queue and wait for the next eventloop to
> handle them. And with the new changes I mocked up in the first email, it
> also addresses the part where wait time can only be efficiently calculated
> using the priority queue (EventQueue), but not the external queue
> (EventQueueExternal).
>
> On Wed, Nov 20, 2019 at 10:37 AM Sudheer Vinukonda
> <su...@yahoo.com.invalid> wrote:
>
> > Hmm..Why’d the API need to differentiate the implementation details to
> > users? Alternately, why’d someone pick an API that may have a hole?
> >
> > I haven’t fully analyzed and understood the proposed changes, but having
> > two different API that only differ in how they are implemented under the
> > hood (and one of them with an apparent hole) doesn’t sound right to me.
> >
> > - Sudheer
> >
> >
> > > On Nov 20, 2019, at 8:01 AM, Fei Deng <fe...@verizonmedia.com.invalid>
> > wrote:
> > >
> > > Let me rephrase that, the new API behaves the same as the
> TSContSchedule
> > > with 0 timeout after PR#6103, which will handle events as soon as
> > possible.
> > > While this is good for delays, it also causes the situation scw00
> brought
> > > up (dead loop). And since there is no good way of differentiating this
> > > behavior with the original one where the event is put into the
> > > EventQueueExternal, and waiting for the next eventloop (with possible
> > 60ms
> > > delay if on the same thread). So I want to change the
> > > current TSContSchedule code to make sure that we are not suffering
> from a
> > > dead loop, and also create a new API such that the new execute ASAP is
> > > retained.
> > >
> > >> On Tue, Nov 19, 2019 at 6:06 PM Leif Hedstrom <zw...@apache.org>
> wrote:
> > >>
> > >>
> > >>
> > >>>> On Nov 20, 2019, at 05:52, Fei Deng <du...@apache.org> wrote:
> > >>>
> > >>> Forgot to mention that this change would restore the old behavior of
> > >>> TSContSchedule minus the delay and dead loop.
> > >>>
> > >>>> On Tue, Nov 19, 2019 at 2:39 PM Fei Deng <du...@apache.org>
> wrote:
> > >>>>
> > >>>> While PR#6103 (https://github.com/apache/trafficserver/pull/6103)
> > >> solves
> > >>>> the problem of having the 60ms delay (caused by waiting in sleep),
> it
> > >> also
> > >>>> creates an issue where the event loop ends up in a "dead loop" if
> the
> > >>>> scheduled event schedules itself with 0 timeout (see test code by
> > >> scw00).
> > >>>> Here's what I have in mind that will fix the problem.
> > >>>>
> > >>>>
> > >>>>  1. New API "TSContDispatch", which will be basically the current
> > >>>>  "TSContSchedule" calls with 0 timeouts.
> > >>
> > >>
> > >> Dumb question, why do we need a new API, if it behaves the same as a
> > call
> > >> to TSContSchedule with a timeout of 0?
> > >>
> > >> I’m getting pretty concerned about these changes, as I think we all
> > agree,
> > >> this is critical code, and it’s (as we are seeing) easy to break
> things
> > in
> > >> bad ways.
> > >>
> > >> — Leif
> > >>
> > >>>>  2. When scheduling events/continuations using "TSContSchedule" API
> > >> with
> > >>>>  0 timeouts, set flag to identify the event to be processed on the
> > next
> > >>>>  event loop.
> > >>>>  3. New "LocalQueue" class to handle events that are supposed to be
> > >>>>  processed in the next event loop.
> > >>>>
> > >>>> Here's the branch I'm working on right now, it shows an easy concept
> > of
> > >> the
> > >>>> "LocalQueue".
> > >>>>
> https://github.com/duke8253/trafficserver/tree/master-event_dispatch
> > >>>>
> > >>
> > >>
> >
> >

Re: API Proposal: TSContDispatch

Posted by Sudheer Vinukonda <su...@yahoo.com.INVALID>.

 >>> that will cause an issue for devs that are using
schedule with 0 timeouts in their code to schedule itself over and over
again (see test code in PR comment), which is bad programming on their part
but it might happen and is a valid concern. 

So, isn't this a new bug introduced with PR# 6103? 
Either we care about the bug and fix it or if we don't (as you say, _bad programming_) then document it as a Gotcha/limitation, but, I still don't see why/how a new API is needed to fix a bug.

    On Wednesday, November 20, 2019, 12:21:00 PM PST, Fei Deng <fe...@verizonmedia.com.invalid> wrote:  

 It's not a hole, it's how it should have been all along, but have not been
working properly for a long time. schedule_imm is supposed to put things in
queue and have the eventloop process them asap and not waiting for IO or
other stuff, but instead it stays in queue till we wake up the thread
either by signal from others, or waiting out the sleep time. With PR#6103,that's been addressed and also fixes the problem with the unwanted delays,
but as scw00 pointed out, that will cause an issue for devs that are usingschedule with 0 timeouts in their code to schedule itself over and over
again (see test code in PR comment), which is bad programming on their part
but it might happen and is a valid concern. With the changes proposed here,the new API will handle that extreme use case where you want things to run
asap without any kind of waiting, and make TSContSchedule calls with 0
timeouts behave like it used to be (even though it isn't what it should
be), which is put the event into a queue and wait for the next eventloop to
handle them. And with the new changes I mocked up in the first email, it
also addresses the part where wait time can only be efficiently calculated
using the priority queue (EventQueue), but not the external queue
(EventQueueExternal).

On Wed, Nov 20, 2019 at 10:37 AM Sudheer Vinukonda
<su...@yahoo.com.invalid> wrote:

> Hmm..Why’d the API need to differentiate the implementation details to
> users? Alternately, why’d someone pick an API that may have a hole?
>
> I haven’t fully analyzed and understood the proposed changes, but having
> two different API that only differ in how they are implemented under the
> hood (and one of them with an apparent hole) doesn’t sound right to me.
>
> - Sudheer
>
>
> > On Nov 20, 2019, at 8:01 AM, Fei Deng <fe...@verizonmedia.com.invalid>
> wrote:
> >
> > Let me rephrase that, the new API behaves the same as the TSContSchedule
> > with 0 timeout after PR#6103, which will handle events as soon as
> possible.
> > While this is good for delays, it also causes the situation scw00 brought
> > up (dead loop). And since there is no good way of differentiating this
> > behavior with the original one where the event is put into the
> > EventQueueExternal, and waiting for the next eventloop (with possible
> 60ms
> > delay if on the same thread). So I want to change the
> > current TSContSchedule code to make sure that we are not suffering from a
> > dead loop, and also create a new API such that the new execute ASAP is
> > retained.
> >
> >> On Tue, Nov 19, 2019 at 6:06 PM Leif Hedstrom <zw...@apache.org> wrote:
> >>
> >>
> >>
> >>>> On Nov 20, 2019, at 05:52, Fei Deng <du...@apache.org> wrote:
> >>>
> >>> Forgot to mention that this change would restore the old behavior of
> >>> TSContSchedule minus the delay and dead loop.
> >>>
> >>>> On Tue, Nov 19, 2019 at 2:39 PM Fei Deng <du...@apache.org> wrote:
> >>>>
> >>>> While PR#6103 (https://github.com/apache/trafficserver/pull/6103)
> >> solves
> >>>> the problem of having the 60ms delay (caused by waiting in sleep), it
> >> also
> >>>> creates an issue where the event loop ends up in a "dead loop" if the
> >>>> scheduled event schedules itself with 0 timeout (see test code by
> >> scw00).
> >>>> Here's what I have in mind that will fix the problem.
> >>>>
> >>>>
> >>>>  1. New API "TSContDispatch", which will be basically the current
> >>>>  "TSContSchedule" calls with 0 timeouts.
> >>
> >>
> >> Dumb question, why do we need a new API, if it behaves the same as a
> call
> >> to TSContSchedule with a timeout of 0?
> >>
> >> I’m getting pretty concerned about these changes, as I think we all
> agree,
> >> this is critical code, and it’s (as we are seeing) easy to break things
> in
> >> bad ways.
> >>
> >> — Leif
> >>
> >>>>  2. When scheduling events/continuations using "TSContSchedule" API
> >> with
> >>>>  0 timeouts, set flag to identify the event to be processed on the
> next
> >>>>  event loop.
> >>>>  3. New "LocalQueue" class to handle events that are supposed to be
> >>>>  processed in the next event loop.
> >>>>
> >>>> Here's the branch I'm working on right now, it shows an easy concept
> of
> >> the
> >>>> "LocalQueue".
> >>>> https://github.com/duke8253/trafficserver/tree/master-event_dispatch
> >>>>
> >>
> >>
>
>

Re: API Proposal: TSContDispatch

Posted by Fei Deng <fe...@verizonmedia.com.INVALID>.

It's not a hole, it's how it should have been all along, but have not been
working properly for a long time. schedule_imm is supposed to put things in
queue and have the eventloop process them asap and not waiting for IO or
other stuff, but instead it stays in queue till we wake up the thread
either by signal from others, or waiting out the sleep time. With PR#6103,
that's been addressed and also fixes the problem with the unwanted delays,
but as scw00 pointed out, that will cause an issue for devs that are using
schedule with 0 timeouts in their code to schedule itself over and over
again (see test code in PR comment), which is bad programming on their part
but it might happen and is a valid concern. With the changes proposed here,
the new API will handle that extreme use case where you want things to run
asap without any kind of waiting, and make TSContSchedule calls with 0
timeouts behave like it used to be (even though it isn't what it should
be), which is put the event into a queue and wait for the next eventloop to
handle them. And with the new changes I mocked up in the first email, it
also addresses the part where wait time can only be efficiently calculated
using the priority queue (EventQueue), but not the external queue
(EventQueueExternal).

On Wed, Nov 20, 2019 at 10:37 AM Sudheer Vinukonda
<su...@yahoo.com.invalid> wrote:

> Hmm..Why’d the API need to differentiate the implementation details to
> users? Alternately, why’d someone pick an API that may have a hole?
>
> I haven’t fully analyzed and understood the proposed changes, but having
> two different API that only differ in how they are implemented under the
> hood (and one of them with an apparent hole) doesn’t sound right to me.
>
> - Sudheer
>
>
> > On Nov 20, 2019, at 8:01 AM, Fei Deng <fe...@verizonmedia.com.invalid>
> wrote:
> >
> > Let me rephrase that, the new API behaves the same as the TSContSchedule
> > with 0 timeout after PR#6103, which will handle events as soon as
> possible.
> > While this is good for delays, it also causes the situation scw00 brought
> > up (dead loop). And since there is no good way of differentiating this
> > behavior with the original one where the event is put into the
> > EventQueueExternal, and waiting for the next eventloop (with possible
> 60ms
> > delay if on the same thread). So I want to change the
> > current TSContSchedule code to make sure that we are not suffering from a
> > dead loop, and also create a new API such that the new execute ASAP is
> > retained.
> >
> >> On Tue, Nov 19, 2019 at 6:06 PM Leif Hedstrom <zw...@apache.org> wrote:
> >>
> >>
> >>
> >>>> On Nov 20, 2019, at 05:52, Fei Deng <du...@apache.org> wrote:
> >>>
> >>> Forgot to mention that this change would restore the old behavior of
> >>> TSContSchedule minus the delay and dead loop.
> >>>
> >>>> On Tue, Nov 19, 2019 at 2:39 PM Fei Deng <du...@apache.org> wrote:
> >>>>
> >>>> While PR#6103 (https://github.com/apache/trafficserver/pull/6103)
> >> solves
> >>>> the problem of having the 60ms delay (caused by waiting in sleep), it
> >> also
> >>>> creates an issue where the event loop ends up in a "dead loop" if the
> >>>> scheduled event schedules itself with 0 timeout (see test code by
> >> scw00).
> >>>> Here's what I have in mind that will fix the problem.
> >>>>
> >>>>
> >>>>  1. New API "TSContDispatch", which will be basically the current
> >>>>  "TSContSchedule" calls with 0 timeouts.
> >>
> >>
> >> Dumb question, why do we need a new API, if it behaves the same as a
> call
> >> to TSContSchedule with a timeout of 0?
> >>
> >> I’m getting pretty concerned about these changes, as I think we all
> agree,
> >> this is critical code, and it’s (as we are seeing) easy to break things
> in
> >> bad ways.
> >>
> >> — Leif
> >>
> >>>>  2. When scheduling events/continuations using "TSContSchedule" API
> >> with
> >>>>  0 timeouts, set flag to identify the event to be processed on the
> next
> >>>>  event loop.
> >>>>  3. New "LocalQueue" class to handle events that are supposed to be
> >>>>  processed in the next event loop.
> >>>>
> >>>> Here's the branch I'm working on right now, it shows an easy concept
> of
> >> the
> >>>> "LocalQueue".
> >>>> https://github.com/duke8253/trafficserver/tree/master-event_dispatch
> >>>>
> >>
> >>
>
>

Re: API Proposal: TSContDispatch

Posted by Sudheer Vinukonda <su...@yahoo.com.INVALID>.

Hmm..Why’d the API need to differentiate the implementation details to users? Alternately, why’d someone pick an API that may have a hole?

I haven’t fully analyzed and understood the proposed changes, but having two different API that only differ in how they are implemented under the hood (and one of them with an apparent hole) doesn’t sound right to me.

- Sudheer


> On Nov 20, 2019, at 8:01 AM, Fei Deng <fe...@verizonmedia.com.invalid> wrote:
> 
> Let me rephrase that, the new API behaves the same as the TSContSchedule
> with 0 timeout after PR#6103, which will handle events as soon as possible.
> While this is good for delays, it also causes the situation scw00 brought
> up (dead loop). And since there is no good way of differentiating this
> behavior with the original one where the event is put into the
> EventQueueExternal, and waiting for the next eventloop (with possible 60ms
> delay if on the same thread). So I want to change the
> current TSContSchedule code to make sure that we are not suffering from a
> dead loop, and also create a new API such that the new execute ASAP is
> retained.
> 
>> On Tue, Nov 19, 2019 at 6:06 PM Leif Hedstrom <zw...@apache.org> wrote:
>> 
>> 
>> 
>>>> On Nov 20, 2019, at 05:52, Fei Deng <du...@apache.org> wrote:
>>> 
>>> Forgot to mention that this change would restore the old behavior of
>>> TSContSchedule minus the delay and dead loop.
>>> 
>>>> On Tue, Nov 19, 2019 at 2:39 PM Fei Deng <du...@apache.org> wrote:
>>>> 
>>>> While PR#6103 (https://github.com/apache/trafficserver/pull/6103)
>> solves
>>>> the problem of having the 60ms delay (caused by waiting in sleep), it
>> also
>>>> creates an issue where the event loop ends up in a "dead loop" if the
>>>> scheduled event schedules itself with 0 timeout (see test code by
>> scw00).
>>>> Here's what I have in mind that will fix the problem.
>>>> 
>>>> 
>>>>  1. New API "TSContDispatch", which will be basically the current
>>>>  "TSContSchedule" calls with 0 timeouts.
>> 
>> 
>> Dumb question, why do we need a new API, if it behaves the same as a call
>> to TSContSchedule with a timeout of 0?
>> 
>> I’m getting pretty concerned about these changes, as I think we all agree,
>> this is critical code, and it’s (as we are seeing) easy to break things in
>> bad ways.
>> 
>> — Leif
>> 
>>>>  2. When scheduling events/continuations using "TSContSchedule" API
>> with
>>>>  0 timeouts, set flag to identify the event to be processed on the next
>>>>  event loop.
>>>>  3. New "LocalQueue" class to handle events that are supposed to be
>>>>  processed in the next event loop.
>>>> 
>>>> Here's the branch I'm working on right now, it shows an easy concept of
>> the
>>>> "LocalQueue".
>>>> https://github.com/duke8253/trafficserver/tree/master-event_dispatch
>>>> 
>> 
>>

Re: API Proposal: TSContDispatch

Posted by Fei Deng <fe...@verizonmedia.com.INVALID>.

Let me rephrase that, the new API behaves the same as the TSContSchedule
with 0 timeout after PR#6103, which will handle events as soon as possible.
While this is good for delays, it also causes the situation scw00 brought
up (dead loop). And since there is no good way of differentiating this
behavior with the original one where the event is put into the
EventQueueExternal, and waiting for the next eventloop (with possible 60ms
delay if on the same thread). So I want to change the
current TSContSchedule code to make sure that we are not suffering from a
dead loop, and also create a new API such that the new execute ASAP is
retained.

On Tue, Nov 19, 2019 at 6:06 PM Leif Hedstrom <zw...@apache.org> wrote:

>
>
> > On Nov 20, 2019, at 05:52, Fei Deng <du...@apache.org> wrote:
> >
> > Forgot to mention that this change would restore the old behavior of
> > TSContSchedule minus the delay and dead loop.
> >
> >> On Tue, Nov 19, 2019 at 2:39 PM Fei Deng <du...@apache.org> wrote:
> >>
> >> While PR#6103 (https://github.com/apache/trafficserver/pull/6103)
> solves
> >> the problem of having the 60ms delay (caused by waiting in sleep), it
> also
> >> creates an issue where the event loop ends up in a "dead loop" if the
> >> scheduled event schedules itself with 0 timeout (see test code by
> scw00).
> >> Here's what I have in mind that will fix the problem.
> >>
> >>
> >>   1. New API "TSContDispatch", which will be basically the current
> >>   "TSContSchedule" calls with 0 timeouts.
>
>
> Dumb question, why do we need a new API, if it behaves the same as a call
> to TSContSchedule with a timeout of 0?
>
> I’m getting pretty concerned about these changes, as I think we all agree,
> this is critical code, and it’s (as we are seeing) easy to break things in
> bad ways.
>
> — Leif
>
> >>   2. When scheduling events/continuations using "TSContSchedule" API
> with
> >>   0 timeouts, set flag to identify the event to be processed on the next
> >>   event loop.
> >>   3. New "LocalQueue" class to handle events that are supposed to be
> >>   processed in the next event loop.
> >>
> >> Here's the branch I'm working on right now, it shows an easy concept of
> the
> >> "LocalQueue".
> >> https://github.com/duke8253/trafficserver/tree/master-event_dispatch
> >>
>
>

Re: API Proposal: TSContDispatch

Posted by Leif Hedstrom <zw...@apache.org>.


> On Nov 20, 2019, at 05:52, Fei Deng <du...@apache.org> wrote:
> 
> Forgot to mention that this change would restore the old behavior of
> TSContSchedule minus the delay and dead loop.
> 
>> On Tue, Nov 19, 2019 at 2:39 PM Fei Deng <du...@apache.org> wrote:
>> 
>> While PR#6103 (https://github.com/apache/trafficserver/pull/6103) solves
>> the problem of having the 60ms delay (caused by waiting in sleep), it also
>> creates an issue where the event loop ends up in a "dead loop" if the
>> scheduled event schedules itself with 0 timeout (see test code by scw00).
>> Here's what I have in mind that will fix the problem.
>> 
>> 
>>   1. New API "TSContDispatch", which will be basically the current
>>   "TSContSchedule" calls with 0 timeouts.


Dumb question, why do we need a new API, if it behaves the same as a call to TSContSchedule with a timeout of 0?

I’m getting pretty concerned about these changes, as I think we all agree, this is critical code, and it’s (as we are seeing) easy to break things in bad ways.

— Leif 

>>   2. When scheduling events/continuations using "TSContSchedule" API with
>>   0 timeouts, set flag to identify the event to be processed on the next
>>   event loop.
>>   3. New "LocalQueue" class to handle events that are supposed to be
>>   processed in the next event loop.
>> 
>> Here's the branch I'm working on right now, it shows an easy concept of the
>> "LocalQueue".
>> https://github.com/duke8253/trafficserver/tree/master-event_dispatch
>>

Re: API Proposal: TSContDispatch

Posted by Fei Deng <du...@apache.org>.

Forgot to mention that this change would restore the old behavior of
TSContSchedule minus the delay and dead loop.

On Tue, Nov 19, 2019 at 2:39 PM Fei Deng <du...@apache.org> wrote:

> While PR#6103 (https://github.com/apache/trafficserver/pull/6103) solves
> the problem of having the 60ms delay (caused by waiting in sleep), it also
> creates an issue where the event loop ends up in a "dead loop" if the
> scheduled event schedules itself with 0 timeout (see test code by scw00).
> Here's what I have in mind that will fix the problem.
>
>
>    1. New API "TSContDispatch", which will be basically the current
>    "TSContSchedule" calls with 0 timeouts.
>    2. When scheduling events/continuations using "TSContSchedule" API with
>    0 timeouts, set flag to identify the event to be processed on the next
>    event loop.
>    3. New "LocalQueue" class to handle events that are supposed to be
>    processed in the next event loop.
>
> Here's the branch I'm working on right now, it shows an easy concept of the
> "LocalQueue".
> https://github.com/duke8253/trafficserver/tree/master-event_dispatch
>