You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@httpd.apache.org by Luca Toscano <to...@gmail.com> on 2016/05/13 11:33:03 UTC

Timers and mpm-event

Hi Apache devs,

I have some questions about how mpm-event uses timers. If I understood
correctly the code, there are two main things that the listener thread
cares about:

- timeout_queue(s), that represents connections waiting for completion,
keep alive or in lingering close.
- timer_skiplist, that sorts timers for the above queues in an efficient
way (this is of course a very inaccurate and simplistic description).

My understanding is that the listener thread, using apr_pollset_poll, will
react on new connection requests plus all events related to the
timeout_queues. It also sleeps a maximum of 0.1s anyway to be able to check
the skiplist and update sockets that have an expired timeout (for example,
when a connection doesn't send any new data for more than KeepAliveTimeout).

Now the questions (if what I've said is vaguely true):

- What does PT_USER represents and how it is used?
- How is a new timer inserted in the skiplist? I followed the code and the
only "insert" actions that I can see are triggered by
event_get_timer_event, that is used for PT_USER events and by the function
hooked to mpm_register_timed_callback (I can see it declared in mpm_common
but no idea about how/when it runs). There are also some peek/pop actions
executed before apr_pollset_poll but no trace of inserts.

I know that those are very generic questions so even some hints would be
really appreciated. I have some goals in mind:

1) Add an overview in https://httpd.apache.org/docs/current/mod/event.html
(maybe adding a definitive answer to
https://bz.apache.org/bugzilla/show_bug.cgi?id=57399)
2) Create infographics (or even simple images) about prefork/worker/event
(and motorz?) to compare them in a "under the hood" section of the
documentation.
3) Complete http://httpd.apache.org/docs/current/misc/perf-tuning.html.

Thanks in advance!

Regards,

Luca

Re: Timers and mpm-event

Posted by Stefan Eissing <st...@greenbytes.de>.
> Am 13.05.2016 um 16:11 schrieb Eric Covener <co...@gmail.com>:
> 
> On Fri, May 13, 2016 at 7:02 AM, Stefan Eissing
> <st...@greenbytes.de> wrote:
>> That would allow HTTP/2 processing to become fully async and it would no longer need its own worker thread pool, at least with mpm_event.
>> 
>> Thoughts?
> 
> One bit I am still ignorant of is all of the beam-ish stuff (the
> problem and the solution) and how moving the main connection from
> thread-to-thread might impact that.

The bucket beams have no thread affinity. What I describe in the comments as 'red side' and 'green side' is just a name for the thread that is *currently* handling the red/green pool.

So, the red pool and green pools are fixed during the lifetime of a beam. The threads may vary. Whichever thread is the one owning operations on the red pool, I call the red thread.

The beam just manages the lifetimes of buckets and their pool affinity without unnecessary copying. The bucket being sent, the red buckets, are only ever manipulated in calls from the sending side. The buckets received, the green buckets, are separate instances, so they can be safely manipulated by the receiver (split/read/destroy).

The trick is that a green bucket may expose red data when read. That means the red bucket must stay around at least as long as the green one. So, a green bucket calls its beam when it's last share gets destroyed. The beam then knows that the corresponding red bucket is no longer needed.

Those no longer needed red buckets are placed on a 'purge' list. This list gets cleared and the buckets destroyed on the next call from the sending side, the red side.

The fragile thing is the cleanup of the red pool. That must not happen while the beam has outstanding green buckets. For this, the beam has a shutdown method that may block until this is true.

> Maybe  you could have a pipe()  with a writing end in each slave,  and
> read by the master,  that the event loop watches to re-schedule the
> master?

Hmm, I do not see the need. Those pipe()s will only generate events when another part of the process wants to. Unless we are talking about spreading HTTP/2 processing across multiple processes and using the pipes to transfer the actual data. And I am not convinced that this is a good idea.

And signaling would also need to go the other direction: from master to slave. Which would then require 2(4?) file handles per active requests, I assume?

-Stefan

Re: Timers and mpm-event

Posted by Eric Covener <co...@gmail.com>.
On Fri, May 13, 2016 at 7:02 AM, Stefan Eissing
<st...@greenbytes.de> wrote:
> That would allow HTTP/2 processing to become fully async and it would no longer need its own worker thread pool, at least with mpm_event.
>
> Thoughts?

One bit I am still ignorant of is all of the beam-ish stuff (the
problem and the solution) and how moving the main connection from
thread-to-thread might impact that.

Maybe  you could have a pipe()  with a writing end in each slave,  and
read by the master,  that the event loop watches to re-schedule the
master?
-- 
Eric Covener
covener@gmail.com

Re: Timers and mpm-event

Posted by Eric Covener <co...@gmail.com>.
On Fri, May 13, 2016 at 7:02 AM, Stefan Eissing
<st...@greenbytes.de> wrote:
> 1. Is this ever intended to work on a socket that is a main connection? Since event itself will add this socket to its pollset  now and then, I see a potential conflict. But that can be resolved, since mod_http2 is only interested in the callback while *inside* process_connection. If could unregister before returning or whatever is helpful.

Yes, but the only user so far is pretty unique as it is basically a
handler that drops down into TCP forwarding mode so there are really
no HTTP or filters around anymore.

But I would think that by the time you can register such a callback,
the socket is not in the pollset -- it only sits in there when we're
waiting on it (to pop from keepalive, to become writable, or to close
out)


> 2. I assume the callback gets invoked on whatever worker thread is currently available? it probably should return rather immediately, I assume?

yes, it gets sent down to the queue of normal event worker threads --
the same ones that would otherwise handle e.g. a new connection or a
keepalive request request showing up on an old connection.  When
healthy this is meant to happen immediately.


-- 
Eric Covener
covener@gmail.com

Re: Timers and mpm-event

Posted by Stefan Eissing <st...@greenbytes.de>.
Hijacking this thread...

> Am 13.05.2016 um 15:29 schrieb Eric Covener <co...@gmail.com>:
> 
> On Fri, May 13, 2016 at 4:33 AM, Luca Toscano <to...@gmail.com> wrote:
>> - What does PT_USER represents and how it is used?
> 
> PT_USER is what event tracks when you call
> event_register_poll_callback().  This callback
> allows a module to run some code when either of a pair of sockets
> becomes readable or writable.
> 
> It was written to allow mod_proxy_wstunnel to not tie up a thread when
> both ends of the connection
> are idle.
> 
> Note that it is still trunk-only.

Funny that you mention that...

I made a quick and dirty attempt to use this in mod_http2 yesterday. I want to get rid of the BUSY polling using timeouts that happens when the main connection is waiting for workers to come back with responses. That can block on a conditional, however it also needs to react when new data is arriving from the client. So, in its current form, it makes a timed wait on the conditional and checks the main connection again.

So, I registered on POLLIN on the main connection and signalled the conditional in that callback. Sort of worked, however was not very stable. Before I put in more work, there are some things I need to know too. Maybe that helps everyone with understanding this new feature.

1. Is this ever intended to work on a socket that is a main connection? Since event itself will add this socket to its pollset  now and then, I see a potential conflict. But that can be resolved, since mod_http2 is only interested in the callback while *inside* process_connection. If could unregister before returning or whatever is helpful.

2. I assume the callback gets invoked on whatever worker thread is currently available? it probably should return rather immediately, I assume?


Ideally, however, mod_http2 would use another mechanism:

   a) return from process_connection immediately when there is nothing to do
   b) have outgoing data sitting in output filters for streaming out event based
   c) be called via process_connection again if signaled by someone else*)
      *) someone else would be a slave connection that has produced new data
   d) slave connections could also leave their process_connection and go to "sleep".
      They have no socket, but can be signaled by others that data is available
      or that they can "write" more output.

In this way, slave connections and a master connections are both handled by MPM. While the latter have a socket that generates POLLIN/OUT/HUP events, the slave ones get these events generated by something else. This "something else" is in case of HTTP/2 the interworking between h2 session and stream requests. 

So, in short:
- POLLIN/POLLOUT/HUP event handling can be triggered by other threads via a new MPM API
- Slave connections can be started via a new MPM API. They will not have a socket, but take part in event handling

That would allow HTTP/2 processing to become fully async and it would no longer need its own worker thread pool, at least with mpm_event.

Thoughts?

-Stefan

Re: Timers and mpm-event

Posted by Luca Toscano <to...@gmail.com>.
[Answering to myself after a bit of research, it might be useful for
newcomers like me]

2016-05-14 11:49 GMT+02:00 Luca Toscano <to...@gmail.com>:

> Hi Eric,
>
> Other trivial questions from non experts like me:
>
> 2016-05-13 15:29 GMT+02:00 Eric Covener <co...@gmail.com>:
>
>> On Fri, May 13, 2016 at 4:33 AM, Luca Toscano <to...@gmail.com>
>> wrote:
>> > - What does PT_USER represents and how it is used?
>>
>> PT_USER is what event tracks when you call
>> event_register_poll_callback().  This callback
>> allows a module to run some code when either of a pair of sockets
>> becomes readable or writable.
>>
>
> It is still a bit unclear to me who calls this callback, that should be
> executed when  ap_run_mpm_register_poll_callback is run, but I can't find
> it anywhere (as I can do for other hooks). I know that event is the only
> one supporting it reading from the code's comments, but I can't really
> figure out why. Probably this is a trivial question due to my doubts with
> httpd's core :)
>

I was approaching the problem from the wrong angle, a better one might be
starting from mod_dialup.c in the 2.4.x branch; precisely all the
invocations of ap_mpm_register_timed_callback. The function is declared in
mpm_common and does a very basic thing, namely
calling ap_run_mpm_register_timed_callback that executes the hook's
machinery. Who is registered to the hook? Check the following in event.c:

 ap_hook_mpm_register_timed_callback(event_register_timed_callback, NULL,
NULL, APR_HOOK_MIDDLE);

And the callback, event_register_timed_callback, inserts a timer in the
skiplist. So the hook is basically executed by the modules interested in
adding a custom timer to event, and it works only with this MPM since it is
the only one that registers a callback for the hook.

Eric I know that you tried to explain the same thing to me but I was
missing some steps and I didn't get it straight away :)


>
> Other trivial question: is the skiplist used only for events/timers
> related to modules hooking to mpm_register_timed_callback? If so my
> original understanding of it is wrong, because I thought that it was used
> also to peek/pop/insert timers for the timeout queues (keep alives,
> lingering closes and write completion).
>

IIUC the answer to the skiplist question is yes, since keep alives /
lingering closes / write completion are all handled by separate (non
skiplist) queues (timeout_queue and the function process_timeout_queue).


>
> Any idea about how to tune the 100ms sleep time? It might be a good
> improvement for event!
>
>
This is still an open question for me. Again, IIUC all the listener threads
wait (most of the time) 100ms during their apr_pollset_poll, to be able to
check frequently all the various timers that need to be honored (the ones
in the timeout_queue(s) and skiplist). While I don't see a huge waste in
resources in the listener behavior,
https://bz.apache.org/bugzilla/show_bug.cgi?id=57399 shows a good point:
event could be a bit more efficient.

I really hope that Stefan's parallel thread will re-gain a bit of traction,
the idea of dropping mod_http2's thread pools to leverage completely event
looks very promising.

Sorry again for the extra email, hope that this will help people with too
many questions like me :)

Luca

Re: Timers and mpm-event

Posted by Luca Toscano <to...@gmail.com>.
Hi Eric,

Other trivial questions from non experts like me:

2016-05-13 15:29 GMT+02:00 Eric Covener <co...@gmail.com>:

> On Fri, May 13, 2016 at 4:33 AM, Luca Toscano <to...@gmail.com>
> wrote:
> > - What does PT_USER represents and how it is used?
>
> PT_USER is what event tracks when you call
> event_register_poll_callback().  This callback
> allows a module to run some code when either of a pair of sockets
> becomes readable or writable.
>

It is still a bit unclear to me who calls this callback, that should be
executed when  ap_run_mpm_register_poll_callback is run, but I can't find
it anywhere (as I can do for other hooks). I know that event is the only
one supporting it reading from the code's comments, but I can't really
figure out why. Probably this is a trivial question due to my doubts with
httpd's core :)


> It was written to allow mod_proxy_wstunnel to not tie up a thread when
> both ends of the connection
> are idle.
>
> Note that it is still trunk-only.


Got that, 2.4.x looks simpler, I'll check it.



> > - How is a new timer inserted in the skiplist? I followed the code and
> the
> > only "insert" actions that I can see are triggered by
> event_get_timer_event,
> > that is used for PT_USER events and by the function hooked to
> > mpm_register_timed_callback (I can see it declared in mpm_common but no
> idea
> > about how/when it runs). There are also some peek/pop actions executed
> > before apr_pollset_poll but no trace of inserts.
>
> It is via the insert in event_get_timer_event()
>
> PT_USER uses the timer part to implement a timeout callback on waiting for
> the
> sockets to become usable.    it was added as a proof of concept, IIUC,
> for mod_dialup to
> help demonstrate async handlers (give the thread back by returning
> SUSPENDED and run again later
> by being called back after some time

>
> > I know that those are very generic questions so even some hints would be
> > really appreciated. I have some goals in mind:
> >
> > 1) Add an overview in
> https://httpd.apache.org/docs/current/mod/event.html
> > (maybe adding a definitive answer to
> > https://bz.apache.org/bugzilla/show_bug.cgi?id=57399)
>
> > 2) Create infographics (or even simple images) about prefork/worker/event
> > (and motorz?) to compare them in a "under the hood" section of the
> > documentation.
>
> It might be good to separate design details from the reference manual
> so users are not overwhelmed
>

+1, my idea is only to add a little section to inform the users about this
behavior, then to explain it in detail in other separate docs. Don't want
to overwhelm but at the same time it is good to have quick references for
the admins that are curious about internals (without forcing them to read
the code that is a bit hard for whoever is not used to it).


>
> > 3) Complete http://httpd.apache.org/docs/current/misc/perf-tuning.html.
>
> There is a PR or other kind of complaint about the hard-coded 100ms
> sleep time even when no timers may be in use.  Since timers
> are not used by common modules, it should be possible to improve this.


This one right? https://bz.apache.org/bugzilla/show_bug.cgi?id=57399

Other trivial question: is the skiplist used only for events/timers related
to modules hooking to mpm_register_timed_callback? If so my original
understanding of it is wrong, because I thought that it was used also to
peek/pop/insert timers for the timeout queues (keep alives, lingering
closes and write completion).

Any idea about how to tune the 100ms sleep time? It might be a good
improvement for event!

Thanks for the patience, eventually I'll stop making these kind of
questions! Takes a bit of time to grasp the http's core and I am too
curious :)

Regards,

Luca

Ps: Thanks Stefan for hijacking the thread, really useful and interesting
stuff! (hope that I'll fully understand them after this email thread)

Re: Timers and mpm-event

Posted by Eric Covener <co...@gmail.com>.
On Fri, May 13, 2016 at 4:33 AM, Luca Toscano <to...@gmail.com> wrote:
> - What does PT_USER represents and how it is used?

PT_USER is what event tracks when you call
event_register_poll_callback().  This callback
allows a module to run some code when either of a pair of sockets
becomes readable or writable.

It was written to allow mod_proxy_wstunnel to not tie up a thread when
both ends of the connection
are idle.

Note that it is still trunk-only.

> - How is a new timer inserted in the skiplist? I followed the code and the
> only "insert" actions that I can see are triggered by event_get_timer_event,
> that is used for PT_USER events and by the function hooked to
> mpm_register_timed_callback (I can see it declared in mpm_common but no idea
> about how/when it runs). There are also some peek/pop actions executed
> before apr_pollset_poll but no trace of inserts.

It is via the insert in event_get_timer_event()

PT_USER uses the timer part to implement a timeout callback on waiting for the
sockets to become usable.    it was added as a proof of concept, IIUC,
for mod_dialup to
help demonstrate async handlers (give the thread back by returning
SUSPENDED and run again later
by being called back after some time)
>
> I know that those are very generic questions so even some hints would be
> really appreciated. I have some goals in mind:
>
> 1) Add an overview in https://httpd.apache.org/docs/current/mod/event.html
> (maybe adding a definitive answer to
> https://bz.apache.org/bugzilla/show_bug.cgi?id=57399)

> 2) Create infographics (or even simple images) about prefork/worker/event
> (and motorz?) to compare them in a "under the hood" section of the
> documentation.

It might be good to separate design details from the reference manual
so users are not overwhelmed

> 3) Complete http://httpd.apache.org/docs/current/misc/perf-tuning.html.

There is a PR or other kind of complaint about the hard-coded 100ms
sleep time even when no timers may be in use.  Since timers
are not used by common modules, it should be possible to improve this.


-- 
Eric Covener
covener@gmail.com