You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@httpd.apache.org by Yann Ylavic <yl...@gmail.com> on 2016/04/28 01:16:14 UTC

Re: svn commit: r1741310 - in /httpd/httpd/trunk: modules/http2/ server/ server/mpm/event/ server/mpm/motorz/ server/mpm/simple/ server/mpm/winnt/ server/mpm/worker/

I was offline today so couldn't comment on the different messages on
the subject, so I'll try to summarize (here) my understanding, so
far...

On Wed, Apr 27, 2016 at 8:41 PM,  <wr...@apache.org> wrote:
> Author: wrowe
> Date: Wed Apr 27 18:41:49 2016
> New Revision: 1741310
>
> URL: http://svn.apache.org/viewvc?rev=1741310&view=rev
> Log:
>
>   Ensure http2 follows http in the meaning of
>   status WRITE (meaning 'in the request processing
>   phase' even if still consuming the request body,
>   not literally in a 'now writing' state).

This is indeed consistent with how we report http1 state currently,
but maybe it could be more intuitive to report READ until the body is
consumed in http1 rather than changing http2?
Unless we want to minimize scoreboard updates, for performance reasons...

>
>   Ensure a number of MPMs and the h2 connection io
>   no longer clobber the request status line during
>   state-only changes.  While at it, clean up some
>   very ugly formatting and unnecessary decoration,
>   and avoid the wordy _from_conn() flavor when we
>   are not passing a connection_rec.

Why ap_update_child_status_from_conn(), given a real or NULL c, would
clobber the request line?
The request line is untouched unless a non-NULL r is passed to
update_child_status_internal(), and ap_update_child_status_from_conn()
sets r to NULL.

AIUI, what sets the request line to NULL in mpm_worker is when
ap_read_request() fails (mostly after connection closed remotely or
keep-alive timeout, actually any empty/NULL r->the_request from
read_request_line() would do it).
This may also happen in mpm_event but since the worker (scoreboard
entry) has probably changed in between, this is possibly less visible.

So how about:

Index: server/protocol.c
===================================================================
--- server/protocol.c    (revision 1741108)
+++ server/protocol.c    (working copy)
@@ -1093,13 +1093,14 @@ request_rec *ap_read_request(conn_rec *conn)
             access_status = r->status;
             r->status = HTTP_OK;
             ap_die(access_status, r);
-            ap_update_child_status(conn->sbh, SERVER_BUSY_LOG, r);
+            ap_update_child_status(conn->sbh, SERVER_BUSY_LOG,
+                                   r->the_request ? r : NULL);
             ap_run_log_transaction(r);
             r = NULL;
             apr_brigade_destroy(tmp_bb);
             goto traceout;
         case HTTP_REQUEST_TIME_OUT:
-            ap_update_child_status(conn->sbh, SERVER_BUSY_LOG, r);
+            ap_update_child_status(conn->sbh, SERVER_BUSY_LOG, NULL);
             if (!r->connection->keepalives)
                 ap_run_log_transaction(r);
             apr_brigade_destroy(tmp_bb);
?


Regarding this commit, the goal seems more about not clobbering the
*client IP* until we read the next/real request on each connection,
but this implies that we report more the last requests handled by each
worker than the workers states themselves...
Isn't the access log more appropriate for that purpose finally,
whereas the scoreboard is more about workers' activity?

However I agree that without doing something like this (I tried a
different approach in PR 59333, not working yet... but it could with
more work IMHO), with mpm_event we may find mixed data (from different
requests/connections) in the same scoreboard entry :/

I think Stefan tried to address this (observed by testing http2
scores?) in 2.4.20 by blanking the fields before reading each request
(unfortunately that broke the usual reporting, but I think the
intention was laudable).

So now, either we choose to not report workers' activity in between
requests (there are just multi-purpose workers/request-pollers after
all), or we save some "last request" informations in each conn_rec and
restore them on the worker (score) handling the connection at each
time (this is more reporting the workers' activity than the requests
lines, which will pass from one worker to the other, even possibly
duplicating requests lines with low activity).

I prefer the latter option, but I understand the former too...

Regards,
Yann.

Re: svn commit: r1741310 - in /httpd/httpd/trunk: modules/http2/ server/ server/mpm/event/ server/mpm/motorz/ server/mpm/simple/ server/mpm/winnt/ server/mpm/worker/

Posted by Stefan Eissing <st...@greenbytes.de>.
> Am 28.04.2016 um 10:08 schrieb Rainer Jung <ra...@kippdata.de>:
> 
> Am 28.04.2016 um 04:30 schrieb William A Rowe Jr:
>> On Wed, Apr 27, 2016 at 6:16 PM, Yann Ylavic <ylavic.dev@gmail.com
>> <ma...@gmail.com>> wrote:
>> 
>>    I was offline today so couldn't comment on the different messages on
>>    the subject, so I'll try to summarize (here) my understanding, so
>>    far...
>> 
>>    On Wed, Apr 27, 2016 at 8:41 PM,  <wrowe@apache.org
>>    <ma...@apache.org>> wrote:
>>    > Author: wrowe
>>    > Date: Wed Apr 27 18:41:49 2016
>>    > New Revision: 1741310
>>    >
>>    > URL:http://svn.apache.org/viewvc?rev=1741310&view=rev
>>    > Log:
>>    >
>>    >   Ensure http2 follows http in the meaning of
>>    >   status WRITE (meaning 'in the request processing
>>    >   phase' even if still consuming the request body,
>>    >   not literally in a 'now writing' state).
>> 
>>    This is indeed consistent with how we report http1 state currently,
>>    but maybe it could be more intuitive to report READ until the body is
>>    consumed in http1 rather than changing http2?
>>    Unless we want to minimize scoreboard updates, for performance
>>    reasons...
>> 
>> 
>> Well, we always want to be considerate of performance.
>> 
>> That said, we absolutely must not change the semantic meanings of
>> the server-status results on the maintenance branch.  Change those
>> meanings on trunk for a future 2.6/3.0?  Sure... but 2.4.x needs to
>> behave as 2.4.x has behaved from the start.
> 
> I agree we shouldn't change the semantics of R/W during 2.4. People might monitor the numbers of R and W and will be quite surprised by seeing a noticeable shift from W to R.
> 
> I think it would be a good change for 2.6/3.0 to make R including reading the request body and only after having read it switch to W. We can mention it in CHANGES / migration guide and personally I think those semantics would be more intuitive.

That is how I understood it to be when I made the status updates in mod_http2. But it should of course be in line how we report the http/1.x scores.

Re: svn commit: r1741310 - in /httpd/httpd/trunk: modules/http2/ server/ server/mpm/event/ server/mpm/motorz/ server/mpm/simple/ server/mpm/winnt/ server/mpm/worker/

Posted by Rainer Jung <ra...@kippdata.de>.
Am 28.04.2016 um 04:30 schrieb William A Rowe Jr:
> On Wed, Apr 27, 2016 at 6:16 PM, Yann Ylavic <ylavic.dev@gmail.com
> <ma...@gmail.com>> wrote:
>
>     I was offline today so couldn't comment on the different messages on
>     the subject, so I'll try to summarize (here) my understanding, so
>     far...
>
>     On Wed, Apr 27, 2016 at 8:41 PM,  <wrowe@apache.org
>     <ma...@apache.org>> wrote:
>     > Author: wrowe
>     > Date: Wed Apr 27 18:41:49 2016
>     > New Revision: 1741310
>     >
>     > URL:http://svn.apache.org/viewvc?rev=1741310&view=rev
>     > Log:
>     >
>     >   Ensure http2 follows http in the meaning of
>     >   status WRITE (meaning 'in the request processing
>     >   phase' even if still consuming the request body,
>     >   not literally in a 'now writing' state).
>
>     This is indeed consistent with how we report http1 state currently,
>     but maybe it could be more intuitive to report READ until the body is
>     consumed in http1 rather than changing http2?
>     Unless we want to minimize scoreboard updates, for performance
>     reasons...
>
>
> Well, we always want to be considerate of performance.
>
> That said, we absolutely must not change the semantic meanings of
> the server-status results on the maintenance branch.  Change those
> meanings on trunk for a future 2.6/3.0?  Sure... but 2.4.x needs to
> behave as 2.4.x has behaved from the start.

I agree we shouldn't change the semantics of R/W during 2.4. People 
might monitor the numbers of R and W and will be quite surprised by 
seeing a noticeable shift from W to R.

I think it would be a good change for 2.6/3.0 to make R including 
reading the request body and only after having read it switch to W. We 
can mention it in CHANGES / migration guide and personally I think those 
semantics would be more intuitive.

Regards,

Rainer

Re: svn commit: r1741310 - in /httpd/httpd/trunk: modules/http2/ server/ server/mpm/event/ server/mpm/motorz/ server/mpm/simple/ server/mpm/winnt/ server/mpm/worker/

Posted by Yann Ylavic <yl...@gmail.com>.
On Thu, Apr 28, 2016 at 5:22 PM, William A Rowe Jr <wr...@rowe-clan.net> wrote:
>
> What I meant is that the request line *MUST* be touched - wiped
> clean - for any status_from_conn(c) request.
>
> When we say we preserve the most recent state of the worker in
> the scoreboard, that old request line is gone.  Every time we are
> doing a conn-based refresh of that worker status, the req-based
> data is no longer relevant and needs to go away.

OK, I'm fine with this, much prefered over mixing (which finally I
introduced in my incomplete revert, understood ;)

>
> See
> http://svn.apache.org/viewvc/httpd/httpd/branches/2.4.x/server/scoreboard.c?r1=1239125&r2=1741240&diff_format=h
>
> for the delta between 2.4.1 and 2.4.broken (current state).
>
>          else if (c) {
> -            apr_cpystrn(ws->client, ap_get_remote_host(c, NULL,
> -                        REMOTE_NOLOOKUP, NULL), sizeof(ws->client));
> -            ws->request[0]='\0';
> -            ws->vhost[0]='\0';

Indeed, I missed that, too much influenced by the original report
about blanks introduced in 2.4.20, but that was only about the client
IP, not the vhost/request-line which have always been blank in
SERVER_BUSY_READ...

Your proposed backport looks good to me now, thanks for your patience :)

Re: svn commit: r1741310 - in /httpd/httpd/trunk: modules/http2/ server/ server/mpm/event/ server/mpm/motorz/ server/mpm/simple/ server/mpm/winnt/ server/mpm/worker/

Posted by William A Rowe Jr <wr...@rowe-clan.net>.
On Thu, Apr 28, 2016 at 4:37 AM, Yann Ylavic <yl...@gmail.com> wrote:

> On Thu, Apr 28, 2016 at 4:30 AM, William A Rowe Jr <wr...@rowe-clan.net>
> wrote:
> > On Wed, Apr 27, 2016 at 6:16 PM, Yann Ylavic <yl...@gmail.com>
> wrote:
> >>
> >> On Wed, Apr 27, 2016 at 8:41 PM,  <wr...@apache.org> wrote:
> >> >   Ensure http2 follows http in the meaning of
> >> >   status WRITE (meaning 'in the request processing
> >> >   phase' even if still consuming the request body,
> >> >   not literally in a 'now writing' state).
> >>
> >> This is indeed consistent with how we report http1 state currently,
> >> but maybe it could be more intuitive to report READ until the body is
> >> consumed in http1 rather than changing http2?
> >> Unless we want to minimize scoreboard updates, for performance
> reasons...
> >
> >
> > Well, we always want to be considerate of performance.
> >
> > That said, we absolutely must not change the semantic meanings of
> > the server-status results on the maintenance branch.  Change those
> > meanings on trunk for a future 2.6/3.0?  Sure... but 2.4.x needs to
> > behave as 2.4.x has behaved from the start.
>
> Agreed, we can't change it for http1 in 2.4.x, but maybe http2 has
> other considerations WRT reading/writing on the master connection,
> reporting READ or WRITE for http2 is not really a compatibility issue
> IMHO.
>

It is, yes, the server-status has well understood semantics and changing
them on a released branch is never acceptable. The master worker already
has special semantics which are new. No reason to change the slave
workers between http/1 and http/2 as they are doing the same things.


> >> >   Ensure a number of MPMs and the h2 connection io
> >> >   no longer clobber the request status line during
> >> >   state-only changes.  While at it, clean up some
> >> >   very ugly formatting and unnecessary decoration,
> >> >   and avoid the wordy _from_conn() flavor when we
> >> >   are not passing a connection_rec.
> >>
> >> Why ap_update_child_status_from_conn(), given a real or NULL c, would
> >> clobber the request line?
> >
> >
> > Given a real conn_rec record, we have no request line.  Therefore the
> result
> > is NULL.  There is no purpose in modifying the conn_rec related fields
> once
> > correctly recorded.  The next chance they have for being modified in any
> > meaningful way is during an ssl renegotiation or the processing of the
> Host:
> > and X-F-F: headers during the read_request phase.
>
> That's more an optimization (not rewrite existing values with the same
> ones), but driven by how http1 in *2.4.x* works currently (this relies
> on the same worker handling the connection for the lifetime of the
> request).
>



> When SUSPENDED is used (trunk does it already at several places) we
> must overwrite the fields with actual values.
>

SUSPENDED within a request cannot be supported on 2.4.x as modules
are allowed to make thread storage and thread-local storage assumptions.
E.g. it breaks modules in an unexpected manner, and it is our responsibility
to inform users of this by bumping the version major.  This wasn't seen as
a big of an issue when we introduced event and launched 2.2.x, because
far more third party modules are request handlers, and are not connection-
based.  But we still bumped the version minor when we allowed the
connection to float between threads. This wasn't changed in maintained
release branch.

On trunk, it will have to become the responsibility of the MPM to restore
the right score slot state before reentering the processing loop on a given
worker thread, multiple times per connection or request, as that MPM
sends requests across threads.


> >> The request line is untouched unless a non-NULL r is passed to
> >> update_child_status_internal(), and ap_update_child_status_from_conn()
> >> sets r to NULL.
> >
> > That was incorrect - to the extend that you've changed it since 2.4.1,
> > such a change must be reverted.
>
> I meant ap_update_child_status_from_conn() gives a NULL r to
> update_child_status_internal(), so it won't change the existing
> request line (as opposed to setting it to the NULL value in the
> scoreboard).


What I meant is that the request line *MUST* be touched - wiped
clean - for any status_from_conn(c) request.

When we say we preserve the most recent state of the worker in
the scoreboard, that old request line is gone.  Every time we are
doing a conn-based refresh of that worker status, the req-based
data is no longer relevant and needs to go away.

> At the time we process a new connection (before, and again after we
> > have parsed any SNI host handshake) - there is NO REQUEST.  Any
> > lingering request value must be blanked out at that time.  Once the
> > request URI is parsed we again invoke update_child_status, this time
> > with the request_rec with a now-known request string.
>
> So there is a window where some values need to be blanked or restored
> from elsewhere, so to avoid mixing previous values of the worker with
> new ones not yet available (see below).
>

It's a straightforward flow as illustrated below, yes.


> >> AIUI, what sets the request line to NULL in mpm_worker is when
> >> ap_read_request() fails (mostly after connection closed remotely or
> >> keep-alive timeout, actually any empty/NULL r->the_request from
> >> read_request_line() would do it).
> >> This may also happen in mpm_event but since the worker (scoreboard
> >> entry) has probably changed in between, this is possibly less visible.
> >
> >
> > You don't understand it.
> >
> > Envision this sequence, which is how the scoreboard was designed;
> >
> > Initialization -> entire score record is voided
> >
> > Connection -> score entry tagged READ, what is known of the
> > connecting client and the target vhost is recorded
> >
> > Connection SNI -> score entry updated with new target vhost
> >
> > Request read -> score entry again updated with new target vhost,
> > the request identifier is updated, score entry tagged WRITE
> >
> > Request body is read, Response body is prepared
> >
> > Request complete -> score entry tagged LOGGING, and NO
> > other fields need to be updated
> >
> > Request logged, left in keepalive state -> score entry tagged
> > KEEPALIVE, NO other fields need to be updated
> >
> > Connection closed -> score entry tagged IDLE, NO OTHER
> > FIELDS SHOULD BE UPDATED until the worker is reused
> > (even in the case of event MPM).
>
> OK, so the worker scoreboard entry is up to date with the latest
> connection/request handled.
>

Right


> Now let's consider the case in KEEAPALIVE state where a new request
> arrives (possibly on another connection for mpm_event), or in
> READY/closed state (for any MPM) where a new connection+request is
> handled by the same worker.
>

When we transition to READ just before calling ap_read_request(), we
> update the client IP in the scoreboard, but keep the request line
> (from the previous request) untouched, mixed data...
>

No, no, no.  That's not what httpd did until it was broken.

See
http://svn.apache.org/viewvc/httpd/httpd/branches/2.4.x/server/scoreboard.c?r1=1239125&r2=1741240&diff_format=h

for the delta between 2.4.1 and 2.4.broken (current state).

         else if (c) {
-            apr_cpystrn(ws->client, ap_get_remote_host(c, NULL,
-                        REMOTE_NOLOOKUP, NULL), sizeof(ws->client));
-            ws->request[0]='\0';
-            ws->vhost[0]='\0';

Please don't point me to the most recent release and state it as the
correct behavior, when the patches that got us there was ill-advised
and not thought out, as illustrated by PR 59333.


> For mpm_event this is a very transient state since the worker was
> scheduled only because the connection is POLLIN ready, so
> ap_read_request() won't block.
>

Good... ready to read and set the request in a processing (W) state.
If we are *reading* (and not polling for something to read) the request
should be flipped to the (R) state with the original logic that fills back
in the connection info and empties the request line.


> For mpm_worker though, this may last KeepAliveTimeout.
> Moreover for both MPMs, if the connection is closed by the client the
> worker score will stay as is until next (real) request is read on
> another connection, or for mpm worker if keepalive expires the request
> line will be set to the value NULL (unless patch below is applied).
>

When we enter the keepalive state, status should be set to (K) without
otherwise modifying the previously processed conn/req fields. If the
connection must be sent away, we toggle the state to (C), once it is
gone, we toggle an idle state for that slot.

When we pick back up the connection on another thread in the event
MPM, the correct behavior would be to call status_from_conn() before
handing it back to core in 2.4.x.  On trunk, the behavior should be
enhanced to call status with a request_rec instead if that request
had been SUSPENDed on another thread.

When we resume a connection on another thread, we must revert
the breakage and clear the new score slot request_rec upon the
connection_rec context, remember the prior request isn't interesting
*on this thread* and this is not the thread that serviced the prior
request.

That's what is observed in PR 59333, and I don't see where your commit
> change this.
> We need to either accept/document mixing, or blank the request line
> (but it will be very noticeable by users as it is already in 2.4.20),
> or restore the latest request line (and vhost) on the connection
> (hence also store it by connection, in conn_rec).
>
> >> So how about:
> >>
> >> Index: server/protocol.c
> >> ===================================================================
> >> --- server/protocol.c    (revision 1741108)
> >> +++ server/protocol.c    (working copy)
> >> @@ -1093,13 +1093,14 @@ request_rec *ap_read_request(conn_rec *conn)
> >>              access_status = r->status;
> >>              r->status = HTTP_OK;
> >>              ap_die(access_status, r);
> >> -            ap_update_child_status(conn->sbh, SERVER_BUSY_LOG, r);
> >> +            ap_update_child_status(conn->sbh, SERVER_BUSY_LOG,
> >> +                                   r->the_request ? r : NULL);
>

No.  If the_request is blank the right answer is blank.  We don't log
a non-request.


> Or even do not update the status at all if !r->the_request (and/or
> !r->the_request[0]?) since it will pass to CLOSING very soon.
>

You don't want to know what's going on when httpd threads get
jammed in the logging phase blocking on some logging backend?

Re: svn commit: r1741310 - in /httpd/httpd/trunk: modules/http2/ server/ server/mpm/event/ server/mpm/motorz/ server/mpm/simple/ server/mpm/winnt/ server/mpm/worker/

Posted by Yann Ylavic <yl...@gmail.com>.
On Thu, Apr 28, 2016 at 11:37 AM, Yann Ylavic <yl...@gmail.com> wrote:
>
> OK, so the worker scoreboard entry is up to date with the latest
> connection/request handled.
> Now let's consider the case in KEEAPALIVE state where a new request
> arrives (possibly on another connection for mpm_event), or in
> READY/closed state (for any MPM) where a new connection+request is
> handled by the same worker.
>
> When we transition to READ just before calling ap_read_request(), we
> update the client IP in the scoreboard, but keep the request line
> (from the previous request) untouched, mixed data...
> For mpm_event this is a very transient state since the worker was
> scheduled only because the connection is POLLIN ready, so
> ap_read_request() won't block.
> For mpm_worker though, this may last KeepAliveTimeout.
> Moreover for both MPMs, if the connection is closed by the client the
> worker score will stay as is until next (real) request is read on
> another connection, or for mpm worker if keepalive expires the request
> line will be set to the value NULL (unless patch below is applied).
> That's what is observed in PR 59333, and I don't see where your commit
> change this.
> We need to either accept/document mixing, or blank the request line
> (but it will be very noticeable by users as it is already in 2.4.20),
> or restore the latest request line (and vhost) on the connection
> (hence also store it by connection, in conn_rec).

So maybe for 2.4.x we could add the following to you commit:

Index: modules/http/http_core.c
===================================================================
--- modules/http/http_core.c    (revision 1741396)
+++ modules/http/http_core.c    (working copy)
@@ -141,8 +141,15 @@ static int ap_process_http_async_connection(conn_r
     AP_DEBUG_ASSERT(cs->state == CONN_STATE_READ_REQUEST_LINE);

     while (cs->state == CONN_STATE_READ_REQUEST_LINE) {
-        ap_update_child_status_from_conn(c->sbh, SERVER_BUSY_READ, c);
+        worker_score *ws = ap_get_scoreboard_worker(c->sbh);

+        if (ws->client[0]) {
+            ap_update_child_status(c->sbh, SERVER_BUSY_READ, NULL);
+        }
+        else {
+            ap_update_child_status_from_conn(c->sbh, SERVER_BUSY_READ, c);
+        }
+
         if ((r = ap_read_request(c))) {

             c->keepalive = AP_CONN_UNKNOWN;
@@ -182,13 +189,20 @@ static int ap_process_http_sync_connection(conn_re
     conn_state_t *cs = c->cs;
     apr_socket_t *csd = NULL;
     int mpm_state = 0;
+    worker_score *ws = ap_get_scoreboard_worker(c->sbh);

+    if (ws->client[0]) {
+        ap_update_child_status(c->sbh, SERVER_BUSY_READ, NULL);
+    }
+    else {
+        ap_update_child_status_from_conn(c->sbh, SERVER_BUSY_READ, c);
+    }
+
     /*
      * Read and process each request found on our connection
      * until no requests are left or we decide to close.
      */

-    ap_update_child_status_from_conn(c->sbh, SERVER_BUSY_READ, c);
     while ((r = ap_read_request(c)) != NULL) {
         apr_interval_time_t keep_alive_timeout = r->server->keep_alive_timeout;

Index: server/protocol.c
===================================================================
--- server/protocol.c    (revision 1741396)
+++ server/protocol.c    (working copy)
@@ -1093,13 +1093,14 @@ request_rec *ap_read_request(conn_rec *conn)
             access_status = r->status;
             r->status = HTTP_OK;
             ap_die(access_status, r);
-            ap_update_child_status(conn->sbh, SERVER_BUSY_LOG, r);
+            if (r->the_request && r->the_request[0]) {
+                ap_update_child_status(conn->sbh, SERVER_BUSY_LOG, r);
+            }
             ap_run_log_transaction(r);
             r = NULL;
             apr_brigade_destroy(tmp_bb);
             goto traceout;
         case HTTP_REQUEST_TIME_OUT:
-            ap_update_child_status(conn->sbh, SERVER_BUSY_LOG, r);
             if (!r->connection->keepalives)
                 ap_run_log_transaction(r);
             apr_brigade_destroy(tmp_bb);
_

Such that we don't update the client IP until a request line is read.
So we'd report WRITE + new-request for a request being handled, and
READ + old-request when waiting for a new request.

Re: svn commit: r1741310 - in /httpd/httpd/trunk: modules/http2/ server/ server/mpm/event/ server/mpm/motorz/ server/mpm/simple/ server/mpm/winnt/ server/mpm/worker/

Posted by Yann Ylavic <yl...@gmail.com>.
On Thu, Apr 28, 2016 at 4:30 AM, William A Rowe Jr <wr...@rowe-clan.net> wrote:
> On Wed, Apr 27, 2016 at 6:16 PM, Yann Ylavic <yl...@gmail.com> wrote:
>>
>> On Wed, Apr 27, 2016 at 8:41 PM,  <wr...@apache.org> wrote:
>> >   Ensure http2 follows http in the meaning of
>> >   status WRITE (meaning 'in the request processing
>> >   phase' even if still consuming the request body,
>> >   not literally in a 'now writing' state).
>>
>> This is indeed consistent with how we report http1 state currently,
>> but maybe it could be more intuitive to report READ until the body is
>> consumed in http1 rather than changing http2?
>> Unless we want to minimize scoreboard updates, for performance reasons...
>
>
> Well, we always want to be considerate of performance.
>
> That said, we absolutely must not change the semantic meanings of
> the server-status results on the maintenance branch.  Change those
> meanings on trunk for a future 2.6/3.0?  Sure... but 2.4.x needs to
> behave as 2.4.x has behaved from the start.

Agreed, we can't change it for http1 in 2.4.x, but maybe http2 has
other considerations WRT reading/writing on the master connection,
reporting READ or WRITE for http2 is not really a compatibility issue
IMHO.

>
>>
>> >   Ensure a number of MPMs and the h2 connection io
>> >   no longer clobber the request status line during
>> >   state-only changes.  While at it, clean up some
>> >   very ugly formatting and unnecessary decoration,
>> >   and avoid the wordy _from_conn() flavor when we
>> >   are not passing a connection_rec.
>>
>> Why ap_update_child_status_from_conn(), given a real or NULL c, would
>> clobber the request line?
>
>
> Given a real conn_rec record, we have no request line.  Therefore the result
> is NULL.  There is no purpose in modifying the conn_rec related fields once
> correctly recorded.  The next chance they have for being modified in any
> meaningful way is during an ssl renegotiation or the processing of the Host:
> and X-F-F: headers during the read_request phase.

That's more an optimization (not rewrite existing values with the same
ones), but driven by how http1 in *2.4.x* works currently (this relies
on the same worker handling the connection for the lifetime of the
request).
When SUSPENDED is used (trunk does it already at several places) we
must overwrite the fields with actual values.

>
>>
>> The request line is untouched unless a non-NULL r is passed to
>> update_child_status_internal(), and ap_update_child_status_from_conn()
>> sets r to NULL.
>
>
> That was incorrect - to the extend that you've changed it since 2.4.1,
> such a change must be reverted.

I meant ap_update_child_status_from_conn() gives a NULL r to
update_child_status_internal(), so it won't change the existing
request line (as opposed to setting it to the NULL value in the
scoreboard).

>
> At the time we process a new connection (before, and again after we
> have parsed any SNI host handshake) - there is NO REQUEST.  Any
> lingering request value must be blanked out at that time.  Once the
> request URI is parsed we again invoke update_child_status, this time
> with the request_rec with a now-known request string.

So there is a window where some values need to be blanked or restored
from elsewhere, so to avoid mixing previous values of the worker with
new ones not yet available (see below).

>
>>
>> AIUI, what sets the request line to NULL in mpm_worker is when
>> ap_read_request() fails (mostly after connection closed remotely or
>> keep-alive timeout, actually any empty/NULL r->the_request from
>> read_request_line() would do it).
>> This may also happen in mpm_event but since the worker (scoreboard
>> entry) has probably changed in between, this is possibly less visible.
>
>
> You don't understand it.
>
> Envision this sequence, which is how the scoreboard was designed;
>
> Initialization -> entire score record is voided
>
> Connection -> score entry tagged READ, what is known of the
> connecting client and the target vhost is recorded
>
> Connection SNI -> score entry updated with new target vhost
>
> Request read -> score entry again updated with new target vhost,
> the request identifier is updated, score entry tagged WRITE
>
> Request body is read, Response body is prepared
>
> Request complete -> score entry tagged LOGGING, and NO
> other fields need to be updated
>
> Request logged, left in keepalive state -> score entry tagged
> KEEPALIVE, NO other fields need to be updated
>
> Connection closed -> score entry tagged IDLE, NO OTHER
> FIELDS SHOULD BE UPDATED until the worker is reused
> (even in the case of event MPM).

OK, so the worker scoreboard entry is up to date with the latest
connection/request handled.
Now let's consider the case in KEEAPALIVE state where a new request
arrives (possibly on another connection for mpm_event), or in
READY/closed state (for any MPM) where a new connection+request is
handled by the same worker.

When we transition to READ just before calling ap_read_request(), we
update the client IP in the scoreboard, but keep the request line
(from the previous request) untouched, mixed data...
For mpm_event this is a very transient state since the worker was
scheduled only because the connection is POLLIN ready, so
ap_read_request() won't block.
For mpm_worker though, this may last KeepAliveTimeout.
Moreover for both MPMs, if the connection is closed by the client the
worker score will stay as is until next (real) request is read on
another connection, or for mpm worker if keepalive expires the request
line will be set to the value NULL (unless patch below is applied).
That's what is observed in PR 59333, and I don't see where your commit
change this.
We need to either accept/document mixing, or blank the request line
(but it will be very noticeable by users as it is already in 2.4.20),
or restore the latest request line (and vhost) on the connection
(hence also store it by connection, in conn_rec).

>
>> So how about:
>>
>> Index: server/protocol.c
>> ===================================================================
>> --- server/protocol.c    (revision 1741108)
>> +++ server/protocol.c    (working copy)
>> @@ -1093,13 +1093,14 @@ request_rec *ap_read_request(conn_rec *conn)
>>              access_status = r->status;
>>              r->status = HTTP_OK;
>>              ap_die(access_status, r);
>> -            ap_update_child_status(conn->sbh, SERVER_BUSY_LOG, r);
>> +            ap_update_child_status(conn->sbh, SERVER_BUSY_LOG,
>> +                                   r->the_request ? r : NULL);
>
>
> That could be updated, or we could equally pass NULL always, but it doesn't
> impact the correctness of the proposed backport.

Or even do not update the status at all if !r->the_request (and/or
!r->the_request[0]?) since it will pass to CLOSING very soon.

Re: svn commit: r1741310 - in /httpd/httpd/trunk: modules/http2/ server/ server/mpm/event/ server/mpm/motorz/ server/mpm/simple/ server/mpm/winnt/ server/mpm/worker/

Posted by William A Rowe Jr <wr...@rowe-clan.net>.
On Wed, Apr 27, 2016 at 9:30 PM, William A Rowe Jr <wr...@rowe-clan.net>
wrote:

> On Wed, Apr 27, 2016 at 6:16 PM, Yann Ylavic <yl...@gmail.com> wrote:
>
>>
>> So now, either we choose to not report workers' activity in between
>> requests (there are just multi-purpose workers/request-pollers after
>> all), or we save some "last request" informations in each conn_rec and
>> restore them on the worker (score) handling the connection at each
>> time (this is more reporting the workers' activity than the requests
>> lines, which will pass from one worker to the other, even possibly
>> duplicating requests lines with low activity).
>>
>
> If that slot does other work, it should be recorded.  But while that slot
> sits unused, the last activity for that score slot must be preserved.
>

Just to clarify the event case... our server is not free-threaded.  Once we
are in a request, we are threadbound.  The following request on that same
worker is likely to jump threads, and it's fine, my brief testing today
seems
to confirm, but I presented the patch to the followers of PR 59333 for
further
review and testing.

This will likely change in 3.0 (no, we aren't going to make requests
free-threaded without bumping the version major - that's too drastic
a behavior change.)  Once a *request* is jumping from thread to
thread, as opposed to the underlying connection, we are going to
have to rethink how the scoreboard presents that information and
how it is refreshed.

Re: svn commit: r1741310 - in /httpd/httpd/trunk: modules/http2/ server/ server/mpm/event/ server/mpm/motorz/ server/mpm/simple/ server/mpm/winnt/ server/mpm/worker/

Posted by William A Rowe Jr <wr...@rowe-clan.net>.
On Wed, Apr 27, 2016 at 6:16 PM, Yann Ylavic <yl...@gmail.com> wrote:

> I was offline today so couldn't comment on the different messages on
> the subject, so I'll try to summarize (here) my understanding, so
> far...
>
> On Wed, Apr 27, 2016 at 8:41 PM,  <wr...@apache.org> wrote:
> > Author: wrowe
> > Date: Wed Apr 27 18:41:49 2016
> > New Revision: 1741310
> >
> > URL: http://svn.apache.org/viewvc?rev=1741310&view=rev
> > Log:
> >
> >   Ensure http2 follows http in the meaning of
> >   status WRITE (meaning 'in the request processing
> >   phase' even if still consuming the request body,
> >   not literally in a 'now writing' state).
>
> This is indeed consistent with how we report http1 state currently,
> but maybe it could be more intuitive to report READ until the body is
> consumed in http1 rather than changing http2?
> Unless we want to minimize scoreboard updates, for performance reasons...
>

Well, we always want to be considerate of performance.

That said, we absolutely must not change the semantic meanings of
the server-status results on the maintenance branch.  Change those
meanings on trunk for a future 2.6/3.0?  Sure... but 2.4.x needs to
behave as 2.4.x has behaved from the start.


> >   Ensure a number of MPMs and the h2 connection io
> >   no longer clobber the request status line during
> >   state-only changes.  While at it, clean up some
> >   very ugly formatting and unnecessary decoration,
> >   and avoid the wordy _from_conn() flavor when we
> >   are not passing a connection_rec.
>
> Why ap_update_child_status_from_conn(), given a real or NULL c, would
> clobber the request line?
>

Given a real conn_rec record, we have no request line.  Therefore the result
is NULL.  There is no purpose in modifying the conn_rec related fields once
correctly recorded.  The next chance they have for being modified in any
meaningful way is during an ssl renegotiation or the processing of the
Host:
and X-F-F: headers during the read_request phase.


> The request line is untouched unless a non-NULL r is passed to
> update_child_status_internal(), and ap_update_child_status_from_conn()
> sets r to NULL.
>

That was incorrect - to the extend that you've changed it since 2.4.1,
such a change must be reverted.

At the time we process a new connection (before, and again after we
have parsed any SNI host handshake) - there is NO REQUEST.  Any
lingering request value must be blanked out at that time.  Once the
request URI is parsed we again invoke update_child_status, this time
with the request_rec with a now-known request string.


> AIUI, what sets the request line to NULL in mpm_worker is when
> ap_read_request() fails (mostly after connection closed remotely or
> keep-alive timeout, actually any empty/NULL r->the_request from
> read_request_line() would do it).
> This may also happen in mpm_event but since the worker (scoreboard
> entry) has probably changed in between, this is possibly less visible.
>

You don't understand it.

Envision this sequence, which is how the scoreboard was designed;

Initialization -> entire score record is voided

Connection -> score entry tagged READ, what is known of the
connecting client and the target vhost is recorded

Connection SNI -> score entry updated with new target vhost

Request read -> score entry again updated with new target vhost,
the request identifier is updated, score entry tagged WRITE

Request body is read, Response body is prepared

Request complete -> score entry tagged LOGGING, and NO
other fields need to be updated

Request logged, left in keepalive state -> score entry tagged
KEEPALIVE, NO other fields need to be updated

Connection closed -> score entry tagged IDLE, NO OTHER
FIELDS SHOULD BE UPDATED until the worker is reused
(even in the case of event MPM).




So how about:
>
> Index: server/protocol.c
> ===================================================================
> --- server/protocol.c    (revision 1741108)
> +++ server/protocol.c    (working copy)
> @@ -1093,13 +1093,14 @@ request_rec *ap_read_request(conn_rec *conn)
>              access_status = r->status;
>              r->status = HTTP_OK;
>              ap_die(access_status, r);
> -            ap_update_child_status(conn->sbh, SERVER_BUSY_LOG, r);
> +            ap_update_child_status(conn->sbh, SERVER_BUSY_LOG,
> +                                   r->the_request ? r : NULL);
>

That could be updated, or we could equally pass NULL always, but it doesn't
impact the correctness of the proposed backport.


>              ap_run_log_transaction(r);
>              r = NULL;
>              apr_brigade_destroy(tmp_bb);
>              goto traceout;
>          case HTTP_REQUEST_TIME_OUT:
> -            ap_update_child_status(conn->sbh, SERVER_BUSY_LOG, r);
> +            ap_update_child_status(conn->sbh, SERVER_BUSY_LOG, NULL);
>              if (!r->connection->keepalives)
>                  ap_run_log_transaction(r);
>              apr_brigade_destroy(tmp_bb);
> ?
>
>
> Regarding this commit, the goal seems more about not clobbering the
> *client IP* until we read the next/real request on each connection,
> but this implies that we report more the last requests handled by each
> worker than the workers states themselves...
> Isn't the access log more appropriate for that purpose finally,
> whereas the scoreboard is more about workers' activity?
>

The goal is to avoid clobbering any of the fields, including the request.
The patch satisfies that goal, the original behavior of voiding out the old
request line during any _from_conn() update should be restored as you
just identified above.

You are missing the point, the scoreboard is not to be abused until that
worker has *new* work to do, and even the previous request on that worker
needs to persist in the keepalive case until a new request arrives on that
connection.


> However I agree that without doing something like this (I tried a
> different approach in PR 59333, not working yet... but it could with
> more work IMHO), with mpm_event we may find mixed data (from different
> requests/connections) in the same scoreboard entry :/
>

This is why it is important to only record the client/vhost/request datum
at the time they are first known, and not to keep tickling those values,
for no effect.


> I think Stefan tried to address this (observed by testing http2
> scores?) in 2.4.20 by blanking the fields before reading each request
> (unfortunately that broke the usual reporting, but I think the
> intention was laudable).
>

Not questioning anyone's motives, as Stefan points out, the design
of the scoreboard isn't blatantly obvious.  My patch would have also
corrected the unclear doxygen description, but there is not such
description as we are all aware :)

So now, either we choose to not report workers' activity in between
> requests (there are just multi-purpose workers/request-pollers after
> all), or we save some "last request" informations in each conn_rec and
> restore them on the worker (score) handling the connection at each
> time (this is more reporting the workers' activity than the requests
> lines, which will pass from one worker to the other, even possibly
> duplicating requests lines with low activity).
>

If that slot does other work, it should be recorded.  But while that slot
sits unused, the last activity for that score slot must be preserved.