You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@httpd.apache.org by Stefan Eissing <st...@greenbytes.de> on 2015/06/22 14:48:53 UTC

module configs across (pseudo) connections

Eric, thanks for the help! When enabling mod_logio it became immediately clear that mod_h2 wrongly prevented some pre_connection hooks to run. mod_logio however expects its allocated module config to be there when a request gets cleaned up... So, with v0.7.2 all pre_conn hooks are run again and it is part of my test setup now.

Which adds the issue about proper handling of module configurations in pseudo connections. There seem to be two approaches:
a) treat pseudo connections like real ones -> run all connection hooks 
b) treat them as "shadows" of the real connection -> copy module configs

While a) is the least dangerous, it misses gives a false impression about the properties of a connection. For example, mod_h2 currently copies over the mod_ssl config, so that SSL variables are available during request processing on pseudo connections. On the other hand, code is not really prepared for b) since this means that many threads may operate on the same module config.

So, mod_h2 now follow a) for now (with the exception of mod_ssl). A future proposal for pseudo connections will need to reevaluate this.

Cheers,
   Stefan

> Am 22.06.2015 um 14:23 schrieb Eric Covener <co...@gmail.com>:
> 
> On Mon, Jun 22, 2015 at 7:38 AM, Stefan Eissing
> <st...@greenbytes.de> wrote:
>> Thanks, now I see what you mean. What I do not understand:
>> - why is this EOR processed too early?
> 
> Usually it is at the end of a brigade and doesn't get cleaned up until
> all of the data is written. But the copy and delete causes the cleanup
> to run while you're iterating over the brigade to copy it in advance
> of writing.
> 
>> - what is causing the SegFault in the ap_run_log_transaction()
> 
> I don't know. I would have guessed running it early would only impact
> something later.
> 
>> - and why am I seeing no errors on my system. Is this a configuration issue with logging?
> 
> Looks like you figured this out  -- must have mod_logio plus its %B or
> whatever in your LogFormat.

<green/>bytes GmbH
Hafenweg 16, 48155 Münster, Germany
Phone: +49 251 2807760. Amtsgericht Münster: HRB5782




Re: module configs across (pseudo) connections

Posted by William A Rowe Jr <wr...@rowe-clan.net>.
On Jun 24, 2015 8:39 AM, "Eric Covener" <co...@gmail.com> wrote:
>
> On Wed, Jun 24, 2015 at 9:26 AM, Graham Leggett <mi...@sharp.fm> wrote:
> > I believe we should be treating the “pseudo” connections as real
connections, and perhaps by linking a “subconnection” to a “connection”
(c->main) in the same way we currently link a subrequest to a request
(r->main).
>
> There are some basics for this in trunk from jim. I think if the slave
> connections had their own bucket_alloc it might remove some of the
> copying in h2.

This will be the best-case solution. Even 'connection-level' filters could
be applied before 'network transport' filters are invoked.

This would have simplified mod_ftp data channel connection mechanics as
well.

Re: module configs across (pseudo) connections

Posted by Stefan Eissing <st...@greenbytes.de>.
> Am 24.06.2015 um 16:14 schrieb Eric Covener <co...@gmail.com>:
> 
> On Wed, Jun 24, 2015 at 10:07 AM, Stefan Eissing
> <st...@greenbytes.de> wrote:
>> Hmm, yes, well. It's the thought that counts... ;-)
>> 
>> I think this will not be enough, though, if I understood the failures of my various attempts correctly. But it will certainly be good if more heads than one have a go at this.
>> 
>> Let Tm :-= main thread, Tw := worker thread, TmB() the main connection bucket brigade, TwB() the worker (request) brigade.
>> 
>> When a response from a worker starts:
>>   TmB( , , , , , , , )       TwB(b1,b2,b3, , , , )
>> so we move buckets across threads
>>   TmB(b1,b2, , , , , , )       TwB(b3, , , , , , )
>> and send b1 out and new response data arrives
>>   TmB(b2, , , , , , , )       TwB(b3,b4, , , , , )
>> we have the destruction of b1 and the creation of b4 that go against the same bucket_alloc_t instance from two threads.
>> 
>> Similar operations happen when b1 needs to be split or is a file bucket that gets read. So refraining from destroying buckets in the main thread is not enough.
>> 
>> Have I missed something here?
> 
> 
> One thing I missed was that the httpd thread did the writing vs the h2
> thread. I thought the workers wrote but were serialized by the httpd
> thread.

I see. I was afraid of too many thread switches/waits/sync during main connection writes. For streams of same priority, h2 wants round-robin frames of streams to come out, preferably.

Ideally, I thought, for static file resources at least, the httpd thread would have all the file buckets in its buffers and read from them directly. With that goal in mind, I thought about moving things from h2 thread to httpd.

> -- 
> Eric Covener
> covener@gmail.com

<green/>bytes GmbH
Hafenweg 16, 48155 Münster, Germany
Phone: +49 251 2807760. Amtsgericht Münster: HRB5782




Re: module configs across (pseudo) connections

Posted by Eric Covener <co...@gmail.com>.
On Wed, Jun 24, 2015 at 10:07 AM, Stefan Eissing
<st...@greenbytes.de> wrote:
> Hmm, yes, well. It's the thought that counts... ;-)
>
> I think this will not be enough, though, if I understood the failures of my various attempts correctly. But it will certainly be good if more heads than one have a go at this.
>
> Let Tm :-= main thread, Tw := worker thread, TmB() the main connection bucket brigade, TwB() the worker (request) brigade.
>
> When a response from a worker starts:
>    TmB( , , , , , , , )       TwB(b1,b2,b3, , , , )
> so we move buckets across threads
>    TmB(b1,b2, , , , , , )       TwB(b3, , , , , , )
> and send b1 out and new response data arrives
>    TmB(b2, , , , , , , )       TwB(b3,b4, , , , , )
> we have the destruction of b1 and the creation of b4 that go against the same bucket_alloc_t instance from two threads.
>
> Similar operations happen when b1 needs to be split or is a file bucket that gets read. So refraining from destroying buckets in the main thread is not enough.
>
> Have I missed something here?


One thing I missed was that the httpd thread did the writing vs the h2
thread. I thought the workers wrote but were serialized by the httpd
thread.

-- 
Eric Covener
covener@gmail.com

Re: module configs across (pseudo) connections

Posted by Graham Leggett <mi...@sharp.fm>.
On 24 Jun 2015, at 4:07 PM, Stefan Eissing <st...@greenbytes.de> wrote:

> Hmm, yes, well. It's the thought that counts... ;-)
> 
> I think this will not be enough, though, if I understood the failures of my various attempts correctly. But it will certainly be good if more heads than one have a go at this.
> 
> Let Tm :-= main thread, Tw := worker thread, TmB() the main connection bucket brigade, TwB() the worker (request) brigade.
> 
> When a response from a worker starts:
>   TmB( , , , , , , , )       TwB(b1,b2,b3, , , , )
> so we move buckets across threads
>   TmB(b1,b2, , , , , , )       TwB(b3, , , , , , )
> and send b1 out and new response data arrives
>   TmB(b2, , , , , , , )       TwB(b3,b4, , , , , )
> we have the destruction of b1 and the creation of b4 that go against the same bucket_alloc_t instance from two threads.
> 
> Similar operations happen when b1 needs to be split or is a file bucket that gets read. So refraining from destroying buckets in the main thread is not enough.
> 
> Have I missed something here?

In theory, a bucket should only be considered by one thread at a time.

A brigade containing buckets may be worked on by a different thread, but this should never happen concurrently with another thread.

Am I right in understanding that the problem comes about when there is a transition of a bucket from the “slave” connection to the “master” connection?

In theory, if the “slave” connection was terminated by a filter that collected buckets and set them aside in a brigade, and the “master” connection started with a filter that collected buckets from various brigades and then sent them down the rest of the filter chain we would solve any threading issues.

We would probably need some kind of mechanism for flow control, so that we could temporarily suspend a “slave” connection until the master is ready to receive more data from that connection, but this should be straightforward.

I was working on a patch last year that allowed any filter to set aside buckets and return early before all data was processed. This may help you to implement the “slave to master” filters:

http://mail-archives.apache.org/mod_mbox/httpd-dev/201409.mbox/%3C04776F50-0A53-4C1D-AA1C-9F57B60BC0AC@sharp.fm%3E

Regards,
Graham
—


Re: module configs across (pseudo) connections

Posted by Stefan Eissing <st...@greenbytes.de>.
> Am 24.06.2015 um 15:50 schrieb Jim Jagielski <ji...@jaguNET.com>:
> 
> 
>> On Jun 24, 2015, at 9:39 AM, Eric Covener <co...@gmail.com> wrote:
>> 
>> On Wed, Jun 24, 2015 at 9:26 AM, Graham Leggett <mi...@sharp.fm> wrote:
>>> I believe we should be treating the “pseudo” connections as real connections, and perhaps by linking a “subconnection” to a “connection” (c->main) in the same way we currently link a subrequest to a request (r->main).
>> 
>> There are some basics for this in trunk from jim. I think if the slave
>> connections had their own bucket_alloc it might remove some of the
>> copying in h2.
> 
> +1... that's the thought at least ;)

Hmm, yes, well. It's the thought that counts... ;-)

I think this will not be enough, though, if I understood the failures of my various attempts correctly. But it will certainly be good if more heads than one have a go at this.

Let Tm :-= main thread, Tw := worker thread, TmB() the main connection bucket brigade, TwB() the worker (request) brigade.

When a response from a worker starts:
   TmB( , , , , , , , )       TwB(b1,b2,b3, , , , )
so we move buckets across threads
   TmB(b1,b2, , , , , , )       TwB(b3, , , , , , )
and send b1 out and new response data arrives
   TmB(b2, , , , , , , )       TwB(b3,b4, , , , , )
we have the destruction of b1 and the creation of b4 that go against the same bucket_alloc_t instance from two threads.

Similar operations happen when b1 needs to be split or is a file bucket that gets read. So refraining from destroying buckets in the main thread is not enough.

Have I missed something here?

<green/>bytes GmbH
Hafenweg 16, 48155 Münster, Germany
Phone: +49 251 2807760. Amtsgericht Münster: HRB5782




Re: module configs across (pseudo) connections

Posted by Jim Jagielski <ji...@jaguNET.com>.
> On Jun 24, 2015, at 9:39 AM, Eric Covener <co...@gmail.com> wrote:
> 
> On Wed, Jun 24, 2015 at 9:26 AM, Graham Leggett <mi...@sharp.fm> wrote:
>> I believe we should be treating the “pseudo” connections as real connections, and perhaps by linking a “subconnection” to a “connection” (c->main) in the same way we currently link a subrequest to a request (r->main).
> 
> There are some basics for this in trunk from jim. I think if the slave
> connections had their own bucket_alloc it might remove some of the
> copying in h2.

+1... that's the thought at least ;)

Re: module configs across (pseudo) connections

Posted by Eric Covener <co...@gmail.com>.
On Wed, Jun 24, 2015 at 9:26 AM, Graham Leggett <mi...@sharp.fm> wrote:
> I believe we should be treating the “pseudo” connections as real connections, and perhaps by linking a “subconnection” to a “connection” (c->main) in the same way we currently link a subrequest to a request (r->main).

There are some basics for this in trunk from jim. I think if the slave
connections had their own bucket_alloc it might remove some of the
copying in h2.

-- 
Eric Covener
covener@gmail.com

Re: module configs across (pseudo) connections

Posted by Graham Leggett <mi...@sharp.fm>.
On 24 Jun 2015, at 3:58 PM, Stefan Eissing <st...@greenbytes.de> wrote:

> Totally agree. That is why it is not implemented like that. With the side effect that mod_logio, for example, does not aggregate data for the main connection.
> 
> The only exception in the current implementation is mod_ssl. mod_h2 copies that one to the slave connection, so that ssl_var_lookup() works.

This may break if a change was made to mod_ssl and structure sizes changed.

If there was something like c->main, we could teach mod_ssl how to find the “root” connection so that ssl_var_lookup() “just works” without copying.

Regards,
Graham
—


Re: module configs across (pseudo) connections

Posted by Stefan Eissing <st...@greenbytes.de>.
> Am 24.06.2015 um 15:26 schrieb Graham Leggett <mi...@sharp.fm>:
> 
> I would argue that the copying option above is most dangerous, as this has side effects that may not be catered for by pool cleanups.

Totally agree. That is why it is not implemented like that. With the side effect that mod_logio, for example, does not aggregate data for the main connection.

The only exception in the current implementation is mod_ssl. mod_h2 copies that one to the slave connection, so that ssl_var_lookup() works.

> I believe we should be treating the “pseudo” connections as real connections, and perhaps by linking a “subconnection” to a “connection” (c->main) in the same way we currently link a subrequest to a request (r->main).

Sounds good. Then mod_ssl/mod_logio can treat these as they seem fit.

> If you run into any blockers while doing this I would argue those blockers would be bugs and we want to fix them.

Cheers,
  Stefan

<green/>bytes GmbH
Hafenweg 16, 48155 Münster, Germany
Phone: +49 251 2807760. Amtsgericht Münster: HRB5782




Re: module configs across (pseudo) connections

Posted by Graham Leggett <mi...@sharp.fm>.
On 22 Jun 2015, at 2:48 PM, Stefan Eissing <st...@greenbytes.de> wrote:

> Eric, thanks for the help! When enabling mod_logio it became immediately clear that mod_h2 wrongly prevented some pre_connection hooks to run. mod_logio however expects its allocated module config to be there when a request gets cleaned up... So, with v0.7.2 all pre_conn hooks are run again and it is part of my test setup now.
> 
> Which adds the issue about proper handling of module configurations in pseudo connections. There seem to be two approaches:
> a) treat pseudo connections like real ones -> run all connection hooks 
> b) treat them as "shadows" of the real connection -> copy module configs
> 
> While a) is the least dangerous, it misses gives a false impression about the properties of a connection. For example, mod_h2 currently copies over the mod_ssl config, so that SSL variables are available during request processing on pseudo connections. On the other hand, code is not really prepared for b) since this means that many threads may operate on the same module config.
> 
> So, mod_h2 now follow a) for now (with the exception of mod_ssl). A future proposal for pseudo connections will need to reevaluate this.

I would argue that the copying option above is most dangerous, as this has side effects that may not be catered for by pool cleanups.

I believe we should be treating the “pseudo” connections as real connections, and perhaps by linking a “subconnection” to a “connection” (c->main) in the same way we currently link a subrequest to a request (r->main).

If you run into any blockers while doing this I would argue those blockers would be bugs and we want to fix them.

Regards,
Graham
—


Re: module configs across (pseudo) connections

Posted by Eric Covener <co...@gmail.com>.
Should probably note somewhere that mod_logio won't log meaningful
results for h2 connections (based on how it tracks at connection
level)

On Mon, Jun 22, 2015 at 8:48 AM, Stefan Eissing
<st...@greenbytes.de> wrote:
> Eric, thanks for the help! When enabling mod_logio it became immediately clear that mod_h2 wrongly prevented some pre_connection hooks to run. mod_logio however expects its allocated module config to be there when a request gets cleaned up... So, with v0.7.2 all pre_conn hooks are run again and it is part of my test setup now.
>
> Which adds the issue about proper handling of module configurations in pseudo connections. There seem to be two approaches:
> a) treat pseudo connections like real ones -> run all connection hooks
> b) treat them as "shadows" of the real connection -> copy module configs
>
> While a) is the least dangerous, it misses gives a false impression about the properties of a connection. For example, mod_h2 currently copies over the mod_ssl config, so that SSL variables are available during request processing on pseudo connections. On the other hand, code is not really prepared for b) since this means that many threads may operate on the same module config.
>
> So, mod_h2 now follow a) for now (with the exception of mod_ssl). A future proposal for pseudo connections will need to reevaluate this.
>
> Cheers,
>    Stefan
>
>> Am 22.06.2015 um 14:23 schrieb Eric Covener <co...@gmail.com>:
>>
>> On Mon, Jun 22, 2015 at 7:38 AM, Stefan Eissing
>> <st...@greenbytes.de> wrote:
>>> Thanks, now I see what you mean. What I do not understand:
>>> - why is this EOR processed too early?
>>
>> Usually it is at the end of a brigade and doesn't get cleaned up until
>> all of the data is written. But the copy and delete causes the cleanup
>> to run while you're iterating over the brigade to copy it in advance
>> of writing.
>>
>>> - what is causing the SegFault in the ap_run_log_transaction()
>>
>> I don't know. I would have guessed running it early would only impact
>> something later.
>>
>>> - and why am I seeing no errors on my system. Is this a configuration issue with logging?
>>
>> Looks like you figured this out  -- must have mod_logio plus its %B or
>> whatever in your LogFormat.
>
> <green/>bytes GmbH
> Hafenweg 16, 48155 Münster, Germany
> Phone: +49 251 2807760. Amtsgericht Münster: HRB5782
>
>
>



-- 
Eric Covener
covener@gmail.com