You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@httpd.apache.org by rb...@covalent.net on 2001/01/21 08:12:59 UTC

Apache 2.0 beta STATUS

This is a simple STATUS message for the beta.

1)  We are currently running the entire Apache.org site on Apache 2.0 on
port 8092.  I am talking to Brian about the steps required to move this to
port 80.  I fully expect this to happen soon-ish, assuming we can prove
that we are stable now.  The biggest stumbling block, is that Apache 2.0
HEAD is not running on any known production server, and the last time we
turned it on live on apache.org, we took the machine down.  That
experience may take a little while for some people to get over.

2)  We have 1 patch that must be committed before we go beta.  That is the
ap_r* performance patch.  There are two patches that have been submitted,
and I asked for a vote to end tomorrow morning.  Unfortunately, nobody has
voted for either patch at this point.  I am asking that people review
these patches, and vote for one of them.  I had planned to commit one of
the patches tomorrow afternoon.  I will not commit either patch until at
least three people (other than Greg and I) vote for a patch.  Please
review them and ask any questions you might have.

3)  That's it.  We are still in feature freeze, and I suspect that there
are bugs on non-Unix platforms that still need to be worked out.  I don't
believe that there are any more _features_ that MUST be committed before
we go beta.

I would really like to move this forward now.  It would be REALLY cool to
get the beta out before the end of the month if that is at all
possible.  I am willing to do whatever is necessary to make that happen.

Ryan
_______________________________________________________________________________
Ryan Bloom                        	rbb@apache.org
406 29th St.
San Francisco, CA 94131
-------------------------------------------------------------------------------

Re: Apache 2.0 beta STATUS

Posted by rb...@covalent.net.

> > > But now, we have doug's patch.  Without line-by-line pulling me out of my
> > > little filesystem world just now, would anyone care to comment on the
> > > merits of Doug's v.s. rbb's v.s. gstein's patches?  Maybe the six paragraph
> > > executive summary :-?
> > 
> > Doug's patch is not complete, and it wasn't meant to be.
> >...
> 
> Right.
> 
> Doug's patch is a *third* output mechanism. It doesn't deal with ap_r*
> performance at all.

Exactly.  When Doug wrote his patch, he was concerned about mod_perl's
performance and the fact that mod_perl was creating a lot of very small
buckets.  He can use ap_r, assuming he is actually dealing with a handler,
but if he is writing a filter he can't.  This is one of my concerns with
the filter mechanism.  It doesn't deal with issues like mod_perl, so
mod_perl, mod_snake, php, etc, are all likely to have buffering code.

Ryan

_______________________________________________________________________________
Ryan Bloom                        	rbb@apache.org
406 29th St.
San Francisco, CA 94131
-------------------------------------------------------------------------------

Re: Apache 2.0 beta STATUS

Posted by Greg Stein <gs...@lyra.org>.

On Sun, Jan 21, 2001 at 08:10:00PM -0800, rbb@covalent.net wrote:
>...
> > But now, we have doug's patch.  Without line-by-line pulling me out of my
> > little filesystem world just now, would anyone care to comment on the
> > merits of Doug's v.s. rbb's v.s. gstein's patches?  Maybe the six paragraph
> > executive summary :-?
> 
> Doug's patch is not complete, and it wasn't meant to be.
>...

Right.

Doug's patch is a *third* output mechanism. It doesn't deal with ap_r*
performance at all.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/

Re: the ap_r* thing (was: Re: Apache 2.0 beta STATUS)

Posted by Greg Marr <gr...@alum.wpi.edu>.

At 02:17 PM 01/22/2001, Greg Stein wrote:
>On Mon, Jan 22, 2001 at 02:10:44PM -0500, Greg Marr wrote:
> > At 01:51 PM 01/22/2001, Greg Stein wrote:
> > >On Mon, Jan 22, 2001 at 01:16:23PM -0500, Greg Marr wrote:
> > > > At 06:36 AM 01/22/2001, Greg Stein wrote:
> > > >...
> > > > >An answer is to tell everybody "you must use r->bb, and 
> never your
> > > > >own brigade."
> > > >
> > > > A better answer is to tell people to ap_rflush() before calling
> > > > functions they don't control that may generate output, and to 
> call
> > > > ap_rflush after generating output using ap_r* if the function 
> may be
> > > > called by functions they don't control.
> > >
> > >This would be inadvisable.  ap_rflush() will deliver to the
> > >network.  The hope is to synchronize the ordering, rather than
> > >generate network packets :-)
> >
> > Okay, so I guess that apr_brigade_flush() would be a better call 
> to
> > use, but that does bring up the question I had in my other 
> message.  :)
>
>Which question? It's hard to keep track, and I read all this stuff. 
>Our audience viewers could be even more confused :-)

Sorry, whether to use ap_rflush or apr_brigade_flush() in any given 
circumstance.

-- 
Greg Marr
gregm@alum.wpi.edu
"We thought you were dead."
"I was, but I'm better now." - Sheridan, "The Summoning"

Re: the ap_r* thing (was: Re: Apache 2.0 beta STATUS)

Posted by Greg Stein <gs...@lyra.org>.

On Mon, Jan 22, 2001 at 02:10:44PM -0500, Greg Marr wrote:
> At 01:51 PM 01/22/2001, Greg Stein wrote:
> >On Mon, Jan 22, 2001 at 01:16:23PM -0500, Greg Marr wrote:
> > > At 06:36 AM 01/22/2001, Greg Stein wrote:
> > >...
> > > >An answer is to tell everybody "you must use r->bb, and never 
> > your
> > > >own brigade."
> > >
> > > A better answer is to tell people to ap_rflush() before calling
> > > functions they don't control that may generate output, and to call
> > > ap_rflush after generating output using ap_r* if the function may 
> > be
> > > called by functions they don't control.
> >
> >This would be inadvisable.  ap_rflush() will deliver to the 
> >network.  The hope is to synchronize the ordering, rather than 
> >generate network packets :-)
> 
> Okay, so I guess that apr_brigade_flush() would be a better call to 
> use, but that does bring up the question I had in my other message.  :)

Which question? It's hard to keep track, and I read all this stuff. Our
audience viewers could be even more confused :-)

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/

Re: the ap_r* thing (was: Re: Apache 2.0 beta STATUS)

Posted by Greg Marr <gr...@alum.wpi.edu>.

At 01:51 PM 01/22/2001, Greg Stein wrote:
>On Mon, Jan 22, 2001 at 01:16:23PM -0500, Greg Marr wrote:
> > At 06:36 AM 01/22/2001, Greg Stein wrote:
> >...
> > >An answer is to tell everybody "you must use r->bb, and never 
> your
> > >own brigade."
> >
> > A better answer is to tell people to ap_rflush() before calling
> > functions they don't control that may generate output, and to call
> > ap_rflush after generating output using ap_r* if the function may 
> be
> > called by functions they don't control.
>
>This would be inadvisable.  ap_rflush() will deliver to the 
>network.  The hope is to synchronize the ordering, rather than 
>generate network packets :-)

Okay, so I guess that apr_brigade_flush() would be a better call to 
use, but that does bring up the question I had in my other message.  :)

The rest of your message did help clear a few other things up, thanks.

-- 
Greg Marr
gregm@alum.wpi.edu
"We thought you were dead."
"I was, but I'm better now." - Sheridan, "The Summoning"

Re: the ap_r* thing (was: Re: Apache 2.0 beta STATUS)

Posted by Greg Stein <gs...@lyra.org>.

On Mon, Jan 22, 2001 at 01:16:23PM -0500, Greg Marr wrote:
> At 06:36 AM 01/22/2001, Greg Stein wrote:
>...
> >An answer is to tell everybody "you must use r->bb, and never your 
> >own brigade."
> 
> A better answer is to tell people to ap_rflush() before calling 
> functions they don't control that may generate output, and to call 
> ap_rflush after generating output using ap_r* if the function may be 
> called by functions they don't control.

This would be inadvisable. ap_rflush() will deliver to the network. The hope
is to synchronize the ordering, rather than generate network packets :-)

> >By using a filter in my patch, I've trapped all output to the 
> >network.  Nothing can go without passing through that 
> >filter.  Therefore, I have a perfect choke point to ensure that I 
> >can order all the output properly.
> 
> Actually, all the output better be ordered properly before it gets to 
> your filter, since your filter can be pushed down the stack by 
> another filter inserting itself higher up.

Apache output is ordered by its entry to the output filter chain. There is
no other definition. It is that chain that transmits it onto the network, so
it must be correct on entry.

[ there may be no filters other than the core_output_filter ]

So, yes: your statement is correct. But it isn't anything new. :-)

> Is there ever the possibility of a filter being inserted into the 
> stack after the first ap_r* call?  If so, and that filter gets placed 
> before your filter, won't that totally hose any data currently in the 
> output filter's buffer?

That is a similar situation to the case where the ap_r* went to the network
before the new filter was inserted.

Basically: dynamic insertion of filters, while content is being generated,
is always a tricky business. The general rule of thumb is that a filter can
only insert another filter *after* itself. But even then, we make no
guarantees about whether that new filter has seen/processes all of the
content.

So, again: nothing new here. :-)

> >My patch has *zero* requirements on module authors.  Use whatever 
> >API you have been using or want to use.  It doesn't matter, and you 
> >don't ever have to worry about what somebody else is using.  There 
> >is no possibility for synchronization problems.
> 
> That's true, but it also has the possibility that the entire thing 
> can be turned off by a filter being inserted above it in the stack, 
> in which case Apache's back at its current performance.

Correct. We can always come up with a pathological case :-). Until somebody
goes and constructs a filter that specifically is intended to precede the
OLD_WRITE filter, then we're going to see all of the performance gains.
Someone has to specifically break it. If you truly believe this is a
possibility, then we can make an allowance within the AP_FTYPE constants for
more insurance. (personally, I don't see the situation arising, but am happy
to buy more insurance :-)

> >I'm all for creating an APR solution, but the number one priority is 
> >Apache.  If our solution also happens to work for APR, then 
> >great.  But I don't believe that we can necessarily say "this works 
> >for APR" and then wedge it into Apache and make module authors need 
> >to be aware of the various output mechanisms and compensate for 
> >potential ordering problems.
> 
> It removes any possibility of output ordering problems, but also adds 
> a possibility of a filter ordering problem rendering it ineffective.

You need to work at it for that to happen :-). There are no filters that are
defined to precende AP_FTYPE_CONTENT. Again, we can make an allowance if
this is deemed a *true* risk. I'm +0 for an AP_FTYPE change to clarify.

> >*) ap_rwrite(my_buffer, 100000, r)
> 
> That one's pretty nasty.

This is what mod_perl, mod_python, etc have done to this point. Content
generators that have large content have been using ap_rwrite. We can fix
those, but the point is to keep the ap_r* APIs working so that we don't
create too high of a bar for module authors.

> >*) ap_rprintf(r, "<D:compare-report>%s</D:compare-report>", report)
> >    report == 200k string
> 
> As is this one.

hehe... Note that the above construct actually works very well in 1.3. As
the vprintf() is performed, it is flushing directly to the network. The
working set characteristics are great (a fixed-size formatting buffer).
Point being that (since it has nice perf characteristics) it is out there in
module code today.

My patch already handles the first, and can have a fixed-size working set
for the second. (both Ryan and I will need to do the vformatter stuff to
properly deal with ap_[v]rprintf; we haven't bothered with our patches since
that is a straight-forward post-patch)

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/

Re: the ap_r* thing (was: Re: Apache 2.0 beta STATUS)

Posted by Greg Marr <gr...@alum.wpi.edu>.

At 06:36 AM 01/22/2001, Greg Stein wrote:
>*) assume it used ap_r*. There will be content sitting in r->bb. I 
>construct
>    my brigade for other pieces of the MERGE response and call
>    ap_pass_brigade(). I've now misordered the response because my 
> brigade
>    went before the stuff sitting in r->bb.
>
>*) let's say that I send some data with ap_r* and then call the DAV 
>utility
>    function. assume it uses ap_pass_brigade(). It constructs the 
> brigade and
>    passes it. Again: a misorder occurs because the brigade from the 
> utility
>    went out before the r->bb brigade.
>
>An answer is to tell everybody "you must use r->bb, and never your 
>own brigade."

A better answer is to tell people to ap_rflush() before calling 
functions they don't control that may generate output, and to call 
ap_rflush after generating output using ap_r* if the function may be 
called by functions they don't control.

>By using a filter in my patch, I've trapped all output to the 
>network.  Nothing can go without passing through that 
>filter.  Therefore, I have a perfect choke point to ensure that I 
>can order all the output properly.

Actually, all the output better be ordered properly before it gets to 
your filter, since your filter can be pushed down the stack by 
another filter inserting itself higher up.

Is there ever the possibility of a filter being inserted into the 
stack after the first ap_r* call?  If so, and that filter gets placed 
before your filter, won't that totally hose any data currently in the 
output filter's buffer?

>My patch has *zero* requirements on module authors.  Use whatever 
>API you have been using or want to use.  It doesn't matter, and you 
>don't ever have to worry about what somebody else is using.  There 
>is no possibility for synchronization problems.

That's true, but it also has the possibility that the entire thing 
can be turned off by a filter being inserted above it in the stack, 
in which case Apache's back at its current performance.

>I'm all for creating an APR solution, but the number one priority is 
>Apache.  If our solution also happens to work for APR, then 
>great.  But I don't believe that we can necessarily say "this works 
>for APR" and then wedge it into Apache and make module authors need 
>to be aware of the various output mechanisms and compensate for 
>potential ordering problems.

It removes any possibility of output ordering problems, but also adds 
a possibility of a filter ordering problem rendering it ineffective.

>*) ap_rwrite(my_buffer, 100000, r)

That one's pretty nasty.

>*) ap_rprintf(r, "<D:compare-report>%s</D:compare-report>", report)
>    report == 200k string

As is this one.

-- 
Greg Marr
gregm@alum.wpi.edu
"We thought you were dead."
"I was, but I'm better now." - Sheridan, "The Summoning"

the ap_r* thing (was: Re: Apache 2.0 beta STATUS)

Posted by Greg Stein <gs...@lyra.org>.

On Sun, Jan 21, 2001 at 11:56:55PM -0600, William A. Rowe, Jr. wrote:
> > From: Greg Stein [mailto:gstein@lyra.org]
> > Sent: Sunday, January 21, 2001 11:26 PM
> > 
> > On Sun, Jan 21, 2001 at 09:58:20PM -0600, William A. Rowe, Jr. wrote:
> > > As I say, I'm for the module author controlling the unknowns with as little
> > > possible chance for interference from 'invisible forces'.
> > 
> > The invisible forces in this case is the requirements imposed by forcing a
> > module author and even the core Apache code to have to synchronize between
> > two totally different output mechanisms. Each coder must be aware of all the
> > other bits of code which could potentially produce output.  Even worse, they
> > need to know *how* that other code did it. Not just "did they?" but "how?".
> > That kind of invisible coupling across the code base is a recipe for bugs.
> > Insert a sync call here, insert one there. Oops, we forgot to sync the two
> > mechanism over there, too. Oy! :-)
> 
> How is this possible unless we invoke a subrequest?

You've got a DAV plugin. You are implementing the MERGE method. Within its
response definition is a way to return property information. So your code
calls dav_send_propstat().

Now, without looking, how can you tell what output mechanism was used? Was
it ap_r* or was it ap_pass_brigade()? If it doesn't match what I'm using,
then I'm going to need to do some extra work.

[ the mixing problem only occurs with Ryan's patch, so I need to use it here
  as a demonstration of the mixing problem; not to pick on the patch ]

To clarify:

*) assume it used ap_r*. There will be content sitting in r->bb. I construct
   my brigade for other pieces of the MERGE response and call
   ap_pass_brigade(). I've now misordered the response because my brigade
   went before the stuff sitting in r->bb.

*) let's say that I send some data with ap_r* and then call the DAV utility
   function. assume it uses ap_pass_brigade(). It constructs the brigade and
   passes it. Again: a misorder occurs because the brigade from the utility
   went out before the r->bb brigade.

An answer is to tell everybody "you must use r->bb, and never your own
brigade." That is taking away choicse, rather than providing them. Also
recognize that anybody using r->bb must first create it, if it doesn't
exist. Just a little bit more pain.

Oh, and if people are supposed to use r->bb, then they better make sure to
call apr_brigade_flush() on it (watch out: only if it has been created!) to
make sure that if somebody *has* put data in there, that it gets
synchronized properly.

By using a filter in my patch, I've trapped all output to the network.
Nothing can go without passing through that filter. Therefore, I have a
perfect choke point to ensure that I can order all the output properly.

>...
> I'll think on this another night.  And [sorry I already clipped the text],
> on the comment of 'fixing apache' v.s. 'fixing apr', I'm very strongly
> against leaving useful mechanisms in apache when an equivilant mechanism
> exists to migrate them cleanly into apr.  If it were a case of "here's the
> Apache patch, and we'll finish moving it to apr in three weeks", I'd say
> great!  Let's roll the beta and move it afterwards!  But we aren't - again,
> I'm seeing two methods that would both solve our problem, with different
> requirements on the authors.

My patch has *zero* requirements on module authors. Use whatever API you
have been using or want to use. It doesn't matter, and you don't ever have
to worry about what somebody else is using. There is no possibility for
synchronization problems.

I'm all for creating an APR solution, but the number one priority is Apache.
If our solution also happens to work for APR, then great. But I don't
believe that we can necessarily say "this works for APR" and then wedge it
into Apache and make module authors need to be aware of the various output
mechanisms and compensate for potential ordering problems.

> Please, Greg and Ryan, document -simply- where your patch stands up and
> the other falls down, so I can grok this quickly without wadeing chin
> deep in the filtering mechanics!

All right. Here are a few cases:

*) ordering skew. see above.

*) ap_rwrite(my_buffer, 100000, r)

   gstein: flush the existing buffer (in the old_write filter), wrap a
           transient bucket around my_buffer, and deliver the brigade

   rbb: crash due to a memory overwrite in apr_brigade_write() -- it would
        copy the 100k my_buffer into a 9k malloc buffer.

	[ assuming we fix that bug... ]

	apr_brigade_write() has nowhere to send the data, so it allocates
	and copies the 100k into a new HEAP bucket, appends it to the
	brigade (r->bb), and then returns.

	eek! 100k memory copy!

	[ assuming that we patch up ap_rwrite to deal with "big" strings ]

	okay, now ap_rwrite() appends a 100k transient bucket to r->bb and
        passes that along.

	[ okay, now we have to fix this for ap_rputs and ap_rvputs, too.
	  darn. those are Apache specific now, huh? ]

*) ap_rprintf(r, "<D:compare-report>%s</D:compare-report>", report)
   report == 200k string

   [ assume both patches use apr_vformatter to break their 4k limit on
     ap_[v]rprintf ]

   gstein: the filter's buffer is used as the formatting buffer, so the
	   formatting begins after any existing buffered output.
	   apr_vformatter calls the flush_func after it fills the formatting
	   buffer. this flush_func is simply flush_buffer() in my patch. if
	   apr_vformatter exits without calling flush_func, then we have new
	   content sitting in the filter's output buffer.

   rbb: the flush_func is called and a new bucket is created from the
	formatting buffer and inserted into the brigade. There is no way to
	send this data to the network from within apr_brigade_vprintf().

	[ okay... restructure the assumption: apr_brigade_vprintf is torched
	  because it can hose the working set. move the vformatter call up
	  to ap_vrprintf. ]

        ap_vrprintf now uses a flush_func that can go to the network. but
	wait, we have some code in there to say "oh, but this is too small,
	so let's put this part into the brigade." The patch now has a dual
	path to decide where to send the data: in a brigade and down the
	network, or a call to apr_brigade_write() (which makes another copy)

	[ note: the above is now Apache specific ]

What I'd really like to see is a variation of Ryan's patch that uses a
bucket at the tail which is over-allocated so that more data can be
appended. That would allow bucket/brigade users to "write a bunch of little
items", yet still work with buckets. This would also be helpful for other
APR users. However, it doesn't really solve the ap_r* problem like my patch
does (and vice versa: I don't solve the "little bitty bucket output"
problem, nor the other-APR-user problem).

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/

Re: Apache 2.0 beta STATUS

Posted by Greg Stein <gs...@lyra.org>.

On Mon, Jan 22, 2001 at 01:20:57PM -0500, Greg Marr wrote:
> At 01:11 AM 01/22/2001, rbb@covalent.net wrote:
> 
> > > Please, Greg and Ryan, document -simply- where your patch stands 
> > up and
> > > the other falls down, so I can grok this quickly without wadeing 
> > chin
> > > deep in the filtering mechanics!
> >
> >Using apr_brigade_flush() function, it is possible to mix the API's
> >
> >It looks like an API that programmers are used to.  There is a 
> >buffered
> >and a non-buffered API.  The direct bucket calls are unbuffered, the
> >brigade calls are buffered.  If you want to switch from buffered to
> >unbuffered, you have to call apr_brigade_flush.
> 
> Shouldn't that be ap_rflush()?

If you intend to use r->bb for further output, then apr_brigade_flush is
fine.

If you intend to construct your own brigades, then something *like*
ap_rflush() is needed. But another new function would be necessary because
you don't want an actual ap_rflush since it shoves it out the network.
Essentially, a function such as:

void ap_sync_output(request_rec *r)
{
    if (r->bb) {
        apr_brigade_flush(r->bb);
	(void) ap_pass_brigade(r->output_filters, r->bb);
    }
}

> >apr_brigade_flush is incredibly inexpensive.  If it is called and 
> >isn't needed, it is a single if and a return.
> 
> ap_rflush() is more expensive than apr_brigade_flush().

Horribly expensive due to the network implications. An API such as above is
what you're seeking.
[ again, only if you intend to not use r->bb (and note that I'd hope we
  don't ask people to use r->bb for all output) ]

> Also, wouldn't the user have to do:
> 
> if(r->bb) {
>      apr_brigade_flush(r->bb);
> }

I'd think they would use ap_sync_output(r) calls. Just flushing a brigade is
not enough. That does an internal-synchronization on the brigade, but
doesn't deal with ordering to the output chain.

[ note that apr_brigade_flush is unneeded if the buffer-to-tail-bucket
  algorithm is used; a sync between r->bb and the output would still be
  needed with that algorithm, tho ]

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/

RE: Apache 2.0 beta STATUS

Posted by rb...@covalent.net.

> >Using apr_brigade_flush() function, it is possible to mix the API's
> >
> >It looks like an API that programmers are used to.  There is a 
> >buffered
> >and a non-buffered API.  The direct bucket calls are unbuffered, the
> >brigade calls are buffered.  If you want to switch from buffered to
> >unbuffered, you have to call apr_brigade_flush.
> 
> Shouldn't that be ap_rflush()?

Yes and now.  ap_rflush does do a brigade flush, but it also does other
stuff.  If you go from the ap_r* functions to a direct ap_bucket
manipulation, you are much better off calling apr_brigade_flush directly.

> >apr_brigade_flush is incredibly inexpensive.  If it is called and 
> >isn't needed, it is a single if and a return.
> 
> ap_rflush() is more expensive than apr_brigade_flush().
> 
> Also, wouldn't the user have to do:
> 
> if(r->bb) {
>      apr_brigade_flush(r->bb);
> }

Most of the time that if statement won't be required.  We can even set it
up so that the brigade_flush() call is never needed.

Ryan

_______________________________________________________________________________
Ryan Bloom                        	rbb@apache.org
406 29th St.
San Francisco, CA 94131
-------------------------------------------------------------------------------

RE: Apache 2.0 beta STATUS

Posted by Greg Marr <gr...@alum.wpi.edu>.

At 01:11 AM 01/22/2001, rbb@covalent.net wrote:

> > Please, Greg and Ryan, document -simply- where your patch stands 
> up and
> > the other falls down, so I can grok this quickly without wadeing 
> chin
> > deep in the filtering mechanics!
>
>Using apr_brigade_flush() function, it is possible to mix the API's
>
>It looks like an API that programmers are used to.  There is a 
>buffered
>and a non-buffered API.  The direct bucket calls are unbuffered, the
>brigade calls are buffered.  If you want to switch from buffered to
>unbuffered, you have to call apr_brigade_flush.

Shouldn't that be ap_rflush()?

>apr_brigade_flush is incredibly inexpensive.  If it is called and 
>isn't needed, it is a single if and a return.

ap_rflush() is more expensive than apr_brigade_flush().

Also, wouldn't the user have to do:

if(r->bb) {
     apr_brigade_flush(r->bb);
}

-- 
Greg Marr
gregm@alum.wpi.edu
"We thought you were dead."
"I was, but I'm better now." - Sheridan, "The Summoning"

RE: Apache 2.0 beta STATUS

Posted by rb...@covalent.net.

> > In it's current form, it requires the module author to actually call
> > apr_brigade_flush.  This was a design choice on my part.  I have detailed
> > how this can be hidden from the module author, but I disagree that we
> > should.
> 
> Can this be done for -apache- without affecting the 'pure'
> apr implementation?  Such that apache users are allowed to
> believe that it's 'just like the old apache' only better,
> while apr folks don't have the overhead when developing
> a new app?

Because apr_brigade_flush is so cheap, yes.  All we need to do is
change:

#define APR_BRIGADE_INSERT_TAIL(b, e)                                   \
        APR_RING_INSERT_TAIL(&(b)->list, (e), apr_bucket, link)

to:

#define APR_BRIGADE_INSERT_TAIL(b, e)                                   \
	apr_brigade_flush(b);						\
        APR_RING_INSERT_TAIL(&(b)->list, (e), apr_bucket, link)

This adds either a function call and one if statement, or a function
call one if statement and a bucket creation/insertion.  Depending on if
there is anything in the brigade's buffer or not.

We would also need to change any other macros that insert at the end of
the brigade.  From my very fast inspection, I believe that is just the
INSERT_TAIL and the CONCAT macros.

Of course, this isn't a fail-safe, people would need to use the r->bb
brigade in their handler *only* for this to work at all.  In filters, they
still use which every brigade they were always using.  If they don't use
r->bb in their handler, they will just find that they can not mix the two
API's without a lot of care.

Ryan
_______________________________________________________________________________
Ryan Bloom                        	rbb@apache.org
406 29th St.
San Francisco, CA 94131
-------------------------------------------------------------------------------

RE: Apache 2.0 beta STATUS

Posted by "William A. Rowe, Jr." <wr...@rowe-clan.net>.

> From: rbb@covalent.net [mailto:rbb@covalent.net]
> Sent: Monday, January 22, 2001 12:11 AM
> 
> Where it fails:
> 
> In it's current form, it requires the module author to actually call
> apr_brigade_flush.  This was a design choice on my part.  I have detailed
> how this can be hidden from the module author, but I disagree that we
> should.

Can this be done for -apache- without affecting the 'pure'
apr implementation?  Such that apache users are allowed to
believe that it's 'just like the old apache' only better,
while apr folks don't have the overhead when developing
a new app?

RE: Apache 2.0 beta STATUS

Posted by rb...@covalent.net.

> Please, Greg and Ryan, document -simply- where your patch stands up and
> the other falls down, so I can grok this quickly without wadeing chin
> deep in the filtering mechanics!

Really simple:

Where my patch stands up:
	
using the apr_brigade_* functions, we collapse small writes into one
bucket

Using apr_brigade_flush() function, it is possible to mix the API's

If we add the apr_brigade_flush call to the brigade macros, we can hide
this from the module author, although I am opposed to that.

It works both inside and outside of Apache, which also means it works for
module authors that want to append data to the end of the brigade.

It looks like an API that programmers are used to.  There is a buffered
and a non-buffered API.  The direct bucket calls are unbuffered, the
brigade calls are buffered.  If you want to switch from buffered to
unbuffered, you have to call apr_brigade_flush.

apr_brigade_flush is incredibly inexpensive.  If it is called and isn't
needed, it is a single if and a return.

A minor point, but it has consistently performed slightly better that the
filter approach.

Where it fails:

In it's current form, it requires the module author to actually call
apr_brigade_flush.  This was a design choice on my part.  I have detailed
how this can be hidden from the module author, but I disagree that we
should.

That's all I can see.  I can back every statement up with code, although
it might take a few hours, because I have been tweaking things
tonight.  :-)

I am going to bed now.  Please, I just want to commit one tomorrow.  I
honestly believe my patch is better for both Apache and apr-util, but I am
not willing to continue arguing.  If we don't reach a real conclusion by
Tuesday morning, I will remove my patch from consideration.  This arguing
doesn't do us any good at all, and it just slows down our progress.

Ryan

_______________________________________________________________________________
Ryan Bloom                        	rbb@apache.org
406 29th St.
San Francisco, CA 94131
-------------------------------------------------------------------------------

RE: Apache 2.0 beta STATUS

Posted by "William A. Rowe, Jr." <wr...@rowe-clan.net>.

> From: Greg Stein [mailto:gstein@lyra.org]
> Sent: Sunday, January 21, 2001 11:26 PM
> 
> On Sun, Jan 21, 2001 at 09:58:20PM -0600, William A. Rowe, Jr. wrote:
> > As I say, I'm for the module author controlling the unknowns with as little
> > possible chance for interference from 'invisible forces'.
> 
> The invisible forces in this case is the requirements imposed by forcing a
> module author and even the core Apache code to have to synchronize between
> two totally different output mechanisms. Each coder must be aware of all the
> other bits of code which could potentially produce output.  Even worse, they
> need to know *how* that other code did it. Not just "did they?" but "how?".
> That kind of invisible coupling across the code base is a recipe for bugs.
> Insert a sync call here, insert one there. Oops, we forgot to sync the two
> mechanism over there, too. Oy! :-)

How is this possible unless we invoke a subrequest?  Subrequests are very
simple to protect within Apache, even if the author 'forgets'.  But this is
no different than using different xwrite() calls against a yopen()ed file.
Mix your metaphores and you can sink a program.

I'll think on this another night.  And [sorry I already clipped the text],
on the comment of 'fixing apache' v.s. 'fixing apr', I'm very strongly
against leaving useful mechanisms in apache when an equivilant mechanism
exists to migrate them cleanly into apr.  If it were a case of "here's the
Apache patch, and we'll finish moving it to apr in three weeks", I'd say
great!  Let's roll the beta and move it afterwards!  But we aren't - again,
I'm seeing two methods that would both solve our problem, with different
requirements on the authors.

Please, Greg and Ryan, document -simply- where your patch stands up and
the other falls down, so I can grok this quickly without wadeing chin
deep in the filtering mechanics!

Bill

Re: Apache 2.0 beta STATUS

Posted by rb...@covalent.net.

> > Filtered (gstein) carries the risk it can be bumped from the filter stack
> > by some very mean spirited filter that is VERY_VERY_FIRST (unlikely), but
> > applies only to Apache 2.0, and appears stable.
> 
> This is a mischaracterization. There is no risk. If somebody sneaks in
> front, then the server operates quite fine. Completely robust.

And back to being incredibly slow again.

> You also did not mention that my patch allows free mixing of ap_r* and
> ap_pass_brigade(). There is no possible way for output to be misordered.
> Ryan's patch does *not* have that feature, and I pointed that out a *long*
> time ago. That was the reason why I veto'd Victor's patch.

I actually showed how that feature could be acheived.  I just do not
believe that it is as big a deal as you do.

> Whose requirements? We're trying to fix Apache, not APR. And we're trying to
> provide a way for people to continue to use ap_r* without suffering a
> penalty. And we're trying to provide a way that they can transition to a
> better API for some of the critical parts.

I disagree with this.  If the functions don't work, they don't work.  A
fix in Apache doesn't help any other program that uses the bucket API.  If
we don't care about the other programs, then we really should remove the
brigades from the apr-util library, because nobody will use them with
their current performance.

Ryan
_______________________________________________________________________________
Ryan Bloom                        	rbb@apache.org
406 29th St.
San Francisco, CA 94131
-------------------------------------------------------------------------------

Re: Apache 2.0 beta STATUS

Posted by Greg Stein <gs...@lyra.org>.

On Sun, Jan 21, 2001 at 09:58:20PM -0600, William A. Rowe, Jr. wrote:
>...
> Filtered (gstein) carries the risk it can be bumped from the filter stack
> by some very mean spirited filter that is VERY_VERY_FIRST (unlikely), but
> applies only to Apache 2.0, and appears stable.

This is a mischaracterization. There is no risk. If somebody sneaks in
front, then the server operates quite fine. Completely robust.

You also did not mention that my patch allows free mixing of ap_r* and
ap_pass_brigade(). There is no possible way for output to be misordered.
Ryan's patch does *not* have that feature, and I pointed that out a *long*
time ago. That was the reason why I veto'd Victor's patch.

I consider mixing of the APIs critical. There is just no simple way to
ensure that output will never be mixed. Consider the handler which uses
ap_r* and then starts a subrequest which uses brigades, which returns to the
handler to use more ap_r* functions. Are we seriously going to start
sprinking order-synchronization calls throughout Apache to ensure that
everything is ordered properly? No. We will invariably miss some. How about
that third party DAV module that uses some output functions from mod_dav?
Which module is in charge of ensuring that the order remains synchronized?
Why should they have to care? (given that I've demonstrated they don't have
to care -- mixing APIs freely is quite possible and quite efficient)

>...
> Based on requirements alone, I'm +1 on the latter patch.  I believe it is

Whose requirements? We're trying to fix Apache, not APR. And we're trying to
provide a way for people to continue to use ap_r* without suffering a
penalty. And we're trying to provide a way that they can transition to a
better API for some of the critical parts.

For example, I would think it would be great for a handler to ap_rputs() a
couple header strings, call a few output functions which use ap_r*, and then
call some other doo-dad that puts together a complex brigade using custom
buckets. Mixing of the APIs is quite important.

[ the above scenario isn't that difficult to imagine: I have been thinking
  about how mod_dav plugins can provide some custom output. The obvious
  mechanism is to let them use brigades/buckets in some way. But the
  wrapping output structures are still easiest to use ap_r* ]

>...
> such nonesense.  The later seems more stable, in that the module author
> determines the stability.

Euh... why not just start with stable in the first place. You can't get
messed up output with my patch. No matter what you do.

>...
> As I say, I'm for the module author controlling the unknowns with as little
> possible chance for interference from 'invisible forces'.

The invisible forces in this case is the requirements imposed by forcing a
module author and even the core Apache code to have to synchronize between
two totally different output mechanisms. Each coder must be aware of all the
other bits of code which could potentially produce output. Even worse, they
need to know *how* that other code did it. Not just "did they?" but "how?".
That kind of invisible coupling across the code base is a recipe for bugs.
Insert a sync call here, insert one there. Oops, we forgot to sync the two
mechanism over there, too. Oy! :-)

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/

Re: Apache 2.0 beta STATUS

Posted by rb...@covalent.net.

> Based on requirements alone, I'm +1 on the latter patch.  I believe it is
> better to tell the author "Add this call and you will be safe", rather than
> chase obsure module authoring bugs that flushed the filter list or other
> such nonesense.  The later seems more stable, in that the module author
> determines the stability.
> 
> But now, we have doug's patch.  Without line-by-line pulling me out of my
> little filesystem world just now, would anyone care to comment on the
> merits of Doug's v.s. rbb's v.s. gstein's patches?  Maybe the six paragraph
> executive summary :-?

Doug's patch is not complete, and it wasn't meant to be.  I simply asked
Doug to post his patch, because it was a bit faster than my patch.  I have
a feeling that by removing the extra strlen(), I have removed that
difference.

Ryan
_______________________________________________________________________________
Ryan Bloom                        	rbb@apache.org
406 29th St.
San Francisco, CA 94131
-------------------------------------------------------------------------------

Re: Apache 2.0 beta STATUS

Posted by "William A. Rowe, Jr." <wr...@rowe-clan.net>.

From: "William A. Rowe, Jr." <wr...@rowe-clan.net>
Sent: Sunday, January 21, 2001 3:00 AM


> From: <rb...@covalent.net>
> Sent: Sunday, January 21, 2001 1:12 AM
> 
> > 2)  We have 1 patch that must be committed before we go beta.  That is the
> > ap_r* performance patch.  There are two patches that have been submitted,
> > and I asked for a vote to end tomorrow morning.  Unfortunately, nobody has
> > voted for either patch at this point.  I am asking that people review
> > these patches, and vote for one of them.  I had planned to commit one of
> > the patches tomorrow afternoon.  I will not commit either patch until at
> > least three people (other than Greg and I) vote for a patch.  Please
> > review them and ask any questions you might have.
> 
> We have as many patches as it takes to make apache.org stable.  Yes, we
> need to choose a patch for the ap_r* fns - I'm leaning twords the apr
> implementation over the filter implementation, for the reason that we
> have a more thorough bucket/brigade/buffering solution within apr that
> isn't restricted to the apache solution.  If their is a technical flaw
> with either patch, someone please point it out.

Ok... my understanding (with gstein's and rbb's latest updated patches);

Filtered (gstein) carries the risk it can be bumped from the filter stack
by some very mean spirited filter that is VERY_VERY_FIRST (unlikely), but
applies only to Apache 2.0, and appears stable.

Buffered apr (rbb) requires an additional call to transition between the
buffered api and buckets api.  It appears stable.

Based on requirements alone, I'm +1 on the latter patch.  I believe it is
better to tell the author "Add this call and you will be safe", rather than
chase obsure module authoring bugs that flushed the filter list or other
such nonesense.  The later seems more stable, in that the module author
determines the stability.

But now, we have doug's patch.  Without line-by-line pulling me out of my
little filesystem world just now, would anyone care to comment on the
merits of Doug's v.s. rbb's v.s. gstein's patches?  Maybe the six paragraph
executive summary :-?

As I say, I'm for the module author controlling the unknowns with as little
possible chance for interference from 'invisible forces'.

Bill

Re: Apache 2.0 beta STATUS

Posted by rb...@covalent.net.

> > that we are stable now.  The biggest stumbling block, is that Apache 2.0
> > HEAD is not running on any known production server, and the last time we
> > turned it on live on apache.org, we took the machine down.  That
> > experience may take a little while for some people to get over.
> 
> We took down apache.org with an alpha release, not the beta.  Don't forget
> that, and we expect our users to remember that as well.  It it critical
> that we eat our own cooking... until we run head, we don't have a beta
> candidate.

I agree, but all I am saying is that it may take a while to remove the bad
taste that was left when we took down the machine.  :-)  

>   - backout apr_get_filename_case

Done

>   - fix the data symbols and pool problems of dav, if I can find them.

Try it again today.  Last night, I discovered that we were calling the
register_hooks function twice for every module that was in an AddModule
call.  I doubt this solved your problem, but I would am curious.  :-)

> If no one has thanked you in the last few weeks for being a total PITA,
> well, thank you.  I'm far more impressed by what I've seen these last

No thanks are necessary, and yes, others have said it.  :-)  I work this
hard at Apache 2.0 because I have spent more than 2 years getting it done,
and I want to see it finished now.  I am also not the only person who
deserves thanks for this.  The entire group has had a large role in this
version of Apache, and I truly believe that I have learned far more from
this work then I have contributed.

> few days than I have been in some time.  I'm more convinced by the hour
> that this is -the- server design we want to throw into the world.  This
> is the point in development that engineers most often say damn, we blew
> it, it won't/doesn't ... I really don't have that feeling about where
> we stand today.

Glad to hear it.  If anybody has the feeling that we got it wrong, please
speak up now, let's fix it now, not in three months.  :-)

Ryan

_______________________________________________________________________________
Ryan Bloom                        	rbb@apache.org
406 29th St.
San Francisco, CA 94131
-------------------------------------------------------------------------------

Re: Apache 2.0 beta STATUS

Posted by "William A. Rowe, Jr." <wr...@rowe-clan.net>.

From: <rb...@covalent.net>
Sent: Sunday, January 21, 2001 1:12 AM


> This is a simple STATUS message for the beta.
> 
> 1)  We are currently running the entire Apache.org site on Apache 2.0 on
> port 8092.  I am talking to Brian about the steps required to move this to
> port 80.  I fully expect this to happen soon-ish, assuming we can prove
> that we are stable now.  The biggest stumbling block, is that Apache 2.0
> HEAD is not running on any known production server, and the last time we
> turned it on live on apache.org, we took the machine down.  That
> experience may take a little while for some people to get over.

We took down apache.org with an alpha release, not the beta.  Don't forget
that, and we expect our users to remember that as well.  It it critical
that we eat our own cooking... until we run head, we don't have a beta
candidate.

> 2)  We have 1 patch that must be committed before we go beta.  That is the
> ap_r* performance patch.  There are two patches that have been submitted,
> and I asked for a vote to end tomorrow morning.  Unfortunately, nobody has
> voted for either patch at this point.  I am asking that people review
> these patches, and vote for one of them.  I had planned to commit one of
> the patches tomorrow afternoon.  I will not commit either patch until at
> least three people (other than Greg and I) vote for a patch.  Please
> review them and ask any questions you might have.

We have as many patches as it takes to make apache.org stable.  Yes, we
need to choose a patch for the ap_r* fns - I'm leaning twords the apr
implementation over the filter implementation, for the reason that we
have a more thorough bucket/brigade/buffering solution within apr that
isn't restricted to the apache solution.  If their is a technical flaw
with either patch, someone please point it out.

> 3)  That's it.  We are still in feature freeze, and I suspect that there
> are bugs on non-Unix platforms that still need to be worked out.  I don't
> believe that there are any more _features_ that MUST be committed before
> we go beta.

No doubt.  I closed a big hole in unix security just today.  I have the
following transition map that will be implemented, the first group prior
to beta:

  - backout apr_get_filename_case
  - implement simple finfo->fcase for unix, os2 [no, I don't know what I'm
      doing, I'll simply create the assuming case-sensitive patch]
  - implement finfo->fcase for win32 [simple, really]
  - replace FindFile in win32 canonical with the apr_stat ->fcase value.
  - pound the heck out of win32's allocations
  - fix the data symbols and pool problems of dav, if I can find them.

The changes above assure winnt can open any file name and utf8 names are
assured to be processed.  Before or after beta;

  - add apr_open_fileinfo [optimized under win32, someday under OS X and
      any other platform that can perform some transition magic], I'm
      accepting a better fn name to describe opening a file from an finfo.
  - allow apr_read/write on win32 files opened with APR_XTHREAD [this api
      only supports sendfile today.]
  - optimize the flags to apr_[l]stat/getfileinfo.
  - collapse multiple apr_stat's in the server today, and absorb most of
      the apr_canonical stuff as part of the apr_stat directory walk
      [hit the filesystem only once, not twice.]
  - make the canonical function do nothing more than right the string
      [fix slashes, colons, reject streams, etc] without hitting the
      filesystem at all. 
  - for win9x - add APR_CHR results based on file name (con, aux, nul etc)

That's my map, the first few items will be done Monday.

> I would really like to move this forward now.  It would be REALLY cool to
> get the beta out before the end of the month if that is at all
> possible.  I am willing to do whatever is necessary to make that happen.

If no one has thanked you in the last few weeks for being a total PITA,
well, thank you.  I'm far more impressed by what I've seen these last
few days than I have been in some time.  I'm more convinced by the hour
that this is -the- server design we want to throw into the world.  This
is the point in development that engineers most often say damn, we blew
it, it won't/doesn't ... I really don't have that feeling about where
we stand today.