You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@httpd.apache.org by "Fielding, Roy" <fi...@eBuilt.com> on 2000/07/01 10:47:01 UTC

Re: another response to Roy :-)

>I don't understand this. What are the "problems of buff" that my patch
>introduces? And I really don't understand the fragility point.

It doesn't introduce them -- they are there already.  One of the goals of
the filter concept was to fix the bird's nest of interconnected side-effect
conditions that allow buff to perform well without losing the performance.
That's why there is so much trepidation about anyone messin with 1.3.x buff.
Your patch does nothing to change that situation.  Bucket brigades are a
replacement for buff in addition to a solution for filters.

>It started
>with a simple, easy-to-use char* callback handler. People wanted more, so I
>filled in the slots I had (but didn't code first-time around) for the
bucket
>stuff. Ryan asked for an example of something else, and I demonstrated that
>it was possible (but it isn't part of the patch). I believe the ability to
>quickly add each feature demonstrates a sound framework rather than any
>fragility.

Dude, you are an excellent programmer -- I am convinced you can implement
our entire server quickly.  That doesn't mean its a sound framework.
It just means you understand your own code.  I'll be impressed by the
framework when I can understand it without asking you questions.

>Presuming the design you're referring to is the bucket-brigades... yes,
>there is a design, but no code that implements it. I believe the only
>missing, basic requirements is adding a "next" pointer to the buckets in my
>patch. There may be some other details, but I think my patch can implement
>what you are looking for. I'd love to learn where it specifically falls
down.

I used a list-queue ADT for Ada95 because I wanted the semi-transparent
features of being able to insert anywhere in the queue, but particularly
at the front and back because those are the most common.  This is separate
from the buckets themselves.  I have no code to implement the full bucket
brigades, other than the sample data structures posted months ago.  The
Ada95 stuff was GPL (not by choice).  The real psychedelic stuff happens
when you can pass metadata (tokenized header fields) as buckets and the
filters know how to pass that down the chain without treating them as data.
I didn't implement that in the Ada95 version.

>> The char * interface cannot support sendfile or caching or writing a
proxy
>> in filters.  It is useless to me, at the same cost as a design that would
>> actually work.  I'm not saying it isn't useful for someone -- I am saying
>> the design won't accomplish my needs, whereas the bucket brigades does.
>> I would veto it myself if I had time to back it up with a fully
implemented
>> alternative.  Even so, there is no doubt in my mind that adding a dumb
>> filtering design to Apache at this point is far far worse than doing
>> nothing about filtering in 2.0.
>
>Roy. Please hold for a moment here.
>
>I *do* have buckets. The char* interface is simply an *option* for that
>"someone" you refer to. If you want buckets, then apply my patch. If you
>want a list-of-buckets, then I'll add a next pointer.
>
>Specifically, take a look at ap_bucket_t in the ap_filter.h file that is
>part of my patch. A filter then implements a function that accepts one of
>those buckets.
>
>This isn't a dumb filtering interface. It is a bucket-based system with an
>option for a simplified interface.

Life doesn't work that way.  Its like implementing a sewer system where
you give every land owner the right to decide what size pipes to use on
their portion of the network.  As much as you might think people will
implement what they need, what they will do is implement the shortsighted
solution and pass that off as "the filter".  When a configuration is
created in which one filter is too small, the shit doesn't just stop in
that pipe -- it backs up into everyone's toilets.

I know you have buckets, or at least a data type that could one day
be buckets.  But you aren't using them like I did.  My buckets are passed
around in list-queues (really just lists with front and back pointers).
My buckets carry data and metadata and meta-metadata.  My buckets are
used to indicate stream-end, and the filter configuration itself is
determined by the stream content.  It probably sounds weird, but the
effects of this interface are completely different than mere content
filters.  They simplify everything.  I'm not saying that we have to
simplify everything right away, but I am saying that it is just as easy
to implement a fully-general filter using bucket brigades as it is
to implement your string interface filters -- all of the complex parts
are handled by the ADTs, and even those are no more complex than the
memory management functionality you would have to implement for strings
without buckets.

>> If you guys spent half as much time commenting your code as you did
>> arguing about the edge cases, maybe we wouldn't have to spend so much
>> time trying to communicate?  Maybe it would help clarify the intention
>> behind some of the interface issues.
>
>This is a good suggestion, but we only have so much time. Both of us hoped
>that the code was enough, and the comments and doc would come in a future
>pass.

Sorry, I am too old to believe any developer when they say they will add
comments "in a future pass."  It isn't in our nature.  In fact, as far as
I can recall, Dean and myself are the only ones who ever add comments
after the fact, and that's mostly to explain other people's code (or avoid
answering further questions about our own).  Of course, you'll probably
surprise me and add perfectly lucid explanations of everything after
the commit, but you have to admit that this is a less than optimal plan
for a code review process.

>...

>But, please... detail the features/qualities of that missing 20% or 30%. If
>you want to be completely happy with my patch, then I'll gladly code those
>bits for you. But I just don't know what your "100% goals" are... It seems
>somewhat unfair to state that one of the posted patches does not meet the
>100% goals, yet not explain what those are. How could we ever meet that?

I thought it was already in the repository somewhere.  Shucks, it may have
been deleted when the old old 2.0 became apache-nspr became apache-apr and
then apache-2.0.  I'll find it.  Basically, we had a discussion of what we
wanted out of layered-IO during the nature walk and design meetings in
San Francisco (exactly two years ago).  Ed and Alexei wrote it all down
in a few messages, Dean layed out some other issues, and I brain-dumped
on the bucket brigades design.

>> Greg, I know this is more vague rambling on my part.  Let me make a
>> specific example.  Somewhere in your patch (as noted while catching
>> up on my e-mail today) is a filter function that is making a test
>> on filter->r->connection->aborted.  That is horribly wrong.
>
>Are you referring to the ap_l* functions? The filter callbacks never do.
>
>(or are you thinking of Ryan's patches? in one of his macros, which is used
>in the filter callbacks, they refer to the aborted flag)
>
>I'm going to assume you *are* talking about the ap_l* functions... below...

Yeah, I meant the ap_l* functions.

>> Consider the case where you have an active proxy implemented in the form
>> of a filter where you have one input filter chain reading the inbound
>> response and passing it downstream toward the client.  What r?  Which
>> connection?  Why the hell does the filter need to know anything about
>> the endpoints, let alone assume they are sockets?
>
>The filter doesn't need to know (and doesn't today). The framework doesn't
>need to know (but does today). I can remove those tests quite easily. But
in
>the *current* Apache, it makes a lot of sense. It allows the filter chain
to
>short-circuit the processing when it finds out that it is no longer needed.
>
>Is that too much knowledge of the endpoint? Sure. But it isn't permanent,
>and it certainly isn't a requirement of the design. If the whole chain just
>kept going and shoved everything down to BUFF where it finally got
>discarded, it would still work fine.

It should just be checking a return value from the stream call.  The only
part that can know how to test for abort is the network end, and it can't
rely on a request-global variable for that purpose (ap_rwrite could, but
only because the ap_r* only send data on r->connection).

>> The patch may work fine, but the design doesn't.  Data flow networks are
>> a very well-defined and understood software architecture.  They have a
>> single, very important constraint: no filter is allowed to know anything
>> about the nature of its upstream or downstream neighbors beyond what
>> is defined by the filter's own interface.  That constraint is what makes
>> data flow networks highly configurable and reusable.  Those are
properties
>> that we want from our filters.
>
>Sigh. Roy: none of my example filters have used any knowledge of the end
>points. Never have the callbacks ever checked ->aborted. I believe my
filter
>design matches your requirements.

What can I say?  That is totally unclear from the code.  Yes, I have read
all of your patches, just as I have read all of Ryan's patches.  You aren't
solving the same problem that I am trying to solve, and neither of you are
writing comments in the code, so don't expect it to be understood.

>> The second problem here is that this discussion contains too many
>> messages.  Both of you seem to have a need to respond to every message
>> on a point-by-point basis,
>
>I know there are too many messages. But look at it from my point of view.
>You come along and post a relatively singular message. People that have
>tuned out of the discussion stop to read it ("ooh. roy posted. maybe it is
>time to see what is happening."). Then they read about how you don't like
my
>patch because it only provides the char* interface and does not implement
>the bucket brigades.
>
>What am I to do? Just let that impression go and sink in? Or do I try to
>educate the readers of these emails that, *yes* it *does* have buckets?

You aren't doing much educating when you say "yes it does".  Buckets are
not bucket brigades.  The presence of char * makes bucket brigades useless.
Allowing the filters to lose information is the problem.  Ryan has said
that several times, as have I, but your response is always that only
one of the two interfaces loses information.  Well, that's one too many.
That isn't a start -- its a short drive straight off the cliff.

>When Ryan says the recursive-call design will suck performance wise, yet
>that whole "register spill" thing truly doesn't exist in these scenarios?
>Let that pass? Give people the wrong impression?

I didn't let it pass.  I didn't continue harping on it for five messages
either.  There is a difference.  We all get carried away at times, but
enough is enough.

>I feel it is important that people understand that I've posted a patch that
>meets (IMO) our needs. If they know that, then they may stop and actually
>look at the thing and provide one of the needed +1 votes. But if every
>message has some comment about how it is broken here or there, or doesn't
>meet this need or that, then they will keep skipping over it. We'll just
>remain in limbo...

I think people here are perfectly capable of forming their own opinions,
at least when they have time to read their mail.  What does short-circuit
a review is when you say you can add something later.  That only works if
everyone is in agreement on the start.

>>...
>> What you don't seem
>> to be realizing is that you have generated so much text that the only
people
>> who have enough time to both read the discussion and propose alternatives
>> are the two of you.  The rest of us just can't keep up.
>
>Oh, I totally understand. That is why I've been posted things with titles
>such as "what are the issues?" or "recap" or "summary". To boil all this
>stuff down so that (hopefully) people can jump in at that point.

Sorry, they haven't.  What you write down in the summary is what you think
the status is, duplicating what you already said but without context.
I don't need that, particularly when you leave out half of the objections
simply because you think them meaningless.  What I want you to write down
is your design, as it exists in the patch you posted (not the one you
might someday post), or declare the patch dead and just describe your
future design.  That's what I want to understand.

I'm not too keen on the modified subject lines.  Dean's gonna be pissed. ;-)

>I get the feeling that you didn't see the ap_bucket_t stuff in my patch.
>Maybe you looked at the very first one? Please look at the latest,
available
>in the archive with MsgID <20...@lyra.org>. There are some
>examples of the filtering scheme in <20...@lyra.org>.
>
>If you could spend just 30 minutes looking over that. Again: it is only 850
>lines, most of it quite straight-forward. But sorry: no comments yet.

I spent some time looking at it yesterday, enough to see that you aren't
designing for my goals.  Content filtering is my least important
goal.  Completely replacing HTTP parsing with a filter is my primary goal,
followed by a better proxy, then internal memory caches, and finally
zero-copy sendfile (in order of importance, but in reverse order of
likely implementation).  Content filtering is something we get for free
using the bucket brigade interface, but we don't get anything for free
if we start with an interface that only supports content filtering.

>If that patch does not meet your bucket-brigades thoughts, or you can see
>where it can't *ever* meet them, then please explain to me. I will
>definitely listen and fix the code.

I was hoping that the header/file/footer example would do that.
Does the notion of passing metadata on the stream make sense yet?
Yes, I know that 1.3 content generators assume that the headers
have already been sent, but I don't need a new major version to
implement what we can already do in 1.3.

>I'm guessing that *something* is going to go into 2.0. I hope that you can
>at least spend a small amount of time to review the posted patch, and
>provide your thoughts. If this is dear to you, and something is going in,
>then you'll want to have a look :-)

I already did -- there just wasn't any point in replying to every line
of your patch when it is all based on an interface that I don't want.

I'll try to dig through all of the design notes and put together a
single coherent argument for why all this long-term stuff is needed.
But not tonight -- I have a desperate need to spend some time at the
beach tomorrow.

[er... today. shit, another three hours blown just composing this message,
and then I discover UCI has expired my account via some stupid automated
bounce-the-undergrads script, so now my mail and all my archives of
Apache stuff are completely hosed until support gets back from their
four-day
weekend, by which point I'll have been bounced off all the mail lists.
Crap.]

....Roy

Re: another response to Roy :-)

Posted by Greg Stein <gs...@lyra.org>.

On Sat, Jul 01, 2000 at 01:47:01AM -0700, Fielding, Roy wrote:
> ...
> ... cogent explanation of the delta between the proposed patch and
> ... bucket brigades
> ...

All right. Gotcha.

I believe that I understand what you're saying here (of course, without my
feeding it back to you, who's to say? :-). To get there from our current
code base is a long distance, unfortunately. We just don't seem to be ready
for passing metadata in that way (heck, to have an ADT for it), to radically
alter the BUFF usage, etc.

What to do? Start marching, me thinks :-)

>...
> >This is a good suggestion, but we only have so much time. Both of us hoped
> >that the code was enough, and the comments and doc would come in a future
> >pass.
> 
> Sorry, I am too old to believe any developer when they say they will add
> comments "in a future pass."

hehe... I think I'm older than you, even. I hear ya :-)

>...
> Of course, you'll probably
> surprise me and add perfectly lucid explanations of everything after
> the commit, but you have to admit that this is a less than optimal plan
> for a code review process.

Given the focus and discussions around this? You bet I'll add comments :-).
And yes, it has been less than optimal. A definite education in "skip
writing that email; go comment your code and resubmit instead." :-)

>...
> Does the notion of passing metadata on the stream make sense yet?

Absolutely. Getting from here to there is still a good chunk of work. I
believe that I also understand where you're heading with that stuff. It can
definitely be quite cool. I'm just hoping that we can get something useful
accomplished without needing to wait for the whole package. If we can get a
positive benefit, without painting ourselves into a corner, then I'm up for
it. Maybe our choices will make step 10 a bit more difficult, but I'm
willing to live with that. (if it makes it impossible, well...)

Thanks for the detailed email. It definitely clarifies your direction, some
great ideas, and what we can shoot for. I'm confident we can get there.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/