You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@httpd.apache.org by gr...@apache.org on 2004/02/04 22:39:43 UTC

[PATCH] configurable Location block speed up

I had some offline feedback for my previous Location speeder-upper which I 
though had merit.  That patch skips the directory walk when it detects a 
SetHandler directive inside of a Location block.  The jist of the criticism was 
that some user might have a Location block with a URI that overlaps with 
something in the filesystem, despite what we say in the doc.  I quote:

-----------------------------------------------------
When to use <Location>

Use <Location> to apply directives to content that lives outside the filesystem. 
  For content that lives in the filesystem, use <Directory> and <Files>.  An 
exception is <Location / >, which is an easy way to apply a configuration to the 
entire server.
-----------------------------------------------------

If there are sites out there that ignore the piece about the content living 
outside the filesystem, my previous patch which requires no new configuration 
might create problems.  This version adds an OutsideFilesystem directive, only 
valid in <Location>, to eliminate the possibility for unpleasant surprises when 
upgrading httpd.  It causes the directory walk to be skipped if set.  btw, I'm 
thinking of this as a 2.1 thing which wouldn't be backported.

But then if I play devil's advocate, someone could see the new directive and 
turn it on when it's not appropriate and cause Bad Things to happen.  Mainly I'm 
looking for comments on whether this should be configurable or not.

Thanks,
Greg

Re: [PATCH] configurable Location block speed up

Posted by gr...@apache.org.
Joshua Slive wrote:

>>(without closer looking at the code)
>>I'd suggest rather
>>
>><Location [virtual] /uri-path>
>></Location>
>>
>>where the virtual keyword defines that the location is independent from
>>filesystem.
> 
> 
> Perhaps I'm misunderstanding the issue, but neither of these make sense to
> me.  It is not really the behavior of the <Location> block that is
> changing.  <Location> always acts outside the filesystem.  What is
> changing is that a certain directive inside a <Location> block should be
> applied differently.  So shouldn't this just be
> 
> SetVirtualHandler server-status
> or
> SetHandler virtual server-status

Joshua and André, thanks.  Both of you are better at user interfaces than I am. 
   But it sounds like we first need to decide if configuring/assuming <Location 
 > to be truly outside the filesystem is the way to go.

Greg


Re: [PATCH] configurable Location block speed up

Posted by André Malo <nd...@perlig.de>.
* Joshua Slive <jo...@slive.ca> wrote:

> 
> On Wed, 4 Feb 2004, [ISO-8859-15] André Malo wrote:
> 
> > * gregames@apache.org wrote:
> >
> > > If there are sites out there that ignore the piece about the content
> > > living outside the filesystem, my previous patch which requires no new
> > > configuration might create problems.  This version adds an
> > > OutsideFilesystem directive, only valid in <Location>, to eliminate the
> > > possibility for unpleasant surprises when upgrading httpd.  It causes
> > > the directory walk to be skipped if set.  btw, I'm thinking of this as a
> > > 2.1 thing which wouldn't be backported.
> >
> > (without closer looking at the code)
> > I'd suggest rather
> >
> > <Location [virtual] /uri-path>
> > </Location>
> >
> > where the virtual keyword defines that the location is independent from
> > filesystem.
> 
> Perhaps I'm misunderstanding the issue, but neither of these make sense to
> me.  It is not really the behavior of the <Location> block that is
> changing.  <Location> always acts outside the filesystem.  What is
> changing is that a certain directive inside a <Location> block should be
> applied differently.  So shouldn't this just be
> 
> SetVirtualHandler server-status
> or
> SetHandler virtual server-status

The aim is to skip directory and files container merging. When using
Sethandler I'd see an inconsistency that it cannot apply e.g. to AddHandler
directives [not in core] and should be applied to ForceType as well.
<Location virtual> is complete core and would be one central interface.

I'd guess, it's finally just a naming issue, isn't it? ("virtual" may be not
intutitive enough anyway)

nd

Re: [PATCH] configurable Location block speed up

Posted by Joshua Slive <jo...@slive.ca>.
On Wed, 4 Feb 2004, [ISO-8859-15] André Malo wrote:

> * gregames@apache.org wrote:
>
> > If there are sites out there that ignore the piece about the content living
> > outside the filesystem, my previous patch which requires no new
> > configuration might create problems.  This version adds an OutsideFilesystem
> > directive, only valid in <Location>, to eliminate the possibility for
> > unpleasant surprises when upgrading httpd.  It causes the directory walk to
> > be skipped if set.  btw, I'm thinking of this as a 2.1 thing which wouldn't
> > be backported.
>
> (without closer looking at the code)
> I'd suggest rather
>
> <Location [virtual] /uri-path>
> </Location>
>
> where the virtual keyword defines that the location is independent from
> filesystem.

Perhaps I'm misunderstanding the issue, but neither of these make sense to
me.  It is not really the behavior of the <Location> block that is
changing.  <Location> always acts outside the filesystem.  What is
changing is that a certain directive inside a <Location> block should be
applied differently.  So shouldn't this just be

SetVirtualHandler server-status
or
SetHandler virtual server-status

Joshua.

Re: [PATCH] configurable Location block speed up

Posted by André Malo <nd...@perlig.de>.
* gregames@apache.org wrote:

> If there are sites out there that ignore the piece about the content living 
> outside the filesystem, my previous patch which requires no new
> configuration might create problems.  This version adds an OutsideFilesystem
> directive, only valid in <Location>, to eliminate the possibility for
> unpleasant surprises when upgrading httpd.  It causes the directory walk to
> be skipped if set.  btw, I'm thinking of this as a 2.1 thing which wouldn't
> be backported.

(without closer looking at the code)
I'd suggest rather

<Location [virtual] /uri-path>
</Location>

where the virtual keyword defines that the location is independent from
filesystem.

nd

Re: FileSystem v.s. Other Resources [was configurable Location?]

Posted by Geoffrey Young <ge...@modperlcookbook.org>.

> Let's do this in 2.1 by splitting out the file system,
> and if the filesystem module isn't handling a request, it won't be serving
> content but also won't be invoking the directory walk or stat-ing files.

this all sounds kinda interesting, and similar to the way auth has been set
up in 2.1 - more directives and modules, but more flexibility and power.

> 
> Oh last observation - it should become (in 2.1) nearly impossible for folks
> to just bork around with the contents of r->filename and r->finfo, first by 
> stripping them from the request rec, and second by providing an API to
> the filesystem module that lets another module link into another file.
> That API would prevent module authors from bypassing the filesystem
> module's internal security. 

this has come up before, and I'd love to see an API that prevents accidental
disagreement between r->filename and r->finfo (for one).  IIRC, there was
supposed to be some discussion at the hackathon about this, but it sounds
like I didn't miss it :)

--Geoff


Re: FileSystem v.s. Other Resources [was configurable Location?]

Posted by "William A. Rowe, Jr." <wr...@rowe-clan.net>.
At 09:22 AM 2/5/2004, gregames@apache.org wrote:
>>Effect/Issue 1:
>>Bypassing the filesystem canonicalization would be very bad on certain platforms such as windows, depending on case sensitivity, etc.  It would
>>also bypass *user configured* options such as avoiding symlinks.
>
>only for <Location >  If the admin is telling us the content is outside the filesystem, I don't think filesystem canonicalization or symlinks are relevant.

But the question is how the admin communicates that to us.  You are 
suggesting we allow <Location virtual /not/a/file> to say it's outside of the 
file system.  *However* are you preventing a file from being served?

Guess that is the gist of my concern.

>>Effect/Issue 2:
>>Bypassing the directory block walk would ignore the very <Directory >
>>sections that the user explicitly put in their config.  This too is bad.
>
>This is only for <Location > .  There is no change to the behavior of <Directory >.
>
>>to avoid these consequences, and still save ourselves
>>the pain of the filesystem walk and <Directory > block handling...
>
>I'm only changing <Location >

Wait - so you say.  But the effect is to disable <Directory > block handling,
am I confused on this point???  If you ignore valid <Directory > blocks because
somewhere else the admin says "Wait - this location is virtual!" then are you
also preventing those files from being served?

If not this is what I suggest is a violation of the principal of least surprise.

>>=== Break out file handling as a separate component ===
>>I've proposed in the past the simple directive
>>FilesystemMount /path/to/files
>
>At first glance it looks like a key difference is that you are approaching it by saying that it's OK to have <Location > map to a real filesystem and providing config to explicitly specify the mapping.  I'm advocating going the other way and allowing <Location > to be treated as if it is really outside of the filesystem.

Here's what I'm suggesting;

1. Everything in Apache is at some location.  Location is a predictable factor
   because every URI becomes a location.  A Location block will always
   override all other configurations, location blocks are always applied 
   as least-to-most significant if they are absolute, and then pattern matching
   location blocks are applied after that.

2. Some things in Apache are in the Filesystem, some are in JK, some are
   in an application (a cgi with PATH_INFO).  These aren't always 1:1 to the 
   filesystem, except in the case of filesystem-based modules such as our
   default handler and the current cgi modules.

3. If something is in the Filesystem, then <Directory > and <Files > should
   be applied.  If something is not in the filesystem, than no, it should not, and
   that other backend (e.g. proxy) should have it's own container blocks to
   deal with the resource (e.g. <Proxy > blocks.)  Overloading can become
   a very confusing a dangerous game for admins to properly configure their
   servers.

4. If something is not in the Filesystem, there should be no way that our own
   CGI handler or default file handler should be allowed to handle the request.
   If Apache 

5. Whatever resource (Filesystem or what have you) that claims to own the
   resource for config block processing should be the only available handler
   to deal with the resource on the back end.  Jumping from one resource 
   context in the configuration phase to another resource resource in the
   handler phase is very dangerous.

Beyond what I've detailed above, I'm flexible about how it works/what the various
directives are, etc.

Bill



Re: FileSystem v.s. Other Resources [was configurable Location?]

Posted by "William A. Rowe, Jr." <wr...@rowe-clan.net>.
At 09:47 AM 2/6/2004, gregames@apache.org wrote:
>William A. Rowe, Jr. wrote:
>>At 12:17 PM 2/5/2004, Joshua Slive wrote:
>>
>>>>I do, however, agree that doing a directory-walk on virtual resources is
>>>>not nice.  But my opinion is that "virtualness" is a property of the
>>>>resource, and hence should be designated when selecting the resource.
>>>>That is why I suggested changing SetHandler rather than <Location>.
>>>
>>>And perhaps I'm going way off in left field here, but why should this be
>>>user-configurable at all?  Shouldn't the (for example) server-status
>>>handler know itself that it is a virtual handler, and therefore indicate
>>>that the directory-walk should be skipped?
>>
>>For example, yes.  But on the other hand, what prevents someone from
>>removing the server-status handler in the fixups phase and tricking us into
>>serving a file.
>
>sounds like we're getting into defense mechanisms against hypothetical malicious modules...  a loosing battle IMO

Malice?  Nah, defending against badly written or inherently insecure modules.
But much more to the discussion at hand...

...will administrators grok when <Location virtual "/foo"> is a safe bet?  Will
it proliferate in example configurations in the wild, in unsafe ways?  I think
we can all presume it will.  Look at our frustrations with users that have
essentially open proxy configs, which they set up by following examples
that proliferate out there. 

Just playing defense here.

Bill



Re: FileSystem v.s. Other Resources [was configurable Location?]

Posted by gr...@apache.org.
William A. Rowe, Jr. wrote:
> At 12:17 PM 2/5/2004, Joshua Slive wrote:
> 
> 
>>>I do, however, agree that doing a directory-walk on virtual resources is
>>>not nice.  But my opinion is that "virtualness" is a property of the
>>>resource, and hence should be designated when selecting the resource.
>>>That is why I suggested changing SetHandler rather than <Location>.
>>
>>And perhaps I'm going way off in left field here, but why should this be
>>user-configurable at all?  Shouldn't the (for example) server-status
>>handler know itself that it is a virtual handler, and therefore indicate
>>that the directory-walk should be skipped?
> 
> 
> For example, yes.  But on the other hand, what prevents someone from
> removing the server-status handler in the fixups phase and tricking us into
> serving a file.

sounds like we're getting into defense mechanisms against hypothetical malicious 
modules...  a loosing battle IMO

> I'd like to see the default handler entirely detached from the request if the
> dir_walk code has been bypassed.

if it's zero cost/dirt cheap in mainline code, I don't mind.

Greg


Re: FileSystem v.s. Other Resources [was configurable Location?]

Posted by "William A. Rowe, Jr." <wr...@rowe-clan.net>.
At 12:17 PM 2/5/2004, Joshua Slive wrote:

>> I do, however, agree that doing a directory-walk on virtual resources is
>> not nice.  But my opinion is that "virtualness" is a property of the
>> resource, and hence should be designated when selecting the resource.
>> That is why I suggested changing SetHandler rather than <Location>.
>
>And perhaps I'm going way off in left field here, but why should this be
>user-configurable at all?  Shouldn't the (for example) server-status
>handler know itself that it is a virtual handler, and therefore indicate
>that the directory-walk should be skipped?

For example, yes.  But on the other hand, what prevents someone from
removing the server-status handler in the fixups phase and tricking us into
serving a file.

I'd like to see the default handler entirely detached from the request if the
dir_walk code has been bypassed.

Bill



Re: FileSystem v.s. Other Resources [was configurable Location?]

Posted by gr...@apache.org.
William A. Rowe, Jr. wrote:
> At 02:11 PM 2/9/2004, gregames@apache.org wrote:
> 
>>William A. Rowe, Jr. wrote:

>>>Modules can do that today with some very trivial code...
>>
>>I think I see a problem.  No doubt it could be made to work with a simple tweak.
>>
>>SetHandler in the location container sets the handler field in the core "dir" conf during location walk.
> 
> 
> ???  I'm not seeing that, the dir/location walk doesn't set up r->handler at all.

right.  From merge_core_dir_configs():

     if (new->handler) {
         conf->handler = new->handler;
     }

...so currently it only lives in config land until the fixups phase if I'm 
reading the code correctly.  That's the problem.

> You can use the SetHandler directive in location or dir contexts.  I suppose
> if it is set in a location, we set r->handler it in the translate_name phase.

sounds good

> Even if it's overridden in a <Directory > block, the <Location > block it
> points at would still override it.  So this change seems harmless.

I'll buy it

>>map_to_storage runs, but r->handler hasn't been updated due to the SetHandler yet.  virtual_map_location doesn't know that it owns this URI.
>>
>>An unnecessary directory walk happens.
> 
> 
> Yes - the module needs to determine ownership in the translate_name phase.
> 
> 
>>core_override_type runs during the fixups phase and sets r->handler to the handler specified in the <Location > block from the handler field in the core "dir" config.  It's too late to affect the directory walk.
>>
>>Another nit is that every module with virtual content would repeat essentially the same code, other than checking the unique handler name of course, and add to mainline path length on every request by increasing the number of map_to_storage routines.  We would depend on each module zapping r->filename.  This doesn't bother me for 2.0 if we can save a directory walk.
> 
> 
> I guess I'm really concerned, after some of the problems and confusion this
> thread has revealed, about changing the behavior at all in 2.0.  

Maybe you're right.

> If we can get
> our own core "not a file" handlers to behave nicely (by fixing r->handler back
> in the translate_name phase) I would feel better about a minor change.  That's
> a pretty restrictive change that would only affect, mod_info, mod_status etc.
> Other authors could do likewise if they understand the ramifications.

The performance of mod_info and mod_status doesn't bother me.  Seeing the system 
calls they generate in a trace does, but I'll get over it.  Isn't mod_dav 
virtual content though from the core's point of view (pls excuse my ignorance)? 
  mod_proxy is, but it already has working special case code scattered 
throughout the core.  (hmmm, might be some hints as to where the hits are for a 
more general solution)

> But it seems after reading all of the comments, we aught to make some
> appropriate design changes in 2.2 so this entire solution is robust without
> breaking existing modules and possibly introducing security issues.  After
> all, Apache 2.0 should be stable, and we should spend some time breaking
> the server in 2.1-dev with the eye to releasing a faster, more optimized 2.2.

Amen

> And yup - *every* handler author aught to make the appropriate change, 
> because only they understand what's up with their content, and if it should 
> be walked by dir and file sections, or not.  We know mod_info and mod_status
> don't need to be, so we can code that ourselves.

right

Greg


Re: FileSystem v.s. Other Resources [was configurable Location?]

Posted by "William A. Rowe, Jr." <wr...@rowe-clan.net>.
At 02:11 PM 2/9/2004, gregames@apache.org wrote:
>William A. Rowe, Jr. wrote:
>>At 04:34 PM 2/6/2004, gregames@apache.org wrote:
>>
>>>Joshua Slive wrote:
>
>>>>And perhaps I'm going way off in left field here, but why should this be
>>>>user-configurable at all?  Shouldn't the (for example) server-status
>>>>handler know itself that it is a virtual handler, and therefore indicate
>>>>that the directory-walk should be skipped?
>>>
>>>I like it!  assuming we tweak it to be the module's responsibility to declare this property, vs. strictly the handler's (too late) 
>>Modules can do that today with some very trivial code...
>
>I think I see a problem.  No doubt it could be made to work with a simple tweak.
>
>SetHandler in the location container sets the handler field in the core "dir" conf during location walk.

???  I'm not seeing that, the dir/location walk doesn't set up r->handler at all.
You can use the SetHandler directive in location or dir contexts.  I suppose
if it is set in a location, we set r->handler it in the translate_name phase.
Even if it's overridden in a <Directory > block, the <Location > block it
points at would still override it.  So this change seems harmless.

>map_to_storage runs, but r->handler hasn't been updated due to the SetHandler yet.  virtual_map_location doesn't know that it owns this URI.
>
>An unnecessary directory walk happens.

Yes - the module needs to determine ownership in the translate_name phase.

>core_override_type runs during the fixups phase and sets r->handler to the handler specified in the <Location > block from the handler field in the core "dir" config.  It's too late to affect the directory walk.
>
>Another nit is that every module with virtual content would repeat essentially the same code, other than checking the unique handler name of course, and add to mainline path length on every request by increasing the number of map_to_storage routines.  We would depend on each module zapping r->filename.  This doesn't bother me for 2.0 if we can save a directory walk.

I guess I'm really concerned, after some of the problems and confusion this
thread has revealed, about changing the behavior at all in 2.0.  If we can get
our own core "not a file" handlers to behave nicely (by fixing r->handler back
in the translate_name phase) I would feel better about a minor change.  That's
a pretty restrictive change that would only affect, mod_info, mod_status etc.
Other authors could do likewise if they understand the ramifications.

But it seems after reading all of the comments, we aught to make some
appropriate design changes in 2.2 so this entire solution is robust without
breaking existing modules and possibly introducing security issues.  After
all, Apache 2.0 should be stable, and we should spend some time breaking
the server in 2.1-dev with the eye to releasing a faster, more optimized 2.2.

And yup - *every* handler author aught to make the appropriate change, 
because only they understand what's up with their content, and if it should 
be walked by dir and file sections, or not.  We know mod_info and mod_status
don't need to be, so we can code that ourselves.

Bill



Re: FileSystem v.s. Other Resources [was configurable Location?]

Posted by gr...@apache.org.
William A. Rowe, Jr. wrote:
> At 04:34 PM 2/6/2004, gregames@apache.org wrote:
> 
>>Joshua Slive wrote:

>>>And perhaps I'm going way off in left field here, but why should this be
>>>user-configurable at all?  Shouldn't the (for example) server-status
>>>handler know itself that it is a virtual handler, and therefore indicate
>>>that the directory-walk should be skipped?
>>
>>I like it!  assuming we tweak it to be the module's responsibility to declare this property, vs. strictly the handler's (too late) 
> 
> Modules can do that today with some very trivial code...

I think I see a problem.  No doubt it could be made to work with a simple tweak.

SetHandler in the location container sets the handler field in the core "dir" 
conf during location walk.

map_to_storage runs, but r->handler hasn't been updated due to the SetHandler 
yet.  virtual_map_location doesn't know that it owns this URI.

An unnecessary directory walk happens.

core_override_type runs during the fixups phase and sets r->handler to the 
handler specified in the <Location > block from the handler field in the core 
"dir" config.  It's too late to affect the directory walk.

Another nit is that every module with virtual content would repeat essentially 
the same code, other than checking the unique handler name of course, and add to 
mainline path length on every request by increasing the number of map_to_storage 
routines.  We would depend on each module zapping r->filename.  This doesn't 
bother me for 2.0 if we can save a directory walk.

Greg

> static int virtual_map_location(request_rec *r)
> {
>     int access_status;
> 
>     /* Test that SetHandler forced to this content 
>      * otherwise skip it.
>      */
>     if (strncmp(r->handler, "info-handler") != 0)
>         return DECLINED;
> 
>     /* this info-handler won't deal with the filename
>      * so null the filename to ensure no file is served.
>      */
>     r->filename = "";
> 
>     /* Don't let the core map_to_storage hook handle this,
>      * We don't need directory/file_walk.  See mod_proxy.c
>      * if our own custom <Restrictions > blocks were needed.
>      */
>     return OK;
> }
> 
> 
> static void register_hooks(apr_pool_t *p)
> {
> ...
>     /* Suppress default dir_walk behavior, our module's
>      * content is virtual.
>      */
>     ap_hook_map_to_storage(virtual_map_location, NULL, NULL, APR_HOOK_MIDDLE);
> 
> }
> 
> 
> 



Re: FileSystem v.s. Other Resources [was configurable Location?]

Posted by André Malo <nd...@perlig.de>.
* "William A. Rowe, Jr." <wr...@rowe-clan.net> wrote:

> >The only issue in mod_rewrite could be the Type/Handler forcing which
> >occurs also in fixup, where is the right place for such things (to force
> >something).
> 
> Don't they result in internal redirects?  In that case there is nothing
> wrong when the internal redirect starts a new request cycle with the
> modified uri/filename.

Type or handler forcing just does ap_set_content_type or sets r->handler
respectively. (like ForceType does).

nd

Re: FileSystem v.s. Other Resources [was configurable Location?]

Posted by "William A. Rowe, Jr." <wr...@rowe-clan.net>.
At 01:37 PM 2/9/2004, André Malo wrote:
>* "William A. Rowe, Jr." <wr...@rowe-clan.net> wrote:
>
>> You hit the nail on the head - Alias (and mod_rewrite) cause us the greatest
>> grief in fixing this set of issues.  *IF* they all are parsed in the
>> translate name phase we are fine, since map_to_storage is run after
>> translate_name is done.
>> 
>> If they are not handled up front we have problems (some translations can
>> occur in the fixup phase, IIRC, which is wrong IMHO.)
>
>No. Alias translations and RewriteRules in servercontext are handled in the
>translate_name hook. Fixup rewrites (->internal redirects) or
>mod_alias-redirects in directory context are only triggered by per-directory
>configuration, which would be skipped in that case.

Ahhh of course.

>The only issue in mod_rewrite could be the Type/Handler forcing which occurs
>also in fixup, where is the right place for such things (to force something).

Don't they result in internal redirects?  In that case there is nothing wrong when
the internal redirect starts a new request cycle with the modified uri/filename.

Bill




Re: FileSystem v.s. Other Resources [was configurable Location?]

Posted by André Malo <nd...@perlig.de>.
* "William A. Rowe, Jr." <wr...@rowe-clan.net> wrote:

> You hit the nail on the head - Alias (and mod_rewrite) cause us the greatest
> grief in fixing this set of issues.  *IF* they all are parsed in the
> translate name phase we are fine, since map_to_storage is run after
> translate_name is done.
> 
> If they are not handled up front we have problems (some translations can
> occur in the fixup phase, IIRC, which is wrong IMHO.)

No. Alias translations and RewriteRules in servercontext are handled in the
translate_name hook. Fixup rewrites (->internal redirects) or
mod_alias-redirects in directory context are only triggered by per-directory
configuration, which would be skipped in that case.

The only issue in mod_rewrite could be the Type/Handler forcing which occurs
also in fixup, where is the right place for such things (to force something).

nd

Re: FileSystem v.s. Other Resources [was configurable Location?]

Posted by "William A. Rowe, Jr." <wr...@rowe-clan.net>.
You hit the nail on the head - Alias (and mod_rewrite) cause us the greatest
grief in fixing this set of issues.  *IF* they all are parsed in the translate name
phase we are fine, since map_to_storage is run after translate_name is done.

If they are not handled up front we have problems (some translations can
occur in the fixup phase, IIRC, which is wrong IMHO.)

The other risk, that we reset filename because we might hit the core handler,
could be much cleaner if we just failed 500 right out of the core handler any
time that dir_walk isn't run against the resource.  I recall some arguments
to that fix in 2.0 - I'd like to force the hand on that issue in 2.2.  Unsetting
the r->filename would become unnecessary in this trivial example.

Bill

At 08:08 AM 2/9/2004, Brian Akins wrote:
>William A. Rowe, Jr. wrote:
>>At 04:34 PM 2/6/2004, gregames@apache.org wrote:
>
>>    /* this info-handler won't deal with the filename
>>     * so null the filename to ensure no file is served.
>>     */
>>    r->filename = "";
>>\
>
>
>What if you have something like:
>
><Location /dynamic-stuff>
>        SetHandler virtual-dynamic-handler
></Location>
>
>Alias /normal/stuff/dyn /dynamic-stuff
>
>
>How would all this affect this situation?  IE, we would still need to know that /normal/stuff/dyn/1/2/test.html became /dynamic-stuff/1/2/test.html



Re: FileSystem v.s. Other Resources [was configurable Location?]

Posted by Brian Akins <ba...@web.turner.com>.
Joshua Slive wrote:
> On Mon, 9 Feb 2004, Brian Akins wrote:
> 
>><Location /dynamic-stuff>
>>	SetHandler virtual-dynamic-handler
>></Location>
>>
>>Alias /normal/stuff/dyn /dynamic-stuff
>>
>>
>>How would all this affect this situation?  IE, we would still need to
>>know that /normal/stuff/dyn/1/2/test.html became
>>/dynamic-stuff/1/2/test.html
> 
> 
> Alias maps URLs to the filesystem.  The second argument must be a
> filesystem path, and should not be mapped against any <Location>.
> 
> Joshua.
> 

OK.  So in my example, in the proposed system, I should use Rewrite? So 
the alias would become:

RewriteRule /normal/stuff/dyn(.*) /dynamic-stuff/$1

I am agreement with the need to not do uneccesarry stats, but I do'nt 
want to have to write my own "alias" style routine...

-- 
Brian Akins
Senior Systems Engineer
CNN Internet Technologies

Re: FileSystem v.s. Other Resources [was configurable Location?]

Posted by Joshua Slive <jo...@slive.ca>.
On Mon, 9 Feb 2004, Brian Akins wrote:
> <Location /dynamic-stuff>
> 	SetHandler virtual-dynamic-handler
> </Location>
>
> Alias /normal/stuff/dyn /dynamic-stuff
>
>
> How would all this affect this situation?  IE, we would still need to
> know that /normal/stuff/dyn/1/2/test.html became
> /dynamic-stuff/1/2/test.html

Alias maps URLs to the filesystem.  The second argument must be a
filesystem path, and should not be mapped against any <Location>.

Joshua.

Re: FileSystem v.s. Other Resources [was configurable Location?]

Posted by Brian Akins <ba...@web.turner.com>.
William A. Rowe, Jr. wrote:
> At 04:34 PM 2/6/2004, gregames@apache.org wrote:

>     /* this info-handler won't deal with the filename
>      * so null the filename to ensure no file is served.
>      */
>     r->filename = "";
> 
>\


What if you have something like:

<Location /dynamic-stuff>
	SetHandler virtual-dynamic-handler
</Location>

Alias /normal/stuff/dyn /dynamic-stuff


How would all this affect this situation?  IE, we would still need to 
know that /normal/stuff/dyn/1/2/test.html became 
/dynamic-stuff/1/2/test.html





-- 
Brian Akins
Senior Systems Engineer
CNN Internet Technologies

Re: FileSystem v.s. Other Resources [was configurable Location?]

Posted by "William A. Rowe, Jr." <wr...@rowe-clan.net>.
At 04:34 PM 2/6/2004, gregames@apache.org wrote:
>Joshua Slive wrote:
>>>I do, however, agree that doing a directory-walk on virtual resources is
>>>not nice.  But my opinion is that "virtualness" is a property of the
>>>resource, and hence should be designated when selecting the resource.
>>>That is why I suggested changing SetHandler rather than <Location>.
>>
>>And perhaps I'm going way off in left field here, but why should this be
>>user-configurable at all?  Shouldn't the (for example) server-status
>>handler know itself that it is a virtual handler, and therefore indicate
>>that the directory-walk should be skipped?
>
>I like it!  assuming we tweak it to be the module's responsibility to declare this property, vs. strictly the handler's (too late).

Modules can do that today with some very trivial code...






static int virtual_map_location(request_rec *r)
{
    int access_status;

    /* Test that SetHandler forced to this content 
     * otherwise skip it.
     */
    if (strncmp(r->handler, "info-handler") != 0)
        return DECLINED;

    /* this info-handler won't deal with the filename
     * so null the filename to ensure no file is served.
     */
    r->filename = "";

    /* Don't let the core map_to_storage hook handle this,
     * We don't need directory/file_walk.  See mod_proxy.c
     * if our own custom <Restrictions > blocks were needed.
     */
    return OK;
}


static void register_hooks(apr_pool_t *p)
{
...
    /* Suppress default dir_walk behavior, our module's
     * content is virtual.
     */
    ap_hook_map_to_storage(virtual_map_location, NULL, NULL, APR_HOOK_MIDDLE);

}



Re: FileSystem v.s. Other Resources [was configurable Location?]

Posted by gr...@apache.org.
Joshua Slive wrote:
>>I do, however, agree that doing a directory-walk on virtual resources is
>>not nice.  But my opinion is that "virtualness" is a property of the
>>resource, and hence should be designated when selecting the resource.
>>That is why I suggested changing SetHandler rather than <Location>.
> 
> 
> And perhaps I'm going way off in left field here, but why should this be
> user-configurable at all?  Shouldn't the (for example) server-status
> handler know itself that it is a virtual handler, and therefore indicate
> that the directory-walk should be skipped?

I like it!  assuming we tweak it to be the module's responsibility to declare 
this property, vs. strictly the handler's (too late).

Greg



Re: FileSystem v.s. Other Resources [was configurable Location?]

Posted by Joshua Slive <jo...@slive.ca>.
> I do, however, agree that doing a directory-walk on virtual resources is
> not nice.  But my opinion is that "virtualness" is a property of the
> resource, and hence should be designated when selecting the resource.
> That is why I suggested changing SetHandler rather than <Location>.

And perhaps I'm going way off in left field here, but why should this be
user-configurable at all?  Shouldn't the (for example) server-status
handler know itself that it is a virtual handler, and therefore indicate
that the directory-walk should be skipped?

Joshua.

Re: FileSystem v.s. Other Resources [was configurable Location?]

Posted by Joshua Slive <jo...@slive.ca>.
I just want to add a couple notes here on what I see as user-expectations.

> > The problem is that you want to add layers of additional directives, which
> > would change the behavior of <Directory > or <Location >,
>
> only <Location >, in a way that IMO is consistent with the existing
> documentation, but not the existing code of course.
[...]
> A complication to treating <Location > as if it's outside the filesystem
> is the doc that states <Location /> is a quick way to apply a config to
> the whole server.  It will indeed do that today, unfortunately.  I
> believe another quick way to get the same result is to put your config
> directives outside of any container.  If that's true, I would prefer to
> see that doc for <Location /> removed.  If the change to the behavior of
> <Location > is configurable, this is not an issue.  If it's not
> configurable, then we probably should treat <Location /> as if it's in
> the filesystem - a special case for backwards compatibility.

I wrote the particular doc you quoted earlier, and I meant it more as a
recommendation (for security reasons) than a documentation of how apache
works.  The true description of how apache is supposed to function is
here:

http://httpd.apache.org/docs-2.0/sections.html

(But it is a little vauge, because my knowledge of the internals of
configuration merging was not sufficient to clarify it the last time I
went through.)

Many people do, in fact, use <Location> for filesystem-related things.
See the examples in the mod_dav docs for one:
http://httpd.apache.org/docs-2.0/mod/mod_dav.html

Although I don't think this is esthetically correct, it is not
particularly dangerous.  Using <Location> to provide *extra* access is
okay, because if you circumvent the <Location> using filesystem tricks,
you just lose the access.  Using <Location> to *restrict* access is not
safe, because the restrictions may be circumvented.

The other important thing to note is that <Location> is often used for a
reason that is particular to apache configuration merging:
those sections are evaluated last.  That means that
<Location />
Order deny,allow
Deny from all
Allow from local.net
</Location>
has a very different effect than the same directives placed in <Directory
/> because it will be evaluated after all <Directory> and <Files>
sections which could potentially override directives.

I'm not saying that this is a strictly correct use of <Location>, but it
is very useful and there is no other easy way to accomplish this "apply
this directive after everything else" effect.

I do, however, agree that doing a directory-walk on virtual resources is
not nice.  But my opinion is that "virtualness" is a property of the
resource, and hence should be designated when selecting the resource.
That is why I suggested changing SetHandler rather than <Location>.

Joshua.

Re: FileSystem v.s. Other Resources [was configurable Location?]

Posted by Greg Ames <gr...@apache.org>.
Greg Marr wrote:

>> I'm only changing <Location > ... <Directory > is unaffected.
> 
> 
> Well, that's not entirely true.  The <Directory> is affected indirectly, 
> because it no longer applies.  The behavior currently is: it applies to 
> everything it matches.  This would change it to: it applies to 
> everything it matches unless it also matches a <Location>, in which case 
> it doesn't apply.

OK, got it.  Thanks for the clear explanation.

This could possibly be handled by checking for overlaps between <Location > and 
<Directory > at config time, especially if they are sorted in least-to-most 
specific order as wrowe suggests.

But my current favorite idea is Joshua's suggestion for the module to declare 
that the content is virtual, pure and simple.  In that case, any such overlaps 
are an admin screwup IMO.  GET /server-status/index.html doesn't make any sense 
to me even if such a directory exists.

Greg





Re: FileSystem v.s. Other Resources [was configurable Location?]

Posted by "William A. Rowe, Jr." <wr...@rowe-clan.net>.
At 10:43 AM 2/5/2004, Greg Marr wrote:
>At 10:22 AM 2/5/2004, gregames@apache.org wrote:
>>Thanks for the feedback, Will.
>>
>>William A. Rowe, Jr. wrote:
>>>At 03:39 PM 2/4/2004, gregames@apache.org wrote:
>>>
>>>>But then if I play devil's advocate, someone could see the new directive and turn it on when it's not appropriate and cause Bad Things to happen.  Mainly I'm looking for comments on whether this should be configurable or not.
>>>
>>>Yes, I'm one who will agree with your devil's position :)  I know the problem
>>>you are trying to solve, and changing the behavior of <Directory > isn't quite the solution...
>>
>>I'm only changing <Location > ... <Directory > is unaffected.
>
>Well, that's not entirely true.  The <Directory> is affected indirectly, because it no longer applies.  The behavior currently is: it applies to everything it matches.  This would change it to: it applies to everything it matches unless it also matches a <Location>, in which case it doesn't apply.

Right - I understand.

I'm saying (as you pointed out) that it's too dangerous :)  Defies the principle
of least surprise.  They configure <Dir > blocks only to have them ignored.

The only safe way to do what you are suggesting, is to yank the default file
handler from under the request anytime the user has overridden the <Dir >
checks.  Simplest way to do that is bind them together as a filesystem
module.

Oh, the same problems (of yanking <Dir > out from under the user) would be
true of the CGI handler, as well as third party modules that do things with
the filesystem.  It would be important that they could explicitly enable the
module to add <Dir > handling.

Which brings us back to square one.  I'd actually argued a long time ago
that a module that handles requests outside of the filesystem should be
responsible for bypassing the <Dir > handler itself.  For example, mod_proxy
satisfies the map_to_storage hook on it's own, therefore never hitting <Dir >.

Bill




Re: FileSystem v.s. Other Resources [was configurable Location?]

Posted by Greg Marr <gr...@alum.wpi.edu>.
At 10:22 AM 2/5/2004, gregames@apache.org wrote:
>Thanks for the feedback, Will.
>
>William A. Rowe, Jr. wrote:
>>At 03:39 PM 2/4/2004, gregames@apache.org wrote:
>>
>>>But then if I play devil's advocate, someone could see the new directive 
>>>and turn it on when it's not appropriate and cause Bad Things to 
>>>happen.  Mainly I'm looking for comments on whether this should be 
>>>configurable or not.
>>
>>Yes, I'm one who will agree with your devil's position :)  I know the problem
>>you are trying to solve, and changing the behavior of <Directory > isn't 
>>quite the solution...
>
>I'm only changing <Location > ... <Directory > is unaffected.

Well, that's not entirely true.  The <Directory> is affected indirectly, 
because it no longer applies.  The behavior currently is: it applies to 
everything it matches.  This would change it to: it applies to everything 
it matches unless it also matches a <Location>, in which case it doesn't apply.


Re: FileSystem v.s. Other Resources [was configurable Location?]

Posted by gr...@apache.org.
Thanks for the feedback, Will.

William A. Rowe, Jr. wrote:
> At 03:39 PM 2/4/2004, gregames@apache.org wrote:
> 
>>But then if I play devil's advocate, someone could see the new directive and turn it on when it's not appropriate and cause Bad Things to happen.  Mainly I'm looking for comments on whether this should be configurable or not.
> 
> 
> Yes, I'm one who will agree with your devil's position :)  I know the problem
> you are trying to solve, 

great!

> and changing the behavior of <Directory > isn't quite the solution...

I'm only changing <Location > ... <Directory > is unaffected.

> 
> Issue 1:
> 
> Walking the filesystem (the stat aspect of Directory and File) for something
> that doesn't live there is stupid :)  We agree, and this must be fixed in 2.1

cool!

> Issue 2:
> 
> Matching directories when something is outside of the filesystem is simply
> wasteful, and here too we agree.
> 
> The problem is that you want to add layers of additional directives, which
> would change the behavior of <Directory > or <Location >, 

only <Location >, in a way that IMO is consistent with the existing 
documentation, but not the existing code of course.

> in ways that would
> allow the user to seriously compromise what they had explicitly configured
> for security.  This is unwise.  Let's look at the two issues above;
 >
> Effect/Issue 1:
> 
> Bypassing the filesystem canonicalization would be very bad on certain 
> platforms such as windows, depending on case sensitivity, etc.  It would
> also bypass *user configured* options such as avoiding symlinks.

only for <Location >  If the admin is telling us the content is outside the 
filesystem, I don't think filesystem canonicalization or symlinks are relevant.

> Effect/Issue 2:
> 
> Bypassing the directory block walk would ignore the very <Directory >
> sections that the user explicitly put in their config.  This too is bad.

This is only for <Location > .  There is no change to the behavior of <Directory >.

> 
> There is only one way

that sounds kinda religious...in software there are generally a number of ways 
to solve a problem :-)

> to avoid these consequences, and still save ourselves
> the pain of the filesystem walk and <Directory > block handling...

I'm only changing <Location >
> 
> === Break out file handling as a separate component ===
> 
> I've proposed in the past the simple directive
> 
> FilesystemMount /path/to/files

At first glance it looks like a key difference is that you are approaching it by 
saying that it's OK to have <Location > map to a real filesystem and providing 
config to explicitly specify the mapping.  I'm advocating going the other way 
and allowing <Location > to be treated as if it is really outside of the filesystem.

A complication to treating <Location > as if it's outside the filesystem is the 
doc that states <Location /> is a quick way to apply a config to the whole 
server.  It will indeed do that today, unfortunately.  I believe another quick 
way to get the same result is to put your config directives outside of any 
container.  If that's true, I would prefer to see that doc for <Location /> 
removed.  If the change to the behavior of <Location > is configurable, this is 
not an issue.  If it's not configurable, then we probably should treat <Location 
/> as if it's in the filesystem - a special case for backwards compatibility.

I'd like to study your comments more after my coffee kicks in :-)  Obviously 
you've given a lot of thought to this issue so I want to see if I can understand 
where you're coming from.

Greg


FileSystem v.s. Other Resources [was configurable Location?]

Posted by "William A. Rowe, Jr." <wr...@rowe-clan.net>.
At 03:39 PM 2/4/2004, gregames@apache.org wrote:
>But then if I play devil's advocate, someone could see the new directive and turn it on when it's not appropriate and cause Bad Things to happen.  Mainly I'm looking for comments on whether this should be configurable or not.

Yes, I'm one who will agree with your devil's position :)  I know the problem
you are trying to solve, and changing the behavior of <Directory > isn't quite
the solution...

Issue 1:

Walking the filesystem (the stat aspect of Directory and File) for something
that doesn't live there is stupid :)  We agree, and this must be fixed in 2.1

Issue 2:

Matching directories when something is outside of the filesystem is simply
wasteful, and here too we agree.

The problem is that you want to add layers of additional directives, which
would change the behavior of <Directory > or <Location >, in ways that would
allow the user to seriously compromise what they had explicitly configured
for security.  This is unwise.  Let's look at the two issues above;

Effect/Issue 1:

Bypassing the filesystem canonicalization would be very bad on certain 
platforms such as windows, depending on case sensitivity, etc.  It would
also bypass *user configured* options such as avoiding symlinks.

Effect/Issue 2:

Bypassing the directory block walk would ignore the very <Directory >
sections that the user explicitly put in their config.  This too is bad.

There is only one way to avoid these consequences, and still save ourselves
the pain of the filesystem walk and <Directory > block handling...

=== Break out file handling as a separate component ===

I've proposed in the past the simple directive

FilesystemMount /path/to/files

Consider the existing directive, DocumentRoot, as shorthand for;

<Location />
FilesystemMount /old/root/
</Location>

Also consider the Alias /foo /path/to/bar directive as shorthand for;

<Location /foo>
FilesystemMount /path/to/bar
</Location>

But wait, there is more :)  We have a default handler.  Simply modify it to
be the default_file_handler.  Insert it ONLY if the appropriate FilesystemMount
directive exists in the particular location block.

What happens to requests that don't have a FilesystemMount block,
and don't have another handler (e.g. SetHandler info-handler or whatever)?
These requests would error out with a 500 - there is no way for the server
to handle them.

What happens to our old default handler (inserted at all times?)  It becomes
a simple error-500 handler if nothing else intercepts the request.

Note that the exact semantics could change (my naming isn't all that great
either) - but the key detail is that mount directives within a given <Location >
are much safer and easier to follow than arbitrary Mount src dst directives
which may be order and/or length dependent, creating confusion for the user.

So in short, veto on the original proposal to give users gun to point at foot.
+1 on the concepts that we quit interrogating the filesystem when the user
wants to serve non-files, and quit walking directory blocks for things that
do not live in directories.  Let's do this in 2.1 by splitting out the file system,
and if the filesystem module isn't handling a request, it won't be serving
content but also won't be invoking the directory walk or stat-ing files.

Oh last observation - it should become (in 2.1) nearly impossible for folks
to just bork around with the contents of r->filename and r->finfo, first by 
stripping them from the request rec, and second by providing an API to
the filesystem module that lets another module link into another file.
That API would prevent module authors from bypassing the filesystem
module's internal security.  If they want different behavior, they can plug
in their own backend handler to deal with the whole request process.

Bill



Re: [PATCH] configurable Location block speed up

Posted by Greg Ames <gr...@apache.org>.
Glenn wrote:
> On Fri, Feb 06, 2004 at 10:37:51AM -0500, gregames@apache.org wrote:
> 
>>but Joshua has excellent points about "virtualness" being a property of the 
>>handler.  Yes, the server-status handler should know that it is virtual, 
>>but the handler hook is too late to skip the directory walk.  But the 
>>mod_status maintainers (i.e. us) know that mod_status is virtual, so maybe 
>>it could tell the core earlier somehow.  An earlier hook?  preferably 
>>something at config time since the virtualness doesn't ever change?
> 
> 
> That got me thinking about a "handler capabilities" set of flags,
> say, r->handler_attr, which is initialized to zeroes for each request.
> The first single bit capability would be for, say,
> HANDLER_ATTR_VIRTUAL_PATH (as opposed to filesystem-backed).

something like that.  I don't know if there are other uses for handler 
capabilities so it might be overkill at the moment.

> In processing directives, SetHandler code could modify r->handler_attr

a nit - I think we would want to hide this type of info in something private to 
the core, like the core_request_config.

> so that mod_status could set HANDLER_ATTR_VIRTUAL_PATH, after which
> <Directory> and <Files> would be skipped.  Hopefully it could be worked
> out that <Directory> and <Files> would be handled up until the point
> where the request became virtual (/directory/path/then/virtual_path)

I'm still thinking of an all-or-nothing approach, i.e. it's in the filesystem or 
it's not, at least at first.  That's a pretty big change - look at the comments 
I'm getting and the cases I missed.

> When it comes time to run the handler hook, modules could check this
> flag in the same way that they check r->handler and r->content_type
> to see if they should go ahead or DECLINE.  Alternatively, the 'module'
> structure could be expanded to included a capabilities set and only
> modules that flagged a capability would be given a chance to handle
> the request.  Since the default handler only handles filesystem-backed
> requests, it and other filesystem-backed modules would never see the
> request.
>   e.g. if (!(r->handler_attr & HANDLER_ATTR_VIRTUAL_PATH)
> 	   || (foo_module->capabilities & HANDLER_ATTR_VIRTUAL_PATH))
> 
> Cheers,
> Glenn
> 
> 
f


Re: [PATCH] configurable Location block speed up

Posted by Glenn <gs...@gluelogic.com>.
On Fri, Feb 06, 2004 at 10:37:51AM -0500, gregames@apache.org wrote:
> but Joshua has excellent points about "virtualness" being a property of the 
> handler.  Yes, the server-status handler should know that it is virtual, 
> but the handler hook is too late to skip the directory walk.  But the 
> mod_status maintainers (i.e. us) know that mod_status is virtual, so maybe 
> it could tell the core earlier somehow.  An earlier hook?  preferably 
> something at config time since the virtualness doesn't ever change?

That got me thinking about a "handler capabilities" set of flags,
say, r->handler_attr, which is initialized to zeroes for each request.
The first single bit capability would be for, say,
HANDLER_ATTR_VIRTUAL_PATH (as opposed to filesystem-backed).

In processing directives, SetHandler code could modify r->handler_attr
so that mod_status could set HANDLER_ATTR_VIRTUAL_PATH, after which
<Directory> and <Files> would be skipped.  Hopefully it could be worked
out that <Directory> and <Files> would be handled up until the point
where the request became virtual (/directory/path/then/virtual_path)

When it comes time to run the handler hook, modules could check this
flag in the same way that they check r->handler and r->content_type
to see if they should go ahead or DECLINE.  Alternatively, the 'module'
structure could be expanded to included a capabilities set and only
modules that flagged a capability would be given a chance to handle
the request.  Since the default handler only handles filesystem-backed
requests, it and other filesystem-backed modules would never see the
request.
  e.g. if (!(r->handler_attr & HANDLER_ATTR_VIRTUAL_PATH)
	   || (foo_module->capabilities & HANDLER_ATTR_VIRTUAL_PATH))

Cheers,
Glenn

Re: [PATCH] configurable Location block speed up

Posted by "William A. Rowe, Jr." <wr...@rowe-clan.net>.
At 03:08 PM 2/9/2004, gregames@apache.org wrote:
>Ben Laurie wrote:
>>gregames@apache.org wrote: 
>
>>>>>>or Joshua's "virtual" keyword on <Location >, which I like better the more I think about it.
>>>>>
>>>>>ooops... s/Joshua/André/
>>>>>
>>>>>but Joshua has excellent points about "virtualness" being a property of the handler.  Yes, the server-status handler should know that it is virtual, but the handler hook is too late to skip the directory walk.  But the mod_status maintainers (i.e. us) know that mod_status is virtual, so maybe it could tell the core earlier somehow.  An earlier hook?  preferably something at config time since the virtualness doesn't ever change?
>>>>
>>>>Isn't this going to be horribly messy? The current system says "oi, you - here's a URL, do you handle it?" whereas you'd need to have "oi, you, here's a pattern that might match a URL you handle, does it?".
>>>
>>>it should be clean.  I'm thinking the module somehow communicates to the core that all its URLs are virtual, i.e., they don't map to the filesystem no way no how not ever.  Then we still need the
>>>
>>><Location /server-status >
>>>   SetHandler status_handler    # or whatever it is
>>></Location>
>>>
>>>config/processing to take care of the pattern matching of the URLs, just like today.  If both of those things line up, then we can skip the directory walk.
>>
>>Just to be clear: is the idea that you determine the handler (without a directory walk), then determine whether to do the directory walk from that?
>
>right.  The first part already happens.
>
>>If so, aren't there cases where the handler gets determined during the directory walk, by .htaccess?
>
>certainly.  But does it make sense for a given URL to have a handler set by location walk whose content is known to be virtual, and then have yet another handler set during directory walk/.htaccess processing?

Absolutely not.  In fact, if it's not a file resource, there should never be any
file-based .htaccess (it's left as an exercise to the reader to picture a zip file
handler which could extract .htaccess information relevant to the zip contents,
or an .htaccess resource in a database that corresponse to other database
records :-)

>With the config above, we already determine the handler today during the location walk.  The change I'd like to see is that if this happens and also the module declares that its content is virtual, we skip the directory walk/.htaccess.

Right on - if the *module* is virtual it should bypass these things.

>This means you couldn't have <Directory /server-status> or its .htaccess equivalent do anything if we have the config above, assuming such a directory actually exists and we declared that the status handler generates virtual content.  IMO that's goodness because it is confusing as hell to support overlapping Location and Directory blocks*.  For me, that falls into the "unpredictable results" category even though a few of us know what it will do today.
>
>* with the possible exception of <Location /> which is documented as being global.

Agreed (<Location > always wins).  What is /server-status - is it a <File >
or a <Directory > ???  Of course it's neither and flip a coin on what Apache
does with it.  (Hint, if it isn't found and the leading segments exist as
directories, then it is a file, not a directory.)

Bill



Re: [PATCH] configurable Location block speed up

Posted by gr...@apache.org.
Ben Laurie wrote:
> gregames@apache.org wrote: 

>>>>> or Joshua's "virtual" keyword on <Location >, which I like better 
>>>>> the more I think about it.
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> ooops... s/Joshua/André/
>>>>
>>>> but Joshua has excellent points about "virtualness" being a property 
>>>> of the handler.  Yes, the server-status handler should know that it 
>>>> is virtual, but the handler hook is too late to skip the directory 
>>>> walk.  But the mod_status maintainers (i.e. us) know that mod_status 
>>>> is virtual, so maybe it could tell the core earlier somehow.  An 
>>>> earlier hook?  preferably something at config time since the 
>>>> virtualness doesn't ever change?
>>>
>>>
>>>
>>>
>>> Isn't this going to be horribly messy? The current system says "oi, 
>>> you - here's a URL, do you handle it?" whereas you'd need to have 
>>> "oi, you, here's a pattern that might match a URL you handle, does it?".
>>
>>
>> it should be clean.  I'm thinking the module somehow communicates to 
>> the core that all its URLs are virtual, i.e., they don't map to the 
>> filesystem no way no how not ever.  Then we still need the
>>
>> <Location /server-status >
>>    SetHandler status_handler    # or whatever it is
>> </Location>
>>
>> config/processing to take care of the pattern matching of the URLs, 
>> just like today.  If both of those things line up, then we can skip 
>> the directory walk.
> 
> 
> Just to be clear: is the idea that you determine the handler (without a 
> directory walk), then determine whether to do the directory walk from that?

right.  The first part already happens.

> If so, aren't there cases where the handler gets determined during the 
> directory walk, by .htaccess?

certainly.  But does it make sense for a given URL to have a handler set by 
location walk whose content is known to be virtual, and then have yet another 
handler set during directory walk/.htaccess processing?

With the config above, we already determine the handler today during the 
location walk.  The change I'd like to see is that if this happens and also the 
module declares that its content is virtual, we skip the directory walk/.htaccess.

This means you couldn't have <Directory /server-status> or its .htaccess 
equivalent do anything if we have the config above, assuming such a directory 
actually exists and we declared that the status handler generates virtual 
content.  IMO that's goodness because it is confusing as hell to support 
overlapping Location and Directory blocks*.  For me, that falls into the 
"unpredictable results" category even though a few of us know what it will do today.

Greg

* with the possible exception of <Location /> which is documented as being global.


Re: [PATCH] configurable Location block speed up

Posted by Ben Laurie <be...@algroup.co.uk>.
gregames@apache.org wrote:

> Ben Laurie wrote:
> 
>> gregames@apache.org wrote:
>>
>>> gregames@apache.org wrote:
>>>
>>>> or Joshua's "virtual" keyword on <Location >, which I like better 
>>>> the more I think about it.
>>>
>>>
>>>
>>>
>>> ooops... s/Joshua/André/
>>>
>>> but Joshua has excellent points about "virtualness" being a property 
>>> of the handler.  Yes, the server-status handler should know that it 
>>> is virtual, but the handler hook is too late to skip the directory 
>>> walk.  But the mod_status maintainers (i.e. us) know that mod_status 
>>> is virtual, so maybe it could tell the core earlier somehow.  An 
>>> earlier hook?  preferably something at config time since the 
>>> virtualness doesn't ever change?
>>
>>
>>
>> Isn't this going to be horribly messy? The current system says "oi, 
>> you - here's a URL, do you handle it?" whereas you'd need to have "oi, 
>> you, here's a pattern that might match a URL you handle, does it?".
> 
> it should be clean.  I'm thinking the module somehow communicates to the 
> core that all its URLs are virtual, i.e., they don't map to the 
> filesystem no way no how not ever.  Then we still need the
> 
> <Location /server-status >
>    SetHandler status_handler    # or whatever it is
> </Location>
> 
> config/processing to take care of the pattern matching of the URLs, just 
> like today.  If both of those things line up, then we can skip the 
> directory walk.

Just to be clear: is the idea that you determine the handler (without a 
directory walk), then determine whether to do the directory walk from that?

If so, aren't there cases where the handler gets determined during the 
directory walk, by .htaccess?

Cheers,

Ben.

-- 
http://www.apache-ssl.org/ben.html       http://www.thebunker.net/

"There is no limit to what a man can do or how far he can go if he
doesn't mind who gets the credit." - Robert Woodruff

Re: [PATCH] configurable Location block speed up

Posted by gr...@apache.org.
Ben Laurie wrote:
> gregames@apache.org wrote:
> 
>> gregames@apache.org wrote:
>>
>>> or Joshua's "virtual" keyword on <Location >, which I like better the 
>>> more I think about it.
>>
>>
>>
>> ooops... s/Joshua/André/
>>
>> but Joshua has excellent points about "virtualness" being a property 
>> of the handler.  Yes, the server-status handler should know that it is 
>> virtual, but the handler hook is too late to skip the directory walk.  
>> But the mod_status maintainers (i.e. us) know that mod_status is 
>> virtual, so maybe it could tell the core earlier somehow.  An earlier 
>> hook?  preferably something at config time since the virtualness 
>> doesn't ever change?
> 
> 
> Isn't this going to be horribly messy? The current system says "oi, you 
> - here's a URL, do you handle it?" whereas you'd need to have "oi, you, 
> here's a pattern that might match a URL you handle, does it?".

it should be clean.  I'm thinking the module somehow communicates to the core 
that all its URLs are virtual, i.e., they don't map to the filesystem no way no 
how not ever.  Then we still need the

<Location /server-status >
    SetHandler status_handler    # or whatever it is
</Location>

config/processing to take care of the pattern matching of the URLs, just like 
today.  If both of those things line up, then we can skip the directory walk.

btw, neither of the patches I posted do the communicating the virtualness to the 
core bit.  I just realized what Joshua was saying this morning.  The first patch 
assumes virtualness when it sees the config above, the second patch adds it as 
new config.  Having it communicated by the module is a much better idea.

Greg


Re: [PATCH] configurable Location block speed up

Posted by Ben Laurie <be...@algroup.co.uk>.
gregames@apache.org wrote:

> gregames@apache.org wrote:
> 
>> or Joshua's "virtual" keyword on <Location >, which I like better the 
>> more I think about it.
> 
> 
> ooops... s/Joshua/André/
> 
> but Joshua has excellent points about "virtualness" being a property of 
> the handler.  Yes, the server-status handler should know that it is 
> virtual, but the handler hook is too late to skip the directory walk.  
> But the mod_status maintainers (i.e. us) know that mod_status is 
> virtual, so maybe it could tell the core earlier somehow.  An earlier 
> hook?  preferably something at config time since the virtualness doesn't 
> ever change?

Isn't this going to be horribly messy? The current system says "oi, you 
- here's a URL, do you handle it?" whereas you'd need to have "oi, you, 
here's a pattern that might match a URL you handle, does it?".

Cheers,

Ben.

-- 
http://www.apache-ssl.org/ben.html       http://www.thebunker.net/

"There is no limit to what a man can do or how far he can go if he
doesn't mind who gets the credit." - Robert Woodruff

Re: [PATCH] configurable Location block speed up

Posted by gr...@apache.org.
gregames@apache.org wrote:

> or Joshua's "virtual" keyword on <Location >, which I like better the 
> more I think about it.

ooops... s/Joshua/André/

but Joshua has excellent points about "virtualness" being a property of the 
handler.  Yes, the server-status handler should know that it is virtual, but the 
handler hook is too late to skip the directory walk.  But the mod_status 
maintainers (i.e. us) know that mod_status is virtual, so maybe it could tell 
the core earlier somehow.  An earlier hook?  preferably something at config time 
since the virtualness doesn't ever change?

Greg


Re: [PATCH] configurable Location block speed up

Posted by gr...@apache.org.
moving to dev@ ... this deserves wider attention IMO

Geoffrey Young wrote:

> one thing I was wondering about was the interaction between your patch an
> mod_alias.  just by following the code without testing, it seems like either
> mod_alias needs to unset the outside_filesystem flag for things like Alias
> and ScriptAlias, 

something like that might work

> or it needs to blow up over the conflict between those directives and your new directive.

or Joshua's "virtual" keyword on <Location >, which I like better the more I 
think about it.

But yeah, good point.  Unfortunately it adds complexity.  If the conflict could 
be detected at config time rather than run time, the added complexity does't 
bother me much at all.

> that is, unless I'm reading the code wrong, or common setups like this
> 
> Alias /cgi-bin /usr/local/apache/cgi-bin
> <Location /cgi-bin>
>   SetHandler cgi-script
> ...
> 
> 
> are also ones that you believe go against the existing documentation.
> 
> anyway, not a criticism (especially if I'm wrong), 

I think you're right.  Constructive critism is always welcome :)

> just an observation.

> --Geoff

No problem... excellent feedback.

Greg