You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@httpd.apache.org by gr...@apache.org on 2004/01/16 19:02:01 UTC
[PATCH] block speed up
I'm interested in doing some scalability testing with worker on Linux to see
what the O(1) scheduler and new pthread library buys us, and what happens with
different values for ThreadsPerChild. I decided to use a simple handler that
just nanosleep()s for a variable amount of time controlled by the client. I ran
oprofile and saw that the kernel's directory lookup function was the top CPU
eater. An strace revealed that we were doing a directory walk, resulting in a
stat() and 7 lstat()s per request.
My config to enable the handler is
<Location /silly>
SetHandler s2-handler
</Location>
...which is the same way we enable mod_status and mod_info. The key thing here
is that the URIs to access a Location enabled handler do not map to the
filesystem, so the directory walk is a waste of cycles. So what can we do about
it? Maybe detect that the current URI matches such a block, and skip the
filesystem mapping if so. That's exactly what this patch does.
Throughput measured by ab increases from 3053 requests/sec to 3984 with this
patch, slightly more than 30% (dual 400MHz PII, both pegged, 256M RAM, RH 7.3).
This should make it easier to see how the OS changes etc affect our scalability.
I really don't grok how the cache works in ap_location_walk so I wouldn't be
shocked if I haven't implemented this optimally. But it seems to do the Right
Thing with the nanosleep handler, mod_status, and static files. I also
confirmed that we do get a full directory walk if I type in a garbage URL.
Comments?
Greg
Re: [PATCH] block speed up
Posted by gr...@apache.org.
Brian Akins wrote:
> How "wrong" would it be to have a map_to_storage that simply filled out
> r->finfo and returned OK (or the appropriate error) always? IE, no
> directory or file walk. From what I understand, Directory and File
> directives would be useless, but Location would still work. Is this
> correct?
When the URI represents a file, the directory walk fills in r->finfo fairly
efficiently - just one stat() typically followed by an open() from the handler.
I'd like to see that change to an open() - fstat() sequence to be even more
efficient, but that's a different topic.
What I'm trying to do here is deal with the case where the URI is known to be
completely outside the filesystem, like server-status for example. r->finfo is
meaningless in that case - there is no file as far as the core code can tell.
Greg
Re: [PATCH] block speed up
Posted by Brian Akins <ba...@web.turner.com>.
How "wrong" would it be to have a map_to_storage that simply filled out
r->finfo and returned OK (or the appropriate error) always? IE, no
directory or file walk. From what I understand, Directory and File
directives would be useless, but Location would still work. Is this
correct?
--
Brian Akins
Senior Systems Engineer
CNN Internet Technologies
Re: [PATCH] block speed up
Posted by gr...@apache.org.
gregames@apache.org wrote:
> Bill Stoddard wrote:
>
>> WHat happens if the handler DECLINES the request
>
>
> My guess is that the default handler will be called, try to open a
> non-existant file and send back a 404. I'll find out.
>
>> and do we care?
>
>
> If it seg faults, violates protocol or something similar, I care. Other
> than that I don't think so.
>
> mod_status and mod_info both are enabled via <Location > containers.
> mod_status never DECLINEs if it is the handler. mod_info DECLINEs if
> the method isn't GET. Let me see what happens if I send mod_info some
> other method.
I sent mod_info a POST and got back a 404 with a "File does not exist" in the
error log, both with and without this patch.
Greg
Re: [PATCH] block speed up
Posted by gr...@apache.org.
Geoffrey Young wrote:
> Bill Stoddard wrote:
>>My not so well formed thoughts are that if a module claims it should
>>handle a request based on a SetHandler directive in a Location
>>directive, the server should not allow that handler to DECLINE the
>>request. Putting it another way, if the handler claims the request then
>>DECLINEs, the server should prevent other handlers from attempting to
>>serve the request.
>
>
> I've brought this up before, but if you shortcut translation, how do you
> handle cases where r->handler is set after translation/map_to_storage.
I'd rather keep it simple and not change how those cases are processed.
> that is, just because at the start of a request <Location /foo> refers to a
> non-static resource, doesn't mean that assumption won't change.
<Location /foo> wasn't intended to be used for static files. See
http://httpd.apache.org/docs-2.0/mod/core.html#location. If somebody wants to
do something that strange, they can write their own map_to_storage function.
> gregames@apache.org wrote:
>
>>A map_to_storage hook needs to know when to allow the directory walk and
>>when not to.
>
>
> I think I see what you mean. so, perhaps the answer to my question above is
> that, with your patch, if modules expect to set r->handler under certain
> conditions, then they ought to install their own map_to_storage handler to
> handle those conditions as well?
yeah, that's my take. If the module is sophisticated enough to know when to set
r->handler without the core's help, I would assume it's sophisticated enough to
have a custom map_to_storage function which knows when it is appropriate to
alter normal processing.
Greg
Re: [PATCH] block speed up
Posted by Geoffrey Young <ge...@modperlcookbook.org>.
Bill Stoddard wrote:
>> mod_status and mod_info both are enabled via <Location > containers.
>> mod_status never DECLINEs if it is the handler. mod_info DECLINEs if
>> the method isn't GET. Let me see what happens if I send mod_info some
>> other method.
>
>
> My not so well formed thoughts are that if a module claims it should
> handle a request based on a SetHandler directive in a Location
> directive, the server should not allow that handler to DECLINE the
> request. Putting it another way, if the handler claims the request then
> DECLINEs, the server should prevent other handlers from attempting to
> serve the request.
I've brought this up before, but if you shortcut translation, how do you
handle cases where r->handler is set after translation/map_to_storage.
that is, just because at the start of a request <Location /foo> refers to a
non-static resource, doesn't mean that assumption won't change. see mod_dav
and mod_include, both of which set r->handler during the fixup phase.
gregames@apache.org wrote:
> A map_to_storage hook needs to know when to allow the directory walk and
> when not to.
I think I see what you mean. so, perhaps the answer to my question above is
that, with your patch, if modules expect to set r->handler under certain
conditions, then they ought to install their own map_to_storage handler to
handle those conditions as well?
--Geoff
Re: [PATCH] block speed up
Posted by gr...@apache.org.
Bill Stoddard wrote:
>>> WHat happens if the handler DECLINES the request
>>
>>
>>
>> My guess is that the default handler will be called, try to open a
>> non-existant file and send back a 404. I'll find out.
>>
> Make sure file 'silly' exists in documentroot and make sure it has
> access protections coded in a Directory </> container :-)
oh my poor head! I'm not too worried about people who do such things.
>> mod_status and mod_info both are enabled via <Location > containers.
>> mod_status never DECLINEs if it is the handler. mod_info DECLINEs if
>> the method isn't GET. Let me see what happens if I send mod_info some
>> other method.
>
>
> My not so well formed thoughts are that if a module claims it should
> handle a request based on a SetHandler directive in a Location
> directive, the server should not allow that handler to DECLINE the
> request. Putting it another way, if the handler claims the request then
> DECLINEs, the server should prevent other handlers from attempting to
> serve the request.
hmmm, that might simplify some potentially weird cases.
Greg
Re: [PATCH] block speed up
Posted by Bill Stoddard <bi...@wstoddard.com>.
gregames@apache.org wrote:
> Bill Stoddard wrote:
>
>> WHat happens if the handler DECLINES the request
>
>
> My guess is that the default handler will be called, try to open a
> non-existant file and send back a 404. I'll find out.
>
Make sure file 'silly' exists in documentroot and make sure it has access protections coded in a Directory </>
container :-)
>> and do we care?
>
>
> If it seg faults, violates protocol or something similar, I care. Other
> than that I don't think so.
>
> mod_status and mod_info both are enabled via <Location > containers.
> mod_status never DECLINEs if it is the handler. mod_info DECLINEs if
> the method isn't GET. Let me see what happens if I send mod_info some
> other method.
My not so well formed thoughts are that if a module claims it should handle a request based on a SetHandler
directive in a Location directive, the server should not allow that handler to DECLINE the request. Putting it
another way, if the handler claims the request then DECLINEs, the server should prevent other handlers from
attempting to serve the request.
Bill
Re: [PATCH] block speed up
Posted by gr...@apache.org.
Bill Stoddard wrote:
> WHat happens if the handler DECLINES the request
My guess is that the default handler will be called, try to open a non-existant
file and send back a 404. I'll find out.
> and do we care?
If it seg faults, violates protocol or something similar, I care. Other than
that I don't think so.
mod_status and mod_info both are enabled via <Location > containers. mod_status
never DECLINEs if it is the handler. mod_info DECLINEs if the method isn't GET.
Let me see what happens if I send mod_info some other method.
Greg
Re: [PATCH] block speed up
Posted by Bill Stoddard <bi...@wstoddard.com>.
WHat happens if the handler DECLINES the request and do we care?
Bill
gregames@apache.org wrote:
> I'm interested in doing some scalability testing with worker on Linux to
> see what the O(1) scheduler and new pthread library buys us, and what
> happens with different values for ThreadsPerChild. I decided to use a
> simple handler that just nanosleep()s for a variable amount of time
> controlled by the client. I ran oprofile and saw that the kernel's
> directory lookup function was the top CPU eater. An strace revealed
> that we were doing a directory walk, resulting in a stat() and 7
> lstat()s per request.
>
> My config to enable the handler is
>
> <Location /silly>
> SetHandler s2-handler
> </Location>
>
> ...which is the same way we enable mod_status and mod_info. The key
> thing here is that the URIs to access a Location enabled handler do not
> map to the filesystem, so the directory walk is a waste of cycles. So
> what can we do about it? Maybe detect that the current URI matches such
> a block, and skip the filesystem mapping if so. That's exactly what
> this patch does.
>
> Throughput measured by ab increases from 3053 requests/sec to 3984 with
> this patch, slightly more than 30% (dual 400MHz PII, both pegged, 256M
> RAM, RH 7.3). This should make it easier to see how the OS changes etc
> affect our scalability.
>
> I really don't grok how the cache works in ap_location_walk so I
> wouldn't be shocked if I haven't implemented this optimally. But it
> seems to do the Right Thing with the nanosleep handler, mod_status, and
> static files. I also confirmed that we do get a full directory walk if
> I type in a garbage URL.
>
> Comments?
> Greg
>
>
> ------------------------------------------------------------------------
>
> Index: include/http_core.h
> ===================================================================
> RCS file: /home/cvs/httpd-2.0/include/http_core.h,v
> retrieving revision 1.77
> diff -u -d -b -r1.77 http_core.h
> --- include/http_core.h 1 Jan 2004 13:26:16 -0000 1.77
> +++ include/http_core.h 14 Jan 2004 22:48:09 -0000
> @@ -367,6 +367,8 @@
> * won't actually be delivered as the response for the non-GET method.
> */
> int deliver_script;
> + /* The URI matches a <Location > block containing a SetHandler */
> + int uri_outside_filesystem;
> } core_request_config;
>
> /* Standard entries that are guaranteed to be accessible via
> Index: server/core.c
> ===================================================================
> RCS file: /home/cvs/httpd-2.0/server/core.c,v
> retrieving revision 1.256
> diff -u -d -b -r1.256 core.c
> --- server/core.c 8 Jan 2004 18:56:20 -0000 1.256
> +++ server/core.c 14 Jan 2004 22:48:09 -0000
> @@ -3349,6 +3349,9 @@
> static int core_map_to_storage(request_rec *r)
> {
> int access_status;
> + core_request_config *r_conf = ap_get_module_config(r->request_config,
> + &core_module);
> + if (!r_conf->uri_outside_filesystem) {
>
> if ((access_status = ap_directory_walk(r))) {
> return access_status;
> @@ -3356,6 +3359,7 @@
>
> if ((access_status = ap_file_walk(r))) {
> return access_status;
> + }
> }
>
> return OK;
> Index: server/request.c
> ===================================================================
> RCS file: /home/cvs/httpd-2.0/server/request.c,v
> retrieving revision 1.132
> diff -u -d -b -r1.132 request.c
> --- server/request.c 1 Jan 2004 13:26:23 -0000 1.132
> +++ server/request.c 14 Jan 2004 22:48:09 -0000
> @@ -1330,9 +1330,22 @@
> * and note the end result to (potentially) skip this step next time.
> */
> if (now_merged) {
> + core_dir_config *core_dir_conf = ap_get_module_config(now_merged,
> + &core_module);
> + core_request_config *r_conf = ap_get_module_config(r->request_config,
> + &core_module);
> +
> r->per_dir_config = ap_merge_per_dir_configs(r->pool,
> r->per_dir_config,
> now_merged);
> + if (core_dir_conf->handler && strcmp(core_dir_conf->handler,
> + "none")) {
> + /* this location block has a handler. A pretty strong hint
> + * that this URI doesn't live in the filesystem
> + */
> + r_conf->uri_outside_filesystem = 1;
> + }
> +
> }
> cache->per_dir_result = r->per_dir_config;
>
Re: [PATCH] block speed up
Posted by gr...@apache.org.
Geoffrey Young wrote:
>>...which is the same way we enable mod_status and mod_info. The key
>>thing here is that the URIs to access a Location enabled handler do not
>>map to the filesystem, so the directory walk is a waste of cycles. So
>>what can we do about it?
>
>
> isn't that what map_to_storage is for?
A map_to_storage hook needs to know when to allow the directory walk and when
not to. I've seen some sophisicated modules with multiple hooks do this by
parsing URIs against their own config. But since we have several handlers that
are enabled strictly by core config directives in the same way, it seems
appropriate to use the existing core config and URI parsing to tell the core
what to do. Plus I don't think we would want to duplicate code in each module
that uses <Location >
Greg
Re: [PATCH] block speed up
Posted by Geoffrey Young <ge...@modperlcookbook.org>.
> ...which is the same way we enable mod_status and mod_info. The key
> thing here is that the URIs to access a Location enabled handler do not
> map to the filesystem, so the directory walk is a waste of cycles. So
> what can we do about it?
isn't that what map_to_storage is for?
http://marc.theaimsgroup.com/?l=apache-httpd-dev&m=106506407907008&w=2
additional issues of interest can be found here
http://marc.theaimsgroup.com/?l=apache-httpd-dev&m=106451559226339&w=2
HTH
--Geoff