You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@httpd.apache.org by "Ralf S. Engelschall" <rs...@engelschall.com> on 1997/01/09 15:39:44 UTC
Re: URI-lookahead patch w/ docs (for 2.4?)
Below is a non-trivial patch for mod_rewrite by Ian Kluft of Cisco Systems
which adds new functionality. While this will not be included into any
mod_rewrite 2.3.x versions (which are for Apache 1.2) they could go unto 2.4.0
in conjunction with the furthcoming custom redirect codes (mod_alias has it
already).
Because I don't want to include stuff which conflicts with the Apache Groups
opinion for inclusion (mod_rewrite 2.4.0 should be again part of the core) I
want your review for the code diff from Ian.
Any comments?
Ralf S. Engelschall
rse@engelschall.com
http://www.engelschall.com/
> The following patch adds a feature that we wanted here at Cisco Systems and
> updates the mod_rewrite docs for it. Since this adds a new feature, I guess
> it's possible this could bump mod_rewrite up to 2.4.
>
> Here's what it does... It adds two tests to RewriteCond:
> -U to check if the path is a valid URI and accessible via all the server's
> currently-configured access controls for that path
> -F to check if the path is a valid file and accessible via all the server's
> currently-configured access controls for that path
>
> It also adds two variable lookups to RewriteCond
> SUB:xxx performs a sub-request to look ahead, get xxx from the sub-request
> SUBREQ returns "true" in a subrequest or "false" in the main request
>
> Since three of these four features perform subrequests, the docs warn users
> to set conditions so they'll only run when the extra work is needed. It
> provides an example.
>
> I presume you merge the *.html files into mod_rewrite.html yourself. I only
> modified the docs in mod_rewrite_about.html and mod_rewrite_config.html.
> --
> Ian Kluft KO6YQ PP-ASEL Cisco Systems, Inc.
> ikluft@cisco.com (work) ikluft@thunder.sbay.org (home) San Jose, CA
>
> ------------------------------------------------------------------------------
> *** ORIGmod_rewrite.c Fri Jan 3 18:25:09 1997
> --- mod_rewrite.c Thu Jan 9 00:43:11 1997
> ***************
> *** 1644,1649 ****
> --- 1644,1700 ----
> if (S_ISDIR(sb.st_mode))
> rc = 1;
> }
> + else if (strcmp(p->pattern, "-U") == 0) {
> + /*
> + * Do a subrequest to check if a URI exists and is accessible.
> + * (Contributed to Apache by Cisco Systems, Inc.
> + * Patch by Ian Kluft <ik...@cisco.com>)
> + */
> +
> + /* avoid infinite subrequest recursion */
> + if ( strlen(input)>0 && /* nonempty path, and */
> + ( !r->main || /* not in a subrequest, or */
> + ( r->main->uri && r->uri && /* URIs aren't NULL and */
> + strcmp(r->main->uri,r->uri)!=0))) /* sub and main URIs differ */
> + {
> + /* process a URI-based subrequest */
> + request_rec *sub_req = sub_req_lookup_uri(input,r);
> +
> + /* URI exists for any result up to 3xx, redirects allowed */
> + if ( sub_req->status < 400 ) {
> + rc = 1;
> + }
> + destroy_sub_req(sub_req);
> + }
> + }
> + else if (strcmp(p->pattern, "-F") == 0) {
> + /*
> + * Do a subrequest to check if a file exists and is accessible.
> + * This differs from -U in that no path translation is done.
> + * (Contributed to Apache by Cisco Systems, Inc.
> + * Patch by Ian Kluft <ik...@cisco.com>)
> + */
> +
> + /* avoid infinite subrequest recursion */
> + if ( /* nonempty path, and */
> + strlen(input)>0 &&
> + /* not in a subrequest, or */
> + ( !r->main ||
> + /* files aren't NULL and */
> + ( r->main->filename && r->filename &&
> + /* sub and main files differ */
> + strcmp(r->main->filename,r->filename)!=0)))
> + {
> + /* process a file-based subrequest */
> + request_rec *sub_req = sub_req_lookup_file(input,r);
> +
> + /* file exists for any result up to 2xx, no redirects */
> + if ( sub_req->status < 300 ) {
> + rc = 1;
> + }
> + destroy_sub_req(sub_req);
> + }
> + }
> else {
> /* it is really a regexp pattern, so apply it */
> #ifdef HAS_APACHE_REGEX_LIB
> ***************
> *** 2494,2499 ****
> --- 2545,2582 ----
> /* all other env-variables from the parent Apache process */
> else if (strlen(var) > 4 && strncasecmp(var, "ENV:", 4) == 0) {
> result = getenv(var+4);
> + }
> +
> + /*
> + * determine whether we're in a sub-request or not
> + * (Contributed to Apache by Cisco Systems, Inc.
> + * Patch by Ian Kluft <ik...@cisco.com>)
> + */
> + else if (strcasecmp(var, "SUBREQ") == 0) {
> + result = r->main ? "true" : "false";
> + }
> +
> + /*
> + * sub-request to look-ahead for script parameters
> + * (Contributed to Apache by Cisco Systems, Inc.
> + * Patch by Ian Kluft <ik...@cisco.com>)
> + */
> + else if (strlen(var) > 4 && strncasecmp(var, "SUB:", 4) == 0) {
> + /* avoid infinite subrequest recursion */
> + if ( !r->main || /* not in a subrequest, or */
> + ( r->main->uri && r->uri && /* URIs aren't NULL and */
> + strcmp(r->main->uri,r->uri)!=0)) /* sub and main URIs differ */
> + {
> + /* process a URI-based subrequest */
> + request_rec *sub_req = sub_req_lookup_uri(r->uri,r);
> +
> + /* copy it up to our scope before we destory the sub_req's pool */
> + result = pstrdup(r->pool,lookup_variable(sub_req,var+4));
> + destroy_sub_req(sub_req);
> +
> + /* may as well return here rather that re-pstrdup it */
> + return result;
> + }
> }
>
> /* uptime, load average, etc. .. */
> *** doc/ORIGmod_rewrite_about.html Thu Jan 9 00:23:21 1997
> --- doc/mod_rewrite_about.html Thu Jan 9 00:46:55 1997
> ***************
> *** 188,193 ****
> --- 188,201 ----
> variables named <tt>SCRIPT_URL</tt> and <tt>SCRIPT_URI</tt> are provided
> which contain the original (i.e. previous to any rewritings!)
> <i>Web-view</i> to the current resource.
> + <p>
> + <li>The <b>RewriteCond</b> directive now allows two ways to harness Apache's
> + sub-request mechanism to do "what-if" testing on potential files or
> + URIs, and to use all of the server's access control checks on them.
> + It can also test CGI script paths or their parameters, which would not
> + be available without a sub-request to look ahead to Apache steps where
> + they're derived.
> + (This was contributed to Apache by Cisco Systems, Inc.)
> </ul>
> <p>
>
> *** doc/ORIGmod_rewrite_config.html Wed Jan 8 23:29:17 1997
> --- doc/mod_rewrite_config.html Thu Jan 9 00:21:51 1997
> ***************
> *** 468,473 ****
> --- 468,474 ----
> THE_REQUEST<br>
> REQUEST_URI<br>
> REQUEST_FILENAME<br>
> + SUBREQ<br>
> </font>
> </td>
> </tr>
> ***************
> *** 502,507 ****
> --- 503,543 ----
> <i>header</i> can be any HTTP MIME-header name. This is looked-up
> from the HTTP request. Example: <tt>%{HTTP:Proxy-Connection}</tt>
> is the value of the HTTP header ``<tt>Proxy-Connection:</tt>''.
> + <p>
> + <li>A special-case format: <tt>%{SUB:header}</tt> where <i>header</i>
> + can be any variable name (including other formats listed above).
> + This can be used to "look ahead" for
> + variables like SCRIPT_FILENAME, PATH_INFO, QUERY_STRING
> + or any other variables which have not yet been determined at the
> + name-translation stage where <b>mod_rewrite</b> does its work.
> + This performs an Apache "sub-request" to complete what the request would
> + look like for future stages where the script name and remaining path have
> + been recognized.
> + <b>Note:</b> A sub-request is time-consuming. This should only be done
> + following another condition which limits the cases in which it can run,
> + usually just to paths where the extra work is needed.
> + <p>The following example will prevent access to a specialized CGI
> + directory if the PATH_INFO variable (remainder of the path after the script)
> + is a file which does not exist or is not accessible due to access controls.
> + This example is used to prevent file-conversion programs from being
> + used to serve files the user would not normally be allowed to access.
> + <blockquote><pre>
> + # First, make sure we only do the expensive subrequests for these paths
> + RewriteCond %{REQUEST_URI} ^/conv-cgi-bin/
> + # We'll only restrict main requests but subrequests can see what the file is
> + RewriteCond %{SUBREQ} false
> + # But the user must be able to access the file specified in PATH_INFO
> + RewriteCond %{SUB:PATH_INFO} !-U
> + # weed out the ones that don't meet the conditions for conv-cgi-bin
> + RewriteRule ^/conv-cgi-bin/.* /cgi-bin/http-err/403?%{REQUEST_URI} [PT,L]
> + </pre></blockquote>
> + <p>
> + <li>The special variable <tt>%{SUBREQ}</tt> returns the string "true"
> + if the test is being performed within a sub-request
> + (usually to test something),
> + or "false" if it's in the main request for which which the server will
> + actually return a response.
> + For an example, see the <tt>%{SUB:xxx}</tt> variable pattern above.
> </ol>
>
>
> ***************
> *** 540,545 ****
> --- 576,613 ----
> <li>'<b>-l</b>' (is symbolic <b>l</b>ink)<br>
> Treats the <i>TestString</i> as a pathname and
> tests if it exists and is a symbolic link.
> + <p>
> + <li>'<b>-U</b>' (is a URI that is accessible)<br>
> + Treats the <i>TestString</i> as a URI as if it were requested from the
> + server on its own, but only to test if it exists and is accessible.
> + All applicable access controls are checked because it evaluates the
> + URI in a sub-request.
> + A URI is considered accessible if any HTTP result in the 1xx (informational),
> + 2xx (success), or 3xx (redirect) ranges would have been produced for that
> + request.
> + <p>
> + <b>Note:</b> Sub-requests are not free. They take time so this directive
> + should only be used following at least one other <tt>RewriteCond</tt>
> + that limits the paths it will apply to, so that it's only used when the
> + extra work is needed.
> + For an example, see the <tt>%{SUB:xxx}</tt> variable pattern above.
> + <p>
> + <li>'<b>-F</b>' (is a file that is accessible via the server)<br>
> + Treats the <i>TestString</i> as a full file path as if it were requested
> + from the server on its own, but only to test if it exists and is accessible.
> + All applicable access controls are checked because it evaluates the
> + file path in a sub-request.
> + A file is considered accessible if any HTTP result in the 1xx (informational)
> + or 2xx (success) ranges would have been produced for that request.
> + <p>
> + The main difference between <b>-U</b> and <b>-F</b> is that <b>-F</b> does
> + not perform a URI-to-filename translation (which could use <b>mod_rewrite</b>
> + among others in the sub request.)
> + But both perform the server's access checks on the file.
> + <b>-U</b> succeeds for requests that would have resulted in redirection
> + but <b>-F</b> considers redirects to mean a non-existent file (failure.)
> + <p>
> + <b>Note:</b> Sub-requests are not free. See the warning for <b>-U</b>.
> </ul>
> <p>
> Notice: All of these tests can also be prefixed by a not ('!') character
>