You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@httpd.apache.org by "Ralf S. Engelschall" <rs...@engelschall.com> on 1997/01/09 15:39:44 UTC

Re: URI-lookahead patch w/ docs (for 2.4?)

Below is a non-trivial patch for mod_rewrite by Ian Kluft of Cisco Systems
which adds new functionality. While this will not be included into any
mod_rewrite 2.3.x versions (which are for Apache 1.2) they could go unto 2.4.0
in conjunction with the furthcoming custom redirect codes (mod_alias has it
already).

Because I don't want to include stuff which conflicts with the Apache Groups
opinion for inclusion (mod_rewrite 2.4.0 should be again part of the core) I
want your review for the code diff from Ian.

Any comments?
                                        Ralf S. Engelschall
                                        rse@engelschall.com
                                        http://www.engelschall.com/

> The following patch adds a feature that we wanted here at Cisco Systems and
> updates the mod_rewrite docs for it.  Since this adds a new feature, I guess
> it's possible this could bump mod_rewrite up to 2.4.
> 
> Here's what it does... It adds two tests to RewriteCond:
>   -U to check if the path is a valid URI and accessible via all the server's
>      currently-configured access controls for that path
>   -F to check if the path is a valid file and accessible via all the server's
>      currently-configured access controls for that path
> 
> It also adds two variable lookups to RewriteCond
>    SUB:xxx performs a sub-request to look ahead, get xxx from the sub-request
>    SUBREQ returns "true" in a subrequest or "false" in the main request
> 
> Since three of these four features perform subrequests, the docs warn users
> to set conditions so they'll only run when the extra work is needed.  It
> provides an example.
> 
> I presume you merge the *.html files into mod_rewrite.html yourself.  I only
> modified the docs in mod_rewrite_about.html and mod_rewrite_config.html.
> -- 
> Ian Kluft  KO6YQ PP-ASEL                                  Cisco Systems, Inc.
> ikluft@cisco.com (work)  ikluft@thunder.sbay.org (home)          San Jose, CA
> 
> ------------------------------------------------------------------------------
> *** ORIGmod_rewrite.c Fri Jan  3 18:25:09 1997
> --- mod_rewrite.c Thu Jan  9 00:43:11 1997
> ***************
> *** 1644,1649 ****
> --- 1644,1700 ----
>               if (S_ISDIR(sb.st_mode))
>                   rc = 1;
>       }
> +     else if (strcmp(p->pattern, "-U") == 0) {
> +     /*
> +      * Do a subrequest to check if a URI exists and is accessible.
> +      * (Contributed to Apache by Cisco Systems, Inc.
> +      * Patch by Ian Kluft <ik...@cisco.com>)
> +      */
> + 
> +     /* avoid infinite subrequest recursion */
> +     if ( strlen(input)>0 &&         /* nonempty path, and */
> +         ( !r->main ||           /* not in a subrequest, or */
> +         ( r->main->uri && r->uri &&     /* URIs aren't NULL and */
> +         strcmp(r->main->uri,r->uri)!=0)))   /* sub and main URIs differ */
> +     {
> +         /* process a URI-based subrequest */
> +         request_rec *sub_req = sub_req_lookup_uri(input,r);
> + 
> +         /* URI exists for any result up to 3xx, redirects allowed */
> +         if ( sub_req->status < 400 ) {
> +             rc = 1;
> +         }
> +         destroy_sub_req(sub_req);
> +     }
> +     }
> +     else if (strcmp(p->pattern, "-F") == 0) {
> +     /*
> +      * Do a subrequest to check if a file exists and is accessible.
> +      * This differs from -U in that no path translation is done.
> +      * (Contributed to Apache by Cisco Systems, Inc.
> +      * Patch by Ian Kluft <ik...@cisco.com>)
> +      */
> + 
> +     /* avoid infinite subrequest recursion */
> +     if ( /* nonempty path, and */
> +         strlen(input)>0 &&
> +         /* not in a subrequest, or */
> +         ( !r->main ||
> +         /* files aren't NULL and */
> +         ( r->main->filename && r->filename &&
> +         /* sub and main files differ */
> +         strcmp(r->main->filename,r->filename)!=0)))
> +     {
> +         /* process a file-based subrequest */
> +         request_rec *sub_req = sub_req_lookup_file(input,r);
> + 
> +         /* file exists for any result up to 2xx, no redirects */
> +         if ( sub_req->status < 300 ) {
> +             rc = 1;
> +         }
> +         destroy_sub_req(sub_req);
> +     }
> +     }
>       else {
>           /* it is really a regexp pattern, so apply it */
>   #ifdef HAS_APACHE_REGEX_LIB
> ***************
> *** 2494,2499 ****
> --- 2545,2582 ----
>       /* all other env-variables from the parent Apache process */
>       else if (strlen(var) > 4 && strncasecmp(var, "ENV:", 4) == 0) {
>           result = getenv(var+4);
> +     }
> + 
> +     /*
> +      * determine whether we're in a sub-request or not
> +      * (Contributed to Apache by Cisco Systems, Inc.
> +      * Patch by Ian Kluft <ik...@cisco.com>)
> +      */
> +     else if (strcasecmp(var, "SUBREQ") == 0) {
> +     result = r->main ? "true" : "false";
> +     }
> + 
> +     /*
> +      * sub-request to look-ahead for script parameters
> +      * (Contributed to Apache by Cisco Systems, Inc.
> +      * Patch by Ian Kluft <ik...@cisco.com>)
> +      */
> +     else if (strlen(var) > 4 && strncasecmp(var, "SUB:", 4) == 0) {
> +     /* avoid infinite subrequest recursion */
> +     if ( !r->main ||            /* not in a subrequest, or */
> +         ( r->main->uri && r->uri &&     /* URIs aren't NULL and */
> +         strcmp(r->main->uri,r->uri)!=0))    /* sub and main URIs differ */
> +     {
> +         /* process a URI-based subrequest */
> +         request_rec *sub_req = sub_req_lookup_uri(r->uri,r);
> + 
> +         /* copy it up to our scope before we destory the sub_req's pool */
> +         result = pstrdup(r->pool,lookup_variable(sub_req,var+4));
> +         destroy_sub_req(sub_req);
> + 
> +         /* may as well return here rather that re-pstrdup it */
> +         return result;
> +     }
>       }
>   
>       /* uptime, load average, etc. .. */
> *** doc/ORIGmod_rewrite_about.html    Thu Jan  9 00:23:21 1997
> --- doc/mod_rewrite_about.html    Thu Jan  9 00:46:55 1997
> ***************
> *** 188,193 ****
> --- 188,201 ----
>       variables named <tt>SCRIPT_URL</tt> and <tt>SCRIPT_URI</tt> are provided
>       which contain the original (i.e. previous to any rewritings!)
>       <i>Web-view</i> to the current resource.
> + <p>
> + <li>The <b>RewriteCond</b> directive now allows two ways to harness Apache's
> +     sub-request mechanism to do "what-if" testing on potential files or
> +     URIs, and to use all of the server's access control checks on them.
> +     It can also test CGI script paths or their parameters, which would not
> +     be available without a sub-request to look ahead to Apache steps where
> +     they're derived.
> +     (This was contributed to Apache by Cisco Systems, Inc.)
>   </ul>
>   <p>
>   
> *** doc/ORIGmod_rewrite_config.html   Wed Jan  8 23:29:17 1997
> --- doc/mod_rewrite_config.html   Thu Jan  9 00:21:51 1997
> ***************
> *** 468,473 ****
> --- 468,474 ----
>   THE_REQUEST<br>
>   REQUEST_URI<br>
>   REQUEST_FILENAME<br>
> + SUBREQ<br>
>   </font>
>   </td>
>   </tr>
> ***************
> *** 502,507 ****
> --- 503,543 ----
>   <i>header</i> can be any HTTP MIME-header name. This is looked-up
>   from the HTTP request. Example: <tt>%{HTTP:Proxy-Connection}</tt>
>   is the value of the HTTP header ``<tt>Proxy-Connection:</tt>''.
> + <p>
> + <li>A special-case format: <tt>%{SUB:header}</tt> where <i>header</i>
> + can be any variable name (including other formats listed above).
> + This can be used to "look ahead" for
> + variables like SCRIPT_FILENAME, PATH_INFO, QUERY_STRING
> + or any other variables which have not yet been determined at the
> + name-translation stage where <b>mod_rewrite</b> does its work.
> + This performs an Apache "sub-request" to complete what the request would
> + look like for future stages where the script name and remaining path have
> + been recognized.
> + <b>Note:</b> A sub-request is time-consuming.  This should only be done
> + following another condition which limits the cases in which it can run,
> + usually just to paths where the extra work is needed.
> + <p>The following example will prevent access to a specialized CGI
> + directory if the PATH_INFO variable (remainder of the path after the script)
> + is a file which does not exist or is not accessible due to access controls.
> + This example is used to prevent file-conversion programs from being
> + used to serve files the user would not normally be allowed to access.
> + <blockquote><pre>
> + # First, make sure we only do the expensive subrequests for these paths
> + RewriteCond %{REQUEST_URI} ^/conv-cgi-bin/
> + # We'll only restrict main requests but subrequests can see what the file is
> + RewriteCond %{SUBREQ} false
> + # But the user must be able to access the file specified in PATH_INFO
> + RewriteCond %{SUB:PATH_INFO} !-U
> + # weed out the ones that don't meet the conditions for conv-cgi-bin
> + RewriteRule ^/conv-cgi-bin/.* /cgi-bin/http-err/403?%{REQUEST_URI} [PT,L]
> + </pre></blockquote>
> + <p>
> + <li>The special variable <tt>%{SUBREQ}</tt> returns the string "true"
> + if the test is being performed within a sub-request
> + (usually to test something),
> + or "false" if it's in the main request for which which the server will
> + actually return a response.
> + For an example, see the <tt>%{SUB:xxx}</tt> variable pattern above.
>   </ol>
>   
>   
> ***************
> *** 540,545 ****
> --- 576,613 ----
>   <li>'<b>-l</b>' (is symbolic <b>l</b>ink)<br>
>   Treats the <i>TestString</i> as a pathname and
>   tests if it exists and is a symbolic link.
> + <p>
> + <li>'<b>-U</b>' (is a URI that is accessible)<br>
> + Treats the <i>TestString</i> as a URI as if it were requested from the
> + server on its own, but only to test if it exists and is accessible.
> + All applicable access controls are checked because it evaluates the
> + URI in a sub-request.
> + A URI is considered accessible if any HTTP result in the 1xx (informational),
> + 2xx (success), or 3xx (redirect) ranges would have been produced for that
> + request.
> + <p>
> + <b>Note:</b> Sub-requests are not free.  They take time so this directive
> + should only be used following at least one other <tt>RewriteCond</tt>
> + that limits the paths it will apply to, so that it's only used when the
> + extra work is needed.
> + For an example, see the <tt>%{SUB:xxx}</tt> variable pattern above.
> + <p>
> + <li>'<b>-F</b>' (is a file that is accessible via the server)<br>
> + Treats the <i>TestString</i> as a full file path as if it were requested
> + from the server on its own, but only to test if it exists and is accessible.
> + All applicable access controls are checked because it evaluates the
> + file path in a sub-request.
> + A file is considered accessible if any HTTP result in the 1xx (informational)
> + or 2xx (success) ranges would have been produced for that request.
> + <p>
> + The main difference between <b>-U</b> and <b>-F</b> is that <b>-F</b> does
> + not perform a URI-to-filename translation (which could use <b>mod_rewrite</b>
> + among others in the sub request.)
> + But both perform the server's access checks on the file.
> + <b>-U</b> succeeds for requests that would have resulted in redirection
> + but <b>-F</b> considers redirects to mean a non-existent file (failure.)
> + <p>
> + <b>Note:</b> Sub-requests are not free.  See the warning for <b>-U</b>.
>   </ul>
>   <p>
>   Notice: All of these tests can also be prefixed by a not ('!') character
>