You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@httpd.apache.org by Rasmus Lerdorf <ra...@apache.org> on 2002/07/06 16:39:41 UTC

PATH_INFO in A2?

With this config:

LoadModule php4_module        modules/libphp4.so
AddType application/x-httpd-php .php

http://localhost/index.php

works fine and parser PHP pages without problems.  (using prefork - but it
shouldn't matter)

However: http://localhost/index.php/foo

Gives a 404 and says:

The requested URL /index.php/foo was not found on this server.

In Apache1 /index.php would still get passed to PHP and /foo would end up
in the PATH_INFO server var.

-Rasmus


Re: PATH_INFO in A2?

Posted by Ian Holsman <ia...@apache.org>.
Rasmus Lerdorf wrote:
> With this config:
> 
> LoadModule php4_module        modules/libphp4.so
> AddType application/x-httpd-php .php
> 
> http://localhost/index.php
> 
> works fine and parser PHP pages without problems.  (using prefork - but it
> shouldn't matter)
> 
> However: http://localhost/index.php/foo
> 
> Gives a 404 and says:
> 
> The requested URL /index.php/foo was not found on this server.
> 
> In Apache1 /index.php would still get passed to PHP and /foo would end up
> in the PATH_INFO server var.
> 
> -Rasmus
> 
does it work with the normal CGI or mod_include ???

<!--#echo var='PATH_INFO' --> is a SSI file should tell you





Re: PATH_INFO in A2?

Posted by Sebastian Bergmann <li...@sebastian-bergmann.de>.
Dale Ghent wrote:
> Make sure you have AcceptPathInfo On

  Now it works, thanks.

-- 
  Sebastian Bergmann
  http://sebastian-bergmann.de/                 http://phpOpenTracker.de/

  Did I help you? Consider a gift: http://wishlist.sebastian-bergmann.de/

Re: PATH_INFO in A2?

Posted by Aaron Bannert <aa...@clove.org>.
On Mon, Jul 08, 2002 at 12:12:50PM -0500, William A. Rowe, Jr. wrote:
> >>     /* Deal with the poor soul who is trying to force path_info to be
> >>      * accepted within the core_handler, where they will let the subreq
> >>      * address its contents.  This is toggled by the user in the very
> >>      * beginning of the fixup phase, so modules should override the 
> >user's
> >>      * discretion in their own module fixup phase.  It is tristate, if
> >>      * the user doesn't specify, the result is 2 (which the module may
> >>      * interpret to its own customary behavior.)  It won't be touched
> >>      * if the value is no longer undefined (2), so any module changing
> >>      * the value prior to the fixup phase OVERRIDES the user's choice.
> >>      */
> >>     if ((r->used_path_info == AP_REQ_DEFAULT_PATH_INFO)
> >>         && (conf->accept_path_info != 3)) {
> >>         r->used_path_info = conf->accept_path_info;
> >>     }
> >
> >Cool, I was meaning to ask you what the magic numbers meant here, so I
> >could document it in the header and then implement this properly for
> >PHP. Unfortunately I got sidetracked... :) Glad it came up again so
> >we can get it fixed.
> 
> from httpd.h:627;
> 
> /**
>  * @defgroup values_request_rec_used_path_info Possible values for 
> request_rec.used_path_info
>  * @{
>  * Possible values for request_rec.used_path_info:
>  */
> 
> /** Accept the path_info from the request */
> #define AP_REQ_ACCEPT_PATH_INFO    0
> /** Return a 404 error if path_info was given */
> #define AP_REQ_REJECT_PATH_INFO    1
> /** Module may chose to use the given path_info */
> #define AP_REQ_DEFAULT_PATH_INFO   2
> /** @} */

I was refering the magic numbers for accept_path_info, not used_path_info.
Are those defined anywhere?

> >So what does PHP need to do?
> 
> before fixup's...
> 
> if (r->used_path_info == AP_REQ_DEFAULT_PATH_INFO) {
>     r->used_path_info = AP_REQ_ACCEPT_PATH_INFO;
> }
> 
> after the map to storage phase... such as in the filter_init.

You make it look too easy! :) I'll implement this today unless someone
beats me to it.

-aaron

Re: PATH_INFO in A2?

Posted by "William A. Rowe, Jr." <wr...@rowe-clan.net>.
At 11:41 AM 7/8/2002, Aaron Bannert wrote:
>On Mon, Jul 08, 2002 at 11:03:48AM -0500, William A. Rowe, Jr. wrote:
> > At 12:02 PM 7/6/2002, Rasmus Lerdorf wrote:
> > >> > What is a dynamic page if not a PHP page?
> > >>
> > >> Like I said, Apache doesn't know if a file on disk is meant for PHP or
> > >> not.  The best way to fix this would be for mod_php to set the value if
> > >> the filter is added for the request.
> > >>
> > >> I agree, it would be cool if Apache could set this correctly based on
> > >> the filters that have been added for the request.
> > >
> > >Seems like there should be an API call where a filter can specify whether
> > >it is a dynamic one or not as opposed to specifically flipping the
> > >acceptpathinfo bit.
> >
> > There is, and they shouldn't.  See core.c:3145
> >
> >     /* Deal with the poor soul who is trying to force path_info to be
> >      * accepted within the core_handler, where they will let the subreq
> >      * address its contents.  This is toggled by the user in the very
> >      * beginning of the fixup phase, so modules should override the user's
> >      * discretion in their own module fixup phase.  It is tristate, if
> >      * the user doesn't specify, the result is 2 (which the module may
> >      * interpret to its own customary behavior.)  It won't be touched
> >      * if the value is no longer undefined (2), so any module changing
> >      * the value prior to the fixup phase OVERRIDES the user's choice.
> >      */
> >     if ((r->used_path_info == AP_REQ_DEFAULT_PATH_INFO)
> >         && (conf->accept_path_info != 3)) {
> >         r->used_path_info = conf->accept_path_info;
> >     }
>
>Cool, I was meaning to ask you what the magic numbers meant here, so I
>could document it in the header and then implement this properly for
>PHP. Unfortunately I got sidetracked... :) Glad it came up again so
>we can get it fixed.

from httpd.h:627;

/**
  * @defgroup values_request_rec_used_path_info Possible values for 
request_rec.used_path_info
  * @{
  * Possible values for request_rec.used_path_info:
  */

/** Accept the path_info from the request */
#define AP_REQ_ACCEPT_PATH_INFO    0
/** Return a 404 error if path_info was given */
#define AP_REQ_REJECT_PATH_INFO    1
/** Module may chose to use the given path_info */
#define AP_REQ_DEFAULT_PATH_INFO   2
/** @} */

> > Effectively, we allow any module to set r->used_path_info to flag that
> > the request is valid with path_info.
> >
> > The reason to your question in another post, 'Why not always default
> > to on???' was pretty simple.  Most filters don't process path info.
> > php, perl and includes sometimes use the path info, sometimes won't.
> > That's why it can be turned off anywhere, even if your preference is to
> > allow it.  Modules can override the -default- preference, but never an
> > explicit preference in the config.
> >
> > So why not leave it on everywhere?  That's been discussed many times,
> > the IBM WebSphere product did so by default (even static content.)  The
> > problem is recursion, you have no way of telling a robot or other scanner
> > that there are no more -new- pages here to read, so they can just pound
> > the heck out of duplicated URI space.
> >
> > We used to have this on www.apache.org/httpd.html. Folks started using
> > www.apache.org/httpd [multiviewed].  Then folks started referring to
> > www.apache.org/httpd/index.html, as if this were a directory.  It never 
> was.
> >
> > It's only productive to allow PATH_INFO if the application (PHP, Perl, etc)
> > can provide a 404 itself if the PATH_INFO isn't interesting.  Includes 
> can't
> > do that, which is why I provided the r->used_path_info flag to advise that
> > some filter might be consuming that bit.
>
>So what does PHP need to do?

before fixup's...

if (r->used_path_info == AP_REQ_DEFAULT_PATH_INFO) {
     r->used_path_info = AP_REQ_ACCEPT_PATH_INFO;
}

after the map to storage phase... such as in the filter_init.

RE: PATH_INFO in A2?

Posted by "William A. Rowe, Jr." <wr...@rowe-clan.net>.
At 12:02 PM 7/6/2002, Rasmus Lerdorf wrote:
> > > What is a dynamic page if not a PHP page?
> >
> > Like I said, Apache doesn't know if a file on disk is meant for PHP or
> > not.  The best way to fix this would be for mod_php to set the value if
> > the filter is added for the request.
> >
> > I agree, it would be cool if Apache could set this correctly based on
> > the filters that have been added for the request.
>
>Seems like there should be an API call where a filter can specify whether
>it is a dynamic one or not as opposed to specifically flipping the
>acceptpathinfo bit.

There is, and they shouldn't.  See core.c:3145

     /* Deal with the poor soul who is trying to force path_info to be
      * accepted within the core_handler, where they will let the subreq
      * address its contents.  This is toggled by the user in the very
      * beginning of the fixup phase, so modules should override the user's
      * discretion in their own module fixup phase.  It is tristate, if
      * the user doesn't specify, the result is 2 (which the module may
      * interpret to its own customary behavior.)  It won't be touched
      * if the value is no longer undefined (2), so any module changing
      * the value prior to the fixup phase OVERRIDES the user's choice.
      */
     if ((r->used_path_info == AP_REQ_DEFAULT_PATH_INFO)
         && (conf->accept_path_info != 3)) {
         r->used_path_info = conf->accept_path_info;
     }

Effectively, we allow any module to set r->used_path_info to flag that
the request is valid with path_info.

The reason to your question in another post, 'Why not always default
to on???' was pretty simple.  Most filters don't process path info.
php, perl and includes sometimes use the path info, sometimes won't.
That's why it can be turned off anywhere, even if your preference is to
allow it.  Modules can override the -default- preference, but never an
explicit preference in the config.

So why not leave it on everywhere?  That's been discussed many times,
the IBM WebSphere product did so by default (even static content.)  The
problem is recursion, you have no way of telling a robot or other scanner
that there are no more -new- pages here to read, so they can just pound
the heck out of duplicated URI space.

We used to have this on www.apache.org/httpd.html.  Folks started using
www.apache.org/httpd [multiviewed].  Then folks started referring to
www.apache.org/httpd/index.html, as if this were a directory.  It never was.

It's only productive to allow PATH_INFO if the application (PHP, Perl, etc)
can provide a 404 itself if the PATH_INFO isn't interesting.  Includes can't
do that, which is why I provided the r->used_path_info flag to advise that
some filter might be consuming that bit.

Bill








Re: PATH_INFO in A2?

Posted by Justin Erenkrantz <je...@apache.org>.
On Sat, Jul 06, 2002 at 10:02:24AM -0700, Rasmus Lerdorf wrote:
> > > What is a dynamic page if not a PHP page?
> >
> > Like I said, Apache doesn't know if a file on disk is meant for PHP or
> > not.  The best way to fix this would be for mod_php to set the value if
> > the filter is added for the request.
> >
> > I agree, it would be cool if Apache could set this correctly based on
> > the filters that have been added for the request.
> 
> Seems like there should be an API call where a filter can specify whether
> it is a dynamic one or not as opposed to specifically flipping the
> acceptpathinfo bit.

Sounds like PHP needs to implement the filter_init function.  It
is exactly for this situation.  -- justin

Re: PATH_INFO in A2?

Posted by Rodent of Unusual Size <Ke...@Golux.Com>.
"William A. Rowe, Jr." wrote:
> 
> And don't go spilling entire web trees from every static file resource,
> resulting in infinite recursion of a resource identifier ;-)

Um, we're not talking about static files, so the above is a red herring.

> Sorry if I feel somewhat strongly here.  Just got blasted with a dozen
> hits last night for robots.txt by a.v., and I have only one dns entry
> registered to that box.

robots.txt has the SSI handler declared for it?  If it's a static file,
sure -- make the default for path-info be off.  But if the handler for
a resource is anything *except* the last-chance filesystem handler,
leave it enabled by default.  DON'T assume we know better than whomever
wrote the handler, and DON'T add confusion about when the option is
on by default, off by default, and when it needs to be set explicitly.
-- 
#ken	P-)}

Ken Coar, Sanagendamgagwedweinini  http://Golux.Com/coar/
Author, developer, opinionist      http://Apache-Server.Com/

"Millennium hand and shrimp!"

Re: PATH_INFO in A2?

Posted by "William A. Rowe, Jr." <wr...@rowe-clan.net>.
At 03:56 PM 7/11/2002, Rodent of Unusual Size wrote:
>"William A. Rowe, Jr." wrote:
> >
> > But not by default.  That would be the exception, not the rule.
> > The idea is to avoid namespace recursion, and unlike CGIs, most
> > folks don't look at PATH_INFO in their SSIs.
>
>This smacks of a 'father knows best' attitude -- which historically
>infuriates me.  If it's dynamic, it's dynamic, so give it access
>to all its data.  *DON'T* force users (and people they ask for help)
>into a guessing game about when to turn the option on.

And don't go spilling entire web trees from every static file resource,
resulting in infinite recursion of a resource identifier ;-)

This was a compromise.  Short of pre-parsing the SSI for included
CGI files or PATH_INFO references, we can't predict.  Better docs
about this?   YES!  I would agree 100% this isn't intuitive enough.

Sorry if I feel somewhat strongly here.  Just got blasted with a dozen
hits last night for robots.txt by a.v., and I have only one dns entry
registered to that box.  Sure, you -can- include anything in an SSI,
but should the majority default to recursion or no recursion?

To default in favor of a non-infinitely-recursing resource is a good thing,
IMHO.  Do others care to shed new light in this year old debate?

Bill



Re: PATH_INFO in A2?

Posted by di...@covalent.net.
On Thu, 11 Jul 2002, Rodent of Unusual Size wrote:

> "William A. Rowe, Jr." wrote:
> >
> > But not by default.  That would be the exception, not the rule.
> > The idea is to avoid namespace recursion, and unlike CGIs, most
> > folks don't look at PATH_INFO in their SSIs.
>
> This smacks of a 'father knows best' attitude -- which historically
> infuriates me.  If it's dynamic, it's dynamic, so give it access
> to all its data.  *DON'T* force users (and people they ask for help)
> into a guessing game about when to turn the option on.
>
Aye - and people who love fooling netapps/squid/caches who think they
should do stupid things when they see a '?' in the URL love using it for
constructing dynamic content whith a URL which looks like it is static
(and then use PATH_INFO).


Dw
-- 
http://www.foo.com/legal/italy/disclaimer.html



Re: PATH_INFO in A2?

Posted by Rodent of Unusual Size <Ke...@Golux.Com>.
"William A. Rowe, Jr." wrote:
> 
> But not by default.  That would be the exception, not the rule.
> The idea is to avoid namespace recursion, and unlike CGIs, most
> folks don't look at PATH_INFO in their SSIs.

This smacks of a 'father knows best' attitude -- which historically
infuriates me.  If it's dynamic, it's dynamic, so give it access
to all its data.  *DON'T* force users (and people they ask for help)
into a guessing game about when to turn the option on.
-- 
#ken	P-)}

Ken Coar, Sanagendamgagwedweinini  http://Golux.Com/coar/
Author, developer, opinionist      http://Apache-Server.Com/

"Millennium hand and shrimp!"

Re: PATH_INFO in A2?

Posted by "William A. Rowe, Jr." <wr...@rowe-clan.net>.
At 11:49 AM 7/11/2002, you wrote:
>Justin Erenkrantz wrote:
> >
> > I don't believe that mod_include would want AcceptPathInfo on.
>
>Um, why not?  I know of SSIs that use the path-info..

But not by default.  That would be the exception, not the rule.
The idea is to avoid namespace recursion, and unlike CGIs, most
folks don't look at PATH_INFO in their SSIs.  If your CGI doesn't
look at it either, that's the CGI's problem.

Bill



Re: PATH_INFO in A2?

Posted by Rodent of Unusual Size <Ke...@Golux.Com>.
Justin Erenkrantz wrote:
> 
> I don't believe that mod_include would want AcceptPathInfo on.

Um, why not?  I know of SSIs that use the path-info..
-- 
#ken	P-)}

Ken Coar, Sanagendamgagwedweinini  http://Golux.Com/coar/
Author, developer, opinionist      http://Apache-Server.Com/

"Millennium hand and shrimp!"

RE: PATH_INFO in A2?

Posted by Ryan Bloom <rb...@covalent.net>.
> From: Justin Erenkrantz [mailto:jerenkrantz@apache.org]
> 
> On Sat, Jul 06, 2002 at 03:11:20PM -0400, rbb@apache.org wrote:
> > We just added a new function for all input filters to allow this to
be
> > done (Justin referenced it in his reply).  However, that function
> doesn't
> > solve the problem, because there should be an
ap_filter_is_dynamic(r)
> that
> > hides the implementation details for all filters.
> 
> I don't believe that mod_include would want AcceptPathInfo on.
> Only PHP would.  So, I don't know what ap_filter_is_dynamic() would
> buy here (other than setting r->no_local_copy to 1).  -- Justin

ap_filter_is_dynamic wouldn't replace the init function, it would be a
simple function that the init functions could call to easily setup the
filters.

Ryan



Re: PATH_INFO in A2?

Posted by Rasmus Lerdorf <ra...@apache.org>.
> On Sat, Jul 06, 2002 at 03:11:20PM -0400, rbb@apache.org wrote:
> > We just added a new function for all input filters to allow this to be
> > done (Justin referenced it in his reply).  However, that function doesn't
> > solve the problem, because there should be an ap_filter_is_dynamic(r) that
> > hides the implementation details for all filters.
>
> I don't believe that mod_include would want AcceptPathInfo on.
> Only PHP would.  So, I don't know what ap_filter_is_dynamic() would
> buy here (other than setting r->no_local_copy to 1).  -- justin

Why wouldn't mod_include want AcceptPathInfo to be on?  I am sure there
are .shtml files out there that are written with the assumption that
PATH_INFO is set correctly.

-Rasmus


Re: PATH_INFO in A2?

Posted by Justin Erenkrantz <je...@apache.org>.
On Sat, Jul 06, 2002 at 03:11:20PM -0400, rbb@apache.org wrote:
> We just added a new function for all input filters to allow this to be
> done (Justin referenced it in his reply).  However, that function doesn't
> solve the problem, because there should be an ap_filter_is_dynamic(r) that
> hides the implementation details for all filters.

I don't believe that mod_include would want AcceptPathInfo on.
Only PHP would.  So, I don't know what ap_filter_is_dynamic() would
buy here (other than setting r->no_local_copy to 1).  -- justin

RE: PATH_INFO in A2?

Posted by rb...@apache.org.
On Sat, 6 Jul 2002, Rasmus Lerdorf wrote:

> > > What is a dynamic page if not a PHP page?
> >
> > Like I said, Apache doesn't know if a file on disk is meant for PHP or
> > not.  The best way to fix this would be for mod_php to set the value if
> > the filter is added for the request.
> >
> > I agree, it would be cool if Apache could set this correctly based on
> > the filters that have been added for the request.
> 
> Seems like there should be an API call where a filter can specify whether
> it is a dynamic one or not as opposed to specifically flipping the
> acceptpathinfo bit.

We just added a new function for all input filters to allow this to be
done (Justin referenced it in his reply).  However, that function doesn't
solve the problem, because there should be an ap_filter_is_dynamic(r) that
hides the implementation details for all filters.

Ryan

_______________________________________________________________________________
Ryan Bloom                        	rbb@apache.org
550 Jean St
Oakland CA 94610
-------------------------------------------------------------------------------


RE: PATH_INFO in A2?

Posted by Rasmus Lerdorf <ra...@apache.org>.
> > What is a dynamic page if not a PHP page?
>
> Like I said, Apache doesn't know if a file on disk is meant for PHP or
> not.  The best way to fix this would be for mod_php to set the value if
> the filter is added for the request.
>
> I agree, it would be cool if Apache could set this correctly based on
> the filters that have been added for the request.

Seems like there should be an API call where a filter can specify whether
it is a dynamic one or not as opposed to specifically flipping the
acceptpathinfo bit.

-Rasmus


RE: PATH_INFO in A2?

Posted by Ryan Bloom <rb...@covalent.net>.
> From: Rasmus Lerdorf [mailto:rasmus@apache.org]
> 
> > > > Make sure you have AcceptPathInfo On
> > >
> > > Argh!  Why the heck is that off by default?
> >
> > It's on by default for dynamic pages, but there is no way that
Apache
> > can tell that a page is going to be served by PHP, so it is off for
what
> > Apache thinks are static pages.
> 
> What is a dynamic page if not a PHP page?

Like I said, Apache doesn't know if a file on disk is meant for PHP or
not.  The best way to fix this would be for mod_php to set the value if
the filter is added for the request.

I agree, it would be cool if Apache could set this correctly based on
the filters that have been added for the request.

Ryan



RE: PATH_INFO in A2?

Posted by Rasmus Lerdorf <ra...@apache.org>.
> > > Make sure you have AcceptPathInfo On
> >
> > Argh!  Why the heck is that off by default?
>
> It's on by default for dynamic pages, but there is no way that Apache
> can tell that a page is going to be served by PHP, so it is off for what
> Apache thinks are static pages.

What is a dynamic page if not a PHP page?

-Rasmus


RE: PATH_INFO in A2?

Posted by Ryan Bloom <rb...@covalent.net>.
> From: Rasmus Lerdorf [mailto:rasmus@apache.org]
> 
> On Sat, 6 Jul 2002, Dale Ghent wrote:
> 
> > On Sat, 6 Jul 2002, Rasmus Lerdorf wrote:
> >
> > | 2.0.40-dev built on June 23rd
> >
> > Make sure you have AcceptPathInfo On
> 
> Argh!  Why the heck is that off by default?

It's on by default for dynamic pages, but there is no way that Apache
can tell that a page is going to be served by PHP, so it is off for what
Apache thinks are static pages.

Ryan



Re: PATH_INFO in A2?

Posted by Dale Ghent <da...@elemental.org>.
On Sat, 6 Jul 2002, Rasmus Lerdorf wrote:

| On Sat, 6 Jul 2002, Dale Ghent wrote:
|
| > On Sat, 6 Jul 2002, Rasmus Lerdorf wrote:
| >
| > | 2.0.40-dev built on June 23rd
| >
| > Make sure you have AcceptPathInfo On
|
| Argh!  Why the heck is that off by default?

Dunno. I ran into this same problem a while back when I made the switch
from AP1.3 to AP2. It has always seemed like a regression to me
(especially when there's no mention on the change in any of the AP2
upgrade docs.

There was no AcceptPathInfo directive in 1.3. It got added in AP2 with a
default of Off (obviously). I don't know the original meaning of this. I
guess having a switch for this is cool... but it's default shouldnt stray
from what has been expected all the long.

/dale


Re: PATH_INFO in A2?

Posted by Rasmus Lerdorf <ra...@apache.org>.
On Sat, 6 Jul 2002, Dale Ghent wrote:

> On Sat, 6 Jul 2002, Rasmus Lerdorf wrote:
>
> | 2.0.40-dev built on June 23rd
>
> Make sure you have AcceptPathInfo On

Argh!  Why the heck is that off by default?

-Rasmus


Re: PATH_INFO in A2?

Posted by Dale Ghent <da...@elemental.org>.
On Sat, 6 Jul 2002, Rasmus Lerdorf wrote:

| 2.0.40-dev built on June 23rd

Make sure you have AcceptPathInfo On

/dale



Re: PATH_INFO in A2?

Posted by Rasmus Lerdorf <ra...@apache.org>.
2.0.40-dev built on June 23rd

On Sat, 6 Jul 2002, Dale Ghent wrote:

> On Sat, 6 Jul 2002, Rasmus Lerdorf wrote:
>
> | The requested URL /index.php/foo was not found on this server.
> |
> | In Apache1 /index.php would still get passed to PHP and /foo would end up
> | in the PATH_INFO server var.
>
> What AP2 code are you using? I'm running .40-dev with a CVS update as of
> last afternoon (right after Bpane's apr_table mods) and PHP 4.2.1 (with
> the set-cookie fix)... everything is fine for me on a site that reqires
> PATH_INFO ( http://playadelfuego.org/ )
>
> /dale
>


Re: PATH_INFO in A2?

Posted by Dale Ghent <da...@elemental.org>.
On Sat, 6 Jul 2002, Rasmus Lerdorf wrote:

| The requested URL /index.php/foo was not found on this server.
|
| In Apache1 /index.php would still get passed to PHP and /foo would end up
| in the PATH_INFO server var.

What AP2 code are you using? I'm running .40-dev with a CVS update as of
last afternoon (right after Bpane's apr_table mods) and PHP 4.2.1 (with
the set-cookie fix)... everything is fine for me on a site that reqires
PATH_INFO ( http://playadelfuego.org/ )

/dale