You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mod_python-dev@quetz.apache.org by "Graham Dumpleton (JIRA)" <ji...@apache.org> on 2006/03/05 05:42:39 UTC
[jira] Closed: (MODPYTHON-22) mod_python.publisher extension handling

     [ http://issues.apache.org/jira/browse/MODPYTHON-22?page=all ]
     
Graham Dumpleton closed MODPYTHON-22:
-------------------------------------


> mod_python.publisher extension handling
> ---------------------------------------
>
>          Key: MODPYTHON-22
>          URL: http://issues.apache.org/jira/browse/MODPYTHON-22
>      Project: mod_python
>         Type: Bug
>     Versions: 3.1.4, 3.1.3
>     Reporter: Graham Dumpleton
>     Assignee: Nicolas Lehuen
>     Priority: Minor
>      Fix For: 3.2.7

>
> The following code in mod_python.publisher doesn't appear to be correct.
>   imp_suffixes = " ".join([x[0][1:] for x in imp.get_suffixes()])
>     # get rid of the suffix
>     #   explanation: Suffixes that will get stripped off
>     #   are those that were specified as an argument to the
>     #   AddHandler directive. Everything else will be considered
>     #   a package.module rather than module.suffix
>     exts = req.get_addhandler_exts()
>     if not exts:
>         # this is SetHandler, make an exception for Python suffixes
>         exts = imp_suffixes
>     if req.extension:  # this exists if we're running in a | .ext handler
>         exts += req.extension[1:]
>     if exts:
>         suffixes = exts.strip().split()
>         exp = "\\." + "$|\\.".join(suffixes)
>         suff_matcher = re.compile(exp) # python caches these, so its fast
>         module_name = suff_matcher.sub("", module_name)
> For starters, imp.get_suffixes() returns:
>     [('.so', 'rb', 3), ('module.so', 'rb', 3), ('.py', 'U', 1), ('.pyc', 'rb', 2)]
> on a UNIX platform. Thus yielding for imp_suffixes:
>   'so odule.so py pyc'
> Ie., "m" has been lost from "module.so". As it is likely that a dynamically
> loaded C module isn't going to be usable within publisher, this is no loss
> at this point.
> Now if one were using:
>   SetHandler python-program 
>   PythonHandler mod_python.publisher | .xxx
>   PythonHandler mod_python.publisher | .py
> then req.get_addhandler_exts() returns:
>   ''
> Ie., empty string. Thus, "exts" gets set to imp_suffixes.
>   exts = 'so odule.so py pyc'
> Now because ".py" and ".xxx" was specified on the PythonHandler line, req.extension
> is set to be ".py" or ".xxx" as appropriate for request. When this is appended to "exts",
> no space is added and thus result for ".py" request is:
>   exts = 'so odule.so py pycpy'
> For a ".py" extension this is no drama as it is already listed in exts at that
> point and thus things still work okay.
> The lack of a space though does screw things for ".xxx" though. If one used a URL
> with a .xxx extension, one would expect it to drop the extension and still work,
> but because exts gets rewritten as:
>   exts = 'so odule.so py pycxxx'
> it doesn't work, instead you get not found error.
> The only way around it is to use:
>   SetHandler python-program
>   AddHandler python-program .xxx
>   PythonHandler mod_python.publisher | .xxx
>   PythonHandler mod_python.publisher | .py
> Ie., as well as SetHandler, define AddHandler for at least one extension type.
> In doing this, req.get_addhandler_exts() now returns:
>   'xxx '
> Notice how there is a space at the end of the string. That it is set means exts
> isn't set to imp_suffixes. When req.extension is added the existing space means
> all is okay and result will be:
>   'xxx xxx'
> In summary, instead of:
>   exts += req.extension[1:]
> should be equivalent of:
>   exts += ' '
>   exts += req.extension[1:]
> Do note that in case of req.get_addhandler_exts() returning something, this
> will mean spaces are doubled up. This appears to be okay as split() function
> treats adjoining spaces as a single field separator when splitting string.
> Another bit of code which is a bit loose is:
>   exp = "\\." + "$|\\.".join(suffixes)
> When this is applied to:
>    'xxx xxx'
> one gets:
>   '\\.xxx$|\\.xxx'
> The intent of the regular expression when applied is that it will remove the
> extension just from the end of the URL. Ie., "foo.xxx" yields "foo". However,
> because there is no '$' on the very last part of the pattern, any instance of
> the last extension will be removed from anywhere in the string. ie.:
>   suff_matcher.sub("","aaa.xxxbbb.foo.xxx")
> yields:
>   'aaabbb.foo'
> instead of:
>   "aaa.xxxbbb.foo"
> At the moment this may not be an issue as the way Apache does its matching
> and Python does its module lookup, means that neither would produce a valid
> result in the first place even if one were to have files with '.' in the actual name
> besides that used for the extension. :-)
> The missing space still needs to be fixed though.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira