You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mod_python-dev@quetz.apache.org by Graham Dumpleton <gr...@dscpl.com.au> on 2006/11/19 12:37:40 UTC

Proposal for new handler to be added to mod_python 3.3.

I know that we are very close to getting mod_python 3.3 out the door,
and I know that it is me holding it up purely through the documentation
not yet being updated to at least give some basic details on new bits
in the rewritten module importer, but even at this late stage I have  
been
thinking if we aren't with 3.3 missing a great opportunity to add a new
basic mod_python dispatch handler. After all, it may well be some time
before we get around to releasing the next version.

On that basis, I have quickly thrown together a new dispatch handler
which I think could be included with mod_python. After seeing so many
people going off and writing their own dispatchers with varied success I
could see that including this would save a lot of mucking around for
people.

I have attached the actual code for the handler module, but I'll explain
a few things about it as well.

First off, intended that this would exist as 'mod_python.dispatcher'. In
the simplest case, if you wanted to put all requests which fall within a
directory through it, you would use:

   SetHandler mod_python
   PythonHandler mod_python.dispatcher

The basic premise of the dispatcher is then that it will attempt to  
match
any request against a resource against a handler found in a .py file
corresponding to that specific resource.

Thus, if a request is made against '/index', the dispatcher would  
look for
a .py file called 'index.py' in that directory and execute the  
'handler()'
function within it. If the request was made against '/index.html', the
dispatcher would instead try and execute the 'handler_html()' function.

   from mod_python import apache

   def handler_html(req):
     req.content_type = 'text/html'
     req.write('<html><body><p>index</p></body></html>')
     return apache.OK

So, an important aspect of the dispatcher is that it doesn't just  
push all
requests for a resource through the one handler function. Instead, it  
will
call different handlers within the one resource code file for the  
different
extensions.

The benefit of this is that it becomes really easy to control what  
extension
you want a resource accessed under. Further, it is easy to provide  
different
views of a resource where the format of the result is different based on
the extension.

   def handler_csv(req):
     ...

   def handler_xml(req):
     ...

Because of the way that Apache matches files, this will all work  
within any
subdirectories as well. Thus if the request is against '/subdir/ 
index.html',
the dispatcher will try and execute 'handler_html()' in 'subdir/ 
index.py'.

If the directory is shared with static files which one wants to be  
served
up as is, one can use the Files directive to indicate which files  
should be
still served up as static files and not processed by mod_python.

   SetHandler mod_python
   PythonHandler mod_python.dispatcher

   <Files *.jpg>
   SetHandler None
   </Files>

Instead of starting with SetHandler, one could instead start with  
AddHandler.

   AddHandler mod_python .html
   PythonHandler mod_python.dispatcher

   # Must block access to static .py files.

   <Files *.py>
   deny from all
   </Files>

In this way, only things with a .html extension will be handled by  
the dispatcher.
If there are individual .html files that you want served as static  
files or in some
other way, can again use Files directive to mark specifically how  
they should
be dealt with.

   <Files index.html>
   SetHandler default-handler
   </Files>

For the next tricky bit of this dispatcher, it can be used for other  
phases besides
that for just the response handler phase. For example, if I want to  
be able to
specify fixup handlers for individual resources, then I could use:

   SetHandler mod_python
   PythonFixupHandler mod_python.dispatcher
   PythonHandler mod_python.dispatcher

This would then allow me to say something like:

   from mod_python import apache

   def fixuphandler(req):
     req.used_path_info = apache.AP_REQ_ACCEPT_PATH_INFO
     return apache.OK

   def handler(req):
     ...

What is happening here is that the default behaviour of the  
dispatcher is to
prohibit additional path information to be provided for a request  
against
a resource. This could be enabled on a per resource basis from the  
Apache
configuration file:

   <Files resource>
   AcceptPathInfo On
   </Files>

or, as shown, a fixuphandler() function specific to that resource  
could be
provided within the same resource code file and set req.used_path_info
as appropriate instead.

Thus, any of the prior phases could be enabled on a case by case basis
if they needed to be overridden by specific resources at some point.

Alternatively, one could setup the handler so it checks resource code  
files
for the existence of a handler for any of the prior phases by using:

   SetHandler mod_python
   PythonHandlerModule mod_python.dispatcher

That way, with one directive, would allow other phases such as the
headerparserhandler, accesshandler phases etc, to also be overridden
on a per resource basis.

Although the dispatcher is targeted at allowing handlers to be  
specified on
a per resource basis, it is still possible to mixin other handlers  
which apply
across all requests.

For example:

   PythonHeaderParserHandler moddir::directoryindex

   SetHandler mod_python
   PythonHandlerModule mod_python.dispatcher

   from mod_python import apache

   # moddir.py

   def directoryindex(req):
     if req.content_type == 'httpd/unix-directory:
       req.filename = req.filename + 'index.html'
       req.finfo = apache.stat(req.filename)
       req.content_type = 'text/html'
     return apache.OK

Thus one can emulate the DirectoryIndex directive, which otherwise would
not work with this dispatcher and causes mod_python other problems  
anyway
because of bugs with internal fast redirects in Apache.

One final example with this. One need not even have an actual  
response handler
defined for a resource, but still define a fixup handler. For  
example, if one wanted
to use SSI for a specific resource which existed as a static .html  
file. One could
do this with:

   def fixuphandler_html(req):
     req.handler = 'default-handler'
     req.add_output_filter('INCLUDES')
     req.ssi_globals = { ... }
     return apache.OK

All up, although the actual code for this new dispatch handler is  
quite little, it
allows for a great deal of flexibility and would probably give users  
more to work
with than what most people come up with for a simple dispatcher.

For those who have been trying mod_python 3.3, would be great to see  
you give
this new handler a go and see what you think. Any comments most welcome,
especially whether or not it is something people feel would be  
worthwhile to
include in mod_python 3.3. Note that for people not using mod_python  
3.3, this
will not work with older versions.

BTW, to test this, just throw it a directory in your document tree  
and setup the
.htaccess file to contain:

   SetHandler mod_python
   PythonHandlerModule _handlers
   PythonDebug On

   <Files *.py>
   deny from all
   </Files>

Just don't make requests against '/_handlers' as it will loop back on  
itself given
that it isn't meant to be in the document tree itself.

Have fun.

Graham



Re: Proposal for new handler to be added to mod_python 3.3.

Posted by Jim Gallacher <jp...@jgassociates.ca>.
David Fraser wrote:
> Jim Gallacher wrote:
>> -1 (On including in 3.3)
>>
>> We need to have some release discipline. The beta cycle for 3.2 was 
>> something like 8 months long and I don't want to see that happen 
>> again. One of the reasons for that long cycle was the feeling that "it 
>> might be a long time before we do another release, so let's make this 
>> change now", which ultimately *caused* the delay.
>>
>> There are a lot of really great things in 3.3 (new importer) and I 
>> think we should get it out as soon as possible. I'd also like to see a 
>> 3.4 release fairly soon (4-6 months?) after a though audit for python 
>> 2.5 64-bit support. If the new handler is accepted it may not have to 
>> wait too long. Also, if it's pure python it would be pretty painless 
>> for people to backport it if they want to use it in the mean time.
>>
>> Maybe we should adopt some sort of calendar release policy. If we aim 
>> for a minor point release every 6 months then people will always know 
>> that they won't have to wait too long for the latest and greatest 
>> features to appear. This would alleviate the urge to stuff too much 
>> into any one release. I'm not suggesting that we be a slave to the 
>> calendar - just use it as a guideline.
>>
>> Jim
> I agree with the need for release discipline, however I think we need to 
> have a compromise where either
> - the optimal minor point release cycle for mod_python is more frequent 
> than 6-monthly (6 months is a fairly large delay for a simple feature 
> like this), or

I'm all for a more frequent release cycle, just as I'm also in favour of 
peace in our time. I'm just not sure either is realistic. I don't think 
6 months is necessarily optimal, but rather based on observation of what 
is possible. I suggested 6 months so that we  at least have something 
predictable.

In this particular case we are just days away from a full beta, which 
I'm pretty confident will become 3.3.0 final. There is just a small 
delay in cleaning up a couple of points in the documentation and we are 
then ready to roll. Plopping new features in this close to the wire just 
slows the process down. Without addressing the specifics of this 
particular proposal (which I'm sure is on par with Graham's usual great 
work), it is entirely possible that it could delay the 3.3.0 release by 
another 2 to 4 weeks. By then some wonderful code I've been working on 
(well hypothetically at least ;) ) should be ready, so what the heck, 
let's include it as well... just need a few more days to give it some 
polish... tick tick tick... more fiddling...  tick tick... and suddenly 
it's February '07 and a full year has passed since 3.2 went out.

> - there is a mechanism to allow certain kinds of new features more 
> frequently than a minor point release.

I don't think there is any magic in the minor point releases. It's the 
appropriate place for new features, as long as any API changes are 
backwards compatible. We just need to be able to say "OK, we've got 
enough good new stuff in the development branch, let's release it". 
Making *that* decision seems to more difficult than writing code. Heck, 
it's almost as tough as writing documentation.

> In some ways it may make sense to have different rules for base code and 
> handlers that function on top of it.

I see your logic, but at this point most changes in the base code are 
likely to be bug fixes from which everyone would benefit. Any major 
changes in the base code, by which I mean the stuff written in C, might 
be treated differently, but it's not likely that new features would be 
introduced there first anyway.

> I've always felt that the publisher handler in mod_python is problematic 
> - it causes lots of queries and isn't really the best way to do a lot of 
> things, but because it is included in the base distro most people seem 
> to try it out first.
> This dispatcher seems like a better fit to mod_python. But perhaps both 
> of them should be broken out into a separate package - it could be 
> called modpython-utils or something like that.
> That way that package could have a faster release cycle, 

I'm not sure that would actually result in a faster release cycle, as 
we'd then have 2 branches to manage. The idea has merit, but for other 
reasons.

I hope my response doesn't sound like too much of an angry rant - it may 
be a rant, but not angry. Just assume there there is a smiley face stuck 
at the end of each sentence. :)

Jim



Re: Proposal for new handler to be added to mod_python 3.3.

Posted by David Fraser <da...@sjsoft.com>.
Jim Gallacher wrote:
> -1 (On including in 3.3)
>
> We need to have some release discipline. The beta cycle for 3.2 was 
> something like 8 months long and I don't want to see that happen 
> again. One of the reasons for that long cycle was the feeling that "it 
> might be a long time before we do another release, so let's make this 
> change now", which ultimately *caused* the delay.
>
> There are a lot of really great things in 3.3 (new importer) and I 
> think we should get it out as soon as possible. I'd also like to see a 
> 3.4 release fairly soon (4-6 months?) after a though audit for python 
> 2.5 64-bit support. If the new handler is accepted it may not have to 
> wait too long. Also, if it's pure python it would be pretty painless 
> for people to backport it if they want to use it in the mean time.
>
> Maybe we should adopt some sort of calendar release policy. If we aim 
> for a minor point release every 6 months then people will always know 
> that they won't have to wait too long for the latest and greatest 
> features to appear. This would alleviate the urge to stuff too much 
> into any one release. I'm not suggesting that we be a slave to the 
> calendar - just use it as a guideline.
>
> Jim
I agree with the need for release discipline, however I think we need to 
have a compromise where either
- the optimal minor point release cycle for mod_python is more frequent 
than 6-monthly (6 months is a fairly large delay for a simple feature 
like this), or
- there is a mechanism to allow certain kinds of new features more 
frequently than a minor point release.
In some ways it may make sense to have different rules for base code and 
handlers that function on top of it.
I've always felt that the publisher handler in mod_python is problematic 
- it causes lots of queries and isn't really the best way to do a lot of 
things, but because it is included in the base distro most people seem 
to try it out first.
 This dispatcher seems like a better fit to mod_python. But perhaps both 
of them should be broken out into a separate package - it could be 
called modpython-utils or something like that.
That way that package could have a faster release cycle, could require 
pure python code for ease of debugging etc, and there would be more 
clarity about the different layers

David

Re: Proposal for new handler to be added to mod_python 3.3.

Posted by "Gregory (Grisha) Trubetskoy" <gr...@apache.org>.
I haven't studied the code and gave the implications of the new handler a 
lot of thought, and it is for that very reason I'm with Jim on this one - 
let's not rush things out the door as part of an official relase (3.3 in 
this case). If we decide to include it (+0 on that for now), for the 
people who want to check it out, there is always the SVN trunk via which 
you can try it (if/when we include it). Let's release 3.3, while "sleeping 
over" the dispatcher handler and see where that takes us.

Grisha


On Sun, 19 Nov 2006, Jim Gallacher wrote:

> -1 (On including in 3.3)
>
> We need to have some release discipline. The beta cycle for 3.2 was something 
> like 8 months long and I don't want to see that happen again. One of the 
> reasons for that long cycle was the feeling that "it might be a long time 
> before we do another release, so let's make this change now", which 
> ultimately *caused* the delay.
>
> There are a lot of really great things in 3.3 (new importer) and I think we 
> should get it out as soon as possible. I'd also like to see a 3.4 release 
> fairly soon (4-6 months?) after a though audit for python 2.5 64-bit support. 
> If the new handler is accepted it may not have to wait too long. Also, if 
> it's pure python it would be pretty painless for people to backport it if 
> they want to use it in the mean time.
>
> Maybe we should adopt some sort of calendar release policy. If we aim for a 
> minor point release every 6 months then people will always know that they 
> won't have to wait too long for the latest and greatest features to appear. 
> This would alleviate the urge to stuff too much into any one release. I'm not 
> suggesting that we be a slave to the calendar - just use it as a guideline.
>
> Jim
>
> Graham Dumpleton wrote:
>> I know that we are very close to getting mod_python 3.3 out the door,
>> and I know that it is me holding it up purely through the documentation
>> not yet being updated to at least give some basic details on new bits
>> in the rewritten module importer, but even at this late stage I have been
>> thinking if we aren't with 3.3 missing a great opportunity to add a new
>> basic mod_python dispatch handler. After all, it may well be some time
>> before we get around to releasing the next version.
>> 
>> On that basis, I have quickly thrown together a new dispatch handler
>> which I think could be included with mod_python. After seeing so many
>> people going off and writing their own dispatchers with varied success I
>> could see that including this would save a lot of mucking around for
>> people.
>> 
>> I have attached the actual code for the handler module, but I'll explain
>> a few things about it as well.
>> 
>> First off, intended that this would exist as 'mod_python.dispatcher'. In
>> the simplest case, if you wanted to put all requests which fall within a
>> directory through it, you would use:
>> 
>>   SetHandler mod_python
>>   PythonHandler mod_python.dispatcher
>> 
>> The basic premise of the dispatcher is then that it will attempt to match
>> any request against a resource against a handler found in a .py file
>> corresponding to that specific resource.
>> 
>> Thus, if a request is made against '/index', the dispatcher would look for
>> a .py file called 'index.py' in that directory and execute the 'handler()'
>> function within it. If the request was made against '/index.html', the
>> dispatcher would instead try and execute the 'handler_html()' function.
>> 
>>   from mod_python import apache
>> 
>>   def handler_html(req):
>>     req.content_type = 'text/html'
>>     req.write('<html><body><p>index</p></body></html>')
>>     return apache.OK
>> 
>> So, an important aspect of the dispatcher is that it doesn't just push all
>> requests for a resource through the one handler function. Instead, it will
>> call different handlers within the one resource code file for the 
>> different
>> extensions.
>> 
>> The benefit of this is that it becomes really easy to control what 
>> extension
>> you want a resource accessed under. Further, it is easy to provide 
>> different
>> views of a resource where the format of the result is different based on
>> the extension.
>> 
>>   def handler_csv(req):
>>     ...
>> 
>>   def handler_xml(req):
>>     ...
>> 
>> Because of the way that Apache matches files, this will all work within 
>> any
>> subdirectories as well. Thus if the request is against 
>> '/subdir/index.html',
>> the dispatcher will try and execute 'handler_html()' in 'subdir/index.py'.
>> 
>> If the directory is shared with static files which one wants to be served
>> up as is, one can use the Files directive to indicate which files should 
>> be
>> still served up as static files and not processed by mod_python.
>> 
>>   SetHandler mod_python
>>   PythonHandler mod_python.dispatcher
>> 
>>   <Files *.jpg>
>>   SetHandler None
>>   </Files>
>> 
>> Instead of starting with SetHandler, one could instead start with 
>> AddHandler.
>> 
>>   AddHandler mod_python .html
>>   PythonHandler mod_python.dispatcher
>> 
>>   # Must block access to static .py files.
>> 
>>   <Files *.py>
>>   deny from all
>>   </Files>
>> 
>> In this way, only things with a .html extension will be handled by the 
>> dispatcher.
>> If there are individual .html files that you want served as static files 
>> or in some
>> other way, can again use Files directive to mark specifically how they 
>> should
>> be dealt with.
>> 
>>   <Files index.html>
>>   SetHandler default-handler
>>   </Files>
>> 
>> For the next tricky bit of this dispatcher, it can be used for other 
>> phases besides
>> that for just the response handler phase. For example, if I want to be 
>> able to
>> specify fixup handlers for individual resources, then I could use:
>> 
>>   SetHandler mod_python
>>   PythonFixupHandler mod_python.dispatcher
>>   PythonHandler mod_python.dispatcher
>> 
>> This would then allow me to say something like:
>> 
>>   from mod_python import apache
>> 
>>   def fixuphandler(req):
>>     req.used_path_info = apache.AP_REQ_ACCEPT_PATH_INFO
>>     return apache.OK
>> 
>>   def handler(req):
>>     ...
>> 
>> What is happening here is that the default behaviour of the dispatcher is 
>> to
>> prohibit additional path information to be provided for a request against
>> a resource. This could be enabled on a per resource basis from the Apache
>> configuration file:
>> 
>>   <Files resource>
>>   AcceptPathInfo On
>>   </Files>
>> 
>> or, as shown, a fixuphandler() function specific to that resource could be
>> provided within the same resource code file and set req.used_path_info
>> as appropriate instead.
>> 
>> Thus, any of the prior phases could be enabled on a case by case basis
>> if they needed to be overridden by specific resources at some point.
>> 
>> Alternatively, one could setup the handler so it checks resource code 
>> files
>> for the existence of a handler for any of the prior phases by using:
>> 
>>   SetHandler mod_python
>>   PythonHandlerModule mod_python.dispatcher
>> 
>> That way, with one directive, would allow other phases such as the
>> headerparserhandler, accesshandler phases etc, to also be overridden
>> on a per resource basis.
>> 
>> Although the dispatcher is targeted at allowing handlers to be specified 
>> on
>> a per resource basis, it is still possible to mixin other handlers which 
>> apply
>> across all requests.
>> 
>> For example:
>> 
>>   PythonHeaderParserHandler moddir::directoryindex
>> 
>>   SetHandler mod_python
>>   PythonHandlerModule mod_python.dispatcher
>> 
>>   from mod_python import apache
>> 
>>   # moddir.py
>> 
>>   def directoryindex(req):
>>     if req.content_type == 'httpd/unix-directory:
>>       req.filename = req.filename + 'index.html'
>>       req.finfo = apache.stat(req.filename)
>>       req.content_type = 'text/html'
>>     return apache.OK
>> 
>> Thus one can emulate the DirectoryIndex directive, which otherwise would
>> not work with this dispatcher and causes mod_python other problems anyway
>> because of bugs with internal fast redirects in Apache.
>> 
>> One final example with this. One need not even have an actual response 
>> handler
>> defined for a resource, but still define a fixup handler. For example, if 
>> one wanted
>> to use SSI for a specific resource which existed as a static .html file. 
>> One could
>> do this with:
>> 
>>   def fixuphandler_html(req):
>>     req.handler = 'default-handler'
>>     req.add_output_filter('INCLUDES')
>>     req.ssi_globals = { ... }
>>     return apache.OK
>> 
>> All up, although the actual code for this new dispatch handler is quite 
>> little, it
>> allows for a great deal of flexibility and would probably give users more 
>> to work
>> with than what most people come up with for a simple dispatcher.
>> 
>> For those who have been trying mod_python 3.3, would be great to see you 
>> give
>> this new handler a go and see what you think. Any comments most welcome,
>> especially whether or not it is something people feel would be worthwhile 
>> to
>> include in mod_python 3.3. Note that for people not using mod_python 3.3, 
>> this
>> will not work with older versions.
>> 
>> BTW, to test this, just throw it a directory in your document tree and 
>> setup the
>> .htaccess file to contain:
>> 
>>   SetHandler mod_python
>>   PythonHandlerModule _handlers
>>   PythonDebug On
>> 
>>   <Files *.py>
>>   deny from all
>>   </Files>
>> 
>> Just don't make requests against '/_handlers' as it will loop back on 
>> itself given
>> that it isn't meant to be in the document tree itself.
>> 
>> Have fun.
>> 
>> Graham
>> 
>> 
>> 
>> ------------------------------------------------------------------------
>> 
>> from mod_python import apache
>> 
>> import os, posixpath
>> 
>> def _phasehandler(req):
>> 
>>     # Ignore requests made against the directory.
>> 
>>     if req.filename[-1] == '/':
>>         return apache.DECLINED
>> 
>>     # Requests where there is no code file are ignored.
>> 
>>     stub, extn = posixpath.splitext(req.filename)
>>     directory = os.path.dirname(req.filename)
>>     target = stub + '.py'
>> 
>>     if not os.path.exists(target):
>>         return apache.DECLINED
>> 
>>     # Import the Python code file.
>> 
>>     module = apache.import_module(target)
>> 
>>     # Ensure that there is a handler within the code file for
>>     # the appropriate phase and resource extension.
>> 
>>     name = req.phase[6:].lower()
>> 
>>     if extn:
>>         name = "%s_%s" % (name, extn[1:])
>> 
>>     if not hasattr(module, name):
>>         return apache.DECLINED
>> 
>>     # Get a reference to the handler and invoke it.
>> 
>>     return getattr(module, name)(req)
>> 
>> 
>> postreadrequesthandler = _phasehandler
>> headerparserhandler = _phasehandler
>> accesshandler = _phasehandler
>> authenhandler = _phasehandler
>> authzhandler = _phasehandler
>> typehandler = _phasehandler
>> fixuphandler = _phasehandler
>> 
>> 
>> def handler(req):
>> 
>>     # Ensure that no additional path information has been
>>     # provided if it isn't desired. The default behaviour is to
>>     # reject such additional path information. Whether
>>     # additional path information is accepted can be specified
>>     # using the AcceptPathInfo directive or by setting the
>>     # req.used_path_info attribute appropriately in an earlier
>>     # phase.
>> 
>>     if req.used_path_info != apache.AP_REQ_ACCEPT_PATH_INFO:
>>         if req.path_info:
>>             return apache.HTTP_NOT_FOUND
>> 
>>     # Requests against the directory itself are forbidden.
>> 
>>     if req.filename[-1] == '/':
>>         return apache.HTTP_FORBIDDEN
>> 
>>     # Requests where there is no code file are rejected.
>> 
>>     stub, extn = posixpath.splitext(req.filename)
>>     directory = os.path.dirname(req.filename)
>>     target = stub + '.py'
>> 
>>     if not os.path.exists(target):
>>         return apache.HTTP_NOT_FOUND
>> 
>>     # Import the Python code file.
>> 
>>     module = apache.import_module(target)
>> 
>>     # Ensure that there is a handler within the code file for
>>     # the response handler phase.
>> 
>>     if extn:
>>         name = "handler_%s" % extn[1:]
>>     else:
>>         name = 'handler'
>> 
>>     if not hasattr(module, name):
>>         return apache.HTTP_NOT_FOUND
>> 
>>     # Get a reference to the handler and invoke it.
>> 
>>     return getattr(module, name)(req)
>> 
>> 
>> ------------------------------------------------------------------------
>> 
>> 
>> 
>> 
>

Re: Proposal for new handler to be added to mod_python 3.3.

Posted by Jim Gallacher <jp...@jgassociates.ca>.
-1 (On including in 3.3)

We need to have some release discipline. The beta cycle for 3.2 was 
something like 8 months long and I don't want to see that happen again. 
One of the reasons for that long cycle was the feeling that "it might be 
a long time before we do another release, so let's make this change 
now", which ultimately *caused* the delay.

There are a lot of really great things in 3.3 (new importer) and I think 
we should get it out as soon as possible. I'd also like to see a 3.4 
release fairly soon (4-6 months?) after a though audit for python 2.5 
64-bit support. If the new handler is accepted it may not have to wait 
too long. Also, if it's pure python it would be pretty painless for 
people to backport it if they want to use it in the mean time.

Maybe we should adopt some sort of calendar release policy. If we aim 
for a minor point release every 6 months then people will always know 
that they won't have to wait too long for the latest and greatest 
features to appear. This would alleviate the urge to stuff too much into 
any one release. I'm not suggesting that we be a slave to the calendar - 
just use it as a guideline.

Jim

Graham Dumpleton wrote:
> I know that we are very close to getting mod_python 3.3 out the door,
> and I know that it is me holding it up purely through the documentation
> not yet being updated to at least give some basic details on new bits
> in the rewritten module importer, but even at this late stage I have been
> thinking if we aren't with 3.3 missing a great opportunity to add a new
> basic mod_python dispatch handler. After all, it may well be some time
> before we get around to releasing the next version.
> 
> On that basis, I have quickly thrown together a new dispatch handler
> which I think could be included with mod_python. After seeing so many
> people going off and writing their own dispatchers with varied success I
> could see that including this would save a lot of mucking around for
> people.
> 
> I have attached the actual code for the handler module, but I'll explain
> a few things about it as well.
> 
> First off, intended that this would exist as 'mod_python.dispatcher'. In
> the simplest case, if you wanted to put all requests which fall within a
> directory through it, you would use:
> 
>   SetHandler mod_python
>   PythonHandler mod_python.dispatcher
> 
> The basic premise of the dispatcher is then that it will attempt to match
> any request against a resource against a handler found in a .py file
> corresponding to that specific resource.
> 
> Thus, if a request is made against '/index', the dispatcher would look for
> a .py file called 'index.py' in that directory and execute the 'handler()'
> function within it. If the request was made against '/index.html', the
> dispatcher would instead try and execute the 'handler_html()' function.
> 
>   from mod_python import apache
> 
>   def handler_html(req):
>     req.content_type = 'text/html'
>     req.write('<html><body><p>index</p></body></html>')
>     return apache.OK
> 
> So, an important aspect of the dispatcher is that it doesn't just push all
> requests for a resource through the one handler function. Instead, it will
> call different handlers within the one resource code file for the different
> extensions.
> 
> The benefit of this is that it becomes really easy to control what 
> extension
> you want a resource accessed under. Further, it is easy to provide 
> different
> views of a resource where the format of the result is different based on
> the extension.
> 
>   def handler_csv(req):
>     ...
> 
>   def handler_xml(req):
>     ...
> 
> Because of the way that Apache matches files, this will all work within any
> subdirectories as well. Thus if the request is against 
> '/subdir/index.html',
> the dispatcher will try and execute 'handler_html()' in 'subdir/index.py'.
> 
> If the directory is shared with static files which one wants to be served
> up as is, one can use the Files directive to indicate which files should be
> still served up as static files and not processed by mod_python.
> 
>   SetHandler mod_python
>   PythonHandler mod_python.dispatcher
> 
>   <Files *.jpg>
>   SetHandler None
>   </Files>
> 
> Instead of starting with SetHandler, one could instead start with 
> AddHandler.
> 
>   AddHandler mod_python .html
>   PythonHandler mod_python.dispatcher
> 
>   # Must block access to static .py files.
> 
>   <Files *.py>
>   deny from all
>   </Files>
> 
> In this way, only things with a .html extension will be handled by the 
> dispatcher.
> If there are individual .html files that you want served as static files 
> or in some
> other way, can again use Files directive to mark specifically how they 
> should
> be dealt with.
> 
>   <Files index.html>
>   SetHandler default-handler
>   </Files>
> 
> For the next tricky bit of this dispatcher, it can be used for other 
> phases besides
> that for just the response handler phase. For example, if I want to be 
> able to
> specify fixup handlers for individual resources, then I could use:
> 
>   SetHandler mod_python
>   PythonFixupHandler mod_python.dispatcher
>   PythonHandler mod_python.dispatcher
> 
> This would then allow me to say something like:
> 
>   from mod_python import apache
> 
>   def fixuphandler(req):
>     req.used_path_info = apache.AP_REQ_ACCEPT_PATH_INFO
>     return apache.OK
> 
>   def handler(req):
>     ...
> 
> What is happening here is that the default behaviour of the dispatcher 
> is to
> prohibit additional path information to be provided for a request against
> a resource. This could be enabled on a per resource basis from the Apache
> configuration file:
> 
>   <Files resource>
>   AcceptPathInfo On
>   </Files>
> 
> or, as shown, a fixuphandler() function specific to that resource could be
> provided within the same resource code file and set req.used_path_info
> as appropriate instead.
> 
> Thus, any of the prior phases could be enabled on a case by case basis
> if they needed to be overridden by specific resources at some point.
> 
> Alternatively, one could setup the handler so it checks resource code files
> for the existence of a handler for any of the prior phases by using:
> 
>   SetHandler mod_python
>   PythonHandlerModule mod_python.dispatcher
> 
> That way, with one directive, would allow other phases such as the
> headerparserhandler, accesshandler phases etc, to also be overridden
> on a per resource basis.
> 
> Although the dispatcher is targeted at allowing handlers to be specified on
> a per resource basis, it is still possible to mixin other handlers which 
> apply
> across all requests.
> 
> For example:
> 
>   PythonHeaderParserHandler moddir::directoryindex
> 
>   SetHandler mod_python
>   PythonHandlerModule mod_python.dispatcher
> 
>   from mod_python import apache
> 
>   # moddir.py
> 
>   def directoryindex(req):
>     if req.content_type == 'httpd/unix-directory:
>       req.filename = req.filename + 'index.html'
>       req.finfo = apache.stat(req.filename)
>       req.content_type = 'text/html'
>     return apache.OK
> 
> Thus one can emulate the DirectoryIndex directive, which otherwise would
> not work with this dispatcher and causes mod_python other problems anyway
> because of bugs with internal fast redirects in Apache.
> 
> One final example with this. One need not even have an actual response 
> handler
> defined for a resource, but still define a fixup handler. For example, 
> if one wanted
> to use SSI for a specific resource which existed as a static .html file. 
> One could
> do this with:
> 
>   def fixuphandler_html(req):
>     req.handler = 'default-handler'
>     req.add_output_filter('INCLUDES')
>     req.ssi_globals = { ... }
>     return apache.OK
> 
> All up, although the actual code for this new dispatch handler is quite 
> little, it
> allows for a great deal of flexibility and would probably give users 
> more to work
> with than what most people come up with for a simple dispatcher.
> 
> For those who have been trying mod_python 3.3, would be great to see you 
> give
> this new handler a go and see what you think. Any comments most welcome,
> especially whether or not it is something people feel would be 
> worthwhile to
> include in mod_python 3.3. Note that for people not using mod_python 
> 3.3, this
> will not work with older versions.
> 
> BTW, to test this, just throw it a directory in your document tree and 
> setup the
> .htaccess file to contain:
> 
>   SetHandler mod_python
>   PythonHandlerModule _handlers
>   PythonDebug On
> 
>   <Files *.py>
>   deny from all
>   </Files>
> 
> Just don't make requests against '/_handlers' as it will loop back on 
> itself given
> that it isn't meant to be in the document tree itself.
> 
> Have fun.
> 
> Graham
> 
> 
> 
> ------------------------------------------------------------------------
> 
> from mod_python import apache
> 
> import os, posixpath
> 
> def _phasehandler(req):
> 
>     # Ignore requests made against the directory.
> 
>     if req.filename[-1] == '/':
>         return apache.DECLINED
> 
>     # Requests where there is no code file are ignored.
> 
>     stub, extn = posixpath.splitext(req.filename)
>     directory = os.path.dirname(req.filename)
>     target = stub + '.py'
> 
>     if not os.path.exists(target):
>         return apache.DECLINED
> 
>     # Import the Python code file.
> 
>     module = apache.import_module(target)
> 
>     # Ensure that there is a handler within the code file for
>     # the appropriate phase and resource extension.
> 
>     name = req.phase[6:].lower()
> 
>     if extn:
>         name = "%s_%s" % (name, extn[1:])
> 
>     if not hasattr(module, name):
>         return apache.DECLINED
> 
>     # Get a reference to the handler and invoke it.
> 
>     return getattr(module, name)(req)
> 
> 
> postreadrequesthandler = _phasehandler
> headerparserhandler = _phasehandler
> accesshandler = _phasehandler
> authenhandler = _phasehandler
> authzhandler = _phasehandler
> typehandler = _phasehandler
> fixuphandler = _phasehandler
> 
> 
> def handler(req):
> 
>     # Ensure that no additional path information has been
>     # provided if it isn't desired. The default behaviour is to
>     # reject such additional path information. Whether
>     # additional path information is accepted can be specified
>     # using the AcceptPathInfo directive or by setting the
>     # req.used_path_info attribute appropriately in an earlier
>     # phase.
> 
>     if req.used_path_info != apache.AP_REQ_ACCEPT_PATH_INFO:
>         if req.path_info:
>             return apache.HTTP_NOT_FOUND
> 
>     # Requests against the directory itself are forbidden.
> 
>     if req.filename[-1] == '/':
>         return apache.HTTP_FORBIDDEN
> 
>     # Requests where there is no code file are rejected.
> 
>     stub, extn = posixpath.splitext(req.filename)
>     directory = os.path.dirname(req.filename)
>     target = stub + '.py'
> 
>     if not os.path.exists(target):
>         return apache.HTTP_NOT_FOUND
> 
>     # Import the Python code file.
> 
>     module = apache.import_module(target)
> 
>     # Ensure that there is a handler within the code file for
>     # the response handler phase.
> 
>     if extn:
>         name = "handler_%s" % extn[1:]
>     else:
>         name = 'handler'
> 
>     if not hasattr(module, name):
>         return apache.HTTP_NOT_FOUND
> 
>     # Get a reference to the handler and invoke it.
> 
>     return getattr(module, name)(req)
> 
> 
> ------------------------------------------------------------------------
> 
> 
> 
> 


Re: Proposal for new handler to be added to mod_python 3.3.

Posted by Jeff Hinrichs - DM&T <je...@dundeemt.com>.
On 11/19/06, Graham Dumpleton <gr...@dscpl.com.au> wrote:
> I know that we are very close to getting mod_python 3.3 out the door,
> and I know that it is me holding it up purely through the documentation
> not yet being updated to at least give some basic details on new bits
> in the rewritten module importer, but even at this late stage I have
> been
> thinking if we aren't with 3.3 missing a great opportunity to add a new
> basic mod_python dispatch handler. After all, it may well be some time
> before we get around to releasing the next version.
>
> On that basis, I have quickly thrown together a new dispatch handler
> which I think could be included with mod_python. After seeing so many
> people going off and writing their own dispatchers with varied success I
> could see that including this would save a lot of mucking around for
> people.
>
> I have attached the actual code for the handler module, but I'll explain
> a few things about it as well.
>
> First off, intended that this would exist as 'mod_python.dispatcher'. In
> the simplest case, if you wanted to put all requests which fall within a
> directory through it, you would use:
>
>    SetHandler mod_python
>    PythonHandler mod_python.dispatcher
>
> The basic premise of the dispatcher is then that it will attempt to
> match
> any request against a resource against a handler found in a .py file
> corresponding to that specific resource.
>
> Thus, if a request is made against '/index', the dispatcher would
> look for
> a .py file called 'index.py' in that directory and execute the
> 'handler()'
> function within it. If the request was made against '/index.html', the
> dispatcher would instead try and execute the 'handler_html()' function.
>
>    from mod_python import apache
>
>    def handler_html(req):
>      req.content_type = 'text/html'
>      req.write('<html><body><p>index</p></body></html>')
>      return apache.OK
>
> So, an important aspect of the dispatcher is that it doesn't just
> push all
> requests for a resource through the one handler function. Instead, it
> will
> call different handlers within the one resource code file for the
> different
> extensions.
>
> The benefit of this is that it becomes really easy to control what
> extension
> you want a resource accessed under. Further, it is easy to provide
> different
> views of a resource where the format of the result is different based on
> the extension.
>
>    def handler_csv(req):
>      ...
>
>    def handler_xml(req):
>      ...
>
> Because of the way that Apache matches files, this will all work
> within any
> subdirectories as well. Thus if the request is against '/subdir/
> index.html',
> the dispatcher will try and execute 'handler_html()' in 'subdir/
> index.py'.
>
> If the directory is shared with static files which one wants to be
> served
> up as is, one can use the Files directive to indicate which files
> should be
> still served up as static files and not processed by mod_python.
>
>    SetHandler mod_python
>    PythonHandler mod_python.dispatcher
>
>    <Files *.jpg>
>    SetHandler None
>    </Files>
>
> Instead of starting with SetHandler, one could instead start with
> AddHandler.
>
>    AddHandler mod_python .html
>    PythonHandler mod_python.dispatcher
>
>    # Must block access to static .py files.
>
>    <Files *.py>
>    deny from all
>    </Files>
>
> In this way, only things with a .html extension will be handled by
> the dispatcher.
> If there are individual .html files that you want served as static
> files or in some
> other way, can again use Files directive to mark specifically how
> they should
> be dealt with.
>
>    <Files index.html>
>    SetHandler default-handler
>    </Files>
>
> For the next tricky bit of this dispatcher, it can be used for other
> phases besides
> that for just the response handler phase. For example, if I want to
> be able to
> specify fixup handlers for individual resources, then I could use:
>
>    SetHandler mod_python
>    PythonFixupHandler mod_python.dispatcher
>    PythonHandler mod_python.dispatcher
>
> This would then allow me to say something like:
>
>    from mod_python import apache
>
>    def fixuphandler(req):
>      req.used_path_info = apache.AP_REQ_ACCEPT_PATH_INFO
>      return apache.OK
>
>    def handler(req):
>      ...
>
> What is happening here is that the default behaviour of the
> dispatcher is to
> prohibit additional path information to be provided for a request
> against
> a resource. This could be enabled on a per resource basis from the
> Apache
> configuration file:
>
>    <Files resource>
>    AcceptPathInfo On
>    </Files>
>
> or, as shown, a fixuphandler() function specific to that resource
> could be
> provided within the same resource code file and set req.used_path_info
> as appropriate instead.
>
> Thus, any of the prior phases could be enabled on a case by case basis
> if they needed to be overridden by specific resources at some point.
>
> Alternatively, one could setup the handler so it checks resource code
> files
> for the existence of a handler for any of the prior phases by using:
>
>    SetHandler mod_python
>    PythonHandlerModule mod_python.dispatcher
>
> That way, with one directive, would allow other phases such as the
> headerparserhandler, accesshandler phases etc, to also be overridden
> on a per resource basis.
>
> Although the dispatcher is targeted at allowing handlers to be
> specified on
> a per resource basis, it is still possible to mixin other handlers
> which apply
> across all requests.
>
> For example:
>
>    PythonHeaderParserHandler moddir::directoryindex
>
>    SetHandler mod_python
>    PythonHandlerModule mod_python.dispatcher
>
>    from mod_python import apache
>
>    # moddir.py
>
>    def directoryindex(req):
>      if req.content_type == 'httpd/unix-directory:
>        req.filename = req.filename + 'index.html'
>        req.finfo = apache.stat(req.filename)
>        req.content_type = 'text/html'
>      return apache.OK
>
> Thus one can emulate the DirectoryIndex directive, which otherwise would
> not work with this dispatcher and causes mod_python other problems
> anyway
> because of bugs with internal fast redirects in Apache.
>
> One final example with this. One need not even have an actual
> response handler
> defined for a resource, but still define a fixup handler. For
> example, if one wanted
> to use SSI for a specific resource which existed as a static .html
> file. One could
> do this with:
>
>    def fixuphandler_html(req):
>      req.handler = 'default-handler'
>      req.add_output_filter('INCLUDES')
>      req.ssi_globals = { ... }
>      return apache.OK
>
> All up, although the actual code for this new dispatch handler is
> quite little, it
> allows for a great deal of flexibility and would probably give users
> more to work
> with than what most people come up with for a simple dispatcher.
>
> For those who have been trying mod_python 3.3, would be great to see
> you give
> this new handler a go and see what you think. Any comments most welcome,
> especially whether or not it is something people feel would be
> worthwhile to
> include in mod_python 3.3. Note that for people not using mod_python
> 3.3, this
> will not work with older versions.
>
> BTW, to test this, just throw it a directory in your document tree
> and setup the
> .htaccess file to contain:
>
>    SetHandler mod_python
>    PythonHandlerModule _handlers
>    PythonDebug On
>
>    <Files *.py>
>    deny from all
>    </Files>
>
> Just don't make requests against '/_handlers' as it will loop back on
> itself given
> that it isn't meant to be in the document tree itself.
>
> Have fun.
>
> Graham
>
>
Graham,

Is this based on your Vampire work?  If so,  does it handle extra path
info like vampire?  Also, does it handle web services like vampire
too?   I've used vampire before and I like it.

I'm updating my repos now.

Thanks,

Jeff
p.s.
+1 on the new handler