You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mod_python-dev@quetz.apache.org by "Gregory (Grisha) Trubetskoy" <gr...@apache.org> on 2003/04/04 21:45:33 UTC

Python Server Pages

I am delighted to announce that in a little private exchange Sterling
Hughes has agreed to work on integrating the code from mod_psp into
mod_python. (Just google for mod_psp for details).

I guess the point of this e-mail is to bring the matter to everyone's
attention and to give people an opportunity to pitch in with
ideas/opinions/etc.

Grisha


Re: Python Server Pages

Posted by David Fraser <da...@sjsoft.com>.
Sterling Hughes wrote:

>On Fri, 2003-04-04 at 14:45, Gregory (Grisha) Trubetskoy wrote:
>  
>
>>I am delighted to announce that in a little private exchange Sterling
>>Hughes has agreed to work on integrating the code from mod_psp into
>>mod_python. (Just google for mod_psp for details).
>>
>>I guess the point of this e-mail is to bring the matter to everyone's
>>attention and to give people an opportunity to pitch in with
>>ideas/opinions/etc.
>>    
>>
>
>
>Hi,
>
>I figured I'd take this opportunity to say hello, and discuss some of my
>plans for mod_python + psp.
>
Great, seems like a simple way to do simple things, I read all your 
stuff on the web about it...

>First a bit about mod_psp
>(http://www.edwardbear.org/mod_psp-0.3.tar.gz).  mod_psp was created to
>create a PHP-esque environment for python code.  It sports a syntax for
>embedding python code with in HTML, an apache module that registers MIME
>types, and a little toolkit for common tasks (like accessing GET, POST
>and cookie variables):
>
><?psp
>import time
>?>
><html>
><head>
><title>Hello World</title>
></head>
><body>
><h1>Hello World</h1>
>It is now <i><?=time.strftime("%Y-%m-%d, %H:%M)?></i>
></body>
></html>
>
>
>It supports code blocks and indentation via two methods, either using
>whitespace to denote html blocks:
>
><?psp if foo: ?>
>	FOO is here
>	<?psp if bar: ?>
>		BAR is here
>	Some More FOO
>Whitespace block is over, printed regardless
>
>Or by denoting things with { and } like in traditional languages:
>
><?psp if foo: { ?>
>	FOO is here
>	<?psp if bar: ?>
>		BAR is here
>	Some More Foo
><?psp } ?>
>Block of FOO is over, printed regardless
>
>These two indentations can be mixed, and matched.  
>
I saw on your web page you asked whether people thought the {} 
indentation was a good idea or not ... don't know if you've settled on 
it or not, but I would think it takes away one of the main strengths of 
Python (block indentation) and as you said makes your parsing job more 
complex...
Just my thought...

David



Re: Copyright

Posted by Sterling Hughes <st...@bumblebury.com>.
On Wed, 2003-04-09 at 10:01, Gregory (Grisha) Trubetskoy wrote:
> I noticed that the psp_parser.l file still has "Copyright (c) 2003
> Sterling Hughes" - I don't know whether this is against any ASF policy,
> and also if it can be changed just like that, or if the change of
> copyright requires some legal paperwork - perhaps Greg can comment on
> this.
> 
> I'm also thinking that the way to give credit where it is due would be
> to change the COPYRIGHT so that it contains something like:
> 
> [ ... ]
>  * This software consists of voluntary contributions made by many
>  * individuals on behalf of the Apache Software Foundation.  For more
>  * information on the Apache Software Foundation, please see
>  * <http://www.apache.org/>.
>  *
>  * Originally developed by Gregory Trubetskoy.
>  * Python Server Pages originally developed by Sterling Hughes
>  *
> [ ... ]
> 

Ohh, didn't even notice that (must have added it really late at night
:).  Fine with me.

-Sterling

> Grisha
-- 
"First they ignore you, then they laugh at you,  
 then they fight you, then you win."  
    - Gandhi


Copyright

Posted by "Gregory (Grisha) Trubetskoy" <gr...@apache.org>.
I noticed that the psp_parser.l file still has "Copyright (c) 2003
Sterling Hughes" - I don't know whether this is against any ASF policy,
and also if it can be changed just like that, or if the change of
copyright requires some legal paperwork - perhaps Greg can comment on
this.

I'm also thinking that the way to give credit where it is due would be
to change the COPYRIGHT so that it contains something like:

[ ... ]
 * This software consists of voluntary contributions made by many
 * individuals on behalf of the Apache Software Foundation.  For more
 * information on the Apache Software Foundation, please see
 * <http://www.apache.org/>.
 *
 * Originally developed by Gregory Trubetskoy.
 * Python Server Pages originally developed by Sterling Hughes
 *
[ ... ]

Grisha




Re: Python Server Pages

Posted by Sterling Hughes <st...@bumblebury.com>.
On Tue, 2003-04-08 at 01:02, Gregory (Grisha) Trubetskoy wrote:
> Well - I tried it and it works! :-)
> 
> I've got some comments/ideas, but I'll send another e-mail when I'm able
> to think better.
> 

Great, I look forward to it.  I should note that at this point its
largely (a pretty reliable) reference implementation.  There a few
issues to consider when merging the two, one of the the major challenges
being how to make the lexer threadsafe.

-Sterling
-- 
"Science is like sex: sometimes something useful comes out, 
but that is not the reason we are doing it." 
    - Richard Feynman


Re: Python Server Pages

Posted by "Gregory (Grisha) Trubetskoy" <gr...@apache.org>.
Well - I tried it and it works! :-)

I've got some comments/ideas, but I'll send another e-mail when I'm able
to think better.

Thanks again, Sterling, for this enormously valuable contribution!

Grisha

On 5 Apr 2003, Sterling Hughes wrote:

> Ok, the two are now somewhat integrated..  I've attached a patch, and a
> set of files that accompany it, showing the functionality working
> together.  It turns out python's exec works fine on code objects
> (cool!), so the handler works perfectly.
>
> The Apache Handler to enable this is:
>
> <Directory /data/www/modpsp/htdocs/>
>     AddHandler python-program .psp
>     PythonHandler mptest
>     PythonDebug On
> </Directory>
>
> Once you have it installed, copy mptest.py and tst1.psp to the
> appropriate directory, and try out tst1.psp.
>
> -Sterling
>


Re: Python Server Pages

Posted by Sterling Hughes <st...@bumblebury.com>.
Ok, the two are now somewhat integrated..  I've attached a patch, and a
set of files that accompany it, showing the functionality working
together.  It turns out python's exec works fine on code objects
(cool!), so the handler works perfectly.

The Apache Handler to enable this is:

<Directory /data/www/modpsp/htdocs/>
    AddHandler python-program .psp
    PythonHandler mptest
    PythonDebug On
</Directory>

Once you have it installed, copy mptest.py and tst1.psp to the
appropriate directory, and try out tst1.psp.

-Sterling

Re: Python Server Pages

Posted by "Gregory (Grisha) Trubetskoy" <gr...@apache.org>.
On 5 Apr 2003, Sterling Hughes wrote:

> response.write("<html>")

Ah!. Here is another thing that will need to be thought about - mod_python
doesn't have a response, only request. This is obviously because httpd
only has request. So you req.write(), req.read(), change req.headers_out,
req.headers_in, etc.

> My main concern with a handler is the speed issues, although I it seems
> to be a good starting point.  The solution you present would require
> reparsing the code each request.

It shouldn't - _psp.so should be able to keep its own cache of code
objects. _psp.so will live as long as the httpd process lives by virtue of
Python's import mechanism.

> If we do this internally, it allows us to be more intelligent about
> caching code objects.  Currently a typical mod_psp request requires a
> hashlookup + stat, that's it.  (well, ok, realpath() causes an extra
> stat, or 5 on messed up systems, but that can be eliminated with
> apache2).
>
> Is there anyway for python to evaluate a compiled PyObject * OP
> structure directly in Python code?

I *think* it will.

> If so, I think this would work well as a permanent solution too
> (although I'd still like to add a MIME handler in the code, its just
> easier for people to install.)

I think another mime handler is not a bad idea - in any case it's only a 2
line patch to add that.

Grisha


Re: Python Server Pages

Posted by Sterling Hughes <st...@bumblebury.com>.
On Sat, 2003-04-05 at 00:17, Gregory (Grisha) Trubetskoy wrote: 
> On 4 Apr 2003, Sterling Hughes wrote:
> 
> > PSP supports
> > modifying request and response parameters, however it also provides
> > reasonable defaults.  So for example, the content-type of an application
> > starts with text/html.  Headers are automatically sent when the
> > application first produces output, etc.
> 
> If I understood it correctly, this is made available via a global request
> object - it might be a better option if this was the mod_python's request
> object, which I think has more features. I don't know much about the
> mod_psp request, but mod_python's is modeled heavily after apache's
> request_rec. The details are here:
> http://www.modpython.org/live/current/doc-html/pyapi-mprequest.html
> 
> BTW, the latest mod_python uses Python's new class implementation, whereas
> the 2.x version uses the old "built-in" object style. The implication is
> that new objects require Python higher version than 2.2.1 (I think).
> 

Yep, I'm thinking more and more a handler is the way to go, I just have
one concern (below.)  

I was bit by this while trying to implement primitive support for psp in
mod_python.

> > 1) mod_python gains two new MIME handlers, application/x-httpd-python
> > and application/x-httpd-psp.  These MIME handlers behave similairly to
> > how they did in mod_psp.
> 
> In mod_python I didn't use MIME types, it is mapped by use of AddHandler,
> don't know if this is good or bad. So the association of a file with
> mod_python typically looks like this:
> 
> AddHandler python-program .py
> 
> (As a side note, I think I saw that mod_perl's new handler name is
> mod_perl, so their config looks like "AddHandler mod_perl .pl", which I
> think is an idea worth borrowing).

I agree. :)

> Anyway - if someone could explain the advantages/disadvantages of using
> mime types, I'd appreciate since I don't seem to grasp the idea at the
> moment.

Well, from my perspective MIME types are the clear way of saying "every
file ending in .psp is a python server pages file," and "every file
ending in .py is a python file."

MIME types are catch all, whereas handlers are meant more specifically. 
At least that's my understanding.

> In any event, the handling of handlers is different between httpd 1.3 and
> 2.0. In 1.3 you passed a list of handlers/mime-types as part of module
> config, and then httpd took care of calling your handler when needed. In
> httpd 2.0 the handlers are called for every request and the responsibility
> for examining the mime type now belongs to the individual module, so
> handler code typically begins with something like:
> 
>     if (!req->handler || strcmp(req->handler, "python-program"))
>         return DECLINED;
> 
> So there is a (probably insignificant) performance hit to adding new
> types.
> 
> The mod_python reqeust processing looks something like this:
> 
>     python_handler() in mod_python.c
>         o  check to see whther there is a mod_python handler (not to
>              be confused with apache handler - mod_python handler is
>              the name of a python module which will handle the request and
>              it is set via PythonHandler directive). It's possible to
>              specify multiple handlers, in which case they will be called
>              sequentially. There is a special C structure called hstack
>              which contains this list of handlers. Handlers can also be
>              added onto the stack dynamically from inside Python.
>         o  select subinterpreter name based on virtual host name
> 	     and possibly directory/file name
>         o  if this subinterpeter exists, switch to it, otherwise create
>              it, then switch to it. This is done in get_interpreter().
>              get_interpreter() actually imports the mod_python.apache
>              module (written in Python) and instantiates a CallBack obj.
>              (the instantiation is only done once when subinterepreter is
>              created, from there on the object is reused)
>         o  create the mod_python request object
>         o  call into Python (specically call CallBack.HandlerDispatch())
> 	   -- from this point we're running Python code --
>            1. HandlerDispatch() looks at the hstack object to find out
>                which python handler (i.e. a normal Python module) to
>                import
>            2. The module is imported (and possibly re-imported if it
>                changed on disk since last import)
>            3. The name of the module is removed from hstack
>            4. Control is passed to a method inside this module.
>               o This is where content is generated, and written
>                   to the client using req.write()
>            5. Loop back to step 1. and repeat until hstack is empty
> 
> > 2) The Python Server Pages functionality gets exported in a traditional
> > mod_python module, and therefore you could do something like:
> >
> > from mod_python import psp
> >
> > def handler(req):
> > 	code = psp.parse(filename)
> >
> > You can also use the psp.include() function to parse and execute psp
> > code from within mod_python handlers and application/x-httpd-python
> > python scripts.  mod_psp already has this functionality.
> 
> So as a starting point, given that there'd exist a mod_python._psp module,
> (which would be _psp.so I guess) a mod_python handler could be implemented
> without any changes to mod_python and look something like:
> 
> from mod_python import apache
> from mod_python import _psp
> 
> def handler(req):
> 
>     try:
>         # the request object is passed in because somehow it would
>         # be made available to the inlined Python, or is this something
>         # that can be done just prior to exec?
>         code = _psp.parse(req, filename)
>         exec code
>     except SomeErrors:
> 
>         # do something...
>         return apache.SOME_ERROR
> 
>     return apache.OK
> 
> If the above code snippet lived in a file callde psp.py, then a typical
> httpd.conf would look like:
> 
> AddHandler python-program .psp
> PythonHandler mod_python.psp
> 
> OK, now that I just typed all this I realized I don't know how mod_psp
> sends its output to the client - in mod_python it's done via
> req.write()...

response.write() in mod_psp, modeled after (*gasp* ;) asp.  A script
such as:

<html>
<head>
<title>Hello World</title>
</head>
<body>
Booh!
</body>
</html>

is translated into: 

response.write("<html>")
response.write("<head>")
response.write("<title>Hello World</title>")
response.write("</head>")
response.write("<body>")
response.write("Booh!")
response.write("</body>")
response.write("</html>")

Hacking this into req.write() is a 4 line patch.

> Now, if we want a special mime-type for psp, then I think the above is
> still good, we just need to insert a snippet of code at the beginning of
> python_handler() C function that adds mod_python.psp to hstack if it sees
> that particular mime type.
> 
> Feel free to comment, the above was typed straight off the top of my head
> and probably contains errors/ommissions, but I think we have a starting
> point of a discussion.


My main concern with a handler is the speed issues, although I it seems
to be a good starting point.  The solution you present would require
reparsing the code each request.  If we do this internally, it allows us
to be more intelligent about caching code objects.  Currently a typical
mod_psp request requires a hashlookup + stat, that's it.  (well, ok,
realpath() causes an extra stat, or 5 on messed up systems, but that can
be eliminated with apache2).

Is there anyway for python to evaluate a compiled PyObject * OP
structure directly in Python code?  If so, I think this would work well
as a permanent solution too (although I'd still like to add a MIME
handler in the code, its just easier for people to install.)

-Sterling




Re: Python Server Pages

Posted by "Gregory (Grisha) Trubetskoy" <gr...@apache.org>.
On 4 Apr 2003, Sterling Hughes wrote:

> PSP supports
> modifying request and response parameters, however it also provides
> reasonable defaults.  So for example, the content-type of an application
> starts with text/html.  Headers are automatically sent when the
> application first produces output, etc.

If I understood it correctly, this is made available via a global request
object - it might be a better option if this was the mod_python's request
object, which I think has more features. I don't know much about the
mod_psp request, but mod_python's is modeled heavily after apache's
request_rec. The details are here:

http://www.modpython.org/live/current/doc-html/pyapi-mprequest.html

BTW, the latest mod_python uses Python's new class implementation, whereas
the 2.x version uses the old "built-in" object style. The implication is
that new objects require Python higher version than 2.2.1 (I think).

> 1) mod_python gains two new MIME handlers, application/x-httpd-python
> and application/x-httpd-psp.  These MIME handlers behave similairly to
> how they did in mod_psp.

In mod_python I didn't use MIME types, it is mapped by use of AddHandler,
don't know if this is good or bad. So the association of a file with
mod_python typically looks like this:

AddHandler python-program .py

(As a side note, I think I saw that mod_perl's new handler name is
mod_perl, so their config looks like "AddHandler mod_perl .pl", which I
think is an idea worth borrowing).

Anyway - if someone could explain the advantages/disadvantages of using
mime types, I'd appreciate since I don't seem to grasp the idea at the
moment.

In any event, the handling of handlers is different between httpd 1.3 and
2.0. In 1.3 you passed a list of handlers/mime-types as part of module
config, and then httpd took care of calling your handler when needed. In
httpd 2.0 the handlers are called for every request and the responsibility
for examining the mime type now belongs to the individual module, so
handler code typically begins with something like:

    if (!req->handler || strcmp(req->handler, "python-program"))
        return DECLINED;

So there is a (probably insignificant) performance hit to adding new
types.

The mod_python reqeust processing looks something like this:

    python_handler() in mod_python.c
        o  check to see whther there is a mod_python handler (not to
             be confused with apache handler - mod_python handler is
             the name of a python module which will handle the request and
             it is set via PythonHandler directive). It's possible to
             specify multiple handlers, in which case they will be called
             sequentially. There is a special C structure called hstack
             which contains this list of handlers. Handlers can also be
             added onto the stack dynamically from inside Python.
        o  select subinterpreter name based on virtual host name
	     and possibly directory/file name
        o  if this subinterpeter exists, switch to it, otherwise create
             it, then switch to it. This is done in get_interpreter().
             get_interpreter() actually imports the mod_python.apache
             module (written in Python) and instantiates a CallBack obj.
             (the instantiation is only done once when subinterepreter is
             created, from there on the object is reused)
        o  create the mod_python request object
        o  call into Python (specically call CallBack.HandlerDispatch())
	   --- from this point we're running Python code --
           1. HandlerDispatch() looks at the hstack object to find out
               which python handler (i.e. a normal Python module) to
               import
           2. The module is imported (and possibly re-imported if it
               changed on disk since last import)
           3. The name of the module is removed from hstack
           4. Control is passed to a method inside this module.
              o This is where content is generated, and written
                  to the client using req.write()
           5. Loop back to step 1. and repeat until hstack is empty

> 2) The Python Server Pages functionality gets exported in a traditional
> mod_python module, and therefore you could do something like:
>
> from mod_python import psp
>
> def handler(req):
> 	code = psp.parse(filename)
>
> You can also use the psp.include() function to parse and execute psp
> code from within mod_python handlers and application/x-httpd-python
> python scripts.  mod_psp already has this functionality.

So as a starting point, given that there'd exist a mod_python._psp module,
(which would be _psp.so I guess) a mod_python handler could be implemented
without any changes to mod_python and look something like:

from mod_python import apache
from mod_python import _psp

def handler(req):

    try:
        # the request object is passed in because somehow it would
        # be made available to the inlined Python, or is this something
        # that can be done just prior to exec?
        code = _psp.parse(req, filename)
        exec code
    except SomeErrors:

        # do something...
        return apache.SOME_ERROR

    return apache.OK

If the above code snippet lived in a file callde psp.py, then a typical
httpd.conf would look like:

AddHandler python-program .psp
PythonHandler mod_python.psp

OK, now that I just typed all this I realized I don't know how mod_psp
sends its output to the client - in mod_python it's done via
req.write()...

Now, if we want a special mime-type for psp, then I think the above is
still good, we just need to insert a snippet of code at the beginning of
python_handler() C function that adds mod_python.psp to hstack if it sees
that particular mime type.

Feel free to comment, the above was typed straight off the top of my head
and probably contains errors/ommissions, but I think we have a starting
point of a discussion.

Grisha


Re: Python Server Pages

Posted by Sterling Hughes <st...@bumblebury.com>.
On Fri, 2003-04-04 at 14:45, Gregory (Grisha) Trubetskoy wrote:
> I am delighted to announce that in a little private exchange Sterling
> Hughes has agreed to work on integrating the code from mod_psp into
> mod_python. (Just google for mod_psp for details).
> 
> I guess the point of this e-mail is to bring the matter to everyone's
> attention and to give people an opportunity to pitch in with
> ideas/opinions/etc.


Hi,

I figured I'd take this opportunity to say hello, and discuss some of my
plans for mod_python + psp.

First a bit about mod_psp
(http://www.edwardbear.org/mod_psp-0.3.tar.gz).  mod_psp was created to
create a PHP-esque environment for python code.  It sports a syntax for
embedding python code with in HTML, an apache module that registers MIME
types, and a little toolkit for common tasks (like accessing GET, POST
and cookie variables):

<?psp
import time
?>
<html>
<head>
<title>Hello World</title>
</head>
<body>
<h1>Hello World</h1>
It is now <i><?=time.strftime("%Y-%m-%d, %H:%M)?></i>
</body>
</html>


It supports code blocks and indentation via two methods, either using
whitespace to denote html blocks:

<?psp if foo: ?>
	FOO is here
	<?psp if bar: ?>
		BAR is here
	Some More FOO
Whitespace block is over, printed regardless

Or by denoting things with { and } like in traditional languages:

<?psp if foo: { ?>
	FOO is here
	<?psp if bar: ?>
		BAR is here
	Some More Foo
<?psp } ?>
Block of FOO is over, printed regardless

These two indentations can be mixed, and matched.  PSP supports
modifying request and response parameters, however it also provides
reasonable defaults.  So for example, the content-type of an application
starts with text/html.  Headers are automatically sent when the
application first produces output, etc.

The core idea behind PSP is to make developing web applications with
Python as simple as humanly possible.  My first (and only) web
application with PSP can be found at:
http://www.edwardbear.org/guestbook.psp.txt (it ends in .txt because IE
ignores MIME headers)..

PSP is internally written completely in C.  It consists of a parser,
which rewrites PSP files into python files (plain html into
response.write("html code"), it uses flex), code that executes the
python file, and support scaffolding that give access to the apache api.
mod_psp registers two MIME types, application/x-httpd-python and
application/x-httpd-psp, it allows you to natively run normal python
files (no rewriting), and PSP files. 

One of the cooler things it does (don't know if mod_python does this),
is internally cache the compiled python objects.  Therefore, you only
need to compile python scripts once per apache process (it stat()s the
files each time to check for changes).

I took a little time today to look at mod_python's source code, and it
seems the mod_psp could live very happily inside of mod_python without
being too invasive.  What I think would be an ideal thing is if the
following happened:

1) mod_python gains two new MIME handlers, application/x-httpd-python
and application/x-httpd-psp.  These MIME handlers behave similairly to
how they did in mod_psp.

2) The Python Server Pages functionality gets exported in a traditional
mod_python module, and therefore you could do something like:

from mod_python import psp

def handler(req):
	code = psp.parse(filename)

You can also use the psp.include() function to parse and execute psp
code from within mod_python handlers and application/x-httpd-python
python scripts.  mod_psp already has this functionality.

Anyhow, this is how I envision the two merging.  I'm sorry to have
written such a long mail, I just prefer to be verbose. ;-)  I'm really
excited about getting the two to work together, and am interested to
hear other people's opinions as to how this should be integrated.

-Sterling