You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mod_python-dev@quetz.apache.org by Daniel Popowich <dp...@comcast.net> on 2005/08/09 22:27:37 UTC

_apache._global_lock theory

The recent discussion of max locks and deadlocking issues with
_apache._global_(un)?lock() are timely for me:

I'm in the middle of writing a caching module for mod_python servlets
so a developer can have the output of a servlet cached, keyed on the
hash of the URI, for future requests.  The goal, of course, is to
increase throughput for dynamic pages with long shelf life (e.g.,
content manager updates database once per day, so the HTML only needs
to be generated once per day, not per request).

I need locking.  My first gut instinct was to go to the fcntl module
(lockf), but this is only available for unix.  My second gut instinct
was: so what?  :-)

Wanting a true x-platform tool I then thought of mod_python.Session
needing locks, poked around the code, and saw how it uses
_apache._global_lock().  Further poking around showed me how psp code
caching uses the function.  About the time I began worrying about
deadlock issues I found the thread on this list discussing the same
problems.

The solution for Session and psp code caching is to explicitly use lock
id 0.  This works as long as a module does not hold the lock for the
whole request, but unlocks immediately after acquiring the needed
resource.  Fine.

So, my question: is this the recommended way for mod_python framework
developers to acquire x-platform global locks?  Explicitly use lock
id 0?  If so, is this a secret or should it be documented?

Daniel Popowich
-----------------------------------------------
http://home.comcast.net/~d.popowich/mpservlets/

Re: _apache._global_lock theory

Posted by Jim Gallacher <jg...@sympatico.ca>.
Daniel Popowich wrote:
> Jim Gallacher writes:
> 
>>Daniel Popowich wrote:
>>
>>>The recent discussion of max locks and deadlocking issues with
>>>_apache._global_(un)?lock() are timely for me:
>>>
>>>I'm in the middle of writing a caching module for mod_python servlets
>>>so a developer can have the output of a servlet cached, keyed on the
>>>hash of the URI, for future requests.  The goal, of course, is to
>>>increase throughput for dynamic pages with long shelf life (e.g.,
>>>content manager updates database once per day, so the HTML only needs
>>>to be generated once per day, not per request).
>>>
>>>I need locking.  My first gut instinct was to go to the fcntl module
>>>(lockf), but this is only available for unix.  My second gut instinct
>>>was: so what?  :-)
>>>
>>>Wanting a true x-platform tool I then thought of mod_python.Session
>>>needing locks, poked around the code, and saw how it uses
>>>_apache._global_lock().  Further poking around showed me how psp code
>>>caching uses the function.  About the time I began worrying about
>>>deadlock issues I found the thread on this list discussing the same
>>>problems.
>>>
>>>The solution for Session and psp code caching is to explicitly use lock
>>>id 0.  This works as long as a module does not hold the lock for the
>>>whole request, but unlocks immediately after acquiring the needed
>>>resource.  Fine.
>>
>>Just to be clear, sessions use the locks above index 0 for session 
>>locking. The session id is hashed to determine which index is used. 
>>DbmSession uses lock index 0 to lock the dbm file for reading and 
>>writing the persistent session data. This is independent of the session 
>>lock.
> 
> 
> Right, not sessions proper, but by the mod_python.Session module for
> dbm file reading.
> 
> 
>>>So, my question: is this the recommended way for mod_python framework
>>>developers to acquire x-platform global locks?  Explicitly use lock
>>>id 0?  If so, is this a secret or should it be documented?
>>
>>I don't know if it's recommended, but I don't see a problem as long as 
>>the lock is held briefly and you make sure you unlock it when you are 
>>done. I suspect it is undocumented because it was never documented as 
>>opposed to some larger conspiracy.
> 
> 
> I guess I was too tongue-in-cheek...my question is: Is it not
> documented on *purpose*?  Perhaps it should be documented for internal
> developers and framework developers?

No, actually I understood your cheek. I was just too lazy to put in a 
smiley after my comment. I shall correct that now. :) And a winkey for 
good measure. ;)

> 
>>I used another cross platform approach in filesession_cleanup() in
>>Session.py.  I wanted to make sure only one request at a time was
>>running the cleanup, and used the os.open() call to exclusively open
>>a guard file. (OK, not a guard file, but my brain just went
>>blank. Hopefully you get the idea.)
> 
> 
> I'm with ya...  :-)
> 
> 
>>Here is a code snippet:
> 
> 
> Thanks for the code...maybe I'll try both (your code and
> _apache._global_lock()) and benchmark my caching code with ab.
> 
> 
> Thinking out loud here...wouldn't it be good for mod_python to provide
> a facility for global locking based on some key?  By default, the lock
> is per interpreter, but optionally per server?  Given the oddities of
> python programming within an apache environment, especially a prefork
> MPM environment, it seems it would be a most valueable service.  The
> Session, psp and 3rd-party locking (e.g. mpservlets) could all share
> the same code.
> 

That discussion will have to wait for another time. Time to call it a day.

Regards,
Jim

Re: _apache._global_lock theory

Posted by Daniel Popowich <dp...@comcast.net>.
Jim Gallacher writes:
> Daniel Popowich wrote:
> > The recent discussion of max locks and deadlocking issues with
> > _apache._global_(un)?lock() are timely for me:
> > 
> > I'm in the middle of writing a caching module for mod_python servlets
> > so a developer can have the output of a servlet cached, keyed on the
> > hash of the URI, for future requests.  The goal, of course, is to
> > increase throughput for dynamic pages with long shelf life (e.g.,
> > content manager updates database once per day, so the HTML only needs
> > to be generated once per day, not per request).
> > 
> > I need locking.  My first gut instinct was to go to the fcntl module
> > (lockf), but this is only available for unix.  My second gut instinct
> > was: so what?  :-)
> > 
> > Wanting a true x-platform tool I then thought of mod_python.Session
> > needing locks, poked around the code, and saw how it uses
> > _apache._global_lock().  Further poking around showed me how psp code
> > caching uses the function.  About the time I began worrying about
> > deadlock issues I found the thread on this list discussing the same
> > problems.
> > 
> > The solution for Session and psp code caching is to explicitly use lock
> > id 0.  This works as long as a module does not hold the lock for the
> > whole request, but unlocks immediately after acquiring the needed
> > resource.  Fine.
> 
> Just to be clear, sessions use the locks above index 0 for session 
> locking. The session id is hashed to determine which index is used. 
> DbmSession uses lock index 0 to lock the dbm file for reading and 
> writing the persistent session data. This is independent of the session 
> lock.

Right, not sessions proper, but by the mod_python.Session module for
dbm file reading.

> > So, my question: is this the recommended way for mod_python framework
> > developers to acquire x-platform global locks?  Explicitly use lock
> > id 0?  If so, is this a secret or should it be documented?
> 
> I don't know if it's recommended, but I don't see a problem as long as 
> the lock is held briefly and you make sure you unlock it when you are 
> done. I suspect it is undocumented because it was never documented as 
> opposed to some larger conspiracy.

I guess I was too tongue-in-cheek...my question is: Is it not
documented on *purpose*?  Perhaps it should be documented for internal
developers and framework developers?

> I used another cross platform approach in filesession_cleanup() in
> Session.py.  I wanted to make sure only one request at a time was
> running the cleanup, and used the os.open() call to exclusively open
> a guard file. (OK, not a guard file, but my brain just went
> blank. Hopefully you get the idea.)

I'm with ya...  :-)

> Here is a code snippet:

Thanks for the code...maybe I'll try both (your code and
_apache._global_lock()) and benchmark my caching code with ab.


Thinking out loud here...wouldn't it be good for mod_python to provide
a facility for global locking based on some key?  By default, the lock
is per interpreter, but optionally per server?  Given the oddities of
python programming within an apache environment, especially a prefork
MPM environment, it seems it would be a most valueable service.  The
Session, psp and 3rd-party locking (e.g. mpservlets) could all share
the same code.


Daniel Popowich
-----------------------------------------------
http://home.comcast.net/~d.popowich/mpservlets/




Re: _apache._global_lock theory

Posted by Jim Gallacher <jg...@sympatico.ca>.
Daniel Popowich wrote:
> The recent discussion of max locks and deadlocking issues with
> _apache._global_(un)?lock() are timely for me:
> 
> I'm in the middle of writing a caching module for mod_python servlets
> so a developer can have the output of a servlet cached, keyed on the
> hash of the URI, for future requests.  The goal, of course, is to
> increase throughput for dynamic pages with long shelf life (e.g.,
> content manager updates database once per day, so the HTML only needs
> to be generated once per day, not per request).
> 
> I need locking.  My first gut instinct was to go to the fcntl module
> (lockf), but this is only available for unix.  My second gut instinct
> was: so what?  :-)
> 
> Wanting a true x-platform tool I then thought of mod_python.Session
> needing locks, poked around the code, and saw how it uses
> _apache._global_lock().  Further poking around showed me how psp code
> caching uses the function.  About the time I began worrying about
> deadlock issues I found the thread on this list discussing the same
> problems.
> 
> The solution for Session and psp code caching is to explicitly use lock
> id 0.  This works as long as a module does not hold the lock for the
> whole request, but unlocks immediately after acquiring the needed
> resource.  Fine.

Just to be clear, sessions use the locks above index 0 for session 
locking. The session id is hashed to determine which index is used. 
DbmSession uses lock index 0 to lock the dbm file for reading and 
writing the persistent session data. This is independent of the session 
lock.

> So, my question: is this the recommended way for mod_python framework
> developers to acquire x-platform global locks?  Explicitly use lock
> id 0?  If so, is this a secret or should it be documented?

I don't know if it's recommended, but I don't see a problem as long as 
the lock is held briefly and you make sure you unlock it when you are 
done. I suspect it is undocumented because it was never documented as 
opposed to some larger conspiracy.

I used another cross platform approach in filesession_cleanup() in 
Session.py. I wanted to make sure only one request at a time was running 
the cleanup, and used the os.open() call to exclusively open a guard 
file. (OK, not a guard file, but my brain just went blank. Hopefully you 
get the idea.)

Here is a code snippet:

     lockfile = os.path.join(sessdir,'.mp_sess.lck')
     try:
         lockfp = os.open(lockfile, os.O_CREAT | os.O_EXCL | os.O_WRONLY,
                            0660)
     except:
         req.log_error('FileSession cleanup:
                         another process is already running.'
                         % (fast_cleanup,verify_cleanup),
                         apache.APLOG_NOTICE)
         return

     try:
         ... more code ...
        try:
             os.unlink(lockfile)
         except:
             pass

     finally:
         os.close(lockfp)


I got this idea from Barry Pearce in a thread from April 2005 on the 
mod_python list. You might want to check archives - lots of interesting 
stuff.

Regards,
Jim