You are viewing a plain text version of this content. The canonical link for it is here.
Posted to modperl@perl.apache.org by md <my...@yahoo.com> on 2002/08/19 22:47:25 UTC

Apache::Session - What goes in session?

I'm using mod_perl and Apache::Session on an app that
is similar to MyYahoo. I found a few bits of info from
a previous thread, but I'm curious as to what type of
information should go in the session and what should
come from the database.

Currently I'm putting very little in the session, but
what I am putting in the session is more "global" in
nature...greeting, current page number, current page
name...data that doesn't change very often. I'm
pulling a lot of info from the database and I wonder
if my design is sound. Most of the info being pulled
from the database is features for the page. 

Now I need to add global "modules" to the page which
will show user info like which pages they have created
and which features are being emailed to the user.
These modules will display on every page unless the
user turns them off. It seems that since this info
wouldn't change very often that I should put the data
in the session...

Anyone have any general tips on session design?

Thanks.

__________________________________________________
Do You Yahoo!?
HotJobs - Search Thousands of New Jobs
http://www.hotjobs.com

Re: Apache::Session - What goes in session?

Posted by Tony Bowden <to...@kasei.com>.
On Thu, Aug 22, 2002 at 01:20:37PM -0700, md wrote:
> > Don't worry about whether it *seems* efficient. Do it right, and then
> > worry about how to speed that up - if, and only if, it's too slow.
> > Premature optimisation is the root of all evil, and
> > all that ..

> Thanks for that tidbit.
> One question though...the only thing in the cookie is
> the sesison id. I'm getting the user id when the user
> logs in and putting that in the session. Would you
> pull the user id from the db everytime too instead of
> putting it in the session? I'm leaning towards taking
> it out.

Personally I'd hash the user id into the session id and then extract it
programatically. But pulling it from the database is fine too.

Tony

Re: Apache::Session - What goes in session?

Posted by md <my...@yahoo.com>.
--- Tony Bowden <to...@kasei.com> wrote:

> Don't worry about whether it *seems* efficient. Do
> it right, and then
> worry about how to speed that up - if, and only if,
> it's too slow.
> 
> Premature optimisation is the root of all evil, and
> all that ..


Thanks for that tidbit.

I removed almost everything from the sesison and I'm
now pulling that info from the DB with no noticable
difference.

I think a can eliminate a few db calls by placing a
few things in hidden fields or the query string. 

One question though...the only thing in the cookie is
the sesison id. I'm getting the user id when the user
logs in and putting that in the session. Would you
pull the user id from the db everytime too instead of
putting it in the session? I'm leaning towards taking
it out.



__________________________________________________
Do You Yahoo!?
HotJobs - Search Thousands of New Jobs
http://www.hotjobs.com

Re: Apache::Session - What goes in session?

Posted by Perrin Harkins <pe...@elem.com>.
Peter J. Schoenster wrote:
> If I'm using Apache::DBI so I have a persistent connection to MySQL, 
> would it not be faster to simply use a table in MySQL?

Probably not, if the MySQL server is on a separate machine.  If it's on 
the same machine, it would be close.  Remember, MySQL has more work to 
do (parse SQL statement, make query plan, etc.) than a simple hash-based 
system like BerkeleyDB does.  Best thing would be to benchmark it though.

- Perrin


RE: Apache::Session - What goes in session?

Posted by Jesse Erlbaum <je...@erlbaum.net>.
Hi Peter --

> > The morale of the story:  Flat files rock!  ;-)
>
> If I'm using Apache::DBI so I have a persistent connection to MySQL,
> would it not be faster to simply use a table in MySQL?


Unlikely.  Even with cached database connections you are probably not going
to beat the performance of going to a flat text file.  Accessing files is
something the OS is optimized to do.  The process of issuing a SQL query,
having it parsed and retrieving results is probably more time-consuming than
you think.

One way to think about it is this:  MySQL stores its data in files.  There
are many layers of code between DBI and those files, each of which add
processing time.  Going directly to files is far less code, and less code is
most often faster code.

The best way to be cure is to benchmark the difference yourself.  Try out
the Benchmark module.  Quantitative data trumps anecdotal data every time.


Warmest regards,

-Jesse-


--

  Jesse Erlbaum
  The Erlbaum Group
  jesse@erlbaum.net
  Phone: 212-684-6161
  Fax: 212-684-6226




Re: Apache::Session - What goes in session?

Posted by "Peter J. Schoenster" <pe...@schoenster.com>.
On 21 Aug 2002 at 2:09, Ask Bjoern Hansen wrote:

> Now using good old Fcntl to control access to simple "flat files".
> (Data serialized with pack("N*", ...); I don't think anything beats
> "pack" and "unpack" for serializing data).
> 
> The expiration went into the data and purging the cache was a simple
> cronjob to find files older than a few minutes and deleting them.
> 
> The performance?  I don't remember the exact figure, but it was at
> least several times faster than the BerkeleyDB system.  And *much*
> simpler.
> 
> 
> The morale of the story:  Flat files rock!  ;-)

If I'm using Apache::DBI so I have a persistent connection to MySQL, 
would it not be faster to simply use a table in MySQL?


Peter



---------------------------
"Reality is that which, when you stop believing in it, doesn't go
away".
                -- Philip K. Dick


Re: Apache::Session - What goes in session?

Posted by si...@siberian.org.
Thanks, you just saved us a ton of time.

Off to change course ;)

J

On Tue, 20 Aug 2002 13:12:29 -0400
  Perrin Harkins <pe...@elem.com> wrote:
>siberian@siberian.org wrote:
>>We are investigating using IPC rather then a file based 
>>structure but 
>>its purely investigation at this point.
>>
>>What are the speed diffs between an IPC cache and a 
>>Berkely DB cache. My 
>>gut instinct always screams 'Stay Off The Disk' but my 
>>gut is not always 
>>right.. Ok, rarely right.. ;)
>
>Most of the shared memory modules are much slower than 
>Berkeley DB.  The fastest option around is IPC::MM, but 
>data you store in that does not persist if you restart 
>the server which is a problem for some. BerkeleyDB (the 
>new one, not DB_File) is also very fast, and other 
>options like Cache::Mmap and Cache::FileCache are much 
>faster than anything based on IPC::Sharelite and the 
>like.
>
>I have charts and numbers in my TPC presentation, which I 
>will be putting up soon.
>
>- Perrin
>


Re: Apache::Session - What goes in session?

Posted by Perrin Harkins <pe...@elem.com>.
Jie Gao wrote:
> There are cases in which it is desirable to expire an entry which
> hasn't been used for a certain period of time; authenticated sessions
> data, for example.

Okay, so you're looking for a session module rather than a cache. 
Apache::Session doesn't handle expiration, but you could add it, as many 
here have.  You could also just use one of the general-purpose storage 
modules like MLDBM::Sync, BerkeleyDB, or the storage modules in 
Cache::Cache (like Cache::FileBackend) and then add expiration.  Those 
are all generic storage modules with no cache-specific stuff in their APIs.

- Perrin



Re: Apache::Session - What goes in session?

Posted by Jie Gao <J....@isu.usyd.edu.au>.
On Tue, 20 Aug 2002, Perrin Harkins wrote:

> Jie Gao wrote:
>  > I wish some of these modules would be able to "touch" cached data so that
>  > it would expire cache entries on "last-accessed" rather than on the time
>  > the entries were created.
>
> Why?  People used to do that with cached because they had limited space
> and wanted to purge the cache with an LRU algorithm to keep size down,
> but disk space is too cheap to worry about now.
>
> If an item in the cache is okay to stay there as long as people are
> accessing it, you are essentially saying that cached items never become
> invalid.  In that case, why bother ever deleting any of them?

There are cases in which it is desirable to expire an entry which
hasn't been used for a certain period of time; authenticated sessions
data, for example. Absolute expiration is indeed needed, as well.

Regards,



Jie


Re: Apache::Session - What goes in session?

Posted by Perrin Harkins <pe...@elem.com>.
Jie Gao wrote:
 > I wish some of these modules would be able to "touch" cached data so that
 > it would expire cache entries on "last-accessed" rather than on the time
 > the entries were created.

Why?  People used to do that with cached because they had limited space
and wanted to purge the cache with an LRU algorithm to keep size down,
but disk space is too cheap to worry about now.

If an item in the cache is okay to stay there as long as people are
accessing it, you are essentially saying that cached items never become
invalid.  In that case, why bother ever deleting any of them?

- Perrin



Re: Apache::Session - What goes in session?

Posted by Jie Gao <J....@isu.usyd.edu.au>.
On Tue, 20 Aug 2002, Perrin Harkins wrote:

> Date: Tue, 20 Aug 2002 13:12:29 -0400
> From: Perrin Harkins <pe...@elem.com>
> To: siberian@siberian.org
> Cc: Dave Rolsky <au...@urth.org>, modperl@perl.apache.org
> Subject: Re: Apache::Session - What goes in session?
>
> siberian@siberian.org wrote:
> > We are investigating using IPC rather then a file based structure but
> > its purely investigation at this point.
> >
> > What are the speed diffs between an IPC cache and a Berkely DB cache. My
> > gut instinct always screams 'Stay Off The Disk' but my gut is not always
> > right.. Ok, rarely right.. ;)
>
> Most of the shared memory modules are much slower than Berkeley DB.  The
> fastest option around is IPC::MM, but data you store in that does not
> persist if you restart the server which is a problem for some.
> BerkeleyDB (the new one, not DB_File) is also very fast, and other
> options like Cache::Mmap and Cache::FileCache are much faster than
> anything based on IPC::Sharelite and the like.

I wish some of these modules would be able to "touch" cached data so that
it would expire cache entries on "last-accessed" rather than on the time
the entries were created.

Regards,



Jie


Re: Apache::Session - What goes in session?

Posted by Perrin Harkins <pe...@elem.com>.
siberian@siberian.org wrote:
> We are investigating using IPC rather then a file based structure but 
> its purely investigation at this point.
> 
> What are the speed diffs between an IPC cache and a Berkely DB cache. My 
> gut instinct always screams 'Stay Off The Disk' but my gut is not always 
> right.. Ok, rarely right.. ;)

Most of the shared memory modules are much slower than Berkeley DB.  The 
fastest option around is IPC::MM, but data you store in that does not 
persist if you restart the server which is a problem for some. 
BerkeleyDB (the new one, not DB_File) is also very fast, and other 
options like Cache::Mmap and Cache::FileCache are much faster than 
anything based on IPC::Sharelite and the like.

I have charts and numbers in my TPC presentation, which I will be 
putting up soon.

- Perrin


Re: Apache::Session - What goes in session?

Posted by Perrin Harkins <pe...@elem.com>.
Ask Bjoern Hansen wrote:
> The performance?  I don't remember the exact figure, but it was at
> least several times faster than the BerkeleyDB system.  And *much*
> simpler.

In my benchmarks, recent versions of BerkeleyDB, used with the 
BerkeleyDB module and allowed to manage their own locking, beat all 
available flat-file modules.  It may be possible to improve the 
flat-file ones, but it even beat Tie::TextDir which is about as simple 
(and therefore fast) as they come.  The only thing that did better was 
IPC::MM.

- Perrin


Re: Apache::Session - What goes in session?

Posted by Ask Bjoern Hansen <as...@develooper.com>.
On Tue, 20 Aug 2002 siberian@siberian.org wrote:

> We are investigating using IPC rather then a file based
> structure but its purely investigation at this point.
>
> What are the speed diffs between an IPC cache and a
> Berkely DB cache. My gut instinct always screams 'Stay Off
> The Disk' but my gut is not always right.. Ok, rarely
> right.. ;)

IPC (for many definitions of that) has all sorts of odd limitations
and isn't that fast.  Don't go there.

The disk is usually much faster than you think.  Often overlooked
for caching is a simple file based cache.

Here's a story about that:

A while ago Graham Barr and I spend some time going through a number
of iterations for a "self cleaning" cache system.  It would take
lots of writes and fewer reads.  In each cache entry a number of
integers would be stored.  Just storing the last thousand entries
would be enough.

We tried quite a few different approaches; the most noteworthy was a
system of semaphores to control access to a number of slots in a
BerkeleyDB.  That should be pretty fast, right?

It got a bit complicated as our systems didn't support that many
semaphores, so we had to come up with a system for sharing the
semaphores across multiple "slots" in the database.

Designing and writing this implementation took a few days.  It was
really cool.

Anyway, after fixing that and a few deadlocks we were benchmarking
away.  The system was so clever.  We thought it was simple and neat.
Okay, neat at least.  And it was really slow. Slow. (~200 writes a
second on a 400MHz Pentium II if I recall correctly).

First we suspected we did something wrong with the semaphores, but
further benchmarking showed that the BerkeleyDB just wasn't that
fast for writing.

30 minutes thinking and 30 minutes typing code later we had a
prototype for a simple filebased system.

Now using good old Fcntl to control access to simple "flat files".
(Data serialized with pack("N*", ...); I don't think anything beats
"pack" and "unpack" for serializing data).

The expiration went into the data and purging the cache was a simple
cronjob to find files older than a few minutes and deleting them.

The performance?  I don't remember the exact figure, but it was at
least several times faster than the BerkeleyDB system.  And *much*
simpler.


The morale of the story:  Flat files rock!  ;-)


  - ask

-- 
ask bjoern hansen, http://www.askbjoernhansen.com/ !try; do();


Re: Apache::Session - What goes in session?

Posted by si...@siberian.org.
We are investigating using IPC rather then a file based 
structure but its purely investigation at this point.

What are the speed diffs between an IPC cache and a 
Berkely DB cache. My gut instinct always screams 'Stay Off 
The Disk' but my gut is not always right.. Ok, rarely 
right.. ;)

John-

On Tue, 20 Aug 2002 11:49:52 -0500 (CDT)
  Dave Rolsky <au...@urth.org> wrote:
>On Tue, 20 Aug 2002 siberian@siberian.org wrote:
>
>> Currently we are working on a 'per machine' cache so all
>> children can benefit for each childs initial database 
>>read
>> of the translated string, the differential between
>> children is annoying in the 'per child cache' strategy.
>
>Sounds like you want BerkeleyDB.pm (not DB_File), which 
>is quite fast and
>handles locking/concurrent access internally (when set up 
>properly).
>
>See the Alzabo::ObjectCache::{Store,Sync}::BerkeleyDB 
>modules for
>examples.
>
>For Alzabo, I also have a caching system that caches data 
>in a database,
>for cross-machine caching/syncing.  I haven't really 
>benchmarked it yet
>but I imagine it could be a win in some situations.  For 
>example, you
>could set up the cache as a separate machine running 
>MySQL and still pull
>your data from another machine, possibly running a 
>different RDBMS.
>
>
>-dave
>
>/*==================
>www.urth.org
>we await the New Sun
>==================*/
>


Re: Apache::Session - What goes in session?

Posted by Dave Rolsky <au...@urth.org>.
On Tue, 20 Aug 2002 siberian@siberian.org wrote:

> Currently we are working on a 'per machine' cache so all
> children can benefit for each childs initial database read
> of the translated string, the differential between
> children is annoying in the 'per child cache' strategy.

Sounds like you want BerkeleyDB.pm (not DB_File), which is quite fast and
handles locking/concurrent access internally (when set up properly).

See the Alzabo::ObjectCache::{Store,Sync}::BerkeleyDB modules for
examples.

For Alzabo, I also have a caching system that caches data in a database,
for cross-machine caching/syncing.  I haven't really benchmarked it yet
but I imagine it could be a win in some situations.  For example, you
could set up the cache as a separate machine running MySQL and still pull
your data from another machine, possibly running a different RDBMS.


-dave

/*==================
www.urth.org
we await the New Sun
==================*/


Re: Apache::Session - What goes in session?

Posted by si...@siberian.org.
We do see some slowdown on our langauge translation db 
calls since they are so intensive. Moving to a 'per child' 
cache for each string as it came out of the db sped page 
loads up from 4.5 seconds to .6-1.0 seconds per page which 
is significant.

Currently we are working on a 'per machine' cache so all 
children can benefit for each childs initial database read 
of the translated string, the differential between 
children is annoying in the 'per child cache' strategy.

John-

On Tue, 20 Aug 2002 16:33:07 +0100
  Tony Bowden <to...@kasei.com> wrote:
>On Mon, Aug 19, 2002 at 06:54:01PM -0700, md wrote:
>> I can definitely get it all from the db, but that 
>>doesn't
>> seem very efficient.
>
>Don't worry about whether it *seems* efficient. Do it 
>right, and then
>worry about how to speed that up - if, and only if, it's 
>too slow.
>
>Premature optimisation is the root of all evil, and all 
>that ..
>
>At BlackStar the session was just a single hashed ID and 
>all other info
>was loaded from the database every time. We thought about 
>caching some
>info a few times, but always ran into problems with 
>replication.  In the
>end we discovered that fetching everything from the 
>database on every
>request wasn't noticeably slower than anything else we 
>could up with,
>and was a lot more flexible. Throwing more memory at the 
>database servers
>was usually quicker, cheaper and more effective than 
>micro-optimising
>our session vs caching strategy...
>
>Tony


Re: Apache::Session - What goes in session?

Posted by Tony Bowden <to...@kasei.com>.
On Mon, Aug 19, 2002 at 06:54:01PM -0700, md wrote:
> I can definitely get it all from the db, but that doesn't
> seem very efficient.

Don't worry about whether it *seems* efficient. Do it right, and then
worry about how to speed that up - if, and only if, it's too slow.

Premature optimisation is the root of all evil, and all that ..

At BlackStar the session was just a single hashed ID and all other info
was loaded from the database every time. We thought about caching some
info a few times, but always ran into problems with replication.  In the
end we discovered that fetching everything from the database on every
request wasn't noticeably slower than anything else we could up with,
and was a lot more flexible. Throwing more memory at the database servers
was usually quicker, cheaper and more effective than micro-optimising
our session vs caching strategy...

Tony

Re: Apache::Session - What goes in session?

Posted by md <my...@yahoo.com>.
Thanks...you've given me plenty to work with. Great
explination. This is good pragmatic stuff to know!


__________________________________________________
Do You Yahoo!?
HotJobs - Search Thousands of New Jobs
http://www.hotjobs.com

Re: Apache::Session - What goes in session?

Posted by Perrin Harkins <pe...@elem.com>.
md wrote:
> I haven't looked at the cache modules docs yet...would
> it be possible to build cache on the separate
> load-balanced machines as we go along...as we do with
> template caching?

Of course.  However, if a user is sent to a random machine each time you 
won't be able to cache anything that a user is allowed to change during 
their time on the site, because they could end up on a machine that has 
an old cached value for it.  Sticky load-balancing or a cluster-wide 
cache (which you can update when data changes) deals with this problem.

> everything seems so user specific...

That doesn't mean you can't cache it.  You can do basically the same 
thing you were doing with the session: stuff a hash of user-specific 
stuff into the cache.  The next time that user sends a request, you 
check the cache for data on that user ID (you get the user ID from the 
session) and if you don't find any you just fetch it from the db.

Pseudo-code:

sub fetch_user_data {
   my $user_id = shift;
   my $user_data;
   unless ($user_data = fetch_from_cache($user_id)) {
     $user_data = fetch_from_db($user_id);
   }
   return $user_data;
}

> I would be curious though that if my choice is simply
> that the data is stored in the session or comes from
> the database with each request, would it still be best
> to essentially only store the session id in the
> session and pull everything else from the db? It still
> seems that something trivial like a greeting name (a
> preference) could go in the session.

Your decision about what to put in the session is not connected to your 
decision about what to pull from the db each time.  You can cache all 
the data if you want to, and still have very little in the session.

This might sound like an academic distinction, but I think it's 
important to keep the concepts separate: a session is a place to store 
transient state information that is irrelevant as soon as the user logs 
out, and a cache is a way of speeding up access to a slow resource like 
a database, and the two things should not be confused.  You can actually 
cache the session data if you need to (with a write-through cache that 
updates the backing database as well).  A cache will typically be faster 
than session storage because it doesn't need to be very reliable and 
because you can store and retrieve individual chunks of data (user's 
name, page names) when you need them instead of storing and retrieving 
everything on every request.  Separating these concepts allows you to do 
things like migrate the session storage to a transactional database some 
day, and move your cache storage to a distributed multicast cache when 
someone comes out with a module for that.

> The only
> gotcha would be that the calendar would need to update
> every day, at least on the current month's pages.

The cache modules I mentioned have a concept of "timeout" so that you 
can say "cache this for 12 hours" and then when it expires you fetch it 
again and update the cache for another 12 hours.

> Even though there are some "preset" pages, the user
> can change the names and the user can also create a
> cutom page with its own name.

No problem, you can cache data that's only useful for a single user, as 
I explained above.

> Not
> to mention that between the fact that the users' daily
> pages can have any number of user selected features
> per page and features themselves can have archive
> depths of anywhere from 3 to 20 years, there's a lot
> of info.

No problem, disks are cheap.  400MB of disk space will cost you about as 
much as a movie in New York these days.

- Perrin


Re: Apache::Session - What goes in session?

Posted by md <my...@yahoo.com>.
--- Perrin Harkins <pe...@elem.com> wrote:

> There are a few ways to deal with this.  The
> simplest is to use the 
> "sticky" load-balancing feature that many
> load-balancers have.  Failing 
> that, you can store to a network file system like
> NFS or CIFS, or use a 
> database.  (There are also fancier options with
> things like Spread, but 
> that's getting a little ahead of the game.)  You can
> use MySQL for 
> caching, and it will probably have similar
> performance to a networked 
> file system.  Unfortunately, the Apache::Session
> code isn't all that 
> easy to use for this, since it assumes you want to
> generate IDs for the 
> objects you store rather than passing them in.  You
> could adapt the code 
> from it to suit your needs though.  The important
> thing is to leave out 
> all of the mutually exclusive locking it implements,
> since a cache is 
> all about "get the latest as quick as you can" and
> lost updates are not 
> a problem ("last save wins" is good enough for a
> cache).

I haven't looked at the cache modules docs yet...would
it be possible to build cache on the separate
load-balanced machines as we go along...as we do with
template caching? By that I mean if an item has cached
on machine one then further requests on machine one
will come from cache where if on machine two the same
item hasn't cached, it will be pulled from the db the
first time and then cached?

If this isn't possible, I'm not sure if I'll be able
to implement any caching or not (some of the site
configuration is out of my hands) and everything seems
so user specific...I'll definitely reread your posts
and go through my app for things that should be
cached.

I would be curious though that if my choice is simply
that the data is stored in the session or comes from
the database with each request, would it still be best
to essentially only store the session id in the
session and pull everything else from the db? It still
seems that something trivial like a greeting name (a
preference) could go in the session.

> The relationships to the features and pages differ
> by user, but there 
> might be general information about the features
> themselves that is 
> stored in the database and is not user-specific. 
> That could be cached 
> separately, to save some trips to the db for each
> user.

The only thing I can think of right now is a
calendar...that should probably be cached. The only
gotcha would be that the calendar would need to update
every day, at least on the current month's pages. But
this is only on a "feature" page, not a users created
page (that is a user can click a link on their daily
page that takes them to a feature page where they can
go through archives).
 

> You can cache the names too if you want to, but
> keeping them out of the 
> session means that you won't be slowed down by
> fetching that extra data 
> and de-serializing it with Storable unless the page
> you're on actually 
> needs it.  

Even though there are some "preset" pages, the user
can change the names and the user can also create a
cutom page with its own name. So there could be
thousands of unique page names, many (most) specific
to unique users (like "Jim's Sports Page", etc.). Not
to mention that between the fact that the users' daily
pages can have any number of user selected features
per page and features themselves can have archive
depths of anywhere from 3 to 20 years, there's a lot
of info.

> It's also good to separate things that
> have to be reliable 
> (like the ID of the current user, since without that
> you have to send 
> them back to log in again) from things that don't
> need to be (you could 
> always fetch the list of pages from the db if your
> cache went down).

Very good advice. I've found that occasionally
something happens to my session where the sesssion id
is ok but some of the other data disapears (like
current page id) which really screws things up until
you log out and log back in again. This leads me to
suspect that I've answered my own question from above.
It's just whether I can cache or not.

Thanks for all your time and help.



__________________________________________________
Do You Yahoo!?
HotJobs - Search Thousands of New Jobs
http://www.hotjobs.com

Re: Apache::Session - What goes in session?

Posted by Perrin Harkins <pe...@elem.com>.
md wrote:

>We are using a load-balanced
>system; I shoudl have mentioned that earlier. Won't
>that be an issue with caching to disk? Is it possible
>to cache to the db?
>

There are a few ways to deal with this.  The simplest is to use the 
"sticky" load-balancing feature that many load-balancers have.  Failing 
that, you can store to a network file system like NFS or CIFS, or use a 
database.  (There are also fancier options with things like Spread, but 
that's getting a little ahead of the game.)  You can use MySQL for 
caching, and it will probably have similar performance to a networked 
file system.  Unfortunately, the Apache::Session code isn't all that 
easy to use for this, since it assumes you want to generate IDs for the 
objects you store rather than passing them in.  You could adapt the code 
from it to suit your needs though.  The important thing is to leave out 
all of the mutually exclusive locking it implements, since a cache is 
all about "get the latest as quick as you can" and lost updates are not 
a problem ("last save wins" is good enough for a cache).

>The "modules" will consist of a "pages" module with
>the names of all the pages the user has created (with
>links) and a "emails" module which will display all
>the features that the user is getting via email. 
>These modules will be displayed on every page. 
>
>You can see that almost everything is user-specific.
>

The relationships to the features and pages differ by user, but there 
might be general information about the features themselves that is 
stored in the database and is not user-specific.  That could be cached 
separately, to save some trips to the db for each user.

>Right now I'm storing the page names/ids in a hash ref
>in the session (the emails module isn't live yet), but
>I thought that I would change that and only store the
>module id and pull the names from the db (if the user
>hasn't turned off the module) with each page call.
>

You can cache the names too if you want to, but keeping them out of the 
session means that you won't be slowed down by fetching that extra data 
and de-serializing it with Storable unless the page you're on actually 
needs it.  It's also good to separate things that have to be reliable 
(like the ID of the current user, since without that you have to send 
them back to log in again) from things that don't need to be (you could 
always fetch the list of pages from the db if your cache went down).

- Perrin


Re: Apache::Session - What goes in session?

Posted by md <my...@yahoo.com>.
--- Perrin Harkins <pe...@elem.com> wrote:

> >Current page name and id are never stored in db, so
> >different browser windows can be on different
> >pages...
> >
> 
> I thought your session was all stored in MySQL.  Why
> are you putting 
> these in the session exactly?  If these things are
> not relevant to more 
> than one request (page), they don't belong in the
> session.  They should 
> just be in ordinary variables.

You are correct, these items are in the session in the
db. I meant that they weren't kept in long term
storage in the db after the session ended like the
default page id and user name are. The current page
id/name is only relevent for an active session. Once a
session is started current page is set to whatever the
default page id is and will change as the user changes
pages. The only reason I did this (as I recall) is
that way I can get the page name once. 
 
> You should use a cache for that, rather than the
> session.  This is 
> long-term data that you just want quicker access to.

Yes, that's exactly what I want to do. My main concern
is long-term data that I want quicker access to. I can
definitely get it all from the db, but that doesn't
seem very efficient.
 
> Template Toolkit caches the compiled template code,
> but it doesn't cache 
> your data or the output of the templates.  What you
> should do is grab a 
> module like Cache::Cache or Cache::Mmap and take a
> look at the examples 
> there.  You use it in a way that's very similar to
> what you're doing 
> with Apache::Session for the things you referred to
> as global.  There 
> are also good examples in the documentation for the
> Memoize module.

Great...exactly the kind of info I was looking for.
I'll look at those. We are using a load-balanced
system; I shoudl have mentioned that earlier. Won't
that be an issue with caching to disk? Is it possible
to cache to the db?

> There are various reasons to use a cache rather than
> treating the 
> session like a cache.  If you put a lot of data in
> the session, it will 
> slow down every hit loading and saving that data. 
> In a cache, you can 
> just keep multiple cached items separately and only
> grab them if you 
> need them for this page.  With a cache you can store
> things that come 
> from the database but are not user-specific, like
> today's weather.

Thank you for all the excellent advice and
explination(in this and other posts).

Most of the info I'll be pulling is *very*
user-specific...user name, which features to display
on which page, what features the user gets by email,
etc.

What happens is the user logs in and then the username
(greeting), the default page id (the user can create
many pages with different features per page) and what
features go on the default page are pulled from the
database and the default page is displayed, as well as
any "module" info.

The "modules" will consist of a "pages" module with
the names of all the pages the user has created (with
links) and a "emails" module which will display all
the features that the user is getting via email. 
These modules will be displayed on every page. 

You can see that almost everything is user-specific.

Right now I'm storing the page names/ids in a hash ref
in the session (the emails module isn't live yet), but
I thought that I would change that and only store the
module id and pull the names from the db (if the user
hasn't turned off the module) with each page call.

Thanks again for all the info!

__________________________________________________
Do You Yahoo!?
HotJobs - Search Thousands of New Jobs
http://www.hotjobs.com

Re: Apache::Session - What goes in session?

Posted by Perrin Harkins <pe...@elem.com>.
md wrote:

>I don't think "global" was the term I should have
>used. What I mean is data that will be seen on all or
>most pages by the same user...like "Hello Jim"
>

Okay, don't put that in the session.  It belongs in a cache.  The 
session is for transient state information, that you don't want to keep 
after the user logs out.

>Current page name and id are never stored in db, so
>different browser windows can be on different
>pages...
>

I thought your session was all stored in MySQL.  Why are you putting 
these in the session exactly?  If these things are not relevant to more 
than one request (page), they don't belong in the session.  They should 
just be in ordinary variables.

>>That sounds like a "user" or "subscriptions" object
>>to me, not session data.
>>    
>>
>
>Once again, I shouldn't have used the term "global".
>This is the "subscriptions" info for a single
>user...that's why I had thought to put this in the
>session instead of pulling from the db each page call
>since the data will rarely change.
>

You should use a cache for that, rather than the session.  This is 
long-term data that you just want quicker access to.

>I am using TT caching
>for the templates, but I'm not sure how to cache the
>non-session data.
>

Template Toolkit caches the compiled template code, but it doesn't cache 
your data or the output of the templates.  What you should do is grab a 
module like Cache::Cache or Cache::Mmap and take a look at the examples 
there.  You use it in a way that's very similar to what you're doing 
with Apache::Session for the things you referred to as global.  There 
are also good examples in the documentation for the Memoize module.

There are various reasons to use a cache rather than treating the 
session like a cache.  If you put a lot of data in the session, it will 
slow down every hit loading and saving that data.  In a cache, you can 
just keep multiple cached items separately and only grab them if you 
need them for this page.  With a cache you can store things that come 
from the database but are not user-specific, like today's weather.

>What about something like "default page id", which is
>the page that is considered your home page? This id is
>stored permanently in the db ("lasts more than the
>current current browsing session") but I keep it in
>the session since this also rarely changes so I don't
>want 
>to keep hitting the db to get it.
>

I would have some kind of user object which has a property of 
default_page_id.  The first time the user logs in I would fetch that 
from the database, and then I would cache it so that I wouldn't need to 
go back to the database for it on future requests.

- Perrin


Re: Apache::Session - What goes in session?

Posted by md <my...@yahoo.com>.
--- Perrin Harkins <pe...@elem.com> wrote:
> md wrote:

> That doesn't sound very global to me.  What happens
> when users open 
> multiple browser windows on your site?  Doesn't it
> screw up the "current 
> page" data?

I don't think "global" was the term I should have
used. What I mean is data that will be seen on all or
most pages by the same user...like "Hello Jim", where
"Jim" is pulled from the database when the session is
created and passed around in the session after that
(and updated in the db and session if user changes
their greeting name). 

Current page name and id are never stored in db, so
different browser windows can be on different
pages...I'm not sure if that's good or bad. However,
changes to the user name will be seen in both browser
windows since that's updated both in the session and
db.
 

> Optimizing database fetches or caching data is
> independent of the 
> session issue.  Nothing that is relevant to more
> than one user should 
> ever go in the session.

Correct. That little info I am putting in the session
corresponds directly to a single user.
 

> That sounds like a "user" or "subscriptions" object
> to me, not session data.

Once again, I shouldn't have used the term "global".
This is the "subscriptions" info for a single
user...that's why I had thought to put this in the
session instead of pulling from the db each page call
since the data will rarely change. This info will be
displayed on every page the user visits (unless they
"turn off" this module).

 
> No, that's caching.  Don't use the session for
> caching, use a cache for 
> it.  They're not the same.  A session is often
> stored in a database so 
> that it can be reliable.  A cache is usually stored
> on the file system 
> so it can be fast.

The session is stored in a database
(Apache::Session::MySQL), and I am using TT caching
for the templates, but I'm not sure how to cache the
non-session data. I've seen this discussed but I
definitely need more info on this. As it stands I see
two options: get data from the session or get it from
the db...how do I bring  caching into play?
 
> Things like the login status of this session, and
> the user ID that is 
> associated with it go in the session.  Status of a
> particular page has 
> to be passed in query args or hidden fields, to
> avoid problems with 
> multiple browser windows.  Data that applies to
> multiple users or lasts 
> more than the current browsing session never goes in
> the session.

What about something like "default page id", which is
the page that is considered your home page? This id is
stored permanently in the db ("lasts more than the
current current browsing session") but I keep it in
the session since this also rarely changes so I don't
want 
to keep hitting the db to get it.

Thanks again...



__________________________________________________
Do You Yahoo!?
HotJobs - Search Thousands of New Jobs
http://www.hotjobs.com

Re: Apache::Session - What goes in session?

Posted by Perrin Harkins <pe...@elem.com>.
md wrote:
> Currently I'm putting very little in the session

Good.  You should put in as little as possible.

> what I am putting in the session is more "global" in
> nature...greeting, current page number, current page
> name...

That doesn't sound very global to me.  What happens when users open 
multiple browser windows on your site?  Doesn't it screw up the "current 
page" data?

> I'm
> pulling a lot of info from the database and I wonder
> if my design is sound.

Optimizing database fetches or caching data is independent of the 
session issue.  Nothing that is relevant to more than one user should 
ever go in the session.

> Now I need to add global "modules" to the page which
> will show user info like which pages they have created
> and which features are being emailed to the user.
> These modules will display on every page unless the
> user turns them off.

That sounds like a "user" or "subscriptions" object to me, not session data.

> It seems that since this info
> wouldn't change very often that I should put the data
> in the session...

No, that's caching.  Don't use the session for caching, use a cache for 
it.  They're not the same.  A session is often stored in a database so 
that it can be reliable.  A cache is usually stored on the file system 
so it can be fast.

Things like the login status of this session, and the user ID that is 
associated with it go in the session.  Status of a particular page has 
to be passed in query args or hidden fields, to avoid problems with 
multiple browser windows.  Data that applies to multiple users or lasts 
more than the current browsing session never goes in the session.

- Perrin


RE: Apache::Session - What goes in session?

Posted by Jesse Erlbaum <je...@erlbaum.net>.
Hello md --

> I'm using mod_perl and Apache::Session on an app that
> is similar to MyYahoo. I found a few bits of info from
> a previous thread, but I'm curious as to what type of
> information should go in the session and what should
> come from the database.

One thing to watch out for is the trap of using session data as a dumping
ground for global variables.  Since you are asking "what belongs in a
session", it seems you are already thinking along those lines.  I have found
that many people who are fond of sessions often use them to store data which
I would be personally inclined to store in hidden form data, in a simple
cookie, or retrieve from a database when needed.

In my systems I usually only store a single "session ID" in a cookie -- a
key which references a database row.  This allows me to have as much data as
I like but keep it all in the database.  There is one case where it might
make sense to put data into a "session" of some sort -- to cache information
which is very time-consuming to retrieve.  Minimizing time-consuming
database operations is an important thing to think about in large systems,
and a place where session data might come in handy.

Warmest regards,

-Jesse-


--

  Jesse Erlbaum
  The Erlbaum Group
  jesse@erlbaum.net
  Phone: 212-684-6161
  Fax: 212-684-6226