You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@roller.apache.org by Allen Gilliland <Al...@Sun.COM> on 2005/08/09 23:33:43 UTC

Sharing some stats

I am diverging from the deployments discussion for a second because Elias comment sparked a question.  I'm interested in anything that anyone wants to share about their roller installation ...

how many blogs does it have?
what is your performance like?
what are your cache size settings?
how good is your caching efficiency on average?
any numbers on how much activity the site gets? hits/visits? load?
server info?  processors?  ram?  OS?  webserver?  database?
how is stability?  does the server require restarts often?

anything that can be shared would be cool.  i'd like to keep some info on who is running roller and in what kind of environments so that we can hopefully make sure we are keeping roller well suited for various situations.

blogs.sun.com currently has almost 1600 blogs on it and our stability is quite good.  i would give out server info, but i'm not allowed to.  probably our biggest performance concern is page caching, which has gotten worse and worse as more people start blogging.  i think out cache size is 4000 right now and that is plenty for the rss cache, but the page cache is still overwhelmed :/

anyways ... how about others?

-- Allen


On Tue, 2005-08-09 at 13:59, Elias Torres wrote:
> I'm also not an official part of the project, but I might be running
> the second or third largest Roller-based website ;-) and my opinion if
> it counts at all, is that if Dave/Allen can handle the heat in the
> kitchen, let them stay in the kitchen. I'm sure that's why they pay
> them the big bucks.


Re: Sharing some stats

Posted by Elias Torres <el...@gmail.com>.
Perfectly reasonable. I'll work on a design proposal. RollerWiki, right?

Elias

On 8/22/05, Allen Gilliland <Al...@sun.com> wrote:
> On Sun, 2005-08-21 at 19:33, Elias Torres wrote:
> > I'll try to prototype this when I get a chance if you guys tell me
> > you'd be interested in incorporating it.
> 
> we are always interested in useful new features developed by anyone, but if you really want to develop a feature this substantial you should start with a design plan and post it on the list for the other developers to look over and approve before you start too much coding.
> 
> i don't want to sound like a code nazi or anything, but personally i would likely reject pretty much any new code that didn't go through some form of design proposal before being submitted.  i just think this is a necessary quality control process.
> 
> -- Allen
> 
> 
> >
> > Elias
> >
> > On 8/16/05, Lance Lavandowska <la...@gmail.com> wrote:
> > > On 8/16/05, Elias Torres <el...@gmail.com> wrote:
> > > > Then we could write RewriteRules in Apache that translated these for example:
> > > >
> > > > http://www.jroller.com/page/fate/Weblog?catname=General into
> > > > http://www.jroller.com/static-content/fate/general/index.html
> > >
> > > Suggestion: write the static version to the user's resource directory:
> > > http://www.jroller.com/resources/fate/general/index.html
> > >
> > > The only problem with this is that it could interfere with the
> > > maxDirectorySize admin value (eating up space valuable to the user, so
> > > that they cannot upload a file).  Since currently that value only
> > > measures against the "base" resource directory for the user
> > > (/resources/fate) we can get aroudn this issue for the time being by
> > > writing all static content to a subdirectory.  THis stops working
> > > if/when we allow the user to create subdirectories.
> > >
> > > Lance
> > >
> 
>

Re: Sharing some stats

Posted by Allen Gilliland <Al...@Sun.COM>.
On Sun, 2005-08-21 at 19:33, Elias Torres wrote:
> I'll try to prototype this when I get a chance if you guys tell me
> you'd be interested in incorporating it.

we are always interested in useful new features developed by anyone, but if you really want to develop a feature this substantial you should start with a design plan and post it on the list for the other developers to look over and approve before you start too much coding.

i don't want to sound like a code nazi or anything, but personally i would likely reject pretty much any new code that didn't go through some form of design proposal before being submitted.  i just think this is a necessary quality control process.

-- Allen


> 
> Elias
> 
> On 8/16/05, Lance Lavandowska <la...@gmail.com> wrote:
> > On 8/16/05, Elias Torres <el...@gmail.com> wrote:
> > > Then we could write RewriteRules in Apache that translated these for example:
> > >
> > > http://www.jroller.com/page/fate/Weblog?catname=General into
> > > http://www.jroller.com/static-content/fate/general/index.html
> > 
> > Suggestion: write the static version to the user's resource directory:
> > http://www.jroller.com/resources/fate/general/index.html
> > 
> > The only problem with this is that it could interfere with the
> > maxDirectorySize admin value (eating up space valuable to the user, so
> > that they cannot upload a file).  Since currently that value only
> > measures against the "base" resource directory for the user
> > (/resources/fate) we can get aroudn this issue for the time being by
> > writing all static content to a subdirectory.  THis stops working
> > if/when we allow the user to create subdirectories.
> > 
> > Lance
> >


Re: Sharing some stats

Posted by Elias Torres <el...@gmail.com>.
wow. I did not know about your move away from OSCache. It tells you
how long I've been away from Roller. I found some post on google
around June 2004. Anyways, I still think that memory is not the way to
go since we keep recomputing pages that might not change for a long
time.

Regarding the library, it might be worth looking into to not have to
depend on Apache and having an entire blog on disk would totally
satisfy my requirement, no matter how Roller decides to serve it. :-)

Elias

On 8/22/05, Lance Lavandowska <la...@gmail.com> wrote:
> We actually moved away from OSCache due to a memory leak (I hear it's
> since been fixed, but don't know).
> 
> The reason I suggested an alternative file location, is that then we'd
> also have a mechanism by which the user can 'archive' her posts (add a
> "zip up your files for download" option ala Pebble).
> 
> Instead of Apache mod-rewrite, how about using the URLRewrite
> (https://urlrewrite.dev.java.net/) library?
> 
> Lance
> 
> On 8/21/05, Elias Torres <el...@gmail.com> wrote:
> > Lance,
> >
> > I'm not as concerned with the location on disk as I am with the actual
> > URL. The goal is to maintain the same URL pattern on the browser but
> > use instead a static version on disk. This is where RewriteRules come
> > in, but then we would depend on Apache. I think that if we can make
> > this optional, it would be great if we can pull this off.
> > Alternatively, we could explore OSCache disk-based caching so we don't
> > have to maintain an entire user's blog on memory. Also, I repeat, I
> > rather have Apache go straight to a static file, than the J2EE
> > container being hit just to read a file from disk.
> >
> > I'll try to prototype this when I get a chance if you guys tell me
> > you'd be interested in incorporating it.
> >
> > Elias
> >
> > On 8/16/05, Lance Lavandowska <la...@gmail.com> wrote:
> > > On 8/16/05, Elias Torres <el...@gmail.com> wrote:
> > > > Then we could write RewriteRules in Apache that translated these for example:
> > > >
> > > > http://www.jroller.com/page/fate/Weblog?catname=General into
> > > > http://www.jroller.com/static-content/fate/general/index.html
> > >
> > > Suggestion: write the static version to the user's resource directory:
> > > http://www.jroller.com/resources/fate/general/index.html
> > >
> > > The only problem with this is that it could interfere with the
> > > maxDirectorySize admin value (eating up space valuable to the user, so
> > > that they cannot upload a file).  Since currently that value only
> > > measures against the "base" resource directory for the user
> > > (/resources/fate) we can get aroudn this issue for the time being by
> > > writing all static content to a subdirectory.  THis stops working
> > > if/when we allow the user to create subdirectories.
> > >
> > > Lance
> > >
> >
>

Re: Sharing some stats

Posted by Lance Lavandowska <la...@gmail.com>.
We actually moved away from OSCache due to a memory leak (I hear it's
since been fixed, but don't know).

The reason I suggested an alternative file location, is that then we'd
also have a mechanism by which the user can 'archive' her posts (add a
"zip up your files for download" option ala Pebble).

Instead of Apache mod-rewrite, how about using the URLRewrite
(https://urlrewrite.dev.java.net/) library?

Lance

On 8/21/05, Elias Torres <el...@gmail.com> wrote:
> Lance,
> 
> I'm not as concerned with the location on disk as I am with the actual
> URL. The goal is to maintain the same URL pattern on the browser but
> use instead a static version on disk. This is where RewriteRules come
> in, but then we would depend on Apache. I think that if we can make
> this optional, it would be great if we can pull this off.
> Alternatively, we could explore OSCache disk-based caching so we don't
> have to maintain an entire user's blog on memory. Also, I repeat, I
> rather have Apache go straight to a static file, than the J2EE
> container being hit just to read a file from disk.
> 
> I'll try to prototype this when I get a chance if you guys tell me
> you'd be interested in incorporating it.
> 
> Elias
> 
> On 8/16/05, Lance Lavandowska <la...@gmail.com> wrote:
> > On 8/16/05, Elias Torres <el...@gmail.com> wrote:
> > > Then we could write RewriteRules in Apache that translated these for example:
> > >
> > > http://www.jroller.com/page/fate/Weblog?catname=General into
> > > http://www.jroller.com/static-content/fate/general/index.html
> >
> > Suggestion: write the static version to the user's resource directory:
> > http://www.jroller.com/resources/fate/general/index.html
> >
> > The only problem with this is that it could interfere with the
> > maxDirectorySize admin value (eating up space valuable to the user, so
> > that they cannot upload a file).  Since currently that value only
> > measures against the "base" resource directory for the user
> > (/resources/fate) we can get aroudn this issue for the time being by
> > writing all static content to a subdirectory.  THis stops working
> > if/when we allow the user to create subdirectories.
> >
> > Lance
> >
>

Re: Sharing some stats

Posted by Elias Torres <el...@gmail.com>.
Lance,

I'm not as concerned with the location on disk as I am with the actual
URL. The goal is to maintain the same URL pattern on the browser but
use instead a static version on disk. This is where RewriteRules come
in, but then we would depend on Apache. I think that if we can make
this optional, it would be great if we can pull this off.
Alternatively, we could explore OSCache disk-based caching so we don't
have to maintain an entire user's blog on memory. Also, I repeat, I
rather have Apache go straight to a static file, than the J2EE
container being hit just to read a file from disk.

I'll try to prototype this when I get a chance if you guys tell me
you'd be interested in incorporating it.

Elias

On 8/16/05, Lance Lavandowska <la...@gmail.com> wrote:
> On 8/16/05, Elias Torres <el...@gmail.com> wrote:
> > Then we could write RewriteRules in Apache that translated these for example:
> >
> > http://www.jroller.com/page/fate/Weblog?catname=General into
> > http://www.jroller.com/static-content/fate/general/index.html
> 
> Suggestion: write the static version to the user's resource directory:
> http://www.jroller.com/resources/fate/general/index.html
> 
> The only problem with this is that it could interfere with the
> maxDirectorySize admin value (eating up space valuable to the user, so
> that they cannot upload a file).  Since currently that value only
> measures against the "base" resource directory for the user
> (/resources/fate) we can get aroudn this issue for the time being by
> writing all static content to a subdirectory.  THis stops working
> if/when we allow the user to create subdirectories.
> 
> Lance
>

Re: Sharing some stats

Posted by Elias Torres <el...@gmail.com>.
On 8/11/05, Matt Raible <mr...@gmail.com> wrote:
> On 8/11/05, Elias Torres <el...@gmail.com> wrote:
> > On 8/11/05, Allen Gilliland <Al...@sun.com> wrote:
> > > excellent, thanks for the info guys.
> > >
> > > yeah ... 9000 blogs is a lot and in truth i'm not even sure any system
> > > could really handle much more than that in a truly dynamic fashion.  rss
> > > feeds are pretty easy to cache because they aren't as complex, but the
> > > pages themselves are more complicated and there are a number of possible
> > > views of the page data which makes caching even harder.
> > >
> > > i am particularly intrigued that you both commented about the use of
> > > static html pages.  i think this would be a great option for Roller and
> > > it would be very cool if we could be pretty sneaky about it and actually
> > > use the same url structure that exists now, but just map to raw html
> > > files on the backend.  we should do some investigations on this, but i
> > > certainly like the idea of having the option to use static html pages.
> > >
> >
> > I'm not sure we would want J2EE container serving files though. Also,
> > just like wordpress we don't have to match the file structure, because
> > we can have a set of URLRewrites to map to whichever structure we
> > please on the filesystem. I mean MovableType, Blogger works this way,
> > why can't we do the same for Roller. We are right now in the process
> > of deciding what our solution will be to replace our existing server,
> > but those limits we see for Roller will hurt us for a company of the
> > size of IBM. Nothing can beat static content performance. What do you
> > guys think?
> 
> I agree that static content performance is good - but we'd likely have
> to generate the static content from the HTML content each time it
> changed.  If we did this, you could literally get 1000s of pages for
> one blog, since there are many different views if you change the date.
>  I'm fine with doing this, but it'd likely result in lots of disk
> space required.
> 
> Matt

I don't think we should worry about disk space, especially when
contrasted to the performance gains we would get from static content.
I would highly doubt the entire JRoller content being more than one
1GB of generated HTML content. That's nothing, if being served by
apache via static files, compared to the machinery that it takes to
run it dynamic. Roller is growing quite rapidly and it needs to
perform as more people are interested in it.

> 
> >
> > > Matt, if you are curious about your cache performance then you can turn
> > > up the debugging on the LRUCacheHandler2 class (i believe that's the
> > > right one).  This will flood your logs with lots of messages about cache
> > > hits and misses so make sure and watch your log files sizes, but it'll
> > > give you the info you need.  Another good idea is to turn on garbage
> > > collection debugging messages so you can see how much of your heap you
> > > are using.  With 9000 blogs my guess is that your caches are pretty
> > > overwhelmed, but I would also guess that if you check your the garbage
> > > collection after a Full GC that you probably have a little more room in
> > > your heap to increase the sizes.
> > >
> > > -- Allen
> > >
> > >
> > > Matthew P. Schmidt wrote:
> > >
> > > > I'll share.  We have about 9000 blogs with rapid growth.  Its running
> > > > on one dual xeon, on MySQL and Resin and uses about a 1.6G heap with 3
> > > > 3000 item caches (page, rss, last modified).  I'm not sure how much
> > > > they're actually being used.  Load is generally pretty manageable,
> > > > especially with the latest version of Roller.   As for hits, most of
> > > > it is RSS, with several million hits of that per month.  There are
> > > > also a million or more blog views per month and the server doesn't
> > > > generally have to restart that often.  Before merging our fork with
> > > > Roller 1.2, we were restarting every night due to a memory leak.  Our
> > > > biggest problem is probably the amount of referrer spam, even with a
> > > > healthy blacklist of dirty words.  I think static HTML for the pages
> > > > (which they basically are now if your cache is big enough) and a
> > > > better referrer filter would be two big helpers for us.
> > > > Matthew P. Schmidt
> > > > Vice President of Technology
> > > > Javalobby.org
> > > > Email: matt@javalobby.org
> > > > Phone: 919.678.0300
> > > >
> > > >
> > > >
> > > > Elias Torres wrote:
> > > >
> > > >> We have also around 1800 blogs and it's growing rapidly. Also, around
> > > >> 12K people make use of the system in total and this we know because we
> > > >> don't allow anonymous comments. You need to be authenticated for
> > > >> someone to comment/post.
> > > >>
> > > >> I wonder why you are not allowed to give out server info. Maybe I'll
> > > >> hold off on that too for now.
> > > >>
> > > >> I'm sure others have asked this before, but is there a plan of turning
> > > >> Roller blogs into static HTML? I'd be interested in hearing your
> > > >> thoughts on this. I'm sure this would alleviate many of the caching
> > > >> performance problems.
> > > >>
> > > >> Elias
> > > >>
> > > >> On 8/9/05, Allen Gilliland <Al...@sun.com> wrote:
> > > >>
> > > >>
> > > >>> I am diverging from the deployments discussion for a second because
> > > >>> Elias comment sparked a question.  I'm interested in anything that
> > > >>> anyone wants to share about their roller installation ...
> > > >>>
> > > >>> how many blogs does it have?
> > > >>> what is your performance like?
> > > >>> what are your cache size settings?
> > > >>> how good is your caching efficiency on average?
> > > >>> any numbers on how much activity the site gets? hits/visits? load?
> > > >>> server info?  processors?  ram?  OS?  webserver?  database?
> > > >>> how is stability?  does the server require restarts often?
> > > >>>
> > > >>> anything that can be shared would be cool.  i'd like to keep some
> > > >>> info on who is running roller and in what kind of environments so
> > > >>> that we can hopefully make sure we are keeping roller well suited
> > > >>> for various situations.
> > > >>>
> > > >>> blogs.sun.com currently has almost 1600 blogs on it and our
> > > >>> stability is quite good.  i would give out server info, but i'm not
> > > >>> allowed to.  probably our biggest performance concern is page
> > > >>> caching, which has gotten worse and worse as more people start
> > > >>> blogging.  i think out cache size is 4000 right now and that is
> > > >>> plenty for the rss cache, but the page cache is still overwhelmed :/
> > > >>>
> > > >>> anyways ... how about others?
> > > >>>
> > > >>> -- Allen
> > > >>>
> > > >>>
> > > >>> On Tue, 2005-08-09 at 13:59, Elias Torres wrote:
> > > >>>
> > > >>>
> > > >>>> I'm also not an official part of the project, but I might be running
> > > >>>> the second or third largest Roller-based website ;-) and my opinion if
> > > >>>> it counts at all, is that if Dave/Allen can handle the heat in the
> > > >>>> kitchen, let them stay in the kitchen. I'm sure that's why they pay
> > > >>>> them the big bucks.
> > > >>>>
> > > >>>
> > > >>>
> > > >>
> > >
> >
>

Re: Sharing some stats

Posted by Matt Raible <mr...@gmail.com>.
On 8/11/05, Elias Torres <el...@gmail.com> wrote:
> On 8/11/05, Allen Gilliland <Al...@sun.com> wrote:
> > excellent, thanks for the info guys.
> >
> > yeah ... 9000 blogs is a lot and in truth i'm not even sure any system
> > could really handle much more than that in a truly dynamic fashion.  rss
> > feeds are pretty easy to cache because they aren't as complex, but the
> > pages themselves are more complicated and there are a number of possible
> > views of the page data which makes caching even harder.
> >
> > i am particularly intrigued that you both commented about the use of
> > static html pages.  i think this would be a great option for Roller and
> > it would be very cool if we could be pretty sneaky about it and actually
> > use the same url structure that exists now, but just map to raw html
> > files on the backend.  we should do some investigations on this, but i
> > certainly like the idea of having the option to use static html pages.
> >
> 
> I'm not sure we would want J2EE container serving files though. Also,
> just like wordpress we don't have to match the file structure, because
> we can have a set of URLRewrites to map to whichever structure we
> please on the filesystem. I mean MovableType, Blogger works this way,
> why can't we do the same for Roller. We are right now in the process
> of deciding what our solution will be to replace our existing server,
> but those limits we see for Roller will hurt us for a company of the
> size of IBM. Nothing can beat static content performance. What do you
> guys think?

I agree that static content performance is good - but we'd likely have
to generate the static content from the HTML content each time it
changed.  If we did this, you could literally get 1000s of pages for
one blog, since there are many different views if you change the date.
 I'm fine with doing this, but it'd likely result in lots of disk
space required.

Matt

> 
> > Matt, if you are curious about your cache performance then you can turn
> > up the debugging on the LRUCacheHandler2 class (i believe that's the
> > right one).  This will flood your logs with lots of messages about cache
> > hits and misses so make sure and watch your log files sizes, but it'll
> > give you the info you need.  Another good idea is to turn on garbage
> > collection debugging messages so you can see how much of your heap you
> > are using.  With 9000 blogs my guess is that your caches are pretty
> > overwhelmed, but I would also guess that if you check your the garbage
> > collection after a Full GC that you probably have a little more room in
> > your heap to increase the sizes.
> >
> > -- Allen
> >
> >
> > Matthew P. Schmidt wrote:
> >
> > > I'll share.  We have about 9000 blogs with rapid growth.  Its running
> > > on one dual xeon, on MySQL and Resin and uses about a 1.6G heap with 3
> > > 3000 item caches (page, rss, last modified).  I'm not sure how much
> > > they're actually being used.  Load is generally pretty manageable,
> > > especially with the latest version of Roller.   As for hits, most of
> > > it is RSS, with several million hits of that per month.  There are
> > > also a million or more blog views per month and the server doesn't
> > > generally have to restart that often.  Before merging our fork with
> > > Roller 1.2, we were restarting every night due to a memory leak.  Our
> > > biggest problem is probably the amount of referrer spam, even with a
> > > healthy blacklist of dirty words.  I think static HTML for the pages
> > > (which they basically are now if your cache is big enough) and a
> > > better referrer filter would be two big helpers for us.
> > > Matthew P. Schmidt
> > > Vice President of Technology
> > > Javalobby.org
> > > Email: matt@javalobby.org
> > > Phone: 919.678.0300
> > >
> > >
> > >
> > > Elias Torres wrote:
> > >
> > >> We have also around 1800 blogs and it's growing rapidly. Also, around
> > >> 12K people make use of the system in total and this we know because we
> > >> don't allow anonymous comments. You need to be authenticated for
> > >> someone to comment/post.
> > >>
> > >> I wonder why you are not allowed to give out server info. Maybe I'll
> > >> hold off on that too for now.
> > >>
> > >> I'm sure others have asked this before, but is there a plan of turning
> > >> Roller blogs into static HTML? I'd be interested in hearing your
> > >> thoughts on this. I'm sure this would alleviate many of the caching
> > >> performance problems.
> > >>
> > >> Elias
> > >>
> > >> On 8/9/05, Allen Gilliland <Al...@sun.com> wrote:
> > >>
> > >>
> > >>> I am diverging from the deployments discussion for a second because
> > >>> Elias comment sparked a question.  I'm interested in anything that
> > >>> anyone wants to share about their roller installation ...
> > >>>
> > >>> how many blogs does it have?
> > >>> what is your performance like?
> > >>> what are your cache size settings?
> > >>> how good is your caching efficiency on average?
> > >>> any numbers on how much activity the site gets? hits/visits? load?
> > >>> server info?  processors?  ram?  OS?  webserver?  database?
> > >>> how is stability?  does the server require restarts often?
> > >>>
> > >>> anything that can be shared would be cool.  i'd like to keep some
> > >>> info on who is running roller and in what kind of environments so
> > >>> that we can hopefully make sure we are keeping roller well suited
> > >>> for various situations.
> > >>>
> > >>> blogs.sun.com currently has almost 1600 blogs on it and our
> > >>> stability is quite good.  i would give out server info, but i'm not
> > >>> allowed to.  probably our biggest performance concern is page
> > >>> caching, which has gotten worse and worse as more people start
> > >>> blogging.  i think out cache size is 4000 right now and that is
> > >>> plenty for the rss cache, but the page cache is still overwhelmed :/
> > >>>
> > >>> anyways ... how about others?
> > >>>
> > >>> -- Allen
> > >>>
> > >>>
> > >>> On Tue, 2005-08-09 at 13:59, Elias Torres wrote:
> > >>>
> > >>>
> > >>>> I'm also not an official part of the project, but I might be running
> > >>>> the second or third largest Roller-based website ;-) and my opinion if
> > >>>> it counts at all, is that if Dave/Allen can handle the heat in the
> > >>>> kitchen, let them stay in the kitchen. I'm sure that's why they pay
> > >>>> them the big bucks.
> > >>>>
> > >>>
> > >>>
> > >>
> >
>

Re: Sharing some stats

Posted by Elias Torres <el...@gmail.com>.
Trygve,

I'm not sure if you are familiar with the way that Roller is
implemented today. Just in case, it already uses a memory/disk caching
system written by Open Symphony called OSCache. I'm not sure what the
differences are between it and what you are suggesting. But I do know
that using a memory caching system is already not the most optimal
solution as indicated by the fellow who runs JRoller.com.

Elias

On 8/18/05, Trygve Lie <tr...@hotmail.com> wrote:
> Hi
> 
> I do not think static content is the way to go. By my experience it will
> introduce a lot of new issues.
> At least if the posibillities of static content is introduced it must be
> able to turn it on or off on different levels (not just on or off at a whole
> site).
> 
> But; I would much rather see a implementation with Memcached:
> http://www.danga.com/memcached/
> 
> I work in a company which maintains and develops the largest media network
> in Norway and we have a quite large scale CMS running all our media
> services. A time a go we did have the same problem with caching as described
> here. Turning on static caching was not an option for us.
> For us Memcached was the definitive solution to a lot of our performance
> problems.
> Now we can just put in a blade server with a lot of memory if we need more
> cache...
> 
> In our network there has been done an implementation of Memcached for
> Hibernate. This would probably work for Roller also. I can see if I can get
> this code awailable for the public....
> 
> Trygve
> 
> 
> >From: Lance Lavandowska <la...@gmail.com>
> >Reply-To: roller-dev@incubator.apache.org
> >To: roller-dev@incubator.apache.org, elias@torrez.us
> >Subject: Re: Sharing some stats
> >Date: Tue, 16 Aug 2005 12:02:33 -0500
> >
> >On 8/16/05, Elias Torres <el...@gmail.com> wrote:
> > > Then we could write RewriteRules in Apache that translated these for
> >example:
> > >
> > > http://www.jroller.com/page/fate/Weblog?catname=General into
> > > http://www.jroller.com/static-content/fate/general/index.html
> >
> >Suggestion: write the static version to the user's resource directory:
> >http://www.jroller.com/resources/fate/general/index.html
> >
> >The only problem with this is that it could interfere with the
> >maxDirectorySize admin value (eating up space valuable to the user, so
> >that they cannot upload a file).  Since currently that value only
> >measures against the "base" resource directory for the user
> >(/resources/fate) we can get aroudn this issue for the time being by
> >writing all static content to a subdirectory.  THis stops working
> >if/when we allow the user to create subdirectories.
> >
> >Lance
> 
> _________________________________________________________________
> MSN Search http://search.msn.no/ Raskere. Rett på sak. Mer presist.
> 
>

Re: Sharing some stats

Posted by Trygve Lie <tr...@hotmail.com>.
Hi

Sorry for the late reply.

Great to see you like it and that you are woundering about implementing it 
:)

I'm very aware of Rollers use of OSCache (which I'm also familiar with) but 
memcached works in another way.
Let me explain to those of you who have not looked at it; First of all it's 
distributed memory and second it's not an JAVA applications. In other words 
you can start memcached on as many servers (where you have free space) as 
you want and it will act as one huge pile of memory for the application.
First of all you end up with the posibillity to have unlimited amount of 
memory and this is a lightning fast way to cache tings, but another 
advantage is that the cached objects are moved out of the JAVA VM. JAVA just 
connects to it and this is a huge bennefit if the JAVA EE server dies (not 
the entire server).
If the JAVA EE server dies all the cached objects will still be in memcached 
and when the JAVA EE server are up running again it will just have the 
cached objects there again. It does not need to pul every thing from the DB 
again, it's just there :)
Even if the whole server dies or one of the servers alocating memory to 
memcached only parts of the cached objects will be lost because it's 
distributed..

I find this much more interesting in a large scale installation than a 
static content function, but static content are nice for smaller sites which 
does not have the access to hardware such as larger installations might 
have.

Anyway; very nice to see you do find this interesting :)

Just to change topic; I'll be starting to look at a integration of the 
FCKEditor (http://www.fckeditor.net/) into Roller. I think it should be an 
easy task to do and a nice start for me to be able to contribute. Is there 
anybody else looking at this?
This does belong in a new tread...

PS: My last post ended up at the root in the apache incubator archive. If 
this one does also, what am I doing wrong? I'm just replying to the post I 
want to reply to and the reply adress is roller-dev@incubator.apache.org

Kind regards
Trygve



>From: Allen Gilliland <Al...@Sun.COM>
>Reply-To: roller-dev@incubator.apache.org
>To: roller-dev <ro...@incubator.apache.org>
>CC: elias@torrez.us
>Subject: Re: Sharing some stats
>Date: Wed, 24 Aug 2005 14:46:50 -0700
>
>I spent a little time playing with memcached today and I must say that I 
>rather like it.  The setup was pretty easy and depending on how we would 
>would want to use it I think this could be a nice little feature.
>
>Rather than bother with trying to integrate this into the backend via 
>hibernate I instead decided to just apply it to the page level cache.  
>There are a number of things in the presentation layer caching setup that I 
>think can be cleaned up to make this easier, but so far it hasn't been too 
>bad.
>
>It seems to me like we probably don't need to use memcached for more than 
>just the presentation level caching, and means we should be able to keep 
>our implementation pretty simple.
>
>thanks for making this suggestion Trygve.
>
>-- Allen
>
>
>On Thu, 2005-08-18 at 00:52, Trygve Lie wrote:
> > Hi
> >
> > I do not think static content is the way to go. By my experience it will
> > introduce a lot of new issues.
> > At least if the posibillities of static content is introduced it must be
> > able to turn it on or off on different levels (not just on or off at a 
>whole
> > site).
> >
> > But; I would much rather see a implementation with Memcached:
> > http://www.danga.com/memcached/
> >
> > I work in a company which maintains and develops the largest media 
>network
> > in Norway and we have a quite large scale CMS running all our media
> > services. A time a go we did have the same problem with caching as 
>described
> > here. Turning on static caching was not an option for us.
> > For us Memcached was the definitive solution to a lot of our performance
> > problems.
> > Now we can just put in a blade server with a lot of memory if we need 
>more
> > cache...
> >
> > In our network there has been done an implementation of Memcached for
> > Hibernate. This would probably work for Roller also. I can see if I can 
>get
> > this code awailable for the public....
> >
> > Trygve
> >
> >
> > >From: Lance Lavandowska <la...@gmail.com>
> > >Reply-To: roller-dev@incubator.apache.org
> > >To: roller-dev@incubator.apache.org, elias@torrez.us
> > >Subject: Re: Sharing some stats
> > >Date: Tue, 16 Aug 2005 12:02:33 -0500
> > >
> > >On 8/16/05, Elias Torres <el...@gmail.com> wrote:
> > > > Then we could write RewriteRules in Apache that translated these for
> > >example:
> > > >
> > > > http://www.jroller.com/page/fate/Weblog?catname=General into
> > > > http://www.jroller.com/static-content/fate/general/index.html
> > >
> > >Suggestion: write the static version to the user's resource directory:
> > >http://www.jroller.com/resources/fate/general/index.html
> > >
> > >The only problem with this is that it could interfere with the
> > >maxDirectorySize admin value (eating up space valuable to the user, so
> > >that they cannot upload a file).  Since currently that value only
> > >measures against the "base" resource directory for the user
> > >(/resources/fate) we can get aroudn this issue for the time being by
> > >writing all static content to a subdirectory.  THis stops working
> > >if/when we allow the user to create subdirectories.
> > >
> > >Lance
> >
> > _________________________________________________________________
> > MSN Search http://search.msn.no/ Raskere. Rett på sak. Mer presist.
> >
>

_________________________________________________________________
MSN Messenger http://www.msn.no/messenger Den enkleste og raskeste måten å 
holde kontakten på.


Re: Sharing some stats

Posted by Allen Gilliland <Al...@Sun.COM>.
I spent a little time playing with memcached today and I must say that I rather like it.  The setup was pretty easy and depending on how we would would want to use it I think this could be a nice little feature.

Rather than bother with trying to integrate this into the backend via hibernate I instead decided to just apply it to the page level cache.  There are a number of things in the presentation layer caching setup that I think can be cleaned up to make this easier, but so far it hasn't been too bad.

It seems to me like we probably don't need to use memcached for more than just the presentation level caching, and means we should be able to keep our implementation pretty simple.

thanks for making this suggestion Trygve.

-- Allen


On Thu, 2005-08-18 at 00:52, Trygve Lie wrote:
> Hi
> 
> I do not think static content is the way to go. By my experience it will 
> introduce a lot of new issues.
> At least if the posibillities of static content is introduced it must be 
> able to turn it on or off on different levels (not just on or off at a whole 
> site).
> 
> But; I would much rather see a implementation with Memcached: 
> http://www.danga.com/memcached/
> 
> I work in a company which maintains and develops the largest media network 
> in Norway and we have a quite large scale CMS running all our media 
> services. A time a go we did have the same problem with caching as described 
> here. Turning on static caching was not an option for us.
> For us Memcached was the definitive solution to a lot of our performance 
> problems.
> Now we can just put in a blade server with a lot of memory if we need more 
> cache...
> 
> In our network there has been done an implementation of Memcached for 
> Hibernate. This would probably work for Roller also. I can see if I can get 
> this code awailable for the public....
> 
> Trygve
> 
> 
> >From: Lance Lavandowska <la...@gmail.com>
> >Reply-To: roller-dev@incubator.apache.org
> >To: roller-dev@incubator.apache.org, elias@torrez.us
> >Subject: Re: Sharing some stats
> >Date: Tue, 16 Aug 2005 12:02:33 -0500
> >
> >On 8/16/05, Elias Torres <el...@gmail.com> wrote:
> > > Then we could write RewriteRules in Apache that translated these for 
> >example:
> > >
> > > http://www.jroller.com/page/fate/Weblog?catname=General into
> > > http://www.jroller.com/static-content/fate/general/index.html
> >
> >Suggestion: write the static version to the user's resource directory:
> >http://www.jroller.com/resources/fate/general/index.html
> >
> >The only problem with this is that it could interfere with the
> >maxDirectorySize admin value (eating up space valuable to the user, so
> >that they cannot upload a file).  Since currently that value only
> >measures against the "base" resource directory for the user
> >(/resources/fate) we can get aroudn this issue for the time being by
> >writing all static content to a subdirectory.  THis stops working
> >if/when we allow the user to create subdirectories.
> >
> >Lance
> 
> _________________________________________________________________
> MSN Search http://search.msn.no/ Raskere. Rett på sak. Mer presist.
> 


Re: Sharing some stats

Posted by Trygve Lie <tr...@hotmail.com>.
Hi

I do not think static content is the way to go. By my experience it will 
introduce a lot of new issues.
At least if the posibillities of static content is introduced it must be 
able to turn it on or off on different levels (not just on or off at a whole 
site).

But; I would much rather see a implementation with Memcached: 
http://www.danga.com/memcached/

I work in a company which maintains and develops the largest media network 
in Norway and we have a quite large scale CMS running all our media 
services. A time a go we did have the same problem with caching as described 
here. Turning on static caching was not an option for us.
For us Memcached was the definitive solution to a lot of our performance 
problems.
Now we can just put in a blade server with a lot of memory if we need more 
cache...

In our network there has been done an implementation of Memcached for 
Hibernate. This would probably work for Roller also. I can see if I can get 
this code awailable for the public....

Trygve


>From: Lance Lavandowska <la...@gmail.com>
>Reply-To: roller-dev@incubator.apache.org
>To: roller-dev@incubator.apache.org, elias@torrez.us
>Subject: Re: Sharing some stats
>Date: Tue, 16 Aug 2005 12:02:33 -0500
>
>On 8/16/05, Elias Torres <el...@gmail.com> wrote:
> > Then we could write RewriteRules in Apache that translated these for 
>example:
> >
> > http://www.jroller.com/page/fate/Weblog?catname=General into
> > http://www.jroller.com/static-content/fate/general/index.html
>
>Suggestion: write the static version to the user's resource directory:
>http://www.jroller.com/resources/fate/general/index.html
>
>The only problem with this is that it could interfere with the
>maxDirectorySize admin value (eating up space valuable to the user, so
>that they cannot upload a file).  Since currently that value only
>measures against the "base" resource directory for the user
>(/resources/fate) we can get aroudn this issue for the time being by
>writing all static content to a subdirectory.  THis stops working
>if/when we allow the user to create subdirectories.
>
>Lance

_________________________________________________________________
MSN Search http://search.msn.no/ Raskere. Rett på sak. Mer presist.


Re: Sharing some stats

Posted by Lance Lavandowska <la...@gmail.com>.
On 8/16/05, Elias Torres <el...@gmail.com> wrote:
> Then we could write RewriteRules in Apache that translated these for example:
> 
> http://www.jroller.com/page/fate/Weblog?catname=General into
> http://www.jroller.com/static-content/fate/general/index.html

Suggestion: write the static version to the user's resource directory:
http://www.jroller.com/resources/fate/general/index.html

The only problem with this is that it could interfere with the
maxDirectorySize admin value (eating up space valuable to the user, so
that they cannot upload a file).  Since currently that value only
measures against the "base" resource directory for the user
(/resources/fate) we can get aroudn this issue for the time being by
writing all static content to a subdirectory.  THis stops working
if/when we allow the user to create subdirectories.

Lance

Re: Sharing some stats

Posted by Elias Torres <el...@gmail.com>.
The URL rewriting that I'm talking about is used more with WordPress.
Wordpress has an index.php that takes parameters for everything like
yeah, monthnum, day, feed, etc. Now what they do is to provide a
sample .htaccess file that allows the user to have nice URLs like
"/archives/2005/06/05/123" through a rewrite rule like the following:

RewriteRule ^archives/([0-9]{4})/([0-9]{1,2})/([0-9]{1,2})/([0-9]+)?$
/index.php?year=$1&monthnum=$2&day=$3&p=$4 [QSA,L]

This is practically what Roller does in that function to parse the
path info from the RollerServlet (I'm not 100% where this is, but I
know of such function). Now, my intention is not to make Apache a
requirement, but an option. Roller already has a nice URL pattern
already so this would not be too hard to map to static content. From a
quick glance at it, it would be a matter of creating a "theme" that
supported static publication. There are four types of pages, the main
weblog page, the category page, the day page and the entry page.
Everytime a person makes a post/comment, we can re-generate the pages
affected by changes and put them into another directory structure.
Then, using URL Rewrites, we can make them look like the normal Roller
J2EE deployment.

For example,

If we had:

/roller
+ /username
 + index.html <- This is the normal homepage
 + /archives/20050605.html <- A day's worth of entries
 + /entries/page_slug.html <- An entry
 + /categoryname/index.html <- The generated page with recent entries
in that category
 + /rss/index.atom <- Atom feed
 + /rss/categoryname/index.atom <- The Java Atom feed, etc.

Then we could write RewriteRules in Apache that translated these for example:

http://www.jroller.com/page/fate/Weblog?catname=General into
http://www.jroller.com/static-content/fate/general/index.html

http://www.jroller.com/page/fate?entry=another_anniversary into
http://www.jroller.com/static-content/fate/entries/another_anniversary.html

and so on.

Of course, all the servlet endpoints would stay the same, like posting
comments, trackback and so forth. Most of the work would have to be on
the static content generation and checking which Velocity Macros
actually need HTTPRequest information. Most of them should work,
because pages are getting cached already anyways, so it's not like
those macros are dependent on being called everytime and besides the
URLs would look the same to them. All we would need to do is fake the
HTTPRequest when running the Velocity templates.

Let me know what do you think,

Elias

On 8/15/05, Allen Gilliland <Al...@sun.com> wrote:
> On Thu, 2005-08-11 at 14:13, Elias Torres wrote:
> > I'm not sure we would want J2EE container serving files though. Also,
> > just like wordpress we don't have to match the file structure, because
> > we can have a set of URLRewrites to map to whichever structure we
> > please on the filesystem. I mean MovableType, Blogger works this way,
> > why can't we do the same for Roller. We are right now in the process
> > of deciding what our solution will be to replace our existing server,
> > but those limits we see for Roller will hurt us for a company of the
> > size of IBM. Nothing can beat static content performance. What do you
> > guys think?
> 
> I'd like to hear more about how you see the static content being served up.  I've used MT quite a bit, but never with any url rewriting.
> 
> Personally, I still think that by default we would want to make the whole system run in a pure j2ee container.  Remember that while some of us run very large Roller installations, most people will likely have very moderate sized sites.  In those cases a pure j2ee container should be fine.
> 
> I would also not be a fan of requiring another piece of software just to make a given feature work.  i.e. to *require* Apache to use the static content would be lame in my opinion.  I think that static content should be disabled by default, yet easily enabled via the Roller config.  Once it's enabled the site owner should have a few options on how to use it.  namely, run it in their current j2ee container with some servlets/filters doing url mapping.  or if they want, they can disable the built in url mapping and implement their own way of doing the url rewriting via apache, or anything else.
> 
> -- Allen
> 
> 
> >
> > > Matt, if you are curious about your cache performance then you can turn
> > > up the debugging on the LRUCacheHandler2 class (i believe that's the
> > > right one).  This will flood your logs with lots of messages about cache
> > > hits and misses so make sure and watch your log files sizes, but it'll
> > > give you the info you need.  Another good idea is to turn on garbage
> > > collection debugging messages so you can see how much of your heap you
> > > are using.  With 9000 blogs my guess is that your caches are pretty
> > > overwhelmed, but I would also guess that if you check your the garbage
> > > collection after a Full GC that you probably have a little more room in
> > > your heap to increase the sizes.
> > >
> > > -- Allen
> > >
> > >
> > > Matthew P. Schmidt wrote:
> > >
> > > > I'll share.  We have about 9000 blogs with rapid growth.  Its running
> > > > on one dual xeon, on MySQL and Resin and uses about a 1.6G heap with 3
> > > > 3000 item caches (page, rss, last modified).  I'm not sure how much
> > > > they're actually being used.  Load is generally pretty manageable,
> > > > especially with the latest version of Roller.   As for hits, most of
> > > > it is RSS, with several million hits of that per month.  There are
> > > > also a million or more blog views per month and the server doesn't
> > > > generally have to restart that often.  Before merging our fork with
> > > > Roller 1.2, we were restarting every night due to a memory leak.  Our
> > > > biggest problem is probably the amount of referrer spam, even with a
> > > > healthy blacklist of dirty words.  I think static HTML for the pages
> > > > (which they basically are now if your cache is big enough) and a
> > > > better referrer filter would be two big helpers for us.
> > > > Matthew P. Schmidt
> > > > Vice President of Technology
> > > > Javalobby.org
> > > > Email: matt@javalobby.org
> > > > Phone: 919.678.0300
> > > >
> > > >
> > > >
> > > > Elias Torres wrote:
> > > >
> > > >> We have also around 1800 blogs and it's growing rapidly. Also, around
> > > >> 12K people make use of the system in total and this we know because we
> > > >> don't allow anonymous comments. You need to be authenticated for
> > > >> someone to comment/post.
> > > >>
> > > >> I wonder why you are not allowed to give out server info. Maybe I'll
> > > >> hold off on that too for now.
> > > >>
> > > >> I'm sure others have asked this before, but is there a plan of turning
> > > >> Roller blogs into static HTML? I'd be interested in hearing your
> > > >> thoughts on this. I'm sure this would alleviate many of the caching
> > > >> performance problems.
> > > >>
> > > >> Elias
> > > >>
> > > >> On 8/9/05, Allen Gilliland <Al...@sun.com> wrote:
> > > >>
> > > >>
> > > >>> I am diverging from the deployments discussion for a second because
> > > >>> Elias comment sparked a question.  I'm interested in anything that
> > > >>> anyone wants to share about their roller installation ...
> > > >>>
> > > >>> how many blogs does it have?
> > > >>> what is your performance like?
> > > >>> what are your cache size settings?
> > > >>> how good is your caching efficiency on average?
> > > >>> any numbers on how much activity the site gets? hits/visits? load?
> > > >>> server info?  processors?  ram?  OS?  webserver?  database?
> > > >>> how is stability?  does the server require restarts often?
> > > >>>
> > > >>> anything that can be shared would be cool.  i'd like to keep some
> > > >>> info on who is running roller and in what kind of environments so
> > > >>> that we can hopefully make sure we are keeping roller well suited
> > > >>> for various situations.
> > > >>>
> > > >>> blogs.sun.com currently has almost 1600 blogs on it and our
> > > >>> stability is quite good.  i would give out server info, but i'm not
> > > >>> allowed to.  probably our biggest performance concern is page
> > > >>> caching, which has gotten worse and worse as more people start
> > > >>> blogging.  i think out cache size is 4000 right now and that is
> > > >>> plenty for the rss cache, but the page cache is still overwhelmed :/
> > > >>>
> > > >>> anyways ... how about others?
> > > >>>
> > > >>> -- Allen
> > > >>>
> > > >>>
> > > >>> On Tue, 2005-08-09 at 13:59, Elias Torres wrote:
> > > >>>
> > > >>>
> > > >>>> I'm also not an official part of the project, but I might be running
> > > >>>> the second or third largest Roller-based website ;-) and my opinion if
> > > >>>> it counts at all, is that if Dave/Allen can handle the heat in the
> > > >>>> kitchen, let them stay in the kitchen. I'm sure that's why they pay
> > > >>>> them the big bucks.
> > > >>>>
> > > >>>
> > > >>>
> > > >>
> > >
> 
>

Re: Sharing some stats

Posted by Allen Gilliland <Al...@Sun.COM>.
On Thu, 2005-08-11 at 14:13, Elias Torres wrote:
> I'm not sure we would want J2EE container serving files though. Also,
> just like wordpress we don't have to match the file structure, because
> we can have a set of URLRewrites to map to whichever structure we
> please on the filesystem. I mean MovableType, Blogger works this way,
> why can't we do the same for Roller. We are right now in the process
> of deciding what our solution will be to replace our existing server,
> but those limits we see for Roller will hurt us for a company of the
> size of IBM. Nothing can beat static content performance. What do you
> guys think?

I'd like to hear more about how you see the static content being served up.  I've used MT quite a bit, but never with any url rewriting.

Personally, I still think that by default we would want to make the whole system run in a pure j2ee container.  Remember that while some of us run very large Roller installations, most people will likely have very moderate sized sites.  In those cases a pure j2ee container should be fine.

I would also not be a fan of requiring another piece of software just to make a given feature work.  i.e. to *require* Apache to use the static content would be lame in my opinion.  I think that static content should be disabled by default, yet easily enabled via the Roller config.  Once it's enabled the site owner should have a few options on how to use it.  namely, run it in their current j2ee container with some servlets/filters doing url mapping.  or if they want, they can disable the built in url mapping and implement their own way of doing the url rewriting via apache, or anything else.

-- Allen


> 
> > Matt, if you are curious about your cache performance then you can turn
> > up the debugging on the LRUCacheHandler2 class (i believe that's the
> > right one).  This will flood your logs with lots of messages about cache
> > hits and misses so make sure and watch your log files sizes, but it'll
> > give you the info you need.  Another good idea is to turn on garbage
> > collection debugging messages so you can see how much of your heap you
> > are using.  With 9000 blogs my guess is that your caches are pretty
> > overwhelmed, but I would also guess that if you check your the garbage
> > collection after a Full GC that you probably have a little more room in
> > your heap to increase the sizes.
> > 
> > -- Allen
> > 
> > 
> > Matthew P. Schmidt wrote:
> > 
> > > I'll share.  We have about 9000 blogs with rapid growth.  Its running
> > > on one dual xeon, on MySQL and Resin and uses about a 1.6G heap with 3
> > > 3000 item caches (page, rss, last modified).  I'm not sure how much
> > > they're actually being used.  Load is generally pretty manageable,
> > > especially with the latest version of Roller.   As for hits, most of
> > > it is RSS, with several million hits of that per month.  There are
> > > also a million or more blog views per month and the server doesn't
> > > generally have to restart that often.  Before merging our fork with
> > > Roller 1.2, we were restarting every night due to a memory leak.  Our
> > > biggest problem is probably the amount of referrer spam, even with a
> > > healthy blacklist of dirty words.  I think static HTML for the pages
> > > (which they basically are now if your cache is big enough) and a
> > > better referrer filter would be two big helpers for us.
> > > Matthew P. Schmidt
> > > Vice President of Technology
> > > Javalobby.org
> > > Email: matt@javalobby.org
> > > Phone: 919.678.0300
> > >
> > >
> > >
> > > Elias Torres wrote:
> > >
> > >> We have also around 1800 blogs and it's growing rapidly. Also, around
> > >> 12K people make use of the system in total and this we know because we
> > >> don't allow anonymous comments. You need to be authenticated for
> > >> someone to comment/post.
> > >>
> > >> I wonder why you are not allowed to give out server info. Maybe I'll
> > >> hold off on that too for now.
> > >>
> > >> I'm sure others have asked this before, but is there a plan of turning
> > >> Roller blogs into static HTML? I'd be interested in hearing your
> > >> thoughts on this. I'm sure this would alleviate many of the caching
> > >> performance problems.
> > >>
> > >> Elias
> > >>
> > >> On 8/9/05, Allen Gilliland <Al...@sun.com> wrote:
> > >>
> > >>
> > >>> I am diverging from the deployments discussion for a second because
> > >>> Elias comment sparked a question.  I'm interested in anything that
> > >>> anyone wants to share about their roller installation ...
> > >>>
> > >>> how many blogs does it have?
> > >>> what is your performance like?
> > >>> what are your cache size settings?
> > >>> how good is your caching efficiency on average?
> > >>> any numbers on how much activity the site gets? hits/visits? load?
> > >>> server info?  processors?  ram?  OS?  webserver?  database?
> > >>> how is stability?  does the server require restarts often?
> > >>>
> > >>> anything that can be shared would be cool.  i'd like to keep some
> > >>> info on who is running roller and in what kind of environments so
> > >>> that we can hopefully make sure we are keeping roller well suited
> > >>> for various situations.
> > >>>
> > >>> blogs.sun.com currently has almost 1600 blogs on it and our
> > >>> stability is quite good.  i would give out server info, but i'm not
> > >>> allowed to.  probably our biggest performance concern is page
> > >>> caching, which has gotten worse and worse as more people start
> > >>> blogging.  i think out cache size is 4000 right now and that is
> > >>> plenty for the rss cache, but the page cache is still overwhelmed :/
> > >>>
> > >>> anyways ... how about others?
> > >>>
> > >>> -- Allen
> > >>>
> > >>>
> > >>> On Tue, 2005-08-09 at 13:59, Elias Torres wrote:
> > >>>
> > >>>
> > >>>> I'm also not an official part of the project, but I might be running
> > >>>> the second or third largest Roller-based website ;-) and my opinion if
> > >>>> it counts at all, is that if Dave/Allen can handle the heat in the
> > >>>> kitchen, let them stay in the kitchen. I'm sure that's why they pay
> > >>>> them the big bucks.
> > >>>>
> > >>>
> > >>>
> > >>
> >


Re: Sharing some stats

Posted by Elias Torres <el...@gmail.com>.
On 8/11/05, Allen Gilliland <Al...@sun.com> wrote:
> excellent, thanks for the info guys.
> 
> yeah ... 9000 blogs is a lot and in truth i'm not even sure any system
> could really handle much more than that in a truly dynamic fashion.  rss
> feeds are pretty easy to cache because they aren't as complex, but the
> pages themselves are more complicated and there are a number of possible
> views of the page data which makes caching even harder.
> 
> i am particularly intrigued that you both commented about the use of
> static html pages.  i think this would be a great option for Roller and
> it would be very cool if we could be pretty sneaky about it and actually
> use the same url structure that exists now, but just map to raw html
> files on the backend.  we should do some investigations on this, but i
> certainly like the idea of having the option to use static html pages.
> 

I'm not sure we would want J2EE container serving files though. Also,
just like wordpress we don't have to match the file structure, because
we can have a set of URLRewrites to map to whichever structure we
please on the filesystem. I mean MovableType, Blogger works this way,
why can't we do the same for Roller. We are right now in the process
of deciding what our solution will be to replace our existing server,
but those limits we see for Roller will hurt us for a company of the
size of IBM. Nothing can beat static content performance. What do you
guys think?

> Matt, if you are curious about your cache performance then you can turn
> up the debugging on the LRUCacheHandler2 class (i believe that's the
> right one).  This will flood your logs with lots of messages about cache
> hits and misses so make sure and watch your log files sizes, but it'll
> give you the info you need.  Another good idea is to turn on garbage
> collection debugging messages so you can see how much of your heap you
> are using.  With 9000 blogs my guess is that your caches are pretty
> overwhelmed, but I would also guess that if you check your the garbage
> collection after a Full GC that you probably have a little more room in
> your heap to increase the sizes.
> 
> -- Allen
> 
> 
> Matthew P. Schmidt wrote:
> 
> > I'll share.  We have about 9000 blogs with rapid growth.  Its running
> > on one dual xeon, on MySQL and Resin and uses about a 1.6G heap with 3
> > 3000 item caches (page, rss, last modified).  I'm not sure how much
> > they're actually being used.  Load is generally pretty manageable,
> > especially with the latest version of Roller.   As for hits, most of
> > it is RSS, with several million hits of that per month.  There are
> > also a million or more blog views per month and the server doesn't
> > generally have to restart that often.  Before merging our fork with
> > Roller 1.2, we were restarting every night due to a memory leak.  Our
> > biggest problem is probably the amount of referrer spam, even with a
> > healthy blacklist of dirty words.  I think static HTML for the pages
> > (which they basically are now if your cache is big enough) and a
> > better referrer filter would be two big helpers for us.
> > Matthew P. Schmidt
> > Vice President of Technology
> > Javalobby.org
> > Email: matt@javalobby.org
> > Phone: 919.678.0300
> >
> >
> >
> > Elias Torres wrote:
> >
> >> We have also around 1800 blogs and it's growing rapidly. Also, around
> >> 12K people make use of the system in total and this we know because we
> >> don't allow anonymous comments. You need to be authenticated for
> >> someone to comment/post.
> >>
> >> I wonder why you are not allowed to give out server info. Maybe I'll
> >> hold off on that too for now.
> >>
> >> I'm sure others have asked this before, but is there a plan of turning
> >> Roller blogs into static HTML? I'd be interested in hearing your
> >> thoughts on this. I'm sure this would alleviate many of the caching
> >> performance problems.
> >>
> >> Elias
> >>
> >> On 8/9/05, Allen Gilliland <Al...@sun.com> wrote:
> >>
> >>
> >>> I am diverging from the deployments discussion for a second because
> >>> Elias comment sparked a question.  I'm interested in anything that
> >>> anyone wants to share about their roller installation ...
> >>>
> >>> how many blogs does it have?
> >>> what is your performance like?
> >>> what are your cache size settings?
> >>> how good is your caching efficiency on average?
> >>> any numbers on how much activity the site gets? hits/visits? load?
> >>> server info?  processors?  ram?  OS?  webserver?  database?
> >>> how is stability?  does the server require restarts often?
> >>>
> >>> anything that can be shared would be cool.  i'd like to keep some
> >>> info on who is running roller and in what kind of environments so
> >>> that we can hopefully make sure we are keeping roller well suited
> >>> for various situations.
> >>>
> >>> blogs.sun.com currently has almost 1600 blogs on it and our
> >>> stability is quite good.  i would give out server info, but i'm not
> >>> allowed to.  probably our biggest performance concern is page
> >>> caching, which has gotten worse and worse as more people start
> >>> blogging.  i think out cache size is 4000 right now and that is
> >>> plenty for the rss cache, but the page cache is still overwhelmed :/
> >>>
> >>> anyways ... how about others?
> >>>
> >>> -- Allen
> >>>
> >>>
> >>> On Tue, 2005-08-09 at 13:59, Elias Torres wrote:
> >>>
> >>>
> >>>> I'm also not an official part of the project, but I might be running
> >>>> the second or third largest Roller-based website ;-) and my opinion if
> >>>> it counts at all, is that if Dave/Allen can handle the heat in the
> >>>> kitchen, let them stay in the kitchen. I'm sure that's why they pay
> >>>> them the big bucks.
> >>>>
> >>>
> >>>
> >>
>

Re: Sharing some stats

Posted by Allen Gilliland <Al...@Sun.COM>.
excellent, thanks for the info guys.

yeah ... 9000 blogs is a lot and in truth i'm not even sure any system 
could really handle much more than that in a truly dynamic fashion.  rss 
feeds are pretty easy to cache because they aren't as complex, but the 
pages themselves are more complicated and there are a number of possible 
views of the page data which makes caching even harder.

i am particularly intrigued that you both commented about the use of 
static html pages.  i think this would be a great option for Roller and 
it would be very cool if we could be pretty sneaky about it and actually 
use the same url structure that exists now, but just map to raw html 
files on the backend.  we should do some investigations on this, but i 
certainly like the idea of having the option to use static html pages.

Matt, if you are curious about your cache performance then you can turn 
up the debugging on the LRUCacheHandler2 class (i believe that's the 
right one).  This will flood your logs with lots of messages about cache 
hits and misses so make sure and watch your log files sizes, but it'll 
give you the info you need.  Another good idea is to turn on garbage 
collection debugging messages so you can see how much of your heap you 
are using.  With 9000 blogs my guess is that your caches are pretty 
overwhelmed, but I would also guess that if you check your the garbage 
collection after a Full GC that you probably have a little more room in 
your heap to increase the sizes.

-- Allen


Matthew P. Schmidt wrote:

> I'll share.  We have about 9000 blogs with rapid growth.  Its running 
> on one dual xeon, on MySQL and Resin and uses about a 1.6G heap with 3 
> 3000 item caches (page, rss, last modified).  I'm not sure how much 
> they're actually being used.  Load is generally pretty manageable, 
> especially with the latest version of Roller.   As for hits, most of 
> it is RSS, with several million hits of that per month.  There are 
> also a million or more blog views per month and the server doesn't 
> generally have to restart that often.  Before merging our fork with 
> Roller 1.2, we were restarting every night due to a memory leak.  Our 
> biggest problem is probably the amount of referrer spam, even with a 
> healthy blacklist of dirty words.  I think static HTML for the pages 
> (which they basically are now if your cache is big enough) and a 
> better referrer filter would be two big helpers for us. 
> Matthew P. Schmidt
> Vice President of Technology
> Javalobby.org
> Email: matt@javalobby.org
> Phone: 919.678.0300
>
>
>
> Elias Torres wrote:
>
>> We have also around 1800 blogs and it's growing rapidly. Also, around
>> 12K people make use of the system in total and this we know because we
>> don't allow anonymous comments. You need to be authenticated for
>> someone to comment/post.
>>
>> I wonder why you are not allowed to give out server info. Maybe I'll
>> hold off on that too for now.
>>
>> I'm sure others have asked this before, but is there a plan of turning
>> Roller blogs into static HTML? I'd be interested in hearing your
>> thoughts on this. I'm sure this would alleviate many of the caching
>> performance problems.
>>
>> Elias
>>
>> On 8/9/05, Allen Gilliland <Al...@sun.com> wrote:
>>  
>>
>>> I am diverging from the deployments discussion for a second because 
>>> Elias comment sparked a question.  I'm interested in anything that 
>>> anyone wants to share about their roller installation ...
>>>
>>> how many blogs does it have?
>>> what is your performance like?
>>> what are your cache size settings?
>>> how good is your caching efficiency on average?
>>> any numbers on how much activity the site gets? hits/visits? load?
>>> server info?  processors?  ram?  OS?  webserver?  database?
>>> how is stability?  does the server require restarts often?
>>>
>>> anything that can be shared would be cool.  i'd like to keep some 
>>> info on who is running roller and in what kind of environments so 
>>> that we can hopefully make sure we are keeping roller well suited 
>>> for various situations.
>>>
>>> blogs.sun.com currently has almost 1600 blogs on it and our 
>>> stability is quite good.  i would give out server info, but i'm not 
>>> allowed to.  probably our biggest performance concern is page 
>>> caching, which has gotten worse and worse as more people start 
>>> blogging.  i think out cache size is 4000 right now and that is 
>>> plenty for the rss cache, but the page cache is still overwhelmed :/
>>>
>>> anyways ... how about others?
>>>
>>> -- Allen
>>>
>>>
>>> On Tue, 2005-08-09 at 13:59, Elias Torres wrote:
>>>   
>>>
>>>> I'm also not an official part of the project, but I might be running
>>>> the second or third largest Roller-based website ;-) and my opinion if
>>>> it counts at all, is that if Dave/Allen can handle the heat in the
>>>> kitchen, let them stay in the kitchen. I'm sure that's why they pay
>>>> them the big bucks.
>>>>     
>>>
>>>   
>>

Re: Sharing some stats

Posted by "Matthew P. Schmidt" <ma...@javalobby.org>.
I'll share.  We have about 9000 blogs with rapid growth.  Its running on 
one dual xeon, on MySQL and Resin and uses about a 1.6G heap with 3 3000 
item caches (page, rss, last modified).  I'm not sure how much they're 
actually being used.  Load is generally pretty manageable, especially 
with the latest version of Roller.   As for hits, most of it is RSS, 
with several million hits of that per month.  There are also a million 
or more blog views per month and the server doesn't generally have to 
restart that often.  Before merging our fork with Roller 1.2, we were 
restarting every night due to a memory leak.  Our biggest problem is 
probably the amount of referrer spam, even with a healthy blacklist of 
dirty words.  I think static HTML for the pages (which they basically 
are now if your cache is big enough) and a better referrer filter would 
be two big helpers for us.  

Matthew P. Schmidt
Vice President of Technology
Javalobby.org
Email: matt@javalobby.org
Phone: 919.678.0300



Elias Torres wrote:

>We have also around 1800 blogs and it's growing rapidly. Also, around
>12K people make use of the system in total and this we know because we
>don't allow anonymous comments. You need to be authenticated for
>someone to comment/post.
>
>I wonder why you are not allowed to give out server info. Maybe I'll
>hold off on that too for now.
>
>I'm sure others have asked this before, but is there a plan of turning
>Roller blogs into static HTML? I'd be interested in hearing your
>thoughts on this. I'm sure this would alleviate many of the caching
>performance problems.
>
>Elias
>
>On 8/9/05, Allen Gilliland <Al...@sun.com> wrote:
>  
>
>>I am diverging from the deployments discussion for a second because Elias comment sparked a question.  I'm interested in anything that anyone wants to share about their roller installation ...
>>
>>how many blogs does it have?
>>what is your performance like?
>>what are your cache size settings?
>>how good is your caching efficiency on average?
>>any numbers on how much activity the site gets? hits/visits? load?
>>server info?  processors?  ram?  OS?  webserver?  database?
>>how is stability?  does the server require restarts often?
>>
>>anything that can be shared would be cool.  i'd like to keep some info on who is running roller and in what kind of environments so that we can hopefully make sure we are keeping roller well suited for various situations.
>>
>>blogs.sun.com currently has almost 1600 blogs on it and our stability is quite good.  i would give out server info, but i'm not allowed to.  probably our biggest performance concern is page caching, which has gotten worse and worse as more people start blogging.  i think out cache size is 4000 right now and that is plenty for the rss cache, but the page cache is still overwhelmed :/
>>
>>anyways ... how about others?
>>
>>-- Allen
>>
>>
>>On Tue, 2005-08-09 at 13:59, Elias Torres wrote:
>>    
>>
>>>I'm also not an official part of the project, but I might be running
>>>the second or third largest Roller-based website ;-) and my opinion if
>>>it counts at all, is that if Dave/Allen can handle the heat in the
>>>kitchen, let them stay in the kitchen. I'm sure that's why they pay
>>>them the big bucks.
>>>      
>>>
>>    
>>

Re: Sharing some stats

Posted by Elias Torres <el...@gmail.com>.
We have also around 1800 blogs and it's growing rapidly. Also, around
12K people make use of the system in total and this we know because we
don't allow anonymous comments. You need to be authenticated for
someone to comment/post.

I wonder why you are not allowed to give out server info. Maybe I'll
hold off on that too for now.

I'm sure others have asked this before, but is there a plan of turning
Roller blogs into static HTML? I'd be interested in hearing your
thoughts on this. I'm sure this would alleviate many of the caching
performance problems.

Elias

On 8/9/05, Allen Gilliland <Al...@sun.com> wrote:
> I am diverging from the deployments discussion for a second because Elias comment sparked a question.  I'm interested in anything that anyone wants to share about their roller installation ...
> 
> how many blogs does it have?
> what is your performance like?
> what are your cache size settings?
> how good is your caching efficiency on average?
> any numbers on how much activity the site gets? hits/visits? load?
> server info?  processors?  ram?  OS?  webserver?  database?
> how is stability?  does the server require restarts often?
> 
> anything that can be shared would be cool.  i'd like to keep some info on who is running roller and in what kind of environments so that we can hopefully make sure we are keeping roller well suited for various situations.
> 
> blogs.sun.com currently has almost 1600 blogs on it and our stability is quite good.  i would give out server info, but i'm not allowed to.  probably our biggest performance concern is page caching, which has gotten worse and worse as more people start blogging.  i think out cache size is 4000 right now and that is plenty for the rss cache, but the page cache is still overwhelmed :/
> 
> anyways ... how about others?
> 
> -- Allen
> 
> 
> On Tue, 2005-08-09 at 13:59, Elias Torres wrote:
> > I'm also not an official part of the project, but I might be running
> > the second or third largest Roller-based website ;-) and my opinion if
> > it counts at all, is that if Dave/Allen can handle the heat in the
> > kitchen, let them stay in the kitchen. I'm sure that's why they pay
> > them the big bucks.
> 
>