You are viewing a plain text version of this content. The canonical link for it is here.
Posted to olio-user@incubator.apache.org by Vasileios Kontorinis <bk...@gmail.com> on 2010/03/09 20:47:50 UTC

Modeling Olio Performance

Akara and Shanti hi,
    Now that Olio works I have a ton more questions... :-)
1) Are there any plans in the future to support dynamic addition/removal of
servers (web,db,filestore) during steady state. e.g. I realize that the
webserver is the performance bottleneck, because it cannot keep up with the
requests. One way to fix this is to add more servers.
2) Are there any ways for figuring out where that actual bottleneck is
(besides the http://blogs.sun.com/shanti/entry/olio_on_nehalem) other than
high level tools sush as vmstat,mpstat, sar? In my setup I believe the IO to
be the limiting factor. As I scale the number of users I see the percentage
of cpu for the webservers spent waiting for IO to increase tll is saturates
around 40-60% on average.(sys and user , increase as well but at a much
slower pace) for 1000 concurrent users
However, I monitor the network traffic and it does not seem as the Gb
ethernet is saturated. (I am reading/writing ~ 10-15MB/s while it should
support up to 125MB/s , one ethernet interface that all the virtual machines
are using)
In terms of the harddrive now, the filestores have write throughput of
around 2-4MB/s and read around 0.1-0.5MB/s , while the webservers and db
much less (webservers write ~0.5-1MB/s, read too small, db write 0.1-0.3
MB/s, read too small -- with memcache) and it feels like the there should be
more bandwidth there.
Still I get failing results for EventDetail.
Is there any way to monitor the internals of olio, something like a
breakdown of the end-to-end response time ( I know this is tough but even
approximate would be a nice feature). Any knowledge beyond just the average
response time / throughput would be beneficial.
3) If someone wanted to create a model for Olio performance (I know this
defeats the purpose of creating a benchmark to measure things :-) how would
he model the interaction between the different tiers?
There have been suggestions in literature to apply queueing theory for such
purposes (especially for Rubis like benchmarks). So each Olio operation
requires a number of web server accesses (6 on average) to generate the html
content and db queries (10 on average) for the data [*deploying web 2.0
applications on sun servers and the opensolaris™ operating system*]. If we
had a way to monitor average delay for each webserver/db request as well as
the geocoder requests then a model is possible, right?
Have you ever done something similar for debugging purposes?

It would also be nice if we can have this info per server, e.g. the specific
web-server is over-committed, or this db is under-committed let's put some
more requests there.

Any insights are more than welcome.

Thanks
-------------------------------------------------------------------
Kontorinis Vasileios
Phd student, University of California San Diego
San Diego, CA 92122
Cell. phone: (858) 717 6899
bkontorinis@gmail.com, vkontori@ucsd.edu
-------------------------------------------------------------------

Re: Modeling Olio Performance

Posted by Ak...@Sun.COM.
Let me try to answer this question...

On 03/09/10 12:47, Vasileios Kontorinis wrote:
> Akara and Shanti hi,
>     Now that Olio works I have a ton more questions... :-)
> 1) Are there any plans in the future to support dynamic addition/removal 
> of servers (web,db,filestore) during steady state. e.g. I realize that 
> the webserver is the performance bottleneck, because it cannot keep up 
> with the requests. One way to fix this is to add more servers.\

Olio is an open source project and I'm really glad to hear feedback. 
This is certainly a desired feature but not yet a concrete plan to 
deliver such functionality. I really encourage you to file a Jira on 
this. If you want this very soon, I'd really encourage contributing this 
to Olio.

> 2) Are there any ways for figuring out where that actual bottleneck is 
> (besides the http://blogs.sun.com/shanti/entry/olio_on_nehalem) other 
> than high level tools sush as vmstat,mpstat, sar? In my setup I believe 
> the IO to be the limiting factor.

iostat certainly provides you plenty of info on I/O bottlenecks. If on 
Solaris, there are plenty of things you can do based on dtrace. 
Unfortunately, the same does not apply to Linux.

> As I scale the number of users I see 
> the percentage of cpu for the webservers spent waiting for IO to 
> increase tll is saturates around 40-60% on average.(sys and user , 
> increase as well but at a much slower pace) for 1000 concurrent users.
> However, I monitor the network traffic and it does not seem as the Gb 
> ethernet is saturated. (I am reading/writing ~ 10-15MB/s while it should 
> support up to 125MB/s , one ethernet interface that all the virtual 
> machines are using)

Small request/response sizes hamper throughput.

> In terms of the harddrive now, the filestores have write throughput of 
> around 2-4MB/s and read around 0.1-0.5MB/s , while the webservers and db 
> much less (webservers write ~0.5-1MB/s, read too small, db write 0.1-0.3 
> MB/s, read too small -- with memcache) and it feels like the there 
> should be more bandwidth there.
> Still I get failing results for EventDetail.
> Is there any way to monitor the internals of olio, something like a 
> breakdown of the end-to-end response time ( I know this is tough but 
> even approximate would be a nice feature). Any knowledge beyond just the 
> average response time / throughput would be beneficial.

Surely not as part of Olio. Dtrace can help a lot.

> 3) If someone wanted to create a model for Olio performance (I know this 
> defeats the purpose of creating a benchmark to measure things :-) how 
> would he model the interaction between the different tiers?
> There have been suggestions in literature to apply queueing theory for 
> such purposes (especially for Rubis like benchmarks). So each Olio 
> operation requires a number of web server accesses (6 on average) to 
> generate the html content and db queries (10 on average) for the data 
> [/deploying web 2.0 applications on sun servers and the opensolaris™ 
> operating system/]. If we had a way to monitor average delay for each 
> webserver/db request as well as the geocoder requests then a model is 
> possible, right?
> Have you ever done something similar for debugging purposes?

We have looked into this a bit more. Since you mentioned OpenSolaris, 
you can certainly use dtrace to find out such latencies.

> 
> It would also be nice if we can have this info per server, e.g. the 
> specific web-server is over-committed, or this db is under-committed 
> let's put some more requests there.

Yes, this is not automated. You need to know the probes for each layer. 
There are certainly MySQL probes you can use (google for it).

Thanks,
-Akara