You are viewing a plain text version of this content. The canonical link for it is here.
Posted to olio-user@incubator.apache.org by Vasileios Kontorinis <bk...@gmail.com> on 2010/02/08 09:18:55 UTC

Olio Scaling

Akara and Shanti,
  I managed to fix a very subtle issue with xen. There was an issue with the
checksum that reduces the throughput of the network from 1Gbs to 1Mbs.
When that was fixed my I managed to scale to 1800 concurrent users.
However, the only metric failing now is the

Average images loaded per Home Page 2.65   >=3       FAILED

Actually I managed to get a passing result for 25 users.

The logs seem clean.
I only get
[Mon Feb 08 07:56:13 2010] [error] [client 10.17.255.250] index.php waiting
for cache

and

Thu Feb 04 00:10:51 2010] [error] [client 10.17.255.250]
olio-local-web:21355 obtained HomeUpdateLock
[Thu Feb 04 00:10:51 2010] [error] [client 10.17.255.250]
olio-local-web:21355 released HomeUpdateLock

Any idea on how to debug this the failing metric?

I also had some question regarding Memcached. In the MemcachedStats output
log I get:

Server              Time  items  cache_MB  conns  sets/s  gets/s  get_hits/s
 get_misses/s  evicts/s  rB/s    wB/s
--------------  --------  -----  --------  -----  ------  ------  ----------
 ------------  --------  ----  ------
olio-mem:11211  04:20:47      3      0.05     34    1.70   13.10       10.30
         2.80         0  5709  216402


Server              Time  items  cache_MB  conns  sets/s  gets/s  get_hits/s
 get_misses/s  evicts/s  rB/s  wB/s
--------------  --------  -----  --------  -----  ------  ------  ----------
 ------------  --------  ----  ----
olio-mem:11211  04:20:47      3      0.05     34    0.00    0.00        0.00
         0.00         0     0    48


Server              Time  items  cache_MB  conns  sets/s  gets/s  get_hits/s
 get_misses/s  evicts/s  rB/s  wB/s
--------------  --------  -----  --------  -----  ------  ------  ----------
 ------------  --------  ----  ----
olio-mem:11211  04:20:47      3      0.05     34    0.00    0.00        0.00
         0.00         0     0    48


Does this mean that I only use 0.05 MB from the memcached memory?
I am pretty sure that the memcached command has  -m 256   which means that I
should be reach close to 256MB, when running with high number of users.
Is cache_MB something different?

Thanks again
-------------------------------------------------------------------
Kontorinis Vasileios
Phd student, University of California San Diego
San Diego, CA 92122
Cell. phone: (858) 717 6899
bkontorinis@gmail.com, vkontori@ucsd.edu
-------------------------------------------------------------------


2010/1/27 Shanti Subramanyam <sh...@gmail.com>

> Yes - these are problems that I'm already aware of.
> The best solution to the filestore issue is to change ownership of the
> directory to the same user/group as the apache process. We could have the
> fileloader.sh change write access I guess, but since that's a big security
> hole, we may not want to do that automatically without letting the user know
> about it.
>
> The fact that your response times are so high indicate that you're running
> a far larger load than the system can handle and/or you still need some
> tuning.
> I suggest you start over from say 100 users and see at what point your
> response times start getting really large. The apache error log should be
> pulled in as part of the 'Statistics' tab, so do keep monitoring that.
>
> Shanti
>
>
> On Wed, Jan 27, 2010 at 1:34 AM, Vasileios Kontorinis <
> bkontorinis@gmail.com> wrote:
>
>> Shanti hi again,
>>    I checked my apache logs and there were a bunch of errors.
>> It looks like there some issues with the
>> webapp/php/trunk/classes/ImageUtil.php in the last release of olio. (I
>> downloaded
>> http://www.alliedquotes.com/mirrors/apache/incubator/olio/0.2/apache-olio-php-src-0.2.tar.gz)
>> 1) There is a line that needs to be commented. php complains ("1.5. Must
>> be greater than zero.").
>> 2) Then, it was complaining that it cannot find function
>> fastimagecopyresampled . To work around that moved the function
>> fastimagecopyresampled above createThumb (this might not  be required ) and
>> deplared it static.
>>     Finally,  I call the function from createThumb with
>> self::fastimagecopyresampled .
>> 3) Then, it started complaining because it could not write to the
>> filestore. The problem is that wants to write the new images as www-data
>> from the apache, while the filestore does not have write persmission for
>> others. Manually,
>>     giving access solves the problem (chmod -R o+w <path>/filestore) but
>> since the directories in filestore are generated automatically, maybe the
>> chmod command should be added in fileloader.sh
>>
>> Funnily enough, after fixing those issues, I still cannot pass the:
>> Average images loaded per Home Page 2.65   >=3       FAILED
>>
>> and on top of that I also have:
>> Response Times (secs)
>> AddPerson     5.190  13.194  3.387 8.800     3.000 FAILED
>> AddEvent       5.904  16.784  3.159 10.400   4.000 FAILED
>>
>> Think tims for AddPerson and AddEvent fail as well.
>>
>> Any insights are welcome .... :-(
>>
>> -------------------------------------------------------------------
>> Kontorinis Vasileios
>> Phd student, University of California San Diego
>> San Diego, CA 92122
>> Cell. phone: (858) 717 6899
>> bkontorinis@gmail.com, vkontori@ucsd.edu
>> -------------------------------------------------------------------
>>
>>
>> 2010/1/26 Shanti Subramanyam <sh...@gmail.com>
>>
>>> Yes - 0.2 requires a lot more disk space as we changed the ratio of
>>> concurrent users to registered users to 1:100. If you haven't already,
>>> please check out our published Blueprints for detailed performance
>>> characteristics of the workload:
>>> Deploying Web 2.0 Applications on Sun Servers and the OpenSolaris
>>> Operating System<http://wikis.sun.com/display/BluePrints/Deploying+Web+2.0+Applications+on+Sun+Servers+and+the+OpenSolaris+Operating+System>
>>> <http://wikis.sun.com/display/BluePrints/Deploying+Web+2.0+Applications+on+Sun+Servers+and+the+OpenSolaris+Operating+System>
>>> If you run for long enough, you should get passing runs. Have you
>>> verified that there are no errors in the run logs when you see the 'Avg.
>>> images loaded per home page' fail ?
>>>
>>> On to your open files error  - you may have to tune your networking tier
>>> and/or #open file descriptors. I don't believe we have ever seen as many
>>> files open as you are seeing. Can you determine whether these are from the
>>> file store or network ? We also typically run the filestore on a different
>>> system and nfs-mount it on the webserver box.
>>> You will have to tune your system to ensure good performance since you
>>> will need memory for both apache and files.
>>>
>>> Shanti
>>>
>>>
>>> On Mon, Jan 25, 2010 at 5:06 PM, Vasileios Kontorinis <
>>> bkontorinis@gmail.com> wrote:
>>>
>>>> Akara and Shanti hi,
>>>>    I did migrate to Olio 0.2. With the last version of Olio I came
>>>> across some new interesting things.
>>>>
>>>> Scaling issues:
>>>>   - I am still getting the:
>>>> Average images loaded per Home Page2.55>= 3
>>>> FAILED
>>>>  - additionally, when I scale the concurrent users to 800 I run out of
>>>> diskspace since my filestore occupies more than 62GB.
>>>> Actually for 600 users it occupies 50GB. I was curious if that makes
>>>> sense. How much space I will need to reach 1000 users?
>>>> In the php_setup.html it suggests that we will need 50GB but apparently
>>>> we need way more for large number of users.
>>>>
>>>>  - Finally and most importantly, for 600 users many of the operations
>>>> fail with the exception:
>>>> Message: java.net.SocketException: Too many open files
>>>> Stack Trace:
>>>>  Class Method Line java.net.PlainSocketImpl socketAccept
>>>> java.net.PlainSocketImpl accept 390 java.net.ServerSocket implAccept
>>>> 453 java.net.ServerSocket accept 421
>>>> sun.rmi.transport.tcp.TCPTransport$AcceptLoop executeAcceptLoop 369
>>>> sun.rmi.transport.tcp.TCPTransport$AcceptLoop run 341java.lang.Thread
>>>> run 619
>>>> or
>>>>
>>>> java.net.SocketException: Too many open files
>>>> Stack Trace:
>>>>  Class Method Line java.net.Socket createImpl 394 java.net.Socket
>>>> getImpl 457 java.net.Socket bind 571
>>>> com.sun.faban.driver.transport.hc3.ProtocolTimedSocketFactory
>>>> createSocket 60 org.apache.commons.httpclient.HttpConnection open 707
>>>> org.apache.commons.httpclient.HttpMethodDirector executeWithRetry 387
>>>> org.apache.commons.httpclient.HttpMethodDirector executeMethod 171
>>>> org.apache.commons.httpclient.HttpClient executeMethod 397
>>>> org.apache.commons.httpclient.HttpClient executeMethod 323
>>>> com.sun.faban.driver.transport.hc3.ApacheHC3Transport readURL 274
>>>> org.apache.olio.workload.driver.UIDriver doLogin 398
>>>> org.apache.olio.workload.driver.UIDriver doLogin 424
>>>> sun.reflect.GeneratedMethodAccessor8 invoke
>>>> sun.reflect.DelegatingMethodAccessorImpl invoke 25
>>>> java.lang.reflect.Method invoke 597
>>>> com.sun.faban.driver.engine.TimeThread doRun 169
>>>> com.sun.faban.driver.engine.AgentThrea
>>>>
>>>> I am monitoring the number of open files in the web-server with   `watch
>>>> "lsof | wc"` and the olio starts failing when around 65000-70,000 files are
>>>> open. lsof shows that for each apache2 thread there are around 100 files
>>>> open. Therefore there are around 650-700 different apache2 threads that
>>>> create the bulk of those open file descriptors.
>>>> The soft and hard limit is set to 403238, which means that there should
>>>> be many more open files before it will start failing.
>>>> (Actually, I verified the limit by opening a bunch of files with a
>>>> python script and it does reach the limitation of 403238.)
>>>> Any insights?  Is there any chance the the file descriptors take more
>>>> time that usual to be reclaimed after being closed in the xen vm I use for
>>>> my web-server? Does it make sense for olio at the first place to have so
>>>> many files open at the same time?
>>>>
>>>> Thanks again.
>>>>
>>>>
>>>> -------------------------------------------------------------------
>>>> Kontorinis Vasileios
>>>> Phd student, University of California San Diego
>>>> San Diego, CA 92122
>>>> Cell. phone: (858) 717 6899
>>>> bkontorinis@gmail.com, vkontori@ucsd.edu
>>>> -------------------------------------------------------------------
>>>>
>>>>
>>>> 2010/1/16 Shanti Subramanyam <sh...@gmail.com>
>>>>
>>>>  I would really recommend that you migrate to Olio 0.2. In addition to
>>>>> bug fixes, there are some major features changes in it. See Olio
>>>>> 0.2 released<http://perfwork.wordpress.com/2010/01/13/olio-0-2-relesed/>
>>>>>
>>>>>
>>>>> Shanti
>>>>>
>>>>>
>>>>> On Sat, Jan 16, 2010 at 4:49 PM, Vasileios Kontorinis <
>>>>> bkontorinis@gmail.com> wrote:
>>>>>
>>>>>> Akara hi again,
>>>>>>    Below I have comments on your suggestions and at the end some bonus
>>>>>> questions... Thanks again.
>>>>>>
>>>>>> 2010/1/13 Akara Sucharitakul <Ak...@sun.com>
>>>>>>
>>>>>>> With your permission, I'd like to copy the Olio and Faban user
>>>>>>> aliases going forward. I feel it will help a much wider audience. Please see
>>>>>>> below for answers/comments:
>>>>>>>
>>>>>>> Sure. I cced olio user alias. I am not sure which is the right faban
>>>>>> list.
>>>>>>
>>>>>>
>>>>>>> Vasileios Kontorinis wrote:
>>>>>>>
>>>>>>>> Akara hi,
>>>>>>>>   I am a grad student at UCSD and I use Olio for a research project
>>>>>>>> where we want to measure olio performance under live virtual machine
>>>>>>>> migration. We use ubuntu 8.04 on nehalem servers.
>>>>>>>> I have co ed the last version of olio from the online svn repository
>>>>>>>> and downloaded the last version of faban (faban-kit-101509.tar.gz <
>>>>>>>> http://faban.sunsource.net/nightly/faban-kit-101509.tar.gz>)
>>>>>>>>
>>>>>>>
>>>>>>> 101509 is fairly recent. But the latest on the web site is 111109
>>>>>>> (Faban 1.0). There were just bug fixes between those releases.
>>>>>>
>>>>>>
>>>>>> I have upgraded to Faban 1.0, still using olio1.0 though ( the release
>>>>>> of 2.0 was announced, will switch to it if I run into bugs that have been
>>>>>> fixed)
>>>>>>
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>> So far, I employed a bunch of hacks to get most of it to work and I
>>>>>>>> am almost there. In the process I got a bunch of questions.
>>>>>>>>
>>>>>>>> Questions (some of them might be just faban related, not olio so
>>>>>>>> bear with me):
>>>>>>>> 1) In there any way to deploy OlioDriver.jar through the command
>>>>>>>> line? Firefox through ssh forwarding is dead slow and I d rather avoid if I
>>>>>>>> can.
>>>>>>>>
>>>>>>>
>>>>>>> Just drop the jar into faban/benchmarks/ and it will deploy itself.
>>>>>>> This is documented at
>>>>>>> http://faban.sunsource.net/1.0/docs/guide/harnessdev/deploybenchmark.htmlunder "Alternate Deployment Methods."
>>>>>>>
>>>>>>>
>>>>>>>  2) The services ApacheHttpdService, MemcachedService, MySQLService
>>>>>>>> that come with Faban should be deployed before running Olio?
>>>>>>>>    I was getting some very weird errors. e.g.
>>>>>>>>
>>>>>>>
>>>>>>> Yes, you should. Olio will search for those.
>>>>>>>
>>>>>>> Done
>>>>>>
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>> 03:50:27:WARNING:UIDriverAgent[0]: Forcefully terminating benchmark
>>>>>>>> run
>>>>>>>> 03:50:27:WARNING:UIDriverAgent[0]: 25 threads forcefully terminated.
>>>>>>>> java.lang.Throwable: Stack of non-terminating thread.
>>>>>>>>    at java.net.SocketInputStream.socketRead0 (null)
>>>>>>>>    at java.net.SocketInputStream.read (129)
>>>>>>>>    at java.io.FilterInputStream.read (116)
>>>>>>>>    at com.sun.faban.driver.transport.util.TimedInputStream.read
>>>>>>>> (139)
>>>>>>>>    at java.io.BufferedInputStream.fill (218)
>>>>>>>>    at java.io.BufferedInputStream.read (237)
>>>>>>>>    at org.apache.commons.httpclient.HttpParser.readRawLine (78)
>>>>>>>>    at org.apache.commons.httpclient.HttpParser.readLine (106)
>>>>>>>>    at org.apache.commons.httpclient.HttpConnection.readLine (1116)
>>>>>>>>    at org.apache.commons.httpclient.HttpMethodBase.readStatusLine
>>>>>>>> (1973)
>>>>>>>>    at org.apache.commons.httpclient.HttpMethodBase.readResponse
>>>>>>>> (1735)
>>>>>>>>    at org.apache.commons.httpclient.HttpMethodBase.execute (1098)
>>>>>>>>    at
>>>>>>>> org.apache.commons.httpclient.HttpMethodDirector.executeWithRetry (398)
>>>>>>>>    at org.apache.commons.httpclient.HttpMethodDirector.executeMethod
>>>>>>>> (171)
>>>>>>>>    at org.apache.commons.httpclient.HttpClient.executeMethod (397)
>>>>>>>>    at org.apache.commons.httpclient.HttpClient.executeMethod (323)
>>>>>>>>    at com.sun.faban.driver.transport.hc3.ApacheHC3Transport.fetchURL
>>>>>>>> (529)
>>>>>>>>    at com.sun.faban.driver.transport.hc3.ApacheHC3Transport.fetchURL
>>>>>>>> (552)
>>>>>>>>    at org.apache.olio.workload.driver.UIDriver.doHomePage (355)
>>>>>>>>    at sun.reflect.NativeMethodAccessorImpl.invoke0 (null)
>>>>>>>>    at sun.reflect.NativeMethodAccessorImpl.invoke (39)
>>>>>>>>    at sun.reflect.DelegatingMethodAccessorImpl.invoke (25)
>>>>>>>>    at java.lang.reflect.Method.invoke (597)
>>>>>>>>    at com.sun.faban.driver.engine.TimeThread.doRun (169)
>>>>>>>>    at com.sun.faban.driver.engine.AgentThread.run (202)
>>>>>>>>
>>>>>>>> and afterwards the master was waiting for threads to join for
>>>>>>>> ever... (I attached gdb to verify that something was wrong) and hence I had
>>>>>>>> to kill the benchmark.
>>>>>>>>
>>>>>>>
>>>>>>> These threads are hanging reading the server responses, that never
>>>>>>> came.
>>>>>>>
>>>>>>>
>>>>>> Building the services from Faban probably fixes it.
>>>>>>
>>>>>>
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>> In the Olio log there are WARNINGS  complaining about not deploying
>>>>>>>> those. After building those and manually copying them to /faban/services
>>>>>>>> (ant deploy did not place them there... :-(  )
>>>>>>>>
>>>>>>>
>>>>>>> Yes. But ant deploy should get them there. If not, can you please let
>>>>>>> me know the ant messages?
>>>>>>
>>>>>>
>>>>>> Ant was deploying them indeed. I had a mistake in building.properties.
>>>>>> I had:  faban.url=http://<hostname>:9980/   instead of  faban.url=
>>>>>> http://localhost:9980/
>>>>>> After I changed that it started working...
>>>>>>
>>>>>>
>>>>>>>
>>>>>>>  it worked. (mostly worked)
>>>>>>>>
>>>>>>>> 3) I still have warnings like:
>>>>>>>> 01:38:08:INFO:Time difference to host olio-web is 269 ms. Attempting
>>>>>>>> to set clock.
>>>>>>>> 01:38:08:INFO:Time difference to host olio-db is 263 ms. Attempting
>>>>>>>> to set clock.
>>>>>>>>
>>>>>>>
>>>>>>> These two are OK. Just trying to do a clock sync between the systems.
>>>>>>>
>>>>>>>
>>>>>>>  01:38:08:WARNING:olio-web wakeup-before time reached 700ms limit.
>>>>>>>> System is too busy. Giving up.
>>>>>>>>
>>>>>>>
>>>>>>> This is one of Faban's clock-setting calibrations. If the system is
>>>>>>> too busy or you run on some virtualization architectures, the lag time
>>>>>>> between an intended end of sleep and the actual time when the thread really
>>>>>>> wakes up (gets scheduled/executed) is too high, calibrations will fail.
>>>>>>>
>>>>>>>
>>>>>>>  01:38:08:INFO:Time difference to host olio-mem is 262 ms. Attempting
>>>>>>>> to set clock.
>>>>>>>> 01:38:10:WARNING:olio-db wakeup-before time reached 700ms limit.
>>>>>>>> System is too busy. Giving up.
>>>>>>>> 09:38:09:WARNING:[date, -u, 011309382010.10]
>>>>>>>> stderr:
>>>>>>>> date: cannot set date: Operation not permitted
>>>>>>>> 09:38:09:WARNING:Error on "[date, -u, 011309382010.10]" command
>>>>>>>> trying to set the date. Exit value: 1
>>>>>>>> 09:38:10:WARNING:[date, -u, 011309382010.11]
>>>>>>>> stderr:
>>>>>>>> date: cannot set date: Operation not permitted
>>>>>>>> 09:38:10:WARNING:Error on "[date, -u, 011309382010.11]" command
>>>>>>>> trying to set the date. Exit value: 1
>>>>>>>> 09:38:09:WARNING:[date, -u, 011309382010.10]
>>>>>>>> stderr:
>>>>>>>> date: cannot set date: Operation not permitted
>>>>>>>> 09:38:09:WARNING:Error on "[date, -u, 011309382010.10]" command
>>>>>>>> trying to set the date. Exit value: 1
>>>>>>>> 09:38:10:WARNING:[date, -u, 011309382010.11]
>>>>>>>> stderr:
>>>>>>>> date: cannot set date: Operation not permitted
>>>>>>>>
>>>>>>>> Leting faban change the vm clock sounds from the beginning a bad
>>>>>>>> idea.
>>>>>>>>
>>>>>>>
>>>>>>> OK. So it is xen. Yes, this is what Faban is trying to solve. You can
>>>>>>> certainly turn it off. Please see:
>>>>>>>   http://faban.sunsource.net/1.0/docs/howd services
>>>>>>> ApacheHttpdService, MemcachedService, MySQLService that come with Faban
>>>>>>> should be deployed before running Olio?
>>>>>>>    I was gettingoi/physclocksync.html<http://faban.sunsource.net/1.0/docs/howdoi/physclocksync.html>
>>>>>>>
>>>>>>>
>>>>>> I added the  <fh:timeSync>false<fh:timeSync> in my run.xml file ( btw
>>>>>> in the link above there is a mistake :  <fh:timeSync>false
>>>>>> </fh:timeSync> is correct, the second <fh:timeSync> needs a closing
>>>>>> tag, the "/" is missing)
>>>>>> that made the warnings go away.
>>>>>>
>>>>>>
>>>>>>
>>>>>>>
>>>>>>>  Unfortunately, xen is really bad in maintaining an accurate clock.
>>>>>>>> As a result there is usually time difference between the different virtual
>>>>>>>> machines
>>>>>>>> of more than 10ms. I went over the setTime function in Faban source
>>>>>>>> (/faban/com/sun/faban/harness/agent/CmdAgentImpl.java), it's big and ugly
>>>>>>>> (very ugly)
>>>>>>>>
>>>>>>>
>>>>>>> Thanks for the compliments! I think you mean CmdService.setClockTask.
>>>>>>> Time sensitive code ain't pretty. It is the complexities dealing with the
>>>>>>> clock and trying to achieve good accuracy. If you think you can simplify
>>>>>>> this, I'm listening (without loosing the accuracy, of course). In
>>>>>>> comparison, CmdAgentImpl has nothing.
>>>>>>>
>>>>>>>
>>>>>> Yes, you r right it is CmdService.setClockTask. The previous email was
>>>>>> composed at 3am ... :-)
>>>>>> I am still a little confused.  the setClockTask is used to set the
>>>>>> clock so that all the machines are synchronized with master. From what you
>>>>>> mentioned the physical clock sync is only used for the logs.
>>>>>> Why do we need to do that since 1) it requires root privileges (which
>>>>>> might not be always available) 2) I could imagine an alternative that uses
>>>>>> deltas from the actual physical clock without having to set it.
>>>>>> ( I am probably missing something... :-)
>>>>>>
>>>>>>
>>>>>>
>>>>>>>  Why there is this strict requirement for 10ms difference? Any ideas?
>>>>>>>>
>>>>>>>
>>>>>>> It is easily achievable in most cases. May not be true for VMs.
>>>>>>>
>>>>>>> On some VM architectures, the OS however does not get scheduled till
>>>>>>> way after that, thus causing problems. You may be able to measure
>>>>>>> performance on those VMs. But you don't want to use such VMs to be a driver.
>>>>>>> Your response time measurements will be way off.
>>>>>>>
>>>>>>> The physical clock sync is not really rigorous. And you can turn it
>>>>>>> off. It is more to keep the systems in good time sync. If your VM stands in
>>>>>>> the way, just turn it off. The driver's virtual clock sync is much more
>>>>>>> picky in comparison. This is because the start time for the steady state
>>>>>>> should be the same (with a very small tolerance) no matter how many drivers
>>>>>>> are driving. Otherwise the measurement period won't be the same when viewed
>>>>>>> from different drivers and the results won't be reliable.
>>>>>>>
>>>>>>>
>>>>>>>  Even with ntp it's hard to provide the 10ms guarantee.
>>>>>>>>
>>>>>>>
>>>>>>> That's why we don't use ntp ;-)
>>>>>>
>>>>>>
>>>>>> Just out of curiosity, the physical clocks are set only once at the
>>>>>> beginning (right?), therefore for long runs the 10ms difference will not be
>>>>>> guaranteed. Nope? Especially under VMs I 've seen significant clock
>>>>>> difference withing a few minutes.
>>>>>> At least ntp can periodically resync (of course doing so, might screw
>>>>>> up the logs with time going backwards etc)
>>>>>>
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>  I am thinking of modifying this function to always return that the
>>>>>>>> time difference is less than 10ms (so that I do not have to wait all the
>>>>>>>> time for the timeouts.)
>>>>>>>>
>>>>>>>
>>>>>>> Why bother. Don't like it, just turn it off. It has good use in most
>>>>>>> configurations we're dealing with. And, it avoids ntp inaccuracies.
>>>>>>>
>>>>>>>
>>>>>>>  Will this break anything in Olio?
>>>>>>>>
>>>>>>>
>>>>>>> Nope. Except the times in your logs will appear out of sequence. They
>>>>>>> rely on the local time on the originating systems.
>>>>>>>
>>>>>>>
>>>>>>>> 4) Warning like:
>>>>>>>> 09:39:48:WARNING:Image at
>>>>>>>> http://olio-web:80/fileService.php?cache=false&file=e168t.jpg <
>>>>>>>> http://olio-web:80/fileService.php?cache=false&file=e168t.jpg> size
>>>>>>>> of 249 bytes is too small. Image may not exist
>>>>>>>> can be ignored, right?
>>>>>>>>
>>>>>>>
>>>>>>> Well, something is wrong. We don't have images that small. Check
>>>>>>> whether e168t.jpg is really that small. That's why we have that warning.
>>>>>>>
>>>>>>>
>>>>>>
>>>>>> It kinda funny, my problem was that I had the olio webkit version
>>>>>> installed and then I downloaded the version from the online svn repository.
>>>>>> I built the driver but forgot to update the webpage for my apache server.
>>>>>> Which
>>>>>> as expected was the source for many of my issues.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>> 5) Last and most important.
>>>>>>>> I can run the benchmark and all the operation succeed but for login.
>>>>>>>> I get a bunch of:
>>>>>>>>
>>>>>>>> 09:39:25:WARNING:UIDriverAgent[0].2.doLogin: Found login prompt at
>>>>>>>> index 2926, Login as at786o08x, 2178 failed.
>>>>>>>> Note: Error not counted in result.
>>>>>>>> Either transaction start or end time is not within steady state.
>>>>>>>> java.lang.RuntimeException: Found login prompt at index 2926, Login
>>>>>>>> as at786o08x, 2178 failed.
>>>>>>>>    at org.apache.olio.workload.driver.UIDriver.doLogin (404)
>>>>>>>>    at sun.reflect.NativeMethodAccessorImpl.invoke0 (null)
>>>>>>>>    at sun.reflect.NativeMethodAccessorImpl.invoke (39)
>>>>>>>>    at sun.reflect.DelegatingMethodAccessorImpl.invoke (25)
>>>>>>>>    at java.lang.reflect.Method.invoke (597)
>>>>>>>>    at com.sun.faban.driver.engine.TimeThread.doRun (169)
>>>>>>>>    at com.sun.faban.driver.engine.AgentThread.run (202)
>>>>>>>>
>>>>>>>> Any ideas? I do get
>>>>>>>>
>>>>>>>
>>>>>>> You likely have cookie issues. It can't seem to hold on to a session.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>> Well there was a permission issue with the http_session dir. I could
>>>>>> not right to it. chmod 777 it fixed this.
>>>>>>
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>> (I ve found online:
>>>>>>>> http://www.mail-archive.com/olio-dev@incubator.apache.org/msg00647.htmlwhich is similar, but when I added
>>>>>>>>
>>>>>>>> com.sun.faban.driver.transport.sunhttp.ThreadCookieHandler.level=FINER
>>>>>>>>  in build.properties
>>>>>>>> I did not see any cookie related warnings. Those should appear in
>>>>>>>> the olio run log or the apache log, right? Am i just looking at the wrong
>>>>>>>> place? )
>>>>>>>>
>>>>>>>
>>>>>>> Yes, that's applicable only to the Sun Http Transport. The version of
>>>>>>> Olio you're using is based on the Apache Http Transport (Apache HttpClient
>>>>>>> 3.1). The ThreadCookieHandler is not used for the Apache transport and
>>>>>>> that's why you don't see any logs. Try upgrade to Faban 1.0 before looking
>>>>>>> at other things.
>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>> It's a long email I know. Your feedback would be most appreciated.
>>>>>>>>
>>>>>>>> -Regards
>>>>>>>> -------------------------------------------------------------------
>>>>>>>> Kontorinis Vasileios
>>>>>>>> Phd student, University of California San Diego
>>>>>>>> San Diego, CA 92122
>>>>>>>> Cell. phone: (858) 717 6899
>>>>>>>> bkontorinis@gmail.com <ma...@gmail.com>,
>>>>>>>> vkontori@ucsd.edu <ma...@ucsd.edu>
>>>>>>>> -------------------------------------------------------------------\
>>>>>>>>
>>>>>>>
>>>>>>> Thanks for all the questions/comments.
>>>>>>>
>>>>>>> -Akara
>>>>>>>
>>>>>>>
>>>>>>
>>>>>> And now some more questions/ comments:
>>>>>> 1) I get the following error:
>>>>>>
>>>>>> 15:13:05:SEVERE:CmdService: Getting - exception reading
>>>>>> /usr/data/olio-db.err
>>>>>> java.io.FileNotFoundException: File /usr/data/olio-db.err does not
>>>>>> exist.
>>>>>>     at com.sun.faban.common.FileTransfer.<init> (70)
>>>>>>     at com.sun.faban.harness.agent.FileAgentImpl.get (315)
>>>>>>     at sun.reflect.NativeMethodAccessorImpl.invoke0 (null)
>>>>>>     at sun.reflect.NativeMethodAccessorImpl.invoke (39)
>>>>>>     at sun.reflect.DelegatingMethodAccessorImpl.invoke (25)
>>>>>>     at java.lang.reflect.Method.invoke (597)
>>>>>>     at sun.rmi.server.UnicastServerRef.dispatch (305)
>>>>>>     at sun.rmi.transport.Transport$1.run (159)
>>>>>>     at java.security.AccessController.doPrivileged (null)
>>>>>>     at sun.rmi.transport.Transport.serviceCall (155)
>>>>>>     at sun.rmi.transport.tcp.TCPTransport.handleMessages (535)
>>>>>>     at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0 (790)
>>>>>>     at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run (649)
>>>>>>     at java.util.concurrent.ThreadPoolExecutor$Worker.runTask (885)
>>>>>>     at java.util.concurrent.ThreadPoolExecutor$Worker.run (907)
>>>>>>     at java.lang.Thread.run (619)
>>>>>>     at sun.rmi.transport.StreamRemoteCall.exceptionReceivedFromServer
>>>>>> (255)
>>>>>>     at sun.rmi.transport.StreamRemoteCall.executeCall (233)
>>>>>>     at sun.rmi.server.UnicastRef.invoke (142)
>>>>>>     at com.sun.faban.harness.agent.FileAgentImpl_Stub.get (null)
>>>>>>     at com.sun.faban.harness.engine.CmdService.get (1334)
>>>>>>     at com.sun.faban.harness.RunContext.getFile (346)
>>>>>>     at com.sun.services.MySQLService.getLogs (197)
>>>>>>     at sun.reflect.NativeMethodAccessorImpl.invoke0 (null)
>>>>>>     at sun.reflect.NativeMethodAccessorImpl.invoke (39)
>>>>>>     at sun.reflect.DelegatingMethodAccessorImpl.invoke (25)
>>>>>>     at java.lang.reflect.Method.invoke (597)
>>>>>>     at com.sun.faban.harness.util.Invoker.invoke (98)
>>>>>>     at com.sun.faban.harness.services.ServiceWrapper.getLogs (200)
>>>>>>     at com.sun.faban.harness.services.ServiceManager.getLogs (642)
>>>>>>     at com.sun.faban.harness.engine.GenericBenchmark.start (323)
>>>>>>     at com.sun.faban.harness.engine.RunDaemon.run (338)
>>>>>>     at java.lang.Thread.run (619)
>>>>>> 15:13:05:WARNING:Could not copy /usr/data/olio-db.err to
>>>>>> /home/gdhiman/faban.1.0/faban/output/OlioDriver.2D/mysql_err.log.olio-db
>>>>>>
>>>>>> Apparently something is misconfigured in my db-server. Any ideas?
>>>>>>
>>>>>> 2) I get the following error:
>>>>>> 15:13:16:WARNING:[/home/gdhiman/faban.1.0/faban/bin/fenxi, process,
>>>>>> /home/gdhiman/faban.1.0/faban/output/OlioDriver.2D/,
>>>>>> /home/gdhiman/faban.1.0/faban/output/OlioDriver.2D//post/, OlioDriver.2D]
>>>>>> stderr:
>>>>>> Error in executing perl
>>>>>> /home/gdhiman/faban.1.0/faban/master/webapps/fenxi/txt2db/mpstat.pl
>>>>>> Error in executing perl
>>>>>> /home/gdhiman/faban.1.0/faban/master/webapps/fenxi/txt2db/mpstat.pl
>>>>>>
>>>>>> Actually I traced back this one. The problem is the difference in
>>>>>> output format of the Sun's mpstat and default GNU mpstat.
>>>>>> This is my output of my mpstat:
>>>>>>
>>>>>> gdhiman@olio-client00:~/faban.1.0/faban/output/OlioDriver.2D$ mpstat
>>>>>> 1
>>>>>> Linux 2.6.18.8-xen (olio-client00)     01/16/10
>>>>>>
>>>>>> 16:25:06     CPU   %user   %nice    %sys %iowait    %irq   %soft
>>>>>> %steal   %idle    intr/s
>>>>>> 16:25:07     all    0.00    0.00    0.00    0.00    0.00    0.00
>>>>>> 0.00  100.00     52.48
>>>>>> 16:25:08     all    0.00    0.00    0.00    0.00    0.00    0.00
>>>>>> 0.00  100.00     50.50
>>>>>> 16:25:09     all    0.00    0.00    0.00    0.00    0.00    0.00
>>>>>> 0.00  100.00     79.21
>>>>>> 16:25:10     all    0.00    0.00    0.00    0.00    0.00    0.00
>>>>>> 0.00  100.00     45.54
>>>>>> 16:25:11     all    0.00    0.00    0.00    0.00    0.00    0.00
>>>>>> 0.00  100.00     55.45
>>>>>>
>>>>>> The first line as well as the time at the beginning of each entry
>>>>>> messing up the parsing at mpstat.pl. (also the fields are different)
>>>>>>   Any plans to support this??
>>>>>>
>>>>>> 3) Scaling questions.
>>>>>> - So far I did not have a single experiment passing. Some are pretty
>>>>>> close with only one metric check failing.
>>>>>>
>>>>>> Average images loaded per Home Page2.79>= 3
>>>>>> FAILED
>>>>>> Any ideas? Is it the case that the disc is not fast enough? I am just
>>>>>> using the local filesystem for the filestore.
>>>>>>
>>>>>> - As I double the number of concurrent users I observe linear scaling
>>>>>> in the thoughput.
>>>>>> Con Users         Throughput
>>>>>>  25                        4.967
>>>>>>  50                       10.06
>>>>>> 100                      19.375
>>>>>> 200                      40.21
>>>>>> 400                      75.818
>>>>>> 800                       0.383
>>>>>> 1000                     0.483
>>>>>>
>>>>>> The linear scaling stops for 400 concurrent users ( only one agent).
>>>>>> Actually it would be exactly linear (value of ~80) but almost half of the
>>>>>> login operations failed. I am looking into it.
>>>>>> Any insights on what might be the first thing failing?
>>>>>>
>>>>>> For the 800 and 1000 experiments there are no failed operations
>>>>>> logged. It looks like those are being discarded... (?)
>>>>>>
>>>>>> Bonus question:
>>>>>> In the runtime statistics
>>>>>> <runtimeStats enabled="true">
>>>>>>          <interval>30</interval>
>>>>>>  </runtimeStats>
>>>>>>
>>>>>> only the 90% response time is reported. Is there an easy way to also
>>>>>> report the 99% ? ( or I need to add code for that?)
>>>>>>
>>>>>>
>>>>>> Thanks a lot again in advance.
>>>>>> -VK
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>

Re: Glassfish v3 Jersey errors

Posted by Kim LiChong <Ki...@Sun.COM>.
HI James,

Can you file a bug on this?  Can you please also include what build of 
GFv3 you are using.

thanks,

Kim
> Hi,
>  
> I have been running Glassfish v2.1.1 without issues.  I am now 
> trying to get the java version running on Glassfish v3.  I recompiled 
> the webapp for v3.  During glassfish startup I am seeing errors 
> relating to Jersey in my log.  The app seems to work OK despite this.  
> Is there anything special that needs to be done for Glassfish v3? 
>  
> Here is the relevant portion of the server.log:
>
> [#|2010-02-12T10:13:04.535-0800|INFO|glassfishv3.0|com.sun.jersey.server.impl.application.WebApplicationImpl|_ThreadID=11;_ThreadName=Thread-1;|Initiating 
> Jersey application, version 'Jersey: 1.1.4.1 11/24/2009 01:37 AM'|#]
>
> [#|2010-02-12T10:13:04.587-0800|INFO|glassfishv3.0|com.sun.jersey.server.impl.application.WebApplicationImpl|_ThreadID=11;_ThreadName=Thread-1;|Adding 
> the following classes declared in 
> META-INF/services/jersey-server-components to the resource configuration:
>
> class com.sun.jersey.multipart.impl.FormDataMultiPartDispatchProvider
>
> class com.sun.jersey.multipart.impl.MultiPartConfigProvider
>
> class com.sun.jersey.multipart.impl.MultiPartReader
>
> class com.sun.jersey.multipart.impl.MultiPartWriter|#]
>
> [#|2010-02-12T10:13:05.162-0800|SEVERE|glassfishv3.0|com.sun.jersey.core.spi.component.ProviderFactory|_ThreadID=11;_ThreadName=Thread-1;|The 
> provider class, class 
> com.sun.jersey.json.impl.provider.entity.JSONArrayProvider, could not 
> be instantiated. Processing will continue but the class will not be 
> utilized
>
> java.lang.IllegalAccessException: Class 
> com.sun.jersey.core.spi.component.ComponentConstructor can not access 
> a member of class 
> com.sun.jersey.json.impl.provider.entity.JSONArrayProvider with 
> modifiers ""
>
> at sun.reflect.Reflection.ensureMemberAccess(Reflection.java:65)
>
> at java.lang.Class.newInstance0(Class.java:349)
>
> at java.lang.Class.newInstance(Class.java:308)
>
> at 
> com.sun.jersey.core.spi.component.ComponentConstructor._getInstance(ComponentConstructor.java:153)
>
> at 
> com.sun.jersey.core.spi.component.ComponentConstructor.getInstance(ComponentConstructor.java:141)
>
> at 
> com.sun.jersey.core.spi.component.ProviderFactory.__getComponentProvider(ProviderFactory.java:163)
>
> at 
> com.sun.jersey.core.spi.component.ProviderFactory.getComponentProvider(ProviderFactory.java:134)
>
> at 
> com.sun.jersey.core.spi.component.ProviderServices.getComponent(ProviderServices.java:233)
>
> at 
> com.sun.jersey.core.spi.component.ProviderServices.getProvidersAndServices(ProviderServices.java:150)
>
> at 
> com.sun.jersey.core.spi.factory.MessageBodyFactory.initReaders(MessageBodyFactory.java:153)
>
> at 
> com.sun.jersey.core.spi.factory.MessageBodyFactory.init(MessageBodyFactory.java:145)
>
> at 
> com.sun.jersey.server.impl.application.WebApplicationImpl.initiate(WebApplicationImpl.java:664)
>
> at 
> com.sun.jersey.server.impl.application.WebApplicationImpl.initiate(WebApplicationImpl.java:449)
>
> at 
> com.sun.jersey.spi.container.servlet.ServletContainer.initiate(ServletContainer.java:404)
>
> at 
> com.sun.jersey.spi.container.servlet.ServletContainer$InternalWebComponent.initiate(ServletContainer.java:253)
>
> at 
> com.sun.jersey.spi.container.servlet.WebComponent.load(WebComponent.java:521)
>
> at 
> com.sun.jersey.spi.container.servlet.WebComponent.init(WebComponent.java:199)
>
> at 
> com.sun.jersey.spi.container.servlet.ServletContainer.init(ServletContainer.java:308)
>
> at 
> com.sun.jersey.spi.container.servlet.ServletContainer.init(ServletContainer.java:471)
>
> at javax.servlet.GenericServlet.init(GenericServlet.java:242)
>
> at 
> org.apache.catalina.core.StandardWrapper.initServlet(StandardWrapper.java:1428)
>
> at 
> org.apache.catalina.core.StandardWrapper.load(StandardWrapper.java:1230)
>
> at 
> org.apache.catalina.core.StandardContext.loadOnStartup(StandardContext.java:4934)
>
> at 
> org.apache.catalina.core.StandardContext.start(StandardContext.java:5207)
>
> at com.sun.enterprise.web.WebModule.start(WebModule.java:499)
>
> at 
> org.apache.catalina.core.ContainerBase.addChildInternal(ContainerBase.java:928)
>
> at org.apache.catalina.core.ContainerBase.addChild(ContainerBase.java:912)
>
> at org.apache.catalina.core.StandardHost.addChild(StandardHost.java:694)
>
> at 
> com.sun.enterprise.web.WebContainer.loadWebModule(WebContainer.java:1933)
>
> at 
> com.sun.enterprise.web.WebContainer.loadWebModule(WebContainer.java:1605)
>
> at com.sun.enterprise.web.WebApplication.start(WebApplication.java:90)
>
> at org.glassfish.internal.data.EngineRef.start(EngineRef.java:126)
>
> at org.glassfish.internal.data.ModuleInfo.start(ModuleInfo.java:241)
>
> at 
> org.glassfish.internal.data.ApplicationInfo.start(ApplicationInfo.java:236)
>
> at 
> com.sun.enterprise.v3.server.ApplicationLifecycle.deploy(ApplicationLifecycle.java:339)
>
> at 
> com.sun.enterprise.v3.server.ApplicationLoaderService.processApplication(ApplicationLoaderService.java:340)
>
> at 
> com.sun.enterprise.v3.server.ApplicationLoaderService.postConstruct(ApplicationLoaderService.java:163)
>
> at 
> com.sun.hk2.component.AbstractWombImpl.inject(AbstractWombImpl.java:174)
>
> at com.sun.hk2.component.ConstructorWomb$1.run(ConstructorWomb.java:87)
>
> at java.security.AccessController.doPrivileged(Native Method)
>
> at 
> com.sun.hk2.component.ConstructorWomb.initialize(ConstructorWomb.java:84)
>
> at com.sun.hk2.component.AbstractWombImpl.get(AbstractWombImpl.java:77)
>
> at 
> com.sun.hk2.component.SingletonInhabitant.get(SingletonInhabitant.java:58)
>
> at com.sun.hk2.component.LazyInhabitant.get(LazyInhabitant.java:107)
>
> at 
> com.sun.hk2.component.AbstractInhabitantImpl.get(AbstractInhabitantImpl.java:60)
>
> at 
> com.sun.enterprise.v3.server.AppServerStartup.run(AppServerStartup.java:236)
>
> at 
> com.sun.enterprise.v3.server.AppServerStartup.start(AppServerStartup.java:128)
>
> at com.sun.enterprise.module.bootstrap.Main.launch(Main.java:457)
>
> at com.sun.enterprise.module.bootstrap.Main.launch(Main.java:401)
>
> at org.jvnet.hk2.osgiadapter.HK2Main.start(HK2Main.java:125)
>
> at 
> org.apache.felix.framework.util.SecureAction.startActivator(SecureAction.java:640)
>
> at org.apache.felix.framework.Felix.activateBundle(Felix.java:1700)
>
> at org.apache.felix.framework.Felix.startBundle(Felix.java:1622)
>
> at org.apache.felix.framework.BundleImpl.start(BundleImpl.java:915)
>
> at org.jvnet.hk2.osgimain.Main.start(Main.java:140)
>
> at 
> org.apache.felix.framework.util.SecureAction.startActivator(SecureAction.java:640)
>
> at org.apache.felix.framework.Felix.activateBundle(Felix.java:1700)
>
> at org.apache.felix.framework.Felix.startBundle(Felix.java:1622)
>
> at org.apache.felix.framework.Felix.setActiveStartLevel(Felix.java:1077)
>
> at org.apache.felix.framework.StartLevelImpl.run(StartLevelImpl.java:264)
>
> at java.lang.Thread.run(Thread.java:619)
>
> |#]
>
> [#|2010-02-12T10:13:05.177-0800|SEVERE|glassfishv3.0|com.sun.jersey.core.spi.component.ProviderFactory|_ThreadID=11;_ThreadName=Thread-1;|The 
> provider class, class 
> com.sun.jersey.json.impl.provider.entity.JSONObjectProvider, could not 
> be instantiated. Processing will continue but the class will not be 
> utilized
>
> java.lang.IllegalAccessException: Class 
> com.sun.jersey.core.spi.component.ComponentConstructor can not access 
> a member of class 
> com.sun.jersey.json.impl.provider.entity.JSONObjectProvider with 
> modifiers ""
>
> at sun.reflect.Reflection.ensureMemberAccess(Reflection.java:65)
>
> at java.lang.Class.newInstance0(Class.java:349)
>
> at java.lang.Class.newInstance(Class.java:308)
>
> at 
> com.sun.jersey.core.spi.component.ComponentConstructor._getInstance(ComponentConstructor.java:153)
>
> at 
> com.sun.jersey.core.spi.component.ComponentConstructor.getInstance(ComponentConstructor.java:141)
>
> at 
> com.sun.jersey.core.spi.component.ProviderFactory.__getComponentProvider(ProviderFactory.java:163)
>
> at 
> com.sun.jersey.core.spi.component.ProviderFactory.getComponentProvider(ProviderFactory.java:134)
>
> at 
> com.sun.jersey.core.spi.component.ProviderServices.getComponent(ProviderServices.java:233)
>
> at 
> com.sun.jersey.core.spi.component.ProviderServices.getProvidersAndServices(ProviderServices.java:150)
>
> at 
> com.sun.jersey.core.spi.factory.MessageBodyFactory.initReaders(MessageBodyFactory.java:153)
>
> at 
> com.sun.jersey.core.spi.factory.MessageBodyFactory.init(MessageBodyFactory.java:145)
>
> at 
> com.sun.jersey.server.impl.application.WebApplicationImpl.initiate(WebApplicationImpl.java:664)
>
> at 
> com.sun.jersey.server.impl.application.WebApplicationImpl.initiate(WebApplicationImpl.java:449)
>
> at 
> com.sun.jersey.spi.container.servlet.ServletContainer.initiate(ServletContainer.java:404)
>
> at 
> com.sun.jersey.spi.container.servlet.ServletContainer$InternalWebComponent.initiate(ServletContainer.java:253)
>
> at 
> com.sun.jersey.spi.container.servlet.WebComponent.load(WebComponent.java:521)
>
> at 
> com.sun.jersey.spi.container.servlet.WebComponent.init(WebComponent.java:199)
>
> at 
> com.sun.jersey.spi.container.servlet.ServletContainer.init(ServletContainer.java:308)
>
> at 
> com.sun.jersey.spi.container.servlet.ServletContainer.init(ServletContainer.java:471)
>
> at javax.servlet.GenericServlet.init(GenericServlet.java:242)
>
> at 
> org.apache.catalina.core.StandardWrapper.initServlet(StandardWrapper.java:1428)
>
> at 
> org.apache.catalina.core.StandardWrapper.load(StandardWrapper.java:1230)
>
> at 
> org.apache.catalina.core.StandardContext.loadOnStartup(StandardContext.java:4934)
>
> at 
> org.apache.catalina.core.StandardContext.start(StandardContext.java:5207)
>
> at com.sun.enterprise.web.WebModule.start(WebModule.java:499)
>
> at 
> org.apache.catalina.core.ContainerBase.addChildInternal(ContainerBase.java:928)
>
> at org.apache.catalina.core.ContainerBase.addChild(ContainerBase.java:912)
>
> at org.apache.catalina.core.StandardHost.addChild(StandardHost.java:694)
>
> at 
> com.sun.enterprise.web.WebContainer.loadWebModule(WebContainer.java:1933)
>
> at 
> com.sun.enterprise.web.WebContainer.loadWebModule(WebContainer.java:1605)
>
> at com.sun.enterprise.web.WebApplication.start(WebApplication.java:90)
>
> at org.glassfish.internal.data.EngineRef.start(EngineRef.java:126)
>
> at org.glassfish.internal.data.ModuleInfo.start(ModuleInfo.java:241)
>
> at 
> org.glassfish.internal.data.ApplicationInfo.start(ApplicationInfo.java:236)
>
> at 
> com.sun.enterprise.v3.server.ApplicationLifecycle.deploy(ApplicationLifecycle.java:339)
>
> at 
> com.sun.enterprise.v3.server.ApplicationLoaderService.processApplication(ApplicationLoaderService.java:340)
>
> at 
> com.sun.enterprise.v3.server.ApplicationLoaderService.postConstruct(ApplicationLoaderService.java:163)
>
> at 
> com.sun.hk2.component.AbstractWombImpl.inject(AbstractWombImpl.java:174)
>
> at com.sun.hk2.component.ConstructorWomb$1.run(ConstructorWomb.java:87)
>
> at java.security.AccessController.doPrivileged(Native Method)
>
> at 
> com.sun.hk2.component.ConstructorWomb.initialize(ConstructorWomb.java:84)
>
> at com.sun.hk2.component.AbstractWombImpl.get(AbstractWombImpl.java:77)
>
> at 
> com.sun.hk2.component.SingletonInhabitant.get(SingletonInhabitant.java:58)
>
> at com.sun.hk2.component.LazyInhabitant.get(LazyInhabitant.java:107)
>
> at 
> com.sun.hk2.component.AbstractInhabitantImpl.get(AbstractInhabitantImpl.java:60)
>
> at 
> com.sun.enterprise.v3.server.AppServerStartup.run(AppServerStartup.java:236)
>
> at 
> com.sun.enterprise.v3.server.AppServerStartup.start(AppServerStartup.java:128)
>
> at com.sun.enterprise.module.bootstrap.Main.launch(Main.java:457)
>
> at com.sun.enterprise.module.bootstrap.Main.launch(Main.java:401)
>
> at org.jvnet.hk2.osgiadapter.HK2Main.start(HK2Main.java:125)
>
> at 
> org.apache.felix.framework.util.SecureAction.startActivator(SecureAction.java:640)
>
> at org.apache.felix.framework.Felix.activateBundle(Felix.java:1700)
>
> at org.apache.felix.framework.Felix.startBundle(Felix.java:1622)
>
> at org.apache.felix.framework.BundleImpl.start(BundleImpl.java:915)
>
> at org.jvnet.hk2.osgimain.Main.start(Main.java:140)
>
> at 
> org.apache.felix.framework.util.SecureAction.startActivator(SecureAction.java:640)
>
> at org.apache.felix.framework.Felix.activateBundle(Felix.java:1700)
>
> at org.apache.felix.framework.Felix.startBundle(Felix.java:1622)
>
> at org.apache.felix.framework.Felix.setActiveStartLevel(Felix.java:1077)
>
> at org.apache.felix.framework.StartLevelImpl.run(StartLevelImpl.java:264)
>
> at java.lang.Thread.run(Thread.java:619)
>
> |#]
>
> [#|2010-02-12T10:13:05.446-0800|SEVERE|glassfishv3.0|com.sun.jersey.core.spi.component.ProviderFactory|_ThreadID=11;_ThreadName=Thread-1;|The 
> provider class, class 
> com.sun.jersey.json.impl.provider.entity.JSONArrayProvider, could not 
> be instantiated. Processing will continue but the class will not be 
> utilized
>
> java.lang.IllegalAccessException: Class 
> com.sun.jersey.core.spi.component.ComponentConstructor can not access 
> a member of class 
> com.sun.jersey.json.impl.provider.entity.JSONArrayProvider with 
> modifiers ""
>
> at sun.reflect.Reflection.ensureMemberAccess(Reflection.java:65)
>
> at java.lang.Class.newInstance0(Class.java:349)
>
> at java.lang.Class.newInstance(Class.java:308)
>
> at 
> com.sun.jersey.core.spi.component.ComponentConstructor._getInstance(ComponentConstructor.java:153)
>
> at 
> com.sun.jersey.core.spi.component.ComponentConstructor.getInstance(ComponentConstructor.java:141)
>
> at 
> com.sun.jersey.core.spi.component.ProviderFactory.__getComponentProvider(ProviderFactory.java:163)
>
> at 
> com.sun.jersey.core.spi.component.ProviderFactory.getComponentProvider(ProviderFactory.java:134)
>
> at 
> com.sun.jersey.core.spi.component.ProviderServices.getComponent(ProviderServices.java:233)
>
> at 
> com.sun.jersey.core.spi.component.ProviderServices.getProvidersAndServices(ProviderServices.java:150)
>
> at 
> com.sun.jersey.core.spi.factory.MessageBodyFactory.initWriters(MessageBodyFactory.java:171)
>
> at 
> com.sun.jersey.core.spi.factory.MessageBodyFactory.init(MessageBodyFactory.java:146)
>
> at 
> com.sun.jersey.server.impl.application.WebApplicationImpl.initiate(WebApplicationImpl.java:664)
>
> at 
> com.sun.jersey.server.impl.application.WebApplicationImpl.initiate(WebApplicationImpl.java:449)
>
> at 
> com.sun.jersey.spi.container.servlet.ServletContainer.initiate(ServletContainer.java:404)
>
> at 
> com.sun.jersey.spi.container.servlet.ServletContainer$InternalWebComponent.initiate(ServletContainer.java:253)
>
> at 
> com.sun.jersey.spi.container.servlet.WebComponent.load(WebComponent.java:521)
>
> at 
> com.sun.jersey.spi.container.servlet.WebComponent.init(WebComponent.java:199)
>
> at 
> com.sun.jersey.spi.container.servlet.ServletContainer.init(ServletContainer.java:308)
>
> at 
> com.sun.jersey.spi.container.servlet.ServletContainer.init(ServletContainer.java:471)
>
> at javax.servlet.GenericServlet.init(GenericServlet.java:242)
>
> at 
> org.apache.catalina.core.StandardWrapper.initServlet(StandardWrapper.java:1428)
>
> at 
> org.apache.catalina.core.StandardWrapper.load(StandardWrapper.java:1230)
>
> at 
> org.apache.catalina.core.StandardContext.loadOnStartup(StandardContext.java:4934)
>
> at 
> org.apache.catalina.core.StandardContext.start(StandardContext.java:5207)
>
> at com.sun.enterprise.web.WebModule.start(WebModule.java:499)
>
> at 
> org.apache.catalina.core.ContainerBase.addChildInternal(ContainerBase.java:928)
>
> at org.apache.catalina.core.ContainerBase.addChild(ContainerBase.java:912)
>
> at org.apache.catalina.core.StandardHost.addChild(StandardHost.java:694)
>
> at 
> com.sun.enterprise.web.WebContainer.loadWebModule(WebContainer.java:1933)
>
> at 
> com.sun.enterprise.web.WebContainer.loadWebModule(WebContainer.java:1605)
>
> at com.sun.enterprise.web.WebApplication.start(WebApplication.java:90)
>
> at org.glassfish.internal.data.EngineRef.start(EngineRef.java:126)
>
> at org.glassfish.internal.data.ModuleInfo.start(ModuleInfo.java:241)
>
> at 
> org.glassfish.internal.data.ApplicationInfo.start(ApplicationInfo.java:236)
>
> at 
> com.sun.enterprise.v3.server.ApplicationLifecycle.deploy(ApplicationLifecycle.java:339)
>
> at 
> com.sun.enterprise.v3.server.ApplicationLoaderService.processApplication(ApplicationLoaderService.java:340)
>
> at 
> com.sun.enterprise.v3.server.ApplicationLoaderService.postConstruct(ApplicationLoaderService.java:163)
>
> at 
> com.sun.hk2.component.AbstractWombImpl.inject(AbstractWombImpl.java:174)
>
> at com.sun.hk2.component.ConstructorWomb$1.run(ConstructorWomb.java:87)
>
> at java.security.AccessController.doPrivileged(Native Method)
>
> at 
> com.sun.hk2.component.ConstructorWomb.initialize(ConstructorWomb.java:84)
>
> at com.sun.hk2.component.AbstractWombImpl.get(AbstractWombImpl.java:77)
>
> at 
> com.sun.hk2.component.SingletonInhabitant.get(SingletonInhabitant.java:58)
>
> at com.sun.hk2.component.LazyInhabitant.get(LazyInhabitant.java:107)
>
> at 
> com.sun.hk2.component.AbstractInhabitantImpl.get(AbstractInhabitantImpl.java:60)
>
> at 
> com.sun.enterprise.v3.server.AppServerStartup.run(AppServerStartup.java:236)
>
> at 
> com.sun.enterprise.v3.server.AppServerStartup.start(AppServerStartup.java:128)
>
> at com.sun.enterprise.module.bootstrap.Main.launch(Main.java:457)
>
> at com.sun.enterprise.module.bootstrap.Main.launch(Main.java:401)
>
> at org.jvnet.hk2.osgiadapter.HK2Main.start(HK2Main.java:125)
>
> at 
> org.apache.felix.framework.util.SecureAction.startActivator(SecureAction.java:640)
>
> at org.apache.felix.framework.Felix.activateBundle(Felix.java:1700)
>
> at org.apache.felix.framework.Felix.startBundle(Felix.java:1622)
>
> at org.apache.felix.framework.BundleImpl.start(BundleImpl.java:915)
>
> at org.jvnet.hk2.osgimain.Main.start(Main.java:140)
>
> at 
> org.apache.felix.framework.util.SecureAction.startActivator(SecureAction.java:640)
>
> at org.apache.felix.framework.Felix.activateBundle(Felix.java:1700)
>
> at org.apache.felix.framework.Felix.startBundle(Felix.java:1622)
>
> at org.apache.felix.framework.Felix.setActiveStartLevel(Felix.java:1077)
>
> at org.apache.felix.framework.StartLevelImpl.run(StartLevelImpl.java:264)
>
> at java.lang.Thread.run(Thread.java:619)
>
> |#]
>
> [#|2010-02-12T10:13:05.449-0800|SEVERE|glassfishv3.0|com.sun.jersey.core.spi.component.ProviderFactory|_ThreadID=11;_ThreadName=Thread-1;|The 
> provider class, class 
> com.sun.jersey.json.impl.provider.entity.JSONObjectProvider, could not 
> be instantiated. Processing will continue but the class will not be 
> utilized
>
> java.lang.IllegalAccessException: Class 
> com.sun.jersey.core.spi.component.ComponentConstructor can not access 
> a member of class 
> com.sun.jersey.json.impl.provider.entity.JSONObjectProvider with 
> modifiers ""
>
> at sun.reflect.Reflection.ensureMemberAccess(Reflection.java:65)
>
> at java.lang.Class.newInstance0(Class.java:349)
>
> at java.lang.Class.newInstance(Class.java:308)
>
> at 
> com.sun.jersey.core.spi.component.ComponentConstructor._getInstance(ComponentConstructor.java:153)
>
> at 
> com.sun.jersey.core.spi.component.ComponentConstructor.getInstance(ComponentConstructor.java:141)
>
> at 
> com.sun.jersey.core.spi.component.ProviderFactory.__getComponentProvider(ProviderFactory.java:163)
>
> at 
> com.sun.jersey.core.spi.component.ProviderFactory.getComponentProvider(ProviderFactory.java:134)
>
> at 
> com.sun.jersey.core.spi.component.ProviderServices.getComponent(ProviderServices.java:233)
>
> at 
> com.sun.jersey.core.spi.component.ProviderServices.getProvidersAndServices(ProviderServices.java:150)
>
> at 
> com.sun.jersey.core.spi.factory.MessageBodyFactory.initWriters(MessageBodyFactory.java:171)
>
> at 
> com.sun.jersey.core.spi.factory.MessageBodyFactory.init(MessageBodyFactory.java:146)
>
> at 
> com.sun.jersey.server.impl.application.WebApplicationImpl.initiate(WebApplicationImpl.java:664)
>
> at 
> com.sun.jersey.server.impl.application.WebApplicationImpl.initiate(WebApplicationImpl.java:449)
>
> at 
> com.sun.jersey.spi.container.servlet.ServletContainer.initiate(ServletContainer.java:404)
>
> at 
> com.sun.jersey.spi.container.servlet.ServletContainer$InternalWebComponent.initiate(ServletContainer.java:253)
>
> at 
> com.sun.jersey.spi.container.servlet.WebComponent.load(WebComponent.java:521)
>
> at 
> com.sun.jersey.spi.container.servlet.WebComponent.init(WebComponent.java:199)
>
> at 
> com.sun.jersey.spi.container.servlet.ServletContainer.init(ServletContainer.java:308)
>
> at 
> com.sun.jersey.spi.container.servlet.ServletContainer.init(ServletContainer.java:471)
>
> at javax.servlet.GenericServlet.init(GenericServlet.java:242)
>
> at 
> org.apache.catalina.core.StandardWrapper.initServlet(StandardWrapper.java:1428)
>
> at 
> org.apache.catalina.core.StandardWrapper.load(StandardWrapper.java:1230)
>
> at 
> org.apache.catalina.core.StandardContext.loadOnStartup(StandardContext.java:4934)
>
> at 
> org.apache.catalina.core.StandardContext.start(StandardContext.java:5207)
>
> at com.sun.enterprise.web.WebModule.start(WebModule.java:499)
>
> at 
> org.apache.catalina.core.ContainerBase.addChildInternal(ContainerBase.java:928)
>
> at org.apache.catalina.core.ContainerBase.addChild(ContainerBase.java:912)
>
> at org.apache.catalina.core.StandardHost.addChild(StandardHost.java:694)
>
> at 
> com.sun.enterprise.web.WebContainer.loadWebModule(WebContainer.java:1933)
>
> at 
> com.sun.enterprise.web.WebContainer.loadWebModule(WebContainer.java:1605)
>
> at com.sun.enterprise.web.WebApplication.start(WebApplication.java:90)
>
> at org.glassfish.internal.data.EngineRef.start(EngineRef.java:126)
>
> at org.glassfish.internal.data.ModuleInfo.start(ModuleInfo.java:241)
>
> at 
> org.glassfish.internal.data.ApplicationInfo.start(ApplicationInfo.java:236)
>
> at 
> com.sun.enterprise.v3.server.ApplicationLifecycle.deploy(ApplicationLifecycle.java:339)
>
> at 
> com.sun.enterprise.v3.server.ApplicationLoaderService.processApplication(ApplicationLoaderService.java:340)
>
> at 
> com.sun.enterprise.v3.server.ApplicationLoaderService.postConstruct(ApplicationLoaderService.java:163)
>
> at 
> com.sun.hk2.component.AbstractWombImpl.inject(AbstractWombImpl.java:174)
>
> at com.sun.hk2.component.ConstructorWomb$1.run(ConstructorWomb.java:87)
>
> at java.security.AccessController.doPrivileged(Native Method)
>
> at 
> com.sun.hk2.component.ConstructorWomb.initialize(ConstructorWomb.java:84)
>
> at com.sun.hk2.component.AbstractWombImpl.get(AbstractWombImpl.java:77)
>
> at 
> com.sun.hk2.component.SingletonInhabitant.get(SingletonInhabitant.java:58)
>
> at com.sun.hk2.component.LazyInhabitant.get(LazyInhabitant.java:107)
>
> at 
> com.sun.hk2.component.AbstractInhabitantImpl.get(AbstractInhabitantImpl.java:60)
>
> at 
> com.sun.enterprise.v3.server.AppServerStartup.run(AppServerStartup.java:236)
>
> at 
> com.sun.enterprise.v3.server.AppServerStartup.start(AppServerStartup.java:128)
>
> at com.sun.enterprise.module.bootstrap.Main.launch(Main.java:457)
>
> at com.sun.enterprise.module.bootstrap.Main.launch(Main.java:401)
>
> at org.jvnet.hk2.osgiadapter.HK2Main.start(HK2Main.java:125)
>
> at 
> org.apache.felix.framework.util.SecureAction.startActivator(SecureAction.java:640)
>
> at org.apache.felix.framework.Felix.activateBundle(Felix.java:1700)
>
> at org.apache.felix.framework.Felix.startBundle(Felix.java:1622)
>
> at org.apache.felix.framework.BundleImpl.start(BundleImpl.java:915)
>
> at org.jvnet.hk2.osgimain.Main.start(Main.java:140)
>
> at 
> org.apache.felix.framework.util.SecureAction.startActivator(SecureAction.java:640)
>
> at org.apache.felix.framework.Felix.activateBundle(Felix.java:1700)
>
> at org.apache.felix.framework.Felix.startBundle(Felix.java:1622)
>
> at org.apache.felix.framework.Felix.setActiveStartLevel(Felix.java:1077)
>
> at org.apache.felix.framework.StartLevelImpl.run(StartLevelImpl.java:264)
>
> at java.lang.Thread.run(Thread.java:619)
>
> |#]
>


Glassfish v3 Jersey errors

Posted by James Zubb <jz...@vmware.com>.
Hi,

I have been running Glassfish v2.1.1 without issues.  I am now trying to get the java version running on Glassfish v3.  I recompiled the webapp for v3.  During glassfish startup I am seeing errors relating to Jersey in my log.  The app seems to work OK despite this.  Is there anything special that needs to be done for Glassfish v3?

Here is the relevant portion of the server.log:

[#|2010-02-12T10:13:04.535-0800|INFO|glassfishv3.0|com.sun.jersey.server.impl.application.WebApplicationImpl|_ThreadID=11;_ThreadName=Thread-1;|Initiating Jersey application, version 'Jersey: 1.1.4.1 11/24/2009 01:37 AM'|#]

[#|2010-02-12T10:13:04.587-0800|INFO|glassfishv3.0|com.sun.jersey.server.impl.application.WebApplicationImpl|_ThreadID=11;_ThreadName=Thread-1;|Adding the following classes declared in META-INF/services/jersey-server-components to the resource configuration:

class com.sun.jersey.multipart.impl.FormDataMultiPartDispatchProvider

class com.sun.jersey.multipart.impl.MultiPartConfigProvider

class com.sun.jersey.multipart.impl.MultiPartReader

class com.sun.jersey.multipart.impl.MultiPartWriter|#]

[#|2010-02-12T10:13:05.162-0800|SEVERE|glassfishv3.0|com.sun.jersey.core.spi.component.ProviderFactory|_ThreadID=11;_ThreadName=Thread-1;|The provider class, class com.sun.jersey.json.impl.provider.entity.JSONArrayProvider, could not be instantiated. Processing will continue but the class will not be utilized

java.lang.IllegalAccessException: Class com.sun.jersey.core.spi.component.ComponentConstructor can not access a member of class com.sun.jersey.json.impl.provider.entity.JSONArrayProvider with modifiers ""

at sun.reflect.Reflection.ensureMemberAccess(Reflection.java:65)

at java.lang.Class.newInstance0(Class.java:349)

at java.lang.Class.newInstance(Class.java:308)

at com.sun.jersey.core.spi.component.ComponentConstructor._getInstance(ComponentConstructor.java:153)

at com.sun.jersey.core.spi.component.ComponentConstructor.getInstance(ComponentConstructor.java:141)

at com.sun.jersey.core.spi.component.ProviderFactory.__getComponentProvider(ProviderFactory.java:163)

at com.sun.jersey.core.spi.component.ProviderFactory.getComponentProvider(ProviderFactory.java:134)

at com.sun.jersey.core.spi.component.ProviderServices.getComponent(ProviderServices.java:233)

at com.sun.jersey.core.spi.component.ProviderServices.getProvidersAndServices(ProviderServices.java:150)

at com.sun.jersey.core.spi.factory.MessageBodyFactory.initReaders(MessageBodyFactory.java:153)

at com.sun.jersey.core.spi.factory.MessageBodyFactory.init(MessageBodyFactory.java:145)

at com.sun.jersey.server.impl.application.WebApplicationImpl.initiate(WebApplicationImpl.java:664)

at com.sun.jersey.server.impl.application.WebApplicationImpl.initiate(WebApplicationImpl.java:449)

at com.sun.jersey.spi.container.servlet.ServletContainer.initiate(ServletContainer.java:404)

at com.sun.jersey.spi.container.servlet.ServletContainer$InternalWebComponent.initiate(ServletContainer.java:253)

at com.sun.jersey.spi.container.servlet.WebComponent.load(WebComponent.java:521)

at com.sun.jersey.spi.container.servlet.WebComponent.init(WebComponent.java:199)

at com.sun.jersey.spi.container.servlet.ServletContainer.init(ServletContainer.java:308)

at com.sun.jersey.spi.container.servlet.ServletContainer.init(ServletContainer.java:471)

at javax.servlet.GenericServlet.init(GenericServlet.java:242)

at org.apache.catalina.core.StandardWrapper.initServlet(StandardWrapper.java:1428)

at org.apache.catalina.core.StandardWrapper.load(StandardWrapper.java:1230)

at org.apache.catalina.core.StandardContext.loadOnStartup(StandardContext.java:4934)

at org.apache.catalina.core.StandardContext.start(StandardContext.java:5207)

at com.sun.enterprise.web.WebModule.start(WebModule.java:499)

at org.apache.catalina.core.ContainerBase.addChildInternal(ContainerBase.java:928)

at org.apache.catalina.core.ContainerBase.addChild(ContainerBase.java:912)

at org.apache.catalina.core.StandardHost.addChild(StandardHost.java:694)

at com.sun.enterprise.web.WebContainer.loadWebModule(WebContainer.java:1933)

at com.sun.enterprise.web.WebContainer.loadWebModule(WebContainer.java:1605)

at com.sun.enterprise.web.WebApplication.start(WebApplication.java:90)

at org.glassfish.internal.data.EngineRef.start(EngineRef.java:126)

at org.glassfish.internal.data.ModuleInfo.start(ModuleInfo.java:241)

at org.glassfish.internal.data.ApplicationInfo.start(ApplicationInfo.java:236)

at com.sun.enterprise.v3.server.ApplicationLifecycle.deploy(ApplicationLifecycle.java:339)

at com.sun.enterprise.v3.server.ApplicationLoaderService.processApplication(ApplicationLoaderService.java:340)

at com.sun.enterprise.v3.server.ApplicationLoaderService.postConstruct(ApplicationLoaderService.java:163)

at com.sun.hk2.component.AbstractWombImpl.inject(AbstractWombImpl.java:174)

at com.sun.hk2.component.ConstructorWomb$1.run(ConstructorWomb.java:87)

at java.security.AccessController.doPrivileged(Native Method)

at com.sun.hk2.component.ConstructorWomb.initialize(ConstructorWomb.java:84)

at com.sun.hk2.component.AbstractWombImpl.get(AbstractWombImpl.java:77)

at com.sun.hk2.component.SingletonInhabitant.get(SingletonInhabitant.java:58)

at com.sun.hk2.component.LazyInhabitant.get(LazyInhabitant.java:107)

at com.sun.hk2.component.AbstractInhabitantImpl.get(AbstractInhabitantImpl.java:60)

at com.sun.enterprise.v3.server.AppServerStartup.run(AppServerStartup.java:236)

at com.sun.enterprise.v3.server.AppServerStartup.start(AppServerStartup.java:128)

at com.sun.enterprise.module.bootstrap.Main.launch(Main.java:457)

at com.sun.enterprise.module.bootstrap.Main.launch(Main.java:401)

at org.jvnet.hk2.osgiadapter.HK2Main.start(HK2Main.java:125)

at org.apache.felix.framework.util.SecureAction.startActivator(SecureAction.java:640)

at org.apache.felix.framework.Felix.activateBundle(Felix.java:1700)

at org.apache.felix.framework.Felix.startBundle(Felix.java:1622)

at org.apache.felix.framework.BundleImpl.start(BundleImpl.java:915)

at org.jvnet.hk2.osgimain.Main.start(Main.java:140)

at org.apache.felix.framework.util.SecureAction.startActivator(SecureAction.java:640)

at org.apache.felix.framework.Felix.activateBundle(Felix.java:1700)

at org.apache.felix.framework.Felix.startBundle(Felix.java:1622)

at org.apache.felix.framework.Felix.setActiveStartLevel(Felix.java:1077)

at org.apache.felix.framework.StartLevelImpl.run(StartLevelImpl.java:264)

at java.lang.Thread.run(Thread.java:619)

|#]

[#|2010-02-12T10:13:05.177-0800|SEVERE|glassfishv3.0|com.sun.jersey.core.spi.component.ProviderFactory|_ThreadID=11;_ThreadName=Thread-1;|The provider class, class com.sun.jersey.json.impl.provider.entity.JSONObjectProvider, could not be instantiated. Processing will continue but the class will not be utilized

java.lang.IllegalAccessException: Class com.sun.jersey.core.spi.component.ComponentConstructor can not access a member of class com.sun.jersey.json.impl.provider.entity.JSONObjectProvider with modifiers ""

at sun.reflect.Reflection.ensureMemberAccess(Reflection.java:65)

at java.lang.Class.newInstance0(Class.java:349)

at java.lang.Class.newInstance(Class.java:308)

at com.sun.jersey.core.spi.component.ComponentConstructor._getInstance(ComponentConstructor.java:153)

at com.sun.jersey.core.spi.component.ComponentConstructor.getInstance(ComponentConstructor.java:141)

at com.sun.jersey.core.spi.component.ProviderFactory.__getComponentProvider(ProviderFactory.java:163)

at com.sun.jersey.core.spi.component.ProviderFactory.getComponentProvider(ProviderFactory.java:134)

at com.sun.jersey.core.spi.component.ProviderServices.getComponent(ProviderServices.java:233)

at com.sun.jersey.core.spi.component.ProviderServices.getProvidersAndServices(ProviderServices.java:150)

at com.sun.jersey.core.spi.factory.MessageBodyFactory.initReaders(MessageBodyFactory.java:153)

at com.sun.jersey.core.spi.factory.MessageBodyFactory.init(MessageBodyFactory.java:145)

at com.sun.jersey.server.impl.application.WebApplicationImpl.initiate(WebApplicationImpl.java:664)

at com.sun.jersey.server.impl.application.WebApplicationImpl.initiate(WebApplicationImpl.java:449)

at com.sun.jersey.spi.container.servlet.ServletContainer.initiate(ServletContainer.java:404)

at com.sun.jersey.spi.container.servlet.ServletContainer$InternalWebComponent.initiate(ServletContainer.java:253)

at com.sun.jersey.spi.container.servlet.WebComponent.load(WebComponent.java:521)

at com.sun.jersey.spi.container.servlet.WebComponent.init(WebComponent.java:199)

at com.sun.jersey.spi.container.servlet.ServletContainer.init(ServletContainer.java:308)

at com.sun.jersey.spi.container.servlet.ServletContainer.init(ServletContainer.java:471)

at javax.servlet.GenericServlet.init(GenericServlet.java:242)

at org.apache.catalina.core.StandardWrapper.initServlet(StandardWrapper.java:1428)

at org.apache.catalina.core.StandardWrapper.load(StandardWrapper.java:1230)

at org.apache.catalina.core.StandardContext.loadOnStartup(StandardContext.java:4934)

at org.apache.catalina.core.StandardContext.start(StandardContext.java:5207)

at com.sun.enterprise.web.WebModule.start(WebModule.java:499)

at org.apache.catalina.core.ContainerBase.addChildInternal(ContainerBase.java:928)

at org.apache.catalina.core.ContainerBase.addChild(ContainerBase.java:912)

at org.apache.catalina.core.StandardHost.addChild(StandardHost.java:694)

at com.sun.enterprise.web.WebContainer.loadWebModule(WebContainer.java:1933)

at com.sun.enterprise.web.WebContainer.loadWebModule(WebContainer.java:1605)

at com.sun.enterprise.web.WebApplication.start(WebApplication.java:90)

at org.glassfish.internal.data.EngineRef.start(EngineRef.java:126)

at org.glassfish.internal.data.ModuleInfo.start(ModuleInfo.java:241)

at org.glassfish.internal.data.ApplicationInfo.start(ApplicationInfo.java:236)

at com.sun.enterprise.v3.server.ApplicationLifecycle.deploy(ApplicationLifecycle.java:339)

at com.sun.enterprise.v3.server.ApplicationLoaderService.processApplication(ApplicationLoaderService.java:340)

at com.sun.enterprise.v3.server.ApplicationLoaderService.postConstruct(ApplicationLoaderService.java:163)

at com.sun.hk2.component.AbstractWombImpl.inject(AbstractWombImpl.java:174)

at com.sun.hk2.component.ConstructorWomb$1.run(ConstructorWomb.java:87)

at java.security.AccessController.doPrivileged(Native Method)

at com.sun.hk2.component.ConstructorWomb.initialize(ConstructorWomb.java:84)

at com.sun.hk2.component.AbstractWombImpl.get(AbstractWombImpl.java:77)

at com.sun.hk2.component.SingletonInhabitant.get(SingletonInhabitant.java:58)

at com.sun.hk2.component.LazyInhabitant.get(LazyInhabitant.java:107)

at com.sun.hk2.component.AbstractInhabitantImpl.get(AbstractInhabitantImpl.java:60)

at com.sun.enterprise.v3.server.AppServerStartup.run(AppServerStartup.java:236)

at com.sun.enterprise.v3.server.AppServerStartup.start(AppServerStartup.java:128)

at com.sun.enterprise.module.bootstrap.Main.launch(Main.java:457)

at com.sun.enterprise.module.bootstrap.Main.launch(Main.java:401)

at org.jvnet.hk2.osgiadapter.HK2Main.start(HK2Main.java:125)

at org.apache.felix.framework.util.SecureAction.startActivator(SecureAction.java:640)

at org.apache.felix.framework.Felix.activateBundle(Felix.java:1700)

at org.apache.felix.framework.Felix.startBundle(Felix.java:1622)

at org.apache.felix.framework.BundleImpl.start(BundleImpl.java:915)

at org.jvnet.hk2.osgimain.Main.start(Main.java:140)

at org.apache.felix.framework.util.SecureAction.startActivator(SecureAction.java:640)

at org.apache.felix.framework.Felix.activateBundle(Felix.java:1700)

at org.apache.felix.framework.Felix.startBundle(Felix.java:1622)

at org.apache.felix.framework.Felix.setActiveStartLevel(Felix.java:1077)

at org.apache.felix.framework.StartLevelImpl.run(StartLevelImpl.java:264)

at java.lang.Thread.run(Thread.java:619)

|#]

[#|2010-02-12T10:13:05.446-0800|SEVERE|glassfishv3.0|com.sun.jersey.core.spi.component.ProviderFactory|_ThreadID=11;_ThreadName=Thread-1;|The provider class, class com.sun.jersey.json.impl.provider.entity.JSONArrayProvider, could not be instantiated. Processing will continue but the class will not be utilized

java.lang.IllegalAccessException: Class com.sun.jersey.core.spi.component.ComponentConstructor can not access a member of class com.sun.jersey.json.impl.provider.entity.JSONArrayProvider with modifiers ""

at sun.reflect.Reflection.ensureMemberAccess(Reflection.java:65)

at java.lang.Class.newInstance0(Class.java:349)

at java.lang.Class.newInstance(Class.java:308)

at com.sun.jersey.core.spi.component.ComponentConstructor._getInstance(ComponentConstructor.java:153)

at com.sun.jersey.core.spi.component.ComponentConstructor.getInstance(ComponentConstructor.java:141)

at com.sun.jersey.core.spi.component.ProviderFactory.__getComponentProvider(ProviderFactory.java:163)

at com.sun.jersey.core.spi.component.ProviderFactory.getComponentProvider(ProviderFactory.java:134)

at com.sun.jersey.core.spi.component.ProviderServices.getComponent(ProviderServices.java:233)

at com.sun.jersey.core.spi.component.ProviderServices.getProvidersAndServices(ProviderServices.java:150)

at com.sun.jersey.core.spi.factory.MessageBodyFactory.initWriters(MessageBodyFactory.java:171)

at com.sun.jersey.core.spi.factory.MessageBodyFactory.init(MessageBodyFactory.java:146)

at com.sun.jersey.server.impl.application.WebApplicationImpl.initiate(WebApplicationImpl.java:664)

at com.sun.jersey.server.impl.application.WebApplicationImpl.initiate(WebApplicationImpl.java:449)

at com.sun.jersey.spi.container.servlet.ServletContainer.initiate(ServletContainer.java:404)

at com.sun.jersey.spi.container.servlet.ServletContainer$InternalWebComponent.initiate(ServletContainer.java:253)

at com.sun.jersey.spi.container.servlet.WebComponent.load(WebComponent.java:521)

at com.sun.jersey.spi.container.servlet.WebComponent.init(WebComponent.java:199)

at com.sun.jersey.spi.container.servlet.ServletContainer.init(ServletContainer.java:308)

at com.sun.jersey.spi.container.servlet.ServletContainer.init(ServletContainer.java:471)

at javax.servlet.GenericServlet.init(GenericServlet.java:242)

at org.apache.catalina.core.StandardWrapper.initServlet(StandardWrapper.java:1428)

at org.apache.catalina.core.StandardWrapper.load(StandardWrapper.java:1230)

at org.apache.catalina.core.StandardContext.loadOnStartup(StandardContext.java:4934)

at org.apache.catalina.core.StandardContext.start(StandardContext.java:5207)

at com.sun.enterprise.web.WebModule.start(WebModule.java:499)

at org.apache.catalina.core.ContainerBase.addChildInternal(ContainerBase.java:928)

at org.apache.catalina.core.ContainerBase.addChild(ContainerBase.java:912)

at org.apache.catalina.core.StandardHost.addChild(StandardHost.java:694)

at com.sun.enterprise.web.WebContainer.loadWebModule(WebContainer.java:1933)

at com.sun.enterprise.web.WebContainer.loadWebModule(WebContainer.java:1605)

at com.sun.enterprise.web.WebApplication.start(WebApplication.java:90)

at org.glassfish.internal.data.EngineRef.start(EngineRef.java:126)

at org.glassfish.internal.data.ModuleInfo.start(ModuleInfo.java:241)

at org.glassfish.internal.data.ApplicationInfo.start(ApplicationInfo.java:236)

at com.sun.enterprise.v3.server.ApplicationLifecycle.deploy(ApplicationLifecycle.java:339)

at com.sun.enterprise.v3.server.ApplicationLoaderService.processApplication(ApplicationLoaderService.java:340)

at com.sun.enterprise.v3.server.ApplicationLoaderService.postConstruct(ApplicationLoaderService.java:163)

at com.sun.hk2.component.AbstractWombImpl.inject(AbstractWombImpl.java:174)

at com.sun.hk2.component.ConstructorWomb$1.run(ConstructorWomb.java:87)

at java.security.AccessController.doPrivileged(Native Method)

at com.sun.hk2.component.ConstructorWomb.initialize(ConstructorWomb.java:84)

at com.sun.hk2.component.AbstractWombImpl.get(AbstractWombImpl.java:77)

at com.sun.hk2.component.SingletonInhabitant.get(SingletonInhabitant.java:58)

at com.sun.hk2.component.LazyInhabitant.get(LazyInhabitant.java:107)

at com.sun.hk2.component.AbstractInhabitantImpl.get(AbstractInhabitantImpl.java:60)

at com.sun.enterprise.v3.server.AppServerStartup.run(AppServerStartup.java:236)

at com.sun.enterprise.v3.server.AppServerStartup.start(AppServerStartup.java:128)

at com.sun.enterprise.module.bootstrap.Main.launch(Main.java:457)

at com.sun.enterprise.module.bootstrap.Main.launch(Main.java:401)

at org.jvnet.hk2.osgiadapter.HK2Main.start(HK2Main.java:125)

at org.apache.felix.framework.util.SecureAction.startActivator(SecureAction.java:640)

at org.apache.felix.framework.Felix.activateBundle(Felix.java:1700)

at org.apache.felix.framework.Felix.startBundle(Felix.java:1622)

at org.apache.felix.framework.BundleImpl.start(BundleImpl.java:915)

at org.jvnet.hk2.osgimain.Main.start(Main.java:140)

at org.apache.felix.framework.util.SecureAction.startActivator(SecureAction.java:640)

at org.apache.felix.framework.Felix.activateBundle(Felix.java:1700)

at org.apache.felix.framework.Felix.startBundle(Felix.java:1622)

at org.apache.felix.framework.Felix.setActiveStartLevel(Felix.java:1077)

at org.apache.felix.framework.StartLevelImpl.run(StartLevelImpl.java:264)

at java.lang.Thread.run(Thread.java:619)

|#]

[#|2010-02-12T10:13:05.449-0800|SEVERE|glassfishv3.0|com.sun.jersey.core.spi.component.ProviderFactory|_ThreadID=11;_ThreadName=Thread-1;|The provider class, class com.sun.jersey.json.impl.provider.entity.JSONObjectProvider, could not be instantiated. Processing will continue but the class will not be utilized

java.lang.IllegalAccessException: Class com.sun.jersey.core.spi.component.ComponentConstructor can not access a member of class com.sun.jersey.json.impl.provider.entity.JSONObjectProvider with modifiers ""

at sun.reflect.Reflection.ensureMemberAccess(Reflection.java:65)

at java.lang.Class.newInstance0(Class.java:349)

at java.lang.Class.newInstance(Class.java:308)

at com.sun.jersey.core.spi.component.ComponentConstructor._getInstance(ComponentConstructor.java:153)

at com.sun.jersey.core.spi.component.ComponentConstructor.getInstance(ComponentConstructor.java:141)

at com.sun.jersey.core.spi.component.ProviderFactory.__getComponentProvider(ProviderFactory.java:163)

at com.sun.jersey.core.spi.component.ProviderFactory.getComponentProvider(ProviderFactory.java:134)

at com.sun.jersey.core.spi.component.ProviderServices.getComponent(ProviderServices.java:233)

at com.sun.jersey.core.spi.component.ProviderServices.getProvidersAndServices(ProviderServices.java:150)

at com.sun.jersey.core.spi.factory.MessageBodyFactory.initWriters(MessageBodyFactory.java:171)

at com.sun.jersey.core.spi.factory.MessageBodyFactory.init(MessageBodyFactory.java:146)

at com.sun.jersey.server.impl.application.WebApplicationImpl.initiate(WebApplicationImpl.java:664)

at com.sun.jersey.server.impl.application.WebApplicationImpl.initiate(WebApplicationImpl.java:449)

at com.sun.jersey.spi.container.servlet.ServletContainer.initiate(ServletContainer.java:404)

at com.sun.jersey.spi.container.servlet.ServletContainer$InternalWebComponent.initiate(ServletContainer.java:253)

at com.sun.jersey.spi.container.servlet.WebComponent.load(WebComponent.java:521)

at com.sun.jersey.spi.container.servlet.WebComponent.init(WebComponent.java:199)

at com.sun.jersey.spi.container.servlet.ServletContainer.init(ServletContainer.java:308)

at com.sun.jersey.spi.container.servlet.ServletContainer.init(ServletContainer.java:471)

at javax.servlet.GenericServlet.init(GenericServlet.java:242)

at org.apache.catalina.core.StandardWrapper.initServlet(StandardWrapper.java:1428)

at org.apache.catalina.core.StandardWrapper.load(StandardWrapper.java:1230)

at org.apache.catalina.core.StandardContext.loadOnStartup(StandardContext.java:4934)

at org.apache.catalina.core.StandardContext.start(StandardContext.java:5207)

at com.sun.enterprise.web.WebModule.start(WebModule.java:499)

at org.apache.catalina.core.ContainerBase.addChildInternal(ContainerBase.java:928)

at org.apache.catalina.core.ContainerBase.addChild(ContainerBase.java:912)

at org.apache.catalina.core.StandardHost.addChild(StandardHost.java:694)

at com.sun.enterprise.web.WebContainer.loadWebModule(WebContainer.java:1933)

at com.sun.enterprise.web.WebContainer.loadWebModule(WebContainer.java:1605)

at com.sun.enterprise.web.WebApplication.start(WebApplication.java:90)

at org.glassfish.internal.data.EngineRef.start(EngineRef.java:126)

at org.glassfish.internal.data.ModuleInfo.start(ModuleInfo.java:241)

at org.glassfish.internal.data.ApplicationInfo.start(ApplicationInfo.java:236)

at com.sun.enterprise.v3.server.ApplicationLifecycle.deploy(ApplicationLifecycle.java:339)

at com.sun.enterprise.v3.server.ApplicationLoaderService.processApplication(ApplicationLoaderService.java:340)

at com.sun.enterprise.v3.server.ApplicationLoaderService.postConstruct(ApplicationLoaderService.java:163)

at com.sun.hk2.component.AbstractWombImpl.inject(AbstractWombImpl.java:174)

at com.sun.hk2.component.ConstructorWomb$1.run(ConstructorWomb.java:87)

at java.security.AccessController.doPrivileged(Native Method)

at com.sun.hk2.component.ConstructorWomb.initialize(ConstructorWomb.java:84)

at com.sun.hk2.component.AbstractWombImpl.get(AbstractWombImpl.java:77)

at com.sun.hk2.component.SingletonInhabitant.get(SingletonInhabitant.java:58)

at com.sun.hk2.component.LazyInhabitant.get(LazyInhabitant.java:107)

at com.sun.hk2.component.AbstractInhabitantImpl.get(AbstractInhabitantImpl.java:60)

at com.sun.enterprise.v3.server.AppServerStartup.run(AppServerStartup.java:236)

at com.sun.enterprise.v3.server.AppServerStartup.start(AppServerStartup.java:128)

at com.sun.enterprise.module.bootstrap.Main.launch(Main.java:457)

at com.sun.enterprise.module.bootstrap.Main.launch(Main.java:401)

at org.jvnet.hk2.osgiadapter.HK2Main.start(HK2Main.java:125)

at org.apache.felix.framework.util.SecureAction.startActivator(SecureAction.java:640)

at org.apache.felix.framework.Felix.activateBundle(Felix.java:1700)

at org.apache.felix.framework.Felix.startBundle(Felix.java:1622)

at org.apache.felix.framework.BundleImpl.start(BundleImpl.java:915)

at org.jvnet.hk2.osgimain.Main.start(Main.java:140)

at org.apache.felix.framework.util.SecureAction.startActivator(SecureAction.java:640)

at org.apache.felix.framework.Felix.activateBundle(Felix.java:1700)

at org.apache.felix.framework.Felix.startBundle(Felix.java:1622)

at org.apache.felix.framework.Felix.setActiveStartLevel(Felix.java:1077)

at org.apache.felix.framework.StartLevelImpl.run(StartLevelImpl.java:264)

at java.lang.Thread.run(Thread.java:619)

|#]

Fwd: Olio Scaling

Posted by Vasileios Kontorinis <bk...@gmail.com>.
Shanti hi again,
   Sorry for not submitting the JIRA on time, I am extremely busy lately.

I have a fast question regarding the way the webserver interacts with the
filestore. I run some scaling studies with one, two and three different
server while having only one filestore (I do specify that in the run.xml
configuration file, webServer and dataStorage ).
The filestore is a local folder on one of the server machines. However, in
the oliophp/etc/config.php I also specify on each server

$olioconfig['fileSystem'] = 'LocalFS';
$olioconfig['localfsRoot'] = '/home/gdhiman/filestore';

As a result, I do get WARNINGS for missing files on the webserver that do
not host a filestore. What is the right configuration for
oliophp/etc/config.php? Can I somehow detach the filestore from the
webserver so that it requests files remotely?


Thanks again.
-------------------------------------------------------------------
Kontorinis Vasileios
Phd student, University of California San Diego
San Diego, CA 92122
Cell. phone: (858) 717 6899
bkontorinis@gmail.com, vkontori@ucsd.edu
-------------------------------------------------------------------


2010/2/8 Shanti Subramanyam <sh...@gmail.com>


>
> On Mon, Feb 8, 2010 at 3:53 PM, Vasileios Kontorinis <
> bkontorinis@gmail.com> wrote:
>
>>
>>
>>> We need to look into this issue  - I suspect that something subtle has
>>> changed in 0.2 which hasn't got accounted for in the expected #images
>>> loaded. Can I please request that you file a JIRA on this ?
>>>
>>
>> How do I do this? Pointers?
>>
>
> http://issues.apache.org
>
>
>> I tried runs of 20mins to verify that longer runs will not make it better
>> and it's still failing for just 50 users.
>>
>
> What worries me is that you're saying it  fails for 1800 users too - I can
> understand it may fail for 50 users, but if it fails for larger #users, then
> it is a bug.
>
>>
>>
>
>> and I do get the repetitive patterns you mentioned. However, the cache_MB
>> though never exceeds 0.05...
>> I would expect that memcache size is really important for the application
>> scaling. What is the point of having a separate memcache server if we are
>> only using less than 50KB(?) of memory for caching?
>>
>>
> Try running without memcached - it can be easily configured in the app's
> etc/config.php. Then you will see what different the cache makes. The
> reduction in db traffic is dramatic resulting in the response times you see.
> The reason the size is small is because we are currently only caching the
> home page which is shared. We have not bothered to implement any additional
> caching as this level of caching is sufficient to reduce the db load.
>
> Regards
>> -VK
>>
>>  Shanti
>
>>
>>
>>> Shanti
>>>
>>>
>>>> Thanks again
>>>> -------------------------------------------------------------------
>>>> Kontorinis Vasileios
>>>> Phd student, University of California San Diego
>>>> San Diego, CA 92122
>>>> Cell. phone: (858) 717 6899
>>>> bkontorinis@gmail.com, vkontori@ucsd.edu
>>>> -------------------------------------------------------------------
>>>>
>>>>
>>>> 2010/1/27 Shanti Subramanyam <sh...@gmail.com>
>>>>
>>>>> Yes - these are problems that I'm already aware of.
>>>>> The best solution to the filestore issue is to change ownership of the
>>>>> directory to the same user/group as the apache process. We could have the
>>>>> fileloader.sh change write access I guess, but since that's a big security
>>>>> hole, we may not want to do that automatically without letting the user know
>>>>> about it.
>>>>>
>>>>> The fact that your response times are so high indicate that you're
>>>>> running a far larger load than the system can handle and/or you still need
>>>>> some tuning.
>>>>> I suggest you start over from say 100 users and see at what point your
>>>>> response times start getting really large. The apache error log should be
>>>>> pulled in as part of the 'Statistics' tab, so do keep monitoring that.
>>>>>
>>>>> Shanti
>>>>>
>>>>>
>>>>> On Wed, Jan 27, 2010 at 1:34 AM, Vasileios Kontorinis <
>>>>> bkontorinis@gmail.com> wrote:
>>>>>
>>>>>> Shanti hi again,
>>>>>>    I checked my apache logs and there were a bunch of errors.
>>>>>> It looks like there some issues with the
>>>>>> webapp/php/trunk/classes/ImageUtil.php in the last release of olio. (I
>>>>>> downloaded
>>>>>> http://www.alliedquotes.com/mirrors/apache/incubator/olio/0.2/apache-olio-php-src-0.2.tar.gz)
>>>>>> 1) There is a line that needs to be commented. php complains ("1.5.
>>>>>> Must be greater than zero.").
>>>>>> 2) Then, it was complaining that it cannot find function
>>>>>> fastimagecopyresampled . To work around that moved the function
>>>>>> fastimagecopyresampled above createThumb (this might not  be required ) and
>>>>>> deplared it static.
>>>>>>     Finally,  I call the function from createThumb with
>>>>>> self::fastimagecopyresampled .
>>>>>> 3) Then, it started complaining because it could not write to the
>>>>>> filestore. The problem is that wants to write the new images as www-data
>>>>>> from the apache, while the filestore does not have write persmission for
>>>>>> others. Manually,
>>>>>>     giving access solves the problem (chmod -R o+w <path>/filestore)
>>>>>> but since the directories in filestore are generated automatically, maybe
>>>>>> the chmod command should be added in fileloader.sh
>>>>>>
>>>>>> Funnily enough, after fixing those issues, I still cannot pass the:
>>>>>> Average images loaded per Home Page 2.65   >=3       FAILED
>>>>>>
>>>>>> and on top of that I also have:
>>>>>> Response Times (secs)
>>>>>> AddPerson     5.190  13.194  3.387 8.800     3.000 FAILED
>>>>>> AddEvent       5.904  16.784  3.159 10.400   4.000 FAILED
>>>>>>
>>>>>> Think tims for AddPerson and AddEvent fail as well.
>>>>>>
>>>>>> Any insights are welcome .... :-(
>>>>>>
>>>>>> -------------------------------------------------------------------
>>>>>> Kontorinis Vasileios
>>>>>> Phd student, University of California San Diego
>>>>>> San Diego, CA 92122
>>>>>> Cell. phone: (858) 717 6899
>>>>>> bkontorinis@gmail.com, vkontori@ucsd.edu
>>>>>> -------------------------------------------------------------------
>>>>>>
>>>>>>
>>>>>> 2010/1/26 Shanti Subramanyam <sh...@gmail.com>
>>>>>>
>>>>>>> Yes - 0.2 requires a lot more disk space as we changed the ratio of
>>>>>>> concurrent users to registered users to 1:100. If you haven't already,
>>>>>>> please check out our published Blueprints for detailed performance
>>>>>>> characteristics of the workload:
>>>>>>> Deploying Web 2.0 Applications on Sun Servers and the OpenSolaris
>>>>>>> Operating System<http://wikis.sun.com/display/BluePrints/Deploying+Web+2.0+Applications+on+Sun+Servers+and+the+OpenSolaris+Operating+System>
>>>>>>> <http://wikis.sun.com/display/BluePrints/Deploying+Web+2.0+Applications+on+Sun+Servers+and+the+OpenSolaris+Operating+System>
>>>>>>> If you run for long enough, you should get passing runs. Have you
>>>>>>> verified that there are no errors in the run logs when you see the 'Avg.
>>>>>>> images loaded per home page' fail ?
>>>>>>>
>>>>>>> On to your open files error  - you may have to tune your networking
>>>>>>> tier and/or #open file descriptors. I don't believe we have ever seen as
>>>>>>> many files open as you are seeing. Can you determine whether these are from
>>>>>>> the file store or network ? We also typically run the filestore on a
>>>>>>> different system and nfs-mount it on the webserver box.
>>>>>>> You will have to tune your system to ensure good performance since
>>>>>>> you will need memory for both apache and files.
>>>>>>>
>>>>>>> Shanti
>>>>>>>
>>>>>>>
>>>>>>> On Mon, Jan 25, 2010 at 5:06 PM, Vasileios Kontorinis <
>>>>>>> bkontorinis@gmail.com> wrote:
>>>>>>>
>>>>>>>> Akara and Shanti hi,
>>>>>>>>    I did migrate to Olio 0.2. With the last version of Olio I came
>>>>>>>> across some new interesting things.
>>>>>>>>
>>>>>>>> Scaling issues:
>>>>>>>>   - I am still getting the:
>>>>>>>> Average images loaded per Home Page2.55>= 3
>>>>>>>> FAILED
>>>>>>>>  - additionally, when I scale the concurrent users to 800 I run out
>>>>>>>> of diskspace since my filestore occupies more than 62GB.
>>>>>>>> Actually for 600 users it occupies 50GB. I was curious if that makes
>>>>>>>> sense. How much space I will need to reach 1000 users?
>>>>>>>> In the php_setup.html it suggests that we will need 50GB but
>>>>>>>> apparently we need way more for large number of users.
>>>>>>>>
>>>>>>>>  - Finally and most importantly, for 600 users many of the
>>>>>>>> operations fail with the exception:
>>>>>>>> Message: java.net.SocketException: Too many open files
>>>>>>>> Stack Trace:
>>>>>>>>  Class Method Line java.net.PlainSocketImpl socketAccept
>>>>>>>> java.net.PlainSocketImpl accept 390 java.net.ServerSocket
>>>>>>>> implAccept 453 java.net.ServerSocket accept 421
>>>>>>>> sun.rmi.transport.tcp.TCPTransport$AcceptLoop executeAcceptLoop 369
>>>>>>>> sun.rmi.transport.tcp.TCPTransport$AcceptLoop run 341
>>>>>>>> java.lang.Thread run 619
>>>>>>>> or
>>>>>>>>
>>>>>>>> java.net.SocketException: Too many open files
>>>>>>>> Stack Trace:
>>>>>>>>  Class Method Line java.net.Socket createImpl 394 java.net.Socket
>>>>>>>> getImpl 457 java.net.Socket bind 571
>>>>>>>> com.sun.faban.driver.transport.hc3.ProtocolTimedSocketFactory
>>>>>>>> createSocket 60 org.apache.commons.httpclient.HttpConnection open
>>>>>>>> 707 org.apache.commons.httpclient.HttpMethodDirector
>>>>>>>> executeWithRetry 387
>>>>>>>> org.apache.commons.httpclient.HttpMethodDirector executeMethod 171
>>>>>>>> org.apache.commons.httpclient.HttpClient executeMethod 397
>>>>>>>> org.apache.commons.httpclient.HttpClient executeMethod 323
>>>>>>>> com.sun.faban.driver.transport.hc3.ApacheHC3Transport readURL 274
>>>>>>>> org.apache.olio.workload.driver.UIDriver doLogin 398
>>>>>>>> org.apache.olio.workload.driver.UIDriver doLogin 424
>>>>>>>> sun.reflect.GeneratedMethodAccessor8 invoke
>>>>>>>> sun.reflect.DelegatingMethodAccessorImpl invoke 25
>>>>>>>> java.lang.reflect.Method invoke 597
>>>>>>>> com.sun.faban.driver.engine.TimeThread doRun 169
>>>>>>>> com.sun.faban.driver.engine.AgentThrea
>>>>>>>>
>>>>>>>> I am monitoring the number of open files in the web-server with
>>>>>>>> `watch "lsof | wc"` and the olio starts failing when around 65000-70,000
>>>>>>>> files are open. lsof shows that for each apache2 thread there are around 100
>>>>>>>> files open. Therefore there are around 650-700 different apache2 threads
>>>>>>>> that create the bulk of those open file descriptors.
>>>>>>>> The soft and hard limit is set to 403238, which means that there
>>>>>>>> should be many more open files before it will start failing.
>>>>>>>> (Actually, I verified the limit by opening a bunch of files with a
>>>>>>>> python script and it does reach the limitation of 403238.)
>>>>>>>> Any insights?  Is there any chance the the file descriptors take
>>>>>>>> more time that usual to be reclaimed after being closed in the xen vm I use
>>>>>>>> for my web-server? Does it make sense for olio at the first place to have so
>>>>>>>> many files open at the same time?
>>>>>>>>
>>>>>>>> Thanks again.
>>>>>>>>
>>>>>>>>
>>>>>>>> -------------------------------------------------------------------
>>>>>>>> Kontorinis Vasileios
>>>>>>>> Phd student, University of California San Diego
>>>>>>>> San Diego, CA 92122
>>>>>>>> Cell. phone: (858) 717 6899
>>>>>>>> bkontorinis@gmail.com, vkontori@ucsd.edu
>>>>>>>> -------------------------------------------------------------------
>>>>>>>>
>>>>>>>>
>>>>>>>> 2010/1/16 Shanti Subramanyam <sh...@gmail.com>
>>>>>>>>
>>>>>>>>  I would really recommend that you migrate to Olio 0.2. In addition
>>>>>>>>> to bug fixes, there are some major features changes in it. See Olio
>>>>>>>>> 0.2 released<http://perfwork.wordpress.com/2010/01/13/olio-0-2-relesed/>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Shanti
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Sat, Jan 16, 2010 at 4:49 PM, Vasileios Kontorinis <
>>>>>>>>> bkontorinis@gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>> Akara hi again,
>>>>>>>>>>    Below I have comments on your suggestions and at the end some
>>>>>>>>>> bonus questions... Thanks again.
>>>>>>>>>>
>>>>>>>>>> 2010/1/13 Akara Sucharitakul <Ak...@sun.com>
>>>>>>>>>>
>>>>>>>>>>> With your permission, I'd like to copy the Olio and Faban user
>>>>>>>>>>> aliases going forward. I feel it will help a much wider audience. Please see
>>>>>>>>>>> below for answers/comments:
>>>>>>>>>>>
>>>>>>>>>>> Sure. I cced olio user alias. I am not sure which is the right
>>>>>>>>>> faban list.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> Vasileios Kontorinis wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Akara hi,
>>>>>>>>>>>>   I am a grad student at UCSD and I use Olio for a research
>>>>>>>>>>>> project where we want to measure olio performance under live virtual machine
>>>>>>>>>>>> migration. We use ubuntu 8.04 on nehalem servers.
>>>>>>>>>>>> I have co ed the last version of olio from the online svn
>>>>>>>>>>>> repository and downloaded the last version of faban (faban-kit-101509.tar.gz
>>>>>>>>>>>> <http://faban.sunsource.net/nightly/faban-kit-101509.tar.gz>)
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> 101509 is fairly recent. But the latest on the web site is 111109
>>>>>>>>>>> (Faban 1.0). There were just bug fixes between those releases.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> I have upgraded to Faban 1.0, still using olio1.0 though ( the
>>>>>>>>>> release of 2.0 was announced, will switch to it if I run into bugs that have
>>>>>>>>>> been fixed)
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>> So far, I employed a bunch of hacks to get most of it to work
>>>>>>>>>>>> and I am almost there. In the process I got a bunch of questions.
>>>>>>>>>>>>
>>>>>>>>>>>> Questions (some of them might be just faban related, not olio so
>>>>>>>>>>>> bear with me):
>>>>>>>>>>>> 1) In there any way to deploy OlioDriver.jar through the command
>>>>>>>>>>>> line? Firefox through ssh forwarding is dead slow and I d rather avoid if I
>>>>>>>>>>>> can.
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Just drop the jar into faban/benchmarks/ and it will deploy
>>>>>>>>>>> itself. This is documented at
>>>>>>>>>>> http://faban.sunsource.net/1.0/docs/guide/harnessdev/deploybenchmark.htmlunder "Alternate Deployment Methods."
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>  2) The services ApacheHttpdService, MemcachedService,
>>>>>>>>>>>> MySQLService that come with Faban should be deployed before running Olio?
>>>>>>>>>>>>    I was getting some very weird errors. e.g.
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Yes, you should. Olio will search for those.
>>>>>>>>>>>
>>>>>>>>>>> Done
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>> 03:50:27:WARNING:UIDriverAgent[0]: Forcefully terminating
>>>>>>>>>>>> benchmark run
>>>>>>>>>>>> 03:50:27:WARNING:UIDriverAgent[0]: 25 threads forcefully
>>>>>>>>>>>> terminated.
>>>>>>>>>>>> java.lang.Throwable: Stack of non-terminating thread.
>>>>>>>>>>>>    at java.net.SocketInputStream.socketRead0 (null)
>>>>>>>>>>>>    at java.net.SocketInputStream.read (129)
>>>>>>>>>>>>    at java.io.FilterInputStream.read (116)
>>>>>>>>>>>>    at com.sun.faban.driver.transport.util.TimedInputStream.read
>>>>>>>>>>>> (139)
>>>>>>>>>>>>    at java.io.BufferedInputStream.fill (218)
>>>>>>>>>>>>    at java.io.BufferedInputStream.read (237)
>>>>>>>>>>>>    at org.apache.commons.httpclient.HttpParser.readRawLine (78)
>>>>>>>>>>>>    at org.apache.commons.httpclient.HttpParser.readLine (106)
>>>>>>>>>>>>    at org.apache.commons.httpclient.HttpConnection.readLine
>>>>>>>>>>>> (1116)
>>>>>>>>>>>>    at
>>>>>>>>>>>> org.apache.commons.httpclient.HttpMethodBase.readStatusLine (1973)
>>>>>>>>>>>>    at org.apache.commons.httpclient.HttpMethodBase.readResponse
>>>>>>>>>>>> (1735)
>>>>>>>>>>>>    at org.apache.commons.httpclient.HttpMethodBase.execute
>>>>>>>>>>>> (1098)
>>>>>>>>>>>>    at
>>>>>>>>>>>> org.apache.commons.httpclient.HttpMethodDirector.executeWithRetry (398)
>>>>>>>>>>>>    at
>>>>>>>>>>>> org.apache.commons.httpclient.HttpMethodDirector.executeMethod (171)
>>>>>>>>>>>>    at org.apache.commons.httpclient.HttpClient.executeMethod
>>>>>>>>>>>> (397)
>>>>>>>>>>>>    at org.apache.commons.httpclient.HttpClient.executeMethod
>>>>>>>>>>>> (323)
>>>>>>>>>>>>    at
>>>>>>>>>>>> com.sun.faban.driver.transport.hc3.ApacheHC3Transport.fetchURL (529)
>>>>>>>>>>>>    at
>>>>>>>>>>>> com.sun.faban.driver.transport.hc3.ApacheHC3Transport.fetchURL (552)
>>>>>>>>>>>>    at org.apache.olio.workload.driver.UIDriver.doHomePage (355)
>>>>>>>>>>>>    at sun.reflect.NativeMethodAccessorImpl.invoke0 (null)
>>>>>>>>>>>>    at sun.reflect.NativeMethodAccessorImpl.invoke (39)
>>>>>>>>>>>>    at sun.reflect.DelegatingMethodAccessorImpl.invoke (25)
>>>>>>>>>>>>    at java.lang.reflect.Method.invoke (597)
>>>>>>>>>>>>    at com.sun.faban.driver.engine.TimeThread.doRun (169)
>>>>>>>>>>>>    at com.sun.faban.driver.engine.AgentThread.run (202)
>>>>>>>>>>>>
>>>>>>>>>>>> and afterwards the master was waiting for threads to join for
>>>>>>>>>>>> ever... (I attached gdb to verify that something was wrong) and hence I had
>>>>>>>>>>>> to kill the benchmark.
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> These threads are hanging reading the server responses, that
>>>>>>>>>>> never came.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>> Building the services from Faban probably fixes it.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>> In the Olio log there are WARNINGS  complaining about not
>>>>>>>>>>>> deploying those. After building those and manually copying them to
>>>>>>>>>>>> /faban/services (ant deploy did not place them there... :-(  )
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Yes. But ant deploy should get them there. If not, can you please
>>>>>>>>>>> let me know the ant messages?
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Ant was deploying them indeed. I had a mistake in
>>>>>>>>>> building.properties.
>>>>>>>>>> I had:  faban.url=http://<hostname>:9980/   instead of
>>>>>>>>>> faban.url=http://localhost:9980/
>>>>>>>>>> After I changed that it started working...
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>  it worked. (mostly worked)
>>>>>>>>>>>>
>>>>>>>>>>>> 3) I still have warnings like:
>>>>>>>>>>>> 01:38:08:INFO:Time difference to host olio-web is 269 ms.
>>>>>>>>>>>> Attempting to set clock.
>>>>>>>>>>>> 01:38:08:INFO:Time difference to host olio-db is 263 ms.
>>>>>>>>>>>> Attempting to set clock.
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> These two are OK. Just trying to do a clock sync between the
>>>>>>>>>>> systems.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>  01:38:08:WARNING:olio-web wakeup-before time reached 700ms
>>>>>>>>>>>> limit. System is too busy. Giving up.
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> This is one of Faban's clock-setting calibrations. If the system
>>>>>>>>>>> is too busy or you run on some virtualization architectures, the lag time
>>>>>>>>>>> between an intended end of sleep and the actual time when the thread really
>>>>>>>>>>> wakes up (gets scheduled/executed) is too high, calibrations will fail.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>  01:38:08:INFO:Time difference to host olio-mem is 262 ms.
>>>>>>>>>>>> Attempting to set clock.
>>>>>>>>>>>> 01:38:10:WARNING:olio-db wakeup-before time reached 700ms limit.
>>>>>>>>>>>> System is too busy. Giving up.
>>>>>>>>>>>> 09:38:09:WARNING:[date, -u, 011309382010.10]
>>>>>>>>>>>> stderr:
>>>>>>>>>>>> date: cannot set date: Operation not permitted
>>>>>>>>>>>> 09:38:09:WARNING:Error on "[date, -u, 011309382010.10]" command
>>>>>>>>>>>> trying to set the date. Exit value: 1
>>>>>>>>>>>> 09:38:10:WARNING:[date, -u, 011309382010.11]
>>>>>>>>>>>> stderr:
>>>>>>>>>>>> date: cannot set date: Operation not permitted
>>>>>>>>>>>> 09:38:10:WARNING:Error on "[date, -u, 011309382010.11]" command
>>>>>>>>>>>> trying to set the date. Exit value: 1
>>>>>>>>>>>> 09:38:09:WARNING:[date, -u, 011309382010.10]
>>>>>>>>>>>> stderr:
>>>>>>>>>>>> date: cannot set date: Operation not permitted
>>>>>>>>>>>> 09:38:09:WARNING:Error on "[date, -u, 011309382010.10]" command
>>>>>>>>>>>> trying to set the date. Exit value: 1
>>>>>>>>>>>> 09:38:10:WARNING:[date, -u, 011309382010.11]
>>>>>>>>>>>> stderr:
>>>>>>>>>>>> date: cannot set date: Operation not permitted
>>>>>>>>>>>>
>>>>>>>>>>>> Leting faban change the vm clock sounds from the beginning a bad
>>>>>>>>>>>> idea.
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> OK. So it is xen. Yes, this is what Faban is trying to solve. You
>>>>>>>>>>> can certainly turn it off. Please see:
>>>>>>>>>>>   http://faban.sunsource.net/1.0/docs/howd services
>>>>>>>>>>> ApacheHttpdService, MemcachedService, MySQLService that come with Faban
>>>>>>>>>>> should be deployed before running Olio?
>>>>>>>>>>>    I was gettingoi/physclocksync.html<http://faban.sunsource.net/1.0/docs/howdoi/physclocksync.html>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>> I added the  <fh:timeSync>false<fh:timeSync> in my run.xml file (
>>>>>>>>>> btw in the link above there is a mistake :  <fh:timeSync>false
>>>>>>>>>> </fh:timeSync> is correct, the second <fh:timeSync> needs a
>>>>>>>>>> closing tag, the "/" is missing)
>>>>>>>>>> that made the warnings go away.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>  Unfortunately, xen is really bad in maintaining an accurate
>>>>>>>>>>>> clock. As a result there is usually time difference between the different
>>>>>>>>>>>> virtual machines
>>>>>>>>>>>> of more than 10ms. I went over the setTime function in Faban
>>>>>>>>>>>> source (/faban/com/sun/faban/harness/agent/CmdAgentImpl.java), it's big and
>>>>>>>>>>>> ugly (very ugly)
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Thanks for the compliments! I think you mean
>>>>>>>>>>> CmdService.setClockTask. Time sensitive code ain't pretty. It is the
>>>>>>>>>>> complexities dealing with the clock and trying to achieve good accuracy. If
>>>>>>>>>>> you think you can simplify this, I'm listening (without loosing the
>>>>>>>>>>> accuracy, of course). In comparison, CmdAgentImpl has nothing.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>> Yes, you r right it is CmdService.setClockTask. The previous email
>>>>>>>>>> was composed at 3am ... :-)
>>>>>>>>>> I am still a little confused.  the setClockTask is used to set the
>>>>>>>>>> clock so that all the machines are synchronized with master. From what you
>>>>>>>>>> mentioned the physical clock sync is only used for the logs.
>>>>>>>>>> Why do we need to do that since 1) it requires root privileges
>>>>>>>>>> (which might not be always available) 2) I could imagine an alternative that
>>>>>>>>>> uses deltas from the actual physical clock without having to set it.
>>>>>>>>>> ( I am probably missing something... :-)
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>>  Why there is this strict requirement for 10ms difference? Any
>>>>>>>>>>>> ideas?
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> It is easily achievable in most cases. May not be true for VMs.
>>>>>>>>>>>
>>>>>>>>>>> On some VM architectures, the OS however does not get scheduled
>>>>>>>>>>> till way after that, thus causing problems. You may be able to measure
>>>>>>>>>>> performance on those VMs. But you don't want to use such VMs to be a driver.
>>>>>>>>>>> Your response time measurements will be way off.
>>>>>>>>>>>
>>>>>>>>>>> The physical clock sync is not really rigorous. And you can turn
>>>>>>>>>>> it off. It is more to keep the systems in good time sync. If your VM stands
>>>>>>>>>>> in the way, just turn it off. The driver's virtual clock sync is much more
>>>>>>>>>>> picky in comparison. This is because the start time for the steady state
>>>>>>>>>>> should be the same (with a very small tolerance) no matter how many drivers
>>>>>>>>>>> are driving. Otherwise the measurement period won't be the same when viewed
>>>>>>>>>>> from different drivers and the results won't be reliable.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>  Even with ntp it's hard to provide the 10ms guarantee.
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> That's why we don't use ntp ;-)
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Just out of curiosity, the physical clocks are set only once at
>>>>>>>>>> the beginning (right?), therefore for long runs the 10ms difference will not
>>>>>>>>>> be guaranteed. Nope? Especially under VMs I 've seen significant clock
>>>>>>>>>> difference withing a few minutes.
>>>>>>>>>> At least ntp can periodically resync (of course doing so, might
>>>>>>>>>> screw up the logs with time going backwards etc)
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>  I am thinking of modifying this function to always return that
>>>>>>>>>>>> the time difference is less than 10ms (so that I do not have to wait all the
>>>>>>>>>>>> time for the timeouts.)
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Why bother. Don't like it, just turn it off. It has good use in
>>>>>>>>>>> most configurations we're dealing with. And, it avoids ntp inaccuracies.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>  Will this break anything in Olio?
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Nope. Except the times in your logs will appear out of sequence.
>>>>>>>>>>> They rely on the local time on the originating systems.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>> 4) Warning like:
>>>>>>>>>>>> 09:39:48:WARNING:Image at
>>>>>>>>>>>> http://olio-web:80/fileService.php?cache=false&file=e168t.jpg <
>>>>>>>>>>>> http://olio-web:80/fileService.php?cache=false&file=e168t.jpg>
>>>>>>>>>>>> size of 249 bytes is too small. Image may not exist
>>>>>>>>>>>> can be ignored, right?
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Well, something is wrong. We don't have images that small. Check
>>>>>>>>>>> whether e168t.jpg is really that small. That's why we have that warning.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> It kinda funny, my problem was that I had the olio webkit version
>>>>>>>>>> installed and then I downloaded the version from the online svn repository.
>>>>>>>>>> I built the driver but forgot to update the webpage for my apache server.
>>>>>>>>>> Which
>>>>>>>>>> as expected was the source for many of my issues.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>> 5) Last and most important.
>>>>>>>>>>>> I can run the benchmark and all the operation succeed but for
>>>>>>>>>>>> login.
>>>>>>>>>>>> I get a bunch of:
>>>>>>>>>>>>
>>>>>>>>>>>> 09:39:25:WARNING:UIDriverAgent[0].2.doLogin: Found login prompt
>>>>>>>>>>>> at index 2926, Login as at786o08x, 2178 failed.
>>>>>>>>>>>> Note: Error not counted in result.
>>>>>>>>>>>> Either transaction start or end time is not within steady state.
>>>>>>>>>>>> java.lang.RuntimeException: Found login prompt at index 2926,
>>>>>>>>>>>> Login as at786o08x, 2178 failed.
>>>>>>>>>>>>    at org.apache.olio.workload.driver.UIDriver.doLogin (404)
>>>>>>>>>>>>    at sun.reflect.NativeMethodAccessorImpl.invoke0 (null)
>>>>>>>>>>>>    at sun.reflect.NativeMethodAccessorImpl.invoke (39)
>>>>>>>>>>>>    at sun.reflect.DelegatingMethodAccessorImpl.invoke (25)
>>>>>>>>>>>>    at java.lang.reflect.Method.invoke (597)
>>>>>>>>>>>>    at com.sun.faban.driver.engine.TimeThread.doRun (169)
>>>>>>>>>>>>    at com.sun.faban.driver.engine.AgentThread.run (202)
>>>>>>>>>>>>
>>>>>>>>>>>> Any ideas? I do get
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> You likely have cookie issues. It can't seem to hold on to a
>>>>>>>>>>> session.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>> Well there was a permission issue with the http_session dir. I
>>>>>>>>>> could not right to it. chmod 777 it fixed this.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>> (I ve found online:
>>>>>>>>>>>> http://www.mail-archive.com/olio-dev@incubator.apache.org/msg00647.htmlwhich is similar, but when I added
>>>>>>>>>>>>
>>>>>>>>>>>> com.sun.faban.driver.transport.sunhttp.ThreadCookieHandler.level=FINER
>>>>>>>>>>>>  in build.properties
>>>>>>>>>>>> I did not see any cookie related warnings. Those should appear
>>>>>>>>>>>> in the olio run log or the apache log, right? Am i just looking at the wrong
>>>>>>>>>>>> place? )
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Yes, that's applicable only to the Sun Http Transport. The
>>>>>>>>>>> version of Olio you're using is based on the Apache Http Transport (Apache
>>>>>>>>>>> HttpClient 3.1). The ThreadCookieHandler is not used for the Apache
>>>>>>>>>>> transport and that's why you don't see any logs. Try upgrade to Faban 1.0
>>>>>>>>>>> before looking at other things.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> It's a long email I know. Your feedback would be most
>>>>>>>>>>>> appreciated.
>>>>>>>>>>>>
>>>>>>>>>>>> -Regards
>>>>>>>>>>>>
>>>>>>>>>>>> -------------------------------------------------------------------
>>>>>>>>>>>> Kontorinis Vasileios
>>>>>>>>>>>> Phd student, University of California San Diego
>>>>>>>>>>>> San Diego, CA 92122
>>>>>>>>>>>> Cell. phone: (858) 717 6899
>>>>>>>>>>>> bkontorinis@gmail.com <ma...@gmail.com>,
>>>>>>>>>>>> vkontori@ucsd.edu <ma...@ucsd.edu>
>>>>>>>>>>>>
>>>>>>>>>>>> -------------------------------------------------------------------\
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Thanks for all the questions/comments.
>>>>>>>>>>>
>>>>>>>>>>> -Akara
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> And now some more questions/ comments:
>>>>>>>>>> 1) I get the following error:
>>>>>>>>>>
>>>>>>>>>> 15:13:05:SEVERE:CmdService: Getting - exception reading
>>>>>>>>>> /usr/data/olio-db.err
>>>>>>>>>> java.io.FileNotFoundException: File /usr/data/olio-db.err does not
>>>>>>>>>> exist.
>>>>>>>>>>     at com.sun.faban.common.FileTransfer.<init> (70)
>>>>>>>>>>     at com.sun.faban.harness.agent.FileAgentImpl.get (315)
>>>>>>>>>>     at sun.reflect.NativeMethodAccessorImpl.invoke0 (null)
>>>>>>>>>>     at sun.reflect.NativeMethodAccessorImpl.invoke (39)
>>>>>>>>>>     at sun.reflect.DelegatingMethodAccessorImpl.invoke (25)
>>>>>>>>>>     at java.lang.reflect.Method.invoke (597)
>>>>>>>>>>     at sun.rmi.server.UnicastServerRef.dispatch (305)
>>>>>>>>>>     at sun.rmi.transport.Transport$1.run (159)
>>>>>>>>>>     at java.security.AccessController.doPrivileged (null)
>>>>>>>>>>     at sun.rmi.transport.Transport.serviceCall (155)
>>>>>>>>>>     at sun.rmi.transport.tcp.TCPTransport.handleMessages (535)
>>>>>>>>>>     at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0
>>>>>>>>>> (790)
>>>>>>>>>>     at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run
>>>>>>>>>> (649)
>>>>>>>>>>     at java.util.concurrent.ThreadPoolExecutor$Worker.runTask
>>>>>>>>>> (885)
>>>>>>>>>>     at java.util.concurrent.ThreadPoolExecutor$Worker.run (907)
>>>>>>>>>>     at java.lang.Thread.run (619)
>>>>>>>>>>     at
>>>>>>>>>> sun.rmi.transport.StreamRemoteCall.exceptionReceivedFromServer (255)
>>>>>>>>>>     at sun.rmi.transport.StreamRemoteCall.executeCall (233)
>>>>>>>>>>     at sun.rmi.server.UnicastRef.invoke (142)
>>>>>>>>>>     at com.sun.faban.harness.agent.FileAgentImpl_Stub.get (null)
>>>>>>>>>>     at com.sun.faban.harness.engine.CmdService.get (1334)
>>>>>>>>>>     at com.sun.faban.harness.RunContext.getFile (346)
>>>>>>>>>>     at com.sun.services.MySQLService.getLogs (197)
>>>>>>>>>>     at sun.reflect.NativeMethodAccessorImpl.invoke0 (null)
>>>>>>>>>>     at sun.reflect.NativeMethodAccessorImpl.invoke (39)
>>>>>>>>>>     at sun.reflect.DelegatingMethodAccessorImpl.invoke (25)
>>>>>>>>>>     at java.lang.reflect.Method.invoke (597)
>>>>>>>>>>     at com.sun.faban.harness.util.Invoker.invoke (98)
>>>>>>>>>>     at com.sun.faban.harness.services.ServiceWrapper.getLogs (200)
>>>>>>>>>>     at com.sun.faban.harness.services.ServiceManager.getLogs (642)
>>>>>>>>>>     at com.sun.faban.harness.engine.GenericBenchmark.start (323)
>>>>>>>>>>     at com.sun.faban.harness.engine.RunDaemon.run (338)
>>>>>>>>>>     at java.lang.Thread.run (619)
>>>>>>>>>> 15:13:05:WARNING:Could not copy /usr/data/olio-db.err to
>>>>>>>>>> /home/gdhiman/faban.1.0/faban/output/OlioDriver.2D/mysql_err.log.olio-db
>>>>>>>>>>
>>>>>>>>>> Apparently something is misconfigured in my db-server. Any ideas?
>>>>>>>>>>
>>>>>>>>>> 2) I get the following error:
>>>>>>>>>> 15:13:16:WARNING:[/home/gdhiman/faban.1.0/faban/bin/fenxi,
>>>>>>>>>> process, /home/gdhiman/faban.1.0/faban/output/OlioDriver.2D/,
>>>>>>>>>> /home/gdhiman/faban.1.0/faban/output/OlioDriver.2D//post/, OlioDriver.2D]
>>>>>>>>>> stderr:
>>>>>>>>>> Error in executing perl
>>>>>>>>>> /home/gdhiman/faban.1.0/faban/master/webapps/fenxi/txt2db/
>>>>>>>>>> mpstat.pl
>>>>>>>>>> Error in executing perl
>>>>>>>>>> /home/gdhiman/faban.1.0/faban/master/webapps/fenxi/txt2db/
>>>>>>>>>> mpstat.pl
>>>>>>>>>>
>>>>>>>>>> Actually I traced back this one. The problem is the difference in
>>>>>>>>>> output format of the Sun's mpstat and default GNU mpstat.
>>>>>>>>>> This is my output of my mpstat:
>>>>>>>>>>
>>>>>>>>>> gdhiman@olio-client00:~/faban.1.0/faban/output/OlioDriver.2D$
>>>>>>>>>> mpstat 1
>>>>>>>>>> Linux 2.6.18.8-xen (olio-client00)     01/16/10
>>>>>>>>>>
>>>>>>>>>> 16:25:06     CPU   %user   %nice    %sys %iowait    %irq   %soft
>>>>>>>>>> %steal   %idle    intr/s
>>>>>>>>>> 16:25:07     all    0.00    0.00    0.00    0.00    0.00
>>>>>>>>>> 0.00    0.00  100.00     52.48
>>>>>>>>>> 16:25:08     all    0.00    0.00    0.00    0.00    0.00
>>>>>>>>>> 0.00    0.00  100.00     50.50
>>>>>>>>>> 16:25:09     all    0.00    0.00    0.00    0.00    0.00
>>>>>>>>>> 0.00    0.00  100.00     79.21
>>>>>>>>>> 16:25:10     all    0.00    0.00    0.00    0.00    0.00
>>>>>>>>>> 0.00    0.00  100.00     45.54
>>>>>>>>>> 16:25:11     all    0.00    0.00    0.00    0.00    0.00
>>>>>>>>>> 0.00    0.00  100.00     55.45
>>>>>>>>>>
>>>>>>>>>> The first line as well as the time at the beginning of each entry
>>>>>>>>>> messing up the parsing at mpstat.pl. (also the fields are
>>>>>>>>>> different)   Any plans to support this??
>>>>>>>>>>
>>>>>>>>>> 3) Scaling questions.
>>>>>>>>>> - So far I did not have a single experiment passing. Some are
>>>>>>>>>> pretty close with only one metric check failing.
>>>>>>>>>>
>>>>>>>>>> Average images loaded per Home Page2.79>= 3
>>>>>>>>>> FAILED
>>>>>>>>>> Any ideas? Is it the case that the disc is not fast enough? I am
>>>>>>>>>> just using the local filesystem for the filestore.
>>>>>>>>>>
>>>>>>>>>> - As I double the number of concurrent users I observe linear
>>>>>>>>>> scaling in the thoughput.
>>>>>>>>>> Con Users         Throughput
>>>>>>>>>>  25                        4.967
>>>>>>>>>>  50                       10.06
>>>>>>>>>> 100                      19.375
>>>>>>>>>> 200                      40.21
>>>>>>>>>> 400                      75.818
>>>>>>>>>> 800                       0.383
>>>>>>>>>> 1000                     0.483
>>>>>>>>>>
>>>>>>>>>> The linear scaling stops for 400 concurrent users ( only one
>>>>>>>>>> agent). Actually it would be exactly linear (value of ~80) but almost half
>>>>>>>>>> of the login operations failed. I am looking into it.
>>>>>>>>>> Any insights on what might be the first thing failing?
>>>>>>>>>>
>>>>>>>>>> For the 800 and 1000 experiments there are no failed operations
>>>>>>>>>> logged. It looks like those are being discarded... (?)
>>>>>>>>>>
>>>>>>>>>> Bonus question:
>>>>>>>>>> In the runtime statistics
>>>>>>>>>> <runtimeStats enabled="true">
>>>>>>>>>>          <interval>30</interval>
>>>>>>>>>>  </runtimeStats>
>>>>>>>>>>
>>>>>>>>>> only the 90% response time is reported. Is there an easy way to
>>>>>>>>>> also report the 99% ? ( or I need to add code for that?)
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Thanks a lot again in advance.
>>>>>>>>>> -VK
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Re: Olio Scaling

Posted by Shanti Subramanyam <sh...@gmail.com>.
Sure. As long as the filestore is accessbile from /home/<usr>/filestore from
all webservers, it will work.

Shanti

On Wed, Feb 24, 2010 at 3:15 PM, Vasileios Kontorinis <bkontorinis@gmail.com
> wrote:

> Shanti hi,
> It appears that the IO is the big bottleneck in my setup. This is why I
> wanted to have multiple filestores to test if I can get better throughput.
> In Olio, is it possible to have multiple filestores?
> I want to isolate the IO bottleneck ( Disk, network etc ) and for that I
> need  to shuffle things around.
> In order  to have multiple filestores I expose them through NFS to the
> webservers (so it is still LocalFS) but I specify only one value in
> oliophp/etc/config.php . ($olioconfig['localfsRoot'] =
> '/home/<user>/filestore'; )
>
> Ideas?
>
> Thanks
>
> -------------------------------------------------------------------
> Kontorinis Vasileios
> Phd student, University of California San Diego
> San Diego, CA 92122
> Cell. phone: (858) 717 6899
> bkontorinis@gmail.com, vkontori@ucsd.edu
> -------------------------------------------------------------------
>
>
> 2010/2/12 Shanti Subramanyam <sh...@gmail.com>
>
> If you want to run multiple webservers on different systems, you must have
>> access to the filestore from all of them. The easiest way to do this is to
>> nfs-mount the filestore from the server it resides on so it is accessible to
>> the other machines as well.
>>
>> Shanti
>>
>>
>> On Thu, Feb 11, 2010 at 9:42 PM, Vasileios Kontorinis <
>> bkontorinis@gmail.com> wrote:
>>
>>> Shanti hi again,
>>>    Sorry for not submitting the JIRA on time, I am extremely busy lately.
>>>
>>>
>>> I have a fast question regarding the way the webserver interacts with the
>>> filestore. I run some scaling studies with one, two and three different
>>> server while having only one filestore (I do specify that in the run.xml
>>> configuration file, webServer and dataStorage ).
>>> The filestore is a local folder on one of the server machines. However,
>>> in the oliophp/etc/config.php I also specify on each server
>>>
>>> $olioconfig['fileSystem'] = 'LocalFS';
>>> $olioconfig['localfsRoot'] = '/home/gdhiman/filestore';
>>>
>>> As a result, I do get WARNINGS for missing files on the webserver that do
>>> not host a filestore. What is the right configuration for
>>> oliophp/etc/config.php? Can I somehow detach the filestore from the
>>> webserver so that it requests files remotely?
>>>
>>>
>>> Thanks again.
>>> -------------------------------------------------------------------
>>> Kontorinis Vasileios
>>> Phd student, University of California San Diego
>>> San Diego, CA 92122
>>> Cell. phone: (858) 717 6899
>>> bkontorinis@gmail.com, vkontori@ucsd.edu
>>> -------------------------------------------------------------------
>>>
>>>
>>> 2010/2/8 Shanti Subramanyam <sh...@gmail.com>
>>>
>>>
>>>>
>>>> On Mon, Feb 8, 2010 at 3:53 PM, Vasileios Kontorinis <
>>>> bkontorinis@gmail.com> wrote:
>>>>
>>>>>
>>>>>
>>>>>> We need to look into this issue  - I suspect that something subtle has
>>>>>> changed in 0.2 which hasn't got accounted for in the expected #images
>>>>>> loaded. Can I please request that you file a JIRA on this ?
>>>>>>
>>>>>
>>>>> How do I do this? Pointers?
>>>>>
>>>>
>>>> http://issues.apache.org
>>>>
>>>>
>>>>> I tried runs of 20mins to verify that longer runs will not make it
>>>>> better and it's still failing for just 50 users.
>>>>>
>>>>
>>>> What worries me is that you're saying it  fails for 1800 users too - I
>>>> can understand it may fail for 50 users, but if it fails for larger #users,
>>>> then it is a bug.
>>>>
>>>>>
>>>>>
>>>>
>>>>> and I do get the repetitive patterns you mentioned. However, the
>>>>> cache_MB though never exceeds 0.05...
>>>>> I would expect that memcache size is really important for the
>>>>> application scaling. What is the point of having a separate memcache server
>>>>> if we are only using less than 50KB(?) of memory for caching?
>>>>>
>>>>>
>>>> Try running without memcached - it can be easily configured in the app's
>>>> etc/config.php. Then you will see what different the cache makes. The
>>>> reduction in db traffic is dramatic resulting in the response times you see.
>>>> The reason the size is small is because we are currently only caching the
>>>> home page which is shared. We have not bothered to implement any additional
>>>> caching as this level of caching is sufficient to reduce the db load.
>>>>
>>>> Regards
>>>>> -VK
>>>>>
>>>>>  Shanti
>>>>
>>>>>
>>>>>
>>>>>> Shanti
>>>>>>
>>>>>>
>>>>>>> Thanks again
>>>>>>> -------------------------------------------------------------------
>>>>>>> Kontorinis Vasileios
>>>>>>> Phd student, University of California San Diego
>>>>>>> San Diego, CA 92122
>>>>>>> Cell. phone: (858) 717 6899
>>>>>>> bkontorinis@gmail.com, vkontori@ucsd.edu
>>>>>>> -------------------------------------------------------------------
>>>>>>>
>>>>>>>
>>>>>>> 2010/1/27 Shanti Subramanyam <sh...@gmail.com>
>>>>>>>
>>>>>>>> Yes - these are problems that I'm already aware of.
>>>>>>>> The best solution to the filestore issue is to change ownership of
>>>>>>>> the directory to the same user/group as the apache process. We could have
>>>>>>>> the fileloader.sh change write access I guess, but since that's a big
>>>>>>>> security hole, we may not want to do that automatically without letting the
>>>>>>>> user know about it.
>>>>>>>>
>>>>>>>> The fact that your response times are so high indicate that you're
>>>>>>>> running a far larger load than the system can handle and/or you still need
>>>>>>>> some tuning.
>>>>>>>> I suggest you start over from say 100 users and see at what point
>>>>>>>> your response times start getting really large. The apache error log should
>>>>>>>> be pulled in as part of the 'Statistics' tab, so do keep monitoring that.
>>>>>>>>
>>>>>>>> Shanti
>>>>>>>>
>>>>>>>>
>>>>>>>> On Wed, Jan 27, 2010 at 1:34 AM, Vasileios Kontorinis <
>>>>>>>> bkontorinis@gmail.com> wrote:
>>>>>>>>
>>>>>>>>> Shanti hi again,
>>>>>>>>>    I checked my apache logs and there were a bunch of errors.
>>>>>>>>> It looks like there some issues with the
>>>>>>>>> webapp/php/trunk/classes/ImageUtil.php in the last release of olio. (I
>>>>>>>>> downloaded
>>>>>>>>> http://www.alliedquotes.com/mirrors/apache/incubator/olio/0.2/apache-olio-php-src-0.2.tar.gz)
>>>>>>>>> 1) There is a line that needs to be commented. php complains ("1.5.
>>>>>>>>> Must be greater than zero.").
>>>>>>>>> 2) Then, it was complaining that it cannot find function
>>>>>>>>> fastimagecopyresampled . To work around that moved the function
>>>>>>>>> fastimagecopyresampled above createThumb (this might not  be required ) and
>>>>>>>>> deplared it static.
>>>>>>>>>     Finally,  I call the function from createThumb with
>>>>>>>>> self::fastimagecopyresampled .
>>>>>>>>> 3) Then, it started complaining because it could not write to the
>>>>>>>>> filestore. The problem is that wants to write the new images as www-data
>>>>>>>>> from the apache, while the filestore does not have write persmission for
>>>>>>>>> others. Manually,
>>>>>>>>>     giving access solves the problem (chmod -R o+w
>>>>>>>>> <path>/filestore) but since the directories in filestore are generated
>>>>>>>>> automatically, maybe the chmod command should be added in fileloader.sh
>>>>>>>>>
>>>>>>>>> Funnily enough, after fixing those issues, I still cannot pass the:
>>>>>>>>> Average images loaded per Home Page 2.65   >=3       FAILED
>>>>>>>>>
>>>>>>>>> and on top of that I also have:
>>>>>>>>> Response Times (secs)
>>>>>>>>> AddPerson     5.190  13.194  3.387 8.800     3.000 FAILED
>>>>>>>>> AddEvent       5.904  16.784  3.159 10.400   4.000 FAILED
>>>>>>>>>
>>>>>>>>> Think tims for AddPerson and AddEvent fail as well.
>>>>>>>>>
>>>>>>>>> Any insights are welcome .... :-(
>>>>>>>>>
>>>>>>>>> -------------------------------------------------------------------
>>>>>>>>> Kontorinis Vasileios
>>>>>>>>> Phd student, University of California San Diego
>>>>>>>>> San Diego, CA 92122
>>>>>>>>> Cell. phone: (858) 717 6899
>>>>>>>>> bkontorinis@gmail.com, vkontori@ucsd.edu
>>>>>>>>> -------------------------------------------------------------------
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> 2010/1/26 Shanti Subramanyam <sh...@gmail.com>
>>>>>>>>>
>>>>>>>>>> Yes - 0.2 requires a lot more disk space as we changed the ratio
>>>>>>>>>> of concurrent users to registered users to 1:100. If you haven't already,
>>>>>>>>>> please check out our published Blueprints for detailed performance
>>>>>>>>>> characteristics of the workload:
>>>>>>>>>> Deploying Web 2.0 Applications on Sun Servers and the OpenSolaris
>>>>>>>>>> Operating System<http://wikis.sun.com/display/BluePrints/Deploying+Web+2.0+Applications+on+Sun+Servers+and+the+OpenSolaris+Operating+System>
>>>>>>>>>> <http://wikis.sun.com/display/BluePrints/Deploying+Web+2.0+Applications+on+Sun+Servers+and+the+OpenSolaris+Operating+System>
>>>>>>>>>> If you run for long enough, you should get passing runs. Have you
>>>>>>>>>> verified that there are no errors in the run logs when you see the 'Avg.
>>>>>>>>>> images loaded per home page' fail ?
>>>>>>>>>>
>>>>>>>>>> On to your open files error  - you may have to tune your
>>>>>>>>>> networking tier and/or #open file descriptors. I don't believe we have ever
>>>>>>>>>> seen as many files open as you are seeing. Can you determine whether these
>>>>>>>>>> are from the file store or network ? We also typically run the filestore on
>>>>>>>>>> a different system and nfs-mount it on the webserver box.
>>>>>>>>>> You will have to tune your system to ensure good performance since
>>>>>>>>>> you will need memory for both apache and files.
>>>>>>>>>>
>>>>>>>>>> Shanti
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Mon, Jan 25, 2010 at 5:06 PM, Vasileios Kontorinis <
>>>>>>>>>> bkontorinis@gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> Akara and Shanti hi,
>>>>>>>>>>>    I did migrate to Olio 0.2. With the last version of Olio I
>>>>>>>>>>> came across some new interesting things.
>>>>>>>>>>>
>>>>>>>>>>> Scaling issues:
>>>>>>>>>>>   - I am still getting the:
>>>>>>>>>>> Average images loaded per Home Page2.55>= 3
>>>>>>>>>>> FAILED
>>>>>>>>>>>  - additionally, when I scale the concurrent users to 800 I run
>>>>>>>>>>> out of diskspace since my filestore occupies more than 62GB.
>>>>>>>>>>> Actually for 600 users it occupies 50GB. I was curious if that
>>>>>>>>>>> makes sense. How much space I will need to reach 1000 users?
>>>>>>>>>>> In the php_setup.html it suggests that we will need 50GB but
>>>>>>>>>>> apparently we need way more for large number of users.
>>>>>>>>>>>
>>>>>>>>>>>  - Finally and most importantly, for 600 users many of the
>>>>>>>>>>> operations fail with the exception:
>>>>>>>>>>> Message: java.net.SocketException: Too many open files
>>>>>>>>>>> Stack Trace:
>>>>>>>>>>>  Class Method Line java.net.PlainSocketImpl socketAccept
>>>>>>>>>>> java.net.PlainSocketImpl accept 390 java.net.ServerSocket
>>>>>>>>>>> implAccept 453 java.net.ServerSocket accept 421
>>>>>>>>>>> sun.rmi.transport.tcp.TCPTransport$AcceptLoop executeAcceptLoop
>>>>>>>>>>> 369 sun.rmi.transport.tcp.TCPTransport$AcceptLoop run 341
>>>>>>>>>>> java.lang.Thread run 619
>>>>>>>>>>> or
>>>>>>>>>>>
>>>>>>>>>>> java.net.SocketException: Too many open files
>>>>>>>>>>> Stack Trace:
>>>>>>>>>>>  Class Method Line java.net.Socket createImpl 394
>>>>>>>>>>> java.net.Socket getImpl 457 java.net.Socket bind 571
>>>>>>>>>>> com.sun.faban.driver.transport.hc3.ProtocolTimedSocketFactory
>>>>>>>>>>> createSocket 60 org.apache.commons.httpclient.HttpConnection
>>>>>>>>>>> open 707 org.apache.commons.httpclient.HttpMethodDirector
>>>>>>>>>>> executeWithRetry 387
>>>>>>>>>>> org.apache.commons.httpclient.HttpMethodDirector executeMethod
>>>>>>>>>>> 171 org.apache.commons.httpclient.HttpClient executeMethod 397
>>>>>>>>>>> org.apache.commons.httpclient.HttpClient executeMethod 323
>>>>>>>>>>> com.sun.faban.driver.transport.hc3.ApacheHC3Transport readURL
>>>>>>>>>>> 274 org.apache.olio.workload.driver.UIDriver doLogin 398
>>>>>>>>>>> org.apache.olio.workload.driver.UIDriver doLogin 424
>>>>>>>>>>> sun.reflect.GeneratedMethodAccessor8 invoke
>>>>>>>>>>> sun.reflect.DelegatingMethodAccessorImpl invoke 25
>>>>>>>>>>> java.lang.reflect.Method invoke 597
>>>>>>>>>>> com.sun.faban.driver.engine.TimeThread doRun 169
>>>>>>>>>>> com.sun.faban.driver.engine.AgentThrea
>>>>>>>>>>>
>>>>>>>>>>> I am monitoring the number of open files in the web-server with
>>>>>>>>>>> `watch "lsof | wc"` and the olio starts failing when around 65000-70,000
>>>>>>>>>>> files are open. lsof shows that for each apache2 thread there are around 100
>>>>>>>>>>> files open. Therefore there are around 650-700 different apache2 threads
>>>>>>>>>>> that create the bulk of those open file descriptors.
>>>>>>>>>>> The soft and hard limit is set to 403238, which means that there
>>>>>>>>>>> should be many more open files before it will start failing.
>>>>>>>>>>> (Actually, I verified the limit by opening a bunch of files with
>>>>>>>>>>> a python script and it does reach the limitation of 403238.)
>>>>>>>>>>> Any insights?  Is there any chance the the file descriptors take
>>>>>>>>>>> more time that usual to be reclaimed after being closed in the xen vm I use
>>>>>>>>>>> for my web-server? Does it make sense for olio at the first place to have so
>>>>>>>>>>> many files open at the same time?
>>>>>>>>>>>
>>>>>>>>>>> Thanks again.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> -------------------------------------------------------------------
>>>>>>>>>>> Kontorinis Vasileios
>>>>>>>>>>> Phd student, University of California San Diego
>>>>>>>>>>> San Diego, CA 92122
>>>>>>>>>>> Cell. phone: (858) 717 6899
>>>>>>>>>>> bkontorinis@gmail.com, vkontori@ucsd.edu
>>>>>>>>>>>
>>>>>>>>>>> -------------------------------------------------------------------
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> 2010/1/16 Shanti Subramanyam <sh...@gmail.com>
>>>>>>>>>>>
>>>>>>>>>>>  I would really recommend that you migrate to Olio 0.2. In
>>>>>>>>>>>> addition to bug fixes, there are some major features changes in it. See Olio
>>>>>>>>>>>> 0.2 released<http://perfwork.wordpress.com/2010/01/13/olio-0-2-relesed/>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Shanti
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Sat, Jan 16, 2010 at 4:49 PM, Vasileios Kontorinis <
>>>>>>>>>>>> bkontorinis@gmail.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Akara hi again,
>>>>>>>>>>>>>    Below I have comments on your suggestions and at the end
>>>>>>>>>>>>> some bonus questions... Thanks again.
>>>>>>>>>>>>>
>>>>>>>>>>>>> 2010/1/13 Akara Sucharitakul <Ak...@sun.com>
>>>>>>>>>>>>>
>>>>>>>>>>>>>> With your permission, I'd like to copy the Olio and Faban user
>>>>>>>>>>>>>> aliases going forward. I feel it will help a much wider audience. Please see
>>>>>>>>>>>>>> below for answers/comments:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Sure. I cced olio user alias. I am not sure which is the right
>>>>>>>>>>>>> faban list.
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Vasileios Kontorinis wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Akara hi,
>>>>>>>>>>>>>>>   I am a grad student at UCSD and I use Olio for a research
>>>>>>>>>>>>>>> project where we want to measure olio performance under live virtual machine
>>>>>>>>>>>>>>> migration. We use ubuntu 8.04 on nehalem servers.
>>>>>>>>>>>>>>> I have co ed the last version of olio from the online svn
>>>>>>>>>>>>>>> repository and downloaded the last version of faban (faban-kit-101509.tar.gz
>>>>>>>>>>>>>>> <http://faban.sunsource.net/nightly/faban-kit-101509.tar.gz
>>>>>>>>>>>>>>> >)
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 101509 is fairly recent. But the latest on the web site is
>>>>>>>>>>>>>> 111109 (Faban 1.0). There were just bug fixes between those releases.
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> I have upgraded to Faban 1.0, still using olio1.0 though ( the
>>>>>>>>>>>>> release of 2.0 was announced, will switch to it if I run into bugs that have
>>>>>>>>>>>>> been fixed)
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> So far, I employed a bunch of hacks to get most of it to work
>>>>>>>>>>>>>>> and I am almost there. In the process I got a bunch of questions.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Questions (some of them might be just faban related, not olio
>>>>>>>>>>>>>>> so bear with me):
>>>>>>>>>>>>>>> 1) In there any way to deploy OlioDriver.jar through the
>>>>>>>>>>>>>>> command line? Firefox through ssh forwarding is dead slow and I d rather
>>>>>>>>>>>>>>> avoid if I can.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Just drop the jar into faban/benchmarks/ and it will deploy
>>>>>>>>>>>>>> itself. This is documented at
>>>>>>>>>>>>>> http://faban.sunsource.net/1.0/docs/guide/harnessdev/deploybenchmark.htmlunder "Alternate Deployment Methods."
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>  2) The services ApacheHttpdService, MemcachedService,
>>>>>>>>>>>>>>> MySQLService that come with Faban should be deployed before running Olio?
>>>>>>>>>>>>>>>    I was getting some very weird errors. e.g.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Yes, you should. Olio will search for those.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Done
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> 03:50:27:WARNING:UIDriverAgent[0]: Forcefully terminating
>>>>>>>>>>>>>>> benchmark run
>>>>>>>>>>>>>>> 03:50:27:WARNING:UIDriverAgent[0]: 25 threads forcefully
>>>>>>>>>>>>>>> terminated.
>>>>>>>>>>>>>>> java.lang.Throwable: Stack of non-terminating thread.
>>>>>>>>>>>>>>>    at java.net.SocketInputStream.socketRead0 (null)
>>>>>>>>>>>>>>>    at java.net.SocketInputStream.read (129)
>>>>>>>>>>>>>>>    at java.io.FilterInputStream.read (116)
>>>>>>>>>>>>>>>    at
>>>>>>>>>>>>>>> com.sun.faban.driver.transport.util.TimedInputStream.read (139)
>>>>>>>>>>>>>>>    at java.io.BufferedInputStream.fill (218)
>>>>>>>>>>>>>>>    at java.io.BufferedInputStream.read (237)
>>>>>>>>>>>>>>>    at org.apache.commons.httpclient.HttpParser.readRawLine
>>>>>>>>>>>>>>> (78)
>>>>>>>>>>>>>>>    at org.apache.commons.httpclient.HttpParser.readLine (106)
>>>>>>>>>>>>>>>    at org.apache.commons.httpclient.HttpConnection.readLine
>>>>>>>>>>>>>>> (1116)
>>>>>>>>>>>>>>>    at
>>>>>>>>>>>>>>> org.apache.commons.httpclient.HttpMethodBase.readStatusLine (1973)
>>>>>>>>>>>>>>>    at
>>>>>>>>>>>>>>> org.apache.commons.httpclient.HttpMethodBase.readResponse (1735)
>>>>>>>>>>>>>>>    at org.apache.commons.httpclient.HttpMethodBase.execute
>>>>>>>>>>>>>>> (1098)
>>>>>>>>>>>>>>>    at
>>>>>>>>>>>>>>> org.apache.commons.httpclient.HttpMethodDirector.executeWithRetry (398)
>>>>>>>>>>>>>>>    at
>>>>>>>>>>>>>>> org.apache.commons.httpclient.HttpMethodDirector.executeMethod (171)
>>>>>>>>>>>>>>>    at org.apache.commons.httpclient.HttpClient.executeMethod
>>>>>>>>>>>>>>> (397)
>>>>>>>>>>>>>>>    at org.apache.commons.httpclient.HttpClient.executeMethod
>>>>>>>>>>>>>>> (323)
>>>>>>>>>>>>>>>    at
>>>>>>>>>>>>>>> com.sun.faban.driver.transport.hc3.ApacheHC3Transport.fetchURL (529)
>>>>>>>>>>>>>>>    at
>>>>>>>>>>>>>>> com.sun.faban.driver.transport.hc3.ApacheHC3Transport.fetchURL (552)
>>>>>>>>>>>>>>>    at org.apache.olio.workload.driver.UIDriver.doHomePage
>>>>>>>>>>>>>>> (355)
>>>>>>>>>>>>>>>    at sun.reflect.NativeMethodAccessorImpl.invoke0 (null)
>>>>>>>>>>>>>>>    at sun.reflect.NativeMethodAccessorImpl.invoke (39)
>>>>>>>>>>>>>>>    at sun.reflect.DelegatingMethodAccessorImpl.invoke (25)
>>>>>>>>>>>>>>>    at java.lang.reflect.Method.invoke (597)
>>>>>>>>>>>>>>>    at com.sun.faban.driver.engine.TimeThread.doRun (169)
>>>>>>>>>>>>>>>    at com.sun.faban.driver.engine.AgentThread.run (202)
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> and afterwards the master was waiting for threads to join for
>>>>>>>>>>>>>>> ever... (I attached gdb to verify that something was wrong) and hence I had
>>>>>>>>>>>>>>> to kill the benchmark.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> These threads are hanging reading the server responses, that
>>>>>>>>>>>>>> never came.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>> Building the services from Faban probably fixes it.
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> In the Olio log there are WARNINGS  complaining about not
>>>>>>>>>>>>>>> deploying those. After building those and manually copying them to
>>>>>>>>>>>>>>> /faban/services (ant deploy did not place them there... :-(  )
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Yes. But ant deploy should get them there. If not, can you
>>>>>>>>>>>>>> please let me know the ant messages?
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Ant was deploying them indeed. I had a mistake in
>>>>>>>>>>>>> building.properties.
>>>>>>>>>>>>> I had:  faban.url=http://<hostname>:9980/   instead of
>>>>>>>>>>>>> faban.url=http://localhost:9980/
>>>>>>>>>>>>> After I changed that it started working...
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>  it worked. (mostly worked)
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> 3) I still have warnings like:
>>>>>>>>>>>>>>> 01:38:08:INFO:Time difference to host olio-web is 269 ms.
>>>>>>>>>>>>>>> Attempting to set clock.
>>>>>>>>>>>>>>> 01:38:08:INFO:Time difference to host olio-db is 263 ms.
>>>>>>>>>>>>>>> Attempting to set clock.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> These two are OK. Just trying to do a clock sync between the
>>>>>>>>>>>>>> systems.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>  01:38:08:WARNING:olio-web wakeup-before time reached 700ms
>>>>>>>>>>>>>>> limit. System is too busy. Giving up.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> This is one of Faban's clock-setting calibrations. If the
>>>>>>>>>>>>>> system is too busy or you run on some virtualization architectures, the lag
>>>>>>>>>>>>>> time between an intended end of sleep and the actual time when the thread
>>>>>>>>>>>>>> really wakes up (gets scheduled/executed) is too high, calibrations will
>>>>>>>>>>>>>> fail.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>  01:38:08:INFO:Time difference to host olio-mem is 262 ms.
>>>>>>>>>>>>>>> Attempting to set clock.
>>>>>>>>>>>>>>> 01:38:10:WARNING:olio-db wakeup-before time reached 700ms
>>>>>>>>>>>>>>> limit. System is too busy. Giving up.
>>>>>>>>>>>>>>> 09:38:09:WARNING:[date, -u, 011309382010.10]
>>>>>>>>>>>>>>> stderr:
>>>>>>>>>>>>>>> date: cannot set date: Operation not permitted
>>>>>>>>>>>>>>> 09:38:09:WARNING:Error on "[date, -u, 011309382010.10]"
>>>>>>>>>>>>>>> command trying to set the date. Exit value: 1
>>>>>>>>>>>>>>> 09:38:10:WARNING:[date, -u, 011309382010.11]
>>>>>>>>>>>>>>> stderr:
>>>>>>>>>>>>>>> date: cannot set date: Operation not permitted
>>>>>>>>>>>>>>> 09:38:10:WARNING:Error on "[date, -u, 011309382010.11]"
>>>>>>>>>>>>>>> command trying to set the date. Exit value: 1
>>>>>>>>>>>>>>> 09:38:09:WARNING:[date, -u, 011309382010.10]
>>>>>>>>>>>>>>> stderr:
>>>>>>>>>>>>>>> date: cannot set date: Operation not permitted
>>>>>>>>>>>>>>> 09:38:09:WARNING:Error on "[date, -u, 011309382010.10]"
>>>>>>>>>>>>>>> command trying to set the date. Exit value: 1
>>>>>>>>>>>>>>> 09:38:10:WARNING:[date, -u, 011309382010.11]
>>>>>>>>>>>>>>> stderr:
>>>>>>>>>>>>>>> date: cannot set date: Operation not permitted
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Leting faban change the vm clock sounds from the beginning a
>>>>>>>>>>>>>>> bad idea.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> OK. So it is xen. Yes, this is what Faban is trying to solve.
>>>>>>>>>>>>>> You can certainly turn it off. Please see:
>>>>>>>>>>>>>>   http://faban.sunsource.net/1.0/docs/howd services
>>>>>>>>>>>>>> ApacheHttpdService, MemcachedService, MySQLService that come with Faban
>>>>>>>>>>>>>> should be deployed before running Olio?
>>>>>>>>>>>>>>    I was gettingoi/physclocksync.html<http://faban.sunsource.net/1.0/docs/howdoi/physclocksync.html>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>> I added the  <fh:timeSync>false<fh:timeSync> in my run.xml
>>>>>>>>>>>>> file ( btw in the link above there is a mistake :  <
>>>>>>>>>>>>> fh:timeSync>false</fh:timeSync> is correct, the second
>>>>>>>>>>>>> <fh:timeSync> needs a closing tag, the "/" is missing)
>>>>>>>>>>>>> that made the warnings go away.
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>  Unfortunately, xen is really bad in maintaining an accurate
>>>>>>>>>>>>>>> clock. As a result there is usually time difference between the different
>>>>>>>>>>>>>>> virtual machines
>>>>>>>>>>>>>>> of more than 10ms. I went over the setTime function in Faban
>>>>>>>>>>>>>>> source (/faban/com/sun/faban/harness/agent/CmdAgentImpl.java), it's big and
>>>>>>>>>>>>>>> ugly (very ugly)
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks for the compliments! I think you mean
>>>>>>>>>>>>>> CmdService.setClockTask. Time sensitive code ain't pretty. It is the
>>>>>>>>>>>>>> complexities dealing with the clock and trying to achieve good accuracy. If
>>>>>>>>>>>>>> you think you can simplify this, I'm listening (without loosing the
>>>>>>>>>>>>>> accuracy, of course). In comparison, CmdAgentImpl has nothing.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>> Yes, you r right it is CmdService.setClockTask. The previous
>>>>>>>>>>>>> email was composed at 3am ... :-)
>>>>>>>>>>>>> I am still a little confused.  the setClockTask is used to set
>>>>>>>>>>>>> the clock so that all the machines are synchronized with master. From what
>>>>>>>>>>>>> you mentioned the physical clock sync is only used for the logs.
>>>>>>>>>>>>> Why do we need to do that since 1) it requires root privileges
>>>>>>>>>>>>> (which might not be always available) 2) I could imagine an alternative that
>>>>>>>>>>>>> uses deltas from the actual physical clock without having to set it.
>>>>>>>>>>>>> ( I am probably missing something... :-)
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>>  Why there is this strict requirement for 10ms difference? Any
>>>>>>>>>>>>>>> ideas?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> It is easily achievable in most cases. May not be true for
>>>>>>>>>>>>>> VMs.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On some VM architectures, the OS however does not get
>>>>>>>>>>>>>> scheduled till way after that, thus causing problems. You may be able to
>>>>>>>>>>>>>> measure performance on those VMs. But you don't want to use such VMs to be a
>>>>>>>>>>>>>> driver. Your response time measurements will be way off.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> The physical clock sync is not really rigorous. And you can
>>>>>>>>>>>>>> turn it off. It is more to keep the systems in good time sync. If your VM
>>>>>>>>>>>>>> stands in the way, just turn it off. The driver's virtual clock sync is much
>>>>>>>>>>>>>> more picky in comparison. This is because the start time for the steady
>>>>>>>>>>>>>> state should be the same (with a very small tolerance) no matter how many
>>>>>>>>>>>>>> drivers are driving. Otherwise the measurement period won't be the same when
>>>>>>>>>>>>>> viewed from different drivers and the results won't be reliable.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>  Even with ntp it's hard to provide the 10ms guarantee.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> That's why we don't use ntp ;-)
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Just out of curiosity, the physical clocks are set only once at
>>>>>>>>>>>>> the beginning (right?), therefore for long runs the 10ms difference will not
>>>>>>>>>>>>> be guaranteed. Nope? Especially under VMs I 've seen significant clock
>>>>>>>>>>>>> difference withing a few minutes.
>>>>>>>>>>>>> At least ntp can periodically resync (of course doing so, might
>>>>>>>>>>>>> screw up the logs with time going backwards etc)
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>  I am thinking of modifying this function to always return
>>>>>>>>>>>>>>> that the time difference is less than 10ms (so that I do not have to wait
>>>>>>>>>>>>>>> all the time for the timeouts.)
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Why bother. Don't like it, just turn it off. It has good use
>>>>>>>>>>>>>> in most configurations we're dealing with. And, it avoids ntp inaccuracies.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>  Will this break anything in Olio?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Nope. Except the times in your logs will appear out of
>>>>>>>>>>>>>> sequence. They rely on the local time on the originating systems.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> 4) Warning like:
>>>>>>>>>>>>>>> 09:39:48:WARNING:Image at
>>>>>>>>>>>>>>> http://olio-web:80/fileService.php?cache=false&file=e168t.jpg<
>>>>>>>>>>>>>>> http://olio-web:80/fileService.php?cache=false&file=e168t.jpg>
>>>>>>>>>>>>>>> size of 249 bytes is too small. Image may not exist
>>>>>>>>>>>>>>> can be ignored, right?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Well, something is wrong. We don't have images that small.
>>>>>>>>>>>>>> Check whether e168t.jpg is really that small. That's why we have that
>>>>>>>>>>>>>> warning.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> It kinda funny, my problem was that I had the olio webkit
>>>>>>>>>>>>> version installed and then I downloaded the version from the online svn
>>>>>>>>>>>>> repository. I built the driver but forgot to update the webpage for my
>>>>>>>>>>>>> apache server.  Which
>>>>>>>>>>>>> as expected was the source for many of my issues.
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> 5) Last and most important.
>>>>>>>>>>>>>>> I can run the benchmark and all the operation succeed but for
>>>>>>>>>>>>>>> login.
>>>>>>>>>>>>>>> I get a bunch of:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> 09:39:25:WARNING:UIDriverAgent[0].2.doLogin: Found login
>>>>>>>>>>>>>>> prompt at index 2926, Login as at786o08x, 2178 failed.
>>>>>>>>>>>>>>> Note: Error not counted in result.
>>>>>>>>>>>>>>> Either transaction start or end time is not within steady
>>>>>>>>>>>>>>> state.
>>>>>>>>>>>>>>> java.lang.RuntimeException: Found login prompt at index 2926,
>>>>>>>>>>>>>>> Login as at786o08x, 2178 failed.
>>>>>>>>>>>>>>>    at org.apache.olio.workload.driver.UIDriver.doLogin (404)
>>>>>>>>>>>>>>>    at sun.reflect.NativeMethodAccessorImpl.invoke0 (null)
>>>>>>>>>>>>>>>    at sun.reflect.NativeMethodAccessorImpl.invoke (39)
>>>>>>>>>>>>>>>    at sun.reflect.DelegatingMethodAccessorImpl.invoke (25)
>>>>>>>>>>>>>>>    at java.lang.reflect.Method.invoke (597)
>>>>>>>>>>>>>>>    at com.sun.faban.driver.engine.TimeThread.doRun (169)
>>>>>>>>>>>>>>>    at com.sun.faban.driver.engine.AgentThread.run (202)
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Any ideas? I do get
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> You likely have cookie issues. It can't seem to hold on to a
>>>>>>>>>>>>>> session.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>> Well there was a permission issue with the http_session dir. I
>>>>>>>>>>>>> could not right to it. chmod 777 it fixed this.
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> (I ve found online:
>>>>>>>>>>>>>>> http://www.mail-archive.com/olio-dev@incubator.apache.org/msg00647.htmlwhich is similar, but when I added
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> com.sun.faban.driver.transport.sunhttp.ThreadCookieHandler.level=FINER
>>>>>>>>>>>>>>>  in build.properties
>>>>>>>>>>>>>>> I did not see any cookie related warnings. Those should
>>>>>>>>>>>>>>> appear in the olio run log or the apache log, right? Am i just looking at
>>>>>>>>>>>>>>> the wrong place? )
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Yes, that's applicable only to the Sun Http Transport. The
>>>>>>>>>>>>>> version of Olio you're using is based on the Apache Http Transport (Apache
>>>>>>>>>>>>>> HttpClient 3.1). The ThreadCookieHandler is not used for the Apache
>>>>>>>>>>>>>> transport and that's why you don't see any logs. Try upgrade to Faban 1.0
>>>>>>>>>>>>>> before looking at other things.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> It's a long email I know. Your feedback would be most
>>>>>>>>>>>>>>> appreciated.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> -Regards
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> -------------------------------------------------------------------
>>>>>>>>>>>>>>> Kontorinis Vasileios
>>>>>>>>>>>>>>> Phd student, University of California San Diego
>>>>>>>>>>>>>>> San Diego, CA 92122
>>>>>>>>>>>>>>> Cell. phone: (858) 717 6899
>>>>>>>>>>>>>>> bkontorinis@gmail.com <ma...@gmail.com>,
>>>>>>>>>>>>>>> vkontori@ucsd.edu <ma...@ucsd.edu>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> -------------------------------------------------------------------\
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks for all the questions/comments.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> -Akara
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> And now some more questions/ comments:
>>>>>>>>>>>>> 1) I get the following error:
>>>>>>>>>>>>>
>>>>>>>>>>>>> 15:13:05:SEVERE:CmdService: Getting - exception reading
>>>>>>>>>>>>> /usr/data/olio-db.err
>>>>>>>>>>>>> java.io.FileNotFoundException: File /usr/data/olio-db.err does
>>>>>>>>>>>>> not exist.
>>>>>>>>>>>>>     at com.sun.faban.common.FileTransfer.<init> (70)
>>>>>>>>>>>>>     at com.sun.faban.harness.agent.FileAgentImpl.get (315)
>>>>>>>>>>>>>     at sun.reflect.NativeMethodAccessorImpl.invoke0 (null)
>>>>>>>>>>>>>     at sun.reflect.NativeMethodAccessorImpl.invoke (39)
>>>>>>>>>>>>>     at sun.reflect.DelegatingMethodAccessorImpl.invoke (25)
>>>>>>>>>>>>>     at java.lang.reflect.Method.invoke (597)
>>>>>>>>>>>>>     at sun.rmi.server.UnicastServerRef.dispatch (305)
>>>>>>>>>>>>>     at sun.rmi.transport.Transport$1.run (159)
>>>>>>>>>>>>>     at java.security.AccessController.doPrivileged (null)
>>>>>>>>>>>>>     at sun.rmi.transport.Transport.serviceCall (155)
>>>>>>>>>>>>>     at sun.rmi.transport.tcp.TCPTransport.handleMessages (535)
>>>>>>>>>>>>>     at
>>>>>>>>>>>>> sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0 (790)
>>>>>>>>>>>>>     at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run
>>>>>>>>>>>>> (649)
>>>>>>>>>>>>>     at java.util.concurrent.ThreadPoolExecutor$Worker.runTask
>>>>>>>>>>>>> (885)
>>>>>>>>>>>>>     at java.util.concurrent.ThreadPoolExecutor$Worker.run (907)
>>>>>>>>>>>>>     at java.lang.Thread.run (619)
>>>>>>>>>>>>>     at
>>>>>>>>>>>>> sun.rmi.transport.StreamRemoteCall.exceptionReceivedFromServer (255)
>>>>>>>>>>>>>     at sun.rmi.transport.StreamRemoteCall.executeCall (233)
>>>>>>>>>>>>>     at sun.rmi.server.UnicastRef.invoke (142)
>>>>>>>>>>>>>     at com.sun.faban.harness.agent.FileAgentImpl_Stub.get
>>>>>>>>>>>>> (null)
>>>>>>>>>>>>>     at com.sun.faban.harness.engine.CmdService.get (1334)
>>>>>>>>>>>>>     at com.sun.faban.harness.RunContext.getFile (346)
>>>>>>>>>>>>>     at com.sun.services.MySQLService.getLogs (197)
>>>>>>>>>>>>>     at sun.reflect.NativeMethodAccessorImpl.invoke0 (null)
>>>>>>>>>>>>>     at sun.reflect.NativeMethodAccessorImpl.invoke (39)
>>>>>>>>>>>>>     at sun.reflect.DelegatingMethodAccessorImpl.invoke (25)
>>>>>>>>>>>>>     at java.lang.reflect.Method.invoke (597)
>>>>>>>>>>>>>     at com.sun.faban.harness.util.Invoker.invoke (98)
>>>>>>>>>>>>>     at com.sun.faban.harness.services.ServiceWrapper.getLogs
>>>>>>>>>>>>> (200)
>>>>>>>>>>>>>     at com.sun.faban.harness.services.ServiceManager.getLogs
>>>>>>>>>>>>> (642)
>>>>>>>>>>>>>     at com.sun.faban.harness.engine.GenericBenchmark.start
>>>>>>>>>>>>> (323)
>>>>>>>>>>>>>     at com.sun.faban.harness.engine.RunDaemon.run (338)
>>>>>>>>>>>>>     at java.lang.Thread.run (619)
>>>>>>>>>>>>> 15:13:05:WARNING:Could not copy /usr/data/olio-db.err to
>>>>>>>>>>>>> /home/gdhiman/faban.1.0/faban/output/OlioDriver.2D/mysql_err.log.olio-db
>>>>>>>>>>>>>
>>>>>>>>>>>>> Apparently something is misconfigured in my db-server. Any
>>>>>>>>>>>>> ideas?
>>>>>>>>>>>>>
>>>>>>>>>>>>> 2) I get the following error:
>>>>>>>>>>>>> 15:13:16:WARNING:[/home/gdhiman/faban.1.0/faban/bin/fenxi,
>>>>>>>>>>>>> process, /home/gdhiman/faban.1.0/faban/output/OlioDriver.2D/,
>>>>>>>>>>>>> /home/gdhiman/faban.1.0/faban/output/OlioDriver.2D//post/, OlioDriver.2D]
>>>>>>>>>>>>> stderr:
>>>>>>>>>>>>> Error in executing perl
>>>>>>>>>>>>> /home/gdhiman/faban.1.0/faban/master/webapps/fenxi/txt2db/
>>>>>>>>>>>>> mpstat.pl
>>>>>>>>>>>>> Error in executing perl
>>>>>>>>>>>>> /home/gdhiman/faban.1.0/faban/master/webapps/fenxi/txt2db/
>>>>>>>>>>>>> mpstat.pl
>>>>>>>>>>>>>
>>>>>>>>>>>>> Actually I traced back this one. The problem is the difference
>>>>>>>>>>>>> in output format of the Sun's mpstat and default GNU mpstat.
>>>>>>>>>>>>> This is my output of my mpstat:
>>>>>>>>>>>>>
>>>>>>>>>>>>> gdhiman@olio-client00:~/faban.1.0/faban/output/OlioDriver.2D$
>>>>>>>>>>>>> mpstat 1
>>>>>>>>>>>>> Linux 2.6.18.8-xen (olio-client00)     01/16/10
>>>>>>>>>>>>>
>>>>>>>>>>>>> 16:25:06     CPU   %user   %nice    %sys %iowait    %irq
>>>>>>>>>>>>> %soft  %steal   %idle    intr/s
>>>>>>>>>>>>> 16:25:07     all    0.00    0.00    0.00    0.00    0.00
>>>>>>>>>>>>> 0.00    0.00  100.00     52.48
>>>>>>>>>>>>> 16:25:08     all    0.00    0.00    0.00    0.00    0.00
>>>>>>>>>>>>> 0.00    0.00  100.00     50.50
>>>>>>>>>>>>> 16:25:09     all    0.00    0.00    0.00    0.00    0.00
>>>>>>>>>>>>> 0.00    0.00  100.00     79.21
>>>>>>>>>>>>> 16:25:10     all    0.00    0.00    0.00    0.00    0.00
>>>>>>>>>>>>> 0.00    0.00  100.00     45.54
>>>>>>>>>>>>> 16:25:11     all    0.00    0.00    0.00    0.00    0.00
>>>>>>>>>>>>> 0.00    0.00  100.00     55.45
>>>>>>>>>>>>>
>>>>>>>>>>>>> The first line as well as the time at the beginning of each
>>>>>>>>>>>>> entry messing up the parsing at mpstat.pl. (also the fields
>>>>>>>>>>>>> are different)   Any plans to support this??
>>>>>>>>>>>>>
>>>>>>>>>>>>> 3) Scaling questions.
>>>>>>>>>>>>> - So far I did not have a single experiment passing. Some are
>>>>>>>>>>>>> pretty close with only one metric check failing.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Average images loaded per Home Page2.79>= 3
>>>>>>>>>>>>> FAILED
>>>>>>>>>>>>> Any ideas? Is it the case that the disc is not fast enough? I
>>>>>>>>>>>>> am just using the local filesystem for the filestore.
>>>>>>>>>>>>>
>>>>>>>>>>>>> - As I double the number of concurrent users I observe linear
>>>>>>>>>>>>> scaling in the thoughput.
>>>>>>>>>>>>> Con Users         Throughput
>>>>>>>>>>>>>  25                        4.967
>>>>>>>>>>>>>  50                       10.06
>>>>>>>>>>>>> 100                      19.375
>>>>>>>>>>>>> 200                      40.21
>>>>>>>>>>>>> 400                      75.818
>>>>>>>>>>>>> 800                       0.383
>>>>>>>>>>>>> 1000                     0.483
>>>>>>>>>>>>>
>>>>>>>>>>>>> The linear scaling stops for 400 concurrent users ( only one
>>>>>>>>>>>>> agent). Actually it would be exactly linear (value of ~80) but almost half
>>>>>>>>>>>>> of the login operations failed. I am looking into it.
>>>>>>>>>>>>> Any insights on what might be the first thing failing?
>>>>>>>>>>>>>
>>>>>>>>>>>>> For the 800 and 1000 experiments there are no failed operations
>>>>>>>>>>>>> logged. It looks like those are being discarded... (?)
>>>>>>>>>>>>>
>>>>>>>>>>>>> Bonus question:
>>>>>>>>>>>>> In the runtime statistics
>>>>>>>>>>>>> <runtimeStats enabled="true">
>>>>>>>>>>>>>          <interval>30</interval>
>>>>>>>>>>>>>  </runtimeStats>
>>>>>>>>>>>>>
>>>>>>>>>>>>> only the 90% response time is reported. Is there an easy way to
>>>>>>>>>>>>> also report the 99% ? ( or I need to add code for that?)
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks a lot again in advance.
>>>>>>>>>>>>> -VK
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Re: Olio Scaling

Posted by Shanti Subramanyam <sh...@gmail.com>.
If you want to run multiple webservers on different systems, you must have
access to the filestore from all of them. The easiest way to do this is to
nfs-mount the filestore from the server it resides on so it is accessible to
the other machines as well.

Shanti

On Thu, Feb 11, 2010 at 9:42 PM, Vasileios Kontorinis <bkontorinis@gmail.com
> wrote:

> Shanti hi again,
>    Sorry for not submitting the JIRA on time, I am extremely busy lately.
>
> I have a fast question regarding the way the webserver interacts with the
> filestore. I run some scaling studies with one, two and three different
> server while having only one filestore (I do specify that in the run.xml
> configuration file, webServer and dataStorage ).
> The filestore is a local folder on one of the server machines. However, in
> the oliophp/etc/config.php I also specify on each server
>
> $olioconfig['fileSystem'] = 'LocalFS';
> $olioconfig['localfsRoot'] = '/home/gdhiman/filestore';
>
> As a result, I do get WARNINGS for missing files on the webserver that do
> not host a filestore. What is the right configuration for
> oliophp/etc/config.php? Can I somehow detach the filestore from the
> webserver so that it requests files remotely?
>
>
> Thanks again.
> -------------------------------------------------------------------
> Kontorinis Vasileios
> Phd student, University of California San Diego
> San Diego, CA 92122
> Cell. phone: (858) 717 6899
> bkontorinis@gmail.com, vkontori@ucsd.edu
> -------------------------------------------------------------------
>
>
> 2010/2/8 Shanti Subramanyam <sh...@gmail.com>
>
>
>>
>> On Mon, Feb 8, 2010 at 3:53 PM, Vasileios Kontorinis <
>> bkontorinis@gmail.com> wrote:
>>
>>>
>>>
>>>> We need to look into this issue  - I suspect that something subtle has
>>>> changed in 0.2 which hasn't got accounted for in the expected #images
>>>> loaded. Can I please request that you file a JIRA on this ?
>>>>
>>>
>>> How do I do this? Pointers?
>>>
>>
>> http://issues.apache.org
>>
>>
>>> I tried runs of 20mins to verify that longer runs will not make it better
>>> and it's still failing for just 50 users.
>>>
>>
>> What worries me is that you're saying it  fails for 1800 users too - I can
>> understand it may fail for 50 users, but if it fails for larger #users, then
>> it is a bug.
>>
>>>
>>>
>>
>>> and I do get the repetitive patterns you mentioned. However, the cache_MB
>>> though never exceeds 0.05...
>>> I would expect that memcache size is really important for the application
>>> scaling. What is the point of having a separate memcache server if we are
>>> only using less than 50KB(?) of memory for caching?
>>>
>>>
>> Try running without memcached - it can be easily configured in the app's
>> etc/config.php. Then you will see what different the cache makes. The
>> reduction in db traffic is dramatic resulting in the response times you see.
>> The reason the size is small is because we are currently only caching the
>> home page which is shared. We have not bothered to implement any additional
>> caching as this level of caching is sufficient to reduce the db load.
>>
>> Regards
>>> -VK
>>>
>>>  Shanti
>>
>>>
>>>
>>>> Shanti
>>>>
>>>>
>>>>> Thanks again
>>>>> -------------------------------------------------------------------
>>>>> Kontorinis Vasileios
>>>>> Phd student, University of California San Diego
>>>>> San Diego, CA 92122
>>>>> Cell. phone: (858) 717 6899
>>>>> bkontorinis@gmail.com, vkontori@ucsd.edu
>>>>> -------------------------------------------------------------------
>>>>>
>>>>>
>>>>> 2010/1/27 Shanti Subramanyam <sh...@gmail.com>
>>>>>
>>>>>> Yes - these are problems that I'm already aware of.
>>>>>> The best solution to the filestore issue is to change ownership of the
>>>>>> directory to the same user/group as the apache process. We could have the
>>>>>> fileloader.sh change write access I guess, but since that's a big security
>>>>>> hole, we may not want to do that automatically without letting the user know
>>>>>> about it.
>>>>>>
>>>>>> The fact that your response times are so high indicate that you're
>>>>>> running a far larger load than the system can handle and/or you still need
>>>>>> some tuning.
>>>>>> I suggest you start over from say 100 users and see at what point your
>>>>>> response times start getting really large. The apache error log should be
>>>>>> pulled in as part of the 'Statistics' tab, so do keep monitoring that.
>>>>>>
>>>>>> Shanti
>>>>>>
>>>>>>
>>>>>> On Wed, Jan 27, 2010 at 1:34 AM, Vasileios Kontorinis <
>>>>>> bkontorinis@gmail.com> wrote:
>>>>>>
>>>>>>> Shanti hi again,
>>>>>>>    I checked my apache logs and there were a bunch of errors.
>>>>>>> It looks like there some issues with the
>>>>>>> webapp/php/trunk/classes/ImageUtil.php in the last release of olio. (I
>>>>>>> downloaded
>>>>>>> http://www.alliedquotes.com/mirrors/apache/incubator/olio/0.2/apache-olio-php-src-0.2.tar.gz)
>>>>>>> 1) There is a line that needs to be commented. php complains ("1.5.
>>>>>>> Must be greater than zero.").
>>>>>>> 2) Then, it was complaining that it cannot find function
>>>>>>> fastimagecopyresampled . To work around that moved the function
>>>>>>> fastimagecopyresampled above createThumb (this might not  be required ) and
>>>>>>> deplared it static.
>>>>>>>     Finally,  I call the function from createThumb with
>>>>>>> self::fastimagecopyresampled .
>>>>>>> 3) Then, it started complaining because it could not write to the
>>>>>>> filestore. The problem is that wants to write the new images as www-data
>>>>>>> from the apache, while the filestore does not have write persmission for
>>>>>>> others. Manually,
>>>>>>>     giving access solves the problem (chmod -R o+w <path>/filestore)
>>>>>>> but since the directories in filestore are generated automatically, maybe
>>>>>>> the chmod command should be added in fileloader.sh
>>>>>>>
>>>>>>> Funnily enough, after fixing those issues, I still cannot pass the:
>>>>>>> Average images loaded per Home Page 2.65   >=3       FAILED
>>>>>>>
>>>>>>> and on top of that I also have:
>>>>>>> Response Times (secs)
>>>>>>> AddPerson     5.190  13.194  3.387 8.800     3.000 FAILED
>>>>>>> AddEvent       5.904  16.784  3.159 10.400   4.000 FAILED
>>>>>>>
>>>>>>> Think tims for AddPerson and AddEvent fail as well.
>>>>>>>
>>>>>>> Any insights are welcome .... :-(
>>>>>>>
>>>>>>> -------------------------------------------------------------------
>>>>>>> Kontorinis Vasileios
>>>>>>> Phd student, University of California San Diego
>>>>>>> San Diego, CA 92122
>>>>>>> Cell. phone: (858) 717 6899
>>>>>>> bkontorinis@gmail.com, vkontori@ucsd.edu
>>>>>>> -------------------------------------------------------------------
>>>>>>>
>>>>>>>
>>>>>>> 2010/1/26 Shanti Subramanyam <sh...@gmail.com>
>>>>>>>
>>>>>>>> Yes - 0.2 requires a lot more disk space as we changed the ratio of
>>>>>>>> concurrent users to registered users to 1:100. If you haven't already,
>>>>>>>> please check out our published Blueprints for detailed performance
>>>>>>>> characteristics of the workload:
>>>>>>>> Deploying Web 2.0 Applications on Sun Servers and the OpenSolaris
>>>>>>>> Operating System<http://wikis.sun.com/display/BluePrints/Deploying+Web+2.0+Applications+on+Sun+Servers+and+the+OpenSolaris+Operating+System>
>>>>>>>> <http://wikis.sun.com/display/BluePrints/Deploying+Web+2.0+Applications+on+Sun+Servers+and+the+OpenSolaris+Operating+System>
>>>>>>>> If you run for long enough, you should get passing runs. Have you
>>>>>>>> verified that there are no errors in the run logs when you see the 'Avg.
>>>>>>>> images loaded per home page' fail ?
>>>>>>>>
>>>>>>>> On to your open files error  - you may have to tune your networking
>>>>>>>> tier and/or #open file descriptors. I don't believe we have ever seen as
>>>>>>>> many files open as you are seeing. Can you determine whether these are from
>>>>>>>> the file store or network ? We also typically run the filestore on a
>>>>>>>> different system and nfs-mount it on the webserver box.
>>>>>>>> You will have to tune your system to ensure good performance since
>>>>>>>> you will need memory for both apache and files.
>>>>>>>>
>>>>>>>> Shanti
>>>>>>>>
>>>>>>>>
>>>>>>>> On Mon, Jan 25, 2010 at 5:06 PM, Vasileios Kontorinis <
>>>>>>>> bkontorinis@gmail.com> wrote:
>>>>>>>>
>>>>>>>>> Akara and Shanti hi,
>>>>>>>>>    I did migrate to Olio 0.2. With the last version of Olio I came
>>>>>>>>> across some new interesting things.
>>>>>>>>>
>>>>>>>>> Scaling issues:
>>>>>>>>>   - I am still getting the:
>>>>>>>>> Average images loaded per Home Page2.55>= 3
>>>>>>>>> FAILED
>>>>>>>>>  - additionally, when I scale the concurrent users to 800 I run out
>>>>>>>>> of diskspace since my filestore occupies more than 62GB.
>>>>>>>>> Actually for 600 users it occupies 50GB. I was curious if that
>>>>>>>>> makes sense. How much space I will need to reach 1000 users?
>>>>>>>>> In the php_setup.html it suggests that we will need 50GB but
>>>>>>>>> apparently we need way more for large number of users.
>>>>>>>>>
>>>>>>>>>  - Finally and most importantly, for 600 users many of the
>>>>>>>>> operations fail with the exception:
>>>>>>>>> Message: java.net.SocketException: Too many open files
>>>>>>>>> Stack Trace:
>>>>>>>>>  Class Method Line java.net.PlainSocketImpl socketAccept
>>>>>>>>> java.net.PlainSocketImpl accept 390 java.net.ServerSocket
>>>>>>>>> implAccept 453 java.net.ServerSocket accept 421
>>>>>>>>> sun.rmi.transport.tcp.TCPTransport$AcceptLoop executeAcceptLoop
>>>>>>>>> 369 sun.rmi.transport.tcp.TCPTransport$AcceptLoop run 341
>>>>>>>>> java.lang.Thread run 619
>>>>>>>>> or
>>>>>>>>>
>>>>>>>>> java.net.SocketException: Too many open files
>>>>>>>>> Stack Trace:
>>>>>>>>>  Class Method Line java.net.Socket createImpl 394 java.net.Socket
>>>>>>>>> getImpl 457 java.net.Socket bind 571
>>>>>>>>> com.sun.faban.driver.transport.hc3.ProtocolTimedSocketFactory
>>>>>>>>> createSocket 60 org.apache.commons.httpclient.HttpConnection open
>>>>>>>>> 707 org.apache.commons.httpclient.HttpMethodDirector
>>>>>>>>> executeWithRetry 387
>>>>>>>>> org.apache.commons.httpclient.HttpMethodDirector executeMethod 171
>>>>>>>>> org.apache.commons.httpclient.HttpClient executeMethod 397
>>>>>>>>> org.apache.commons.httpclient.HttpClient executeMethod 323
>>>>>>>>> com.sun.faban.driver.transport.hc3.ApacheHC3Transport readURL 274
>>>>>>>>> org.apache.olio.workload.driver.UIDriver doLogin 398
>>>>>>>>> org.apache.olio.workload.driver.UIDriver doLogin 424
>>>>>>>>> sun.reflect.GeneratedMethodAccessor8 invoke
>>>>>>>>> sun.reflect.DelegatingMethodAccessorImpl invoke 25
>>>>>>>>> java.lang.reflect.Method invoke 597
>>>>>>>>> com.sun.faban.driver.engine.TimeThread doRun 169
>>>>>>>>> com.sun.faban.driver.engine.AgentThrea
>>>>>>>>>
>>>>>>>>> I am monitoring the number of open files in the web-server with
>>>>>>>>> `watch "lsof | wc"` and the olio starts failing when around 65000-70,000
>>>>>>>>> files are open. lsof shows that for each apache2 thread there are around 100
>>>>>>>>> files open. Therefore there are around 650-700 different apache2 threads
>>>>>>>>> that create the bulk of those open file descriptors.
>>>>>>>>> The soft and hard limit is set to 403238, which means that there
>>>>>>>>> should be many more open files before it will start failing.
>>>>>>>>> (Actually, I verified the limit by opening a bunch of files with a
>>>>>>>>> python script and it does reach the limitation of 403238.)
>>>>>>>>> Any insights?  Is there any chance the the file descriptors take
>>>>>>>>> more time that usual to be reclaimed after being closed in the xen vm I use
>>>>>>>>> for my web-server? Does it make sense for olio at the first place to have so
>>>>>>>>> many files open at the same time?
>>>>>>>>>
>>>>>>>>> Thanks again.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> -------------------------------------------------------------------
>>>>>>>>> Kontorinis Vasileios
>>>>>>>>> Phd student, University of California San Diego
>>>>>>>>> San Diego, CA 92122
>>>>>>>>> Cell. phone: (858) 717 6899
>>>>>>>>> bkontorinis@gmail.com, vkontori@ucsd.edu
>>>>>>>>> -------------------------------------------------------------------
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> 2010/1/16 Shanti Subramanyam <sh...@gmail.com>
>>>>>>>>>
>>>>>>>>>  I would really recommend that you migrate to Olio 0.2. In addition
>>>>>>>>>> to bug fixes, there are some major features changes in it. See Olio
>>>>>>>>>> 0.2 released<http://perfwork.wordpress.com/2010/01/13/olio-0-2-relesed/>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Shanti
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Sat, Jan 16, 2010 at 4:49 PM, Vasileios Kontorinis <
>>>>>>>>>> bkontorinis@gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> Akara hi again,
>>>>>>>>>>>    Below I have comments on your suggestions and at the end some
>>>>>>>>>>> bonus questions... Thanks again.
>>>>>>>>>>>
>>>>>>>>>>> 2010/1/13 Akara Sucharitakul <Ak...@sun.com>
>>>>>>>>>>>
>>>>>>>>>>>> With your permission, I'd like to copy the Olio and Faban user
>>>>>>>>>>>> aliases going forward. I feel it will help a much wider audience. Please see
>>>>>>>>>>>> below for answers/comments:
>>>>>>>>>>>>
>>>>>>>>>>>> Sure. I cced olio user alias. I am not sure which is the right
>>>>>>>>>>> faban list.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>> Vasileios Kontorinis wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Akara hi,
>>>>>>>>>>>>>   I am a grad student at UCSD and I use Olio for a research
>>>>>>>>>>>>> project where we want to measure olio performance under live virtual machine
>>>>>>>>>>>>> migration. We use ubuntu 8.04 on nehalem servers.
>>>>>>>>>>>>> I have co ed the last version of olio from the online svn
>>>>>>>>>>>>> repository and downloaded the last version of faban (faban-kit-101509.tar.gz
>>>>>>>>>>>>> <http://faban.sunsource.net/nightly/faban-kit-101509.tar.gz>)
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> 101509 is fairly recent. But the latest on the web site is
>>>>>>>>>>>> 111109 (Faban 1.0). There were just bug fixes between those releases.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> I have upgraded to Faban 1.0, still using olio1.0 though ( the
>>>>>>>>>>> release of 2.0 was announced, will switch to it if I run into bugs that have
>>>>>>>>>>> been fixed)
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>> So far, I employed a bunch of hacks to get most of it to work
>>>>>>>>>>>>> and I am almost there. In the process I got a bunch of questions.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Questions (some of them might be just faban related, not olio
>>>>>>>>>>>>> so bear with me):
>>>>>>>>>>>>> 1) In there any way to deploy OlioDriver.jar through the
>>>>>>>>>>>>> command line? Firefox through ssh forwarding is dead slow and I d rather
>>>>>>>>>>>>> avoid if I can.
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Just drop the jar into faban/benchmarks/ and it will deploy
>>>>>>>>>>>> itself. This is documented at
>>>>>>>>>>>> http://faban.sunsource.net/1.0/docs/guide/harnessdev/deploybenchmark.htmlunder "Alternate Deployment Methods."
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>  2) The services ApacheHttpdService, MemcachedService,
>>>>>>>>>>>>> MySQLService that come with Faban should be deployed before running Olio?
>>>>>>>>>>>>>    I was getting some very weird errors. e.g.
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Yes, you should. Olio will search for those.
>>>>>>>>>>>>
>>>>>>>>>>>> Done
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>> 03:50:27:WARNING:UIDriverAgent[0]: Forcefully terminating
>>>>>>>>>>>>> benchmark run
>>>>>>>>>>>>> 03:50:27:WARNING:UIDriverAgent[0]: 25 threads forcefully
>>>>>>>>>>>>> terminated.
>>>>>>>>>>>>> java.lang.Throwable: Stack of non-terminating thread.
>>>>>>>>>>>>>    at java.net.SocketInputStream.socketRead0 (null)
>>>>>>>>>>>>>    at java.net.SocketInputStream.read (129)
>>>>>>>>>>>>>    at java.io.FilterInputStream.read (116)
>>>>>>>>>>>>>    at com.sun.faban.driver.transport.util.TimedInputStream.read
>>>>>>>>>>>>> (139)
>>>>>>>>>>>>>    at java.io.BufferedInputStream.fill (218)
>>>>>>>>>>>>>    at java.io.BufferedInputStream.read (237)
>>>>>>>>>>>>>    at org.apache.commons.httpclient.HttpParser.readRawLine (78)
>>>>>>>>>>>>>    at org.apache.commons.httpclient.HttpParser.readLine (106)
>>>>>>>>>>>>>    at org.apache.commons.httpclient.HttpConnection.readLine
>>>>>>>>>>>>> (1116)
>>>>>>>>>>>>>    at
>>>>>>>>>>>>> org.apache.commons.httpclient.HttpMethodBase.readStatusLine (1973)
>>>>>>>>>>>>>    at org.apache.commons.httpclient.HttpMethodBase.readResponse
>>>>>>>>>>>>> (1735)
>>>>>>>>>>>>>    at org.apache.commons.httpclient.HttpMethodBase.execute
>>>>>>>>>>>>> (1098)
>>>>>>>>>>>>>    at
>>>>>>>>>>>>> org.apache.commons.httpclient.HttpMethodDirector.executeWithRetry (398)
>>>>>>>>>>>>>    at
>>>>>>>>>>>>> org.apache.commons.httpclient.HttpMethodDirector.executeMethod (171)
>>>>>>>>>>>>>    at org.apache.commons.httpclient.HttpClient.executeMethod
>>>>>>>>>>>>> (397)
>>>>>>>>>>>>>    at org.apache.commons.httpclient.HttpClient.executeMethod
>>>>>>>>>>>>> (323)
>>>>>>>>>>>>>    at
>>>>>>>>>>>>> com.sun.faban.driver.transport.hc3.ApacheHC3Transport.fetchURL (529)
>>>>>>>>>>>>>    at
>>>>>>>>>>>>> com.sun.faban.driver.transport.hc3.ApacheHC3Transport.fetchURL (552)
>>>>>>>>>>>>>    at org.apache.olio.workload.driver.UIDriver.doHomePage (355)
>>>>>>>>>>>>>    at sun.reflect.NativeMethodAccessorImpl.invoke0 (null)
>>>>>>>>>>>>>    at sun.reflect.NativeMethodAccessorImpl.invoke (39)
>>>>>>>>>>>>>    at sun.reflect.DelegatingMethodAccessorImpl.invoke (25)
>>>>>>>>>>>>>    at java.lang.reflect.Method.invoke (597)
>>>>>>>>>>>>>    at com.sun.faban.driver.engine.TimeThread.doRun (169)
>>>>>>>>>>>>>    at com.sun.faban.driver.engine.AgentThread.run (202)
>>>>>>>>>>>>>
>>>>>>>>>>>>> and afterwards the master was waiting for threads to join for
>>>>>>>>>>>>> ever... (I attached gdb to verify that something was wrong) and hence I had
>>>>>>>>>>>>> to kill the benchmark.
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> These threads are hanging reading the server responses, that
>>>>>>>>>>>> never came.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>> Building the services from Faban probably fixes it.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>> In the Olio log there are WARNINGS  complaining about not
>>>>>>>>>>>>> deploying those. After building those and manually copying them to
>>>>>>>>>>>>> /faban/services (ant deploy did not place them there... :-(  )
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Yes. But ant deploy should get them there. If not, can you
>>>>>>>>>>>> please let me know the ant messages?
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Ant was deploying them indeed. I had a mistake in
>>>>>>>>>>> building.properties.
>>>>>>>>>>> I had:  faban.url=http://<hostname>:9980/   instead of
>>>>>>>>>>> faban.url=http://localhost:9980/
>>>>>>>>>>> After I changed that it started working...
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>  it worked. (mostly worked)
>>>>>>>>>>>>>
>>>>>>>>>>>>> 3) I still have warnings like:
>>>>>>>>>>>>> 01:38:08:INFO:Time difference to host olio-web is 269 ms.
>>>>>>>>>>>>> Attempting to set clock.
>>>>>>>>>>>>> 01:38:08:INFO:Time difference to host olio-db is 263 ms.
>>>>>>>>>>>>> Attempting to set clock.
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> These two are OK. Just trying to do a clock sync between the
>>>>>>>>>>>> systems.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>  01:38:08:WARNING:olio-web wakeup-before time reached 700ms
>>>>>>>>>>>>> limit. System is too busy. Giving up.
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> This is one of Faban's clock-setting calibrations. If the system
>>>>>>>>>>>> is too busy or you run on some virtualization architectures, the lag time
>>>>>>>>>>>> between an intended end of sleep and the actual time when the thread really
>>>>>>>>>>>> wakes up (gets scheduled/executed) is too high, calibrations will fail.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>  01:38:08:INFO:Time difference to host olio-mem is 262 ms.
>>>>>>>>>>>>> Attempting to set clock.
>>>>>>>>>>>>> 01:38:10:WARNING:olio-db wakeup-before time reached 700ms
>>>>>>>>>>>>> limit. System is too busy. Giving up.
>>>>>>>>>>>>> 09:38:09:WARNING:[date, -u, 011309382010.10]
>>>>>>>>>>>>> stderr:
>>>>>>>>>>>>> date: cannot set date: Operation not permitted
>>>>>>>>>>>>> 09:38:09:WARNING:Error on "[date, -u, 011309382010.10]" command
>>>>>>>>>>>>> trying to set the date. Exit value: 1
>>>>>>>>>>>>> 09:38:10:WARNING:[date, -u, 011309382010.11]
>>>>>>>>>>>>> stderr:
>>>>>>>>>>>>> date: cannot set date: Operation not permitted
>>>>>>>>>>>>> 09:38:10:WARNING:Error on "[date, -u, 011309382010.11]" command
>>>>>>>>>>>>> trying to set the date. Exit value: 1
>>>>>>>>>>>>> 09:38:09:WARNING:[date, -u, 011309382010.10]
>>>>>>>>>>>>> stderr:
>>>>>>>>>>>>> date: cannot set date: Operation not permitted
>>>>>>>>>>>>> 09:38:09:WARNING:Error on "[date, -u, 011309382010.10]" command
>>>>>>>>>>>>> trying to set the date. Exit value: 1
>>>>>>>>>>>>> 09:38:10:WARNING:[date, -u, 011309382010.11]
>>>>>>>>>>>>> stderr:
>>>>>>>>>>>>> date: cannot set date: Operation not permitted
>>>>>>>>>>>>>
>>>>>>>>>>>>> Leting faban change the vm clock sounds from the beginning a
>>>>>>>>>>>>> bad idea.
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> OK. So it is xen. Yes, this is what Faban is trying to solve.
>>>>>>>>>>>> You can certainly turn it off. Please see:
>>>>>>>>>>>>   http://faban.sunsource.net/1.0/docs/howd services
>>>>>>>>>>>> ApacheHttpdService, MemcachedService, MySQLService that come with Faban
>>>>>>>>>>>> should be deployed before running Olio?
>>>>>>>>>>>>    I was gettingoi/physclocksync.html<http://faban.sunsource.net/1.0/docs/howdoi/physclocksync.html>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>> I added the  <fh:timeSync>false<fh:timeSync> in my run.xml file
>>>>>>>>>>> ( btw in the link above there is a mistake :  <fh:timeSync>false
>>>>>>>>>>> </fh:timeSync> is correct, the second <fh:timeSync> needs a
>>>>>>>>>>> closing tag, the "/" is missing)
>>>>>>>>>>> that made the warnings go away.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>  Unfortunately, xen is really bad in maintaining an accurate
>>>>>>>>>>>>> clock. As a result there is usually time difference between the different
>>>>>>>>>>>>> virtual machines
>>>>>>>>>>>>> of more than 10ms. I went over the setTime function in Faban
>>>>>>>>>>>>> source (/faban/com/sun/faban/harness/agent/CmdAgentImpl.java), it's big and
>>>>>>>>>>>>> ugly (very ugly)
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks for the compliments! I think you mean
>>>>>>>>>>>> CmdService.setClockTask. Time sensitive code ain't pretty. It is the
>>>>>>>>>>>> complexities dealing with the clock and trying to achieve good accuracy. If
>>>>>>>>>>>> you think you can simplify this, I'm listening (without loosing the
>>>>>>>>>>>> accuracy, of course). In comparison, CmdAgentImpl has nothing.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>> Yes, you r right it is CmdService.setClockTask. The previous
>>>>>>>>>>> email was composed at 3am ... :-)
>>>>>>>>>>> I am still a little confused.  the setClockTask is used to set
>>>>>>>>>>> the clock so that all the machines are synchronized with master. From what
>>>>>>>>>>> you mentioned the physical clock sync is only used for the logs.
>>>>>>>>>>> Why do we need to do that since 1) it requires root privileges
>>>>>>>>>>> (which might not be always available) 2) I could imagine an alternative that
>>>>>>>>>>> uses deltas from the actual physical clock without having to set it.
>>>>>>>>>>> ( I am probably missing something... :-)
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>  Why there is this strict requirement for 10ms difference? Any
>>>>>>>>>>>>> ideas?
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> It is easily achievable in most cases. May not be true for VMs.
>>>>>>>>>>>>
>>>>>>>>>>>> On some VM architectures, the OS however does not get scheduled
>>>>>>>>>>>> till way after that, thus causing problems. You may be able to measure
>>>>>>>>>>>> performance on those VMs. But you don't want to use such VMs to be a driver.
>>>>>>>>>>>> Your response time measurements will be way off.
>>>>>>>>>>>>
>>>>>>>>>>>> The physical clock sync is not really rigorous. And you can turn
>>>>>>>>>>>> it off. It is more to keep the systems in good time sync. If your VM stands
>>>>>>>>>>>> in the way, just turn it off. The driver's virtual clock sync is much more
>>>>>>>>>>>> picky in comparison. This is because the start time for the steady state
>>>>>>>>>>>> should be the same (with a very small tolerance) no matter how many drivers
>>>>>>>>>>>> are driving. Otherwise the measurement period won't be the same when viewed
>>>>>>>>>>>> from different drivers and the results won't be reliable.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>  Even with ntp it's hard to provide the 10ms guarantee.
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> That's why we don't use ntp ;-)
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Just out of curiosity, the physical clocks are set only once at
>>>>>>>>>>> the beginning (right?), therefore for long runs the 10ms difference will not
>>>>>>>>>>> be guaranteed. Nope? Especially under VMs I 've seen significant clock
>>>>>>>>>>> difference withing a few minutes.
>>>>>>>>>>> At least ntp can periodically resync (of course doing so, might
>>>>>>>>>>> screw up the logs with time going backwards etc)
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>  I am thinking of modifying this function to always return that
>>>>>>>>>>>>> the time difference is less than 10ms (so that I do not have to wait all the
>>>>>>>>>>>>> time for the timeouts.)
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Why bother. Don't like it, just turn it off. It has good use in
>>>>>>>>>>>> most configurations we're dealing with. And, it avoids ntp inaccuracies.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>  Will this break anything in Olio?
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Nope. Except the times in your logs will appear out of sequence.
>>>>>>>>>>>> They rely on the local time on the originating systems.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>> 4) Warning like:
>>>>>>>>>>>>> 09:39:48:WARNING:Image at
>>>>>>>>>>>>> http://olio-web:80/fileService.php?cache=false&file=e168t.jpg<
>>>>>>>>>>>>> http://olio-web:80/fileService.php?cache=false&file=e168t.jpg>
>>>>>>>>>>>>> size of 249 bytes is too small. Image may not exist
>>>>>>>>>>>>> can be ignored, right?
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Well, something is wrong. We don't have images that small. Check
>>>>>>>>>>>> whether e168t.jpg is really that small. That's why we have that warning.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> It kinda funny, my problem was that I had the olio webkit version
>>>>>>>>>>> installed and then I downloaded the version from the online svn repository.
>>>>>>>>>>> I built the driver but forgot to update the webpage for my apache server.
>>>>>>>>>>> Which
>>>>>>>>>>> as expected was the source for many of my issues.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>> 5) Last and most important.
>>>>>>>>>>>>> I can run the benchmark and all the operation succeed but for
>>>>>>>>>>>>> login.
>>>>>>>>>>>>> I get a bunch of:
>>>>>>>>>>>>>
>>>>>>>>>>>>> 09:39:25:WARNING:UIDriverAgent[0].2.doLogin: Found login prompt
>>>>>>>>>>>>> at index 2926, Login as at786o08x, 2178 failed.
>>>>>>>>>>>>> Note: Error not counted in result.
>>>>>>>>>>>>> Either transaction start or end time is not within steady
>>>>>>>>>>>>> state.
>>>>>>>>>>>>> java.lang.RuntimeException: Found login prompt at index 2926,
>>>>>>>>>>>>> Login as at786o08x, 2178 failed.
>>>>>>>>>>>>>    at org.apache.olio.workload.driver.UIDriver.doLogin (404)
>>>>>>>>>>>>>    at sun.reflect.NativeMethodAccessorImpl.invoke0 (null)
>>>>>>>>>>>>>    at sun.reflect.NativeMethodAccessorImpl.invoke (39)
>>>>>>>>>>>>>    at sun.reflect.DelegatingMethodAccessorImpl.invoke (25)
>>>>>>>>>>>>>    at java.lang.reflect.Method.invoke (597)
>>>>>>>>>>>>>    at com.sun.faban.driver.engine.TimeThread.doRun (169)
>>>>>>>>>>>>>    at com.sun.faban.driver.engine.AgentThread.run (202)
>>>>>>>>>>>>>
>>>>>>>>>>>>> Any ideas? I do get
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> You likely have cookie issues. It can't seem to hold on to a
>>>>>>>>>>>> session.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>> Well there was a permission issue with the http_session dir. I
>>>>>>>>>>> could not right to it. chmod 777 it fixed this.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>> (I ve found online:
>>>>>>>>>>>>> http://www.mail-archive.com/olio-dev@incubator.apache.org/msg00647.htmlwhich is similar, but when I added
>>>>>>>>>>>>>
>>>>>>>>>>>>> com.sun.faban.driver.transport.sunhttp.ThreadCookieHandler.level=FINER
>>>>>>>>>>>>>  in build.properties
>>>>>>>>>>>>> I did not see any cookie related warnings. Those should appear
>>>>>>>>>>>>> in the olio run log or the apache log, right? Am i just looking at the wrong
>>>>>>>>>>>>> place? )
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Yes, that's applicable only to the Sun Http Transport. The
>>>>>>>>>>>> version of Olio you're using is based on the Apache Http Transport (Apache
>>>>>>>>>>>> HttpClient 3.1). The ThreadCookieHandler is not used for the Apache
>>>>>>>>>>>> transport and that's why you don't see any logs. Try upgrade to Faban 1.0
>>>>>>>>>>>> before looking at other things.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> It's a long email I know. Your feedback would be most
>>>>>>>>>>>>> appreciated.
>>>>>>>>>>>>>
>>>>>>>>>>>>> -Regards
>>>>>>>>>>>>>
>>>>>>>>>>>>> -------------------------------------------------------------------
>>>>>>>>>>>>> Kontorinis Vasileios
>>>>>>>>>>>>> Phd student, University of California San Diego
>>>>>>>>>>>>> San Diego, CA 92122
>>>>>>>>>>>>> Cell. phone: (858) 717 6899
>>>>>>>>>>>>> bkontorinis@gmail.com <ma...@gmail.com>,
>>>>>>>>>>>>> vkontori@ucsd.edu <ma...@ucsd.edu>
>>>>>>>>>>>>>
>>>>>>>>>>>>> -------------------------------------------------------------------\
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks for all the questions/comments.
>>>>>>>>>>>>
>>>>>>>>>>>> -Akara
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> And now some more questions/ comments:
>>>>>>>>>>> 1) I get the following error:
>>>>>>>>>>>
>>>>>>>>>>> 15:13:05:SEVERE:CmdService: Getting - exception reading
>>>>>>>>>>> /usr/data/olio-db.err
>>>>>>>>>>> java.io.FileNotFoundException: File /usr/data/olio-db.err does
>>>>>>>>>>> not exist.
>>>>>>>>>>>     at com.sun.faban.common.FileTransfer.<init> (70)
>>>>>>>>>>>     at com.sun.faban.harness.agent.FileAgentImpl.get (315)
>>>>>>>>>>>     at sun.reflect.NativeMethodAccessorImpl.invoke0 (null)
>>>>>>>>>>>     at sun.reflect.NativeMethodAccessorImpl.invoke (39)
>>>>>>>>>>>     at sun.reflect.DelegatingMethodAccessorImpl.invoke (25)
>>>>>>>>>>>     at java.lang.reflect.Method.invoke (597)
>>>>>>>>>>>     at sun.rmi.server.UnicastServerRef.dispatch (305)
>>>>>>>>>>>     at sun.rmi.transport.Transport$1.run (159)
>>>>>>>>>>>     at java.security.AccessController.doPrivileged (null)
>>>>>>>>>>>     at sun.rmi.transport.Transport.serviceCall (155)
>>>>>>>>>>>     at sun.rmi.transport.tcp.TCPTransport.handleMessages (535)
>>>>>>>>>>>     at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0
>>>>>>>>>>> (790)
>>>>>>>>>>>     at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run
>>>>>>>>>>> (649)
>>>>>>>>>>>     at java.util.concurrent.ThreadPoolExecutor$Worker.runTask
>>>>>>>>>>> (885)
>>>>>>>>>>>     at java.util.concurrent.ThreadPoolExecutor$Worker.run (907)
>>>>>>>>>>>     at java.lang.Thread.run (619)
>>>>>>>>>>>     at
>>>>>>>>>>> sun.rmi.transport.StreamRemoteCall.exceptionReceivedFromServer (255)
>>>>>>>>>>>     at sun.rmi.transport.StreamRemoteCall.executeCall (233)
>>>>>>>>>>>     at sun.rmi.server.UnicastRef.invoke (142)
>>>>>>>>>>>     at com.sun.faban.harness.agent.FileAgentImpl_Stub.get (null)
>>>>>>>>>>>     at com.sun.faban.harness.engine.CmdService.get (1334)
>>>>>>>>>>>     at com.sun.faban.harness.RunContext.getFile (346)
>>>>>>>>>>>     at com.sun.services.MySQLService.getLogs (197)
>>>>>>>>>>>     at sun.reflect.NativeMethodAccessorImpl.invoke0 (null)
>>>>>>>>>>>     at sun.reflect.NativeMethodAccessorImpl.invoke (39)
>>>>>>>>>>>     at sun.reflect.DelegatingMethodAccessorImpl.invoke (25)
>>>>>>>>>>>     at java.lang.reflect.Method.invoke (597)
>>>>>>>>>>>     at com.sun.faban.harness.util.Invoker.invoke (98)
>>>>>>>>>>>     at com.sun.faban.harness.services.ServiceWrapper.getLogs
>>>>>>>>>>> (200)
>>>>>>>>>>>     at com.sun.faban.harness.services.ServiceManager.getLogs
>>>>>>>>>>> (642)
>>>>>>>>>>>     at com.sun.faban.harness.engine.GenericBenchmark.start (323)
>>>>>>>>>>>     at com.sun.faban.harness.engine.RunDaemon.run (338)
>>>>>>>>>>>     at java.lang.Thread.run (619)
>>>>>>>>>>> 15:13:05:WARNING:Could not copy /usr/data/olio-db.err to
>>>>>>>>>>> /home/gdhiman/faban.1.0/faban/output/OlioDriver.2D/mysql_err.log.olio-db
>>>>>>>>>>>
>>>>>>>>>>> Apparently something is misconfigured in my db-server. Any ideas?
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> 2) I get the following error:
>>>>>>>>>>> 15:13:16:WARNING:[/home/gdhiman/faban.1.0/faban/bin/fenxi,
>>>>>>>>>>> process, /home/gdhiman/faban.1.0/faban/output/OlioDriver.2D/,
>>>>>>>>>>> /home/gdhiman/faban.1.0/faban/output/OlioDriver.2D//post/, OlioDriver.2D]
>>>>>>>>>>> stderr:
>>>>>>>>>>> Error in executing perl
>>>>>>>>>>> /home/gdhiman/faban.1.0/faban/master/webapps/fenxi/txt2db/
>>>>>>>>>>> mpstat.pl
>>>>>>>>>>> Error in executing perl
>>>>>>>>>>> /home/gdhiman/faban.1.0/faban/master/webapps/fenxi/txt2db/
>>>>>>>>>>> mpstat.pl
>>>>>>>>>>>
>>>>>>>>>>> Actually I traced back this one. The problem is the difference in
>>>>>>>>>>> output format of the Sun's mpstat and default GNU mpstat.
>>>>>>>>>>> This is my output of my mpstat:
>>>>>>>>>>>
>>>>>>>>>>> gdhiman@olio-client00:~/faban.1.0/faban/output/OlioDriver.2D$
>>>>>>>>>>> mpstat 1
>>>>>>>>>>> Linux 2.6.18.8-xen (olio-client00)     01/16/10
>>>>>>>>>>>
>>>>>>>>>>> 16:25:06     CPU   %user   %nice    %sys %iowait    %irq   %soft
>>>>>>>>>>> %steal   %idle    intr/s
>>>>>>>>>>> 16:25:07     all    0.00    0.00    0.00    0.00    0.00
>>>>>>>>>>> 0.00    0.00  100.00     52.48
>>>>>>>>>>> 16:25:08     all    0.00    0.00    0.00    0.00    0.00
>>>>>>>>>>> 0.00    0.00  100.00     50.50
>>>>>>>>>>> 16:25:09     all    0.00    0.00    0.00    0.00    0.00
>>>>>>>>>>> 0.00    0.00  100.00     79.21
>>>>>>>>>>> 16:25:10     all    0.00    0.00    0.00    0.00    0.00
>>>>>>>>>>> 0.00    0.00  100.00     45.54
>>>>>>>>>>> 16:25:11     all    0.00    0.00    0.00    0.00    0.00
>>>>>>>>>>> 0.00    0.00  100.00     55.45
>>>>>>>>>>>
>>>>>>>>>>> The first line as well as the time at the beginning of each entry
>>>>>>>>>>> messing up the parsing at mpstat.pl. (also the fields are
>>>>>>>>>>> different)   Any plans to support this??
>>>>>>>>>>>
>>>>>>>>>>> 3) Scaling questions.
>>>>>>>>>>> - So far I did not have a single experiment passing. Some are
>>>>>>>>>>> pretty close with only one metric check failing.
>>>>>>>>>>>
>>>>>>>>>>> Average images loaded per Home Page2.79>= 3
>>>>>>>>>>> FAILED
>>>>>>>>>>> Any ideas? Is it the case that the disc is not fast enough? I am
>>>>>>>>>>> just using the local filesystem for the filestore.
>>>>>>>>>>>
>>>>>>>>>>> - As I double the number of concurrent users I observe linear
>>>>>>>>>>> scaling in the thoughput.
>>>>>>>>>>> Con Users         Throughput
>>>>>>>>>>>  25                        4.967
>>>>>>>>>>>  50                       10.06
>>>>>>>>>>> 100                      19.375
>>>>>>>>>>> 200                      40.21
>>>>>>>>>>> 400                      75.818
>>>>>>>>>>> 800                       0.383
>>>>>>>>>>> 1000                     0.483
>>>>>>>>>>>
>>>>>>>>>>> The linear scaling stops for 400 concurrent users ( only one
>>>>>>>>>>> agent). Actually it would be exactly linear (value of ~80) but almost half
>>>>>>>>>>> of the login operations failed. I am looking into it.
>>>>>>>>>>> Any insights on what might be the first thing failing?
>>>>>>>>>>>
>>>>>>>>>>> For the 800 and 1000 experiments there are no failed operations
>>>>>>>>>>> logged. It looks like those are being discarded... (?)
>>>>>>>>>>>
>>>>>>>>>>> Bonus question:
>>>>>>>>>>> In the runtime statistics
>>>>>>>>>>> <runtimeStats enabled="true">
>>>>>>>>>>>          <interval>30</interval>
>>>>>>>>>>>  </runtimeStats>
>>>>>>>>>>>
>>>>>>>>>>> only the 90% response time is reported. Is there an easy way to
>>>>>>>>>>> also report the 99% ? ( or I need to add code for that?)
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Thanks a lot again in advance.
>>>>>>>>>>> -VK
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Re: Olio Scaling

Posted by Shanti Subramanyam <sh...@gmail.com>.
On Mon, Feb 8, 2010 at 1:18 AM, Vasileios Kontorinis
<bk...@gmail.com>wrote:

> Akara and Shanti,
>   I managed to fix a very subtle issue with xen. There was an issue with
> the checksum that reduces the throughput of the network from 1Gbs to 1Mbs.
>

Wow !  Is this a generic issue with Xen ?

When that was fixed my I managed to scale to 1800 concurrent users.
> However, the only metric failing now is the
>
> Average images loaded per Home Page 2.65   >=3       FAILED
>
> Actually I managed to get a passing result for 25 users.
>
>
We need to look into this issue  - I suspect that something subtle has
changed in 0.2 which hasn't got accounted for in the expected #images
loaded. Can I please request that you file a JIRA on this ?

I also had some question regarding Memcached. In the MemcachedStats output
> log I get:
>
> Server              Time  items  cache_MB  conns  sets/s  gets/s
>  get_hits/s  get_misses/s  evicts/s  rB/s    wB/s
> --------------  --------  -----  --------  -----  ------  ------
>  ----------  ------------  --------  ----  ------
> olio-mem:11211  04:20:47      3      0.05     34    1.70   13.10
> 10.30          2.80         0  5709  216402
>
>
> Server              Time  items  cache_MB  conns  sets/s  gets/s
>  get_hits/s  get_misses/s  evicts/s  rB/s  wB/s
> --------------  --------  -----  --------  -----  ------  ------
>  ----------  ------------  --------  ----  ----
> olio-mem:11211  04:20:47      3      0.05     34    0.00    0.00
>  0.00          0.00         0     0    48
>
>
> Server              Time  items  cache_MB  conns  sets/s  gets/s
>  get_hits/s  get_misses/s  evicts/s  rB/s  wB/s
> --------------  --------  -----  --------  -----  ------  ------
>  ----------  ------------  --------  ----  ----
> olio-mem:11211  04:20:47      3      0.05     34    0.00    0.00
>  0.00          0.00         0     0    48
>
>
> Does this mean that I only use 0.05 MB from the memcached memory?
> I am pretty sure that the memcached command has  -m 256   which means that
> I should be reach close to 256MB, when running with high number of users.
> Is cache_MB something different?
>


Your cache_MB size is correct - we actually cache very little in memcached.
However, the number of 'conns' you are seeing is worrisome. I have typically
seen the same or more as the actual #concurrent users (so you should see
around 1800). Your first entry looks good for the other stats, but you
should see similar numbers (with rBs, wB/s, get_hits etc.) in other entries
as well. Depending on the frequency you are running it at, you will see some
entries with zero number (like the ones you have).

Shanti


> Thanks again
> -------------------------------------------------------------------
> Kontorinis Vasileios
> Phd student, University of California San Diego
> San Diego, CA 92122
> Cell. phone: (858) 717 6899
> bkontorinis@gmail.com, vkontori@ucsd.edu
> -------------------------------------------------------------------
>
>
> 2010/1/27 Shanti Subramanyam <sh...@gmail.com>
>
>> Yes - these are problems that I'm already aware of.
>> The best solution to the filestore issue is to change ownership of the
>> directory to the same user/group as the apache process. We could have the
>> fileloader.sh change write access I guess, but since that's a big security
>> hole, we may not want to do that automatically without letting the user know
>> about it.
>>
>> The fact that your response times are so high indicate that you're running
>> a far larger load than the system can handle and/or you still need some
>> tuning.
>> I suggest you start over from say 100 users and see at what point your
>> response times start getting really large. The apache error log should be
>> pulled in as part of the 'Statistics' tab, so do keep monitoring that.
>>
>> Shanti
>>
>>
>> On Wed, Jan 27, 2010 at 1:34 AM, Vasileios Kontorinis <
>> bkontorinis@gmail.com> wrote:
>>
>>> Shanti hi again,
>>>    I checked my apache logs and there were a bunch of errors.
>>> It looks like there some issues with the
>>> webapp/php/trunk/classes/ImageUtil.php in the last release of olio. (I
>>> downloaded
>>> http://www.alliedquotes.com/mirrors/apache/incubator/olio/0.2/apache-olio-php-src-0.2.tar.gz)
>>> 1) There is a line that needs to be commented. php complains ("1.5. Must
>>> be greater than zero.").
>>> 2) Then, it was complaining that it cannot find function
>>> fastimagecopyresampled . To work around that moved the function
>>> fastimagecopyresampled above createThumb (this might not  be required ) and
>>> deplared it static.
>>>     Finally,  I call the function from createThumb with
>>> self::fastimagecopyresampled .
>>> 3) Then, it started complaining because it could not write to the
>>> filestore. The problem is that wants to write the new images as www-data
>>> from the apache, while the filestore does not have write persmission for
>>> others. Manually,
>>>     giving access solves the problem (chmod -R o+w <path>/filestore) but
>>> since the directories in filestore are generated automatically, maybe the
>>> chmod command should be added in fileloader.sh
>>>
>>> Funnily enough, after fixing those issues, I still cannot pass the:
>>> Average images loaded per Home Page 2.65   >=3       FAILED
>>>
>>> and on top of that I also have:
>>> Response Times (secs)
>>> AddPerson     5.190  13.194  3.387 8.800     3.000 FAILED
>>> AddEvent       5.904  16.784  3.159 10.400   4.000 FAILED
>>>
>>> Think tims for AddPerson and AddEvent fail as well.
>>>
>>> Any insights are welcome .... :-(
>>>
>>> -------------------------------------------------------------------
>>> Kontorinis Vasileios
>>> Phd student, University of California San Diego
>>> San Diego, CA 92122
>>> Cell. phone: (858) 717 6899
>>> bkontorinis@gmail.com, vkontori@ucsd.edu
>>> -------------------------------------------------------------------
>>>
>>>
>>> 2010/1/26 Shanti Subramanyam <sh...@gmail.com>
>>>
>>>> Yes - 0.2 requires a lot more disk space as we changed the ratio of
>>>> concurrent users to registered users to 1:100. If you haven't already,
>>>> please check out our published Blueprints for detailed performance
>>>> characteristics of the workload:
>>>> Deploying Web 2.0 Applications on Sun Servers and the OpenSolaris
>>>> Operating System<http://wikis.sun.com/display/BluePrints/Deploying+Web+2.0+Applications+on+Sun+Servers+and+the+OpenSolaris+Operating+System>
>>>> <http://wikis.sun.com/display/BluePrints/Deploying+Web+2.0+Applications+on+Sun+Servers+and+the+OpenSolaris+Operating+System>
>>>> If you run for long enough, you should get passing runs. Have you
>>>> verified that there are no errors in the run logs when you see the 'Avg.
>>>> images loaded per home page' fail ?
>>>>
>>>> On to your open files error  - you may have to tune your networking tier
>>>> and/or #open file descriptors. I don't believe we have ever seen as many
>>>> files open as you are seeing. Can you determine whether these are from the
>>>> file store or network ? We also typically run the filestore on a different
>>>> system and nfs-mount it on the webserver box.
>>>> You will have to tune your system to ensure good performance since you
>>>> will need memory for both apache and files.
>>>>
>>>> Shanti
>>>>
>>>>
>>>> On Mon, Jan 25, 2010 at 5:06 PM, Vasileios Kontorinis <
>>>> bkontorinis@gmail.com> wrote:
>>>>
>>>>> Akara and Shanti hi,
>>>>>    I did migrate to Olio 0.2. With the last version of Olio I came
>>>>> across some new interesting things.
>>>>>
>>>>> Scaling issues:
>>>>>   - I am still getting the:
>>>>> Average images loaded per Home Page2.55>= 3
>>>>> FAILED
>>>>>  - additionally, when I scale the concurrent users to 800 I run out of
>>>>> diskspace since my filestore occupies more than 62GB.
>>>>> Actually for 600 users it occupies 50GB. I was curious if that makes
>>>>> sense. How much space I will need to reach 1000 users?
>>>>> In the php_setup.html it suggests that we will need 50GB but apparently
>>>>> we need way more for large number of users.
>>>>>
>>>>>  - Finally and most importantly, for 600 users many of the operations
>>>>> fail with the exception:
>>>>> Message: java.net.SocketException: Too many open files
>>>>> Stack Trace:
>>>>>  Class Method Line java.net.PlainSocketImpl socketAccept
>>>>> java.net.PlainSocketImpl accept 390 java.net.ServerSocket implAccept
>>>>> 453 java.net.ServerSocket accept 421
>>>>> sun.rmi.transport.tcp.TCPTransport$AcceptLoop executeAcceptLoop 369
>>>>> sun.rmi.transport.tcp.TCPTransport$AcceptLoop run 341java.lang.Thread
>>>>> run 619
>>>>> or
>>>>>
>>>>> java.net.SocketException: Too many open files
>>>>> Stack Trace:
>>>>>  Class Method Line java.net.Socket createImpl 394 java.net.Socket
>>>>> getImpl 457 java.net.Socket bind 571
>>>>> com.sun.faban.driver.transport.hc3.ProtocolTimedSocketFactory
>>>>> createSocket 60 org.apache.commons.httpclient.HttpConnection open 707
>>>>> org.apache.commons.httpclient.HttpMethodDirector executeWithRetry 387
>>>>> org.apache.commons.httpclient.HttpMethodDirector executeMethod 171
>>>>> org.apache.commons.httpclient.HttpClient executeMethod 397
>>>>> org.apache.commons.httpclient.HttpClient executeMethod 323
>>>>> com.sun.faban.driver.transport.hc3.ApacheHC3Transport readURL 274
>>>>> org.apache.olio.workload.driver.UIDriver doLogin 398
>>>>> org.apache.olio.workload.driver.UIDriver doLogin 424
>>>>> sun.reflect.GeneratedMethodAccessor8 invoke
>>>>> sun.reflect.DelegatingMethodAccessorImpl invoke 25
>>>>> java.lang.reflect.Method invoke 597
>>>>> com.sun.faban.driver.engine.TimeThread doRun 169
>>>>> com.sun.faban.driver.engine.AgentThrea
>>>>>
>>>>> I am monitoring the number of open files in the web-server with
>>>>> `watch "lsof | wc"` and the olio starts failing when around 65000-70,000
>>>>> files are open. lsof shows that for each apache2 thread there are around 100
>>>>> files open. Therefore there are around 650-700 different apache2 threads
>>>>> that create the bulk of those open file descriptors.
>>>>> The soft and hard limit is set to 403238, which means that there should
>>>>> be many more open files before it will start failing.
>>>>> (Actually, I verified the limit by opening a bunch of files with a
>>>>> python script and it does reach the limitation of 403238.)
>>>>> Any insights?  Is there any chance the the file descriptors take more
>>>>> time that usual to be reclaimed after being closed in the xen vm I use for
>>>>> my web-server? Does it make sense for olio at the first place to have so
>>>>> many files open at the same time?
>>>>>
>>>>> Thanks again.
>>>>>
>>>>>
>>>>> -------------------------------------------------------------------
>>>>> Kontorinis Vasileios
>>>>> Phd student, University of California San Diego
>>>>> San Diego, CA 92122
>>>>> Cell. phone: (858) 717 6899
>>>>> bkontorinis@gmail.com, vkontori@ucsd.edu
>>>>> -------------------------------------------------------------------
>>>>>
>>>>>
>>>>> 2010/1/16 Shanti Subramanyam <sh...@gmail.com>
>>>>>
>>>>>  I would really recommend that you migrate to Olio 0.2. In addition to
>>>>>> bug fixes, there are some major features changes in it. See Olio
>>>>>> 0.2 released<http://perfwork.wordpress.com/2010/01/13/olio-0-2-relesed/>
>>>>>>
>>>>>>
>>>>>> Shanti
>>>>>>
>>>>>>
>>>>>> On Sat, Jan 16, 2010 at 4:49 PM, Vasileios Kontorinis <
>>>>>> bkontorinis@gmail.com> wrote:
>>>>>>
>>>>>>> Akara hi again,
>>>>>>>    Below I have comments on your suggestions and at the end some
>>>>>>> bonus questions... Thanks again.
>>>>>>>
>>>>>>> 2010/1/13 Akara Sucharitakul <Ak...@sun.com>
>>>>>>>
>>>>>>>> With your permission, I'd like to copy the Olio and Faban user
>>>>>>>> aliases going forward. I feel it will help a much wider audience. Please see
>>>>>>>> below for answers/comments:
>>>>>>>>
>>>>>>>> Sure. I cced olio user alias. I am not sure which is the right faban
>>>>>>> list.
>>>>>>>
>>>>>>>
>>>>>>>> Vasileios Kontorinis wrote:
>>>>>>>>
>>>>>>>>> Akara hi,
>>>>>>>>>   I am a grad student at UCSD and I use Olio for a research project
>>>>>>>>> where we want to measure olio performance under live virtual machine
>>>>>>>>> migration. We use ubuntu 8.04 on nehalem servers.
>>>>>>>>> I have co ed the last version of olio from the online svn
>>>>>>>>> repository and downloaded the last version of faban (faban-kit-101509.tar.gz
>>>>>>>>> <http://faban.sunsource.net/nightly/faban-kit-101509.tar.gz>)
>>>>>>>>>
>>>>>>>>
>>>>>>>> 101509 is fairly recent. But the latest on the web site is 111109
>>>>>>>> (Faban 1.0). There were just bug fixes between those releases.
>>>>>>>
>>>>>>>
>>>>>>> I have upgraded to Faban 1.0, still using olio1.0 though ( the
>>>>>>> release of 2.0 was announced, will switch to it if I run into bugs that have
>>>>>>> been fixed)
>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>> So far, I employed a bunch of hacks to get most of it to work and I
>>>>>>>>> am almost there. In the process I got a bunch of questions.
>>>>>>>>>
>>>>>>>>> Questions (some of them might be just faban related, not olio so
>>>>>>>>> bear with me):
>>>>>>>>> 1) In there any way to deploy OlioDriver.jar through the command
>>>>>>>>> line? Firefox through ssh forwarding is dead slow and I d rather avoid if I
>>>>>>>>> can.
>>>>>>>>>
>>>>>>>>
>>>>>>>> Just drop the jar into faban/benchmarks/ and it will deploy itself.
>>>>>>>> This is documented at
>>>>>>>> http://faban.sunsource.net/1.0/docs/guide/harnessdev/deploybenchmark.htmlunder "Alternate Deployment Methods."
>>>>>>>>
>>>>>>>>
>>>>>>>>  2) The services ApacheHttpdService, MemcachedService, MySQLService
>>>>>>>>> that come with Faban should be deployed before running Olio?
>>>>>>>>>    I was getting some very weird errors. e.g.
>>>>>>>>>
>>>>>>>>
>>>>>>>> Yes, you should. Olio will search for those.
>>>>>>>>
>>>>>>>> Done
>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>> 03:50:27:WARNING:UIDriverAgent[0]: Forcefully terminating benchmark
>>>>>>>>> run
>>>>>>>>> 03:50:27:WARNING:UIDriverAgent[0]: 25 threads forcefully
>>>>>>>>> terminated.
>>>>>>>>> java.lang.Throwable: Stack of non-terminating thread.
>>>>>>>>>    at java.net.SocketInputStream.socketRead0 (null)
>>>>>>>>>    at java.net.SocketInputStream.read (129)
>>>>>>>>>    at java.io.FilterInputStream.read (116)
>>>>>>>>>    at com.sun.faban.driver.transport.util.TimedInputStream.read
>>>>>>>>> (139)
>>>>>>>>>    at java.io.BufferedInputStream.fill (218)
>>>>>>>>>    at java.io.BufferedInputStream.read (237)
>>>>>>>>>    at org.apache.commons.httpclient.HttpParser.readRawLine (78)
>>>>>>>>>    at org.apache.commons.httpclient.HttpParser.readLine (106)
>>>>>>>>>    at org.apache.commons.httpclient.HttpConnection.readLine (1116)
>>>>>>>>>    at org.apache.commons.httpclient.HttpMethodBase.readStatusLine
>>>>>>>>> (1973)
>>>>>>>>>    at org.apache.commons.httpclient.HttpMethodBase.readResponse
>>>>>>>>> (1735)
>>>>>>>>>    at org.apache.commons.httpclient.HttpMethodBase.execute (1098)
>>>>>>>>>    at
>>>>>>>>> org.apache.commons.httpclient.HttpMethodDirector.executeWithRetry (398)
>>>>>>>>>    at
>>>>>>>>> org.apache.commons.httpclient.HttpMethodDirector.executeMethod (171)
>>>>>>>>>    at org.apache.commons.httpclient.HttpClient.executeMethod (397)
>>>>>>>>>    at org.apache.commons.httpclient.HttpClient.executeMethod (323)
>>>>>>>>>    at
>>>>>>>>> com.sun.faban.driver.transport.hc3.ApacheHC3Transport.fetchURL (529)
>>>>>>>>>    at
>>>>>>>>> com.sun.faban.driver.transport.hc3.ApacheHC3Transport.fetchURL (552)
>>>>>>>>>    at org.apache.olio.workload.driver.UIDriver.doHomePage (355)
>>>>>>>>>    at sun.reflect.NativeMethodAccessorImpl.invoke0 (null)
>>>>>>>>>    at sun.reflect.NativeMethodAccessorImpl.invoke (39)
>>>>>>>>>    at sun.reflect.DelegatingMethodAccessorImpl.invoke (25)
>>>>>>>>>    at java.lang.reflect.Method.invoke (597)
>>>>>>>>>    at com.sun.faban.driver.engine.TimeThread.doRun (169)
>>>>>>>>>    at com.sun.faban.driver.engine.AgentThread.run (202)
>>>>>>>>>
>>>>>>>>> and afterwards the master was waiting for threads to join for
>>>>>>>>> ever... (I attached gdb to verify that something was wrong) and hence I had
>>>>>>>>> to kill the benchmark.
>>>>>>>>>
>>>>>>>>
>>>>>>>> These threads are hanging reading the server responses, that never
>>>>>>>> came.
>>>>>>>>
>>>>>>>>
>>>>>>> Building the services from Faban probably fixes it.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>> In the Olio log there are WARNINGS  complaining about not deploying
>>>>>>>>> those. After building those and manually copying them to /faban/services
>>>>>>>>> (ant deploy did not place them there... :-(  )
>>>>>>>>>
>>>>>>>>
>>>>>>>> Yes. But ant deploy should get them there. If not, can you please
>>>>>>>> let me know the ant messages?
>>>>>>>
>>>>>>>
>>>>>>> Ant was deploying them indeed. I had a mistake in
>>>>>>> building.properties.
>>>>>>> I had:  faban.url=http://<hostname>:9980/   instead of  faban.url=
>>>>>>> http://localhost:9980/
>>>>>>> After I changed that it started working...
>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>>  it worked. (mostly worked)
>>>>>>>>>
>>>>>>>>> 3) I still have warnings like:
>>>>>>>>> 01:38:08:INFO:Time difference to host olio-web is 269 ms.
>>>>>>>>> Attempting to set clock.
>>>>>>>>> 01:38:08:INFO:Time difference to host olio-db is 263 ms. Attempting
>>>>>>>>> to set clock.
>>>>>>>>>
>>>>>>>>
>>>>>>>> These two are OK. Just trying to do a clock sync between the
>>>>>>>> systems.
>>>>>>>>
>>>>>>>>
>>>>>>>>  01:38:08:WARNING:olio-web wakeup-before time reached 700ms limit.
>>>>>>>>> System is too busy. Giving up.
>>>>>>>>>
>>>>>>>>
>>>>>>>> This is one of Faban's clock-setting calibrations. If the system is
>>>>>>>> too busy or you run on some virtualization architectures, the lag time
>>>>>>>> between an intended end of sleep and the actual time when the thread really
>>>>>>>> wakes up (gets scheduled/executed) is too high, calibrations will fail.
>>>>>>>>
>>>>>>>>
>>>>>>>>  01:38:08:INFO:Time difference to host olio-mem is 262 ms.
>>>>>>>>> Attempting to set clock.
>>>>>>>>> 01:38:10:WARNING:olio-db wakeup-before time reached 700ms limit.
>>>>>>>>> System is too busy. Giving up.
>>>>>>>>> 09:38:09:WARNING:[date, -u, 011309382010.10]
>>>>>>>>> stderr:
>>>>>>>>> date: cannot set date: Operation not permitted
>>>>>>>>> 09:38:09:WARNING:Error on "[date, -u, 011309382010.10]" command
>>>>>>>>> trying to set the date. Exit value: 1
>>>>>>>>> 09:38:10:WARNING:[date, -u, 011309382010.11]
>>>>>>>>> stderr:
>>>>>>>>> date: cannot set date: Operation not permitted
>>>>>>>>> 09:38:10:WARNING:Error on "[date, -u, 011309382010.11]" command
>>>>>>>>> trying to set the date. Exit value: 1
>>>>>>>>> 09:38:09:WARNING:[date, -u, 011309382010.10]
>>>>>>>>> stderr:
>>>>>>>>> date: cannot set date: Operation not permitted
>>>>>>>>> 09:38:09:WARNING:Error on "[date, -u, 011309382010.10]" command
>>>>>>>>> trying to set the date. Exit value: 1
>>>>>>>>> 09:38:10:WARNING:[date, -u, 011309382010.11]
>>>>>>>>> stderr:
>>>>>>>>> date: cannot set date: Operation not permitted
>>>>>>>>>
>>>>>>>>> Leting faban change the vm clock sounds from the beginning a bad
>>>>>>>>> idea.
>>>>>>>>>
>>>>>>>>
>>>>>>>> OK. So it is xen. Yes, this is what Faban is trying to solve. You
>>>>>>>> can certainly turn it off. Please see:
>>>>>>>>   http://faban.sunsource.net/1.0/docs/howd services
>>>>>>>> ApacheHttpdService, MemcachedService, MySQLService that come with Faban
>>>>>>>> should be deployed before running Olio?
>>>>>>>>    I was gettingoi/physclocksync.html<http://faban.sunsource.net/1.0/docs/howdoi/physclocksync.html>
>>>>>>>>
>>>>>>>>
>>>>>>> I added the  <fh:timeSync>false<fh:timeSync> in my run.xml file (
>>>>>>> btw in the link above there is a mistake :  <fh:timeSync>false
>>>>>>> </fh:timeSync> is correct, the second <fh:timeSync> needs a closing
>>>>>>> tag, the "/" is missing)
>>>>>>> that made the warnings go away.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>>  Unfortunately, xen is really bad in maintaining an accurate clock.
>>>>>>>>> As a result there is usually time difference between the different virtual
>>>>>>>>> machines
>>>>>>>>> of more than 10ms. I went over the setTime function in Faban source
>>>>>>>>> (/faban/com/sun/faban/harness/agent/CmdAgentImpl.java), it's big and ugly
>>>>>>>>> (very ugly)
>>>>>>>>>
>>>>>>>>
>>>>>>>> Thanks for the compliments! I think you mean
>>>>>>>> CmdService.setClockTask. Time sensitive code ain't pretty. It is the
>>>>>>>> complexities dealing with the clock and trying to achieve good accuracy. If
>>>>>>>> you think you can simplify this, I'm listening (without loosing the
>>>>>>>> accuracy, of course). In comparison, CmdAgentImpl has nothing.
>>>>>>>>
>>>>>>>>
>>>>>>> Yes, you r right it is CmdService.setClockTask. The previous email
>>>>>>> was composed at 3am ... :-)
>>>>>>> I am still a little confused.  the setClockTask is used to set the
>>>>>>> clock so that all the machines are synchronized with master. From what you
>>>>>>> mentioned the physical clock sync is only used for the logs.
>>>>>>> Why do we need to do that since 1) it requires root privileges (which
>>>>>>> might not be always available) 2) I could imagine an alternative that uses
>>>>>>> deltas from the actual physical clock without having to set it.
>>>>>>> ( I am probably missing something... :-)
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>>  Why there is this strict requirement for 10ms difference? Any
>>>>>>>>> ideas?
>>>>>>>>>
>>>>>>>>
>>>>>>>> It is easily achievable in most cases. May not be true for VMs.
>>>>>>>>
>>>>>>>> On some VM architectures, the OS however does not get scheduled till
>>>>>>>> way after that, thus causing problems. You may be able to measure
>>>>>>>> performance on those VMs. But you don't want to use such VMs to be a driver.
>>>>>>>> Your response time measurements will be way off.
>>>>>>>>
>>>>>>>> The physical clock sync is not really rigorous. And you can turn it
>>>>>>>> off. It is more to keep the systems in good time sync. If your VM stands in
>>>>>>>> the way, just turn it off. The driver's virtual clock sync is much more
>>>>>>>> picky in comparison. This is because the start time for the steady state
>>>>>>>> should be the same (with a very small tolerance) no matter how many drivers
>>>>>>>> are driving. Otherwise the measurement period won't be the same when viewed
>>>>>>>> from different drivers and the results won't be reliable.
>>>>>>>>
>>>>>>>>
>>>>>>>>  Even with ntp it's hard to provide the 10ms guarantee.
>>>>>>>>>
>>>>>>>>
>>>>>>>> That's why we don't use ntp ;-)
>>>>>>>
>>>>>>>
>>>>>>> Just out of curiosity, the physical clocks are set only once at the
>>>>>>> beginning (right?), therefore for long runs the 10ms difference will not be
>>>>>>> guaranteed. Nope? Especially under VMs I 've seen significant clock
>>>>>>> difference withing a few minutes.
>>>>>>> At least ntp can periodically resync (of course doing so, might screw
>>>>>>> up the logs with time going backwards etc)
>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>  I am thinking of modifying this function to always return that the
>>>>>>>>> time difference is less than 10ms (so that I do not have to wait all the
>>>>>>>>> time for the timeouts.)
>>>>>>>>>
>>>>>>>>
>>>>>>>> Why bother. Don't like it, just turn it off. It has good use in most
>>>>>>>> configurations we're dealing with. And, it avoids ntp inaccuracies.
>>>>>>>>
>>>>>>>>
>>>>>>>>  Will this break anything in Olio?
>>>>>>>>>
>>>>>>>>
>>>>>>>> Nope. Except the times in your logs will appear out of sequence.
>>>>>>>> They rely on the local time on the originating systems.
>>>>>>>>
>>>>>>>>
>>>>>>>>> 4) Warning like:
>>>>>>>>> 09:39:48:WARNING:Image at
>>>>>>>>> http://olio-web:80/fileService.php?cache=false&file=e168t.jpg <
>>>>>>>>> http://olio-web:80/fileService.php?cache=false&file=e168t.jpg>
>>>>>>>>> size of 249 bytes is too small. Image may not exist
>>>>>>>>> can be ignored, right?
>>>>>>>>>
>>>>>>>>
>>>>>>>> Well, something is wrong. We don't have images that small. Check
>>>>>>>> whether e168t.jpg is really that small. That's why we have that warning.
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>> It kinda funny, my problem was that I had the olio webkit version
>>>>>>> installed and then I downloaded the version from the online svn repository.
>>>>>>> I built the driver but forgot to update the webpage for my apache server.
>>>>>>> Which
>>>>>>> as expected was the source for many of my issues.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>> 5) Last and most important.
>>>>>>>>> I can run the benchmark and all the operation succeed but for
>>>>>>>>> login.
>>>>>>>>> I get a bunch of:
>>>>>>>>>
>>>>>>>>> 09:39:25:WARNING:UIDriverAgent[0].2.doLogin: Found login prompt at
>>>>>>>>> index 2926, Login as at786o08x, 2178 failed.
>>>>>>>>> Note: Error not counted in result.
>>>>>>>>> Either transaction start or end time is not within steady state.
>>>>>>>>> java.lang.RuntimeException: Found login prompt at index 2926, Login
>>>>>>>>> as at786o08x, 2178 failed.
>>>>>>>>>    at org.apache.olio.workload.driver.UIDriver.doLogin (404)
>>>>>>>>>    at sun.reflect.NativeMethodAccessorImpl.invoke0 (null)
>>>>>>>>>    at sun.reflect.NativeMethodAccessorImpl.invoke (39)
>>>>>>>>>    at sun.reflect.DelegatingMethodAccessorImpl.invoke (25)
>>>>>>>>>    at java.lang.reflect.Method.invoke (597)
>>>>>>>>>    at com.sun.faban.driver.engine.TimeThread.doRun (169)
>>>>>>>>>    at com.sun.faban.driver.engine.AgentThread.run (202)
>>>>>>>>>
>>>>>>>>> Any ideas? I do get
>>>>>>>>>
>>>>>>>>
>>>>>>>> You likely have cookie issues. It can't seem to hold on to a
>>>>>>>> session.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>> Well there was a permission issue with the http_session dir. I could
>>>>>>> not right to it. chmod 777 it fixed this.
>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>> (I ve found online:
>>>>>>>>> http://www.mail-archive.com/olio-dev@incubator.apache.org/msg00647.htmlwhich is similar, but when I added
>>>>>>>>>
>>>>>>>>> com.sun.faban.driver.transport.sunhttp.ThreadCookieHandler.level=FINER
>>>>>>>>>  in build.properties
>>>>>>>>> I did not see any cookie related warnings. Those should appear in
>>>>>>>>> the olio run log or the apache log, right? Am i just looking at the wrong
>>>>>>>>> place? )
>>>>>>>>>
>>>>>>>>
>>>>>>>> Yes, that's applicable only to the Sun Http Transport. The version
>>>>>>>> of Olio you're using is based on the Apache Http Transport (Apache
>>>>>>>> HttpClient 3.1). The ThreadCookieHandler is not used for the Apache
>>>>>>>> transport and that's why you don't see any logs. Try upgrade to Faban 1.0
>>>>>>>> before looking at other things.
>>>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>> It's a long email I know. Your feedback would be most appreciated.
>>>>>>>>>
>>>>>>>>> -Regards
>>>>>>>>> -------------------------------------------------------------------
>>>>>>>>> Kontorinis Vasileios
>>>>>>>>> Phd student, University of California San Diego
>>>>>>>>> San Diego, CA 92122
>>>>>>>>> Cell. phone: (858) 717 6899
>>>>>>>>> bkontorinis@gmail.com <ma...@gmail.com>,
>>>>>>>>> vkontori@ucsd.edu <ma...@ucsd.edu>
>>>>>>>>>
>>>>>>>>> -------------------------------------------------------------------\
>>>>>>>>>
>>>>>>>>
>>>>>>>> Thanks for all the questions/comments.
>>>>>>>>
>>>>>>>> -Akara
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>> And now some more questions/ comments:
>>>>>>> 1) I get the following error:
>>>>>>>
>>>>>>> 15:13:05:SEVERE:CmdService: Getting - exception reading
>>>>>>> /usr/data/olio-db.err
>>>>>>> java.io.FileNotFoundException: File /usr/data/olio-db.err does not
>>>>>>> exist.
>>>>>>>     at com.sun.faban.common.FileTransfer.<init> (70)
>>>>>>>     at com.sun.faban.harness.agent.FileAgentImpl.get (315)
>>>>>>>     at sun.reflect.NativeMethodAccessorImpl.invoke0 (null)
>>>>>>>     at sun.reflect.NativeMethodAccessorImpl.invoke (39)
>>>>>>>     at sun.reflect.DelegatingMethodAccessorImpl.invoke (25)
>>>>>>>     at java.lang.reflect.Method.invoke (597)
>>>>>>>     at sun.rmi.server.UnicastServerRef.dispatch (305)
>>>>>>>     at sun.rmi.transport.Transport$1.run (159)
>>>>>>>     at java.security.AccessController.doPrivileged (null)
>>>>>>>     at sun.rmi.transport.Transport.serviceCall (155)
>>>>>>>     at sun.rmi.transport.tcp.TCPTransport.handleMessages (535)
>>>>>>>     at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0
>>>>>>> (790)
>>>>>>>     at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run (649)
>>>>>>>     at java.util.concurrent.ThreadPoolExecutor$Worker.runTask (885)
>>>>>>>     at java.util.concurrent.ThreadPoolExecutor$Worker.run (907)
>>>>>>>     at java.lang.Thread.run (619)
>>>>>>>     at sun.rmi.transport.StreamRemoteCall.exceptionReceivedFromServer
>>>>>>> (255)
>>>>>>>     at sun.rmi.transport.StreamRemoteCall.executeCall (233)
>>>>>>>     at sun.rmi.server.UnicastRef.invoke (142)
>>>>>>>     at com.sun.faban.harness.agent.FileAgentImpl_Stub.get (null)
>>>>>>>     at com.sun.faban.harness.engine.CmdService.get (1334)
>>>>>>>     at com.sun.faban.harness.RunContext.getFile (346)
>>>>>>>     at com.sun.services.MySQLService.getLogs (197)
>>>>>>>     at sun.reflect.NativeMethodAccessorImpl.invoke0 (null)
>>>>>>>     at sun.reflect.NativeMethodAccessorImpl.invoke (39)
>>>>>>>     at sun.reflect.DelegatingMethodAccessorImpl.invoke (25)
>>>>>>>     at java.lang.reflect.Method.invoke (597)
>>>>>>>     at com.sun.faban.harness.util.Invoker.invoke (98)
>>>>>>>     at com.sun.faban.harness.services.ServiceWrapper.getLogs (200)
>>>>>>>     at com.sun.faban.harness.services.ServiceManager.getLogs (642)
>>>>>>>     at com.sun.faban.harness.engine.GenericBenchmark.start (323)
>>>>>>>     at com.sun.faban.harness.engine.RunDaemon.run (338)
>>>>>>>     at java.lang.Thread.run (619)
>>>>>>> 15:13:05:WARNING:Could not copy /usr/data/olio-db.err to
>>>>>>> /home/gdhiman/faban.1.0/faban/output/OlioDriver.2D/mysql_err.log.olio-db
>>>>>>>
>>>>>>> Apparently something is misconfigured in my db-server. Any ideas?
>>>>>>>
>>>>>>> 2) I get the following error:
>>>>>>> 15:13:16:WARNING:[/home/gdhiman/faban.1.0/faban/bin/fenxi, process,
>>>>>>> /home/gdhiman/faban.1.0/faban/output/OlioDriver.2D/,
>>>>>>> /home/gdhiman/faban.1.0/faban/output/OlioDriver.2D//post/, OlioDriver.2D]
>>>>>>> stderr:
>>>>>>> Error in executing perl
>>>>>>> /home/gdhiman/faban.1.0/faban/master/webapps/fenxi/txt2db/mpstat.pl
>>>>>>> Error in executing perl
>>>>>>> /home/gdhiman/faban.1.0/faban/master/webapps/fenxi/txt2db/mpstat.pl
>>>>>>>
>>>>>>> Actually I traced back this one. The problem is the difference in
>>>>>>> output format of the Sun's mpstat and default GNU mpstat.
>>>>>>> This is my output of my mpstat:
>>>>>>>
>>>>>>> gdhiman@olio-client00:~/faban.1.0/faban/output/OlioDriver.2D$ mpstat
>>>>>>> 1
>>>>>>> Linux 2.6.18.8-xen (olio-client00)     01/16/10
>>>>>>>
>>>>>>> 16:25:06     CPU   %user   %nice    %sys %iowait    %irq   %soft
>>>>>>> %steal   %idle    intr/s
>>>>>>> 16:25:07     all    0.00    0.00    0.00    0.00    0.00    0.00
>>>>>>> 0.00  100.00     52.48
>>>>>>> 16:25:08     all    0.00    0.00    0.00    0.00    0.00    0.00
>>>>>>> 0.00  100.00     50.50
>>>>>>> 16:25:09     all    0.00    0.00    0.00    0.00    0.00    0.00
>>>>>>> 0.00  100.00     79.21
>>>>>>> 16:25:10     all    0.00    0.00    0.00    0.00    0.00    0.00
>>>>>>> 0.00  100.00     45.54
>>>>>>> 16:25:11     all    0.00    0.00    0.00    0.00    0.00    0.00
>>>>>>> 0.00  100.00     55.45
>>>>>>>
>>>>>>> The first line as well as the time at the beginning of each entry
>>>>>>> messing up the parsing at mpstat.pl. (also the fields are different)
>>>>>>>   Any plans to support this??
>>>>>>>
>>>>>>> 3) Scaling questions.
>>>>>>> - So far I did not have a single experiment passing. Some are pretty
>>>>>>> close with only one metric check failing.
>>>>>>>
>>>>>>> Average images loaded per Home Page2.79>= 3
>>>>>>> FAILED
>>>>>>> Any ideas? Is it the case that the disc is not fast enough? I am just
>>>>>>> using the local filesystem for the filestore.
>>>>>>>
>>>>>>> - As I double the number of concurrent users I observe linear scaling
>>>>>>> in the thoughput.
>>>>>>> Con Users         Throughput
>>>>>>>  25                        4.967
>>>>>>>  50                       10.06
>>>>>>> 100                      19.375
>>>>>>> 200                      40.21
>>>>>>> 400                      75.818
>>>>>>> 800                       0.383
>>>>>>> 1000                     0.483
>>>>>>>
>>>>>>> The linear scaling stops for 400 concurrent users ( only one agent).
>>>>>>> Actually it would be exactly linear (value of ~80) but almost half of the
>>>>>>> login operations failed. I am looking into it.
>>>>>>> Any insights on what might be the first thing failing?
>>>>>>>
>>>>>>> For the 800 and 1000 experiments there are no failed operations
>>>>>>> logged. It looks like those are being discarded... (?)
>>>>>>>
>>>>>>> Bonus question:
>>>>>>> In the runtime statistics
>>>>>>> <runtimeStats enabled="true">
>>>>>>>          <interval>30</interval>
>>>>>>>  </runtimeStats>
>>>>>>>
>>>>>>> only the 90% response time is reported. Is there an easy way to also
>>>>>>> report the 99% ? ( or I need to add code for that?)
>>>>>>>
>>>>>>>
>>>>>>> Thanks a lot again in advance.
>>>>>>> -VK
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>