You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@geronimo.apache.org by Kevan Miller <ke...@gmail.com> on 2008/11/20 18:13:36 UTC
Monitoring TCK machines
This is definitely better than what we have now...
Anybody else know of any xen monitoring tools?
--kevan
On Oct 17, 2008, at 6:58 PM, Jay D. McHugh wrote:
> Hey Kevan,
>
> Regarding monitoring...
>
> I managed to run into xenmon.py.
>
> It appears to log the system utilization for the whole box as well
> as each
> VM to log files in 'your' home directory if you specify the '-n' flag.
>
> Here is the help page for xenmon.py:
> jaydm@phoebe:~$ sudo python /usr/sbin/xenmon.py -h
> Usage: xenmon.py [options]
>
> Options:
> -h, --help show this help message and exit
> -l, --live show the ncurses live monitoring frontend
> (default)
> -n, --notlive write to file instead of live monitoring
> -p PREFIX, --prefix=PREFIX
> prefix to use for output files
> -t DURATION, --time=DURATION
> stop logging to file after this much time has
> elapsed
> (in seconds). set to 0 to keep logging
> indefinitely
> -i INTERVAL, --interval=INTERVAL
> interval for logging (in ms)
> --ms_per_sample=MSPERSAMPLE
> determines how many ms worth of data goes in
> a sample
> --cpu=CPU specifies which cpu to display data for
> --allocated Display allocated time for each domain
> --noallocated Don't display allocated time for each domain
> --blocked Display blocked time for each domain
> --noblocked Don't display blocked time for each domain
> --waited Display waiting time for each domain
> --nowaited Don't display waiting time for each domain
> --excount Display execution count for each domain
> --noexcount Don't display execution count for each domain
> --iocount Display I/O count for each domain
> --noiocount Don't display I/O count for each domain
>
> And here is some sample output:
>
> jaydm@phoebe:~$ cat log-dom0.log
> # passed cpu dom cpu(tot) cpu(%) cpu/ex allocated/ex blocked(tot)
> blocked(%) blocked/io waited(tot) waited(%) waited/ex ex/s io(tot)
> io/ex
> 0.000 0 0 2.086 0.000 38863.798 30000000.000 154.177 0.000 0.000
> 0.504 0.000 9383.278 0.000 0.000 0.000
> 2.750 1 0 2.512 0.000 53804.925 30000000.000 153.217 0.000 0.000
> 0.316 0.000 6774.813 0.000 0.000 0.000
> 4.063 2 0 2.625 0.000 59959.942 30000000.000 153.886 0.000 0.000
> 0.173 0.000 3939.987 0.000 0.000 0.000
> 5.203 3 0 3.020 0.000 47522.430 30000000.000 171.834 0.000 0.000
> 0.701 0.000 11031.759 0.000 0.000 0.000
> 6.403 4 0 2.130 0.000 39256.871 30000000.000 171.870 0.000 0.000
> 0.617 0.000 11378.014 0.000 0.000 0.000
> 9.230 6 0 0.836 0.000 53962.875 30000000.000 57.287 0.000 0.000
> 0.038 0.000 2450.488 0.000 0.000 0.000
> 10.305 7 0 2.171 0.000 46119.247 30000000.000 154.008 0.000 0.000
> 0.367 0.000 7804.444 0.000 0.000 0.000
> 11.518 0 0 15931680.822 1.593 54019.023 30000000.000 889706824.191
> 88.971 0.000 2630292.436 0.263 8918.446 294.927 0.000 0.000
> 1009.216 1 0 7687035.544 0.769 53822.548 30000000.000 473101345.004
> 47.310 0.000 864964.568 0.086 6056.248 142.822 0.000 0.000
> 1010.199 2 0 20502235.224 2.050 61655.293 30000000.000 979188763.754
> 97.919 0.000 4279443600.516 427.944 12869345.608 332.530 0.000 0.000
> 1011.239 3 0 13634865.766 1.363 45934.870 30000000.000 985479796.363
> 98.548 0.000 1593248.596 0.159 5367.538 296.830 0.000 0.000
> 1012.312 4 0 18228049.181 1.823 61242.790 30000000.000 979822521.396
> 97.982 0.000 2593364.560 0.259 8713.213 297.636 0.000 0.000
> 1013.338 5 0 9891757.872 0.989 65386.046 30000000.000 571275802.794
> 57.128 0.000 357431.539 0.036 2362.678 151.282 0.000 0.000
>
> We could probably add a cron job to grab a single sample every X
> minutes
> and append them together to build up a utilization history (rather
> than
> simply running it all of the time).
>
> I just tried to get a single sample and the smallest run I could get
> was
> about three seconds with four samples taken.
>
> Or, I also tried xentop in batch mode:
>
> jaydm@phoebe:~$ sudo xentop -b -i 1
> NAME STATE CPU(sec) CPU(%) MEM(k) MEM(%) MAXMEM(k)
> MAXMEM(%) VCPUS NETS NETTX(k) NETRX(k) VBDS VBD_OO VBD_RD
> VBD_WR SSID
> Domain-0 -----r 430567 0.0 3939328 23.5 no
> limit n/a 8 4 0 0 0 0
> 0 0 2149631536
> tck01 --b--- 750449 0.0 3145728 18.8 3145728
> 18.8 2 1 483054 1855493 1 15 655667 8445829
> 2149631536
> tck02 --b--- 1101273 0.0 3145728 18.8 3145728
> 18.8 2 1 367792 1773407 1 83 1131709 9030663
> 2149631536
> tck03 -----r 144552 0.0 3145728 18.8 3145728
> 18.8 2 1 188115 2370069 1 6 370431 1290683
> 2149631536
> tck04 --b--- 103742 0.0 3145728 18.8 3145728
> 18.8 2 1 286936 2341941 1 7 381523 1484476
> 2149631536
>
> It looks to me like having a cron job that periodically ran xentop and
> build up a history would be the best option (without digging through
> a ton of different specialized monitor packages).
Re: Monitoring TCK machines
Posted by Kevan Miller <ke...@gmail.com>.
On Nov 20, 2008, at 12:32 PM, Jason Dillon wrote:
> Just install snmpd on each domain and then use cacti ( http://www.cacti.net/
> ) for all your monitoring needs.
Done, but looks like I may need a configuration assist. May ping you...
--kevan
Re: Monitoring TCK machines
Posted by Jason Dillon <ja...@gmail.com>.
Just install snmpd on each domain and then use cacti ( http://www.cacti.net/
) for all your monitoring needs.
--jason
On Nov 21, 2008, at 12:13 AM, Kevan Miller wrote:
> This is definitely better than what we have now...
>
> Anybody else know of any xen monitoring tools?
>
> --kevan
>
> On Oct 17, 2008, at 6:58 PM, Jay D. McHugh wrote:
>> Hey Kevan,
>>
>> Regarding monitoring...
>>
>> I managed to run into xenmon.py.
>>
>> It appears to log the system utilization for the whole box as well
>> as each
>> VM to log files in 'your' home directory if you specify the '-n'
>> flag.
>>
>> Here is the help page for xenmon.py:
>> jaydm@phoebe:~$ sudo python /usr/sbin/xenmon.py -h
>> Usage: xenmon.py [options]
>>
>> Options:
>> -h, --help show this help message and exit
>> -l, --live show the ncurses live monitoring frontend
>> (default)
>> -n, --notlive write to file instead of live monitoring
>> -p PREFIX, --prefix=PREFIX
>> prefix to use for output files
>> -t DURATION, --time=DURATION
>> stop logging to file after this much time has
>> elapsed
>> (in seconds). set to 0 to keep logging
>> indefinitely
>> -i INTERVAL, --interval=INTERVAL
>> interval for logging (in ms)
>> --ms_per_sample=MSPERSAMPLE
>> determines how many ms worth of data goes in
>> a sample
>> --cpu=CPU specifies which cpu to display data for
>> --allocated Display allocated time for each domain
>> --noallocated Don't display allocated time for each domain
>> --blocked Display blocked time for each domain
>> --noblocked Don't display blocked time for each domain
>> --waited Display waiting time for each domain
>> --nowaited Don't display waiting time for each domain
>> --excount Display execution count for each domain
>> --noexcount Don't display execution count for each domain
>> --iocount Display I/O count for each domain
>> --noiocount Don't display I/O count for each domain
>>
>> And here is some sample output:
>>
>> jaydm@phoebe:~$ cat log-dom0.log
>> # passed cpu dom cpu(tot) cpu(%) cpu/ex allocated/ex blocked(tot)
>> blocked(%) blocked/io waited(tot) waited(%) waited/ex ex/s io(tot)
>> io/ex
>> 0.000 0 0 2.086 0.000 38863.798 30000000.000 154.177 0.000 0.000
>> 0.504 0.000 9383.278 0.000 0.000 0.000
>> 2.750 1 0 2.512 0.000 53804.925 30000000.000 153.217 0.000 0.000
>> 0.316 0.000 6774.813 0.000 0.000 0.000
>> 4.063 2 0 2.625 0.000 59959.942 30000000.000 153.886 0.000 0.000
>> 0.173 0.000 3939.987 0.000 0.000 0.000
>> 5.203 3 0 3.020 0.000 47522.430 30000000.000 171.834 0.000 0.000
>> 0.701 0.000 11031.759 0.000 0.000 0.000
>> 6.403 4 0 2.130 0.000 39256.871 30000000.000 171.870 0.000 0.000
>> 0.617 0.000 11378.014 0.000 0.000 0.000
>> 9.230 6 0 0.836 0.000 53962.875 30000000.000 57.287 0.000 0.000
>> 0.038 0.000 2450.488 0.000 0.000 0.000
>> 10.305 7 0 2.171 0.000 46119.247 30000000.000 154.008 0.000 0.000
>> 0.367 0.000 7804.444 0.000 0.000 0.000
>> 11.518 0 0 15931680.822 1.593 54019.023 30000000.000 889706824.191
>> 88.971 0.000 2630292.436 0.263 8918.446 294.927 0.000 0.000
>> 1009.216 1 0 7687035.544 0.769 53822.548 30000000.000 473101345.004
>> 47.310 0.000 864964.568 0.086 6056.248 142.822 0.000 0.000
>> 1010.199 2 0 20502235.224 2.050 61655.293 30000000.000
>> 979188763.754 97.919 0.000 4279443600.516 427.944 12869345.608
>> 332.530 0.000 0.000
>> 1011.239 3 0 13634865.766 1.363 45934.870 30000000.000
>> 985479796.363 98.548 0.000 1593248.596 0.159 5367.538 296.830 0.000
>> 0.000
>> 1012.312 4 0 18228049.181 1.823 61242.790 30000000.000
>> 979822521.396 97.982 0.000 2593364.560 0.259 8713.213 297.636 0.000
>> 0.000
>> 1013.338 5 0 9891757.872 0.989 65386.046 30000000.000 571275802.794
>> 57.128 0.000 357431.539 0.036 2362.678 151.282 0.000 0.000
>>
>> We could probably add a cron job to grab a single sample every X
>> minutes
>> and append them together to build up a utilization history (rather
>> than
>> simply running it all of the time).
>>
>> I just tried to get a single sample and the smallest run I could
>> get was
>> about three seconds with four samples taken.
>>
>> Or, I also tried xentop in batch mode:
>>
>> jaydm@phoebe:~$ sudo xentop -b -i 1
>> NAME STATE CPU(sec) CPU(%) MEM(k) MEM(%) MAXMEM(k)
>> MAXMEM(%) VCPUS NETS NETTX(k) NETRX(k) VBDS VBD_OO VBD_RD
>> VBD_WR SSID
>> Domain-0 -----r 430567 0.0 3939328 23.5 no
>> limit n/a 8 4 0 0 0 0
>> 0 0 2149631536
>> tck01 --b--- 750449 0.0 3145728 18.8 3145728
>> 18.8 2 1 483054 1855493 1 15 655667 8445829
>> 2149631536
>> tck02 --b--- 1101273 0.0 3145728 18.8 3145728
>> 18.8 2 1 367792 1773407 1 83 1131709 9030663
>> 2149631536
>> tck03 -----r 144552 0.0 3145728 18.8 3145728
>> 18.8 2 1 188115 2370069 1 6 370431 1290683
>> 2149631536
>> tck04 --b--- 103742 0.0 3145728 18.8 3145728
>> 18.8 2 1 286936 2341941 1 7 381523 1484476
>> 2149631536
>>
>> It looks to me like having a cron job that periodically ran xentop
>> and
>> build up a history would be the best option (without digging through
>> a ton of different specialized monitor packages).