You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@geronimo.apache.org by Kevan Miller <ke...@gmail.com> on 2008/11/20 18:13:36 UTC

Monitoring TCK machines

This is definitely better than what we have now...

Anybody else know of any xen monitoring tools?

--kevan

On Oct 17, 2008, at 6:58 PM, Jay D. McHugh wrote:
> Hey Kevan,
>
> Regarding monitoring...
>
> I managed to run into xenmon.py.
>
> It appears to log the system utilization for the whole box as well  
> as each
> VM to log files in 'your' home directory if you specify the '-n' flag.
>
> Here is the help page for xenmon.py:
> jaydm@phoebe:~$ sudo python /usr/sbin/xenmon.py -h
> Usage: xenmon.py [options]
>
> Options:
>  -h, --help            show this help message and exit
>  -l, --live            show the ncurses live monitoring frontend  
> (default)
>  -n, --notlive         write to file instead of live monitoring
>  -p PREFIX, --prefix=PREFIX
>                        prefix to use for output files
>  -t DURATION, --time=DURATION
>                        stop logging to file after this much time has  
> elapsed
>                        (in seconds). set to 0 to keep logging  
> indefinitely
>  -i INTERVAL, --interval=INTERVAL
>                        interval for logging (in ms)
>  --ms_per_sample=MSPERSAMPLE
>                        determines how many ms worth of data goes in  
> a sample
>  --cpu=CPU             specifies which cpu to display data for
>  --allocated           Display allocated time for each domain
>  --noallocated         Don't display allocated time for each domain
>  --blocked             Display blocked time for each domain
>  --noblocked           Don't display blocked time for each domain
>  --waited              Display waiting time for each domain
>  --nowaited            Don't display waiting time for each domain
>  --excount             Display execution count for each domain
>  --noexcount           Don't display execution count for each domain
>  --iocount             Display I/O count for each domain
>  --noiocount           Don't display I/O count for each domain
>
> And here is some sample output:
>
> jaydm@phoebe:~$ cat log-dom0.log
> # passed cpu dom cpu(tot) cpu(%) cpu/ex allocated/ex blocked(tot)  
> blocked(%) blocked/io waited(tot) waited(%) waited/ex ex/s io(tot)  
> io/ex
> 0.000 0 0 2.086 0.000 38863.798 30000000.000 154.177 0.000 0.000  
> 0.504 0.000 9383.278 0.000 0.000 0.000
> 2.750 1 0 2.512 0.000 53804.925 30000000.000 153.217 0.000 0.000  
> 0.316 0.000 6774.813 0.000 0.000 0.000
> 4.063 2 0 2.625 0.000 59959.942 30000000.000 153.886 0.000 0.000  
> 0.173 0.000 3939.987 0.000 0.000 0.000
> 5.203 3 0 3.020 0.000 47522.430 30000000.000 171.834 0.000 0.000  
> 0.701 0.000 11031.759 0.000 0.000 0.000
> 6.403 4 0 2.130 0.000 39256.871 30000000.000 171.870 0.000 0.000  
> 0.617 0.000 11378.014 0.000 0.000 0.000
> 9.230 6 0 0.836 0.000 53962.875 30000000.000 57.287 0.000 0.000  
> 0.038 0.000 2450.488 0.000 0.000 0.000
> 10.305 7 0 2.171 0.000 46119.247 30000000.000 154.008 0.000 0.000  
> 0.367 0.000 7804.444 0.000 0.000 0.000
> 11.518 0 0 15931680.822 1.593 54019.023 30000000.000 889706824.191  
> 88.971 0.000 2630292.436 0.263 8918.446 294.927 0.000 0.000
> 1009.216 1 0 7687035.544 0.769 53822.548 30000000.000 473101345.004  
> 47.310 0.000 864964.568 0.086 6056.248 142.822 0.000 0.000
> 1010.199 2 0 20502235.224 2.050 61655.293 30000000.000 979188763.754  
> 97.919 0.000 4279443600.516 427.944 12869345.608 332.530 0.000 0.000
> 1011.239 3 0 13634865.766 1.363 45934.870 30000000.000 985479796.363  
> 98.548 0.000 1593248.596 0.159 5367.538 296.830 0.000 0.000
> 1012.312 4 0 18228049.181 1.823 61242.790 30000000.000 979822521.396  
> 97.982 0.000 2593364.560 0.259 8713.213 297.636 0.000 0.000
> 1013.338 5 0 9891757.872 0.989 65386.046 30000000.000 571275802.794  
> 57.128 0.000 357431.539 0.036 2362.678 151.282 0.000 0.000
>
> We could probably add a cron job to grab a single sample every X  
> minutes
> and append them together to build up a utilization history (rather  
> than
> simply running it all of the time).
>
> I just tried to get a single sample and the smallest run I could get  
> was
> about three seconds with four samples taken.
>
> Or, I also tried xentop in batch mode:
>
> jaydm@phoebe:~$ sudo xentop -b -i 1
>      NAME  STATE   CPU(sec) CPU(%)     MEM(k) MEM(%)  MAXMEM(k)  
> MAXMEM(%) VCPUS NETS NETTX(k) NETRX(k) VBDS   VBD_OO   VBD_RD    
> VBD_WR SSID
>  Domain-0 -----r     430567    0.0    3939328   23.5   no  
> limit       n/a     8    4        0        0    0        0         
> 0        0 2149631536
>     tck01 --b---     750449    0.0    3145728   18.8    3145728       
> 18.8     2    1   483054  1855493    1       15   655667  8445829  
> 2149631536
>     tck02 --b---    1101273    0.0    3145728   18.8    3145728       
> 18.8     2    1   367792  1773407    1       83  1131709  9030663  
> 2149631536
>     tck03 -----r     144552    0.0    3145728   18.8    3145728       
> 18.8     2    1   188115  2370069    1        6   370431  1290683  
> 2149631536
>     tck04 --b---     103742    0.0    3145728   18.8    3145728       
> 18.8     2    1   286936  2341941    1        7   381523  1484476  
> 2149631536
>
> It looks to me like having a cron job that periodically ran xentop and
> build up a history would be the best option (without digging through
> a ton of different specialized monitor packages).

Re: Monitoring TCK machines

Posted by Kevan Miller <ke...@gmail.com>.
On Nov 20, 2008, at 12:32 PM, Jason Dillon wrote:

> Just install snmpd on each domain and then use cacti ( http://www.cacti.net/ 
>  ) for all your monitoring needs.

Done, but looks like I may need a configuration assist. May ping you...

--kevan

Re: Monitoring TCK machines

Posted by Jason Dillon <ja...@gmail.com>.
Just install snmpd on each domain and then use cacti ( http://www.cacti.net/ 
  ) for all your monitoring needs.

--jason


On Nov 21, 2008, at 12:13 AM, Kevan Miller wrote:

> This is definitely better than what we have now...
>
> Anybody else know of any xen monitoring tools?
>
> --kevan
>
> On Oct 17, 2008, at 6:58 PM, Jay D. McHugh wrote:
>> Hey Kevan,
>>
>> Regarding monitoring...
>>
>> I managed to run into xenmon.py.
>>
>> It appears to log the system utilization for the whole box as well  
>> as each
>> VM to log files in 'your' home directory if you specify the '-n'  
>> flag.
>>
>> Here is the help page for xenmon.py:
>> jaydm@phoebe:~$ sudo python /usr/sbin/xenmon.py -h
>> Usage: xenmon.py [options]
>>
>> Options:
>> -h, --help            show this help message and exit
>> -l, --live            show the ncurses live monitoring frontend  
>> (default)
>> -n, --notlive         write to file instead of live monitoring
>> -p PREFIX, --prefix=PREFIX
>>                       prefix to use for output files
>> -t DURATION, --time=DURATION
>>                       stop logging to file after this much time has  
>> elapsed
>>                       (in seconds). set to 0 to keep logging  
>> indefinitely
>> -i INTERVAL, --interval=INTERVAL
>>                       interval for logging (in ms)
>> --ms_per_sample=MSPERSAMPLE
>>                       determines how many ms worth of data goes in  
>> a sample
>> --cpu=CPU             specifies which cpu to display data for
>> --allocated           Display allocated time for each domain
>> --noallocated         Don't display allocated time for each domain
>> --blocked             Display blocked time for each domain
>> --noblocked           Don't display blocked time for each domain
>> --waited              Display waiting time for each domain
>> --nowaited            Don't display waiting time for each domain
>> --excount             Display execution count for each domain
>> --noexcount           Don't display execution count for each domain
>> --iocount             Display I/O count for each domain
>> --noiocount           Don't display I/O count for each domain
>>
>> And here is some sample output:
>>
>> jaydm@phoebe:~$ cat log-dom0.log
>> # passed cpu dom cpu(tot) cpu(%) cpu/ex allocated/ex blocked(tot)  
>> blocked(%) blocked/io waited(tot) waited(%) waited/ex ex/s io(tot)  
>> io/ex
>> 0.000 0 0 2.086 0.000 38863.798 30000000.000 154.177 0.000 0.000  
>> 0.504 0.000 9383.278 0.000 0.000 0.000
>> 2.750 1 0 2.512 0.000 53804.925 30000000.000 153.217 0.000 0.000  
>> 0.316 0.000 6774.813 0.000 0.000 0.000
>> 4.063 2 0 2.625 0.000 59959.942 30000000.000 153.886 0.000 0.000  
>> 0.173 0.000 3939.987 0.000 0.000 0.000
>> 5.203 3 0 3.020 0.000 47522.430 30000000.000 171.834 0.000 0.000  
>> 0.701 0.000 11031.759 0.000 0.000 0.000
>> 6.403 4 0 2.130 0.000 39256.871 30000000.000 171.870 0.000 0.000  
>> 0.617 0.000 11378.014 0.000 0.000 0.000
>> 9.230 6 0 0.836 0.000 53962.875 30000000.000 57.287 0.000 0.000  
>> 0.038 0.000 2450.488 0.000 0.000 0.000
>> 10.305 7 0 2.171 0.000 46119.247 30000000.000 154.008 0.000 0.000  
>> 0.367 0.000 7804.444 0.000 0.000 0.000
>> 11.518 0 0 15931680.822 1.593 54019.023 30000000.000 889706824.191  
>> 88.971 0.000 2630292.436 0.263 8918.446 294.927 0.000 0.000
>> 1009.216 1 0 7687035.544 0.769 53822.548 30000000.000 473101345.004  
>> 47.310 0.000 864964.568 0.086 6056.248 142.822 0.000 0.000
>> 1010.199 2 0 20502235.224 2.050 61655.293 30000000.000  
>> 979188763.754 97.919 0.000 4279443600.516 427.944 12869345.608  
>> 332.530 0.000 0.000
>> 1011.239 3 0 13634865.766 1.363 45934.870 30000000.000  
>> 985479796.363 98.548 0.000 1593248.596 0.159 5367.538 296.830 0.000  
>> 0.000
>> 1012.312 4 0 18228049.181 1.823 61242.790 30000000.000  
>> 979822521.396 97.982 0.000 2593364.560 0.259 8713.213 297.636 0.000  
>> 0.000
>> 1013.338 5 0 9891757.872 0.989 65386.046 30000000.000 571275802.794  
>> 57.128 0.000 357431.539 0.036 2362.678 151.282 0.000 0.000
>>
>> We could probably add a cron job to grab a single sample every X  
>> minutes
>> and append them together to build up a utilization history (rather  
>> than
>> simply running it all of the time).
>>
>> I just tried to get a single sample and the smallest run I could  
>> get was
>> about three seconds with four samples taken.
>>
>> Or, I also tried xentop in batch mode:
>>
>> jaydm@phoebe:~$ sudo xentop -b -i 1
>>     NAME  STATE   CPU(sec) CPU(%)     MEM(k) MEM(%)  MAXMEM(k)  
>> MAXMEM(%) VCPUS NETS NETTX(k) NETRX(k) VBDS   VBD_OO   VBD_RD    
>> VBD_WR SSID
>> Domain-0 -----r     430567    0.0    3939328   23.5   no  
>> limit       n/a     8    4        0        0    0        0         
>> 0        0 2149631536
>>    tck01 --b---     750449    0.0    3145728   18.8    3145728       
>> 18.8     2    1   483054  1855493    1       15   655667  8445829  
>> 2149631536
>>    tck02 --b---    1101273    0.0    3145728   18.8    3145728       
>> 18.8     2    1   367792  1773407    1       83  1131709  9030663  
>> 2149631536
>>    tck03 -----r     144552    0.0    3145728   18.8    3145728       
>> 18.8     2    1   188115  2370069    1        6   370431  1290683  
>> 2149631536
>>    tck04 --b---     103742    0.0    3145728   18.8    3145728       
>> 18.8     2    1   286936  2341941    1        7   381523  1484476  
>> 2149631536
>>
>> It looks to me like having a cron job that periodically ran xentop  
>> and
>> build up a history would be the best option (without digging through
>> a ton of different specialized monitor packages).