You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Shirley Cohen <sc...@cs.utexas.edu> on 2008/09/30 01:42:36 UTC

dfs i/o stats

Hi,

I would like to measure the disk i/o performance of our hadoop  
cluster. However, running iostat on 16 nodes is rather cumbersome.  
Does dfs keep track of any stats like the number of blocks or bytes  
read and written? From scanning the api, I found a class called  
"org.apache.hadoop.fs.FileSystem.Statistics" that could be relevant.  
Does anyone know if this is what I'm looking for?

Thanks,

Shirley

Re: dfs i/o stats

Posted by Shirley Cohen <sc...@cs.utexas.edu>.
Thanks for the helpful pointers!

Shirley

On Sep 29, 2008, at 8:02 PM, Konstantin Shvachko wrote:

> We use TestDFSIO for measuring IO performance on our clusters.
> It is called a test, but in fact its a benchmark.
> It runs a map-reduce job, which either writes to or reads from files
> and collects statistics.
>
> Another thing is that Hadoop automatically collects metrics.
> Like number of creates, deletes, ls's etc.
> Here are some links:
> http://wiki.apache.org/hadoop/GangliaMetrics
> http://hadoop.apache.org/core/docs/current/api/org/apache/hadoop/ 
> dfs/NameNodeMetrics.html
> http://hadoop.apache.org/core/docs/current/api/org/apache/hadoop/ 
> dfs/FSNamesystemMetrics.html
>
> Hope this is helpful.
> --Konstantin
>
> Shirley Cohen wrote:
>> Hi,
>> I would like to measure the disk i/o performance of our hadoop  
>> cluster. However, running iostat on 16 nodes is rather cumbersome.  
>> Does dfs keep track of any stats like the number of blocks or  
>> bytes read and written?  From scanning the api, I found a class  
>> called "org.apache.hadoop.fs.FileSystem.Statistics" that could be  
>> relevant. Does anyone know if this is what I'm looking for?
>> Thanks,
>> Shirley


Re: dfs i/o stats

Posted by Konstantin Shvachko <sh...@yahoo-inc.com>.
We use TestDFSIO for measuring IO performance on our clusters.
It is called a test, but in fact its a benchmark.
It runs a map-reduce job, which either writes to or reads from files
and collects statistics.

Another thing is that Hadoop automatically collects metrics.
Like number of creates, deletes, ls's etc.
Here are some links:
http://wiki.apache.org/hadoop/GangliaMetrics
http://hadoop.apache.org/core/docs/current/api/org/apache/hadoop/dfs/NameNodeMetrics.html
http://hadoop.apache.org/core/docs/current/api/org/apache/hadoop/dfs/FSNamesystemMetrics.html

Hope this is helpful.
--Konstantin

Shirley Cohen wrote:
> Hi,
> 
> I would like to measure the disk i/o performance of our hadoop cluster. 
> However, running iostat on 16 nodes is rather cumbersome. Does dfs keep 
> track of any stats like the number of blocks or bytes read and written? 
>  From scanning the api, I found a class called 
> "org.apache.hadoop.fs.FileSystem.Statistics" that could be relevant. 
> Does anyone know if this is what I'm looking for?
> 
> Thanks,
> 
> Shirley
> 

Re: dfs i/o stats

Posted by Shirley Cohen <sc...@cs.utexas.edu>.
Great!

Thanks very much,

Shirley

On Sep 29, 2008, at 7:37 PM, Elia Mazzawi wrote:

> you can see those stats for each job in the job tracker web interface
> http://yourdfsmaster.com:50030/jobtracker.jsp
> click on the job link to get the stats
>
> since its in the web interface there is probably a command to get it.
>
> Shirley Cohen wrote:
>> Hi,
>>
>> I would like to measure the disk i/o performance of our hadoop  
>> cluster. However, running iostat on 16 nodes is rather cumbersome.  
>> Does dfs keep track of any stats like the number of blocks or  
>> bytes read and written? From scanning the api, I found a class  
>> called "org.apache.hadoop.fs.FileSystem.Statistics" that could be  
>> relevant. Does anyone know if this is what I'm looking for?
>>
>> Thanks,
>>
>> Shirley


Re: dfs i/o stats

Posted by Elia Mazzawi <el...@casalemedia.com>.
you can see those stats for each job in the job tracker web interface
http://yourdfsmaster.com:50030/jobtracker.jsp
click on the job link to get the stats

since its in the web interface there is probably a command to get it.

Shirley Cohen wrote:
> Hi,
>
> I would like to measure the disk i/o performance of our hadoop 
> cluster. However, running iostat on 16 nodes is rather cumbersome. 
> Does dfs keep track of any stats like the number of blocks or bytes 
> read and written? From scanning the api, I found a class called 
> "org.apache.hadoop.fs.FileSystem.Statistics" that could be relevant. 
> Does anyone know if this is what I'm looking for?
>
> Thanks,
>
> Shirley