You are viewing a plain text version of this content. The canonical link for it is here.

Posted to common-user@hadoop.apache.org by tienduc_dinh <ti...@yahoo.com> on 2009/01/06 16:04:57 UTC

TestDFSIO delivers bad values of "throughput" and "average IO rate"

Hello,

I'm now using hadoop-0.18.0 and testing it on a cluster with 1 master and 4
slaves. In hadoop-site.xml the value of "mapred.map.tasks" is 10. Because
the values "throughput" and "average IO rate" are similar, I just post the
values of "throughput" of the same command with 3 times running

- > hadoop-0.18.0/bin/hadoop jar testDFSIO.jar -write -fileSize 2048
-nrFiles 1

+ with "dfs.replication = 1" => 33,60 / 31,48 / 30,95

+ with "dfs.replication = 2" => 26,40 / 20,99 / 21,70

I find something strange while reading the source code. 

- The value of mapred.reduce.tasks is always set to 1 

job.setNumReduceTasks(1) in the function runIOTest()  and reduceFile = new
Path(WRITE_DIR, "part-00000") in analyzeResult().

So I think, if we properly have mapred.reduce.tasks = 2, we will have on the
file system 2 Paths to "part-00000" and "part-00001", e.g.
/benchmarks/TestDFSIO/io_write/part-00000

- And i don't understand the line with "double med = rate / 1000 / tasks".
Is it not "double med = rate * tasks / 1000 "
-- 
View this message in context: http://www.nabble.com/TestDFSIO-delivers-bad-values-of-%22throughput%22-and-%22average-IO-rate%22-tp21312088p21312088.html
Sent from the Hadoop core-user mailing list archive at Nabble.com.

Re: TestDFSIO delivers bad values of "throughput" and "average IO rate"

Posted by Bharath Mundlapudi <bh...@yahoo.com>.

what is the fileSize you are specifying for TestDFSIO?

I think, If you haven't specified this value then default is 1MB. This might cause client bottleneck. 


-Bharath



________________________________
From: sendy.seftiandy <sn...@yahoo.com>
To: core-user@hadoop.apache.org
Sent: Saturday, May 21, 2011 11:35 PM
Subject: Re: TestDFSIO delivers bad values of "throughput" and "average IO rate"


after read this thread, i'm still confused. in my cluster when it has 1
master + 3 slaves with nrFiles=3 and total MBytes processed = 1000, the
throughput is equal with when it has 1 master+6slaves with nrFiles=6, the
total MBytes processed = 1000

can i  multiplied the throughpout from first try by the number of slaves (3)
and compare that with the throughput from second try multiplied by the
number of its slaves 

i really need the answer, thank you very much
-- 
View this message in context: http://old.nabble.com/Re%3A-TestDFSIO-delivers-bad-values-of-%22throughput%22-and-%22average-IO-rate%22-tp21322404p31673904.html
Sent from the Hadoop core-user mailing list archive at Nabble.com.

Re: TestDFSIO delivers bad values of "throughput" and "average IO rate"

Posted by "sendy.seftiandy" <sn...@yahoo.com>.

after read this thread, i'm still confused. in my cluster when it has 1
master + 3 slaves with nrFiles=3 and total MBytes processed = 1000, the
throughput is equal with when it has 1 master+6slaves with nrFiles=6, the
total MBytes processed = 1000

can i  multiplied the throughpout from first try by the number of slaves (3)
and compare that with the throughput from second try multiplied by the
number of its slaves 

i really need the answer, thank you very much
-- 
View this message in context: http://old.nabble.com/Re%3A-TestDFSIO-delivers-bad-values-of-%22throughput%22-and-%22average-IO-rate%22-tp21322404p31673904.html
Sent from the Hadoop core-user mailing list archive at Nabble.com.

Re: TestDFSIO delivers bad values of "throughput" and "average IO rate"

Posted by Konstantin Shvachko <sh...@yahoo-inc.com>.

In TestDFSIO we want each task to create only one file.
It is a one-to-one mapping from files to map tasks.
And splits are defined so that each map gets only
one file name, which it creates or reads.

--Konstantin

tienduc_dinh wrote:
> I don't understand, why the parameter -nrFiles of TestDFSIO should override
> mapred.map.tasks. 
> nrFiles is the number of the files which will be created and
> mapred.map.tasks is the number how many splits will be done by the input
> file.
> 
> Thanks
> 
> 
> Konstantin Shvachko wrote:
>> Hi tienduc_dinh,
>>
>> Just a bit of a background, which should help to answer your questions.
>> TestDFSIO mappers perform one operation (read or write) each, measure
>> the time taken by the operation and output the following three values:
>> (I am intentionally omitting some other output stuff.)
>> - size(i)
>> - time(i)
>> - rate(i) = size(i) / time(i)
>> i is the index of the map task 0 <= i < N, and N is the "-nrFiles" value,
>> which equals the number of maps.
>>
>> Then the reduce sums those values and writes them into "part-00000".
>> That is you get three fields in it
>> size = size(0) + ... + size(N-1)
>> time = time(0) + ... + time(N-1)
>> rate = rate(0) + ... + rate(N-1)
>>
>> Then we calculate
>> throughput = size / time
>> averageIORate = rate / N
>>
>> So answering your questions
>> - There should be only one reduce task, otherwise you will have to
>> manually sum corresponding values in "part-00000" and "part-00001".
>> - The value of the ":rate" after the reduce equals the sum of individual
>> rates of each operation. So if you want to have an average you should
>> divide it by the number tasks rather than multiply.
>>
>> Now, in your case you create only one file "-nrFiles 1", which means
>> you run only one map task.
>> Setting "mapred.map.tasks" to 10 in hadoop-site.xml defines the default
>> number of tasks per job. See here
>> http://hadoop.apache.org/core/docs/current/hadoop-default.html#mapred.map.tasks
>> In case of TestDFSIO it will be overridden by "-nrFiles".
>>
>> Hope this answers your questions.
>> Thanks,
>> --Konstantin
>>
>>
>>
>> tienduc_dinh wrote:
>>> Hello,
>>>
>>> I'm now using hadoop-0.18.0 and testing it on a cluster with 1 master and
>>> 4
>>> slaves. In hadoop-site.xml the value of "mapred.map.tasks" is 10. Because
>>> the values "throughput" and "average IO rate" are similar, I just post
>>> the
>>> values of "throughput" of the same command with 3 times running
>>>
>>> - > hadoop-0.18.0/bin/hadoop jar testDFSIO.jar -write -fileSize 2048
>>> -nrFiles 1
>>>
>>> + with "dfs.replication = 1" => 33,60 / 31,48 / 30,95
>>>
>>> + with "dfs.replication = 2" => 26,40 / 20,99 / 21,70
>>>
>>> I find something strange while reading the source code. 
>>>
>>> - The value of mapred.reduce.tasks is always set to 1 
>>>
>>> job.setNumReduceTasks(1) in the function runIOTest()  and reduceFile =
>>> new
>>> Path(WRITE_DIR, "part-00000") in analyzeResult().
>>>
>>> So I think, if we properly have mapred.reduce.tasks = 2, we will have on
>>> the
>>> file system 2 Paths to "part-00000" and "part-00001", e.g.
>>> /benchmarks/TestDFSIO/io_write/part-00000
>>>
>>> - And i don't understand the line with "double med = rate / 1000 /
>>> tasks".
>>> Is it not "double med = rate * tasks / 1000 "
>>
>

Re: TestDFSIO delivers bad values of "throughput" and "average IO rate"

Posted by tienduc_dinh <ti...@yahoo.com>.

I don't understand, why the parameter -nrFiles of TestDFSIO should override
mapred.map.tasks. 
nrFiles is the number of the files which will be created and
mapred.map.tasks is the number how many splits will be done by the input
file.

Thanks


Konstantin Shvachko wrote:
> 
> Hi tienduc_dinh,
> 
> Just a bit of a background, which should help to answer your questions.
> TestDFSIO mappers perform one operation (read or write) each, measure
> the time taken by the operation and output the following three values:
> (I am intentionally omitting some other output stuff.)
> - size(i)
> - time(i)
> - rate(i) = size(i) / time(i)
> i is the index of the map task 0 <= i < N, and N is the "-nrFiles" value,
> which equals the number of maps.
> 
> Then the reduce sums those values and writes them into "part-00000".
> That is you get three fields in it
> size = size(0) + ... + size(N-1)
> time = time(0) + ... + time(N-1)
> rate = rate(0) + ... + rate(N-1)
> 
> Then we calculate
> throughput = size / time
> averageIORate = rate / N
> 
> So answering your questions
> - There should be only one reduce task, otherwise you will have to
> manually sum corresponding values in "part-00000" and "part-00001".
> - The value of the ":rate" after the reduce equals the sum of individual
> rates of each operation. So if you want to have an average you should
> divide it by the number tasks rather than multiply.
> 
> Now, in your case you create only one file "-nrFiles 1", which means
> you run only one map task.
> Setting "mapred.map.tasks" to 10 in hadoop-site.xml defines the default
> number of tasks per job. See here
> http://hadoop.apache.org/core/docs/current/hadoop-default.html#mapred.map.tasks
> In case of TestDFSIO it will be overridden by "-nrFiles".
> 
> Hope this answers your questions.
> Thanks,
> --Konstantin
> 
> 
> 
> tienduc_dinh wrote:
>> Hello,
>> 
>> I'm now using hadoop-0.18.0 and testing it on a cluster with 1 master and
>> 4
>> slaves. In hadoop-site.xml the value of "mapred.map.tasks" is 10. Because
>> the values "throughput" and "average IO rate" are similar, I just post
>> the
>> values of "throughput" of the same command with 3 times running
>> 
>> - > hadoop-0.18.0/bin/hadoop jar testDFSIO.jar -write -fileSize 2048
>> -nrFiles 1
>> 
>> + with "dfs.replication = 1" => 33,60 / 31,48 / 30,95
>> 
>> + with "dfs.replication = 2" => 26,40 / 20,99 / 21,70
>> 
>> I find something strange while reading the source code. 
>> 
>> - The value of mapred.reduce.tasks is always set to 1 
>> 
>> job.setNumReduceTasks(1) in the function runIOTest()  and reduceFile =
>> new
>> Path(WRITE_DIR, "part-00000") in analyzeResult().
>> 
>> So I think, if we properly have mapred.reduce.tasks = 2, we will have on
>> the
>> file system 2 Paths to "part-00000" and "part-00001", e.g.
>> /benchmarks/TestDFSIO/io_write/part-00000
>> 
>> - And i don't understand the line with "double med = rate / 1000 /
>> tasks".
>> Is it not "double med = rate * tasks / 1000 "
> 
> 

-- 
View this message in context: http://www.nabble.com/Re%3A-TestDFSIO-delivers-bad-values-of-%22throughput%22-and-%22average-IO-rate%22-tp21322404p21399409.html
Sent from the Hadoop core-user mailing list archive at Nabble.com.

Re: TestDFSIO delivers bad values of "throughput" and "average IO rate"

Posted by tienduc_dinh <ti...@yahoo.com>.

hi Konstantin,

I think I got it, I forgot one thing in your last post.

time = time(0) + ... + time(N-1).

So it must be the throughput per client, and I'm happy now, that hadoop
works very well with the scalbility on my cluster.

Thank you so much and wish you all the best in the new year 2009 :handshake:

Tien Duc Dinh
-- 
View this message in context: http://www.nabble.com/Re%3A-TestDFSIO-delivers-bad-values-of-%22throughput%22-and-%22average-IO-rate%22-tp21322404p21340373.html
Sent from the Hadoop core-user mailing list archive at Nabble.com.

Re: TestDFSIO delivers bad values of "throughput" and "average IO rate"

Posted by tienduc_dinh <ti...@yahoo.com>.

hi Konstantin,

sorry for my mistake, it was not 5012, it was 512.

Of course, it is great that the throughput is mb/sec per client like you
said. In this case we have circa 120 MB/sec :clap:

But I'm not sure, if that really was. Please follow my example and
calculation of throughput

> hadoop-0.18.0/bin/hadoop jar testDFSIO.jar -write -fileSize 512 -nrFiles 4 

The value of throughput = 30,16

The information in /benchmarks/TestDFSIO/io_write/part-00000 is :

f:rate	121631.625
f:sqrate	3726004.8
l:size	2147483648
l:tasks	4
l:time	67900

To calculate throughput in source code:
throughput = size * 1000.0 / (time * MEGA),

So in this case we have throughput = 2147483648 * 1000 / (time * MEGA) =
2048 * 1000 / 67900 = 30,16.

Because the value of "size" is 2048 GB and not 512 MB, that's why I'm not
sure about it.

Can you give me a hint again, thanks lots.

Tien Duc Dinh

-- 
View this message in context: http://www.nabble.com/Re%3A-TestDFSIO-delivers-bad-values-of-%22throughput%22-and-%22average-IO-rate%22-tp21322404p21339931.html
Sent from the Hadoop core-user mailing list archive at Nabble.com.

Re: TestDFSIO delivers bad values of "throughput" and "average IO rate"

Posted by Konstantin Shvachko <sh...@yahoo-inc.com>.


tienduc_dinh wrote:
> Hi Konstantin,
> 
> thanks so much for your help. I was a litte bit confused about why my
> setting mapred.map.tasks = 10 in hadoop-site.xml, but hadoop didn't map
> anything. So your answer with 
> 
>> In case of TestDFSIO it will be overridden by "-nrFiles".
> 
> is the key. 
> 
> I need now your confirm to know, if I've understood it right. 

That is correct.

> + If I want to write 2 GB with 1 map task, I should use the following
> command.
> 
>> hadoop-0.18.0/bin/hadoop jar testDFSIO.jar -write -fileSize 2048 -nrFiles
>> 1 
> 
> The values of throughput are, e.g. 33,60 / 31,48 / 30,95. 
> 
> + If I want to write 2 GB with 4 map tasks, I should use the following
> command.
> 
>> hadoop-0.18.0/bin/hadoop jar testDFSIO.jar -write -fileSize 5012 -nrFiles
>> 4

You are writing 20GB not 2GB.
Should be 512 instead of 5012.

> The values of throughput are, e.g. 31,50 / 32,09 / 30,56. 
> 
> Can you please explain me, why the values in case 2 are much better. I have
> 1 master and 4 slaves and if I calculate it right, they must be even 4 times
> higher, right ?

throughput is mb/sec per client.
It is great that you get the same numbers for 1 write and 4 parallel writes.
This means that Hadoop on your cluster scales well! :-)

> Sorry for my poor english skill and thanks very much for your help.
> 
> Tien Duc Dinh
> 
> 
> Konstantin Shvachko wrote:
>> Hi tienduc_dinh,
>>
>> Just a bit of a background, which should help to answer your questions.
>> TestDFSIO mappers perform one operation (read or write) each, measure
>> the time taken by the operation and output the following three values:
>> (I am intentionally omitting some other output stuff.)
>> - size(i)
>> - time(i)
>> - rate(i) = size(i) / time(i)
>> i is the index of the map task 0 <= i < N, and N is the "-nrFiles" value,
>> which equals the number of maps.
>>
>> Then the reduce sums those values and writes them into "part-00000".
>> That is you get three fields in it
>> size = size(0) + ... + size(N-1)
>> time = time(0) + ... + time(N-1)
>> rate = rate(0) + ... + rate(N-1)
>>
>> Then we calculate
>> throughput = size / time
>> averageIORate = rate / N
>>
>> So answering your questions
>> - There should be only one reduce task, otherwise you will have to
>> manually sum corresponding values in "part-00000" and "part-00001".
>> - The value of the ":rate" after the reduce equals the sum of individual
>> rates of each operation. So if you want to have an average you should
>> divide it by the number tasks rather than multiply.
>>
>> Now, in your case you create only one file "-nrFiles 1", which means
>> you run only one map task.
>> Setting "mapred.map.tasks" to 10 in hadoop-site.xml defines the default
>> number of tasks per job. See here
>> http://hadoop.apache.org/core/docs/current/hadoop-default.html#mapred.map.tasks
>> In case of TestDFSIO it will be overridden by "-nrFiles".
>>
>> Hope this answers your questions.
>> Thanks,
>> --Konstantin
>>
>>
>>
>> tienduc_dinh wrote:
>>> Hello,
>>>
>>> I'm now using hadoop-0.18.0 and testing it on a cluster with 1 master and
>>> 4
>>> slaves. In hadoop-site.xml the value of "mapred.map.tasks" is 10. Because
>>> the values "throughput" and "average IO rate" are similar, I just post
>>> the
>>> values of "throughput" of the same command with 3 times running
>>>
>>> - > hadoop-0.18.0/bin/hadoop jar testDFSIO.jar -write -fileSize 2048
>>> -nrFiles 1
>>>
>>> + with "dfs.replication = 1" => 33,60 / 31,48 / 30,95
>>>
>>> + with "dfs.replication = 2" => 26,40 / 20,99 / 21,70
>>>
>>> I find something strange while reading the source code. 
>>>
>>> - The value of mapred.reduce.tasks is always set to 1 
>>>
>>> job.setNumReduceTasks(1) in the function runIOTest()  and reduceFile =
>>> new
>>> Path(WRITE_DIR, "part-00000") in analyzeResult().
>>>
>>> So I think, if we properly have mapred.reduce.tasks = 2, we will have on
>>> the
>>> file system 2 Paths to "part-00000" and "part-00001", e.g.
>>> /benchmarks/TestDFSIO/io_write/part-00000
>>>
>>> - And i don't understand the line with "double med = rate / 1000 /
>>> tasks".
>>> Is it not "double med = rate * tasks / 1000 "
>>
>

Re: TestDFSIO delivers bad values of "throughput" and "average IO rate"

Posted by tienduc_dinh <ti...@yahoo.com>.

Hi Konstantin,

thanks so much for your help. I was a litte bit confused about why my
setting mapred.map.tasks = 10 in hadoop-site.xml, but hadoop didn't map
anything. So your answer with 

> In case of TestDFSIO it will be overridden by "-nrFiles".

is the key. 

I need now your confirm to know, if I've understood it right. 

+ If I want to write 2 GB with 1 map task, I should use the following
command.

> hadoop-0.18.0/bin/hadoop jar testDFSIO.jar -write -fileSize 2048 -nrFiles
> 1 

The values of throughput are, e.g. 33,60 / 31,48 / 30,95. 

+ If I want to write 2 GB with 4 map tasks, I should use the following
command.

> hadoop-0.18.0/bin/hadoop jar testDFSIO.jar -write -fileSize 5012 -nrFiles
> 4

The values of throughput are, e.g. 31,50 / 32,09 / 30,56. 

Can you please explain me, why the values in case 2 are much better. I have
1 master and 4 slaves and if I calculate it right, they must be even 4 times
higher, right ?

Sorry for my poor english skill and thanks very much for your help.

Tien Duc Dinh


Konstantin Shvachko wrote:
> 
> Hi tienduc_dinh,
> 
> Just a bit of a background, which should help to answer your questions.
> TestDFSIO mappers perform one operation (read or write) each, measure
> the time taken by the operation and output the following three values:
> (I am intentionally omitting some other output stuff.)
> - size(i)
> - time(i)
> - rate(i) = size(i) / time(i)
> i is the index of the map task 0 <= i < N, and N is the "-nrFiles" value,
> which equals the number of maps.
> 
> Then the reduce sums those values and writes them into "part-00000".
> That is you get three fields in it
> size = size(0) + ... + size(N-1)
> time = time(0) + ... + time(N-1)
> rate = rate(0) + ... + rate(N-1)
> 
> Then we calculate
> throughput = size / time
> averageIORate = rate / N
> 
> So answering your questions
> - There should be only one reduce task, otherwise you will have to
> manually sum corresponding values in "part-00000" and "part-00001".
> - The value of the ":rate" after the reduce equals the sum of individual
> rates of each operation. So if you want to have an average you should
> divide it by the number tasks rather than multiply.
> 
> Now, in your case you create only one file "-nrFiles 1", which means
> you run only one map task.
> Setting "mapred.map.tasks" to 10 in hadoop-site.xml defines the default
> number of tasks per job. See here
> http://hadoop.apache.org/core/docs/current/hadoop-default.html#mapred.map.tasks
> In case of TestDFSIO it will be overridden by "-nrFiles".
> 
> Hope this answers your questions.
> Thanks,
> --Konstantin
> 
> 
> 
> tienduc_dinh wrote:
>> Hello,
>> 
>> I'm now using hadoop-0.18.0 and testing it on a cluster with 1 master and
>> 4
>> slaves. In hadoop-site.xml the value of "mapred.map.tasks" is 10. Because
>> the values "throughput" and "average IO rate" are similar, I just post
>> the
>> values of "throughput" of the same command with 3 times running
>> 
>> - > hadoop-0.18.0/bin/hadoop jar testDFSIO.jar -write -fileSize 2048
>> -nrFiles 1
>> 
>> + with "dfs.replication = 1" => 33,60 / 31,48 / 30,95
>> 
>> + with "dfs.replication = 2" => 26,40 / 20,99 / 21,70
>> 
>> I find something strange while reading the source code. 
>> 
>> - The value of mapred.reduce.tasks is always set to 1 
>> 
>> job.setNumReduceTasks(1) in the function runIOTest()  and reduceFile =
>> new
>> Path(WRITE_DIR, "part-00000") in analyzeResult().
>> 
>> So I think, if we properly have mapred.reduce.tasks = 2, we will have on
>> the
>> file system 2 Paths to "part-00000" and "part-00001", e.g.
>> /benchmarks/TestDFSIO/io_write/part-00000
>> 
>> - And i don't understand the line with "double med = rate / 1000 /
>> tasks".
>> Is it not "double med = rate * tasks / 1000 "
> 
> 

-- 
View this message in context: http://www.nabble.com/Re%3A-TestDFSIO-delivers-bad-values-of-%22throughput%22-and-%22average-IO-rate%22-tp21322404p21332803.html
Sent from the Hadoop core-user mailing list archive at Nabble.com.

Re: TestDFSIO delivers bad values of "throughput" and "average IO rate"

Posted by Konstantin Shvachko <sh...@yahoo-inc.com>.

Hi tienduc_dinh,

Just a bit of a background, which should help to answer your questions.
TestDFSIO mappers perform one operation (read or write) each, measure
the time taken by the operation and output the following three values:
(I am intentionally omitting some other output stuff.)
- size(i)
- time(i)
- rate(i) = size(i) / time(i)
i is the index of the map task 0 <= i < N, and N is the "-nrFiles" value,
which equals the number of maps.

Then the reduce sums those values and writes them into "part-00000".
That is you get three fields in it
size = size(0) + ... + size(N-1)
time = time(0) + ... + time(N-1)
rate = rate(0) + ... + rate(N-1)

Then we calculate
throughput = size / time
averageIORate = rate / N

So answering your questions
- There should be only one reduce task, otherwise you will have to
manually sum corresponding values in "part-00000" and "part-00001".
- The value of the ":rate" after the reduce equals the sum of individual
rates of each operation. So if you want to have an average you should
divide it by the number tasks rather than multiply.

Now, in your case you create only one file "-nrFiles 1", which means
you run only one map task.
Setting "mapred.map.tasks" to 10 in hadoop-site.xml defines the default
number of tasks per job. See here
http://hadoop.apache.org/core/docs/current/hadoop-default.html#mapred.map.tasks
In case of TestDFSIO it will be overridden by "-nrFiles".

Hope this answers your questions.
Thanks,
--Konstantin



tienduc_dinh wrote:
> Hello,
> 
> I'm now using hadoop-0.18.0 and testing it on a cluster with 1 master and 4
> slaves. In hadoop-site.xml the value of "mapred.map.tasks" is 10. Because
> the values "throughput" and "average IO rate" are similar, I just post the
> values of "throughput" of the same command with 3 times running
> 
> - > hadoop-0.18.0/bin/hadoop jar testDFSIO.jar -write -fileSize 2048
> -nrFiles 1
> 
> + with "dfs.replication = 1" => 33,60 / 31,48 / 30,95
> 
> + with "dfs.replication = 2" => 26,40 / 20,99 / 21,70
> 
> I find something strange while reading the source code. 
> 
> - The value of mapred.reduce.tasks is always set to 1 
> 
> job.setNumReduceTasks(1) in the function runIOTest()  and reduceFile = new
> Path(WRITE_DIR, "part-00000") in analyzeResult().
> 
> So I think, if we properly have mapred.reduce.tasks = 2, we will have on the
> file system 2 Paths to "part-00000" and "part-00001", e.g.
> /benchmarks/TestDFSIO/io_write/part-00000
> 
> - And i don't understand the line with "double med = rate / 1000 / tasks".
> Is it not "double med = rate * tasks / 1000 "