You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Doug Cutting (JIRA)" <ji...@apache.org> on 2006/04/21 19:29:06 UTC
[jira] Resolved: (HADOOP-72) hadoop doesn't take advatage of
distributed compiting in TestDFSIO
[ http://issues.apache.org/jira/browse/HADOOP-72?page=all ]
Doug Cutting resolved HADOOP-72:
--------------------------------
Fix Version: 0.2
Resolution: Won't Fix
This was caused by a misunderstanding.
> hadoop doesn't take advatage of distributed compiting in TestDFSIO
> ------------------------------------------------------------------
>
> Key: HADOOP-72
> URL: http://issues.apache.org/jira/browse/HADOOP-72
> Project: Hadoop
> Type: Test
> Components: dfs, fs, mapred
> Environment: 200 node cluster
> Reporter: Konstantin Shvachko
> Fix For: 0.2
> Attachments: TestDFSIO.java, TestDFSIO_results.log, TestDFSIO_results_200_node_cluster.log, TestDFSIO_results_sequential.log
>
> TestDFSIO runs N map jobs, each either writing to or reading from a separate file of the same size,
> and collects statistical information on its performance.
> The reducer further calculates the overall statistics for all maps.
> It outputs the following data:
> - read or write test
> - date and time the test finished
> - number of files
> - total number of bytes processed
> - overall throughput in mb/sec
> - average IO rate in mb/sec per file
> __Results__
> I run 7 iterations of the test one after another on a cluster of ~200 nodes.
> The file size is the same in all cases 320Mb.
> The number of files tried is 1,2,4,8,16,32,64.
> The log file with statistics is attached.
> It looks like we don't have any distributed computing here at all.
> The total execution time increases proportionally to the total size of data both for writes and reads.
> Another thing is that the io ratio for read is higher than the write rate just gradually.
> For comparison I attach time measuring for the same ios performed on the same cluster but sequentially in a simple loop.
> This is the summary:
> Files map/red time sequential time
> 1 49 34
> 2 86 69
> 4 158 131
> 8 299 266
> 16 569 532
> 32 1131
> 64 2218
> This doesn't look good, unless there is something wrong with my test (attached) or the cluster settings.
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
http://www.atlassian.com/software/jira