You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-issues@hadoop.apache.org by "mingleizhang (JIRA)" <ji...@apache.org> on 2016/07/07 01:26:11 UTC

[jira] [Commented] (MAPREDUCE-6729) Hitting performance and error when lots of files to write or read

    [ https://issues.apache.org/jira/browse/MAPREDUCE-6729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15365456#comment-15365456 ] 

mingleizhang commented on MAPREDUCE-6729:
-----------------------------------------

Is there anyone wanna get this jira ? If not, I will work on this soon.

> Hitting performance and error when lots of files to write or read
> -----------------------------------------------------------------
>
>                 Key: MAPREDUCE-6729
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6729
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: benchmarks, performance, test
>            Reporter: mingleizhang
>            Priority: Minor
>              Labels: performance, test
>
> When doing DFSIO test as a distributed i/o benchmark tool. Then especially writes plenty of files to disk or read from, both can cause performance issue and imprecise value in a way. The question is that existing practices needs to delete files when before running a job and that will cause extra time consumption and furthermore cause performance issue, statistical time error and imprecise throughput while the files are lots of. So we need to replace or improve this hack to prevent this from happening in the future.
> {code}
> public static void testWrite() throws Exception {
>     FileSystem fs = cluster.getFileSystem();
>     long tStart = System.currentTimeMillis();
>     bench.writeTest(fs); // this line of code will cause extra time consumption because of fs.delete(*,*) by the writeTest method
>     long execTime = System.currentTimeMillis() - tStart;
>     bench.analyzeResult(fs, TestType.TEST_TYPE_WRITE, execTime);
>   }
> private void writeTest(FileSystem fs) throws IOException {
>   Path writeDir = getWriteDir(config);
>   fs.delete(getDataDir(config), true);
>   fs.delete(writeDir, true);    
>   runIOTest(WriteMapper.class, writeDir);
>   }
> {code} 
> [https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/fs/TestDFSIO.java]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: mapreduce-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-help@hadoop.apache.org