You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-issues@hadoop.apache.org by "Hong Tang (JIRA)" <ji...@apache.org> on 2010/08/20 11:29:17 UTC
[jira] Commented: (MAPREDUCE-2023) TestDFSIO read test may not read
specified bytes.
[ https://issues.apache.org/jira/browse/MAPREDUCE-2023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12900652#action_12900652 ]
Hong Tang commented on MAPREDUCE-2023:
--------------------------------------
The problem is due to the following code segments:
{code}
public static class ReadMapper extends IOStatMapper<Long> {
public ReadMapper() {
}
public Long doIO(Reporter reporter,
String name,
long totalSize // in bytes
) throws IOException {
// open file
DataInputStream in = fs.open(new Path(getDataDir(getConf()), name));
long actualSize = 0;
try {
for(int curSize = bufferSize;
curSize == bufferSize && actualSize < totalSize;) { // <-- HERE
curSize = in.read(buffer, 0, bufferSize);
if(curSize < 0) break;
actualSize += curSize;
reporter.setStatus("reading " + name + "@" +
actualSize + "/" + totalSize
+ " ::host = " + hostName);
}
} finally {
in.close();
}
return Long.valueOf(actualSize);
}
}
{code}
The problem is that the for-loop breaks out as soon as the previous read fails to fulfill the full buffer. The fix is pretty simple:
{code}
for(int curSize = bufferSize; actualSize < totalSize;) {
{code}
> TestDFSIO read test may not read specified bytes.
> -------------------------------------------------
>
> Key: MAPREDUCE-2023
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2023
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Components: benchmarks
> Reporter: Hong Tang
>
> TestDFSIO's read test may read less bytes than specified when reading large files.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.