You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-issues@hadoop.apache.org by "BELUGA BEHR (JIRA)" <ji...@apache.org> on 2017/08/03 22:09:00 UTC
[jira] [Commented] (MAPREDUCE-1821) IFile.Reader should check
whether data crc has checked before it stop reading.
[ https://issues.apache.org/jira/browse/MAPREDUCE-1821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16113585#comment-16113585 ]
BELUGA BEHR commented on MAPREDUCE-1821:
----------------------------------------
Well, I don't think this is a big deal. There is only one checksum for the entire file, so you can't trust the values of {{nextRawKey}} until you get to the end of the file anyway. At which point, a call to the {{close}} method will cause the remaining bits to be check-sum.
{code}
ArrayList<Map.Entry<K,V>> results = new ArrayList<>();
IFile.Reader reader = new IFile.Reader(...);
try {
// loop 1,2,3,EOF,5,6,7,EOF
while (reader.nextRawKey(buf)) {
// serialize buffer into Key
reader.nextRawValue(buf);
// serialize buffer into Value
results.add(new Map.Entry<K,V>(keyValue,mapValue));
} finally {
try {
// reads rest of file and validates checksum
reader.close();
} catch (ChecksumException cse) {
// Has values 1,2,3
results.clear();
}
}
return results;
{code}
Now, I don't know if this is being done anywhere, but the facilities are there.
> IFile.Reader should check whether data crc has checked before it stop reading.
> ------------------------------------------------------------------------------
>
> Key: MAPREDUCE-1821
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1821
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Components: task
> Reporter: ZhuGuanyin
> Assignee: ZhuGuanyin
>
> Currently IFile data has crc checked in IFileInputStream (doRead method),
> Normally the IFile would end with 2 bytes of -1, which means EOF_MARKER for keylength and valuelength, and then with 4 bytes crc checksum;
> IFileInputStream checksumIn would check crc before IFile.Reader get EOF_MARKER,
> IFile.Reader would stop reading when positionToNextRecord() read keylength EOF_MARKER(-1),and valuelength EOF_MARKER(-1);
> But if something error happened(IFile corrupted), if the IFileReader read -1, -1 not at end of the IFile, the data may not checked!
> Then Reader thought it had got all data and close reader......the task may fake success without any WARNing.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: mapreduce-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-help@hadoop.apache.org