You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Colin Patrick McCabe (JIRA)" <ji...@apache.org> on 2013/06/24 18:38:21 UTC
[jira] [Created] (HADOOP-9667) SequenceFile: Reset keys and values
when syncing to a place before the header
Colin Patrick McCabe created HADOOP-9667:
--------------------------------------------
Summary: SequenceFile: Reset keys and values when syncing to a place before the header
Key: HADOOP-9667
URL: https://issues.apache.org/jira/browse/HADOOP-9667
Project: Hadoop Common
Issue Type: Bug
Reporter: Colin Patrick McCabe
Priority: Minor
There seems to be a bug in the {{SequenceFile#sync}} function. Thanks to Christopher Ng for this report:
{code}
/** Seek to the next sync mark past a given position.*/
public synchronized void sync(long position) throws IOException {
if (position+SYNC_SIZE >= end) {
seek(end);
return;
}
if (position < headerEnd) {
// seek directly to first record
in.seek(headerEnd); <====
should this not call seek (ie this.seek) instead?
// note the sync marker "seen" in the header
syncSeen = true;
return;
}
{code}
the problem is that when you sync to the start of a compressed file, the
noBufferedKeys and valuesDecompressed isn't reset so a block read isn't
triggered. When you subsequently call next() you're potentially getting
keys from the buffer which still contains keys from the previous position
of the file.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira