You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@storm.apache.org by Sachin Pasalkar <Sa...@symantec.com> on 2015/09/16 10:19:45 UTC

Possible issue with SequenceFileBolt

When you use the compressionType as BLOCK it does not rotate the file at specified file size.

On further investigation I found that Block writer sync only when buffered data reached compressionBlockSize.
Code in org.apache.hadoop.io.SequenceFile.BlockCompressWriter.append(Object, Object)

 int currentBlockSize = keyBuffer.getLength() + valBuffer.getLength();
            if(currentBlockSize >= compressionBlockSize)
                sync();
Default value of compressionBlockSize is 1000000, which make
offset = this.writer.getLength(); (SequenceFileBolt class) to returns stall value of current data size.

Incase or org.apache.hadoop.io.SequenceFile.RecordCompressWriter it works fine as it always call for sync api (checkAndWriteSync())

Either we can sync records every time it comes or use  int currentBlockSize = keyBuffer.getLength() + valBuffer.getLength(); to return current size.

Thanks,
Sachin