You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Saptarshi Guha <sa...@gmail.com> on 2012/11/12 19:26:12 UTC

Calling sync for every record in sequencefile.writer

Hello,

For a given part file (e..g part-m-0000), i would like to record the
position of key written to this file.

To get this position, i wrote something
//out.sync()
currentposition=out.getLength();
record_current_position(key, currentposition)
out.append(key, value);

where out is SequenceFile.Writer

Now, if I leave the first line uncommented, for small files, getLength()
does not change from key to key.
if i call sync, for every key, it changes to accurately reflect the
position.
Is there some other function i can use to get the current position (like a
file's 'tell' function)

But calling sync for every record would be costly?

How much?(I dont expect an answer to the last question).
if it makes a difference i have block compression turned on.

I noticed that Mapfile.writer does something similar(calls getLength) and
would reduce to the above operation i.e. call getLength for every key-value
pair if i set the index to 1. So would this impact Mapfile.writer?

Cheers
Sapsi