You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@lucene.apache.org by jp...@apache.org on 2013/01/26 14:24:45 UTC
svn commit: r1438892 - in
/lucene/dev/branches/lucene4547/lucene/core/src/java/org/apache/lucene/util/packed:
BlockPackedWriter.java MonotonicBlockPackedWriter.java
Author: jpountz
Date: Sat Jan 26 13:24:44 2013
New Revision: 1438892
URL: http://svn.apache.org/viewvc?rev=1438892&view=rev
Log:
More format docs.
Modified:
lucene/dev/branches/lucene4547/lucene/core/src/java/org/apache/lucene/util/packed/BlockPackedWriter.java
lucene/dev/branches/lucene4547/lucene/core/src/java/org/apache/lucene/util/packed/MonotonicBlockPackedWriter.java
Modified: lucene/dev/branches/lucene4547/lucene/core/src/java/org/apache/lucene/util/packed/BlockPackedWriter.java
URL: http://svn.apache.org/viewvc/lucene/dev/branches/lucene4547/lucene/core/src/java/org/apache/lucene/util/packed/BlockPackedWriter.java?rev=1438892&r1=1438891&r2=1438892&view=diff
==============================================================================
--- lucene/dev/branches/lucene4547/lucene/core/src/java/org/apache/lucene/util/packed/BlockPackedWriter.java (original)
+++ lucene/dev/branches/lucene4547/lucene/core/src/java/org/apache/lucene/util/packed/BlockPackedWriter.java Sat Jan 26 13:24:44 2013
@@ -29,7 +29,30 @@ import org.apache.lucene.store.DataOutpu
* using as few bits as possible. Memory usage of this class is proportional to
* the block size. Each block has an overhead between 1 and 10 bytes to store
* the minimum value and the number of bits per value of the block.
+ * <p>
+ * Format:
+ * <ul>
+ * <li><BLock><sup>BlockCount</sup>
+ * <li>BlockCount: ⌈ ValueCount / BlockSize ⌉
+ * <li>Block: <Header, (Ints)>
+ * <li>Header: <Token, (MinValue)>
+ * <li>Token: a {@link DataOutput#writeByte(byte) byte}, first 7 bits are the
+ * number of bits per value (<tt>bitsPerValue</tt>). If the 8th bit is 1,
+ * then MinValue (see next) is <tt>0</tt>, otherwise MinValue and needs to
+ * be decoded
+ * <li>MinValue: a
+ * <a href="https://developers.google.com/protocol-buffers/docs/encoding#types">zigzag-encoded</a>
+ * {@link DataOutput#writeVLong(long) variable-length long} whose value
+ * should be added to every int from the block to restore the original
+ * values
+ * <li>Ints: If the number of bits per value is <tt>0</tt>, then there is
+ * nothing to decode and all ints are equal to MinValue. Otherwise: BlockSize
+ * {@link PackedInts packed ints} encoded on exactly <tt>bitsPerValue</tt>
+ * bits per value. They are the subtraction of the original values and
+ * MinValue
+ * </ul>
* @see BlockPackedReaderIterator
+ * @see BlockPackedReader
* @lucene.internal
*/
public final class BlockPackedWriter extends AbstractBlockPackedWriter {
Modified: lucene/dev/branches/lucene4547/lucene/core/src/java/org/apache/lucene/util/packed/MonotonicBlockPackedWriter.java
URL: http://svn.apache.org/viewvc/lucene/dev/branches/lucene4547/lucene/core/src/java/org/apache/lucene/util/packed/MonotonicBlockPackedWriter.java?rev=1438892&r1=1438891&r2=1438892&view=diff
==============================================================================
--- lucene/dev/branches/lucene4547/lucene/core/src/java/org/apache/lucene/util/packed/MonotonicBlockPackedWriter.java (original)
+++ lucene/dev/branches/lucene4547/lucene/core/src/java/org/apache/lucene/util/packed/MonotonicBlockPackedWriter.java Sat Jan 26 13:24:44 2013
@@ -24,10 +24,30 @@ import org.apache.lucene.store.DataOutpu
/**
* A writer for large monotonically increasing sequences of positive longs.
* <p>
- * The sequence is divided into fixed-size blocks and for each block, the
- * average value per ord is computed, followed by the delta from the expected
- * value for every ord, using as few bits as possible. Each block has an
- * overhead between 6 and 14 bytes.
+ * The sequence is divided into fixed-size blocks and for each block, values
+ * are modeled after a linear function f: x → A × x + B. The block
+ * encodes deltas from the expected values computed from this function using as
+ * few bits as possible. Each block has an overhead between 6 and 14 bytes.
+ * <p>
+ * Format:
+ * <ul>
+ * <li><BLock><sup>BlockCount</sup>
+ * <li>BlockCount: ⌈ ValueCount / BlockSize ⌉
+ * <li>Block: <Header, (Ints)>
+ * <li>Header: <B, A, BitsPerValue>
+ * <li>B: the B from f: x → A × x + B using a
+ * {@link DataOutput#writeVLong(long) variable-length long}
+ * <li>A: the A from f: x → A × x + B encoded using
+ * {@link Float#floatToIntBits(float)} on
+ * {@link DataOutput#writeInt(int) 4 bytes}
+ * <li>BitsPerValue: a {@link DataOutput#writeVInt(int) variable-length int}
+ * <li>Ints: if BitsPerValue is <tt>0</tt>, then there is nothing to read and
+ * all values perfectly match the result of the function. Otherwise, these
+ * are the
+ * <a href="https://developers.google.com/protocol-buffers/docs/encoding#types">zigzag-encoded</a>
+ * {@link PackedInts packed} deltas from the expected value (computed from
+ * the function) using exaclty BitsPerValue bits per value
+ * </ul>
* @see MonotonicBlockPackedReader
* @lucene.internal
*/