You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@lucene.apache.org by rm...@apache.org on 2013/01/22 01:01:39 UTC
svn commit: r1436696 - in
/lucene/dev/branches/lucene4547/lucene/core/src/java/org/apache/lucene:
codecs/ codecs/lucene40/ codecs/lucene41/ codecs/lucene42/ codecs/perfield/
document/ index/ search/ search/similarities/
Author: rmuir
Date: Tue Jan 22 00:01:38 2013
New Revision: 1436696
URL: http://svn.apache.org/viewvc?rev=1436696&view=rev
Log:
javadocs
Modified:
lucene/dev/branches/lucene4547/lucene/core/src/java/org/apache/lucene/codecs/DocValuesConsumer.java
lucene/dev/branches/lucene4547/lucene/core/src/java/org/apache/lucene/codecs/lucene40/package.html
lucene/dev/branches/lucene4547/lucene/core/src/java/org/apache/lucene/codecs/lucene41/package.html
lucene/dev/branches/lucene4547/lucene/core/src/java/org/apache/lucene/codecs/lucene42/package.html
lucene/dev/branches/lucene4547/lucene/core/src/java/org/apache/lucene/codecs/perfield/PerFieldDocValuesFormat.java
lucene/dev/branches/lucene4547/lucene/core/src/java/org/apache/lucene/document/FieldType.java
lucene/dev/branches/lucene4547/lucene/core/src/java/org/apache/lucene/index/AtomicReader.java
lucene/dev/branches/lucene4547/lucene/core/src/java/org/apache/lucene/index/BinaryDocValues.java
lucene/dev/branches/lucene4547/lucene/core/src/java/org/apache/lucene/index/NumericDocValues.java
lucene/dev/branches/lucene4547/lucene/core/src/java/org/apache/lucene/index/SortedDocValues.java
lucene/dev/branches/lucene4547/lucene/core/src/java/org/apache/lucene/index/package.html
lucene/dev/branches/lucene4547/lucene/core/src/java/org/apache/lucene/search/package.html
lucene/dev/branches/lucene4547/lucene/core/src/java/org/apache/lucene/search/similarities/package.html
Modified: lucene/dev/branches/lucene4547/lucene/core/src/java/org/apache/lucene/codecs/DocValuesConsumer.java
URL: http://svn.apache.org/viewvc/lucene/dev/branches/lucene4547/lucene/core/src/java/org/apache/lucene/codecs/DocValuesConsumer.java?rev=1436696&r1=1436695&r2=1436696&view=diff
==============================================================================
--- lucene/dev/branches/lucene4547/lucene/core/src/java/org/apache/lucene/codecs/DocValuesConsumer.java (original)
+++ lucene/dev/branches/lucene4547/lucene/core/src/java/org/apache/lucene/codecs/DocValuesConsumer.java Tue Jan 22 00:01:38 2013
@@ -37,10 +37,7 @@ import org.apache.lucene.util.PriorityQu
// prototype streaming DV api
public abstract class DocValuesConsumer implements Closeable {
- // TODO: are any of these params too "infringing" on codec?
- // we want codec to get necessary stuff from IW, but trading off against merge complexity.
- // nocommit should we pass SegmentWriteState...?
public abstract void addNumericField(FieldInfo field, Iterable<Number> values) throws IOException;
public abstract void addBinaryField(FieldInfo field, Iterable<BytesRef> values) throws IOException;
Modified: lucene/dev/branches/lucene4547/lucene/core/src/java/org/apache/lucene/codecs/lucene40/package.html
URL: http://svn.apache.org/viewvc/lucene/dev/branches/lucene4547/lucene/core/src/java/org/apache/lucene/codecs/lucene40/package.html?rev=1436696&r1=1436695&r2=1436696&view=diff
==============================================================================
--- lucene/dev/branches/lucene4547/lucene/core/src/java/org/apache/lucene/codecs/lucene40/package.html (original)
+++ lucene/dev/branches/lucene4547/lucene/core/src/java/org/apache/lucene/codecs/lucene40/package.html Tue Jan 22 00:01:38 2013
@@ -363,11 +363,11 @@ file, previously they were stored in tex
frequencies.</li>
<li>In version 4.0, the format of the inverted index became extensible via
the {@link org.apache.lucene.codecs.Codec Codec} api. Fast per-document storage
-({@link org.apache.lucene.index.DocValues DocValues}) was introduced. Normalization
-factors need no longer be a single byte, they can be any DocValues
-{@link org.apache.lucene.index.DocValues.Type type}. Terms need not be unicode
-strings, they can be any byte sequence. Term offsets can optionally be indexed
-into the postings lists. Payloads can be stored in the term vectors.</li>
+({@code DocValues}) was introduced. Normalization factors need no longer be a
+single byte, they can be any {@link org.apache.lucene.index.NumericDocValues NumericDocValues}.
+Terms need not be unicode strings, they can be any byte sequence. Term offsets
+can optionally be indexed into the postings lists. Payloads can be stored in the
+term vectors.</li>
</ul>
<a name="Limitations" id="Limitations"></a>
<h2>Limitations</h2>
Modified: lucene/dev/branches/lucene4547/lucene/core/src/java/org/apache/lucene/codecs/lucene41/package.html
URL: http://svn.apache.org/viewvc/lucene/dev/branches/lucene4547/lucene/core/src/java/org/apache/lucene/codecs/lucene41/package.html?rev=1436696&r1=1436695&r2=1436696&view=diff
==============================================================================
--- lucene/dev/branches/lucene4547/lucene/core/src/java/org/apache/lucene/codecs/lucene41/package.html (original)
+++ lucene/dev/branches/lucene4547/lucene/core/src/java/org/apache/lucene/codecs/lucene41/package.html Tue Jan 22 00:01:38 2013
@@ -368,11 +368,11 @@ file, previously they were stored in tex
frequencies.</li>
<li>In version 4.0, the format of the inverted index became extensible via
the {@link org.apache.lucene.codecs.Codec Codec} api. Fast per-document storage
-({@link org.apache.lucene.index.DocValues DocValues}) was introduced. Normalization
-factors need no longer be a single byte, they can be any DocValues
-{@link org.apache.lucene.index.DocValues.Type type}. Terms need not be unicode
-strings, they can be any byte sequence. Term offsets can optionally be indexed
-into the postings lists. Payloads can be stored in the term vectors.</li>
+({@code DocValues}) was introduced. Normalization factors need no longer be a
+single byte, they can be any {@link org.apache.lucene.index.NumericDocValues NumericDocValues}.
+Terms need not be unicode strings, they can be any byte sequence. Term offsets
+can optionally be indexed into the postings lists. Payloads can be stored in the
+term vectors.</li>
<li>In version 4.1, the format of the postings list changed to use either
of FOR compression or variable-byte encoding, depending upon the frequency
of the term.</li>
Modified: lucene/dev/branches/lucene4547/lucene/core/src/java/org/apache/lucene/codecs/lucene42/package.html
URL: http://svn.apache.org/viewvc/lucene/dev/branches/lucene4547/lucene/core/src/java/org/apache/lucene/codecs/lucene42/package.html?rev=1436696&r1=1436695&r2=1436696&view=diff
==============================================================================
--- lucene/dev/branches/lucene4547/lucene/core/src/java/org/apache/lucene/codecs/lucene42/package.html (original)
+++ lucene/dev/branches/lucene4547/lucene/core/src/java/org/apache/lucene/codecs/lucene42/package.html Tue Jan 22 00:01:38 2013
@@ -368,11 +368,11 @@ file, previously they were stored in tex
frequencies.</li>
<li>In version 4.0, the format of the inverted index became extensible via
the {@link org.apache.lucene.codecs.Codec Codec} api. Fast per-document storage
-({@link org.apache.lucene.index.DocValues DocValues}) was introduced. Normalization
-factors need no longer be a single byte, they can be any DocValues
-{@link org.apache.lucene.index.DocValues.Type type}. Terms need not be unicode
-strings, they can be any byte sequence. Term offsets can optionally be indexed
-into the postings lists. Payloads can be stored in the term vectors.</li>
+({@code DocValues}) was introduced. Normalization factors need no longer be a
+single byte, they can be any {@link org.apache.lucene.index.NumericDocValues NumericDocValues}.
+Terms need not be unicode strings, they can be any byte sequence. Term offsets
+can optionally be indexed into the postings lists. Payloads can be stored in the
+term vectors.</li>
<li>In version 4.1, the format of the postings list changed to use either
of FOR compression or variable-byte encoding, depending upon the frequency
of the term.</li>
Modified: lucene/dev/branches/lucene4547/lucene/core/src/java/org/apache/lucene/codecs/perfield/PerFieldDocValuesFormat.java
URL: http://svn.apache.org/viewvc/lucene/dev/branches/lucene4547/lucene/core/src/java/org/apache/lucene/codecs/perfield/PerFieldDocValuesFormat.java?rev=1436696&r1=1436695&r2=1436696&view=diff
==============================================================================
--- lucene/dev/branches/lucene4547/lucene/core/src/java/org/apache/lucene/codecs/perfield/PerFieldDocValuesFormat.java (original)
+++ lucene/dev/branches/lucene4547/lucene/core/src/java/org/apache/lucene/codecs/perfield/PerFieldDocValuesFormat.java Tue Jan 22 00:01:38 2013
@@ -182,9 +182,6 @@ public abstract class PerFieldDocValuesF
}
}
- // nocommit what if SimpleNormsFormat wants to use this
- // ...? we have a "boolean isNorms" issue...? I guess we
- // just need to make a PerFieldNormsFormat?
private class FieldsReader extends DocValuesProducer {
private final Map<String,DocValuesProducer> fields = new TreeMap<String,DocValuesProducer>();
Modified: lucene/dev/branches/lucene4547/lucene/core/src/java/org/apache/lucene/document/FieldType.java
URL: http://svn.apache.org/viewvc/lucene/dev/branches/lucene4547/lucene/core/src/java/org/apache/lucene/document/FieldType.java?rev=1436696&r1=1436695&r2=1436696&view=diff
==============================================================================
--- lucene/dev/branches/lucene4547/lucene/core/src/java/org/apache/lucene/document/FieldType.java (original)
+++ lucene/dev/branches/lucene4547/lucene/core/src/java/org/apache/lucene/document/FieldType.java Tue Jan 22 00:01:38 2013
@@ -416,7 +416,7 @@ public class FieldType implements Indexa
* {@inheritDoc}
* <p>
* The default is <code>null</code> (no docValues)
- * @see #setDocValueType(DocValuesType)
+ * @see #setDocValueType(org.apache.lucene.index.FieldInfo.DocValuesType)
*/
@Override
public DocValuesType docValueType() {
Modified: lucene/dev/branches/lucene4547/lucene/core/src/java/org/apache/lucene/index/AtomicReader.java
URL: http://svn.apache.org/viewvc/lucene/dev/branches/lucene4547/lucene/core/src/java/org/apache/lucene/index/AtomicReader.java?rev=1436696&r1=1436695&r2=1436696&view=diff
==============================================================================
--- lucene/dev/branches/lucene4547/lucene/core/src/java/org/apache/lucene/index/AtomicReader.java (original)
+++ lucene/dev/branches/lucene4547/lucene/core/src/java/org/apache/lucene/index/AtomicReader.java Tue Jan 22 00:01:38 2013
@@ -175,10 +175,10 @@ public abstract class AtomicReader exten
* used by a single thread. */
public abstract SortedDocValues getSortedDocValues(String field) throws IOException;
- // nocommit document that these are thread-private:
/** Returns {@link NumericDocValues} representing norms
* for this field, or null if no {@link NumericDocValues}
- * were indexed. */
+ * were indexed. The returned instance should only be
+ * used by a single thread. */
public abstract NumericDocValues getNormValues(String field) throws IOException;
/**
Modified: lucene/dev/branches/lucene4547/lucene/core/src/java/org/apache/lucene/index/BinaryDocValues.java
URL: http://svn.apache.org/viewvc/lucene/dev/branches/lucene4547/lucene/core/src/java/org/apache/lucene/index/BinaryDocValues.java?rev=1436696&r1=1436695&r2=1436696&view=diff
==============================================================================
--- lucene/dev/branches/lucene4547/lucene/core/src/java/org/apache/lucene/index/BinaryDocValues.java (original)
+++ lucene/dev/branches/lucene4547/lucene/core/src/java/org/apache/lucene/index/BinaryDocValues.java Tue Jan 22 00:01:38 2013
@@ -19,6 +19,9 @@ package org.apache.lucene.index;
import org.apache.lucene.util.BytesRef;
+/**
+ * A per-document byte[]
+ */
public abstract class BinaryDocValues {
/** Lookup the value for document.
@@ -29,8 +32,12 @@ public abstract class BinaryDocValues {
* "private" instance should be used for each source. */
public abstract void get(int docID, BytesRef result);
+ /**
+ * Indicates the value was missing for the document.
+ */
public static final byte[] MISSING = new byte[0];
+ /** An empty BinaryDocValues which returns empty bytes for every document */
public static final BinaryDocValues EMPTY = new BinaryDocValues() {
@Override
public void get(int docID, BytesRef result) {
Modified: lucene/dev/branches/lucene4547/lucene/core/src/java/org/apache/lucene/index/NumericDocValues.java
URL: http://svn.apache.org/viewvc/lucene/dev/branches/lucene4547/lucene/core/src/java/org/apache/lucene/index/NumericDocValues.java?rev=1436696&r1=1436695&r2=1436696&view=diff
==============================================================================
--- lucene/dev/branches/lucene4547/lucene/core/src/java/org/apache/lucene/index/NumericDocValues.java (original)
+++ lucene/dev/branches/lucene4547/lucene/core/src/java/org/apache/lucene/index/NumericDocValues.java Tue Jan 22 00:01:38 2013
@@ -17,9 +17,18 @@ package org.apache.lucene.index;
* limitations under the License.
*/
+/**
+ * A per-document numeric value.
+ */
public abstract class NumericDocValues {
+ /**
+ * Returns the numeric value for the specified document ID.
+ * @param docID document ID to lookup
+ * @return numeric value
+ */
public abstract long get(int docID);
+ /** An empty NumericDocValues which returns zero for every document */
public static final NumericDocValues EMPTY = new NumericDocValues() {
@Override
public long get(int docID) {
Modified: lucene/dev/branches/lucene4547/lucene/core/src/java/org/apache/lucene/index/SortedDocValues.java
URL: http://svn.apache.org/viewvc/lucene/dev/branches/lucene4547/lucene/core/src/java/org/apache/lucene/index/SortedDocValues.java?rev=1436696&r1=1436695&r2=1436696&view=diff
==============================================================================
--- lucene/dev/branches/lucene4547/lucene/core/src/java/org/apache/lucene/index/SortedDocValues.java (original)
+++ lucene/dev/branches/lucene4547/lucene/core/src/java/org/apache/lucene/index/SortedDocValues.java Tue Jan 22 00:01:38 2013
@@ -19,11 +19,35 @@ package org.apache.lucene.index;
import org.apache.lucene.util.BytesRef;
+/**
+ * A per-document byte[] with presorted values.
+ * <p>
+ * Per-Document values in a SortedDocValues are deduplicated, dereferenced,
+ * and sorted into a dictionary of unique values. A pointer to the
+ * dictionary value (ordinal) can be retrieved for each document. Ordinals
+ * are dense and in increasing sorted order.
+ */
public abstract class SortedDocValues extends BinaryDocValues {
+ /**
+ * Returns the ordinal for the specified docID.
+ * @param docID document ID to lookup
+ * @return ordinal for the document: this is dense, starts at 0, then
+ * increments by 1 for the next value in sorted order.
+ */
public abstract int getOrd(int docID);
+ /** Retrieves the value for the specified ordinal.
+ * @param ord ordinal to lookup
+ * @param result will be populated with the ordinal's value
+ * @see #getOrd(int)
+ */
public abstract void lookupOrd(int ord, BytesRef result);
+ /**
+ * Returns the number of unique values.
+ * @return number of unique values in this SortedDocValues. This is
+ * also equivalent to one plus the maximum ordinal.
+ */
public abstract int getValueCount();
@Override
@@ -37,6 +61,7 @@ public abstract class SortedDocValues ex
}
}
+ /** An empty SortedDocValues which returns empty bytes for every document */
public static final SortedDocValues EMPTY = new SortedDocValues() {
@Override
public int getOrd(int docID) {
Modified: lucene/dev/branches/lucene4547/lucene/core/src/java/org/apache/lucene/index/package.html
URL: http://svn.apache.org/viewvc/lucene/dev/branches/lucene4547/lucene/core/src/java/org/apache/lucene/index/package.html?rev=1436696&r1=1436695&r2=1436696&view=diff
==============================================================================
--- lucene/dev/branches/lucene4547/lucene/core/src/java/org/apache/lucene/index/package.html (original)
+++ lucene/dev/branches/lucene4547/lucene/core/src/java/org/apache/lucene/index/package.html Tue Jan 22 00:01:38 2013
@@ -254,7 +254,7 @@ its {@link org.apache.lucene.search.simi
</p>
<p>
Additional user-supplied statistics can be added to the document as DocValues fields and
-accessed via {@link org.apache.lucene.index.AtomicReader#docValues}.
+accessed via {@link org.apache.lucene.index.AtomicReader#getNumericDocValues}.
</p>
<p>
</body>
Modified: lucene/dev/branches/lucene4547/lucene/core/src/java/org/apache/lucene/search/package.html
URL: http://svn.apache.org/viewvc/lucene/dev/branches/lucene4547/lucene/core/src/java/org/apache/lucene/search/package.html?rev=1436696&r1=1436695&r2=1436696&view=diff
==============================================================================
--- lucene/dev/branches/lucene4547/lucene/core/src/java/org/apache/lucene/search/package.html (original)
+++ lucene/dev/branches/lucene4547/lucene/core/src/java/org/apache/lucene/search/package.html Tue Jan 22 00:01:38 2013
@@ -338,7 +338,7 @@ extend by plugging in a different compon
Finally, you can extend the low level {@link org.apache.lucene.search.similarities.Similarity Similarity} directly
to implement a new retrieval model, or to use external scoring factors particular to your application. For example,
a custom Similarity can access per-document values via {@link org.apache.lucene.search.FieldCache FieldCache} or
-{@link org.apache.lucene.index.DocValues} and integrate them into the score.
+{@link org.apache.lucene.index.NumericDocValues} and integrate them into the score.
</p>
<p>
See the {@link org.apache.lucene.search.similarities} package documentation for information
Modified: lucene/dev/branches/lucene4547/lucene/core/src/java/org/apache/lucene/search/similarities/package.html
URL: http://svn.apache.org/viewvc/lucene/dev/branches/lucene4547/lucene/core/src/java/org/apache/lucene/search/similarities/package.html?rev=1436696&r1=1436695&r2=1436696&view=diff
==============================================================================
--- lucene/dev/branches/lucene4547/lucene/core/src/java/org/apache/lucene/search/similarities/package.html (original)
+++ lucene/dev/branches/lucene4547/lucene/core/src/java/org/apache/lucene/search/similarities/package.html Tue Jan 22 00:01:38 2013
@@ -132,7 +132,7 @@ subclassing the Similarity, one can simp
matching term occurs. In these
cases people have overridden Similarity to return 1 from the tf() method.</p></li>
<li><p>Changing Length Normalization — By overriding
- {@link org.apache.lucene.search.similarities.Similarity#computeNorm(FieldInvertState state, Norm)},
+ {@link org.apache.lucene.search.similarities.Similarity#computeNorm(FieldInvertState state)},
it is possible to discount how the length of a field contributes
to a score. In {@link org.apache.lucene.search.similarities.DefaultSimilarity},
lengthNorm = 1 / (numTerms in field)^0.5, but if one changes this to be