You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-commits@lucene.apache.org by gs...@apache.org on 2006/06/21 02:29:32 UTC

svn commit: r415851 - /lucene/java/trunk/xdocs/fileformats.xml

Author: gsingers
Date: Tue Jun 20 17:29:32 2006
New Revision: 415851

URL: http://svn.apache.org/viewvc?rev=415851&view=rev
Log:
Updated the 1.9 reference at the top of the file and added in some cross references to the API.

Modified:
    lucene/java/trunk/xdocs/fileformats.xml

Modified: lucene/java/trunk/xdocs/fileformats.xml
URL: http://svn.apache.org/viewvc/lucene/java/trunk/xdocs/fileformats.xml?rev=415851&r1=415850&r2=415851&view=diff
==============================================================================
--- lucene/java/trunk/xdocs/fileformats.xml (original)
+++ lucene/java/trunk/xdocs/fileformats.xml Tue Jun 20 17:29:32 2006
@@ -14,7 +14,7 @@
 
             <p>
                 This document defines the index file formats used
-                in Lucene version 1.9.  If you are using a different
+                in Lucene version 2.0.  If you are using a different
 		version of Lucene, please consult the copy of
 		<code>docs/fileformats.html</code> that was distributed
 		with the version you are using.
@@ -107,7 +107,7 @@
                     tokenized, but sometimes it is useful for certain identifier fields
                     to be indexed literally.
                 </p>
-
+                <p>See the <a href="http://lucene.apache.org/java/docs/api/org/apache/lucene/document/Field.html">Field</a> java docs for more information on Fields.</p>
             </subsection>
 
             <subsection name="Segments">
@@ -230,8 +230,9 @@
                     </p>
                 </li>
                 <li><p>Term Vectors.  For each field in each document, the term vector
-                       (sometimes called document vector) is stored.  A term vector consists
-                       of term text and term frequency.
+                       (sometimes called document vector) may be stored.  A term vector consists
+                       of term text and term frequency.  To add Term Vectors to your index see the
+                    <a href="http://lucene.apache.org/java/docs/api/org/apache/lucene/document/Field.html">Field</a> constructors
                     </p>
                 </li>              
                 <li><p>Deleted documents.
@@ -249,7 +250,8 @@
             <p>
                 All files belonging to a segment have the same name with varying
                 extensions.  The extensions correspond to the different file formats
-                described below.
+                described below. When using the Compound File format (default in 1.4 and greater) these files are
+                collapsed into a single .cfs file (see below for details)
             </p>
 
             <p>
@@ -814,6 +816,7 @@
             	<p>FileName --&gt; String</p>
 
             	<p>FileData --&gt; raw file data</p>
+                <p>The raw file data is the data from the individual files named above.</p>
             	
             </subsection>
 
@@ -1096,7 +1099,10 @@
                             particular, it is the difference between the position of this term's
                             entry in that file and the position of the previous term's entry.
                         </p>
-                        <p>TODO: document skipInterval information</p>
+                        <p>SkipInterval is the fraction of TermDocs stored in skip tables. It is used to accelerate TermDocs.skipTo(int).
+                            Larger values result in smaller indexes, greater acceleration, but fewer accelerable cases, while
+                            smaller values result in bigger indexes, less acceleration and more
+                            accelerable cases.</p>
                     </li>
                 </ol>
             </subsection>