You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-commits@lucene.apache.org by mi...@apache.org on 2006/11/18 00:18:49 UTC

svn commit: r476359 [1/2] - in /lucene/java/trunk: ./ docs/ src/java/org/apache/lucene/index/ src/java/org/apache/lucene/store/ src/test/org/apache/lucene/index/ src/test/org/apache/lucene/store/ xdocs/

Author: mikemccand
Date: Fri Nov 17 15:18:47 2006
New Revision: 476359

URL: http://svn.apache.org/viewvc?view=rev&rev=476359
Log:
Lockless commits

Added:
    lucene/java/trunk/src/java/org/apache/lucene/index/IndexFileDeleter.java
    lucene/java/trunk/src/test/org/apache/lucene/index/index.prelockless.cfs.zip   (with props)
    lucene/java/trunk/src/test/org/apache/lucene/index/index.prelockless.nocfs.zip   (with props)
Modified:
    lucene/java/trunk/CHANGES.txt
    lucene/java/trunk/docs/fileformats.html
    lucene/java/trunk/src/java/org/apache/lucene/index/IndexFileNameFilter.java
    lucene/java/trunk/src/java/org/apache/lucene/index/IndexFileNames.java
    lucene/java/trunk/src/java/org/apache/lucene/index/IndexReader.java
    lucene/java/trunk/src/java/org/apache/lucene/index/IndexWriter.java
    lucene/java/trunk/src/java/org/apache/lucene/index/MultiReader.java
    lucene/java/trunk/src/java/org/apache/lucene/index/SegmentInfo.java
    lucene/java/trunk/src/java/org/apache/lucene/index/SegmentInfos.java
    lucene/java/trunk/src/java/org/apache/lucene/index/SegmentReader.java
    lucene/java/trunk/src/java/org/apache/lucene/store/FSDirectory.java
    lucene/java/trunk/src/java/org/apache/lucene/store/RAMDirectory.java
    lucene/java/trunk/src/test/org/apache/lucene/index/TestIndexReader.java
    lucene/java/trunk/src/test/org/apache/lucene/index/TestIndexWriter.java
    lucene/java/trunk/src/test/org/apache/lucene/index/TestMultiReader.java
    lucene/java/trunk/src/test/org/apache/lucene/store/TestLockFactory.java
    lucene/java/trunk/xdocs/fileformats.xml

Modified: lucene/java/trunk/CHANGES.txt
URL: http://svn.apache.org/viewvc/lucene/java/trunk/CHANGES.txt?view=diff&rev=476359&r1=476358&r2=476359
==============================================================================
--- lucene/java/trunk/CHANGES.txt (original)
+++ lucene/java/trunk/CHANGES.txt Fri Nov 17 15:18:47 2006
@@ -104,6 +104,15 @@
  9. LUCENE-657: Made FuzzyQuery non-final and inner ScoreTerm protected.
     (Steven Parkes via Otis Gospodnetic)
 
+10. LUCENE-701: Lockless commits: a commit lock is no longer required
+    when a writer commits and a reader opens the index.  This includes
+    a change to the index file format (see docs/fileformats.html for
+    details).  It also removes all APIs associated with the commit
+    lock & its timeout.  Readers are now truly read-only and do not
+    block one another on startup.  This is the first step to getting
+    Lucene to work correctly over NFS (second step is
+    LUCENE-710). (Mike McCandless)
+
 Bug fixes
 
  1. Fixed the web application demo (built with "ant war-demo") which

Modified: lucene/java/trunk/docs/fileformats.html
URL: http://svn.apache.org/viewvc/lucene/java/trunk/docs/fileformats.html?view=diff&rev=476359&r1=476358&r2=476359
==============================================================================
--- lucene/java/trunk/docs/fileformats.html (original)
+++ lucene/java/trunk/docs/fileformats.html Fri Nov 17 15:18:47 2006
@@ -118,7 +118,7 @@
         <blockquote>
                                     <p>
                 This document defines the index file formats used
-                in Lucene version 2.0.  If you are using a different
+                in Lucene version 2.1.  If you are using a different
 		version of Lucene, please consult the copy of
 		<code>docs/fileformats.html</code> that was distributed
 		with the version you are using.
@@ -143,6 +143,17 @@
                 Compatibility notes are provided in this document,
                 describing how file formats have changed from prior versions.
             </p>
+                                                <p>
+	        In version 2.1, the file format was changed to allow
+	        lock-less commits (ie, no more commit lock).  The
+	        change is fully backwards compatible: you can open a
+	        pre-2.1 index for searching or adding/deleting of
+	        docs.  When the new segments file is saved
+	        (committed), it will be written in the new file format
+	        (meaning no specific "upgrade" process is needed).
+	        But note that once a commit has occurred, pre-2.1
+	        Lucene will not be able to read the index.
+	    </p>
                             </blockquote>
         </p>
       </td></tr>
@@ -404,6 +415,17 @@
                 in an index are stored in a single directory, although this is not
                 required.
             </p>
+                                                <p>
+	        As of version 2.1 (lock-less commits), file names are
+	        never re-used (there is one exception, "segments.gen",
+	        see below).  That is, when any file is saved to the
+	        Directory it is given a never before used filename.
+	        This is achieved using a simple generations approach.
+	        For example, the first segments file is segments_1,
+	        then segments_2, etc.  The generation is a sequential
+	        long integer represented in alpha-numeric (base 36)
+	        form.
+            </p>
                             </blockquote>
         </p>
       </td></tr>
@@ -1080,25 +1102,53 @@
         <blockquote>
                                     <p>
                     The active segments in the index are stored in the
-                    segment info file.  An index only has
-                    a single file in this format, and it is named "segments".
-                    This lists each segment by name, and also contains the size of each
-                    segment.
-                </p>
+                    segment info file, <tt>segments_N</tt>.  There may
+                    be one or more <tt>segments_N</tt> files in the
+                    index; however, the one with the largest
+                    generation is the active one (when older
+                    segments_N files are present it's because they
+                    temporarily cannot be deleted, or, a writer is in
+                    the process of committing). This file lists each
+                    segment by name, has details about the separate
+                    norms and deletion files, and also contains the
+                    size of each segment.
+                </p>
+                                                <p>
+		    As of 2.1, there is also a file
+		    <tt>segments.gen</tt>.  This file contains the
+		    current generation (the <tt>_N</tt> in
+		    <tt>segments_N</tt>) of the index.  This is
+		    used only as a fallback in case the current
+		    generation cannot be accurately determined by
+		    directory listing alone (as is the case for some
+		    NFS clients with time-based directory cache
+		    expiraation).  This file simply contains an Int32
+		    version header (SegmentInfos.FORMAT_LOCKLESS =
+		    -2), followed by the generation recorded as Int64,
+		    written twice.
+		</p>
                                                 <p>
+		<b>Pre-2.1:</b>
                     Segments    --&gt; Format, Version, NameCounter, SegCount, &lt;SegName, SegSize&gt;<sup>SegCount</sup>
                 </p>
                                                 <p>
-                    Format, NameCounter, SegCount, SegSize    --&gt; UInt32
+		<b>2.1 and above:</b>
+                    Segments    --&gt; Format, Version, NameCounter, SegCount, &lt;SegName, SegSize, DelGen, NumField, NormGen<sup>NumField</sup> &gt;<sup>SegCount</sup>, IsCompoundFile
                 </p>
                                                 <p>
-                    Version --&gt; UInt64
+                    Format, NameCounter, SegCount, SegSize, NumField    --&gt; Int32
+                </p>
+                                                <p>
+                    Version, DelGen, NormGen --&gt; Int64
                 </p>
                                                 <p>
                     SegName    --&gt; String
                 </p>
                                                 <p>
-                    Format is -1 in Lucene 1.4.
+                    IsCompoundFile    --&gt; Int8
+                </p>
+                                                <p>
+                    Format is -1 as of Lucene 1.4 and -2 as of Lucene 2.1.
                 </p>
                                                 <p>
                     Version counts how often the index has been
@@ -1114,6 +1164,35 @@
                                                 <p>
                     SegSize is the number of documents contained in the segment index.
                 </p>
+                                                <p>
+                    DelGen is the generation count of the separate
+                    deletes file.  If this is -1, there are no
+                    separate deletes.  If it is 0, this is a pre-2.1
+                    segment and you must check filesystem for the
+                    existence of _X.del.  Anything above zero means
+                    there are separate deletes (_X_N.del).
+                </p>
+                                                <p>
+                    NumField is the size of the array for NormGen, or
+                    -1 if there are no NormGens stored.
+                </p>
+                                                <p>
+                    NormGen records the generation of the separate
+                    norms files.  If NumField is -1, there are no
+                    normGens stored and they are all assumed to be 0
+                    when the segment file was written pre-2.1 and all
+                    assumed to be -1 when the segments file is 2.1 or
+                    above.  The generation then has the same meaning
+                    as delGen (above).
+                </p>
+                                                <p>
+                    IsCompoundFile records whether the segment is
+                    written as a compound file or not.  If this is -1,
+                    the segment is not a compound file.  If it is 1,
+                    the segment is a compound file.  Else it is 0,
+                    which means we check filesystem to see if _X.cfs
+                    exists.
+                </p>
                             </blockquote>
       </td></tr>
       <tr><td><br/></td></tr>
@@ -1121,42 +1200,31 @@
                                                     <table border="0" cellspacing="0" cellpadding="2" width="100%">
       <tr><td bgcolor="#828DA6">
         <font color="#ffffff" face="arial,helvetica,sanserif">
-          <a name="Lock Files"><strong>Lock Files</strong></a>
+          <a name="Lock File"><strong>Lock File</strong></a>
         </font>
       </td></tr>
       <tr><td>
         <blockquote>
                                     <p>
-                    Several files are used to indicate that another
-                    process is using an index.  Note that these files are not
+                    A write lock is used to indicate that another
+                    process is writing to the index.  Note that this file is not
                     stored in the index directory itself, but rather in the
                     system's temporary directory, as indicated in the Java
                     system property "java.io.tmpdir".
                 </p>
-                                                <ul>
-                    <li>
-                        <p>
-                            When a file named "commit.lock"
-                            is present, a process is currently re-writing the "segments"
-                            file and deleting outdated segment index files, or a process is
-                            reading the "segments"
-                            file and opening the files of the segments it names.  This lock file
-                            prevents files from being deleted by another process after a process
-                            has read the "segments"
-                            file but before it has managed to open all of the files of the
-                            segments named therein.
-                        </p>
-                    </li>
-
-                    <li>
-                        <p>
-                            When a file named "write.lock"
-                            is present, a process is currently adding documents to an index, or
-                            removing files from that index.  This lock file prevents several
-                            processes from attempting to modify an index at the same time.
-                        </p>
-                    </li>
-                </ul>
+                                                <p>
+                    The write lock is named "XXXX-write.lock" where
+                    XXXX is typically a unique prefix computed by the
+                    directory path to the index.  When this file is
+                    present, a process is currently adding documents
+                    to an index, or removing files from that index.
+                    This lock file prevents several processes from
+                    attempting to modify an index at the same time.
+                </p>
+                                                <p>
+                    Note that prior to version 2.1, Lucene also used a
+                    commit lock.  This was removed in 2.1.
+		</p>
                             </blockquote>
       </td></tr>
       <tr><td><br/></td></tr>
@@ -1170,20 +1238,11 @@
       <tr><td>
         <blockquote>
                                     <p>
-                    A file named "deletable"
-                    contains the names of files that are no longer used by the index, but
-                    which could not be deleted.  This is only used on Win32, where a
-                    file may not be deleted while it is still open. On other platforms
-                    the file contains only null bytes.
-                </p>
-                                                <p>
-                    Deletable    --&gt; DeletableCount,
-                    &lt;DelableName&gt;<sup>DeletableCount</sup>
-                </p>
-                                                <p>DeletableCount    --&gt; UInt32
-                </p>
-                                                <p>DeletableName    --&gt;
-                    String
+                    Prior to Lucene 2.1 there was a file "deletable"
+                    that contained details about files that need to be
+                    deleted.  As of 2.1, a writer dynamically computes
+                    the files that are deletable, instead, so no file
+                    is written.
                 </p>
                             </blockquote>
       </td></tr>

Added: lucene/java/trunk/src/java/org/apache/lucene/index/IndexFileDeleter.java
URL: http://svn.apache.org/viewvc/lucene/java/trunk/src/java/org/apache/lucene/index/IndexFileDeleter.java?view=auto&rev=476359
==============================================================================
--- lucene/java/trunk/src/java/org/apache/lucene/index/IndexFileDeleter.java (added)
+++ lucene/java/trunk/src/java/org/apache/lucene/index/IndexFileDeleter.java Fri Nov 17 15:18:47 2006
@@ -0,0 +1,219 @@
+package org.apache.lucene.index;
+
+import org.apache.lucene.index.IndexFileNames;
+import org.apache.lucene.index.IndexFileNameFilter;
+import org.apache.lucene.index.SegmentInfos;
+import org.apache.lucene.store.Directory;
+
+import java.io.IOException;
+import java.io.PrintStream;
+import java.util.Vector;
+import java.util.HashMap;
+
+/**
+ * A utility class (used by both IndexReader and
+ * IndexWriter) to keep track of files that need to be
+ * deleted because they are no longer referenced by the
+ * index.
+ */
+public class IndexFileDeleter {
+  private Vector deletable;
+  private Vector pending;
+  private Directory directory;
+  private SegmentInfos segmentInfos;
+  private PrintStream infoStream;
+
+  public IndexFileDeleter(SegmentInfos segmentInfos, Directory directory)
+    throws IOException {
+    this.segmentInfos = segmentInfos;
+    this.directory = directory;
+  }
+
+  void setInfoStream(PrintStream infoStream) {
+    this.infoStream = infoStream;
+  }
+
+  /** Determine index files that are no longer referenced
+   * and therefore should be deleted.  This is called once
+   * (by the writer), and then subsequently we add onto
+   * deletable any files that are no longer needed at the
+   * point that we create the unused file (eg when merging
+   * segments), and we only remove from deletable when a
+   * file is successfully deleted.
+   */
+
+  public void findDeletableFiles() throws IOException {
+
+    // Gather all "current" segments:
+    HashMap current = new HashMap();
+    for(int j=0;j<segmentInfos.size();j++) {
+      SegmentInfo segmentInfo = (SegmentInfo) segmentInfos.elementAt(j);
+      current.put(segmentInfo.name, segmentInfo);
+    }
+
+    // Then go through all files in the Directory that are
+    // Lucene index files, and add to deletable if they are
+    // not referenced by the current segments info:
+
+    String segmentsInfosFileName = segmentInfos.getCurrentSegmentFileName();
+    IndexFileNameFilter filter = IndexFileNameFilter.getFilter();
+
+    String[] files = directory.list();
+
+    for (int i = 0; i < files.length; i++) {
+
+      if (filter.accept(null, files[i]) && !files[i].equals(segmentsInfosFileName) && !files[i].equals(IndexFileNames.SEGMENTS_GEN)) {
+
+        String segmentName;
+        String extension;
+
+        // First remove any extension:
+        int loc = files[i].indexOf('.');
+        if (loc != -1) {
+          extension = files[i].substring(1+loc);
+          segmentName = files[i].substring(0, loc);
+        } else {
+          extension = null;
+          segmentName = files[i];
+        }
+
+        // Then, remove any generation count:
+        loc = segmentName.indexOf('_', 1);
+        if (loc != -1) {
+          segmentName = segmentName.substring(0, loc);
+        }
+
+        // Delete this file if it's not a "current" segment,
+        // or, it is a single index file but there is now a
+        // corresponding compound file:
+        boolean doDelete = false;
+
+        if (!current.containsKey(segmentName)) {
+          // Delete if segment is not referenced:
+          doDelete = true;
+        } else {
+          // OK, segment is referenced, but file may still
+          // be orphan'd:
+          SegmentInfo info = (SegmentInfo) current.get(segmentName);
+
+          if (filter.isCFSFile(files[i]) && info.getUseCompoundFile()) {
+            // This file is in fact stored in a CFS file for
+            // this segment:
+            doDelete = true;
+          } else {
+            
+            if ("del".equals(extension)) {
+              // This is a _segmentName_N.del file:
+              if (!files[i].equals(info.getDelFileName())) {
+                // If this is a seperate .del file, but it
+                // doesn't match the current del filename for
+                // this segment, then delete it:
+                doDelete = true;
+              }
+            } else if (extension != null && extension.startsWith("s") && extension.matches("s\\d+")) {
+              int field = Integer.parseInt(extension.substring(1));
+              // This is a _segmentName_N.sX file:
+              if (!files[i].equals(info.getNormFileName(field))) {
+                // This is an orphan'd separate norms file:
+                doDelete = true;
+              }
+            }
+          }
+        }
+
+        if (doDelete) {
+          addDeletableFile(files[i]);
+          if (infoStream != null) {
+            infoStream.println("IndexFileDeleter: file \"" + files[i] + "\" is unreferenced in index and will be deleted on next commit");
+          }
+        }
+      }
+    }
+  }
+
+  /*
+   * Some operating systems (e.g. Windows) don't permit a file to be deleted
+   * while it is opened for read (e.g. by another process or thread). So we
+   * assume that when a delete fails it is because the file is open in another
+   * process, and queue the file for subsequent deletion.
+   */
+
+  public final void deleteSegments(Vector segments) throws IOException {
+
+    deleteFiles();                                // try to delete files that we couldn't before
+
+    for (int i = 0; i < segments.size(); i++) {
+      SegmentReader reader = (SegmentReader)segments.elementAt(i);
+      if (reader.directory() == this.directory)
+        deleteFiles(reader.files()); // try to delete our files
+      else
+        deleteFiles(reader.files(), reader.directory()); // delete other files
+    }
+  }
+  
+  public final void deleteFiles(Vector files, Directory directory)
+       throws IOException {
+    for (int i = 0; i < files.size(); i++)
+      directory.deleteFile((String)files.elementAt(i));
+  }
+
+  public final void deleteFiles(Vector files)
+       throws IOException {
+    deleteFiles();                                // try to delete files that we couldn't before
+    for (int i = 0; i < files.size(); i++) {
+      deleteFile((String) files.elementAt(i));
+    }
+  }
+
+  public final void deleteFile(String file)
+       throws IOException {
+    try {
+      directory.deleteFile(file);		  // try to delete each file
+    } catch (IOException e) {			  // if delete fails
+      if (directory.fileExists(file)) {
+        if (infoStream != null)
+          infoStream.println("IndexFileDeleter: unable to remove file \"" + file + "\": " + e.toString() + "; Will re-try later.");
+        addDeletableFile(file);                  // add to deletable
+      }
+    }
+  }
+
+  final void clearPendingFiles() {
+    pending = null;
+  }
+
+  final void addPendingFile(String fileName) {
+    if (pending == null) {
+      pending = new Vector();
+    }
+    pending.addElement(fileName);
+  }
+
+  final void commitPendingFiles() {
+    if (pending != null) {
+      if (deletable == null) {
+        deletable = pending;
+        pending = null;
+      } else {
+        deletable.addAll(pending);
+        pending = null;
+      }
+    }
+  }
+
+  public final void addDeletableFile(String fileName) {
+    if (deletable == null) {
+      deletable = new Vector();
+    }
+    deletable.addElement(fileName);
+  }
+
+  public final void deleteFiles()
+    throws IOException {
+    if (deletable != null) {
+      Vector oldDeletable = deletable;
+      deletable = null;
+      deleteFiles(oldDeletable); // try to delete deletable
+    }
+  }
+}

Modified: lucene/java/trunk/src/java/org/apache/lucene/index/IndexFileNameFilter.java
URL: http://svn.apache.org/viewvc/lucene/java/trunk/src/java/org/apache/lucene/index/IndexFileNameFilter.java?view=diff&rev=476359&r1=476358&r2=476359
==============================================================================
--- lucene/java/trunk/src/java/org/apache/lucene/index/IndexFileNameFilter.java (original)
+++ lucene/java/trunk/src/java/org/apache/lucene/index/IndexFileNameFilter.java Fri Nov 17 15:18:47 2006
@@ -19,6 +19,7 @@
 
 import java.io.File;
 import java.io.FilenameFilter;
+import java.util.HashSet;
 
 /**
  * Filename filter that accept filenames and extensions only created by Lucene.
@@ -28,18 +29,64 @@
  */
 public class IndexFileNameFilter implements FilenameFilter {
 
+  static IndexFileNameFilter singleton = new IndexFileNameFilter();
+  private HashSet extensions;
+
+  public IndexFileNameFilter() {
+    extensions = new HashSet();
+    for (int i = 0; i < IndexFileNames.INDEX_EXTENSIONS.length; i++) {
+      extensions.add(IndexFileNames.INDEX_EXTENSIONS[i]);
+    }
+  }
+
   /* (non-Javadoc)
    * @see java.io.FilenameFilter#accept(java.io.File, java.lang.String)
    */
   public boolean accept(File dir, String name) {
-    for (int i = 0; i < IndexFileNames.INDEX_EXTENSIONS.length; i++) {
-      if (name.endsWith("."+IndexFileNames.INDEX_EXTENSIONS[i]))
+    int i = name.lastIndexOf('.');
+    if (i != -1) {
+      String extension = name.substring(1+i);
+      if (extensions.contains(extension)) {
+        return true;
+      } else if (extension.startsWith("f") &&
+                 extension.matches("f\\d+")) {
+        return true;
+      } else if (extension.startsWith("s") &&
+                 extension.matches("s\\d+")) {
         return true;
+      }
+    } else {
+      if (name.equals(IndexFileNames.DELETABLE)) return true;
+      else if (name.startsWith(IndexFileNames.SEGMENTS)) return true;
     }
-    if (name.equals(IndexFileNames.DELETABLE)) return true;
-    else if (name.equals(IndexFileNames.SEGMENTS)) return true;
-    else if (name.matches(".+\\.f\\d+")) return true;
     return false;
   }
 
+  /**
+   * Returns true if this is a file that would be contained
+   * in a CFS file.  This function should only be called on
+   * files that pass the above "accept" (ie, are already
+   * known to be a Lucene index file).
+   */
+  public boolean isCFSFile(String name) {
+    int i = name.lastIndexOf('.');
+    if (i != -1) {
+      String extension = name.substring(1+i);
+      if (extensions.contains(extension) &&
+           !extension.equals("del") &&
+           !extension.equals("gen") &&
+          !extension.equals("cfs")) {
+        return true;
+      }
+      if (extension.startsWith("f") &&
+          extension.matches("f\\d+")) {
+        return true;
+      }
+    }
+    return false;
+  }
+
+  public static IndexFileNameFilter getFilter() {
+    return singleton;
+  }
 }

Modified: lucene/java/trunk/src/java/org/apache/lucene/index/IndexFileNames.java
URL: http://svn.apache.org/viewvc/lucene/java/trunk/src/java/org/apache/lucene/index/IndexFileNames.java?view=diff&rev=476359&r1=476358&r2=476359
==============================================================================
--- lucene/java/trunk/src/java/org/apache/lucene/index/IndexFileNames.java (original)
+++ lucene/java/trunk/src/java/org/apache/lucene/index/IndexFileNames.java Fri Nov 17 15:18:47 2006
@@ -27,19 +27,25 @@
 
   /** Name of the index segment file */
   static final String SEGMENTS = "segments";
+
+  /** Name of the generation reference file name */
+  static final String SEGMENTS_GEN = "segments.gen";
   
-  /** Name of the index deletable file */
+  /** Name of the index deletable file (only used in
+   * pre-lockless indices) */
   static final String DELETABLE = "deletable";
-  
+   
   /**
-   * This array contains all filename extensions used by Lucene's index files, with
-   * one exception, namely the extension made up from <code>.f</code> + a number.
-   * Also note that two of Lucene's files (<code>deletable</code> and
-   * <code>segments</code>) don't have any filename extension.
+   * This array contains all filename extensions used by
+   * Lucene's index files, with two exceptions, namely the
+   * extension made up from <code>.f</code> + a number and
+   * from <code>.s</code> + a number.  Also note that
+   * Lucene's <code>segments_N</code> files do not have any
+   * filename extension.
    */
   static final String INDEX_EXTENSIONS[] = new String[] {
       "cfs", "fnm", "fdx", "fdt", "tii", "tis", "frq", "prx", "del",
-      "tvx", "tvd", "tvf", "tvp" };
+      "tvx", "tvd", "tvf", "tvp", "gen"};
   
   /** File extensions of old-style index files */
   static final String COMPOUND_EXTENSIONS[] = new String[] {
@@ -50,5 +56,24 @@
   static final String VECTOR_EXTENSIONS[] = new String[] {
     "tvx", "tvd", "tvf"
   };
-  
+
+  /**
+   * Computes the full file name from base, extension and
+   * generation.  If the generation is -1, the file name is
+   * null.  If it's 0, the file name is <base><extension>.
+   * If it's > 0, the file name is <base>_<generation><extension>.
+   *
+   * @param base -- main part of the file name
+   * @param extension -- extension of the filename (including .)
+   * @param gen -- generation
+   */
+  public static final String fileNameFromGeneration(String base, String extension, long gen) {
+    if (gen == -1) {
+      return null;
+    } else if (gen == 0) {
+      return base + extension;
+    } else {
+      return base + "_" + Long.toString(gen, Character.MAX_RADIX) + extension;
+    }
+  }
 }

Modified: lucene/java/trunk/src/java/org/apache/lucene/index/IndexReader.java
URL: http://svn.apache.org/viewvc/lucene/java/trunk/src/java/org/apache/lucene/index/IndexReader.java?view=diff&rev=476359&r1=476358&r2=476359
==============================================================================
--- lucene/java/trunk/src/java/org/apache/lucene/index/IndexReader.java (original)
+++ lucene/java/trunk/src/java/org/apache/lucene/index/IndexReader.java Fri Nov 17 15:18:47 2006
@@ -113,6 +113,7 @@
   private Directory directory;
   private boolean directoryOwner;
   private boolean closeDirectory;
+  protected IndexFileDeleter deleter;
 
   private SegmentInfos segmentInfos;
   private Lock writeLock;
@@ -138,24 +139,40 @@
   }
 
   private static IndexReader open(final Directory directory, final boolean closeDirectory) throws IOException {
-    synchronized (directory) {			  // in- & inter-process sync
-      return (IndexReader)new Lock.With(
-          directory.makeLock(IndexWriter.COMMIT_LOCK_NAME),
-          IndexWriter.COMMIT_LOCK_TIMEOUT) {
-          public Object doBody() throws IOException {
-            SegmentInfos infos = new SegmentInfos();
-            infos.read(directory);
-            if (infos.size() == 1) {		  // index is optimized
-              return SegmentReader.get(infos, infos.info(0), closeDirectory);
-            }
-            IndexReader[] readers = new IndexReader[infos.size()];
-            for (int i = 0; i < infos.size(); i++)
-              readers[i] = SegmentReader.get(infos.info(i));
-            return new MultiReader(directory, infos, closeDirectory, readers);
 
+    return (IndexReader) new SegmentInfos.FindSegmentsFile(directory) {
+
+      public Object doBody(String segmentFileName) throws IOException {
+
+        SegmentInfos infos = new SegmentInfos();
+        infos.read(directory, segmentFileName);
+
+        if (infos.size() == 1) {		  // index is optimized
+          return SegmentReader.get(infos, infos.info(0), closeDirectory);
+        } else {
+
+          // To reduce the chance of hitting FileNotFound
+          // (and having to retry), we open segments in
+          // reverse because IndexWriter merges & deletes
+          // the newest segments first.
+
+          IndexReader[] readers = new IndexReader[infos.size()];
+          for (int i = infos.size()-1; i >= 0; i--) {
+            try {
+              readers[i] = SegmentReader.get(infos.info(i));
+            } catch (IOException e) {
+              // Close all readers we had opened:
+              for(i++;i<infos.size();i++) {
+                readers[i].close();
+              }
+              throw e;
+            }
           }
-        }.run();
-    }
+
+          return new MultiReader(directory, infos, closeDirectory, readers);
+        }
+      }
+    }.run();
   }
 
   /** Returns the directory this index resides in. */
@@ -175,8 +192,12 @@
    * Do not use this to check whether the reader is still up-to-date, use
    * {@link #isCurrent()} instead. 
    */
-  public static long lastModified(File directory) throws IOException {
-    return FSDirectory.fileModified(directory, IndexFileNames.SEGMENTS);
+  public static long lastModified(File fileDirectory) throws IOException {
+    return ((Long) new SegmentInfos.FindSegmentsFile(fileDirectory) {
+        public Object doBody(String segmentFileName) {
+          return new Long(FSDirectory.fileModified(fileDirectory, segmentFileName));
+        }
+      }.run()).longValue();
   }
 
   /**
@@ -184,8 +205,12 @@
    * Do not use this to check whether the reader is still up-to-date, use
    * {@link #isCurrent()} instead. 
    */
-  public static long lastModified(Directory directory) throws IOException {
-    return directory.fileModified(IndexFileNames.SEGMENTS);
+  public static long lastModified(final Directory directory2) throws IOException {
+    return ((Long) new SegmentInfos.FindSegmentsFile(directory2) {
+        public Object doBody(String segmentFileName) throws IOException {
+          return new Long(directory2.fileModified(segmentFileName));
+        }
+      }.run()).longValue();
   }
 
   /**
@@ -227,21 +252,7 @@
    * @throws IOException if segments file cannot be read.
    */
   public static long getCurrentVersion(Directory directory) throws IOException {
-    synchronized (directory) {                 // in- & inter-process sync
-      Lock commitLock=directory.makeLock(IndexWriter.COMMIT_LOCK_NAME);
-
-      boolean locked=false;
-
-      try {
-         locked=commitLock.obtain(IndexWriter.COMMIT_LOCK_TIMEOUT);
-
-         return SegmentInfos.readCurrentVersion(directory);
-      } finally {
-        if (locked) {
-          commitLock.release();
-        }
-      }
-    }
+    return SegmentInfos.readCurrentVersion(directory);
   }
 
   /**
@@ -259,21 +270,7 @@
    * @throws IOException
    */
   public boolean isCurrent() throws IOException {
-    synchronized (directory) {                 // in- & inter-process sync
-      Lock commitLock=directory.makeLock(IndexWriter.COMMIT_LOCK_NAME);
-
-      boolean locked=false;
-
-      try {
-         locked=commitLock.obtain(IndexWriter.COMMIT_LOCK_TIMEOUT);
-
-         return SegmentInfos.readCurrentVersion(directory) == segmentInfos.getVersion();
-      } finally {
-        if (locked) {
-          commitLock.release();
-        }
-      }
-    }
+    return SegmentInfos.readCurrentVersion(directory) == segmentInfos.getVersion();
   }
 
   /**
@@ -319,7 +316,7 @@
    * @return <code>true</code> if an index exists; <code>false</code> otherwise
    */
   public static boolean indexExists(String directory) {
-    return (new File(directory, IndexFileNames.SEGMENTS)).exists();
+    return indexExists(new File(directory));
   }
 
   /**
@@ -328,8 +325,9 @@
    * @param  directory the directory to check for an index
    * @return <code>true</code> if an index exists; <code>false</code> otherwise
    */
+
   public static boolean indexExists(File directory) {
-    return (new File(directory, IndexFileNames.SEGMENTS)).exists();
+    return SegmentInfos.getCurrentSegmentGeneration(directory.list()) != -1;
   }
 
   /**
@@ -340,7 +338,7 @@
    * @throws IOException if there is a problem with accessing the index
    */
   public static boolean indexExists(Directory directory) throws IOException {
-    return directory.fileExists(IndexFileNames.SEGMENTS);
+    return SegmentInfos.getCurrentSegmentGeneration(directory) != -1;
   }
 
   /** Returns the number of documents in this index. */
@@ -592,17 +590,22 @@
    */
   protected final synchronized void commit() throws IOException{
     if(hasChanges){
+      if (deleter == null) {
+        // In the MultiReader case, we share this deleter
+        // across all SegmentReaders:
+        setDeleter(new IndexFileDeleter(segmentInfos, directory));
+        deleter.deleteFiles();
+      }
       if(directoryOwner){
-        synchronized (directory) {      // in- & inter-process sync
-           new Lock.With(directory.makeLock(IndexWriter.COMMIT_LOCK_NAME),
-                   IndexWriter.COMMIT_LOCK_TIMEOUT) {
-             public Object doBody() throws IOException {
-               doCommit();
-               segmentInfos.write(directory);
-               return null;
-             }
-           }.run();
-         }
+        deleter.clearPendingFiles();
+        doCommit();
+        String oldInfoFileName = segmentInfos.getCurrentSegmentFileName();
+        segmentInfos.write(directory);
+        // Attempt to delete all files we just obsoleted:
+
+        deleter.deleteFile(oldInfoFileName);
+        deleter.commitPendingFiles();
+        deleter.deleteFiles();
         if (writeLock != null) {
           writeLock.release();  // release write lock
           writeLock = null;
@@ -614,6 +617,13 @@
     hasChanges = false;
   }
 
+  protected void setDeleter(IndexFileDeleter deleter) {
+    this.deleter = deleter;
+  }
+  protected IndexFileDeleter getDeleter() {
+    return deleter;
+  }
+
   /** Implements commit. */
   protected abstract void doCommit() throws IOException;
 
@@ -658,8 +668,7 @@
    */
   public static boolean isLocked(Directory directory) throws IOException {
     return
-            directory.makeLock(IndexWriter.WRITE_LOCK_NAME).isLocked() ||
-            directory.makeLock(IndexWriter.COMMIT_LOCK_NAME).isLocked();
+      directory.makeLock(IndexWriter.WRITE_LOCK_NAME).isLocked();
   }
 
   /**
@@ -684,7 +693,6 @@
    */
   public static void unlock(Directory directory) throws IOException {
     directory.makeLock(IndexWriter.WRITE_LOCK_NAME).release();
-    directory.makeLock(IndexWriter.COMMIT_LOCK_NAME).release();
   }
 
   /**

Modified: lucene/java/trunk/src/java/org/apache/lucene/index/IndexWriter.java
URL: http://svn.apache.org/viewvc/lucene/java/trunk/src/java/org/apache/lucene/index/IndexWriter.java?view=diff&rev=476359&r1=476358&r2=476359
==============================================================================
--- lucene/java/trunk/src/java/org/apache/lucene/index/IndexWriter.java (original)
+++ lucene/java/trunk/src/java/org/apache/lucene/index/IndexWriter.java Fri Nov 17 15:18:47 2006
@@ -67,16 +67,7 @@
 
   private long writeLockTimeout = WRITE_LOCK_TIMEOUT;
 
-  /**
-   * Default value for the commit lock timeout (10,000).
-   * @see #setDefaultCommitLockTimeout
-   */
-  public static long COMMIT_LOCK_TIMEOUT = 10000;
-
-  private long commitLockTimeout = COMMIT_LOCK_TIMEOUT;
-
   public static final String WRITE_LOCK_NAME = "write.lock";
-  public static final String COMMIT_LOCK_NAME = "commit.lock";
 
   /**
    * Default value is 10. Change using {@link #setMergeFactor(int)}.
@@ -111,6 +102,7 @@
   private SegmentInfos segmentInfos = new SegmentInfos(); // the segments
   private SegmentInfos ramSegmentInfos = new SegmentInfos(); // the segments in ramDirectory
   private final Directory ramDirectory = new RAMDirectory(); // for temp segs
+  private IndexFileDeleter deleter;
 
   private Lock writeLock;
 
@@ -260,19 +252,30 @@
       this.writeLock = writeLock;                   // save it
 
       try {
-        synchronized (directory) {        // in- & inter-process sync
-          new Lock.With(directory.makeLock(IndexWriter.COMMIT_LOCK_NAME), commitLockTimeout) {
-              public Object doBody() throws IOException {
-                if (create)
-                  segmentInfos.write(directory);
-                else
-                  segmentInfos.read(directory);
-                return null;
-              }
-            }.run();
+        if (create) {
+          // Try to read first.  This is to allow create
+          // against an index that's currently open for
+          // searching.  In this case we write the next
+          // segments_N file with no segments:
+          try {
+            segmentInfos.read(directory);
+            segmentInfos.clear();
+          } catch (IOException e) {
+            // Likely this means it's a fresh directory
+          }
+          segmentInfos.write(directory);
+        } else {
+          segmentInfos.read(directory);
         }
+
+        // Create a deleter to keep track of which files can
+        // be deleted:
+        deleter = new IndexFileDeleter(segmentInfos, directory);
+        deleter.setInfoStream(infoStream);
+        deleter.findDeletableFiles();
+        deleter.deleteFiles();
+
       } catch (IOException e) {
-        // the doBody method failed
         this.writeLock.release();
         this.writeLock = null;
         throw e;
@@ -381,35 +384,6 @@
   }
 
   /**
-   * Sets the maximum time to wait for a commit lock (in milliseconds) for this instance of IndexWriter.  @see
-   * @see #setDefaultCommitLockTimeout to change the default value for all instances of IndexWriter.
-   */
-  public void setCommitLockTimeout(long commitLockTimeout) {
-    this.commitLockTimeout = commitLockTimeout;
-  }
-
-  /**
-   * @see #setCommitLockTimeout
-   */
-  public long getCommitLockTimeout() {
-    return commitLockTimeout;
-  }
-
-  /**
-   * Sets the default (for any instance of IndexWriter) maximum time to wait for a commit lock (in milliseconds)
-   */
-  public static void setDefaultCommitLockTimeout(long commitLockTimeout) {
-    IndexWriter.COMMIT_LOCK_TIMEOUT = commitLockTimeout;
-  }
-
-  /**
-   * @see #setDefaultCommitLockTimeout
-   */
-  public static long getDefaultCommitLockTimeout() {
-    return IndexWriter.COMMIT_LOCK_TIMEOUT;
-  }
-
-  /**
    * Sets the maximum time to wait for a write lock (in milliseconds) for this instance of IndexWriter.  @see
    * @see #setDefaultWriteLockTimeout to change the default value for all instances of IndexWriter.
    */
@@ -517,7 +491,7 @@
     String segmentName = newRAMSegmentName();
     dw.addDocument(segmentName, doc);
     synchronized (this) {
-      ramSegmentInfos.addElement(new SegmentInfo(segmentName, 1, ramDirectory));
+      ramSegmentInfos.addElement(new SegmentInfo(segmentName, 1, ramDirectory, false));
       maybeFlushRamSegments();
     }
   }
@@ -790,36 +764,26 @@
     int docCount = merger.merge();                // merge 'em
 
     segmentInfos.setSize(0);                      // pop old infos & add new
-    segmentInfos.addElement(new SegmentInfo(mergedName, docCount, directory));
+    SegmentInfo info = new SegmentInfo(mergedName, docCount, directory, false);
+    segmentInfos.addElement(info);
 
     if(sReader != null)
         sReader.close();
 
-    synchronized (directory) {			  // in- & inter-process sync
-      new Lock.With(directory.makeLock(COMMIT_LOCK_NAME), commitLockTimeout) {
-	  public Object doBody() throws IOException {
-	    segmentInfos.write(directory);	  // commit changes
-	    return null;
-	  }
-	}.run();
-    }
+    String segmentsInfosFileName = segmentInfos.getCurrentSegmentFileName();
+    segmentInfos.write(directory);         // commit changes
 
-    deleteSegments(segmentsToDelete);  // delete now-unused segments
+    deleter.deleteFile(segmentsInfosFileName);    // delete old segments_N file
+    deleter.deleteSegments(segmentsToDelete);     // delete now-unused segments
 
     if (useCompoundFile) {
-      final Vector filesToDelete = merger.createCompoundFile(mergedName + ".tmp");
-      synchronized (directory) { // in- & inter-process sync
-        new Lock.With(directory.makeLock(COMMIT_LOCK_NAME), commitLockTimeout) {
-          public Object doBody() throws IOException {
-            // make compound file visible for SegmentReaders
-            directory.renameFile(mergedName + ".tmp", mergedName + ".cfs");
-            return null;
-          }
-        }.run();
-      }
+      Vector filesToDelete = merger.createCompoundFile(mergedName + ".cfs");
+      segmentsInfosFileName = segmentInfos.getCurrentSegmentFileName();
+      info.setUseCompoundFile(true);
+      segmentInfos.write(directory);     // commit again so readers know we've switched this segment to a compound file
 
-      // delete now unused files of segment
-      deleteFiles(filesToDelete);
+      deleter.deleteFile(segmentsInfosFileName);  // delete old segments_N file
+      deleter.deleteFiles(filesToDelete); // delete now unused files of segment 
     }
   }
 
@@ -937,10 +901,11 @@
    */
   private final int mergeSegments(SegmentInfos sourceSegments, int minSegment, int end)
     throws IOException {
+
     final String mergedName = newSegmentName();
     if (infoStream != null) infoStream.print("merging segments");
     SegmentMerger merger = new SegmentMerger(this, mergedName);
-
+    
     final Vector segmentsToDelete = new Vector();
     for (int i = minSegment; i < end; i++) {
       SegmentInfo si = sourceSegments.info(i);
@@ -960,7 +925,7 @@
     }
 
     SegmentInfo newSegment = new SegmentInfo(mergedName, mergedDocCount,
-        directory);
+                                             directory, false);
     if (sourceSegments == ramSegmentInfos) {
       sourceSegments.removeAllElements();
       segmentInfos.addElement(newSegment);
@@ -973,113 +938,24 @@
     // close readers before we attempt to delete now-obsolete segments
     merger.closeReaders();
 
-    synchronized (directory) {                 // in- & inter-process sync
-      new Lock.With(directory.makeLock(COMMIT_LOCK_NAME), commitLockTimeout) {
-          public Object doBody() throws IOException {
-            segmentInfos.write(directory);     // commit before deleting
-            return null;
-          }
-        }.run();
-    }
-    
-    deleteSegments(segmentsToDelete);  // delete now-unused segments
+    String segmentsInfosFileName = segmentInfos.getCurrentSegmentFileName();
+    segmentInfos.write(directory);     // commit before deleting
 
-    if (useCompoundFile) {
-      final Vector filesToDelete = merger.createCompoundFile(mergedName + ".tmp");
-      synchronized (directory) { // in- & inter-process sync
-        new Lock.With(directory.makeLock(COMMIT_LOCK_NAME), commitLockTimeout) {
-          public Object doBody() throws IOException {
-            // make compound file visible for SegmentReaders
-            directory.renameFile(mergedName + ".tmp", mergedName + ".cfs");
-            return null;
-          }
-        }.run();
-      }
-
-      // delete now unused files of segment 
-      deleteFiles(filesToDelete);   
-    }
-
-    return mergedDocCount;
-  }
-
-  /*
-   * Some operating systems (e.g. Windows) don't permit a file to be deleted
-   * while it is opened for read (e.g. by another process or thread). So we
-   * assume that when a delete fails it is because the file is open in another
-   * process, and queue the file for subsequent deletion.
-   */
-
-  private final void deleteSegments(Vector segments) throws IOException {
-    Vector deletable = new Vector();
-
-    deleteFiles(readDeleteableFiles(), deletable); // try to delete deleteable
-
-    for (int i = 0; i < segments.size(); i++) {
-      SegmentReader reader = (SegmentReader)segments.elementAt(i);
-      if (reader.directory() == this.directory)
-        deleteFiles(reader.files(), deletable);	  // try to delete our files
-      else
-        deleteFiles(reader.files(), reader.directory()); // delete other files
-    }
-
-    writeDeleteableFiles(deletable);		  // note files we can't delete
-  }
-  
-  private final void deleteFiles(Vector files) throws IOException {
-    Vector deletable = new Vector();
-    deleteFiles(readDeleteableFiles(), deletable); // try to delete deleteable
-    deleteFiles(files, deletable);     // try to delete our files
-    writeDeleteableFiles(deletable);        // note files we can't delete
-  }
-
-  private final void deleteFiles(Vector files, Directory directory)
-       throws IOException {
-    for (int i = 0; i < files.size(); i++)
-      directory.deleteFile((String)files.elementAt(i));
-  }
+    deleter.deleteFile(segmentsInfosFileName);    // delete old segments_N file
+    deleter.deleteSegments(segmentsToDelete);     // delete now-unused segments
 
-  private final void deleteFiles(Vector files, Vector deletable)
-       throws IOException {
-    for (int i = 0; i < files.size(); i++) {
-      String file = (String)files.elementAt(i);
-      try {
-        directory.deleteFile(file);		  // try to delete each file
-      } catch (IOException e) {			  // if delete fails
-        if (directory.fileExists(file)) {
-          if (infoStream != null)
-            infoStream.println(e.toString() + "; Will re-try later.");
-          deletable.addElement(file);		  // add to deletable
-        }
-      }
-    }
-  }
+    if (useCompoundFile) {
+      Vector filesToDelete = merger.createCompoundFile(mergedName + ".cfs");
 
-  private final Vector readDeleteableFiles() throws IOException {
-    Vector result = new Vector();
-    if (!directory.fileExists(IndexFileNames.DELETABLE))
-      return result;
+      segmentsInfosFileName = segmentInfos.getCurrentSegmentFileName();
+      newSegment.setUseCompoundFile(true);
+      segmentInfos.write(directory);     // commit again so readers know we've switched this segment to a compound file
 
-    IndexInput input = directory.openInput(IndexFileNames.DELETABLE);
-    try {
-      for (int i = input.readInt(); i > 0; i--)	  // read file names
-        result.addElement(input.readString());
-    } finally {
-      input.close();
+      deleter.deleteFile(segmentsInfosFileName);  // delete old segments_N file
+      deleter.deleteFiles(filesToDelete);  // delete now-unused segments
     }
-    return result;
-  }
 
-  private final void writeDeleteableFiles(Vector files) throws IOException {
-    IndexOutput output = directory.createOutput("deleteable.new");
-    try {
-      output.writeInt(files.size());
-      for (int i = 0; i < files.size(); i++)
-        output.writeString((String)files.elementAt(i));
-    } finally {
-      output.close();
-    }
-    directory.renameFile("deleteable.new", IndexFileNames.DELETABLE);
+    return mergedDocCount;
   }
 
   private final boolean checkNonDecreasingLevels(int start) {

Modified: lucene/java/trunk/src/java/org/apache/lucene/index/MultiReader.java
URL: http://svn.apache.org/viewvc/lucene/java/trunk/src/java/org/apache/lucene/index/MultiReader.java?view=diff&rev=476359&r1=476358&r2=476359
==============================================================================
--- lucene/java/trunk/src/java/org/apache/lucene/index/MultiReader.java (original)
+++ lucene/java/trunk/src/java/org/apache/lucene/index/MultiReader.java Fri Nov 17 15:18:47 2006
@@ -218,6 +218,13 @@
     return new MultiTermPositions(subReaders, starts);
   }
 
+  protected void setDeleter(IndexFileDeleter deleter) {
+    // Share deleter to our SegmentReaders:
+    this.deleter = deleter;
+    for (int i = 0; i < subReaders.length; i++)
+      subReaders[i].setDeleter(deleter);
+  }
+
   protected void doCommit() throws IOException {
     for (int i = 0; i < subReaders.length; i++)
       subReaders[i].commit();

Modified: lucene/java/trunk/src/java/org/apache/lucene/index/SegmentInfo.java
URL: http://svn.apache.org/viewvc/lucene/java/trunk/src/java/org/apache/lucene/index/SegmentInfo.java?view=diff&rev=476359&r1=476358&r2=476359
==============================================================================
--- lucene/java/trunk/src/java/org/apache/lucene/index/SegmentInfo.java (original)
+++ lucene/java/trunk/src/java/org/apache/lucene/index/SegmentInfo.java Fri Nov 17 15:18:47 2006
@@ -18,15 +18,302 @@
  */
 
 import org.apache.lucene.store.Directory;
+import org.apache.lucene.store.IndexOutput;
+import org.apache.lucene.store.IndexInput;
+import java.io.IOException;
 
 final class SegmentInfo {
   public String name;				  // unique name in dir
   public int docCount;				  // number of docs in seg
   public Directory dir;				  // where segment resides
 
+  private boolean preLockless;                    // true if this is a segments file written before
+                                                  // lock-less commits (XXX)
+
+  private long delGen;                            // current generation of del file; -1 if there
+                                                  // are no deletes; 0 if it's a pre-XXX segment
+                                                  // (and we must check filesystem); 1 or higher if
+                                                  // there are deletes at generation N
+   
+  private long[] normGen;                         // current generations of each field's norm file.
+                                                  // If this array is null, we must check filesystem
+                                                  // when preLockLess is true.  Else,
+                                                  // there are no separate norms
+
+  private byte isCompoundFile;                    // -1 if it is not; 1 if it is; 0 if it's
+                                                  // pre-XXX (ie, must check file system to see
+                                                  // if <name>.cfs exists)         
+
   public SegmentInfo(String name, int docCount, Directory dir) {
     this.name = name;
     this.docCount = docCount;
     this.dir = dir;
+    delGen = -1;
+    isCompoundFile = 0;
+    preLockless = true;
+  }
+  public SegmentInfo(String name, int docCount, Directory dir, boolean isCompoundFile) {
+    this(name, docCount, dir);
+    if (isCompoundFile) {
+      this.isCompoundFile = 1;
+    } else {
+      this.isCompoundFile = -1;
+    }
+    preLockless = false;
+  }
+
+
+  /**
+   * Construct a new SegmentInfo instance by reading a
+   * previously saved SegmentInfo from input.
+   *
+   * @param dir directory to load from
+   * @param format format of the segments info file
+   * @param input input handle to read segment info from
+   */
+  public SegmentInfo(Directory dir, int format, IndexInput input) throws IOException {
+    this.dir = dir;
+    name = input.readString();
+    docCount = input.readInt();
+    if (format <= SegmentInfos.FORMAT_LOCKLESS) {
+      delGen = input.readLong();
+      int numNormGen = input.readInt();
+      if (numNormGen == -1) {
+        normGen = null;
+      } else {
+        normGen = new long[numNormGen];
+        for(int j=0;j<numNormGen;j++) {
+          normGen[j] = input.readLong();
+        }
+      }
+      isCompoundFile = input.readByte();
+      preLockless = isCompoundFile == 0;
+    } else {
+      delGen = 0;
+      normGen = null;
+      isCompoundFile = 0;
+      preLockless = true;
+    }
+  }
+  
+  void setNumField(int numField) {
+    if (normGen == null) {
+      // normGen is null if we loaded a pre-XXX segment
+      // file, or, if this segments file hasn't had any
+      // norms set against it yet:
+      normGen = new long[numField];
+
+      if (!preLockless) {
+        // This is a FORMAT_LOCKLESS segment, which means
+        // there are no norms:
+        for(int i=0;i<numField;i++) {
+          normGen[i] = -1;
+        }
+      }
+    }
+  }
+
+  boolean hasDeletions()
+    throws IOException {
+    // Cases:
+    //
+    //   delGen == -1: this means this segment was written
+    //     by the LOCKLESS code and for certain does not have
+    //     deletions yet
+    //
+    //   delGen == 0: this means this segment was written by
+    //     pre-LOCKLESS code which means we must check
+    //     directory to see if .del file exists
+    //
+    //   delGen > 0: this means this segment was written by
+    //     the LOCKLESS code and for certain has
+    //     deletions
+    //
+    if (delGen == -1) {
+      return false;
+    } else if (delGen > 0) {
+      return true;
+    } else {
+      return dir.fileExists(getDelFileName());
+    }
+  }
+
+  void advanceDelGen() {
+    // delGen 0 is reserved for pre-LOCKLESS format
+    if (delGen == -1) {
+      delGen = 1;
+    } else {
+      delGen++;
+    }
+  }
+
+  void clearDelGen() {
+    delGen = -1;
+  }
+
+  String getDelFileName() {
+    if (delGen == -1) {
+      // In this case we know there is no deletion filename
+      // against this segment
+      return null;
+    } else {
+      // If delGen is 0, it's the pre-lockless-commit file format
+      return IndexFileNames.fileNameFromGeneration(name, ".del", delGen);
+    }
+  }
+
+  /**
+   * Returns true if this field for this segment has saved a separate norms file (_<segment>_N.sX).
+   *
+   * @param fieldNumber the field index to check
+   */
+  boolean hasSeparateNorms(int fieldNumber)
+    throws IOException {
+    if ((normGen == null && preLockless) || (normGen != null && normGen[fieldNumber] == 0)) {
+      // Must fallback to directory file exists check:
+      String fileName = name + ".s" + fieldNumber;
+      return dir.fileExists(fileName);
+    } else if (normGen == null || normGen[fieldNumber] == -1) {
+      return false;
+    } else {
+      return true;
+    }
+  }
+
+  /**
+   * Returns true if any fields in this segment have separate norms.
+   */
+  boolean hasSeparateNorms()
+    throws IOException {
+    if (normGen == null) {
+      if (!preLockless) {
+        // This means we were created w/ LOCKLESS code and no
+        // norms are written yet:
+        return false;
+      } else {
+        // This means this segment was saved with pre-LOCKLESS
+        // code.  So we must fallback to the original
+        // directory list check:
+        String[] result = dir.list();
+        String pattern;
+        pattern = name + ".s";
+        int patternLength = pattern.length();
+        for(int i = 0; i < result.length; i++){
+          if(result[i].startsWith(pattern) && Character.isDigit(result[i].charAt(patternLength)))
+            return true;
+        }
+        return false;
+      }
+    } else {
+      // This means this segment was saved with LOCKLESS
+      // code so we first check whether any normGen's are >
+      // 0 (meaning they definitely have separate norms):
+      for(int i=0;i<normGen.length;i++) {
+        if (normGen[i] > 0) {
+          return true;
+        }
+      }
+      // Next we look for any == 0.  These cases were
+      // pre-LOCKLESS and must be checked in directory:
+      for(int i=0;i<normGen.length;i++) {
+        if (normGen[i] == 0) {
+          if (dir.fileExists(getNormFileName(i))) {
+            return true;
+          }
+        }
+      }
+    }
+
+    return false;
+  }
+
+  /**
+   * Increment the generation count for the norms file for
+   * this field.
+   *
+   * @param fieldIndex field whose norm file will be rewritten
+   */
+  void advanceNormGen(int fieldIndex) {
+    if (normGen[fieldIndex] == -1) {
+      normGen[fieldIndex] = 1;
+    } else {
+      normGen[fieldIndex]++;
+    }
+  }
+
+  /**
+   * Get the file name for the norms file for this field.
+   *
+   * @param number field index
+   */
+  String getNormFileName(int number) throws IOException {
+    String prefix;
+
+    long gen;
+    if (normGen == null) {
+      gen = 0;
+    } else {
+      gen = normGen[number];
+    }
+    
+    if (hasSeparateNorms(number)) {
+      prefix = ".s";
+      return IndexFileNames.fileNameFromGeneration(name, prefix + number, gen);
+    } else {
+      prefix = ".f";
+      return IndexFileNames.fileNameFromGeneration(name, prefix + number, 0);
+    }
+  }
+
+  /**
+   * Mark whether this segment is stored as a compound file.
+   *
+   * @param isCompoundFile true if this is a compound file;
+   * else, false
+   */
+  void setUseCompoundFile(boolean isCompoundFile) {
+    if (isCompoundFile) {
+      this.isCompoundFile = 1;
+    } else {
+      this.isCompoundFile = -1;
+    }
+  }
+
+  /**
+   * Returns true if this segment is stored as a compound
+   * file; else, false.
+   *
+   * @param directory directory to check.  This parameter is
+   * only used when the segment was written before version
+   * XXX (at which point compound file or not became stored
+   * in the segments info file).
+   */
+  boolean getUseCompoundFile() throws IOException {
+    if (isCompoundFile == -1) {
+      return false;
+    } else if (isCompoundFile == 1) {
+      return true;
+    } else {
+      return dir.fileExists(name + ".cfs");
+    }
+  }
+
+  /**
+   * Save this segment's info.
+   */
+  void write(IndexOutput output)
+    throws IOException {
+    output.writeString(name);
+    output.writeInt(docCount);
+    output.writeLong(delGen);
+    if (normGen == null) {
+      output.writeInt(-1);
+    } else {
+      output.writeInt(normGen.length);
+      for(int j=0;j<normGen.length;j++) {
+        output.writeLong(normGen[j]);
+      }
+    }
+    output.writeByte(isCompoundFile);
   }
 }

Modified: lucene/java/trunk/src/java/org/apache/lucene/index/SegmentInfos.java
URL: http://svn.apache.org/viewvc/lucene/java/trunk/src/java/org/apache/lucene/index/SegmentInfos.java?view=diff&rev=476359&r1=476358&r2=476359
==============================================================================
--- lucene/java/trunk/src/java/org/apache/lucene/index/SegmentInfos.java (original)
+++ lucene/java/trunk/src/java/org/apache/lucene/index/SegmentInfos.java Fri Nov 17 15:18:47 2006
@@ -19,36 +19,151 @@
 
 import java.util.Vector;
 import java.io.IOException;
+import java.io.PrintStream;
+import java.io.File;
+import java.io.FileNotFoundException;
 import org.apache.lucene.store.Directory;
 import org.apache.lucene.store.IndexInput;
 import org.apache.lucene.store.IndexOutput;
 import org.apache.lucene.util.Constants;
 
-final class SegmentInfos extends Vector {
+public final class SegmentInfos extends Vector {
   
   /** The file format version, a negative number. */
   /* Works since counter, the old 1st entry, is always >= 0 */
   public static final int FORMAT = -1;
-  
+
+  /** This is the current file format written.  It differs
+   * slightly from the previous format in that file names
+   * are never re-used (write once).  Instead, each file is
+   * written to the next generation.  For example,
+   * segments_1, segments_2, etc.  This allows us to not use
+   * a commit lock.  See <a
+   * href="http://lucene.apache.org/java/docs/fileformats.html">file
+   * formats</a> for details.
+   */
+  public static final int FORMAT_LOCKLESS = -2;
+
   public int counter = 0;    // used to name new segments
   /**
    * counts how often the index has been changed by adding or deleting docs.
    * starting with the current time in milliseconds forces to create unique version numbers.
    */
   private long version = System.currentTimeMillis();
+  private long generation = 0;             // generation of the "segments_N" file we read
+
+  /**
+   * If non-null, information about loading segments_N files
+   * will be printed here.  @see #setInfoStream.
+   */
+  private static PrintStream infoStream;
 
   public final SegmentInfo info(int i) {
     return (SegmentInfo) elementAt(i);
   }
 
-  public final void read(Directory directory) throws IOException {
-    
-    IndexInput input = directory.openInput(IndexFileNames.SEGMENTS);
+  /**
+   * Get the generation (N) of the current segments_N file
+   * from a list of files.
+   *
+   * @param files -- array of file names to check
+   */
+  public static long getCurrentSegmentGeneration(String[] files) {
+    if (files == null) {
+      return -1;
+    }
+    long max = -1;
+    int prefixLen = IndexFileNames.SEGMENTS.length()+1;
+    for (int i = 0; i < files.length; i++) {
+      String file = files[i];
+      if (file.startsWith(IndexFileNames.SEGMENTS) && !file.equals(IndexFileNames.SEGMENTS_GEN)) {
+        if (file.equals(IndexFileNames.SEGMENTS)) {
+          // Pre lock-less commits:
+          if (max == -1) {
+            max = 0;
+          }
+        } else {
+          long v = Long.parseLong(file.substring(prefixLen), Character.MAX_RADIX);
+          if (v > max) {
+            max = v;
+          }
+        }
+      }
+    }
+    return max;
+  }
+
+  /**
+   * Get the generation (N) of the current segments_N file
+   * in the directory.
+   *
+   * @param directory -- directory to search for the latest segments_N file
+   */
+  public static long getCurrentSegmentGeneration(Directory directory) throws IOException {
+    String[] files = directory.list();
+    if (files == null)
+      throw new IOException("Cannot read directory " + directory);
+    return getCurrentSegmentGeneration(files);
+  }
+
+  /**
+   * Get the filename of the current segments_N file
+   * from a list of files.
+   *
+   * @param files -- array of file names to check
+   */
+
+  public static String getCurrentSegmentFileName(String[] files) throws IOException {
+    return IndexFileNames.fileNameFromGeneration(IndexFileNames.SEGMENTS,
+                                                 "",
+                                                 getCurrentSegmentGeneration(files));
+  }
+
+  /**
+   * Get the filename of the current segments_N file
+   * in the directory.
+   *
+   * @param directory -- directory to search for the latest segments_N file
+   */
+  public static String getCurrentSegmentFileName(Directory directory) throws IOException {
+    return IndexFileNames.fileNameFromGeneration(IndexFileNames.SEGMENTS,
+                                                 "",
+                                                 getCurrentSegmentGeneration(directory));
+  }
+
+  /**
+   * Get the segment_N filename in use by this segment infos.
+   */
+  public String getCurrentSegmentFileName() {
+    return IndexFileNames.fileNameFromGeneration(IndexFileNames.SEGMENTS,
+                                                 "",
+                                                 generation);
+  }
+
+  /**
+   * Read a particular segmentFileName.  Note that this may
+   * throw an IOException if a commit is in process.
+   *
+   * @param directory -- directory containing the segments file
+   * @param segmentFileName -- segment file to load
+   */
+  public final void read(Directory directory, String segmentFileName) throws IOException {
+    boolean success = false;
+
+    IndexInput input = directory.openInput(segmentFileName);
+
+    if (segmentFileName.equals(IndexFileNames.SEGMENTS)) {
+      generation = 0;
+    } else {
+      generation = Long.parseLong(segmentFileName.substring(1+IndexFileNames.SEGMENTS.length()),
+                                  Character.MAX_RADIX);
+    }
+
     try {
       int format = input.readInt();
       if(format < 0){     // file contains explicit format info
         // check that it is a format we can understand
-        if (format < FORMAT)
+        if (format < FORMAT_LOCKLESS)
           throw new IOException("Unknown format version: " + format);
         version = input.readLong(); // read version
         counter = input.readInt(); // read counter
@@ -58,9 +173,7 @@
       }
       
       for (int i = input.readInt(); i > 0; i--) { // read segmentInfos
-        SegmentInfo si =
-          new SegmentInfo(input.readString(), input.readInt(), directory);
-        addElement(si);
+        addElement(new SegmentInfo(directory, format, input));
       }
       
       if(format >= 0){    // in old format the version number may be at the end of the file
@@ -69,31 +182,71 @@
         else
           version = input.readLong(); // read version
       }
+      success = true;
     }
     finally {
       input.close();
+      if (!success) {
+        // Clear any segment infos we had loaded so we
+        // have a clean slate on retry:
+        clear();
+      }
     }
   }
+  /**
+   * This version of read uses the retry logic (for lock-less
+   * commits) to find the right segments file to load.
+   */
+  public final void read(Directory directory) throws IOException {
+
+    generation = -1;
+
+    new FindSegmentsFile(directory) {
+
+      public Object doBody(String segmentFileName) throws IOException {
+        read(directory, segmentFileName);
+        return null;
+      }
+    }.run();
+  }
 
   public final void write(Directory directory) throws IOException {
-    IndexOutput output = directory.createOutput("segments.new");
+
+    // Always advance the generation on write:
+    if (generation == -1) {
+      generation = 1;
+    } else {
+      generation++;
+    }
+
+    String segmentFileName = getCurrentSegmentFileName();
+    IndexOutput output = directory.createOutput(segmentFileName);
+
     try {
-      output.writeInt(FORMAT); // write FORMAT
-      output.writeLong(++version); // every write changes the index
+      output.writeInt(FORMAT_LOCKLESS); // write FORMAT
+      output.writeLong(++version); // every write changes
+                                   // the index
       output.writeInt(counter); // write counter
       output.writeInt(size()); // write infos
       for (int i = 0; i < size(); i++) {
         SegmentInfo si = info(i);
-        output.writeString(si.name);
-        output.writeInt(si.docCount);
+        si.write(output);
       }         
     }
     finally {
       output.close();
     }
 
-    // install new segment info
-    directory.renameFile("segments.new", IndexFileNames.SEGMENTS);
+    try {
+      output = directory.createOutput(IndexFileNames.SEGMENTS_GEN);
+      output.writeInt(FORMAT_LOCKLESS);
+      output.writeLong(generation);
+      output.writeLong(generation);
+      output.close();
+    } catch (IOException e) {
+      // It's OK if we fail to write this file since it's
+      // used only as one of the retry fallbacks.
+    }
   }
 
   /**
@@ -108,30 +261,322 @@
    */
   public static long readCurrentVersion(Directory directory)
     throws IOException {
-      
-    IndexInput input = directory.openInput(IndexFileNames.SEGMENTS);
-    int format = 0;
-    long version = 0;
-    try {
-      format = input.readInt();
-      if(format < 0){
-        if (format < FORMAT)
-          throw new IOException("Unknown format version: " + format);
-        version = input.readLong(); // read version
-      }
-    }
-    finally {
-      input.close();
-    }
+
+    return ((Long) new FindSegmentsFile(directory) {
+        public Object doBody(String segmentFileName) throws IOException {
+
+          IndexInput input = directory.openInput(segmentFileName);
+
+          int format = 0;
+          long version = 0;
+          try {
+            format = input.readInt();
+            if(format < 0){
+              if (format < FORMAT_LOCKLESS)
+                throw new IOException("Unknown format version: " + format);
+              version = input.readLong(); // read version
+            }
+          }
+          finally {
+            input.close();
+          }
      
-    if(format < 0)
-      return version;
+          if(format < 0)
+            return new Long(version);
 
-    // We cannot be sure about the format of the file.
-    // Therefore we have to read the whole file and cannot simply seek to the version entry.
+          // We cannot be sure about the format of the file.
+          // Therefore we have to read the whole file and cannot simply seek to the version entry.
+          SegmentInfos sis = new SegmentInfos();
+          sis.read(directory, segmentFileName);
+          return new Long(sis.getVersion());
+        }
+      }.run()).longValue();
+  }
 
-    SegmentInfos sis = new SegmentInfos();
-    sis.read(directory);
-    return sis.getVersion();
+  /** If non-null, information about retries when loading
+   * the segments file will be printed to this.
+   */
+  public static void setInfoStream(PrintStream infoStream) {
+    SegmentInfos.infoStream = infoStream;
   }
+
+  /* Advanced configuration of retry logic in loading
+     segments_N file */
+  private static int defaultGenFileRetryCount = 10;
+  private static int defaultGenFileRetryPauseMsec = 50;
+  private static int defaultGenLookaheadCount = 10;
+
+  /**
+   * Advanced: set how many times to try loading the
+   * segments.gen file contents to determine current segment
+   * generation.  This file is only referenced when the
+   * primary method (listing the directory) fails.
+   */
+  public static void setDefaultGenFileRetryCount(int count) {
+    defaultGenFileRetryCount = count;
+  }
+
+  /**
+   * @see #setDefaultGenFileRetryCount
+   */
+  public static int getDefaultGenFileRetryCount() {
+    return defaultGenFileRetryCount;
+  }
+
+  /**
+   * Advanced: set how many milliseconds to pause in between
+   * attempts to load the segments.gen file.
+   */
+  public static void setDefaultGenFileRetryPauseMsec(int msec) {
+    defaultGenFileRetryPauseMsec = msec;
+  }
+
+  /**
+   * @see #setDefaultGenFileRetryPauseMsec
+   */
+  public static int getDefaultGenFileRetryPauseMsec() {
+    return defaultGenFileRetryPauseMsec;
+  }
+
+  /**
+   * Advanced: set how many times to try incrementing the
+   * gen when loading the segments file.  This only runs if
+   * the primary (listing directory) and secondary (opening
+   * segments.gen file) methods fail to find the segments
+   * file.
+   */
+  public static void setDefaultGenLookaheadCount(int count) {
+    defaultGenLookaheadCount = count;
+  }
+  /**
+   * @see #setDefaultGenLookaheadCount
+   */
+  public static int getDefaultGenLookahedCount() {
+    return defaultGenLookaheadCount;
+  }
+
+  /**
+   * @see #setInfoStream
+   */
+  public static PrintStream getInfoStream() {
+    return infoStream;
+  }
+
+  private static void message(String message) {
+    if (infoStream != null) {
+      infoStream.println(Thread.currentThread().getName() + ": " + message);
+    }
+  }
+
+  /**
+   * Utility class for executing code that needs to do
+   * something with the current segments file.  This is
+   * necessary with lock-less commits because from the time
+   * you locate the current segments file name, until you
+   * actually open it, read its contents, or check modified
+   * time, etc., it could have been deleted due to a writer
+   * commit finishing.
+   */
+  public abstract static class FindSegmentsFile {
+    
+    File fileDirectory;
+    Directory directory;
+
+    public FindSegmentsFile(File directory) {
+      this.fileDirectory = directory;
+    }
+
+    public FindSegmentsFile(Directory directory) {
+      this.directory = directory;
+    }
+
+    public Object run() throws IOException {
+      String segmentFileName = null;
+      long lastGen = -1;
+      long gen = 0;
+      int genLookaheadCount = 0;
+      IOException exc = null;
+      boolean retry = false;
+
+      int method = 0;
+
+      // Loop until we succeed in calling doBody() without
+      // hitting an IOException.  An IOException most likely
+      // means a commit was in process and has finished, in
+      // the time it took us to load the now-old infos files
+      // (and segments files).  It's also possible it's a
+      // true error (corrupt index).  To distinguish these,
+      // on each retry we must see "forward progress" on
+      // which generation we are trying to load.  If we
+      // don't, then the original error is real and we throw
+      // it.
+      
+      // We have three methods for determining the current
+      // generation.  We try each in sequence.
+
+      while(true) {
+
+        // Method 1: list the directory and use the highest
+        // segments_N file.  This method works well as long
+        // as there is no stale caching on the directory
+        // contents:
+        String[] files = null;
+
+        if (0 == method) {
+          if (directory != null) {
+            files = directory.list();
+          } else {
+            files = fileDirectory.list();
+          }
+
+          gen = getCurrentSegmentGeneration(files);
+
+          if (gen == -1) {
+            String s = "";
+            for(int i=0;i<files.length;i++) {
+              s += " " + files[i];
+            }
+            throw new FileNotFoundException("no segments* file found: files:" + s);
+          }
+        }
+
+        // Method 2 (fallback if Method 1 isn't reliable):
+        // if the directory listing seems to be stale, then
+        // try loading the "segments.gen" file.
+        if (1 == method || (0 == method && lastGen == gen && retry)) {
+
+          method = 1;
+            
+          for(int i=0;i<defaultGenFileRetryCount;i++) {
+            IndexInput genInput = null;
+            try {
+              genInput = directory.openInput(IndexFileNames.SEGMENTS_GEN);
+            } catch (IOException e) {
+              message("segments.gen open: IOException " + e);
+            }
+            if (genInput != null) {
+
+              try {
+                int version = genInput.readInt();
+                if (version == FORMAT_LOCKLESS) {
+                  long gen0 = genInput.readLong();
+                  long gen1 = genInput.readLong();
+                  message("fallback check: " + gen0 + "; " + gen1);
+                  if (gen0 == gen1) {
+                    // The file is consistent.
+                    if (gen0 > gen) {
+                      message("fallback to '" + IndexFileNames.SEGMENTS_GEN + "' check: now try generation " + gen0 + " > " + gen);
+                      gen = gen0;
+                    }
+                    break;
+                  }
+                }
+              } catch (IOException err2) {
+                // will retry
+              } finally {
+                genInput.close();
+              }
+            }
+            try {
+              Thread.sleep(defaultGenFileRetryPauseMsec);
+            } catch (InterruptedException e) {
+              // will retry
+            }
+          }
+        }
+
+        // Method 3 (fallback if Methods 2 & 3 are not
+        // reliabel): since both directory cache and file
+        // contents cache seem to be stale, just advance the
+        // generation.
+        if (2 == method || (1 == method && lastGen == gen && retry)) {
+
+          method = 2;
+
+          if (genLookaheadCount < defaultGenLookaheadCount) {
+            gen++;
+            genLookaheadCount++;
+            message("look ahead incremenent gen to " + gen);
+          }
+        }
+
+        if (lastGen == gen) {
+
+          // This means we're about to try the same
+          // segments_N last tried.  This is allowed,
+          // exactly once, because writer could have been in
+          // the process of writing segments_N last time.
+
+          if (retry) {
+            // OK, we've tried the same segments_N file
+            // twice in a row, so this must be a real
+            // error.  We throw the original exception we
+            // got.
+            throw exc;
+          } else {
+            retry = true;
+          }
+
+        } else {
+          // Segment file has advanced since our last loop, so
+          // reset retry:
+          retry = false;
+        }
+
+        lastGen = gen;
+
+        segmentFileName = IndexFileNames.fileNameFromGeneration(IndexFileNames.SEGMENTS,
+                                                                "",
+                                                                gen);
+
+        try {
+          Object v = doBody(segmentFileName);
+          if (exc != null) {
+            message("success on " + segmentFileName);
+          }
+          return v;
+        } catch (IOException err) {
+
+          // Save the original root cause:
+          if (exc == null) {
+            exc = err;
+          }
+
+          message("primary Exception on '" + segmentFileName + "': " + err + "'; will retry: retry=" + retry + "; gen = " + gen);
+
+          if (!retry && gen > 1) {
+
+            // This is our first time trying this segments
+            // file (because retry is false), and, there is
+            // possibly a segments_(N-1) (because gen > 1).
+            // So, check if the segments_(N-1) exists and
+            // try it if so:
+            String prevSegmentFileName = IndexFileNames.fileNameFromGeneration(IndexFileNames.SEGMENTS,
+                                                                               "",
+                                                                               gen-1);
+            
+            if (directory.fileExists(prevSegmentFileName)) {
+              message("fallback to prior segment file '" + prevSegmentFileName + "'");
+              try {
+                Object v = doBody(prevSegmentFileName);
+                if (exc != null) {
+                  message("success on fallback " + prevSegmentFileName);
+                }
+                return v;
+              } catch (IOException err2) {
+                message("secondary Exception on '" + prevSegmentFileName + "': " + err2 + "'; will retry");
+              }
+            }
+          }
+        }
+      }
+    }
+
+    /**
+     * Subclass must implement this.  The assumption is an
+     * IOException will be thrown if something goes wrong
+     * during the processing that could have been caused by
+     * a writer committing.
+     */
+    protected abstract Object doBody(String segmentFileName) throws IOException;}
 }