You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@lucene.apache.org by si...@apache.org on 2011/04/29 12:20:06 UTC

svn commit: r1097759 - /lucene/dev/branches/realtime_search/lucene/CHANGES.txt

Author: simonw
Date: Fri Apr 29 10:20:06 2011
New Revision: 1097759

URL: http://svn.apache.org/viewvc?rev=1097759&view=rev
Log:
LUCENE-3023: added changes.txt entry for DWPT  LUCENE-2956, LUCENE-2573, LUCENE-2324, LUCENE-2555

Modified:
    lucene/dev/branches/realtime_search/lucene/CHANGES.txt

Modified: lucene/dev/branches/realtime_search/lucene/CHANGES.txt
URL: http://svn.apache.org/viewvc/lucene/dev/branches/realtime_search/lucene/CHANGES.txt?rev=1097759&r1=1097758&r2=1097759&view=diff
==============================================================================
--- lucene/dev/branches/realtime_search/lucene/CHANGES.txt (original)
+++ lucene/dev/branches/realtime_search/lucene/CHANGES.txt Fri Apr 29 10:20:06 2011
@@ -141,6 +141,7 @@ Changes in backwards compatibility polic
 * LUCENE-2315: AttributeSource's methods for accessing attributes are now final,
   else its easy to corrupt the internal states.  (Uwe Schindler)
 
+
 Changes in Runtime Behavior
 
 * LUCENE-2846: omitNorms now behaves like omitTermFrequencyAndPositions, if you
@@ -168,6 +169,70 @@ Changes in Runtime Behavior
   globally across IndexWriter sessions and persisted into a X.fnx file on
   successful commit. The corresponding file format changes are backwards-
   compatible. (Michael Busch, Simon Willnauer)
+  
+* LUCENE-2956, LUCENE-2573, LUCENE-2324, LUCENE-2555: Changes from 
+  DocumentsWriterPerThread:
+
+  - IndexWriter now uses a DocumentsWriter per thread when indexing documents.
+    Each DocumentsWriterPerThread indexes documents in its own private segment,
+    and the in memory segments are no longer merged on flush.  Instead, each
+    segment is separately flushed to disk and subsequently merged with normal
+    segment merging.
+
+  - DocumentsWriterPerThread (DWPT) is now flushed concurrently based on a
+    FlushPolicy.  When a DWPT is flushed, a fresh DWPT is swapped in so that
+    indexing may continue concurrently with flushing.  The selected
+    DWPT flushes all its RAM resident documents do disk.  Note: Segment flushes
+    don't flush all RAM resident documents but only the documents private to
+    the DWPT selected for flushing. 
+  
+  - Flushing is now controlled by FlushPolicy that is called for every add,
+    update or delete on IndexWriter. By default DWPTs are flushed either on
+    maxBufferedDocs per DWPT or the global active used memory. Once the active
+    memory exceeds ramBufferSizeMB only the largest DWPT is selected for
+    flushing and the memory used by this DWPT is substracted from the active
+    memory and added to a flushing memory pool, which can lead to temporarily
+    higher memory usage due to ongoing indexing.
+    
+  - IndexWriter now can utilize ramBufferSize > 2048 MB. Each DWPT can address
+    up to 2048 MB memory such that the ramBufferSize is now bounded by the max
+    number of DWPT avaliable in the used DocumentsWriterPerThreadPool.
+    IndexWriters net memory consumption can grow far beyond the 2048 MB limit if
+    the applicatoin can use all available DWPTs. To prevent a DWPT from
+    exhausting its address space IndexWriter will forcefully flush a DWPT if its
+    hard memory limit is exceeded. The RAMPerThreadHardLimitMB can be controlled
+    via IndexWriterConfig and defaults to 1945 MB. 
+    Since IndexWriter flushes DWPT concurrently not all memory is released
+    immediately. Applications should still use a ramBufferSize significantly
+    lower than the JVMs avaliable heap memory since under high load multiple
+    flushing DWPT can consume substantial transient memory when IO performance
+    is slow relative to indexing rate.
+    
+  - IndexWriter#commit now doesn't block concurrent indexing while flushing all
+    'currently' RAM resident documents to disk. Yet, flushes that occur while a
+    a full flush is running are queued and will happen after all DWPT involved
+    in the full flush are done flushing. Applications using multiple threads
+    during indexing and trigger a full flush (eg call commmit() or open a new
+    NRT reader) can use significantly more transient memory.
+    
+  - IndexWriter#addDocument and IndexWriter.updateDocument can block indexing
+    threads if the number of active + number of flushing DWPT exceed a
+    safety limit. By default this happens if 2 * max number available thread
+    states (DWPTPool) is exceeded. This safety limit prevents applications from
+    exhausting their available memory if flushing can't keep up with
+    concurrently indexing threads.  
+    
+  - IndexWriter only applies and flushes deletes if the maxBufferedDelTerms
+    limit is reached during indexing. No segment flushes will be triggered
+    due to this setting.
+    
+  - IndexWriter#flush(boolean, boolean) doesn't synchronized on IndexWriter
+    anymore. A dedicated flushLock has been introduced to prevent multiple full-
+    flushes happening concurrently. 
+    
+  - DocumentsWriter doesn't write shared doc stores anymore. 
+  
+  (Mike McCandless, Michael Busch, Simon Willnauer)
 
 API Changes