You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@lucene.apache.org by bu...@apache.org on 2012/07/03 14:33:11 UTC
svn commit: r824226 [2/5] - in /websites/staging/lucene/trunk/content: ./ core/ openrelevance/ pylucene/ solr/

Modified: websites/staging/lucene/trunk/content/mainnews.html
==============================================================================
--- websites/staging/lucene/trunk/content/mainnews.html (original)
+++ websites/staging/lucene/trunk/content/mainnews.html Tue Jul  3 12:33:05 2012
@@ -191,509 +191,123 @@
       <h1 class="title"></h1>
 
       <div><h1 id="news">News</h1>
-<h2 id="12-april-2012-lucene-core-360-and-solr-360-available">12 April 2012 - Lucene Core 3.6.0 and Solr 3.6.0 Available</h2>
+<h2 id="3-july-2012-lucene-core-40-alpha-and-solr-40-alpha-available">3 July 2012 - Lucene Core 4.0-ALPHA and Solr 4.0-ALPHA Available</h2>
 <p>The Lucene PMC is pleased to announce the availability
-of Apache Lucene 3.6.0 and Apache Solr 3.6.0</p>
+of Apache Lucene 4.0-ALPHA and Apache Solr 4.0-ALPHA</p>
 <p>Lucene can be downloaded from <a href="http://lucene.apache.org/core/mirrors-core-latest-redir.html">http://lucene.apache.org/core/mirrors-core-latest-redir.html</a> and Solr can be downloaded from <a href="http://lucene.apache.org/solr/mirrors-solr-latest-redir.html">http://lucene.apache.org/solr/mirrors-solr-latest-redir.html</a></p>
 <p>Highlights of the Lucene release include:</p>
 <ul>
 <li>
-<p>In addition to Java 5 and Java 6, this release has now
-    full Java 7 support (minimum JDK 7u1 required).</p>
+<p>The index formats for terms, postings lists, stored fields, term vectors, etc
+  are pluggable via the Codec api. You can select from the provided
+  implementations or customize the index format with your own Codec to meet your needs.</p>
 </li>
 <li>
-<p>TypeTokenFilter filters tokens based on their TypeAttribute.</p>
+<p>Similarity has been decoupled from the vector space model (TF/IDF). Additional models
+  such as BM25, Divergence from Randomness, Language Models, and Information-based models
+  are provided (see http://www.lucidimagination.com/blog/2011/09/12/flexible-ranking-in-lucene-4).</p>
 </li>
 <li>
-<p>Fixed offset bugs in a number of CharFilters, Tokenizers and TokenFilters
-    that could lead to exceptions during highlighting.</p>
+<p>Added support for per-document values (DocValues). DocValues can be used for custom
+  scoring factors (accessible via Similarity), for pre-sorted Sort values, and more.</p>
 </li>
 <li>
-<p>Added phonetic encoders: Metaphone, Soundex, Caverphone,
-    Beider-Morse, etc.</p>
+<p>When indexing via multiple threads, each IndexWriter thread now flushes its own segment
+  to disk concurrently, resulting in substantial performance improvements
+  (see http://blog.mikemccandless.com/2011/05/265-indexing-speedup-with-lucenes.html).</p>
 </li>
 <li>
-<p>CJKBigramFilter and CJKWidthFilter replace CJKTokenizer.</p>
+<p>Per-document normalization factors ("norms") are no longer limited to a single byte.
+  Similarity implementations can use any DocValues type to store norms.</p>
 </li>
 <li>
-<p>Kuromoji morphological analyzer tokenizes Japanese text, producing
-    both compound words and their segmentation.</p>
+<p>Added index statistics such as the number of tokens for a term or field, number of postings
+  for a field, and number of documents with a posting for a field: these support additional
+  scoring models (see
+  http://blog.mikemccandless.com/2012/03/new-index-statistics-in-lucene-40.html).</p>
 </li>
 <li>
-<p>Static index pruning (Carmel pruning) removes postings with low
-    within-document term frequency.</p>
+<p>Implemented a new default term dictionary/index (BlockTree) that indexes shared prefixes
+  instead of every n'th term. This is not only more time- and space- efficient, but can
+  also sometimes avoid going to disk at all for terms that do not exist. Alternative term
+  dictionary implementions are provided and pluggable via the Codec api.</p>
 </li>
 <li>
-<p>QueryParser now interprets '*' as an open end for range
-    queries.</p>
+<p>Indexed terms are no longer UTF-16 char sequences, instead terms can be any binary
+  value encoded as byte arrays. By default, text terms are now encoded as UTF-8
+  bytes. Sort order of terms is now defined by their binary value, which is identical
+  to UTF-8 sort order.</p>
 </li>
 <li>
-<p>FieldValueFilter excludes documents missing the specified field.</p>
+<p>Substantially faster performance when using a Filter during searching.</p>
 </li>
 <li>
-<p>CheckIndex and IndexUpgrader allow you to specify the
-    specific FSDirectory implementation to use with the new -dir-impl
-    command-line option.</p>
+<p>File-system based directories can rate-limit the IO (MB/sec) of merge
+  threads, to reduce IO contention between merging and searching threads.</p>
 </li>
 <li>
-<p>FSTs can now do reverse lookup (by output) in certain cases and
-    can be packed to reduce their size.  There is now a method to
-    retrieve top N shortest paths from a start node in an FST.</p>
+<p>Added a number of alternative Codecs and components for different use-cases: "Appending"
+  works with append-only filesystems (such as Hadoop DFS), "Memory" writes the entire
+  terms+postings as an FST read into RAM (see
+  http://blog.mikemccandless.com/2011/06/primary-key-lookups-are-28x-faster-with.html),
+  "Pulsing" inlines the postings for low-frequency terms into the term dictionary (see
+  http://blog.mikemccandless.com/2010/06/lucenes-pulsingcodec-on-primary-key.html),
+  "SimpleText" writes all files in plain-text for easy debugging/transparency (see
+  http://blog.mikemccandless.com/2010/10/lucenes-simpletext-codec.html), among others.</p>
 </li>
 <li>
-<p>New WFSTCompletionLookup suggester supports finer-grained
-    ranking for suggestions.</p>
+<p>Term offsets can be optionally encoded into the postings lists and can be retrieved
+  per-position.</p>
 </li>
 <li>
-<p>FST based suggesters now use an offline (disk-based) sort, instead
-    of in-memory sort, when pre-sorting the suggestions.</p>
+<p>A new AutomatonQuery returns all documents containing any term matching a provided
+  finite-state automaton (see http://www.slideshare.net/otisg/finite-state-queries-in-lucene).</p>
 </li>
 <li>
-<p>ToChildBlockJoinQuery joins in the opposite direction (parent down
-    to child documents).</p>
+<p>FuzzyQuery is 100-200 times faster than in past releases (see
+  http://blog.mikemccandless.com/2011/03/lucenes-fuzzyquery-is-100-times-faster.html).</p>
 </li>
 <li>
-<p>New query-time joining is more flexible (but less performant) than
-    index-time joins.</p>
+<p>A new spell checker, DirectSpellChecker, finds possible corrections directly against the
+  main search index without requiring a separate index.</p>
 </li>
 <li>
-<p>Added HTMLStripCharFilter to strip HTML markup.</p>
+<p>Various in-memory data structures such as the term dictionary and FieldCache are represented
+  more efficiently with less object overhead (see http://blog.mikemccandless.com/2010/07/lucenes-ram-usage-for-searching.html).</p>
 </li>
 <li>
-<p>Security fix: Better prevention of virtual machine SIGSEGVs when
-    using MMapDirectory: Code using cloned IndexInputs of already
-    closed indexes could possibly crash VM, allowing DoS attacks to
-    your application.</p>
+<p>All search logic is now required to work per segment, IndexReader was therefore refactored to
+  differentiate between atomic and composite readers
+  (see http://blog.thetaphi.de/2012/02/is-your-indexreader-atomic-major.html).</p>
 </li>
 <li>
-<p>Many bug fixes.</p>
+<p>Lucene 4.0 provides a modular API, consolidating components such as Analyzers and Queries
+  that were previously scattered across Lucene core, contrib, and Solr. These modules also
+  include additional functionality such as UIMA analyzer integration and a completely reworked
+  spatial search implementation.</p>
 </li>
 </ul>
 <p>Highlights of the Solr release include:</p>
-<ul>
-<li>
-<p>New SolrJ client connector using Apache Http Components http client
-   (SOLR-2020)</p>
-</li>
-<li>
-<p>Many analyzer factories are now "multi term query aware" allowing for things
-   like field type aware lowercasing when building prefix &amp; wildcard queries.
-   (SOLR-2438)</p>
-</li>
-<li>
-<p>New Kuromoji morphological analyzer tokenizes Japanese text, producing
-   both compound words and their segmentation. (SOLR-3056)</p>
-</li>
-<li>
-<p>Range Faceting (Dates &amp; Numbers) is now supported in distributed search
-   (SOLR-1709)</p>
-</li>
-<li>
-<p>HTMLStripCharFilter has been completely re-implemented, fixing many bugs
-   and greatly improving the performance (LUCENE-3690)</p>
-</li>
-<li>
-<p>StreamingUpdateSolrServer now supports the javabin format (SOLR-1565)</p>
-</li>
-<li>
-<p>New LFU Cache option for use in Solr's internal caches. (SOLR-2906)</p>
-</li>
-<li>
-<p>Memory performance improvements to all FST based suggesters (SOLR-2888)</p>
-</li>
-<li>
-<p>New WFSTLookupFactory suggester supports finer-grained ranking for
-   suggestions. (LUCENE-3714)</p>
-</li>
-<li>
-<p>New options for configuring the amount of concurrency used in distributed
-   searches (SOLR-3221)</p>
-</li>
-<li>
-<p>Many bug fixes</p>
-</li>
-</ul>
-<h2 id="27-november-2011-lucene-core-350-and-solr-350-available">27 November 2011 - Lucene Core 3.5.0 and Solr 3.5.0 Available</h2>
-<p>The Lucene PMC is pleased to announce the availability
-of Apache Lucene 3.5.0 and Apache Solr 3.5.0.</p>
-<p>Lucene can be downloaded from <a href="http://www.apache.org/dyn/closer.cgi/lucene/java/">http://www.apache.org/dyn/closer.cgi/lucene/java/</a> and Solr can be downloaded from <a href="http://www.apache.org/dyn/closer.cgi/lucene/solr/">http://www.apache.org/dyn/closer.cgi/lucene/solr/</a></p>
-<p>Highlights of the Lucene release include:</p>
-<ul>
-<li>
-<p>Added a very substantial (3-5X) RAM reduction required to hold the
-terms index on opening an IndexReader. (<a href="https://issues.apache.org/jira/browse/LUCENE-2205">LUCENE-2205</a>)</p>
-</li>
-<li>
-<p>Added IndexSearcher.searchAfter which returns results after a 
-specified ScoreDoc (e.g. last document on the previous page) to 
-support deep paging use cases. 
-(<a href="https://issues.apache.org/jira/browse/LUCENE-2215">LUCENE-2215</a>)</p>
-</li>
-<li>
-<p>Added SearcherManager to manage sharing and reopening IndexSearchers
-across multiple search threads. Underlying IndexReader instances are 
-safely closed if not referenced anymore. 
-(<a href="https://issues.apache.org/jira/browse/LUCENE-3445">LUCENE-3445</a>, 
-<a href="https://issues.apache.org/jira/browse/LUCENE-3558">LUCENE-3558</a>)</p>
-</li>
-<li>
-<p>Added SearcherLifetimeManager which safely provides a consistent 
-view of the index across multiple requests (e.g. paging/drilldown). 
-(<a href="https://issues.apache.org/jira/browse/LUCENE-3558">LUCENE-3558</a>, 
-<a href="https://issues.apache.org/jira/browse/LUCENE-3486">LUCENE-3486</a>)</p>
-</li>
-<li>
-<p>Renamed IndexWriter.optimize to forceMerge to discourage use of 
-this method since it is horribly costly and rarely justified anymore. 
-(<a href="https://issues.apache.org/jira/browse/LUCENE-3454">LUCENE-3454</a>)</p>
-</li>
-<li>
-<p>Added NGramPhraseQuery that speeds up phrase queries 30-50% when
- n-gram analysis is used. (<a href="https://issues.apache.org/jira/browse/LUCENE-3426">LUCENE-3426</a>)</p>
-</li>
-<li>
-<p>Added a new reopen API (IndexReader.openIfChanged) that returns 
-null instead of the old reader if there are no changes in the index. 
-(<a href="https://issues.apache.org/jira/browse/LUCENE-3464">LUCENE-3464</a>)</p>
-</li>
-<li>
-<p>Improvements to vector highlighting: support for more queries 
-such as wildcards and boundary analysis for generated snippets. 
-(<a href="https://issues.apache.org/jira/browse/LUCENE-1824">LUCENE-1824</a>, 
-<a href="https://issues.apache.org/jira/browse/LUCENE-1889">LUCENE-1889</a>)</p>
-</li>
-<li>
-<p>IndexSearcher and IndexReader now perform additional checks to 
-throw AlreadyClosedExceptions if searches are performed on a 
-closed IndexReader. Performing searches on already closed reader can 
-cause JVM crashes when invalid memory mapped files are referenced. </p>
-</li>
-<li>
-<p>Several bugfixes, including a bug where closing an NRT reader 
-after the writer was closed was incorrectly invoking the 
-DeletionPolicy. See CHANGES.txt entries for full details.</p>
-</li>
-</ul>
-<p>Highlights of the Solr release include:</p>
-<ul>
-<li>
-<p>Bug fixes and improvements from Apache Lucene 3.5.0, including a
-    very substantial (3-5X) RAM reduction required to hold the terms
-    index on opening an <code  class="inline">IndexReader</code>.
-    (<a href="https://issues.apache.org/jira/browse/LUCENE-2205">LUCENE-2205</a>)</p>
-</li>
-<li>
-<p>Added support for distributed result grouping.
-    (<a href="https://issues.apache.org/jira/browse/SOLR-2066">SOLR-2066</a>,
-    <a href="https://issues.apache.org/jira/browse/SOLR-2776">SOLR-2776</a>)</p>
-</li>
-<li>
-<p>Added support for Hunspell stemmer <code  class="inline">TokenFilter</code> supporting
-    stemming for 99 languages.
-    (<a href="https://issues.apache.org/jira/browse/SOLR-2769">SOLR-2769</a>)</p>
-</li>
-<li>
-<p>A new contrib module "langid" adds language identification
-    capabilities as an Update Processor, using Tika's
-    <code  class="inline">LanguageIdentifier</code> or Cybozu language-detection library
-    (<a href="https://issues.apache.org/jira/browse/SOLR-1979">SOLR-1979</a>)</p>
-</li>
-<li>
-<p>Numeric types including Trie and date types now support
-    <code  class="inline">sortMissingFirst</code>/<code  class="inline">Last</code>.
-    (<a href="https://issues.apache.org/jira/browse/SOLR-2881">SOLR-2881</a>)</p>
-</li>
-<li>
-<p>Added <code  class="inline">hl.q</code> parameter. It is optional and if it is specified,
-    it overrides <code  class="inline">q</code> parameter in <code  class="inline">Highlighter</code>.
-    (<a href="https://issues.apache.org/jira/browse/SOLR-1926">SOLR-1926</a>)</p>
-</li>
-<li>
-<p>Several minor bugfixes like date parsing for years from 0001-1000, ignored
-    configurations when using <code  class="inline">QueryAnalyzer</code> with
-    <code  class="inline">SpellCheckComponent</code> and many more.
-    See CHANGES.txt entries for full details.</p>
-</li>
-</ul>
-<h2 id="26-october-2011-java-7u1-fixes-index-corruption-and-crash-bugs-in-apache-lucene-core-and-apache-solrwzxhzdk8">26 October 2011 - Java 7u1 fixes index corruption and crash bugs in Apache Lucene Core and Apache Solr</title></h2>
-<p>Oracle released <a href="http://www.oracle.com/technetwork/java/javase/7u1-relnotes-507962.html">Java 7u1</a> on October 19.
-  According to the release notes and tests done by the Lucene committers, all bugs reported on July 28 are fixed in this release,
-  so code using Porter stemmer no longer crashes with <code  class="inline">SIGSEGV</code>. We were not able to experience any index corruption anymore,
-  so it is safe to use Java 7u1 with Lucene Core and Solr.
-  On the same day, Oracle released <a href="http://www.oracle.com/technetwork/java/javase/6u29-relnotes-507960.html">Java 6u29</a>
-  fixing the same problems occurring with Java 6, if the JVM switches <code  class="inline">-XX:+AggressiveOpts</code>
-  or <code  class="inline">-XX:+OptimizeStringConcat</code> were used. Of course, you should <strong>not</strong> use experimental JVM options like
-  <code  class="inline">-XX:+AggressiveOpts</code> in production environments! We recommend everybody to upgrade to this latest version 6u29.
-  In case you upgrade to Java 7, remember that you may have to reindex, as the unicode
-  version shipped with Java 7 changed and tokenization behaves differently
-  (e.g. lowercasing). For more information, read <code  class="inline">JRE_VERSION_MIGRATION.txt</code>
-  in your distribution package!</p>
-<h2 id="14-september-2011-lucene-core-340-and-solr-340-availablewzxhzdk25">14 September 2011 - Lucene Core 3.4.0 and Solr 3.4.0 Available</title></h2>
-<p>The Lucene PMC is pleased to announce the availability
- of Apache Lucene 3.4.0 and Apache Solr 3.4.0.</p>
-<p>Lucene can be downloaded from <a href="http://www.apache.org/dyn/closer.cgi/lucene/java/">http://www.apache.org/dyn/closer.cgi/lucene/java/</a> and Solr can be downloaded from <a href="http://www.apache.org/dyn/closer.cgi/lucene/solr/">http://www.apache.org/dyn/closer.cgi/lucene/solr/</a></p>
-<p>If you are already using Apache Lucene 3.1, 3.2 or 3.3, we strongly recommend you upgrade to 3.4.0 because of the index corruption bug on OS or computer crash or power loss (<a href="https://issues.apache.org/jira/browse/LUCENE-3418">LUCENE-3418</a>), now fixed in 3.4.0.</p>
-<p>Highlights of the Lucene release include:</p>
-<ul>
-<li>
-<p>Fixed a major bug (<a href="https://issues.apache.org/jira/browse/LUCENE-3418">LUCENE-3418</a>) whereby a Lucene index could
-easily become corrupted if the OS or computer crashed or lost
-power.</p>
-</li>
-<li>
-<p>Added a new faceting module (<code  class="inline">contrib/facet</code>) for computing facet
-counts (both hierarchical and non-hierarchical) at search
-time (<a href="https://issues.apache.org/jira/browse/LUCENE-3079">LUCENE-3079</a>).</p>
-</li>
-<li>
-<p>Added a new join module (<code  class="inline">contrib/join</code>), enabling indexing and
-searching of nested (parent/child) documents using
-<code  class="inline">BlockJoinQuery</code>/<code  class="inline">Collector</code> (<a href="https://issues.apache.org/jira/browse/LUCENE-3171">LUCENE-3171</a>).</p>
-</li>
-<li>
-<p>It is now possible to index documents with term frequencies
-included but without positions (<a href="https://issues.apache.org/jira/browse/LUCENE-2048">LUCENE-2048</a>); previously
-<code  class="inline">omitTermFreqAndPositions</code> always omitted both.</p>
-</li>
-<li>
-<p>The modular <code  class="inline">QueryParser</code> (<code  class="inline">contrib/queryparser</code>) can now create
-<code  class="inline">NumericRangeQuery</code>.</p>
-</li>
-<li>
-<p>Added SynonymFilter, in <code  class="inline">contrib/analyzers</code>, to apply multi-word
-synonyms during indexing or querying, including parsers to read
-the wordnet and solr synonym formats (<a href="https://issues.apache.org/jira/browse/LUCENE-3233">LUCENE-3233</a>).</p>
-</li>
-<li>
-<p>You can now control how documents that don't have a value on the
-sort field should sort (<a href="https://issues.apache.org/jira/browse/LUCENE-3390">LUCENE-3390</a>), using <code  class="inline">SortField.setMissingValue</code>.</p>
-</li>
-<li>
-<p>Fixed a case where term vectors could be silently deleted from the
-index after <code  class="inline">addIndexes</code> (<a href="https://issues.apache.org/jira/browse/LUCENE-3402">LUCENE-3402</a>).</p>
-</li>
-</ul>
-<p>Highlights of the Solr release include:</p>
-<ul>
-<li>
-<p>SolrJ client can now parse grouped and range facets results
-(<a href="https://issues.apache.org/jira/browse/SOLR-2523">SOLR-2523</a>).</p>
-</li>
-<li>
-<p>A new <code  class="inline">XsltUpdateRequestHandler</code> allows posting XML that's
-transformed by a provided XSLT into a valid Solr document
-(<a href="https://issues.apache.org/jira/browse/SOLR-2630">SOLR-2630</a>).</p>
-</li>
-<li>
-<p>Post-group faceting option (<code  class="inline">group.truncate</code>) can now compute
-facet counts for only the highest ranking documents per-group.
-(<a href="https://issues.apache.org/jira/browse/SOLR-2665">SOLR-2665</a>).</p>
-</li>
-<li>
-<p>Add <code  class="inline">commitWithin</code> update request parameter to all update handlers
-that were previously missing it.  This tells Solr to commit the
-change within the specified amount of time (<a href="https://issues.apache.org/jira/browse/SOLR-2540">SOLR-2540</a>).</p>
-</li>
-<li>
-<p>You can now specify <code  class="inline">NIOFSDirectory</code> (<a href="https://issues.apache.org/jira/browse/SOLR-2670">SOLR-2670</a>).</p>
-</li>
-<li>
-<p>New parameter <code  class="inline">hl.phraseLimit</code> speeds up <code  class="inline">FastVectorHighlighter</code>
-(<a href="https://issues.apache.org/jira/browse/LUCENE-3234">LUCENE-3234</a>).</p>
-</li>
-<li>
-<p>The query cache and filter cache can now be disabled per request.
-See <a href="http://wiki.apache.org/solr/CommonQueryParameters#Caching_of_filters">this wiki page</a>
-(<a href="https://issues.apache.org/jira/browse/SOLR-2429">SOLR-2429</a>).</p>
-</li>
-<li>
-<p>Improved memory usage, build time, and performance of 
-<code  class="inline">SynonymFilterFactory</code> (<a href="https://issues.apache.org/jira/browse/LUCENE-3233">LUCENE-3233</a>).</p>
-</li>
-<li>
-<p>Added <code  class="inline">omitPositions</code> to the schema, so you can omit position
-information while still indexing term frequencies (<a href="https://issues.apache.org/jira/browse/LUCENE-2048">LUCENE-2048</a>).</p>
-</li>
-<li>
-<p>Various fixes for multi-threaded <code  class="inline">DataImportHandler</code>.</p>
-</li>
-</ul>
-<h2 id="28-july-2011-warning-index-corruption-and-crashes-in-apache-lucene-core-apache-solr-with-java-7">28 July 2011 - WARNING: Index corruption and crashes in Apache Lucene Core / Apache Solr with Java 7</h2>
-<p>Oracle released <a href="http://www.oracle.com/technetwork/java/javase/downloads/index.html">Java 7</a> today.
-Unfortunately it contains hotspot compiler optimizations, which miscompile some loops.
-This can affect code of several Apache projects. Sometimes JVMs only crash, but in several cases,
-results calculated can be incorrect, leading to bugs in applications
-(see Hotspot bugs <a href="http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=7070134">7070134</a>,
-<a href="http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=7044738">7044738</a>,
-<a href="http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=7068051">7068051</a>).
-<strong>Apache Lucene Core</strong> and <strong>Apache Solr</strong> are two Apache projects,
-which are affected by these bugs, namely all versions released until today.
-Solr users with the default configuration will have
-Java crashing with <code  class="inline">SIGSEGV</code> as soon as they start to index documents, as one
-affected part is the well-known Porter stemmer
-(see <a href="https://issues.apache.org/jira/browse/LUCENE-3335">LUCENE-3335</a>).
-Other loops in Lucene may be miscompiled, too, leading to index corruption
-(especially on Lucene trunk with pulsing codec; other loops may be
-affected, too - <a href="https://issues.apache.org/jira/browse/LUCENE-3346">LUCENE-3346</a>).
-These problems were detected only 5 days before the official Java 7 release,
-so Oracle had no time to fix those bugs, affecting also many more applications.
-In response to our questions, they proposed to include the fixes into service
-release u2 (eventually into service release u1, see
-<a href="http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2011-July/005971.html">this mail</a>).
-<strong>This means you cannot use Apache Lucene/Solr with Java 7 releases before Update 2!</strong>
-If you do, please don't open bug reports, it is not the committers' fault!
-At least disable loop optimizations using the <code  class="inline">-XX:-UseLoopPredicate</code> JVM option
-to not risk index corruptions.
-<em>Please note:</em> Also Java 6 users are affected, if they use one of those
-JVM options, which are <strong>not</strong> enabled by default: <code  class="inline">-XX:+OptimizeStringConcat</code>
-or <code  class="inline">-XX:+AggressiveOpts</code>.
-It is strongly recommended not to use any hotspot optimization switches in any Java
-version without extensive testing!
-In case you upgrade to Java 7, remember that you may have to reindex, as the unicode
-version shipped with Java 7 changed and tokenization behaves differently
-(e.g. lowercasing). For more information, read <code  class="inline">JRE_VERSION_MIGRATION.txt</code>
-in your distribution package!</p>
-<h2 id="1-july-2011-lucene-core-33-and-solr-33-available">1 July 2011 - Lucene Core 3.3 and Solr 3.3 Available</h2>
-<p>The Lucene PMC is pleased to announce the availability
-of Apache Lucene 3.3 and Apache Solr 3.3.</p>
-<p>Lucene can be downloaded from <a href="http://www.apache.org/dyn/closer.cgi/lucene/java/">http://www.apache.org/dyn/closer.cgi/lucene/java/</a> and Solr can be downloaded from <a href="http://www.apache.org/dyn/closer.cgi/lucene/solr/">http://www.apache.org/dyn/closer.cgi/lucene/solr/</a>
-Highlights of the Lucene release include:</p>
-<ul>
-<li>The spellchecker module now includes suggest/auto-complete functionality,
-       with three implementations: Jaspell, Ternary Trie, and Finite State.</li>
-<li>Support for merging results from multiple shards, for both "normal"
-       search results (TopDocs.merge) as well as grouped results using the
-       grouping module (SearchGroup.merge, TopGroups.merge).</li>
-<li>An optimized implementation of KStem, a less aggressive stemmer
-       for English</li>
-<li>Single-pass grouping implementation based on block document indexing.</li>
-<li>Improvements to MMapDirectory (now also the default implementation
-       returned by FSDirectory.open on 64-bit Linux).</li>
-<li>NRTManager simplifies handling near-real-time search with multiple
-       search threads, allowing the application to control which indexing
-       changes must be visible to which search requests.</li>
-<li>TwoPhaseCommitTool facilitates performing a multi-resource
-       two-phased commit, including IndexWriter.</li>
-<li>The default merge policy, TieredMergePolicy, has a new method
-       (set/getReclaimDeletesWeight) to control how aggressively it
-       targets segments with deletions, and is now more aggressive than
-       before by default.</li>
-<li>PKIndexSplitter tool splits an index by a mid-point term.</li>
-</ul>
-<p>Highlights of the Solr release include:</p>
-<ul>
-<li>Grouping / Field Collapsing</li>
-<li>A new, automaton-based suggest/autocomplete implementation offering an
-       order of magnitude smaller RAM consumption.</li>
-<li>KStemFilterFactory, an optimized implementation of a less aggressive
-       stemmer for English.</li>
-<li>Solr defaults to a new, more efficient merge policy (TieredMergePolicy).
-       See <a href="http://s.apache.org/merging">http://s.apache.org/merging</a> for more information.</li>
-<li>Important bugfixes, including extremely high RAM usage in spellchecking.</li>
-<li>Bugfixes and improvements from Apache Lucene 3.3</li>
-</ul>
-<h2 id="4-june-2011-lucene-core-32-and-solr-32-available">4 June 2011 - Lucene Core 3.2 and Solr 3.2 Available</h2>
-<p>The Lucene PMC is pleased to announce the availability of Apache Lucene 3.2 and Apache Solr 3.2.</p>
-<p>Lucene can be downloaded from <a href="http://www.apache.org/dyn/closer.cgi/lucene/java/">http://www.apache.org/dyn/closer.cgi/lucene/java/</a> and Solr can be downloaded from <a href="http://www.apache.org/dyn/closer.cgi/lucene/solr/">http://www.apache.org/dyn/closer.cgi/lucene/solr/</a>
-Highlights of the Lucene release include:</p>
-<ul>
-<li>A new grouping module, under lucene/contrib/grouping, enables
-       search results to be grouped by a single-valued indexed field</li>
-<li>A new IndexUpgrader tool fully converts an old index to the
-       current format.</li>
-<li>A new Directory implementation, NRTCachingDirectory, caches small
-       segments in RAM, to reduce the I/O load for applications with fast
-       NRT reopen rates.</li>
-<li>A new Collector implementation, CachingCollector, is able to
-       gather search hits (document IDs and optionally also scores) and
-       then replay them.  This is useful for Collectors that require two
-       or more passes to produce results.</li>
-<li>Index a document block using IndexWriter's new addDocuments or
-       updateDocuments methods.  These experimental APIs ensure that the
-       block of documents will forever remain contiguous in the index,
-       enabling interesting future features like grouping and joins.</li>
-<li>A new default merge policy, TieredMergePolicy, which is more
-       efficient due to being able to merge non-contiguous segments.
-       See <a href="http://s.apache.org/merging">http://s.apache.org/merging</a> for details.</li>
-<li>NumericField is now returned correctly when you load a stored
-       document (previously you received a normal Field back, with the
-       numeric value converted string).</li>
-<li>Deleted terms are now applied during flushing to the newly flushed
-       segment, which is more efficient than having to later initialize a
-       reader for that segment.</li>
-</ul>
-<p>Highlights of the Solr release include:</p>
-<ul>
-<li>Ability to specify overwrite and commitWithin as request parameters when 
-       using the JSON update format.</li>
-<li>TermQParserPlugin, useful when generating filter queries from terms 
-       returned from field faceting or the terms component.</li>
-<li>DebugComponent now supports using a NamedList to model Explanation objects 
-       in its responses instead of Explanation.toString.</li>
-<li>Improvements to the UIMA and Carrot2 integrations.</li>
-<li>Highlighting performance improvements. </li>
-<li>A test-framework jar for easy testing of Solr extensions.</li>
-<li>Bugfixes and improvements from Apache Lucene 3.2. </li>
-</ul>
-<h2 id="31-march-2011-lucene-core-31-and-solr-31-available">31 March 2011 - Lucene Core 3.1 and Solr 3.1 Available</h2>
-<p>The Lucene PMC is pleased to announce the availability of Apache Lucene 3.1 and Apache Solr 3.1.
-The version number for Solr 3.1 was chosen to reflect the merge of
-development with Lucene, which is currently also on 3.1.  Going
-forward, we expect the Solr version to be the same as the Lucene
-version.  Solr 3.1 contains Lucene 3.1 and is the release after Solr 1.4.1.</p>
-<p>Lucene can be downloaded from <a href="http://www.apache.org/dyn/closer.cgi/lucene/java/">http://www.apache.org/dyn/closer.cgi/lucene/java/</a> and Solr can be downloaded from <a href="http://www.apache.org/dyn/closer.cgi/lucene/solr/">http://www.apache.org/dyn/closer.cgi/lucene/solr/</a>
-Highlights of the Lucene release include:</p>
-<ul>
-<li>Numerous performance improvements: faster exact PhraseQuery; merging
-favors segments with deletions; primary key lookup is faster;
-IndexWriter.addIndexes(Directory[]) uses file copy instead of
-merging; various Directory performance improvements; compound file
-is dynamically turned off for large segments; fully deleted segments
-are dropped on commit; faster snowball analyzers (in contrib);
-ConcurrentMergeScheduler is more careful about setting priority of
-merge threads.</li>
-<li>ReusableAnalyzerBase makes it easier to reuse TokenStreams
-correctly.</li>
-<li>Improved Analysis capabilities: Improved Unicode support, including
-Unicode 4, more friendly term handling (CharTermAttribute), easier
-object reuse and better support for protected words in lossy token
-filters (e.g. stemmers).</li>
-<li>ConstantScoreQuery now allows directly wrapping a Query.</li>
-<li>IndexWriter is now configured with a new separate builder API,
-IndexWriterConfig.  You can now control IndexWriter's previously
-fixed internal thread limit by calling setMaxThreadStates.</li>
-<li>IndexWriter.getReader is replaced by IndexReader.open(IndexWriter).
-In addition you can now specify whether deletes should be resolved
-when you open an NRT reader.</li>
-<li>MultiSearcher is deprecated; ParallelMultiSearcher has been
-absorbed directly into IndexSearcher.</li>
-<li>On 64bit Windows and Solaris JVMs, MMapDirectory is now the
-default implementation (returned by FSDirectory.open).
-MMapDirectory also enables unmapping if the JVM supports it.</li>
-<li>New TotalHitCountCollector just counts total number of hits.</li>
-<li>ReaderFinishedListener API enables external caches to evict entries
-once a segment is finished.</li>
-</ul>
-<p>Highlights of the Solr release include:</p>
-<ul>
-<li>Numeric range facets (similar to date faceting).</li>
-<li>New spatial search, including spatial filtering, boosting and sorting capabilities.</li>
-<li>Example Velocity driven search UI at http://localhost:8983/solr/browse</li>
-<li>A new termvector-based highlighter</li>
-<li>Extend dismax (edismax) query parser which addresses some
-missing features in the dismax query parser along with some
-extensions.</li>
-<li>Several more components now support distributed mode:
-TermsComponent, SpellCheckComponent.</li>
-<li>A new Auto Suggest component.</li>
-<li>Ability to sort by functions.</li>
-<li>JSON document indexing.</li>
-<li>CSV response format.</li>
-<li>Apache UIMA integration for metadata extraction.</li>
-<li>Leverages Lucene 3.1 and it's inherent optimizations and bug fixes
-as well as new analysis capabilities.</li>
-<li>Numerous improvements, bug fixes, and optimizations.</li>
-</ul></div>
+<p>The largest set of features goes by the development code-name âSolr Cloudâ and involves bringing easy scalability to Solr.  See http://wiki.apache.org/solr/SolrCloud for more details.
+ - Distributed indexing designed from the ground up for near real-time (NRT) and NoSQL features such as realtime-get, optimistic locking, and durable updates.
+ - High availability with no single points of failure.
+ - Apache Zookeeper integration for distributed coordination and cluster metadata and configuration storage.
+ - Immunity to split-brain issues due to Zookeeper's Paxos distributed consensus protocols.
+ - Updates sent to any node in the cluster and are automatically forwarded to the correct shard and replicated to multiple nodes for redundancy.
+ - Queries sent to any node automatically perform a full distributed search across the cluster with load balancing and fail-over.</p>
+<p>Solr 4.0-alpha includes more NoSQL features for those using Solr as a primary data store:
+ - Update durability â A transaction log ensures that even uncommitted documents are never lost.
+ - Real-time Get â The ability to quickly retrieve the latest version of a document, without the need to commit or open a new searcher
+ - Versioning and Optimistic Locking â combined with real-time get, this allows read-update-write functionality that ensures no conflicting changes were made concurrently by other clients.
+ - Atomic updates -  the ability to add, remove, change, and increment fields of an existing document without having to send in the complete document again.</p>
+<p>There are many other features coming in Solr 4, such as
+ - Pivot Faceting â Multi-level or hierarchical faceting where the top constraints for one field are found for each top constraint of a different field.
+ - Pseudo-fields â The ability to alias fields, or to add metadata along with returned documents, such as function query values and results of spatial distance calculations.
+ - A spell checker implementation that can work directly from the main index instead of creating a sidecar index.
+ - Pseudo-Join functionality â The ability to select a set of documents based on their relationship to a second set of documents.
+ - Function query enhancements including conditional function queries and relevancy functions.
+ - New update processors to facilitate modifying documents prior to indexing.
+ - A brand new web admin interface, including support for SolrCloud.</p></div>
       
 
 
@@ -705,15 +319,15 @@ as well as new analysis capabilities.</l
       <p><div class="button-green">
     <a href="core/mirrors-core-latest-redir.html">Download</a>
     <div class="flap top">Click to begin</div>
-    <div class="flap bottom">of Apache Lucene 3.6</div>
+    <div class="flap bottom">of Apache Lucene 4.0-ALPHA</div>
   </div>
-  <div class="download-desc">Apache Lucene 3.6</div>
+  <div class="download-desc">Apache Lucene 4.0-ALPHA</div>
   <div class="button-orange">
     <a href="solr/mirrors-solr-latest-redir.html">Download</a>
     <div class="flap top">Click to begin</div>
-    <div class="flap bottom">of Apache Solr 3.6</div>
+    <div class="flap bottom">of Apache Solr 4.0-ALPHA</div>
   </div>
-  <div class="download-desc">Apache Solr 3.6</div></p>
+  <div class="download-desc">Apache Solr 4.0-ALPHA</div></p>
 <h1 id="projects">Projects</h1>
 <ul>
 <li><a href="./core/">Lucene Core (Java)</a></li>

Modified: websites/staging/lucene/trunk/content/openrelevance/mailing-lists.html
==============================================================================
--- websites/staging/lucene/trunk/content/openrelevance/mailing-lists.html (original)
+++ websites/staging/lucene/trunk/content/openrelevance/mailing-lists.html Tue Jul  3 12:33:05 2012
@@ -143,509 +143,123 @@ to how to consume ORP content or how to 
 -<a href="http://mail-archives.apache.org/mod_mbox/lucene-openrelevance-dev/">Archive</a></p></div>
       
         <div><h1 id="news">News</h1>
-<h2 id="12-april-2012-lucene-core-360-and-solr-360-available">12 April 2012 - Lucene Core 3.6.0 and Solr 3.6.0 Available</h2>
+<h2 id="3-july-2012-lucene-core-40-alpha-and-solr-40-alpha-available">3 July 2012 - Lucene Core 4.0-ALPHA and Solr 4.0-ALPHA Available</h2>
 <p>The Lucene PMC is pleased to announce the availability
-of Apache Lucene 3.6.0 and Apache Solr 3.6.0</p>
+of Apache Lucene 4.0-ALPHA and Apache Solr 4.0-ALPHA</p>
 <p>Lucene can be downloaded from <a href="http://lucene.apache.org/core/mirrors-core-latest-redir.html">http://lucene.apache.org/core/mirrors-core-latest-redir.html</a> and Solr can be downloaded from <a href="http://lucene.apache.org/solr/mirrors-solr-latest-redir.html">http://lucene.apache.org/solr/mirrors-solr-latest-redir.html</a></p>
 <p>Highlights of the Lucene release include:</p>
 <ul>
 <li>
-<p>In addition to Java 5 and Java 6, this release has now
-    full Java 7 support (minimum JDK 7u1 required).</p>
+<p>The index formats for terms, postings lists, stored fields, term vectors, etc
+  are pluggable via the Codec api. You can select from the provided
+  implementations or customize the index format with your own Codec to meet your needs.</p>
 </li>
 <li>
-<p>TypeTokenFilter filters tokens based on their TypeAttribute.</p>
+<p>Similarity has been decoupled from the vector space model (TF/IDF). Additional models
+  such as BM25, Divergence from Randomness, Language Models, and Information-based models
+  are provided (see http://www.lucidimagination.com/blog/2011/09/12/flexible-ranking-in-lucene-4).</p>
 </li>
 <li>
-<p>Fixed offset bugs in a number of CharFilters, Tokenizers and TokenFilters
-    that could lead to exceptions during highlighting.</p>
+<p>Added support for per-document values (DocValues). DocValues can be used for custom
+  scoring factors (accessible via Similarity), for pre-sorted Sort values, and more.</p>
 </li>
 <li>
-<p>Added phonetic encoders: Metaphone, Soundex, Caverphone,
-    Beider-Morse, etc.</p>
+<p>When indexing via multiple threads, each IndexWriter thread now flushes its own segment
+  to disk concurrently, resulting in substantial performance improvements
+  (see http://blog.mikemccandless.com/2011/05/265-indexing-speedup-with-lucenes.html).</p>
 </li>
 <li>
-<p>CJKBigramFilter and CJKWidthFilter replace CJKTokenizer.</p>
+<p>Per-document normalization factors ("norms") are no longer limited to a single byte.
+  Similarity implementations can use any DocValues type to store norms.</p>
 </li>
 <li>
-<p>Kuromoji morphological analyzer tokenizes Japanese text, producing
-    both compound words and their segmentation.</p>
+<p>Added index statistics such as the number of tokens for a term or field, number of postings
+  for a field, and number of documents with a posting for a field: these support additional
+  scoring models (see
+  http://blog.mikemccandless.com/2012/03/new-index-statistics-in-lucene-40.html).</p>
 </li>
 <li>
-<p>Static index pruning (Carmel pruning) removes postings with low
-    within-document term frequency.</p>
+<p>Implemented a new default term dictionary/index (BlockTree) that indexes shared prefixes
+  instead of every n'th term. This is not only more time- and space- efficient, but can
+  also sometimes avoid going to disk at all for terms that do not exist. Alternative term
+  dictionary implementions are provided and pluggable via the Codec api.</p>
 </li>
 <li>
-<p>QueryParser now interprets '*' as an open end for range
-    queries.</p>
+<p>Indexed terms are no longer UTF-16 char sequences, instead terms can be any binary
+  value encoded as byte arrays. By default, text terms are now encoded as UTF-8
+  bytes. Sort order of terms is now defined by their binary value, which is identical
+  to UTF-8 sort order.</p>
 </li>
 <li>
-<p>FieldValueFilter excludes documents missing the specified field.</p>
+<p>Substantially faster performance when using a Filter during searching.</p>
 </li>
 <li>
-<p>CheckIndex and IndexUpgrader allow you to specify the
-    specific FSDirectory implementation to use with the new -dir-impl
-    command-line option.</p>
+<p>File-system based directories can rate-limit the IO (MB/sec) of merge
+  threads, to reduce IO contention between merging and searching threads.</p>
 </li>
 <li>
-<p>FSTs can now do reverse lookup (by output) in certain cases and
-    can be packed to reduce their size.  There is now a method to
-    retrieve top N shortest paths from a start node in an FST.</p>
+<p>Added a number of alternative Codecs and components for different use-cases: "Appending"
+  works with append-only filesystems (such as Hadoop DFS), "Memory" writes the entire
+  terms+postings as an FST read into RAM (see
+  http://blog.mikemccandless.com/2011/06/primary-key-lookups-are-28x-faster-with.html),
+  "Pulsing" inlines the postings for low-frequency terms into the term dictionary (see
+  http://blog.mikemccandless.com/2010/06/lucenes-pulsingcodec-on-primary-key.html),
+  "SimpleText" writes all files in plain-text for easy debugging/transparency (see
+  http://blog.mikemccandless.com/2010/10/lucenes-simpletext-codec.html), among others.</p>
 </li>
 <li>
-<p>New WFSTCompletionLookup suggester supports finer-grained
-    ranking for suggestions.</p>
+<p>Term offsets can be optionally encoded into the postings lists and can be retrieved
+  per-position.</p>
 </li>
 <li>
-<p>FST based suggesters now use an offline (disk-based) sort, instead
-    of in-memory sort, when pre-sorting the suggestions.</p>
+<p>A new AutomatonQuery returns all documents containing any term matching a provided
+  finite-state automaton (see http://www.slideshare.net/otisg/finite-state-queries-in-lucene).</p>
 </li>
 <li>
-<p>ToChildBlockJoinQuery joins in the opposite direction (parent down
-    to child documents).</p>
+<p>FuzzyQuery is 100-200 times faster than in past releases (see
+  http://blog.mikemccandless.com/2011/03/lucenes-fuzzyquery-is-100-times-faster.html).</p>
 </li>
 <li>
-<p>New query-time joining is more flexible (but less performant) than
-    index-time joins.</p>
+<p>A new spell checker, DirectSpellChecker, finds possible corrections directly against the
+  main search index without requiring a separate index.</p>
 </li>
 <li>
-<p>Added HTMLStripCharFilter to strip HTML markup.</p>
+<p>Various in-memory data structures such as the term dictionary and FieldCache are represented
+  more efficiently with less object overhead (see http://blog.mikemccandless.com/2010/07/lucenes-ram-usage-for-searching.html).</p>
 </li>
 <li>
-<p>Security fix: Better prevention of virtual machine SIGSEGVs when
-    using MMapDirectory: Code using cloned IndexInputs of already
-    closed indexes could possibly crash VM, allowing DoS attacks to
-    your application.</p>
+<p>All search logic is now required to work per segment, IndexReader was therefore refactored to
+  differentiate between atomic and composite readers
+  (see http://blog.thetaphi.de/2012/02/is-your-indexreader-atomic-major.html).</p>
 </li>
 <li>
-<p>Many bug fixes.</p>
+<p>Lucene 4.0 provides a modular API, consolidating components such as Analyzers and Queries
+  that were previously scattered across Lucene core, contrib, and Solr. These modules also
+  include additional functionality such as UIMA analyzer integration and a completely reworked
+  spatial search implementation.</p>
 </li>
 </ul>
 <p>Highlights of the Solr release include:</p>
-<ul>
-<li>
-<p>New SolrJ client connector using Apache Http Components http client
-   (SOLR-2020)</p>
-</li>
-<li>
-<p>Many analyzer factories are now "multi term query aware" allowing for things
-   like field type aware lowercasing when building prefix &amp; wildcard queries.
-   (SOLR-2438)</p>
-</li>
-<li>
-<p>New Kuromoji morphological analyzer tokenizes Japanese text, producing
-   both compound words and their segmentation. (SOLR-3056)</p>
-</li>
-<li>
-<p>Range Faceting (Dates &amp; Numbers) is now supported in distributed search
-   (SOLR-1709)</p>
-</li>
-<li>
-<p>HTMLStripCharFilter has been completely re-implemented, fixing many bugs
-   and greatly improving the performance (LUCENE-3690)</p>
-</li>
-<li>
-<p>StreamingUpdateSolrServer now supports the javabin format (SOLR-1565)</p>
-</li>
-<li>
-<p>New LFU Cache option for use in Solr's internal caches. (SOLR-2906)</p>
-</li>
-<li>
-<p>Memory performance improvements to all FST based suggesters (SOLR-2888)</p>
-</li>
-<li>
-<p>New WFSTLookupFactory suggester supports finer-grained ranking for
-   suggestions. (LUCENE-3714)</p>
-</li>
-<li>
-<p>New options for configuring the amount of concurrency used in distributed
-   searches (SOLR-3221)</p>
-</li>
-<li>
-<p>Many bug fixes</p>
-</li>
-</ul>
-<h2 id="27-november-2011-lucene-core-350-and-solr-350-available">27 November 2011 - Lucene Core 3.5.0 and Solr 3.5.0 Available</h2>
-<p>The Lucene PMC is pleased to announce the availability
-of Apache Lucene 3.5.0 and Apache Solr 3.5.0.</p>
-<p>Lucene can be downloaded from <a href="http://www.apache.org/dyn/closer.cgi/lucene/java/">http://www.apache.org/dyn/closer.cgi/lucene/java/</a> and Solr can be downloaded from <a href="http://www.apache.org/dyn/closer.cgi/lucene/solr/">http://www.apache.org/dyn/closer.cgi/lucene/solr/</a></p>
-<p>Highlights of the Lucene release include:</p>
-<ul>
-<li>
-<p>Added a very substantial (3-5X) RAM reduction required to hold the
-terms index on opening an IndexReader. (<a href="https://issues.apache.org/jira/browse/LUCENE-2205">LUCENE-2205</a>)</p>
-</li>
-<li>
-<p>Added IndexSearcher.searchAfter which returns results after a 
-specified ScoreDoc (e.g. last document on the previous page) to 
-support deep paging use cases. 
-(<a href="https://issues.apache.org/jira/browse/LUCENE-2215">LUCENE-2215</a>)</p>
-</li>
-<li>
-<p>Added SearcherManager to manage sharing and reopening IndexSearchers
-across multiple search threads. Underlying IndexReader instances are 
-safely closed if not referenced anymore. 
-(<a href="https://issues.apache.org/jira/browse/LUCENE-3445">LUCENE-3445</a>, 
-<a href="https://issues.apache.org/jira/browse/LUCENE-3558">LUCENE-3558</a>)</p>
-</li>
-<li>
-<p>Added SearcherLifetimeManager which safely provides a consistent 
-view of the index across multiple requests (e.g. paging/drilldown). 
-(<a href="https://issues.apache.org/jira/browse/LUCENE-3558">LUCENE-3558</a>, 
-<a href="https://issues.apache.org/jira/browse/LUCENE-3486">LUCENE-3486</a>)</p>
-</li>
-<li>
-<p>Renamed IndexWriter.optimize to forceMerge to discourage use of 
-this method since it is horribly costly and rarely justified anymore. 
-(<a href="https://issues.apache.org/jira/browse/LUCENE-3454">LUCENE-3454</a>)</p>
-</li>
-<li>
-<p>Added NGramPhraseQuery that speeds up phrase queries 30-50% when
- n-gram analysis is used. (<a href="https://issues.apache.org/jira/browse/LUCENE-3426">LUCENE-3426</a>)</p>
-</li>
-<li>
-<p>Added a new reopen API (IndexReader.openIfChanged) that returns 
-null instead of the old reader if there are no changes in the index. 
-(<a href="https://issues.apache.org/jira/browse/LUCENE-3464">LUCENE-3464</a>)</p>
-</li>
-<li>
-<p>Improvements to vector highlighting: support for more queries 
-such as wildcards and boundary analysis for generated snippets. 
-(<a href="https://issues.apache.org/jira/browse/LUCENE-1824">LUCENE-1824</a>, 
-<a href="https://issues.apache.org/jira/browse/LUCENE-1889">LUCENE-1889</a>)</p>
-</li>
-<li>
-<p>IndexSearcher and IndexReader now perform additional checks to 
-throw AlreadyClosedExceptions if searches are performed on a 
-closed IndexReader. Performing searches on already closed reader can 
-cause JVM crashes when invalid memory mapped files are referenced. </p>
-</li>
-<li>
-<p>Several bugfixes, including a bug where closing an NRT reader 
-after the writer was closed was incorrectly invoking the 
-DeletionPolicy. See CHANGES.txt entries for full details.</p>
-</li>
-</ul>
-<p>Highlights of the Solr release include:</p>
-<ul>
-<li>
-<p>Bug fixes and improvements from Apache Lucene 3.5.0, including a
-    very substantial (3-5X) RAM reduction required to hold the terms
-    index on opening an <code  class="inline">IndexReader</code>.
-    (<a href="https://issues.apache.org/jira/browse/LUCENE-2205">LUCENE-2205</a>)</p>
-</li>
-<li>
-<p>Added support for distributed result grouping.
-    (<a href="https://issues.apache.org/jira/browse/SOLR-2066">SOLR-2066</a>,
-    <a href="https://issues.apache.org/jira/browse/SOLR-2776">SOLR-2776</a>)</p>
-</li>
-<li>
-<p>Added support for Hunspell stemmer <code  class="inline">TokenFilter</code> supporting
-    stemming for 99 languages.
-    (<a href="https://issues.apache.org/jira/browse/SOLR-2769">SOLR-2769</a>)</p>
-</li>
-<li>
-<p>A new contrib module "langid" adds language identification
-    capabilities as an Update Processor, using Tika's
-    <code  class="inline">LanguageIdentifier</code> or Cybozu language-detection library
-    (<a href="https://issues.apache.org/jira/browse/SOLR-1979">SOLR-1979</a>)</p>
-</li>
-<li>
-<p>Numeric types including Trie and date types now support
-    <code  class="inline">sortMissingFirst</code>/<code  class="inline">Last</code>.
-    (<a href="https://issues.apache.org/jira/browse/SOLR-2881">SOLR-2881</a>)</p>
-</li>
-<li>
-<p>Added <code  class="inline">hl.q</code> parameter. It is optional and if it is specified,
-    it overrides <code  class="inline">q</code> parameter in <code  class="inline">Highlighter</code>.
-    (<a href="https://issues.apache.org/jira/browse/SOLR-1926">SOLR-1926</a>)</p>
-</li>
-<li>
-<p>Several minor bugfixes like date parsing for years from 0001-1000, ignored
-    configurations when using <code  class="inline">QueryAnalyzer</code> with
-    <code  class="inline">SpellCheckComponent</code> and many more.
-    See CHANGES.txt entries for full details.</p>
-</li>
-</ul>
-<h2 id="26-october-2011-java-7u1-fixes-index-corruption-and-crash-bugs-in-apache-lucene-core-and-apache-solrwzxhzdk8">26 October 2011 - Java 7u1 fixes index corruption and crash bugs in Apache Lucene Core and Apache Solr</title></h2>
-<p>Oracle released <a href="http://www.oracle.com/technetwork/java/javase/7u1-relnotes-507962.html">Java 7u1</a> on October 19.
-  According to the release notes and tests done by the Lucene committers, all bugs reported on July 28 are fixed in this release,
-  so code using Porter stemmer no longer crashes with <code  class="inline">SIGSEGV</code>. We were not able to experience any index corruption anymore,
-  so it is safe to use Java 7u1 with Lucene Core and Solr.
-  On the same day, Oracle released <a href="http://www.oracle.com/technetwork/java/javase/6u29-relnotes-507960.html">Java 6u29</a>
-  fixing the same problems occurring with Java 6, if the JVM switches <code  class="inline">-XX:+AggressiveOpts</code>
-  or <code  class="inline">-XX:+OptimizeStringConcat</code> were used. Of course, you should <strong>not</strong> use experimental JVM options like
-  <code  class="inline">-XX:+AggressiveOpts</code> in production environments! We recommend everybody to upgrade to this latest version 6u29.
-  In case you upgrade to Java 7, remember that you may have to reindex, as the unicode
-  version shipped with Java 7 changed and tokenization behaves differently
-  (e.g. lowercasing). For more information, read <code  class="inline">JRE_VERSION_MIGRATION.txt</code>
-  in your distribution package!</p>
-<h2 id="14-september-2011-lucene-core-340-and-solr-340-availablewzxhzdk25">14 September 2011 - Lucene Core 3.4.0 and Solr 3.4.0 Available</title></h2>
-<p>The Lucene PMC is pleased to announce the availability
- of Apache Lucene 3.4.0 and Apache Solr 3.4.0.</p>
-<p>Lucene can be downloaded from <a href="http://www.apache.org/dyn/closer.cgi/lucene/java/">http://www.apache.org/dyn/closer.cgi/lucene/java/</a> and Solr can be downloaded from <a href="http://www.apache.org/dyn/closer.cgi/lucene/solr/">http://www.apache.org/dyn/closer.cgi/lucene/solr/</a></p>
-<p>If you are already using Apache Lucene 3.1, 3.2 or 3.3, we strongly recommend you upgrade to 3.4.0 because of the index corruption bug on OS or computer crash or power loss (<a href="https://issues.apache.org/jira/browse/LUCENE-3418">LUCENE-3418</a>), now fixed in 3.4.0.</p>
-<p>Highlights of the Lucene release include:</p>
-<ul>
-<li>
-<p>Fixed a major bug (<a href="https://issues.apache.org/jira/browse/LUCENE-3418">LUCENE-3418</a>) whereby a Lucene index could
-easily become corrupted if the OS or computer crashed or lost
-power.</p>
-</li>
-<li>
-<p>Added a new faceting module (<code  class="inline">contrib/facet</code>) for computing facet
-counts (both hierarchical and non-hierarchical) at search
-time (<a href="https://issues.apache.org/jira/browse/LUCENE-3079">LUCENE-3079</a>).</p>
-</li>
-<li>
-<p>Added a new join module (<code  class="inline">contrib/join</code>), enabling indexing and
-searching of nested (parent/child) documents using
-<code  class="inline">BlockJoinQuery</code>/<code  class="inline">Collector</code> (<a href="https://issues.apache.org/jira/browse/LUCENE-3171">LUCENE-3171</a>).</p>
-</li>
-<li>
-<p>It is now possible to index documents with term frequencies
-included but without positions (<a href="https://issues.apache.org/jira/browse/LUCENE-2048">LUCENE-2048</a>); previously
-<code  class="inline">omitTermFreqAndPositions</code> always omitted both.</p>
-</li>
-<li>
-<p>The modular <code  class="inline">QueryParser</code> (<code  class="inline">contrib/queryparser</code>) can now create
-<code  class="inline">NumericRangeQuery</code>.</p>
-</li>
-<li>
-<p>Added SynonymFilter, in <code  class="inline">contrib/analyzers</code>, to apply multi-word
-synonyms during indexing or querying, including parsers to read
-the wordnet and solr synonym formats (<a href="https://issues.apache.org/jira/browse/LUCENE-3233">LUCENE-3233</a>).</p>
-</li>
-<li>
-<p>You can now control how documents that don't have a value on the
-sort field should sort (<a href="https://issues.apache.org/jira/browse/LUCENE-3390">LUCENE-3390</a>), using <code  class="inline">SortField.setMissingValue</code>.</p>
-</li>
-<li>
-<p>Fixed a case where term vectors could be silently deleted from the
-index after <code  class="inline">addIndexes</code> (<a href="https://issues.apache.org/jira/browse/LUCENE-3402">LUCENE-3402</a>).</p>
-</li>
-</ul>
-<p>Highlights of the Solr release include:</p>
-<ul>
-<li>
-<p>SolrJ client can now parse grouped and range facets results
-(<a href="https://issues.apache.org/jira/browse/SOLR-2523">SOLR-2523</a>).</p>
-</li>
-<li>
-<p>A new <code  class="inline">XsltUpdateRequestHandler</code> allows posting XML that's
-transformed by a provided XSLT into a valid Solr document
-(<a href="https://issues.apache.org/jira/browse/SOLR-2630">SOLR-2630</a>).</p>
-</li>
-<li>
-<p>Post-group faceting option (<code  class="inline">group.truncate</code>) can now compute
-facet counts for only the highest ranking documents per-group.
-(<a href="https://issues.apache.org/jira/browse/SOLR-2665">SOLR-2665</a>).</p>
-</li>
-<li>
-<p>Add <code  class="inline">commitWithin</code> update request parameter to all update handlers
-that were previously missing it.  This tells Solr to commit the
-change within the specified amount of time (<a href="https://issues.apache.org/jira/browse/SOLR-2540">SOLR-2540</a>).</p>
-</li>
-<li>
-<p>You can now specify <code  class="inline">NIOFSDirectory</code> (<a href="https://issues.apache.org/jira/browse/SOLR-2670">SOLR-2670</a>).</p>
-</li>
-<li>
-<p>New parameter <code  class="inline">hl.phraseLimit</code> speeds up <code  class="inline">FastVectorHighlighter</code>
-(<a href="https://issues.apache.org/jira/browse/LUCENE-3234">LUCENE-3234</a>).</p>
-</li>
-<li>
-<p>The query cache and filter cache can now be disabled per request.
-See <a href="http://wiki.apache.org/solr/CommonQueryParameters#Caching_of_filters">this wiki page</a>
-(<a href="https://issues.apache.org/jira/browse/SOLR-2429">SOLR-2429</a>).</p>
-</li>
-<li>
-<p>Improved memory usage, build time, and performance of 
-<code  class="inline">SynonymFilterFactory</code> (<a href="https://issues.apache.org/jira/browse/LUCENE-3233">LUCENE-3233</a>).</p>
-</li>
-<li>
-<p>Added <code  class="inline">omitPositions</code> to the schema, so you can omit position
-information while still indexing term frequencies (<a href="https://issues.apache.org/jira/browse/LUCENE-2048">LUCENE-2048</a>).</p>
-</li>
-<li>
-<p>Various fixes for multi-threaded <code  class="inline">DataImportHandler</code>.</p>
-</li>
-</ul>
-<h2 id="28-july-2011-warning-index-corruption-and-crashes-in-apache-lucene-core-apache-solr-with-java-7">28 July 2011 - WARNING: Index corruption and crashes in Apache Lucene Core / Apache Solr with Java 7</h2>
-<p>Oracle released <a href="http://www.oracle.com/technetwork/java/javase/downloads/index.html">Java 7</a> today.
-Unfortunately it contains hotspot compiler optimizations, which miscompile some loops.
-This can affect code of several Apache projects. Sometimes JVMs only crash, but in several cases,
-results calculated can be incorrect, leading to bugs in applications
-(see Hotspot bugs <a href="http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=7070134">7070134</a>,
-<a href="http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=7044738">7044738</a>,
-<a href="http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=7068051">7068051</a>).
-<strong>Apache Lucene Core</strong> and <strong>Apache Solr</strong> are two Apache projects,
-which are affected by these bugs, namely all versions released until today.
-Solr users with the default configuration will have
-Java crashing with <code  class="inline">SIGSEGV</code> as soon as they start to index documents, as one
-affected part is the well-known Porter stemmer
-(see <a href="https://issues.apache.org/jira/browse/LUCENE-3335">LUCENE-3335</a>).
-Other loops in Lucene may be miscompiled, too, leading to index corruption
-(especially on Lucene trunk with pulsing codec; other loops may be
-affected, too - <a href="https://issues.apache.org/jira/browse/LUCENE-3346">LUCENE-3346</a>).
-These problems were detected only 5 days before the official Java 7 release,
-so Oracle had no time to fix those bugs, affecting also many more applications.
-In response to our questions, they proposed to include the fixes into service
-release u2 (eventually into service release u1, see
-<a href="http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2011-July/005971.html">this mail</a>).
-<strong>This means you cannot use Apache Lucene/Solr with Java 7 releases before Update 2!</strong>
-If you do, please don't open bug reports, it is not the committers' fault!
-At least disable loop optimizations using the <code  class="inline">-XX:-UseLoopPredicate</code> JVM option
-to not risk index corruptions.
-<em>Please note:</em> Also Java 6 users are affected, if they use one of those
-JVM options, which are <strong>not</strong> enabled by default: <code  class="inline">-XX:+OptimizeStringConcat</code>
-or <code  class="inline">-XX:+AggressiveOpts</code>.
-It is strongly recommended not to use any hotspot optimization switches in any Java
-version without extensive testing!
-In case you upgrade to Java 7, remember that you may have to reindex, as the unicode
-version shipped with Java 7 changed and tokenization behaves differently
-(e.g. lowercasing). For more information, read <code  class="inline">JRE_VERSION_MIGRATION.txt</code>
-in your distribution package!</p>
-<h2 id="1-july-2011-lucene-core-33-and-solr-33-available">1 July 2011 - Lucene Core 3.3 and Solr 3.3 Available</h2>
-<p>The Lucene PMC is pleased to announce the availability
-of Apache Lucene 3.3 and Apache Solr 3.3.</p>
-<p>Lucene can be downloaded from <a href="http://www.apache.org/dyn/closer.cgi/lucene/java/">http://www.apache.org/dyn/closer.cgi/lucene/java/</a> and Solr can be downloaded from <a href="http://www.apache.org/dyn/closer.cgi/lucene/solr/">http://www.apache.org/dyn/closer.cgi/lucene/solr/</a>
-Highlights of the Lucene release include:</p>
-<ul>
-<li>The spellchecker module now includes suggest/auto-complete functionality,
-       with three implementations: Jaspell, Ternary Trie, and Finite State.</li>
-<li>Support for merging results from multiple shards, for both "normal"
-       search results (TopDocs.merge) as well as grouped results using the
-       grouping module (SearchGroup.merge, TopGroups.merge).</li>
-<li>An optimized implementation of KStem, a less aggressive stemmer
-       for English</li>
-<li>Single-pass grouping implementation based on block document indexing.</li>
-<li>Improvements to MMapDirectory (now also the default implementation
-       returned by FSDirectory.open on 64-bit Linux).</li>
-<li>NRTManager simplifies handling near-real-time search with multiple
-       search threads, allowing the application to control which indexing
-       changes must be visible to which search requests.</li>
-<li>TwoPhaseCommitTool facilitates performing a multi-resource
-       two-phased commit, including IndexWriter.</li>
-<li>The default merge policy, TieredMergePolicy, has a new method
-       (set/getReclaimDeletesWeight) to control how aggressively it
-       targets segments with deletions, and is now more aggressive than
-       before by default.</li>
-<li>PKIndexSplitter tool splits an index by a mid-point term.</li>
-</ul>
-<p>Highlights of the Solr release include:</p>
-<ul>
-<li>Grouping / Field Collapsing</li>
-<li>A new, automaton-based suggest/autocomplete implementation offering an
-       order of magnitude smaller RAM consumption.</li>
-<li>KStemFilterFactory, an optimized implementation of a less aggressive
-       stemmer for English.</li>
-<li>Solr defaults to a new, more efficient merge policy (TieredMergePolicy).
-       See <a href="http://s.apache.org/merging">http://s.apache.org/merging</a> for more information.</li>
-<li>Important bugfixes, including extremely high RAM usage in spellchecking.</li>
-<li>Bugfixes and improvements from Apache Lucene 3.3</li>
-</ul>
-<h2 id="4-june-2011-lucene-core-32-and-solr-32-available">4 June 2011 - Lucene Core 3.2 and Solr 3.2 Available</h2>
-<p>The Lucene PMC is pleased to announce the availability of Apache Lucene 3.2 and Apache Solr 3.2.</p>
-<p>Lucene can be downloaded from <a href="http://www.apache.org/dyn/closer.cgi/lucene/java/">http://www.apache.org/dyn/closer.cgi/lucene/java/</a> and Solr can be downloaded from <a href="http://www.apache.org/dyn/closer.cgi/lucene/solr/">http://www.apache.org/dyn/closer.cgi/lucene/solr/</a>
-Highlights of the Lucene release include:</p>
-<ul>
-<li>A new grouping module, under lucene/contrib/grouping, enables
-       search results to be grouped by a single-valued indexed field</li>
-<li>A new IndexUpgrader tool fully converts an old index to the
-       current format.</li>
-<li>A new Directory implementation, NRTCachingDirectory, caches small
-       segments in RAM, to reduce the I/O load for applications with fast
-       NRT reopen rates.</li>
-<li>A new Collector implementation, CachingCollector, is able to
-       gather search hits (document IDs and optionally also scores) and
-       then replay them.  This is useful for Collectors that require two
-       or more passes to produce results.</li>
-<li>Index a document block using IndexWriter's new addDocuments or
-       updateDocuments methods.  These experimental APIs ensure that the
-       block of documents will forever remain contiguous in the index,
-       enabling interesting future features like grouping and joins.</li>
-<li>A new default merge policy, TieredMergePolicy, which is more
-       efficient due to being able to merge non-contiguous segments.
-       See <a href="http://s.apache.org/merging">http://s.apache.org/merging</a> for details.</li>
-<li>NumericField is now returned correctly when you load a stored
-       document (previously you received a normal Field back, with the
-       numeric value converted string).</li>
-<li>Deleted terms are now applied during flushing to the newly flushed
-       segment, which is more efficient than having to later initialize a
-       reader for that segment.</li>
-</ul>
-<p>Highlights of the Solr release include:</p>
-<ul>
-<li>Ability to specify overwrite and commitWithin as request parameters when 
-       using the JSON update format.</li>
-<li>TermQParserPlugin, useful when generating filter queries from terms 
-       returned from field faceting or the terms component.</li>
-<li>DebugComponent now supports using a NamedList to model Explanation objects 
-       in its responses instead of Explanation.toString.</li>
-<li>Improvements to the UIMA and Carrot2 integrations.</li>
-<li>Highlighting performance improvements. </li>
-<li>A test-framework jar for easy testing of Solr extensions.</li>
-<li>Bugfixes and improvements from Apache Lucene 3.2. </li>
-</ul>
-<h2 id="31-march-2011-lucene-core-31-and-solr-31-available">31 March 2011 - Lucene Core 3.1 and Solr 3.1 Available</h2>
-<p>The Lucene PMC is pleased to announce the availability of Apache Lucene 3.1 and Apache Solr 3.1.
-The version number for Solr 3.1 was chosen to reflect the merge of
-development with Lucene, which is currently also on 3.1.  Going
-forward, we expect the Solr version to be the same as the Lucene
-version.  Solr 3.1 contains Lucene 3.1 and is the release after Solr 1.4.1.</p>
-<p>Lucene can be downloaded from <a href="http://www.apache.org/dyn/closer.cgi/lucene/java/">http://www.apache.org/dyn/closer.cgi/lucene/java/</a> and Solr can be downloaded from <a href="http://www.apache.org/dyn/closer.cgi/lucene/solr/">http://www.apache.org/dyn/closer.cgi/lucene/solr/</a>
-Highlights of the Lucene release include:</p>
-<ul>
-<li>Numerous performance improvements: faster exact PhraseQuery; merging
-favors segments with deletions; primary key lookup is faster;
-IndexWriter.addIndexes(Directory[]) uses file copy instead of
-merging; various Directory performance improvements; compound file
-is dynamically turned off for large segments; fully deleted segments
-are dropped on commit; faster snowball analyzers (in contrib);
-ConcurrentMergeScheduler is more careful about setting priority of
-merge threads.</li>
-<li>ReusableAnalyzerBase makes it easier to reuse TokenStreams
-correctly.</li>
-<li>Improved Analysis capabilities: Improved Unicode support, including
-Unicode 4, more friendly term handling (CharTermAttribute), easier
-object reuse and better support for protected words in lossy token
-filters (e.g. stemmers).</li>
-<li>ConstantScoreQuery now allows directly wrapping a Query.</li>
-<li>IndexWriter is now configured with a new separate builder API,
-IndexWriterConfig.  You can now control IndexWriter's previously
-fixed internal thread limit by calling setMaxThreadStates.</li>
-<li>IndexWriter.getReader is replaced by IndexReader.open(IndexWriter).
-In addition you can now specify whether deletes should be resolved
-when you open an NRT reader.</li>
-<li>MultiSearcher is deprecated; ParallelMultiSearcher has been
-absorbed directly into IndexSearcher.</li>
-<li>On 64bit Windows and Solaris JVMs, MMapDirectory is now the
-default implementation (returned by FSDirectory.open).
-MMapDirectory also enables unmapping if the JVM supports it.</li>
-<li>New TotalHitCountCollector just counts total number of hits.</li>
-<li>ReaderFinishedListener API enables external caches to evict entries
-once a segment is finished.</li>
-</ul>
-<p>Highlights of the Solr release include:</p>
-<ul>
-<li>Numeric range facets (similar to date faceting).</li>
-<li>New spatial search, including spatial filtering, boosting and sorting capabilities.</li>
-<li>Example Velocity driven search UI at http://localhost:8983/solr/browse</li>
-<li>A new termvector-based highlighter</li>
-<li>Extend dismax (edismax) query parser which addresses some
-missing features in the dismax query parser along with some
-extensions.</li>
-<li>Several more components now support distributed mode:
-TermsComponent, SpellCheckComponent.</li>
-<li>A new Auto Suggest component.</li>
-<li>Ability to sort by functions.</li>
-<li>JSON document indexing.</li>
-<li>CSV response format.</li>
-<li>Apache UIMA integration for metadata extraction.</li>
-<li>Leverages Lucene 3.1 and it's inherent optimizations and bug fixes
-as well as new analysis capabilities.</li>
-<li>Numerous improvements, bug fixes, and optimizations.</li>
-</ul></div>
+<p>The largest set of features goes by the development code-name âSolr Cloudâ and involves bringing easy scalability to Solr.  See http://wiki.apache.org/solr/SolrCloud for more details.
+ - Distributed indexing designed from the ground up for near real-time (NRT) and NoSQL features such as realtime-get, optimistic locking, and durable updates.
+ - High availability with no single points of failure.
+ - Apache Zookeeper integration for distributed coordination and cluster metadata and configuration storage.
+ - Immunity to split-brain issues due to Zookeeper's Paxos distributed consensus protocols.
+ - Updates sent to any node in the cluster and are automatically forwarded to the correct shard and replicated to multiple nodes for redundancy.
+ - Queries sent to any node automatically perform a full distributed search across the cluster with load balancing and fail-over.</p>
+<p>Solr 4.0-alpha includes more NoSQL features for those using Solr as a primary data store:
+ - Update durability â A transaction log ensures that even uncommitted documents are never lost.
+ - Real-time Get â The ability to quickly retrieve the latest version of a document, without the need to commit or open a new searcher
+ - Versioning and Optimistic Locking â combined with real-time get, this allows read-update-write functionality that ensures no conflicting changes were made concurrently by other clients.
+ - Atomic updates -  the ability to add, remove, change, and increment fields of an existing document without having to send in the complete document again.</p>
+<p>There are many other features coming in Solr 4, such as
+ - Pivot Faceting â Multi-level or hierarchical faceting where the top constraints for one field are found for each top constraint of a different field.
+ - Pseudo-fields â The ability to alias fields, or to add metadata along with returned documents, such as function query values and results of spatial distance calculations.
+ - A spell checker implementation that can work directly from the main index instead of creating a sidecar index.
+ - Pseudo-Join functionality â The ability to select a set of documents based on their relationship to a second set of documents.
+ - Function query enhancements including conditional function queries and relevancy functions.
+ - New update processors to facilitate modifying documents prior to indexing.
+ - A brand new web admin interface, including support for SolrCloud.</p></div>