You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-commits@lucene.apache.org by Apache Wiki <wi...@apache.org> on 2006/02/11 08:51:58 UTC

[Solr Wiki] Update of "SolrConfigXml" by HossMan

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Solr Wiki" for change notification.

The following page has been changed by HossMan:
http://wiki.apache.org/solr/SolrConfigXml

The comment on the change is:
initial import from CNET's PI/Solarconfigxmltemplate wiki page

New page:
= solrconfig.xml =

solrconfig.xml is the file that contains most of the parameters for configuring SOLR itself. 

/!\ Package names need updated

/!\ Should link to a sample file.

[[TableOfContents]]

/!\ Need section mentioning indexDir and indexDefaults ... existing section in CNET wiki was jacked up somehow.


== mainIndex Parameters ==

The values in this section controls the merging of multiple index segments. (Disregard the `indexDefaults` section above.) See the `mergeFactor` Considerations section on the SolrPerformanceFactors doc for more details.
{{{
   <mainIndex>
    <!-- lucene options specific to the main on-disk lucene index -->
    <useCompoundFile>false</useCompoundFile>
    <mergeFactor>10</mergeFactor>
    <maxBufferedDocs>1000</maxBufferedDocs>
    <maxMergeDocs>2147483647</maxMergeDocs>
    <maxFieldLength>10000</maxFieldLength>
  </mainIndex>
}}}
The `maxMergeDocs` parameter tells Lucene to not to allow any segment to contain more docs than the value stipulated, but to create a new segment instead.

== Update Handler Section ==

You can list multiple events in the updateHandler but do not change
`updateHandler class="solar.DirectUpdateHandler2"`.  

{{{
<updateHandler class="solar.DirectUpdateHandler2">

    <!-- autocommit pending docs if certain criteria are met -->
    <autocommit>  <!-- NOTE: autocommit not implemented yet -->
      <maxDocs>10000</maxDocs>
      <maxSec>3600</maxSec>
    </autocommit>

    <!-- represents a lower bound on the frequency that commits 
         may occur (in seconds). NOTE: not yet implemented
    -->
    <commitIntervalLowerBound>0</commitIntervalLowerBound>

    <!-- The RunExecutableListener executes an external command.
         exe  - the name of the executable to run
         dir  -  dir to use as the current working directory. default="."
         wait - the calling thread waits until the executable returns. 
                default="true"
         args - the arguments to pass to the program.  default=nothing
         env  - environment variables to set.  default=nothing
      -->
    <!-- A postCommit event is fired after every commit
      -->
    <listener event="postCommit" class="solar.RunExecutableListener">
      <str name="exe">/var/opt/resin3/__PORT__/scripts/solar/snapshooter</str>
      <str name="dir">/var/opt/resin3/__PORT__</str>
      <bool name="wait">true</bool>
      <!--
      <arr name="args"> <str>arg1</str> <str>arg2</str> </arr>
      <arr name="env"> <str>MYVAR=val1</str> </arr>
        -->
    </listener>
  </updateHandler>
}}}
 
== The Query Section ==

Controls everything query-related.

{{{
  <query>
    <!-- Maximum number of clauses in a boolean query... can affect range 
         or wildcard queries that expand to big boolean queries.  
         An exception is thrown if exceeded.
    -->
    <maxBooleanClauses>1024</maxBooleanClauses>
}}}
 
=== Caching Section ===

You can change these caching parameters as your index grows and changes. See the   SolrCaching page for details on configuring the caches.

{{{
    <!-- Internal cache used by SolarIndexSearcher for filters (DocSets),
         unordered sets of *all* documents that match a query.
         When a new searcher is opened, its caches may be prepopulated
         or "autowarmed" using data from caches in the old searcher.
         autowarmCount is the number of items to prepopulate.  For LRUCache,
         the prepopulated items will be the most recently accessed items.
      -->
    <filterCache
      class="solar.search.LRUCache"
      size="16384"
      initialSize="4096"
      autowarmCount="4096"/>

    <!-- queryResultCache caches results of searches - ordered lists of
         document ids (DocList) based on a query, a sort, and the range
         of documents requested.
      -->
    <queryResultCache
      class="solar.search.LRUCache"
      size="16384"
      initialSize="4096"
      autowarmCount="1024"/>

   <!-- documentCache caches Lucene Document objects (the stored fields for
        each document).
    -->
    <documentCache
      class="solar.search.LRUCache"
      size="16384"
      initialSize="16384"/>

 <!-- Example of a generic cache.  These caches may be accessed by name 
      through SolarIndexSearcher.getCache(),cacheLookup(), and cacheInsert(). 
      The purpose is to enable easy caching of user/application level data. 
      The regenerator argument should be specified as an implementation of
      solar.search.CacheRegenerator if autowarming is desired.
   -->
    <!--
    <cache name="dfllNode"
      class="solar.search.LRUCache"
      size="4096"
      initialSize="2048"
      autowarmCount="4096"
      regenerator="cnwk.dfll.plugin.DfllSolarRequestHandler$Regenerator"/>
    -->

    <!-- An optimization that attempts to use a filter to satisfy a search.
         If the requested sort does not include a score, then the filterCache 
         will be checked for a filter matching the query.  If found, the filter 
         will be used as the source of document ids, and then the sort will be 
         applied to that.
      -->
    <useFilterForSortedQuery>true</useFilterForSortedQuery>

    <!-- An optimization for use with the queryResultCache.  When a search
         is requested, a superset of the requested number of document ids
         are collected.  For example, of a search for a particular query
         requests matching documents 10 through 19, and queryWindowSize is 50, 
         then documents 0 through 50 will be collected and cached. Any further 
         requests in that range can be satisfied via the cache.
    -->
    <queryResultWindowSize>50</queryResultWindowSize>

    <!-- This entry enables an int hash representation for filters (DocSets) 
         when the number of items in the set is less than maxSize. For smaller 
         sets, this representation is more memory efficient, more efficient to 
         iterate over, and faster to take intersections.
     -->
    <HashDocSet maxSize="3000" loadFactor="0.75"/>


    <!-- boolToFilterOptimizer converts boolean clauses with zero boost
         cached filters if the number of docs selected by the clause exceeds the
         threshold (represented as a fraction of the total index)
    -->
    <boolTofilterOptimizer enabled="true" cacheSize="32" threshold=".05"/>
}}}

== Searcher Section ==

Use this section to define a listener for a particular events &#8212; '''listener events''' that you can use to fire-off special code &#8212; such as code that invokes some common queries to warm-up caches. You can have a great number of individual queries in both Searcher sections &#8212; for example, objects that you know will ALWAYS be requested &#8212; and therefore should be auto-warmed in every new searcher. 

=== New Searcher ===

Use this section to define a listener for a particular event &#8212; the New Searcher event.  A new Searcher is opened when a (current) Searcher already exists. In the example below, the listener is of the class, !QuerySenderListener, which takes lists of queries and sends them to the new searcher being opened, thereby warming it.  

{{{
    <!-- a newSearcher event is fired whenever a new searcher is being
         prepared and there is a current searcher handling requests 
         (aka registered). 
     -->
    <!-- QuerySenderListener takes an array of NamedList and 
         executes a local query request for each NamedList in sequence.
     -->
    <!--
    <listener event="newSearcher" class="solar.QuerySenderListener">
      <arr name="queries">
        <lst> <str name="q">solar</str> 
              <str name="start">0</str>
              <str name="rows">10</str> 
        </lst>
        <lst> <str name="q">rocks</str> 
              <str name="start">0</str>
              <str name="rows">10</str> 
        </lst>
      </arr>
    -->
}}}

=== First Searcher ===

Use this section to define a listener for a the First Searcher event. A First Searcher is opened when there is _no_ existing (current) Searcher. In the example below, the listener is of the class, QuerySenderListener, which takes lists of queries and sends them to the new searcher being opened, thereby warming it. (If there is no Searcher, you cannot use auto-warming because auto-warming requires an existing Searcher.)

{{{
    <!-- a firstSearcher event is fired whenever a new searcher is being
         prepared but there is no current registered searcher to handle 
         requests or to gain prewarming data from.
     -->
    <!--
    <listener event="firstSearcher" class="solar.QuerySenderListener">
      <arr name="queries">
        <lst> <str name="q">fast_warm</str> 
              <str name="start">0</str>
              <str name="rows">10</str>
        </lst>
      </arr>
    </listener>
    -->
  </query>
}}}

== Request Handler Plug-in Section ==

This is where multiple request handlers can be registered.

{{{
<!-- requestHandler plugins... incoming queries will be dispatched to 
     the correct handler based on the qt (query type) param matching the 
     name of registered handlers. The "standard" request handler is the 
     default and will be used if qtis not specified in the request.
  -->
  <requestHandler name="standard" class="solar.StandardRequestHandler" />
  <requestHandler name="old" class="solar.tst.OldRequestHandler" />
  <requestHandler name="test" class="solar.tst.TestRequestHandler" />  
}}}
 
== The GUI Section ==

This section handles the administration web page.

Defines “Gettable” files &#151; allows the defined files to be accessed through the web interface. Also specifies the default search to be filed in on the admin form, and what the "ping" query should be for monitoring the health of the index.

{{{
<admin>
    <defaultQuery>solar</defaultQuery>
    <gettableFiles>
         solrconfig.xml conf/solar/WEB-INF/web.external.xml
         conf/resin.conf start
    </gettableFiles> 
    <pingQuery>q=solr&amp;version=2.0&amp;start=0&amp;rows=0</pingQuery>
  </admin>
}}}