You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@lucene.apache.org by "Cassandra Targett (Confluence)" <co...@apache.org> on 2013/09/18 20:35:00 UTC

[CONF] Apache Solr Reference Guide > Solr Glossary

Space: Apache Solr Reference Guide (https://cwiki.apache.org/confluence/display/solr)
Page: Solr Glossary (https://cwiki.apache.org/confluence/display/solr/Solr+Glossary)


Edited by Cassandra Targett:
---------------------------------------------------------------------
Where possible, terms are linked to relevant parts of the Solr Reference Guide for more information.

----
*Jump to a letter:*

[A|#A] [B|#B] [C|#C] [D|#D] [E|#E] [F|#F] G H [I|#I] J K [L|#L] [M|#M] [N|#N] [O|#O] P [Q|#Q] [R|#R] [S|#S] [T|#T] U V [W|#W] X Y [Z|#Z]


h2. A

h6. [Atomic updates|Updating Parts of Documents#Atomic Updates]
An approach to updating only one or more fields of a document, instead of reindexing the entire document. 

h2. B

h6. Boolean operators
These control the inclusion or exclusion of keywords in a query by using operators such as AND, OR, and NOT.


h2. C

h6. Cluster
In Solr, a cluster is a set of Solr nodes managed as a unit. They may contain many cores, collections, shards, and/or replicas. See also [#SolrCloud].

h6. Collection
In Solr, one or more documents grouped together in a single logical index. A collection must have a single schema, but can be spread across multiple cores.

In [#ZooKeeper], a group of cores managed together as part of a SolrCloud installation. 

h6. Commit
To make document changes permanent in the index. In the case of added documents, they would be searchable after a _commit_.

h6. Core
An individual Solr instance (represents a logical index). Multiple cores can run on a single node. See also [#SolrCloud].

h6. Core reload
To re-initialize Solr after changes to {{schema.xml}}, {{solrconfig.xml}} or other configuration files.

h2. D

h6. Distributed search
Distributed search is one where queries are processed across more than one [shard|#Shard].

h6. Document
One or more Fields and their values that are considered related for indexing. See also [Field|#field].

h2. E

h6. Ensemble
A [#ZooKeeper] term to indicate multiple ZooKeeper instances running simultaneously.

h2. F

h6. Facet
The arrangement of search results into categories based on indexed terms.

h6. Field
The content to be indexed/searched along with metadata defining how the content should be processed by Solr.


h2. I

h6. Inverse document frequency (IDF)
A measure of the general importance of a term. It is calculated as the number of total Documents divided by the number of Documents that a particular word occurs in the collection. See [http://en.wikipedia.org/wiki/Tf-idf] and [http://lucene.apache.org/core/old_versioned_docs/versions/3_5_0/scoring.html] for more info on TF-IDF based scoring and Lucene scoring in particular. See also [#Term Frequency].

h6. Inverted index
A way of creating a searchable index that lists every word and the documents that contain those words, similar to an index in the back of a book which lists words and the pages on which they can be found.  When performing keyword searches, this method is considered more efficient than the alternative, which would be to create a list of documents paired with every word used in each document. Since users search using terms they expect to be in documents, finding the term before the document saves processing resources and time.

h2. L

h6. Leader
The main shard for each node that routes document adds, updates, or deletes to other shards on the same node. See also [#SolrCloud].

h2. M

h6. Metadata
Literally, _data about data_.  Metadata is information about a document, such as it's title, author, or location.

h2. N

h6. Natural language query
A search that is entered as a user would normally speak or write, as in, "What is aspirin?"

h6. Node
A JVM instance running Solr. Also known as a Solr server.

h2. O

h6. [Optimistic concurrency|Updating Parts of Documents#Optimistic Concurrency]
Also known as "optimistic locking", this is an approach that allows for updates to documents currently in the index while retaining locking or version control.

h6. Overseer
The name of the SolrCloud process that coordinates the clusters. It keeps track of existing nodes and shards, and assigns shards to nodes. See also [#SolrCloud].

h2. Q

h6. Query parser
A query parser processes the terms entered by a user.


h2. R

h6. Recall
The ability of a search engine to retrieve _all_ of the possible matches to a user's query.

h6. Relevance
The appropriateness of a document to the search conducted by the user.

h6. Replica
A copy of a shard or single logical index, for use in failover or load balancing. 

h6. [Replication|solr:Index Replication]
A method of copying a master index from one server to one or more "slave" or "child" servers. 

h6. [RequestHandler|solr:RequestHandlers and SearchComponents in SolrConfig]
Logic and configuration parameters that tell Solr how to handle incoming "requests", whether the requests are to return search results, to index documents, or to handle other custom situations.

h2. S

h6. [SearchComponent|solr:RequestHandlers and SearchComponents in SolrConfig]
Logic and configuration parameters used by request handlers to process query requests. Examples of search components include faceting, highlighting, and "more like this" functionality.

h6. Shard
In SolrCloud, a logical section of a single collection. This may be spread across multiple nodes. See also [#SolrCloud].

h6. [solr:SolrCloud]
Umbrella term for a suite of functionality in Solr which allows managing a cluster of Solr servers for scalability, fault tolerance, and high availability.

h6. Solr Schema (schema.xml)
The Apache Solr index schema. The schema defines the fields to be indexed and the type for the field (text, integers, etc.) The schema is stored in schema.xml and is located in the Solr home conf directory.

h6. SolrConfig (solrconfig.xml)
The Apache Solr configuration file. Defines indexing options, RequestHandlers, highlighting, spellchecking and various other configurations. The file, solrconfig.xml is located in the Solr home conf directory.

h6. Spell Check
The ability to suggest alternative spellings of search terms to a user, as a check against spelling errors causing few or zero results. 

h6. Stopwords
Generally, words that have little meaning to a user's search but which may have been entered as part of a [natural language|#naturallanguage] query. Stopwords are generally very small pronouns, conjunctions and prepositions (such as, "the", "with", or "and")

h6. [solr:Suggester]
Functionality in Solr that provides the ability to suggest possible query terms to users as they type.

h6. Synonyms
Synonyms generally are terms which are near to each other in meaning and may substitute for one another. In a search engine implementation, synonyms may be abbreviations as well as words, or terms that are not consistently hyphenated. Examples of synonyms in this context would be "Inc." and "Incorporated" or "iPod" and "i-pod".

h2. T

h6. Term frequency
The number of times a word occurs in a given document. See [http://en.wikipedia.org/wiki/Tf-idf] and [http://lucene.apache.org/java/2_3_2/scoring.html] for more info on TF-IDF based scoring and Lucene scoring in particular.
See also [#Inverse Document Frequency (IDF)].

h6. Transaction log
An append-only log of write operations maintained by each node. This log is only required with SolrCloud implementations and is created and managed automatically by Solr.

h2. W

h6. Wildcard
A wildcard allows a substitution of one or more letters of a word to account for possible variations in spelling or tenses.

h2. Z

h6. ZooKeeper
Also known as [Apache ZooKeeper|http://zookeeper.apache.org/]. The system used by SolrCloud to keep track of configuration files and node names for a cluster. A ZooKeeper cluster is used as the central configuration store for the cluster, a coordinator for operations requiring distributed synchronization, and the system of record for cluster topology. See also [#SolrCloud].

{scrollbar}


Stop watching space: https://cwiki.apache.org/confluence/users/removespacenotification.action?spaceKey=solr
Change email notification preferences: https://cwiki.apache.org/confluence/users/editmyemailsettings.action