You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@jackrabbit.apache.org by Apache Wiki <wi...@apache.org> on 2007/05/07 16:39:46 UTC

[Jackrabbit Wiki] Update of "Clustering" by DominiquePfister

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Jackrabbit Wiki" for change notification.

The following page has been changed by DominiquePfister:
http://wiki.apache.org/jackrabbit/Clustering

New page:
= Clustering =

Clustering support was added in Jackrabbit 1.2.1. This works as follows: content is shared through all cluster nodes. Every change made by one cluster node is reported in a journal, which can be either file based or written to some database.

== Prerequisites ==

In order to cluster some repository nodes, the following prerequisites must be met:

 * The persistence managers must store their data in the same, globally accessible location
 * Every cluster node must be assigned a unique ID
 * A journal type must be chosen, either based on files or stored in a database

Let's look at these steps in more detail:

=== Persistence manager configuration ===

For performance reasons, only information identifying the modified items is stored in the journal. This implies, that all cluster nodes must have access to the items' actual content. Since file system based persistence managers are not transactional, one has to use persistence managers storing their data in a database running standalone. The following sample shows a workspace's persistence manager configuration using an Oracle database:

{{{
<PersistenceManager class="org.apache.jackrabbit.core.persistence.db.OraclePersistenceManager">
  <param name="url" value="jdbc:oracle:thin:@myhost:1521:mydb" />
  <param name="user" value="scott"/>
  <param name="password" value="tiger"/>
  <param name="schemaObjectPrefix" value="${wsp.name}_"/>
  <param name="externalBLOBs" value="false"/>
</PersistenceManager>
}}}

Since the file system BLOB store uses a repository local directory and is not transactional, one should set the parameter '''externalBLOBs''' to ''false'' in order to store BLOBs in the database as well.

=== Unique cluster node ID ===

Every cluster node needs a unique ID that will help identify the author of some change. This ID can be either specified in the cluster configuration as '''id''' attribute or as value of the system property '''org.apache.jackrabbit.core.cluster.node_id'''. When copying repository configurations, do not forget to adapt the cluster node IDs if they are hardcoded. See below for some sample cluster configurations.

=== Journal type ===

The cluster nodes store information identifying items they modified in a journal. This journal must again be globally available to all nodes in the cluster. This can be either a folder in the file system or a database running standalone.

==== File journal ====

The file journal is configured through the following properties:

 * '''revision''': location of the cluster node's revision file
 * '''directory''': location of the journal folder

==== Database journal ====

The database journal is configured through the following properties:

 * '''revision''': location of the cluster node's revision file
 * '''driver''': JDBC driver class name
 * '''url''': JDBC URL
 * '''user''': user name
 * '''password''': password

== Sample cluster configuration ==

This section contains some sample cluster configurations. First, using a file based journal implementation, where the journal files are created in a share exported by NFS:

{{{
<Cluster id="node1">
  <Journal class="org.apache.jackrabbit.core.journal.FileJournal">
    <param name="revision" value="${rep.home}/revision.log" />
    <param name="directory" value="/nfs/myserver/myjournal" />
  </Journal>
</Cluster>
}}}

In the next configuration, the journal is stored in an Oracle database:

{{{
<Cluster id="node1">
  <Journal class="org.apache.jackrabbit.core.journal.DatabaseJournal">
    <param name="revision" value="${rep.home}/revision.log" />
    <param name="driver" value="oracle.jdbc.driver.OracleDriver" />
    <param name="url" value="jdbc:oracle:thin:@myhost:1521:mydb" />
    <param name="user" value="scott"/>
    <param name="password" value="tiger"/>
  </Journal>
</Cluster>
}}}

'''Note''': the journal implementation classes have been refactored in Jackrabbit 1.3. In earlier versions, journal implementations resided in the package ''org.apache.jackrabbit.core.cluster''.