You are viewing a plain text version of this content. The canonical link for it is here.

Posted to users@jackrabbit.apache.org by AdamR <ad...@runbox.com> on 2009/08/25 13:01:04 UTC

Clustering with database replication

Hi all,

I've been playing with an unusual clustering model and would like your
feedback. It seems to me that my solution may be applicable to a wide range
of situations - assuming there are no glaring problems that I have not
thought of.

Contsraints:
1) There is only one read/write cluster node (master) and many read-only
nodes (slaves)
2) It is not practical for the cluster nodes to share a common database or
file system

The second constraint is the most significant as Jackrabbit clustering
requires all nodes to share a common DB or file system for persistance
manager storage and the cluster journal. This is not desirable for my
application. At worst it introduces a single point of failure, at best
unwanted complexity configuring a database/SAN cluster to gaurantee the
shared storage will always be available.

My solution is to store as much as possible - the PM storage, datastore,
cluster journal etc - in a database, and use database replication to keep
the Jackrabbit slaves up-to-date with the Jackrabbit master.

Every node in the cluster is configured to look only in it's local database,
it has no dependencies on any other node or any shared storage. Each slave
Jackrabbit has an exact copy of the master's cluster journal, allowing it to
keep it's indexes up-to-date whenever changes are made.

I've been testing this using MySQL and it appears to work well. My only
worry is around the management of the cluster journal. On the master, the
local_revisions table contains one row for itself, which obviously always
contains the latest revision ID. The table does not contain any data for the
other cluster nodes (the MySQL replication only works in one direction),
which it has no knowledge about. As far as I can tell this table is only
used to determine which items from the journal need to be processed on each
node, therfore so long as each node's copy of the local_revisions table has
itself in, it should be fine?

So what do people think? I realise this is probably an unusual way of
deploying a Jackrabbit cluster - but I think it works.

Cheers,
Adam

--
View this message in context: http://www.nabble.com/Clustering-with-database-replication-tp25132305p25132305.html
Sent from the Jackrabbit - Users mailing list archive at Nabble.com.

Re: Clustering with database replication

Posted by AdamR <ad...@runbox.com>.


AdamR wrote:
> 
> On the master, the local_revisions table contains one row for itself,
> which obviously always contains the latest revision ID. The table does not
> contain any data for the other cluster nodes (the MySQL replication only
> works in one direction), which it has no knowledge about. As far as I can
> tell this table is only used to determine which items from the journal
> need to be processed on each node, therfore so long as each node's copy of
> the local_revisions table has itself in, it should be fine?
> 

Hmm, I just discovered the RevisionTableJanitor. This would be problematic
as the cluster journal would constantly get cleaned-up before the slave
nodes have a chance to update themselves. However, I think everything will
be fine if I keep this turned off and run periodic manual clean-ups instead.
-- 
View this message in context: http://www.nabble.com/Clustering-with-database-replication-tp25132305p25132565.html
Sent from the Jackrabbit - Users mailing list archive at Nabble.com.