You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@jackrabbit.apache.org by William Ribeiro <wr...@tricode.nl> on 2010/09/29 13:42:39 UTC

In deph Jackrabbit clustering question

Hello everybody,

I got a question about Clustering Jackrabbit (v 1.6.2). I'm trying to figure
out how the clustering schematics works. I got a custom in-memory database
that I integrated with Jackrabbit: I already got custom PersistenceManager
and FileSystem implementations plus some other required classes. They are
working fine since I got my test repositories running with all
functionality.

The new feature that I want to support is Clustering. I started implementing
my custom Journal class, extending from AbstractJournal, but there are some
points that I really don't get.

I start 2 repositories/cluster nodes in the same machine but with different
configurations. Each repo has it's own unique ID and I think they are
compliant to the configuration rules stated here (
http://wiki.apache.org/jackrabbit/Clustering) since both repos start without
any problems.

I try to add content to one of the repos like this:
========================== JAVA CODE =====================================
Workspace ws = session.getWorkspace();
List<String> prefixes =
Arrays.asList(ws.getNamespaceRegistry().getPrefixes());
  if( prefixes == null || !prefixes.contains("wiki")){
    ws.getNamespaceRegistry().registerNamespace("wiki","
http://www.barik.net/wiki/1.0");
  }

  Node rootNode = keepAliveSession.getRootNode();
  Node encyclopedia = null;
  if(!rootNode.hasNode("wiki:encyclopedia")){
    encyclopedia = rootNode.addNode("wiki:encyclopedia");
  }
  else{
    encyclopedia = rootNode.getNode("wiki:encyclopedia");
  }
========================== END OF JAVA CODE ===============================

At this point, even before saving the session, a new Record is created and
stored in the shared Journal thus creating a new Revision. On the
synchronization thread of the other repo, I can see it retrieves this new
Revision record(NamespaceRecord) from the Journal and it's
NamespaceRegistryImpl updates two files: ns_reg.properties and
ns_idx.properties. Both repos are now synced to the Journal (revision 1, at
this point).

In the same session I continue to add some more dummy content:
========================== JAVA CODE =====================================
for (int i = 0; i < iterations; i++) {
  Node p = encyclopedia.addNode("wiki:entry");
  p.setProperty("wiki:title", new StringValue("Rose " + i));
  p.setProperty("wiki:content", new StringValue("A rose is a flowering
shrub. " + i));
  p.setProperty(JcrConstants.JCR_LASTMODIFIED, System.currentTimeMillis());
}
session.save();
========================== END OF JAVA CODE ===============================
No this is where things starts to get fuzzy. First of all, the new nodes are
only "flushed" when the session is saved, what is not the case with the
Namespace. Okay, after the session is saved, the synchronization thread of
the other repo get's a ChangeLogRecord from the Journal with a bunch of
changes. The SharedItemStateManager comes into play and persists the new
changes ... or at least it was suppose to. I cannot track/find where it
actually does it! I really don't get anything more by this point. The
LockManagerImpl, SearchManager also consume the events but I don't know what
it does exactly.

Anyway ... I can't understand how this mechanism works. I need to find the
method that invokes the PersistenceManager for persisting the new items
found in the Journal's records.

Can anybody give me a light on this issue? I just found this that might help
me: http://jackrabbit.apache.org/jackrabbit-architecture.html

Thanks a lot.
Cheers

-- 
Met vriendelijke groet,
Tricode Professional Services BV

William R. J. Ribeiro
Developer

T:   +31 (0)318 55 92 10
F:   +31 (0)318 65 09 09
E:   wribeiro@tricode.nl
W:  www.tricode.nl

De Schutterij 12, 3905 PL Veenendaal, The Netherlands | KVK 30183142

Re: In deph Jackrabbit clustering question

Posted by William Ribeiro <wr...@tricode.nl>.

Hey guys,

thanks for the replies and hints. I finally realized that the problem was
indeed in the PersistenceManager. The way it was handling the bundles was
wrong even when multiple PMs were pointing to the same persistence location
(db url).

Keep on the good work!

Cheers.

On Wed, Sep 29, 2010 at 7:42 PM, Alexander Klimetschek <ak...@day.com>wrote:

> On Wed, Sep 29, 2010 at 15:18, William Ribeiro <wr...@tricode.nl>
> wrote:
> > Okay ... this what I got so far! What am I missing??? I didn't see the
> new
> > changes being stored by the PersistenceManager. I'm stuck with this for
> > almost 1 week now!
>
> Again: the cluster node that receives an update through the journal
> only has to invalidate its caches. The actual node/property data
> clustering must happen in the persistence manager, so that the next
> time the application on the other cluster node reads such an updated
> node, will get the latest version from the persistence manager.
>
> Regards,
> Alex
>
> --
> Alexander Klimetschek
> alexander.klimetschek@day.com
>



-- 
Met vriendelijke groet,
Tricode Professional Services BV

William R. J. Ribeiro
Developer

T:   +31 (0)318 55 92 10
F:   +31 (0)318 65 09 09
E:   wribeiro@tricode.nl
W:  www.tricode.nl

De Schutterij 12, 3905 PL Veenendaal, The Netherlands | KVK 30183142

Re: In deph Jackrabbit clustering question

Posted by Alexander Klimetschek <ak...@day.com>.

On Wed, Sep 29, 2010 at 15:18, William Ribeiro <wr...@tricode.nl> wrote:
> Okay ... this what I got so far! What am I missing??? I didn't see the new
> changes being stored by the PersistenceManager. I'm stuck with this for
> almost 1 week now!

Again: the cluster node that receives an update through the journal
only has to invalidate its caches. The actual node/property data
clustering must happen in the persistence manager, so that the next
time the application on the other cluster node reads such an updated
node, will get the latest version from the persistence manager.

Regards,
Alex

-- 
Alexander Klimetschek
alexander.klimetschek@day.com

Re: In deph Jackrabbit clustering question

Posted by William Ribeiro <wr...@tricode.nl>.

Thanks Jukka and Alexander,

I'm sorry for spamming. I'm not familiar with Nabble/MarkMail/Gmane yet. I
couldn't figure out if my message was successfully delivered to the lists
that's why I re-sent it. This won't  happen again.

Ok, this is my stack trace:
Daemon Thread [ClusterNode-embeded-1] (Suspended)

GigaSpacesBundlePersistenceManager(AbstractBundlePersistenceManager).onExternalUpdate(ChangeLog)
line: 292
    SharedItemStateManager.doExternalUpdate(ChangeLog) line: 1216
    SharedItemStateManager.externalUpdate(ChangeLog, EventStateCollection)
line: 1177
    RepositoryImpl$WorkspaceInfo.externalUpdate(ChangeLog, List, long,
String) line: 2193
    ClusterNode.process(ChangeLogRecord) line: 869
    ChangeLogRecord.process(ClusterRecordProcessor) line: 507
    ClusterNode.consume(Record) line: 815
    GigaJournal.doSync(long) line: 290
    GigaJournal(AbstractJournal).sync() line: 188
    ClusterNode.sync() line: 329
    ClusterNode.run() line: 295
    Thread.run() line: 619

And this the toString() of the ChageLog object at this point:
{#addedStates=34, #modifiedStates=1, #deletedStates=0, #modifiedRefs=0}
So far so good, right?!

Also, the 'bundles' and 'missing' objects are empty, what is expected since
I added content to the other repository, also in melthod 'onExternalUpdate'
I only see 'remove' operations on this objects.

Later the events are dispatched again by the ObservationDispatcher which has
3 synchronous consumers: SearchManager, LockManagerImpl and another
SearchManager.

When the first SearchManager consumes the events it prints messages like
this:
[ClusterNode-embeded-1]  INFO SearchManager:458 - Node no longer available
b6602d73-4ddd-4c8c-b95f-4def87a5b5be, skipped.

Them it comes back to the doSync() method where it set's the cluster node to
the revision although it didn't got the changes.

Okay ... this what I got so far! What am I missing??? I didn't see the new
changes being stored by the PersistenceManager. I'm stuck with this for
almost 1 week now!

Thanks for you help.

Regards,
On Wed, Sep 29, 2010 at 2:13 PM, Alexander Klimetschek <ak...@day.com>wrote:

> On Wed, Sep 29, 2010 at 13:42, William Ribeiro <wr...@tricode.nl>
> wrote:
> > Anyway ... I can't understand how this mechanism works. I need to find
> the
> > method that invokes the PersistenceManager for persisting the new items
> > found in the Journal's records.
>
> Why? The writing cluster node will a) store the new node in the
> persistence manager and then b) notify the journal of the change. This
> propagates to the other nodes, which can then invalidate their cache.
> The persistence manager on those other nodes must be able to retrieve
> the new/changed node then (immediately).
>
> Regards,
> Alex
>
> --
> Alexander Klimetschek
> alexander.klimetschek@day.com
>

-- 
Met vriendelijke groet,
Tricode Professional Services BV

William R. J. Ribeiro
Developer

T:   +31 (0)318 55 92 10
F:   +31 (0)318 65 09 09
E:   wribeiro@tricode.nl
W:  www.tricode.nl

De Schutterij 12, 3905 PL Veenendaal, The Netherlands | KVK 30183142

Re: In deph Jackrabbit clustering question

Posted by Alexander Klimetschek <ak...@day.com>.

On Wed, Sep 29, 2010 at 13:42, William Ribeiro <wr...@tricode.nl> wrote:
> Anyway ... I can't understand how this mechanism works. I need to find the
> method that invokes the PersistenceManager for persisting the new items
> found in the Journal's records.

Why? The writing cluster node will a) store the new node in the
persistence manager and then b) notify the journal of the change. This
propagates to the other nodes, which can then invalidate their cache.
The persistence manager on those other nodes must be able to retrieve
the new/changed node then (immediately).

Regards,
Alex

-- 
Alexander Klimetschek
alexander.klimetschek@day.com

Re: In deph Jackrabbit clustering question

Posted by Jukka Zitting <ju...@gmail.com>.

Hi William,

Both I and Alexander already answered your question, see
http://markmail.org/message/azydtktkgewdornh and
http://markmail.org/message/wzuk2izcik5gbqvg.

Was there something you didn't understand or did we miss some part of
your question? Simply repeating the question, now for the third time
won't help.

BR,

Jukka Zitting