You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@jackrabbit.apache.org by 小川 貴裕 <og...@brainsellers.com> on 2009/02/13 06:22:27 UTC

File based clustering of jackrabbit

Dear all,
I'm a test engeneer in Japan.

We deploy jackrabbit for our application to keep pdf files.
(with using ObjectPersistenceManager as PM)
And I am reserching how to use jackrabbit on clustering servers.

Then I have a question about clustering...

[Q.] Is it possible to operate jackrabbit with file based repository?

I've read <http://jackrabbit.apache.org/jackrabbit-configuration.html> and
<http://wiki.apache.org/jackrabbit/Clustering>.

At the last paragraph of the latter page, I found "Since file system based
persistence managers are not transactional, one has to use persistence
managers storing their data in a database running standalone."

I understand that DB based repository is stable, better and strongly
recommended. But I am not sure it is possible or not.
(nuance of "have to" is not clear for me...)

# We won't introduce another DBMS for our customer ,if possible,
# to avoid support needs for DBMS settings.

I think, jackrabbit has its own locking components, so it does not
depend on locking system of low level layer.
Assuming no network problem, HDD crash nor sudden shutdown,
jackrabbit can be operated on clustering servers without DBMS.
(If these accidents ocuur, user will restore repository from his backup,
and he will lost some nodes(pdfs) newly registered.
It is users choice to accept this risk, or not.)

Is it right?

Any suggestion is appreciated.
Thank you for reading long question..
----
Takahiro OGAWA <ogawa(at)brainsellers.com>
<http://www.brainsellers.com/>


Re: File based clustering of jackrabbit

Posted by Thomas Müller <th...@day.com>.
Hi,

[Q.] Is it possible to operate jackrabbit with file based repository?


No, not with the file based persistence managers that are included in
Jackrabbit. I have updated the wiki page at
http://wiki.apache.org/jackrabbit/Clustering :

"The persistence manager needs to be transactional, and need to support
concurrent access from multiple processes. When using Jackrabbit, one option
is to use a database persistence manager, and use a database that does
support concurrent access. The file system based persistence managers in
Jackrabbit are not transactional and don't support concurrent access; Apache
Derby doesn't support concurrent access in the embedded mode."

> # We won't introduce another DBMS for our customer ,if possible,
> # to avoid support needs for DBMS settings.

I understand this. When using Jackrabbit, it is not possible at this time.
There is an option however when using the (commercial) Day CRX. If you are
interested, please see http://www.day.com/crx (sorry, I have to say that).

Regards,
Thomas

Re: File based clustering of jackrabbit

Posted by Takahiro OGAWA <og...@brainsellers.com>.
Hi, Alexander and Thomas

Thank you for reply.

Alexander Klimetschek wrote:
> Your options are (see also [1]):
> - local derby database with bundle db PM but no clustering (default
> jackrabbit config, fast + stable)
> - bundle file system PM but no clustering (fast, but not so stable)
> - transactional database with bundle db PM gives clustering (you can
> use postgresql or mysql for popular open-source dbs)

Now I understand clearly, I think.

 > have to = must (no nuance at all ;-))

Ouch! I have learned "have to" 20 years ago...

But that wiki page was a bit confusing for me since 
file based journal setting appears first.

Thomas Müller wrote:
> No, not with the file based persistence managers that are included in
> Jackrabbit. I have updated the wiki page at
> http://wiki.apache.org/jackrabbit/Clustering :

It is very helpful for us. Thank you.
Maybe "impossible" or "is not available" is more clear 
for non-English and non-Western people.
(I know there are many Indian engenners... probably
 something is wrong with English education in Japan.)

Thanks.
----
Takahiro OGAWA <ogawa(at)brainsellers.com>
<http://www.brainsellers.com/>



Re: File based clustering of jackrabbit

Posted by Alexander Klimetschek <ak...@day.com>.
On Fri, Feb 13, 2009 at 6:22 AM, 小川 貴裕 <og...@brainsellers.com> wrote:
> Then I have a question about clustering...
>
> [Q.] Is it possible to operate jackrabbit with file based repository?

No, there is no file based persistence manager in Jackrabbit that
allows clustering.

Shared read/write access to the data needs proper synchronization,
that's why the simple file system persistence manager cannot be used
for clustering. Since filesystem pm is also slow and not very robust
regarding crashes etc., we strongly recommend to use a database bundle
persistence managers.

> I understand that DB based repository is stable, better and strongly
> recommended. But I am not sure it is possible or not.
> (nuance of "have to" is not clear for me...)

have to = must (no nuance at all ;-))

> # We won't introduce another DBMS for our customer ,if possible,
> # to avoid support needs for DBMS settings.
>
> I think, jackrabbit has its own locking components, so it does not
> depend on locking system of low level layer.
> Assuming no network problem, HDD crash nor sudden shutdown,
> jackrabbit can be operated on clustering servers without DBMS.
> (If these accidents ocuur, user will restore repository from his backup,
> and he will lost some nodes(pdfs) newly registered.
> It is users choice to accept this risk, or not.)

No, if you cluster a file system PM, it will simply not work -
problems will occur during normal operations, since there is no
synchronization between multiple jackrabbit nodes (ie. their PMs)
accessing the same files. This is what the transactional database
would ensure.

Your options are (see also [1]):
- local derby database with bundle db PM but no clustering (default
jackrabbit config, fast + stable)
- bundle file system PM but no clustering (fast, but not so stable)
- transactional database with bundle db PM gives clustering (you can
use postgresql or mysql for popular open-source dbs)

[1] http://wiki.apache.org/jackrabbit/PersistenceManagerFAQ

Regards,
Alex

-- 
Alexander Klimetschek
alexander.klimetschek@day.com