You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@jackrabbit.apache.org by Shaul Dar <sh...@gmail.com> on 2009/02/12 20:12:16 UTC

Q: Using JackRabbit as a distributed file store

Hi,
I hope you can help me with the following. I apologize for the length of
this message...

My organization (non-profit) has a need for what I think of as a
distributed, virtual file system. The idea is to abstract logical files and
folders from their physical location. I.e. logical file = handle, physical
file = data.

There are 2 main user categories: (1) normal users = content creators (e.g.
video, audio and docs) need to deposit files into logical folders (i.e.
create "logical files"), and give them properties (e.g. file content type,
intended audience). Based on predefined rules, the file should
automatically be moved to 1 or more physical servers in the 3 worldwide
locations we have, and stored there using the local file system (Windows,
Linux etc). (2) Administrators need to control file distribution, i.e.
mapping of logical files to physical replicas (e.g. delete/add replicas).

We need a Web GUI for users and admins. Should support the logical folder
system (create space, create/delete move folder etc), search (by file name,
size, date created or last modified, and possibly hash value), and a
coarse-grain permissions system (e.g. user vs. admin). The back-end should
perform the necessary file transfers, e.g. add/fetch a replica (reliably),
preferably over HTTP (i.e. OS agnostic). In between should be a mapping
layer that maps logical files to physical files. All metadata should be kept
in a DB (MySQL).

To clarify, I am aware of distributed file systems (*FS), this is not what I
am looking for. I am looking rather for (1) management piece (Web based
interface) described above + (2) the physical transfer layer, and (3) the
layer that maps logical/physical files.

So my questions are: does JackRabbit provide what we want? Having read thru
the documentation, I am still unclear on whether it is a framework or a
full-blown system, and does it have only back-end components or also a
front-end. In short, I'm trying to understand how much we would need to
develop? I saw that Day is developing a commercial system, but for a hefty
sum - is there a free alternative? Also I understand that JackRabbit
currently supports JCR 1.x, which does not include distribution across
locations? Is there an estimate when it will support JCR 2.x, i.e.
WAN distribution?

The other concept I've seen is the WebDav spec, which seems to be about 10
years old. I don't know if there are any implementation (e.g.
http://ftp.ics.uci.edu/
<http://goog_1234415942981>pub/ietf/<http://goog_1234415942981>
webdav <http://goog_1234415942981>/ <http://goog_1234415942981>
http://ftp.ics.uci.edu/pub/ietf/webdav/implementation.html), or systems
built on top of it, that provides what I want.

Any suggestions, corrections,feedback? You are welcome to mail me...
Thanks!

-- Shaul

Dr. Shaul Dar
Email: shauldar@gmail.com
Web: www.shauldar.com

Re: Q: Using JackRabbit as a distributed file store

Posted by Thomas Fromm <tf...@inubit.com>.
Hi,

> So my questions are: does JackRabbit provide what we want? Having read thru
> the documentation, I am still unclear on whether it is a framework or a
> full-blown system, and does it have only back-end components or also a
> front-end. In short, I'm trying to understand how much we would need to
> develop? I saw that Day is developing a commercial system, but for a hefty
> sum - is there a free alternative? Also I understand that JackRabbit
> currently supports JCR 1.x, which does not include distribution across
> locations? Is there an estimate when it will support JCR 2.x, i.e.
> WAN distribution?

Just to clarify:

Jackrabbit is an implementation of the JCR specification.
It has some implementations for data persistence. e.g. you can store the data 
into normal filesystem or also into database.
You can also develop your own persistence layer which works over WAN.
It has by default no webbased GUI. You need to develop your own.

*snap*

In my opinion (if I understand your requirements) the easiest way for you 
would be to choose a webbased document management system.
There you have the webbased interface for management, for adding users, their 
permissions and so on. Also you have change history, indexing for search and 
so on...
Most of them uses databases for persistence.

To share the data to different servers I'd have a look at database features. 
(As catchwords take replication or distributed databases.)
Depending on choosen software, multiple instances of the document management 
system can work on a shared/replicated database.

But don't ask for details. I never had such scenario nor I used any content 
management system :-).
These are just only my first thoughts after reading you mail.

> The other concept I've seen is the WebDav spec, which seems to be about 10
> years old. I don't know if there are any implementation (e.g.
> http://ftp.ics.uci.edu/
> <http://goog_1234415942981>pub/ietf/<http://goog_1234415942981>
> webdav <http://goog_1234415942981>/ <http://goog_1234415942981>
> http://ftp.ics.uci.edu/pub/ietf/webdav/implementation.html), or systems
> built on top of it, that provides what I want.

WebDAV ist still alive. And still widly used e.g. inside Microsoft Exchange 
for example.

--tf

Re: Q: Using JackRabbit as a distributed file store

Posted by Stefan Guggisberg <st...@gmail.com>.
On Mon, Feb 16, 2009 at 11:47 AM, Shaul Dar <sh...@gmail.com> wrote:
> I was told by someone that JCR 2.X will support WAN distribution, and also
> see the following in the roadmap (
> http://jackrabbit.apache.org/jackrabbit-roadmap.html)<http://jackrabbit.apache.org/jackrabbit-roadmap.html>
> Am I misunderstanding? Thanks,

probably. i am still puzzled... what list item are you refering to as
'WAN distribution'?  WebDAV remoting? this refers to remoting
the JCR api by using the WebDAV transport underneath (as an
alternative to RMI).

however, i can confirm that there's no 'WAN distribution' feature
planned for JCR 2.0.

cheers
stefan

>
> -- Shaul
> Medium term
>
>   - Apache Jackrabbit 2.0
>   - JCR 2.0 support
>   - Transactional versioning
>   - WebDAV remoting
>   - Hot backup
>   - Full XPath
>
> On Fri, Feb 13, 2009 at 10:51 AM, Stefan Guggisberg <
> stefan.guggisberg@gmail.com> wrote:
>
>> On Thu, Feb 12, 2009 at 8:12 PM, Shaul Dar <sh...@gmail.com> wrote:
>> > Hi,
>> > I hope you can help me with the following. I apologize for the length of
>> > this message...
>> >
>> > My organization (non-profit) has a need for what I think of as a
>> > distributed, virtual file system. The idea is to abstract logical files
>> and
>> > folders from their physical location. I.e. logical file = handle,
>> physical
>> > file = data.
>> >
>> > There are 2 main user categories: (1) normal users = content creators
>> (e.g.
>> > video, audio and docs) need to deposit files into logical folders (i.e.
>> > create "logical files"), and give them properties (e.g. file content
>> type,
>> > intended audience). Based on predefined rules, the file should
>> > automatically be moved to 1 or more physical servers in the 3 worldwide
>> > locations we have, and stored there using the local file system (Windows,
>> > Linux etc). (2) Administrators need to control file distribution, i.e.
>> > mapping of logical files to physical replicas (e.g. delete/add replicas).
>> >
>> > We need a Web GUI for users and admins. Should support the logical folder
>> > system (create space, create/delete move folder etc), search (by file
>> name,
>> > size, date created or last modified, and possibly hash value), and a
>> > coarse-grain permissions system (e.g. user vs. admin). The back-end
>> should
>> > perform the necessary file transfers, e.g. add/fetch a replica
>> (reliably),
>> > preferably over HTTP (i.e. OS agnostic). In between should be a mapping
>> > layer that maps logical files to physical files. All metadata should be
>> kept
>> > in a DB (MySQL).
>> >
>> > To clarify, I am aware of distributed file systems (*FS), this is not
>> what I
>> > am looking for. I am looking rather for (1) management piece (Web based
>> > interface) described above + (2) the physical transfer layer, and (3) the
>> > layer that maps logical/physical files.
>> >
>> > So my questions are: does JackRabbit provide what we want? Having read
>> thru
>> > the documentation, I am still unclear on whether it is a framework or a
>> > full-blown system, and does it have only back-end components or also a
>> > front-end. In short, I'm trying to understand how much we would need to
>> > develop? I saw that Day is developing a commercial system, but for a
>> hefty
>> > sum - is there a free alternative? Also I understand that JackRabbit
>> > currently supports JCR 1.x, which does not include distribution across
>> > locations? Is there an estimate when it will support JCR 2.x, i.e.
>> > WAN distribution?
>>
>> what makes you think JCR 2.X is about WAN distribution?
>>
>> cheers
>> stefan
>

Re: Q: Using JackRabbit as a distributed file store

Posted by Shaul Dar <sh...@gmail.com>.
I was told by someone that JCR 2.X will support WAN distribution, and also
see the following in the roadmap (
http://jackrabbit.apache.org/jackrabbit-roadmap.html)<http://jackrabbit.apache.org/jackrabbit-roadmap.html>
Am I misunderstanding? Thanks,

-- Shaul
Medium term

   - Apache Jackrabbit 2.0
   - JCR 2.0 support
   - Transactional versioning
   - WebDAV remoting
   - Hot backup
   - Full XPath

On Fri, Feb 13, 2009 at 10:51 AM, Stefan Guggisberg <
stefan.guggisberg@gmail.com> wrote:

> On Thu, Feb 12, 2009 at 8:12 PM, Shaul Dar <sh...@gmail.com> wrote:
> > Hi,
> > I hope you can help me with the following. I apologize for the length of
> > this message...
> >
> > My organization (non-profit) has a need for what I think of as a
> > distributed, virtual file system. The idea is to abstract logical files
> and
> > folders from their physical location. I.e. logical file = handle,
> physical
> > file = data.
> >
> > There are 2 main user categories: (1) normal users = content creators
> (e.g.
> > video, audio and docs) need to deposit files into logical folders (i.e.
> > create "logical files"), and give them properties (e.g. file content
> type,
> > intended audience). Based on predefined rules, the file should
> > automatically be moved to 1 or more physical servers in the 3 worldwide
> > locations we have, and stored there using the local file system (Windows,
> > Linux etc). (2) Administrators need to control file distribution, i.e.
> > mapping of logical files to physical replicas (e.g. delete/add replicas).
> >
> > We need a Web GUI for users and admins. Should support the logical folder
> > system (create space, create/delete move folder etc), search (by file
> name,
> > size, date created or last modified, and possibly hash value), and a
> > coarse-grain permissions system (e.g. user vs. admin). The back-end
> should
> > perform the necessary file transfers, e.g. add/fetch a replica
> (reliably),
> > preferably over HTTP (i.e. OS agnostic). In between should be a mapping
> > layer that maps logical files to physical files. All metadata should be
> kept
> > in a DB (MySQL).
> >
> > To clarify, I am aware of distributed file systems (*FS), this is not
> what I
> > am looking for. I am looking rather for (1) management piece (Web based
> > interface) described above + (2) the physical transfer layer, and (3) the
> > layer that maps logical/physical files.
> >
> > So my questions are: does JackRabbit provide what we want? Having read
> thru
> > the documentation, I am still unclear on whether it is a framework or a
> > full-blown system, and does it have only back-end components or also a
> > front-end. In short, I'm trying to understand how much we would need to
> > develop? I saw that Day is developing a commercial system, but for a
> hefty
> > sum - is there a free alternative? Also I understand that JackRabbit
> > currently supports JCR 1.x, which does not include distribution across
> > locations? Is there an estimate when it will support JCR 2.x, i.e.
> > WAN distribution?
>
> what makes you think JCR 2.X is about WAN distribution?
>
> cheers
> stefan

Re: Q: Using JackRabbit as a distributed file store

Posted by Stefan Guggisberg <st...@gmail.com>.
On Thu, Feb 12, 2009 at 8:12 PM, Shaul Dar <sh...@gmail.com> wrote:
> Hi,
> I hope you can help me with the following. I apologize for the length of
> this message...
>
> My organization (non-profit) has a need for what I think of as a
> distributed, virtual file system. The idea is to abstract logical files and
> folders from their physical location. I.e. logical file = handle, physical
> file = data.
>
> There are 2 main user categories: (1) normal users = content creators (e.g.
> video, audio and docs) need to deposit files into logical folders (i.e.
> create "logical files"), and give them properties (e.g. file content type,
> intended audience). Based on predefined rules, the file should
> automatically be moved to 1 or more physical servers in the 3 worldwide
> locations we have, and stored there using the local file system (Windows,
> Linux etc). (2) Administrators need to control file distribution, i.e.
> mapping of logical files to physical replicas (e.g. delete/add replicas).
>
> We need a Web GUI for users and admins. Should support the logical folder
> system (create space, create/delete move folder etc), search (by file name,
> size, date created or last modified, and possibly hash value), and a
> coarse-grain permissions system (e.g. user vs. admin). The back-end should
> perform the necessary file transfers, e.g. add/fetch a replica (reliably),
> preferably over HTTP (i.e. OS agnostic). In between should be a mapping
> layer that maps logical files to physical files. All metadata should be kept
> in a DB (MySQL).
>
> To clarify, I am aware of distributed file systems (*FS), this is not what I
> am looking for. I am looking rather for (1) management piece (Web based
> interface) described above + (2) the physical transfer layer, and (3) the
> layer that maps logical/physical files.
>
> So my questions are: does JackRabbit provide what we want? Having read thru
> the documentation, I am still unclear on whether it is a framework or a
> full-blown system, and does it have only back-end components or also a
> front-end. In short, I'm trying to understand how much we would need to
> develop? I saw that Day is developing a commercial system, but for a hefty
> sum - is there a free alternative? Also I understand that JackRabbit
> currently supports JCR 1.x, which does not include distribution across
> locations? Is there an estimate when it will support JCR 2.x, i.e.
> WAN distribution?

what makes you think JCR 2.X is about WAN distribution?

cheers
stefan

>
> The other concept I've seen is the WebDav spec, which seems to be about 10
> years old. I don't know if there are any implementation (e.g.
> http://ftp.ics.uci.edu/
> <http://goog_1234415942981>pub/ietf/<http://goog_1234415942981>
> webdav <http://goog_1234415942981>/ <http://goog_1234415942981>
> http://ftp.ics.uci.edu/pub/ietf/webdav/implementation.html), or systems
> built on top of it, that provides what I want.
>
> Any suggestions, corrections,feedback? You are welcome to mail me...
> Thanks!
>
> -- Shaul
>
> Dr. Shaul Dar
> Email: shauldar@gmail.com
> Web: www.shauldar.com
>