You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@jackrabbit.apache.org by Eugeny N Dzhurinsky <bo...@redwerk.com> on 2007/10/30 14:29:18 UTC

using Hadoop (HDFS) as a storage in JR

Hello there!

Is it possible to use Hadoop as a storage (filesystem implementation) for
JackRabbit? Probably there is ready-to-use filesystem interface
implementation, or we need to create them from scratch?

Thank you in advance!

-- 
Eugene N Dzhurinsky

Re: using Hadoop (HDFS) as a storage in JR

Posted by Marcel Reutegger <ma...@gmx.net>.
Jukka Zitting wrote:
> It would be interesting to see a HDFS implementation, but I'm not sure
> if HDFS is really a good match for the needs of Jackrabbit.

I think a FileSystem implementation is of limited use, but a DataStore based on 
HDFS might be cool to have...

regards
  marcel

Re: using Hadoop (HDFS) as a storage in JR

Posted by Thomas Mueller <th...@gmail.com>.
Hi,

I have a question: what is your motivation to use Hadoop?

FYI: you can not run multiple instances of Jackrabbit against the same files.

Thomas

On 11/1/07, Eugeny N Dzhurinsky <bo...@redwerk.com> wrote:
> On Thu, Nov 01, 2007 at 10:27:22AM +0200, Jukka Zitting wrote:
> > Hi,
> >
> > On 10/30/07, Eugeny N Dzhurinsky <bo...@redwerk.com> wrote:
> > > Is it possible to use Hadoop as a storage (filesystem implementation) for
> > > JackRabbit? Probably there is ready-to-use filesystem interface
> > > implementation, or we need to create them from scratch?
> >
> > You need to add an implementation of the Jackrabbit FileSystem
> > interface to do that. The current implementations are based on memory,
> > the normal file system, and a relational database.
> >
> > It would be interesting to see a HDFS implementation, but I'm not sure
> > if HDFS is really a good match for the needs of Jackrabbit.
>
> Hello!
>
> We were thinking in this way and we had implemented the FileSystem interface
> which should support the HDFS. However for some reason it doesn't work, in
> fact looks like JackRabbit ignores it when creating the repository - we are
> seeing the LocalFileSystem is being used, and no methods are called from our
> implementation.
>
> We borrowed the working example of repository.xml file and replaced all
> occurences of LocalFileSystem with our HDFSFileSystem, but that didn't do the
> trick. Did we miss something, and how is it possible to easily debug which
> filesystem implementation JackRabbit chosen and why?
>
> Thank you in advance!
>
> --
> Eugene N Dzhurinsky
>
>

Re: using Hadoop (HDFS) as a storage in JR

Posted by Eugeny N Dzhurinsky <bo...@redwerk.com>.
On Thu, Nov 01, 2007 at 10:27:22AM +0200, Jukka Zitting wrote:
> Hi,
> 
> On 10/30/07, Eugeny N Dzhurinsky <bo...@redwerk.com> wrote:
> > Is it possible to use Hadoop as a storage (filesystem implementation) for
> > JackRabbit? Probably there is ready-to-use filesystem interface
> > implementation, or we need to create them from scratch?
> 
> You need to add an implementation of the Jackrabbit FileSystem
> interface to do that. The current implementations are based on memory,
> the normal file system, and a relational database.
> 
> It would be interesting to see a HDFS implementation, but I'm not sure
> if HDFS is really a good match for the needs of Jackrabbit.

Hello!

We were thinking in this way and we had implemented the FileSystem interface
which should support the HDFS. However for some reason it doesn't work, in
fact looks like JackRabbit ignores it when creating the repository - we are
seeing the LocalFileSystem is being used, and no methods are called from our
implementation. 

We borrowed the working example of repository.xml file and replaced all
occurences of LocalFileSystem with our HDFSFileSystem, but that didn't do the
trick. Did we miss something, and how is it possible to easily debug which
filesystem implementation JackRabbit chosen and why?

Thank you in advance!

-- 
Eugene N Dzhurinsky

Re: using Hadoop (HDFS) as a storage in JR

Posted by Jukka Zitting <ju...@gmail.com>.
Hi,

On 10/30/07, Eugeny N Dzhurinsky <bo...@redwerk.com> wrote:
> Is it possible to use Hadoop as a storage (filesystem implementation) for
> JackRabbit? Probably there is ready-to-use filesystem interface
> implementation, or we need to create them from scratch?

You need to add an implementation of the Jackrabbit FileSystem
interface to do that. The current implementations are based on memory,
the normal file system, and a relational database.

It would be interesting to see a HDFS implementation, but I'm not sure
if HDFS is really a good match for the needs of Jackrabbit.

BR,

Jukka Zitting