You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@jackrabbit.apache.org by "Alan D. Cabrera" <li...@toolazydogs.com> on 2007/08/24 02:43:28 UTC
A JCR with infinite capacity
I need a JCR that has infinite capacity. This is for a photo
repository that would store millions of 5-10MB photos. The photos
themselves would not change. I need to be able to add
One idea that I've been toying with is to create a persistence
manager that manages multiple stores. Some of these stores could
even be on other physical servers, maybe even be a farm of servers.
I was thinking of managing the node and property states along the
same lines as the google file system manages blocks; I have some
ideas how this can be done simply. Property and node states could be
duplicated for resiliance in the face of disk/server failures.
Another nice thing is that I could map the JCR path to a URL to the
actual file server and have the file server serve up the file w/out
having to go through the JCR metadata layer.
Another way might be to graft Hadoop as a FileSystem.
I realize that this may only make sense in the limited application
of storing photos that are not modified. Thoughts?
Regards,
Alan
Re: A JCR with infinite capacity
Posted by "Alan D. Cabrera" <li...@toolazydogs.com>.
On Aug 23, 2007, at 5:49 PM, Padraic I. Hannon wrote:
> However, you will probably run into search and retrieval issues? I
> am unsure as to the guts of the lucene part of the application, but
> there may be problems there with so many files?
Yeah, I was thinking that a ton of relatively little, on the scale
that Hadoop handles, might not be a good application of Hadoop. But
my reasoning could be tainted because it would be really fun writing
an infinite redundant store under Jackrabbit.
Regards,
Alan
Re: A JCR with infinite capacity
Posted by Bertrand Delacretaz <bd...@apache.org>.
On 8/24/07, Padraic I. Hannon <pi...@wasabicowboy.com> wrote:
> ...I am
> unsure as to the guts of the lucene part of the application, but there
> may be problems there with so many files?...
Indexing the "usual" metadata of millions of images in a Lucene index
shouldn't be a problem. Googling "Lucene millions" shows lots of
working examples.
-Bertrand
Re: A JCR with infinite capacity
Posted by "Padraic I. Hannon" <pi...@wasabicowboy.com>.
However, you will probably run into search and retrieval issues? I am
unsure as to the guts of the lucene part of the application, but there
may be problems there with so many files?
-paddy
Re: A JCR with infinite capacity
Posted by "Padraic I. Hannon" <pi...@wasabicowboy.com>.
I started to write a Hadoop DFS based persistence manager. I can create
a Jira ticket and upload if you would like. I am unsure if it works as I
have been distracted by other things over the last week or so.
-paddy
Alan D. Cabrera wrote:
> I need a JCR that has infinite capacity. This is for a photo
> repository that would store millions of 5-10MB photos. The photos
> themselves would not change. I need to be able to add
>
> One idea that I've been toying with is to create a persistence manager
> that manages multiple stores. Some of these stores could even be on
> other physical servers, maybe even be a farm of servers. I was
> thinking of managing the node and property states along the same lines
> as the google file system manages blocks; I have some ideas how this
> can be done simply. Property and node states could be duplicated for
> resiliance in the face of disk/server failures. Another nice thing is
> that I could map the JCR path to a URL to the actual file server and
> have the file server serve up the file w/out having to go through the
> JCR metadata layer.
>
> Another way might be to graft Hadoop as a FileSystem.
>
> I realize that this may only make sense in the limited application of
> storing photos that are not modified. Thoughts?
>
>
> Regards,
> Alan