You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Amandeep Khurana <am...@gmail.com> on 2009/03/24 00:07:01 UTC

RDF store over HDFS/HBase

Has anyone explored using HDFS/HBase as the underlying storage for an RDF
store? Most solutions (all are single node) that I have found till now scale
up only to a couple of billion rows in the Triple store. Wondering how
Hadoop could be leveraged here...

Amandeep


Amandeep Khurana
Computer Science Graduate Student
University of California, Santa Cruz

Re: RDF store over HDFS/HBase

Posted by Amandeep Khurana <am...@gmail.com>.
Yes... I was curious if someone has explored building an RDF store over
HBase/HDFS. I did see some proposals/ideas by people on the mailing list
archives but couldnt get anything concrete.

Also, I'm not sure if the HBase model is suitable for RDF data storage or
not. This ofcourse is debatable.

Amandeep


Amandeep Khurana
Computer Science Graduate Student
University of California, Santa Cruz


On Mon, Mar 23, 2009 at 4:17 PM, Ryan Rawson <ry...@gmail.com> wrote:

> I would expect HBase would scale well - the semantics of the data being
> stored shouldn't matter, just the size.
>
> I think there are a number of production HBase installations that have
> billions of rows.
>
> On Mon, Mar 23, 2009 at 4:10 PM, Ding, Hui <hu...@sap.com> wrote:
>
> > I remember there was a project proposal back in late last year.  They've
> > set up an official  webpage.Not sure if they are still alive/making any
> > progress.
> > You  can search in the email archive.
> >
> > -----Original Message-----
> > From: Amandeep Khurana [mailto:amansk@gmail.com]
> > Sent: Monday, March 23, 2009 4:07 PM
> > To: hbase-user@hadoop.apache.org; core-user@hadoop.apache.org;
> > core-dev@hadoop.apache.org
> > Subject: RDF store over HDFS/HBase
> >
> > Has anyone explored using HDFS/HBase as the underlying storage for an
> > RDF
> > store? Most solutions (all are single node) that I have found till now
> > scale
> > up only to a couple of billion rows in the Triple store. Wondering how
> > Hadoop could be leveraged here...
> >
> > Amandeep
> >
> >
> > Amandeep Khurana
> > Computer Science Graduate Student
> > University of California, Santa Cruz
> >
>

Re: RDF store over HDFS/HBase

Posted by Amandeep Khurana <am...@gmail.com>.
Yes... I was curious if someone has explored building an RDF store over
HBase/HDFS. I did see some proposals/ideas by people on the mailing list
archives but couldnt get anything concrete.

Also, I'm not sure if the HBase model is suitable for RDF data storage or
not. This ofcourse is debatable.

Amandeep


Amandeep Khurana
Computer Science Graduate Student
University of California, Santa Cruz


On Mon, Mar 23, 2009 at 4:17 PM, Ryan Rawson <ry...@gmail.com> wrote:

> I would expect HBase would scale well - the semantics of the data being
> stored shouldn't matter, just the size.
>
> I think there are a number of production HBase installations that have
> billions of rows.
>
> On Mon, Mar 23, 2009 at 4:10 PM, Ding, Hui <hu...@sap.com> wrote:
>
> > I remember there was a project proposal back in late last year.  They've
> > set up an official  webpage.Not sure if they are still alive/making any
> > progress.
> > You  can search in the email archive.
> >
> > -----Original Message-----
> > From: Amandeep Khurana [mailto:amansk@gmail.com]
> > Sent: Monday, March 23, 2009 4:07 PM
> > To: hbase-user@hadoop.apache.org; core-user@hadoop.apache.org;
> > core-dev@hadoop.apache.org
> > Subject: RDF store over HDFS/HBase
> >
> > Has anyone explored using HDFS/HBase as the underlying storage for an
> > RDF
> > store? Most solutions (all are single node) that I have found till now
> > scale
> > up only to a couple of billion rows in the Triple store. Wondering how
> > Hadoop could be leveraged here...
> >
> > Amandeep
> >
> >
> > Amandeep Khurana
> > Computer Science Graduate Student
> > University of California, Santa Cruz
> >
>

Re: RDF store over HDFS/HBase

Posted by Amandeep Khurana <am...@gmail.com>.
Yes... I was curious if someone has explored building an RDF store over
HBase/HDFS. I did see some proposals/ideas by people on the mailing list
archives but couldnt get anything concrete.

Also, I'm not sure if the HBase model is suitable for RDF data storage or
not. This ofcourse is debatable.

Amandeep


Amandeep Khurana
Computer Science Graduate Student
University of California, Santa Cruz


On Mon, Mar 23, 2009 at 4:17 PM, Ryan Rawson <ry...@gmail.com> wrote:

> I would expect HBase would scale well - the semantics of the data being
> stored shouldn't matter, just the size.
>
> I think there are a number of production HBase installations that have
> billions of rows.
>
> On Mon, Mar 23, 2009 at 4:10 PM, Ding, Hui <hu...@sap.com> wrote:
>
> > I remember there was a project proposal back in late last year.  They've
> > set up an official  webpage.Not sure if they are still alive/making any
> > progress.
> > You  can search in the email archive.
> >
> > -----Original Message-----
> > From: Amandeep Khurana [mailto:amansk@gmail.com]
> > Sent: Monday, March 23, 2009 4:07 PM
> > To: hbase-user@hadoop.apache.org; core-user@hadoop.apache.org;
> > core-dev@hadoop.apache.org
> > Subject: RDF store over HDFS/HBase
> >
> > Has anyone explored using HDFS/HBase as the underlying storage for an
> > RDF
> > store? Most solutions (all are single node) that I have found till now
> > scale
> > up only to a couple of billion rows in the Triple store. Wondering how
> > Hadoop could be leveraged here...
> >
> > Amandeep
> >
> >
> > Amandeep Khurana
> > Computer Science Graduate Student
> > University of California, Santa Cruz
> >
>

Re: RDF store over HDFS/HBase

Posted by Ryan Rawson <ry...@gmail.com>.
I would expect HBase would scale well - the semantics of the data being
stored shouldn't matter, just the size.

I think there are a number of production HBase installations that have
billions of rows.

On Mon, Mar 23, 2009 at 4:10 PM, Ding, Hui <hu...@sap.com> wrote:

> I remember there was a project proposal back in late last year.  They've
> set up an official  webpage.Not sure if they are still alive/making any
> progress.
> You  can search in the email archive.
>
> -----Original Message-----
> From: Amandeep Khurana [mailto:amansk@gmail.com]
> Sent: Monday, March 23, 2009 4:07 PM
> To: hbase-user@hadoop.apache.org; core-user@hadoop.apache.org;
> core-dev@hadoop.apache.org
> Subject: RDF store over HDFS/HBase
>
> Has anyone explored using HDFS/HBase as the underlying storage for an
> RDF
> store? Most solutions (all are single node) that I have found till now
> scale
> up only to a couple of billion rows in the Triple store. Wondering how
> Hadoop could be leveraged here...
>
> Amandeep
>
>
> Amandeep Khurana
> Computer Science Graduate Student
> University of California, Santa Cruz
>

Re: RDF store over HDFS/HBase

Posted by Ryan Rawson <ry...@gmail.com>.
I would expect HBase would scale well - the semantics of the data being
stored shouldn't matter, just the size.

I think there are a number of production HBase installations that have
billions of rows.

On Mon, Mar 23, 2009 at 4:10 PM, Ding, Hui <hu...@sap.com> wrote:

> I remember there was a project proposal back in late last year.  They've
> set up an official  webpage.Not sure if they are still alive/making any
> progress.
> You  can search in the email archive.
>
> -----Original Message-----
> From: Amandeep Khurana [mailto:amansk@gmail.com]
> Sent: Monday, March 23, 2009 4:07 PM
> To: hbase-user@hadoop.apache.org; core-user@hadoop.apache.org;
> core-dev@hadoop.apache.org
> Subject: RDF store over HDFS/HBase
>
> Has anyone explored using HDFS/HBase as the underlying storage for an
> RDF
> store? Most solutions (all are single node) that I have found till now
> scale
> up only to a couple of billion rows in the Triple store. Wondering how
> Hadoop could be leveraged here...
>
> Amandeep
>
>
> Amandeep Khurana
> Computer Science Graduate Student
> University of California, Santa Cruz
>

Re: RDF store over HDFS/HBase

Posted by Ryan Rawson <ry...@gmail.com>.
I would expect HBase would scale well - the semantics of the data being
stored shouldn't matter, just the size.

I think there are a number of production HBase installations that have
billions of rows.

On Mon, Mar 23, 2009 at 4:10 PM, Ding, Hui <hu...@sap.com> wrote:

> I remember there was a project proposal back in late last year.  They've
> set up an official  webpage.Not sure if they are still alive/making any
> progress.
> You  can search in the email archive.
>
> -----Original Message-----
> From: Amandeep Khurana [mailto:amansk@gmail.com]
> Sent: Monday, March 23, 2009 4:07 PM
> To: hbase-user@hadoop.apache.org; core-user@hadoop.apache.org;
> core-dev@hadoop.apache.org
> Subject: RDF store over HDFS/HBase
>
> Has anyone explored using HDFS/HBase as the underlying storage for an
> RDF
> store? Most solutions (all are single node) that I have found till now
> scale
> up only to a couple of billion rows in the Triple store. Wondering how
> Hadoop could be leveraged here...
>
> Amandeep
>
>
> Amandeep Khurana
> Computer Science Graduate Student
> University of California, Santa Cruz
>

RE: RDF store over HDFS/HBase

Posted by "Ding, Hui" <hu...@sap.com>.
I remember there was a project proposal back in late last year.  They've
set up an official  webpage.Not sure if they are still alive/making any
progress.
You  can search in the email archive.

-----Original Message-----
From: Amandeep Khurana [mailto:amansk@gmail.com] 
Sent: Monday, March 23, 2009 4:07 PM
To: hbase-user@hadoop.apache.org; core-user@hadoop.apache.org;
core-dev@hadoop.apache.org
Subject: RDF store over HDFS/HBase

Has anyone explored using HDFS/HBase as the underlying storage for an
RDF
store? Most solutions (all are single node) that I have found till now
scale
up only to a couple of billion rows in the Triple store. Wondering how
Hadoop could be leveraged here...

Amandeep


Amandeep Khurana
Computer Science Graduate Student
University of California, Santa Cruz

RE: RDF store over HDFS/HBase

Posted by "Ding, Hui" <hu...@sap.com>.
I remember there was a project proposal back in late last year.  They've
set up an official  webpage.Not sure if they are still alive/making any
progress.
You  can search in the email archive.

-----Original Message-----
From: Amandeep Khurana [mailto:amansk@gmail.com] 
Sent: Monday, March 23, 2009 4:07 PM
To: hbase-user@hadoop.apache.org; core-user@hadoop.apache.org;
core-dev@hadoop.apache.org
Subject: RDF store over HDFS/HBase

Has anyone explored using HDFS/HBase as the underlying storage for an
RDF
store? Most solutions (all are single node) that I have found till now
scale
up only to a couple of billion rows in the Triple store. Wondering how
Hadoop could be leveraged here...

Amandeep


Amandeep Khurana
Computer Science Graduate Student
University of California, Santa Cruz

RE: RDF store over HDFS/HBase

Posted by "Ding, Hui" <hu...@sap.com>.
I remember there was a project proposal back in late last year.  They've
set up an official  webpage.Not sure if they are still alive/making any
progress.
You  can search in the email archive.

-----Original Message-----
From: Amandeep Khurana [mailto:amansk@gmail.com] 
Sent: Monday, March 23, 2009 4:07 PM
To: hbase-user@hadoop.apache.org; core-user@hadoop.apache.org;
core-dev@hadoop.apache.org
Subject: RDF store over HDFS/HBase

Has anyone explored using HDFS/HBase as the underlying storage for an
RDF
store? Most solutions (all are single node) that I have found till now
scale
up only to a couple of billion rows in the Triple store. Wondering how
Hadoop could be leveraged here...

Amandeep


Amandeep Khurana
Computer Science Graduate Student
University of California, Santa Cruz

Re: RDF store over HDFS/HBase

Posted by Andrew Newman <an...@gmail.com>.
So one of the things that I've thought about with when I've considered
using HBase for RDF storage was how to implement blank nodes.  I've
always talked about requiring a global lock on the system in order to
ensure that if you refer to a blank node on one node in the cluster
that it is the same node in another.  There are ways around it (using
some form of context around the blank nodes preventing the data
relating to a blank node from splitting in the first place) but it
still seems like an outstanding problem.  I'd be interested in this
part of your solution.

2009/3/24 Philip M. White <pm...@qnan.org>:
> On Mon, Mar 23, 2009 at 05:33:46PM -0700, stack wrote:
>> Anywhere we can go to learn more about the effort?  What can we do in HBase
>> to make the project more likely to succeed?
>
> Right now we don't have anything of value to show you, but we plan to
> move on this pretty quickly.  We're copying the functionality of using
> HBase as the persistent store from another (proprietary) project.
>
> If you (or anyone else) would like to participate in this development,
> let me know.  We can work together on this.
>
> --
> Philip
>

Re: RDF store over HDFS/HBase

Posted by Andrew Newman <an...@gmail.com>.
So one of the things that I've thought about with using HBase for RDF
storage was whether to keep blank nodes or not.  When I've spoke about
supporting blank nodes I've always talked about requiring a global
lock on the system in order to ensure that if you refer to a blank
node on one node in the cluster that it is the same node in another.
I'd be interested in this part of your solution.

2009/3/24 Philip M. White <pm...@qnan.org>:
> On Mon, Mar 23, 2009 at 05:33:46PM -0700, stack wrote:
>> Anywhere we can go to learn more about the effort?  What can we do in HBase
>> to make the project more likely to succeed?
>
> Right now we don't have anything of value to show you, but we plan to
> move on this pretty quickly.  We're copying the functionality of using
> HBase as the persistent store from another (proprietary) project.
>
> If you (or anyone else) would like to participate in this development,
> let me know.  We can work together on this.
>
> --
> Philip
>

Re: RDF store over HDFS/HBase

Posted by "Philip M. White" <pm...@qnan.org>.
On Mon, Mar 23, 2009 at 05:33:46PM -0700, stack wrote:
> Anywhere we can go to learn more about the effort?  What can we do in HBase
> to make the project more likely to succeed?

Right now we don't have anything of value to show you, but we plan to
move on this pretty quickly.  We're copying the functionality of using
HBase as the persistent store from another (proprietary) project.

If you (or anyone else) would like to participate in this development,
let me know.  We can work together on this.

-- 
Philip

Re: RDF store over HDFS/HBase

Posted by stack <st...@duboce.net>.
Philip:

Anywhere we can go to learn more about the effort?  What can we do in HBase
to make the project more likely to succeed?

Thank you,
St.Ack

On Mon, Mar 23, 2009 at 5:05 PM, Philip M. White <pm...@qnan.org> wrote:

> On Mon, Mar 23, 2009 at 04:07:01PM -0700, Amandeep Khurana wrote:
> > Has anyone explored using HDFS/HBase as the underlying storage for an RDF
> > store? Most solutions (all are single node) that I have found till now
> scale
> > up only to a couple of billion rows in the Triple store. Wondering how
> > Hadoop could be leveraged here...
>
> Amandeep, the Semantic Web Research Lab of the University of Texas at
> Dallas is working on this.  We expect to have an implementation of this
> for Jena by summer.
>
> --
> Philip
>

Re: RDF store over HDFS/HBase

Posted by "Philip M. White" <pm...@qnan.org>.
On Mon, Mar 23, 2009 at 04:07:01PM -0700, Amandeep Khurana wrote:
> Has anyone explored using HDFS/HBase as the underlying storage for an RDF
> store? Most solutions (all are single node) that I have found till now scale
> up only to a couple of billion rows in the Triple store. Wondering how
> Hadoop could be leveraged here...

Amandeep, the Semantic Web Research Lab of the University of Texas at
Dallas is working on this.  We expect to have an implementation of this
for Jena by summer.

-- 
Philip

Re: RDF store over HDFS/HBase

Posted by "Edward J. Yoon" <ed...@apache.org>.
AFAIK, some guys trying to make a POC codes for the Heart project.
Also, I'm all wrapped up in the Hama and incidence matrix/graph
theory, thinking about facility of contextual reasoning.

On Tue, Mar 24, 2009 at 8:55 AM, Amandeep Khurana <am...@gmail.com> wrote:
> Yes, I have. Doesnt seem much activity there. I dont think they came out
> with a release.
>
>
> Amandeep Khurana
> Computer Science Graduate Student
> University of California, Santa Cruz
>
>
> On Mon, Mar 23, 2009 at 4:45 PM, Andrew Purtell <ap...@apache.org> wrote:
>
>>
>> Have you heard of the Heart project?
>>
>>    http://rdf-proj.blogspot.com/
>>
>> I don't know of its current status.
>>
>>   - Andy
>>
>>
>> > From: Amandeep Khurana
>> > Subject: RDF store over HDFS/HBase
>> >
>> > Has anyone explored using HDFS/HBase as the underlying
>> > storage for an RDF store?
>>
>>
>>
>>
>>
>



-- 
Best Regards, Edward J. Yoon
edwardyoon@apache.org
http://blog.udanax.org

Re: RDF store over HDFS/HBase

Posted by "Edward J. Yoon" <ed...@apache.org>.
AFAIK, some guys trying to make a POC codes for the Heart project.
Also, I'm all wrapped up in the Hama and incidence matrix/graph
theory, thinking about facility of contextual reasoning.

On Tue, Mar 24, 2009 at 8:55 AM, Amandeep Khurana <am...@gmail.com> wrote:
> Yes, I have. Doesnt seem much activity there. I dont think they came out
> with a release.
>
>
> Amandeep Khurana
> Computer Science Graduate Student
> University of California, Santa Cruz
>
>
> On Mon, Mar 23, 2009 at 4:45 PM, Andrew Purtell <ap...@apache.org> wrote:
>
>>
>> Have you heard of the Heart project?
>>
>>    http://rdf-proj.blogspot.com/
>>
>> I don't know of its current status.
>>
>>   - Andy
>>
>>
>> > From: Amandeep Khurana
>> > Subject: RDF store over HDFS/HBase
>> >
>> > Has anyone explored using HDFS/HBase as the underlying
>> > storage for an RDF store?
>>
>>
>>
>>
>>
>



-- 
Best Regards, Edward J. Yoon
edwardyoon@apache.org
http://blog.udanax.org

Re: RDF store over HDFS/HBase

Posted by "Edward J. Yoon" <ed...@apache.org>.
AFAIK, some guys trying to make a POC codes for the Heart project.
Also, I'm all wrapped up in the Hama and incidence matrix/graph
theory, thinking about facility of contextual reasoning.

On Tue, Mar 24, 2009 at 8:55 AM, Amandeep Khurana <am...@gmail.com> wrote:
> Yes, I have. Doesnt seem much activity there. I dont think they came out
> with a release.
>
>
> Amandeep Khurana
> Computer Science Graduate Student
> University of California, Santa Cruz
>
>
> On Mon, Mar 23, 2009 at 4:45 PM, Andrew Purtell <ap...@apache.org> wrote:
>
>>
>> Have you heard of the Heart project?
>>
>>    http://rdf-proj.blogspot.com/
>>
>> I don't know of its current status.
>>
>>   - Andy
>>
>>
>> > From: Amandeep Khurana
>> > Subject: RDF store over HDFS/HBase
>> >
>> > Has anyone explored using HDFS/HBase as the underlying
>> > storage for an RDF store?
>>
>>
>>
>>
>>
>



-- 
Best Regards, Edward J. Yoon
edwardyoon@apache.org
http://blog.udanax.org

Re: RDF store over HDFS/HBase

Posted by Amandeep Khurana <am...@gmail.com>.
Yes, I have. Doesnt seem much activity there. I dont think they came out
with a release.


Amandeep Khurana
Computer Science Graduate Student
University of California, Santa Cruz


On Mon, Mar 23, 2009 at 4:45 PM, Andrew Purtell <ap...@apache.org> wrote:

>
> Have you heard of the Heart project?
>
>    http://rdf-proj.blogspot.com/
>
> I don't know of its current status.
>
>   - Andy
>
>
> > From: Amandeep Khurana
> > Subject: RDF store over HDFS/HBase
> >
> > Has anyone explored using HDFS/HBase as the underlying
> > storage for an RDF store?
>
>
>
>
>

Re: RDF store over HDFS/HBase

Posted by Amandeep Khurana <am...@gmail.com>.
Yes, I have. Doesnt seem much activity there. I dont think they came out
with a release.


Amandeep Khurana
Computer Science Graduate Student
University of California, Santa Cruz


On Mon, Mar 23, 2009 at 4:45 PM, Andrew Purtell <ap...@apache.org> wrote:

>
> Have you heard of the Heart project?
>
>    http://rdf-proj.blogspot.com/
>
> I don't know of its current status.
>
>   - Andy
>
>
> > From: Amandeep Khurana
> > Subject: RDF store over HDFS/HBase
> >
> > Has anyone explored using HDFS/HBase as the underlying
> > storage for an RDF store?
>
>
>
>
>

Re: RDF store over HDFS/HBase

Posted by Amandeep Khurana <am...@gmail.com>.
Yes, I have. Doesnt seem much activity there. I dont think they came out
with a release.


Amandeep Khurana
Computer Science Graduate Student
University of California, Santa Cruz


On Mon, Mar 23, 2009 at 4:45 PM, Andrew Purtell <ap...@apache.org> wrote:

>
> Have you heard of the Heart project?
>
>    http://rdf-proj.blogspot.com/
>
> I don't know of its current status.
>
>   - Andy
>
>
> > From: Amandeep Khurana
> > Subject: RDF store over HDFS/HBase
> >
> > Has anyone explored using HDFS/HBase as the underlying
> > storage for an RDF store?
>
>
>
>
>

Re: RDF store over HDFS/HBase

Posted by Andrew Purtell <ap...@apache.org>.
Have you heard of the Heart project?

    http://rdf-proj.blogspot.com/

I don't know of its current status. 

   - Andy


> From: Amandeep Khurana 
> Subject: RDF store over HDFS/HBase
>
> Has anyone explored using HDFS/HBase as the underlying
> storage for an RDF store?



      

Re: RDF store over HDFS/HBase

Posted by Andrew Purtell <ap...@apache.org>.
Have you heard of the Heart project?

    http://rdf-proj.blogspot.com/

I don't know of its current status. 

   - Andy


> From: Amandeep Khurana 
> Subject: RDF store over HDFS/HBase
>
> Has anyone explored using HDFS/HBase as the underlying
> storage for an RDF store?



      

Re: RDF store over HDFS/HBase

Posted by Andrew Purtell <ap...@apache.org>.
Have you heard of the Heart project?

    http://rdf-proj.blogspot.com/

I don't know of its current status. 

   - Andy


> From: Amandeep Khurana 
> Subject: RDF store over HDFS/HBase
>
> Has anyone explored using HDFS/HBase as the underlying
> storage for an RDF store?