You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@jena.apache.org by Dinithi Nallaperuma <di...@gmail.com> on 2014/03/17 19:14:22 UTC

[GSOC 2014] SPARQL Query Caching

Hi All,

I am Dinithi Nallaperuma, a postgraduate student following MSc Advanced
Software Engineering from University of Westminster, UK. I am fairly
familiar with ontologies and related subject areas.

I am interested in the project SPARQL Query Caching [1]. I would like to
know whether there is any mentor who would like to mentor this project
during the summer.

I would also like to know your opinion on using JCS [2] to implement the
proposed caching solution. I have some prior experience working with JCS
and it can cater many of our requirements built-in. JCS supports in-memory
caching as well as spooling to disc or database. Further it has a good api
for managing the cache.

The biggest challenge I've faced so far is cooping the modifications. One
approach is to keep a mapping between the objects and SPARQL query results
that referred them and invalidate the cache entry when an update to the
object occurs. I would much appreciate any advise on how to coop with data
modifications

[1] https://issues.apache.org/jira/browse/JENA-626
[2] http://commons.apache.org/proper/commons-jcs/

Thank you and Regards,
Dinithi

Re: GSoC 2014 - Develop a new in-memory RDF Dataset implementation (JENA-624)

Posted by Andy Seaborne <an...@apache.org>.

On 19/03/14 12:37, Timothy Armstrong wrote:
> I have had a really large vision about how we can enhance all our
> object-oriented technologies with the Semantic Web technologies. Running
> SPARQL on object-oriented data is part of it, and I have thought ARQ
> would be best for that purpose.  Another part is that we can enhance the
> object-oriented data model with many elements of the OWL data model,
> including much of the reasoning, anywhere the object-oriented data model
> is used.  The intention is to let people use the OWL data model in all
> the object-oriented programs they write, instead of just the
> object-oriented data model as it is.
>
> What I would really like, though, would be if we could get more data on
> the Semantic Web and make it larger, with all the object-oriented data
> in the world that people are willing to post.  I have thought what we
> really want to do is set up SPARQL endpoints on object databases.
> Another source of object-oriented data in the world is object-relational
> mapping.  I'm not entirely sure, but I have thought it might also be
> possible to set up SPARQL endpoints on data sources of object-relational
> mapping, by treating the data as object-oriented data.

Alibaba (https://bitbucket.org/openrdf/alibaba) gives a RDF-ORM, so it's 
the reverse direction to exposing existing object-oriented data (as far 
as I know).

	Andy

Re: GSoC 2014 - Develop a new in-memory RDF Dataset implementation (JENA-624)

Posted by Timothy Armstrong <tj...@ncsu.edu>.

Hi,

Thank you so much for the references, Andy and Ying.  I had seen AliBaba 
before, but not Tupelo.  I think these projects are different from what 
I have in mind.  I feel that it should be possible to set up SPARQL 
endpoints on all the object database software.  I thought maybe Fuseki 
could be used for that purpose. I am aware that people have done quite a 
lot of work to translate relational data into RDF and set up SPARQL 
endpoints on relational databases.  I have worked with D2RQ.

I just found the JSON-LD project, which I see has become a W3C 
recommendation.  I see that JSON-LD data can be interpreted as both 
object-oriented data and RDF at the same time.  So there is this very 
close connection between object-oriented data and RDF.

I have found that it is extremely simple to map Java data into RDF and 
does not require much effort at all.  We treat object-oriented 
attributes as RDF properties, as other software has done.  When a Set is 
used in an attribute, it is a non-functional property. Sometimes when 
people use Lists or arrays in attributes what they really mean is an 
rdf:List, and sometimes what they really mean is a non-functional 
property.  All that is needed to treat object-oriented data as RDF is to 
specify which interpretation for List and array attributes is really 
meant, which can be done with a simple Java annotation on the 
attribute.  Then it should be possible to run SPARQL on the RDF, 
although write operations are more complicated than read operations.

Well, I found a way to write the TBox of any OWL ontology entirely in 
Java.  Here is the FOAF "knows" property, for instance:

@ObjectProperty
@label("knows")
@comment("A person known by this person (indicating some level of 
reciprocated interaction between the parties).")
@isDefinedBy("http://xmlns.com/foaf/0.1/")
@term_status("stable")
@domain(Person.class)
@range(Person.class)
public @interface knows {}

Thanks,
Tim Armstrong


On 03/19/2014 09:35 AM, Ying Jiang wrote:
> Hi,
>
> As far as I know, for "object-oriented data" <-> "semantic data"
> mapping, Tupelo 2 has done some similar work in 2009/2010. You can
> have a look at [1] [2]
>
> Best regards,
> Ying Jiang
>
> [1] http://tupeloproject.ncsa.uiuc.edu/node/67
> [2] http://tupeloproject.ncsa.uiuc.edu/node/69
>
>
>
>
> On Wed, Mar 19, 2014 at 8:37 PM, Timothy Armstrong <tj...@ncsu.edu> wrote:
>> Hello Andy,
>>
>> Thank you so much for your response.  I would be very interested in making a
>> new implementation of DatasetGraph, although I would have to learn about the
>> issues involved in SPARQL query optimization, as I have not studied those
>> issues.  I would also have to learn more about parallel programming.  Well,
>> maybe it is late to be applying for GSoC.  I would just like to get involved
>> in an open source Semantic Web project, in any case.
>>
>> Thank you also for the references to related work.  I shall have to look at
>> them in more detail.  I also need to explore the connection with JSON-LD.
>>
>> I have had a really large vision about how we can enhance all our
>> object-oriented technologies with the Semantic Web technologies. Running
>> SPARQL on object-oriented data is part of it, and I have thought ARQ would
>> be best for that purpose.  Another part is that we can enhance the
>> object-oriented data model with many elements of the OWL data model,
>> including much of the reasoning, anywhere the object-oriented data model is
>> used.  The intention is to let people use the OWL data model in all the
>> object-oriented programs they write, instead of just the object-oriented
>> data model as it is.
>>
>> What I would really like, though, would be if we could get more data on the
>> Semantic Web and make it larger, with all the object-oriented data in the
>> world that people are willing to post.  I have thought what we really want
>> to do is set up SPARQL endpoints on object databases.  Another source of
>> object-oriented data in the world is object-relational mapping.  I'm not
>> entirely sure, but I have thought it might also be possible to set up SPARQL
>> endpoints on data sources of object-relational mapping, by treating the data
>> as object-oriented data.
>>
>> I do have in mind how to implement at least large parts of the functionality
>> of the Graph, Node, and Triple interfaces backed by object-oriented data,
>> and I have a lot of code working toward that purpose in Java.  My
>> understanding of ARQ is that it would be sufficient in order to run SPARQL
>> SELECT, CONSTRUCT, and Update queries on object-oriented data if we would
>> just implement the Model interface, or related interfaces, backed by
>> object-oriented data.  Is that correct?
>>
>> As I have in mind, there would be one implementation of Model for each piece
>> of object database software, or maybe for each piece of object-relational
>> mapping software, although the implementations could have a lot of code in
>> common.  There would also be a Model for an arbitrary Java Collection of
>> Java objects that the user would supply.  Additionally, there would be a
>> Model to use in Java programs that would consist of all the Java objects in
>> memory that have not been garbage collected, which we could use to run
>> SPARQL on all the objects in memory.   (I have means of accessing main
>> memory in Java with AspectJ.)
>>
>> Well, as I say, maybe it is late to be applying for GSoC.  I have just been
>> hoping that I can make a contribution to the Semantic Web with these ideas.
>> I need to find a conference to which to submit my article.  Thank you very
>> much again for your response.
>>
>> Tim Armstrong
>>
>>
>>
>> On 03/18/2014 09:49 AM, Andy Seaborne wrote:
>>> Hi Tim,
>>>
>>> The idea of this project wasn't to implement the Model interface, it was
>>> to implement the storage level DatasetGraph interface. Jena has an implement
>>> for Model in memory (actually - for Graph : Model is a presentation of Graph
>>> and Graph (and Node and Triple) are the key abstractions.
>>>
>>> Aside from GSoC:
>>>
>>> Your ideas for relating RDF access to object-oriented sounds interesting -
>>> do you have a particular source of object-oriented data in mind?
>>>
>>> I don't know of any closely related work which isn't to say there isn't
>>> any.  Does the work on CumulusRDF, which stores RDF molecules (if I rmember
>>> correctly) have any relevance?  Or Haystacks (MIT) which used adjacency
>>> lists on nodes to store RDF which is a different style to the "traditional"
>>> triple storage style.
>>>
>>> I suspect the W3C "CSV on the Web" Working Group might be connected -
>>> there, data is assumed into be in regular table structures which can be
>>> viewed as a low level object oriented data format.
>>>
>>>      Andy
>>>
>>> On 18/03/14 01:27, Timothy Armstrong wrote:
>>>> Hello,
>>>>
>>>> I'm interested in contributing to Jena in Google Summer of Code 2014.
>>>> I'm a computer science Ph.D. student at North Carolina State
>>>> University.  I have studied the Semantic Web very passionately, as I
>>>> feel it is a wonderful vision.  I have taken a course in it, worked as a
>>>> research assistant on the Protein Ontology project (
>>>> http://pir.georgetown.edu/pro/pro.shtml ), and developed some open
>>>> source software for it.  I have used Jena a lot.
>>>>
>>>> I have some ideas for JENA-624 (
>>>> https://issues.apache.org/jira/browse/JENA-624 ), although I am very
>>>> interested in directions you see for it, and I would be glad to work on
>>>> other issues.  There are a lot of ideas I have had for my Semantic Web
>>>> software that are related to Jena.  I would be very glad to contribute
>>>> to the Jena project in GSoC, but I would also be glad to contribute
>>>> anything in my existing software that would be useful to Jena. Well, I
>>>> realize that I am a bit late posting here for GSoC, and I am hurrying to
>>>> get my software's web site and article in a presentable form.
>>>>
>>>> I came up with a very simple interpretation of object-oriented
>>>> programming, similar to connections other people have made, that treats
>>>> all object-oriented data as triples in RDF.  It means in part that we
>>>> can run SPARQL queries on any object-oriented data.  I have thought it
>>>> would be very good if we could use ARQ to run SPARQL on main memory in
>>>> object-oriented programs and on object databases.  I found that we can
>>>> post object-oriented data directly on the Semantic Web without having to
>>>> write any sort of mapping like D2RQ: either by translating
>>>> object-oriented data into an existing Semantic Web format, or by setting
>>>> up SPARQL endpoints on object databases. Well, I am very interested if
>>>> you are aware if any of this has been done before.
>>>>
>>>> Regarding JENA-624, I have in mind how to create implementations of the
>>>> Jena Model interface (com.hp.hpl.jena.rdf.model.Model) backed by Java
>>>> data.  I have been thinking that it might help to run SPARQL on Java
>>>> data with ARQ if we could implement Model backed by Java data. I am
>>>> wondering if you think it would be applicable to JENA-624, or to any
>>>> other issues, if we could create implementations of Model in this
>>>> manner.  There could be both in-memory models with Java data, and disk
>>>> models with object databases.
>>>>
>>>> So, I would be very glad to contribute.
>>>>
>>>> Thanks,
>>>> Tim Armstrong
>>>

Re: GSoC 2014 - Develop a new in-memory RDF Dataset implementation (JENA-624)

Posted by Ying Jiang <jp...@gmail.com>.

Hi,

As far as I know, for "object-oriented data" <-> "semantic data"
mapping, Tupelo 2 has done some similar work in 2009/2010. You can
have a look at [1] [2]

Best regards,
Ying Jiang

[1] http://tupeloproject.ncsa.uiuc.edu/node/67
[2] http://tupeloproject.ncsa.uiuc.edu/node/69




On Wed, Mar 19, 2014 at 8:37 PM, Timothy Armstrong <tj...@ncsu.edu> wrote:
> Hello Andy,
>
> Thank you so much for your response.  I would be very interested in making a
> new implementation of DatasetGraph, although I would have to learn about the
> issues involved in SPARQL query optimization, as I have not studied those
> issues.  I would also have to learn more about parallel programming.  Well,
> maybe it is late to be applying for GSoC.  I would just like to get involved
> in an open source Semantic Web project, in any case.
>
> Thank you also for the references to related work.  I shall have to look at
> them in more detail.  I also need to explore the connection with JSON-LD.
>
> I have had a really large vision about how we can enhance all our
> object-oriented technologies with the Semantic Web technologies. Running
> SPARQL on object-oriented data is part of it, and I have thought ARQ would
> be best for that purpose.  Another part is that we can enhance the
> object-oriented data model with many elements of the OWL data model,
> including much of the reasoning, anywhere the object-oriented data model is
> used.  The intention is to let people use the OWL data model in all the
> object-oriented programs they write, instead of just the object-oriented
> data model as it is.
>
> What I would really like, though, would be if we could get more data on the
> Semantic Web and make it larger, with all the object-oriented data in the
> world that people are willing to post.  I have thought what we really want
> to do is set up SPARQL endpoints on object databases.  Another source of
> object-oriented data in the world is object-relational mapping.  I'm not
> entirely sure, but I have thought it might also be possible to set up SPARQL
> endpoints on data sources of object-relational mapping, by treating the data
> as object-oriented data.
>
> I do have in mind how to implement at least large parts of the functionality
> of the Graph, Node, and Triple interfaces backed by object-oriented data,
> and I have a lot of code working toward that purpose in Java.  My
> understanding of ARQ is that it would be sufficient in order to run SPARQL
> SELECT, CONSTRUCT, and Update queries on object-oriented data if we would
> just implement the Model interface, or related interfaces, backed by
> object-oriented data.  Is that correct?
>
> As I have in mind, there would be one implementation of Model for each piece
> of object database software, or maybe for each piece of object-relational
> mapping software, although the implementations could have a lot of code in
> common.  There would also be a Model for an arbitrary Java Collection of
> Java objects that the user would supply.  Additionally, there would be a
> Model to use in Java programs that would consist of all the Java objects in
> memory that have not been garbage collected, which we could use to run
> SPARQL on all the objects in memory.   (I have means of accessing main
> memory in Java with AspectJ.)
>
> Well, as I say, maybe it is late to be applying for GSoC.  I have just been
> hoping that I can make a contribution to the Semantic Web with these ideas.
> I need to find a conference to which to submit my article.  Thank you very
> much again for your response.
>
> Tim Armstrong
>
>
>
> On 03/18/2014 09:49 AM, Andy Seaborne wrote:
>>
>> Hi Tim,
>>
>> The idea of this project wasn't to implement the Model interface, it was
>> to implement the storage level DatasetGraph interface. Jena has an implement
>> for Model in memory (actually - for Graph : Model is a presentation of Graph
>> and Graph (and Node and Triple) are the key abstractions.
>>
>> Aside from GSoC:
>>
>> Your ideas for relating RDF access to object-oriented sounds interesting -
>> do you have a particular source of object-oriented data in mind?
>>
>> I don't know of any closely related work which isn't to say there isn't
>> any.  Does the work on CumulusRDF, which stores RDF molecules (if I rmember
>> correctly) have any relevance?  Or Haystacks (MIT) which used adjacency
>> lists on nodes to store RDF which is a different style to the "traditional"
>> triple storage style.
>>
>> I suspect the W3C "CSV on the Web" Working Group might be connected -
>> there, data is assumed into be in regular table structures which can be
>> viewed as a low level object oriented data format.
>>
>>     Andy
>>
>> On 18/03/14 01:27, Timothy Armstrong wrote:
>>>
>>> Hello,
>>>
>>> I'm interested in contributing to Jena in Google Summer of Code 2014.
>>> I'm a computer science Ph.D. student at North Carolina State
>>> University.  I have studied the Semantic Web very passionately, as I
>>> feel it is a wonderful vision.  I have taken a course in it, worked as a
>>> research assistant on the Protein Ontology project (
>>> http://pir.georgetown.edu/pro/pro.shtml ), and developed some open
>>> source software for it.  I have used Jena a lot.
>>>
>>> I have some ideas for JENA-624 (
>>> https://issues.apache.org/jira/browse/JENA-624 ), although I am very
>>> interested in directions you see for it, and I would be glad to work on
>>> other issues.  There are a lot of ideas I have had for my Semantic Web
>>> software that are related to Jena.  I would be very glad to contribute
>>> to the Jena project in GSoC, but I would also be glad to contribute
>>> anything in my existing software that would be useful to Jena. Well, I
>>> realize that I am a bit late posting here for GSoC, and I am hurrying to
>>> get my software's web site and article in a presentable form.
>>>
>>> I came up with a very simple interpretation of object-oriented
>>> programming, similar to connections other people have made, that treats
>>> all object-oriented data as triples in RDF.  It means in part that we
>>> can run SPARQL queries on any object-oriented data.  I have thought it
>>> would be very good if we could use ARQ to run SPARQL on main memory in
>>> object-oriented programs and on object databases.  I found that we can
>>> post object-oriented data directly on the Semantic Web without having to
>>> write any sort of mapping like D2RQ: either by translating
>>> object-oriented data into an existing Semantic Web format, or by setting
>>> up SPARQL endpoints on object databases. Well, I am very interested if
>>> you are aware if any of this has been done before.
>>>
>>> Regarding JENA-624, I have in mind how to create implementations of the
>>> Jena Model interface (com.hp.hpl.jena.rdf.model.Model) backed by Java
>>> data.  I have been thinking that it might help to run SPARQL on Java
>>> data with ARQ if we could implement Model backed by Java data. I am
>>> wondering if you think it would be applicable to JENA-624, or to any
>>> other issues, if we could create implementations of Model in this
>>> manner.  There could be both in-memory models with Java data, and disk
>>> models with object databases.
>>>
>>> So, I would be very glad to contribute.
>>>
>>> Thanks,
>>> Tim Armstrong
>>
>>
>

Re: GSoC 2014 - Develop a new in-memory RDF Dataset implementation (JENA-624)

Posted by Timothy Armstrong <tj...@ncsu.edu>.

Hello Andy,

Thank you so much for your response.  I would be very interested in 
making a new implementation of DatasetGraph, although I would have to 
learn about the issues involved in SPARQL query optimization, as I have 
not studied those issues.  I would also have to learn more about 
parallel programming.  Well, maybe it is late to be applying for GSoC.  
I would just like to get involved in an open source Semantic Web 
project, in any case.

Thank you also for the references to related work.  I shall have to look 
at them in more detail.  I also need to explore the connection with JSON-LD.

I have had a really large vision about how we can enhance all our 
object-oriented technologies with the Semantic Web technologies. Running 
SPARQL on object-oriented data is part of it, and I have thought ARQ 
would be best for that purpose.  Another part is that we can enhance the 
object-oriented data model with many elements of the OWL data model, 
including much of the reasoning, anywhere the object-oriented data model 
is used.  The intention is to let people use the OWL data model in all 
the object-oriented programs they write, instead of just the 
object-oriented data model as it is.

What I would really like, though, would be if we could get more data on 
the Semantic Web and make it larger, with all the object-oriented data 
in the world that people are willing to post.  I have thought what we 
really want to do is set up SPARQL endpoints on object databases.  
Another source of object-oriented data in the world is object-relational 
mapping.  I'm not entirely sure, but I have thought it might also be 
possible to set up SPARQL endpoints on data sources of object-relational 
mapping, by treating the data as object-oriented data.

I do have in mind how to implement at least large parts of the 
functionality of the Graph, Node, and Triple interfaces backed by 
object-oriented data, and I have a lot of code working toward that 
purpose in Java.  My understanding of ARQ is that it would be sufficient 
in order to run SPARQL SELECT, CONSTRUCT, and Update queries on 
object-oriented data if we would just implement the Model interface, or 
related interfaces, backed by object-oriented data.  Is that correct?

As I have in mind, there would be one implementation of Model for each 
piece of object database software, or maybe for each piece of 
object-relational mapping software, although the implementations could 
have a lot of code in common.  There would also be a Model for an 
arbitrary Java Collection of Java objects that the user would supply.  
Additionally, there would be a Model to use in Java programs that would 
consist of all the Java objects in memory that have not been garbage 
collected, which we could use to run SPARQL on all the objects in 
memory.   (I have means of accessing main memory in Java with AspectJ.)

Well, as I say, maybe it is late to be applying for GSoC.  I have just 
been hoping that I can make a contribution to the Semantic Web with 
these ideas.  I need to find a conference to which to submit my 
article.  Thank you very much again for your response.

Tim Armstrong

On 03/18/2014 09:49 AM, Andy Seaborne wrote:
> Hi Tim,
>
> The idea of this project wasn't to implement the Model interface, it 
> was to implement the storage level DatasetGraph interface. Jena has an 
> implement for Model in memory (actually - for Graph : Model is a 
> presentation of Graph and Graph (and Node and Triple) are the key 
> abstractions.
>
> Aside from GSoC:
>
> Your ideas for relating RDF access to object-oriented sounds 
> interesting - do you have a particular source of object-oriented data 
> in mind?
>
> I don't know of any closely related work which isn't to say there 
> isn't any.  Does the work on CumulusRDF, which stores RDF molecules 
> (if I rmember correctly) have any relevance?  Or Haystacks (MIT) which 
> used adjacency lists on nodes to store RDF which is a different style 
> to the "traditional" triple storage style.
>
> I suspect the W3C "CSV on the Web" Working Group might be connected - 
> there, data is assumed into be in regular table structures which can 
> be viewed as a low level object oriented data format.
>
>     Andy
>
> On 18/03/14 01:27, Timothy Armstrong wrote:
>> Hello,
>>
>> I'm interested in contributing to Jena in Google Summer of Code 2014.
>> I'm a computer science Ph.D. student at North Carolina State
>> University.  I have studied the Semantic Web very passionately, as I
>> feel it is a wonderful vision.  I have taken a course in it, worked as a
>> research assistant on the Protein Ontology project (
>> http://pir.georgetown.edu/pro/pro.shtml ), and developed some open
>> source software for it.  I have used Jena a lot.
>>
>> I have some ideas for JENA-624 (
>> https://issues.apache.org/jira/browse/JENA-624 ), although I am very
>> interested in directions you see for it, and I would be glad to work on
>> other issues.  There are a lot of ideas I have had for my Semantic Web
>> software that are related to Jena.  I would be very glad to contribute
>> to the Jena project in GSoC, but I would also be glad to contribute
>> anything in my existing software that would be useful to Jena. Well, I
>> realize that I am a bit late posting here for GSoC, and I am hurrying to
>> get my software's web site and article in a presentable form.
>>
>> I came up with a very simple interpretation of object-oriented
>> programming, similar to connections other people have made, that treats
>> all object-oriented data as triples in RDF.  It means in part that we
>> can run SPARQL queries on any object-oriented data.  I have thought it
>> would be very good if we could use ARQ to run SPARQL on main memory in
>> object-oriented programs and on object databases.  I found that we can
>> post object-oriented data directly on the Semantic Web without having to
>> write any sort of mapping like D2RQ: either by translating
>> object-oriented data into an existing Semantic Web format, or by setting
>> up SPARQL endpoints on object databases. Well, I am very interested if
>> you are aware if any of this has been done before.
>>
>> Regarding JENA-624, I have in mind how to create implementations of the
>> Jena Model interface (com.hp.hpl.jena.rdf.model.Model) backed by Java
>> data.  I have been thinking that it might help to run SPARQL on Java
>> data with ARQ if we could implement Model backed by Java data. I am
>> wondering if you think it would be applicable to JENA-624, or to any
>> other issues, if we could create implementations of Model in this
>> manner.  There could be both in-memory models with Java data, and disk
>> models with object databases.
>>
>> So, I would be very glad to contribute.
>>
>> Thanks,
>> Tim Armstrong
>

Re: GSoC 2014 - Develop a new in-memory RDF Dataset implementation (JENA-624)

Posted by Andy Seaborne <an...@apache.org>.

Hi Tim,

The idea of this project wasn't to implement the Model interface, it was 
to implement the storage level DatasetGraph interface.  Jena has an 
implement for Model in memory (actually - for Graph : Model is a 
presentation of Graph and Graph (and Node and Triple) are the key 
abstractions.

Aside from GSoC:

Your ideas for relating RDF access to object-oriented sounds interesting 
- do you have a particular source of object-oriented data in mind?

I don't know of any closely related work which isn't to say there isn't 
any.  Does the work on CumulusRDF, which stores RDF molecules (if I 
rmember correctly) have any relevance?  Or Haystacks (MIT) which used 
adjacency lists on nodes to store RDF which is a different style to the 
"traditional" triple storage style.

I suspect the W3C "CSV on the Web" Working Group might be connected - 
there, data is assumed into be in regular table structures which can be 
viewed as a low level object oriented data format.

	Andy

On 18/03/14 01:27, Timothy Armstrong wrote:
> Hello,
>
> I'm interested in contributing to Jena in Google Summer of Code 2014.
> I'm a computer science Ph.D. student at North Carolina State
> University.  I have studied the Semantic Web very passionately, as I
> feel it is a wonderful vision.  I have taken a course in it, worked as a
> research assistant on the Protein Ontology project (
> http://pir.georgetown.edu/pro/pro.shtml ), and developed some open
> source software for it.  I have used Jena a lot.
>
> I have some ideas for JENA-624 (
> https://issues.apache.org/jira/browse/JENA-624 ), although I am very
> interested in directions you see for it, and I would be glad to work on
> other issues.  There are a lot of ideas I have had for my Semantic Web
> software that are related to Jena.  I would be very glad to contribute
> to the Jena project in GSoC, but I would also be glad to contribute
> anything in my existing software that would be useful to Jena.  Well, I
> realize that I am a bit late posting here for GSoC, and I am hurrying to
> get my software's web site and article in a presentable form.
>
> I came up with a very simple interpretation of object-oriented
> programming, similar to connections other people have made, that treats
> all object-oriented data as triples in RDF.  It means in part that we
> can run SPARQL queries on any object-oriented data.  I have thought it
> would be very good if we could use ARQ to run SPARQL on main memory in
> object-oriented programs and on object databases.  I found that we can
> post object-oriented data directly on the Semantic Web without having to
> write any sort of mapping like D2RQ: either by translating
> object-oriented data into an existing Semantic Web format, or by setting
> up SPARQL endpoints on object databases. Well, I am very interested if
> you are aware if any of this has been done before.
>
> Regarding JENA-624, I have in mind how to create implementations of the
> Jena Model interface (com.hp.hpl.jena.rdf.model.Model) backed by Java
> data.  I have been thinking that it might help to run SPARQL on Java
> data with ARQ if we could implement Model backed by Java data. I am
> wondering if you think it would be applicable to JENA-624, or to any
> other issues, if we could create implementations of Model in this
> manner.  There could be both in-memory models with Java data, and disk
> models with object databases.
>
> So, I would be very glad to contribute.
>
> Thanks,
> Tim Armstrong

GSoC 2014 - Develop a new in-memory RDF Dataset implementation (JENA-624)

Posted by Timothy Armstrong <tj...@ncsu.edu>.

Hello,

I'm interested in contributing to Jena in Google Summer of Code 2014.  
I'm a computer science Ph.D. student at North Carolina State 
University.  I have studied the Semantic Web very passionately, as I 
feel it is a wonderful vision.  I have taken a course in it, worked as a 
research assistant on the Protein Ontology project ( 
http://pir.georgetown.edu/pro/pro.shtml ), and developed some open 
source software for it.  I have used Jena a lot.

I have some ideas for JENA-624 ( 
https://issues.apache.org/jira/browse/JENA-624 ), although I am very 
interested in directions you see for it, and I would be glad to work on 
other issues.  There are a lot of ideas I have had for my Semantic Web 
software that are related to Jena.  I would be very glad to contribute 
to the Jena project in GSoC, but I would also be glad to contribute 
anything in my existing software that would be useful to Jena.  Well, I 
realize that I am a bit late posting here for GSoC, and I am hurrying to 
get my software's web site and article in a presentable form.

I came up with a very simple interpretation of object-oriented 
programming, similar to connections other people have made, that treats 
all object-oriented data as triples in RDF.  It means in part that we 
can run SPARQL queries on any object-oriented data.  I have thought it 
would be very good if we could use ARQ to run SPARQL on main memory in 
object-oriented programs and on object databases.  I found that we can 
post object-oriented data directly on the Semantic Web without having to 
write any sort of mapping like D2RQ: either by translating 
object-oriented data into an existing Semantic Web format, or by setting 
up SPARQL endpoints on object databases. Well, I am very interested if 
you are aware if any of this has been done before.

Regarding JENA-624, I have in mind how to create implementations of the 
Jena Model interface (com.hp.hpl.jena.rdf.model.Model) backed by Java 
data.  I have been thinking that it might help to run SPARQL on Java 
data with ARQ if we could implement Model backed by Java data. I am 
wondering if you think it would be applicable to JENA-624, or to any 
other issues, if we could create implementations of Model in this 
manner.  There could be both in-memory models with Java data, and disk 
models with object databases.

So, I would be very glad to contribute.

Thanks,
Tim Armstrong

Re: [GSOC 2014] SPARQL Query Caching

Posted by Andy Seaborne <an...@apache.org>.

On 19/03/14 18:32, Dinithi Nallaperuma wrote:
> Hi Andy,
>
> Thanks for you reply.
> I guess when you say 'an Apache committer' you mean an Apache commit who's
> with Jena project.

In theory, any Apache committer but it does need someone with 
familiarity wit the Jena codebase which in practice means someone with 
the Jena project.

Mentors have several roles:
1/ GSoC Process (so you get paid!)
2/ Technical discussion
3/ Committing to the codebase (though forking on github means that isn't 
major)

	Andy

> I would like to take this opportunity to ask whether any Apache committer
> subscribed here would like to mentor the project 'SPARQL Query Caching'
>
>
> On Tue, Mar 18, 2014 at 7:21 PM, Andy Seaborne <an...@apache.org> wrote:
>
>> Hi Dinithi,
>>
>> Thank you for your interest - at the moment, no one has expressed an
>> interest in mentoring that project.  Sorry about that - the project makes
>> paging results using LIMIT-OFFSET work quite nicely but unless someon step
>> forward (an Apache committer), the project is mentor-less.
>>
>>          Andy
>>
>>
>> On 17/03/14 18:14, Dinithi Nallaperuma wrote:
>>
>>> Hi All,
>>>
>>> I am Dinithi Nallaperuma, a postgraduate student following MSc Advanced
>>> Software Engineering from University of Westminster, UK. I am fairly
>>> familiar with ontologies and related subject areas.
>>>
>>> I am interested in the project SPARQL Query Caching [1]. I would like to
>>> know whether there is any mentor who would like to mentor this project
>>> during the summer.
>>>
>>> I would also like to know your opinion on using JCS [2] to implement the
>>> proposed caching solution. I have some prior experience working with JCS
>>> and it can cater many of our requirements built-in. JCS supports in-memory
>>> caching as well as spooling to disc or database. Further it has a good api
>>> for managing the cache.
>>>
>>> The biggest challenge I've faced so far is cooping the modifications. One
>>> approach is to keep a mapping between the objects and SPARQL query results
>>> that referred them and invalidate the cache entry when an update to the
>>> object occurs. I would much appreciate any advise on how to coop with data
>>> modifications
>>>
>>
>> This is very hard in SPARQL because of negatives (parts of data that
>> caused results to be excluded), not just the parts of the data that lead to
>> a specific variable binding.
>>
>> That said, given that most use is publishing (more read than write),
>> dumping the cache after an update is a viable scheme in many situations if
>> the impact at the time of update is tolerable, such as during a quiet
>> period (like early morning).
>>
>

Re: [GSOC 2014] SPARQL Query Caching

Posted by Dinithi Nallaperuma <di...@gmail.com>.

Hi Andy,

Thanks for you reply.
I guess when you say 'an Apache committer' you mean an Apache commit who's
with Jena project.
I would like to take this opportunity to ask whether any Apache committer
subscribed here would like to mentor the project 'SPARQL Query Caching'


On Tue, Mar 18, 2014 at 7:21 PM, Andy Seaborne <an...@apache.org> wrote:

> Hi Dinithi,
>
> Thank you for your interest - at the moment, no one has expressed an
> interest in mentoring that project.  Sorry about that - the project makes
> paging results using LIMIT-OFFSET work quite nicely but unless someon step
> forward (an Apache committer), the project is mentor-less.
>
>         Andy
>
>
> On 17/03/14 18:14, Dinithi Nallaperuma wrote:
>
>> Hi All,
>>
>> I am Dinithi Nallaperuma, a postgraduate student following MSc Advanced
>> Software Engineering from University of Westminster, UK. I am fairly
>> familiar with ontologies and related subject areas.
>>
>> I am interested in the project SPARQL Query Caching [1]. I would like to
>> know whether there is any mentor who would like to mentor this project
>> during the summer.
>>
>> I would also like to know your opinion on using JCS [2] to implement the
>> proposed caching solution. I have some prior experience working with JCS
>> and it can cater many of our requirements built-in. JCS supports in-memory
>> caching as well as spooling to disc or database. Further it has a good api
>> for managing the cache.
>>
>> The biggest challenge I've faced so far is cooping the modifications. One
>> approach is to keep a mapping between the objects and SPARQL query results
>> that referred them and invalidate the cache entry when an update to the
>> object occurs. I would much appreciate any advise on how to coop with data
>> modifications
>>
>
> This is very hard in SPARQL because of negatives (parts of data that
> caused results to be excluded), not just the parts of the data that lead to
> a specific variable binding.
>
> That said, given that most use is publishing (more read than write),
> dumping the cache after an update is a viable scheme in many situations if
> the impact at the time of update is tolerable, such as during a quiet
> period (like early morning).
>

Re: [GSOC 2014] SPARQL Query Caching

Posted by Andy Seaborne <an...@apache.org>.

Hi Dinithi,

Thank you for your interest - at the moment, no one has expressed an 
interest in mentoring that project.  Sorry about that - the project 
makes paging results using LIMIT-OFFSET work quite nicely but unless 
someon step forward (an Apache committer), the project is mentor-less.

	Andy

On 17/03/14 18:14, Dinithi Nallaperuma wrote:
> Hi All,
>
> I am Dinithi Nallaperuma, a postgraduate student following MSc Advanced
> Software Engineering from University of Westminster, UK. I am fairly
> familiar with ontologies and related subject areas.
>
> I am interested in the project SPARQL Query Caching [1]. I would like to
> know whether there is any mentor who would like to mentor this project
> during the summer.
>
> I would also like to know your opinion on using JCS [2] to implement the
> proposed caching solution. I have some prior experience working with JCS
> and it can cater many of our requirements built-in. JCS supports in-memory
> caching as well as spooling to disc or database. Further it has a good api
> for managing the cache.
>
> The biggest challenge I've faced so far is cooping the modifications. One
> approach is to keep a mapping between the objects and SPARQL query results
> that referred them and invalidate the cache entry when an update to the
> object occurs. I would much appreciate any advise on how to coop with data
> modifications

This is very hard in SPARQL because of negatives (parts of data that 
caused results to be excluded), not just the parts of the data that lead 
to a specific variable binding.

That said, given that most use is publishing (more read than write), 
dumping the cache after an update is a viable scheme in many situations 
if the impact at the time of update is tolerable, such as during a quiet 
period (like early morning).

>
> [1] https://issues.apache.org/jira/browse/JENA-626
> [2] http://commons.apache.org/proper/commons-jcs/
>
> Thank you and Regards,
> Dinithi
>