You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@jena.apache.org by Dinithi Nallaperuma <di...@gmail.com> on 2014/03/17 19:14:22 UTC
[GSOC 2014] SPARQL Query Caching
Hi All,
I am Dinithi Nallaperuma, a postgraduate student following MSc Advanced
Software Engineering from University of Westminster, UK. I am fairly
familiar with ontologies and related subject areas.
I am interested in the project SPARQL Query Caching [1]. I would like to
know whether there is any mentor who would like to mentor this project
during the summer.
I would also like to know your opinion on using JCS [2] to implement the
proposed caching solution. I have some prior experience working with JCS
and it can cater many of our requirements built-in. JCS supports in-memory
caching as well as spooling to disc or database. Further it has a good api
for managing the cache.
The biggest challenge I've faced so far is cooping the modifications. One
approach is to keep a mapping between the objects and SPARQL query results
that referred them and invalidate the cache entry when an update to the
object occurs. I would much appreciate any advise on how to coop with data
modifications
[1] https://issues.apache.org/jira/browse/JENA-626
[2] http://commons.apache.org/proper/commons-jcs/
Thank you and Regards,
Dinithi
Re: GSoC 2014 - Develop a new in-memory RDF Dataset implementation
(JENA-624)
Posted by Andy Seaborne <an...@apache.org>.
On 19/03/14 12:37, Timothy Armstrong wrote:
> I have had a really large vision about how we can enhance all our
> object-oriented technologies with the Semantic Web technologies. Running
> SPARQL on object-oriented data is part of it, and I have thought ARQ
> would be best for that purpose. Another part is that we can enhance the
> object-oriented data model with many elements of the OWL data model,
> including much of the reasoning, anywhere the object-oriented data model
> is used. The intention is to let people use the OWL data model in all
> the object-oriented programs they write, instead of just the
> object-oriented data model as it is.
>
> What I would really like, though, would be if we could get more data on
> the Semantic Web and make it larger, with all the object-oriented data
> in the world that people are willing to post. I have thought what we
> really want to do is set up SPARQL endpoints on object databases.
> Another source of object-oriented data in the world is object-relational
> mapping. I'm not entirely sure, but I have thought it might also be
> possible to set up SPARQL endpoints on data sources of object-relational
> mapping, by treating the data as object-oriented data.
Alibaba (https://bitbucket.org/openrdf/alibaba) gives a RDF-ORM, so it's
the reverse direction to exposing existing object-oriented data (as far
as I know).
Andy
Re: GSoC 2014 - Develop a new in-memory RDF Dataset implementation
(JENA-624)
Posted by Timothy Armstrong <tj...@ncsu.edu>.
Hi,
Thank you so much for the references, Andy and Ying. I had seen AliBaba
before, but not Tupelo. I think these projects are different from what
I have in mind. I feel that it should be possible to set up SPARQL
endpoints on all the object database software. I thought maybe Fuseki
could be used for that purpose. I am aware that people have done quite a
lot of work to translate relational data into RDF and set up SPARQL
endpoints on relational databases. I have worked with D2RQ.
I just found the JSON-LD project, which I see has become a W3C
recommendation. I see that JSON-LD data can be interpreted as both
object-oriented data and RDF at the same time. So there is this very
close connection between object-oriented data and RDF.
I have found that it is extremely simple to map Java data into RDF and
does not require much effort at all. We treat object-oriented
attributes as RDF properties, as other software has done. When a Set is
used in an attribute, it is a non-functional property. Sometimes when
people use Lists or arrays in attributes what they really mean is an
rdf:List, and sometimes what they really mean is a non-functional
property. All that is needed to treat object-oriented data as RDF is to
specify which interpretation for List and array attributes is really
meant, which can be done with a simple Java annotation on the
attribute. Then it should be possible to run SPARQL on the RDF,
although write operations are more complicated than read operations.
Well, I found a way to write the TBox of any OWL ontology entirely in
Java. Here is the FOAF "knows" property, for instance:
@ObjectProperty
@label("knows")
@comment("A person known by this person (indicating some level of
reciprocated interaction between the parties).")
@isDefinedBy("http://xmlns.com/foaf/0.1/")
@term_status("stable")
@domain(Person.class)
@range(Person.class)
public @interface knows {}
Thanks,
Tim Armstrong
On 03/19/2014 09:35 AM, Ying Jiang wrote:
> Hi,
>
> As far as I know, for "object-oriented data" <-> "semantic data"
> mapping, Tupelo 2 has done some similar work in 2009/2010. You can
> have a look at [1] [2]
>
> Best regards,
> Ying Jiang
>
> [1] http://tupeloproject.ncsa.uiuc.edu/node/67
> [2] http://tupeloproject.ncsa.uiuc.edu/node/69
>
>
>
>
> On Wed, Mar 19, 2014 at 8:37 PM, Timothy Armstrong <tj...@ncsu.edu> wrote:
>> Hello Andy,
>>
>> Thank you so much for your response. I would be very interested in making a
>> new implementation of DatasetGraph, although I would have to learn about the
>> issues involved in SPARQL query optimization, as I have not studied those
>> issues. I would also have to learn more about parallel programming. Well,
>> maybe it is late to be applying for GSoC. I would just like to get involved
>> in an open source Semantic Web project, in any case.
>>
>> Thank you also for the references to related work. I shall have to look at
>> them in more detail. I also need to explore the connection with JSON-LD.
>>
>> I have had a really large vision about how we can enhance all our
>> object-oriented technologies with the Semantic Web technologies. Running
>> SPARQL on object-oriented data is part of it, and I have thought ARQ would
>> be best for that purpose. Another part is that we can enhance the
>> object-oriented data model with many elements of the OWL data model,
>> including much of the reasoning, anywhere the object-oriented data model is
>> used. The intention is to let people use the OWL data model in all the
>> object-oriented programs they write, instead of just the object-oriented
>> data model as it is.
>>
>> What I would really like, though, would be if we could get more data on the
>> Semantic Web and make it larger, with all the object-oriented data in the
>> world that people are willing to post. I have thought what we really want
>> to do is set up SPARQL endpoints on object databases. Another source of
>> object-oriented data in the world is object-relational mapping. I'm not
>> entirely sure, but I have thought it might also be possible to set up SPARQL
>> endpoints on data sources of object-relational mapping, by treating the data
>> as object-oriented data.
>>
>> I do have in mind how to implement at least large parts of the functionality
>> of the Graph, Node, and Triple interfaces backed by object-oriented data,
>> and I have a lot of code working toward that purpose in Java. My
>> understanding of ARQ is that it would be sufficient in order to run SPARQL
>> SELECT, CONSTRUCT, and Update queries on object-oriented data if we would
>> just implement the Model interface, or related interfaces, backed by
>> object-oriented data. Is that correct?
>>
>> As I have in mind, there would be one implementation of Model for each piece
>> of object database software, or maybe for each piece of object-relational
>> mapping software, although the implementations could have a lot of code in
>> common. There would also be a Model for an arbitrary Java Collection of
>> Java objects that the user would supply. Additionally, there would be a
>> Model to use in Java programs that would consist of all the Java objects in
>> memory that have not been garbage collected, which we could use to run
>> SPARQL on all the objects in memory. (I have means of accessing main
>> memory in Java with AspectJ.)
>>
>> Well, as I say, maybe it is late to be applying for GSoC. I have just been
>> hoping that I can make a contribution to the Semantic Web with these ideas.
>> I need to find a conference to which to submit my article. Thank you very
>> much again for your response.
>>
>> Tim Armstrong
>>
>>
>>
>> On 03/18/2014 09:49 AM, Andy Seaborne wrote:
>>> Hi Tim,
>>>
>>> The idea of this project wasn't to implement the Model interface, it was
>>> to implement the storage level DatasetGraph interface. Jena has an implement
>>> for Model in memory (actually - for Graph : Model is a presentation of Graph
>>> and Graph (and Node and Triple) are the key abstractions.
>>>
>>> Aside from GSoC:
>>>
>>> Your ideas for relating RDF access to object-oriented sounds interesting -
>>> do you have a particular source of object-oriented data in mind?
>>>
>>> I don't know of any closely related work which isn't to say there isn't
>>> any. Does the work on CumulusRDF, which stores RDF molecules (if I rmember
>>> correctly) have any relevance? Or Haystacks (MIT) which used adjacency
>>> lists on nodes to store RDF which is a different style to the "traditional"
>>> triple storage style.
>>>
>>> I suspect the W3C "CSV on the Web" Working Group might be connected -
>>> there, data is assumed into be in regular table structures which can be
>>> viewed as a low level object oriented data format.
>>>
>>> Andy
>>>
>>> On 18/03/14 01:27, Timothy Armstrong wrote:
>>>> Hello,
>>>>
>>>> I'm interested in contributing to Jena in Google Summer of Code 2014.
>>>> I'm a computer science Ph.D. student at North Carolina State
>>>> University. I have studied the Semantic Web very passionately, as I
>>>> feel it is a wonderful vision. I have taken a course in it, worked as a
>>>> research assistant on the Protein Ontology project (
>>>> http://pir.georgetown.edu/pro/pro.shtml ), and developed some open
>>>> source software for it. I have used Jena a lot.
>>>>
>>>> I have some ideas for JENA-624 (
>>>> https://issues.apache.org/jira/browse/JENA-624 ), although I am very
>>>> interested in directions you see for it, and I would be glad to work on
>>>> other issues. There are a lot of ideas I have had for my Semantic Web
>>>> software that are related to Jena. I would be very glad to contribute
>>>> to the Jena project in GSoC, but I would also be glad to contribute
>>>> anything in my existing software that would be useful to Jena. Well, I
>>>> realize that I am a bit late posting here for GSoC, and I am hurrying to
>>>> get my software's web site and article in a presentable form.
>>>>
>>>> I came up with a very simple interpretation of object-oriented
>>>> programming, similar to connections other people have made, that treats
>>>> all object-oriented data as triples in RDF. It means in part that we
>>>> can run SPARQL queries on any object-oriented data. I have thought it
>>>> would be very good if we could use ARQ to run SPARQL on main memory in
>>>> object-oriented programs and on object databases. I found that we can
>>>> post object-oriented data directly on the Semantic Web without having to
>>>> write any sort of mapping like D2RQ: either by translating
>>>> object-oriented data into an existing Semantic Web format, or by setting
>>>> up SPARQL endpoints on object databases. Well, I am very interested if
>>>> you are aware if any of this has been done before.
>>>>
>>>> Regarding JENA-624, I have in mind how to create implementations of the
>>>> Jena Model interface (com.hp.hpl.jena.rdf.model.Model) backed by Java
>>>> data. I have been thinking that it might help to run SPARQL on Java
>>>> data with ARQ if we could implement Model backed by Java data. I am
>>>> wondering if you think it would be applicable to JENA-624, or to any
>>>> other issues, if we could create implementations of Model in this
>>>> manner. There could be both in-memory models with Java data, and disk
>>>> models with object databases.
>>>>
>>>> So, I would be very glad to contribute.
>>>>
>>>> Thanks,
>>>> Tim Armstrong
>>>
Re: GSoC 2014 - Develop a new in-memory RDF Dataset implementation (JENA-624)
Posted by Ying Jiang <jp...@gmail.com>.
Hi,
As far as I know, for "object-oriented data" <-> "semantic data"
mapping, Tupelo 2 has done some similar work in 2009/2010. You can
have a look at [1] [2]
Best regards,
Ying Jiang
[1] http://tupeloproject.ncsa.uiuc.edu/node/67
[2] http://tupeloproject.ncsa.uiuc.edu/node/69
On Wed, Mar 19, 2014 at 8:37 PM, Timothy Armstrong <tj...@ncsu.edu> wrote:
> Hello Andy,
>
> Thank you so much for your response. I would be very interested in making a
> new implementation of DatasetGraph, although I would have to learn about the
> issues involved in SPARQL query optimization, as I have not studied those
> issues. I would also have to learn more about parallel programming. Well,
> maybe it is late to be applying for GSoC. I would just like to get involved
> in an open source Semantic Web project, in any case.
>
> Thank you also for the references to related work. I shall have to look at
> them in more detail. I also need to explore the connection with JSON-LD.
>
> I have had a really large vision about how we can enhance all our
> object-oriented technologies with the Semantic Web technologies. Running
> SPARQL on object-oriented data is part of it, and I have thought ARQ would
> be best for that purpose. Another part is that we can enhance the
> object-oriented data model with many elements of the OWL data model,
> including much of the reasoning, anywhere the object-oriented data model is
> used. The intention is to let people use the OWL data model in all the
> object-oriented programs they write, instead of just the object-oriented
> data model as it is.
>
> What I would really like, though, would be if we could get more data on the
> Semantic Web and make it larger, with all the object-oriented data in the
> world that people are willing to post. I have thought what we really want
> to do is set up SPARQL endpoints on object databases. Another source of
> object-oriented data in the world is object-relational mapping. I'm not
> entirely sure, but I have thought it might also be possible to set up SPARQL
> endpoints on data sources of object-relational mapping, by treating the data
> as object-oriented data.
>
> I do have in mind how to implement at least large parts of the functionality
> of the Graph, Node, and Triple interfaces backed by object-oriented data,
> and I have a lot of code working toward that purpose in Java. My
> understanding of ARQ is that it would be sufficient in order to run SPARQL
> SELECT, CONSTRUCT, and Update queries on object-oriented data if we would
> just implement the Model interface, or related interfaces, backed by
> object-oriented data. Is that correct?
>
> As I have in mind, there would be one implementation of Model for each piece
> of object database software, or maybe for each piece of object-relational
> mapping software, although the implementations could have a lot of code in
> common. There would also be a Model for an arbitrary Java Collection of
> Java objects that the user would supply. Additionally, there would be a
> Model to use in Java programs that would consist of all the Java objects in
> memory that have not been garbage collected, which we could use to run
> SPARQL on all the objects in memory. (I have means of accessing main
> memory in Java with AspectJ.)
>
> Well, as I say, maybe it is late to be applying for GSoC. I have just been
> hoping that I can make a contribution to the Semantic Web with these ideas.
> I need to find a conference to which to submit my article. Thank you very
> much again for your response.
>
> Tim Armstrong
>
>
>
> On 03/18/2014 09:49 AM, Andy Seaborne wrote:
>>
>> Hi Tim,
>>
>> The idea of this project wasn't to implement the Model interface, it was
>> to implement the storage level DatasetGraph interface. Jena has an implement
>> for Model in memory (actually - for Graph : Model is a presentation of Graph
>> and Graph (and Node and Triple) are the key abstractions.
>>
>> Aside from GSoC:
>>
>> Your ideas for relating RDF access to object-oriented sounds interesting -
>> do you have a particular source of object-oriented data in mind?
>>
>> I don't know of any closely related work which isn't to say there isn't
>> any. Does the work on CumulusRDF, which stores RDF molecules (if I rmember
>> correctly) have any relevance? Or Haystacks (MIT) which used adjacency
>> lists on nodes to store RDF which is a different style to the "traditional"
>> triple storage style.
>>
>> I suspect the W3C "CSV on the Web" Working Group might be connected -
>> there, data is assumed into be in regular table structures which can be
>> viewed as a low level object oriented data format.
>>
>> Andy
>>
>> On 18/03/14 01:27, Timothy Armstrong wrote:
>>>
>>> Hello,
>>>
>>> I'm interested in contributing to Jena in Google Summer of Code 2014.
>>> I'm a computer science Ph.D. student at North Carolina State
>>> University. I have studied the Semantic Web very passionately, as I
>>> feel it is a wonderful vision. I have taken a course in it, worked as a
>>> research assistant on the Protein Ontology project (
>>> http://pir.georgetown.edu/pro/pro.shtml ), and developed some open
>>> source software for it. I have used Jena a lot.
>>>
>>> I have some ideas for JENA-624 (
>>> https://issues.apache.org/jira/browse/JENA-624 ), although I am very
>>> interested in directions you see for it, and I would be glad to work on
>>> other issues. There are a lot of ideas I have had for my Semantic Web
>>> software that are related to Jena. I would be very glad to contribute
>>> to the Jena project in GSoC, but I would also be glad to contribute
>>> anything in my existing software that would be useful to Jena. Well, I
>>> realize that I am a bit late posting here for GSoC, and I am hurrying to
>>> get my software's web site and article in a presentable form.
>>>
>>> I came up with a very simple interpretation of object-oriented
>>> programming, similar to connections other people have made, that treats
>>> all object-oriented data as triples in RDF. It means in part that we
>>> can run SPARQL queries on any object-oriented data. I have thought it
>>> would be very good if we could use ARQ to run SPARQL on main memory in
>>> object-oriented programs and on object databases. I found that we can
>>> post object-oriented data directly on the Semantic Web without having to
>>> write any sort of mapping like D2RQ: either by translating
>>> object-oriented data into an existing Semantic Web format, or by setting
>>> up SPARQL endpoints on object databases. Well, I am very interested if
>>> you are aware if any of this has been done before.
>>>
>>> Regarding JENA-624, I have in mind how to create implementations of the
>>> Jena Model interface (com.hp.hpl.jena.rdf.model.Model) backed by Java
>>> data. I have been thinking that it might help to run SPARQL on Java
>>> data with ARQ if we could implement Model backed by Java data. I am
>>> wondering if you think it would be applicable to JENA-624, or to any
>>> other issues, if we could create implementations of Model in this
>>> manner. There could be both in-memory models with Java data, and disk
>>> models with object databases.
>>>
>>> So, I would be very glad to contribute.
>>>
>>> Thanks,
>>> Tim Armstrong
>>
>>
>
Re: GSoC 2014 - Develop a new in-memory RDF Dataset implementation
(JENA-624)
Posted by Timothy Armstrong <tj...@ncsu.edu>.
Hello Andy,
Thank you so much for your response. I would be very interested in
making a new implementation of DatasetGraph, although I would have to
learn about the issues involved in SPARQL query optimization, as I have
not studied those issues. I would also have to learn more about
parallel programming. Well, maybe it is late to be applying for GSoC.
I would just like to get involved in an open source Semantic Web
project, in any case.
Thank you also for the references to related work. I shall have to look
at them in more detail. I also need to explore the connection with JSON-LD.
I have had a really large vision about how we can enhance all our
object-oriented technologies with the Semantic Web technologies. Running
SPARQL on object-oriented data is part of it, and I have thought ARQ
would be best for that purpose. Another part is that we can enhance the
object-oriented data model with many elements of the OWL data model,
including much of the reasoning, anywhere the object-oriented data model
is used. The intention is to let people use the OWL data model in all
the object-oriented programs they write, instead of just the
object-oriented data model as it is.
What I would really like, though, would be if we could get more data on
the Semantic Web and make it larger, with all the object-oriented data
in the world that people are willing to post. I have thought what we
really want to do is set up SPARQL endpoints on object databases.
Another source of object-oriented data in the world is object-relational
mapping. I'm not entirely sure, but I have thought it might also be
possible to set up SPARQL endpoints on data sources of object-relational
mapping, by treating the data as object-oriented data.
I do have in mind how to implement at least large parts of the
functionality of the Graph, Node, and Triple interfaces backed by
object-oriented data, and I have a lot of code working toward that
purpose in Java. My understanding of ARQ is that it would be sufficient
in order to run SPARQL SELECT, CONSTRUCT, and Update queries on
object-oriented data if we would just implement the Model interface, or
related interfaces, backed by object-oriented data. Is that correct?
As I have in mind, there would be one implementation of Model for each
piece of object database software, or maybe for each piece of
object-relational mapping software, although the implementations could
have a lot of code in common. There would also be a Model for an
arbitrary Java Collection of Java objects that the user would supply.
Additionally, there would be a Model to use in Java programs that would
consist of all the Java objects in memory that have not been garbage
collected, which we could use to run SPARQL on all the objects in
memory. (I have means of accessing main memory in Java with AspectJ.)
Well, as I say, maybe it is late to be applying for GSoC. I have just
been hoping that I can make a contribution to the Semantic Web with
these ideas. I need to find a conference to which to submit my
article. Thank you very much again for your response.
Tim Armstrong
On 03/18/2014 09:49 AM, Andy Seaborne wrote:
> Hi Tim,
>
> The idea of this project wasn't to implement the Model interface, it
> was to implement the storage level DatasetGraph interface. Jena has an
> implement for Model in memory (actually - for Graph : Model is a
> presentation of Graph and Graph (and Node and Triple) are the key
> abstractions.
>
> Aside from GSoC:
>
> Your ideas for relating RDF access to object-oriented sounds
> interesting - do you have a particular source of object-oriented data
> in mind?
>
> I don't know of any closely related work which isn't to say there
> isn't any. Does the work on CumulusRDF, which stores RDF molecules
> (if I rmember correctly) have any relevance? Or Haystacks (MIT) which
> used adjacency lists on nodes to store RDF which is a different style
> to the "traditional" triple storage style.
>
> I suspect the W3C "CSV on the Web" Working Group might be connected -
> there, data is assumed into be in regular table structures which can
> be viewed as a low level object oriented data format.
>
> Andy
>
> On 18/03/14 01:27, Timothy Armstrong wrote:
>> Hello,
>>
>> I'm interested in contributing to Jena in Google Summer of Code 2014.
>> I'm a computer science Ph.D. student at North Carolina State
>> University. I have studied the Semantic Web very passionately, as I
>> feel it is a wonderful vision. I have taken a course in it, worked as a
>> research assistant on the Protein Ontology project (
>> http://pir.georgetown.edu/pro/pro.shtml ), and developed some open
>> source software for it. I have used Jena a lot.
>>
>> I have some ideas for JENA-624 (
>> https://issues.apache.org/jira/browse/JENA-624 ), although I am very
>> interested in directions you see for it, and I would be glad to work on
>> other issues. There are a lot of ideas I have had for my Semantic Web
>> software that are related to Jena. I would be very glad to contribute
>> to the Jena project in GSoC, but I would also be glad to contribute
>> anything in my existing software that would be useful to Jena. Well, I
>> realize that I am a bit late posting here for GSoC, and I am hurrying to
>> get my software's web site and article in a presentable form.
>>
>> I came up with a very simple interpretation of object-oriented
>> programming, similar to connections other people have made, that treats
>> all object-oriented data as triples in RDF. It means in part that we
>> can run SPARQL queries on any object-oriented data. I have thought it
>> would be very good if we could use ARQ to run SPARQL on main memory in
>> object-oriented programs and on object databases. I found that we can
>> post object-oriented data directly on the Semantic Web without having to
>> write any sort of mapping like D2RQ: either by translating
>> object-oriented data into an existing Semantic Web format, or by setting
>> up SPARQL endpoints on object databases. Well, I am very interested if
>> you are aware if any of this has been done before.
>>
>> Regarding JENA-624, I have in mind how to create implementations of the
>> Jena Model interface (com.hp.hpl.jena.rdf.model.Model) backed by Java
>> data. I have been thinking that it might help to run SPARQL on Java
>> data with ARQ if we could implement Model backed by Java data. I am
>> wondering if you think it would be applicable to JENA-624, or to any
>> other issues, if we could create implementations of Model in this
>> manner. There could be both in-memory models with Java data, and disk
>> models with object databases.
>>
>> So, I would be very glad to contribute.
>>
>> Thanks,
>> Tim Armstrong
>
Re: GSoC 2014 - Develop a new in-memory RDF Dataset implementation
(JENA-624)
Posted by Andy Seaborne <an...@apache.org>.
Hi Tim,
The idea of this project wasn't to implement the Model interface, it was
to implement the storage level DatasetGraph interface. Jena has an
implement for Model in memory (actually - for Graph : Model is a
presentation of Graph and Graph (and Node and Triple) are the key
abstractions.
Aside from GSoC:
Your ideas for relating RDF access to object-oriented sounds interesting
- do you have a particular source of object-oriented data in mind?
I don't know of any closely related work which isn't to say there isn't
any. Does the work on CumulusRDF, which stores RDF molecules (if I
rmember correctly) have any relevance? Or Haystacks (MIT) which used
adjacency lists on nodes to store RDF which is a different style to the
"traditional" triple storage style.
I suspect the W3C "CSV on the Web" Working Group might be connected -
there, data is assumed into be in regular table structures which can be
viewed as a low level object oriented data format.
Andy
On 18/03/14 01:27, Timothy Armstrong wrote:
> Hello,
>
> I'm interested in contributing to Jena in Google Summer of Code 2014.
> I'm a computer science Ph.D. student at North Carolina State
> University. I have studied the Semantic Web very passionately, as I
> feel it is a wonderful vision. I have taken a course in it, worked as a
> research assistant on the Protein Ontology project (
> http://pir.georgetown.edu/pro/pro.shtml ), and developed some open
> source software for it. I have used Jena a lot.
>
> I have some ideas for JENA-624 (
> https://issues.apache.org/jira/browse/JENA-624 ), although I am very
> interested in directions you see for it, and I would be glad to work on
> other issues. There are a lot of ideas I have had for my Semantic Web
> software that are related to Jena. I would be very glad to contribute
> to the Jena project in GSoC, but I would also be glad to contribute
> anything in my existing software that would be useful to Jena. Well, I
> realize that I am a bit late posting here for GSoC, and I am hurrying to
> get my software's web site and article in a presentable form.
>
> I came up with a very simple interpretation of object-oriented
> programming, similar to connections other people have made, that treats
> all object-oriented data as triples in RDF. It means in part that we
> can run SPARQL queries on any object-oriented data. I have thought it
> would be very good if we could use ARQ to run SPARQL on main memory in
> object-oriented programs and on object databases. I found that we can
> post object-oriented data directly on the Semantic Web without having to
> write any sort of mapping like D2RQ: either by translating
> object-oriented data into an existing Semantic Web format, or by setting
> up SPARQL endpoints on object databases. Well, I am very interested if
> you are aware if any of this has been done before.
>
> Regarding JENA-624, I have in mind how to create implementations of the
> Jena Model interface (com.hp.hpl.jena.rdf.model.Model) backed by Java
> data. I have been thinking that it might help to run SPARQL on Java
> data with ARQ if we could implement Model backed by Java data. I am
> wondering if you think it would be applicable to JENA-624, or to any
> other issues, if we could create implementations of Model in this
> manner. There could be both in-memory models with Java data, and disk
> models with object databases.
>
> So, I would be very glad to contribute.
>
> Thanks,
> Tim Armstrong
GSoC 2014 - Develop a new in-memory RDF Dataset implementation (JENA-624)
Posted by Timothy Armstrong <tj...@ncsu.edu>.
Hello,
I'm interested in contributing to Jena in Google Summer of Code 2014.
I'm a computer science Ph.D. student at North Carolina State
University. I have studied the Semantic Web very passionately, as I
feel it is a wonderful vision. I have taken a course in it, worked as a
research assistant on the Protein Ontology project (
http://pir.georgetown.edu/pro/pro.shtml ), and developed some open
source software for it. I have used Jena a lot.
I have some ideas for JENA-624 (
https://issues.apache.org/jira/browse/JENA-624 ), although I am very
interested in directions you see for it, and I would be glad to work on
other issues. There are a lot of ideas I have had for my Semantic Web
software that are related to Jena. I would be very glad to contribute
to the Jena project in GSoC, but I would also be glad to contribute
anything in my existing software that would be useful to Jena. Well, I
realize that I am a bit late posting here for GSoC, and I am hurrying to
get my software's web site and article in a presentable form.
I came up with a very simple interpretation of object-oriented
programming, similar to connections other people have made, that treats
all object-oriented data as triples in RDF. It means in part that we
can run SPARQL queries on any object-oriented data. I have thought it
would be very good if we could use ARQ to run SPARQL on main memory in
object-oriented programs and on object databases. I found that we can
post object-oriented data directly on the Semantic Web without having to
write any sort of mapping like D2RQ: either by translating
object-oriented data into an existing Semantic Web format, or by setting
up SPARQL endpoints on object databases. Well, I am very interested if
you are aware if any of this has been done before.
Regarding JENA-624, I have in mind how to create implementations of the
Jena Model interface (com.hp.hpl.jena.rdf.model.Model) backed by Java
data. I have been thinking that it might help to run SPARQL on Java
data with ARQ if we could implement Model backed by Java data. I am
wondering if you think it would be applicable to JENA-624, or to any
other issues, if we could create implementations of Model in this
manner. There could be both in-memory models with Java data, and disk
models with object databases.
So, I would be very glad to contribute.
Thanks,
Tim Armstrong
Re: [GSOC 2014] SPARQL Query Caching
Posted by Andy Seaborne <an...@apache.org>.
On 19/03/14 18:32, Dinithi Nallaperuma wrote:
> Hi Andy,
>
> Thanks for you reply.
> I guess when you say 'an Apache committer' you mean an Apache commit who's
> with Jena project.
In theory, any Apache committer but it does need someone with
familiarity wit the Jena codebase which in practice means someone with
the Jena project.
Mentors have several roles:
1/ GSoC Process (so you get paid!)
2/ Technical discussion
3/ Committing to the codebase (though forking on github means that isn't
major)
Andy
> I would like to take this opportunity to ask whether any Apache committer
> subscribed here would like to mentor the project 'SPARQL Query Caching'
>
>
> On Tue, Mar 18, 2014 at 7:21 PM, Andy Seaborne <an...@apache.org> wrote:
>
>> Hi Dinithi,
>>
>> Thank you for your interest - at the moment, no one has expressed an
>> interest in mentoring that project. Sorry about that - the project makes
>> paging results using LIMIT-OFFSET work quite nicely but unless someon step
>> forward (an Apache committer), the project is mentor-less.
>>
>> Andy
>>
>>
>> On 17/03/14 18:14, Dinithi Nallaperuma wrote:
>>
>>> Hi All,
>>>
>>> I am Dinithi Nallaperuma, a postgraduate student following MSc Advanced
>>> Software Engineering from University of Westminster, UK. I am fairly
>>> familiar with ontologies and related subject areas.
>>>
>>> I am interested in the project SPARQL Query Caching [1]. I would like to
>>> know whether there is any mentor who would like to mentor this project
>>> during the summer.
>>>
>>> I would also like to know your opinion on using JCS [2] to implement the
>>> proposed caching solution. I have some prior experience working with JCS
>>> and it can cater many of our requirements built-in. JCS supports in-memory
>>> caching as well as spooling to disc or database. Further it has a good api
>>> for managing the cache.
>>>
>>> The biggest challenge I've faced so far is cooping the modifications. One
>>> approach is to keep a mapping between the objects and SPARQL query results
>>> that referred them and invalidate the cache entry when an update to the
>>> object occurs. I would much appreciate any advise on how to coop with data
>>> modifications
>>>
>>
>> This is very hard in SPARQL because of negatives (parts of data that
>> caused results to be excluded), not just the parts of the data that lead to
>> a specific variable binding.
>>
>> That said, given that most use is publishing (more read than write),
>> dumping the cache after an update is a viable scheme in many situations if
>> the impact at the time of update is tolerable, such as during a quiet
>> period (like early morning).
>>
>
Re: [GSOC 2014] SPARQL Query Caching
Posted by Dinithi Nallaperuma <di...@gmail.com>.
Hi Andy,
Thanks for you reply.
I guess when you say 'an Apache committer' you mean an Apache commit who's
with Jena project.
I would like to take this opportunity to ask whether any Apache committer
subscribed here would like to mentor the project 'SPARQL Query Caching'
On Tue, Mar 18, 2014 at 7:21 PM, Andy Seaborne <an...@apache.org> wrote:
> Hi Dinithi,
>
> Thank you for your interest - at the moment, no one has expressed an
> interest in mentoring that project. Sorry about that - the project makes
> paging results using LIMIT-OFFSET work quite nicely but unless someon step
> forward (an Apache committer), the project is mentor-less.
>
> Andy
>
>
> On 17/03/14 18:14, Dinithi Nallaperuma wrote:
>
>> Hi All,
>>
>> I am Dinithi Nallaperuma, a postgraduate student following MSc Advanced
>> Software Engineering from University of Westminster, UK. I am fairly
>> familiar with ontologies and related subject areas.
>>
>> I am interested in the project SPARQL Query Caching [1]. I would like to
>> know whether there is any mentor who would like to mentor this project
>> during the summer.
>>
>> I would also like to know your opinion on using JCS [2] to implement the
>> proposed caching solution. I have some prior experience working with JCS
>> and it can cater many of our requirements built-in. JCS supports in-memory
>> caching as well as spooling to disc or database. Further it has a good api
>> for managing the cache.
>>
>> The biggest challenge I've faced so far is cooping the modifications. One
>> approach is to keep a mapping between the objects and SPARQL query results
>> that referred them and invalidate the cache entry when an update to the
>> object occurs. I would much appreciate any advise on how to coop with data
>> modifications
>>
>
> This is very hard in SPARQL because of negatives (parts of data that
> caused results to be excluded), not just the parts of the data that lead to
> a specific variable binding.
>
> That said, given that most use is publishing (more read than write),
> dumping the cache after an update is a viable scheme in many situations if
> the impact at the time of update is tolerable, such as during a quiet
> period (like early morning).
>
Re: [GSOC 2014] SPARQL Query Caching
Posted by Andy Seaborne <an...@apache.org>.
Hi Dinithi,
Thank you for your interest - at the moment, no one has expressed an
interest in mentoring that project. Sorry about that - the project
makes paging results using LIMIT-OFFSET work quite nicely but unless
someon step forward (an Apache committer), the project is mentor-less.
Andy
On 17/03/14 18:14, Dinithi Nallaperuma wrote:
> Hi All,
>
> I am Dinithi Nallaperuma, a postgraduate student following MSc Advanced
> Software Engineering from University of Westminster, UK. I am fairly
> familiar with ontologies and related subject areas.
>
> I am interested in the project SPARQL Query Caching [1]. I would like to
> know whether there is any mentor who would like to mentor this project
> during the summer.
>
> I would also like to know your opinion on using JCS [2] to implement the
> proposed caching solution. I have some prior experience working with JCS
> and it can cater many of our requirements built-in. JCS supports in-memory
> caching as well as spooling to disc or database. Further it has a good api
> for managing the cache.
>
> The biggest challenge I've faced so far is cooping the modifications. One
> approach is to keep a mapping between the objects and SPARQL query results
> that referred them and invalidate the cache entry when an update to the
> object occurs. I would much appreciate any advise on how to coop with data
> modifications
This is very hard in SPARQL because of negatives (parts of data that
caused results to be excluded), not just the parts of the data that lead
to a specific variable binding.
That said, given that most use is publishing (more read than write),
dumping the cache after an update is a viable scheme in many situations
if the impact at the time of update is tolerable, such as during a quiet
period (like early morning).
>
> [1] https://issues.apache.org/jira/browse/JENA-626
> [2] http://commons.apache.org/proper/commons-jcs/
>
> Thank you and Regards,
> Dinithi
>