You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@jena.apache.org by Andreas Grünwald <a....@gmail.com> on 2013/04/04 23:50:22 UTC
Persisting OWL in Jena
Hello,
I managed to establish a connection to my MYSQL database via Jena SDB and
inserted some triples.
However, I still feel insecure how OWL constructs are persisted with Jena.
Here is my Java code example:
c.initSdb(OntModelSpec.OWL_MEM_MICRO_RULE_INF); //connect to MYSQL db
OntModel base = c.getOntologyModel();
String SOURCE = "http://www.eswc2006.org/technologies/ontology";
String NS = SOURCE + "#";
/* Add 2 individuals of the type "paper" linked via an object property */
OntClass paper = base.getOntClass( NS + "Paper" );
Individual p1 = base.createIndividual( NS + "paper1", paper );
Individual p2 = base.createIndividual( NS + "paper2", paper );
Property hasLinkTo = base.createObjectProperty(NS + "hasLinkTo"); //
hasName property
base.add(p1,hasLinkTo,p2);
---
In the database 4 nodes, viz.
- http://www.w3.org/1999/02/22-rdf-syntax-ns#type
- http://www.eswc2006.org/technologies/ontology#hasLinkTo
- http://www.w3.org/2002/07/owl#ObjectProperty
- http://www.eswc2006.org/technologies/ontology#paper1
and 2 triples, viz.
- #hasLinkTo - rdf-syntax-ns#type - owl#ObjectProperty
- #paper1 - #hasLinkTo - #paper1
are persisted.
My question is: Shouldn't the OntClass "Paper" result in an additional
node, and an additional triple in the database?
Or do I have to add this manually? How? I looked at the Jena tutorial but
couldn't find an example.
Best regards and thank you very much,
Andreas
Re: Persisting OWL in Jena
Posted by davejrdn <da...@bellsouth.net>.
Thanks for much for clarifying this. It would be great if this stuff was all
clearly described and documented, including details about which types of queries
trigger the forward and backward reasoning, etc. I'll continue with my
benchmarking including all these different configurations that are possible.
________________________________
From: Dave Reynolds <da...@gmail.com>
To: users@jena.apache.org
Sent: Sat, April 6, 2013 10:27:55 AM
Subject: Re: Persisting OWL in Jena
On 05/04/13 16:01, David Jordan wrote:
> I am testing under several scenarios. For some static cases, I do precompute
>the inferences and store them. For this case, I do have one open question. If
>one wants to later combine multiple ontologies and data with their own implied
>inferencings, is there ever an issue that the original non-inferenced OWL
>specification are needed, because of their interactions with the inferencing of
>the other ontologies being combined? Will one lose some triples that would have
>been inferred if one had started with inferencing done on the original OWL code?
If I follow the question correctly then no, that's safe. All the OWL and
RDFS semantics are defined to be monotonic. Adding new statements can
only ever lead to further statements being deducible, they can't lead to
previously inferred statements becoming invalid. The deductive closure
of an model is always a superset of the original model, no information
is lost.
> Given
> OntModelSpec spec = new OntModelSpec(OntModelSpec.OWL_MEM_MICRO_RULE_INF);
>
> You are saying that with the following code:
> Model memmodel = ModelFactory.createDefaultModel();
> memmodel.add( dbmodel);
> OntModel omodel = ModelFactory.createOntologyModel(spec, memmodel);
>
> This will cause my "database model" to be completely pulled into memory and
>placed in the memmodel, so that the OntModel can run much more efficiently?
Yes. The cost of that "pulling into memory" step may be quite high but
once it's there all the little queries made by the reasoner will perform
better. In most cases you should come out ahead.
> Whereas with the following model omodel it will always go to the database?
>
> Model dbmodel = SDBFactory.connectNamedModel(store, name);
> OntModel omodel = ModelFactory.createOntologyModel(spec, model);
Depends what you mean by "always".
The point is that the reasoners have to do a *lot* of queries over the
data to do their job. In the above set up then each of those queries
goes to the database. This can be very very slow. By pulling the data
into memory you take the hit once (with a simple efficient query).
Some of the results of that reasoning is then stored in internal state
in the reasoner so future queries to the omodel may be partially
answered by that internal state and may not trigger further database
queries.
Exactly what "partial" means in this case is complex. The rule reasoners
employ a mix of forward and backward chaining. The forward parts will
all run to completion and store their results in memory. The backward
reasoning is only triggered by the particular query goal and may invoke
further queries to the underlying model (and thus the database).
However, some parts of the backward queries are "tabled" (to stop
infinite loops as much as for performance reasoners) and those tables
are in memory as well. So over time more and more of a given query can
be answered out of the in-memory state but that never reaches 100% of
all queries.
Dave
> -----Original Message-----
> From: Dave Reynolds [mailto:dave.e.reynolds@gmail.com]
> Sent: Friday, April 05, 2013 10:39 AM
> To: users@jena.apache.org
> Subject: Re: Persisting OWL in Jena
>
> On 05/04/13 15:09, David Jordan wrote:
>> Dave,
>> I have been getting "less than stellar" performance in my benchmarking. I would
>>just like to be sure that the way I am using Jena IS performing inference over
>>in-memory models. I have stored Models in the database. When I access them and
>>create an OntModel, I do it in the following manner:
>>
>> Store store; // assume this is initialized Model model =
>> SDBFactory.connectNamedModel(store, name); OntModelSpec spec = new
>> OntModelSpec(OntModelSpec.OWL_MEM_MICRO_RULE_INF);
>> OntModel omodel = ModelFactory.createOntologyModel(spec, model);
>> omodel.prepare();
>>
>> Does this result in an in-memory model as you recommend?
>
> No, that's an inference model running over the database.
>
>> If not, could you show the necessary code.
>
> Depends on what you are trying to do. Whether your data is static. What
>inferences you want (all or just some interesting ones). Whether the source data
>is large. Whether is available as a file or only a database model. Etc.
>
> In the simple case your data is essentially fixed and you can precompute and
>store the inferences.
>
> Model memmodel = ModelFactory.createDefaultModel();
> // read data into model or use FileUtils.get().loadModel instead
> OntModelSpec spec =
> new OntModelSpec(OntModelSpec.OWL_MEM_MICRO_RULE_INF);
> OntModel omodel = ModelFactory.createOntologyModel(spec, memmodel);
> dbmodel.add( omodel );
>
> If there are only some inferences you need then you might be more selective in
>what the final "add" phase puts into the database model.
>
> Then you access that data in future uses via a non-inference model:
>
> Model dbmodel = SDBFactory.connectNamedModel(store, name);
> OntModelSpec spec = new OntModelSpec(OntModelSpec.OWL_MEM);
> OntModel omodel = ModelFactory.createOntologyModel(spec, dbmodel);
>
> If your data is already in the database and you want to dynamically compute the
>inferences over its current state then do something more like:
>
> Model memmodel = ModelFactory.createDefaultModel();
> memmodel.add( dbmodel );
> OntModelSpec spec =
> new OntModelSpec(OntModelSpec.OWL_MEM_MICRO_RULE_INF);
> OntModel omodel = ModelFactory.createOntologyModel(spec, memmodel);
> // use omodel
>
> Any updates to the data will need to be reflected into the omodel. If those
>updates are done in the same VM that might be OK, if they are done by other
>database clients then that's problematic.
>
> Fundamentally databases and Jena's rule-based inference do not mix well.
>
> Depending on what you need from inference you may be able to achieve the same
>effects by query rewriting, or query rewriting plus some simpler pre-computed
>closure. In the worst case you need a full deductive database.
>
> For minimal RDFS inference then there is some support in the TDB loader for
>computing that more efficiently at load time than the full in-memory rule
>systems do.
>
> Dave
>
>
>
Re: Persisting OWL in Jena
Posted by Dave Reynolds <da...@gmail.com>.
On 05/04/13 16:01, David Jordan wrote:
> I am testing under several scenarios. For some static cases, I do precompute the inferences and store them. For this case, I do have one open question. If one wants to later combine multiple ontologies and data with their own implied inferencings, is there ever an issue that the original non-inferenced OWL specification are needed, because of their interactions with the inferencing of the other ontologies being combined? Will one lose some triples that would have been inferred if one had started with inferencing done on the original OWL code?
If I follow the question correctly then no, that's safe. All the OWL and
RDFS semantics are defined to be monotonic. Adding new statements can
only ever lead to further statements being deducible, they can't lead to
previously inferred statements becoming invalid. The deductive closure
of an model is always a superset of the original model, no information
is lost.
> Given
> OntModelSpec spec = new OntModelSpec(OntModelSpec.OWL_MEM_MICRO_RULE_INF);
>
> You are saying that with the following code:
> Model memmodel = ModelFactory.createDefaultModel();
> memmodel.add( dbmodel);
> OntModel omodel = ModelFactory.createOntologyModel(spec, memmodel);
>
> This will cause my "database model" to be completely pulled into memory and placed in the memmodel, so that the OntModel can run much more efficiently?
Yes. The cost of that "pulling into memory" step may be quite high but
once it's there all the little queries made by the reasoner will perform
better. In most cases you should come out ahead.
> Whereas with the following model omodel it will always go to the database?
>
> Model dbmodel = SDBFactory.connectNamedModel(store, name);
> OntModel omodel = ModelFactory.createOntologyModel(spec, model);
Depends what you mean by "always".
The point is that the reasoners have to do a *lot* of queries over the
data to do their job. In the above set up then each of those queries
goes to the database. This can be very very slow. By pulling the data
into memory you take the hit once (with a simple efficient query).
Some of the results of that reasoning is then stored in internal state
in the reasoner so future queries to the omodel may be partially
answered by that internal state and may not trigger further database
queries.
Exactly what "partial" means in this case is complex. The rule reasoners
employ a mix of forward and backward chaining. The forward parts will
all run to completion and store their results in memory. The backward
reasoning is only triggered by the particular query goal and may invoke
further queries to the underlying model (and thus the database).
However, some parts of the backward queries are "tabled" (to stop
infinite loops as much as for performance reasoners) and those tables
are in memory as well. So over time more and more of a given query can
be answered out of the in-memory state but that never reaches 100% of
all queries.
Dave
> -----Original Message-----
> From: Dave Reynolds [mailto:dave.e.reynolds@gmail.com]
> Sent: Friday, April 05, 2013 10:39 AM
> To: users@jena.apache.org
> Subject: Re: Persisting OWL in Jena
>
> On 05/04/13 15:09, David Jordan wrote:
>> Dave,
>> I have been getting "less than stellar" performance in my benchmarking. I would just like to be sure that the way I am using Jena IS performing inference over in-memory models. I have stored Models in the database. When I access them and create an OntModel, I do it in the following manner:
>>
>> Store store; // assume this is initialized Model model =
>> SDBFactory.connectNamedModel(store, name); OntModelSpec spec = new
>> OntModelSpec(OntModelSpec.OWL_MEM_MICRO_RULE_INF);
>> OntModel omodel = ModelFactory.createOntologyModel(spec, model);
>> omodel.prepare();
>>
>> Does this result in an in-memory model as you recommend?
>
> No, that's an inference model running over the database.
>
>> If not, could you show the necessary code.
>
> Depends on what you are trying to do. Whether your data is static. What inferences you want (all or just some interesting ones). Whether the source data is large. Whether is available as a file or only a database model. Etc.
>
> In the simple case your data is essentially fixed and you can precompute and store the inferences.
>
> Model memmodel = ModelFactory.createDefaultModel();
> // read data into model or use FileUtils.get().loadModel instead
> OntModelSpec spec =
> new OntModelSpec(OntModelSpec.OWL_MEM_MICRO_RULE_INF);
> OntModel omodel = ModelFactory.createOntologyModel(spec, memmodel);
> dbmodel.add( omodel );
>
> If there are only some inferences you need then you might be more selective in what the final "add" phase puts into the database model.
>
> Then you access that data in future uses via a non-inference model:
>
> Model dbmodel = SDBFactory.connectNamedModel(store, name);
> OntModelSpec spec = new OntModelSpec(OntModelSpec.OWL_MEM);
> OntModel omodel = ModelFactory.createOntologyModel(spec, dbmodel);
>
> If your data is already in the database and you want to dynamically compute the inferences over its current state then do something more like:
>
> Model memmodel = ModelFactory.createDefaultModel();
> memmodel.add( dbmodel );
> OntModelSpec spec =
> new OntModelSpec(OntModelSpec.OWL_MEM_MICRO_RULE_INF);
> OntModel omodel = ModelFactory.createOntologyModel(spec, memmodel);
> // use omodel
>
> Any updates to the data will need to be reflected into the omodel. If those updates are done in the same VM that might be OK, if they are done by other database clients then that's problematic.
>
> Fundamentally databases and Jena's rule-based inference do not mix well.
>
> Depending on what you need from inference you may be able to achieve the same effects by query rewriting, or query rewriting plus some simpler pre-computed closure. In the worst case you need a full deductive database.
>
> For minimal RDFS inference then there is some support in the TDB loader for computing that more efficiently at load time than the full in-memory rule systems do.
>
> Dave
>
>
>
RE: Persisting OWL in Jena
Posted by David Jordan <Da...@sas.com>.
I am testing under several scenarios. For some static cases, I do precompute the inferences and store them. For this case, I do have one open question. If one wants to later combine multiple ontologies and data with their own implied inferencings, is there ever an issue that the original non-inferenced OWL specification are needed, because of their interactions with the inferencing of the other ontologies being combined? Will one lose some triples that would have been inferred if one had started with inferencing done on the original OWL code?
Given
OntModelSpec spec = new OntModelSpec(OntModelSpec.OWL_MEM_MICRO_RULE_INF);
You are saying that with the following code:
Model memmodel = ModelFactory.createDefaultModel();
memmodel.add( dbmodel);
OntModel omodel = ModelFactory.createOntologyModel(spec, memmodel);
This will cause my "database model" to be completely pulled into memory and placed in the memmodel, so that the OntModel can run much more efficiently?
Whereas with the following model omodel it will always go to the database?
Model dbmodel = SDBFactory.connectNamedModel(store, name);
OntModel omodel = ModelFactory.createOntologyModel(spec, model);
That probably explains the slowness. With my current code, the initialization of the Model and OntModel did seem relatively fast.
-----Original Message-----
From: Dave Reynolds [mailto:dave.e.reynolds@gmail.com]
Sent: Friday, April 05, 2013 10:39 AM
To: users@jena.apache.org
Subject: Re: Persisting OWL in Jena
On 05/04/13 15:09, David Jordan wrote:
> Dave,
> I have been getting "less than stellar" performance in my benchmarking. I would just like to be sure that the way I am using Jena IS performing inference over in-memory models. I have stored Models in the database. When I access them and create an OntModel, I do it in the following manner:
>
> Store store; // assume this is initialized Model model =
> SDBFactory.connectNamedModel(store, name); OntModelSpec spec = new
> OntModelSpec(OntModelSpec.OWL_MEM_MICRO_RULE_INF);
> OntModel omodel = ModelFactory.createOntologyModel(spec, model);
> omodel.prepare();
>
> Does this result in an in-memory model as you recommend?
No, that's an inference model running over the database.
> If not, could you show the necessary code.
Depends on what you are trying to do. Whether your data is static. What inferences you want (all or just some interesting ones). Whether the source data is large. Whether is available as a file or only a database model. Etc.
In the simple case your data is essentially fixed and you can precompute and store the inferences.
Model memmodel = ModelFactory.createDefaultModel();
// read data into model or use FileUtils.get().loadModel instead
OntModelSpec spec =
new OntModelSpec(OntModelSpec.OWL_MEM_MICRO_RULE_INF);
OntModel omodel = ModelFactory.createOntologyModel(spec, memmodel);
dbmodel.add( omodel );
If there are only some inferences you need then you might be more selective in what the final "add" phase puts into the database model.
Then you access that data in future uses via a non-inference model:
Model dbmodel = SDBFactory.connectNamedModel(store, name);
OntModelSpec spec = new OntModelSpec(OntModelSpec.OWL_MEM);
OntModel omodel = ModelFactory.createOntologyModel(spec, dbmodel);
If your data is already in the database and you want to dynamically compute the inferences over its current state then do something more like:
Model memmodel = ModelFactory.createDefaultModel();
memmodel.add( dbmodel );
OntModelSpec spec =
new OntModelSpec(OntModelSpec.OWL_MEM_MICRO_RULE_INF);
OntModel omodel = ModelFactory.createOntologyModel(spec, memmodel);
// use omodel
Any updates to the data will need to be reflected into the omodel. If those updates are done in the same VM that might be OK, if they are done by other database clients then that's problematic.
Fundamentally databases and Jena's rule-based inference do not mix well.
Depending on what you need from inference you may be able to achieve the same effects by query rewriting, or query rewriting plus some simpler pre-computed closure. In the worst case you need a full deductive database.
For minimal RDFS inference then there is some support in the TDB loader for computing that more efficiently at load time than the full in-memory rule systems do.
Dave
Re: Persisting OWL in Jena
Posted by Dave Reynolds <da...@gmail.com>.
On 05/04/13 15:09, David Jordan wrote:
> Dave,
> I have been getting "less than stellar" performance in my benchmarking. I would just like to be sure that the way I am using Jena IS performing inference over in-memory models. I have stored Models in the database. When I access them and create an OntModel, I do it in the following manner:
>
> Store store; // assume this is initialized
> Model model = SDBFactory.connectNamedModel(store, name);
> OntModelSpec spec = new OntModelSpec(OntModelSpec.OWL_MEM_MICRO_RULE_INF);
> OntModel omodel = ModelFactory.createOntologyModel(spec, model);
> omodel.prepare();
>
> Does this result in an in-memory model as you recommend?
No, that's an inference model running over the database.
> If not, could you show the necessary code.
Depends on what you are trying to do. Whether your data is static. What
inferences you want (all or just some interesting ones). Whether the
source data is large. Whether is available as a file or only a database
model. Etc.
In the simple case your data is essentially fixed and you can precompute
and store the inferences.
Model memmodel = ModelFactory.createDefaultModel();
// read data into model or use FileUtils.get().loadModel instead
OntModelSpec spec =
new OntModelSpec(OntModelSpec.OWL_MEM_MICRO_RULE_INF);
OntModel omodel = ModelFactory.createOntologyModel(spec, memmodel);
dbmodel.add( omodel );
If there are only some inferences you need then you might be more
selective in what the final "add" phase puts into the database model.
Then you access that data in future uses via a non-inference model:
Model dbmodel = SDBFactory.connectNamedModel(store, name);
OntModelSpec spec = new OntModelSpec(OntModelSpec.OWL_MEM);
OntModel omodel = ModelFactory.createOntologyModel(spec, dbmodel);
If your data is already in the database and you want to dynamically
compute the inferences over its current state then do something more like:
Model memmodel = ModelFactory.createDefaultModel();
memmodel.add( dbmodel );
OntModelSpec spec =
new OntModelSpec(OntModelSpec.OWL_MEM_MICRO_RULE_INF);
OntModel omodel = ModelFactory.createOntologyModel(spec, memmodel);
// use omodel
Any updates to the data will need to be reflected into the omodel. If
those updates are done in the same VM that might be OK, if they are done
by other database clients then that's problematic.
Fundamentally databases and Jena's rule-based inference do not mix well.
Depending on what you need from inference you may be able to achieve the
same effects by query rewriting, or query rewriting plus some simpler
pre-computed closure. In the worst case you need a full deductive database.
For minimal RDFS inference then there is some support in the TDB loader
for computing that more efficiently at load time than the full in-memory
rule systems do.
Dave
RE: Persisting OWL in Jena
Posted by David Jordan <Da...@sas.com>.
Dave,
I have been getting "less than stellar" performance in my benchmarking. I would just like to be sure that the way I am using Jena IS performing inference over in-memory models. I have stored Models in the database. When I access them and create an OntModel, I do it in the following manner:
Store store; // assume this is initialized
Model model = SDBFactory.connectNamedModel(store, name);
OntModelSpec spec = new OntModelSpec(OntModelSpec.OWL_MEM_MICRO_RULE_INF);
OntModel omodel = ModelFactory.createOntologyModel(spec, model);
omodel.prepare();
Does this result in an in-memory model as you recommend?
If not, could you show the necessary code.
It would be great to discover I am doing this wrong and that there is a missing line or usage here that can make things run a lot faster...
-----Original Message-----
From: Dave Reynolds [mailto:dave.e.reynolds@gmail.com]
> c.initSdb(OntModelSpec.OWL_MEM_MICRO_RULE_INF); //connect to MYSQL db
Not relevant to your problem here but inference over a database will be very slow. It is better to perform any inference over in-memory models.
Re: Persisting OWL in Jena
Posted by Dave Reynolds <da...@gmail.com>.
On 04/04/13 22:50, Andreas Grünwald wrote:
> Hello,
> I managed to establish a connection to my MYSQL database via Jena SDB and
> inserted some triples.
>
> However, I still feel insecure how OWL constructs are persisted with Jena.
>
> Here is my Java code example:
> c.initSdb(OntModelSpec.OWL_MEM_MICRO_RULE_INF); //connect to MYSQL db
Not relevant to your problem here but inference over a database will be
very slow. It is better to perform any inference over in-memory models.
> OntModel base = c.getOntologyModel();
>
> String SOURCE = "http://www.eswc2006.org/technologies/ontology";
> String NS = SOURCE + "#";
>
> /* Add 2 individuals of the type "paper" linked via an object property */
> OntClass paper = base.getOntClass( NS + "Paper" );
I think you mean createClass(NS + "Paper") here.
> Individual p1 = base.createIndividual( NS + "paper1", paper );
> Individual p2 = base.createIndividual( NS + "paper2", paper );
> Property hasLinkTo = base.createObjectProperty(NS + "hasLinkTo"); //
> hasName property
> base.add(p1,hasLinkTo,p2);
>
> ---
> In the database 4 nodes, viz.
>
> - http://www.w3.org/1999/02/22-rdf-syntax-ns#type
> - http://www.eswc2006.org/technologies/ontology#hasLinkTo
> - http://www.w3.org/2002/07/owl#ObjectProperty
> - http://www.eswc2006.org/technologies/ontology#paper1
>
> and 2 triples, viz.
>
> - #hasLinkTo - rdf-syntax-ns#type - owl#ObjectProperty
> - #paper1 - #hasLinkTo - #paper1
>
> are persisted.
> My question is: Shouldn't the OntClass "Paper" result in an additional
> node, and an additional triple in the database?
If you mean to create and use a class then you need to use the
createClass method.
The getOntClass method, as it says in the javadoc, returns null if there
is no such class currently defined. So in your code the variable /paper/
is null which in turn means the createIndividual calls cannot assign any
rdf:type to paper1 and paper2.
Dave