You are viewing a plain text version of this content. The canonical link for it is here.

Posted to users@jena.apache.org by Andreas Grünwald <a....@gmail.com> on 2013/04/04 23:50:22 UTC

Persisting OWL in Jena

Hello,
I managed to establish a connection to my MYSQL database via Jena SDB and
inserted some triples.

However, I still feel insecure how OWL constructs are persisted with Jena.

Here is my Java code example:
c.initSdb(OntModelSpec.OWL_MEM_MICRO_RULE_INF); //connect to MYSQL db
OntModel base = c.getOntologyModel();

String SOURCE = "http://www.eswc2006.org/technologies/ontology";
String NS = SOURCE + "#";

/* Add 2 individuals of the type "paper" linked via an object property */
OntClass paper = base.getOntClass( NS + "Paper" );
Individual p1 = base.createIndividual( NS + "paper1", paper );
Individual p2 = base.createIndividual( NS + "paper2", paper );
Property hasLinkTo = base.createObjectProperty(NS + "hasLinkTo"); //
hasName property
base.add(p1,hasLinkTo,p2);

 ---
In the database 4 nodes, viz.

   - http://www.w3.org/1999/02/22-rdf-syntax-ns#type
   - http://www.eswc2006.org/technologies/ontology#hasLinkTo
   - http://www.w3.org/2002/07/owl#ObjectProperty
   - http://www.eswc2006.org/technologies/ontology#paper1

and 2 triples, viz.

   - #hasLinkTo - rdf-syntax-ns#type - owl#ObjectProperty
   - #paper1 - #hasLinkTo - #paper1

are persisted.
My question is: Shouldn't the OntClass "Paper" result in an additional
node, and an additional triple in the database?
Or do I have to add this manually? How? I looked at the Jena tutorial but
couldn't find an example.

Best regards and thank you very much,
Andreas

Re: Persisting OWL in Jena

Posted by davejrdn <da...@bellsouth.net>.

Thanks for much for clarifying this. It would be great if this stuff was all 
clearly described and documented, including details about which types of queries 
trigger the forward and backward reasoning, etc. I'll continue with my 
benchmarking including all these different configurations that are possible.

________________________________
From: Dave Reynolds <da...@gmail.com>
To: users@jena.apache.org
Sent: Sat, April 6, 2013 10:27:55 AM
Subject: Re: Persisting OWL in Jena

On 05/04/13 16:01, David Jordan wrote:
> I am testing under several scenarios. For some static cases, I do precompute 
>the inferences and store them. For this case, I do have one open question. If 
>one wants to later combine multiple ontologies and data with their own implied 
>inferencings, is there ever an issue that the original non-inferenced OWL 
>specification are needed, because of their interactions with the inferencing of 
>the other ontologies being combined? Will one lose some triples that would have 
>been inferred if one had started with inferencing done on the original OWL code?

If I follow the question correctly then no, that's safe. All the OWL and 
RDFS semantics are defined to be monotonic. Adding new statements can 
only ever lead to further statements being deducible, they can't lead to 
previously inferred statements becoming invalid. The deductive closure 
of an model is always a superset of the original model, no information 
is lost.

> Given
> OntModelSpec spec = new OntModelSpec(OntModelSpec.OWL_MEM_MICRO_RULE_INF);
>
> You are saying that with the following code:
> Model memmodel = ModelFactory.createDefaultModel();
> memmodel.add( dbmodel);
> OntModel omodel = ModelFactory.createOntologyModel(spec, memmodel);
>
> This will cause my "database model" to be completely pulled into memory and 
>placed in the memmodel, so that the OntModel can run much more efficiently?

Yes. The cost of that "pulling into memory" step may be quite high but 
once it's there all the little queries made by the reasoner will perform 
better. In most cases you should come out ahead.

> Whereas with the following model omodel it will always go to the database?
>
> Model dbmodel = SDBFactory.connectNamedModel(store, name);
> OntModel omodel = ModelFactory.createOntologyModel(spec, model);

Depends what you mean by "always".

The point is that the reasoners have to do a *lot* of queries over the 
data to do their job. In the above set up then each of those queries 
goes to the database. This can be very very slow. By pulling the data 
into memory you take the hit once (with a simple efficient query).

Some of the results of that reasoning is then stored in internal state 
in the reasoner so future queries to the omodel may be partially 
answered by that internal state and may not trigger further database 
queries.

Exactly what "partial" means in this case is complex. The rule reasoners 
employ a mix of forward and backward chaining. The forward parts will 
all run to completion and store their results in memory. The backward 
reasoning is only triggered by the particular query goal and may invoke 
further queries to the underlying model (and thus the database). 
However, some parts of the backward queries are "tabled" (to stop 
infinite loops as much as for performance reasoners) and those tables 
are in memory as well. So over time more and more of a given query can 
be answered out of the in-memory state but that never reaches 100% of 
all queries.

Dave

> -----Original Message-----
> From: Dave Reynolds [mailto:dave.e.reynolds@gmail.com]
> Sent: Friday, April 05, 2013 10:39 AM
> To: users@jena.apache.org
> Subject: Re: Persisting OWL in Jena
>
> On 05/04/13 15:09, David Jordan wrote:
>> Dave,
>> I have been getting "less than stellar" performance in my benchmarking. I would 
>>just like to be sure that the way I am using Jena IS performing inference over 
>>in-memory models. I have stored Models in the database. When I access them and 
>>create an OntModel, I do it in the following manner:
>>
>> Store store; // assume this is initialized Model model =
>> SDBFactory.connectNamedModel(store, name); OntModelSpec spec = new
>> OntModelSpec(OntModelSpec.OWL_MEM_MICRO_RULE_INF);
>> OntModel omodel = ModelFactory.createOntologyModel(spec, model);
>> omodel.prepare();
>>
>> Does this result in an in-memory model as you recommend?
>
> No, that's an inference model running over the database.
>
>> If not, could you show the necessary code.
>
> Depends on what you are trying to do. Whether your data is static. What 
>inferences you want (all or just some interesting ones). Whether the source data 
>is large. Whether is available as a file or only a database model. Etc.
>
> In the simple case your data is essentially fixed and you can precompute and 
>store the inferences.
>
>    Model memmodel = ModelFactory.createDefaultModel();
>    // read data into model or use FileUtils.get().loadModel instead
>    OntModelSpec spec =
>                    new OntModelSpec(OntModelSpec.OWL_MEM_MICRO_RULE_INF);
>    OntModel omodel = ModelFactory.createOntologyModel(spec, memmodel);
>    dbmodel.add( omodel );
>
> If there are only some inferences you need then you might be more selective in 
>what the final "add" phase puts into the database model.
>
> Then you access that data in future uses via a non-inference model:
>
>    Model dbmodel = SDBFactory.connectNamedModel(store, name);
>    OntModelSpec spec = new OntModelSpec(OntModelSpec.OWL_MEM);
>    OntModel omodel = ModelFactory.createOntologyModel(spec, dbmodel);
>
> If your data is already in the database and you want to dynamically compute the 
>inferences over its current state then do something more like:
>
>    Model memmodel = ModelFactory.createDefaultModel();
>    memmodel.add( dbmodel );
>    OntModelSpec spec =
>                    new OntModelSpec(OntModelSpec.OWL_MEM_MICRO_RULE_INF);
>    OntModel omodel = ModelFactory.createOntologyModel(spec, memmodel);
>    // use omodel
>
> Any updates to the data will need to be reflected into the omodel. If those 
>updates are done in the same VM that might be OK, if they are done by other 
>database clients then that's problematic.
>
> Fundamentally databases and Jena's rule-based inference do not mix well.
>
> Depending on what you need from inference you may be able to achieve the same 
>effects by query rewriting, or query rewriting plus some simpler pre-computed 
>closure. In the worst case you need a full deductive database.
>
> For minimal RDFS inference then there is some support in the TDB loader for 
>computing that more efficiently at load time than the full in-memory rule 
>systems do.
>
> Dave
>
>
>

Re: Persisting OWL in Jena

Posted by Dave Reynolds <da...@gmail.com>.

On 05/04/13 16:01, David Jordan wrote:
> I am testing under several scenarios. For some static cases, I do precompute the inferences and store them. For this case, I do have one open question. If one wants to later combine multiple ontologies and data with their own implied inferencings, is there ever an issue that the original non-inferenced OWL specification are needed, because of their interactions with the inferencing of the other ontologies being combined? Will one lose some triples that would have been inferred if one had started with inferencing done on the original OWL code?

If I follow the question correctly then no, that's safe. All the OWL and 
RDFS semantics are defined to be monotonic. Adding new statements can 
only ever lead to further statements being deducible, they can't lead to 
previously inferred statements becoming invalid. The deductive closure 
of an model is always a superset of the original model, no information 
is lost.

> Given
> OntModelSpec spec = new OntModelSpec(OntModelSpec.OWL_MEM_MICRO_RULE_INF);
>
> You are saying that with the following code:
> Model memmodel = ModelFactory.createDefaultModel();
> memmodel.add( dbmodel);
> OntModel omodel = ModelFactory.createOntologyModel(spec, memmodel);
>
> This will cause my "database model" to be completely pulled into memory and placed in the memmodel, so that the OntModel can run much more efficiently?

Yes. The cost of that "pulling into memory" step may be quite high but 
once it's there all the little queries made by the reasoner will perform 
better. In most cases you should come out ahead.

> Whereas with the following model omodel it will always go to the database?
>
> Model dbmodel = SDBFactory.connectNamedModel(store, name);
> OntModel omodel = ModelFactory.createOntologyModel(spec, model);

Depends what you mean by "always".

The point is that the reasoners have to do a *lot* of queries over the 
data to do their job. In the above set up then each of those queries 
goes to the database. This can be very very slow. By pulling the data 
into memory you take the hit once (with a simple efficient query).

Some of the results of that reasoning is then stored in internal state 
in the reasoner so future queries to the omodel may be partially 
answered by that internal state and may not trigger further database 
queries.

Exactly what "partial" means in this case is complex. The rule reasoners 
employ a mix of forward and backward chaining. The forward parts will 
all run to completion and store their results in memory. The backward 
reasoning is only triggered by the particular query goal and may invoke 
further queries to the underlying model (and thus the database). 
However, some parts of the backward queries are "tabled" (to stop 
infinite loops as much as for performance reasoners) and those tables 
are in memory as well. So over time more and more of a given query can 
be answered out of the in-memory state but that never reaches 100% of 
all queries.

Dave

> -----Original Message-----
> From: Dave Reynolds [mailto:dave.e.reynolds@gmail.com]
> Sent: Friday, April 05, 2013 10:39 AM
> To: users@jena.apache.org
> Subject: Re: Persisting OWL in Jena
>
> On 05/04/13 15:09, David Jordan wrote:
>> Dave,
>> I have been getting "less than stellar" performance in my benchmarking. I would just like to be sure that the way I am using Jena IS performing inference over in-memory models. I have stored Models in the database. When I access them and create an OntModel, I do it in the following manner:
>>
>> Store store; // assume this is initialized Model model =
>> SDBFactory.connectNamedModel(store, name); OntModelSpec spec = new
>> OntModelSpec(OntModelSpec.OWL_MEM_MICRO_RULE_INF);
>> OntModel omodel = ModelFactory.createOntologyModel(spec, model);
>> omodel.prepare();
>>
>> Does this result in an in-memory model as you recommend?
>
> No, that's an inference model running over the database.
>
>> If not, could you show the necessary code.
>
> Depends on what you are trying to do. Whether your data is static. What inferences you want (all or just some interesting ones). Whether the source data is large. Whether is available as a file or only a database model. Etc.
>
> In the simple case your data is essentially fixed and you can precompute and store the inferences.
>
>     Model memmodel = ModelFactory.createDefaultModel();
>     // read data into model or use FileUtils.get().loadModel instead
>     OntModelSpec spec =
>                    new OntModelSpec(OntModelSpec.OWL_MEM_MICRO_RULE_INF);
>     OntModel omodel = ModelFactory.createOntologyModel(spec, memmodel);
>     dbmodel.add( omodel );
>
> If there are only some inferences you need then you might be more selective in what the final "add" phase puts into the database model.
>
> Then you access that data in future uses via a non-inference model:
>
>     Model dbmodel = SDBFactory.connectNamedModel(store, name);
>     OntModelSpec spec = new OntModelSpec(OntModelSpec.OWL_MEM);
>     OntModel omodel = ModelFactory.createOntologyModel(spec, dbmodel);
>
> If your data is already in the database and you want to dynamically compute the inferences over its current state then do something more like:
>
>     Model memmodel = ModelFactory.createDefaultModel();
>     memmodel.add( dbmodel );
>     OntModelSpec spec =
>                    new OntModelSpec(OntModelSpec.OWL_MEM_MICRO_RULE_INF);
>     OntModel omodel = ModelFactory.createOntologyModel(spec, memmodel);
>     // use omodel
>
> Any updates to the data will need to be reflected into the omodel. If those updates are done in the same VM that might be OK, if they are done by other database clients then that's problematic.
>
> Fundamentally databases and Jena's rule-based inference do not mix well.
>
> Depending on what you need from inference you may be able to achieve the same effects by query rewriting, or query rewriting plus some simpler pre-computed closure. In the worst case you need a full deductive database.
>
> For minimal RDFS inference then there is some support in the TDB loader for computing that more efficiently at load time than the full in-memory rule systems do.
>
> Dave
>
>
>

RE: Persisting OWL in Jena

Posted by David Jordan <Da...@sas.com>.

I am testing under several scenarios. For some static cases, I do precompute the inferences and store them. For this case, I do have one open question. If one wants to later combine multiple ontologies and data with their own implied inferencings, is there ever an issue that the original non-inferenced OWL specification are needed, because of their interactions with the inferencing of the other ontologies being combined? Will one lose some triples that would have been inferred if one had started with inferencing done on the original OWL code?

Given
OntModelSpec spec = new OntModelSpec(OntModelSpec.OWL_MEM_MICRO_RULE_INF);

You are saying that with the following code:
Model memmodel = ModelFactory.createDefaultModel();
memmodel.add( dbmodel);
OntModel omodel = ModelFactory.createOntologyModel(spec, memmodel);

This will cause my "database model" to be completely pulled into memory and placed in the memmodel, so that the OntModel can run much more efficiently?
Whereas with the following model omodel it will always go to the database?

Model dbmodel = SDBFactory.connectNamedModel(store, name); 
OntModel omodel = ModelFactory.createOntologyModel(spec, model);

That probably explains the slowness. With my current code, the initialization of the Model and OntModel did seem relatively fast.

-----Original Message-----
From: Dave Reynolds [mailto:dave.e.reynolds@gmail.com] 
Sent: Friday, April 05, 2013 10:39 AM
To: users@jena.apache.org
Subject: Re: Persisting OWL in Jena

On 05/04/13 15:09, David Jordan wrote:
> Dave,
> I have been getting "less than stellar" performance in my benchmarking. I would just like to be sure that the way I am using Jena IS performing inference over in-memory models. I have stored Models in the database. When I access them and create an OntModel, I do it in the following manner:
>
> Store store; // assume this is initialized Model model = 
> SDBFactory.connectNamedModel(store, name); OntModelSpec spec = new 
> OntModelSpec(OntModelSpec.OWL_MEM_MICRO_RULE_INF);
> OntModel omodel = ModelFactory.createOntologyModel(spec, model); 
> omodel.prepare();
>
> Does this result in an in-memory model as you recommend?

No, that's an inference model running over the database.

> If not, could you show the necessary code.

Depends on what you are trying to do. Whether your data is static. What inferences you want (all or just some interesting ones). Whether the source data is large. Whether is available as a file or only a database model. Etc.

In the simple case your data is essentially fixed and you can precompute and store the inferences.

   Model memmodel = ModelFactory.createDefaultModel();
   // read data into model or use FileUtils.get().loadModel instead
   OntModelSpec spec =
                  new OntModelSpec(OntModelSpec.OWL_MEM_MICRO_RULE_INF);
   OntModel omodel = ModelFactory.createOntologyModel(spec, memmodel);
   dbmodel.add( omodel );

If there are only some inferences you need then you might be more selective in what the final "add" phase puts into the database model.

Then you access that data in future uses via a non-inference model:

   Model dbmodel = SDBFactory.connectNamedModel(store, name);
   OntModelSpec spec = new OntModelSpec(OntModelSpec.OWL_MEM);
   OntModel omodel = ModelFactory.createOntologyModel(spec, dbmodel);

If your data is already in the database and you want to dynamically compute the inferences over its current state then do something more like:

   Model memmodel = ModelFactory.createDefaultModel();
   memmodel.add( dbmodel );
   OntModelSpec spec =
                  new OntModelSpec(OntModelSpec.OWL_MEM_MICRO_RULE_INF);
   OntModel omodel = ModelFactory.createOntologyModel(spec, memmodel);
   // use omodel

Any updates to the data will need to be reflected into the omodel. If those updates are done in the same VM that might be OK, if they are done by other database clients then that's problematic.

Fundamentally databases and Jena's rule-based inference do not mix well.

Depending on what you need from inference you may be able to achieve the same effects by query rewriting, or query rewriting plus some simpler pre-computed closure. In the worst case you need a full deductive database.

For minimal RDFS inference then there is some support in the TDB loader for computing that more efficiently at load time than the full in-memory rule systems do.

Dave

Re: Persisting OWL in Jena

Posted by Dave Reynolds <da...@gmail.com>.

On 05/04/13 15:09, David Jordan wrote:
> Dave,
> I have been getting "less than stellar" performance in my benchmarking. I would just like to be sure that the way I am using Jena IS performing inference over in-memory models. I have stored Models in the database. When I access them and create an OntModel, I do it in the following manner:
>
> Store store; // assume this is initialized
> Model model = SDBFactory.connectNamedModel(store, name);
> OntModelSpec spec = new OntModelSpec(OntModelSpec.OWL_MEM_MICRO_RULE_INF);
> OntModel omodel = ModelFactory.createOntologyModel(spec, model);
> omodel.prepare();
>
> Does this result in an in-memory model as you recommend?

No, that's an inference model running over the database.

> If not, could you show the necessary code.

Depends on what you are trying to do. Whether your data is static. What 
inferences you want (all or just some interesting ones). Whether the 
source data is large. Whether is available as a file or only a database 
model. Etc.

In the simple case your data is essentially fixed and you can precompute 
and store the inferences.

   Model memmodel = ModelFactory.createDefaultModel();
   // read data into model or use FileUtils.get().loadModel instead
   OntModelSpec spec =
                  new OntModelSpec(OntModelSpec.OWL_MEM_MICRO_RULE_INF);
   OntModel omodel = ModelFactory.createOntologyModel(spec, memmodel);
   dbmodel.add( omodel );

If there are only some inferences you need then you might be more 
selective in what the final "add" phase puts into the database model.

Then you access that data in future uses via a non-inference model:

   Model dbmodel = SDBFactory.connectNamedModel(store, name);
   OntModelSpec spec = new OntModelSpec(OntModelSpec.OWL_MEM);
   OntModel omodel = ModelFactory.createOntologyModel(spec, dbmodel);

If your data is already in the database and you want to dynamically 
compute the inferences over its current state then do something more like:

   Model memmodel = ModelFactory.createDefaultModel();
   memmodel.add( dbmodel );
   OntModelSpec spec =
                  new OntModelSpec(OntModelSpec.OWL_MEM_MICRO_RULE_INF);
   OntModel omodel = ModelFactory.createOntologyModel(spec, memmodel);
   // use omodel

Any updates to the data will need to be reflected into the omodel. If 
those updates are done in the same VM that might be OK, if they are done 
by other database clients then that's problematic.

Fundamentally databases and Jena's rule-based inference do not mix well.

Depending on what you need from inference you may be able to achieve the 
same effects by query rewriting, or query rewriting plus some simpler 
pre-computed closure. In the worst case you need a full deductive database.

For minimal RDFS inference then there is some support in the TDB loader 
for computing that more efficiently at load time than the full in-memory 
rule systems do.

Dave

RE: Persisting OWL in Jena

Posted by David Jordan <Da...@sas.com>.

Dave,
I have been getting "less than stellar" performance in my benchmarking. I would just like to be sure that the way I am using Jena IS performing inference over in-memory models. I have stored Models in the database. When I access them and create an OntModel, I do it in the following manner:

Store store; // assume this is initialized
Model model = SDBFactory.connectNamedModel(store, name);
OntModelSpec spec = new OntModelSpec(OntModelSpec.OWL_MEM_MICRO_RULE_INF);
OntModel omodel = ModelFactory.createOntologyModel(spec, model);
omodel.prepare();

Does this result in an in-memory model as you recommend?
If not, could you show the necessary code.
It would be great to discover I am doing this wrong and that there is a missing line or usage here that can make things run a lot faster...

-----Original Message-----
From: Dave Reynolds [mailto:dave.e.reynolds@gmail.com] 

> c.initSdb(OntModelSpec.OWL_MEM_MICRO_RULE_INF); //connect to MYSQL db

Not relevant to your problem here but inference over a database will be very slow. It is better to perform any inference over in-memory models.

Re: Persisting OWL in Jena

Posted by Dave Reynolds <da...@gmail.com>.

On 04/04/13 22:50, Andreas Grünwald wrote:
> Hello,
> I managed to establish a connection to my MYSQL database via Jena SDB and
> inserted some triples.
>
> However, I still feel insecure how OWL constructs are persisted with Jena.
>
> Here is my Java code example:
> c.initSdb(OntModelSpec.OWL_MEM_MICRO_RULE_INF); //connect to MYSQL db

Not relevant to your problem here but inference over a database will be 
very slow. It is better to perform any inference over in-memory models.

> OntModel base = c.getOntologyModel();
>
> String SOURCE = "http://www.eswc2006.org/technologies/ontology";
> String NS = SOURCE + "#";
>
> /* Add 2 individuals of the type "paper" linked via an object property */
> OntClass paper = base.getOntClass( NS + "Paper" );

I think you mean createClass(NS + "Paper") here.

> Individual p1 = base.createIndividual( NS + "paper1", paper );
> Individual p2 = base.createIndividual( NS + "paper2", paper );
> Property hasLinkTo = base.createObjectProperty(NS + "hasLinkTo"); //
> hasName property
> base.add(p1,hasLinkTo,p2);
>
>   ---
> In the database 4 nodes, viz.
>
>     - http://www.w3.org/1999/02/22-rdf-syntax-ns#type
>     - http://www.eswc2006.org/technologies/ontology#hasLinkTo
>     - http://www.w3.org/2002/07/owl#ObjectProperty
>     - http://www.eswc2006.org/technologies/ontology#paper1
>
> and 2 triples, viz.
>
>     - #hasLinkTo - rdf-syntax-ns#type - owl#ObjectProperty
>     - #paper1 - #hasLinkTo - #paper1
>
> are persisted.
> My question is: Shouldn't the OntClass "Paper" result in an additional
> node, and an additional triple in the database?

If you mean to create and use a class then you need to use the 
createClass method.

The getOntClass method, as it says in the javadoc, returns null if there 
is no such class currently defined. So in your code the variable /paper/ 
is null which in turn means the createIndividual calls cannot assign any 
rdf:type to paper1 and paper2.


Dave