You are viewing a plain text version of this content. The canonical link for it is here.

Posted to users@jena.apache.org by "windman83@libero.it" <wi...@libero.it> on 2012/05/23 12:38:11 UTC

TDB on disk Reasoning

Dear Jena users,
I was wondering if, in the case of large datasets, it would be more 
appropriate to store inferences immediately on disk rather than create an in 
memory model and afterwards adding the inferred statements to the base model. 
Until now I managed to achieve only the second solution.

String directory = "MyDatabases/Dataset1";
dataset = TDBFactory.createDataset(directory);
Model base = dataset.getDefaultModel();
OntModel o = ModelFactory.createOntologyModel(OntModelSpec.OWL_MEM,base);
printStatements(o); // check base model content
load(base); // some reading from owl files (only for the first run)
OntModel m = ModelFactory.createOntologyModel( OntModelSpec.OWL_MEM_RULE_INF, 
base );
m.prepare();
base.add(m); // if this line is removed the next time the model is created 
printStatements(o) won't show the inferred knowledge
base.close();
dataset.close();

Is there a way to make the reasoner work directly on the tdb model without 
needing for base.add(m)? If there are many entailments could I get an 
outofmemory over the m model?

I tried creating an InfModel, beacuse I found this suggestion at http:
//answers.semanticweb.com/questions/956/how-to-set-up-jena-sdb-with-inference-
capability using the following implementation:

String directory = "MyDatabases/Dataset1";
dataset = TDBFactory.createDataset(directory);
Model base = dataset.getDefaultModel();
OntModel o = ModelFactory.createOntologyModel(OntModelSpec.OWL_MEM,base);
printStatements(o); // check base model content
load(base); // some reading from owl files (only for the first run)
Reasoner r = OWLFBRuleReasonerFactory.theInstance().create(null);
InfModel inf=ModelFactory.createInfModel(r, base);
inf.prepare();
base.add(m); // if this line is removed the next time the model is created 
printStatements(o) won't show the inferred knowledge
base.close();
dataset.close();

... same results :( at the second run printStatements(o) print the same things

BR, Paolo

Re: TDB on disk Reasoning

Posted by Andy Seaborne <an...@apache.org>.

On 23/05/12 12:03, Dave Reynolds wrote:
> On 23/05/12 11:38, windman83@libero.it wrote:
>> Dear Jena users,
>> I was wondering if, in the case of large datasets, it would be more
>> appropriate to store inferences immediately on disk rather than create
>> an in
>> memory model and afterwards adding the inferred statements to the base
>> model.
>> Until now I managed to achieve only the second solution.
>>
>> String directory = "MyDatabases/Dataset1";
>> dataset = TDBFactory.createDataset(directory);
>> Model base = dataset.getDefaultModel();
>> OntModel o = ModelFactory.createOntologyModel(OntModelSpec.OWL_MEM,base);
>> printStatements(o); // check base model content
>> load(base); // some reading from owl files (only for the first run)
>> OntModel m = ModelFactory.createOntologyModel(
>> OntModelSpec.OWL_MEM_RULE_INF,
>> base );
>> m.prepare();
>> base.add(m); // if this line is removed the next time the model is
>> created
>> printStatements(o) won't show the inferred knowledge
>> base.close();
>> dataset.close();
>>
>> Is there a way to make the reasoner work directly on the tdb model
>> without
>> needing for base.add(m)?
>
> What you have does work directly over TDB, "base" is not an in memory
> copy but a reference to TDB.
>
> When the reasoner runs all of its internal workings, together with all
> results, will be stored in memory and typically those are larger than
> your base model.
>
>> If there are many entailments could I get an
>> outofmemory over the m model?
>
> Yes, the inference itself is in memory. Keeping your based data in TDB
> just has the effect of slowing reasoning down, it doesn't increase the
> scaling.
>
> If you only need a subset of RDFS inference then you can compute then on
> the fly as you load (I believe tdbloader has some options to enable this).

Its a separate program at the moment:

riotcmd.infer | tdbloader --loc=... -- -

for the loader to read from stdin.

	Andy

>
> If you need more inference then at present you'll just need enough
> memory. Indeed you'll get a lot better performance by reading the TDB
> model into a memory model and reasoning over that.
>
> What you can do is take the results of inference and add() them to a tdb
> model for storage. Then on future occasions use that materialized set of
> results instead of using a reasoner.
>
> Dave
>
>>
>> I tried creating an InfModel, beacuse I found this suggestion at http:
>> //answers.semanticweb.com/questions/956/how-to-set-up-jena-sdb-with-inference-
>>
>> capability using the following implementation:
>>
>> String directory = "MyDatabases/Dataset1";
>> dataset = TDBFactory.createDataset(directory);
>> Model base = dataset.getDefaultModel();
>> OntModel o = ModelFactory.createOntologyModel(OntModelSpec.OWL_MEM,base);
>> printStatements(o); // check base model content
>> load(base); // some reading from owl files (only for the first run)
>> Reasoner r = OWLFBRuleReasonerFactory.theInstance().create(null);
>> InfModel inf=ModelFactory.createInfModel(r, base);
>> inf.prepare();
>> base.add(m); // if this line is removed the next time the model is
>> created
>> printStatements(o) won't show the inferred knowledge
>> base.close();
>> dataset.close();
>>
>> ... same results :( at the second run printStatements(o) print the
>> same things
>>
>> BR, Paolo
>

Re: TDB on disk Reasoning

Posted by Dave Reynolds <da...@gmail.com>.

On 23/05/12 11:38, windman83@libero.it wrote:
> Dear Jena users,
> I was wondering if, in the case of large datasets, it would be more
> appropriate to store inferences immediately on disk rather than create an in
> memory model and afterwards adding the inferred statements to the base model.
> Until now I managed to achieve only the second solution.
>
> String directory = "MyDatabases/Dataset1";
> dataset = TDBFactory.createDataset(directory);
> Model base = dataset.getDefaultModel();
> OntModel o = ModelFactory.createOntologyModel(OntModelSpec.OWL_MEM,base);
> printStatements(o); // check base model content
> load(base); // some reading from owl files (only for the first run)
> OntModel m = ModelFactory.createOntologyModel( OntModelSpec.OWL_MEM_RULE_INF,
> base );
> m.prepare();
> base.add(m); // if this line is removed the next time the model is created
> printStatements(o) won't show the inferred knowledge
> base.close();
> dataset.close();
>
> Is there a way to make the reasoner work directly on the tdb model without
> needing for base.add(m)?

What you have does work directly over TDB, "base" is not an in memory 
copy but a reference to TDB.

When the reasoner runs all of its internal workings, together with all 
results, will be stored in memory and typically those are larger than 
your base model.

> If there are many entailments could I get an
> outofmemory over the m model?

Yes, the inference itself is in memory. Keeping your based data in TDB 
just has the effect of slowing reasoning down, it doesn't increase the 
scaling.

If you only need a subset of RDFS inference then you can compute then on 
the fly as you load (I believe tdbloader has some options to enable this).

If you need more inference then at present you'll just need enough 
memory. Indeed you'll get a lot better performance by reading the TDB 
model into a memory model and reasoning over that.

What you can do is take the results of inference and add() them to a tdb 
model for storage. Then on future occasions use that materialized set of 
results instead of using a reasoner.

Dave

>
> I tried creating an InfModel, beacuse I found this suggestion at http:
> //answers.semanticweb.com/questions/956/how-to-set-up-jena-sdb-with-inference-
> capability using the following implementation:
>
> String directory = "MyDatabases/Dataset1";
> dataset = TDBFactory.createDataset(directory);
> Model base = dataset.getDefaultModel();
> OntModel o = ModelFactory.createOntologyModel(OntModelSpec.OWL_MEM,base);
> printStatements(o); // check base model content
> load(base); // some reading from owl files (only for the first run)
> Reasoner r = OWLFBRuleReasonerFactory.theInstance().create(null);
> InfModel inf=ModelFactory.createInfModel(r, base);
> inf.prepare();
> base.add(m); // if this line is removed the next time the model is created
> printStatements(o) won't show the inferred knowledge
> base.close();
> dataset.close();
>
> ... same results :( at the second run printStatements(o) print the same things
>
> BR, Paolo