You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@jena.apache.org by m....@utwente.nl on 2012/03/08 15:37:30 UTC
Loading model into memory
Hi,
I want to apply the OnTools .FindShortestPath function on Yago.
I am using the following code to load the model:
Model model = TDBFactory.createModel(FullYagoDirectory);
The FindShortestPath function taking too much time to return a result.
I wonder if it is possible to load the model into main memory to make it faster or if there is any other way to make FindShortestPath much faster.
Thanks a lot
mena
----------------------------------------------------
Mena B. Habib
PhD Student
University of Twente
Faculty of Electrical Engineering, Mathematics and Computer Science.
Database Chair
7500AE Enschede, Netherlands
mail: m.b.habib@ewi.utwente.nl<ma...@ewi.utwente.nl>
website: http://wwwhome.ctit.utwente.nl/~badiehm/
Phone: +31 53 489 4549
Fax: +31 53 489 2927
Mobile: +31 68 183 2680
Re: Loading model into memory
Posted by Andy Seaborne <an...@apache.org>.
On 15/03/12 14:15, m.badiehhabibmorgan@utwente.nl wrote:
> Hi Andy To use ARQ 2.9.1 I need also to use TDB 0.9 In the new TDB,
> the function CreateModel(String dir) is no more existing.
CreateModel or createModel
TDBFactory.createModel() exists, but is deprecated.
> How to read
> a model from disc (not from memory) with the new versions?
Better to use a dataset and get the default model.
> Furthermore, we I use ARQ 2.9.1 I always got this error Exception in
> thread "main" java.lang.NoClassDefFoundError:
> org/apache/jena/iri/IRIFactory I am using
> jena-iri-0.9.0-incubating.jar Thanks a lot
You need jena-iri-0.9.1-incubating-SNAPSHOT with ARQ
2.9.1incubating-SNAPSHOT as given by the POM.
We have seen some problems with maven not properly updating the snapshot
dependencies - use mvn -U to force snapshots to be updated.
Andy
>
> Mena
RE: Loading model into memory
Posted by m....@utwente.nl.
Hi Andy
To use ARQ 2.9.1 I need also to use TDB 0.9 In the new TDB, the function CreateModel(String dir) is no more existing. How to read a model from disc (not from memory) with the new versions?
Furthermore, we I use ARQ 2.9.1 I always got this error
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/jena/iri/IRIFactory
I am using jena-iri-0.9.0-incubating.jar
Thanks a lot
Mena
----------------------------------------------------
Mena B. Habib
PhD Student
University of Twente
Faculty of Electrical Engineering, Mathematics and Computer Science.
Database Chair
7500AE Enschede, Netherlands
mail: m.b.habib@ewi.utwente.nl
website: http://wwwhome.ctit.utwente.nl/~badiehm/
Phone: +31 53 489 4549
Fax: +31 53 489 2927
Mobile: +31 68 183 2680
-----Original Message-----
From: m.badiehhabibmorgan@utwente.nl [mailto:m.badiehhabibmorgan@utwente.nl]
Sent: Thursday, March 15, 2012 2:08 PM
To: jena-users@incubator.apache.org
Subject: RE: Loading model into memory
----------------------------------------------------
Mena B. Habib
PhD Student
University of Twente
Faculty of Electrical Engineering, Mathematics and Computer Science.
Database Chair
7500AE Enschede, Netherlands
mail: m.b.habib@ewi.utwente.nl
website: http://wwwhome.ctit.utwente.nl/~badiehm/
Phone: +31 53 489 4549
Fax: +31 53 489 2927
Mobile: +31 68 183 2680
-----Original Message-----
From: Andy Seaborne [mailto:andy.seaborne.apache@gmail.com] On Behalf Of Andy Seaborne
Sent: Tuesday, March 13, 2012 1:44 PM
To: jena-users@incubator.apache.org
Subject: Re: Loading model into memory
On 13/03/12 12:07, m.badiehhabibmorgan@utwente.nl wrote:
> Hi Andy
>
> I tried this query :
> Query query = QueryFactory.create("SELECT * WHERE {
> \"http://www.mpii.de/yago/resource/Alexandria\" DISTINCT(path) \"http://www.mpii.de/yago/resource/Egypt\" } "); But it gives me this error:
> Exception in thread "main" com.hp.hpl.jena.query.QueryParseException: Encountered " "distinct" "DISTINCT "" at line 1, column 64.
>
> I am using ARQ 2.9.0, Core 2.7.0, TDB 0.9.0 I could not find ARQ
> 2.9.1
That query has a number of problems:
1/ DISTINCT(..) is only in the development build of ARQ 2.9.1 (location below).
2/ it's a syntax extension, you need "create(..., Syntax.syntaxARQ)" to enable it
3/ You need to put a path expression inside DISTINCT(...) It does not find arbitrary paths, it matches a path you give it. You can't have variables there either. Not sure that YAGO has for properties. A FOAF example is:
?s distinct(foaf:knows+) ?t
although for work-in-progress reasons is the same as
?s foaf:knows+ ?t
unlike ARQ 2.9.0
See also this email:
http://mail-archives.apache.org/mod_mbox/incubator-jena-users/201203.mbox/%3C4F5A6D38.5070204%40apache.org%3E
ARQ 2.9.1 development build:
https://repository.apache.org/content/repositories/snapshots/org/apache/jena/jena-arq/2.9.1-incubating-SNAPSHOT/
>
> Thanks a lot
>
> Mena
Andy
RE: Loading model into memory
Posted by m....@utwente.nl.
Hi Andy
To use ARQ 2.9.0 I need also to use TDB 0.9
In the new TDB, the function CreateModel(String dir) is no more existing. How to read a model from disc (not from memory) with the new versions?
Thanks a lot
Mena
----------------------------------------------------
Mena B. Habib
PhD Student
University of Twente
Faculty of Electrical Engineering, Mathematics and Computer Science.
Database Chair
7500AE Enschede, Netherlands
mail: m.b.habib@ewi.utwente.nl
website: http://wwwhome.ctit.utwente.nl/~badiehm/
Phone: +31 53 489 4549
Fax: +31 53 489 2927
Mobile: +31 68 183 2680
-----Original Message-----
From: Andy Seaborne [mailto:andy.seaborne.apache@gmail.com] On Behalf Of Andy Seaborne
Sent: Tuesday, March 13, 2012 1:44 PM
To: jena-users@incubator.apache.org
Subject: Re: Loading model into memory
On 13/03/12 12:07, m.badiehhabibmorgan@utwente.nl wrote:
> Hi Andy
>
> I tried this query :
> Query query = QueryFactory.create("SELECT * WHERE {
> \"http://www.mpii.de/yago/resource/Alexandria\" DISTINCT(path) \"http://www.mpii.de/yago/resource/Egypt\" } "); But it gives me this error:
> Exception in thread "main" com.hp.hpl.jena.query.QueryParseException: Encountered " "distinct" "DISTINCT "" at line 1, column 64.
>
> I am using ARQ 2.9.0, Core 2.7.0, TDB 0.9.0 I could not find ARQ
> 2.9.1
That query has a number of problems:
1/ DISTINCT(..) is only in the development build of ARQ 2.9.1 (location below).
2/ it's a syntax extension, you need "create(..., Syntax.syntaxARQ)" to enable it
3/ You need to put a path expression inside DISTINCT(...) It does not find arbitrary paths, it matches a path you give it. You can't have variables there either. Not sure that YAGO has for properties. A FOAF example is:
?s distinct(foaf:knows+) ?t
although for work-in-progress reasons is the same as
?s foaf:knows+ ?t
unlike ARQ 2.9.0
See also this email:
http://mail-archives.apache.org/mod_mbox/incubator-jena-users/201203.mbox/%3C4F5A6D38.5070204%40apache.org%3E
ARQ 2.9.1 development build:
https://repository.apache.org/content/repositories/snapshots/org/apache/jena/jena-arq/2.9.1-incubating-SNAPSHOT/
>
> Thanks a lot
>
> Mena
Andy
Re: Loading model into memory
Posted by Andy Seaborne <an...@apache.org>.
On 13/03/12 12:07, m.badiehhabibmorgan@utwente.nl wrote:
> Hi Andy
>
> I tried this query :
> Query query = QueryFactory.create("SELECT * WHERE { \"http://www.mpii.de/yago/resource/Alexandria\" DISTINCT(path) \"http://www.mpii.de/yago/resource/Egypt\" } ");
> But it gives me this error:
> Exception in thread "main" com.hp.hpl.jena.query.QueryParseException: Encountered " "distinct" "DISTINCT "" at line 1, column 64.
>
> I am using ARQ 2.9.0, Core 2.7.0, TDB 0.9.0
> I could not find ARQ 2.9.1
That query has a number of problems:
1/ DISTINCT(..) is only in the development build of ARQ 2.9.1 (location
below).
2/ it's a syntax extension, you need "create(..., Syntax.syntaxARQ)" to
enable it
3/ You need to put a path expression inside DISTINCT(...) It does not
find arbitrary paths, it matches a path you give it. You can't have
variables there either. Not sure that YAGO has for properties. A FOAF
example is:
?s distinct(foaf:knows+) ?t
although for work-in-progress reasons is the same as
?s foaf:knows+ ?t
unlike ARQ 2.9.0
See also this email:
http://mail-archives.apache.org/mod_mbox/incubator-jena-users/201203.mbox/%3C4F5A6D38.5070204%40apache.org%3E
ARQ 2.9.1 development build:
https://repository.apache.org/content/repositories/snapshots/org/apache/jena/jena-arq/2.9.1-incubating-SNAPSHOT/
>
> Thanks a lot
>
> Mena
Andy
RE: Loading model into memory
Posted by m....@utwente.nl.
Hi Andy
I tried this query :
Query query = QueryFactory.create("SELECT * WHERE { \"http://www.mpii.de/yago/resource/Alexandria\" DISTINCT(path) \"http://www.mpii.de/yago/resource/Egypt\" } ");
But it gives me this error:
Exception in thread "main" com.hp.hpl.jena.query.QueryParseException: Encountered " "distinct" "DISTINCT "" at line 1, column 64.
I am using ARQ 2.9.0, Core 2.7.0, TDB 0.9.0
I could not find ARQ 2.9.1
Thanks a lot
Mena
----------------------------------------------------
Mena B. Habib
PhD Student
University of Twente
Faculty of Electrical Engineering, Mathematics and Computer Science.
Database Chair
7500AE Enschede, Netherlands
mail: m.b.habib@ewi.utwente.nl
website: http://wwwhome.ctit.utwente.nl/~badiehm/
Phone: +31 53 489 4549
Fax: +31 53 489 2927
Mobile: +31 68 183 2680
-----Original Message-----
From: Andy Seaborne [mailto:andy.seaborne.apache@gmail.com] On Behalf Of Andy Seaborne
Sent: Thursday, March 08, 2012 6:36 PM
To: jena-users@incubator.apache.org
Subject: Re: Loading model into memory
On 08/03/12 17:26, m.badiehhabibmorgan@utwente.nl wrote:
> It can be useful in this way. But how to run it using Jena library? Is there a function i can use to apply such query on a model?
You can make a SPARQL query or call the path evaluator PathEval directly.
You will need ARQ 2.9.1-incubating-SNAPSHOT from the Apache snapshot repository.
Andy
>
> Thanks a lot
>
> Mena
> ________________________________________
> From: Andy Seaborne [andy.seaborne.apache@gmail.com] on behalf of Andy
> Seaborne [andy@apache.org]
> Sent: Thursday, March 08, 2012 5:26 PM
> To: jena-users@incubator.apache.org
> Subject: Re: Loading model into memory
>
> On 08/03/12 16:06, m.badiehhabibmorgan@utwente.nl wrote:
>> What I want to do is to see how two entities are related .. I am using the shortest path as a measure for relatedness. I don't care about what kind of relations they had. This why I can't specify certain property.
>> I wonder how { :x DISTINCT(path) ?y } works? What is the output expected for this query? Is it the same as getting the shortest path from x to y?
>
> It's not the shortest path - in fact, it does not say what the path is
> at all, only that ?y is connected to :x.
>
> Andy
>
>>
>> Thanks a lot
>>
>> Mena
>>
>> ----------------------------------------------------
>> Mena B. Habib
>> PhD Student
Re: Loading model into memory
Posted by Andy Seaborne <an...@apache.org>.
On 08/03/12 17:26, m.badiehhabibmorgan@utwente.nl wrote:
> It can be useful in this way. But how to run it using Jena library? Is there a function i can use to apply such query on a model?
You can make a SPARQL query or call the path evaluator PathEval directly.
You will need ARQ 2.9.1-incubating-SNAPSHOT from the Apache snapshot
repository.
Andy
>
> Thanks a lot
>
> Mena
> ________________________________________
> From: Andy Seaborne [andy.seaborne.apache@gmail.com] on behalf of Andy Seaborne [andy@apache.org]
> Sent: Thursday, March 08, 2012 5:26 PM
> To: jena-users@incubator.apache.org
> Subject: Re: Loading model into memory
>
> On 08/03/12 16:06, m.badiehhabibmorgan@utwente.nl wrote:
>> What I want to do is to see how two entities are related .. I am using the shortest path as a measure for relatedness. I don't care about what kind of relations they had. This why I can't specify certain property.
>> I wonder how { :x DISTINCT(path) ?y } works? What is the output expected for this query? Is it the same as getting the shortest path from x to y?
>
> It's not the shortest path - in fact, it does not say what the path is
> at all, only that ?y is connected to :x.
>
> Andy
>
>>
>> Thanks a lot
>>
>> Mena
>>
>> ----------------------------------------------------
>> Mena B. Habib
>> PhD Student
RE: Loading model into memory
Posted by m....@utwente.nl.
It can be useful in this way. But how to run it using Jena library? Is there a function i can use to apply such query on a model?
Thanks a lot
Mena
________________________________________
From: Andy Seaborne [andy.seaborne.apache@gmail.com] on behalf of Andy Seaborne [andy@apache.org]
Sent: Thursday, March 08, 2012 5:26 PM
To: jena-users@incubator.apache.org
Subject: Re: Loading model into memory
On 08/03/12 16:06, m.badiehhabibmorgan@utwente.nl wrote:
> What I want to do is to see how two entities are related .. I am using the shortest path as a measure for relatedness. I don't care about what kind of relations they had. This why I can't specify certain property.
> I wonder how { :x DISTINCT(path) ?y } works? What is the output expected for this query? Is it the same as getting the shortest path from x to y?
It's not the shortest path - in fact, it does not say what the path is
at all, only that ?y is connected to :x.
Andy
>
> Thanks a lot
>
> Mena
>
> ----------------------------------------------------
> Mena B. Habib
> PhD Student
Re: Loading model into memory
Posted by Andy Seaborne <an...@apache.org>.
On 08/03/12 16:06, m.badiehhabibmorgan@utwente.nl wrote:
> What I want to do is to see how two entities are related .. I am using the shortest path as a measure for relatedness. I don't care about what kind of relations they had. This why I can't specify certain property.
> I wonder how { :x DISTINCT(path) ?y } works? What is the output expected for this query? Is it the same as getting the shortest path from x to y?
It's not the shortest path - in fact, it does not say what the path is
at all, only that ?y is connected to :x.
Andy
>
> Thanks a lot
>
> Mena
>
> ----------------------------------------------------
> Mena B. Habib
> PhD Student
RE: Loading model into memory
Posted by m....@utwente.nl.
What I want to do is to see how two entities are related .. I am using the shortest path as a measure for relatedness. I don't care about what kind of relations they had. This why I can't specify certain property.
I wonder how { :x DISTINCT(path) ?y } works? What is the output expected for this query? Is it the same as getting the shortest path from x to y?
Thanks a lot
Mena
----------------------------------------------------
Mena B. Habib
PhD Student
University of Twente
Faculty of Electrical Engineering, Mathematics and Computer Science.
Database Chair
7500AE Enschede, Netherlands
mail: m.b.habib@ewi.utwente.nl
website: http://wwwhome.ctit.utwente.nl/~badiehm/
Phone: +31 53 489 4549
Fax: +31 53 489 2927
Mobile: +31 68 183 2680
-----Original Message-----
From: Andy Seaborne [mailto:andy.seaborne.apache@gmail.com] On Behalf Of Andy Seaborne
Sent: Thursday, March 08, 2012 4:51 PM
To: jena-users@incubator.apache.org
Subject: Re: Loading model into memory
On 08/03/12 15:03, Chris Dollin wrote:
> Mena said:
>
>> I want to apply the OnTools .FindShortestPath function on Yago.
>> I am using the following code to load the model:
>>
>> Model model = TDBFactory.createModel(FullYagoDirectory);
>>
>> The FindShortestPath function taking too much time to return a result.
>> I wonder if it is possible to load the model into main memory to make
>> it faster or if there is any other way to make FindShortestPath much faster.
>
> Model model = ModelFactory.createDefaultModel().add(
> TDBFactory.createModel(FullYagoDirectory) );
>
> Of course you may then run out of memory if the model is big.
>
> Chris
>
> ("Default" models are in-memory models.)
IIRC YAGO(2) is a bit big. The core is something like 30 million triples and full 80 million triples, I think.
Bit big for memory unless you have a big server.
Do you need "shortest path" or is just connectivity of entities acceptable?
ARQ now has DISTINCT for paths and executes it (more) efficiently:
{ :x DISTINCT(path) ?y }
in the ARQ language.
(more to come here ... "soon")
If you do want "shortest path", you may need to simplify the problem.
Jena's OntTools shortest path is quite general - can you work with, say, the path being a fixed property?
If so, maybe extract all the occurrences of that property and make a subgraph, hopefully smaller.
You may need to look at a graph algorithm like the Floyd-Warshall algorithm [*] which is space-consuming and O(N^3) in time. Being able to reduce to something smaller helps with the space consumption.
(OntTool.findShortestPath is a simple breadth first search).
Andy
[*] http://en.wikipedia.org/wiki/Floyd%E2%80%93Warshall_algorithm
Andy
Re: Loading model into memory
Posted by Andy Seaborne <an...@apache.org>.
On 08/03/12 15:03, Chris Dollin wrote:
> Mena said:
>
>> I want to apply the OnTools .FindShortestPath function on Yago.
>> I am using the following code to load the model:
>>
>> Model model = TDBFactory.createModel(FullYagoDirectory);
>>
>> The FindShortestPath function taking too much time to return a result.
>> I wonder if it is possible to load the model into main memory to make it
>> faster or if there is any other way to make FindShortestPath much faster.
>
> Model model = ModelFactory.createDefaultModel().add( TDBFactory.createModel(FullYagoDirectory) );
>
> Of course you may then run out of memory if the model is big.
>
> Chris
>
> ("Default" models are in-memory models.)
IIRC YAGO(2) is a bit big. The core is something like 30 million
triples and full 80 million triples, I think.
Bit big for memory unless you have a big server.
Do you need "shortest path" or is just connectivity of entities acceptable?
ARQ now has DISTINCT for paths and executes it (more) efficiently:
{ :x DISTINCT(path) ?y }
in the ARQ language.
(more to come here ... "soon")
If you do want "shortest path", you may need to simplify the problem.
Jena's OntTools shortest path is quite general - can you work with, say,
the path being a fixed property?
If so, maybe extract all the occurrences of that property and make a
subgraph, hopefully smaller.
You may need to look at a graph algorithm like the Floyd-Warshall
algorithm [*] which is space-consuming and O(N^3) in time. Being able
to reduce to something smaller helps with the space consumption.
(OntTool.findShortestPath is a simple breadth first search).
Andy
[*] http://en.wikipedia.org/wiki/Floyd%E2%80%93Warshall_algorithm
Andy
Re: Loading model into memory
Posted by Chris Dollin <ch...@epimorphics.com>.
Mena said:
> I want to apply the OnTools .FindShortestPath function on Yago.
> I am using the following code to load the model:
>
> Model model = TDBFactory.createModel(FullYagoDirectory);
>
> The FindShortestPath function taking too much time to return a result.
> I wonder if it is possible to load the model into main memory to make it
> faster or if there is any other way to make FindShortestPath much faster.
Model model = ModelFactory.createDefaultModel().add( TDBFactory.createModel(FullYagoDirectory) );
Of course you may then run out of memory if the model is big.
Chris
("Default" models are in-memory models.)
--
"I don't want to know what the Structuralists think! I want /Archer's Goon/
to know what YOU think!"
Epimorphics Ltd, http://www.epimorphics.com
Registered address: Court Lodge, 105 High Street, Portishead, Bristol BS20 6PT
Epimorphics Ltd. is a limited company registered in England (number 7016688)