You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@jena.apache.org by Martin Vassilev <mr...@gmail.com> on 2013/08/05 15:15:14 UTC
jena framework performance
Hi all,
I would like to ask you a few questions about Jena.
I read in the documentation that
*Note:* Although OWL version 1.1 is now a W3C recommendation, Jena's
support for OWL 1.1 features is limited
1. Can you tell me what exactly is not supported (or what is supported)
from the OWL 1.1 and what version of OWL is fully supported.
2. Can Jena handle 2 or 3 billions of triples using TDB.
3. Is it possible to switch TDB with SDB relatively easy ?
Martin
Re: TDBLoader vs TDBloader2
Posted by Andy Seaborne <an...@apache.org>.
On 09/08/13 01:02, Charles Li wrote:
> Hi, All:
tdbloader is script, as is tdbloader2.
TDBLoader is a class in TDB that implements tdbloader.
There is no equivalent clas for tdbloader2 - it simply does not work
that way.
> 1. Is there any difference between the resultant data stores from the same RDF file using TDBLoader and TDBLoader2?
They contain the same triples and behave exactly the same. A tdbloader2
built is slightly more compact but that compactness will slowly
disappear as data is added to the
Both loaders are optimizing the case of loading an empty store.
tdbloader will fall back and load into an existing store.
tdbloader2 won't - it'll destroy the old data.
> 2. What are some of the highlights of differences between TDBLoader and TDBLoader2?
tdbloader2 can be faster. It only runs on Linux/unix machines, not MS
Windows.
Indeed, if your sort(1) has --parallel, tdbloader2 can do some work in
parallel. See the script for details.
>
> 3. What are the guidelines on when to use TDBLoader and when to use TDBLoader2?
Use tdbloader unless your into the 100's of millions of triples.
Then try each and see what happens.
>
> Thanks a lot in advance!
> - Charles
>
Andy
TDBLoader vs TDBloader2
Posted by Charles Li <ch...@gmail.com>.
Hi, All:
1. Is there any difference between the resultant data stores from the same RDF file using TDBLoader and TDBLoader2?
2. What are some of the highlights of differences between TDBLoader and TDBLoader2?
3. What are the guidelines on when to use TDBLoader and when to use TDBLoader2?
Thanks a lot in advance!
- Charles
Re: jena framework performance
Posted by Andy Seaborne <an...@apache.org>.
Hi Martin,
Jena provides APIs to all SPARQL functionality. You can use any SPARQL
compliant store. 4Store is one such store (nearly complete SPARQL 1.1 -
no property paths currently IIRC). It was built to handle data in the
volume you're talking about.
Andy
On 06/08/13 14:33, Martin Vassilev wrote:
> OK thnks, I see that Jena TDB/SDB won't fit for my needs. Is it possible
> to use Jena with 4store, i.e. take advantage of 4store performance and
> scalability and Jena's API ?
>
> Martin
>
> On 08/06/2013 01:18 PM, Andy Seaborne wrote:
>> On 05/08/13 23:01, Dave Reynolds wrote:
>>> On 05/08/13 14:15, Martin Vassilev wrote:
>>>> Hi all,
>>>>
>>>> I would like to ask you a few questions about Jena.
>>>>
>>>> I read in the documentation that
>>>> *Note:* Although OWL version 1.1 is now a W3C recommendation, Jena's
>>>> support for OWL 1.1 features is limited
>>>
>>> I assume you mean OWL 2, if I recall correctly that was called OWL 1.1
>>> for a while before the full extent of the changes were apparent.
>>>
>>>> 1. Can you tell me what exactly is not supported (or what is supported)
>>>> from the OWL 1.1 and what version of OWL is fully supported.
>>>
>>> Jena provides convenience methods for the complete OWL 1 language. It
>>> provides built in rule based inference for subsets of the OWL full
>>> dialect of OWL 1 as described in [1]. Third party tools like Pellet
>>> provide complete OWL DL implementations compatible with Jena.
>>>
>>> Jena currently has no direct support for any of the OWL 2 extensions
>>> beyond OWL 1. However, since OWL 2 can be fully encoded in RDF (the
>>> normative syntax) and Jena can handle any RDF, then you can read and
>>> write any OWL 2 ontology, you just need to do a little more work.
>>> Certainly people have build commercial OWL 2 tools on top of Jena.
>>>
>>> There is no built in inference support for any of OWL 2 beyond OWL 1. To
>>> find out exactly what Pellet et all provide for OWL 2 you would need to
>>> check with them directly.
>>>
>>>> 2. Can Jena handle 2 or 3 billions of triples using TDB.
>>>
>>> It depends.
>>>
>>> There is no hard limit on TDB size around there so with a big enough
>>> machine you should be able to load that. It is pushing the sizes where
>>> things can get slow though. So whether query performance will be
>>> adequate for you will depend on your data, your queries and your
>>> machine. The only way to be sure is to try it.
>>
>> It will take a long time to load and only be able to answer simple
>> queries (basically, look up a resource by URI or inverse functional
>> property and get the values of properties of that object).
>>
>>>
>>>> 3. Is it possible to switch TDB with SDB relatively easy ?
>>>
>>> Relatively but SDB is not advised for new projects unless you have a
>>> strong need for it and I very much doubt it would cope with 2-3bT.
>>
>> SDB will not work at 2-3 billion triples.
>>
>> If you design for SPARQL as your interface, you use a cluster store
>> like 4Store. It's not full SPARQL 1.1, but it's close, is open source
>> (GPLv2) and runs on multiple machines for scaling.
>>
>>>
>>> Dave
>>>
>>> [1] http://jena.apache.org/documentation/inference/#owl
>>
Re: jena framework performance
Posted by Martin Vassilev <mr...@gmail.com>.
OK thnks, I see that Jena TDB/SDB won't fit for my needs. Is it possible
to use Jena with 4store, i.e. take advantage of 4store performance and
scalability and Jena's API ?
Martin
On 08/06/2013 01:18 PM, Andy Seaborne wrote:
> On 05/08/13 23:01, Dave Reynolds wrote:
>> On 05/08/13 14:15, Martin Vassilev wrote:
>>> Hi all,
>>>
>>> I would like to ask you a few questions about Jena.
>>>
>>> I read in the documentation that
>>> *Note:* Although OWL version 1.1 is now a W3C recommendation, Jena's
>>> support for OWL 1.1 features is limited
>>
>> I assume you mean OWL 2, if I recall correctly that was called OWL 1.1
>> for a while before the full extent of the changes were apparent.
>>
>>> 1. Can you tell me what exactly is not supported (or what is supported)
>>> from the OWL 1.1 and what version of OWL is fully supported.
>>
>> Jena provides convenience methods for the complete OWL 1 language. It
>> provides built in rule based inference for subsets of the OWL full
>> dialect of OWL 1 as described in [1]. Third party tools like Pellet
>> provide complete OWL DL implementations compatible with Jena.
>>
>> Jena currently has no direct support for any of the OWL 2 extensions
>> beyond OWL 1. However, since OWL 2 can be fully encoded in RDF (the
>> normative syntax) and Jena can handle any RDF, then you can read and
>> write any OWL 2 ontology, you just need to do a little more work.
>> Certainly people have build commercial OWL 2 tools on top of Jena.
>>
>> There is no built in inference support for any of OWL 2 beyond OWL 1. To
>> find out exactly what Pellet et all provide for OWL 2 you would need to
>> check with them directly.
>>
>>> 2. Can Jena handle 2 or 3 billions of triples using TDB.
>>
>> It depends.
>>
>> There is no hard limit on TDB size around there so with a big enough
>> machine you should be able to load that. It is pushing the sizes where
>> things can get slow though. So whether query performance will be
>> adequate for you will depend on your data, your queries and your
>> machine. The only way to be sure is to try it.
>
> It will take a long time to load and only be able to answer simple
> queries (basically, look up a resource by URI or inverse functional
> property and get the values of properties of that object).
>
>>
>>> 3. Is it possible to switch TDB with SDB relatively easy ?
>>
>> Relatively but SDB is not advised for new projects unless you have a
>> strong need for it and I very much doubt it would cope with 2-3bT.
>
> SDB will not work at 2-3 billion triples.
>
> If you design for SPARQL as your interface, you use a cluster store
> like 4Store. It's not full SPARQL 1.1, but it's close, is open source
> (GPLv2) and runs on multiple machines for scaling.
>
>>
>> Dave
>>
>> [1] http://jena.apache.org/documentation/inference/#owl
>
Re: jena framework performance
Posted by Andy Seaborne <an...@apache.org>.
On 05/08/13 23:01, Dave Reynolds wrote:
> On 05/08/13 14:15, Martin Vassilev wrote:
>> Hi all,
>>
>> I would like to ask you a few questions about Jena.
>>
>> I read in the documentation that
>> *Note:* Although OWL version 1.1 is now a W3C recommendation, Jena's
>> support for OWL 1.1 features is limited
>
> I assume you mean OWL 2, if I recall correctly that was called OWL 1.1
> for a while before the full extent of the changes were apparent.
>
>> 1. Can you tell me what exactly is not supported (or what is supported)
>> from the OWL 1.1 and what version of OWL is fully supported.
>
> Jena provides convenience methods for the complete OWL 1 language. It
> provides built in rule based inference for subsets of the OWL full
> dialect of OWL 1 as described in [1]. Third party tools like Pellet
> provide complete OWL DL implementations compatible with Jena.
>
> Jena currently has no direct support for any of the OWL 2 extensions
> beyond OWL 1. However, since OWL 2 can be fully encoded in RDF (the
> normative syntax) and Jena can handle any RDF, then you can read and
> write any OWL 2 ontology, you just need to do a little more work.
> Certainly people have build commercial OWL 2 tools on top of Jena.
>
> There is no built in inference support for any of OWL 2 beyond OWL 1. To
> find out exactly what Pellet et all provide for OWL 2 you would need to
> check with them directly.
>
>> 2. Can Jena handle 2 or 3 billions of triples using TDB.
>
> It depends.
>
> There is no hard limit on TDB size around there so with a big enough
> machine you should be able to load that. It is pushing the sizes where
> things can get slow though. So whether query performance will be
> adequate for you will depend on your data, your queries and your
> machine. The only way to be sure is to try it.
It will take a long time to load and only be able to answer simple
queries (basically, look up a resource by URI or inverse functional
property and get the values of properties of that object).
>
>> 3. Is it possible to switch TDB with SDB relatively easy ?
>
> Relatively but SDB is not advised for new projects unless you have a
> strong need for it and I very much doubt it would cope with 2-3bT.
SDB will not work at 2-3 billion triples.
If you design for SPARQL as your interface, you use a cluster store like
4Store. It's not full SPARQL 1.1, but it's close, is open source
(GPLv2) and runs on multiple machines for scaling.
>
> Dave
>
> [1] http://jena.apache.org/documentation/inference/#owl
Re: jena framework performance
Posted by Dave Reynolds <da...@gmail.com>.
On 05/08/13 14:15, Martin Vassilev wrote:
> Hi all,
>
> I would like to ask you a few questions about Jena.
>
> I read in the documentation that
> *Note:* Although OWL version 1.1 is now a W3C recommendation, Jena's
> support for OWL 1.1 features is limited
I assume you mean OWL 2, if I recall correctly that was called OWL 1.1
for a while before the full extent of the changes were apparent.
> 1. Can you tell me what exactly is not supported (or what is supported)
> from the OWL 1.1 and what version of OWL is fully supported.
Jena provides convenience methods for the complete OWL 1 language. It
provides built in rule based inference for subsets of the OWL full
dialect of OWL 1 as described in [1]. Third party tools like Pellet
provide complete OWL DL implementations compatible with Jena.
Jena currently has no direct support for any of the OWL 2 extensions
beyond OWL 1. However, since OWL 2 can be fully encoded in RDF (the
normative syntax) and Jena can handle any RDF, then you can read and
write any OWL 2 ontology, you just need to do a little more work.
Certainly people have build commercial OWL 2 tools on top of Jena.
There is no built in inference support for any of OWL 2 beyond OWL 1. To
find out exactly what Pellet et all provide for OWL 2 you would need to
check with them directly.
> 2. Can Jena handle 2 or 3 billions of triples using TDB.
It depends.
There is no hard limit on TDB size around there so with a big enough
machine you should be able to load that. It is pushing the sizes where
things can get slow though. So whether query performance will be
adequate for you will depend on your data, your queries and your
machine. The only way to be sure is to try it.
> 3. Is it possible to switch TDB with SDB relatively easy ?
Relatively but SDB is not advised for new projects unless you have a
strong need for it and I very much doubt it would cope with 2-3bT.
Dave
[1] http://jena.apache.org/documentation/inference/#owl