You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@jena.apache.org by Mikael Pesonen <mi...@lingsoft.fi> on 2020/12/15 14:33:03 UTC
Efficient SPARQL query for RDF collections
Hi,
I am querying subclasses of class (here id:365852007) which belongs to
RDF collection like this
id:1 rdf:type owl:Class ;
rdfs:subClassOf [ owl:intersectionOf ( ... id:365852007 ... ) ] ;
skos:prefLabel "something"@en .
id:2 rdf:type owl:Class ;
rdfs:subClassOf [ owl:intersectionOf ( ... id:365852007 ... ) ] ;
skos:prefLabel "something else"@en .
... denotes random number of other elements in the list.
SPARQL:
select * where
{
?subclass rdfs:subClassOf [ owl:intersectionOf /rdf:rest*/rdf:first
id:365852007 ] .
?subclass skos:prefLabel ?label .
}
This is slow (~10 secs) and we need to extend the query to subclasses of
the subclasses and so on.
Is there any faster way to get this done?
Thanks!
Re: Efficient SPARQL query for RDF collections
Posted by Mikael Pesonen <mi...@lingsoft.fi>.
Hi,
3.17 seems to be about the same, but breaking down the query helped
partially
Original single subclass query: 28s, broke up query: 31ms ( returns 14
items)
Original subclass of subclass: 150s ( returns 23 items), broke up query
doesn't finish in 15 mins and Fuseki seems to hang.
Here's the broke up query for subclass of subclass:
?Z rdf:rest*/rdf:first id:106048009 . ?X owl:intersectionOf ?Z .
?subclass rdfs:subClassOf ?X .
?subclass skos:prefLabel ?label .
?Z2 rdf:rest*/rdf:first ?subclass . ?X2 owl:intersectionOf ?Z2 .
?subclass2 rdfs:subClassOf ?X2 .
?subclass2 skos:prefLabel ?label2 .
BR
On 15/12/2020 20.30, Andy Seaborne wrote:
>
>
> On 15/12/2020 14:33, Mikael Pesonen wrote:
>>
>> Hi,
>>
>> I am querying subclasses of class (here id:365852007) which belongs
>> to RDF collection like this
>>
>> id:1 rdf:type owl:Class ;
>> rdfs:subClassOf [ owl:intersectionOf ( ... id:365852007 ...
>> ) ] ;
>> skos:prefLabel "something"@en .
>> id:2 rdf:type owl:Class ;
>> rdfs:subClassOf [ owl:intersectionOf ( ... id:365852007 ...
>> ) ] ;
>> skos:prefLabel "something else"@en .
>>
>>
>> ... denotes random number of other elements in the list.
>>
>> SPARQL:
>>
>> select * where
>> {
>> ?subclass rdfs:subClassOf [ owl:intersectionOf
>> /rdf:rest*/rdf:first id:365852007 ] .
>> ?subclass skos:prefLabel ?label .
>
> Try breaking the pattern up:
>
> ?Z rdf:rest*/rdf:first id:365852007 .
> ?X owl:intersectionOf ?Z .
> ?subclass rdfs:subClassOf ?X .
> ?subclass rdfs:label ?label .
>
> and use 3.17.0
>
>> }
>>
>>
>> This is slow (~10 secs) and we need to extend the query to subclasses
>> of the subclasses and so on.
>>
>> Is there any faster way to get this done?
>>
>> Thanks!
>>
--
Lingsoft - 30 years of Leading Language Management
www.lingsoft.fi
Speech Applications - Language Management - Translation - Reader's and Writer's Tools - Text Tools - E-books and M-books
Mikael Pesonen
System Engineer
e-mail: mikael.pesonen@lingsoft.fi
Tel. +358 2 279 3300
Time zone: GMT+2
Helsinki Office
Eteläranta 10
FI-00130 Helsinki
FINLAND
Turku Office
Kauppiaskatu 5 A
FI-20100 Turku
FINLAND
Re: Efficient SPARQL query for RDF collections
Posted by Andy Seaborne <an...@apache.org>.
On 14/01/2021 15:18, Mikael Pesonen wrote:
>
> Okay, so comparing the same examples,
>
> fast one:
>
> 1 (base <http://example/base/>
> 2 (prefix ((owl: <http://www.w3.org/2002/07/owl#>)
> 3 (rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>)
> 4 (skos: <http://www.w3.org/2004/02/skos/core#>)
> 5 (rdfs: <http://www.w3.org/2000/01/rdf-schema#>)
> 6 (id: <http://example.com/id/>)
> 7 (list: <http://jena.apache.org/ARQ/list#>))
> 8 (project (?class ?classLabel)
> 9 (sequence
> 10 (sequence
> 11 (bgp (triple ??P23 rdf:first id:308925008))
> 12 (path ?scIntsec (path* rdf:rest) ??P23))
Note order: bgp-path
> 13 (bgp
> 14 (triple ?scTmp owl:intersectionOf ?scIntsec)
> 15 (triple ?class rdfs:subClassOf ?scTmp)
> 16 (triple ?class skos:prefLabel ?classLabel)
> 17 )))))
>
>
> slow one:
>
> 1 (base <http://example/base/>
> 2 (prefix ((owl: <http://www.w3.org/2002/07/owl#>)
> 3 (rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>)
> 4 (skos: <http://www.w3.org/2004/02/skos/core#>)
> 5 (rdfs: <http://www.w3.org/2000/01/rdf-schema#>)
> 6 (id: <http://example.com/id/>)
> 7 (list: <http://jena.apache.org/ARQ/list#>))
> 8 (project (?class ?classLabel)
> 9 (sequence
> 10 (table (vars ?superClass)
> 11 (row [?superClass id:308925008])
> 12 )
> 13 (sequence
> 14 (sequence
> 15 (path ?scIntsec (path* rdf:rest) ??P24)
> 16 (bgp (triple ??P24 rdf:first ?superClass)))
> 17 (bgp
> 18 (triple ?scTmp owl:intersectionOf ?scIntsec)
> 19 (triple ?class rdfs:subClassOf ?scTmp)
> 20 (triple ?class skos:prefLabel ?classLabel)
> 21 ))))))
>
>
> Slow one has unbounded section which gets executed:
It's not completely unbound - it is in a (sequence) so the ?superClass
is passed in.
It does not, however, reverse the order path-bgp.
>
> 15 (path ?scIntsec (path* rdf:rest) ??P24)
> 16 (bgp (triple ??P24 rdf:first ?superClass)))
>
> and has 10000s of results to handle and table construct is attached later?
>
>
>
> On 14/01/2021 16.35, Andy Seaborne wrote:
>> http://www.sparql.org/query-validator.html
>>
>> and use the "SPARQL algebra (general optimizations)"
>>
>> and on the command line:
>>
>> qparse --print=opt
>>
>> On 14/01/2021 10:53, Mikael Pesonen wrote:
>>>
>>> Would be really helpful to know how variables are bound etc if there
>>> exists a simple guide for that.
>>>
>>> This query
>>>
>>> SELECT count(?scIntsec)
>>> WHERE
>>> {
>>> ?scIntsec rdf:first ?superClass .
>>> }
>>>
>>> returns 851029 which must be the reason for slowness in my case. But
>>> I'm still not getting why it matters and why doesn't
>>>
>>> VALUES ?superClass {id:308925008 }
>>
>> Because that can be multiple values, and theh optimzier doesn't make
>> one value a special case,
>>
>> and then
>>
>> ?scIntsec rdf:rest*/rdf:first ?superClass .
>>
>> is a compound expression (you could write it out and control the order
>> if you want)
>>
>> ?X rdf:first ?superClass .
>> ?scIntsec rdf:rest* ?X .
>>
>> and no local information that the other way is better. (Scheduling
>> paths and triple patterns isn't done very much by ARQ.)
>>
>> Andy
>>
>>
>>
>>>
>>> limit the search space.
>>>
>>>
>>> On 13/01/2021 17.12, Steve Vestal wrote:
>>>> I'm going to generalize this request a bit. I earlier reported
>>>> restructuring a query to get from 25 minutes to 90 ms, where I found
>>>> the "Learning SPARQL" book to be helpful. Do you know of any more
>>>> detailed tutorial or guidelines focusing on performance issues,
>>>> something that goes into a bit more detail about ordering, indexing,
>>>> interactions between different SPARQL features, and such?
>>>>
>>>> On 1/13/2021 8:28 AM, Mikael Pesonen wrote:
>>>>>
>>>>> Related to this. This query returns in 70ms
>>>>>
>>>>> SELECT ?class ?classLabel
>>>>> WHERE
>>>>> {
>>>>> ?scIntsec rdf:rest*/rdf:first id:308925008 .
>>>>> ?scTmp owl:intersectionOf ?scIntsec .
>>>>> ?class rdfs:subClassOf ?scTmp.
>>>>> ?class skos:prefLabel ?classLabel .
>>>>> }
>>>>>
>>>>> but this doesn't finish in 15 minutes (aborted then)
>>>>>
>>>>> SELECT ?class ?classLabel
>>>>> WHERE
>>>>> {
>>>>> VALUES ?superClass {id:308925008 }
>>>>> ?scIntsec rdf:rest*/rdf:first ?superClass .
>>>>> ?scTmp owl:intersectionOf ?scIntsec .
>>>>> ?class rdfs:subClassOf ?scTmp.
>>>>> ?class skos:prefLabel ?classLabel .
>>>>> }
>>>>>
>>>>> How come they are so different since they do the same thing?
>>>>>
>>>>>
>>>>> On 15/12/2020 20.30, Andy Seaborne wrote:
>>>>>>
>>>>>>
>>>>>> On 15/12/2020 14:33, Mikael Pesonen wrote:
>>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> I am querying subclasses of class (here id:365852007) which
>>>>>>> belongs to RDF collection like this
>>>>>>>
>>>>>>> id:1 rdf:type owl:Class ;
>>>>>>> rdfs:subClassOf [ owl:intersectionOf ( ... id:365852007
>>>>>>> ... ) ] ;
>>>>>>> skos:prefLabel "something"@en .
>>>>>>> id:2 rdf:type owl:Class ;
>>>>>>> rdfs:subClassOf [ owl:intersectionOf ( ... id:365852007
>>>>>>> ... ) ] ;
>>>>>>> skos:prefLabel "something else"@en .
>>>>>>>
>>>>>>>
>>>>>>> ... denotes random number of other elements in the list.
>>>>>>>
>>>>>>> SPARQL:
>>>>>>>
>>>>>>> select * where
>>>>>>> {
>>>>>>> ?subclass rdfs:subClassOf [ owl:intersectionOf
>>>>>>> /rdf:rest*/rdf:first id:365852007 ] .
>>>>>>> ?subclass skos:prefLabel ?label .
>>>>>>
>>>>>> Try breaking the pattern up:
>>>>>>
>>>>>> ?Z rdf:rest*/rdf:first id:365852007 .
>>>>>> ?X owl:intersectionOf ?Z .
>>>>>> ?subclass rdfs:subClassOf ?X .
>>>>>> ?subclass rdfs:label ?label .
>>>>>>
>>>>>> and use 3.17.0
>>>>>>
>>>>>>> }
>>>>>>>
>>>>>>>
>>>>>>> This is slow (~10 secs) and we need to extend the query to
>>>>>>> subclasses of the subclasses and so on.
>>>>>>>
>>>>>>> Is there any faster way to get this done?
>>>>>>>
>>>>>>> Thanks!
>>>>>>>
>>>>>
>>>>
>>>
>
Re: Efficient SPARQL query for RDF collections
Posted by Mikael Pesonen <mi...@lingsoft.fi>.
Okay, so comparing the same examples,
fast one:
1 (base <http://example/base/>
2 (prefix ((owl: <http://www.w3.org/2002/07/owl#>)
3 (rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>)
4 (skos: <http://www.w3.org/2004/02/skos/core#>)
5 (rdfs: <http://www.w3.org/2000/01/rdf-schema#>)
6 (id: <http://example.com/id/>)
7 (list: <http://jena.apache.org/ARQ/list#>))
8 (project (?class ?classLabel)
9 (sequence
10 (sequence
11 (bgp (triple ??P23 rdf:first id:308925008))
12 (path ?scIntsec (path* rdf:rest) ??P23))
13 (bgp
14 (triple ?scTmp owl:intersectionOf ?scIntsec)
15 (triple ?class rdfs:subClassOf ?scTmp)
16 (triple ?class skos:prefLabel ?classLabel)
17 )))))
slow one:
1 (base <http://example/base/>
2 (prefix ((owl: <http://www.w3.org/2002/07/owl#>)
3 (rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>)
4 (skos: <http://www.w3.org/2004/02/skos/core#>)
5 (rdfs: <http://www.w3.org/2000/01/rdf-schema#>)
6 (id: <http://example.com/id/>)
7 (list: <http://jena.apache.org/ARQ/list#>))
8 (project (?class ?classLabel)
9 (sequence
10 (table (vars ?superClass)
11 (row [?superClass id:308925008])
12 )
13 (sequence
14 (sequence
15 (path ?scIntsec (path* rdf:rest) ??P24)
16 (bgp (triple ??P24 rdf:first ?superClass)))
17 (bgp
18 (triple ?scTmp owl:intersectionOf ?scIntsec)
19 (triple ?class rdfs:subClassOf ?scTmp)
20 (triple ?class skos:prefLabel ?classLabel)
21 ))))))
Slow one has unbounded section which gets executed:
15 (path ?scIntsec (path* rdf:rest) ??P24)
16 (bgp (triple ??P24 rdf:first ?superClass)))
and has 10000s of results to handle and table construct is attached later?
On 14/01/2021 16.35, Andy Seaborne wrote:
> http://www.sparql.org/query-validator.html
>
> and use the "SPARQL algebra (general optimizations)"
>
> and on the command line:
>
> qparse --print=opt
>
> On 14/01/2021 10:53, Mikael Pesonen wrote:
>>
>> Would be really helpful to know how variables are bound etc if there
>> exists a simple guide for that.
>>
>> This query
>>
>> SELECT count(?scIntsec)
>> WHERE
>> {
>> ?scIntsec rdf:first ?superClass .
>> }
>>
>> returns 851029 which must be the reason for slowness in my case. But
>> I'm still not getting why it matters and why doesn't
>>
>> VALUES ?superClass {id:308925008 }
>
> Because that can be multiple values, and theh optimzier doesn't make
> one value a special case,
>
> and then
>
> ?scIntsec rdf:rest*/rdf:first ?superClass .
>
> is a compound expression (you could write it out and control the order
> if you want)
>
> ?X rdf:first ?superClass .
> ?scIntsec rdf:rest* ?X .
>
> and no local information that the other way is better. (Scheduling
> paths and triple patterns isn't done very much by ARQ.)
>
> Andy
>
>
>
>>
>> limit the search space.
>>
>>
>> On 13/01/2021 17.12, Steve Vestal wrote:
>>> I'm going to generalize this request a bit. I earlier reported
>>> restructuring a query to get from 25 minutes to 90 ms, where I found
>>> the "Learning SPARQL" book to be helpful. Do you know of any more
>>> detailed tutorial or guidelines focusing on performance issues,
>>> something that goes into a bit more detail about ordering, indexing,
>>> interactions between different SPARQL features, and such?
>>>
>>> On 1/13/2021 8:28 AM, Mikael Pesonen wrote:
>>>>
>>>> Related to this. This query returns in 70ms
>>>>
>>>> SELECT ?class ?classLabel
>>>> WHERE
>>>> {
>>>> ?scIntsec rdf:rest*/rdf:first id:308925008 .
>>>> ?scTmp owl:intersectionOf ?scIntsec .
>>>> ?class rdfs:subClassOf ?scTmp.
>>>> ?class skos:prefLabel ?classLabel .
>>>> }
>>>>
>>>> but this doesn't finish in 15 minutes (aborted then)
>>>>
>>>> SELECT ?class ?classLabel
>>>> WHERE
>>>> {
>>>> VALUES ?superClass {id:308925008 }
>>>> ?scIntsec rdf:rest*/rdf:first ?superClass .
>>>> ?scTmp owl:intersectionOf ?scIntsec .
>>>> ?class rdfs:subClassOf ?scTmp.
>>>> ?class skos:prefLabel ?classLabel .
>>>> }
>>>>
>>>> How come they are so different since they do the same thing?
>>>>
>>>>
>>>> On 15/12/2020 20.30, Andy Seaborne wrote:
>>>>>
>>>>>
>>>>> On 15/12/2020 14:33, Mikael Pesonen wrote:
>>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> I am querying subclasses of class (here id:365852007) which
>>>>>> belongs to RDF collection like this
>>>>>>
>>>>>> id:1 rdf:type owl:Class ;
>>>>>> rdfs:subClassOf [ owl:intersectionOf ( ... id:365852007
>>>>>> ... ) ] ;
>>>>>> skos:prefLabel "something"@en .
>>>>>> id:2 rdf:type owl:Class ;
>>>>>> rdfs:subClassOf [ owl:intersectionOf ( ... id:365852007
>>>>>> ... ) ] ;
>>>>>> skos:prefLabel "something else"@en .
>>>>>>
>>>>>>
>>>>>> ... denotes random number of other elements in the list.
>>>>>>
>>>>>> SPARQL:
>>>>>>
>>>>>> select * where
>>>>>> {
>>>>>> ?subclass rdfs:subClassOf [ owl:intersectionOf
>>>>>> /rdf:rest*/rdf:first id:365852007 ] .
>>>>>> ?subclass skos:prefLabel ?label .
>>>>>
>>>>> Try breaking the pattern up:
>>>>>
>>>>> ?Z rdf:rest*/rdf:first id:365852007 .
>>>>> ?X owl:intersectionOf ?Z .
>>>>> ?subclass rdfs:subClassOf ?X .
>>>>> ?subclass rdfs:label ?label .
>>>>>
>>>>> and use 3.17.0
>>>>>
>>>>>> }
>>>>>>
>>>>>>
>>>>>> This is slow (~10 secs) and we need to extend the query to
>>>>>> subclasses of the subclasses and so on.
>>>>>>
>>>>>> Is there any faster way to get this done?
>>>>>>
>>>>>> Thanks!
>>>>>>
>>>>
>>>
>>
--
Lingsoft - 30 years of Leading Language Management
www.lingsoft.fi
Speech Applications - Language Management - Translation - Reader's and Writer's Tools - Text Tools - E-books and M-books
Mikael Pesonen
System Engineer
e-mail: mikael.pesonen@lingsoft.fi
Tel. +358 2 279 3300
Time zone: GMT+2
Helsinki Office
Eteläranta 10
FI-00130 Helsinki
FINLAND
Turku Office
Kauppiaskatu 5 A
FI-20100 Turku
FINLAND
Re: Efficient SPARQL query for RDF collections
Posted by Andy Seaborne <an...@apache.org>.
http://www.sparql.org/query-validator.html
and use the "SPARQL algebra (general optimizations)"
and on the command line:
qparse --print=opt
On 14/01/2021 10:53, Mikael Pesonen wrote:
>
> Would be really helpful to know how variables are bound etc if there
> exists a simple guide for that.
>
> This query
>
> SELECT count(?scIntsec)
> WHERE
> {
> ?scIntsec rdf:first ?superClass .
> }
>
> returns 851029 which must be the reason for slowness in my case. But I'm
> still not getting why it matters and why doesn't
>
> VALUES ?superClass {id:308925008 }
Because that can be multiple values, and theh optimzier doesn't make one
value a special case,
and then
?scIntsec rdf:rest*/rdf:first ?superClass .
is a compound expression (you could write it out and control the order
if you want)
?X rdf:first ?superClass .
?scIntsec rdf:rest* ?X .
and no local information that the other way is better. (Scheduling paths
and triple patterns isn't done very much by ARQ.)
Andy
>
> limit the search space.
>
>
> On 13/01/2021 17.12, Steve Vestal wrote:
>> I'm going to generalize this request a bit. I earlier reported
>> restructuring a query to get from 25 minutes to 90 ms, where I found
>> the "Learning SPARQL" book to be helpful. Do you know of any more
>> detailed tutorial or guidelines focusing on performance issues,
>> something that goes into a bit more detail about ordering, indexing,
>> interactions between different SPARQL features, and such?
>>
>> On 1/13/2021 8:28 AM, Mikael Pesonen wrote:
>>>
>>> Related to this. This query returns in 70ms
>>>
>>> SELECT ?class ?classLabel
>>> WHERE
>>> {
>>> ?scIntsec rdf:rest*/rdf:first id:308925008 .
>>> ?scTmp owl:intersectionOf ?scIntsec .
>>> ?class rdfs:subClassOf ?scTmp.
>>> ?class skos:prefLabel ?classLabel .
>>> }
>>>
>>> but this doesn't finish in 15 minutes (aborted then)
>>>
>>> SELECT ?class ?classLabel
>>> WHERE
>>> {
>>> VALUES ?superClass {id:308925008 }
>>> ?scIntsec rdf:rest*/rdf:first ?superClass .
>>> ?scTmp owl:intersectionOf ?scIntsec .
>>> ?class rdfs:subClassOf ?scTmp.
>>> ?class skos:prefLabel ?classLabel .
>>> }
>>>
>>> How come they are so different since they do the same thing?
>>>
>>>
>>> On 15/12/2020 20.30, Andy Seaborne wrote:
>>>>
>>>>
>>>> On 15/12/2020 14:33, Mikael Pesonen wrote:
>>>>>
>>>>> Hi,
>>>>>
>>>>> I am querying subclasses of class (here id:365852007) which belongs
>>>>> to RDF collection like this
>>>>>
>>>>> id:1 rdf:type owl:Class ;
>>>>> rdfs:subClassOf [ owl:intersectionOf ( ... id:365852007
>>>>> ... ) ] ;
>>>>> skos:prefLabel "something"@en .
>>>>> id:2 rdf:type owl:Class ;
>>>>> rdfs:subClassOf [ owl:intersectionOf ( ... id:365852007
>>>>> ... ) ] ;
>>>>> skos:prefLabel "something else"@en .
>>>>>
>>>>>
>>>>> ... denotes random number of other elements in the list.
>>>>>
>>>>> SPARQL:
>>>>>
>>>>> select * where
>>>>> {
>>>>> ?subclass rdfs:subClassOf [ owl:intersectionOf
>>>>> /rdf:rest*/rdf:first id:365852007 ] .
>>>>> ?subclass skos:prefLabel ?label .
>>>>
>>>> Try breaking the pattern up:
>>>>
>>>> ?Z rdf:rest*/rdf:first id:365852007 .
>>>> ?X owl:intersectionOf ?Z .
>>>> ?subclass rdfs:subClassOf ?X .
>>>> ?subclass rdfs:label ?label .
>>>>
>>>> and use 3.17.0
>>>>
>>>>> }
>>>>>
>>>>>
>>>>> This is slow (~10 secs) and we need to extend the query to
>>>>> subclasses of the subclasses and so on.
>>>>>
>>>>> Is there any faster way to get this done?
>>>>>
>>>>> Thanks!
>>>>>
>>>
>>
>
Re: Efficient SPARQL query for RDF collections
Posted by Mikael Pesonen <mi...@lingsoft.fi>.
Would be really helpful to know how variables are bound etc if there
exists a simple guide for that.
This query
SELECT count(?scIntsec)
WHERE
{
?scIntsec rdf:first ?superClass .
}
returns 851029 which must be the reason for slowness in my case. But I'm
still not getting why it matters and why doesn't
VALUES ?superClass {id:308925008 }
limit the search space.
On 13/01/2021 17.12, Steve Vestal wrote:
> I'm going to generalize this request a bit. I earlier reported
> restructuring a query to get from 25 minutes to 90 ms, where I found
> the "Learning SPARQL" book to be helpful. Do you know of any more
> detailed tutorial or guidelines focusing on performance issues,
> something that goes into a bit more detail about ordering, indexing,
> interactions between different SPARQL features, and such?
>
> On 1/13/2021 8:28 AM, Mikael Pesonen wrote:
>>
>> Related to this. This query returns in 70ms
>>
>> SELECT ?class ?classLabel
>> WHERE
>> {
>> ?scIntsec rdf:rest*/rdf:first id:308925008 .
>> ?scTmp owl:intersectionOf ?scIntsec .
>> ?class rdfs:subClassOf ?scTmp.
>> ?class skos:prefLabel ?classLabel .
>> }
>>
>> but this doesn't finish in 15 minutes (aborted then)
>>
>> SELECT ?class ?classLabel
>> WHERE
>> {
>> VALUES ?superClass {id:308925008 }
>> ?scIntsec rdf:rest*/rdf:first ?superClass .
>> ?scTmp owl:intersectionOf ?scIntsec .
>> ?class rdfs:subClassOf ?scTmp.
>> ?class skos:prefLabel ?classLabel .
>> }
>>
>> How come they are so different since they do the same thing?
>>
>>
>> On 15/12/2020 20.30, Andy Seaborne wrote:
>>>
>>>
>>> On 15/12/2020 14:33, Mikael Pesonen wrote:
>>>>
>>>> Hi,
>>>>
>>>> I am querying subclasses of class (here id:365852007) which belongs
>>>> to RDF collection like this
>>>>
>>>> id:1 rdf:type owl:Class ;
>>>> rdfs:subClassOf [ owl:intersectionOf ( ... id:365852007
>>>> ... ) ] ;
>>>> skos:prefLabel "something"@en .
>>>> id:2 rdf:type owl:Class ;
>>>> rdfs:subClassOf [ owl:intersectionOf ( ... id:365852007
>>>> ... ) ] ;
>>>> skos:prefLabel "something else"@en .
>>>>
>>>>
>>>> ... denotes random number of other elements in the list.
>>>>
>>>> SPARQL:
>>>>
>>>> select * where
>>>> {
>>>> ?subclass rdfs:subClassOf [ owl:intersectionOf
>>>> /rdf:rest*/rdf:first id:365852007 ] .
>>>> ?subclass skos:prefLabel ?label .
>>>
>>> Try breaking the pattern up:
>>>
>>> ?Z rdf:rest*/rdf:first id:365852007 .
>>> ?X owl:intersectionOf ?Z .
>>> ?subclass rdfs:subClassOf ?X .
>>> ?subclass rdfs:label ?label .
>>>
>>> and use 3.17.0
>>>
>>>> }
>>>>
>>>>
>>>> This is slow (~10 secs) and we need to extend the query to
>>>> subclasses of the subclasses and so on.
>>>>
>>>> Is there any faster way to get this done?
>>>>
>>>> Thanks!
>>>>
>>
>
--
Lingsoft - 30 years of Leading Language Management
www.lingsoft.fi
Speech Applications - Language Management - Translation - Reader's and Writer's Tools - Text Tools - E-books and M-books
Mikael Pesonen
System Engineer
e-mail: mikael.pesonen@lingsoft.fi
Tel. +358 2 279 3300
Time zone: GMT+2
Helsinki Office
Eteläranta 10
FI-00130 Helsinki
FINLAND
Turku Office
Kauppiaskatu 5 A
FI-20100 Turku
FINLAND
Re: Efficient SPARQL query for RDF collections
Posted by Steve Vestal <st...@adventiumlabs.com>.
I'm going to generalize this request a bit. I earlier reported
restructuring a query to get from 25 minutes to 90 ms, where I found the
"Learning SPARQL" book to be helpful. Do you know of any more detailed
tutorial or guidelines focusing on performance issues, something that
goes into a bit more detail about ordering, indexing, interactions
between different SPARQL features, and such?
On 1/13/2021 8:28 AM, Mikael Pesonen wrote:
>
> Related to this. This query returns in 70ms
>
> SELECT ?class ?classLabel
> WHERE
> {
> ?scIntsec rdf:rest*/rdf:first id:308925008 .
> ?scTmp owl:intersectionOf ?scIntsec .
> ?class rdfs:subClassOf ?scTmp.
> ?class skos:prefLabel ?classLabel .
> }
>
> but this doesn't finish in 15 minutes (aborted then)
>
> SELECT ?class ?classLabel
> WHERE
> {
> VALUES ?superClass {id:308925008 }
> ?scIntsec rdf:rest*/rdf:first ?superClass .
> ?scTmp owl:intersectionOf ?scIntsec .
> ?class rdfs:subClassOf ?scTmp.
> ?class skos:prefLabel ?classLabel .
> }
>
> How come they are so different since they do the same thing?
>
>
> On 15/12/2020 20.30, Andy Seaborne wrote:
>>
>>
>> On 15/12/2020 14:33, Mikael Pesonen wrote:
>>>
>>> Hi,
>>>
>>> I am querying subclasses of class (here id:365852007) which belongs
>>> to RDF collection like this
>>>
>>> id:1 rdf:type owl:Class ;
>>> rdfs:subClassOf [ owl:intersectionOf ( ... id:365852007 ...
>>> ) ] ;
>>> skos:prefLabel "something"@en .
>>> id:2 rdf:type owl:Class ;
>>> rdfs:subClassOf [ owl:intersectionOf ( ... id:365852007 ...
>>> ) ] ;
>>> skos:prefLabel "something else"@en .
>>>
>>>
>>> ... denotes random number of other elements in the list.
>>>
>>> SPARQL:
>>>
>>> select * where
>>> {
>>> ?subclass rdfs:subClassOf [ owl:intersectionOf
>>> /rdf:rest*/rdf:first id:365852007 ] .
>>> ?subclass skos:prefLabel ?label .
>>
>> Try breaking the pattern up:
>>
>> ?Z rdf:rest*/rdf:first id:365852007 .
>> ?X owl:intersectionOf ?Z .
>> ?subclass rdfs:subClassOf ?X .
>> ?subclass rdfs:label ?label .
>>
>> and use 3.17.0
>>
>>> }
>>>
>>>
>>> This is slow (~10 secs) and we need to extend the query to
>>> subclasses of the subclasses and so on.
>>>
>>> Is there any faster way to get this done?
>>>
>>> Thanks!
>>>
>
Re: Efficient SPARQL query for RDF collections
Posted by Mikael Pesonen <mi...@lingsoft.fi>.
I wasn't even aware of such thing. So
SELECT ?class ?classLabel
WHERE
{
VALUES ?superClass {id:308925008}
?scIntsec list:member ?superClass .
?scTmp owl:intersectionOf ?scIntsec .
?class rdfs:subClassOf ?scTmp.
?class skos:prefLabel ?classLabel .
}
runs under 20ms.
Thanks!
On 14/01/2021 14.08, Andy Seaborne wrote:
> Have you tried using the property function "list:member"?
>
> Andy
>
> On 13/01/2021 14:28, Mikael Pesonen wrote:
>>
>> Related to this. This query returns in 70ms
>>
>> SELECT ?class ?classLabel
>> WHERE
>> {
>> ?scIntsec rdf:rest*/rdf:first id:308925008 .
>> ?scTmp owl:intersectionOf ?scIntsec .
>> ?class rdfs:subClassOf ?scTmp.
>> ?class skos:prefLabel ?classLabel .
>> }
>>
>> but this doesn't finish in 15 minutes (aborted then)
>>
>> SELECT ?class ?classLabel
>> WHERE
>> {
>> VALUES ?superClass {id:308925008 }
>> ?scIntsec rdf:rest*/rdf:first ?superClass .
>> ?scTmp owl:intersectionOf ?scIntsec .
>> ?class rdfs:subClassOf ?scTmp.
>> ?class skos:prefLabel ?classLabel .
>> }
>>
>> How come they are so different since they do the same thing?
>>
>>
>> On 15/12/2020 20.30, Andy Seaborne wrote:
>>>
>>>
>>> On 15/12/2020 14:33, Mikael Pesonen wrote:
>>>>
>>>> Hi,
>>>>
>>>> I am querying subclasses of class (here id:365852007) which belongs
>>>> to RDF collection like this
>>>>
>>>> id:1 rdf:type owl:Class ;
>>>> rdfs:subClassOf [ owl:intersectionOf ( ... id:365852007
>>>> ... ) ] ;
>>>> skos:prefLabel "something"@en .
>>>> id:2 rdf:type owl:Class ;
>>>> rdfs:subClassOf [ owl:intersectionOf ( ... id:365852007
>>>> ... ) ] ;
>>>> skos:prefLabel "something else"@en .
>>>>
>>>>
>>>> ... denotes random number of other elements in the list.
>>>>
>>>> SPARQL:
>>>>
>>>> select * where
>>>> {
>>>> ?subclass rdfs:subClassOf [ owl:intersectionOf
>>>> /rdf:rest*/rdf:first id:365852007 ] .
>>>> ?subclass skos:prefLabel ?label .
>>>
>>> Try breaking the pattern up:
>>>
>>> ?Z rdf:rest*/rdf:first id:365852007 .
>>> ?X owl:intersectionOf ?Z .
>>> ?subclass rdfs:subClassOf ?X .
>>> ?subclass rdfs:label ?label .
>>>
>>> and use 3.17.0
>>>
>>>> }
>>>>
>>>>
>>>> This is slow (~10 secs) and we need to extend the query to
>>>> subclasses of the subclasses and so on.
>>>>
>>>> Is there any faster way to get this done?
>>>>
>>>> Thanks!
>>>>
>>
--
Lingsoft - 30 years of Leading Language Management
www.lingsoft.fi
Speech Applications - Language Management - Translation - Reader's and Writer's Tools - Text Tools - E-books and M-books
Mikael Pesonen
System Engineer
e-mail: mikael.pesonen@lingsoft.fi
Tel. +358 2 279 3300
Time zone: GMT+2
Helsinki Office
Eteläranta 10
FI-00130 Helsinki
FINLAND
Turku Office
Kauppiaskatu 5 A
FI-20100 Turku
FINLAND
Re: Efficient SPARQL query for RDF collections
Posted by Andy Seaborne <an...@apache.org>.
Have you tried using the property function "list:member"?
Andy
On 13/01/2021 14:28, Mikael Pesonen wrote:
>
> Related to this. This query returns in 70ms
>
> SELECT ?class ?classLabel
> WHERE
> {
> ?scIntsec rdf:rest*/rdf:first id:308925008 .
> ?scTmp owl:intersectionOf ?scIntsec .
> ?class rdfs:subClassOf ?scTmp.
> ?class skos:prefLabel ?classLabel .
> }
>
> but this doesn't finish in 15 minutes (aborted then)
>
> SELECT ?class ?classLabel
> WHERE
> {
> VALUES ?superClass {id:308925008 }
> ?scIntsec rdf:rest*/rdf:first ?superClass .
> ?scTmp owl:intersectionOf ?scIntsec .
> ?class rdfs:subClassOf ?scTmp.
> ?class skos:prefLabel ?classLabel .
> }
>
> How come they are so different since they do the same thing?
>
>
> On 15/12/2020 20.30, Andy Seaborne wrote:
>>
>>
>> On 15/12/2020 14:33, Mikael Pesonen wrote:
>>>
>>> Hi,
>>>
>>> I am querying subclasses of class (here id:365852007) which belongs
>>> to RDF collection like this
>>>
>>> id:1 rdf:type owl:Class ;
>>> rdfs:subClassOf [ owl:intersectionOf ( ... id:365852007 ...
>>> ) ] ;
>>> skos:prefLabel "something"@en .
>>> id:2 rdf:type owl:Class ;
>>> rdfs:subClassOf [ owl:intersectionOf ( ... id:365852007 ...
>>> ) ] ;
>>> skos:prefLabel "something else"@en .
>>>
>>>
>>> ... denotes random number of other elements in the list.
>>>
>>> SPARQL:
>>>
>>> select * where
>>> {
>>> ?subclass rdfs:subClassOf [ owl:intersectionOf
>>> /rdf:rest*/rdf:first id:365852007 ] .
>>> ?subclass skos:prefLabel ?label .
>>
>> Try breaking the pattern up:
>>
>> ?Z rdf:rest*/rdf:first id:365852007 .
>> ?X owl:intersectionOf ?Z .
>> ?subclass rdfs:subClassOf ?X .
>> ?subclass rdfs:label ?label .
>>
>> and use 3.17.0
>>
>>> }
>>>
>>>
>>> This is slow (~10 secs) and we need to extend the query to subclasses
>>> of the subclasses and so on.
>>>
>>> Is there any faster way to get this done?
>>>
>>> Thanks!
>>>
>
Re: Efficient SPARQL query for RDF collections
Posted by Mikael Pesonen <mi...@lingsoft.fi>.
Related to this. This query returns in 70ms
SELECT ?class ?classLabel
WHERE
{
?scIntsec rdf:rest*/rdf:first id:308925008 .
?scTmp owl:intersectionOf ?scIntsec .
?class rdfs:subClassOf ?scTmp.
?class skos:prefLabel ?classLabel .
}
but this doesn't finish in 15 minutes (aborted then)
SELECT ?class ?classLabel
WHERE
{
VALUES ?superClass {id:308925008 }
?scIntsec rdf:rest*/rdf:first ?superClass .
?scTmp owl:intersectionOf ?scIntsec .
?class rdfs:subClassOf ?scTmp.
?class skos:prefLabel ?classLabel .
}
How come they are so different since they do the same thing?
On 15/12/2020 20.30, Andy Seaborne wrote:
>
>
> On 15/12/2020 14:33, Mikael Pesonen wrote:
>>
>> Hi,
>>
>> I am querying subclasses of class (here id:365852007) which belongs
>> to RDF collection like this
>>
>> id:1 rdf:type owl:Class ;
>> rdfs:subClassOf [ owl:intersectionOf ( ... id:365852007 ...
>> ) ] ;
>> skos:prefLabel "something"@en .
>> id:2 rdf:type owl:Class ;
>> rdfs:subClassOf [ owl:intersectionOf ( ... id:365852007 ...
>> ) ] ;
>> skos:prefLabel "something else"@en .
>>
>>
>> ... denotes random number of other elements in the list.
>>
>> SPARQL:
>>
>> select * where
>> {
>> ?subclass rdfs:subClassOf [ owl:intersectionOf
>> /rdf:rest*/rdf:first id:365852007 ] .
>> ?subclass skos:prefLabel ?label .
>
> Try breaking the pattern up:
>
> ?Z rdf:rest*/rdf:first id:365852007 .
> ?X owl:intersectionOf ?Z .
> ?subclass rdfs:subClassOf ?X .
> ?subclass rdfs:label ?label .
>
> and use 3.17.0
>
>> }
>>
>>
>> This is slow (~10 secs) and we need to extend the query to subclasses
>> of the subclasses and so on.
>>
>> Is there any faster way to get this done?
>>
>> Thanks!
>>
--
Lingsoft - 30 years of Leading Language Management
www.lingsoft.fi
Speech Applications - Language Management - Translation - Reader's and Writer's Tools - Text Tools - E-books and M-books
Mikael Pesonen
System Engineer
e-mail: mikael.pesonen@lingsoft.fi
Tel. +358 2 279 3300
Time zone: GMT+2
Helsinki Office
Eteläranta 10
FI-00130 Helsinki
FINLAND
Turku Office
Kauppiaskatu 5 A
FI-20100 Turku
FINLAND
Re: Efficient SPARQL query for RDF collections
Posted by Andy Seaborne <an...@apache.org>.
On 15/12/2020 14:33, Mikael Pesonen wrote:
>
> Hi,
>
> I am querying subclasses of class (here id:365852007) which belongs to
> RDF collection like this
>
> id:1 rdf:type owl:Class ;
> rdfs:subClassOf [ owl:intersectionOf ( ... id:365852007 ... ) ] ;
> skos:prefLabel "something"@en .
> id:2 rdf:type owl:Class ;
> rdfs:subClassOf [ owl:intersectionOf ( ... id:365852007 ... ) ] ;
> skos:prefLabel "something else"@en .
>
>
> ... denotes random number of other elements in the list.
>
> SPARQL:
>
> select * where
> {
> ?subclass rdfs:subClassOf [ owl:intersectionOf /rdf:rest*/rdf:first
> id:365852007 ] .
> ?subclass skos:prefLabel ?label .
Try breaking the pattern up:
?Z rdf:rest*/rdf:first id:365852007 .
?X owl:intersectionOf ?Z .
?subclass rdfs:subClassOf ?X .
?subclass rdfs:label ?label .
and use 3.17.0
> }
>
>
> This is slow (~10 secs) and we need to extend the query to subclasses of
> the subclasses and so on.
>
> Is there any faster way to get this done?
>
> Thanks!
>