You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@jena.apache.org by Mikael Pesonen <mi...@lingsoft.fi> on 2020/12/15 14:33:03 UTC

Efficient SPARQL query for RDF collections

Hi,

I am querying subclasses of class (here id:365852007) which belongs to 
RDF collection like this

id:1 rdf:type owl:Class ;
         rdfs:subClassOf [ owl:intersectionOf ( ... id:365852007 ... ) ]  ;
         skos:prefLabel "something"@en .
id:2 rdf:type owl:Class ;
         rdfs:subClassOf [ owl:intersectionOf ( ... id:365852007 ... ) ]  ;
         skos:prefLabel "something else"@en .


... denotes random number of other elements in the list.

SPARQL:

select * where
{
     ?subclass rdfs:subClassOf [ owl:intersectionOf /rdf:rest*/rdf:first 
id:365852007  ] .
     ?subclass skos:prefLabel ?label .
}


This is slow (~10 secs) and we need to extend the query to subclasses of 
the subclasses and so on.

Is there any faster way to get this done?

Thanks!

Re: Efficient SPARQL query for RDF collections

Posted by Mikael Pesonen <mi...@lingsoft.fi>.
Hi,

3.17 seems to be about the same, but breaking down the query helped 
partially

Original single subclass query: 28s, broke up query: 31ms ( returns 14 
items)

Original subclass of subclass: 150s ( returns 23 items), broke up query 
doesn't finish in 15 mins and Fuseki seems to hang.


Here's the broke up query for subclass of subclass:

?Z rdf:rest*/rdf:first id:106048009 . ?X owl:intersectionOf ?Z .
?subclass rdfs:subClassOf ?X .
?subclass skos:prefLabel ?label .

?Z2 rdf:rest*/rdf:first ?subclass . ?X2 owl:intersectionOf ?Z2 .
?subclass2 rdfs:subClassOf ?X2 .
?subclass2 skos:prefLabel ?label2 .


BR

On 15/12/2020 20.30, Andy Seaborne wrote:
>
>
> On 15/12/2020 14:33, Mikael Pesonen wrote:
>>
>> Hi,
>>
>> I am querying subclasses of class (here id:365852007) which belongs 
>> to RDF collection like this
>>
>> id:1 rdf:type owl:Class ;
>>          rdfs:subClassOf [ owl:intersectionOf ( ... id:365852007 ... 
>> ) ]  ;
>>          skos:prefLabel "something"@en .
>> id:2 rdf:type owl:Class ;
>>          rdfs:subClassOf [ owl:intersectionOf ( ... id:365852007 ... 
>> ) ]  ;
>>          skos:prefLabel "something else"@en .
>>
>>
>> ... denotes random number of other elements in the list.
>>
>> SPARQL:
>>
>> select * where
>> {
>>      ?subclass rdfs:subClassOf [ owl:intersectionOf 
>> /rdf:rest*/rdf:first id:365852007  ] .
>>      ?subclass skos:prefLabel ?label .
>
> Try breaking the pattern up:
>
>     ?Z rdf:rest*/rdf:first id:365852007   .
>     ?X owl:intersectionOf ?Z .
>     ?subclass rdfs:subClassOf ?X .
>     ?subclass rdfs:label ?label .
>
> and use 3.17.0
>
>> }
>>
>>
>> This is slow (~10 secs) and we need to extend the query to subclasses 
>> of the subclasses and so on.
>>
>> Is there any faster way to get this done?
>>
>> Thanks!
>>

-- 
Lingsoft - 30 years of Leading Language Management

www.lingsoft.fi

Speech Applications - Language Management - Translation - Reader's and Writer's Tools - Text Tools - E-books and M-books

Mikael Pesonen
System Engineer

e-mail: mikael.pesonen@lingsoft.fi
Tel. +358 2 279 3300

Time zone: GMT+2

Helsinki Office
Eteläranta 10
FI-00130 Helsinki
FINLAND

Turku Office
Kauppiaskatu 5 A
FI-20100 Turku
FINLAND


Re: Efficient SPARQL query for RDF collections

Posted by Andy Seaborne <an...@apache.org>.

On 14/01/2021 15:18, Mikael Pesonen wrote:
> 
> Okay, so comparing the same examples,
> 
> fast one:
> 
>    1 (base <http://example/base/>
>    2   (prefix ((owl: <http://www.w3.org/2002/07/owl#>)
>    3            (rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>)
>    4            (skos: <http://www.w3.org/2004/02/skos/core#>)
>    5            (rdfs: <http://www.w3.org/2000/01/rdf-schema#>)
>    6            (id: <http://example.com/id/>)
>    7            (list: <http://jena.apache.org/ARQ/list#>))
>    8     (project (?class ?classLabel)
>    9       (sequence
>   10         (sequence
>   11           (bgp (triple ??P23 rdf:first id:308925008))
>   12           (path ?scIntsec (path* rdf:rest) ??P23))

Note order: bgp-path

>   13         (bgp
>   14           (triple ?scTmp owl:intersectionOf ?scIntsec)
>   15           (triple ?class rdfs:subClassOf ?scTmp)
>   16           (triple ?class skos:prefLabel ?classLabel)
>   17         )))))
> 
> 
> slow one:
> 
>    1 (base <http://example/base/>
>    2   (prefix ((owl: <http://www.w3.org/2002/07/owl#>)
>    3            (rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>)
>    4            (skos: <http://www.w3.org/2004/02/skos/core#>)
>    5            (rdfs: <http://www.w3.org/2000/01/rdf-schema#>)
>    6            (id: <http://example.com/id/>)
>    7            (list: <http://jena.apache.org/ARQ/list#>))
>    8     (project (?class ?classLabel)
>    9       (sequence
>   10         (table (vars ?superClass)
>   11           (row [?superClass id:308925008])
>   12         )
>   13         (sequence
>   14           (sequence
>   15             (path ?scIntsec (path* rdf:rest) ??P24)
>   16             (bgp (triple ??P24 rdf:first ?superClass)))
>   17           (bgp
>   18             (triple ?scTmp owl:intersectionOf ?scIntsec)
>   19             (triple ?class rdfs:subClassOf ?scTmp)
>   20             (triple ?class skos:prefLabel ?classLabel)
>   21           ))))))
> 
> 
> Slow one has unbounded section which gets executed:

It's not completely unbound - it is in a (sequence) so the ?superClass 
is passed in.

It does not, however, reverse the order path-bgp.

> 
> 15             (path ?scIntsec (path* rdf:rest) ??P24)
> 16             (bgp (triple ??P24 rdf:first ?superClass)))
> 
> and has 10000s of results to handle and table construct is attached later?
> 
> 
> 
> On 14/01/2021 16.35, Andy Seaborne wrote:
>> http://www.sparql.org/query-validator.html
>>
>> and use the "SPARQL algebra (general optimizations)"
>>
>> and on the command line:
>>
>> qparse --print=opt
>>
>> On 14/01/2021 10:53, Mikael Pesonen wrote:
>>>
>>> Would be really helpful to know how variables are bound etc if there 
>>> exists a simple guide for that.
>>>
>>> This query
>>>
>>> SELECT count(?scIntsec)
>>> WHERE
>>> {
>>>    ?scIntsec rdf:first ?superClass .
>>> }
>>>
>>> returns 851029 which must be the reason for slowness in my case. But 
>>> I'm still not getting why it matters and why doesn't
>>>
>>> VALUES ?superClass  {id:308925008 }
>>
>> Because that can be multiple values, and theh optimzier doesn't make 
>> one value a special case,
>>
>> and then
>>
>> ?scIntsec rdf:rest*/rdf:first ?superClass .
>>
>> is a compound expression (you could write it out and control the order 
>> if you want)
>>
>> ?X rdf:first ?superClass .
>> ?scIntsec rdf:rest*  ?X .
>>
>> and no local information that the other way is better. (Scheduling 
>> paths and triple patterns isn't done very much by ARQ.)
>>
>>     Andy
>>
>>
>>
>>>
>>> limit the search space.
>>>
>>>
>>> On 13/01/2021 17.12, Steve Vestal wrote:
>>>> I'm going to generalize this request a bit.  I earlier reported 
>>>> restructuring a query to get from 25 minutes to 90 ms, where I found 
>>>> the "Learning SPARQL" book to be helpful.  Do you know of any more 
>>>> detailed tutorial or guidelines focusing on performance issues, 
>>>> something that goes into a bit more detail about ordering, indexing, 
>>>> interactions between different SPARQL features, and such?
>>>>
>>>> On 1/13/2021 8:28 AM, Mikael Pesonen wrote:
>>>>>
>>>>> Related to this. This query returns in 70ms
>>>>>
>>>>> SELECT ?class ?classLabel
>>>>> WHERE
>>>>> {
>>>>>   ?scIntsec rdf:rest*/rdf:first id:308925008 .
>>>>>   ?scTmp owl:intersectionOf ?scIntsec .
>>>>>   ?class rdfs:subClassOf ?scTmp.
>>>>>   ?class skos:prefLabel ?classLabel .
>>>>> }
>>>>>
>>>>> but this doesn't finish in 15 minutes (aborted then)
>>>>>
>>>>> SELECT ?class ?classLabel
>>>>> WHERE
>>>>> {
>>>>>   VALUES ?superClass  {id:308925008 }
>>>>>   ?scIntsec rdf:rest*/rdf:first ?superClass .
>>>>>   ?scTmp owl:intersectionOf ?scIntsec .
>>>>>   ?class rdfs:subClassOf ?scTmp.
>>>>>   ?class skos:prefLabel ?classLabel .
>>>>> }
>>>>>
>>>>> How come they are so different since they do the same thing?
>>>>>
>>>>>
>>>>> On 15/12/2020 20.30, Andy Seaborne wrote:
>>>>>>
>>>>>>
>>>>>> On 15/12/2020 14:33, Mikael Pesonen wrote:
>>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> I am querying subclasses of class (here id:365852007) which 
>>>>>>> belongs to RDF collection like this
>>>>>>>
>>>>>>> id:1 rdf:type owl:Class ;
>>>>>>>          rdfs:subClassOf [ owl:intersectionOf ( ... id:365852007 
>>>>>>> ... ) ]  ;
>>>>>>>          skos:prefLabel "something"@en .
>>>>>>> id:2 rdf:type owl:Class ;
>>>>>>>          rdfs:subClassOf [ owl:intersectionOf ( ... id:365852007 
>>>>>>> ... ) ]  ;
>>>>>>>          skos:prefLabel "something else"@en .
>>>>>>>
>>>>>>>
>>>>>>> ... denotes random number of other elements in the list.
>>>>>>>
>>>>>>> SPARQL:
>>>>>>>
>>>>>>> select * where
>>>>>>> {
>>>>>>>      ?subclass rdfs:subClassOf [ owl:intersectionOf 
>>>>>>> /rdf:rest*/rdf:first id:365852007  ] .
>>>>>>>      ?subclass skos:prefLabel ?label .
>>>>>>
>>>>>> Try breaking the pattern up:
>>>>>>
>>>>>>     ?Z rdf:rest*/rdf:first id:365852007   .
>>>>>>     ?X owl:intersectionOf ?Z .
>>>>>>     ?subclass rdfs:subClassOf ?X .
>>>>>>     ?subclass rdfs:label ?label .
>>>>>>
>>>>>> and use 3.17.0
>>>>>>
>>>>>>> }
>>>>>>>
>>>>>>>
>>>>>>> This is slow (~10 secs) and we need to extend the query to 
>>>>>>> subclasses of the subclasses and so on.
>>>>>>>
>>>>>>> Is there any faster way to get this done?
>>>>>>>
>>>>>>> Thanks!
>>>>>>>
>>>>>
>>>>
>>>
> 

Re: Efficient SPARQL query for RDF collections

Posted by Mikael Pesonen <mi...@lingsoft.fi>.
Okay, so comparing the same examples,

fast one:

   1 (base <http://example/base/>
   2   (prefix ((owl: <http://www.w3.org/2002/07/owl#>)
   3            (rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>)
   4            (skos: <http://www.w3.org/2004/02/skos/core#>)
   5            (rdfs: <http://www.w3.org/2000/01/rdf-schema#>)
   6            (id: <http://example.com/id/>)
   7            (list: <http://jena.apache.org/ARQ/list#>))
   8     (project (?class ?classLabel)
   9       (sequence
  10         (sequence
  11           (bgp (triple ??P23 rdf:first id:308925008))
  12           (path ?scIntsec (path* rdf:rest) ??P23))
  13         (bgp
  14           (triple ?scTmp owl:intersectionOf ?scIntsec)
  15           (triple ?class rdfs:subClassOf ?scTmp)
  16           (triple ?class skos:prefLabel ?classLabel)
  17         )))))


slow one:

   1 (base <http://example/base/>
   2   (prefix ((owl: <http://www.w3.org/2002/07/owl#>)
   3            (rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>)
   4            (skos: <http://www.w3.org/2004/02/skos/core#>)
   5            (rdfs: <http://www.w3.org/2000/01/rdf-schema#>)
   6            (id: <http://example.com/id/>)
   7            (list: <http://jena.apache.org/ARQ/list#>))
   8     (project (?class ?classLabel)
   9       (sequence
  10         (table (vars ?superClass)
  11           (row [?superClass id:308925008])
  12         )
  13         (sequence
  14           (sequence
  15             (path ?scIntsec (path* rdf:rest) ??P24)
  16             (bgp (triple ??P24 rdf:first ?superClass)))
  17           (bgp
  18             (triple ?scTmp owl:intersectionOf ?scIntsec)
  19             (triple ?class rdfs:subClassOf ?scTmp)
  20             (triple ?class skos:prefLabel ?classLabel)
  21           ))))))


Slow one has unbounded section which gets executed:

15             (path ?scIntsec (path* rdf:rest) ??P24)
16             (bgp (triple ??P24 rdf:first ?superClass)))

and has 10000s of results to handle and table construct is attached later?



On 14/01/2021 16.35, Andy Seaborne wrote:
> http://www.sparql.org/query-validator.html
>
> and use the "SPARQL algebra (general optimizations)"
>
> and on the command line:
>
> qparse --print=opt
>
> On 14/01/2021 10:53, Mikael Pesonen wrote:
>>
>> Would be really helpful to know how variables are bound etc if there 
>> exists a simple guide for that.
>>
>> This query
>>
>> SELECT count(?scIntsec)
>> WHERE
>> {
>>    ?scIntsec rdf:first ?superClass .
>> }
>>
>> returns 851029 which must be the reason for slowness in my case. But 
>> I'm still not getting why it matters and why doesn't
>>
>> VALUES ?superClass  {id:308925008 }
>
> Because that can be multiple values, and theh optimzier doesn't make 
> one value a special case,
>
> and then
>
> ?scIntsec rdf:rest*/rdf:first ?superClass .
>
> is a compound expression (you could write it out and control the order 
> if you want)
>
> ?X rdf:first ?superClass .
> ?scIntsec rdf:rest*  ?X .
>
> and no local information that the other way is better. (Scheduling 
> paths and triple patterns isn't done very much by ARQ.)
>
>     Andy
>
>
>
>>
>> limit the search space.
>>
>>
>> On 13/01/2021 17.12, Steve Vestal wrote:
>>> I'm going to generalize this request a bit.  I earlier reported 
>>> restructuring a query to get from 25 minutes to 90 ms, where I found 
>>> the "Learning SPARQL" book to be helpful.  Do you know of any more 
>>> detailed tutorial or guidelines focusing on performance issues, 
>>> something that goes into a bit more detail about ordering, indexing, 
>>> interactions between different SPARQL features, and such?
>>>
>>> On 1/13/2021 8:28 AM, Mikael Pesonen wrote:
>>>>
>>>> Related to this. This query returns in 70ms
>>>>
>>>> SELECT ?class ?classLabel
>>>> WHERE
>>>> {
>>>>   ?scIntsec rdf:rest*/rdf:first id:308925008 .
>>>>   ?scTmp owl:intersectionOf ?scIntsec .
>>>>   ?class rdfs:subClassOf ?scTmp.
>>>>   ?class skos:prefLabel ?classLabel .
>>>> }
>>>>
>>>> but this doesn't finish in 15 minutes (aborted then)
>>>>
>>>> SELECT ?class ?classLabel
>>>> WHERE
>>>> {
>>>>   VALUES ?superClass  {id:308925008 }
>>>>   ?scIntsec rdf:rest*/rdf:first ?superClass .
>>>>   ?scTmp owl:intersectionOf ?scIntsec .
>>>>   ?class rdfs:subClassOf ?scTmp.
>>>>   ?class skos:prefLabel ?classLabel .
>>>> }
>>>>
>>>> How come they are so different since they do the same thing?
>>>>
>>>>
>>>> On 15/12/2020 20.30, Andy Seaborne wrote:
>>>>>
>>>>>
>>>>> On 15/12/2020 14:33, Mikael Pesonen wrote:
>>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> I am querying subclasses of class (here id:365852007) which 
>>>>>> belongs to RDF collection like this
>>>>>>
>>>>>> id:1 rdf:type owl:Class ;
>>>>>>          rdfs:subClassOf [ owl:intersectionOf ( ... id:365852007 
>>>>>> ... ) ]  ;
>>>>>>          skos:prefLabel "something"@en .
>>>>>> id:2 rdf:type owl:Class ;
>>>>>>          rdfs:subClassOf [ owl:intersectionOf ( ... id:365852007 
>>>>>> ... ) ]  ;
>>>>>>          skos:prefLabel "something else"@en .
>>>>>>
>>>>>>
>>>>>> ... denotes random number of other elements in the list.
>>>>>>
>>>>>> SPARQL:
>>>>>>
>>>>>> select * where
>>>>>> {
>>>>>>      ?subclass rdfs:subClassOf [ owl:intersectionOf 
>>>>>> /rdf:rest*/rdf:first id:365852007  ] .
>>>>>>      ?subclass skos:prefLabel ?label .
>>>>>
>>>>> Try breaking the pattern up:
>>>>>
>>>>>     ?Z rdf:rest*/rdf:first id:365852007   .
>>>>>     ?X owl:intersectionOf ?Z .
>>>>>     ?subclass rdfs:subClassOf ?X .
>>>>>     ?subclass rdfs:label ?label .
>>>>>
>>>>> and use 3.17.0
>>>>>
>>>>>> }
>>>>>>
>>>>>>
>>>>>> This is slow (~10 secs) and we need to extend the query to 
>>>>>> subclasses of the subclasses and so on.
>>>>>>
>>>>>> Is there any faster way to get this done?
>>>>>>
>>>>>> Thanks!
>>>>>>
>>>>
>>>
>>

-- 
Lingsoft - 30 years of Leading Language Management

www.lingsoft.fi

Speech Applications - Language Management - Translation - Reader's and Writer's Tools - Text Tools - E-books and M-books

Mikael Pesonen
System Engineer

e-mail: mikael.pesonen@lingsoft.fi
Tel. +358 2 279 3300

Time zone: GMT+2

Helsinki Office
Eteläranta 10
FI-00130 Helsinki
FINLAND

Turku Office
Kauppiaskatu 5 A
FI-20100 Turku
FINLAND


Re: Efficient SPARQL query for RDF collections

Posted by Andy Seaborne <an...@apache.org>.
http://www.sparql.org/query-validator.html

and use the "SPARQL algebra (general optimizations)"

and on the command line:

qparse --print=opt

On 14/01/2021 10:53, Mikael Pesonen wrote:
> 
> Would be really helpful to know how variables are bound etc if there 
> exists a simple guide for that.
> 
> This query
> 
> SELECT count(?scIntsec)
> WHERE
> {
>    ?scIntsec rdf:first ?superClass .
> }
> 
> returns 851029 which must be the reason for slowness in my case. But I'm 
> still not getting why it matters and why doesn't
> 
> VALUES ?superClass  {id:308925008 }

Because that can be multiple values, and theh optimzier doesn't make one 
value a special case,

and then

?scIntsec rdf:rest*/rdf:first ?superClass .

is a compound expression (you could write it out and control the order 
if you want)

?X rdf:first ?superClass .
?scIntsec rdf:rest*  ?X .

and no local information that the other way is better. (Scheduling paths 
and triple patterns isn't done very much by ARQ.)

     Andy



> 
> limit the search space.
> 
> 
> On 13/01/2021 17.12, Steve Vestal wrote:
>> I'm going to generalize this request a bit.  I earlier reported 
>> restructuring a query to get from 25 minutes to 90 ms, where I found 
>> the "Learning SPARQL" book to be helpful.  Do you know of any more 
>> detailed tutorial or guidelines focusing on performance issues, 
>> something that goes into a bit more detail about ordering, indexing, 
>> interactions between different SPARQL features, and such?
>>
>> On 1/13/2021 8:28 AM, Mikael Pesonen wrote:
>>>
>>> Related to this. This query returns in 70ms
>>>
>>> SELECT ?class ?classLabel
>>> WHERE
>>> {
>>>   ?scIntsec rdf:rest*/rdf:first id:308925008 .
>>>   ?scTmp owl:intersectionOf ?scIntsec .
>>>   ?class rdfs:subClassOf ?scTmp.
>>>   ?class skos:prefLabel ?classLabel .
>>> }
>>>
>>> but this doesn't finish in 15 minutes (aborted then)
>>>
>>> SELECT ?class ?classLabel
>>> WHERE
>>> {
>>>   VALUES ?superClass  {id:308925008 }
>>>   ?scIntsec rdf:rest*/rdf:first ?superClass .
>>>   ?scTmp owl:intersectionOf ?scIntsec .
>>>   ?class rdfs:subClassOf ?scTmp.
>>>   ?class skos:prefLabel ?classLabel .
>>> }
>>>
>>> How come they are so different since they do the same thing?
>>>
>>>
>>> On 15/12/2020 20.30, Andy Seaborne wrote:
>>>>
>>>>
>>>> On 15/12/2020 14:33, Mikael Pesonen wrote:
>>>>>
>>>>> Hi,
>>>>>
>>>>> I am querying subclasses of class (here id:365852007) which belongs 
>>>>> to RDF collection like this
>>>>>
>>>>> id:1 rdf:type owl:Class ;
>>>>>          rdfs:subClassOf [ owl:intersectionOf ( ... id:365852007 
>>>>> ... ) ]  ;
>>>>>          skos:prefLabel "something"@en .
>>>>> id:2 rdf:type owl:Class ;
>>>>>          rdfs:subClassOf [ owl:intersectionOf ( ... id:365852007 
>>>>> ... ) ]  ;
>>>>>          skos:prefLabel "something else"@en .
>>>>>
>>>>>
>>>>> ... denotes random number of other elements in the list.
>>>>>
>>>>> SPARQL:
>>>>>
>>>>> select * where
>>>>> {
>>>>>      ?subclass rdfs:subClassOf [ owl:intersectionOf 
>>>>> /rdf:rest*/rdf:first id:365852007  ] .
>>>>>      ?subclass skos:prefLabel ?label .
>>>>
>>>> Try breaking the pattern up:
>>>>
>>>>     ?Z rdf:rest*/rdf:first id:365852007   .
>>>>     ?X owl:intersectionOf ?Z .
>>>>     ?subclass rdfs:subClassOf ?X .
>>>>     ?subclass rdfs:label ?label .
>>>>
>>>> and use 3.17.0
>>>>
>>>>> }
>>>>>
>>>>>
>>>>> This is slow (~10 secs) and we need to extend the query to 
>>>>> subclasses of the subclasses and so on.
>>>>>
>>>>> Is there any faster way to get this done?
>>>>>
>>>>> Thanks!
>>>>>
>>>
>>
> 

Re: Efficient SPARQL query for RDF collections

Posted by Mikael Pesonen <mi...@lingsoft.fi>.
Would be really helpful to know how variables are bound etc if there 
exists a simple guide for that.

This query

SELECT count(?scIntsec)
WHERE
{
   ?scIntsec rdf:first ?superClass .
}

returns 851029 which must be the reason for slowness in my case. But I'm 
still not getting why it matters and why doesn't

VALUES ?superClass  {id:308925008 }

limit the search space.


On 13/01/2021 17.12, Steve Vestal wrote:
> I'm going to generalize this request a bit.  I earlier reported 
> restructuring a query to get from 25 minutes to 90 ms, where I found 
> the "Learning SPARQL" book to be helpful.  Do you know of any more 
> detailed tutorial or guidelines focusing on performance issues, 
> something that goes into a bit more detail about ordering, indexing, 
> interactions between different SPARQL features, and such?
>
> On 1/13/2021 8:28 AM, Mikael Pesonen wrote:
>>
>> Related to this. This query returns in 70ms
>>
>> SELECT ?class ?classLabel
>> WHERE
>> {
>>   ?scIntsec rdf:rest*/rdf:first id:308925008 .
>>   ?scTmp owl:intersectionOf ?scIntsec .
>>   ?class rdfs:subClassOf ?scTmp.
>>   ?class skos:prefLabel ?classLabel .
>> }
>>
>> but this doesn't finish in 15 minutes (aborted then)
>>
>> SELECT ?class ?classLabel
>> WHERE
>> {
>>   VALUES ?superClass  {id:308925008 }
>>   ?scIntsec rdf:rest*/rdf:first ?superClass .
>>   ?scTmp owl:intersectionOf ?scIntsec .
>>   ?class rdfs:subClassOf ?scTmp.
>>   ?class skos:prefLabel ?classLabel .
>> }
>>
>> How come they are so different since they do the same thing?
>>
>>
>> On 15/12/2020 20.30, Andy Seaborne wrote:
>>>
>>>
>>> On 15/12/2020 14:33, Mikael Pesonen wrote:
>>>>
>>>> Hi,
>>>>
>>>> I am querying subclasses of class (here id:365852007) which belongs 
>>>> to RDF collection like this
>>>>
>>>> id:1 rdf:type owl:Class ;
>>>>          rdfs:subClassOf [ owl:intersectionOf ( ... id:365852007 
>>>> ... ) ]  ;
>>>>          skos:prefLabel "something"@en .
>>>> id:2 rdf:type owl:Class ;
>>>>          rdfs:subClassOf [ owl:intersectionOf ( ... id:365852007 
>>>> ... ) ]  ;
>>>>          skos:prefLabel "something else"@en .
>>>>
>>>>
>>>> ... denotes random number of other elements in the list.
>>>>
>>>> SPARQL:
>>>>
>>>> select * where
>>>> {
>>>>      ?subclass rdfs:subClassOf [ owl:intersectionOf 
>>>> /rdf:rest*/rdf:first id:365852007  ] .
>>>>      ?subclass skos:prefLabel ?label .
>>>
>>> Try breaking the pattern up:
>>>
>>>     ?Z rdf:rest*/rdf:first id:365852007   .
>>>     ?X owl:intersectionOf ?Z .
>>>     ?subclass rdfs:subClassOf ?X .
>>>     ?subclass rdfs:label ?label .
>>>
>>> and use 3.17.0
>>>
>>>> }
>>>>
>>>>
>>>> This is slow (~10 secs) and we need to extend the query to 
>>>> subclasses of the subclasses and so on.
>>>>
>>>> Is there any faster way to get this done?
>>>>
>>>> Thanks!
>>>>
>>
>

-- 
Lingsoft - 30 years of Leading Language Management

www.lingsoft.fi

Speech Applications - Language Management - Translation - Reader's and Writer's Tools - Text Tools - E-books and M-books

Mikael Pesonen
System Engineer

e-mail: mikael.pesonen@lingsoft.fi
Tel. +358 2 279 3300

Time zone: GMT+2

Helsinki Office
Eteläranta 10
FI-00130 Helsinki
FINLAND

Turku Office
Kauppiaskatu 5 A
FI-20100 Turku
FINLAND


Re: Efficient SPARQL query for RDF collections

Posted by Steve Vestal <st...@adventiumlabs.com>.
I'm going to generalize this request a bit.  I earlier reported 
restructuring a query to get from 25 minutes to 90 ms, where I found the 
"Learning SPARQL" book to be helpful.  Do you know of any more detailed 
tutorial or guidelines focusing on performance issues, something that 
goes into a bit more detail about ordering, indexing, interactions 
between different SPARQL features, and such?

On 1/13/2021 8:28 AM, Mikael Pesonen wrote:
>
> Related to this. This query returns in 70ms
>
> SELECT ?class ?classLabel
> WHERE
> {
>   ?scIntsec rdf:rest*/rdf:first id:308925008 .
>   ?scTmp owl:intersectionOf ?scIntsec .
>   ?class rdfs:subClassOf ?scTmp.
>   ?class skos:prefLabel ?classLabel .
> }
>
> but this doesn't finish in 15 minutes (aborted then)
>
> SELECT ?class ?classLabel
> WHERE
> {
>   VALUES ?superClass  {id:308925008 }
>   ?scIntsec rdf:rest*/rdf:first ?superClass .
>   ?scTmp owl:intersectionOf ?scIntsec .
>   ?class rdfs:subClassOf ?scTmp.
>   ?class skos:prefLabel ?classLabel .
> }
>
> How come they are so different since they do the same thing?
>
>
> On 15/12/2020 20.30, Andy Seaborne wrote:
>>
>>
>> On 15/12/2020 14:33, Mikael Pesonen wrote:
>>>
>>> Hi,
>>>
>>> I am querying subclasses of class (here id:365852007) which belongs 
>>> to RDF collection like this
>>>
>>> id:1 rdf:type owl:Class ;
>>>          rdfs:subClassOf [ owl:intersectionOf ( ... id:365852007 ... 
>>> ) ]  ;
>>>          skos:prefLabel "something"@en .
>>> id:2 rdf:type owl:Class ;
>>>          rdfs:subClassOf [ owl:intersectionOf ( ... id:365852007 ... 
>>> ) ]  ;
>>>          skos:prefLabel "something else"@en .
>>>
>>>
>>> ... denotes random number of other elements in the list.
>>>
>>> SPARQL:
>>>
>>> select * where
>>> {
>>>      ?subclass rdfs:subClassOf [ owl:intersectionOf 
>>> /rdf:rest*/rdf:first id:365852007  ] .
>>>      ?subclass skos:prefLabel ?label .
>>
>> Try breaking the pattern up:
>>
>>     ?Z rdf:rest*/rdf:first id:365852007   .
>>     ?X owl:intersectionOf ?Z .
>>     ?subclass rdfs:subClassOf ?X .
>>     ?subclass rdfs:label ?label .
>>
>> and use 3.17.0
>>
>>> }
>>>
>>>
>>> This is slow (~10 secs) and we need to extend the query to 
>>> subclasses of the subclasses and so on.
>>>
>>> Is there any faster way to get this done?
>>>
>>> Thanks!
>>>
>


Re: Efficient SPARQL query for RDF collections

Posted by Mikael Pesonen <mi...@lingsoft.fi>.
I wasn't even aware of such thing. So

SELECT ?class ?classLabel
WHERE
{
   VALUES ?superClass  {id:308925008}

   ?scIntsec list:member ?superClass .
   ?scTmp owl:intersectionOf ?scIntsec .
    ?class rdfs:subClassOf ?scTmp.
    ?class skos:prefLabel ?classLabel .
}

runs under 20ms.

Thanks!


On 14/01/2021 14.08, Andy Seaborne wrote:
> Have you tried using the property function "list:member"?
>
>     Andy
>
> On 13/01/2021 14:28, Mikael Pesonen wrote:
>>
>> Related to this. This query returns in 70ms
>>
>> SELECT ?class ?classLabel
>> WHERE
>> {
>>    ?scIntsec rdf:rest*/rdf:first id:308925008 .
>>    ?scTmp owl:intersectionOf ?scIntsec .
>>    ?class rdfs:subClassOf ?scTmp.
>>    ?class skos:prefLabel ?classLabel .
>> }
>>
>> but this doesn't finish in 15 minutes (aborted then)
>>
>> SELECT ?class ?classLabel
>> WHERE
>> {
>>    VALUES ?superClass  {id:308925008 }
>>    ?scIntsec rdf:rest*/rdf:first ?superClass .
>>    ?scTmp owl:intersectionOf ?scIntsec .
>>    ?class rdfs:subClassOf ?scTmp.
>>    ?class skos:prefLabel ?classLabel .
>> }
>>
>> How come they are so different since they do the same thing?
>>
>>
>> On 15/12/2020 20.30, Andy Seaborne wrote:
>>>
>>>
>>> On 15/12/2020 14:33, Mikael Pesonen wrote:
>>>>
>>>> Hi,
>>>>
>>>> I am querying subclasses of class (here id:365852007) which belongs 
>>>> to RDF collection like this
>>>>
>>>> id:1 rdf:type owl:Class ;
>>>>          rdfs:subClassOf [ owl:intersectionOf ( ... id:365852007 
>>>> ... ) ]  ;
>>>>          skos:prefLabel "something"@en .
>>>> id:2 rdf:type owl:Class ;
>>>>          rdfs:subClassOf [ owl:intersectionOf ( ... id:365852007 
>>>> ... ) ]  ;
>>>>          skos:prefLabel "something else"@en .
>>>>
>>>>
>>>> ... denotes random number of other elements in the list.
>>>>
>>>> SPARQL:
>>>>
>>>> select * where
>>>> {
>>>>      ?subclass rdfs:subClassOf [ owl:intersectionOf 
>>>> /rdf:rest*/rdf:first id:365852007  ] .
>>>>      ?subclass skos:prefLabel ?label .
>>>
>>> Try breaking the pattern up:
>>>
>>>     ?Z rdf:rest*/rdf:first id:365852007   .
>>>     ?X owl:intersectionOf ?Z .
>>>     ?subclass rdfs:subClassOf ?X .
>>>     ?subclass rdfs:label ?label .
>>>
>>> and use 3.17.0
>>>
>>>> }
>>>>
>>>>
>>>> This is slow (~10 secs) and we need to extend the query to 
>>>> subclasses of the subclasses and so on.
>>>>
>>>> Is there any faster way to get this done?
>>>>
>>>> Thanks!
>>>>
>>

-- 
Lingsoft - 30 years of Leading Language Management

www.lingsoft.fi

Speech Applications - Language Management - Translation - Reader's and Writer's Tools - Text Tools - E-books and M-books

Mikael Pesonen
System Engineer

e-mail: mikael.pesonen@lingsoft.fi
Tel. +358 2 279 3300

Time zone: GMT+2

Helsinki Office
Eteläranta 10
FI-00130 Helsinki
FINLAND

Turku Office
Kauppiaskatu 5 A
FI-20100 Turku
FINLAND


Re: Efficient SPARQL query for RDF collections

Posted by Andy Seaborne <an...@apache.org>.
Have you tried using the property function "list:member"?

     Andy

On 13/01/2021 14:28, Mikael Pesonen wrote:
> 
> Related to this. This query returns in 70ms
> 
> SELECT ?class ?classLabel
> WHERE
> {
>    ?scIntsec rdf:rest*/rdf:first id:308925008 .
>    ?scTmp owl:intersectionOf ?scIntsec .
>    ?class rdfs:subClassOf ?scTmp.
>    ?class skos:prefLabel ?classLabel .
> }
> 
> but this doesn't finish in 15 minutes (aborted then)
> 
> SELECT ?class ?classLabel
> WHERE
> {
>    VALUES ?superClass  {id:308925008 }
>    ?scIntsec rdf:rest*/rdf:first ?superClass .
>    ?scTmp owl:intersectionOf ?scIntsec .
>    ?class rdfs:subClassOf ?scTmp.
>    ?class skos:prefLabel ?classLabel .
> }
> 
> How come they are so different since they do the same thing?
> 
> 
> On 15/12/2020 20.30, Andy Seaborne wrote:
>>
>>
>> On 15/12/2020 14:33, Mikael Pesonen wrote:
>>>
>>> Hi,
>>>
>>> I am querying subclasses of class (here id:365852007) which belongs 
>>> to RDF collection like this
>>>
>>> id:1 rdf:type owl:Class ;
>>>          rdfs:subClassOf [ owl:intersectionOf ( ... id:365852007 ... 
>>> ) ]  ;
>>>          skos:prefLabel "something"@en .
>>> id:2 rdf:type owl:Class ;
>>>          rdfs:subClassOf [ owl:intersectionOf ( ... id:365852007 ... 
>>> ) ]  ;
>>>          skos:prefLabel "something else"@en .
>>>
>>>
>>> ... denotes random number of other elements in the list.
>>>
>>> SPARQL:
>>>
>>> select * where
>>> {
>>>      ?subclass rdfs:subClassOf [ owl:intersectionOf 
>>> /rdf:rest*/rdf:first id:365852007  ] .
>>>      ?subclass skos:prefLabel ?label .
>>
>> Try breaking the pattern up:
>>
>>     ?Z rdf:rest*/rdf:first id:365852007   .
>>     ?X owl:intersectionOf ?Z .
>>     ?subclass rdfs:subClassOf ?X .
>>     ?subclass rdfs:label ?label .
>>
>> and use 3.17.0
>>
>>> }
>>>
>>>
>>> This is slow (~10 secs) and we need to extend the query to subclasses 
>>> of the subclasses and so on.
>>>
>>> Is there any faster way to get this done?
>>>
>>> Thanks!
>>>
> 

Re: Efficient SPARQL query for RDF collections

Posted by Mikael Pesonen <mi...@lingsoft.fi>.
Related to this. This query returns in 70ms

SELECT ?class ?classLabel
WHERE
{
   ?scIntsec rdf:rest*/rdf:first id:308925008 .
   ?scTmp owl:intersectionOf ?scIntsec .
   ?class rdfs:subClassOf ?scTmp.
   ?class skos:prefLabel ?classLabel .
}

but this doesn't finish in 15 minutes (aborted then)

SELECT ?class ?classLabel
WHERE
{
   VALUES ?superClass  {id:308925008 }
   ?scIntsec rdf:rest*/rdf:first ?superClass .
   ?scTmp owl:intersectionOf ?scIntsec .
   ?class rdfs:subClassOf ?scTmp.
   ?class skos:prefLabel ?classLabel .
}

How come they are so different since they do the same thing?


On 15/12/2020 20.30, Andy Seaborne wrote:
>
>
> On 15/12/2020 14:33, Mikael Pesonen wrote:
>>
>> Hi,
>>
>> I am querying subclasses of class (here id:365852007) which belongs 
>> to RDF collection like this
>>
>> id:1 rdf:type owl:Class ;
>>          rdfs:subClassOf [ owl:intersectionOf ( ... id:365852007 ... 
>> ) ]  ;
>>          skos:prefLabel "something"@en .
>> id:2 rdf:type owl:Class ;
>>          rdfs:subClassOf [ owl:intersectionOf ( ... id:365852007 ... 
>> ) ]  ;
>>          skos:prefLabel "something else"@en .
>>
>>
>> ... denotes random number of other elements in the list.
>>
>> SPARQL:
>>
>> select * where
>> {
>>      ?subclass rdfs:subClassOf [ owl:intersectionOf 
>> /rdf:rest*/rdf:first id:365852007  ] .
>>      ?subclass skos:prefLabel ?label .
>
> Try breaking the pattern up:
>
>     ?Z rdf:rest*/rdf:first id:365852007   .
>     ?X owl:intersectionOf ?Z .
>     ?subclass rdfs:subClassOf ?X .
>     ?subclass rdfs:label ?label .
>
> and use 3.17.0
>
>> }
>>
>>
>> This is slow (~10 secs) and we need to extend the query to subclasses 
>> of the subclasses and so on.
>>
>> Is there any faster way to get this done?
>>
>> Thanks!
>>

-- 
Lingsoft - 30 years of Leading Language Management

www.lingsoft.fi

Speech Applications - Language Management - Translation - Reader's and Writer's Tools - Text Tools - E-books and M-books

Mikael Pesonen
System Engineer

e-mail: mikael.pesonen@lingsoft.fi
Tel. +358 2 279 3300

Time zone: GMT+2

Helsinki Office
Eteläranta 10
FI-00130 Helsinki
FINLAND

Turku Office
Kauppiaskatu 5 A
FI-20100 Turku
FINLAND


Re: Efficient SPARQL query for RDF collections

Posted by Andy Seaborne <an...@apache.org>.

On 15/12/2020 14:33, Mikael Pesonen wrote:
> 
> Hi,
> 
> I am querying subclasses of class (here id:365852007) which belongs to 
> RDF collection like this
> 
> id:1 rdf:type owl:Class ;
>          rdfs:subClassOf [ owl:intersectionOf ( ... id:365852007 ... ) ]  ;
>          skos:prefLabel "something"@en .
> id:2 rdf:type owl:Class ;
>          rdfs:subClassOf [ owl:intersectionOf ( ... id:365852007 ... ) ]  ;
>          skos:prefLabel "something else"@en .
> 
> 
> ... denotes random number of other elements in the list.
> 
> SPARQL:
> 
> select * where
> {
>      ?subclass rdfs:subClassOf [ owl:intersectionOf /rdf:rest*/rdf:first 
> id:365852007  ] .
>      ?subclass skos:prefLabel ?label .

Try breaking the pattern up:

     ?Z rdf:rest*/rdf:first id:365852007   .
     ?X owl:intersectionOf ?Z .
     ?subclass rdfs:subClassOf ?X .
     ?subclass rdfs:label ?label .

and use 3.17.0

> }
> 
> 
> This is slow (~10 secs) and we need to extend the query to subclasses of 
> the subclasses and so on.
> 
> Is there any faster way to get this done?
> 
> Thanks!
>