You are viewing a plain text version of this content. The canonical link for it is here.

Posted to users@jena.apache.org by "Wen, Chen" <cw...@regenstrief.org> on 2016/06/30 17:51:00 UTC

High CPU usage with Apache Jena Fuseki

Hi,
I am having a problem with fuseki-server. Every time when I try to do an ontology based query or just click on "count triples in all graphs", the CPU runs on almost 100% and hangs there. I have to terminate the process to get CPU usage back down.

I have a customized config.ttl for tdb:
<#tdb>  rdf:type fuseki:Service ;
    fuseki:name              "tdb" ;             # http://host/inf
    fuseki:serviceQuery               "sparql" ;   # SPARQL query service
    fuseki:serviceQuery               "query" ;    # SPARQL query service (alt name)
    fuseki:serviceUpdate              "update" ;   # SPARQL update service
    fuseki:serviceUpload              "upload" ;   # Non-SPARQL upload service
    fuseki:serviceReadWriteGraphStore "data" ;     # SPARQL Graph store protocol (read and write)
    # A separate read-only graph store endpoint:
    fuseki:serviceReadGraphStore      "get" ;      # SPARQL Graph store protocol (read only)
    fuseki:dataset           <#dataset2> ;       #select which set to
    .

tdb:GraphTDB    rdfs:subClassOf  ja:Model .

<#dataset2> rdf:type ja:RDFDataset ;
    ja:defaultGraph <#model2>;
    .

And I also increased JVM memory as below in fuseki-server.bat:
java -cp jena-tdb-3.1.0.jar:jena-arq-3.1.0.jar -Xms1g -Xmx15g -XX:NewSize=4g -XX:MaxNewSize=4g -XX:SurvivorRatio=8 -jar fuseki-server.jar %*

I have only 124 tuples loaded. And It works if I do a query without any specific criteria like:
select ?s ?p ?o
where
{
  ?s ?p ?o .
}
limit 100

However if I do a simple ontology specific query, the CPU goes up high and cannot recover anymore:
SELECT ?patient
WHERE
{
    ?patient <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://sample.org/dental-ontology/RIDO_0000083> .
}
limit 100

Am I missing anything? Can somebody advise?

Re: High CPU usage with Apache Jena Fuseki

Posted by Dave Reynolds <da...@gmail.com>.

One option to try would be to use the OWLMicro reasoner configuration: 
http://jena.hpl.hp.com/2003/OWLMicroFBRuleReasoner

If that is still too low performance, and if your data is static, then 
perform the inference ahead of time, store the inference closure and 
serve that without further runtime inference.

Dav

On 02/07/16 10:43, Andy Seaborne wrote:
> It is possible that the inference you are using is causing a lot of
> calculation.  That's driven by what's in sample-dental-ontology-rdfxml.owl.
>
>      Andy
>
> On 01/07/16 14:33, Wen, Chen wrote:
>> Thank you Andy. This machine has 64G memory. Below is the model2
>> config. Do you see anything wrong?
>>
>> <#model2> a ja:InfModel;
>>      ja:baseModel
>>          [a ja:MemoryModel ;
>>                   ja:content [ja:externalContent
>> <file:///E:/sample-dental-ontology-rdfxml.owl>]] ;
>>      ja:reasoner
>>           [ ja:reasonerURL
>>             <http://jena.hpl.hp.com/2003/OWLFBRuleReasoner>];
>>      .
>>
>>
>> -----Original Message-----
>> From: Andy Seaborne [mailto:andy@apache.org]
>> Sent: Thursday, June 30, 2016 4:35 PM
>> To: users@jena.apache.org
>> Subject: Re: High CPU usage with Apache Jena Fuseki
>>
>> On 30/06/16 18:51, Wen, Chen wrote:
>>> Hi,
>>> I am having a problem with fuseki-server. Every time when I try to do
>>> an ontology based query or just click on "count triples in all
>>> graphs", the CPU runs on almost 100% and hangs there. I have to
>>> terminate the process to get CPU usage back down.
>>>
>>> I have a customized config.ttl for tdb:
>>> <#tdb>  rdf:type fuseki:Service ;
>>>       fuseki:name              "tdb" ;             # http://host/inf
>>>       fuseki:serviceQuery               "sparql" ;   # SPARQL query
>>> service
>>>       fuseki:serviceQuery               "query" ;    # SPARQL query
>>> service (alt name)
>>>       fuseki:serviceUpdate              "update" ;   # SPARQL update
>>> service
>>>       fuseki:serviceUpload              "upload" ;   # Non-SPARQL
>>> upload service
>>>       fuseki:serviceReadWriteGraphStore "data" ;     # SPARQL Graph
>>> store protocol (read and write)
>>>       # A separate read-only graph store endpoint:
>>>       fuseki:serviceReadGraphStore      "get" ;      # SPARQL Graph
>>> store protocol (read only)
>>>       fuseki:dataset           <#dataset2> ;       #select which set to
>>>       .
>>>
>>> tdb:GraphTDB    rdfs:subClassOf  ja:Model .
>>>
>>> <#dataset2> rdf:type ja:RDFDataset ;
>>>       ja:defaultGraph <#model2>;
>>>       .
>>
>> Where does <#model2> go to in the config?
>>
>>>
>>> And I also increased JVM memory as below in fuseki-server.bat:
>>> java -cp jena-tdb-3.1.0.jar:jena-arq-3.1.0.jar -Xms1g -Xmx15g
>>> -XX:NewSize=4g -XX:MaxNewSize=4g -XX:SurvivorRatio=8 -jar
>>> fuseki-server.jar %*
>>
>> Only -jar is needed, not -cp
>>
>> How big is the physical RAM in the machine?
>>
>> If it is say 16G, then -Xmx15g is not a good idea as it may force the
>> OS to swap the java heap.
>>
>> For TDB, much of the caching is off heap so -Xmx15g detracting from
>> that.  Allow 2G per TDB database + 2G for Fuseki.
>>
>>> I have only 124 tuples loaded. And It works if I do a query without
>>> any specific criteria like:
>>> select ?s ?p ?o
>>> where
>>> {
>>>     ?s ?p ?o .
>>> }
>>> limit 100
>>>
>>> However if I do a simple ontology specific query, the CPU goes up
>>> high and cannot recover anymore:
>>> SELECT ?patient
>>> WHERE
>>> {
>>>       ?patient <http://www.w3.org/1999/02/22-rdf-syntax-ns#type>
>>> <http://sample.org/dental-ontology/RIDO_0000083> .
>>> }
>>> limit 100
>>>
>>> Am I missing anything? Can somebody advise?
>>>
>>
>

Re: High CPU usage with Apache Jena Fuseki

Posted by Andy Seaborne <an...@apache.org>.

It is possible that the inference you are using is causing a lot of 
calculation.  That's driven by what's in sample-dental-ontology-rdfxml.owl.

	Andy

On 01/07/16 14:33, Wen, Chen wrote:
> Thank you Andy. This machine has 64G memory. Below is the model2 config. Do you see anything wrong?
>
> <#model2> a ja:InfModel;
>      ja:baseModel
>          [a ja:MemoryModel ;
>                   ja:content [ja:externalContent <file:///E:/sample-dental-ontology-rdfxml.owl>]] ;
>      ja:reasoner
>           [ ja:reasonerURL
>             <http://jena.hpl.hp.com/2003/OWLFBRuleReasoner>];
>      .
>
>
> -----Original Message-----
> From: Andy Seaborne [mailto:andy@apache.org]
> Sent: Thursday, June 30, 2016 4:35 PM
> To: users@jena.apache.org
> Subject: Re: High CPU usage with Apache Jena Fuseki
>
> On 30/06/16 18:51, Wen, Chen wrote:
>> Hi,
>> I am having a problem with fuseki-server. Every time when I try to do an ontology based query or just click on "count triples in all graphs", the CPU runs on almost 100% and hangs there. I have to terminate the process to get CPU usage back down.
>>
>> I have a customized config.ttl for tdb:
>> <#tdb>  rdf:type fuseki:Service ;
>>       fuseki:name              "tdb" ;             # http://host/inf
>>       fuseki:serviceQuery               "sparql" ;   # SPARQL query service
>>       fuseki:serviceQuery               "query" ;    # SPARQL query service (alt name)
>>       fuseki:serviceUpdate              "update" ;   # SPARQL update service
>>       fuseki:serviceUpload              "upload" ;   # Non-SPARQL upload service
>>       fuseki:serviceReadWriteGraphStore "data" ;     # SPARQL Graph store protocol (read and write)
>>       # A separate read-only graph store endpoint:
>>       fuseki:serviceReadGraphStore      "get" ;      # SPARQL Graph store protocol (read only)
>>       fuseki:dataset           <#dataset2> ;       #select which set to
>>       .
>>
>> tdb:GraphTDB    rdfs:subClassOf  ja:Model .
>>
>> <#dataset2> rdf:type ja:RDFDataset ;
>>       ja:defaultGraph <#model2>;
>>       .
>
> Where does <#model2> go to in the config?
>
>>
>> And I also increased JVM memory as below in fuseki-server.bat:
>> java -cp jena-tdb-3.1.0.jar:jena-arq-3.1.0.jar -Xms1g -Xmx15g
>> -XX:NewSize=4g -XX:MaxNewSize=4g -XX:SurvivorRatio=8 -jar
>> fuseki-server.jar %*
>
> Only -jar is needed, not -cp
>
> How big is the physical RAM in the machine?
>
> If it is say 16G, then -Xmx15g is not a good idea as it may force the OS to swap the java heap.
>
> For TDB, much of the caching is off heap so -Xmx15g detracting from that.  Allow 2G per TDB database + 2G for Fuseki.
>
>> I have only 124 tuples loaded. And It works if I do a query without any specific criteria like:
>> select ?s ?p ?o
>> where
>> {
>>     ?s ?p ?o .
>> }
>> limit 100
>>
>> However if I do a simple ontology specific query, the CPU goes up high and cannot recover anymore:
>> SELECT ?patient
>> WHERE
>> {
>>       ?patient <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://sample.org/dental-ontology/RIDO_0000083> .
>> }
>> limit 100
>>
>> Am I missing anything? Can somebody advise?
>>
>

RE: High CPU usage with Apache Jena Fuseki

Posted by "Wen, Chen" <cw...@regenstrief.org>.

Thank you Andy. This machine has 64G memory. Below is the model2 config. Do you see anything wrong?

<#model2> a ja:InfModel;
    ja:baseModel
        [a ja:MemoryModel ;
                 ja:content [ja:externalContent <file:///E:/sample-dental-ontology-rdfxml.owl>]] ;
    ja:reasoner
         [ ja:reasonerURL 
           <http://jena.hpl.hp.com/2003/OWLFBRuleReasoner>];
    .


-----Original Message-----
From: Andy Seaborne [mailto:andy@apache.org] 
Sent: Thursday, June 30, 2016 4:35 PM
To: users@jena.apache.org
Subject: Re: High CPU usage with Apache Jena Fuseki

On 30/06/16 18:51, Wen, Chen wrote:
> Hi,
> I am having a problem with fuseki-server. Every time when I try to do an ontology based query or just click on "count triples in all graphs", the CPU runs on almost 100% and hangs there. I have to terminate the process to get CPU usage back down.
>
> I have a customized config.ttl for tdb:
> <#tdb>  rdf:type fuseki:Service ;
>      fuseki:name              "tdb" ;             # http://host/inf
>      fuseki:serviceQuery               "sparql" ;   # SPARQL query service
>      fuseki:serviceQuery               "query" ;    # SPARQL query service (alt name)
>      fuseki:serviceUpdate              "update" ;   # SPARQL update service
>      fuseki:serviceUpload              "upload" ;   # Non-SPARQL upload service
>      fuseki:serviceReadWriteGraphStore "data" ;     # SPARQL Graph store protocol (read and write)
>      # A separate read-only graph store endpoint:
>      fuseki:serviceReadGraphStore      "get" ;      # SPARQL Graph store protocol (read only)
>      fuseki:dataset           <#dataset2> ;       #select which set to
>      .
>
> tdb:GraphTDB    rdfs:subClassOf  ja:Model .
>
> <#dataset2> rdf:type ja:RDFDataset ;
>      ja:defaultGraph <#model2>;
>      .

Where does <#model2> go to in the config?

>
> And I also increased JVM memory as below in fuseki-server.bat:
> java -cp jena-tdb-3.1.0.jar:jena-arq-3.1.0.jar -Xms1g -Xmx15g 
> -XX:NewSize=4g -XX:MaxNewSize=4g -XX:SurvivorRatio=8 -jar 
> fuseki-server.jar %*

Only -jar is needed, not -cp

How big is the physical RAM in the machine?

If it is say 16G, then -Xmx15g is not a good idea as it may force the OS to swap the java heap.

For TDB, much of the caching is off heap so -Xmx15g detracting from that.  Allow 2G per TDB database + 2G for Fuseki.

> I have only 124 tuples loaded. And It works if I do a query without any specific criteria like:
> select ?s ?p ?o
> where
> {
>    ?s ?p ?o .
> }
> limit 100
>
> However if I do a simple ontology specific query, the CPU goes up high and cannot recover anymore:
> SELECT ?patient
> WHERE
> {
>      ?patient <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://sample.org/dental-ontology/RIDO_0000083> .
> }
> limit 100
>
> Am I missing anything? Can somebody advise?
>

Re: High CPU usage with Apache Jena Fuseki

Posted by Andy Seaborne <an...@apache.org>.

On 30/06/16 18:51, Wen, Chen wrote:
> Hi,
> I am having a problem with fuseki-server. Every time when I try to do an ontology based query or just click on "count triples in all graphs", the CPU runs on almost 100% and hangs there. I have to terminate the process to get CPU usage back down.
>
> I have a customized config.ttl for tdb:
> <#tdb>  rdf:type fuseki:Service ;
>      fuseki:name              "tdb" ;             # http://host/inf
>      fuseki:serviceQuery               "sparql" ;   # SPARQL query service
>      fuseki:serviceQuery               "query" ;    # SPARQL query service (alt name)
>      fuseki:serviceUpdate              "update" ;   # SPARQL update service
>      fuseki:serviceUpload              "upload" ;   # Non-SPARQL upload service
>      fuseki:serviceReadWriteGraphStore "data" ;     # SPARQL Graph store protocol (read and write)
>      # A separate read-only graph store endpoint:
>      fuseki:serviceReadGraphStore      "get" ;      # SPARQL Graph store protocol (read only)
>      fuseki:dataset           <#dataset2> ;       #select which set to
>      .
>
> tdb:GraphTDB    rdfs:subClassOf  ja:Model .
>
> <#dataset2> rdf:type ja:RDFDataset ;
>      ja:defaultGraph <#model2>;
>      .

Where does <#model2> go to in the config?

>
> And I also increased JVM memory as below in fuseki-server.bat:
> java -cp jena-tdb-3.1.0.jar:jena-arq-3.1.0.jar -Xms1g -Xmx15g -XX:NewSize=4g -XX:MaxNewSize=4g -XX:SurvivorRatio=8 -jar fuseki-server.jar %*

Only -jar is needed, not -cp

How big is the physical RAM in the machine?

If it is say 16G, then -Xmx15g is not a good idea as it may force the OS 
to swap the java heap.

For TDB, much of the caching is off heap so -Xmx15g detracting from 
that.  Allow 2G per TDB database + 2G for Fuseki.

> I have only 124 tuples loaded. And It works if I do a query without any specific criteria like:
> select ?s ?p ?o
> where
> {
>    ?s ?p ?o .
> }
> limit 100
>
> However if I do a simple ontology specific query, the CPU goes up high and cannot recover anymore:
> SELECT ?patient
> WHERE
> {
>      ?patient <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://sample.org/dental-ontology/RIDO_0000083> .
> }
> limit 100
>
> Am I missing anything? Can somebody advise?
>