You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@jena.apache.org by "Benjamin Geer (JIRA)" <ji...@apache.org> on 2016/01/19 13:23:39 UTC

[jira] [Comment Edited] (JENA-1121) Performance regression in Jena 3.0.1 / Fuseki 2.3.1

    [ https://issues.apache.org/jira/browse/JENA-1121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15106655#comment-15106655 ] 

Benjamin Geer edited comment on JENA-1121 at 1/19/16 12:23 PM:
---------------------------------------------------------------

Here are some shorter examples with various things removed. The original query above:

* 10 ms with Fuseki 2.3.0
* 8.6 s with Fuseki 2.3.1

Removing DISTINCT, ORDER, and GROUP BY:

* 12 ms with Fuseki 2.3.0
* 8.1 s with Fuseki 2.3.1

{noformat}
prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>
prefix knora-base: <http://www.knora.org/ontology/knora-base#>

SELECT
    ?resourceIri
    ?resourceLabel
    ?match

WHERE {
    BIND(STR("de") AS ?preferredLanguage)
    BIND(STR("en") AS ?fallbackLanguage)

    ?s <http://jena.apache.org/text#query> 'Zeitglöcklein' .

    MINUS {
        ?s knora-base:isDeleted true .
    }

    OPTIONAL {
        ?s a ?valueObjectType .
        ?valueObjectType rdfs:subClassOf+ knora-base:Value .
        ?resIri ?resourceProperty ?s .
        ?s knora-base:valueHasString ?literal .

        OPTIONAL {
            ?resourceProperty rdfs:label ?preferredLanguageResourcePropertyLabel .
            FILTER (LANG(?preferredLanguageResourcePropertyLabel) = ?preferredLanguage) .
        }

        OPTIONAL {
            ?resourceProperty rdfs:label ?fallbackLanguageResourcePropertyLabel .
            FILTER (LANG(?fallbackLanguageResourcePropertyLabel) = ?fallbackLanguage) .
        }

        OPTIONAL {
            ?resourceProperty rdfs:label ?anyLanguageResourcePropertyLabel .
        }

        BIND(COALESCE(str(?preferredLanguageResourcePropertyLabel), str(?fallbackLanguageResourcePropertyLabel), str(?anyLanguageResourcePropertyLabel)) AS ?propertyLabel)

        BIND(CONCAT(STR(?valueObjectType), "|", STR(?propertyLabel), "|", STR(?literal)) AS ?match)

        MINUS {
            ?resIri knora-base:isDeleted true .
        }
    }

    BIND(COALESCE(?resIri, ?s) AS ?resourceIri)

    ?resourceIri a ?resourceClass .
    ?resourceClass rdfs:subClassOf+ knora-base:Resource .
    ?resourceIri rdfs:label ?resourceLabel .
}
{noformat}

Removing some of the inner OPTIONALs:

* 9 ms with Fuseki 2.3.0
* 4.8 s with Fuseki 2.3.1

{noformat}
prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>
prefix knora-base: <http://www.knora.org/ontology/knora-base#>

SELECT
    ?resourceIri
    ?resourceLabel
    ?match

WHERE {
    ?s <http://jena.apache.org/text#query> 'Zeitglöcklein' .

    MINUS {
        ?s knora-base:isDeleted true .
    }

    OPTIONAL {
        ?s a ?valueObjectType .
        ?valueObjectType rdfs:subClassOf+ knora-base:Value .
        ?resIri ?resourceProperty ?s .
        ?s knora-base:valueHasString ?literal .

        OPTIONAL {
            ?resourceProperty rdfs:label ?propertyLabel .
        }

        BIND(CONCAT(STR(?valueObjectType), "|", STR(?propertyLabel), "|", STR(?literal)) AS ?match)

        MINUS {
            ?resIri knora-base:isDeleted true .
        }
    }

    BIND(COALESCE(?resIri, ?s) AS ?resourceIri)

    ?resourceIri a ?resourceClass .
    ?resourceClass rdfs:subClassOf+ knora-base:Resource .
    ?resourceIri rdfs:label ?resourceLabel .
}
{noformat}

Making the remaining inner OPTIONAL pattern non-optional:

* 9 ms with Fuseki 2.3.0
* 1.9 s with Fuseki 2.3.1

{noformat}
prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>
prefix knora-base: <http://www.knora.org/ontology/knora-base#>

SELECT
    ?resourceIri
    ?resourceLabel
    ?match

WHERE {
    ?s <http://jena.apache.org/text#query> 'Zeitglöcklein' .

    MINUS {
        ?s knora-base:isDeleted true .
    }

    OPTIONAL {
        ?s a ?valueObjectType .
        ?valueObjectType rdfs:subClassOf+ knora-base:Value .
        ?resIri ?resourceProperty ?s .
        ?s knora-base:valueHasString ?literal .
        ?resourceProperty rdfs:label ?propertyLabel .

        BIND(CONCAT(STR(?valueObjectType), "|", STR(?propertyLabel), "|", STR(?literal)) AS ?match)

        MINUS {
            ?resIri knora-base:isDeleted true .
        }
    }

    BIND(COALESCE(?resIri, ?s) AS ?resourceIri)

    ?resourceIri a ?resourceClass .
    ?resourceClass rdfs:subClassOf+ knora-base:Resource .
    ?resourceIri rdfs:label ?resourceLabel .
}
{noformat}

Removing the MINUSes:

* 8 ms with Fuseki 2.3.0
* 10 ms with Fuseki 2.3.1

{noformat}
prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>
prefix knora-base: <http://www.knora.org/ontology/knora-base#>

SELECT
    ?resourceIri
    ?resourceLabel
    ?match

WHERE {
    ?s <http://jena.apache.org/text#query> 'Zeitglöcklein' .

    OPTIONAL {
        ?s a ?valueObjectType .
        ?valueObjectType rdfs:subClassOf+ knora-base:Value .
        ?resIri ?resourceProperty ?s .
        ?s knora-base:valueHasString ?literal .
        ?resourceProperty rdfs:label ?propertyLabel .

        BIND(CONCAT(STR(?valueObjectType), "|", STR(?propertyLabel), "|", STR(?literal)) AS ?match)
    }

    BIND(COALESCE(?resIri, ?s) AS ?resourceIri)

    ?resourceIri a ?resourceClass .
    ?resourceClass rdfs:subClassOf+ knora-base:Resource .
    ?resourceIri rdfs:label ?resourceLabel .
}
{noformat}

Keeping the inner OPTIONAL but removing the MINUSes:

* 2.8 s with Fuseki 2.3.0
* 2.8 s with Fuseki 2.3.1

{noformat}
prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>
prefix knora-base: <http://www.knora.org/ontology/knora-base#>

SELECT
    ?resourceIri
    ?resourceLabel
    ?match

WHERE {
    ?s <http://jena.apache.org/text#query> 'Zeitglöcklein' .

    OPTIONAL {
        ?s a ?valueObjectType .
        ?valueObjectType rdfs:subClassOf+ knora-base:Value .
        ?resIri ?resourceProperty ?s .
        ?s knora-base:valueHasString ?literal .
    
    	OPTIONAL {
	        ?resourceProperty rdfs:label ?propertyLabel .
	    }

        BIND(CONCAT(STR(?valueObjectType), "|", STR(?propertyLabel), "|", STR(?literal)) AS ?match)
    }

    BIND(COALESCE(?resIri, ?s) AS ?resourceIri)

    ?resourceIri a ?resourceClass .
    ?resourceClass rdfs:subClassOf+ knora-base:Resource .
    ?resourceIri rdfs:label ?resourceLabel .
}
{noformat}

Summary: in these examples, Fuseki 2.3.0 is hundreds of times faster than Fuseki 2.3.1, except when the MINUSes are removed and the inner OPTIONAL is kept, in which case both versions are slow.

Noe that none of the MINUSes in these examples should actually remove any results (knora-base:isDeleted does not occur in the test data).



was (Author: benjamingeer):
Here are some shorter examples with various things removed. The original query above:

* 10 ms with Fuseki 2.3.0
* 8.6 s with Fuseki 2.3.1

Removing DISTINCT, ORDER, and GROUP BY:

* 12 ms with Fuseki 2.3.0
* 8.1 s with Fuseki 2.3.1

{noformat}
prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>
prefix knora-base: <http://www.knora.org/ontology/knora-base#>

SELECT
    ?resourceIri
    ?resourceLabel
    ?match

WHERE {
    BIND(STR("de") AS ?preferredLanguage)
    BIND(STR("en") AS ?fallbackLanguage)

    ?s <http://jena.apache.org/text#query> 'Zeitglöcklein' .

    MINUS {
        ?s knora-base:isDeleted true .
    }

    OPTIONAL {
        ?s a ?valueObjectType .
        ?valueObjectType rdfs:subClassOf+ knora-base:Value .
        ?resIri ?resourceProperty ?s .
        ?s knora-base:valueHasString ?literal .

        OPTIONAL {
            ?resourceProperty rdfs:label ?preferredLanguageResourcePropertyLabel .
            FILTER (LANG(?preferredLanguageResourcePropertyLabel) = ?preferredLanguage) .
        }

        OPTIONAL {
            ?resourceProperty rdfs:label ?fallbackLanguageResourcePropertyLabel .
            FILTER (LANG(?fallbackLanguageResourcePropertyLabel) = ?fallbackLanguage) .
        }

        OPTIONAL {
            ?resourceProperty rdfs:label ?anyLanguageResourcePropertyLabel .
        }

        BIND(COALESCE(str(?preferredLanguageResourcePropertyLabel), str(?fallbackLanguageResourcePropertyLabel), str(?anyLanguageResourcePropertyLabel)) AS ?propertyLabel)

        BIND(CONCAT(STR(?valueObjectType), "|", STR(?propertyLabel), "|", STR(?literal)) AS ?match)

        MINUS {
            ?resIri knora-base:isDeleted true .
        }
    }

    BIND(COALESCE(?resIri, ?s) AS ?resourceIri)

    ?resourceIri a ?resourceClass .
    ?resourceClass rdfs:subClassOf+ knora-base:Resource .
    ?resourceIri rdfs:label ?resourceLabel .
}
{noformat}

Removing some of the inner OPTIONALs:

* 9 ms with Fuseki 2.3.0
* 4.8 s with Fuseki 2.3.1

{noformat}
prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>
prefix knora-base: <http://www.knora.org/ontology/knora-base#>

SELECT
    ?resourceIri
    ?resourceLabel
    ?match

WHERE {
    ?s <http://jena.apache.org/text#query> 'Zeitglöcklein' .

    MINUS {
        ?s knora-base:isDeleted true .
    }

    OPTIONAL {
        ?s a ?valueObjectType .
        ?valueObjectType rdfs:subClassOf+ knora-base:Value .
        ?resIri ?resourceProperty ?s .
        ?s knora-base:valueHasString ?literal .

        OPTIONAL {
            ?resourceProperty rdfs:label ?propertyLabel .
        }

        BIND(CONCAT(STR(?valueObjectType), "|", STR(?propertyLabel), "|", STR(?literal)) AS ?match)

        MINUS {
            ?resIri knora-base:isDeleted true .
        }
    }

    BIND(COALESCE(?resIri, ?s) AS ?resourceIri)

    ?resourceIri a ?resourceClass .
    ?resourceClass rdfs:subClassOf+ knora-base:Resource .
    ?resourceIri rdfs:label ?resourceLabel .
}
{noformat}

Making the remaining inner OPTIONAL pattern non-optional:

* 9 ms with Fuseki 2.3.0
* 1.9 s with Fuseki 2.3.1

{noformat}
prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>
prefix knora-base: <http://www.knora.org/ontology/knora-base#>

SELECT
    ?resourceIri
    ?resourceLabel
    ?match

WHERE {
    ?s <http://jena.apache.org/text#query> 'Zeitglöcklein' .

    MINUS {
        ?s knora-base:isDeleted true .
    }

    OPTIONAL {
        ?s a ?valueObjectType .
        ?valueObjectType rdfs:subClassOf+ knora-base:Value .
        ?resIri ?resourceProperty ?s .
        ?s knora-base:valueHasString ?literal .
        ?resourceProperty rdfs:label ?propertyLabel .

        BIND(CONCAT(STR(?valueObjectType), "|", STR(?propertyLabel), "|", STR(?literal)) AS ?match)

        MINUS {
            ?resIri knora-base:isDeleted true .
        }
    }

    BIND(COALESCE(?resIri, ?s) AS ?resourceIri)

    ?resourceIri a ?resourceClass .
    ?resourceClass rdfs:subClassOf+ knora-base:Resource .
    ?resourceIri rdfs:label ?resourceLabel .
}
{noformat}

Removing the MINUSes:

* 8 ms with Fuseki 2.3.0
* 10 ms with Fuseki 2.3.1

{noformat}
prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>
prefix knora-base: <http://www.knora.org/ontology/knora-base#>

SELECT
    ?resourceIri
    ?resourceLabel
    ?match

WHERE {
    ?s <http://jena.apache.org/text#query> 'Zeitglöcklein' .

    OPTIONAL {
        ?s a ?valueObjectType .
        ?valueObjectType rdfs:subClassOf+ knora-base:Value .
        ?resIri ?resourceProperty ?s .
        ?s knora-base:valueHasString ?literal .
        ?resourceProperty rdfs:label ?propertyLabel .

        BIND(CONCAT(STR(?valueObjectType), "|", STR(?propertyLabel), "|", STR(?literal)) AS ?match)
    }

    BIND(COALESCE(?resIri, ?s) AS ?resourceIri)

    ?resourceIri a ?resourceClass .
    ?resourceClass rdfs:subClassOf+ knora-base:Resource .
    ?resourceIri rdfs:label ?resourceLabel .
}
{noformat}

Keeping the inner OPTIONAL but removing the MINUSes:

* 2.8 s with Fuseki 2.3.0
* 2.8 s with Fuseki 2.3.1

{noformat}
prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>
prefix knora-base: <http://www.knora.org/ontology/knora-base#>

SELECT
    ?resourceIri
    ?resourceLabel
    ?match

WHERE {
    ?s <http://jena.apache.org/text#query> 'Zeitglöcklein' .

    OPTIONAL {
        ?s a ?valueObjectType .
        ?valueObjectType rdfs:subClassOf+ knora-base:Value .
        ?resIri ?resourceProperty ?s .
        ?s knora-base:valueHasString ?literal .
    
    	OPTIONAL {
	        ?resourceProperty rdfs:label ?propertyLabel .
	    }

        BIND(CONCAT(STR(?valueObjectType), "|", STR(?propertyLabel), "|", STR(?literal)) AS ?match)
    }

    BIND(COALESCE(?resIri, ?s) AS ?resourceIri)

    ?resourceIri a ?resourceClass .
    ?resourceClass rdfs:subClassOf+ knora-base:Resource .
    ?resourceIri rdfs:label ?resourceLabel .
}
{noformat}

Summary: in these examples, Fuseki 2.3.0 is hundreds of times faster than Fuseki 2.3.1, except when the MINUSes are removed, in which case both versions are slow.

Noe that none of the MINUSes in these examples should actually remove any results (knora-base:isDeleted does not occur in the test data).


> Performance regression in Jena 3.0.1 / Fuseki 2.3.1
> ---------------------------------------------------
>
>                 Key: JENA-1121
>                 URL: https://issues.apache.org/jira/browse/JENA-1121
>             Project: Apache Jena
>          Issue Type: Bug
>          Components: Jena
>    Affects Versions: Jena 3.0.1, Fuseki 2.3.1, Jena 3.1.0, Fuseki 2.4.0
>         Environment: Mac OS X 10.10.5, iMac, 3.4 GHz Intel Core i7, 32 GB RAM
>            Reporter: Benjamin Geer
>            Priority: Critical
>              Labels: performance
>
> We seem to have encountered a severe performance regression in Jena 3.0.1 / Fuseki 2.3.1 as compared with Jena 3.0.0 / Fuseki 2.3.0. A number of our queries are running between 2 and 20 times slower. Here's one small example with configuration for Fuseki. With Fuseki 2.3.0, the query below takes about 200 milliseconds. With Fuseki 2.3.1, it takes 9 seconds. I've also tried it with the latest Fuseki snapshot (apache-jena-fuseki-2.4.0-20160117.183513-33.zip), and got the same result as with the 2.3.1 release.
> Here's the test data and configuration:
> https://www.dropbox.com/s/b9aepexij5e7noj/jena-performance-test.zip?dl=0
> Here's the query:
> {noformat}
> prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
> prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>
> prefix knora-base: <http://www.knora.org/ontology/knora-base#>
> SELECT DISTINCT
>     ?resourceIri
>     ?resourceLabel
>     (SAMPLE(?anyMatch) AS ?match)
> WHERE {
>     BIND(STR("de") AS ?preferredLanguage)
>     BIND(STR("en") AS ?fallbackLanguage)
>     ?s <http://jena.apache.org/text#query> 'Zeitglöcklein' .
>     MINUS {
>         ?s knora-base:isDeleted true .
>     }
>     OPTIONAL {
>         ?s a ?valueObjectType .
>         ?valueObjectType rdfs:subClassOf+ knora-base:Value .
>         ?resIri ?resourceProperty ?s .
>         ?s knora-base:valueHasString ?literal .
>         OPTIONAL {
>             ?resourceProperty rdfs:label ?preferredLanguageResourcePropertyLabel .
>             FILTER (LANG(?preferredLanguageResourcePropertyLabel) = ?preferredLanguage) .
>         }
>         OPTIONAL {
>             ?resourceProperty rdfs:label ?fallbackLanguageResourcePropertyLabel .
>             FILTER (LANG(?fallbackLanguageResourcePropertyLabel) = ?fallbackLanguage) .
>         }
>         OPTIONAL {
>             ?resourceProperty rdfs:label ?anyLanguageResourcePropertyLabel .
>         }
>         BIND(COALESCE(str(?preferredLanguageResourcePropertyLabel), str(?fallbackLanguageResourcePropertyLabel), str(?anyLanguageResourcePropertyLabel)) AS ?propertyLabel)
>         BIND(CONCAT(STR(?valueObjectType), "|", STR(?propertyLabel), "|", STR(?literal)) AS ?anyMatch)
>         MINUS {
>             ?resIri knora-base:isDeleted true .
>         }
>     }
>     BIND(COALESCE(?resIri, ?s) AS ?resourceIri)
>     ?resourceIri a ?resourceClass .
>     ?resourceClass rdfs:subClassOf+ knora-base:Resource .
>     ?resourceIri rdfs:label ?resourceLabel .
> }
> GROUP BY
>     ?resourceIri
>     ?resourceLabel
> ORDER BY ?resourceIri
> {noformat}
> Best regards,
> Benjamin Geer



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)