You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@jena.apache.org by "Andy Seaborne (JIRA)" <ji...@apache.org> on 2014/10/01 12:54:34 UTC
[jira] [Closed] (JENA-779) Filter placement should be able to break
up extend
[ https://issues.apache.org/jira/browse/JENA-779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Andy Seaborne closed JENA-779.
------------------------------
> Filter placement should be able to break up extend
> --------------------------------------------------
>
> Key: JENA-779
> URL: https://issues.apache.org/jira/browse/JENA-779
> Project: Apache Jena
> Issue Type: Improvement
> Components: ARQ, Optimizer
> Affects Versions: Jena 2.12.0
> Reporter: Rob Vesse
> Assignee: Andy Seaborne
> Fix For: Jena 2.12.1
>
> Attachments: JENA-779-filter-extend-extend, JENA-779-filter-extend_distinct.patch, JENA-779-single-extend.patch, JENA-779.patch
>
>
> The following query demonstrates a query plan seen internally which is considered sub-optimal.
> Consider the following query:
> {noformat}
> SELECT DISTINCT ?domainName
> {
> { ?uri ?p ?o }
> UNION
> {
> ?sub ?p ?uri
> FILTER(isIRI(?uri))
> }
> BIND(str(?uri) as ?s)
> FILTER(STRSTARTS(?s, "http://"))
> BIND(IRI(CONCAT("http://", STRBEFORE(SUBSTR(?s,8), "/"))) AS ?domainName)
> }
> {noformat}
> Which ARQ optimises as follows:
> {noformat}
> (distinct
> (project (?domainName)
> (filter (strstarts ?s "http://")
> (extend ((?s (str ?uri)) (?domainName (iri (concat "http://" (strbefore (substr ?s 8) "/")))))
> (union
> (bgp (triple ?uri ?p ?o))
> (filter (isIRI ?uri)
> (bgp (triple ?sub ?p ?uri))))))))
> {noformat}
> Which makes the query engine do a lot of work because it computes the both the {{BIND}} expressions for lots of possible solutions that will then be rejected when for many of them it would only be necessary to compute the first simple {{BIND}} function.
> It would be better if the query was planned as follows:
> {noformat}
> (distinct
> (project (?domainName)
> (extend (?domainName (iri (concat "http://" (strbefore (substr ?s 8) "/"))))
> (filter (strstarts ?s "http://")
> (extend (?s (str ?uri))
> (union
> (bgp (triple ?uri ?p ?o))
> (filter (isIRI ?uri)
> (bgp (triple ?sub ?p ?uri)))))))))
> {noformat}
> Essentially when we try to push a filter through an {{extend}} if we determine that we cannot push it through the extend we should see if we can split the {{extend}} instead thus resulting in a partial pushing.
> Note that a user can re-write the original query to yield this plan if they make the second {{BIND}} a project expression like so:
> {noformat}
> SELECT DISTINCT (IRI(CONCAT("http://", STRBEFORE(SUBSTR(?s,8), "/"))) AS ?domainName)
> {
> { ?uri ?p ?o }
> UNION
> {
> ?sub ?p ?uri
> FILTER(isIRI(?uri))
> }
> BIND(str(?uri) as ?s)
> FILTER(STRSTARTS(?s, "http://"))
> }
> {noformat}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)