You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@jena.apache.org by Rob Walpole <ro...@gmail.com> on 2013/01/29 19:11:14 UTC

Binding causes hang in Fuseki

Hi there,

The following SPARQL query responds quickly when entered into our Fuseki
endpoint...

DESCRIBE ?ancestor
WHERE
{
    BIND(URI("
http://nationalarchives.gov.uk/dri/catalogue/exportStatus/ReadyToProcess")
AS ?readyStatus)
    ?export rdfs:member ?member ;
               dri:username "rwalpole"^^xsd:string ;
               dri:exportStatus ?readyStatus .
    OPTIONAL
    {
        {
            SELECT ?deselected ?ancestor
            WHERE
            {
                BIND(URI("
http://nationalarchives.gov.uk/dri/catalogue/item/c2433752-b1e5-44e4-a271-c36d29aa6a3b")
AS ?deselected)
                # get the ancestors of the deselected item
?deselected dri:parent+ ?ancestor .

# get the ancestor that is a member of the export list
        FILTER EXISTS { ?export rdfs:member ?ancestor } .
    }
        }
    }
}

But when the ?deselected variable binding is moved out of the optional
section (as shown below) it grinds to a halt...

DESCRIBE ?ancestor
WHERE
{
    BIND(URI("
http://nationalarchives.gov.uk/dri/catalogue/exportStatus/ReadyToProcess")
AS ?readyStatus)
    BIND(URI("
http://nationalarchives.gov.uk/dri/catalogue/item/c2433752-b1e5-44e4-a271-c36d29aa6a3b")
AS ?deselected)
  ?export rdfs:member ?member ;
                dri:username "dfreeman"^^xsd:string ;
                dri:exportStatus ?readyStatus .
    OPTIONAL
    {
        {
            SELECT ?deselected ?ancestor
            WHERE
            {
                # get the ancestors of the deselected item
                ?deselected dri:parent+ ?ancestor .

# get the ancestor that is a member of the export list
        FILTER EXISTS { ?export rdfs:member ?ancestor } .
    }
        }
    }
}

Am I doing something wrong here?

Thanks
Rob

-- 

Rob Walpole
Email robkwalpole@gmail.com
Tel. +44 (0)7969 869881
Skype: RobertWalpolehttp://www.linkedin.com/in/robwalpole

Re: Binding causes hang in Fuseki

Posted by Rob Walpole <ro...@gmail.com>.
Andy,

Just to say this is a low priority issue for us. The query works fine for
what we need with the second binding in the sub query. It was more of a
heads up and trying to understand what was wrong with doing it the other
way.

Cheers
Rob


On Fri, Feb 1, 2013 at 3:38 PM, Andy Seaborne <an...@apache.org> wrote:

> On 01/02/13 15:07, Rob Walpole wrote:
>
>> Hi Andy,
>>
>> On Fri, Feb 1, 2013 at 12:16 PM, Andy Seaborne <an...@apache.org> wrote:
>>
>>  Rob - I notice you use rdfs:member which is a calculated property in ARQ.
>>>
>>> There is a bug in ARQ (JENA-340) which means this isn't handled as a
>>> calculated property inside a FILTER.
>>>
>>
>>
>>  You have:
>>>
>>> FILTER EXISTS { ?export rdfs:member ?ancestor } .
>>>
>>>
>> On it's own this doesn't seem to cause us any problems... perhaps because
>> we are specifically inserting these triples into TDB rather than relying
>> on
>> them being calculated?
>>
>
> good - it checks for both cases - explicit and implicit
>
> Nasty things RDF alt/bag/seq from the POV of a database.
>
>         Andy
>
>
>> Thanks
>> Rob
>>
>>
>


-- 

Rob Walpole
Email robkwalpole@gmail.com
Tel. +44 (0)7969 869881
Skype: RobertWalpolehttp://www.linkedin.com/in/robwalpole

Re: Binding causes hang in Fuseki

Posted by Andy Seaborne <an...@apache.org>.
On 01/02/13 17:44, Paul Gearon wrote:
> On Fri, Feb 1, 2013 at 12:21 PM, Andy Seaborne <an...@apache.org> wrote:
>
>> On 01/02/13 16:51, Paul Gearon wrote:
>>
>>>
>>> On Fri, Feb 1, 2013 at 10:38 AM, Andy Seaborne <andy@apache.org
>>> <ma...@apache.org>> wrote:
>>>
>>>      Nasty things RDF alt/bag/seq from the POV of a database.
>>>
>>>
>>> Could I get your perspective on this please? I accept that best practice
>>> abandoned them long ago, and RDF 1.1 is deprecating them. I also
>>> appreciate the mathematical elegance of the list structure. However, I
>>> don't understand why Containers are considered nasty... with the
>>> exception of the rdf:_nnn properties. Is that what you're referring to?
>>> Or is it something else?

>> It's the DB implications, not the modelling issues, I was poking at.
>>
>> ?x rdfs:member ?o
>>
>> is basically a whole DB  scan looking for rdf:_NNN
>>
>> Yuk.
>>
>
> I've always dealt with it in one of 2 ways.:
>
> a) Look for rdfs:member (often not possible, but useful when available)
> b) Use an index that orders by IRI. Since most indexes are tree based, then
> this is easy enough. In Mulgara's case I created a "magic" predicate that
> could match IRIs by some prefix, then looked for the prefix of
> http://www.w3.org/1999/02/22-rdf-syntax-ns#_. We have a few range lookups
> in the literals, and IRIs are stored similarly, so the code to find and
> join this data to the triples already existed.

TDB indexes are 8 byte NodeId so they are not sorted by IRI.

But they could be :-)

By using the inline NodeId encodings, certain well-know IRIs could get 
sort form NodeIds (but this is a format change - old/current and new 
won't mix properly as they will not compare equal by NodeId)/

rdf:_ could have it's own inline Nodeid space using, say 32 bits for the 
number part.  Then they woudl be sorted by NodeId.

>
>
>> Even with ?x, need to scan for rdf:_1, rdf:_2, etc rdf:_99186
>> Could specially handle ... but Jena does not (ditto lists)
>>
>
> Well at least lists now have support with property paths in SPARQL.
>
> While property paths give you the tools you need to read and manipulate
> lists, there's still some work to be done for the programmer who uses them.
> We probably need to get out a tutorial on common list operations in SPARQL,
> since I find that to be a very common question.
>
>
> A trie on IRIs?
>>
>
> The IRIs here all start with the same prefix, so I find that a simple
> ordered index is enough.
>
>
>
>> There are modelling-wise issues:
>>
>> Can add rdf:_1 to have two of them.  In a Seq ??!!?
>> Merge two seqs with the same URI and you get ... a mess.  Lists at least
>> will be bnodes.
>>
>
> Not easy at the modeling level no. Programming in the guts of the DB is
> easy enough, since you can just append one container to another, or
> whatever approach you choose. However, once the operations get abstracted
> up the stack (in SPARQL, for instance), then it becomes something between
> hard and intractable.
>
>
> Practically:
>>
>> No Turtle support
>>
>
> You/app writer see the horrible encoding.
>>
>
> I've started dropping RDF/XML support, so that's certainly an issue. I do
> like Turtle list syntax.

:-)

>> Really RDF needs a list-ish thing as a first class data type, not encoded
>> as triples.  If triples, when you work with the data, you see triples and
>> the app or library has to reconstruct.  But you can also have mal-formed
>> encodings in triples and the code has to cope.
>
>
> James Leigh proposed this before the new RDF WG was started. Working with
> Datomic right now (a non-RDF triple-store, which has no support for this
> either) has made it very clear how important first class collections would
> be. I ended up adopting RDF-like practices in that case.
>
> Paul
>

	Andy


Re: Binding causes hang in Fuseki

Posted by Paul Gearon <ge...@ieee.org>.
On Fri, Feb 1, 2013 at 12:21 PM, Andy Seaborne <an...@apache.org> wrote:

> On 01/02/13 16:51, Paul Gearon wrote:
>
>>
>> On Fri, Feb 1, 2013 at 10:38 AM, Andy Seaborne <andy@apache.org
>> <ma...@apache.org>> wrote:
>>
>>     Nasty things RDF alt/bag/seq from the POV of a database.
>>
>>
>> Could I get your perspective on this please? I accept that best practice
>> abandoned them long ago, and RDF 1.1 is deprecating them. I also
>> appreciate the mathematical elegance of the list structure. However, I
>> don't understand why Containers are considered nasty... with the
>> exception of the rdf:_nnn properties. Is that what you're referring to?
>> Or is it something else?
>>
>> Regards,
>> Paul Gearon
>>
>
> (on list?)
>

Seemed a little off-topic, but sure...



> It's the DB implications, not the modelling issues, I was poking at.
>
> ?x rdfs:member ?o
>
> is basically a whole DB  scan looking for rdf:_NNN
>
> Yuk.
>

I've always dealt with it in one of 2 ways.:

a) Look for rdfs:member (often not possible, but useful when available)
b) Use an index that orders by IRI. Since most indexes are tree based, then
this is easy enough. In Mulgara's case I created a "magic" predicate that
could match IRIs by some prefix, then looked for the prefix of
http://www.w3.org/1999/02/22-rdf-syntax-ns#_. We have a few range lookups
in the literals, and IRIs are stored similarly, so the code to find and
join this data to the triples already existed.


> Even with ?x, need to scan for rdf:_1, rdf:_2, etc rdf:_99186
> Could specially handle ... but Jena does not (ditto lists)
>

Well at least lists now have support with property paths in SPARQL.

While property paths give you the tools you need to read and manipulate
lists, there's still some work to be done for the programmer who uses them.
We probably need to get out a tutorial on common list operations in SPARQL,
since I find that to be a very common question.


A trie on IRIs?
>

The IRIs here all start with the same prefix, so I find that a simple
ordered index is enough.



> There are modelling-wise issues:
>
> Can add rdf:_1 to have two of them.  In a Seq ??!!?
> Merge two seqs with the same URI and you get ... a mess.  Lists at least
> will be bnodes.
>

Not easy at the modeling level no. Programming in the guts of the DB is
easy enough, since you can just append one container to another, or
whatever approach you choose. However, once the operations get abstracted
up the stack (in SPARQL, for instance), then it becomes something between
hard and intractable.


Practically:
>
> No Turtle support
>

You/app writer see the horrible encoding.
>

I've started dropping RDF/XML support, so that's certainly an issue. I do
like Turtle list syntax.



> Really RDF needs a list-ish thing as a first class data type, not encoded
> as triples.  If triples, when you work with the data, you see triples and
> the app or library has to reconstruct.  But you can also have mal-formed
> encodings in triples and the code has to cope.


James Leigh proposed this before the new RDF WG was started. Working with
Datomic right now (a non-RDF triple-store, which has no support for this
either) has made it very clear how important first class collections would
be. I ended up adopting RDF-like practices in that case.

Paul

Re: Binding causes hang in Fuseki

Posted by Andy Seaborne <an...@apache.org>.
On 01/02/13 15:07, Rob Walpole wrote:
> Hi Andy,
>
> On Fri, Feb 1, 2013 at 12:16 PM, Andy Seaborne <an...@apache.org> wrote:
>
>> Rob - I notice you use rdfs:member which is a calculated property in ARQ.
>>
>> There is a bug in ARQ (JENA-340) which means this isn't handled as a
>> calculated property inside a FILTER.
>
>
>> You have:
>>
>> FILTER EXISTS { ?export rdfs:member ?ancestor } .
>>
>
> On it's own this doesn't seem to cause us any problems... perhaps because
> we are specifically inserting these triples into TDB rather than relying on
> them being calculated?

good - it checks for both cases - explicit and implicit

Nasty things RDF alt/bag/seq from the POV of a database.

	Andy

>
> Thanks
> Rob
>


Re: Binding causes hang in Fuseki

Posted by Rob Walpole <ro...@gmail.com>.
Hi Andy,

On Fri, Feb 1, 2013 at 12:16 PM, Andy Seaborne <an...@apache.org> wrote:

> Rob - I notice you use rdfs:member which is a calculated property in ARQ.
>
> There is a bug in ARQ (JENA-340) which means this isn't handled as a
> calculated property inside a FILTER.


> You have:
>
> FILTER EXISTS { ?export rdfs:member ?ancestor } .
>

On it's own this doesn't seem to cause us any problems... perhaps because
we are specifically inserting these triples into TDB rather than relying on
them being calculated?

Thanks
Rob

Re: Binding causes hang in Fuseki

Posted by Andy Seaborne <an...@apache.org>.
Rob - I notice you use rdfs:member which is a calculated property in ARQ.

There is a bug in ARQ (JENA-340) which means this isn't handled as a 
calculated property inside a FILTER.

You have:
FILTER EXISTS { ?export rdfs:member ?ancestor } .

This probably does not relate to the issue (I'm still looking).

	Andy


On 31/01/13 11:02, Rob Walpole wrote:
> Thanks Andy... answers below...
>
>     Could you send complete, unbroken queries please?  I fixed up the
>     last ones.
>
>
> PREFIX api: <http://purl.org/linked-data/api/vocab#>
> PREFIX dc: <http://purl.org/dc/terms/>
> PREFIX dri: <http://nationalarchives.gov.uk/terms/dri#>
> PREFIX elda: <http://www.epimorphics.com/vocabularies/lda#>
> PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
> PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
> PREFIX spec: <http://nationalarchives.gov.uk/terms/cat#>
> PREFIX tna: <http://nationalarchives.gov.uk/vocab#>
> PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
>
> DESCRIBE ?ancestor
> WHERE
> {
>
> BIND(<http://nationalarchives.gov.uk/dri/catalogue/exportStatus/ReadyToProcess>
> AS ?readyStatus)
>
> BIND(<http://nationalarchives.gov.uk/dri/catalogue/item/c2433752-b1e5-44e4-a271-c36d29aa6a3b>
> AS ?deselected)
>      ?export rdfs:member ?member ;
>              dri:username "dfreeman"^^xsd:string ;
>              dri:exportStatus ?readyStatus .
>      OPTIONAL
>      {
>          # get the ancestors of the deselected item
> ?deselected (dri:parent)+ ?ancestor .
> # get the ancestor that is a member of the export list
>          FILTER EXISTS { ?export rdfs:member ?ancestor } .
>      }
> }
>
>
>     What's the data? How much etc?
>
>
> About 9m triples stored in embedded TDB - equates to approx 1GB data.
>
>     Which version of Fuseki are you running?
>
>
> 0.2.5
>
>     What's the config?
>
>
> Fuseki is run using the following command:
>
> /opt/jena-fuseki-0.2.5/fuseki-server
> --loc=/opt/jena-fuseki-0.2.5/Data/clean/db --mgtPort=3131 --update
> /catalogue
>
> No special config other than this.
>
>     How much heap?
>
> Looks like the default Xmx1200 is being set within fuseki-server script
> - we don't override this.
>
> So I stopped and restarted Fuseki and then ran the above query which
> resulted in the attached stack trace after about 5 mins.
>
> Many thanks
> Rob
>
> --
>
> Rob Walpole
> Emailrobkwalpole@gmail.com  <ma...@gmail.com>
> Tel. +44 (0)7969 869881
> Skype: RobertWalpole
> http://www.linkedin.com/in/robwalpole
>


Re: Binding causes hang in Fuseki

Posted by Rob Walpole <ro...@gmail.com>.
Thanks Andy... answers below...

Could you send complete, unbroken queries please?  I fixed up the last ones.
>

PREFIX api: <http://purl.org/linked-data/api/vocab#>
PREFIX dc: <http://purl.org/dc/terms/>
PREFIX dri: <http://nationalarchives.gov.uk/terms/dri#>
PREFIX elda: <http://www.epimorphics.com/vocabularies/lda#>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX spec: <http://nationalarchives.gov.uk/terms/cat#>
PREFIX tna: <http://nationalarchives.gov.uk/vocab#>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>

DESCRIBE ?ancestor
WHERE
{
    BIND(<
http://nationalarchives.gov.uk/dri/catalogue/exportStatus/ReadyToProcess>
AS ?readyStatus)
    BIND(<
http://nationalarchives.gov.uk/dri/catalogue/item/c2433752-b1e5-44e4-a271-c36d29aa6a3b>
AS ?deselected)
    ?export rdfs:member ?member ;
            dri:username "dfreeman"^^xsd:string ;
            dri:exportStatus ?readyStatus .
    OPTIONAL
    {

        # get the ancestors of the deselected item
?deselected (dri:parent)+ ?ancestor .
 # get the ancestor that is a member of the export list
        FILTER EXISTS { ?export rdfs:member ?ancestor } .
    }
}


>
> What's the data? How much etc?
>

About 9m triples stored in embedded TDB - equates to approx 1GB data.


> Which version of Fuseki are you running?
>

0.2.5


> What's the config?
>

Fuseki is run using the following command:

/opt/jena-fuseki-0.2.5/fuseki-server
--loc=/opt/jena-fuseki-0.2.5/Data/clean/db --mgtPort=3131 --update
/catalogue

No special config other than this.


> How much heap?
>
>
Looks like the default Xmx1200 is being set within fuseki-server script -
we don't override this.

So I stopped and restarted Fuseki and then ran the above query which
resulted in the attached stack trace after about 5 mins.

Many thanks
Rob

-- 

Rob Walpole
Email robkwalpole@gmail.com
Tel. +44 (0)7969 869881
Skype: RobertWalpolehttp://www.linkedin.com/in/robwalpole

Re: Binding causes hang in Fuseki

Posted by Andy Seaborne <an...@apache.org>.
Could you send complete, unbroken queries please?  I fixed up the last ones.

What's the data? How much etc?

	Andy


On 31/01/13 10:09, Rob Walpole wrote:
> Ok, just looking at this problem again. It seems that even without the
> nested select the binding still causes a problem in the main body of the
> query. So...
>
> DESCRIBE ?ancestor
> WHERE
> {
>      BIND(<
> http://nationalarchives.gov.uk/dri/catalogue/exportStatus/ReadyToProcess>
> AS ?readyStatus)
>      ?export rdfs:member ?member ;
>              dri:username "dfreeman"^^xsd:string ;
>              dri:exportStatus ?readyStatus .
>      OPTIONAL
>      {
>          BIND(<
> http://nationalarchives.gov.uk/dri/catalogue/item/c2433752-b1e5-44e4-a271-c36d29aa6a3b>
> AS ?deselected)
>          # get the ancestors of the deselected item
> ?deselected (dri:parent)+ ?ancestor .
>   # get the ancestor that is a member of the export list
>          FILTER EXISTS { ?export rdfs:member ?ancestor } .
>      }
> }
>
> ...returns is seconds whereas...
>
> DESCRIBE ?ancestor
> WHERE
> {
>      BIND(<
> http://nationalarchives.gov.uk/dri/catalogue/exportStatus/ReadyToProcess>
> AS ?readyStatus)
>      BIND(<
> http://nationalarchives.gov.uk/dri/catalogue/item/c2433752-b1e5-44e4-a271-c36d29aa6a3b>
> AS ?deselected)
>      ?export rdfs:member ?member ;
>              dri:username "dfreeman"^^xsd:string ;
>              dri:exportStatus ?readyStatus .
>      OPTIONAL
>      {
>
>          # get the ancestors of the deselected item
> ?deselected (dri:parent)+ ?ancestor .
>   # get the ancestor that is a member of the export list
>          FILTER EXISTS { ?export rdfs:member ?ancestor } .
>      }
> }
>
> ...just hangs. Any more thoughts?
>
> Thanks
> Rob
>
>
> On Tue, Jan 29, 2013 at 7:46 PM, Rob Walpole <ro...@gmail.com> wrote:
>
>> Cool, thanks guys, will give this a try tomorrow :-)
>>
>> Rob
>>
>>
>> On Tue, Jan 29, 2013 at 7:36 PM, Andy Seaborne <an...@apache.org> wrote:
>>
>>> On 29/01/13 18:21, Alexander Dutton wrote:
>>>
>>>>
>>>> -----BEGIN PGP SIGNED MESSAGE-----
>>>> Hash: SHA1
>>>>
>>>> Hi Rob,
>>>>
>>>> On 29/01/13 18:11, Rob Walpole wrote:
>>>>
>>>>> Am I doing something wrong here?
>>>>>
>>>>
>>>> The short answer is that the inner SELECT is evaluated first, leading to
>>>> the results being calculated in the second case in a rather inefficient
>>>> way.
>>>>
>>>> In the first inner SELECT ?deselected is bound, so it's quite quick to
>>>> find all its ancestors.
>>>>
>>>> In the second, all possible ?deselected and ?ancestor pairs are returned
>>>> by the inner query, which are then (effectively) filtered to remove all
>>>> the pairs where ?deselected isn't whatever it was BINDed to.
>>>>
>>>> Here's more from the spec:
>>>> <http://www.w3.org/TR/**sparql11-query/#subqueries<http://www.w3.org/TR/sparql11-query/#subqueries>
>>>>> .
>>>>
>>>> I /think/ ARQ is able to perform some optimisations along these lines,
>>>> but obviously not for your query.
>>>>
>>>
>>> Spot on.
>>>
>>> If you remove the inner SELECT it should do better.
>>>
>>>
>>>
>>>    { BIND(...) AS ?readyStatus)
>>>      BIND(...) AS ?deselected)
>>>      ?export rdfs:member ?member .
>>>      ?export dri:username "rwalpole"^^xsd:string .
>>>      ?export dri:exportStatus ?readyStatus
>>>      OPTIONAL
>>>        { ?deselected (dri:parent)+ ?ancestor
>>>
>>>          FILTER EXISTS {?export rdfs:member ?ancestor }
>>>        }
>>>    }
>>>
>>> but technically this is a different query so it'll depend on your data as
>>> to whether it is right.
>>>
>>> http://www.sparql.org/query-**validator.html<http://www.sparql.org/query-validator.html>
>>>
>>>          Andy
>>>
>>>
>>>
>>>> Best regards,
>>>>
>>>> Alex
>>>>
>>>> PS. You don't need to do URI("http://?"); you can do a straight IRI
>>>> literal: <http://?>
>>>>
>>>> - --
>>>> Alexander Dutton
>>>> Developer, Office of the CIO; data.ox.ac.uk, OxPoints
>>>> IT Services, University of Oxford
>>>> -----BEGIN PGP SIGNATURE-----
>>>> Version: GnuPG v1.4.13 (GNU/Linux)
>>>> Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/
>>>>
>>>> iQEcBAEBAgAGBQJRCBMZAAoJEPotab**D1ANF7Fb0H/**jeCedjfCIuhI2KTNETOcrVR
>>>> Gvl8N4k9ty4AN4F0xFKA3kcGCTR2CI**pgz/**hez6BM5s8mDqLc7ViNPXWxbUhb4kHh
>>>> fxVuuoYBr13VhGnyufvWFliFeT3xSV**LO3eDUilzoja2pvH/Cx/**sNQvcHbi2Ee+EX
>>>> MoWLyfSvtSGY2rXDmMAXvBz49wgk42**mC2Bsr5ptNUfXWQjzz6BXp5SxTKADy**SBXG
>>>> Tm/**DmqGRclHxw233I6EcB9lKfDytTosVu**gH1Yl0BGEHiFPL2/wkkB+**AZiLIwCmb/
>>>> cy+Y8/**I9PlD4onvYlDMRmP169HQVYt849Skx**5/TnTyjMBBNIgQiE8+cj0a/oDc8=
>>>> =ZQec
>>>> -----END PGP SIGNATURE-----
>>>>
>>>>
>>>
>>
>>
>> --
>>
>> Rob Walpole
>> Email robkwalpole@gmail.com
>> Tel. +44 (0)7969 869881
>> Skype: RobertWalpolehttp://www.linkedin.com/in/robwalpole
>>
>>
>
>


Re: Binding causes hang in Fuseki

Posted by Andy Seaborne <an...@apache.org>.
Unclear - queries [23] and [25] are still running, maybe others.

The fact that query [34] sees GC problems does not mean [34] is the real 
cause.

Which version of Fuseki are you running?
What's the config?
How much heap?

	Andy

Re: Binding causes hang in Fuseki

Posted by Rob Walpole <ro...@gmail.com>.
Further update... The Fuseki console is showing an OutOfMemoryError that
relates to this query which I have attached.


On Thu, Jan 31, 2013 at 10:09 AM, Rob Walpole <ro...@gmail.com> wrote:

> Ok, just looking at this problem again. It seems that even without the
> nested select the binding still causes a problem in the main body of the
> query. So...
>
> DESCRIBE ?ancestor
> WHERE
> {
>     BIND(<
> http://nationalarchives.gov.uk/dri/catalogue/exportStatus/ReadyToProcess>
> AS ?readyStatus)
>     ?export rdfs:member ?member ;
>             dri:username "dfreeman"^^xsd:string ;
>             dri:exportStatus ?readyStatus .
>     OPTIONAL
>     {
>         BIND(<
> http://nationalarchives.gov.uk/dri/catalogue/item/c2433752-b1e5-44e4-a271-c36d29aa6a3b>
> AS ?deselected)
>         # get the ancestors of the deselected item
> ?deselected (dri:parent)+ ?ancestor .
>  # get the ancestor that is a member of the export list
>         FILTER EXISTS { ?export rdfs:member ?ancestor } .
>     }
> }
>
> ...returns is seconds whereas...
>
> DESCRIBE ?ancestor
> WHERE
> {
>     BIND(<
> http://nationalarchives.gov.uk/dri/catalogue/exportStatus/ReadyToProcess>
> AS ?readyStatus)
>     BIND(<
> http://nationalarchives.gov.uk/dri/catalogue/item/c2433752-b1e5-44e4-a271-c36d29aa6a3b>
> AS ?deselected)
>     ?export rdfs:member ?member ;
>             dri:username "dfreeman"^^xsd:string ;
>             dri:exportStatus ?readyStatus .
>     OPTIONAL
>     {
>
>         # get the ancestors of the deselected item
>  ?deselected (dri:parent)+ ?ancestor .
>  # get the ancestor that is a member of the export list
>         FILTER EXISTS { ?export rdfs:member ?ancestor } .
>     }
> }
>
> ...just hangs. Any more thoughts?
>
> Thanks
> Rob
>
>
> On Tue, Jan 29, 2013 at 7:46 PM, Rob Walpole <ro...@gmail.com>wrote:
>
>> Cool, thanks guys, will give this a try tomorrow :-)
>>
>> Rob
>>
>>
>> On Tue, Jan 29, 2013 at 7:36 PM, Andy Seaborne <an...@apache.org> wrote:
>>
>>> On 29/01/13 18:21, Alexander Dutton wrote:
>>>
>>>>
>>>> -----BEGIN PGP SIGNED MESSAGE-----
>>>> Hash: SHA1
>>>>
>>>> Hi Rob,
>>>>
>>>> On 29/01/13 18:11, Rob Walpole wrote:
>>>>
>>>>> Am I doing something wrong here?
>>>>>
>>>>
>>>> The short answer is that the inner SELECT is evaluated first, leading to
>>>> the results being calculated in the second case in a rather inefficient
>>>> way.
>>>>
>>>> In the first inner SELECT ?deselected is bound, so it's quite quick to
>>>> find all its ancestors.
>>>>
>>>> In the second, all possible ?deselected and ?ancestor pairs are returned
>>>> by the inner query, which are then (effectively) filtered to remove all
>>>> the pairs where ?deselected isn't whatever it was BINDed to.
>>>>
>>>> Here's more from the spec:
>>>> <http://www.w3.org/TR/**sparql11-query/#subqueries<http://www.w3.org/TR/sparql11-query/#subqueries>
>>>> >.
>>>>
>>>> I /think/ ARQ is able to perform some optimisations along these lines,
>>>> but obviously not for your query.
>>>>
>>>
>>> Spot on.
>>>
>>> If you remove the inner SELECT it should do better.
>>>
>>>
>>>
>>>   { BIND(...) AS ?readyStatus)
>>>     BIND(...) AS ?deselected)
>>>     ?export rdfs:member ?member .
>>>     ?export dri:username "rwalpole"^^xsd:string .
>>>     ?export dri:exportStatus ?readyStatus
>>>     OPTIONAL
>>>       { ?deselected (dri:parent)+ ?ancestor
>>>
>>>         FILTER EXISTS {?export rdfs:member ?ancestor }
>>>       }
>>>   }
>>>
>>> but technically this is a different query so it'll depend on your data
>>> as to whether it is right.
>>>
>>> http://www.sparql.org/query-**validator.html<http://www.sparql.org/query-validator.html>
>>>
>>>         Andy
>>>
>>>
>>>
>>>> Best regards,
>>>>
>>>> Alex
>>>>
>>>> PS. You don't need to do URI("http://?"); you can do a straight IRI
>>>> literal: <http://?>
>>>>
>>>> - --
>>>> Alexander Dutton
>>>> Developer, Office of the CIO; data.ox.ac.uk, OxPoints
>>>> IT Services, University of Oxford
>>>> -----BEGIN PGP SIGNATURE-----
>>>> Version: GnuPG v1.4.13 (GNU/Linux)
>>>> Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/
>>>>
>>>> iQEcBAEBAgAGBQJRCBMZAAoJEPotab**D1ANF7Fb0H/**jeCedjfCIuhI2KTNETOcrVR
>>>> Gvl8N4k9ty4AN4F0xFKA3kcGCTR2CI**pgz/**hez6BM5s8mDqLc7ViNPXWxbUhb4kHh
>>>> fxVuuoYBr13VhGnyufvWFliFeT3xSV**LO3eDUilzoja2pvH/Cx/**sNQvcHbi2Ee+EX
>>>> MoWLyfSvtSGY2rXDmMAXvBz49wgk42**mC2Bsr5ptNUfXWQjzz6BXp5SxTKADy**SBXG
>>>> Tm/**DmqGRclHxw233I6EcB9lKfDytTosVu**gH1Yl0BGEHiFPL2/wkkB+**AZiLIwCmb/
>>>> cy+Y8/**I9PlD4onvYlDMRmP169HQVYt849Skx**5/TnTyjMBBNIgQiE8+cj0a/oDc8=
>>>> =ZQec
>>>> -----END PGP SIGNATURE-----
>>>>
>>>>
>>>
>>
>>
>> --
>>
>> Rob Walpole
>> Email robkwalpole@gmail.com
>> Tel. +44 (0)7969 869881
>> Skype: RobertWalpolehttp://www.linkedin.com/in/robwalpole
>>
>>
>
>
> --
>
> Rob Walpole
> Email robkwalpole@gmail.com
> Tel. +44 (0)7969 869881
> Skype: RobertWalpolehttp://www.linkedin.com/in/robwalpole
>
>


-- 

Rob Walpole
Email robkwalpole@gmail.com
Tel. +44 (0)7969 869881
Skype: RobertWalpolehttp://www.linkedin.com/in/robwalpole

Re: Binding causes hang in Fuseki

Posted by Rob Walpole <ro...@gmail.com>.
Ok, just looking at this problem again. It seems that even without the
nested select the binding still causes a problem in the main body of the
query. So...

DESCRIBE ?ancestor
WHERE
{
    BIND(<
http://nationalarchives.gov.uk/dri/catalogue/exportStatus/ReadyToProcess>
AS ?readyStatus)
    ?export rdfs:member ?member ;
            dri:username "dfreeman"^^xsd:string ;
            dri:exportStatus ?readyStatus .
    OPTIONAL
    {
        BIND(<
http://nationalarchives.gov.uk/dri/catalogue/item/c2433752-b1e5-44e4-a271-c36d29aa6a3b>
AS ?deselected)
        # get the ancestors of the deselected item
?deselected (dri:parent)+ ?ancestor .
 # get the ancestor that is a member of the export list
        FILTER EXISTS { ?export rdfs:member ?ancestor } .
    }
}

...returns is seconds whereas...

DESCRIBE ?ancestor
WHERE
{
    BIND(<
http://nationalarchives.gov.uk/dri/catalogue/exportStatus/ReadyToProcess>
AS ?readyStatus)
    BIND(<
http://nationalarchives.gov.uk/dri/catalogue/item/c2433752-b1e5-44e4-a271-c36d29aa6a3b>
AS ?deselected)
    ?export rdfs:member ?member ;
            dri:username "dfreeman"^^xsd:string ;
            dri:exportStatus ?readyStatus .
    OPTIONAL
    {

        # get the ancestors of the deselected item
?deselected (dri:parent)+ ?ancestor .
 # get the ancestor that is a member of the export list
        FILTER EXISTS { ?export rdfs:member ?ancestor } .
    }
}

...just hangs. Any more thoughts?

Thanks
Rob


On Tue, Jan 29, 2013 at 7:46 PM, Rob Walpole <ro...@gmail.com> wrote:

> Cool, thanks guys, will give this a try tomorrow :-)
>
> Rob
>
>
> On Tue, Jan 29, 2013 at 7:36 PM, Andy Seaborne <an...@apache.org> wrote:
>
>> On 29/01/13 18:21, Alexander Dutton wrote:
>>
>>>
>>> -----BEGIN PGP SIGNED MESSAGE-----
>>> Hash: SHA1
>>>
>>> Hi Rob,
>>>
>>> On 29/01/13 18:11, Rob Walpole wrote:
>>>
>>>> Am I doing something wrong here?
>>>>
>>>
>>> The short answer is that the inner SELECT is evaluated first, leading to
>>> the results being calculated in the second case in a rather inefficient
>>> way.
>>>
>>> In the first inner SELECT ?deselected is bound, so it's quite quick to
>>> find all its ancestors.
>>>
>>> In the second, all possible ?deselected and ?ancestor pairs are returned
>>> by the inner query, which are then (effectively) filtered to remove all
>>> the pairs where ?deselected isn't whatever it was BINDed to.
>>>
>>> Here's more from the spec:
>>> <http://www.w3.org/TR/**sparql11-query/#subqueries<http://www.w3.org/TR/sparql11-query/#subqueries>
>>> >.
>>>
>>> I /think/ ARQ is able to perform some optimisations along these lines,
>>> but obviously not for your query.
>>>
>>
>> Spot on.
>>
>> If you remove the inner SELECT it should do better.
>>
>>
>>
>>   { BIND(...) AS ?readyStatus)
>>     BIND(...) AS ?deselected)
>>     ?export rdfs:member ?member .
>>     ?export dri:username "rwalpole"^^xsd:string .
>>     ?export dri:exportStatus ?readyStatus
>>     OPTIONAL
>>       { ?deselected (dri:parent)+ ?ancestor
>>
>>         FILTER EXISTS {?export rdfs:member ?ancestor }
>>       }
>>   }
>>
>> but technically this is a different query so it'll depend on your data as
>> to whether it is right.
>>
>> http://www.sparql.org/query-**validator.html<http://www.sparql.org/query-validator.html>
>>
>>         Andy
>>
>>
>>
>>> Best regards,
>>>
>>> Alex
>>>
>>> PS. You don't need to do URI("http://?"); you can do a straight IRI
>>> literal: <http://?>
>>>
>>> - --
>>> Alexander Dutton
>>> Developer, Office of the CIO; data.ox.ac.uk, OxPoints
>>> IT Services, University of Oxford
>>> -----BEGIN PGP SIGNATURE-----
>>> Version: GnuPG v1.4.13 (GNU/Linux)
>>> Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/
>>>
>>> iQEcBAEBAgAGBQJRCBMZAAoJEPotab**D1ANF7Fb0H/**jeCedjfCIuhI2KTNETOcrVR
>>> Gvl8N4k9ty4AN4F0xFKA3kcGCTR2CI**pgz/**hez6BM5s8mDqLc7ViNPXWxbUhb4kHh
>>> fxVuuoYBr13VhGnyufvWFliFeT3xSV**LO3eDUilzoja2pvH/Cx/**sNQvcHbi2Ee+EX
>>> MoWLyfSvtSGY2rXDmMAXvBz49wgk42**mC2Bsr5ptNUfXWQjzz6BXp5SxTKADy**SBXG
>>> Tm/**DmqGRclHxw233I6EcB9lKfDytTosVu**gH1Yl0BGEHiFPL2/wkkB+**AZiLIwCmb/
>>> cy+Y8/**I9PlD4onvYlDMRmP169HQVYt849Skx**5/TnTyjMBBNIgQiE8+cj0a/oDc8=
>>> =ZQec
>>> -----END PGP SIGNATURE-----
>>>
>>>
>>
>
>
> --
>
> Rob Walpole
> Email robkwalpole@gmail.com
> Tel. +44 (0)7969 869881
> Skype: RobertWalpolehttp://www.linkedin.com/in/robwalpole
>
>


-- 

Rob Walpole
Email robkwalpole@gmail.com
Tel. +44 (0)7969 869881
Skype: RobertWalpolehttp://www.linkedin.com/in/robwalpole

Re: Binding causes hang in Fuseki

Posted by Rob Walpole <ro...@gmail.com>.
Cool, thanks guys, will give this a try tomorrow :-)

Rob


On Tue, Jan 29, 2013 at 7:36 PM, Andy Seaborne <an...@apache.org> wrote:

> On 29/01/13 18:21, Alexander Dutton wrote:
>
>>
>> -----BEGIN PGP SIGNED MESSAGE-----
>> Hash: SHA1
>>
>> Hi Rob,
>>
>> On 29/01/13 18:11, Rob Walpole wrote:
>>
>>> Am I doing something wrong here?
>>>
>>
>> The short answer is that the inner SELECT is evaluated first, leading to
>> the results being calculated in the second case in a rather inefficient
>> way.
>>
>> In the first inner SELECT ?deselected is bound, so it's quite quick to
>> find all its ancestors.
>>
>> In the second, all possible ?deselected and ?ancestor pairs are returned
>> by the inner query, which are then (effectively) filtered to remove all
>> the pairs where ?deselected isn't whatever it was BINDed to.
>>
>> Here's more from the spec:
>> <http://www.w3.org/TR/**sparql11-query/#subqueries<http://www.w3.org/TR/sparql11-query/#subqueries>
>> >.
>>
>> I /think/ ARQ is able to perform some optimisations along these lines,
>> but obviously not for your query.
>>
>
> Spot on.
>
> If you remove the inner SELECT it should do better.
>
>
>
>   { BIND(...) AS ?readyStatus)
>     BIND(...) AS ?deselected)
>     ?export rdfs:member ?member .
>     ?export dri:username "rwalpole"^^xsd:string .
>     ?export dri:exportStatus ?readyStatus
>     OPTIONAL
>       { ?deselected (dri:parent)+ ?ancestor
>
>         FILTER EXISTS {?export rdfs:member ?ancestor }
>       }
>   }
>
> but technically this is a different query so it'll depend on your data as
> to whether it is right.
>
> http://www.sparql.org/query-**validator.html<http://www.sparql.org/query-validator.html>
>
>         Andy
>
>
>
>> Best regards,
>>
>> Alex
>>
>> PS. You don't need to do URI("http://?"); you can do a straight IRI
>> literal: <http://?>
>>
>> - --
>> Alexander Dutton
>> Developer, Office of the CIO; data.ox.ac.uk, OxPoints
>> IT Services, University of Oxford
>> -----BEGIN PGP SIGNATURE-----
>> Version: GnuPG v1.4.13 (GNU/Linux)
>> Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/
>>
>> iQEcBAEBAgAGBQJRCBMZAAoJEPotab**D1ANF7Fb0H/**jeCedjfCIuhI2KTNETOcrVR
>> Gvl8N4k9ty4AN4F0xFKA3kcGCTR2CI**pgz/**hez6BM5s8mDqLc7ViNPXWxbUhb4kHh
>> fxVuuoYBr13VhGnyufvWFliFeT3xSV**LO3eDUilzoja2pvH/Cx/**sNQvcHbi2Ee+EX
>> MoWLyfSvtSGY2rXDmMAXvBz49wgk42**mC2Bsr5ptNUfXWQjzz6BXp5SxTKADy**SBXG
>> Tm/**DmqGRclHxw233I6EcB9lKfDytTosVu**gH1Yl0BGEHiFPL2/wkkB+**AZiLIwCmb/
>> cy+Y8/**I9PlD4onvYlDMRmP169HQVYt849Skx**5/TnTyjMBBNIgQiE8+cj0a/oDc8=
>> =ZQec
>> -----END PGP SIGNATURE-----
>>
>>
>


-- 

Rob Walpole
Email robkwalpole@gmail.com
Tel. +44 (0)7969 869881
Skype: RobertWalpolehttp://www.linkedin.com/in/robwalpole

Re: Binding causes hang in Fuseki

Posted by Andy Seaborne <an...@apache.org>.
On 29/01/13 18:21, Alexander Dutton wrote:
>
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Hi Rob,
>
> On 29/01/13 18:11, Rob Walpole wrote:
>> Am I doing something wrong here?
>
> The short answer is that the inner SELECT is evaluated first, leading to
> the results being calculated in the second case in a rather inefficient way.
>
> In the first inner SELECT ?deselected is bound, so it's quite quick to
> find all its ancestors.
>
> In the second, all possible ?deselected and ?ancestor pairs are returned
> by the inner query, which are then (effectively) filtered to remove all
> the pairs where ?deselected isn't whatever it was BINDed to.
>
> Here's more from the spec:
> <http://www.w3.org/TR/sparql11-query/#subqueries>.
>
> I /think/ ARQ is able to perform some optimisations along these lines,
> but obviously not for your query.

Spot on.

If you remove the inner SELECT it should do better.



   { BIND(...) AS ?readyStatus)
     BIND(...) AS ?deselected)
     ?export rdfs:member ?member .
     ?export dri:username "rwalpole"^^xsd:string .
     ?export dri:exportStatus ?readyStatus
     OPTIONAL
       { ?deselected (dri:parent)+ ?ancestor
         FILTER EXISTS {?export rdfs:member ?ancestor }
       }
   }

but technically this is a different query so it'll depend on your data 
as to whether it is right.

http://www.sparql.org/query-validator.html

	Andy

>
> Best regards,
>
> Alex
>
> PS. You don't need to do URI("http://?"); you can do a straight IRI
> literal: <http://?>
>
> - --
> Alexander Dutton
> Developer, Office of the CIO; data.ox.ac.uk, OxPoints
> IT Services, University of Oxford
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.13 (GNU/Linux)
> Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/
>
> iQEcBAEBAgAGBQJRCBMZAAoJEPotabD1ANF7Fb0H/jeCedjfCIuhI2KTNETOcrVR
> Gvl8N4k9ty4AN4F0xFKA3kcGCTR2CIpgz/hez6BM5s8mDqLc7ViNPXWxbUhb4kHh
> fxVuuoYBr13VhGnyufvWFliFeT3xSVLO3eDUilzoja2pvH/Cx/sNQvcHbi2Ee+EX
> MoWLyfSvtSGY2rXDmMAXvBz49wgk42mC2Bsr5ptNUfXWQjzz6BXp5SxTKADySBXG
> Tm/DmqGRclHxw233I6EcB9lKfDytTosVugH1Yl0BGEHiFPL2/wkkB+AZiLIwCmb/
> cy+Y8/I9PlD4onvYlDMRmP169HQVYt849Skx5/TnTyjMBBNIgQiE8+cj0a/oDc8=
> =ZQec
> -----END PGP SIGNATURE-----
>


Re: Binding causes hang in Fuseki

Posted by Alexander Dutton <al...@it.ox.ac.uk>.
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi Rob,

On 29/01/13 18:11, Rob Walpole wrote:
> Am I doing something wrong here?

The short answer is that the inner SELECT is evaluated first, leading to
the results being calculated in the second case in a rather inefficient way.

In the first inner SELECT ?deselected is bound, so it's quite quick to
find all its ancestors.

In the second, all possible ?deselected and ?ancestor pairs are returned
by the inner query, which are then (effectively) filtered to remove all
the pairs where ?deselected isn't whatever it was BINDed to.

Here's more from the spec:
<http://www.w3.org/TR/sparql11-query/#subqueries>.

I /think/ ARQ is able to perform some optimisations along these lines,
but obviously not for your query.

Best regards,

Alex

PS. You don't need to do URI("http://?"); you can do a straight IRI
literal: <http://?>

- -- 
Alexander Dutton
Developer, Office of the CIO; data.ox.ac.uk, OxPoints
IT Services, University of Oxford
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.13 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQEcBAEBAgAGBQJRCBMZAAoJEPotabD1ANF7Fb0H/jeCedjfCIuhI2KTNETOcrVR
Gvl8N4k9ty4AN4F0xFKA3kcGCTR2CIpgz/hez6BM5s8mDqLc7ViNPXWxbUhb4kHh
fxVuuoYBr13VhGnyufvWFliFeT3xSVLO3eDUilzoja2pvH/Cx/sNQvcHbi2Ee+EX
MoWLyfSvtSGY2rXDmMAXvBz49wgk42mC2Bsr5ptNUfXWQjzz6BXp5SxTKADySBXG
Tm/DmqGRclHxw233I6EcB9lKfDytTosVugH1Yl0BGEHiFPL2/wkkB+AZiLIwCmb/
cy+Y8/I9PlD4onvYlDMRmP169HQVYt849Skx5/TnTyjMBBNIgQiE8+cj0a/oDc8=
=ZQec
-----END PGP SIGNATURE-----