You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@marmotta.apache.org by Sergio Fernández <se...@salzburgresearch.at> on 2014/07/24 20:58:09 UTC

Fwd: SPARQL subset as a PATCH format for LDP

LDPath as influence to LDPatch?

Well, I'm still awaiting what it contributes for what RDF Patch already does...

---------- Forwarded message ----------
From: Alexandre Bertails <al...@bertails.org>
Date: Jul 24, 2014 8:30 PM
Subject: SPARQL subset as a PATCH format for LDP
To: "public-ldp-wg@w3.org" <pu...@w3.org>
Cc: 

> All, 
>
> I have been thinking a lot about the SPARQL subset idea and I would 
> like to share some thoughts. As you could have expected from the last 
> call, I am not in favor of it, so I have taken the time to document my 
> issues with the approach. 
>
> First, let me remind you the scope of LD Patch. It is PATCH format for 
> partial updates of LDP-RS. So it's only about RDF graphs. It is not 
> intended for updating quad stores, nor named graphs. Also, it is not 
> meant to be a high-level language but rather an assembly one. For that 
> reason, the editors challenged themselves for not adding higher-level 
> features. 
>
> Skolemization is not used. The assumption is that bnodes form tree 
> structures. The idea is that most of those trees (and the bnodes in 
> them) can be distinguished by filtering on sub-components of those 
> trees. I recommend [1] for a recent and thorough analysis confirming 
> those assumptions. 
>
> That is the very reason behind the LD Path (no 'c') algebra, which 
> shares some similarities with XPath. They are applied left-to-right, 
> and recursively for path constraints. The semantics formally specifies 
> the order in which those operations must be evaluated. So LDP 
> application writers can rely on that semantics for runtime 
> characteristics, for example by restraining the node sets as early as 
> possible in the path, by probably starting from the leaves of the 
> tree, and then moving up in the tree, until reaching the bnode. 
>
> So, SPARQL. Yes, you can consider a subset with similar expressive 
> power. People seem to think that defining the concrete syntax would be 
> enough, and that it would be as easy if not easier than LD Patch. I 
> disagree. First, the two concrete syntaxes would share a lot of the 
> production rules, basically all the ones borrowed from Turtle. The 
> additional ones are no issue in both cases. 
>
> Then, I have heard people saying that we wouldn't need to write down 
> the operational semantics, because we could say it's the same than 
> SPARQL Update, but for that subset of the syntax. I disagree. Because 
> as a developer and as a user, I would have to be sure I understand 
> well the SPARQL semantics to either implement LD Patch (if I don't 
> want to depend on an existing SPARQL implementation), or to use it. So 
> I'd argue that the semantics _has_ to be written. And I'd have to 
> reject valid SPARQL Update queries which are not in the subset. 
>
> Another issue is that we will still need Basic Graph Patterns, the (S 
> P O .)-s in the WHERE clause, which rely on intermediate ResultSet-s 
> for their semantics. 
>
> For example: 
>
> Bind ?event <http://conferences.ted.com/TED2009/> 
> /-schema:url[/schema:startDate="2009-02-04"]/schema:location[/schema:name="Long 
> Beach, California"][/schema:geo[/schema:latitude][/schema:longitude]] 
>
> would be equivalent to something like that: 
>
> WHERE { 
>   ?event schema:url <http://conferences.ted.com/TED2009/> . 
>   ?event schema:startDate "2009-02-04" . 
>   ?event schema:location ?loc . 
>   ?loc schema:name "Long Beach, California" . 
>   ?loc schema:geo ?geo . 
>   ?geo schema:latitude [] . 
>   ?geo schema:longitude [] . 
> } 
>
> If we want the same performance characterics (mainly, predictability), 
> we would have to refine the SPARQL semantics so that the order of the 
> clauses matters (ie. no need to depend on a query optimiser). And we 
> would need to do some static analysis on the query to make sure that 
> ResultSet-s are not needed. In any case, it goes beyond the idea of 
> using subset of the syntax + a pointer to SPARQL Update semantics. 
>
> Another problem is the support for rdf:list. I have just finished 
> writing down the semantics for UpdateList and based on that 
> experience, I know this is something I want to rely on as a user, 
> because it is so easy to get it wrong, so I want native support for 
> it. And I don't think it is possible to do something equivalent in 
> SPARQL Update. That is a huge drawback as list manipulation (eg. in 
> JSON-LD, or Turtle) is an everyday task. 
>
> So to summarize my issues with the approach: 
>
> 1. semantics is not that easy to define 
> 2. performance characteristics 
> 3. no native support for rdf:list 
> 4. needs to explain to the user how it differs from existing SPARQL 
> Update 
>
> SPARQL Update is good at doing what it was designed for, but there is 
> little interest in being syntax compatible with it. 
>
> Regards, 
>
> Alexandre 
>
> [1] http://www.websemanticsjournal.org/index.php/ps/article/view/365 
>

Re: Fwd: SPARQL subset as a PATCH format for LDP

Posted by Andy Seaborne <an...@apache.org>.
On 24/07/14 19:58, Sergio Fernández wrote:
> LDPath as influence to LDPatch?
>
> Well, I'm still awaiting what it contributes for what RDF Patch already does...

RDF Patch is certainly at the lower level, machine level with a slant to 
processing issues over "readability".  It combines with binary RDF [1], 
which in my experiments parses at 560K triples/s, or x3 faster than I 
can get N-triples to go.

But these engineering issues are less in the sight of LDP which, 
realistically, is slanted to smaller data as the unit of change.

	Andy

[1] https://github.com/afs/rdf-thrift

>
> ---------- Forwarded message ----------
> From: Alexandre Bertails <al...@bertails.org>
> Date: Jul 24, 2014 8:30 PM
> Subject: SPARQL subset as a PATCH format for LDP
> To: "public-ldp-wg@w3.org" <pu...@w3.org>
> Cc:
>
>> All,
>>
>> I have been thinking a lot about the SPARQL subset idea and I would
>> like to share some thoughts. As you could have expected from the last
>> call, I am not in favor of it, so I have taken the time to document my
>> issues with the approach.
>>
>> First, let me remind you the scope of LD Patch. It is PATCH format for
>> partial updates of LDP-RS. So it's only about RDF graphs. It is not
>> intended for updating quad stores, nor named graphs. Also, it is not
>> meant to be a high-level language but rather an assembly one. For that
>> reason, the editors challenged themselves for not adding higher-level
>> features.
>>
>> Skolemization is not used. The assumption is that bnodes form tree
>> structures. The idea is that most of those trees (and the bnodes in
>> them) can be distinguished by filtering on sub-components of those
>> trees. I recommend [1] for a recent and thorough analysis confirming
>> those assumptions.
>>
>> That is the very reason behind the LD Path (no 'c') algebra, which
>> shares some similarities with XPath. They are applied left-to-right,
>> and recursively for path constraints. The semantics formally specifies
>> the order in which those operations must be evaluated. So LDP
>> application writers can rely on that semantics for runtime
>> characteristics, for example by restraining the node sets as early as
>> possible in the path, by probably starting from the leaves of the
>> tree, and then moving up in the tree, until reaching the bnode.
>>
>> So, SPARQL. Yes, you can consider a subset with similar expressive
>> power. People seem to think that defining the concrete syntax would be
>> enough, and that it would be as easy if not easier than LD Patch. I
>> disagree. First, the two concrete syntaxes would share a lot of the
>> production rules, basically all the ones borrowed from Turtle. The
>> additional ones are no issue in both cases.
>>
>> Then, I have heard people saying that we wouldn't need to write down
>> the operational semantics, because we could say it's the same than
>> SPARQL Update, but for that subset of the syntax. I disagree. Because
>> as a developer and as a user, I would have to be sure I understand
>> well the SPARQL semantics to either implement LD Patch (if I don't
>> want to depend on an existing SPARQL implementation), or to use it. So
>> I'd argue that the semantics _has_ to be written. And I'd have to
>> reject valid SPARQL Update queries which are not in the subset.
>>
>> Another issue is that we will still need Basic Graph Patterns, the (S
>> P O .)-s in the WHERE clause, which rely on intermediate ResultSet-s
>> for their semantics.
>>
>> For example:
>>
>> Bind ?event <http://conferences.ted.com/TED2009/>
>> /-schema:url[/schema:startDate="2009-02-04"]/schema:location[/schema:name="Long
>> Beach, California"][/schema:geo[/schema:latitude][/schema:longitude]]
>>
>> would be equivalent to something like that:
>>
>> WHERE {
>>    ?event schema:url <http://conferences.ted.com/TED2009/> .
>>    ?event schema:startDate "2009-02-04" .
>>    ?event schema:location ?loc .
>>    ?loc schema:name "Long Beach, California" .
>>    ?loc schema:geo ?geo .
>>    ?geo schema:latitude [] .
>>    ?geo schema:longitude [] .
>> }
>>
>> If we want the same performance characterics (mainly, predictability),
>> we would have to refine the SPARQL semantics so that the order of the
>> clauses matters (ie. no need to depend on a query optimiser). And we
>> would need to do some static analysis on the query to make sure that
>> ResultSet-s are not needed. In any case, it goes beyond the idea of
>> using subset of the syntax + a pointer to SPARQL Update semantics.
>>
>> Another problem is the support for rdf:list. I have just finished
>> writing down the semantics for UpdateList and based on that
>> experience, I know this is something I want to rely on as a user,
>> because it is so easy to get it wrong, so I want native support for
>> it. And I don't think it is possible to do something equivalent in
>> SPARQL Update. That is a huge drawback as list manipulation (eg. in
>> JSON-LD, or Turtle) is an everyday task.
>>
>> So to summarize my issues with the approach:
>>
>> 1. semantics is not that easy to define
>> 2. performance characteristics
>> 3. no native support for rdf:list
>> 4. needs to explain to the user how it differs from existing SPARQL
>> Update
>>
>> SPARQL Update is good at doing what it was designed for, but there is
>> little interest in being syntax compatible with it.
>>
>> Regards,
>>
>> Alexandre
>>
>> [1] http://www.websemanticsjournal.org/index.php/ps/article/view/365
>>