You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@jena.apache.org by Chris Tomlinson <ch...@gmail.com> on 2019/05/16 17:35:16 UTC
question about RDFPatch headers
Hi,
We’re building an editing service for our RDF Linked Data Service and are thinking to use at least some of the features of RDFPatch/RDFDelta.
We use named graphs for the various Entities that we model: Works, Persons, Places, Lineages and so on. We are wanting to include in the patch some headers indicating the graphs that are being updated in the patch and the graphs that are created in the patch. We want this information to help the editing service have easy access to this information w/o analyzing the patch and doing other work to discover what’s being created and so on.
At first we thought of using a couple of keywords like, graph and create:
H graph <http://purl.bdrc.io/graph/0686c69d-8f89-4496-acb5-744f0157a8db> .
H graph <http://purl.bdrc.io/graph/3ee0eca0-6d5f-4b4d-85db-f69ab1167eb1> .
H create <http://purl.bdrc.io/graph/b1167eb1-85db-4b4d-6d5f-3ee0eca0f69a> .
H create <http://purl.bdrc.io/graph/0157a8db-acb5-4496-8f89-0686c69d744f> .
H id … .
but org.seaborne.patch.PatchHeader uses a Map so we can only one H graph … and one H create … in the patch. Two alternatives we’ve considered are to use a String of comma separated graphIds:
H graph "0686c69d-8f89-4496-acb5-744f0157a8db , 3ee0eca0-6d5f-4b4d-85db-f69ab1167eb1” .
H create "b1167eb1-85db-4b4d-6d5f-3ee0eca0f69a , 0157a8db-acb5-4496-8f89-0686c69d744f” .
which is plausible but in some cases the list of graphIds could become quite long and so this could be an issue down the line with very large strings.
A second idea was to add the notion of a preamble to the patch using PS, for preamble start, and PE, for preamble end, which would separate our extensions from the defined RDFPatch structure:
PS .
H graph <http://purl.bdrc.io/graph/0686c69d-8f89-4496-acb5-744f0157a8db> .
H graph <http://purl.bdrc.io/graph/3ee0eca0-6d5f-4b4d-85db-f69ab1167eb1> .
H create <http://purl.bdrc.io/graph/b1167eb1-85db-4b4d-6d5f-3ee0eca0f69a> .
H create <http://purl.bdrc.io/graph/0157a8db-acb5-4496-8f89-0686c69d744f> .
PE .
H id … .
TX .
…
We would then pre-parse the patch payload up to the PE and submit the remainder to RDFPatch, and so on.
A 3rd possibility is to consider some extension to RDFPatch to use a different signature for the Map in PatchHeader. This seems rather involved.
So we’re asking what approaches others might have taken for this sort of use-case or how best to accommodate this in RDFPatch as is.
Thanks very much,
Chris
Re: question about RDFPatch headers
Posted by Chris Tomlinson <ch...@gmail.com>.
Hi Andy,
We appreciate the ideas. If we go with a linking approach we’ll have to add some more machinery, which is fine.
Thanks again,
Chris
> On May 17, 2019, at 10:40 AM, Andy Seaborne <an...@apache.org> wrote:
>
> Hi Chris,
>
> If the "meta" part becomes complicated, it might be better to put a link in the header that goes to another file. There is balance to be struck between arbitrary structures and simple processing.
>
> It does make some sense to have a multi-valued patch header.
>
> (all one line with or without separating comma)
>
> H graph <http://purl.bdrc.io/graph/0686c69d-8f89-4496-acb5-744f0157a8db> <http://purl.bdrc.io/graph/3ee0eca0-6d5f-4b4d-85db-f69ab1167eb1> .
>
> Having the header as a map makes mixing header entries and reprocessing them work better. No assumed order creeps in and no confusion about duplicates for things that must be unique (like id). c.f. HTTP headers.
>
> If you think the meta data is going to get large, then a link to elsewhere may be better for other reasons like using the metadata without needing to access the patch in the log.
>
> Andy
>
> On 16/05/2019 18:35, Chris Tomlinson wrote:
>> Hi,
>> We’re building an editing service for our RDF Linked Data Service and are thinking to use at least some of the features of RDFPatch/RDFDelta.
>> We use named graphs for the various Entities that we model: Works, Persons, Places, Lineages and so on. We are wanting to include in the patch some headers indicating the graphs that are being updated in the patch and the graphs that are created in the patch. We want this information to help the editing service have easy access to this information w/o analyzing the patch and doing other work to discover what’s being created and so on.
>> At first we thought of using a couple of keywords like, graph and create:
>> H graph <http://purl.bdrc.io/graph/0686c69d-8f89-4496-acb5-744f0157a8db> .
>> H graph <http://purl.bdrc.io/graph/3ee0eca0-6d5f-4b4d-85db-f69ab1167eb1> .
>> H create <http://purl.bdrc.io/graph/b1167eb1-85db-4b4d-6d5f-3ee0eca0f69a> .
>> H create <http://purl.bdrc.io/graph/0157a8db-acb5-4496-8f89-0686c69d744f> .
>> H id … .
>> but org.seaborne.patch.PatchHeader uses a Map so we can only one H graph … and one H create … in the patch. Two alternatives we’ve considered are to use a String of comma separated graphIds:
>> H graph "0686c69d-8f89-4496-acb5-744f0157a8db , 3ee0eca0-6d5f-4b4d-85db-f69ab1167eb1” .
>> H create "b1167eb1-85db-4b4d-6d5f-3ee0eca0f69a , 0157a8db-acb5-4496-8f89-0686c69d744f” .
>> which is plausible but in some cases the list of graphIds could become quite long and so this could be an issue down the line with very large strings.
>> A second idea was to add the notion of a preamble to the patch using PS, for preamble start, and PE, for preamble end, which would separate our extensions from the defined RDFPatch structure:
>> PS .
>> H graph <http://purl.bdrc.io/graph/0686c69d-8f89-4496-acb5-744f0157a8db> .
>> H graph <http://purl.bdrc.io/graph/3ee0eca0-6d5f-4b4d-85db-f69ab1167eb1> .
>> H create <http://purl.bdrc.io/graph/b1167eb1-85db-4b4d-6d5f-3ee0eca0f69a> .
>> H create <http://purl.bdrc.io/graph/0157a8db-acb5-4496-8f89-0686c69d744f> .
>> PE .
>> H id … .
>> TX .
>> …
>> We would then pre-parse the patch payload up to the PE and submit the remainder to RDFPatch, and so on.
>> A 3rd possibility is to consider some extension to RDFPatch to use a different signature for the Map in PatchHeader. This seems rather involved.
>> So we’re asking what approaches others might have taken for this sort of use-case or how best to accommodate this in RDFPatch as is.
>> Thanks very much,
>> Chris
Re: question about RDFPatch headers
Posted by Andy Seaborne <an...@apache.org>.
Hi Chris,
If the "meta" part becomes complicated, it might be better to put a link
in the header that goes to another file. There is balance to be struck
between arbitrary structures and simple processing.
It does make some sense to have a multi-valued patch header.
(all one line with or without separating comma)
H graph <http://purl.bdrc.io/graph/0686c69d-8f89-4496-acb5-744f0157a8db>
<http://purl.bdrc.io/graph/3ee0eca0-6d5f-4b4d-85db-f69ab1167eb1> .
Having the header as a map makes mixing header entries and reprocessing
them work better. No assumed order creeps in and no confusion about
duplicates for things that must be unique (like id). c.f. HTTP headers.
If you think the meta data is going to get large, then a link to
elsewhere may be better for other reasons like using the metadata
without needing to access the patch in the log.
Andy
On 16/05/2019 18:35, Chris Tomlinson wrote:
> Hi,
>
> We’re building an editing service for our RDF Linked Data Service and are thinking to use at least some of the features of RDFPatch/RDFDelta.
>
> We use named graphs for the various Entities that we model: Works, Persons, Places, Lineages and so on. We are wanting to include in the patch some headers indicating the graphs that are being updated in the patch and the graphs that are created in the patch. We want this information to help the editing service have easy access to this information w/o analyzing the patch and doing other work to discover what’s being created and so on.
>
> At first we thought of using a couple of keywords like, graph and create:
>
> H graph <http://purl.bdrc.io/graph/0686c69d-8f89-4496-acb5-744f0157a8db> .
> H graph <http://purl.bdrc.io/graph/3ee0eca0-6d5f-4b4d-85db-f69ab1167eb1> .
> H create <http://purl.bdrc.io/graph/b1167eb1-85db-4b4d-6d5f-3ee0eca0f69a> .
> H create <http://purl.bdrc.io/graph/0157a8db-acb5-4496-8f89-0686c69d744f> .
> H id … .
>
> but org.seaborne.patch.PatchHeader uses a Map so we can only one H graph … and one H create … in the patch. Two alternatives we’ve considered are to use a String of comma separated graphIds:
>
> H graph "0686c69d-8f89-4496-acb5-744f0157a8db , 3ee0eca0-6d5f-4b4d-85db-f69ab1167eb1” .
> H create "b1167eb1-85db-4b4d-6d5f-3ee0eca0f69a , 0157a8db-acb5-4496-8f89-0686c69d744f” .
>
> which is plausible but in some cases the list of graphIds could become quite long and so this could be an issue down the line with very large strings.
>
> A second idea was to add the notion of a preamble to the patch using PS, for preamble start, and PE, for preamble end, which would separate our extensions from the defined RDFPatch structure:
>
> PS .
> H graph <http://purl.bdrc.io/graph/0686c69d-8f89-4496-acb5-744f0157a8db> .
> H graph <http://purl.bdrc.io/graph/3ee0eca0-6d5f-4b4d-85db-f69ab1167eb1> .
> H create <http://purl.bdrc.io/graph/b1167eb1-85db-4b4d-6d5f-3ee0eca0f69a> .
> H create <http://purl.bdrc.io/graph/0157a8db-acb5-4496-8f89-0686c69d744f> .
> PE .
> H id … .
> TX .
> …
>
> We would then pre-parse the patch payload up to the PE and submit the remainder to RDFPatch, and so on.
>
> A 3rd possibility is to consider some extension to RDFPatch to use a different signature for the Map in PatchHeader. This seems rather involved.
>
> So we’re asking what approaches others might have taken for this sort of use-case or how best to accommodate this in RDFPatch as is.
>
> Thanks very much,
> Chris
>
>