You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@jena.apache.org by GitBox <gi...@apache.org> on 2022/04/09 20:48:38 UTC

[GitHub] [jena] fatzh opened a new issue, #1254: json-ld serialisation and URIs/predicate overlapping

fatzh opened a new issue, #1254:
URL: https://github.com/apache/jena/issues/1254

   First off, thanks for the great software, we use it a lot and it's brilliant ;) 
   
   I stumbled upon something today while trying to parse a json-ld response from Jena/Fuseki. I have a test store with a few books, their URIs look like this: `<http://onbetween.ch/3ms/cms#book_1>`.
   
   If there's a custom predicate `<http://onbetween.ch/3ms/cms#book>`, there's an overlap with the book URIs and the JSON-LD serialisation is no longer valid :-/ as I get book URIs like this: `book:_1`.
   
   Here's the very simple CONSTRUCT query that I send to Fuseki (version 4.4.0):
   
   ```
   In [67]: response = requests.post('http://localhost:3030/threems_example', data={'query': """
       ...: PREFIX cms: <http://onbetween.ch/3ms/cms#>
       ...:
       ...: CONSTRUCT {
       ...: ?a cms:book ?c
       ...: }
       ...: FROM <http://example/cmstest_data>
       ...: WHERE {
       ...: ?a a cms:Collection.
       ...: VALUES ?a { cms:collection_1 }.
       ...: ?c a cms:Book.
       ...: }
       ...: """})
   
   In [68]: print(response.content.decode())
   @prefix schema: <http://schema.org/> .
   @prefix threems: <http://onbetween.ch/3ms/core#> .
   @prefix owl:   <http://www.w3.org/2002/07/owl#> .
   @prefix cms:   <http://onbetween.ch/3ms/cms#> .
   @prefix xsd:   <http://www.w3.org/2001/XMLSchema#> .
   @prefix skos:  <http://www.w3.org/2004/02/skos/core#> .
   @prefix rdfs:  <http://www.w3.org/2000/01/rdf-schema#> .
   @prefix cmsapi: <http://onbetween.ch/3ms/cmsapi#> .
   @prefix rdf:   <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
   @prefix xml:   <http://www.w3.org/XML/1998/namespace> .
   @prefix cmsui: <http://onbetween.ch/3ms/cmsui#> .
   @prefix dc:    <http://purl.org/dc/elements/1.1/> .
   
   cms:collection_1  cms:book  cms:book_B , cms:book_C , cms:book_2 , cms:book_1 , cms:book_A .
   ```
   
   Which is correct, got a simple collection with 5 books.
   
   Now if I request JSON-LD form Fuseki (just adding headers `Accept:  application/ld+json` to the query)
   
   ```
   In [73]: response = requests.post('http://localhost:3030/threems_example', data={'query': """
       ...: PREFIX cms: <http://onbetween.ch/3ms/cms#>
       ...:
       ...: CONSTRUCT {
       ...: ?a cms:book ?c
       ...: }
       ...: FROM <http://example/cmstest_data>
       ...: WHERE {
       ...: ?a a cms:Collection.
       ...: VALUES ?a { cms:collection_1 }.
       ...: ?c a cms:Book.
       ...: }
       ...: """}, headers={'Accept':  'application/ld+json'})
   
   In [74]: print(response.content.decode())
   {
     "@id" : "cms:collection_1",
     "book" : [ "book:_B", "book:_C", "book:_2", "book:_1", "book:_A" ],
     "@context" : {
       "book" : {
         "@id" : "http://onbetween.ch/3ms/cms#book",
         "@type" : "@id"
       },
       "schema" : "http://schema.org/",
       "threems" : "http://onbetween.ch/3ms/core#",
       "owl" : "http://www.w3.org/2002/07/owl#",
       "cms" : "http://onbetween.ch/3ms/cms#",
       "xsd" : "http://www.w3.org/2001/XMLSchema#",
       "skos" : "http://www.w3.org/2004/02/skos/core#",
       "rdfs" : "http://www.w3.org/2000/01/rdf-schema#",
       "cmsapi" : "http://onbetween.ch/3ms/cmsapi#",
       "xml" : "http://www.w3.org/XML/1998/namespace",
       "rdf" : "http://www.w3.org/1999/02/22-rdf-syntax-ns#",
       "cmsui" : "http://onbetween.ch/3ms/cmsui#",
       "dc" : "http://purl.org/dc/elements/1.1/"
     }
   }
   ```
   
   Fuseki serializes the books like this:
   
   `  "book" : [ "book:_B", "book:_C", "book:_2", "book:_1", "book:_A" ],`
   
   I wasn't sure if that's actually correct json-ld serialisation, but trying on the json-ld playground [here](https://tinyurl.com/y8jlny9t) I get this interpretation:
   
   ```
   <http://onbetween.ch/3ms/cms#collection_1> <http://onbetween.ch/3ms/cms#book> <book:_1> .
   <http://onbetween.ch/3ms/cms#collection_1> <http://onbetween.ch/3ms/cms#book> <book:_2> .
   <http://onbetween.ch/3ms/cms#collection_1> <http://onbetween.ch/3ms/cms#book> <book:_A> .
   <http://onbetween.ch/3ms/cms#collection_1> <http://onbetween.ch/3ms/cms#book> <book:_B> .
   <http://onbetween.ch/3ms/cms#collection_1> <http://onbetween.ch/3ms/cms#book> <book:_C> .
   ```
   
   Which is incorrect. Should be:
   
   ```
   <http://onbetween.ch/3ms/cms#collection_1> <http://onbetween.ch/3ms/cms#book> <http://onbetween.ch/3ms/cms#book_1> .
   <http://onbetween.ch/3ms/cms#collection_1> <http://onbetween.ch/3ms/cms#book> <http://onbetween.ch/3ms/cms#book_2> .
   <http://onbetween.ch/3ms/cms#collection_1> <http://onbetween.ch/3ms/cms#book> <http://onbetween.ch/3ms/cms#book_A> .
   <http://onbetween.ch/3ms/cms#collection_1> <http://onbetween.ch/3ms/cms#book> <http://onbetween.ch/3ms/cms#book_B> .
   <http://onbetween.ch/3ms/cms#collection_1> <http://onbetween.ch/3ms/cms#book> <http://onbetween.ch/3ms/cms#book_C> .
   ```
   
   It's a bit of an edge case, but actually for us this may happen when working with organisation specific ontologies. 
   
   I'm more of a python dev nowadays but if I can help let me know. If you can confirm it's a bug, I can also look into it, but I'm actually not 100% sure if that's not a JSON-LD spec issue. 
   
   Also if the predicate serialisation would be using the prefixes, i.e. `cms:book`, this wouldn't happen, we would have `cms:book_1`, something like:
   
   ```
   {
     "@id": "cms:collection_1",
     "cms:book": [
       "cms:book_B",
       "cms:book_C",
       "cms:book_2",
       "cms:book_1",
       "cms:book_A"
     ],
     "@context": {
       "cms:book": {
         "@id": "http://onbetween.ch/3ms/cms#book",
         "@type": "@id"
       },
       "schema": "http://schema.org/",
       "threems": "http://onbetween.ch/3ms/core#",
       "owl": "http://www.w3.org/2002/07/owl#",
       "cms": "http://onbetween.ch/3ms/cms#",
       "xsd": "http://www.w3.org/2001/XMLSchema#",
       "skos": "http://www.w3.org/2004/02/skos/core#",
       "rdfs": "http://www.w3.org/2000/01/rdf-schema#",
       "cmsapi": "http://onbetween.ch/3ms/cmsapi#",
       "xml": "http://www.w3.org/XML/1998/namespace",
       "rdf": "http://www.w3.org/1999/02/22-rdf-syntax-ns#",
       "cmsui": "http://onbetween.ch/3ms/cmsui#",
       "dc": "http://purl.org/dc/elements/1.1/"
     }
   }
   ```
   
   What do you think ?
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@jena.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@jena.apache.org
For additional commands, e-mail: issues-help@jena.apache.org


[GitHub] [jena] afs commented on issue #1254: json-ld serialisation and URIs/predicate overlapping

Posted by GitBox <gi...@apache.org>.
afs commented on issue #1254:
URL: https://github.com/apache/jena/issues/1254#issuecomment-1102192335

   @gkellogg -- thanks for the details.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@jena.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@jena.apache.org
For additional commands, e-mail: issues-help@jena.apache.org


[GitHub] [jena] afs commented on issue #1254: json-ld serialisation and URIs/predicate overlapping

Posted by GitBox <gi...@apache.org>.
afs commented on issue #1254:
URL: https://github.com/apache/jena/issues/1254#issuecomment-1101575333

   @gkellogg -- Hi Gregg, the 1.0 and 1.1 playgrounds confirm this difference in behaviour for the "book:YYY".
   
   If you have a moment, could you point to which of items in https://www.w3.org/TR/json-ld11/#changes-from-10 is causing this?
   
   Simplified version:
   ```json
   {
     "@id" : "http://example/collection",
     "http://example/p" : [ "book:ZZZ" ],
     "book" : [ "book:YYY" ],
     "@context" : {
       "book" : {
         "@id" : "http://onbetween.ch/3ms/cms#",
         "@type" : "@id"
       }
     }
   }
   ```
   gives 1.0 playground:
   ```nt
   <http://example/collection> <http://example/p> "book:ZZZ" .
   <http://example/collection> <http://onbetween.ch/3ms/cms#> <http://onbetween.ch/3ms/cms#YYY> .
   
   ```
   or 1.1 playground:
   ```nt
   <http://example/collection> <http://example/p> "book:ZZZ" .
   <http://example/collection> <http://onbetween.ch/3ms/cms#> <book:YYY> .
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@jena.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@jena.apache.org
For additional commands, e-mail: issues-help@jena.apache.org


[GitHub] [jena] afs commented on issue #1254: json-ld serialisation and URIs/predicate overlapping

Posted by GitBox <gi...@apache.org>.
afs commented on issue #1254:
URL: https://github.com/apache/jena/issues/1254#issuecomment-1100612572

   Hi @fatzh,
   
   Jena uses [jsonld-java](https://github.com/jsonld-java/jsonld-java) for reading JSON-LD 1.0 and uses [titanium-json-ld](https://github.com/filip26/titanium-json-ld) to parse JSON-LD 1.1.
   
   Jena uses jsonld-java for writing JSON-LD (so JSON-LD 1.0). Note - your data does not have "@ version" (space added to not name a GH user!)
   
   When I parse the `[ "book:_B", "book:_C", "book:_2", "book:_1", "book:_A" ]` I get different RDF between JSON-LD 1.0 and 1.1 across json-ld-java and titanium. Same for the JSON-LD playground does the same for JSON-LD 1.0 vs 1.1.
   
   See a users@jena thread:
   https://lists.apache.org/thread/zl0c6jgxnc9ckmc5pvhcoy72ypyr41fp
   
   Suggestion - could you add a prefix to the data for `book:`? The writer tries to use the prefixes to build the context.
   
   It looks like a difference between JSON-LD 1.0 and 1.1.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@jena.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@jena.apache.org
For additional commands, e-mail: issues-help@jena.apache.org


[GitHub] [jena] fatzh commented on issue #1254: json-ld serialisation and URIs/predicate overlapping

Posted by GitBox <gi...@apache.org>.
fatzh commented on issue #1254:
URL: https://github.com/apache/jena/issues/1254#issuecomment-1100751582

   hi @afs ! thanks, indeed it seems ok when parsing with `@version: 1.0`. I guess at some point jsonld-java will support 1.1, they seem to be on it.
   
   > Note - your data does not have "@ version"
   
   the data is what I get back from Jena, I guess I can live with it for now, but good to know.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@jena.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@jena.apache.org
For additional commands, e-mail: issues-help@jena.apache.org


[GitHub] [jena] fatzh closed issue #1254: json-ld serialisation and URIs/predicate overlapping

Posted by GitBox <gi...@apache.org>.
fatzh closed issue #1254: json-ld serialisation and URIs/predicate overlapping
URL: https://github.com/apache/jena/issues/1254


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@jena.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@jena.apache.org
For additional commands, e-mail: issues-help@jena.apache.org


[GitHub] [jena] gkellogg commented on issue #1254: json-ld serialisation and URIs/predicate overlapping

Posted by GitBox <gi...@apache.org>.
gkellogg commented on issue #1254:
URL: https://github.com/apache/jena/issues/1254#issuecomment-1101793606

   Yes, this was intentional, as terms were used too liberally as prefixes, which caused unintended consequences. The note in the [Changes since 1.0 Recommendation of 16 January 2014](https://www.w3.org/TR/json-ld11-api/#changes-since-1-0-recommendation-of-16-january-2014) says the following:
   
   > In JSON-LD 1.1, terms will be used as [compact IRI](https://www.w3.org/TR/json-ld11/#dfn-compact-iri) prefixes when compacting only if a [simple term definition](https://www.w3.org/TR/json-ld11/#dfn-simple-term-definition) is used where the value ends with a URI [gen-delim](https://tools.ietf.org/html/rfc3986#section-2.2) character, or if their [expanded term definition](https://www.w3.org/TR/json-ld11/#dfn-expanded-term-definition) contains an @prefix [entry](https://infra.spec.whatwg.org/#map-entry) with the value true. The 1.0 algorithm has been updated to only consider terms that map to a value that ends with a URI [gen-delim](https://tools.ietf.org/html/rfc3986#section-2.2) character.
   
   The operative step is Step 10 in the [Create Term Definition Algorithm](https://www.w3.org/TR/json-ld11-api/#algorithm-0)
   
   > Create a new [term definition](https://www.w3.org/TR/json-ld11/#dfn-term-definition), definition, initializing [prefix flag](https://www.w3.org/TR/json-ld11-api/#dfn-prefix-flag) to false, [protected](https://www.w3.org/TR/json-ld11-api/#dfn-protected) to protected, and [reverse property](https://www.w3.org/TR/json-ld11-api/#dfn-reverse-property) to false.
   
   And step 14.2.5:
   
   > If term contains neither a colon (:) nor a slash (/), simple term is true, and if the [IRI](https://tools.ietf.org/html/rfc3987#section-2) mapping of definition is either an IRI ending with a [gen-delim](https://tools.ietf.org/html/rfc3986#section-2.2) character, or a [blank node identifier](https://www.w3.org/TR/rdf11-concepts/#dfn-blank-node-identifier), set the [prefix flag](https://www.w3.org/TR/json-ld11-api/#dfn-prefix-flag) in definition to true.
   
   The operative bit is that this is not a _simple term_.
   
   This can be changed by adding `"@prefix": true` to the term definition ([playground link](https://json-ld.org/playground/#startTab=tab-nquads&json-ld=%7B%22%40id%22%3A%22http%3A%2F%2Fexample%2Fcollection%22%2C%22http%3A%2F%2Fexample%2Fp%22%3A%5B%22book%3AZZZ%22%5D%2C%22book%22%3A%5B%22book%3AYYY%22%5D%2C%22%40context%22%3A%7B%22book%22%3A%7B%22%40id%22%3A%22http%3A%2F%2Fonbetween.ch%2F3ms%2Fcms%23%22%2C%22%40type%22%3A%22%40id%22%2C%22%40prefix%22%3Atrue%7D%7D%7D)):
   
   ```json
   {
     "@id" : "http://example/collection",
     "http://example/p" : [ "book:ZZZ" ],
     "book" : [ "book:YYY" ],
     "@context" : {
       "book" : {
         "@id" : "http://onbetween.ch/3ms/cms#",
         "@type" : "@id",
         "@prefix": true
       }
     }
   }
   ```
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@jena.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@jena.apache.org
For additional commands, e-mail: issues-help@jena.apache.org