You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@jena.apache.org by stain <gi...@git.apache.org> on 2018/01/11 18:33:07 UTC
[GitHub] jena pull request #341: JENA-1462: Tests RDF/XML parsing newer URI schemes
GitHub user stain opened a pull request:
https://github.com/apache/jena/pull/341
JENA-1462: Tests RDF/XML parsing newer URI schemes
Tests for [JENA-1462](https://issues.apache.org/jira/browse/JENA-1462)
RIOT parsing RDF/XML with a base URI different from http/https/file, such as `ssh://example.com/nested/`, fails.
Note as JENA-1462 is not fixed by this PR, this only adds the unit tests and test files.
This test also highlights a bug in parsing URIs like `file://example.com/etc/passwd` as described in [JENA-1463](https://issues.apache.org/jira/browse/JENA-1463)
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/stain/jena JENA-1462
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/jena/pull/341.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #341
----
commit 6ecd48af6967ca48f985850393ac3b16df31a314
Author: Stian Soiland-Reyes <st...@...>
Date: 2018-01-11T18:12:33Z
JENA-1462: Tests RDF/XML parsing newer URI schemes
RIOT parsing RDF/XML with a base URI different from http/https/file,
such as ssh://, fails.
Note as JENA-1462 is not fixed, this only adds the unit tests.
----
---
[GitHub] jena pull request #341: JENA-1462: Tests RDF/XML parsing newer URI schemes
Posted by afs <gi...@git.apache.org>.
Github user afs commented on a diff in the pull request:
https://github.com/apache/jena/pull/341#discussion_r161378376
--- Diff: jena-arq/testing/RIOT/URISchemes/app.nt ---
@@ -0,0 +1 @@
+<app://2dee5b0a-6100-470a-a67f-1399518cb470/nested/foo.txt> <http://www.w3.org/2000/01/rdf-schema#seeAlso> <app://2dee5b0a-6100-470a-a67f-1399518cb470/bar.txt> .
--- End diff --
N-Triples is handled differently - there is no IRI base resolution in normal parsing; this is intentional.
The command line "riot" specially adds IRI checking for N-Triples. Please remove the NT tests.
---
[GitHub] jena pull request #341: JENA-1462: Tests RDF/XML parsing newer URI schemes
Posted by stain <gi...@git.apache.org>.
Github user stain commented on a diff in the pull request:
https://github.com/apache/jena/pull/341#discussion_r161559455
--- Diff: jena-arq/testing/RIOT/URISchemes/app.nt ---
@@ -0,0 +1 @@
+<app://2dee5b0a-6100-470a-a67f-1399518cb470/nested/foo.txt> <http://www.w3.org/2000/01/rdf-schema#seeAlso> <app://2dee5b0a-6100-470a-a67f-1399518cb470/bar.txt> .
--- End diff --
Should I still remove `*.nt`? For all we know future code might use different iri factories for relative resolution and NT/NQ parsing, as all the other formats use `file:///current/directory` as base by default.
---
[GitHub] jena pull request #341: JENA-1462: Tests RDF/XML parsing newer URI schemes
Posted by stain <gi...@git.apache.org>.
Github user stain commented on a diff in the pull request:
https://github.com/apache/jena/pull/341#discussion_r161557584
--- Diff: jena-arq/testing/RIOT/URISchemes/file-base.rdf ---
@@ -0,0 +1,8 @@
+<rdf:RDF
+ xml:base="http://2dee5b0a-6100-470a-a67f-1399518cb470/nested/"
--- End diff --
Commented in Java file, it was just to verify that it was not a UUID-like hostname that was causing the problem. It could be valid HTTP URL - with a local DNS search domain at least :)
---
[GitHub] jena pull request #341: JENA-1462: Tests RDF/XML parsing newer URI schemes
Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:
https://github.com/apache/jena/pull/341
---
[GitHub] jena pull request #341: JENA-1462: Tests RDF/XML parsing newer URI schemes
Posted by afs <gi...@git.apache.org>.
Github user afs commented on a diff in the pull request:
https://github.com/apache/jena/pull/341#discussion_r161378072
--- Diff: jena-arq/testing/RIOT/URISchemes/file-base.rdf ---
@@ -0,0 +1,8 @@
+<rdf:RDF
+ xml:base="http://2dee5b0a-6100-470a-a67f-1399518cb470/nested/"
--- End diff --
Wrong URI scheme, was "file://example.com/nested/" intended?
---
[GitHub] jena pull request #341: JENA-1462: Tests RDF/XML parsing newer URI schemes
Posted by stain <gi...@git.apache.org>.
Github user stain commented on a diff in the pull request:
https://github.com/apache/jena/pull/341#discussion_r161557853
--- Diff: jena-arq/testing/RIOT/URISchemes/app.nt ---
@@ -0,0 +1 @@
+<app://2dee5b0a-6100-470a-a67f-1399518cb470/nested/foo.txt> <http://www.w3.org/2000/01/rdf-schema#seeAlso> <app://2dee5b0a-6100-470a-a67f-1399518cb470/bar.txt> .
--- End diff --
I included this to verify that the IRIs were acceptable for parsing overall, precisely as N-Triples has no base resolution.
---
[GitHub] jena pull request #341: JENA-1462: Tests RDF/XML parsing newer URI schemes
Posted by stain <gi...@git.apache.org>.
Github user stain commented on a diff in the pull request:
https://github.com/apache/jena/pull/341#discussion_r161643496
--- Diff: jena-arq/testing/RIOT/URISchemes/file-base.rdf ---
@@ -0,0 +1,8 @@
+<rdf:RDF
+ xml:base="http://2dee5b0a-6100-470a-a67f-1399518cb470/nested/"
--- End diff --
Bah, sorry.. silly me, I thought the comment was on http-base.rdf! Fixed in 33063f335d0cb9224a64dc2f77dbaa9963d92ecb
`fileBaseRDF()` now passes after your JENA-1463 fix 1d037e80dfd8a91e9c4042f65695cadec4b17097 from #342 :-) -- only `fileRefRDF()` still fails (passing the base to `RDFDataMgr.read`)
```turtle
<file:///example.com/nested/foo.txt> <http://www.w3.org/2000/01/rdf-schema#seeAlso> <file:///example.com/bar.txt> .
```
---
[GitHub] jena pull request #341: JENA-1462: Tests RDF/XML parsing newer URI schemes
Posted by stain <gi...@git.apache.org>.
Github user stain commented on a diff in the pull request:
https://github.com/apache/jena/pull/341#discussion_r161643602
--- Diff: jena-arq/testing/RIOT/URISchemes/app.nt ---
@@ -0,0 +1 @@
+<app://2dee5b0a-6100-470a-a67f-1399518cb470/nested/foo.txt> <http://www.w3.org/2000/01/rdf-schema#seeAlso> <app://2dee5b0a-6100-470a-a67f-1399518cb470/bar.txt> .
--- End diff --
Removed *.nt tests in 5200369653704be86e277774e6ce6d856369eec6
---
[GitHub] jena issue #341: JENA-1462: Tests RDF/XML parsing newer URI schemes
Posted by afs <gi...@git.apache.org>.
Github user afs commented on the issue:
https://github.com/apache/jena/pull/341
To be clear: RIOT based parsing does not enforce IANA checking. It was just that the RDF/XML parser didn't pick up the right IRIFactory via RIOT.
https://github.com/apache/jena/pull/342/commits/aecfaac0d631995120b77f22059291ffe6b9f06f
---
[GitHub] jena pull request #341: JENA-1462: Tests RDF/XML parsing newer URI schemes
Posted by afs <gi...@git.apache.org>.
Github user afs commented on a diff in the pull request:
https://github.com/apache/jena/pull/341#discussion_r161625830
--- Diff: jena-arq/testing/RIOT/URISchemes/file-base.rdf ---
@@ -0,0 +1,8 @@
+<rdf:RDF
+ xml:base="http://2dee5b0a-6100-470a-a67f-1399518cb470/nested/"
--- End diff --
I don't see a comment about file-base.rdf being a different pattern from other "base.rdf" files. file-base.rdf is the same as http-base.rdf.
All the other "X-base.rdf" have "X:" for xml:base. Otherwise there is no xml:base= file: testing.
---