You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@jena.apache.org by stain <gi...@git.apache.org> on 2018/01/11 18:33:07 UTC

[GitHub] jena pull request #341: JENA-1462: Tests RDF/XML parsing newer URI schemes

GitHub user stain opened a pull request:

    https://github.com/apache/jena/pull/341

    JENA-1462: Tests RDF/XML parsing newer URI schemes

    Tests for [JENA-1462](https://issues.apache.org/jira/browse/JENA-1462)
    
    RIOT parsing RDF/XML with a base URI different from http/https/file, such as `ssh://example.com/nested/`, fails.
    
    Note as JENA-1462 is not fixed by this PR, this only adds the unit tests and test files.
    
    This test also highlights a bug in parsing URIs like `file://example.com/etc/passwd` as described in [JENA-1463](https://issues.apache.org/jira/browse/JENA-1463)

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/stain/jena JENA-1462

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/jena/pull/341.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #341
    
----
commit 6ecd48af6967ca48f985850393ac3b16df31a314
Author: Stian Soiland-Reyes <st...@...>
Date:   2018-01-11T18:12:33Z

    JENA-1462: Tests RDF/XML parsing newer URI schemes
    
    RIOT parsing RDF/XML with a base URI different from http/https/file,
    such as ssh://, fails.
    
    Note as JENA-1462 is not fixed, this only adds the unit tests.

----


---

[GitHub] jena pull request #341: JENA-1462: Tests RDF/XML parsing newer URI schemes

Posted by afs <gi...@git.apache.org>.
Github user afs commented on a diff in the pull request:

    https://github.com/apache/jena/pull/341#discussion_r161378376
  
    --- Diff: jena-arq/testing/RIOT/URISchemes/app.nt ---
    @@ -0,0 +1 @@
    +<app://2dee5b0a-6100-470a-a67f-1399518cb470/nested/foo.txt> <http://www.w3.org/2000/01/rdf-schema#seeAlso> <app://2dee5b0a-6100-470a-a67f-1399518cb470/bar.txt> .
    --- End diff --
    
    N-Triples is handled differently - there is no IRI base resolution in normal parsing; this is intentional.
    
    The command line "riot" specially adds IRI checking for N-Triples. Please remove the NT tests.
    



---

[GitHub] jena pull request #341: JENA-1462: Tests RDF/XML parsing newer URI schemes

Posted by stain <gi...@git.apache.org>.
Github user stain commented on a diff in the pull request:

    https://github.com/apache/jena/pull/341#discussion_r161559455
  
    --- Diff: jena-arq/testing/RIOT/URISchemes/app.nt ---
    @@ -0,0 +1 @@
    +<app://2dee5b0a-6100-470a-a67f-1399518cb470/nested/foo.txt> <http://www.w3.org/2000/01/rdf-schema#seeAlso> <app://2dee5b0a-6100-470a-a67f-1399518cb470/bar.txt> .
    --- End diff --
    
    Should I still remove `*.nt`? For all we know future code might use different iri factories for relative resolution and NT/NQ parsing, as all the other formats use `file:///current/directory` as base by default.


---

[GitHub] jena pull request #341: JENA-1462: Tests RDF/XML parsing newer URI schemes

Posted by stain <gi...@git.apache.org>.
Github user stain commented on a diff in the pull request:

    https://github.com/apache/jena/pull/341#discussion_r161557584
  
    --- Diff: jena-arq/testing/RIOT/URISchemes/file-base.rdf ---
    @@ -0,0 +1,8 @@
    +<rdf:RDF
    +    xml:base="http://2dee5b0a-6100-470a-a67f-1399518cb470/nested/"
    --- End diff --
    
    Commented in Java file, it was just to verify that it was not a UUID-like hostname that was causing the problem. It could be valid HTTP URL - with a local DNS search domain at least :)


---

[GitHub] jena pull request #341: JENA-1462: Tests RDF/XML parsing newer URI schemes

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/jena/pull/341


---

[GitHub] jena pull request #341: JENA-1462: Tests RDF/XML parsing newer URI schemes

Posted by afs <gi...@git.apache.org>.
Github user afs commented on a diff in the pull request:

    https://github.com/apache/jena/pull/341#discussion_r161378072
  
    --- Diff: jena-arq/testing/RIOT/URISchemes/file-base.rdf ---
    @@ -0,0 +1,8 @@
    +<rdf:RDF
    +    xml:base="http://2dee5b0a-6100-470a-a67f-1399518cb470/nested/"
    --- End diff --
    
    Wrong URI scheme, was "file://example.com/nested/" intended?


---

[GitHub] jena pull request #341: JENA-1462: Tests RDF/XML parsing newer URI schemes

Posted by stain <gi...@git.apache.org>.
Github user stain commented on a diff in the pull request:

    https://github.com/apache/jena/pull/341#discussion_r161557853
  
    --- Diff: jena-arq/testing/RIOT/URISchemes/app.nt ---
    @@ -0,0 +1 @@
    +<app://2dee5b0a-6100-470a-a67f-1399518cb470/nested/foo.txt> <http://www.w3.org/2000/01/rdf-schema#seeAlso> <app://2dee5b0a-6100-470a-a67f-1399518cb470/bar.txt> .
    --- End diff --
    
    I included this to verify that the IRIs were acceptable for parsing overall, precisely as N-Triples has no base resolution.


---

[GitHub] jena pull request #341: JENA-1462: Tests RDF/XML parsing newer URI schemes

Posted by stain <gi...@git.apache.org>.
Github user stain commented on a diff in the pull request:

    https://github.com/apache/jena/pull/341#discussion_r161643496
  
    --- Diff: jena-arq/testing/RIOT/URISchemes/file-base.rdf ---
    @@ -0,0 +1,8 @@
    +<rdf:RDF
    +    xml:base="http://2dee5b0a-6100-470a-a67f-1399518cb470/nested/"
    --- End diff --
    
    Bah, sorry.. silly me, I thought the comment was on http-base.rdf! Fixed in 33063f335d0cb9224a64dc2f77dbaa9963d92ecb
    
    `fileBaseRDF()` now passes after your JENA-1463 fix 1d037e80dfd8a91e9c4042f65695cadec4b17097 from #342  :-)  -- only `fileRefRDF()` still fails (passing the base to `RDFDataMgr.read`)
    
    ```turtle
    <file:///example.com/nested/foo.txt> <http://www.w3.org/2000/01/rdf-schema#seeAlso> <file:///example.com/bar.txt> .
    ```


---

[GitHub] jena pull request #341: JENA-1462: Tests RDF/XML parsing newer URI schemes

Posted by stain <gi...@git.apache.org>.
Github user stain commented on a diff in the pull request:

    https://github.com/apache/jena/pull/341#discussion_r161643602
  
    --- Diff: jena-arq/testing/RIOT/URISchemes/app.nt ---
    @@ -0,0 +1 @@
    +<app://2dee5b0a-6100-470a-a67f-1399518cb470/nested/foo.txt> <http://www.w3.org/2000/01/rdf-schema#seeAlso> <app://2dee5b0a-6100-470a-a67f-1399518cb470/bar.txt> .
    --- End diff --
    
    Removed *.nt tests in 5200369653704be86e277774e6ce6d856369eec6


---

[GitHub] jena issue #341: JENA-1462: Tests RDF/XML parsing newer URI schemes

Posted by afs <gi...@git.apache.org>.
Github user afs commented on the issue:

    https://github.com/apache/jena/pull/341
  
    To be clear: RIOT based parsing does not enforce IANA checking. It was just that the RDF/XML parser didn't pick up the right IRIFactory via RIOT.
    
    https://github.com/apache/jena/pull/342/commits/aecfaac0d631995120b77f22059291ffe6b9f06f


---

[GitHub] jena pull request #341: JENA-1462: Tests RDF/XML parsing newer URI schemes

Posted by afs <gi...@git.apache.org>.
Github user afs commented on a diff in the pull request:

    https://github.com/apache/jena/pull/341#discussion_r161625830
  
    --- Diff: jena-arq/testing/RIOT/URISchemes/file-base.rdf ---
    @@ -0,0 +1,8 @@
    +<rdf:RDF
    +    xml:base="http://2dee5b0a-6100-470a-a67f-1399518cb470/nested/"
    --- End diff --
    
    I don't see a comment about file-base.rdf being a different pattern from other "base.rdf" files. file-base.rdf is the same as http-base.rdf. 
    
    All the other "X-base.rdf" have "X:" for xml:base.  Otherwise there is no xml:base= file: testing.


---